Proxmox LXC backup/restore failures: tar errors, permissions, and filesystem gotchas

Was this helpful?

You run vzdump on a Proxmox LXC container. It churns for a while. Then it detonates with a tar complaint that sounds like a 1998 Linux forum post:
“tar: … Cannot open: Permission denied”, or “xattrs not supported”, or the classic “Unexpected EOF in archive”.

Meanwhile the business thinks “backups are green” because the job existed, not because it restored. Storage thinks it’s an app problem. The app team thinks it’s “a Proxmox thing.”
It’s usually a filesystem thing wearing a tar costume.

Fast diagnosis playbook

The goal is not to admire the error message. The goal is to find the bottleneck fast: storage capability, filesystem features, or identity mapping.
Here’s the order that saves time.

1) Confirm which phase failed: backup creation, compression, or restore extraction

  • If the archive file is missing or tiny, it’s a creation problem (read errors, permissions, snapshot issues).
  • If the archive exists but restore fails early, it’s usually xattrs/ACLs/ownership mapping, or the target storage can’t represent metadata.
  • If restore fails late, look for “No space left”, inode exhaustion, or timeouts on network storage.

2) Identify the storage type on both ends (source rootfs and backup target)

LXC rootfs on ZFS behaves differently than rootfs on dir. Backup target on NFS behaves differently than backup target on ZFS.
“It’s just a file” is a comforting lie your storage will punish.

3) Check whether the container is unprivileged and whether the backup/restore path preserves ownership

Unprivileged containers rely on UID/GID mapping. If you restore onto storage that can’t store those IDs or xattrs, tar will squeal and Proxmox will abort.

4) Reproduce with no compression and verbose tar output

Compression hides the first real error. Turn it off, reproduce, and read the first failing file path.

5) Only then chase “tar bugs”

Tar is usually the messenger. Shooting it doesn’t fix the message.

How Proxmox backs up LXC (and where tar fits)

Proxmox’s LXC backups are usually driven by vzdump. For containers, the default “artifact” is a tar archive of the container’s root filesystem
plus metadata (config, mount points). Depending on mode, Proxmox might:

  • stop mode: stop container, tar the filesystem consistently, start it again.
  • snapshot mode: if storage supports snapshots, take one and tar from the snapshot while the container keeps running.
  • suspend mode: older compromise; less common and less loved.

The key: even if your rootfs is ZFS, the resulting backup is still often a tar stream unless you choose a storage+format that supports native volume snapshots in the backup.
That tar stream carries metadata: permissions, owners, timestamps, device nodes, ACLs, and extended attributes (xattrs).
If the source filesystem has features the destination (backup target) or extraction destination can’t represent, you get failures that look arbitrary.

Interesting facts & historical context (because this stuff has roots)

  1. Tar predates Linux by a decade. It was built for tape; “streaming” behavior is why partial archives happen when something interrupts the pipe.
  2. GNU tar gained xattr/ACL support gradually. Older distros treated ACLs as optional frosting; modern containers treat them as identity.
  3. LXC’s unprivileged containers normalized UID shifting. The “root inside is 100000 outside” convention is not universal, but it’s common.
  4. Proxmox inherited design DNA from OpenVZ tooling. The name vzdump is a fossil from that era; it still does the job.
  5. ZFS snapshots are cheap; restores aren’t always. Snapshots are metadata, but sending/receiving or extracting tar is real I/O.
  6. NFS has multiple personalities. v3 vs v4, root_squash, and attribute caching can turn “permission denied” into performance theater.
  7. CIFS/SMB is not a POSIX filesystem. It can emulate Unix mode bits, but xattrs and device nodes are a negotiation, not a guarantee.
  8. Overlay filesystems changed expectations. Containers made people expect copy-on-write layers; tar expects stable inodes and paths.

One operational rule that ages well: if you can’t restore, you don’t have a backup. That’s not poetry, it’s accounting.

Quote (paraphrased idea) from Werner Vogels: “Everything fails, all the time—design and operate assuming failure.” It applies brutally well to backups.

Failure modes: tar errors and what they really mean

“tar: Cannot open: Permission denied”

This is rarely “tar lacks permission.” It’s usually the Proxmox backup process running on the host trying to traverse a path that the host can’t read
due to idmapped ownership, bind mounts, or a rootfs that isn’t what you think it is.
Common triggers:

  • Bind mount into the container from a directory with restrictive permissions on the host.
  • NFS mount with root_squash where host root becomes nobody.
  • Unprivileged container where certain paths have unexpected ownership after a manual chown or rsync.

“tar: Cannot set xattr” / “Operation not supported”

This is a filesystem feature mismatch. The extraction target doesn’t support the xattr namespace tar is trying to restore.
Classic cases:

  • Restoring onto CIFS without proper Unix extensions.
  • Restoring onto an NFS export that strips xattrs or maps them oddly.
  • Restoring into a filesystem mounted with xattr/acl disabled (yes, people still do that).

“tar: Cannot change ownership to uid …”

Usually one of:

  • You’re restoring into an unprivileged container rootfs on a filesystem that doesn’t like large UIDs/GIDs, or you’re restoring without the right mapping context.
  • Extraction is happening as a user that cannot chown (e.g., backup running inside a restricted environment or over certain network filesystems).
  • Backup was taken from privileged container and restored to unprivileged (or vice versa) without adjusting expectations.

“Unexpected EOF in archive”

Tar is a stream. If the process writing the stream dies, the reader sees EOF. Root causes:

  • Out of space on backup target mid-stream.
  • Network hiccup to NFS/CIFS causing write failure and pipe break.
  • OOM kill, or timeout in compression stage (zstd/gzip).

“File changed as we read it”

This is what happens when you back up a running filesystem without a snapshot barrier. Some churn is benign; some yields broken application state.
If you see this often, switch to snapshot mode on supported storage, or stop mode for correctness.

Joke #1: Backups are like parachutes—if you wait to test them during the jump, you’ve committed to a learning experience.

Permissions: privileged vs unprivileged, idmaps, and why root isn’t root

Containers are not VMs. They share the host kernel. Proxmox LXC uses Linux user namespaces for unprivileged containers.
That means:

  • Privileged container: container root maps to host root (UID 0).
  • Unprivileged container: container root maps to some high host UID (often 100000). File ownership on disk reflects host UIDs, not container UIDs.

Why this matters for backup/restore:

  • Tar stores numeric UIDs/GIDs. If your backup contains host-shifted UIDs (e.g., 100000+), restore needs to recreate them exactly.
  • Some filesystems and exports don’t preserve large UIDs well, or map them strangely.
  • Bind mounts can introduce mixed ownership worlds: part of the tree is shifted, part is not.

Bind mounts are the silent saboteur

Proxmox allows container mount points (mp0, mp1, etc.) that bind host paths into the container.
This is great until you back up. Depending on configuration, those paths might be included, excluded, or behave differently during tar.
The host permissions on the source path decide what tar can read.

Practical advice: if you use bind mounts for application data, treat that data as a separate backup domain with its own method (filesystem snapshot, database dump, etc.).
Trying to “just include it in vzdump” works right up until it doesn’t. And it always fails on a weekend.

Filesystem gotchas: ZFS, dir, NFS, CIFS, btrfs, and friends

ZFS: snapshots help, but dataset properties can still hurt you

ZFS is the grown-up in the room: consistent snapshots, checksums, compression, and fast clones. But it’s not magic.
The usual restore landmines are:

  • Mountpoint confusion: restoring into a dataset that’s not mounted where Proxmox expects.
  • Dataset permissions: ACL mode and xattr storage set in ways that don’t match container expectations.
  • Refquota/refreservation: restore fails mid-way with ENOSPC even though the pool has space.

Directory storage (“dir”): simple, readable, and surprisingly easy to misconfigure

A plain filesystem directory as storage is often fine. Your enemy here is not complexity; it’s subtle mount options:
noacl, nouser_xattr, or storing backups on something like exFAT because “it’s just a backup disk.”
ExFAT does not understand your Linux metadata needs. It understands regret.

NFS: the permission model is the product

If your backups live on NFS, you need to understand:

  • root_squash: host root becomes anonymous; tar can’t chown and may not even read.
  • idmapping: NFSv4 name-based mapping can mismatch numeric IDs, leading to “wrong ownership after restore.”
  • locking and attribute caching: can cause weird “file changed” warnings during backup of active trees.

CIFS/SMB: fine for office documents, risky for container rootfs semantics

SMB can work for backup archives (a tar.zst file sitting on a share) if permissions are correct.
But restoring rootfs onto CIFS is a bad idea unless you enjoy debugging xattrs at 2 a.m.
If you must: ensure Unix extensions, xattrs, and proper mount options. Even then, test restores regularly.

btrfs: snapshots are there, but operational maturity varies

btrfs can do snapshots and send/receive. In Proxmox, btrfs support exists but is less common than ZFS in the wild.
The gotchas tend to be:

  • Subvolume layout confusion leading to “restore succeeded but container won’t start.”
  • Quota groups causing unexpected ENOSPC even with apparent free space.

Joke #2: Tar doesn’t “hate you.” It just refuses to lie about your filesystem’s personality.

Practical tasks: commands, what the output means, and the decision you make

These are the real moves. Each task includes a runnable command, sample output, what it tells you, and what you do next.
Run them on the Proxmox host unless noted.

Task 1: Identify the container’s rootfs and storage backend

cr0x@server:~$ pct config 101 | egrep '^(arch|cores|hostname|ostype|rootfs|unprivileged|mp[0-9]+)'
hostname: app-lxc-01
ostype: debian
arch: amd64
cores: 4
unprivileged: 1
rootfs: local-zfs:subvol-101-disk-0,size=16G
mp0: /srv/data,mp=/var/lib/app-data

Meaning: Rootfs is a ZFS subvolume. Container is unprivileged. There’s a bind mount (mp0) from /srv/data.

Decision: Expect UID shifting. Treat mp0 separately; it is likely the first source of permission failures.

Task 2: Verify the backup storage configuration and type

cr0x@server:~$ pvesm status
Name             Type     Status           Total            Used       Available        %
local             dir     active        196529976        42108740       144544184   21.43%
local-zfs         zfspool active        2147483648      126903296      2020580352    5.91%
backup-nfs        nfs     active       1953514588      110154852      1843369736    5.64%

Meaning: Backups likely go to backup-nfs. NFS means root_squash and xattr behavior are suspects.

Decision: Confirm NFS export options before blaming Proxmox or tar.

Task 3: Inspect a failed job log for the first real error

cr0x@server:~$ journalctl -u pvedaemon -S -2h | egrep -i 'vzdump|tar:|error|exit'
Dec 26 11:01:02 server pvedaemon[2218]: vzdump 101 --storage backup-nfs --mode snapshot --compress zstd
Dec 26 11:01:45 server pvedaemon[2218]: tar: ./var/lib/app-data/cache: Cannot open: Permission denied
Dec 26 11:01:45 server pvedaemon[2218]: tar: Exiting with failure status due to previous errors
Dec 26 11:01:45 server pvedaemon[2218]: ERROR: Backup of VM 101 failed - command 'tar ...' failed: exit code 2

Meaning: The failing path is inside the bind mount target (/var/lib/app-data), which maps to /srv/data on host.

Decision: Check host permissions on /srv/data and NFS squashing if it’s on an NFS mount.

Task 4: Check whether the bind mount source is on NFS/CIFS and with what options

cr0x@server:~$ findmnt -T /srv/data -o TARGET,SOURCE,FSTYPE,OPTIONS
TARGET   SOURCE                  FSTYPE OPTIONS
/srv/data nfs01:/export/appdata  nfs4   rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.10

Meaning: Your bind mount source is NFS. If the export has root_squash, host root won’t be able to read everything.

Decision: Validate export permissions. Consider excluding bind mounts from vzdump and backing them up via storage-native snapshot instead.

Task 5: Confirm root squashing symptoms by trying to read as root

cr0x@server:~$ sudo -n ls -ld /srv/data /srv/data/cache
drwxr-x--- 10 100000 100000 4096 Dec 26 10:40 /srv/data
drwx------  2 100123 100123 4096 Dec 26 10:45 /srv/data/cache

Meaning: Ownership is shifted (100000+), consistent with unprivileged container data landing on NFS. Root on host may still read, but NFS export policy may override.

Decision: If root can’t traverse, fix NFS export (no_root_squash for a dedicated backup/host net) or change backup approach.

Task 6: Re-run backup without compression to surface the true failure quickly

cr0x@server:~$ sudo vzdump 101 --storage backup-nfs --mode snapshot --compress none --stdout 0
INFO: starting new backup job: vzdump 101 --storage backup-nfs --mode snapshot --compress none --stdout 0
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup archive 'vzdump-lxc-101-2025_12_26-11_07_10.tar'
ERROR: tar: ./var/lib/app-data/cache: Cannot open: Permission denied
ERROR: tar: Exiting with failure status due to previous errors
INFO: removing partial backup file 'backup-nfs:backup/vzdump-lxc-101-2025_12_26-11_07_10.tar'
ERROR: Backup of VM 101 failed - command 'tar' failed: exit code 2

Meaning: Same failure, cleaner log. It’s not compression.

Decision: Stop trying different compressors. Fix permissions or exclude that mount from backup.

Task 7: Check container mount points and whether they’re included in backup

cr0x@server:~$ grep -E '^(mp[0-9]+|lxc.mount.entry)' /etc/pve/lxc/101.conf
mp0: /srv/data,mp=/var/lib/app-data

Meaning: That mount is defined via Proxmox config. By default, vzdump will include it unless configured otherwise.

Decision: If it’s external/shared data, exclude it from container backup and back it up as its own dataset/export.

Task 8: Verify xattr/ACL capability on backup target filesystem

cr0x@server:~$ findmnt -T /mnt/pve/backup-nfs -o TARGET,FSTYPE,OPTIONS
TARGET            FSTYPE OPTIONS
/mnt/pve/backup-nfs nfs4   rw,relatime,vers=4.1,sec=sys,hard,timeo=600,retrans=2

Meaning: NFSv4 may support xattrs, but server export config matters. Also, your backup target only needs to store the tar file; xattrs matter more on restore target.

Decision: Separate “can write archive file” from “can restore metadata.” Test restore on the intended target storage.

Task 9: Validate a backup archive without restoring it (detect corruption)

cr0x@server:~$ sudo tar -tf /mnt/pve/backup-nfs/dump/vzdump-lxc-101-2025_12_26-10_00_00.tar.zst | head
./
./etc/
./etc/hostname
./etc/hosts
./var/
./var/lib/
./var/lib/app-data/
./var/lib/app-data/cache/

Meaning: You can list the archive. If this fails with “Unexpected EOF,” the archive is corrupt/incomplete.

Decision: If corrupted: investigate storage/network interruption, not container permissions.

Task 10: Confirm available space and inodes on restore target

cr0x@server:~$ df -h /rpool/data /rpool/data/subvol-101-disk-0
Filesystem      Size  Used Avail Use% Mounted on
rpool           1.8T  1.2T  600G  67% /
rpool/data      600G  410G  190G  69% /rpool/data

cr0x@server:~$ df -i /rpool/data
Filesystem       Inodes  IUsed    IFree IUse% Mounted on
rpool/data            0      0        0     - /rpool/data

Meaning: On ZFS, inode reporting is different; lack of inode numbers here is normal. Focus on dataset quotas and available space.

Decision: If restoring to ext4/xfs, inode exhaustion is real; on ZFS, check quota properties instead.

Task 11: Check ZFS dataset quotas that can fake “no space left”

cr0x@server:~$ zfs get -o name,property,value -H refquota,quota,used,avail rpool/data/subvol-101-disk-0
rpool/data/subvol-101-disk-0	refquota	16G
rpool/data/subvol-101-disk-0	quota	none
rpool/data/subvol-101-disk-0	used	15.8G
rpool/data/subvol-101-disk-0	avail	200M

Meaning: The dataset is boxed in by refquota. Even if the pool has terabytes free, this subvolume doesn’t.

Decision: Increase size in Proxmox or bump refquota before restore, or restore to a larger dataset and adjust container disk size afterward.

Task 12: Inspect mapping for unprivileged containers (host side)

cr0x@server:~$ sudo grep -E '^(root|lxc)' /etc/subuid /etc/subgid
/etc/subuid:root:100000:65536
/etc/subgid:root:100000:65536

Meaning: Host root has a range for mapping. If these ranges are missing or too small, unprivileged containers and restores can behave strangely.

Decision: Ensure consistent subuid/subgid ranges across cluster nodes, especially if you migrate/restore on different hosts.

Task 13: Run a restore in a scratch container ID to test without committing

cr0x@server:~$ sudo pct restore 999 /mnt/pve/backup-nfs/dump/vzdump-lxc-101-2025_12_26-10_00_00.tar.zst --storage local-zfs
restoring 'vzdump-lxc-101-2025_12_26-10_00_00.tar.zst' to VM 999 ...
extracting archive '/mnt/pve/backup-nfs/dump/vzdump-lxc-101-2025_12_26-10_00_00.tar.zst'
tar: ./var/lib/app-data/cache: Cannot open: Permission denied
tar: Exiting with failure status due to previous errors
ERROR: Restore failed - command 'zstd -q -d -c ... | tar ...' failed: exit code 2

Meaning: Restore fails at extraction of bind mount contents. This proves the issue is in the archive contents and permissions path, not container runtime.

Decision: Exclude that mount from backup, or fix access to it during backup/restore. If it’s external data, don’t put it in the rootfs backup.

Task 14: Confirm whether mount points are inside the archive (and decide if they should be)

cr0x@server:~$ sudo tar -tf /mnt/pve/backup-nfs/dump/vzdump-lxc-101-2025_12_26-10_00_00.tar.zst | egrep '^./var/lib/app-data' | head
./var/lib/app-data/
./var/lib/app-data/cache/
./var/lib/app-data/cache/tmp.db

Meaning: The bind-mounted data is included in the archive.

Decision: If that data lives on shared storage (NFS) and has its own lifecycle, exclude it and back it up at the storage layer. Otherwise, fix host access to read it.

Task 15: Catch AppArmor/LSM interference (rare, but painful)

cr0x@server:~$ dmesg -T | egrep -i 'apparmor|denied|audit' | tail
[Thu Dec 26 11:12:22 2025] audit: type=1400 audit(1766747542.123:410): apparmor="DENIED" operation="open" profile="lxc-pct" name="/srv/data/cache/tmp.db" pid=31222 comm="tar" requested_mask="r" denied_mask="r" fsuid=0 ouid=100123

Meaning: The kernel denied tar’s read, not classic Unix permissions. That’s a policy problem.

Decision: Adjust LXC/AppArmor profile or move backup/restore path to locations allowed by policy. Don’t “disable AppArmor” as a first resort.

Three corporate mini-stories from the trenches

Incident caused by a wrong assumption: “The bind mount is part of the container, so it’s backed up”

A mid-sized SaaS shop ran Proxmox with a tidy LXC fleet. Application teams used bind mounts for anything stateful:
/srv/postgres into the DB container, /srv/uploads into the web container, and so on.
The infra team assumed that because Proxmox showed the mount in the container config, it was “included in the backup.”

The failure started quietly. Backups “succeeded” for weeks because the bind-mounted paths were readable most of the time.
Then an NFS export change went in: root squashing was enabled broadly as part of a security sweep.
That night, vzdump failed on several containers with Permission denied. The scheduler still reported “job ran,” and nobody looked.

Two months later a container was corrupted during an upgrade. Restore “worked,” the container booted, and the app came up—empty.
The archive had the rootfs, but the critical bind-mounted data had never been consistently captured. Some restores failed outright; others restored a stale subset.
It was the worst kind of incident: apparently successful recovery with incorrect state.

The fix wasn’t heroic. They stopped treating bind mounts as part of container backup scope.
Database containers got logical backups plus storage snapshots. Uploads got object storage replication.
vzdump backups remained for rootfs and config, but they were no longer pretending to be the whole truth.

Optimization that backfired: “Let’s back up to SMB because it’s cheaper and already there”

Another org had a Windows-based NAS cluster and a mandate: consolidate backups onto it.
Someone pointed Proxmox’s backup storage at an SMB share. It worked for a handful of containers, so they rolled it out widely.
Costs looked good. The spreadsheet smiled.

The first crack appeared during restore testing. Small containers restored fine; larger ones failed intermittently with tar errors:
Cannot set xattr, Operation not supported, and occasionally Unexpected EOF.
They “fixed” EOF by increasing timeouts and retried until green.
That’s not a fix; that’s gambling with extra steps.

The deeper problem: SMB semantics and mount options varied by node. Some nodes mounted with different uid/gid mapping.
Some preserved xattrs, some didn’t. The same backup restored differently depending on where you ran it.
Restore became a lottery with better logging.

They eventually pivoted to using SMB only as a dumb repository for backup files produced elsewhere, not as a filesystem to restore rootfs onto.
For restores, they extracted onto ZFS-backed storage on the Proxmox host. Problem vanished. Costs rose a bit. Sleep improved a lot.

Boring but correct practice that saved the day: routine restore drills and a “scratch restore” ID

A regulated environment ran Proxmox with ZFS and strict change control. Their backup job had one “annoying” requirement:
every week, restore two random containers into a scratch ID range (900–999) on an isolated network.
No one loved spending time on it, but it was policy.

One week, the restore drill failed on a container that had recently been converted to unprivileged.
Tar errors showed up around ownership and xattrs. It wasn’t a crisis because it was caught in the drill, not in an outage.

The cause was mundane: a node had inconsistent /etc/subuid ranges after a rebuild.
Backups were fine. Restores were host-dependent. They fixed the mapping, reran the drill, and moved on.
The real win: they found it before an incident forced them to care.

This is the boring truth: the restore drill wasn’t glamorous, but it prevented a high-stress, high-visibility failure later.
They didn’t need heroics because they had a routine.

Common mistakes (symptom → root cause → fix)

1) Symptom: Backup fails with “Permission denied” on a path under /var/lib/...

Root cause: That path is a bind mount from the host; host permissions (or NFS root_squash) block tar.

Fix: Ensure host root can read the bind mount source; or exclude the mount from container backup and back it up separately.

2) Symptom: Restore fails with “Cannot set xattr” or “Operation not supported”

Root cause: Restore target filesystem/export doesn’t support the xattrs/ACLs from the archive.

Fix: Restore onto POSIX-native storage (ZFS, ext4, xfs). Avoid restoring rootfs onto CIFS; validate NFS export xattr support.

3) Symptom: Restore fails with “Cannot change ownership” or lots of numeric UIDs in logs

Root cause: Unprivileged container UID shifting; restore environment can’t represent or apply those ownerships.

Fix: Keep consistent subuid/subgid across nodes. Restore onto storage that supports chown and large UIDs. Don’t mix privileged/unprivileged without a plan.

4) Symptom: “Unexpected EOF” when listing or restoring an archive

Root cause: Incomplete archive (write interrupted, out of space, network drop, killed process).

Fix: Check storage capacity, NFS stability, and system logs for OOM kills. Prefer local fast storage for backup staging if network is flaky.

5) Symptom: Backup succeeds but restore produces wrong permissions inside the container

Root cause: Restore onto filesystem that remaps ownership (some NFSv4 setups) or loses ACLs/xattrs.

Fix: Restore onto ZFS/ext4/xfs locally, then migrate. Or adjust NFS idmapping to be consistent and test.

6) Symptom: Restore fails with ENOSPC though the pool has space

Root cause: ZFS refquota/quota or btrfs qgroups limit the dataset/subvolume.

Fix: Increase dataset size/quotas before restore. Don’t trust “pool free” as the final answer.

7) Symptom: “File changed as we read it” warnings, sometimes followed by broken app state

Root cause: Backup taken without a proper snapshot boundary for a busy workload.

Fix: Use snapshot mode on snapshot-capable storage, or stop mode for consistency. For databases, do logical backups.

8) Symptom: Restore works on one node but fails on another

Root cause: Inconsistent host configuration: subuid/subgid, mount options, storage plugin versions, or NFS mounts.

Fix: Standardize node configs, treat them as cattle, and run restore drills on multiple nodes.

Checklists / step-by-step plan

Step-by-step: from failed log to root cause in under 30 minutes

  1. Get the first failing path from journalctl or the vzdump log. Ignore the last error line; it’s usually generic.
  2. Map the path to container config: is it under a bind mount (mpX)? If yes, treat it as external storage.
  3. Confirm storage types (rootfs storage, backup storage, restore target storage) with pvesm status and findmnt.
  4. Check permissions on the host for the exact failing file/dir. If it’s NFS/CIFS, check export/share policy, not just mode bits.
  5. Re-run with no compression to get deterministic errors and avoid CPU noise.
  6. Validate archive integrity via tar -t before attempting restore repeatedly.
  7. For unprivileged containers, verify subuid/subgid consistency and avoid restoring onto non-POSIX filesystems.
  8. Fix the cause, then run a scratch restore to a new CTID before touching the production container ID.

Operational checklist: make LXC restores boring

  • Keep container rootfs on ZFS or a sane local POSIX filesystem.
  • Use snapshot backups when supported; use stop backups for correctness-sensitive services.
  • Do not rely on bind mounts for “automatic inclusion” in backups. Decide explicitly.
  • Standardize /etc/subuid and /etc/subgid across all cluster nodes.
  • Store archives wherever you want, but restore onto storage that preserves metadata.
  • Run routine scratch restores. Rotate which node performs them.
  • Track backup failures as alerts, not as “someone will notice.”

Decision guide: choose the right backup scope

  • Container rootfs + config: vzdump is fine. Restores are fast and predictable on ZFS/local storage.
  • Bind-mounted stateful data: back up using the storage system that owns it (ZFS snapshots, NFS server snapshots, database-native dumps).
  • Databases: prioritize logical backups and replication over filesystem tar consistency.

FAQ

1) Why does Proxmox use tar for LXC backups instead of snapshots only?

Portability. A tar archive can be stored anywhere and restored onto different backends. Snapshots are storage-specific; tar is the lowest common denominator.

2) Is “exit code 2” from tar always a fatal backup failure?

In this context, treat it as fatal. Tar uses exit code 2 for “fatal errors.” Proxmox will usually mark the backup failed and may delete partial output.

3) Can I ignore “file changed as we read it” warnings?

For a busy app server, sometimes. For a database or anything transactional, no. If you want consistent restores, use snapshot mode or stop mode, plus app-consistent backups.

4) Why do unprivileged containers complicate restore?

Because the on-disk ownership is shifted (high UIDs/GIDs). The restore process must faithfully recreate those numeric IDs and metadata. Some storage backends won’t.

5) My backups are stored on NFS. Is that bad?

Not inherently. Storing archives on NFS is common. The risk increases if your container data itself is on NFS via bind mounts, or if you try to restore rootfs onto NFS.

6) Why does restoring onto CIFS sometimes fail with xattr/ACL errors?

CIFS is not POSIX-native. Whether it supports Linux xattrs, ACLs, and device nodes depends on server features and mount options. Containers expect Linux semantics.

7) What’s the safest restore workflow?

Restore to a new CTID on local ZFS (or ext4/xfs), validate boot and application checks, then swap traffic and decommission the broken container.
Don’t restore in place unless you’re willing to extend the outage.

8) Why does restore fail with “no space left” when ZFS pool has plenty of free space?

Because the dataset may be limited by refquota or similar. Restore writes into the dataset, not the whole pool.
Check dataset avail, not pool free space.

9) How do I handle bind mounts in backups the “right” way?

Decide whether the bind-mounted data is part of the container’s recovery objective. If it is, back it up explicitly using a method suited to that storage.
If it isn’t, exclude it from container backup scope and document the dependency.

10) Should I switch everything to privileged containers to avoid UID mapping issues?

No. That trades operational convenience for a larger blast radius. Fix mapping consistency and storage compatibility instead. Privileged containers are not a free lunch.

Conclusion: next steps that actually reduce risk

When Proxmox LXC backup/restore fails with tar errors, the winning move is to stop reading the tar output like it’s a riddle.
It’s usually telling you one of three truths: the host can’t read a path, the target can’t represent metadata, or the storage ran out of something mid-stream.

Do this next:

  1. Run a scratch restore for one representative container today. If you can’t restore on demand, treat that as an incident.
  2. Inventory bind mounts across your containers and decide whether each is in scope for container-level backups.
  3. Standardize subuid/subgid and storage mount options across Proxmox nodes. Drift is how “works on my node” happens.
  4. Prefer restoring onto ZFS/local POSIX storage. Use network shares for storing archives, not for being the filesystem that must hold a container rootfs.
  5. Turn restore drills into routine. Boring is the point.
← Previous
PCIe and GPUs: when x8 is fine and when it hurts
Next →
The Patriot missile bug: when time drift became a battlefield problem

Leave a comment