You schedule backups, go to bed, and wake up to the kind of alert that makes coffee taste like regret: “backup storage not available on node”. You check the Proxmox GUI. Storage says “Shared”. Your brain says “So it’s available everywhere.” Reality says “Cute.”
This error is Proxmox being blunt: the node running the backup can’t use that storage right now. The “Shared” checkbox is not a magic distributed filesystem fairy. It’s metadata and assumptions. The fix is to prove those assumptions on every node, in the order that actually matters: mount, reachability, permissions, identity mapping, and config consistency.
Fast diagnosis playbook
When you need answers fast, don’t wander. Run this like a checklist. The goal is to find where “shared” breaks: config, mount, network, permissions, or identity.
First: confirm what node is failing and what storage Proxmox thinks it’s using
- Look at the backup job history: which node executed it? If you have a cluster, jobs can run on multiple nodes depending on where the VM is.
- Identify the storage ID (e.g.,
backup-nfs,pbs01). - Check Proxmox sees it as “active” on that node.
Second: prove the mount/path exists on the failing node
pvesm statusandpvesm pathfor that storage.findmntfor the mountpoint.- Create a test file as root in the target directory.
Third: isolate network vs permissions vs identity mapping
- Network:
ping,nc(for PBS),showmount(for NFS), SMB probe for CIFS. - Permissions: try
touch, check ownership and mode, inspect NFS export options (root_squashand friends). - Identity: check if backup process runs as root; confirm remote side expects root or maps root.
Fourth: check cluster config consistency and split-brain symptoms
pvecm statusfor quorum.- Verify
/etc/pve/storage.cfgcontent is identical across nodes (it usually is, unless pmxcfs is unhappy or someone edited local files incorrectly). - Check time sync; Kerberos-based SMB and some TLS setups get spicy when clocks drift.
Stop early when you find the first broken layer. Fixing permissions won’t help if the mount never existed. Rebooting won’t help if you’re pointing to the wrong DNS name from one node.
Facts and history that explain the trap
- Fact 1: Proxmox’s cluster filesystem (
pmxcfs) is a user-space, replicated config store. It distributes configuration, not mounts or kernel state. - Fact 2: The “shared” attribute predates many people’s current expectation of “cloud semantics”. It’s from the era where admins knew an NFS mount was their job, not the hypervisor’s.
- Fact 3: In Linux, a mount is per-node kernel state. No cluster config file can “share” a mount unless you deploy it on each node.
- Fact 4: NFS’s
root_squashbehavior is a security default that frequently collides with backup software running as root. It’s not a Proxmox bug; it’s your security policy meeting your assumptions. - Fact 5: CIFS/SMB “permissions” are a multi-layer cake: server ACLs, share permissions, client mount options, and sometimes ID mapping. It’s impressive when it works.
- Fact 6: Proxmox Backup Server (PBS) is not a filesystem mount; it’s an API-backed datastore with chunking and dedup. Availability errors there are often network/TLS/auth, not “mount missing”.
- Fact 7: Proxmox’s storage plugins often run checks by attempting to access paths or perform operations; “not available” can mean “can’t stat directory”, “wrong content type”, or “backend unreachable”.
- Fact 8: Systemd changed the game for mounts:
x-systemd.automountcan make mounts “lazy”, which is great for boot speed and terrible for time-bound backup jobs if misconfigured.
One quote that belongs in every on-call handbook:
“Hope is not a strategy.” — paraphrased idea attributed to many operations leaders
Shared storage is one of those places where hope shows up wearing a “works on my node” T-shirt.
Joke #1: “Shared” storage is like “shared” responsibility in a postmortem: everyone agrees it exists, and nobody is sure who owns it.
Why the storage is “not available”: the real failure modes
1) The storage definition is cluster-wide, but the backend is not mounted on every node
This is the classic. The storage is defined as dir or NFS/CIFS, but only one node has the actual mount or directory. Another node tries to run a backup job and finds an empty directory, a missing mountpoint, or a local path that is not what you think it is.
2) The mount exists, but it’s mounted differently (options, versions, paths)
NFSv3 on one node and NFSv4 on another. Different rsize/wsize. Different vers=. Different credential cache for SMB. Or worse: one node mounted a different export with the same mountpoint name. The GUI doesn’t scream; it just fails later.
3) Permissions: root squashed, ACL mismatch, or wrong owner
VZDump and many Proxmox operations run as root. If your NFS server maps root to nobody and your directory isn’t writable for that user, Proxmox gets an I/O or permission error and reports “not available” or a backup failure. You might see the storage as “active”, but writes fail.
4) DNS/routing asymmetry between nodes
Node A can resolve nas01 to the right IP. Node B resolves it to an old IP, a different VLAN, or a dead interface. Or one node routes through a firewall that blocks NFS ports. Storage “works” until a job lands on the wrong node.
5) Cluster state issues: quorum loss or pmxcfs weirdness
If a node loses quorum, some cluster operations are restricted. Storage config might be readable but changes may not propagate the way you expect. Also, a node with a sick cluster filesystem can present stale config. It’s rarer, but it’s real.
6) Backup target isn’t “storage” in the filesystem sense (PBS)
PBS failures look like storage failures in the GUI because Proxmox treats it as a storage backend. But the root causes are different: expired fingerprints, TLS trust issues, auth tokens, datastore removed, or network port blocked.
7) Boot ordering and “nofail” mounts create a time bomb
nofail in /etc/fstab lets the node boot even if the NAS is down. Great. But now the node is “up” and can schedule backups, while the backup directory is actually just an empty local folder. That’s not a backup; that’s performance art.
Joke #2: A backup that writes to an unmounted directory is Schrödinger’s disaster: it both succeeded and failed until you need to restore.
Practical tasks (commands, outputs, decisions)
These are the tasks I run when diagnosing “backup storage not available on node”. Each one includes a realistic command, a plausible snippet of output, and what decision you make from it. Run them on the failing node first, then compare to a known-good node.
Task 1: Identify storage status from Proxmox’s point of view
cr0x@pve2:~$ pvesm status
Name Type Status Total Used Available %
local dir active 98.00GiB 21.13GiB 71.87GiB 21.56%
backup-nfs nfs inactive 0B 0B 0B 0.00%
pbs01 pbs active n/a n/a n/a n/a
What it means: Proxmox thinks backup-nfs is inactive on this node. That’s not a backup problem; it’s a mount/reachability problem.
Decision: Skip VM-level troubleshooting. Fix the storage backend first.
Task 2: Confirm the storage configuration exists and is marked shared
cr0x@pve2:~$ grep -A6 -n "backup-nfs" /etc/pve/storage.cfg
18:nfs: backup-nfs
19: export /exports/pve-backups
20: path /mnt/pve/backup-nfs
21: server nas01
22: content backup
23: prune-backups keep-last=7
24: shared 1
What it means: The storage is configured cluster-wide as NFS, content type “backup”, and “shared 1”.
Decision: Now prove the path is actually mounted and writable.
Task 3: Ask Proxmox where it expects the path to be
cr0x@pve2:~$ pvesm path backup-nfs
/mnt/pve/backup-nfs
What it means: That’s the mountpoint Proxmox will use.
Decision: Validate kernel mount state at that path.
Task 4: Verify the mount is real (not just an empty directory)
cr0x@pve2:~$ findmnt /mnt/pve/backup-nfs
What it means: No output usually means “not mounted”. If it returns an NFS line, you’re mounted.
Decision: If not mounted, check /etc/fstab and systemd units; if mounted, move to permissions and connectivity.
Task 5: Compare with a known-good node to catch “same name, different mount”
cr0x@pve1:~$ findmnt /mnt/pve/backup-nfs
TARGET SOURCE FSTYPE OPTIONS
/mnt/pve/backup-nfs nas01:/exports/pve-backups nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2
What it means: This is what “good” looks like: correct server, correct export, correct mountpoint.
Decision: Make the failing node match this, byte-for-byte where possible.
Task 6: Attempt to mount (and capture the actual error)
cr0x@pve2:~$ mount -av
/ : ignored
/mnt/pve/backup-nfs : mount.nfs4: Connection timed out
What it means: Network reachability problem to nas01 (routing, firewall, DNS, or NAS down).
Decision: Stop and troubleshoot network before touching permissions or Proxmox settings.
Task 7: Test DNS and routing symmetry
cr0x@pve2:~$ getent hosts nas01
10.20.30.50 nas01
What it means: Node resolves nas01 to an IP. That doesn’t prove it’s the right IP.
Decision: Compare output across nodes. If different, fix DNS/hosts. If same, check connectivity to that IP.
Task 8: Prove network reachability (ICMP is not enough, but it’s a start)
cr0x@pve2:~$ ping -c 2 10.20.30.50
PING 10.20.30.50 (10.20.30.50) 56(84) bytes of data.
64 bytes from 10.20.30.50: icmp_seq=1 ttl=63 time=0.462 ms
64 bytes from 10.20.30.50: icmp_seq=2 ttl=63 time=0.497 ms
--- 10.20.30.50 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1004ms
rtt min/avg/max/mdev = 0.462/0.479/0.497/0.017 ms
What it means: The host is reachable, but NFS still might be blocked.
Decision: Test NFS specifically (portmapper for v3, 2049 for v4).
Task 9: Test NFS service reachability
cr0x@pve2:~$ nc -vz 10.20.30.50 2049
Connection to 10.20.30.50 2049 port [tcp/nfs] succeeded!
What it means: NFSv4 TCP port reachable. If this fails, it’s firewall/routing or NAS service down.
Decision: If reachable, check exports and permissions; if not, fix network/security first.
Task 10: Verify the export exists (server-side view via showmount)
cr0x@pve2:~$ showmount -e nas01
Export list for nas01:
/exports/pve-backups 10.20.30.0/24
What it means: The NAS claims it exports the path to your subnet.
Decision: If your node isn’t in that subnet (different VLAN), you’ve found the mismatch.
Task 11: If mounted, validate you can write as root (the backup reality)
cr0x@pve2:~$ sudo sh -c 'touch /mnt/pve/backup-nfs/.pve-write-test && ls -l /mnt/pve/backup-nfs/.pve-write-test'
-rw-r--r-- 1 root root 0 Dec 26 03:12 /mnt/pve/backup-nfs/.pve-write-test
What it means: Root can create files. That’s the minimum for backup storage.
Decision: If this fails with “Permission denied”, inspect NFS export options and directory permissions.
Task 12: Detect root squashing (common on NFS) from the client side
cr0x@pve2:~$ stat -c "%U %G %a %n" /mnt/pve/backup-nfs
nobody nogroup 755 /mnt/pve/backup-nfs
What it means: The server may be mapping root to nobody, and the directory mode is not writable.
Decision: Either adjust export settings (carefully) or create a dedicated backup user and align UID/GID across nodes and NAS.
Task 13: Check for the “mounted but stale” condition (NFS hiccups)
cr0x@pve2:~$ timeout 5 ls -la /mnt/pve/backup-nfs | head
total 16
drwxr-xr-x 2 root root 4096 Dec 26 02:10 .
drwxr-xr-x 10 root root 4096 Dec 26 01:55 ..
-rw-r--r-- 1 root root 0 Dec 26 03:12 .pve-write-test
What it means: Directory listing returns quickly. If it hangs until timeout, you likely have a stale mount or network flaps.
Decision: Investigate network stability, NFS server load, and consider hard mounts with sane timeouts; avoid “soft” for backup integrity.
Task 14: For CIFS/SMB-based backup storage, verify mount options and credential use
cr0x@pve3:~$ findmnt /mnt/pve/backup-smb
TARGET SOURCE FSTYPE OPTIONS
/mnt/pve/backup-smb //files01/backups cifs rw,relatime,vers=3.1.1,cache=strict,username=svc_pve,uid=0,gid=0,file_mode=0640,dir_mode=0750
What it means: SMB mount exists and maps ownership to root. That’s common for Proxmox backup targets.
Decision: If a node lacks this mount or uses different credentials, standardize via /etc/fstab or a systemd mount unit deployed uniformly.
Task 15: For Proxmox Backup Server storage, validate API reachability and auth
cr0x@pve2:~$ nc -vz pbs01 8007
Connection to pbs01 8007 port [tcp/*] succeeded!
What it means: Network path to PBS API is open. That’s step one.
Decision: If blocked, fix firewall/VLAN. If open, check Proxmox storage config and certificates/fingerprint.
Task 16: Inspect backup job logs for the exact storage-related failure
cr0x@pve2:~$ journalctl -u pvedaemon -u pvescheduler --since "today" | grep -i backup | tail -n 12
Dec 26 02:01:01 pve2 pvescheduler[1890]: starting new backup job: vzdump 101 --storage backup-nfs --mode snapshot
Dec 26 02:01:02 pve2 vzdump[2033]: ERROR: storage 'backup-nfs' is not available on node 'pve2'
Dec 26 02:01:02 pve2 vzdump[2033]: INFO: Failed at 2025-12-26 02:01:02
What it means: The error is coming from the backup tool invoking storage checks. It’s not a VM snapshot failure; it’s storage availability.
Decision: Focus on why pvesm status shows inactive or why the path check fails.
Task 17: Confirm the node is healthy in the cluster (quorum and membership)
cr0x@pve2:~$ pvecm status
Cluster information
-------------------
Name: prod-pve
Config Version: 42
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Fri Dec 26 03:18:10 2025
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000002
Ring ID: 1.10
Quorate: Yes
What it means: Cluster is quorate. You’re not debugging storage under a cluster partition (good).
Decision: If Quorate: No, stabilize the cluster first; storage config and job scheduling can get unpredictable.
Task 18: Validate time sync (quiet cause of TLS and SMB weirdness)
cr0x@pve2:~$ timedatectl
Local time: Fri 2025-12-26 03:19:44 UTC
Universal time: Fri 2025-12-26 03:19:44 UTC
RTC time: Fri 2025-12-26 03:19:44
Time zone: Etc/UTC (UTC, +0000)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no
What it means: Clock is synchronized. This reduces the chance of TLS handshake failures with PBS or SMB auth edge cases.
Decision: If out of sync, fix time first. It’s boring and it works.
Task 19: Check that the storage mount survives reboot the way you think it does
cr0x@pve2:~$ grep -n "backup-nfs" /etc/fstab
12:nas01:/exports/pve-backups /mnt/pve/backup-nfs nfs4 rw,hard,timeo=600,retrans=2,_netdev 0 0
What it means: There is an fstab entry. Good. Options matter.
Decision: If missing on one node, that’s why “shared” isn’t shared. Standardize mounts across nodes.
Task 20: Spot the dangerous “nofail” + local directory trap
cr0x@pve2:~$ grep -n "backup-nfs" /etc/fstab
12:nas01:/exports/pve-backups /mnt/pve/backup-nfs nfs4 rw,nofail,_netdev 0 0
What it means: If NFS is down at boot, the mount might not occur, but the directory still exists, and jobs may write locally.
Decision: Replace with systemd automount or explicit dependency ordering so services don’t proceed without storage (details in the checklist section).
Three corporate mini-stories from the trenches
Mini-story 1: The incident caused by a wrong assumption
They had a three-node Proxmox cluster and one shiny “backup” storage pointing at an NFS share. The admin who set it up did the right thing in the GUI: added NFS storage, ticked “Shared”, set content to “VZDump backup file”. Everyone nodded.
Backups succeeded for weeks. That’s the dangerous part. The only reason they succeeded is that most of the “important” VMs lived on node 1 for historical reasons, and the backup schedule ran when node 1 was healthy.
Then a host maintenance window moved a batch of VMs to node 2. The next nightly backups ran from node 2. Storage reported “not available on node.” Someone reran jobs manually on node 1 and called it “temporary.” Two days later, node 1 had an unplanned reboot and the business discovered that “temporary” is another word for “permanent.”
The root cause was not exotic: only node 1 had an /etc/fstab entry for the NFS share. Node 2 and 3 had the directory /mnt/pve/backup-nfs (created by Proxmox), but it wasn’t mounted. The team assumed the cluster config made it “shared.” It didn’t.
The fix was also not exotic: identical mount configuration deployed via automation, plus a canary file check in monitoring: “is this path mounted and writable?” Incidents rarely need cleverness. They need discipline.
Mini-story 2: The optimization that backfired
A different company wanted faster boot times and fewer “node stuck at boot waiting for NAS” incidents. They changed NFS mounts to include nofail and added some systemd tweaks so the hypervisor would come up even if storage was down.
Boot times improved. On paper. In reality, they swapped one failure mode for a sneakier one: nodes came up, the Proxmox GUI looked healthy, and the backup jobs ran. But on the mornings when the NAS was slow to respond, mounts didn’t happen in time. The backup path existed as a normal directory, so the backup job wrote to local disk.
Local disk filled. VMs started pausing. Logs exploded. The storage team got paged for “NAS performance” even though the NAS was fine—Proxmox had quietly diverted workload to local storage because the mount wasn’t there.
They fixed it by using x-systemd.automount (so access triggers the mount), removing “write-to-local” risk by making the mountpoint owned by root but not writable unless mounted, and adding a pre-backup hook to validate mount status. The moral: optimizing boot behavior without thinking about failure semantics is how you create haunted systems.
Mini-story 3: The boring but correct practice that saved the day
A financial services shop ran Proxmox with PBS as the primary backup target and NFS as a secondary “export” location. They were not exciting people. They wrote everything down, and they tested restores quarterly. This is why they slept.
One weekend, a core switch firmware upgrade introduced an ACL change that blocked TCP/8007 from one rack. Two of four Proxmox nodes could reach PBS; two could not. Backups started failing only for VMs currently running on those isolated nodes.
They caught it within an hour because their monitoring didn’t just watch “backup job success.” It watched storage reachability from every node. The alert said, essentially, “pve3 cannot reach pbs01:8007.” No guessing, no archaeology.
They failed over by pinning backup jobs to healthy nodes temporarily, then rolled back the ACL change. After that, they added a change-management checklist item: “validate PBS connectivity from each hypervisor.” Boring. Correct. Saved the day.
Common mistakes: symptom → root cause → fix
1) Symptom: “storage is not available on node” only on one node
Root cause: Mount exists only on some nodes, or DNS resolves differently per node.
Fix: Standardize mounts via /etc/fstab or systemd mount units across all nodes; verify with findmnt and getent hosts on every node.
2) Symptom: storage shows “active”, but backups fail with permission errors or “cannot create file”
Root cause: NFS root_squash, SMB ACL mismatch, or wrong ownership/mode on the target directory.
Fix: Decide your security model: either allow root writes on the export (carefully) or map to a dedicated service user with consistent UID/GID; then test with touch as root.
3) Symptom: backups “succeed” but space usage is on local disk, not NAS
Root cause: Mount not present; writes went to the mountpoint directory on the local filesystem (often due to nofail at boot).
Fix: Remove the trap. Use systemd automount or enforce mount availability before scheduler runs; make the mountpoint non-writable when not mounted; monitor for “is mounted” not “directory exists”.
4) Symptom: NFS mount works manually but not during boot
Root cause: Network not ready when mount is attempted; missing _netdev or systemd ordering problems.
Fix: Add _netdev, consider x-systemd.automount, and ensure network-online target if needed. Validate via reboot test.
5) Symptom: only PBS-backed storage fails, filesystem storages fine
Root cause: Port blocked, TLS trust/fingerprint mismatch, auth token revoked, or datastore renamed/removed on PBS.
Fix: Verify connectivity to TCP/8007, validate PBS storage config, re-approve fingerprint if it changed intentionally, and confirm datastore exists.
6) Symptom: storage intermittently “inactive” with NFS under load
Root cause: Network flaps, NFS server saturation, stale handles, or too-aggressive timeouts.
Fix: Stabilize network, tune NFS server, use hard mounts with sane timeouts, and consider separating backup traffic onto a dedicated VLAN/interface.
7) Symptom: after adding a new node, backups fail on that node only
Root cause: New node didn’t get the OS-level mount setup, firewall rules, DNS search domains, or CA trust store entries.
Fix: Treat node provisioning as code. Apply the same storage mount and network policy as existing nodes before putting it into rotation.
8) Symptom: storage config looks right, but node behaves like it’s not in the cluster
Root cause: Quorum loss, corosync issues, or pmxcfs problems.
Fix: Restore cluster health first. Validate pvecm status, network between nodes, and corosync ring stability.
Checklists / step-by-step plan
Step-by-step: make “shared” actually shared (NFS/CIFS directory style)
-
Pick one canonical storage ID and mountpoint. Example:
backup-nfsmounted at/mnt/pve/backup-nfs.Do not create per-node variations like
/mnt/pve/backup-nfs2“just for now”. “Just for now” is how you get archaeology jobs. -
Standardize name resolution. Use
getent hosts nas01on all nodes and ensure it resolves identically. If you must pin, use/etc/hostsconsistently. -
Deploy identical mount configuration to every node. Use
/etc/fstabor systemd mount units. The key is identical behavior on reboot.Minimal NFSv4 example:
cr0x@pve1:~$ sudo sh -c 'printf "%s\n" "nas01:/exports/pve-backups /mnt/pve/backup-nfs nfs4 rw,hard,timeo=600,retrans=2,_netdev 0 0" >> /etc/fstab'Decision: If you require the node to boot even when NAS is down, don’t blindly add
nofail. Use automount plus guardrails. -
If you use systemd automount, do it deliberately. It avoids boot hangs and reduces “mount wasn’t ready” races.
Example mount options in fstab:
cr0x@pve1:~$ sudo sed -i 's#nfs4 rw,hard,timeo=600,retrans=2,_netdev#nfs4 rw,hard,timeo=600,retrans=2,_netdev,x-systemd.automount,x-systemd.idle-timeout=600#' /etc/fstabDecision: If your workload includes tight backup windows, validate automount latency under load.
-
Enforce “not mounted means not writable”. One practical tactic: make the mountpoint owned by root and mode 000 when unmounted, then let the mount overlay provide permissions. This reduces “writes went local” surprises. Test it carefully so Proxmox can still mount.
-
Validate on each node: mount exists, write works, latency acceptable.
cr0x@pve2:~$ sudo mount -a && findmnt /mnt/pve/backup-nfs && sudo sh -c 'dd if=/dev/zero of=/mnt/pve/backup-nfs/.speedtest bs=1M count=64 conv=fdatasync' 64+0 records in 64+0 records out 67108864 bytes (67 MB, 64 MiB) copied, 0.88 s, 76.6 MB/sDecision: If throughput is wildly different per node, you likely have routing differences, NIC issues, or a switch path problem.
-
Confirm Proxmox sees it active everywhere.
cr0x@pve3:~$ pvesm status | grep backup-nfs backup-nfs nfs active 9.09TiB 3.21TiB 5.88TiB 35.31%Decision: Only after this do you consider backup job scheduling tweaks.
-
Add monitoring that checks mount and writability from every node. The check should be dumb on purpose: “is mounted” and “can create a file” and “is the filesystem type expected”.
Step-by-step: PBS-backed “storage not available”
- Confirm TCP reachability to PBS on port 8007 from every node. Use
nc -vz. - Confirm the storage is active in
pvesm status. If inactive, it’s usually connectivity or auth. - Validate time sync. TLS hates time travel.
- Confirm PBS datastore exists and hasn’t been renamed. Storage config can outlive the thing it points to.
- Be strict about certificates and fingerprints. If a fingerprint changed unexpectedly, treat it as a security event until proven otherwise.
Step-by-step: make backups resilient without lying to yourself
- Prefer PBS for dedup + integrity. Use filesystem shares as secondary export/replication, not your only lifeline.
- Keep a local fallback only if you alert on it. A local backup directory can save you during a NAS outage, but only if you monitor local disk pressure and rotate aggressively.
- Test restores. Not once. Regularly. A backup you haven’t restored is a rumor.
FAQ
1) What does “storage is not available on node” mean in Proxmox?
It means the node executing the operation (often vzdump) can’t access that storage backend at that moment. Typically: not mounted, unreachable, wrong credentials, or permission denied.
2) If storage is marked “Shared”, why doesn’t Proxmox mount it everywhere?
Because Proxmox manages storage definitions, not kernel mounts. Mounting is OS-level state. You must configure mounts on each node (or use a backend that is inherently shared, like Ceph).
3) Can I fix this by unchecking “Shared”?
You can silence some scheduling and migration expectations, but you won’t fix the underlying problem. If multiple nodes need to back up to it, it must be accessible from multiple nodes. Make reality match the checkbox, not the other way around.
4) Why does it fail only sometimes?
Because only some backups run on the node that can’t access storage, or because mounts are flaky (automount delays, network blips, NAS load). Intermittent failures are still failures; they just wait for your worst day.
5) What’s the difference between NFS backup storage and PBS storage in Proxmox?
NFS is a mounted filesystem path. PBS is an API-based backup datastore with deduplication, compression, verification, and pruning semantics. Troubleshooting PBS availability is closer to debugging an application dependency than a mount.
6) My NFS share mounts, but backups fail with “Permission denied”. What now?
Check for root_squash and directory permissions. Proxmox backup jobs typically need root to write. Either allow root writes on that export (risk trade-off) or map to a dedicated service identity with consistent UID/GID and appropriate permissions.
7) How do I prevent backups from writing to local disk when the NFS mount is missing?
Don’t rely on directory existence. Enforce mount checks: use systemd automount and/or make the mountpoint non-writable when unmounted, and monitor “is mounted” plus “write test”. Avoid casual nofail without guardrails.
8) Does cluster quorum affect storage availability?
Not directly for mounts, but quorum loss can cause cluster services and config distribution to behave differently. If a node isn’t quorate, fix cluster health first so you aren’t debugging two problems at once.
9) Is it okay to have different mount options on different nodes if it still “works”?
It’s okay right up until it isn’t. Different NFS versions and mount options can change locking behavior, performance, and failure semantics. Standardize. Your future self will send a thank-you note.
10) Should I use CIFS/SMB for Proxmox backups?
You can, but it’s usually more fragile than NFS in Linux hypervisor environments due to auth and ACL complexity. If you must use it, standardize mount options and credentials across nodes and test failure behavior.
Conclusion: next steps that prevent repeats
“Backup storage not available on node” is Proxmox telling you the truth. The unpleasant part is that the truth lives below the GUI: mounts, networks, permissions, identity mapping, and boot ordering.
Next steps that actually change outcomes:
- Pick the failing node and run the fast diagnosis playbook. Confirm whether the storage is inactive, unmounted, unreachable, or unwritable.
- Standardize mounts across every node. Same server name, same export/share, same mountpoint, same options.
- Remove the “writes went local” trap. If you use
nofail, counterbalance it with automount and monitoring. - Add monitoring per node. Check mount + writability + expected filesystem type, not just “backup job succeeded”.
- Test a restore. Not because it’s fun. Because it’s cheaper than learning during an outage.
If you want one mental model to keep: the “Shared” checkbox is a promise you make to Proxmox. Keep it, and backups become boring. Break it, and backups become a weekly surprise.