Proxmox restore speed: tuning PBS, compression choices, and why restores are slow

Was this helpful?

The backup finished in 12 minutes. The restore is “estimating…” like it’s contemplating life choices.
Meanwhile, your incident channel is doing what incident channels do: turning into a live podcast.

Proxmox Backup Server (PBS) is designed to be efficient at ingesting data, deduplicating it, and storing it safely.
Restores are a different workload. They punish your weakest link: random reads, metadata lookups, decompression,
checksum validation, and a network path that suddenly matters. This is how you get “backups are fast, restores are slow”.

How PBS restores actually work (and why it’s not just “copy the file back”)

PBS is not a dumb repository of tar files. It’s a content-addressed chunk store with deduplication and incremental backups.
During backup, the client splits data into chunks, calculates checksums, compresses chunks, and sends “new” chunks to the server.
That’s a mostly sequential write workload plus some hashing. Modern systems love sequential writes.

Restore is the inverse but not symmetrical. Restoring a VM or container requires reading a lot of chunks, in a specific order,
reassembling data, verifying integrity, decompressing, and streaming it into the target storage. Chunk reads can be less sequential
than you hope, especially when the backup is highly deduplicated across time or across machines. Dedup is great for capacity.
It can be mildly rude to your restore throughput if your underlying storage is weak on random reads or metadata operations.

The restore pipeline (mental model)

  • PVE host asks PBS for snapshot content (via proxmox-backup-client / integrated tooling).
  • PBS reads chunk metadata, locates chunk files, reads them from datastore, verifies checksums.
  • PBS decompresses chunks (depending on how the data was stored).
  • PVE host receives a stream and writes it into your target storage (ZFS, LVM-thin, Ceph, directory, etc.).
  • Target storage does its own work: allocation, copy-on-write, checksumming, parity, TRIM behavior, etc.

Your restore speed is basically the minimum of four ceilings:
datastore read IOPS/throughput, CPU for decompression/checksums,
network throughput/latency, and target storage write behavior.
You only need one of them to be bad to make everything bad.

One of the more annoying truths: “backup speed” is often bounded by change rate, compression savings, and dedup efficiency.
“Restore speed” is bounded by how quickly you can recreate the entire image, not just the delta.

Interesting facts and historical context (because systems didn’t get weird by accident)

  1. Deduplication became mainstream in enterprise backup in the mid-to-late 2000s because disk became cheaper than tape libraries and WAN links were painful.
  2. Content-addressed storage (store blocks by hash) has roots in earlier research systems, but it exploded in popularity with tools like Git because it’s reliable and dedup-friendly.
  3. Modern compression defaults shifted from gzip to zstd in many ecosystems in the late 2010s because zstd offers high ratios with much faster decode than older codecs.
  4. “Incremental forever” backups often optimize for backup windows, not restore windows; the restore path can become a pile of indirections.
  5. ZFS popularized end-to-end checksumming in commodity storage; that’s great for correctness, but every checksum is still CPU work at restore time.
  6. RAID5/6 write penalties are old news; less remembered is that parity RAID can also be unimpressive at random reads under load, which chunk stores can trigger.
  7. NVMe changed expectations for “disk speed” so much that many teams now blame the network first—sometimes incorrectly, sometimes correctly.
  8. Backup verification jobs are a cultural reaction to silent corruption incidents from the 1990s and early 2000s; they’re worth it, but they can collide with restores.

One quote that operations people repeat for a reason: Hope is not a strategy. — General Gordon R. Sullivan.
A restore plan that assumes “it’ll be fast enough” is just hope with a budget line.

Fast diagnosis playbook (find the bottleneck in minutes)

Restores are time-sensitive. You do not have time to “optimize everything”. You need to identify the limiting factor quickly,
then change the one thing that lifts the ceiling.

First: is it PBS read-side, network, or target write-side?

  1. Check live throughput on both ends. If PBS disks are pegged but the target host is idle, you’re read-bound. If the network is saturated, it’s network-bound. If target storage is busy (high iowait) while PBS is fine, you’re write-bound.
  2. Check CPU. If one core is hot and others are idle during restore, you might be single-thread limited somewhere (decompression, checksum, or a storage driver).
  3. Check contention. If GC/verify/prune is running on PBS, you may be fighting yourself. Stop the background parade during an emergency restore.

Second: does the datastore behave like “lots of small reads” or “big sequential reads”?

  • If your PBS datastore is on HDD RAID and you see low MB/s with high IOPS pressure, chunk random-read behavior is biting you.
  • If it’s on SSD/NVMe and still slow, look at CPU, encryption/compression, or network offload issues.

Third: is the target storage the silent villain?

  • ZFS with small recordsize, heavy sync, or bad ashift can make writes sad.
  • LVM-thin under metadata pressure can tank.
  • Restoring into Ceph/RBD has its own replication and backfill realities.

Joke #1: A restore is like a fire drill—except the building is actually on fire and everyone suddenly remembers they have “questions”.

Compression choices: zstd vs lzo vs none, and what restores really pay for

Compression is the most misunderstood knob in backup systems. Teams pick a compression algorithm based on backup window,
then are surprised when restore day charges interest.

What you’re trading

  • CPU: compression and decompression cost cycles. Decompression is usually cheaper than compression, but not free.
  • Network: compressed data reduces bytes on the wire; that can massively improve restore if you’re network-limited.
  • Datastore I/O: fewer bytes read can help if you’re throughput-limited, but if you’re random-read limited, CPU might become the next bottleneck.
  • Dedup efficiency: some compression strategies interact with chunking/dedup. Generally, chunking is done on raw data and chunks are compressed afterward; this keeps dedup reasonable.

Rule of thumb that usually holds in production

  • zstd (moderate levels) is the sane default. Good ratio, fast decode, generally friendly restores.
  • lzo can be attractive for low CPU overhead, but you pay with bigger data (more disk reads, more network). On 10GbE+, that can still be fine; on 1GbE, it’s often painful.
  • no compression is rarely the answer unless you’re CPU-starved and have absurd disk and network bandwidth to spare.

Restore reality: the “fast backup, slow restore” compression trap

It’s common to crank compression levels to make backups finish quickly or reduce storage cost. That can backfire.
Restore requires decompressing the entire dataset you want back, not just changed blocks. If your PBS CPU is modest
and you chose a heavy compression level, you can end up CPU-bound on the server during restore.

The best practice is boring: pick a compression setting that keeps both backup and restore within your RTO.
Then validate it with actual restore tests, not vibes.

Storage path matters: PBS datastore disks, ZFS, RAID, SSDs, and cache myths

PBS performance is overwhelmingly about the datastore. Not the UI. Not the catalog. The datastore: where chunk files live.
If you store your datastore on slow disks, restores will remind you.

PBS datastore I/O patterns in plain terms

  • Backup ingest: lots of sequential-ish writes of new chunks plus metadata updates.
  • Restore: many reads of chunk files, which can be partially sequential but often becomes “seeky” depending on chunk distribution and dedup history.
  • Verify: reads lots of chunks to confirm integrity; it competes directly with restore for read bandwidth.
  • GC (garbage collection): can do substantial metadata work and disk reads; it’s not free.

Datastore placement recommendations (opinionated)

  • Best: NVMe mirror (or redundant SSDs) dedicated to PBS datastore, especially for many VMs and frequent restores.
  • Good: SSD RAID10 or ZFS mirror with decent capacity. Random reads matter.
  • Acceptable: HDD RAID10 for larger archives where restores are rare and RTO is flexible.
  • Avoid: parity RAID (RAID5/6) of HDDs for active restore workloads, unless you’ve tested and accepted the restore time.
  • Also avoid: putting PBS datastore on the same pool that is also running VM workloads. That’s self-harm with extra steps.

ZFS specifics (because most PBS boxes end up on ZFS)

ZFS can be excellent for PBS: checksumming, snapshots, and predictable behavior. But you must respect how ZFS allocates and caches.
A PBS datastore is a file workload with lots of reads and metadata lookups. That’s ARC pressure and potentially small-block read behavior.

  • ARC size: enough RAM helps metadata and hot chunks. Not enough RAM means more disk reads and slower restores.
  • Recordsize: for a filesystem storing chunk files, default recordsize is often fine. Don’t randomly tune recordsize unless you know the chunk size distribution and have measured outcomes.
  • SLOG: generally irrelevant for PBS restores; SLOG affects synchronous writes, not reads. If someone tries to “fix restore speed with SLOG”, you’ve found a confident guesser.
  • L2ARC: can help if your working set of chunks is cacheable and repeated. For one-off disaster restores, it often warms slowly and does little.

Joke #2: L2ARC is like an intern—useful after onboarding, but it won’t save you on day one of the incident.

Network and transport: when 10GbE still restores like it’s 2009

Network issues are boring until they’re not. A restore is a sustained stream that makes every bad NIC setting and every
oversubscribed switch suddenly visible.

What to watch

  • MTU mismatches: jumbo frames on one end and not the other leads to fragmentation or blackholing. Performance becomes “mysteriously bad”.
  • CPU per packet: small packets at high rates can peg a core. This happens more than people admit.
  • Bonding/LACP mistakes: you may not get aggregate throughput for a single flow; many restores are a single TCP stream.
  • Firewall/IPS: deep packet inspection on the backup VLAN is a great way to turn restores into performance art.

Latency matters more than you think

Deduplicated chunk fetch patterns can involve many small reads and metadata lookups. If the client-server interaction requires
lots of request/response cycles, latency and packet loss matter. This is why “10GbE” is not a guarantee. It’s a ceiling.

PBS tuning that actually moves the needle

PBS doesn’t have 400 hidden performance knobs, which is a feature. The wins are mostly architectural:
fast storage, enough CPU, enough RAM, and scheduling background jobs so they don’t fight restores.

Schedule and contention: the easiest win

  • Verify jobs are important but should not run during business hours if restores need to be fast.
  • GC should be scheduled with awareness of restore windows. GC plus restore equals sad disks.
  • Prune is usually lighter but can still cause churn if snapshots are huge and frequent.

Encryption considerations

If you use encryption, you add CPU overhead. On modern CPUs with AES-NI, it’s often fine. But “often” is not “always”.
If your PBS is an older box repurposed from someone’s lab, encryption can become the bottleneck during restore.

Datastore layout

Keep the datastore on a filesystem and block device stack you understand. Exotic layers (remote filesystems, stacked encryption,
thin provisioning under thin provisioning) tend to produce unpredictable behavior under restore pressure.

Proxmox restore side: QEMU, LVM-thin, ZFS zvols, and where speed goes to die

People love blaming PBS because it’s “the backup thing”. But restores write into something. That something can be slow.

ZFS target storage

Restoring a big VM into a ZFS zvol means ZFS is allocating, checksumming, and writing. If the target pool is fragmented,
near full, or backed by HDDs, it will show. Also: if you have sync=always or a workload forcing sync writes,
you can kneecap throughput.

LVM-thin target storage

Thin pools can restore quickly when healthy. When thin metadata is under pressure, or when the pool is close to full,
performance can collapse. It fails like a grown-up: quietly, slowly, and during the only restore that matters.

Ceph / replicated storage

Restoring into Ceph means you’re writing replicated objects, potentially triggering backfill or recovery if the cluster isn’t perfectly calm.
Your restore speed becomes “whatever the cluster can safely ingest without tipping over”.

Practical tasks: commands, what output means, and the decision you make

These are not “run these and feel productive” commands. Each one tells you where the restore is being throttled and what to do next.
Run them during a restore test or an actual restore, because idle systems lie.

Task 1: Identify whether PBS is CPU-bound during restore

cr0x@pbs1:~$ top -H -b -n 1 | head -n 25
top - 10:14:21 up 42 days,  3:11,  2 users,  load average: 11.42, 10.98, 9.87
Threads: 412 total,   9 running, 403 sleeping,   0 stopped,   0 zombie
%Cpu(s): 92.1 us,  3.2 sy,  0.0 ni,  1.8 id,  2.9 wa,  0.0 hi,  0.0 si,  0.0 st
...
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
51231 backup    20   0 5348128 612144  28740 R  198.0   3.8  12:41.22 proxmox-backup
51244 backup    20   0 5348128 612144  28740 R  183.5   3.8  12:39.08 proxmox-backup

What it means: If proxmox-backup threads are consuming lots of CPU and iowait is low, you’re likely CPU-bound (decompression, encryption, checksum).

Decision: Lower compression level for future backups, scale PBS CPU, or move datastore to faster storage so CPU isn’t wasted waiting.

Task 2: Confirm datastore disk utilization and iowait on PBS

cr0x@pbs1:~$ iostat -xm 1 5
Linux 6.2.16-20-pve (pbs1) 	12/28/2025 	_x86_64_	(16 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          22.54    0.00    6.91   41.33    0.00   29.22

Device            r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wkB/s  w_await aqu-sz  %util
nvme0n1         220.0  85432.0     0.0    0.0    3.12   388.3      15.0   6240.0   8.90   1.20   78.0

What it means: High iowait and high device utilization suggests the datastore is the bottleneck. Look at r_await—if it’s high (tens of ms on SSD, hundreds on HDD), reads are stalling.

Decision: Move datastore to faster media, add spindles (RAID10), or stop competing jobs (verify/GC).

Task 3: Check if verify or GC is competing with restores

cr0x@pbs1:~$ systemctl list-units --type=service | egrep 'proxmox-backup-(garbage|verify|prune)'
proxmox-backup-garbage-collection.service loaded active running Proxmox Backup Garbage Collection
proxmox-backup-verify.service              loaded active running Proxmox Backup Verify Job

What it means: If GC/verify is running during a restore, you’re intentionally creating read contention.

Decision: Stop or reschedule these jobs during urgent restores; set job windows that respect RTO.

Task 4: Inspect PBS task logs for slow phases

cr0x@pbs1:~$ proxmox-backup-manager task list --limit 5
┌──────────┬──────────────────────────┬───────────────┬─────────┬──────────┬───────────┐
│  upid    │          starttime       │     worker    │  type   │ status   │ duration  │
╞══════════╪══════════════════════════╪═══════════════╪═════════╪══════════╪═══════════╡
│ UPID:... │ 2025-12-28T09:58:03Z     │ reader        │ restore │ running  │ 00:16:22  │
└──────────┴──────────────────────────┴───────────────┴─────────┴──────────┴───────────┘

What it means: The task list shows what PBS thinks it’s doing and for how long. Pair this with system metrics.

Decision: If restore tasks run long while system is idle, suspect network/client-side. If system is loaded, suspect PBS storage/CPU.

Task 5: Validate datastore free space and fragmentation risk

cr0x@pbs1:~$ df -h /mnt/datastore
Filesystem      Size  Used Avail Use% Mounted on
rpool/pbsdata   7.1T  6.5T  0.6T  92% /mnt/datastore

What it means: Very full filesystems can slow down allocations and metadata updates, especially on CoW filesystems.

Decision: Prune aggressively, expand storage, or add a second datastore. Keep headroom; 80–85% is a happier place.

Task 6: Check ZFS pool health and latency on PBS (if datastore on ZFS)

cr0x@pbs1:~$ zpool iostat -v 1 3
                              capacity     operations     bandwidth
pool                        alloc   free   read  write   read  write
--------------------------  -----  -----  -----  -----  -----  -----
rpool                       6.50T   600G   1200     80   140M  10.2M
  mirror                    6.50T   600G   1200     80   140M  10.2M
    nvme0n1                 -      -       600     40   70.0M  5.1M
    nvme1n1                 -      -       600     40   70.0M  5.1M
--------------------------  -----  -----  -----  -----  -----  -----

What it means: Confirms whether ZFS vdevs are the limiting factor and whether read bandwidth matches expectations.

Decision: If bandwidth is low and ops are high, you’re IOPS-bound; upgrade to NVMe or add vdevs (carefully) to improve parallelism.

Task 7: Check ARC pressure (ZFS caching) on PBS

cr0x@pbs1:~$ arcstat 1 5
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c
10:20:01  3420  1210     35  1100  91    110   9     0   0   28.5G  32.0G
10:20:02  3588  1402     39  1280  91    122   9     0   0   28.5G  32.0G

What it means: High miss% during restore means ZFS is going to disk frequently. Some misses are normal; persistent high misses suggest RAM is limiting or the workload doesn’t cache well.

Decision: Add RAM if metadata/chunks could be cached, or accept that this restore is disk-bound and upgrade disks.

Task 8: Measure network throughput and retransmits during restore

cr0x@pve1:~$ ip -s link show vmbr0
2: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    RX:  bytes packets errors dropped  missed   mcast
    8123456789  7212345      0      12       0       0
    TX:  bytes packets errors dropped carrier collsns
    6234567890  6123456      0       0       0       0

What it means: Drops on the bridge/NIC during a restore are a red flag. Drops trigger retransmits and tank throughput.

Decision: Fix the physical/network layer: MTU consistency, switch buffers, NIC offloads, or move restore traffic off congested segments.

Task 9: Validate MTU end-to-end (common silent killer)

cr0x@pve1:~$ ping -M do -s 8972 -c 2 10.10.10.20
PING 10.10.10.20 (10.10.10.20) 8972(9000) bytes of data.
ping: local error: message too long, mtu=1500
ping: local error: message too long, mtu=1500

--- 10.10.10.20 ping statistics ---
2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1026ms

What it means: This host is not configured for jumbo frames. If you expected MTU 9000, you don’t have it.

Decision: Either standardize on 1500 everywhere or configure jumbo frames everywhere (NICs, bridges, switches). “Somewhere” is not a valid MTU.

Task 10: Check client-side write performance (target datastore on PVE)

cr0x@pve1:~$ iostat -xm 1 3
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          10.12    0.00    4.87   55.21    0.00   29.80

Device            r/s     rkB/s   r_await  w/s     wkB/s   w_await  %util
dm-3              1.0      8.0     2.10  920.0  94208.0   34.90   99.0

What it means: If the target device is at ~99% util and iowait is high, restore is write-bound on the PVE host.

Decision: Restore to faster storage, change target (e.g., restore to local NVMe then migrate), or fix the underlying pool/array.

Task 11: Check whether thin pool is near full (LVM-thin)

cr0x@pve1:~$ lvs -a -o+seg_monitor,metadata_percent,data_percent vg0
  LV               VG  Attr       LSize   Pool Origin Data%  Meta%  Monitor
  vmdata           vg0 twi-aotz--  2.50t             93.12  78.55  monitored
  [vmdata_tmeta]   vg0 ewi-ao----  4.00g
  [vmdata_tdata]   vg0 ewi-ao----  2.50t

What it means: A thin pool at 93% data usage is a performance and risk problem; metadata at ~79% isn’t great either.

Decision: Expand thin pool, free space, or move restores elsewhere. Also consider trimming and monitoring before you hit the wall.

Task 12: Check ZFS target pool fullness and fragmentation risk

cr0x@pve1:~$ zfs list -o name,used,avail,refer,mountpoint
NAME            USED  AVAIL  REFER  MOUNTPOINT
rpool           1.92T   110G   192K  /rpool
rpool/ROOT       32G   110G    32G  /
rpool/vmdata    1.88T   110G   1.88T  -

What it means: 110G available on a multi-terabyte pool is “we are living dangerously”. ZFS performance degrades as pools fill up.

Decision: Free space or add vdev capacity before doing massive restores; target a pool with headroom.

Task 13: Confirm the restore path and process on the PVE host

cr0x@pve1:~$ ps aux | egrep 'proxmox-backup-client|qemu-img|vma' | grep -v egrep
root     144233  110  2.1 2154320 342112 ?      Sl   10:02   8:14 proxmox-backup-client restore vm/102/2025-12-28T09:40:12Z drive-scsi0.img.fidx --repository pbs@pam@10.10.10.20:pbs --target /dev/zvol/rpool/vm-102-disk-0

What it means: Shows what tool and target path are in play. Restoring into a zvol vs a file vs LVM changes behavior.

Decision: If you’re writing into a slow backend, change restore target (restore to a directory on fast NVMe, then import/migrate).

Task 14: Check for obvious NIC offload weirdness (quick sanity)

cr0x@pve1:~$ ethtool -k eno1 | egrep 'tcp-segmentation-offload|generic-segmentation-offload|generic-receive-offload'
tcp-segmentation-offload: on
generic-segmentation-offload: on
generic-receive-offload: on

What it means: Offloads being off can raise CPU cost per packet. Offloads being on but buggy can also hurt, but start with basics.

Decision: If CPU is pegged on softirq during restore, test toggling offloads (carefully) and validate driver/firmware versions.

Task 15: Quick and dirty disk throughput test on PBS datastore (off-hours)

cr0x@pbs1:~$ fio --name=readtest --directory=/mnt/datastore --rw=read --bs=1M --iodepth=16 --numjobs=1 --size=8G --direct=1 --runtime=30 --time_based
readtest: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=16
fio-3.33
read: IOPS=820, BW=820MiB/s (860MB/s)(24.0GiB/30001msec)

What it means: This approximates sequential reads; restores might be more random. Still, if your datastore can’t do decent reads here, it won’t restore fast.

Decision: If BW is far below what the hardware should do, investigate controller settings, pool layout, and disk health.

Task 16: Detect slow disks or errors on PBS

cr0x@pbs1:~$ dmesg -T | egrep -i 'error|reset|timeout|nvme|ata' | tail -n 10
[Sat Dec 27 22:14:09 2025] nvme nvme1: I/O 123 QID 4 timeout, reset controller
[Sat Dec 27 22:14:10 2025] nvme nvme1: Abort status: 0x371

What it means: Timeouts and controller resets will crater performance and reliability. A restore will amplify the pain.

Decision: Replace flaky media, update firmware, and stop treating errors as “noise”.

Three corporate mini-stories from the trenches

Mini-story 1: The incident caused by a wrong assumption

A mid-sized company consolidated backups onto PBS and celebrated: backup windows shrank, storage growth slowed, and dashboards looked calm.
The team assumed restore would scale the same way because “it’s just reading back what we wrote”.

The first real test came during an application outage that required restoring a multi-terabyte VM quickly. The restore started, then crawled.
Network graphs showed plenty of headroom. CPU graphs were fine. The PBS datastore disks were at high utilization with ugly read latency.

The datastore lived on a big parity RAID of HDDs. It had been chosen for capacity per dollar, and it was excellent at streaming writes.
Restores, however, triggered a chunk-heavy read pattern that behaved like semi-random I/O. The array did what parity HDD arrays do under that load: it suffered politely.

They fixed it by moving the active datastore to mirrored SSDs (and leaving older, colder data on the HDD pool). Restore times became predictable.
The wrong assumption wasn’t about PBS; it was about read behavior being “the same as write behavior”. Storage never promised you symmetry.

Mini-story 2: The optimization that backfired

Another org wanted to cut backup storage costs. Someone proposed increasing compression aggressively because backups were “mostly the same stuff”
and “CPU is cheap”. They turned the knob up and watched datastore usage flatten. The change looked brilliant.

Months later, they had to restore multiple VMs simultaneously after a bad patch rollout. Restores were painfully slow and jittery.
PBS CPU was pegged, but not in a healthy “using resources efficiently” way—more like “I can’t breathe”.

The compression level they chose was fine for overnight backups, but restore was a daytime emergency workload. Decompression became the bottleneck,
and because multiple restores ran at once, they amplified the CPU contention. The storage was fast. The network was fine. The CPU was not.

The fix wasn’t heroic: they dropped compression to a moderate zstd level, added a bit more CPU, and—this is the part nobody wants to hear—tested restores quarterly.
Storage costs rose modestly. RTO improved dramatically. The “optimization” had been optimizing the wrong thing.

Mini-story 3: The boring but correct practice that saved the day

A regulated company had an unglamorous rule: every quarter, pick a random VM and do a full bare-metal style restore into an isolated VLAN.
They logged the time, the bottleneck, and the steps needed. Nobody loved this. It took hours and produced no shiny feature.

One quarter, the restore ran slower than expected. The test environment showed dropped packets on one switch port and a rising counter of RX errors.
The VM still restored—eventually—but the postmortem entry said, “fix network path before it matters”.

Two months later, a host failed. They restored several VMs under pressure. The restores ran close to their tested numbers, because the bad switch port had already been swapped.
No one clapped. No one bought cake. The incident channel stayed boring, which is the best outcome operations gets.

The practice wasn’t clever. It was repeatable. It saved them from learning about their network during a real outage.

Common mistakes: symptom → root cause → fix

1) “Backups are fast, restores are slow” (classic)

Symptom: Backup jobs complete quickly; restores crawl, especially for large VMs.

Root cause: Datastore optimized for sequential write, not chunk-heavy reads (HDD parity RAID, busy pool, or low IOPS).

Fix: Put active datastore on SSD/NVMe mirror/RAID10. Separate “hot restore” datastore from cold archive.

2) Restore speed swings wildly hour to hour

Symptom: Sometimes restores are fine; sometimes they’re unusable.

Root cause: Contention with verify/GC/prune, or shared storage with other workloads.

Fix: Schedule background jobs; isolate PBS storage; cap concurrency during business hours.

3) Restore stalls at some percentage and “estimation” looks stuck

Symptom: UI progress barely moves; logs show long pauses.

Root cause: Disk read latency spikes, SMR disks, controller timeouts, or failing media causing retries.

Fix: Check dmesg, SMART/NVMe logs, and iostat latency. Replace unstable disks; avoid SMR for active datastore.

4) 10GbE link, but restore tops out at 80–150 MB/s

Symptom: Network “should” do more, but throughput stays low.

Root cause: Single-flow limits, MTU mismatch, packet drops, firewall/IPS overhead, or CPU softirq saturation.

Fix: Validate MTU end-to-end; check drops and retransmits; move restore traffic; tune NIC offloads and IRQ affinity if needed.

5) CPU is pegged on PBS during restore

Symptom: High CPU usage, low disk utilization; restores slow.

Root cause: Heavy compression level, encryption overhead, or insufficient cores for concurrent restores.

Fix: Use moderate zstd; scale CPU; limit concurrent restores; consider separating ingest and restore workloads (bigger PBS or second PBS).

6) Restoring into ZFS is slower than expected

Symptom: PBS is fine; PVE host iowait is high, writes are slow.

Root cause: Target pool near full, slow vdev layout, fragmentation, or sync settings.

Fix: Add capacity, keep free space headroom, restore to faster temporary storage then migrate, verify ZFS pool design (mirrors for IOPS).

7) LVM-thin restores start fast then degrade

Symptom: First few minutes are great; then it drops off a cliff.

Root cause: Thin pool or metadata nearing capacity; metadata IO contention.

Fix: Expand thin pool; free space; monitor Data% and Meta%; avoid running thin pools at high utilization.

8) Multiple restores at once make everything unusable

Symptom: One restore is okay; three restores are terrible.

Root cause: Shared bottleneck (datastore IOPS, PBS CPU, network queueing, or target storage write limits).

Fix: Limit concurrency; stage restores; upgrade the shared bottleneck; consider separate datastores or PBS nodes.

Checklists / step-by-step plan

Step-by-step: making restores faster without cargo cult tuning

  1. Pick a restore target: define an RTO for “single critical VM” and “worst day: 5 critical VMs”. If you can’t name numbers, you can’t tune.
  2. Run a real restore test during a quiet window. Measure throughput end-to-end (PBS disk, PBS CPU, network, PVE disk).
  3. Eliminate contention: schedule verify/GC away from restore windows. Don’t share PBS datastore with VM workloads.
  4. Decide your bottleneck using the fast diagnosis playbook. Don’t upgrade three things at once.
  5. If read-side is the bottleneck: move datastore to SSD/NVMe, prefer mirrors/RAID10, ensure enough RAM for metadata caching.
  6. If CPU is the bottleneck: drop compression level, ensure AES acceleration if encrypted, add cores, and limit concurrent restores.
  7. If network is the bottleneck: validate MTU, remove packet drops, ensure the path is clean (no surprise inspection), and verify expected single-flow throughput.
  8. If target write-side is the bottleneck: restore to faster local storage then migrate; fix ZFS pool headroom; expand thin pools.
  9. Retest after each change and record the result. “Feels faster” is not a metric.
  10. Institutionalize the boring practice: quarterly restore drills, with logged timings and bottlenecks.

Restore-day checklist (when it’s already on fire)

  • Stop/suspend PBS verify and GC jobs temporarily if they compete for read I/O.
  • Run iostat on PBS and PVE to decide read-bound vs write-bound.
  • Check PBS CPU saturation and PVE iowait.
  • Check NIC drops and MTU mismatch signs.
  • Limit concurrent restores until you understand the bottleneck.
  • If target storage is slow, restore to a faster temporary target and migrate later.

FAQ

1) Why do my backups finish quickly but restores take forever?

Backups can be incremental and deduplicated—so you’re mostly sending changed chunks. Restores rebuild the whole VM image and can trigger lots of chunk reads plus decompression.
Your datastore read IOPS and latency matter much more on restore than on backup.

2) Is PBS restore speed mostly disk, CPU, or network?

It’s whichever one is worst on that path. In practice, datastore disk read latency is the most common limiter, followed by target write performance.
CPU becomes the limiter when compression/encryption is heavy or the PBS is under-provisioned.

3) Should I use zstd or lzo for better restores?

Moderate zstd is usually the best overall choice: good compression with fast decompression. lzo reduces CPU but increases bytes read/transferred.
If you’re network-limited (1GbE), zstd often restores faster. If you’re CPU-limited and have fast storage/network, lzo can help.

4) Does deduplication slow restores?

Dedup can make the read pattern more complex, which hurts on slow random-read storage. On SSD/NVMe, the penalty is typically small.
Dedup is usually worth it; just don’t put the datastore on disks that hate random reads.

5) Will adding a SLOG device speed up restores?

Not materially. SLOG helps synchronous write latency, not read throughput. Restores are read-heavy on PBS and write-heavy on the target.
Spend your budget on faster datastore media and better target write performance.

6) Should I run verify jobs daily?

Integrity checking is good. Running it when it collides with restores is bad. Schedule verify when restore urgency is low,
and size the datastore so verify doesn’t take your entire day.

7) Why is restoring many VMs at once so much slower than one VM?

You’re increasing contention on shared resources: datastore IOPS, PBS CPU, network queues, and target storage writes.
If you need parallel restores, design for it: more IOPS (NVMe mirrors), more CPU, and a clean network path.

8) Is it better to restore to local storage then migrate, instead of restoring directly to shared storage?

Often yes. Restoring to fast local NVMe can drastically reduce the time-to-first-boot.
You can then migrate the VM to slower/replicated storage after the incident is over and everyone is less emotional.

9) How much free space should I keep on PBS datastore and ZFS pools?

Enough that you’re not constantly near the cliff. A practical target is keeping 15–20% headroom on pools that need performance and predictable allocations.
If you’re at 90%+, expect performance surprises.

10) What’s the single highest-impact upgrade for restores?

Put the PBS datastore on SSD/NVMe with a layout that provides IOPS (mirrors/RAID10). Capacity-centric HDD parity arrays are a common restore bottleneck.
After that, ensure the target storage can accept writes at the pace you’re restoring.

Conclusion: the next actions that pay off

Restore performance isn’t mystical. It’s a pipeline with predictable choke points. If your restores are slow, stop guessing and measure:
datastore read latency, PBS CPU, network health, and target write behavior. You’ll usually find one resource screaming louder than the rest.

Practical next steps that tend to produce immediate improvement:

  • Schedule GC/verify so they never compete with restore windows.
  • Move the active datastore to SSD/NVMe mirrors (or RAID10) if you care about restore speed.
  • Pick sane compression (moderate zstd) and validate with quarterly restore drills.
  • Keep headroom on both PBS datastore and target pools; full systems are slow systems.
  • Document a restore runbook that includes the fast diagnosis commands, so you don’t debug under adrenaline.

The goal isn’t to make restores “fast”. The goal is to make them predictable. Predictable restores are what let you sleep through the night.
Or at least sleep until the next, different incident shows up.

← Previous
Docker: Clean up safely — reclaim space without deleting what you need
Next →
Glide vs Direct3D: the API war that decided gaming’s future

Leave a comment