ZFS snapshots are the closest thing storage engineers get to time travel. They let you roll back a dataset, recover a file somebody “didn’t delete,” and ship consistent data to another machine without shutting services down. You can take them every minute and sleep better—until you wake up to a pool at 98% and a cluster that suddenly treats fsync() like a suggestion.
This piece is for people who run production systems, not lab demos. We’ll cover what snapshots really cost, how space accounting lies to you (honestly), and how to run retention without accidentally turning your snapshot schedule into a slow-motion denial of service. Along the way: commands you can run today, three corporate-world stories you’ll recognize, and a fast diagnosis playbook for when the pool is filling faster than your pager battery.
What a ZFS snapshot really is (and isn’t)
A ZFS snapshot is a read-only reference to a dataset’s state at a point in time. ZFS is copy-on-write: when data changes, ZFS writes new blocks and then updates metadata to point to them. Old blocks remain until nothing references them. A snapshot is just “something else that still references the old blocks.”
This is why snapshots are fast to create: ZFS isn’t copying your dataset; it’s pinning references. And it’s also why snapshots can “use” a lot of space later: not because they grow on their own, but because they prevent old blocks from being freed when the live dataset changes.
The mental model that prevents outages
Think of your dataset as a playlist, not a folder. The blocks are the songs. When you “edit” the playlist, ZFS doesn’t edit songs in place; it writes new songs and points the playlist to them. A snapshot is a saved version of the playlist that continues to point at the old songs. If you keep editing, you keep accumulating old songs that snapshots still reference.
If that metaphor feels too gentle for production: snapshots are “garbage collector roots.” They keep blocks alive. If you create a lot of roots and churn the heap, you’ll fill memory. Here, memory is your pool.
Snapshots are not backups (most of the time)
Snapshots protect you against logical problems: accidental deletes, bad deploys, ransomware that you notice quickly, or that one script that “only removes old files” but has an off-by-one in the path. Backups protect you against physical and existential problems: pool loss, operator error that destroys snapshots, fire, theft, and “we migrated the wrong dataset and didn’t notice for a month.”
You can turn snapshots into backups by replicating them elsewhere (e.g., zfs send/receive) with independent retention. Until you do that, snapshots are a safety net attached to the same trapeze.
Joke #1: A snapshot is like “undo,” except it never forgets and it bills you in terabytes.
Facts & context: why snapshots became a big deal
Some historical and technical context helps explain why ZFS snapshotting feels magical—and why it can surprise teams coming from traditional filesystems and LVM.
- ZFS shipped with snapshots as a first-class feature when many filesystems treated snapshots as a bolt-on (volume manager, array feature, or proprietary tooling).
- Copy-on-write is the enabler: ZFS can snapshot cheaply because it never overwrites live blocks in place; it writes new blocks and flips pointers.
- Space accounting evolved over time: early implementations and older tooling often confused “referenced,” “used,” and “logical” sizes, which led to inconsistent dashboards in mixed environments.
- Solaris shops normalized aggressive snapshot schedules long before containers made “immutable infrastructure” fashionable; daily/hourly/minutely patterns were common in enterprise NAS workflows.
- Snapshots are dataset-scoped, not pool-scoped. This matters when people assume “one retention policy for the whole pool” will behave the same everywhere.
- Clones made snapshots more dangerous (in a good way): a clone depends on its snapshot. Deletes get blocked, and your “prune old snaps” job suddenly becomes a liar.
- VM images and databases changed the game: random-write workloads churn blocks quickly, making snapshots accumulate referenced old blocks far faster than a mostly-append log workload.
- “Pool full” became an operational class of incident because ZFS performance and allocation behavior degrade dramatically at high utilization; it’s not just “no more space,” it’s “everything gets slow and then errors start.”
- Cheap snapshots made retention politics harder: once teams know you can keep 30 days “for free,” someone will ask for 365 days “because compliance.”
The uncomfortable truth: how snapshots “use” space
The most common wrong assumption is: “Snapshots are cheap, therefore snapshots are small.” Creation is cheap. Space is not guaranteed to be.
Three kinds of “size” you’ll see
ZFS reports multiple metrics and they answer different questions:
- logicalused: how much data is logically stored (ignores compression, copies, etc.).
- used: how much space is used on disk from this dataset’s perspective (includes things like snapshots depending on flags/properties shown).
- referenced: the space that would be freed if the dataset were destroyed (the “unique” live view, excluding space held only by snapshots).
- usedbysnapshots / usedbydataset / usedbychildren: a breakdown that stops arguments in meetings.
Snapshots “use” the space of blocks that are no longer referenced by the live dataset but are still referenced by one or more snapshots. That means your snapshot usage is proportional to change rate, not total dataset size. A 10 TB dataset with low churn can have years of snapshots with modest overhead; a 500 GB VM datastore with high churn can eat a pool in a weekend.
Why deleting files doesn’t free space (sometimes)
You delete a 200 GB directory. Users cheer. Monitoring doesn’t move. That’s snapshots doing their job: they still reference the blocks, so the pool can’t free them. This is also why “cleanup scripts” can be misleading in snapshot-heavy systems: they reduce live referenced space, but not pool allocation, until old snapshots are removed.
“But we only snapshot /var/log, how bad can it be?”
Append-only patterns are snapshot-friendly. But plenty of systems aren’t append-only:
- VM images: guest filesystem metadata churn, small random writes, log rotation inside guests.
- Databases: checkpointing and compaction can rewrite large regions.
- Build caches and CI workspaces: delete-and-recreate patterns churn everything.
- Container layers: frequent rebuilds; lots of small-file change churn.
The “pool is full” cliff is real
Many teams treat 95% utilization as “still 5% free.” In ZFS, that last 5% is often where performance goes to die, especially on pools with fragmentation, small recordsize, or high random write. Metadata allocations become harder, free space becomes scattered, and workloads that were fine at 70% start timing out at 92%.
Joke #2: ZFS at 98% is like a suitcase at the airport: it still closes, but everyone involved is sweating and somebody’s zipper is about to fail.
Snapshot deletion doesn’t always free space immediately
Destroying snapshots usually frees space quickly, but not always instantly in the way humans expect:
- Pending frees can defer actual reclamation; you might see “freeing” lag.
- Holds can block deletion entirely.
- Clones can keep blocks referenced even after snapshot removal attempts (or prevent removal).
- Special devices / metadata can complicate where the pressure shows up (e.g., a special vdev filling while main vdev looks fine).
Practical tasks (commands + interpretation)
These are real operational tasks you can run during a quiet day to understand your snapshot footprint—or during a loud day to stop the bleeding. The commands assume a Linux-style prompt and OpenZFS tooling.
Task 1: Get the truth about pool capacity and health
cr0x@server:~$ zpool status -v
pool: tank
state: ONLINE
scan: scrub repaired 0B in 00:12:31 with 0 errors on Sun Dec 22 01:10:03 2025
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sda ONLINE 0 0 0
sdb ONLINE 0 0 0
errors: No known data errors
Interpretation: If the pool isn’t healthy, snapshot cleanup is the wrong first move. A degraded vdev, resilvering, or checksum errors can change performance symptoms and risk tolerance. Scrub results tell you whether you’re dealing with “space pressure” or “data integrity pressure” too.
cr0x@server:~$ zpool list -o name,size,alloc,free,capacity,health,fragmentation
NAME SIZE ALLOC FREE CAPACITY HEALTH FRAG
tank 20T 17.8T 2.2T 89% ONLINE 43%
Interpretation: Capacity and fragmentation together matter. 89% and 43% fragmentation on a random-write heavy pool is a different beast than 89% and 5% fragmentation on an append-only pool.
Task 2: List datasets with a snapshot-aware breakdown
cr0x@server:~$ zfs list -o name,used,avail,refer,usedbysnapshots,usedbydataset,usedbychildren -r tank
NAME USED AVAIL REFER USEDSNAP USEDDS USEDCHILD
tank 17.8T 2.2T 256K 0B 256K 17.8T
tank/home 1.2T 2.2T 980G 220G 980G 0B
tank/vm 12.9T 2.2T 3.1T 9.8T 3.1T 0B
tank/db 3.4T 2.2T 1.9T 1.5T 1.9T 0B
Interpretation: This is your headline. tank/vm has 9.8T “used by snapshots.” That doesn’t mean the snapshots are 9.8T copies; it means that much old data is pinned by snapshots. The live referenced footprint is only 3.1T.
Task 3: Count snapshots per dataset (retention drift detector)
cr0x@server:~$ zfs list -H -t snapshot -o name | awk -F@ '{print $1}' | sort | uniq -c | sort -nr | head
4320 tank/vm
720 tank/db
120 tank/home
Interpretation: 4320 snapshots often means “every 10 minutes for 30 days” or “every minute for 3 days,” depending on the naming scheme. Either way: it’s a lot of metadata and a lot of pinned history if churn is high.
Task 4: Find the heaviest snapshots (by used)
cr0x@server:~$ zfs list -t snapshot -o name,used,refer,creation -s used | tail -n 10
tank/vm@auto-2025-12-24-2300 48G 3.1T Wed Dec 24 23:00 2025
tank/vm@auto-2025-12-24-2200 47G 3.1T Wed Dec 24 22:00 2025
tank/vm@auto-2025-12-24-2100 46G 3.1T Wed Dec 24 21:00 2025
tank/db@auto-2025-12-24-2300 18G 1.9T Wed Dec 24 23:00 2025
tank/vm@auto-2025-12-24-2000 44G 3.1T Wed Dec 24 20:00 2025
tank/vm@auto-2025-12-24-1900 44G 3.1T Wed Dec 24 19:00 2025
tank/vm@auto-2025-12-24-1800 43G 3.1T Wed Dec 24 18:00 2025
tank/vm@auto-2025-12-24-1700 42G 3.1T Wed Dec 24 17:00 2025
tank/db@auto-2025-12-24-2200 16G 1.9T Wed Dec 24 22:00 2025
tank/vm@auto-2025-12-24-1600 41G 3.1T Wed Dec 24 16:00 2025
Interpretation: Snapshot used is “space uniquely held by this snapshot.” Big numbers usually indicate heavy churn between snapshots. If you see dozens of 40–50G deltas hourly, your retention policy must be conservative with time, not count.
Task 5: Quickly see which datasets are churn monsters
cr0x@server:~$ zfs list -o name,refer,usedbysnapshots,compression,recordsize,primarycache -r tank/vm
NAME REFER USEDSNAP COMPRESS RECSIZE PRICACHE
tank/vm 3.1T 9.8T lz4 128K all
Interpretation: Compression and recordsize influence churn dynamics. A poor recordsize choice for VM images (often better at 16K–64K depending) can increase write amplification and snapshot deltas.
Task 6: Confirm whether holds are blocking deletion
cr0x@server:~$ zfs holds tank/vm@auto-2025-12-01-0000
NAME TAG TIMESTAMP
tank/vm@auto-2025-12-01-0000 backup Wed Dec 1 00:00 2025
Interpretation: A hold is an explicit “do not delete” pin. If your pruning job claims it deleted snapshots but they remain, holds are a prime suspect.
Task 7: Release a hold (carefully) to allow pruning
cr0x@server:~$ sudo zfs release backup tank/vm@auto-2025-12-01-0000
Interpretation: This does not delete anything. It removes the safety catch. If a replication job uses holds to protect “not yet replicated” snapshots, coordinate before releasing. The correct workflow is: confirm replication state, then release, then destroy.
Task 8: Detect clones that depend on snapshots
cr0x@server:~$ zfs get -H -o name,value origin -r tank | grep -v -- '-'
tank/vm/dev-clone-01 tank/vm@auto-2025-12-15-0000
tank/vm/dev-clone-02 tank/vm@auto-2025-12-20-0000
Interpretation: If a clone exists, the origin snapshot can’t be destroyed until the clone is promoted or destroyed. Teams often forget dev/test clones and then wonder why retention “stopped working.”
Task 9: Identify exactly what prevents snapshot destruction
cr0x@server:~$ sudo zfs destroy tank/vm@auto-2025-12-15-0000
cannot destroy 'tank/vm@auto-2025-12-15-0000': snapshot has dependent clones
Interpretation: ZFS is telling you the truth. Your retention policy is now a dependency graph problem. Either remove the clones, promote them, or change the policy to keep origins longer.
Task 10: Delete snapshots in a controlled window
cr0x@server:~$ sudo zfs destroy tank/vm@auto-2025-12-01-0000
Interpretation: Destroying one snapshot at a time is slower but safer when you’re learning the environment. Once you trust your naming and dependency rules, you can delete ranges or use scripting—but only after you prove the outputs.
Task 11: Delete a range of snapshots by pattern (surgical, not casual)
cr0x@server:~$ zfs list -H -t snapshot -o name -s creation | grep '^tank/vm@auto-2025-11' | head -n 5
tank/vm@auto-2025-11-01-0000
tank/vm@auto-2025-11-01-0100
tank/vm@auto-2025-11-01-0200
tank/vm@auto-2025-11-01-0300
tank/vm@auto-2025-11-01-0400
cr0x@server:~$ zfs list -H -t snapshot -o name -s creation | grep '^tank/vm@auto-2025-11' | wc -l
720
cr0x@server:~$ zfs list -H -t snapshot -o name -s creation | grep '^tank/vm@auto-2025-11' | sudo xargs -n 50 zfs destroy
Interpretation: First show, then count, then destroy. The -n 50 chunking avoids “argument list too long.” If any snapshot in the list has holds or clones, the command will error on those; you need to inspect failures and not assume the whole batch succeeded.
Task 12: Watch space reclaim after deletes (and don’t panic)
cr0x@server:~$ zpool list -o name,alloc,free,capacity
NAME ALLOC FREE CAPACITY
tank 17.8T 2.2T 89%
cr0x@server:~$ sudo zfs destroy tank/vm@auto-2025-11-01-0000
cr0x@server:~$ zpool list -o name,alloc,free,capacity
NAME ALLOC FREE CAPACITY
tank 17.7T 2.3T 88%
Interpretation: Reclaim might be gradual. If you’re under pressure, delete enough to get back to a safe headroom (often 15–20% free depending on workload), not “just one snapshot.”
Task 13: Find datasets with unexpected reservations (space that can’t be used)
cr0x@server:~$ zfs get -o name,property,value -r tank | egrep 'refreservation|reservation' | grep -v '0$'
tank/vm refreservation 2T
Interpretation: A refreservation can make a pool look “more full” because it’s guaranteed space. This is not a snapshot issue, but it often shows up during a snapshot incident because everyone is staring at “where did my free space go?”
Task 14: Measure snapshot creation rate and send/receive health (replication environments)
cr0x@server:~$ zfs list -t snapshot -o name,creation -s creation | tail -n 5
tank/vm@auto-2025-12-24-2000 Wed Dec 24 20:00 2025
tank/vm@auto-2025-12-24-2100 Wed Dec 24 21:00 2025
tank/vm@auto-2025-12-24-2200 Wed Dec 24 22:00 2025
tank/vm@auto-2025-12-24-2300 Wed Dec 24 23:00 2025
tank/vm@auto-2025-12-25-0000 Thu Dec 25 00:00 2025
Interpretation: If snapshots are coming in but replication can’t keep up, you’ll accumulate “must keep until replicated” holds (manual or tool-managed), and retention will stall. Operationally, “snapshot explosion” is often a replication lag problem wearing a storage costume.
Fast diagnosis playbook: bottlenecks and runaway space
This is the “pager is going off” sequence. The goal is to identify whether you’re dealing with (a) real space exhaustion, (b) snapshot retention failure, (c) reservations/special vdev pressure, or (d) performance collapse because you’re near-full and fragmented.
First: confirm the blast radius
- Pool capacity and health: check
zpool listandzpool status. - Is it one dataset or many? check
zfs list -o name,usedbysnapshots,refer -r POOL. - Any special vdev pressure? if you use a special device, look for it in
zpool statusand compare pool symptoms to metadata-heavy workloads.
Second: decide if it’s “snapshots pinned blocks” or “something else”
- Snapshot footprint: datasets with high
usedbysnapshotsare usually the culprits. - Snapshot count drift: count snapshots per dataset; sudden spikes often mean retention job failed, naming changed, or a schedule doubled.
- Reservations: check
refreservation/reservation; don’t delete snapshots to compensate for a reservation someone set months ago.
Third: identify what prevents cleanup
- Holds: run
zfs holdson the oldest snapshots you expect to delete. - Clones: check
originacross descendants; clones block destruction. - Replication lag: if snapshots are held for replication, verify the replication pipeline is healthy before releasing holds.
Fourth: stabilize, then optimize
- Stabilize: free enough space to avoid the near-full cliff. If you’re above ~90%, aim to get below it quickly and safely.
- Stop the bleeding: pause snapshot schedules or reduce frequency for churn-heavy datasets while you investigate.
- Then tune: review recordsize, workload placement, snapshot frequency/retention, and replication architecture.
Three corporate-world mini-stories
1) The incident caused by a wrong assumption: “Snapshots are basically free”
It started as a sensible modernization project: move a legacy virtualization cluster onto ZFS-backed storage to get checksumming, compression, and—everybody’s favorite word—snapshots. The team set hourly snapshots on the VM dataset and kept 30 days. That policy had worked on a file server for years, so it felt safe.
The wrong assumption wasn’t about ZFS correctness; it was about workload churn. A VM datastore isn’t a file server. Guest OSes do constant small writes, update metadata, rotate logs, and rewrite parts of virtual disks that “look idle” at the application level. The ZFS dataset wasn’t huge, but it changed everywhere, all the time.
Two weeks later, the pool hit the high-80s. Then the high-90s. Latency spiked first—small synchronous writes getting stuck behind allocation and fragmentation pressure. VM boot times got weird. Database VMs started throwing timeouts. The storage graphs looked like a heart monitor after an espresso.
The team’s first reaction was to hunt “the big file” and delete it. That freed almost nothing, because old blocks were pinned across hundreds of hourly snapshots. Only after they ran a snapshot-aware breakdown did the pattern become obvious: usedbysnapshots was carrying the pool.
The fix wasn’t a hero move. It was controlled cleanup and a policy reset: fewer snapshots on churn-heavy datasets, shorter retention for high-frequency snapshots, and replication to another box for longer-term recovery points. The lesson that stuck: snapshots are cheap to create, expensive to keep under churn—and your workload decides the bill.
2) The optimization that backfired: “Let’s snapshot every minute”
A different organization had a real business problem: developers were frequently rolling back test environments, and the infrastructure team wanted faster recovery without restoring from backups. Someone proposed minute-level snapshots for a handful of datasets. On paper, it was elegant: take more snapshots, reduce rollback granularity, reduce rebuild time.
For a while, it worked. The team celebrated the first big save: a broken schema migration in staging, rolled back in seconds. That success created a subtle incentive: “If minute-level snapshots are good, why not everywhere?” More datasets joined the schedule. Nobody wanted to be the team without the safety net.
Then the backfire: snapshot metadata overhead and operational complexity. Listing snapshots became slow on heavily-snapshotted datasets; scripts that assumed “a few dozen snapshots” began timing out. Replication lag grew because incremental streams had to traverse more snapshot boundaries, and the receiving side started falling behind during peak hours.
And here’s the part that surprised them most: snapshot frequency didn’t reduce space usage the way they expected. With a high-churn workload, minute snapshots can actually increase retained block diversity: more intermediate versions get pinned, fewer blocks become “fully obsolete” until you delete a much larger slice of history. The deltas were small, but there were so many of them that the pinned set stayed large.
The fix was counterintuitive: fewer snapshots, but smarter tiers. Keep minutely snapshots for a short window (hours), hourly for a medium window (days), daily for longer (weeks). And for datasets that needed “oops recovery,” they moved the most dangerous workflows (like test DB refreshes) to separate datasets so churn didn’t poison everything else.
3) The boring but correct practice that saved the day: “Retention with guardrails”
This one is almost disappointing because it’s not dramatic. A finance-adjacent company ran ZFS on servers that hosted both user homes and application data. Their snapshot policy was conservative and frankly unsexy: hourly snapshots kept for 48 hours, daily for 30 days, monthly for 12 months. They also had two boring rules: snapshots are named predictably, and deletion is always preceded by an inventory report.
One Friday, a deployment bug corrupted application state in a way that wasn’t immediately obvious. The app kept running, but data quality degraded. By Monday, the business noticed. This is exactly the scenario where “just restore from last night” doesn’t cut it—you need multiple points-in-time to bracket when corruption began.
The infrastructure team didn’t debate whether snapshots existed or whether the schedule ran. They already had daily and hourly points, and they already had scripts that could list and compare snapshot candidates. They mounted a snapshot, validated application-level invariants, and moved forward stepwise until they found the last good state.
They recovered without filling the pool, without inventing a new process mid-incident, and without the classic “we have snapshots but we’re not sure what’s in them” chaos. The unglamorous part that mattered: they also had a hard capacity guardrail—alerts at 75/80/85% with an explicit runbook—so the pool never got close to the near-full cliff while they were doing forensic restores.
People love to buy reliability with features. They saved the day with habits.
Common mistakes: symptoms and fixes
Mistake 1: Confusing “referenced” with “used” and declaring victory too early
Symptom: You delete large directories, zfs list shows lower refer, but zpool list barely changes.
Cause: Snapshots are holding old blocks. Live dataset shrank; pool allocation didn’t.
Fix: Check usedbysnapshots and prune snapshots per retention policy. Validate no holds/clones block deletion.
Mistake 2: Blind snapshot pruning with a name pattern you didn’t verify
Symptom: Retention job deletes the wrong snapshots (or none), or deletes the newest ones and keeps the old ones.
Cause: Naming format changed, timezones changed, multiple tools writing different prefixes, or lexicographic ordering doesn’t match time ordering.
Fix: Sort by creation not by name; prove with a dry-run listing before destroying. Prefer zfs list -s creation as the source of truth.
Mistake 3: Ignoring clones and then wondering why destruction fails
Symptom: cannot destroy ... snapshot has dependent clones
Cause: Someone created a clone for dev/test or forensic work and forgot it existed.
Fix: Find clones via zfs get origin. Decide: destroy the clone, or promote it if it must live independently, then revisit retention.
Mistake 4: Holds used by replication, but nobody monitors replication lag
Symptom: Snapshot count grows; deletions silently fail; “oldest snapshot” age increases.
Cause: Replication tooling applies holds until a snapshot is safely transferred; replication falls behind and the held set grows.
Fix: Monitor replication end-to-end. During incident response, confirm last replicated snapshot before releasing holds. Fix the pipeline rather than fighting symptoms.
Mistake 5: Running the pool too hot
Symptom: Latency spikes, allocation slowdowns, sometimes ENOSPC-like errors while “df shows space.”
Cause: High pool utilization and fragmentation cause performance collapse, especially on random writes.
Fix: Restore headroom. Set operational SLOs for capacity (alert early). Re-evaluate snapshot retention and workload placement.
Mistake 6: Treating a metadata special vdev like a magic performance upgrade
Symptom: Main pool shows free space, but system still fails writes or performance collapses; special vdev is full.
Cause: Special vdev filled (metadata/small blocks). Snapshots and small-file churn can amplify metadata pressure.
Fix: Monitor special vdev allocation. Keep headroom there too. Consider adjusting special_small_blocks and snapshot policies, and plan capacity for metadata growth.
Checklists / step-by-step plan
Step-by-step: build a snapshot retention policy that doesn’t eat your pool
- Classify datasets by churn: VM images, databases, CI caches, user homes, logs. Don’t apply one schedule to all.
- Define recovery objectives: “I want hourly rollback for two days” is a requirement. “Keep everything forever” is a wish.
- Use tiered retention: short high-frequency window + longer low-frequency window. This is how you get granularity without indefinite pinning.
- Separate risky workloads: put CI caches and scratch space in their own datasets with aggressive prune schedules.
- Make snapshot names predictable: consistent prefix + ISO-like timestamp. Predictability prevents deletion bugs.
- Decide how holds/clones are handled: document who can clone, how long clones live, and how origins are protected.
- Add capacity guardrails: alerts at 75/80/85% and a hard “stop snapshot creation at X%” policy for non-critical datasets.
- Rehearse restore: mounting a snapshot and restoring a file should be routine, not a midnight art project.
Step-by-step: safely reclaim space during a snapshot-related capacity incident
- Freeze change: pause snapshot creation schedules if they’re making things worse (especially minutely jobs).
- Find the top snapshot consumers:
zfs list -o name,usedbysnapshots -r POOL. - Identify blockers: holds and clones on the oldest snapshots you want to remove.
- Delete the oldest snapshots first: they typically pin the widest history. Avoid deleting “recent recovery points” unless necessary.
- Recheck pool capacity: don’t stop at “it went from 98% to 96%.” Get back to safe headroom.
- Post-incident fix: adjust retention tiers, tune snapshot frequency per dataset class, and repair replication lag if applicable.
Step-by-step: routine weekly audit (the boring practice)
- Review
zpool listand fragmentation trends. - Review top datasets by
usedbysnapshots. - Check snapshot counts per dataset; flag outliers.
- Look for old snapshots with unexpected holds.
- Scan for forgotten clones and decide whether they should be promoted or removed.
FAQ
1) Do ZFS snapshots copy the whole dataset?
No. They pin references to existing blocks. Space impact comes from later changes: old blocks can’t be freed while snapshots reference them.
2) Why does zfs list show huge snapshot usage on a dataset that “hasn’t changed much”?
Something changed more than you think. VM images, databases, and small-file workloads can churn blocks even when app-level data seems stable. Also verify you’re reading usedbysnapshots and not confusing it with refer.
3) If I delete a big directory, when do I get the space back?
You get it back when no snapshot references those blocks. If snapshots exist that predate the delete, the blocks remain pinned until those snapshots are destroyed (or until they age out of retention).
4) Why can’t I destroy a snapshot?
Common reasons: it has dependent clones, or it has holds. Check with zfs holds SNAP and find clones using zfs get origin -r.
5) Will deleting snapshots hurt performance?
Deleting many snapshots can create IO and CPU work, especially on busy systems, and it can contend with workload. But the alternative—running near-full—often hurts more. Do bulk deletes in controlled batches and observe latency.
6) Are more frequent snapshots always better?
No. More frequent snapshots improve recovery granularity, but can increase metadata overhead and keep more intermediate versions pinned. Tiered retention usually beats “minute snapshots forever.”
7) How much free space should I keep in a ZFS pool?
There’s no single number, but many production teams treat ~80–85% as a soft ceiling for mixed workloads, lower for heavy random-write pools. What matters is leaving enough contiguous-ish free space to avoid allocation pain and fragmentation spirals.
8) Are snapshots safe from ransomware?
Snapshots can protect against ransomware that encrypts files—if the attacker can’t delete the snapshots. If attackers get privileged access, they may destroy snapshots too. Replicate snapshots to a separate system with restricted credentials and independent retention if you want real resilience.
9) Why does snapshot “used” look different across servers?
Differences can come from property reporting defaults, feature flags, compression, recordsize, and workload churn. Always compare using explicit columns (e.g., usedbysnapshots) and consistent commands.
10) Is it okay to snapshot databases?
Yes, with care. For crash-consistent snapshots, you may need application coordination (flush/checkpoint) depending on the DB and recovery expectations. Also expect higher churn and plan retention accordingly.
Conclusion
ZFS snapshots are a superpower because they turn time into an operational primitive: you can create consistent restore points quickly, often without pausing workloads. The trap is that time isn’t free. Every snapshot is a promise to remember old blocks, and high-churn datasets will collect those promises like interest.
Run snapshots like you run any other production feature: measure change rate, classify datasets, set tiered retention, monitor capacity early, and treat holds/clones as first-class dependencies. Do that, and snapshots stay what they were meant to be: the fastest “undo” you’ve ever used—without turning your pool into a landfill.