If you’ve run ZFS in anger—on a database host, a VM farm, or a file server that quietly became a data lake—you’ve seen it: du swears a dataset is 2.1T, but zfs list insists it’s using 3.4T. Someone asks, “So… where did the extra terabyte come from?” and suddenly you’re doing incident response with a calculator.
This isn’t a ZFS bug. It’s accounting. ZFS reports multiple “truths” depending on the question you’re asking. du is a filesystem walk. zfs list is block-ownership accounting across snapshots, reservations, metadata, and pool-level policies like slop space. When you align the questions, the numbers align—mostly. When you don’t, you get surprises that look like leaks.
The core mismatch: files vs blocks vs history
du answers: “How many disk blocks are currently allocated to the files I can reach by walking this directory tree right now?”
zfs list answers: “How many blocks in the pool are attributable to this dataset and its snapshots, plus some subtleties like reservations?” Depending on the columns you ask for, it might answer “unique blocks” or “shared blocks” or “logical bytes before compression.”
Same storage system, different question. That difference gets amplified by ZFS’s defining features:
- Copy-on-write means old blocks stick around as long as snapshots reference them.
- Compression means “bytes in files” and “bytes on disk” are two different currencies.
- Block pointers, metadata, and spacemap overhead exist even when your directory looks “empty.”
- Pool-level behavior (slop space) hides capacity to keep the system alive.
Joke #1: Asking “why du and zfs list disagree” is like asking “why my bank balance doesn’t match my spending tracker”—one includes pending transactions and one includes your optimism.
Facts and historical context that actually matter
Storage behavior is politics plus physics, but ZFS also has history. A few short context points that help you reason about today’s numbers:
- ZFS popularized mainstream copy-on-write snapshots in general-purpose filesystems; that choice makes space usage inherently time-dependent.
- The original ZFS lineage came out of Solaris, where the operational target wasn’t “a laptop”; it was “a storage server that must not corrupt data when everything goes wrong.”
- ZFS space reporting evolved: newer OpenZFS exposes properties like
usedbydataset,usedbysnapshots, andlogicalusedto reduce guesswork. - The “1/8th slop space” behavior exists to reduce catastrophic failure modes near full pools; it’s not a UI prank.
- Historically, admins filled filesystems to 99% because ext-era tooling taught them they could; ZFS punishes that habit because allocation and metaslab behavior degrade sharply when nearly full.
- Compression moved from “niche performance hack” to “default recommendation” for many ZFS deployments because modern CPUs are cheap and disk is not.
- Special vdevs (metadata/small blocks on fast media) changed the performance profile of “small file workloads,” but also introduced new ways to run out of the wrong kind of space.
- Snapshots are cheap in the “create” sense, not in the “keep forever” sense; the bill arrives when data churns.
- VM image and container layers made “sparse file + churn + snapshots” a common failure cocktail; it didn’t exist at this scale when ZFS was born.
A practical mental model of ZFS space
When you read ZFS space stats, keep three mental buckets:
1) Logical bytes (what applications think they wrote)
This is file sizes, database pages, VM image “virtual capacity.” It’s what humans want to reason about. ZFS can expose this via logicalused and logicalreferenced.
2) Physical bytes (what actually occupies the pool)
This is what determines if you hit 80%, 90%, and then “everything is on fire.” Compression, copies, parity/RAIDZ overhead, and metadata all land here. ZFS exposes it via used, referenced, and more detailed usedby* fields.
3) Ownership/attribution (who “owns” blocks across time)
With snapshots, blocks can be shared by multiple “views” of data. Your live filesystem may no longer reference a block, but a snapshot does, so the pool keeps it. du can’t see that because it’s walking the live tree, not the historical tree.
What du measures (and what it ignores)
du walks directories and sums allocated blocks for reachable files. That’s useful, but it has blind spots on ZFS:
- Snapshots: A snapshot is not part of the live directory tree, so
duignores space pinned by snapshots unless you explicitly traverse.zfs/snapshot(if exposed). - Dataset metadata and overhead:
dureports file allocations, not ZFS metadata like block pointers, spacemap overhead, and indirect blocks that scale with fragmentation and churn. - Compression truthfulness:
dutypically reports allocated blocks from the OS view. On ZFS, it can be closer to “physical” than “logical,” but the exact behavior depends on platform and flags (--apparent-sizevs default). - Reservations: If a dataset has a reservation, the pool has capacity set aside, but
duwill not tell you about it. - Holes/sparse regions:
ducounts allocated blocks; sparse zeros don’t count unless written.
What zfs list measures (and why it feels “bigger”)
zfs list is block accounting, not a directory walk. Its default columns are deceptively simple:
USED: physical space consumed by the dataset and descendants (depending on context), including space referenced by snapshots.AVAIL: what ZFS will let you allocate, factoring in pool free space, quotas, reservations, and slop space.REFER: physical space accessible by this dataset (not including descendants).
Those words are accurate but incomplete. In production, you almost always want the breakdown columns, because USED is a composite figure.
Snapshots: the usual suspect
Most du vs zfs list disagreements are “snapshots are holding onto deleted/overwritten blocks.” Copy-on-write means modifications allocate new blocks, and snapshots keep the old ones. If you delete a 500G directory today, and last night’s snapshot still exists, those blocks are still in the pool.
This gets worse with high churn workloads: databases, VM images, container layers, and anything that rewrites large files in place. The live dataset might look small (du), but its snapshot set is quietly preserving history as physical blocks (zfs list).
Joke #2: Snapshots are like that “temporary” Slack channel—cheap to create, expensive to keep, and nobody wants to be the one to delete it.
Compression, logical space, and why humans get tricked
Compression is where intuition goes to die. On a compressed dataset:
ls -lshows logical file size.du(default) often shows allocated space, which may track physical usage more closely.zfs listUSEDis physical usage (with snapshot/reservation caveats).zfs get logicalusedshows logical usage (what was written before compression).
So you can see du smaller than zfs list because snapshots pin old blocks, but you can also see the opposite: du --apparent-size can show “bigger” than physical reality due to compression.
Operationally: when capacity planning, you care about physical bytes in the pool. When chargeback or tenant expectations show up, you may care about logical bytes. Pick one, label it, and don’t mix them in the same spreadsheet.
Sparse files, zvols, and “allocated” vs “written”
Sparse files are a classic mismatch generator: a VM disk image can be “1T” logical size but only 80G allocated. du and ZFS’s physical stats will tend to agree on “allocated,” while application teams keep quoting the logical size and wondering why “it doesn’t fit.”
Zvols add another layer: they’re block devices living inside ZFS. They can be thick or thin depending on settings (volmode, provisioning behavior, and the guest’s write patterns). Snapshotting zvols is common in VM stacks; it’s also a great way to preserve an impressive amount of churn.
Metadata, xattrs, ACLs, and small-file tax
Some workloads are mostly metadata: millions of small files, heavy xattrs, ACL-rich trees, or maildir-style patterns. ZFS metadata overhead is not free, and it does not show up as “file bytes.” du can under-represent the footprint because it’s summing file allocations, not counting all the indirect blocks, dnodes, and metadata structures needed to keep the filesystem coherent and fast.
If you’ve ever watched a “tiny files” dataset eat a pool, you learn to stop trusting averages like “files are only 4K.” They’re not. The file is 4K; the ecosystem around the file is the rest of your weekend.
Reservations, quotas, refreservations, and slop space
Reservations create “phantom usage” from the perspective of du. ZFS may set aside space for a dataset (reservation or refreservation) even if files aren’t using it. ZFS reports that reserved space as used/unavailable because it’s promised to someone.
Quotas (quota, refquota) constrain allocation. They don’t directly explain du vs zfs list mismatches, but they do explain “why AVAIL is smaller than expected.”
Then there’s slop space: ZFS typically withholds a chunk of pool free space to keep allocations and metadata updates from failing catastrophically near 100%. So even if you “have” free space, ZFS may refuse to hand it out.
copies=2, special vdevs, and other foot-guns
The copies property is a silent multiplier: copies=2 stores two copies of user data (within the same pool). Great for certain reliability needs; terrible if you forget it’s set and then compare against du.
Special vdevs (metadata/small blocks on SSD/NVMe) can make ZFS feel like a different filesystem. But they also create a second capacity domain: you can run out of special vdev space while the main pool looks fine. Your “space problem” becomes a “wrong vdev full” problem, and the symptoms are subtle until they aren’t.
Three corporate-world mini-stories
Mini-story 1: An incident caused by a wrong assumption
The ticket started as a finance complaint: “Storage is charging us for 40% more than we store.” The team pulled du from the project directory and felt confident. The numbers were small. The chargeback numbers were big. Someone concluded the storage team’s reports were inflated and escalated.
We looked at the dataset and saw the usual pattern: a CI system generating artifacts, deleting them, generating more, and doing that all day. The dataset also had an “hourly snapshots, keep for 30 days” policy copied from a more stable workload. The live tree was modest; the snapshots were basically an archaeological record of build outputs.
The immediate mistake wasn’t “snapshots exist.” It was assuming du measured the same thing as pool consumption. The project’s du output was true for “what exists now.” It was irrelevant for “what blocks the pool must retain.”
We didn’t delete all snapshots; that’s how you earn enemies. We changed the retention for that dataset to match its churn profile, then introduced a separate dataset for artifacts with shorter retention and no long-lived snapshots. Chargeback stopped being a debate because the numbers were now aligned with intention.
The postmortem action item that mattered: every dataset got an explicit data-class label (ephemeral build artifacts, user home directories, databases, VM images) and snapshot policies were attached to the class, not vibes.
Mini-story 2: An optimization that backfired
A performance-minded team enabled aggressive compression and recordsize tuning across a mixed workload dataset. Their benchmark looked great: less disk IO, more cache hits, better throughput. They rolled it out widely and declared victory.
Then backup windows started slipping. Not because ZFS got slower, but because snapshots started retaining far more unique blocks than expected. The workload included large files that were frequently modified in small regions. A recordsize choice that was “fine for sequential throughput” increased write amplification for random updates. Each small change caused more block churn, which snapshots dutifully preserved. Physical usage climbed faster than anyone’s model.
The irony: the “optimization” did improve live performance while quietly increasing the long-term storage cost of the snapshot strategy. And because the pool was sized for the old churn profile, the growth looked like a leak.
The fix wasn’t to abandon compression. It was to segment workloads: separate datasets with appropriate recordsize, snapshot retention tuned to churn, and monitoring on written per snapshot interval. The lesson was boring: performance tuning without lifecycle accounting is just moving cost from one axis to another.
Mini-story 3: A boring but correct practice that saved the day
A different organization had a dull rule: every dataset must have a monthly “space accounting review” report including usedbydataset, usedbysnapshots, and a list of top snapshot consumers. People complained it was paperwork. It wasn’t paperwork; it was reconnaissance.
One quarter, a pool started trending toward 85% despite stable business metrics. Nobody panicked because the monthly reports showed a shift: usedbysnapshots was rising, but usedbydataset wasn’t. That’s not “more data”; that’s “more history per unit of data,” which usually means churn increased or retention changed.
They caught it early: an application had switched from append-only logs to periodic rewrites of a large state file, increasing churn dramatically. Because the team was already watching snapshot space as a first-class metric, they could respond before the pool hit the danger zone.
The fix was almost insulting in its simplicity: change the application to write new files and rotate, and shorten snapshot retention for that dataset. No heroic data migration. No emergency capacity purchase. The boring practice—measuring the right breakdown—saved the day.
Practical tasks: commands and interpretation
These are the field moves I reach for when someone says “ZFS is using more space than du.” Each task includes what to look for and how to interpret it.
Task 1: Get a truthful dataset breakdown (usedby*)
cr0x@server:~$ zfs list -o name,used,refer,usedbydataset,usedbysnapshots,usedbychildren,usedbyrefreservation tank/proj
NAME USED REFER USEDBYDATASET USEDBYSNAPSHOTS USEDBYCHILDREN USEDBYREFRESERV
tank/proj 3.40T 1.95T 1.80T 1.55T 50.0G 0B
Interpretation: The mismatch is right there: snapshots account for 1.55T. du is mostly reflecting the ~1.8T live dataset (plus minus compression/metadata), not the snapshot history.
Task 2: Compare du “allocated” vs “apparent” sizes
cr0x@server:~$ du -sh /tank/proj
1.9T /tank/proj
cr0x@server:~$ du -sh --apparent-size /tank/proj
2.6T /tank/proj
Interpretation: Default du is closer to allocated blocks; apparent size is logical file sizes. If compression is on, logical can be much larger than physical.
Task 3: Check compression ratio and logical space at the dataset level
cr0x@server:~$ zfs get -o name,property,value -H compressratio,compression,logicalused,logicalreferenced tank/proj
tank/proj compressratio 1.37x
tank/proj compression lz4
tank/proj logicalused 4.65T
tank/proj logicalreferenced 2.70T
Interpretation: Logical bytes are higher than physical. If someone is comparing logical totals to physical pool occupancy, you’ll have an argument instead of a plan.
Task 4: List snapshots and see which ones are expensive
cr0x@server:~$ zfs list -t snapshot -o name,used,refer,creation -s used tank/proj
NAME USED REFER CREATION
tank/proj@daily-2025-12-01 120G 1.95T Mon Dec 1 00:00 2025
tank/proj@daily-2025-11-30 118G 1.95T Sun Nov 30 00:00 2025
tank/proj@hourly-2025-12-24-23 22G 1.95T Wed Dec 24 23:00 2025
Interpretation: Snapshot USED is “unique space attributable to this snapshot” (changes since the previous snapshot in that lineage). Large numbers indicate heavy churn during that interval.
Task 5: Find the “written since snapshot” signal (great for churn)
cr0x@server:~$ zfs get -H -o name,property,value written tank/proj
tank/proj written 0B
cr0x@server:~$ zfs snapshot tank/proj@now
cr0x@server:~$ zfs get -H -o name,property,value written tank/proj
tank/proj written 18.4G
Interpretation: written is how much data has been written since the last snapshot. If this is huge every hour, long retention will be expensive.
Task 6: Confirm whether .zfs snapshots are visible and whether du is walking them
cr0x@server:~$ zfs get -H -o value snapdir tank/proj
hidden
cr0x@server:~$ ls -la /tank/proj/.zfs
ls: cannot access '/tank/proj/.zfs': No such file or directory
Interpretation: If snapdir=visible, an incautious du might traverse snapshots and “double count” from a human perspective. If it’s hidden, du will ignore snapshots entirely.
Task 7: Look for reservations and refreservations
cr0x@server:~$ zfs get -H -o name,property,value reservation,refreservation tank/proj
tank/proj reservation 0B
tank/proj refreservation 500G
Interpretation: A refreservation can make the dataset look “big” from ZFS’s point of view even if files are small. It is space promised to that dataset’s referenced data.
Task 8: Inspect quotas and refquotas (why AVAIL looks wrong)
cr0x@server:~$ zfs get -H -o name,property,value quota,refquota,avail,used tank/proj
tank/proj quota 2T
tank/proj refquota none
tank/proj avail 120G
tank/proj used 1.88T
Interpretation: Even if the pool has free space, a quota caps growth. People often mistake this for “mysterious missing capacity.” It’s policy doing its job.
Task 9: Check pool health and slop-space reality via zpool list
cr0x@server:~$ zpool list -o name,size,alloc,free,capacity,health tank
NAME SIZE ALLOC FREE CAPACITY HEALTH
tank 20.0T 17.2T 2.80T 86% ONLINE
Interpretation: At ~86% full, you’re in the zone where fragmentation and allocation behavior can get ugly. Also expect zfs list AVAIL to be smaller than “FREE” due to slop space and dataset-level constraints.
Task 10: Find datasets with heavy snapshot overhead across the pool
cr0x@server:~$ zfs list -o name,used,usedbydataset,usedbysnapshots -r tank | head
NAME USED USEDBYDATASET USEDBYSNAPSHOTS
tank 17.2T 2.10T 14.7T
tank/proj 3.40T 1.80T 1.55T
tank/vm 8.90T 2.40T 6.30T
tank/home 2.10T 1.95T 120G
Interpretation: When usedbysnapshots dominates, the pool is paying for history. That may be correct—but it must be intentional.
Task 11: Spot “deleted but still used” space: open file handles
cr0x@server:~$ sudo lsof +L1 | head
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME
java 2114 app 123w REG 0,119 1048576 0 912345 /tank/proj/logs/app.log (deleted)
Interpretation: This is not ZFS-specific, but it’s a frequent culprit. The file is deleted, du doesn’t count it, but the space remains allocated until the process closes it.
Task 12: Estimate small-file pressure and metadata-heavy trees
cr0x@server:~$ find /tank/proj -xdev -type f -size -16k | wc -l
4821931
cr0x@server:~$ du -sh /tank/proj
1.9T /tank/proj
Interpretation: Millions of tiny files means metadata overhead and potential special-vdev pressure (if configured). If your “data” is small but you have a huge count of files, expect the pool to behave differently than a big-file workload.
Task 13: Check recordsize and volblocksize choices (write amplification clues)
cr0x@server:~$ zfs get -H -o name,property,value recordsize tank/proj
tank/proj recordsize 1M
cr0x@server:~$ zfs get -H -o name,property,value volblocksize tank/vm/zvol01
tank/vm/zvol01 volblocksize 128K
Interpretation: Large recordsize can be great for streaming IO, but can increase churn cost under snapshots for random updates. Zvol volblocksize matters similarly.
Task 14: Detect copies property multiplying usage
cr0x@server:~$ zfs get -H -o name,property,value copies tank/proj
tank/proj copies 2
Interpretation: You’re paying double for data blocks (plus metadata). If nobody intended this, it’s a clean explanation for “why ZFS is bigger than du.”
Task 15: Show refreservation impact on referenced space specifically
cr0x@server:~$ zfs list -o name,refer,usedbyrefreservation tank/proj
NAME REFER USEDBYREFRESERV
tank/proj 1.95T 500G
Interpretation: Even if REFER is stable, the dataset has 500G pinned for it. That reduces pool flexibility and explains “why free space disappeared.”
Fast diagnosis playbook
This is the “I have 10 minutes before a capacity review call” playbook. It won’t solve everything, but it will tell you which direction to run.
First: determine if the discrepancy is snapshots, reservations, or something else
cr0x@server:~$ zfs list -o name,used,refer,usedbydataset,usedbysnapshots,usedbyrefreservation tank/target
NAME USED REFER USEDBYDATASET USEDBYSNAPSHOTS USEDBYREFRESERV
tank/target 3.4T 1.9T 1.8T 1.6T 0B
If usedbysnapshots is big: it’s historical retention + churn.
If usedbyrefreservation is big: it’s policy/reservation.
If usedbydataset is big but du is small: suspect open-but-deleted files, special cases like copies, or du walking the wrong mountpoint.
Second: check whether du is measuring logical or physical, and whether it’s crossing boundaries
cr0x@server:~$ du -sh /tank/target
cr0x@server:~$ du -sh --apparent-size /tank/target
cr0x@server:~$ mount | grep 'tank/target'
What you learn: whether you’re comparing allocated vs apparent bytes, and whether the path you’re scanning is actually the dataset you think it is.
Third: identify churn rate and the snapshot retention policy
cr0x@server:~$ zfs get -H -o name,property,value written,com.sun:auto-snapshot tank/target
tank/target written 220G
tank/target com.sun:auto-snapshot true
What you learn: If written jumps by hundreds of gigabytes per interval, long snapshot retention will dominate physical usage.
Fourth: confirm pool fullness and whether you’re hitting the “near full” pain curve
cr0x@server:~$ zpool list -o name,alloc,free,capacity tank
NAME ALLOC FREE CAPACITY
tank 17.2T 2.8T 86%
What you learn: At high utilization, everything gets harder: frees don’t show up quickly (because they’re snapshot-held), allocations fragment, performance can sag, and “AVAIL” becomes political.
Common mistakes, symptoms, and fixes
Mistake 1: Using zfs list USED as “live data size”
Symptom: Dataset “USED” is huge, but du is smaller and app teams swear nothing changed.
Cause: USED includes snapshot-referenced space (and potentially reservations).
Fix: Use usedbydataset for live data and usedbysnapshots for historical overhead; adjust snapshot retention or churn behavior.
Mistake 2: Treating “deleted files” as immediately freed space
Symptom: rm -rf runs, du drops, but pool space doesn’t recover.
Cause: Snapshots still reference those blocks; or a process still has the file open (deleted-but-open).
Fix: Check snapshot consumers; run lsof +L1; restart/logrotate misbehaving processes; expire snapshots intentionally.
Mistake 3: Comparing du –apparent-size to zfs USED
Symptom: du --apparent-size says 10T, ZFS says 4T; someone calls it “missing data.”
Cause: Compression (and sometimes sparse files) reduces physical usage.
Fix: Decide whether you want logical or physical accounting. Use logicalused alongside used and label charts clearly.
Mistake 4: Ignoring refreservation and then wondering where capacity went
Symptom: Pool free space shrinks faster than live data growth; AVAIL looks low everywhere.
Cause: refreservation pins space even if not used by files.
Fix: Audit refreservation/reservation across datasets; remove or right-size. Keep reservations for workloads that truly need guaranteed headroom.
Mistake 5: Snapshotting high-churn VM images like they’re home directories
Symptom: Snapshot space grows explosively; rollbacks work great, capacity doesn’t.
Cause: Random writes + COW + frequent snapshots = lots of unique blocks retained.
Fix: Tune snapshot frequency/retention; consider replication cadence; separate VM datasets; ensure recordsize/volblocksize choices fit the IO pattern.
Mistake 6: copies=2 enabled “temporarily” and forgotten
Symptom: Used space is roughly double what you expect; no smoking gun in snapshots.
Cause: copies property duplicates blocks.
Fix: Audit zfs get copies -r. If removing, understand that changing it affects new writes; existing blocks may remain until rewritten.
Mistake 7: Running the pool too full, then calling it a capacity bug
Symptom: Writes slow down, frees don’t seem to help, allocation errors near “still some free.”
Cause: High fragmentation + slop space + metaslab constraints.
Fix: Keep pools under a sensible threshold (many teams aim below ~80% for general workloads). Expand capacity or migrate data before you reach the cliff.
Checklists / step-by-step plan
Checklist: “du smaller than zfs list USED” (most common)
- Run
zfs list -o usedbydataset,usedbysnapshots,usedbyrefreservationfor the dataset. - If
usedbysnapshotsis large, list snapshots sorted byusedand identify retention policy. - Check churn rate with
zfs get writtenbetween snapshot intervals. - Confirm nobody is accidentally snapshotting too frequently (automation properties, cron jobs, orchestration tooling).
- Verify open-but-deleted files with
lsof +L1. - Decide: change retention, change churn behavior, or buy space. Do not “just delete random snapshots” without understanding replication/backup dependencies.
Checklist: “zfs list REFER smaller than du”
- Check whether you used
du --apparent-size(logical) whileREFERis physical. - Check compression settings and
compressratio. - For sparse files, compare
ls -llogical size vsduallocated. - Ensure
duisn’t crossing mountpoints or scanning a different path than the dataset mount.
Checklist: “AVAIL is much smaller than pool free”
- Check dataset
quota/refquotaandreservation/refreservation. - Check pool utilization (
zpool list) and consider slop space behavior. - Look for other datasets with large reservations pinning capacity.
- If the pool is near full, stop treating “FREE” as usable; plan expansion/migration.
FAQ
1) Which number should I trust: du or zfs list?
Trust the one that matches your question. For “how big is the live directory tree,” use du (allocated) or du --apparent-size (logical). For “how much pool capacity is consumed,” use ZFS physical accounting (zfs list plus usedby* breakdown).
2) Why does deleting files not free space in the pool?
Because snapshots keep old blocks. Deleting only removes the live reference. Space returns when no snapshot (and no clone) references those blocks. Also check for deleted-but-open files with lsof +L1.
3) What does zfs list REFER mean?
REFER is the amount of physical space referenced by the dataset itself (not its descendants). It’s closer to “live view” than USED, but it still won’t match du --apparent-size on compressed datasets.
4) What does “snapshot USED” actually represent?
Snapshot USED is the amount of space that would be freed if that snapshot were destroyed, assuming no other snapshot/clone also references the same blocks. It’s a “unique contribution” estimate, not “snapshot size” in the way people imagine.
5) Why is AVAIL smaller than zpool FREE?
Because AVAIL is constrained by dataset quotas/reservations and by pool behavior (including slop space). ZFS is conservative on purpose near full pools.
6) Can du ever be larger than zfs list?
Yes, if you use du --apparent-size (logical sizes) on a compressed dataset, or when sparse files report huge logical size. ZFS physical usage can be much smaller than that.
7) How do I find what’s consuming snapshot space?
Start by identifying datasets where usedbysnapshots is high, then list snapshots sorted by used. Next, measure churn with written per snapshot interval. Finally, correlate with workload changes (VM churn, database maintenance jobs, CI artifact patterns).
8) Does compression make snapshots cheaper or more expensive?
Both, depending on the workload. Compression reduces the physical size of each block, so retaining old blocks can be cheaper. But if compression changes record packing or workload behavior, it can also increase churn and thus snapshot-retained unique blocks. Measure written and snapshot used rather than guessing.
9) Are reservations “wasted space”?
They’re reserved space—capacity held back so a dataset can keep writing under contention. That’s not waste if it prevents outages for critical services. It is waste if you set it and forget it on everything.
10) Is it safe to just delete snapshots to get space back?
Mechanically, yes; operationally, “it depends.” Snapshots might be part of your backup/replication chain, or required for recovery objectives. Delete intentionally: pick a retention policy, validate replication behavior, and remove snapshots in the right order if tooling expects it.
Conclusion
du and zfs list disagree because they’re measuring different realities: the live filesystem’s reachable files versus the pool’s block ownership across time, policy, and metadata. ZFS isn’t lying; it’s telling you about history, promises (reservations), and physics (compression, block sizes, overhead).
The operational move is to stop arguing about “the right number” and start standardizing which number you use for which decision: logical bytes for user expectations, physical bytes for capacity, and a snapshot/churn breakdown for anything involving retention. Once you do that, the discrepancy stops being mysterious—and starts being a lever you can pull.