ZFS snapdir=visible: Auditing Snapshots Without Making Users Rage

Was this helpful?

There’s a particular kind of panic that hits an organization when someone says, “We need to audit the files as they looked last Tuesday,” and the storage team replies, “Sure, but only root can see the snapshots.” It’s not a technical limitation so much as a policy choice—and like most policy choices in storage, it becomes a UX problem the second users are involved.

snapdir=visible is one of ZFS’s deceptively small switches that changes how people interact with a filesystem. Done right, it turns snapshot forensics into a self-serve experience and stops your on-call from becoming a human file-restore API. Done wrong, it becomes the world’s most confusing “mystery folder,” breaks expectations in SMB/NFS clients, and creates the kind of ticket storm that makes you consider farming.

What snapdir actually does (and what it doesn’t)

ZFS snapshots are not “backup copies” in the old file-server sense. They’re read-only, point-in-time views of a dataset. Internally, they’re metadata references to blocks that already exist; ZFS uses copy-on-write, so blocks referenced by snapshots are preserved until no snapshot (and no live file) references them anymore.

The snapdir dataset property controls how snapshots are exposed through the filesystem namespace. Specifically:

  • snapdir=hidden (default in many environments): the .zfs directory exists conceptually but isn’t visible in directory listings.
  • snapdir=visible: the .zfs directory appears at the root of the dataset’s mountpoint, typically as /.zfs inside that dataset, with a snapshot subdirectory containing each snapshot as a browsable directory tree.

What it doesn’t do: it doesn’t grant permission. It doesn’t override POSIX mode bits, ACLs, or SMB share rules. It doesn’t magically make snapshot contents writable. It also doesn’t turn snapshots into independent datasets you can mount anywhere. This matters because a lot of “users can see snapshots now!” expectations collapse into “users can see snapshot names but not the files they want,” which is less empowering and more rage-inducing.

Two sentences of wisdom from production: visibility is not access, and access without a story is chaos. If you toggle visibility without explaining what users should do with it, they will explore it like a toddler in a server room: curiously and with no sense of consequences.

Why make snapshots visible at all?

The sales pitch is simple: faster audits and fewer restores. The reality is nuanced: it changes the shape of incident response and shifts some complexity from SREs to users. That can be a win—as long as you constrain it with sane defaults, training, and guardrails.

Use cases where snapdir=visible shines

  • Audit & compliance reviews: “Show me what this folder looked like on the 1st.” Users can browse snapshots to verify presence/absence of artifacts without opening a ticket.
  • Self-serve recovery: restoring a deleted file becomes copy-from-snapshot, not “wait for storage team.”
  • Forensic diffing: comparing two points in time for configuration drift or data tampering. ZFS snapshots give you a consistent view.
  • Reducing restore blast radius: users can restore only what they need, instead of requesting a broad rollback (which is the storage equivalent of using a sledgehammer to fix a watch).

Use cases where it causes problems

  • Shared datasets with messy permissions: users see snapshot trees but can’t read most of them due to ACLs. This generates “broken backups” tickets.
  • SMB/NFS client oddities: some clients treat leading-dot directories as hidden; others don’t. Indexers and backup tools may recurse unexpectedly.
  • Multi-tenant environments: exposing snapshot names can leak information (project names, incident names, retention policies) even if data access is blocked.

Joke #1: Making snapshots visible without a plan is like putting your spare keys under the doormat and then acting surprised when everyone checks the doormat.

How the .zfs directory behaves in real life

When you set snapdir=visible on a dataset, ZFS exposes a special directory at the root of that dataset: .zfs. Under it you typically get:

  • .zfs/snapshot/ — directories named after each snapshot, each containing a full filesystem tree as it looked at snapshot time.

A few behaviors that matter operationally:

  • The snapshot tree is read-only. Users can copy out of it but not modify within it.
  • Permissions are evaluated the same way as the live dataset. If a user couldn’t read /data/hr/payroll.xlsx yesterday, they also can’t read it in .zfs/snapshot/yesterday/data/hr/payroll.xlsx.
  • It’s per-dataset. If you have nested datasets, each dataset has its own .zfs view. This can be confusing when users traverse mountpoints and suddenly see different snapshot lists.
  • It’s not a normal directory. Some tools treat it strangely: recursion behavior, inode numbers, and traversal semantics can surprise older utilities.

There’s also a subtle social behavior: once users discover .zfs/snapshot, they start treating it like a “time machine.” That’s fine—until retention expires and the “time machine” runs out of fuel. If you want fewer angry emails, communicate retention as a product feature (“we keep 30 daily snapshots”) rather than a magical property of the universe.

Facts & historical context worth knowing

Some context makes better decisions. Here are a handful of concrete facts that repeatedly matter in real deployments:

  1. ZFS snapshots are cheap to create because they mostly record metadata; the big cost appears later when snapshots keep old blocks alive.
  2. Copy-on-write is why snapshots work: ZFS never overwrites in-place; changes write new blocks, and snapshots keep references to the old blocks.
  3. The .zfs directory is synthetic: it isn’t stored like normal files; it’s a virtual view provided by ZFS.
  4. Visibility was designed for admin workflows and later became a self-service recovery mechanism in many shops—often without a formal UX plan.
  5. “Snapshot name leaks” are a real thing: even if data is protected, snapshot names can reveal project codenames, incident dates, or HR-related hints if you bake those into names.
  6. Recursive snapshotting can create illusionary completeness: snapshotting a parent dataset doesn’t automatically mean all child datasets are included unless you do it explicitly or use recursive flags.
  7. SMB “Previous Versions” is a separate feature from snapdir=visible; you can support one, the other, or both, and they behave differently.
  8. Snapshots are not backups if the pool dies or ransomware encrypts the live dataset and you replicate encrypted blocks. Snapshots are a local safety net, not a disaster recovery plan.
  9. Retention policy is capacity policy: “keep more snapshots” is indistinguishable from “buy more disk,” just with fewer procurement meetings.

Three corporate-world mini-stories (pain, irony, salvation)

1) Incident caused by a wrong assumption: “Visible means recoverable”

A large internal file share had frequent “oops” deletions. The storage team enabled snapdir=visible to cut restore tickets. Within a day, the helpdesk queue improved—until a compliance analyst tried to retrieve a file from a snapshot and got “Permission denied.” They escalated, assuming snapshots were “backup archives” and therefore accessible to auditors by design.

The wrong assumption was subtle: the analyst’s role granted read access to some reports via an application layer, not via filesystem permissions. In the live world, the app mediated access. In the snapshot world, the user was talking directly to POSIX permissions and ACLs, which were stricter. The user could list snapshot names but couldn’t traverse key directories.

The incident became political because it looked like storage had “broken the audit process.” In truth, storage exposed reality: the audit role never had direct filesystem access. The fix wasn’t technical at first—it was vocabulary. The team wrote a short internal note: “Snapshots expose filesystem history; they do not elevate privileges.” Then they added a controlled “auditor group” with read-only ACLs on specific datasets, and they created a documented process for temporary access approvals.

After that, the win was real: audits became faster, and the on-call stopped doing artisanal restores at 2 a.m. But the key lesson stuck: a feature toggle is not a process.

2) Optimization that backfired: “Let’s make everything visible everywhere”

A platform team decided to standardize: all datasets get snapdir=visible. Their goal was consistency and fewer special cases. They rolled it out on a Friday afternoon because the change was “just a property.” (If you’re looking for a sign from the universe: that was the sign.)

Monday morning brought the fun. A set of file-indexing agents—originally deployed to speed up enterprise search—started crawling .zfs/snapshot trees. Each snapshot looked like a whole new universe of files. The indexers didn’t understand that these were historical views and happily re-indexed the same content dozens of times across snapshots. CPU on the indexer farm spiked, network traffic climbed, and the storage cluster saw a lot more metadata reads than usual.

Worse, some teams used tools that treat any directory tree as candidate backup content. A few “homegrown backup scripts” (the kind written once, never tested again, and kept alive by superstition) followed .zfs and ballooned backup windows. Nobody wanted to admit it, but the “optimization” had created a tax across multiple systems.

The rollback was quick: disable visibility on broad shares, then re-enable selectively where self-serve recovery mattered and where clients were well-behaved. The longer-term fix was to configure indexers and backup tools to exclude /.zfs, and to create datasets with clear intent: end-user shares got a curated snapshot exposure policy, while machine datasets stayed hidden.

Joke #2: The fastest way to find a forgotten crawler is to give it an infinite maze and watch it sprint.

3) A boring but correct practice that saved the day: “A tiny dataset boundary and predictable names”

An engineering org stored build artifacts and release packages on ZFS. They were strict about dataset layout: each product line had its own dataset, and permissions were consistent. Snapshot naming followed a boring standard: auto-YYYYMMDD-HHMM, created hourly and pruned daily.

One day, a release manager discovered a signed binary had disappeared from the “final” directory. The immediate fear was compromise. They needed to answer two questions: when did it disappear, and who had access? Because snapdir=visible was enabled on that dataset, the release team could immediately browse snapshots, find the last snapshot where the file existed, and narrow the window to a specific hour.

The storage team still got involved, but now the question wasn’t “please restore everything.” It was “we’ve identified the disappearance window; can you correlate it with SMB audit logs and confirm there was no pool issue?” That is the kind of ticket you want: bounded, evidence-driven, and fast.

The final outcome was mundane: an automation job had an overly broad cleanup pattern. But the investigation avoided drama because the dataset boundaries were clean, snapshot names were predictable, and snapshot exposure was intentional. It wasn’t heroic engineering; it was good hygiene paying interest.

Practical tasks: commands, outputs, and interpretation

Below are real tasks you can run in production to implement, audit, and troubleshoot snapdir=visible. Each includes what to look for so it’s not just command spam.

Task 1: Check current snapdir setting on a dataset

cr0x@server:~$ zfs get -H -o name,property,value,source snapdir tank/projects
tank/projects	snapdir	hidden	default

Interpretation: It’s using the default value (hidden). If the source is local, someone explicitly set it; if it’s inherited, a parent dataset controls it.

Task 2: Enable snapdir=visible (single dataset)

cr0x@server:~$ sudo zfs set snapdir=visible tank/projects
cr0x@server:~$ zfs get -H snapdir tank/projects
tank/projects	snapdir	visible	local

Interpretation: The dataset will now show a .zfs directory at its mountpoint. This does not change snapshot retention, create snapshots, or alter permissions.

Task 3: Verify .zfs exists and is visible in the mountpoint

cr0x@server:~$ ls -la /tank/projects | head
total 12
drwxr-xr-x  6 root root    6 Dec 24 10:01 .
drwxr-xr-x  4 root root    4 Dec 24 09:58 ..
drwxr-xr-x  2 root root    2 Dec 24 10:01 .zfs
drwxr-xr-x 18 root root   18 Dec 24 09:59 engineering
drwxr-xr-x 11 root root   11 Dec 24 10:00 finance

Interpretation: If .zfs isn’t present, confirm you’re listing the dataset’s mountpoint (not a parent directory) and that the dataset is mounted.

Task 4: List snapshots via ZFS CLI (ground truth)

cr0x@server:~$ zfs list -t snapshot -o name,creation,used,refer,mountpoint -r tank/projects | head
NAME                                   CREATION                USED  REFER  MOUNTPOINT
tank/projects@auto-20251224-0900        Wed Dec 24 09:00 2025   12M   1.20T  -
tank/projects@auto-20251224-1000        Wed Dec 24 10:00 2025   18M   1.21T  -

Interpretation: Snapshot USED is how much unique space that snapshot is holding (blocks not referenced elsewhere). If USED grows, retention has a direct capacity cost.

Task 5: List snapshots by browsing .zfs

cr0x@server:~$ ls -1 /tank/projects/.zfs/snapshot | head
auto-20251224-0900
auto-20251224-1000

Interpretation: If ZFS CLI shows snapshots but .zfs/snapshot is empty, you’re likely not in the dataset you think you are (nested dataset confusion), or the client path is crossing a mountpoint boundary.

Task 6: Copy a deleted file back from a snapshot (safe self-serve restore)

cr0x@server:~$ cp -av /tank/projects/.zfs/snapshot/auto-20251224-0900/engineering/readme.txt \
> /tank/projects/engineering/readme.txt
'/tank/projects/.zfs/snapshot/auto-20251224-0900/engineering/readme.txt' -> '/tank/projects/engineering/readme.txt'

Interpretation: Use cp (or rsync) to restore into the live tree. Don’t attempt to write inside the snapshot tree; it’s read-only and will fail.

Task 7: Confirm a user’s ability to traverse snapshot directories

cr0x@server:~$ sudo -u alice ls -la /tank/projects/.zfs/snapshot/auto-20251224-0900/finance
ls: cannot open directory '/tank/projects/.zfs/snapshot/auto-20251224-0900/finance': Permission denied

Interpretation: This is usually correct behavior. It means the snapshot honors the same permissions as the live tree. If users expect access, you need an ACL/permission discussion, not a storage toggle.

Task 8: Find what’s consuming space due to snapshots (high-level)

cr0x@server:~$ zfs list -o name,used,avail,refer,ratio,mountpoint tank/projects
NAME          USED  AVAIL  REFER  RATIO  MOUNTPOINT
tank/projects 3.41T 2.12T  1.21T  1.68x  /tank/projects

Interpretation: USED includes snapshot-held blocks. If REFER is stable but USED climbs, snapshots (or clones) are keeping old data around.

Task 9: See snapshot space consumption per snapshot

cr0x@server:~$ zfs list -t snapshot -o name,used,refer -s used -r tank/projects | tail -5
tank/projects@auto-20251220-0900  1.9G  1.08T
tank/projects@auto-20251221-0900  2.3G  1.10T
tank/projects@auto-20251222-0900  3.1G  1.14T
tank/projects@auto-20251223-0900  4.8G  1.18T
tank/projects@auto-20251224-0900   12M  1.20T

Interpretation: Older snapshots often hold more unique blocks, especially if the dataset sees churn (VM images, build outputs, databases). A few heavy snapshots can dominate costs.

Task 10: Detect “snapshot directory crawlers” using filesystem activity symptoms

cr0x@server:~$ sudo zpool iostat -v tank 2 3
              capacity     operations     bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
tank        8.10T  2.12T    980    210   118M  22.1M
  raidz2    8.10T  2.12T    980    210   118M  22.1M
    sda         -      -    160     35  19.7M  3.6M
    sdb         -      -    165     36  19.4M  3.7M
    sdc         -      -    162     34  19.6M  3.5M

Interpretation: A sudden rise in read IOPS and read bandwidth after enabling visibility often points to crawlers (indexers, antivirus, backup scripts) walking snapshot trees.

Task 11: Check dataset mountpoints and nested dataset boundaries (avoids confusion)

cr0x@server:~$ zfs list -o name,mountpoint -r tank/projects
NAME                     MOUNTPOINT
tank/projects            /tank/projects
tank/projects/engineering /tank/projects/engineering
tank/projects/finance    /tank/projects/finance

Interpretation: Each child dataset has its own snapshot namespace and potentially its own snapdir setting. Users crossing into /tank/projects/engineering will see .zfs for that dataset, not the parent’s snapshots.

Task 12: Set snapdir via inheritance (apply to a subtree)

cr0x@server:~$ sudo zfs set snapdir=visible tank/projects
cr0x@server:~$ zfs get -H -o name,value,source snapdir tank/projects/engineering
tank/projects/engineering	visible	inherited from tank/projects

Interpretation: Inheritance is your friend for consistency, but it also increases blast radius. For user-facing shares, consider explicit local settings and document intent.

Task 13: Disable visibility (the “stop the bleeding” button)

cr0x@server:~$ sudo zfs set snapdir=hidden tank/projects
cr0x@server:~$ zfs get -H snapdir tank/projects
tank/projects	snapdir	hidden	local

Interpretation: This hides .zfs from directory listings again. It doesn’t delete snapshots; it just removes the UX surface area.

Task 14: Verify snapshot retention and prune intentionally

cr0x@server:~$ zfs destroy tank/projects@auto-20251220-0900
cr0x@server:~$ zfs list -t snapshot -r tank/projects | grep auto-20251220-0900
cr0x@server:~$ echo $?
1

Interpretation: Removing snapshots frees space only if those blocks are no longer referenced by other snapshots or clones. If space doesn’t drop, you likely have other snapshots holding the same churn, or clones pinning blocks.

Task 15: Investigate clones pinning snapshot space

cr0x@server:~$ zfs get -H -o name,value origin -r tank/projects | grep -v '^-$'
tank/projects/engineering/testclone	tank/projects@auto-20251222-0900

Interpretation: If datasets are cloned from snapshots, those snapshots (or their blocks) may be effectively undeletable until clones are destroyed or promoted. Visibility won’t cause this, but it frequently shows up during snapshot audits.

Access control model: what users can really see and do

snapdir=visible changes discoverability, not authorization. In most POSIX setups, a user needs:

  • Execute (traverse) permission on each directory in the path to reach a file in the snapshot.
  • Read permission on the file to copy it out.
  • Appropriate ACL rights if you’re using NFSv4 ACLs or POSIX ACLs.

That means you can use visibility without handing out new power, but it also means users can feel “teased” by snapshot names they can’t use. A practical pattern is to separate datasets by audience:

  • User shares (home directories, team shares): consider snapdir=visible plus explicit guidance on self-restore, with exclusions configured for crawlers.
  • Machine datasets (databases, VM images, CI caches): keep snapdir=hidden unless you have a strong reason; these trees are crawler catnip and often churny.
  • Audit datasets: if auditors need access, grant it intentionally via ACLs to a read-only group; don’t rely on “they can see it so it must be allowed.”

A note on naming: snapshot names become UI when you expose them. If your snapshot names contain spaces, jokes, or incident codenames, you’re basically writing your internal diary on the front door. Use predictable, boring names. Boring is scalable.

Performance and operational impact

In theory, exposing .zfs/snapshot is just a directory listing. In practice, it changes what users and programs do, which changes load patterns. Most “performance issues” blamed on snapdir=visible are actually caused by one of these:

  • Recursive scanning by indexers, antivirus, backup agents, or “find / -type f” scripts that now include snapshots.
  • Metadata-heavy browsing by users who treat snapshot trees as archives and run lots of directory listings across many snapshots.
  • Confusing mount boundaries where users traverse into child datasets and trigger different snapshot sets, multiplying their browsing.

Operationally, plan for:

  • More metadata reads (especially if users browse older snapshots with large directory trees).
  • More cache pressure (ARC) if snapshot trees are frequently traversed and don’t fit in memory.
  • Support questions about “what is this .zfs folder?” and “why are there so many copies?”

One pragmatic trick: treat snapdir=visible as a user-facing product. Ship it with exclusions (indexers/backups), with a retention statement, and with a “how to restore your file” snippet. You’ll save more hours than any micro-optimization in recordsize ever will.

Fast diagnosis playbook

When someone says “snapshots are slow” or “enabling snapdir caused latency,” you don’t have time for philosophical debates. Here’s a fast order of operations that finds bottlenecks quickly.

First: confirm what changed and where

  1. Confirm the dataset and mountpoint users are accessing (nested datasets matter).
  2. Confirm snapdir is actually visible on that dataset (and whether inherited).
  3. Confirm whether the complaint is about listing snapshot names or traversing snapshot contents.
cr0x@server:~$ zfs list -o name,mountpoint -r tank/projects
cr0x@server:~$ zfs get -H -o name,value,source snapdir tank/projects tank/projects/engineering

Second: check pool health and obvious saturation

  1. Pool health (errors, resilvering) first. A degraded pool makes every feature look guilty.
  2. IOPS and bandwidth on the pool during the complaint window.
  3. Latency hints: high read ops with low throughput often indicates metadata storm.
cr0x@server:~$ zpool status -x
all pools are healthy
cr0x@server:~$ zpool iostat -v tank 1 5

Third: look for snapshot tree crawlers and metadata storms

  1. Are there processes doing wide directory traversals?
  2. Did an indexer/AV/backup agent recently update config?
  3. Is the dataset exported via SMB/NFS and being crawled by clients?
cr0x@server:~$ ps auxww | egrep 'find |updatedb|locate|rsync|backup|index' | head
cr0x@server:~$ sudo lsof +D /tank/projects/.zfs/snapshot 2>/dev/null | head

Interpretation: lsof +D can be expensive on huge trees; use it carefully. If it’s too slow, scope it to a single snapshot directory or use targeted checks based on known agents.

Fourth: validate ARC pressure and memory contention

  1. If ARC is thrashing, metadata workloads hurt more.
  2. Confirm you’re not swapping; swapping turns ZFS into a performance art piece.
cr0x@server:~$ free -h
cr0x@server:~$ vmstat 1 5

Fifth: decide the mitigation

  • If crawlers are the cause: exclude /.zfs at the tool level, then re-test.
  • If users are doing legitimate heavy browsing: consider providing a dedicated “restore” host or documented workflow, or keep visibility only on selected datasets.
  • If the pool is saturated: you may be capacity- or IOPS-bound regardless of visibility—fix the pool, not the symptom.

Common mistakes: symptoms and fixes

Mistake 1: Enabling visibility on broad shares without excluding crawlers

Symptoms: sudden spike in read IOPS, slower directory listings, indexer/AV CPU increase, backup windows explode.

Fix: hide snapdir again to stop the bleeding, then configure exclusions in crawlers for /.zfs. Re-enable selectively on datasets where it’s useful.

cr0x@server:~$ sudo zfs set snapdir=hidden tank/projects
cr0x@server:~$ zpool iostat -v tank 2 3

Mistake 2: Assuming snapshots are readable by auditors because they’re visible

Symptoms: “Permission denied” complaints, audit escalation, accusations that storage “withheld” data.

Fix: align filesystem permissions/ACLs with audit requirements, or provide a mediated workflow. Document that visibility does not elevate privileges.

cr0x@server:~$ getfacl -p /tank/projects/finance | head -40

Mistake 3: Confusing parent dataset snapshots with child dataset snapshots

Symptoms: users say “the snapshot I need isn’t there,” but it exists elsewhere; inconsistent snapshot lists across paths.

Fix: map dataset boundaries and mountpoints; ensure snapshot automation covers child datasets as intended.

cr0x@server:~$ zfs list -o name,mountpoint -r tank/projects
cr0x@server:~$ zfs list -t snapshot -r tank/projects/engineering | head

Mistake 4: Snapshot naming that becomes a security or HR problem

Symptoms: users can infer sensitive events from snapshot names; awkward questions in audits; “why is there a snapshot called layoff-plan?”

Fix: standardize on neutral names (timestamp + policy), keep “meaning” in your change system, not in snapshot names.

cr0x@server:~$ zfs list -t snapshot -o name -r tank/projects | head

Mistake 5: Treating snapshots as a replacement for backups

Symptoms: no off-box copies, ransomware encrypts live data and snapshots age out or replicate encrypted state, pool loss is catastrophic.

Fix: keep snapshots as local rollback; use replication and separate retention domains for real backups (and test restores).

cr0x@server:~$ zfs list -t snapshot -r tank | wc -l

Mistake 6: Letting snapshot retention drift until the pool is full

Symptoms: pool hits high allocation, performance degrades, deletes don’t free space “like they used to,” emergency snapshot purge.

Fix: monitor snapshot growth, prune with intent, and understand clones. Consider quota/refquota to cap damage per dataset.

cr0x@server:~$ zpool list
cr0x@server:~$ zfs list -o name,used,refer,avail -r tank/projects

Checklists / step-by-step plan

Checklist A: Safe rollout of snapdir=visible on a user share

  1. Pick the dataset(s) intentionally. Prefer team shares or home directories with stable permissions.
  2. Confirm snapshot automation exists (hourly/daily) and retention is documented.
  3. Standardize snapshot names to be predictable and boring.
  4. Inventory crawlers: indexers, antivirus, backup agents, file analytics, search tools. Plan exclusions for /.zfs.
  5. Enable visibility in a pilot (one dataset, one team).
  6. Measure impact: pool iostat, client experience, ticket types.
  7. Publish a two-minute self-restore guide (“copy from .zfs/snapshot/<snap>/path”).
  8. Expand gradually, with a rollback plan (snapdir=hidden).

Checklist B: Self-serve restore workflow (user-friendly, low-risk)

  1. Find the right dataset mountpoint (avoid nested dataset confusion).
  2. Browse .zfs/snapshot for the time window.
  3. Locate the file in the snapshot tree.
  4. Copy it back into live space with a new name first (safety).
  5. Verify content, then replace the original if needed.
cr0x@server:~$ ls /tank/projects/.zfs/snapshot | tail -5
cr0x@server:~$ cp -av /tank/projects/.zfs/snapshot/auto-20251224-0900/engineering/spec.docx \
> /tank/projects/engineering/spec.docx.restored
cr0x@server:~$ ls -l /tank/projects/engineering/spec.docx*

Checklist C: Audit workflow (prove “what existed when”)

  1. Identify dataset and relevant path.
  2. Find the snapshot closest to the time of interest (zfs list -t snapshot).
  3. Check existence and metadata in snapshot view.
  4. If needed, hash the snapshot file and the restored copy to show integrity (hashing is an operational decision; it can be heavy).
  5. Record snapshot name and creation time as your evidence anchor.
cr0x@server:~$ zfs list -t snapshot -o name,creation -r tank/projects | tail -5
cr0x@server:~$ ls -l /tank/projects/.zfs/snapshot/auto-20251224-0900/finance/ledger.csv
cr0x@server:~$ sha256sum /tank/projects/.zfs/snapshot/auto-20251224-0900/finance/ledger.csv

FAQ

1) Does snapdir=visible make snapshots accessible to everyone?

No. It makes the snapshot directory visible in listings. Access to snapshot contents still depends on filesystem permissions and ACLs.

2) Will enabling it slow down my pool?

Not directly. The risk is behavioral: once snapshots are visible, tools and users may traverse them, causing metadata reads and increased load. If you exclude crawlers and apply it selectively, impact is usually modest.

3) Why can users list snapshot names but not open folders inside them?

Because listing .zfs/snapshot may be permitted while traversal into protected directories is blocked by directory execute permissions or ACLs. This is common in mixed-permission datasets.

4) How do I hide snapshots again without deleting anything?

Set snapdir=hidden on the dataset. The snapshots remain; they just stop showing up via .zfs.

cr0x@server:~$ sudo zfs set snapdir=hidden tank/projects

5) Is .zfs present on child datasets too?

Yes, each dataset has its own snapshot namespace and property inheritance rules. Nested datasets are a frequent source of “wrong snapshot list” confusion.

6) Can users delete or modify files inside snapshots?

No. Snapshot contents are read-only. Users can copy data out to the live filesystem if they have permission to write there.

7) Is this the same as “Previous Versions” in Windows shares?

No. Windows “Previous Versions” is typically implemented by the SMB server mapping snapshots into that UI. snapdir=visible exposes snapshots via a filesystem directory. You can use one, the other, or both; test client behavior carefully.

8) If I delete a snapshot, will I immediately get space back?

Not always. Space is freed only when blocks are no longer referenced by any snapshot, clone, or the live filesystem. Clones are a common reason snapshot deletion doesn’t reclaim space.

9) What’s the safest way to let users restore files without giving them too much rope?

Enable visibility only on datasets designed for end-user browsing, keep snapshot names predictable, publish a restore workflow, and ensure your crawlers exclude /.zfs. Combine with quotas/refquotas to cap worst-case growth.

Conclusion

snapdir=visible is a classic ZFS move: one property that turns a storage primitive into a user-facing feature. The engineering is solid, but the outcome depends on how humans and software behave once the door is open.

If you treat snapshot visibility as a product—scoped rollout, clear retention, sane naming, crawler exclusions, and a documented restore workflow—you get faster audits, fewer restore tickets, and less midnight drama. If you flip it everywhere and walk away, you’ll learn which tools in your environment are secretly powered by recursion. In storage, the data is predictable; it’s the users and their scripts that keep things exciting.

← Previous
Proxmox backups to PBS fail: common chunk/space errors and what to do
Next →
MySQL vs Percona Server: replication stability—why ops teams switch

Leave a comment