ZFS Readonly: The Anti-Ransomware Trick You Can Deploy Today

Was this helpful?

Ransomware doesn’t usually “hack” storage. It abuses the fact that your storage is doing exactly what it was asked to do: accept writes from a trusted identity that suddenly isn’t trustworthy. Your file server doesn’t know the difference between a payroll clerk saving a spreadsheet and malware encrypting 4 million documents at line rate. To the disk, it’s all just enthusiastic writing.

ZFS gives you a deceptively simple lever: readonly datasets. Combined with snapshots and replication, it lets you put parts of your storage in a posture where ransomware can’t encrypt what it can’t write. This isn’t a silver bullet—nothing is—but it’s a control you can deploy today, with commands you can run in production without buying anything or reorganizing your whole architecture.

What ZFS readonly really means (and what it doesn’t)

The ZFS property readonly is exactly what it sounds like: it tells ZFS to mount a dataset in read-only mode. When it’s set, writes through that mount should fail. That’s powerful, but it’s not magical. It’s a storage-layer policy applied to a dataset, not a moral judgement about intent.

Readonly applies to the dataset, not your intentions

When you set readonly=on on a dataset:

  • Writes through the mounted filesystem are blocked (applications will see errors like “Read-only file system”).
  • Metadata updates that require writes (creating files, renames, chmod, etc.) also fail.
  • You can still read, traverse, and compute checksums; ZFS is happy to serve data.

What it does not mean:

  • It does not automatically protect snapshots. Snapshots have their own safety story.
  • It does not stop a privileged user from flipping it back to off.
  • It does not protect you from destruction at the pool level (e.g., zpool destroy by root). This is why separation of duties matters.
  • It does not stop writes via a different path that bypasses the mounted filesystem (for example, if you replicate a dataset into it, you may be writing at the ZFS layer, not through the mount).

In other words: readonly is a strong seatbelt, not a teleporter. It helps you survive common crashes, but you still shouldn’t drive into a wall.

What ransomware typically does to your files

Ransomware at the file level usually:

  • Enumerates directories, often using standard APIs, sometimes with multithreading.
  • Reads a file, writes an encrypted version (either in place or side-by-side), then renames.
  • Deletes originals or truncates them.
  • Targets file servers over SMB/NFS because they’re rich in shared data and often broadly writable.

If the underlying dataset is readonly, those writes, truncations, renames, and deletions fail. The malware can still read your data (confidentiality is a separate control), but it cannot encrypt your stored dataset in place. That’s a huge difference when you’re trying to keep the business running.

One joke, because we’ve earned it: readonly is like putting your data behind glass—people can still admire it, but they can’t “improve” it with a ransomware paint job.

Why readonly works against ransomware

Most anti-ransomware guidance focuses on detection, endpoint controls, and backups. All good. But the operational truth is: ransomware doesn’t need to be clever if your storage is wide open. Storage-layer immutability changes the game by making the “encrypt and overwrite” step impossible (or at least dramatically harder).

Readonly reduces your blast radius

In production, data is rarely homogeneous. You have:

  • Active working sets (need frequent writes).
  • Reference data (rarely changes, heavily read).
  • Archives (should never change).
  • Backup/replica targets (should be write-once in practice).

Readonly is a perfect fit for reference data, archives, and replica targets. If ransomware hits a workstation or a server account with broad access, it can only damage datasets that are actually writable.

Readonly is immediate, cheap, and observable

Security controls that require a long project tend to lose to “urgent business priorities.” ZFS readonly is not one of those controls. You can:

  • Turn it on with a single property change.
  • Confirm it with zfs get and with real-world app behavior.
  • Monitor for attempted writes (applications will log errors; SMB/NFS clients will complain).

That observability matters. In several incidents I’ve watched unfold, the first hard clue was “why did this directory suddenly become read-only?” It wasn’t read-only. The system was trying to write garbage at high speed and hitting guardrails. Those guardrails turned a catastrophe into a noisy annoyance.

Readonly pairs well with snapshots—and snapshots are what you’ll actually use

Readonly is a preventative control. Snapshots are your recovery control. Together, they let you say:

  • “This dataset is not supposed to change.” (readonly)
  • “If something changes that shouldn’t, I can roll back or restore quickly.” (snapshots + replication)

If you only do snapshots, ransomware can still encrypt the live filesystem and you’ll be in restore mode. If you only do readonly, you might block legitimate business changes. The sweet spot is designing your dataset layout so you can be strict where it’s safe and flexible where it’s needed.

Interesting facts and historical context

Storage engineering has a long memory. A few context points that make ZFS readonly feel less like a trick and more like the modern form of old ideas:

  1. ZFS started in the early 2000s at Sun Microsystems, built around the idea that the filesystem and volume manager should be one system, not two awkward roommates.
  2. Copy-on-write (CoW) is ZFS’s superpower: data isn’t overwritten in place; new blocks are written and pointers are updated. This makes snapshots cheap and consistent.
  3. Snapshots predate ransomware as a mainstream threat. They were designed for admin mistakes and consistency, but they map beautifully to rollback-based recovery.
  4. “Immutability” isn’t new: WORM (write once, read many) storage has been sold for decades for compliance archives. ZFS readonly is a practical approximation for many datasets.
  5. Modern ransomware evolved from “screen lockers” into encryption-based extortion when crypto became easier to operationalize and backups became more common.
  6. SMB shares became high-value targets because they concentrate shared data and inherit broad permissions—one compromised user can touch a lot.
  7. Early ZFS adopters learned the hard way that “snapshots exist” isn’t the same as “snapshots are safe.” If an attacker can destroy snapshots, you’re back to square one.
  8. Operational separation of duties is older than computers: the person who can approve spending shouldn’t be the person who can transfer funds. The same logic applies to who can delete snapshots or flip readonly.
  9. “Offline backups” used to mean tapes. Today it often means “a replica you cannot modify from the production identity plane.” Readonly datasets are one piece of that puzzle.

Second joke (and the last one): ransomware is the only workload I’ve seen that scales perfectly without a performance review.

Design patterns: where readonly fits in real storage

Readonly is most effective when your datasets mirror reality: what changes, what shouldn’t, and what must be recoverable quickly. You don’t sprinkle readonly like seasoning; you build boundaries.

Pattern 1: Split “active” and “published” data

Common in analytics and content pipelines:

  • Active ingest dataset: writable, used by ETL or upload jobs.
  • Published dataset: readonly, used by readers (BI tools, web servers, downstream teams).

Promotion is a controlled action: a snapshot or clone is moved into the published namespace, then locked readonly. If ransomware hits an ingest node, it can trash ingest, but published stays intact.

Pattern 2: Readonly replicas as “poor man’s air gap”

Set up a backup/DR host that receives ZFS replication. On the target side, keep datasets mounted readonly (or not mounted at all) and restrict who can change properties. The target becomes “online but not easily writable,” which is often the operational sweet spot.

The key is identity separation: production systems shouldn’t have credentials that can rewrite the backup target beyond the narrow replication capability. If the same root SSH key can administer both ends, you’ve built a high-speed self-destruct mechanism.

Pattern 3: Immutable home directories (carefully)

For some environments—kiosks, labs, call centers—user data is not meant to persist or change outside a controlled sync. You can mount home directories from a readonly dataset and overlay a writable layer elsewhere. This blocks a whole category of “encrypt everything I can see” attacks.

But be honest: for knowledge workers, home directories are active data. Don’t turn readonly into a productivity denial-of-service.

Pattern 4: Protect your “known-good” restores

Most orgs have a directory like /srv/releases, /srv/installers, /srv/golden, or “the thing we reinstall from when the world ends.” Make it a separate dataset. Make it readonly. Snapshot it. Replicate it. If ransomware hits and you can’t reinstall safely, you’ll discover new definitions of “bad day.”

Threat model reality check

Readonly helps most against:

  • Compromised user accounts with file share access.
  • Malware running on application servers that write to shared datasets.
  • Accidental destructive scripts (yes, those count).

Readonly helps less against:

  • Root compromise on the storage server itself.
  • Attackers with ZFS administrative rights who can flip properties or destroy snapshots.
  • Physical access and malicious kernel-level tampering.

That’s not a reason to skip it. It’s a reason to pair it with role separation, snapshot protection, and a replication target that’s not governed by the same set of keys.

Practical tasks (commands + interpretation)

All commands below assume OpenZFS on Linux with a typical pool named tank. Adjust names to match your environment. The point is not the exact string; it’s the operational intent and what “good” looks like.

Task 1: Inventory datasets and spot obvious candidates for readonly

cr0x@server:~$ sudo zfs list -o name,used,avail,refer,mountpoint
NAME                 USED  AVAIL  REFER  MOUNTPOINT
tank                 3.41T  5.22T   192K  /tank
tank/home             610G  5.22T   610G  /home
tank/projects         1.20T  5.22T  1.20T /srv/projects
tank/releases          42G  5.22T   42G   /srv/releases
tank/backup-recv      1.55T  5.22T  1.55T /srv/backup-recv

Interpretation: tank/releases and tank/backup-recv are immediate readonly candidates. tank/home and tank/projects are likely active and need a different strategy (snapshots + quotas + permissions + maybe published/active split).

Task 2: Check current readonly state across the pool

cr0x@server:~$ sudo zfs get -r -o name,property,value,source readonly tank
NAME             PROPERTY  VALUE  SOURCE
tank             readonly  off    default
tank/home        readonly  off    default
tank/projects    readonly  off    default
tank/releases    readonly  off    default
tank/backup-recv readonly  off    default

Interpretation: Everything is writable. That’s normal on day zero, and exactly why ransomware has a field day on file servers.

Task 3: Set a dataset to readonly

cr0x@server:~$ sudo zfs set readonly=on tank/releases
cr0x@server:~$ sudo zfs get -o name,property,value,source readonly tank/releases
NAME          PROPERTY  VALUE  SOURCE
tank/releases readonly  on     local

Interpretation: The property is now locally set. Existing processes that assume they can write into /srv/releases will start failing, which is the point—but you should still coordinate the change.

Task 4: Confirm behavior with a real write attempt

cr0x@server:~$ sudo touch /srv/releases/should-fail
touch: cannot touch '/srv/releases/should-fail': Read-only file system

Interpretation: This is the failure mode you want during an encryption attempt: loud, immediate, and not “successfully overwrote your data.”

Task 5: Make it operationally safe to “temporarily allow writes”

Readonly isn’t set-and-forget if you sometimes legitimately update the dataset. The trick is to create a repeatable “maintenance window” procedure that leaves evidence.

cr0x@server:~$ sudo zfs snapshot tank/releases@before-maint-2025-12-24
cr0x@server:~$ sudo zfs set readonly=off tank/releases
cr0x@server:~$ sudo rsync -a --delete /staging/releases/ /srv/releases/
cr0x@server:~$ sudo zfs set readonly=on tank/releases
cr0x@server:~$ sudo zfs snapshot tank/releases@after-maint-2025-12-24

Interpretation: You’ve created two clear restore points. If the maintenance host was compromised, you can roll back to @before-maint or inspect diffs between snapshots.

Task 6: Tighten snapshot behavior with a hold (snapshot “pinning”)

A snapshot can be destroyed by an attacker with sufficient privileges. ZFS provides “holds” to prevent deletion until the hold is released.

cr0x@server:~$ sudo zfs snapshot tank/releases@golden
cr0x@server:~$ sudo zfs hold keep tank/releases@golden
cr0x@server:~$ sudo zfs holds tank/releases@golden
NAME                  TAG   TIMESTAMP
tank/releases@golden  keep  Wed Dec 24 10:02 2025

Interpretation: Even if someone runs zfs destroy tank/releases@golden, it won’t work until the hold is released. This is not absolute protection against root on the storage host, but it raises the effort and prevents accidental deletion.

Task 7: Build a replication target that defaults to readonly

On a backup receiver host, you can enforce a policy: data arrives via zfs receive, but the mounted view is readonly.

cr0x@backup:~$ sudo zfs create -o mountpoint=/srv/backup-recv tank/backup-recv
cr0x@backup:~$ sudo zfs set readonly=on tank/backup-recv
cr0x@backup:~$ sudo zfs get -o name,property,value tank/backup-recv
NAME             PROPERTY  VALUE
tank/backup-recv readonly  on

Interpretation: This protects against “someone mounted the backups and then wrote into them.” It does not prevent receiving replication, which writes at the dataset level, not through POSIX file writes.

Task 8: Replicate snapshots (send/receive) in a way that’s actually recoverable

cr0x@prod:~$ sudo zfs snapshot tank/projects@replica-001
cr0x@prod:~$ sudo zfs send -w tank/projects@replica-001 | ssh backup sudo zfs receive -u tank/backup-recv/projects

Interpretation: -u receives without mounting, which is underrated security hygiene. Unmounted backups can’t be casually browsed or modified by random processes, and they don’t become part of your everyday namespace.

Task 9: Verify replication and set the received dataset readonly

cr0x@backup:~$ sudo zfs list -t snapshot tank/backup-recv/projects | tail -n 3
NAME                                     USED  AVAIL  REFER  MOUNTPOINT
tank/backup-recv/projects@replica-001        0B      -  1.20T  -
cr0x@backup:~$ sudo zfs set readonly=on tank/backup-recv/projects
cr0x@backup:~$ sudo zfs get readonly,mounted,mountpoint tank/backup-recv/projects
NAME                      PROPERTY    VALUE    SOURCE
tank/backup-recv/projects readonly    on       local
tank/backup-recv/projects mounted     no       -
tank/backup-recv/projects mountpoint  /tank/backup-recv/projects  default

Interpretation: You now have a replica dataset that isn’t even mounted and is marked readonly. This is the kind of boring posture that frustrates attackers and delights auditors.

Task 10: Roll back fast after an encryption event (when you’re sure)

Rollback is a chainsaw. It’s also the fastest way back to “operational,” which is why people reach for it during incidents. Use it carefully.

cr0x@server:~$ sudo zfs list -t snapshot -o name,creation tank/projects | tail -n 5
tank/projects@hourly-2025-12-24-08:00  Wed Dec 24 08:00 2025
tank/projects@hourly-2025-12-24-09:00  Wed Dec 24 09:00 2025
tank/projects@hourly-2025-12-24-10:00  Wed Dec 24 10:00 2025
tank/projects@hourly-2025-12-24-11:00  Wed Dec 24 11:00 2025
tank/projects@hourly-2025-12-24-12:00  Wed Dec 24 12:00 2025
cr0x@server:~$ sudo zfs rollback -r tank/projects@hourly-2025-12-24-10:00

Interpretation: This reverts the dataset to the snapshot state, and -r will also roll back dependent clones/snapshots as needed. You should only do this when you have confirmed the compromise scope and you accept losing changes after that snapshot time.

Task 11: If rollback is too destructive, mount a snapshot read-only and copy out

cr0x@server:~$ sudo mkdir -p /mnt/projects-restore
cr0x@server:~$ sudo mount -t zfs tank/projects@hourly-2025-12-24-10:00 /mnt/projects-restore
cr0x@server:~$ sudo rsync -a /mnt/projects-restore/ /srv/projects/
cr0x@server:~$ sudo umount /mnt/projects-restore

Interpretation: This is slower but safer for partial restores and investigation. You can selectively restore without rewinding the entire dataset’s timeline.

Task 12: Detect suspicious write pressure with ZFS and OS metrics

cr0x@server:~$ sudo zpool iostat -v tank 1 5
              capacity     operations     bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
tank        3.41T  5.22T    220   4100   25.3M  620M
  raidz2    3.41T  5.22T    220   4100   25.3M  620M
    sda         -      -     40    690   4.3M  102M
    sdb         -      -     42    710   4.5M  104M
    sdc         -      -     46    680   4.8M  101M
    sdd         -      -     45    705   4.7M  105M

Interpretation: A sudden write spike with lots of small operations is consistent with mass file rewrite/encryption. It’s not proof, but it’s a reason to look at SMB logs, process lists, and recent file churn.

Task 13: Audit who can change ZFS properties (delegation)

If your operational model allows it, delegate limited ZFS rights rather than giving broad root access.

cr0x@server:~$ sudo zfs allow tank/releases
---- Permissions on tank/releases ---------------------------------------
Local+Descendent permissions:
user backupop create,mount,snapshot,send,receive

Interpretation: Delegation can keep a backup operator functional without giving them the keys to destroy your world. If everyone is root, readonly is a speed bump, not a barrier.

Task 14: Make readonly inherited for an entire subtree

cr0x@server:~$ sudo zfs set readonly=on tank/archive
cr0x@server:~$ sudo zfs create tank/archive/2024
cr0x@server:~$ sudo zfs get readonly tank/archive/2024
NAME              PROPERTY  VALUE  SOURCE
tank/archive/2024 readonly  on     inherited from tank/archive

Interpretation: Inheritance is how you avoid “we forgot to lock the new dataset.” In production, the enemy is often not malice—it’s drift.

Task 15: Confirm client-visible behavior over NFS/SMB (quick sanity)

From the server side you can validate the mount flags. The exact output varies by distro and setup.

cr0x@server:~$ mount | grep '/srv/releases'
tank/releases on /srv/releases type zfs (ro,xattr,posixacl)

Interpretation: You want to see ro. If it says rw, you haven’t achieved readonly at the mount level, regardless of what you think you set.

Three corporate-world mini-stories

Mini-story 1: The incident caused by a wrong assumption

The company had a tidy theory: “The NAS is safe because only the file server can write to it.” And on the whiteboard, it was true. The file server exported SMB shares, users mapped drives, and the storage backend was “internal.” People love the word internal. It sounds like a locked door. It’s usually a curtain.

One Monday morning, a junior admin reported that a project share was “acting weird.” Files weren’t opening. The helpdesk noticed a wave of tickets: Excel files wouldn’t load, PDFs showed errors, and “there’s a new file in every folder with instructions.” The normal incident rhythm kicked in—containment, communication, triage. Then someone asked the question nobody wants to ask: “Are the backups okay?”

Backups existed. They were also mounted read-write on the same file server because “it’s easier to restore.” Malware doesn’t care about easier to restore. It cares about easier to destroy. The ransomware had landed on a workstation, captured the user’s session, moved laterally with cached credentials, and used SMB like it was designed: a fast, authenticated file API. It encrypted the share, then cheerfully walked into the mounted backup path and did the same.

The wrong assumption wasn’t “we have backups.” It was “backups are safe because they’re backups.” Backups are only safe when they’re harder to change than production data. After the postmortem (the kind that reads like a horror novella), they rebuilt the backup target as ZFS replicas received into unmounted datasets and marked readonly. Restore got slightly less convenient. Survival got dramatically more convenient.

The operational lesson: if your backup path is writable from the same identity plane as production, you don’t have backups—you have additional copies of your failure.

Mini-story 2: The optimization that backfired

An infrastructure team wanted to improve performance for a large media workload. They reorganized datasets to reduce “overhead,” consolidating lots of small datasets into one giant dataset with broad ACLs. The idea was simple: fewer mountpoints, fewer headaches, better performance. They also turned off a few “noisy” snapshot schedules because snapshots were “using too much space.” The graphs looked better. The team went home feeling like winners.

Three months later, they got hit with a destructive script, not even ransomware: a pipeline job with a buggy path variable that resolved to the share root. It started cleaning up “old files.” It cleaned up nearly everything. The team reached for snapshots and found… not much. The consolidated dataset design meant there were no natural boundaries. The pipeline’s service account had rights everywhere because “it needs to access media.” With fewer datasets, there were fewer places to apply different policies, and with fewer snapshots, there were fewer restore points.

The backfire wasn’t that consolidation is always bad. The backfire was that consolidation eliminated control points. ZFS properties—readonly, snapshot schedules, quotas, reservations—operate at dataset boundaries. If everything is one dataset, everything shares the same fate. In practice, that’s how “an optimization” turns into “a single point of failure.”

They reversed course: active ingest stayed writable, published media became a separate readonly dataset, and pipeline service accounts got access scoped to where they should write. Snapshot schedules came back with retention tuned to business reality. Performance stayed fine. Recovery got sane.

The operational lesson: data layout is security layout. If you remove boundaries to simplify, you also remove places to apply guardrails.

Mini-story 3: The boring but correct practice that saved the day

A finance-heavy organization had a habit that looked almost quaint: every month-end close, they “sealed” the prior month’s dataset. It was a separate ZFS dataset per accounting period, with a predictable name and a runbook checklist. After close, the dataset was set readonly, snapshotted, and replicated to a secondary system. The team did it like clockwork, even when nobody was watching.

Then came the incident. A compromised workstation account started encrypting a wide swath of shared folders. It was fast, aggressive, and—like most ransomware—indifferent to business calendars. The file server lit up. Monitoring showed a steep climb in write IOPS and latency. Users reported file access errors. The incident commander started the familiar dance: isolate endpoints, disable accounts, block SMB sessions, preserve evidence.

When the dust settled, the bad news was real: several active datasets were partially encrypted. The good news was surgical: all sealed month-end datasets were untouched. Not because the attackers were polite, but because those datasets were readonly and the malware couldn’t overwrite them. The month-end close data—the thing that would trigger regulatory chaos if lost—was intact and immediately usable. They still had a mess, but it was a bounded mess.

Recovery became a two-speed operation: restore active work areas from snapshots and replicas, while continuing operations using the sealed datasets for reporting and compliance. The organization didn’t “avoid” the incident. They avoided the existential failure mode.

The operational lesson: the boring routines are the ones that save you. Not because they’re exciting, but because they’re repeatable under stress.

Fast diagnosis playbook

When something feels wrong—latency spikes, users report weird file extensions, write failures—the best teams don’t debate theory. They run a short playbook that narrows possibilities fast. Here’s one that works well for ZFS-backed file services.

First: confirm whether you’re seeing write pressure, metadata churn, or actual readonly enforcement

  1. Check pool I/O: is it mostly writes? Are ops small and numerous?
cr0x@server:~$ sudo zpool iostat -v tank 1 3

Interpretation: Massive write ops + high bandwidth suggests overwrite/encryption or bulk copy. High ops with low bandwidth can be metadata-heavy churn (rename storms, small files).

  1. Check whether target datasets are readonly as expected (and whether it’s inherited or local):
cr0x@server:~$ sudo zfs get -r -o name,property,value,source readonly tank/projects tank/releases

Interpretation: If a “protected” dataset shows readonly=off from local when you expected inherited or on, you may have drift or a change during maintenance.

  1. Check mount flags to ensure the OS is honoring readonly:
cr0x@server:~$ mount | egrep '/srv/releases|/srv/backup-recv'

Interpretation: You should see ro for readonly datasets. If you don’t, treat it as “not protected.”

Second: identify the dataset and client responsible (or at least correlated)

  1. Find which dataset is hot:
cr0x@server:~$ sudo zfs list -o name,used,refer,quota,reservation,mountpoint -S used | head

Interpretation: Rapid growth can indicate encryption producing new files, or a job dumping data unexpectedly.

  1. Check per-dataset I/O stats (Linux):
cr0x@server:~$ sudo zpool iostat -r -w tank 1 5

Interpretation: Look for sustained write bandwidth inconsistent with normal business patterns.

  1. Correlate with SMB/NFS sessions (examples; your commands will differ):
cr0x@server:~$ sudo smbstatus -p | head
PID     Username     Group        Machine                                   Protocol Version  Encryption           Signing
4123    jdoe         domain users  10.10.14.27 (ipv4:10.10.14.27:51234)      SMB3_11           -                    partial

Interpretation: A small number of clients doing a huge amount of damage is common. If you see a workstation that’s not supposed to touch a share, that’s a lead.

Third: decide whether you’re in “security incident” mode or “performance bug” mode

Symptoms overlap. Encryption looks like a performance issue at first. Performance bugs sometimes look like ransomware (high IOPS, lots of file ops). The difference is whether data content is changing unexpectedly and whether write attempts are failing due to readonly enforcement.

Quick differentiators:

  • Readonly errors in app logs and user reports: suggests guardrails are stopping writes (good) or you accidentally locked an active dataset (bad).
  • New extensions / ransom notes: security incident.
  • One job or host correlates exactly: could still be security, but may be a runaway batch process.
  • Latency spikes plus CPU spikes on endpoints: often encryption.

Either way, you want the same first containment move for file shares: narrow write access, isolate suspicious clients, and preserve snapshots before you touch anything.

Common mistakes (symptoms + fixes)

Mistake 1: “Readonly on the dataset means snapshots are safe”

Symptom: You set readonly=on, feel smug, then discover snapshots are missing after an incident.

Why it happens: Snapshot deletion is not blocked by dataset readonly. If an attacker has ZFS administrative access, they can destroy snapshots or change properties.

Fix: Use snapshot holds for key restore points, restrict ZFS admin rights, and replicate to a target where the attacker’s credentials don’t exist.

Mistake 2: Turning on readonly for a dataset that contains active application state

Symptom: Applications start failing with “Read-only file system,” databases refuse to start, CI pipelines break.

Why it happens: Someone applied a security control without mapping data flows.

Fix: Split datasets: put mutable app state in a writable dataset; publish artifacts to a separate readonly dataset. If you must keep one dataset, don’t use readonly—use snapshots and permissions instead.

Mistake 3: Assuming readonly propagates everywhere without checking inheritance

Symptom: A newly created dataset under a protected tree is writable.

Why it happens: It was created elsewhere and moved, or someone set readonly=off locally on a child dataset, overriding inheritance.

Fix: Audit with zfs get -r readonly and pay attention to source. For important trees, enforce standards via runbooks and reviews.

Mistake 4: Mounted backup replicas that are browsable (and writable) from production identities

Symptom: Backups get encrypted right along with production data.

Why it happens: Convenience: admins mount backup datasets so restores are a simple file copy. Malware loves convenience.

Fix: Receive replicas with zfs receive -u, keep them unmounted by default, and provide restores via controlled mount/clone workflows.

Mistake 5: “We’ll just flip readonly off during deployments” without a process

Symptom: Readonly stays off after a rushed change, and nobody notices until later.

Why it happens: Humans are bad at remembering to re-lock doors in the middle of fires.

Fix: Use a scripted maintenance procedure that snapshots before/after and explicitly sets readonly back on. Audit property drift regularly.

Mistake 6: Confusing filesystem permissions with ZFS readonly

Symptom: A user can still modify files even though “we locked it down,” or legitimate writes fail unexpectedly.

Why it happens: POSIX permissions/ACLs and ZFS readonly are different layers. One controls who may write; the other controls whether anyone may write.

Fix: Use permissions for normal access control, and readonly for immutability goals. Don’t try to replace one with the other.

Mistake 7: Ignoring side-channel writes (temp directories, logs, metadata)

Symptom: An application “only reads,” but it still fails on a readonly dataset.

Why it happens: Many “readers” write caches, lock files, temp files, or update timestamps in place.

Fix: Redirect caches and temp paths to a writable location. For services, set explicit cache dirs and ensure logs go elsewhere.

Checklists / step-by-step plan

Step-by-step: Deploy readonly safely in production (without breaking everything)

  1. Classify datasets by behavior: active-write, published-read, archive, backup target.
  2. Split datasets where boundaries are unclear: don’t fight reality; create control points.
  3. Pick one low-risk dataset (e.g., releases, installers, archives) and pilot readonly.
  4. Snapshot before the change so rollback is trivial.
  5. Enable readonly and validate at three layers: ZFS property, mount flags, and client behavior.
  6. Update runbooks: how to do planned writes (maintenance mode), who approves, and how to re-lock.
  7. Set monitoring expectations: write failures should alert, but not page the entire company.
  8. Add snapshot holds for critical restore points where appropriate.
  9. Build a replica target that receives snapshots unmounted; mark it readonly where possible.
  10. Restrict who can flip readonly and who can destroy snapshots; use delegation if it fits your org.
  11. Test restore quarterly: mount a snapshot, copy out, or clone and verify applications can read it.
  12. Audit drift monthly: readonly, mount status, snapshot presence, replication freshness.

Operational checklist: “Seal” a dataset after publication

  1. Create a snapshot named for the event/time.
  2. Verify consumers are reading from the right mountpoint.
  3. Set readonly=on.
  4. Hold the sealing snapshot if it’s a key restore point.
  5. Replicate to backup/DR.
  6. Record the snapshot name and replication status in your ticketing/change log.

Operational checklist: Suspected ransomware on file shares

  1. Contain: cut SMB/NFS access for suspected clients; disable suspect accounts.
  2. Preserve: take immediate snapshots of affected datasets (yes, even if they’re partially encrypted).
  3. Assess: identify earliest “good” snapshot; verify snapshot integrity and replication status.
  4. Decide: rollback vs selective restore. Rollback is fast; selective restore is safer for partial recovery.
  5. Recover: restore data; rotate credentials; validate endpoints; re-enable access in stages.
  6. Harden: move reference and archive data to readonly datasets; enforce backup immutability posture.

FAQ

1) Is ZFS readonly the same as “immutable storage”?

No. It’s a strong filesystem-level write block for a dataset, but a sufficiently privileged actor can still change properties or destroy datasets/snapshots. True immutability requires stronger separation (different admin domains, offline media, or governance controls). Readonly is still extremely useful because many ransomware events operate with user-level or service-level permissions, not storage-admin power.

2) If I set a dataset readonly, can I still replicate into it with zfs receive?

Generally yes. zfs receive operates at the ZFS layer, not via the mounted filesystem path. Readonly blocks normal file writes through the mount; it does not necessarily block administrative dataset updates like receiving a stream. Test this in your environment and treat replication permissions as a privileged capability.

3) Will readonly protect me if the storage server itself is compromised?

Not reliably. If an attacker has root and ZFS admin capability on the storage host, they can often flip readonly back off and destroy snapshots. Your protection then depends on controls outside that host: separate backup targets, access separation, and potentially offline copies.

4) Should I make my entire pool readonly?

No. That’s a great way to discover which applications write more than you thought, usually during peak hours. The value comes from making specific datasets readonly: archives, published artifacts, sealed reporting periods, and replica targets.

5) How do I handle legitimate updates to a readonly dataset?

Create a maintenance procedure: snapshot, temporarily set readonly off, perform controlled writes, set readonly on, snapshot again. The key is repeatability and auditability, not heroics.

6) What’s the difference between readonly and setting permissions/ACLs to deny writes?

Permissions answer “who may write.” Readonly answers “may anyone write.” Permissions are for normal operation. Readonly is for immutability guarantees and blast-radius reduction. Use both, but don’t confuse them.

7) Can ransomware delete snapshots from a client machine?

Not directly through SMB/NFS in most typical setups, because snapshot management is a storage-admin function. But if your environment exposes snapshot deletion via scripts, APIs, or over-privileged credentials on servers, then yes—snapshots can be targeted. Treat snapshot deletion capability as highly sensitive.

8) Should I mount backup replicas at all?

Only when you’re restoring or verifying. Keeping replicas unmounted by default is a simple way to reduce accidental modification and reduce the chance that some process “discovers” the backups and starts writing into them.

9) Does readonly have performance implications?

Minimal direct impact. The bigger performance story is the architecture you enable: fewer writes, fewer metadata updates, and less churn on protected datasets. During an attack, readonly can improve performance by preventing destructive write storms from succeeding (though the attempted writes still consume some resources).

10) What’s the single most important companion control to readonly?

Identity separation between production and backups/replicas. Readonly reduces damage, but if the same compromised identity can administer the storage and the backup target, you’re betting everything on “nobody will get that far.” In practice, somebody eventually does.

Conclusion

ZFS readonly isn’t a buzzword feature. It’s one of those unglamorous controls that changes incident outcomes. When you apply it to the right datasets—published data, archives, sealed periods, and backup replicas—you force ransomware into a narrower lane. It can still be loud. It can still be scary. But it’s much less likely to be fatal.

The trick is to treat dataset boundaries as security boundaries, pair readonly with snapshots you can actually trust, and practice the boring routines: sealing, replication, restore testing, and drift audits. In storage, the systems that survive aren’t the clever ones. They’re the ones with guardrails that are hard to bypass and easy to verify.

← Previous
Docker Pull Is Painfully Slow: DNS, MTU, and Proxy Fixes That Actually Work
Next →
Ubuntu 24.04: Reverse path filtering — the hidden setting that breaks multi-homing (case #54)

Leave a comment