ZFS Raw Send: Replicating Encrypted Data Without Sharing Keys

December 12, 2025 • February 3, 2026 • Read: 23 min • Views: 15

Was this helpful?

ZFS encryption is one of those rare features that actually makes the storage engineer’s day better: the data is encrypted on disk, the keys are managed at the dataset level, and replication can happen without turning your DR site into a key escrow service. The magic trick behind that last part is raw send: a way to replicate the encrypted bytes as-is, so the destination can store and serve the dataset without ever learning the encryption keys.

This is not a theoretical “security whitepaper” story. Raw send is how you build sane multi-site backups when legal, compliance, or plain common sense says “the backup provider must not have keys.” It’s also how you avoid the classic mistake of decrypting on the source just so you can ship plaintext to somewhere “safe.” Spoiler: somewhere “safe” eventually becomes “somewhere with a compromised admin laptop.”

What raw send actually is (and what it isn’t)

Let’s get precise, because ZFS replication discussions tend to collapse into vague statements like “it’s encrypted, so it’s safe.” Raw send is not “use ssh and call it encrypted.” Raw send is not “send a dataset that happens to be encrypted.” Raw send is a specific ZFS send mode where the send stream contains the encrypted on-disk blocks and metadata, rather than ZFS decrypting them for the stream.

In practical terms:

Normal send of an encrypted dataset generally produces a stream of decrypted data (unless you tell ZFS otherwise). The receiver reconstructs the dataset content, but the stream itself is not the ciphertext that lived on disk.
Raw send produces a stream that keeps the dataset encrypted end-to-end as ZFS stores it. The receiver stores the encrypted dataset and can replicate it further, take snapshots, and so on—without being able to mount/read it unless a key is later provided.

One sentence summary: raw send replicates the ciphertext and the encryption context, not the plaintext.

Two operational consequences that matter:

With raw send, your DR site can be a “dumb vault” for encrypted datasets. You can hand them storage and replication responsibility without handing them the keys.
With raw send, receiving side can still do ZFS things (snapshots, holds, further replication), but it can’t “fix” your data by scrubbing a failing disk into readability if you lose keys. Encryption is a great teacher: it grades you harshly.
Joke #1 (short and relevant): Encryption is like a seatbelt: it feels restrictive until you watch someone try to “just hold on” during an accident.

Facts & historical context

Storage engineering is full of “we invented this in 2005 and are still explaining it in 2025” energy. Here are a few concrete facts and bits of context that make raw send click:
1. ZFS send/receive predates native ZFS encryption. Replication existed long before encryption was part of the core design, so the replication model had to be extended carefully without breaking existing assumptions.
2. ZFS encryption is dataset-native and block-level. It’s not a filesystem-on-top-of-LUKS story. ZFS encrypts blocks and stores encryption metadata with the dataset, enabling per-dataset keys and key rotation.
3. Raw send exists because “encrypted dataset” doesn’t automatically mean “encrypted replication stream.” Without raw mode, ZFS may send a logical representation of the dataset that is effectively plaintext-in-stream.
4. Snapshots are the unit of replication. ZFS send is not “sync the folder.” It’s “send the difference between two points-in-time.” This is why your snapshot policy is your replication policy.
5. Resume tokens changed the operational story. On unstable links, “start over” used to be the default behavior. Resume tokens let you continue a receive without resending everything.
6. Raw send preserves the encryption root and properties. The encryption boundary matters: datasets can inherit encryption, but raw send needs the full encryption context to remain consistent across sites.
7. Dedup and compression are not free lunches. Especially with encryption in play, expectations about dedup effectiveness often collide with reality (and RAM requirements).
8. ZFS streams can include embedded data, large blocks, and recordsize behavior. If you’re used to file-level tools, the “why is this 3x bigger” moment happens fast when recordsize, compression, and snapshot churn interact.
Threat model: what raw send protects (and what it doesn’t)

Raw send is a security tool, not a security blanket. Use it to solve the right problem.

What raw send protects
- Destination storage admins can’t read data without keys. They can replicate it, snapshot it, destroy it, or lose it— but they can’t browse it.
- Intermediary systems can’t read the stream contents if the raw stream is transported over untrusted infrastructure. (They can still tamper with it, which is a different conversation.)
- Key separation across environments. You can keep production keys in production and still maintain offsite copies of the encrypted datasets.
What raw send does not protect
- Integrity by itself. Raw send encrypts confidentiality, not authenticity. ZFS checksums help detect corruption, but you still need to think about transport integrity, disk errors, and hostile modification.
- Availability. If you lose keys, your DR site becomes a museum exhibit titled “Encrypted Blocks, Circa Last Tuesday.”
- Metadata leakage beyond what encryption covers. Depending on implementation and properties, some metadata (dataset names, snapshot names, sizes, timing) may be observable through ZFS properties and operational logs.
- Bad snapshot hygiene. If you snapshot and replicate secrets you shouldn’t have kept, raw send faithfully ships your mistake offsite.
Encryption and send streams: the mental model you need

The most useful mental model is this: ZFS has two broad ways to represent a dataset for replication.
- Logical representation: “Here are the files/blocks as they should appear.” ZFS can reconstruct from this and write out blocks on the receiver. This is where plaintext can show up in the stream even if the on-disk data is encrypted—because ZFS is describing the logical contents.
- Raw representation: “Here are the exact on-disk blocks and metadata.” If the dataset is encrypted, those blocks are ciphertext. The receiver stores them as ciphertext. No keys required to store them; keys required to interpret them.
Raw send also interacts with the concept of an encryption root. In ZFS, a dataset may inherit encryption from a parent. The encryption root is where encryption settings are established. When you replicate, you need to preserve the encryption boundary, or you’ll end up with a destination that can’t properly represent key inheritance and properties.

This is where people trip: they assume “encrypted dataset” implies “receiver will be able to load key and mount it.” That can be true, but it’s optional. Raw send is explicitly designed so the receiver does not need keys to receive. Whether it can later mount depends on whether you later load the key at the destination and whether the encryption parameters match what you expect.

Joke #2 (short and relevant): In storage, “it worked in staging” is just a polite way of saying “production has more entropy.”

Practical tasks: commands you’ll run in production

Below are real tasks you’ll do when building and operating raw send replication. Each task includes commands and what to look for. I’m assuming OpenZFS on Linux-style tooling, but the concepts translate to other platforms with minor flag differences.

Task 1: Confirm the dataset is encrypted and identify the encryption root
```
cr0x@source:~$ zfs get -H -o name,property,value encryption,keylocation,keystatus,encryptionroot tank/prod
tank/prod	encryption	aes-256-gcm
tank/prod	keylocation	prompt
tank/prod	keystatus	available
tank/prod	encryptionroot	tank/prod
```
Interpretation: If encryption=off, raw send will not magically encrypt anything. If encryptionroot is a parent dataset, be aware you might be replicating a tree where children inherit keys.

Task 2: Create a replication snapshot with a naming convention that won’t hurt later
```
cr0x@source:~$ zfs snapshot -r tank/prod@rep-2025-12-25T0200Z
cr0x@source:~$ zfs list -t snapshot -o name,used,creation -r tank/prod | tail -5
tank/prod@rep-2025-12-25T0200Z	0B	Thu Dec 25 02:00 2025
tank/prod/db@rep-2025-12-25T0200Z	0B	Thu Dec 25 02:00 2025
tank/prod/app@rep-2025-12-25T0200Z	0B	Thu Dec 25 02:00 2025
```
Interpretation: Snapshot names become operational handles. If your naming is inconsistent, your incremental replication scripts will become haunted.

Task 3: Do a full raw send into a new destination dataset (unmounted)
```
cr0x@source:~$ zfs send -w -R tank/prod@rep-2025-12-25T0200Z | ssh cr0x@dest "zfs receive -uF backup/prod"
cr0x@dest:~$ zfs list -o name,used,avail,keystatus,mounted backup/prod
NAME         USED  AVAIL  KEYSTATUS  MOUNTED
backup/prod  112G  38.2T  unavailable  no
```
Interpretation: -w is the raw send switch. -R replicates the dataset and its descendants plus properties. On the receiver, -u keeps it unmounted (good practice when keys are not present). Seeing keystatus=unavailable on the destination is the point: it received encrypted data without keys.

Task 4: Verify you actually received an encrypted dataset (not plaintext)
```
cr0x@dest:~$ zfs get -H -o name,property,value encryption,keystatus,encryptionroot backup/prod
backup/prod	encryption	aes-256-gcm
backup/prod	keystatus	unavailable
backup/prod	encryptionroot	backup/prod
```
Interpretation: If encryption is off here, you did not do what you thought you did. Stop and investigate before you replicate further.

Task 5: Perform an incremental raw send (the workhorse)
```
cr0x@source:~$ zfs snapshot -r tank/prod@rep-2025-12-25T0300Z
cr0x@source:~$ zfs send -w -R -I tank/prod@rep-2025-12-25T0200Z tank/prod@rep-2025-12-25T0300Z | ssh cr0x@dest "zfs receive -uF backup/prod"
cr0x@dest:~$ zfs list -t snapshot -o name,creation -r backup/prod | tail -3
backup/prod@rep-2025-12-25T0200Z	Thu Dec 25 02:00 2025
backup/prod@rep-2025-12-25T0300Z	Thu Dec 25 03:00 2025
```
Interpretation: -I sends all intermediate snapshots between two snapshot points (useful when you might have skipped some). If your snapshot graph is clean, -i is tighter. For operational simplicity, -I often wins.

Task 6: Estimate send size before you saturate a link (or a change window)
```
cr0x@source:~$ zfs send -nP -w -R -I tank/prod@rep-2025-12-25T0200Z tank/prod@rep-2025-12-25T0300Z
size	21474836480
incremental	size	21474836480
```
Interpretation: -nP is a dry run with parsable output. That “size” is your planning number. If this is wildly bigger than expected, you probably have snapshot churn, recordsize mismatch, or an app doing “rewrite the whole file every minute.”

Task 7: Receive into a safe holding area using -u and verify nothing auto-mounts
```
cr0x@dest:~$ zfs receive -uF backup/prod < /dev/null
cannot receive: failed to read from stream
```
Interpretation: The command fails because there’s no stream, but the point is cultural: always include -u in automation for encrypted DR copies unless you are explicitly running a warm standby with keys loaded.

Task 8: Use resume tokens to survive flaky links
```
cr0x@dest:~$ zfs get -H -o value receive_resume_token backup/prod
1-8f9c2f1d7c-100000-7890abcdef-200000-10
```
Interpretation: If a receive was interrupted, this property may contain a token. It’s not a trophy; it’s your escape hatch.
```
cr0x@source:~$ ssh cr0x@dest "zfs get -H -o value receive_resume_token backup/prod"
1-8f9c2f1d7c-100000-7890abcdef-200000-10
cr0x@source:~$ zfs send -w -t 1-8f9c2f1d7c-100000-7890abcdef-200000-10 | ssh cr0x@dest "zfs receive -uF backup/prod"
```
Interpretation: zfs send -t resumes from the token. This can save hours (and your relationship with the network team).

Task 9: Confirm the destination can’t mount without keys (expected behavior)
```
cr0x@dest:~$ zfs mount backup/prod
cannot mount 'backup/prod': encryption key not loaded
```
Interpretation: This is a success condition for “vault mode.” If you expected to mount, you need a key management plan, not a different ZFS flag.

Task 10: Load a key on the destination (only when appropriate)
```
cr0x@dest:~$ zfs load-key backup/prod
Enter passphrase for 'backup/prod': 
cr0x@dest:~$ zfs get -H -o value keystatus backup/prod
available
cr0x@dest:~$ zfs mount backup/prod
cr0x@dest:~$ zfs get -H -o value mounted backup/prod
yes
```
Interpretation: You have now crossed the line from “storage-only DR” to “read-capable DR.” This is where audit folks start asking who can type that passphrase and where it’s stored.

Task 11: Validate replication state by comparing snapshot lists
```
cr0x@source:~$ zfs list -t snapshot -o name -r tank/prod | grep '@rep-' | tail -5
tank/prod@rep-2025-12-25T0200Z
tank/prod@rep-2025-12-25T0300Z
tank/prod@rep-2025-12-25T0400Z
tank/prod@rep-2025-12-25T0500Z
tank/prod@rep-2025-12-25T0600Z

cr0x@dest:~$ zfs list -t snapshot -o name -r backup/prod | grep '@rep-' | tail -5
backup/prod@rep-2025-12-25T0200Z
backup/prod@rep-2025-12-25T0300Z
backup/prod@rep-2025-12-25T0400Z
backup/prod@rep-2025-12-25T0500Z
backup/prod@rep-2025-12-25T0600Z
```
Interpretation: Replication correctness starts with “do we have the same snapshot endpoints?” Don’t overcomplicate it.

Task 12: Use holds to prevent snapshot deletion mid-replication
```
cr0x@source:~$ zfs hold -r keepforrep tank/prod@rep-2025-12-25T0600Z
cr0x@source:~$ zfs holds -r tank/prod@rep-2025-12-25T0600Z | head
NAME                          TAG         TIMESTAMP
tank/prod@rep-2025-12-25T0600Z  keepforrep  Thu Dec 25 06:01 2025
```
Interpretation: Holds are underrated. They stop well-meaning cleanup jobs from deleting the snapshot your incremental chain depends on.

Task 13: Check send/receive throughput and identify where time goes
```
cr0x@source:~$ zpool iostat -v 2
                              capacity     operations     bandwidth
pool                        alloc   free   read  write   read  write
--------------------------  -----  -----  -----  -----  -----  -----
tank                         9.21T  5.32T    210    980   185M   612M
  raidz2-0                   9.21T  5.32T    210    980   185M   612M
    sda                          -      -     26    120  23.1M  76.4M
    sdb                          -      -     25    122  22.9M  77.1M
...
```
Interpretation: If disk write bandwidth is high but your network is low, you might be bottlenecked on CPU (encryption/compression), ssh cipher choice, or receiver-side sync writes. If disk read is low, the send might not be pulling as much as expected (or is blocked elsewhere).

Task 14: Receive safely when destination already has data (force rollback with care)
```
cr0x@dest:~$ zfs receive -uF backup/prod
cannot receive: failed to read from stream
```
Interpretation: The important part is the flag: -F forces a rollback of the destination dataset to the most recent snapshot that matches the stream. It’s powerful, and it can destroy newer local snapshots. Use it only when the destination is strictly a replication target, not a place people do local work.

Task 15: Verify encryption key is not accidentally copied into automation
```
cr0x@dest:~$ zfs get -H -o name,property,value keylocation backup/prod
backup/prod	keylocation	none
```
Interpretation: For “vault mode,” keylocation=none (or a non-auto source) reduces the chance of keys being quietly loaded on reboot. If your environment expects auto-load, document it and audit it.

Three corporate-world mini-stories

1) Incident caused by a wrong assumption: “Encrypted dataset” equals “encrypted replication”

In one large organization (the kind with three ticketing systems and zero shared calendars), a team moved a sensitive dataset onto ZFS encryption and declared victory. Compliance liked the words “AES-256,” leadership liked that nobody asked for budget, and the storage team liked that they didn’t have to teach developers about secrets management.

Then came DR replication. Someone wired up zfs send and zfs receive in a hurry, because the old backup system was end-of-life and the vendor had started speaking exclusively in renewal quotes. The replication worked, snapshots appeared at the remote site, and everyone went back to their day jobs.

Months later, during a security review, an auditor asked a simple question: “Can admins at the remote site read the replicated data?” The answer was supposed to be “no.” The test result was “yes,” and it was awkward in exactly the way you’re imagining—quiet, professional, and with a lot of notes being taken.

The postmortem was not about malice. It was about an assumption: the team thought encryption-on-disk implied encryption-in-flight and encryption-in-stream. But their replication was sending a non-raw stream, and the receiver was storing a dataset that could be mounted locally without the original encryption boundary. They had built DR that looked correct in monitoring but violated the core security requirement.

The fix was straightforward: rebuild replication using raw send, verify destination keystatus=unavailable, and write a runbook that forced an explicit decision: “Are we building a vault or a warm standby?” The lesson was the one we always relearn: in ZFS, a dataset property is not a policy unless you test it end-to-end.

2) Optimization that backfired: “Let’s turn on dedup for the backups”

Another place, another good intention. They had nightly snapshots of VM images and database volumes. Replication to the DR site was chewing bandwidth and they wanted it smaller. Someone suggested dedup: “We have lots of similar blocks; ZFS dedup will crush this.” The idea sounded plausible, and the phrase “crush bandwidth” has a certain music to it.

They enabled dedup on the destination pool, reasoning that it was “just backups.” For a short while, they saw improved space usage and slightly smaller deltas. Then the system began to feel sluggish. At first it looked like network jitter. Then it looked like disk latency. Eventually it looked like everything: slow receives, slow snapshots, and occasional stalls that made people suspect the hypervisor layer.

The backfire was predictable in hindsight: dedup needs memory for the DDT (dedup table), and raw encrypted blocks don’t dedup the way people intuitively expect. Encryption tends to turn “similar plaintext blocks” into “random-looking ciphertext blocks,” so the dedup win may be small while the DDT cost is real. They had traded bandwidth for persistent complexity and a new failure mode.

They rolled back dedup, accepted that compression and sane snapshot frequency were better tools, and added a metric: “replication cost per changed GB.” The war story takeaway: if an optimization changes the memory and metadata profile of your storage, you didn’t optimize— you changed the system. Sometimes that’s okay, but it’s never “free.”

3) A boring but correct practice that saved the day: snapshot holds and chain discipline

The most heroic incidents are often the least glamorous. One team ran hourly snapshots and raw incremental replication to a remote site. They also had a painfully boring rule: replication snapshots got a hold tag, and cleanup jobs were forbidden to remove held snapshots. It wasn’t clever. It didn’t impress anyone in architecture meetings. It was just… policy.

Then a change hit: an application team deployed a new build that generated massive churn—rewriting large files instead of appending. Replication deltas ballooned, the link saturated, and the replication job fell behind. Meanwhile, the cleanup job on the source was happily deleting older snapshots because “we only keep 48 hours.”

Except it couldn’t delete the replication anchor snapshots, because they were held. That meant the incremental chain stayed intact while the operations team throttled workloads, adjusted snapshot frequency, and gave replication time to catch up. Without holds, they would have lost the base snapshot mid-stream and been forced into a full resync at the worst possible time.

When the smoke cleared, the postmortem had no villains. The replication system survived because of a practice that nobody wanted to talk about: naming conventions, holds, and refusing to be clever. In production, boring is a feature.

Fast diagnosis playbook

When raw send replication is slow, failing, or behaving oddly, you don’t need a 40-step checklist first. You need a fast funnel that identifies the bottleneck category: CPU, disk, network, or ZFS semantics (snapshots/streams/properties). This is the playbook I use before I let anyone start “tuning.”

First: confirm what you’re actually doing (raw vs non-raw, full vs incremental)
```
cr0x@source:~$ zfs send -nP -w -R -I tank/prod@rep-2025-12-25T0200Z tank/prod@rep-2025-12-25T0300Z
size	21474836480
incremental	size	21474836480
```
Look for: if size is far larger than expected, the “bottleneck” is often “you’re sending more than you think” due to churn or snapshot selection.

Second: check receiver-side constraints (space, readonly, incompatible dataset state)
```
cr0x@dest:~$ zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
backup  40T   28.1T  11.9T        -         -    21%    70%  1.00x  ONLINE  -

cr0x@dest:~$ zfs get -H -o name,property,value readonly,available,receive_resume_token backup/prod
backup/prod	readonly	off
backup/prod	available	11.2T
backup/prod	receive_resume_token	-
```
Look for: low free space, readonly=on, or a resume token that indicates repeated interruptions.

Third: decide whether the bottleneck is disk, CPU, or network

Disk:
```
cr0x@source:~$ zpool iostat -v 2
cr0x@dest:~$ zpool iostat -v 2
```
Look for: high await/latency (on platforms that show it), single vdev saturation, or receiver writes pegged while CPU is idle.

CPU / crypto:
```
cr0x@source:~$ top -b -n 1 | head -20
cr0x@dest:~$ top -b -n 1 | head -20
```
Look for: one core pegged by ssh or kernel crypto, indicating cipher choice or missing hardware acceleration.

Network:
```
cr0x@source:~$ ip -s link show dev eth0
cr0x@dest:~$ ip -s link show dev eth0
```
Look for: errors/drops. If the link is clean but throughput is low, suspect a single-threaded stage (ssh cipher, userland buffer, or receiver sync behavior).

Fourth: inspect ZFS receive behavior and stream compatibility
```
cr0x@dest:~$ zfs events -v | tail -30
```
Look for: events about failed receives, checksum errors, or pool-level faults that make replication “slow” because it’s constantly retrying.

Common mistakes, symptoms, fixes

Mistake 1: Forgetting raw mode and assuming encryption carries over

Symptom: destination dataset is mountable/readable without loading the expected key; or destination encryption=off.

Fix: rebuild replication with raw send and verify at the destination:
```
cr0x@source:~$ zfs send -w -R tank/prod@rep-2025-12-25T0200Z | ssh cr0x@dest "zfs receive -uF backup/prod"
cr0x@dest:~$ zfs get -H -o name,property,value encryption,keystatus backup/prod
backup/prod	encryption	aes-256-gcm
backup/prod	keystatus	unavailable
```
Mistake 2: Breaking incremental chains by deleting snapshots

Symptom: cannot send '...': incremental source ... does not exist or receiver refuses stream due to missing base.

Fix: add holds to anchor snapshots and only prune snapshots that are confirmed replicated.
```
cr0x@source:~$ zfs hold -r keepforrep tank/prod@rep-2025-12-25T0600Z
cr0x@source:~$ zfs release -r keepforrep tank/prod@rep-2025-12-25T0200Z
```
Mistake 3: Using -F blindly on a target that isn’t “replication-only”

Symptom: “mysterious” loss of local snapshots or local changes on destination after replication runs.

Fix: separate datasets: one strictly for replication, one for local use. Or remove -F and handle divergences explicitly.

Mistake 4: Letting the destination auto-mount encrypted datasets

Symptom: unexpected mounts at boot, keys loaded “somehow,” increased blast radius of a compromised DR admin account.

Fix: receive with -u, set controlled keylocation, and audit key loading procedures.
```
cr0x@dest:~$ zfs set canmount=off backup/prod
cr0x@dest:~$ zfs get -H -o name,property,value canmount,mounted backup/prod
backup/prod	canmount	off
backup/prod	mounted	no
```
Mistake 5: Confusing “encrypted transport” with “raw send confidentiality”

Symptom: security review flags that backups are decryptable at destination, or that intermediate systems might see plaintext.

Fix: treat ssh/TLS as transport protection; treat raw send as data-format protection. Use both when you need defense in depth.

Mistake 6: Performance “tuning” by adding layers (and latency)

Symptom: replication slower after introducing extra compression, extra encryption, or heavy buffering defaults.

Fix: measure each stage; simplify pipeline; confirm CPU and disk balance. If you add a tool, you must justify it with metrics.

Checklists / step-by-step plan

Plan A: Build a vault-style DR replica (destination cannot read data)
1. On source, confirm encryption is enabled and note encryption root.
2. Create a recursive replication snapshot with a stable naming scheme.
3. Run a dry-run size estimate for the initial full send window.
4. Do full raw send with -w -R.
5. Receive with -u and consider canmount=off on destination.
6. Verify destination keystatus=unavailable and dataset stays unmounted.
7. Automate hourly/daily incremental raw sends with snapshot holds.
8. Set pruning rules that respect holds and verify replicated snapshots exist before deletion.
Plan B: Build a warm standby (destination can mount with controlled keys)
1. Do everything from Plan A initially.
2. Define key handling: who loads keys, where passphrases live, how audits happen.
3. Test manual key load and mount in a controlled maintenance window.
4. Decide if keys should auto-load on boot (most environments should say “no” unless there is a strong operational reason).
5. Run a failover rehearsal: promote services, confirm application consistency, measure RTO.
Step-by-step: A minimal, reproducible replication loop

This is the “if you’re tired and it’s 02:00” version. It assumes SSH connectivity and that destination is a strict replica.
```
cr0x@source:~$ export SRC=tank/prod
cr0x@source:~$ export DST=backup/prod
cr0x@source:~$ export SNAP=rep-$(date -u +%Y-%m-%dT%H%MZ)

cr0x@source:~$ zfs snapshot -r ${SRC}@${SNAP}
cr0x@source:~$ zfs hold -r keepforrep ${SRC}@${SNAP}

cr0x@source:~$ LAST=$(zfs list -t snapshot -o name -s creation -r ${SRC} | grep '@rep-' | tail -2 | head -1)
cr0x@source:~$ CURR=${SRC}@${SNAP}

cr0x@source:~$ zfs send -nP -w -R -I ${LAST} ${CURR}

cr0x@source:~$ zfs send -w -R -I ${LAST} ${CURR} | ssh cr0x@dest "zfs receive -uF ${DST}"
```
Interpretation: This creates a new snapshot, holds it, finds the previous replication snapshot, estimates size, and then sends incrementally in raw mode. In real automation, you’d handle the “first run” case and release holds when safe.

FAQ

1) Does raw send mean I don’t need SSH encryption?

No. Raw send protects the payload’s confidentiality even if intercepted, but SSH still matters for authentication, protection against tampering in transit, and reducing operational risk. Use raw send for “no keys on destination,” and SSH for “no strangers inject garbage into my receive.”

2) Can the destination replicate the encrypted dataset onward without keys?

Yes, that’s one of the best parts. The destination can perform raw sends of the encrypted dataset to another target, acting as a relay or secondary vault, without ever loading keys.

3) What does keystatus=unavailable actually mean?

It means ZFS knows the dataset is encrypted but does not have the key loaded. The dataset can exist, replicate, and be managed, but it cannot be mounted or read.

4) If I lose keys, can I restore from the raw replica?

You can restore the dataset structure and snapshots, but you cannot decrypt the contents. The raw replica is not a workaround for key loss. Treat keys like the crown jewels: backed up, access controlled, and tested.

5) Why is my incremental send huge when only a little data changed?

Common causes: an application rewrites large files instead of appending, a VM image sees large block churn, recordsize mismatch amplifies changes, or you’re sending a broader snapshot range than intended (e.g., -I including intermediate snapshots with more churn).

6) Can I change properties on the destination replica?

You can, but be careful: replication with -R tends to preserve properties from the source, and repeated receives can revert destination-side changes. If you need different properties at the destination (mountpoints, quotas), plan for it explicitly and test how your receive flags interact.

7) What’s the difference between receiving with -u and setting canmount=off?

-u prevents mounting during the receive operation. canmount=off is a persistent property that prevents mounting at all unless changed. For vault replicas, I often do both: receive unmounted and keep it that way by policy.

8) Should I use -i or -I for incrementals?

-i sends from one snapshot to another directly (tighter, assumes clean chains). -I includes intermediate snapshots between the points (more resilient to skipped sends, sometimes larger streams). In corporate environments with missed jobs and human scheduling, -I is frequently the safer default.

9) How do I know the destination is not accidentally getting keys?

Check keystatus and keylocation, and audit who can run zfs load-key. Also check boot-time units/scripts for key loading behavior. The “keys aren’t there” promise fails most often due to convenience automation added later.

10) Is raw send slower than normal send?

It depends. Raw send can be efficient because it ships stored blocks, but overall throughput depends on the whole pipeline: source read, ZFS send processing, transport, receiver writes, and pool topology. The bigger performance predictor is usually churn and I/O characteristics, not the raw flag itself.

Conclusion

ZFS raw send is one of those features that quietly enables grown-up infrastructure: you can replicate encrypted datasets to places that should never see keys, you can build multi-hop backup chains, and you can do it with first-class ZFS semantics—snapshots, incrementals, and resumable streams.

The operational discipline is the real cost: snapshot hygiene, chain integrity, key management boundaries, and verifying what you think you’re replicating. Do those well and raw send is a gift. Do them casually and you’ll either leak data or discover, at the worst moment, that your “backups” are cryptographic paperweights.

ZFS Raw Send: Replicating Encrypted Data Without Sharing Keys

What raw send actually is (and what it isn’t)

Facts & historical context

Threat model: what raw send protects (and what it doesn’t)

What raw send protects

What raw send does not protect

Encryption and send streams: the mental model you need

Practical tasks: commands you’ll run in production

Task 1: Confirm the dataset is encrypted and identify the encryption root

Task 2: Create a replication snapshot with a naming convention that won’t hurt later

Task 3: Do a full raw send into a new destination dataset (unmounted)

Task 4: Verify you actually received an encrypted dataset (not plaintext)

Task 5: Perform an incremental raw send (the workhorse)

Task 6: Estimate send size before you saturate a link (or a change window)

Task 7: Receive into a safe holding area using -u and verify nothing auto-mounts

Task 8: Use resume tokens to survive flaky links

Task 9: Confirm the destination can’t mount without keys (expected behavior)

Task 10: Load a key on the destination (only when appropriate)

Task 11: Validate replication state by comparing snapshot lists

Task 12: Use holds to prevent snapshot deletion mid-replication

Task 13: Check send/receive throughput and identify where time goes

Task 14: Receive safely when destination already has data (force rollback with care)

Task 15: Verify encryption key is not accidentally copied into automation

Three corporate-world mini-stories

1) Incident caused by a wrong assumption: “Encrypted dataset” equals “encrypted replication”

2) Optimization that backfired: “Let’s turn on dedup for the backups”

3) A boring but correct practice that saved the day: snapshot holds and chain discipline

Fast diagnosis playbook

First: confirm what you’re actually doing (raw vs non-raw, full vs incremental)

Second: check receiver-side constraints (space, readonly, incompatible dataset state)

Third: decide whether the bottleneck is disk, CPU, or network

Fourth: inspect ZFS receive behavior and stream compatibility

Common mistakes, symptoms, fixes

Mistake 1: Forgetting raw mode and assuming encryption carries over

Mistake 2: Breaking incremental chains by deleting snapshots

Mistake 3: Using -F blindly on a target that isn’t “replication-only”

Mistake 4: Letting the destination auto-mount encrypted datasets

Mistake 5: Confusing “encrypted transport” with “raw send confidentiality”

Mistake 6: Performance “tuning” by adding layers (and latency)

Checklists / step-by-step plan

Plan A: Build a vault-style DR replica (destination cannot read data)

Plan B: Build a warm standby (destination can mount with controlled keys)

Step-by-step: A minimal, reproducible replication loop

FAQ

1) Does raw send mean I don’t need SSH encryption?

2) Can the destination replicate the encrypted dataset onward without keys?

3) What does keystatus=unavailable actually mean?

4) If I lose keys, can I restore from the raw replica?

5) Why is my incremental send huge when only a little data changed?

6) Can I change properties on the destination replica?

7) What’s the difference between receiving with -u and setting canmount=off?

8) Should I use -i or -I for incrementals?

9) How do I know the destination is not accidentally getting keys?

10) Is raw send slower than normal send?

Conclusion

Related articles

Leave a comment Cancel reply

Task 7: Receive into a safe holding area using `-u` and verify nothing auto-mounts

Mistake 3: Using `-F` blindly on a target that isn’t “replication-only”

3) What does `keystatus=unavailable` actually mean?

7) What’s the difference between receiving with `-u` and setting `canmount=off`?

8) Should I use `-i` or `-I` for incrementals?