ZFS copies=2/3: Extra Redundancy Without a New VDEV—Smart or Waste?

January 6, 2026 • February 3, 2026 • Read: 24 min • Views: 11

Was this helpful?

ZFS has a property called copies that looks like a cheat code: set it to 2 or 3 and your data gets extra redundancy—no new disks, no new vdevs, no pool rebuild. It’s tempting, especially when procurement is “looking into it” and your risk register is “on fire.”

In practice, copies is neither magic nor useless. It’s a sharp tool that can save a production week—or quietly burn a pile of capacity and performance while giving you a false sense of safety. Let’s treat it like a grown-up feature: understand exactly what it duplicates, where it places those blocks, how it interacts with mirrors/RAIDZ/special vdevs, and how to operate it without turning your pool into a slow, expensive science project.

What copies really does (and what it doesn’t)
Quick facts & historical context
Under the hood: block placement, checksums, and failure modes
Smart or waste? A decision framework
Performance and capacity costs (the parts you actually feel)
Practical patterns: where copies shines
Three corporate-world mini-stories (pain included)
Hands-on tasks: commands, outputs, and interpretation
Fast diagnosis playbook
Common mistakes: symptoms and fixes
Checklists / step-by-step plan
FAQ
Conclusion

What `copies` really does (and what it doesn’t)

The ZFS dataset property copies controls how many copies ZFS stores for each block written to that dataset: copies=1 (default), copies=2, or copies=3.
It’s per-dataset, inheritable, and it applies to new writes (and rewritten blocks), not to existing blocks unless you rewrite them.

What it does

When you write a block, ZFS allocates space and writes that block. With copies=2, it allocates two distinct block locations and writes the same block twice. With copies=3, three locations.
Reads can come from any valid copy; if a copy fails checksum validation, ZFS will try another.

What it does not do

copies does not turn a single-disk pool into a safe pool. It does not protect you from losing a vdev. It does not survive “disk is gone” unless the redundancy of the vdev(s) can still provide at least one valid copy.
It also does not replace backups. It is still the same pool, same administrative domain, same “oops I deleted it” blast radius.

A useful mental model: vdev redundancy (mirror/RAIDZ) protects you from device failure. copies protects you from some block-level loss/corruption scenarios within the surviving topology, and it can improve recoverability when you’re already running degraded or have flaky sectors.

Joke #1: Setting copies=3 on your entire pool is like triple-bagging groceries—great until you realize you can’t carry anything else and you still forgot the eggs.

Quick facts & historical context

Here are a few concrete context points that shape how copies should be understood in 2025:

ZFS was built around end-to-end checksumming: corruption is detected at read time, not guessed at via drive “success.” That makes multiple copies meaningful because ZFS can choose a known-good copy.
Early ZFS deployments pushed “scrub or regret”: periodic scrubs turned latent sector errors into detected-and-repaired events—long before consumer storage got good at admitting problems.
copies is older than many people’s ZFS careers: it’s not a trendy add-on; it’s been around for years as a targeted redundancy knob.
RAIDZ is not “just RAID”: ZFS parity is integrated with checksums and self-healing reads, but parity still has failure domains (whole vdev loss, missing devices) that extra copies can’t magically cross.
Special vdevs changed what “metadata safety” means: moving metadata (and optionally small blocks) onto a special vdev is a performance win—and a reliability trap if you under-redundancy that vdev. Extra copies can be part of the mitigation story, but not the whole story.
Modern drives lie politely: a “successful” read can still deliver bad bits. Checksums catch this; redundancy fixes it. Extra copies can provide another chance when parity reconstruction is painful or impossible.
Compression became mainstream: with compression=lz4, the capacity tax of copies=2 is sometimes less catastrophic than you’d assume—on compressible datasets.
Record sizes got bigger, small files got weirder: databases, VM images, and object stores stress different block patterns; copies interacts with that in ways that aren’t intuitive from “twice the data, twice the cost.”

Under the hood: block placement, checksums, and failure modes

Copies are separate block allocations, not “RAID inside RAID”

ZFS doesn’t store a “primary block” plus an “internal mirror.” It stores multiple independent block pointers in metadata that all point to equivalent block payloads.
On write, ZFS picks multiple allocation targets. On read, ZFS can choose among them.

The important nuance: those allocations typically end up in the same vdev class (normal data vdevs, or special vdevs if the block is classified to go there), and within the same pool.
That means extra copies are not independent in the way a second pool, a second chassis, or a second site is independent.

Checksums are the referee

ZFS checksums every block and stores checksums in parent blocks (not next to the data it protects). When you read a block, ZFS validates it. If the checksum fails and redundancy exists, ZFS can try:

another copy (with copies>1),
another mirror side (on a mirror vdev),
parity reconstruction (on RAIDZ),
or any combination of the above.

What failures does `copies` help with?

In the real world, the failures that matter are rarely “drive dead, replaced cleanly.” They’re usually:

Latent sector errors surfacing during scrub/resilver.
Partially failing devices returning intermittent read errors.
Firmware/transport glitches that cause timeouts and I/O errors under load.
Bad blocks in a degraded pool, where parity reconstruction or mirror alternatives are already constrained.

Extra copies can help when the pool has redundancy to keep at least one copy accessible, but individual blocks become unreadable in one location.
If the vdev itself is lost (e.g., too many disks gone in a RAIDZ, or both sides of a two-way mirror dead), all copies stored on that vdev are irrelevant.

Where do the copies go?

The practical answer: they go where the allocator finds space, subject to ZFS’s block allocation and metaslab behavior.
The uncomfortable truth: you should not assume the copies land on separate physical disks in a way that mimics a mirror.
They can end up on the same disk in a RAIDZ vdev (still at different offsets), because RAIDZ stripes data across all disks and parity, and the “copy” is another logical block that will be distributed by the RAIDZ mapping. This is still useful: it is a second independently allocated block that can survive localized corruption or unreadable regions, but it is not a second independent fault domain.

Interaction with mirrors vs RAIDZ

On a mirror vdev, you already have two (or more) physical copies across disks. Setting copies=2 on top of a two-way mirror means you now have two logical copies, each of which is mirrored—effectively four physical instances across two disks. That can help with certain corruption/repair edge cases, but it’s a lot of write amplification for marginal gain.

On RAIDZ, copies can be more rational for small, high-value data where parity reconstruction during degraded periods is risky or slow. You’re paying for extra allocations, but you’re buying more “shots on goal” for a block to be readable without heroic recovery.

Special vdevs: the place where `copies` becomes political

If you use a special vdev to store metadata and (optionally) small blocks, the availability of that vdev becomes existential for the pool.
If the special vdev dies and you don’t have redundancy there, you can lose the pool even if your big RAIDZ data disks are fine.
Extra copies can reduce the chance that a specific metadata block becomes unreadable due to media errors, but it does not fix “special vdev lost.”

Smart or waste? A decision framework

The right question is not “is copies=2 good?” The right question is: “What failure am I trying to survive, and what am I willing to pay to survive it?”

When `copies=2` is smart

Small, business-critical datasets: configs, secrets vault exports, small databases, license keys, CI artifacts that must exist.
Metadata-heavy trees: millions of small files, where a single unreadable metadata path can be a nightmare during restore.
Degraded-risk windows: environments where you expect to run degraded (remote sites, slow hands on hardware), and you want extra margin against UREs during scrub/resilver.
Boot environments and OS datasets: where “won’t boot” is operationally expensive even if data is intact elsewhere.
Special vdev datasets: some teams put copies=2 on metadata-heavy datasets when special vdev capacity allows, as a belt-and-suspenders approach against localized media errors.

When it’s waste (or worse)

Bulk media, backups, and cold archives: if you can rehydrate, extra copies are often wasted; use replication to another pool/site instead.
Databases with high write rates: write amplification hurts twice: latency and SSD endurance (or HDD IOPS). You’ll feel it and you’ll pay for it.
Entire pool set to copies=2 “just in case”: this is the storage equivalent of putting the whole company on a group chat because one person misses messages.
Trying to compensate for a bad topology: if your design is “single RAIDZ1 of huge drives,” copies is not the fix. The fix is the vdev design, spares, monitoring, and a realistic resilver plan.

A practical decision rubric

I use three questions in production:

What is the recovery path if a few blocks are unreadable? If the answer is “restore from backup in hours” or “rebuild the object,” you probably don’t need copies.
What is the write profile? If it’s high-churn random write, copies can turn a stable system into a latency complaint generator.
Can I constrain the blast radius? If you can isolate critical data into a dataset and set copies there, the feature becomes viable. Blanket use is how you end up explaining storage math to finance.

Performance and capacity costs (the parts you actually feel)

Capacity: usually close to linear, sometimes not

The naive expectation is correct most of the time: copies=2 roughly doubles referenced space for that dataset; copies=3 roughly triples it.
Compression and recordsize can reduce the absolute numbers, but the multiplier effect remains: every block you keep, you keep multiple times.

Where people get surprised is not the multiplier; it’s the interaction with snapshots and rewrite patterns. ZFS is copy-on-write. If you have snapshots and you rewrite blocks, you now have:

old blocks retained by snapshots (with whatever copies setting existed when they were written),
new blocks written with the current copies setting,
and potentially multiple generations of duplicated data if the dataset churns.

Write amplification: the real price tag

Each extra copy is extra allocation, extra I/O, extra checksum work, and extra metadata updates. On HDD RAIDZ, the pain often shows up as:

higher write latency and longer txg sync times,
more fragmentation (more allocations per logical write),
worse small-write IOPS.

On SSD/NVMe pools, it can still hurt, but the symptom is often endurance and garbage collection behavior, plus occasional latency spikes when the system is under memory pressure.

Read behavior: sometimes better, sometimes worse

Reads can benefit if one copy is slow or intermittently failing; ZFS can pick another. But you can also make reads worse by creating more fragmented layouts and more metadata overhead over time.
In clean conditions, reads usually don’t get magically faster from copies.

Joke #2: copies=3 is a bit like bringing three umbrellas to avoid rain—you’ll still get wet if the roof collapses, and now you have three umbrellas to dry.

Practical patterns: where `copies` shines

Pattern 1: “Tiny but terrifying” dataset

A small dataset containing things like Terraform state exports, cluster bootstrap assets, or certificate authorities can be existential.
Storing it with copies=2 buys you extra margin against localized media errors without redesigning the entire pool.

Pattern 2: Metadata-heavy file trees

Millions of small files means the metadata graph is the system. Losing a directory block or an indirect block is not “one file missing”; it can be “this whole subtree is now a crime scene.”
copies=2 can reduce the probability that a single bad block turns into a recovery expedition.

Pattern 3: The “degraded is normal” edge site

If you operate remote sites where replacing a disk takes days, not hours, you spend real time in degraded mode.
Extra copies can improve your odds during that window—especially on RAIDZ—when scrubs/resilvers are more likely to hit read errors elsewhere.

Pattern 4: Special vdev protection, carefully scoped

If you use a special vdev, keep it redundant first. Then consider extra copies on datasets that place many small blocks there, but only if capacity is ample and you’ve measured txg behavior.
The goal isn’t to “make special vdev safe.” The goal is to reduce the chance that a few nasty blocks ruin your week.

Three corporate-world mini-stories (pain included)

Mini-story #1: An incident caused by a wrong assumption

A mid-sized company ran a single pool for “everything,” built on a large RAIDZ2. A team lead read about copies=2 and decided to enable it on the dataset holding VM disk images, reasoning: “RAIDZ2 plus copies=2 equals basically RAIDZ4, right?” This change sailed through because it sounded like free safety.

Two months later, a routine hypervisor patch kicked off a burst of writes: VM snapshots, package updates, and log churn. Latency jumped. The storage graphs looked like a seismograph. The first reaction was to blame the network, then the hypervisors, then “maybe the new kernel.” Nobody suspected the filesystem property because, in theory, it was “just redundancy.”

The postmortem was humbling: the dataset was a high-write workload and copies=2 doubled allocation pressure and sync work. The pool wasn’t out of space, but it was out of patience—txg sync times grew, and synchronous writes (and anything pretending to be synchronous) started queueing behind reality.

The wrong assumption wasn’t that ZFS could store multiple copies. The wrong assumption was that redundancy is free if you don’t buy disks. They backed out copies for VM images, carved out a small “critical config” dataset with copies=2, and put the rest of the redundancy discussion where it belonged: vdev design, spares, and replication.

Mini-story #2: An optimization that backfired

Another org decided to “optimize” their special vdev usage. They had NVMe devices as a special vdev for metadata and small blocks, and someone wanted maximum performance for a monorepo build cache: lots of small files, lots of metadata, very hot.
They set special_small_blocks to a large value so more data would land on the special vdev, and then they added copies=2 “to be safe.”

Performance initially looked great. Builds got faster, inode-heavy operations flew, and everyone congratulated the storage team for being “innovative.” Then the special vdev started filling faster than expected. Because it was now hosting not just metadata but a huge amount of small data, copies=2 doubled the consumption there. The pool as a whole had plenty of space; the special vdev did not.

Once the special vdev crossed uncomfortable fullness, allocations got slower, fragmentation increased, and the system developed a new hobby: latency spikes at the worst possible times (release days, of course). The “optimization” had moved a capacity constraint into the hottest path.

The resolution wasn’t to declare special vdevs “bad.” It was to treat them as a separate tier with its own capacity planning. They reduced special_small_blocks, kept the special vdev redundant, and used copies=2 only on a very small dataset that truly warranted it. The build cache got performance tuning elsewhere—because storage is not a substitute for deleting old caches.

Mini-story #3: A boring but correct practice that saved the day

A financial services team had a habit that looked painfully conservative: monthly scrub windows, alerting on checksum errors, and a policy that any dataset containing “keys, auth, bootstrap” got copies=2—but only those datasets, and only after capacity checks.
Nobody bragged about this. It wasn’t glamorous. It was paperwork with CLI commands.

During a routine scrub, ZFS reported a small number of checksum errors on one disk. The disk didn’t fail SMART thresholds; it didn’t drop out. It just started occasionally returning garbage. The scrub detected bad blocks, ZFS healed them using redundancy, and alerts fired. The team replaced the drive during business hours without drama.

A week later, they hit a second issue: a separate device had a few unreadable sectors. This time, the dataset affected was one of the “keys and bootstrap” datasets. Because it had copies=2, ZFS was able to read from an alternate copy without depending exclusively on parity reconstruction under stress.

Nothing exploded. No war room. No “restore from tape.” Just a couple of tickets and a hardware swap. The boring practice wasn’t that they used copies; it was that they used it narrowly, watched their pools like adults, and ran scrubs on schedule even when nothing was on fire. Boring is underrated in storage.

Hands-on tasks: commands, outputs, and interpretation

The commands below assume OpenZFS on Linux style tooling. Adjust pool/dataset names. The goal is not to memorize flags; it’s to build muscle memory for verifying behavior before and after you touch copies.

Task 1: Inspect current `copies` settings and inheritance

cr0x@server:~$ zfs get -r copies tank
NAME              PROPERTY  VALUE    SOURCE
tank              copies    1        default
tank/critical      copies    2        local
tank/critical/ca   copies    2        inherited from tank/critical
tank/vm           copies    1        inherited from tank

Interpretation: you want to see SOURCE as local only where you meant it. Accidental inheritance is how “small safety measure” becomes “why is the pool full?”

Task 2: Change `copies` for a single dataset

cr0x@server:~$ sudo zfs set copies=2 tank/critical

Interpretation: this affects new writes (new blocks). Existing blocks remain as they were until rewritten.

Task 3: Confirm whether existing data is rewritten (it won’t be)

cr0x@server:~$ zfs get copies tank/critical
NAME           PROPERTY  VALUE  SOURCE
tank/critical  copies    2      local

Interpretation: property is set, but that alone doesn’t retroactively duplicate existing blocks.

Task 4: Force a rewrite so blocks get new copy policy

cr0x@server:~$ sudo rsync -a --inplace --checksum /tank/critical/ /tank/critical/.rewrite-pass/

Interpretation: rewriting can be expensive and creates new data. In practice you’d do a controlled rewrite (sometimes via a temporary dataset) and verify space headroom first. Be careful: “retrofit copies” is not a free operation.

Task 5: Track referenced space and logical used space

cr0x@server:~$ zfs list -o name,used,refer,logicalused,logicalrefer,compressratio tank/critical
NAME           USED  REFER  LUSED  LREFER  RATIO
tank/critical  18G   18G    11G    11G     1.60x

Interpretation: logical* shows pre-compression logical size; USED/REFER are on-disk. With copies=2, you expect on-disk to rise relative to logical—unless compression offsets it.

Task 6: Watch pool health and error counters

cr0x@server:~$ zpool status -v tank
  pool: tank
 state: ONLINE
  scan: scrub repaired 0B in 02:11:19 with 0 errors on Sun Dec 22 03:10:05 2025
config:

        NAME                         STATE     READ WRITE CKSUM
        tank                         ONLINE       0     0     0
          raidz2-0                   ONLINE       0     0     0
            sda                      ONLINE       0     0     0
            sdb                      ONLINE       0     0     0
            sdc                      ONLINE       0     0     0
            sdd                      ONLINE       0     0     0
            sde                      ONLINE       0     0     0
            sdf                      ONLINE       0     0     0

errors: No known data errors

Interpretation: CKSUM errors are the ones that tell you “the drive returned bad data.” That’s where redundancy and extra copies matter.

Task 7: Run a scrub (and know what you’re signing up for)

cr0x@server:~$ sudo zpool scrub tank
cr0x@server:~$ zpool status tank
  pool: tank
 state: ONLINE
  scan: scrub in progress since Mon Dec 23 02:00:02 2025
        1.02T scanned at 1.40G/s, 210G issued at 290M/s, 6.20T total
        0B repaired, 3.39% done, 06:40:12 to go

Interpretation: scrubs are how you find latent errors while you still have redundancy. If copies is your safety net, scrubs are how you confirm the net exists.

Task 8: Measure txg sync behavior (a proxy for “writes are hurting”)

cr0x@server:~$ sudo zpool get -H -o name,property,value,source autotrim,ashift,autoreplace tank
tank	autotrim	off	default
tank	ashift	12	local
tank	autoreplace	off	default

Interpretation: not txg yet, but you’re validating fundamentals. ashift mistakes and trim settings can dominate performance, and people blame copies when the real issue is misalignment or device behavior.

Task 9: Observe real-time I/O by pool and vdev

cr0x@server:~$ zpool iostat -v tank 2
                              capacity     operations     bandwidth
pool                        alloc   free   read  write   read  write
--------------------------  -----  -----  -----  -----  -----  -----
tank                        5.20T  1.00T    120    980  12.3M  210M
  raidz2-0                  5.20T  1.00T    120    980  12.3M  210M
    sda                         -      -     20    170  2.1M   35.0M
    sdb                         -      -     21    165  2.1M   34.1M
    sdc                         -      -     18    162  2.0M   33.8M
    sdd                         -      -     19    161  2.1M   34.6M
    sde                         -      -     21    160  2.0M   35.2M
    sdf                         -      -     21    162  2.0M   36.1M
--------------------------  -----  -----  -----  -----  -----  -----

Interpretation: after enabling copies=2 on a write-heavy dataset, you’ll often see write ops climb and bandwidth rise for the same app-level throughput. That’s write amplification showing up in daylight.

Task 10: Verify dataset-level properties that interact with `copies`

cr0x@server:~$ zfs get -o name,property,value -s local,inherited,default recordsize,compression,sync,copies,logbias tank/critical
NAME           PROPERTY     VALUE     SOURCE
tank/critical  recordsize   128K      default
tank/critical  compression  lz4       inherited from tank
tank/critical  sync         standard  default
tank/critical  copies       2         local
tank/critical  logbias      latency   default

Interpretation: sync, recordsize, and compression often determine whether copies is tolerable. A small-recordsize sync workload with extra copies is how you create a storage team support queue.

Task 11: Check snapshot footprint before changing policy

cr0x@server:~$ zfs list -t snapshot -o name,used,refer,creation -s creation tank/critical | tail -n 5
tank/critical@auto-2025-12-23-0100   220M   18G  Mon Dec 23 01:00 2025
tank/critical@auto-2025-12-23-0200   110M   18G  Mon Dec 23 02:00 2025
tank/critical@auto-2025-12-23-0300   190M   18G  Mon Dec 23 03:00 2025
tank/critical@auto-2025-12-23-0400   140M   18G  Mon Dec 23 04:00 2025
tank/critical@auto-2025-12-23-0500   160M   18G  Mon Dec 23 05:00 2025

Interpretation: snapshots pin old blocks. If you switch to copies=2 and then churn data, you can end up paying for both old single-copy history and new double-copy future until snapshots age out.

Task 12: Validate special vdev presence and health (if used)

cr0x@server:~$ zpool status tank
  pool: tank
 state: ONLINE
config:

        NAME                               STATE     READ WRITE CKSUM
        tank                               ONLINE       0     0     0
          raidz2-0                         ONLINE       0     0     0
            sda                            ONLINE       0     0     0
            sdb                            ONLINE       0     0     0
            sdc                            ONLINE       0     0     0
            sdd                            ONLINE       0     0     0
            sde                            ONLINE       0     0     0
            sdf                            ONLINE       0     0     0
          special
            mirror-1                       ONLINE       0     0     0
              nvme0n1p1                    ONLINE       0     0     0
              nvme1n1p1                    ONLINE       0     0     0

Interpretation: if you have a special vdev, treat it like a first-class citizen. Extra copies won’t save a non-redundant special vdev design.

Task 13: Spot pool-wide fragmentation pressure

cr0x@server:~$ zpool get fragmentation tank
NAME  PROPERTY       VALUE  SOURCE
tank  fragmentation  38%    -

Interpretation: extra copies increase allocation activity. On long-lived pools, that can increase fragmentation and harm latency. This doesn’t mean “never use copies,” it means “don’t use it blindly and forever.”

Task 14: Confirm no accidental pool-wide setting change

cr0x@server:~$ zfs get copies tank
NAME  PROPERTY  VALUE  SOURCE
tank  copies    1      default

Interpretation: if this shows local at the pool root, you’ve probably just doubled the cost of every dataset unless overridden. That’s the kind of discovery you want at 10:00, not at 02:00.

Fast diagnosis playbook

When someone says “storage is slow” and you recently touched copies, don’t start by debating philosophy. Start by narrowing the bottleneck in minutes.

Step 1: Is the pool healthy, or are we masking a failure?

Check zpool status -v for read/write/checksum errors, degraded vdevs, or ongoing resilver/scrub.
If you see checksum errors, assume you have a device or path problem first. Extra copies might be helping you limp, but they’re not the root cause.

Step 2: Is it capacity pressure or special vdev pressure?

Check zfs list and zpool list for free space; ZFS doesn’t like being very full.
If you use a special vdev, verify its allocation and that it isn’t near-full. A “fine” pool with a “full-ish” special vdev can still behave badly.

Step 3: Is it write amplification and txg contention?

Watch zpool iostat -v 2 during the slowdown. If write ops and bandwidth are high but app throughput is flat, you’re paying an amplification tax.
Check the datasets involved: zfs get copies,sync,recordsize,compression. A sync-heavy dataset with copies=2 is a prime suspect.

Step 4: Is it one bad disk (or one slow disk) dragging the vdev?

In mirrors, a slow side can create weird latency patterns depending on read policy and queueing.
In RAIDZ, one sick disk can bottleneck reconstruction and scrubs/resilvers brutally.

Step 5: Confirm you didn’t accidentally expand the blast radius

zfs get -r copies tank and look for unexpected inheritance.
Check whether a “temporary” dataset with copies=3 became the parent of everything because someone likes shortcuts.

Common mistakes: symptoms and fixes

Mistake 1: Setting `copies=2` at the pool root

Symptom: pool usage climbs unexpectedly fast; write latency increases across many workloads.

Fix: set it back at the root and explicitly set it only on the intended datasets.

cr0x@server:~$ sudo zfs inherit copies tank
cr0x@server:~$ sudo zfs set copies=2 tank/critical

Mistake 2: Expecting retroactive protection without rewriting data

Symptom: you set copies=2, but scrubs still report unrecoverable errors on old blocks (or you don’t see the capacity increase you expected).

Fix: plan a rewrite/migration for the specific dataset if you truly need old blocks duplicated. Do it with verified free space and snapshot policy awareness.

Mistake 3: Using `copies` to “fix” a risky vdev design

Symptom: you still can’t tolerate a second disk loss in RAIDZ1; a degraded resilver is still terrifying.

Fix: redesign with appropriate parity/mirrors, add hot spares, and use replication. copies is not a substitute for fault-domain engineering.

Mistake 4: Enabling `copies=2` on write-heavy VM or database datasets

Symptom: sudden latency spikes after “safety improvement,” higher txg sync times, more IO wait, and angry application owners.

Fix: revert to copies=1 for those datasets; use proper redundancy and consider SLOG/sync tuning only if you fully understand the durability requirements.

cr0x@server:~$ sudo zfs set copies=1 tank/vm

Mistake 5: Filling special vdev space by accident

Symptom: pool has free space, but metadata operations slow down; special vdev allocation is high; latency spikes occur during builds or file storms.

Fix: reduce special vdev load (adjust dataset placement policies, prune small-file datasets), and ensure special vdev capacity is sized for the long term. Avoid doubling it with copies unless you’ve budgeted space.

Mistake 6: Treating `copies` as a backup plan

Symptom: ransomware, accidental deletion, or a bad automation run destroys data across all copies instantly.

Fix: use snapshots, immutability where available, and replication/offsite backups. Extra copies are still “inside the blast radius.”

Checklists / step-by-step plan

Checklist A: Before you enable `copies=2`

Scope the dataset: confirm you can isolate the critical data into its own dataset (or a small subtree).
Measure baseline: capture zpool iostat -v, zpool status, and dataset used/refer/logicalused.
Confirm free space headroom: don’t do this on a pool already pushing fullness; allocator stress will hide your real results.
Check snapshot retention: understand how long old single-copy blocks will remain pinned if you churn data.
Confirm vdev health: if you already have checksum errors, fix hardware first; don’t paper over it with extra copies.

Checklist B: Enable and validate safely

Set copies=2 on the dataset only.
Write a small test payload and verify behavior via space accounting deltas (expect more on-disk growth per logical write).
Monitor latency and txg symptoms during your normal peak workload.
Run a scrub in your normal maintenance cycle and verify “0 errors” stays true.

Checklist C: If you need existing data duplicated

Plan space: you may temporarily need extra free space to rewrite.
Pick a rewrite method: a controlled copy to a new dataset (preferred), or an in-place rewrite approach (riskier).
Validate with checksums: use ZFS send/receive where appropriate to force rewrite semantics while preserving snapshots, but only if it matches your operational model.
Age out snapshots: ensure old blocks eventually disappear, or you’ll pay for both worlds indefinitely.

FAQ

1) Does `copies=2` mean I can survive two disk failures?

No. Disk failure survivability is determined by your vdev layout (mirror/RAIDZ level). copies adds additional block instances within that layout; it doesn’t create a new independent failure domain.

2) Does `copies` protect against silent data corruption?

ZFS checksums detect silent corruption. Redundancy repairs it. Extra copies add additional valid sources for repair when one location is bad. It improves odds for certain corruption patterns, but it’s not a replacement for proper redundancy and scrubs.

3) Is `copies=2` useful on a mirrored pool?

Sometimes, for small critical datasets. But mirrors already provide multiple physical copies. On busy mirrored workloads, copies can be an expensive way to buy a small incremental reliability gain.

4) Is `copies=2` useful on RAIDZ?

More often than on mirrors—when used narrowly. RAIDZ parity can reconstruct, but degraded periods and UREs are where extra independently allocated copies can improve recoverability for specific high-value data.

5) Does setting `copies=2` duplicate existing data automatically?

No. It affects new writes. To apply it to existing blocks, you must rewrite/migrate the data.

6) How does `copies` interact with compression?

Compression happens per block. ZFS stores the compressed block multiple times. If the data compresses well, the absolute space cost of extra copies is smaller—but the multiplier remains.

7) Will `copies=3` ever be the right answer?

Rarely, but yes: tiny datasets that are existential and rarely written (for example, a minimal bootstrap tree), where capacity cost is negligible and you want maximum margin against localized unreadable blocks. If you’re considering it for terabytes, pause and re-read the question.

8) Is `copies` better than replication?

Different tools. copies improves survivability within one pool against some block-level issues. Replication creates an independent copy in another pool/system, which helps with disasters, operator error, and site-level failures. If you can only afford one, replication usually wins.

9) Can I set `copies` only for metadata?

Not directly as a “metadata-only” toggle per dataset. You can, however, target datasets that are metadata-heavy, and you can design with special vdevs and recordsize choices to influence what’s considered “small.” But don’t confuse influence with guarantees.

10) What’s the biggest operational risk of enabling `copies`?

Accidental scope creep and performance regression. The failures I’ve seen weren’t “ZFS broke”; they were “someone doubled writes on the busiest dataset and didn’t measure.”

Conclusion

copies=2 and copies=3 are not gimmicks, and they’re not a universal upgrade. They’re targeted redundancy levers that work best when you can point to a small dataset and say, “If this gets a bad block at the wrong time, we lose a day (or a company).”

Use copies like you use fire suppression: concentrated where the risk is high, tested periodically (scrubs), and never confused with “the building can’t burn down.” If you want real fault-domain improvement, build it with vdev design and replication. If you want extra margin for a few critical blocks without rebuilding the world, copies can be smart—just don’t pay for it everywhere.