ZFS zpool split: Cloning a Mirror Pool for Migration or DR

Was this helpful?

There are two kinds of migrations: the ones you plan, and the ones your storage forces you to perform at 2 a.m. after a “minor” firmware update. If you run mirrored ZFS pools, zpool split is one of the rare tools that can turn a migration into a controlled, mostly boring operation: you literally detach one side of your mirrors and walk away with a second pool.

This is not a marketing-grade “instant clone” fairy tale. Splitting a pool is decisive: you’re changing redundancy posture, you’re creating a second identity for the storage, and you’re making promises to future-you about importability, mountpoints, encryption keys, and the state of the last transaction group (TXG) when you pulled the pin. Done well, it’s a clean DR seed or a fast migration jump-start. Done casually, it’s a ticket to “why did production mount the DR copy and start writing to it?” land.

What zpool split really is (and is not)

zpool split takes a pool that contains mirrored vdevs and creates a new pool by separating mirror members. Conceptually: every mirror vdev is a pair (or more) of disks; splitting lets you peel off one disk from each mirror to form a new pool with the same dataset structure and metadata. It’s not copying blocks; it’s reassigning ownership of existing disks.

That’s why it’s fast. It also explains the first rule of split: you only get a “full pool” if you have enough mirrors to donate a member from each mirrored vdev. If your pool is a single mirror vdev of two disks, a split can still create a new pool—but you’re effectively turning one disk into a single-disk pool. That may be acceptable for temporary migration staging, and it may be unacceptable for anything you plan to sleep next to.

What it’s good at

  • Migration staging: Split, ship disks (or move them to a new chassis), import, and you’ve moved the pool identity without network transfer.
  • DR “seed” creation: Split to create an initial DR pool quickly, then use incremental zfs send to keep it updated.
  • Fast environment duplication: Clone a production dataset tree for a one-off analytics job, then destroy the split pool after.

What it is not

  • Not a backup. If your data is corrupt, happily, you will get two copies of corruption.
  • Not a snapshot mechanism. The split pool reflects the state of the pool at the TXG boundary when the split occurred, not a dataset snapshot you can browse.
  • Not a substitute for replication strategy. It’s a tool in the toolbox, not the toolbox.

Joke #1: Splitting a mirror pool is like photocopying your house key by sawing it in half—you will indeed have two keys, but your lock is about to have opinions.

Facts & historical context (the “why” behind the knob)

ZFS has a long memory, and so do the engineers who built operational workflows around it. Here are some facts and context points that explain why zpool split exists and how it ended up in real runbooks:

  1. ZFS was built around pooled storage, not per-filesystem devices. That’s why operations like split act on pool topology, not on “a filesystem” the way older tools did.
  2. Mirrors were always the “operationally friendly” vdev. In production, mirrors are popular because they heal quickly, degrade predictably, and tolerate mixed drive replacement better than parity vdevs.
  3. The on-disk format is portable across systems that share feature flags. Importing a pool on another host is normal, but only if feature flags align and the OS supports them.
  4. Split is a topology transformation, not a data move. That’s why it’s instant compared to zfs send, and also why it changes your redundancy immediately.
  5. Historically, “sneakernet” migrations were common. Before cheap 10/25/40/100GbE was everywhere (and before you trusted it at 95% utilization for 12 hours), moving disks was a legitimate migration path.
  6. ZFS’s checksumming made “move the disks” safer than older filesystems. When you import on the far side, you have end-to-end verification available via scrub and per-block checksums.
  7. Pool GUIDs and device paths are intentionally abstracted. ZFS tracks vdev GUIDs; the OS device names can change and the pool can still import—if you used stable by-id paths in the first place.
  8. Feature flags turned “it imports everywhere” into “it imports where it should.” Enabling newer features can block import on older platforms; split doesn’t change that, but it makes you confront it.
  9. Boot environments and root-on-ZFS made splitting more interesting. Splitting a pool that contains the OS is possible, but you’re now in the business of bootloader compatibility and host IDs.

When to use split vs send/receive vs replication

If you have a mirrored pool and physical access to the disks, split is the fastest way to create a second pool with the same dataset tree. But “fast” is not the same as “best.” Here’s how I decide.

Use zpool split when

  • You have mirrored vdevs and can spare one side temporarily.
  • You need a quick DR seed and your network replication window is unacceptable.
  • You want to migrate a large pool to new hardware without saturating links for days.
  • You can tolerate a temporary reduction in redundancy on the source pool (because after the split, the source has fewer mirror members).

Prefer zfs send | zfs receive when

  • You need point-in-time semantics: snapshots, incrementals, rollback options.
  • You can’t reduce redundancy on production even briefly.
  • Your vdevs aren’t mirrors (RAIDZ cannot be split like this).
  • You need to transform properties as part of migration (recordsize, encryption, dataset layout), and want deliberate control.

Prefer true replication when

  • You need continuous, audited DR with RPO and RTO targets.
  • You need to keep historical points (snap retention policies).
  • You have multiple downstream targets, or need to replicate across trust boundaries.

Joke #2: The fastest data transfer is still a human carrying disks—until someone puts them in a backpack with a magnet, and now you’ve invented performance art.

Preflight: what to confirm before you split

Splitting a pool is easy. Splitting a pool and being confident about what happens next is where adults earn their coffee.

Confirm topology: mirrors only

You need mirrored vdevs. If your pool includes RAIDZ vdevs, zpool split won’t do what you want. Mixed topology pools are common in “we grew it over time” environments—confirm what you actually have.

Confirm feature flags and target OS support

If you split a pool on a host with newer ZFS features enabled and then try to import on an older appliance, you may get an import refusal. This is the #1 “it worked in the lab” failure mode I’ve seen in enterprise migrations.

Confirm encryption key handling

Native ZFS encryption (dataset-level) moves with the on-disk data. The split pool will contain encrypted datasets; importing is not the same as mounting. Make sure the receiving host has access to keys and you know whether keys are loaded automatically or manually.

Confirm you have stable device IDs

On Linux, /dev/sdX is a suggestion, not a contract. Use /dev/disk/by-id for sanity. On the destination, devices will enumerate differently; ZFS can usually find them, but your troubleshooting time explodes if you can’t map “this serial number” to “this vdev member.”

Confirm mountpoint strategy

After split and import, datasets may auto-mount. That’s fine until they mount on the wrong host with the wrong paths, and suddenly you have two servers writing to different copies thinking they’re authoritative. Plan your mountpoint and canmount posture before you import.

Practical tasks: commands and how to read the output

The following tasks are written as if you’re on a Linux host with OpenZFS. Adjust service names and paths for your platform. Each task is something I’ve either run in anger, or wished I had run before the pager introduced itself.

Task 1: Inventory the pool and confirm mirrors

cr0x@server:~$ sudo zpool status -v tank
  pool: tank
 state: ONLINE
config:

        NAME                         STATE     READ WRITE CKSUM
        tank                         ONLINE       0     0     0
          mirror-0                   ONLINE       0     0     0
            ata-SAMSUNG_SSD_1TB_AAA  ONLINE       0     0     0
            ata-SAMSUNG_SSD_1TB_BBB  ONLINE       0     0     0
          mirror-1                   ONLINE       0     0     0
            ata-SAMSUNG_SSD_1TB_CCC  ONLINE       0     0     0
            ata-SAMSUNG_SSD_1TB_DDD  ONLINE       0     0     0

errors: No known data errors

Interpretation: You want every top-level vdev to be mirror-N. If you see raidz or single disks, split will be partial or impossible. Also check for existing errors; splitting a degraded mirror creates a new pool that may be “born sick.”

Task 2: Check current I/O pressure before you do anything disruptive

cr0x@server:~$ iostat -x 2 3
Linux 6.8.0 (server)  12/25/2025  _x86_64_ (32 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           6.12    0.00    1.88    0.74    0.00   91.26

Device            r/s     w/s   rkB/s   wkB/s  await  svctm  %util
nvme0n1          5.0    40.0   640.0  5120.0   2.10   0.15   0.70

Interpretation: A split is quick, but the operational risk comes from what you do around it (scrubs, exports, imports). If the pool is already saturated (%util near 100% and high await), schedule a quieter window.

Task 3: Confirm feature flags and compatibility posture

cr0x@server:~$ sudo zpool get all tank | egrep 'feature@|compatibility|version'
tank  compatibility  -          default
tank  feature@async_destroy  enabled  local
tank  feature@bookmarks      enabled  local
tank  feature@encryption     enabled  local
tank  feature@edonr          enabled  local

Interpretation: The split pool will carry these features. If the destination host doesn’t support them, import may fail. Treat “feature enabled” as a hard requirement for the destination stack.

Task 4: Capture dataset list and critical properties (mountpoints, canmount)

cr0x@server:~$ sudo zfs list -r -o name,used,avail,mountpoint,canmount,encryption,keylocation tank
NAME             USED  AVAIL  MOUNTPOINT     CANMOUNT  ENCRYPTION  KEYLOCATION
tank             220G   700G  /tank          on        off         -
tank/apps         80G   700G  /tank/apps     on        aes-256-gcm file:///root/keys/apps.key
tank/home         40G   700G  /tank/home     on        off         -
tank/vm          100G   700G  /tank/vm       on        aes-256-gcm prompt

Interpretation: This is your “what will mount where” map. On a DR host, you may want canmount=off until you’re ready. Also note keylocation and whether keys are file-based or prompting.

Task 5: Create a “safety snapshot” set (optional but usually smart)

cr0x@server:~$ sudo zfs snapshot -r tank@pre-split-2025-12-25
cr0x@server:~$ sudo zfs list -t snapshot -r tank | tail -5
tank/apps@pre-split-2025-12-25  0B  -
tank/home@pre-split-2025-12-25  0B  -
tank/vm@pre-split-2025-12-25    0B  -

Interpretation: A split doesn’t require snapshots, but snapshots give you rollback and send/receive options if the “move disks” plan goes sideways. It’s cheap insurance.

Task 6: Quiesce risky writers (databases, VM images) and verify

cr0x@server:~$ sudo systemctl stop postgresql
cr0x@server:~$ sudo systemctl stop libvirtd
cr0x@server:~$ sudo lsof +D /tank/vm | head
COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME

Interpretation: For a migration/DR seed you want crash-consistent at minimum; for databases you want application-consistent if you can. If lsof still shows busy files under critical datasets, you’re not quiesced.

Task 7: Run a quick health check and consider a scrub schedule

cr0x@server:~$ sudo zpool status tank
  pool: tank
 state: ONLINE
  scan: scrub repaired 0B in 00:21:33 with 0 errors on Sun Dec 22 02:14:01 2025
config:
...
errors: No known data errors

Interpretation: If the last scrub is ancient or errors exist, address that first. Splitting a pool is not the moment to discover one mirror member has been quietly throwing checksum errors.

Task 8: Perform the split to create a new pool name

cr0x@server:~$ sudo zpool split tank tank_dr
cr0x@server:~$ sudo zpool status -v tank
  pool: tank
 state: ONLINE
config:

        NAME                         STATE     READ WRITE CKSUM
        tank                         ONLINE       0     0     0
          mirror-0                   ONLINE       0     0     0
            ata-SAMSUNG_SSD_1TB_AAA  ONLINE       0     0     0
          mirror-1                   ONLINE       0     0     0
            ata-SAMSUNG_SSD_1TB_CCC  ONLINE       0     0     0

errors: No known data errors

cr0x@server:~$ sudo zpool status -v tank_dr
  pool: tank_dr
 state: ONLINE
config:

        NAME                         STATE     READ WRITE CKSUM
        tank_dr                      ONLINE       0     0     0
          ata-SAMSUNG_SSD_1TB_BBB    ONLINE       0     0     0
          ata-SAMSUNG_SSD_1TB_DDD    ONLINE       0     0     0

errors: No known data errors

Interpretation: On many setups, the new pool is created from the “other side” of each mirror. Note that tank_dr here is now noted as single-disk top-level vdevs (not mirrors) because each mirror contributed one disk. This is the part people forget: you often trade redundancy for portability. If you want the split pool to be mirrored too, you need additional disks to attach later on the destination.

Task 9: Immediately adjust auto-mount posture on the split pool (recommended)

cr0x@server:~$ sudo zfs set canmount=off tank_dr
cr0x@server:~$ sudo zfs set mountpoint=/tank_dr tank_dr
cr0x@server:~$ sudo zfs list -o name,mountpoint,canmount tank_dr
NAME     MOUNTPOINT  CANMOUNT
tank_dr  /tank_dr    off

Interpretation: This prevents accidental mounts on the source host (or on a destination host that imports automatically). You can re-enable mounts per dataset later.

Task 10: Export the split pool for physical move

cr0x@server:~$ sudo zpool export tank_dr
cr0x@server:~$ sudo zpool list
NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
tank  1.81T   220G  1.60T        -         -     3%    11%  1.00x  ONLINE  -

Interpretation: Export cleanly so the destination import doesn’t look like a “stolen pool” from a still-live host. This reduces import friction and prevents accidental multi-import conflicts.

Task 11: On the destination, locate disks by stable ID and import read-only first

cr0x@drhost:~$ sudo ls -l /dev/disk/by-id/ | egrep 'SAMSUNG_SSD_1TB_(BBB|DDD)'
lrwxrwxrwx 1 root root  9 Dec 25 09:02 ata-SAMSUNG_SSD_1TB_BBB -> ../../sdb
lrwxrwxrwx 1 root root  9 Dec 25 09:02 ata-SAMSUNG_SSD_1TB_DDD -> ../../sdc

cr0x@drhost:~$ sudo zpool import
   pool: tank_dr
     id: 15277416958755799222
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

        tank_dr                    ONLINE
          ata-SAMSUNG_SSD_1TB_BBB  ONLINE
          ata-SAMSUNG_SSD_1TB_DDD  ONLINE

cr0x@drhost:~$ sudo zpool import -o readonly=on -o cachefile=none tank_dr
cr0x@drhost:~$ sudo zpool get readonly tank_dr
NAME     PROPERTY  VALUE     SOURCE
tank_dr  readonly  on        local

Interpretation: Read-only import is a fantastic first landing. It lets you inspect datasets, validate structure, and confirm you didn’t bring the wrong disks—without changing on-disk state.

Task 12: Load encryption keys and mount deliberately

cr0x@drhost:~$ sudo zfs load-key -r tank_dr/apps
Enter passphrase for 'tank_dr/apps':
cr0x@drhost:~$ sudo zfs mount -a
cannot mount 'tank_dr': legacy mountpoint, use mount(8)

cr0x@drhost:~$ sudo zfs get -o name,property,value tank_dr | egrep 'mountpoint|canmount'
tank_dr  mountpoint  /tank_dr
tank_dr  canmount    off

Interpretation: If you turned canmount=off, bulk mounting won’t mount the top dataset. That’s intentional. Mount the specific child datasets you need, after keys are loaded, and after you’ve confirmed mountpoints won’t collide with local paths.

Task 13: Convert the split pool back into mirrors on the destination (attach new disks)

cr0x@drhost:~$ sudo zpool status tank_dr
  pool: tank_dr
 state: ONLINE
config:

        NAME                         STATE     READ WRITE CKSUM
        tank_dr                      ONLINE       0     0     0
          ata-SAMSUNG_SSD_1TB_BBB    ONLINE       0     0     0
          ata-SAMSUNG_SSD_1TB_DDD    ONLINE       0     0     0

errors: No known data errors

cr0x@drhost:~$ sudo zpool attach tank_dr ata-SAMSUNG_SSD_1TB_BBB /dev/disk/by-id/ata-SAMSUNG_SSD_1TB_EEE
cr0x@drhost:~$ sudo zpool attach tank_dr ata-SAMSUNG_SSD_1TB_DDD /dev/disk/by-id/ata-SAMSUNG_SSD_1TB_FFF
cr0x@drhost:~$ sudo zpool status tank_dr
  pool: tank_dr
 state: ONLINE
  scan: resilver in progress since Thu Dec 25 09:24:11 2025
        52.1G scanned at 1.20G/s, 12.4G issued at 290M/s, 220G total
        12.4G resilvered, 5.64% done, 00:12:31 to go
config:

        NAME                              STATE     READ WRITE CKSUM
        tank_dr                           ONLINE       0     0     0
          mirror-0                        ONLINE       0     0     0
            ata-SAMSUNG_SSD_1TB_BBB       ONLINE       0     0     0
            ata-SAMSUNG_SSD_1TB_EEE       ONLINE       0     0     0
          mirror-1                        ONLINE       0     0     0
            ata-SAMSUNG_SSD_1TB_DDD       ONLINE       0     0     0
            ata-SAMSUNG_SSD_1TB_FFF       ONLINE       0     0     0

Interpretation: This is how you restore redundancy after shipping a single-disk-per-vdev split pool. Resilver rates depend on ashift, recordsize, drive behavior, and load. Plan time accordingly.

Task 14: Make the pool writable and set cachefile for persistence

cr0x@drhost:~$ sudo zpool export tank_dr
cr0x@drhost:~$ sudo zpool import -o readonly=off tank_dr
cr0x@drhost:~$ sudo zpool set cachefile=/etc/zfs/zpool.cache tank_dr
cr0x@drhost:~$ sudo zpool get cachefile tank_dr
NAME     PROPERTY  VALUE                 SOURCE
tank_dr  cachefile /etc/zfs/zpool.cache  local

Interpretation: Read-only import was for validation; now you transition to normal ops. Setting cachefile ensures the pool is remembered across reboots (platform dependent).

Task 15: Validate data integrity with a scrub after import

cr0x@drhost:~$ sudo zpool scrub tank_dr
cr0x@drhost:~$ sudo zpool status tank_dr
  pool: tank_dr
 state: ONLINE
  scan: scrub in progress since Thu Dec 25 10:02:18 2025
        88.3G scanned at 1.10G/s, 88.3G issued at 1.10G/s, 220G total
        0B repaired, 40.15% done, 00:01:52 to go
config:
...
errors: No known data errors

Interpretation: A scrub after relocation is the grown-up move. It catches transport issues, bad cables, marginal HBAs, and “that one disk that only fails when it’s cold.”

Checklists / step-by-step plan

Checklist A: DR seed via split (minimal downtime, minimal surprises)

  1. Confirm pool is mirrors only: zpool status.
  2. Confirm last scrub clean and recent enough for your risk appetite.
  3. Record dataset properties: mountpoints, canmount, encryption status.
  4. Create recursive snapshots as a fallback and for future incrementals.
  5. Quiesce high-churn applications if you need application-consistency.
  6. Run zpool split to create tank_dr.
  7. Set canmount=off and/or adjust mountpoints on tank_dr to avoid accidental mounts.
  8. Export tank_dr.
  9. Move disks, import read-only on destination, confirm datasets and properties.
  10. Load encryption keys and mount selectively.
  11. Add new disks and attach to re-mirror, allow resilver to finish.
  12. Scrub, then transition to replication (incremental sends) going forward.

Checklist B: Migration to a new host (keeping service names and paths sane)

  1. Decide authoritative cutover moment: when does the destination become writable?
  2. Before split: set canmount=off on critical datasets if you want manual mounting after import.
  3. Split and export the new pool.
  4. On new host: import with -o altroot=/mnt for inspection if you want zero risk of mounting into production paths.
  5. Validate: zfs list, zpool status, check encryption keys can load.
  6. When ready: export/import normally, set correct mountpoints, enable mounts.
  7. Start services and validate application-level health.

Three corporate-world mini-stories (failures and the one that saved the day)

1) Incident caused by a wrong assumption: “The split pool is a backup”

At a mid-sized company with a heavily mirrored ZFS backend, a team created a DR seed by splitting a production pool and shipping the drives to a second site. They did the mechanics correctly: split, export, import, mount. They felt good. Too good.

Two months later, a developer pushed a schema migration that dropped a set of tables in a way that the app didn’t immediately notice. The damage was silent: writes kept flowing, and monitoring watched request rates, not data correctness. When the issue was discovered, the team went to “the DR copy.” It had the same missing data. Of course it did: they hadn’t been doing snapshot-based replication with retention; they’d just created a second live copy and periodically re-split as a “refresh.”

The wrong assumption was subtle: “We have another pool, therefore we have recovery.” But recovery requires time as a dimension—history, points-in-time, retention—and the split pool had none of that. It was a clone, not a backup.

The operational fix was straightforward but painful: they implemented disciplined snapshots (with naming and retention), then incremental replication. They still used split to seed new DR storage quickly, but only as the first step in an actual DR pipeline.

The cultural fix was the bigger win: they stopped calling the split pool “backup” in tickets and dashboards. Words matter. If you name it “backup,” someone will eventually bet the business on it.

2) Optimization that backfired: “Let’s split at peak time; it’s instant”

A different org had a mirrored pool backing a virtualization cluster. Someone proposed a clever plan: do the split during business hours because the split itself is instantaneous, then export and move drives after the evening change window opens. It sounded efficient. The first half was true. The second half was where reality showed up with receipts.

Splitting the pool reduced the number of mirror members in production immediately. The pool stayed ONLINE, but the performance profile changed. A few workloads that had been enjoying “accidental” read parallelism from mirrors (and some cache warmth) found themselves competing more aggressively for IOPS. Latency rose, not catastrophically, but enough to trigger timeouts in a chatty internal service. The service flapped, retries spiked, and the load amplifier turned a small latency increase into a minor incident.

They learned two practical lessons. First: in real systems, “instant” operations can have second-order effects. Removing mirror members can change scheduling and queue behavior in ways your happy-path benchmark never exercised. Second: migrations are not just storage operations; they’re performance events. If your service is tuned right to the edge, topology changes are felt.

The remediation was simple: do the split in a quiet window, and if you need daytime safety, do a snapshot-based replication seed instead. They also added a pre-split performance gate: if 95th percentile latency is already elevated, no topology changes happen.

3) Boring but correct practice that saved the day: stable device IDs and disciplined exports

One of the most successful split-based migrations I’ve seen was almost aggressively uninteresting. The team had a rule: pools are built using stable /dev/disk/by-id paths only, every time, no exceptions. They also had a habit of exporting pools before any physical move—even if “we’re just rebooting into new firmware.”

During a data center migration, they split a mirrored pool into a migration pool, exported it, and moved the drives to new hosts. On the destination, device enumeration was different, HBAs were different, and a couple of disks ended up on a different backplane than planned. None of that mattered. zpool import found the pool by labels, and the by-id naming made it obvious which serial numbers were missing when one cable wasn’t seated.

The boring practice mattered again when someone tried to import the pool on a “staging” host to inspect data while the destination host was being racked. Because the pool had been exported cleanly, there was no “pool was previously in use” confusion, no forced import, no risk that two systems thought they owned it at different times.

Nothing dramatic happened, which is exactly the point. In production, “nothing dramatic happened” is a feature you earn through habits that look pedantic until they aren’t.

Fast diagnosis playbook (what to check first, second, third)

When a split-based migration goes slow or weird, you don’t have time to become a philosopher. Here’s the quick triage order I use to find the bottleneck and decide whether to stop, proceed, or roll back.

First: Is this a ZFS health problem or an OS/hardware problem?

cr0x@server:~$ sudo zpool status -x
all pools are healthy

cr0x@server:~$ dmesg | tail -30
[...]
[ 8921.221] ata9.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 8921.223] blk_update_request: I/O error, dev sdc, sector 12345678 op 0x0:(READ)

Decision: If dmesg shows I/O errors/timeouts, stop and stabilize hardware before blaming ZFS. ZFS can survive a lot, but it can’t negotiate with a SATA link that’s doing interpretive dance.

Second: Is a resilver/scrub dominating I/O?

cr0x@drhost:~$ sudo zpool status tank_dr
  pool: tank_dr
 state: ONLINE
  scan: resilver in progress since Thu Dec 25 09:24:11 2025
        180G scanned, 95G issued, 220G total
        95G resilvered, 43.18% done, 00:08:41 to go

Decision: If resilver is active, your performance is often “working as designed.” Either wait, or throttle expectations. If the business needs performance now, consider deferring some workloads until resilver completes.

Third: Are you CPU-bound, ARC-bound, or disk-bound?

cr0x@drhost:~$ sudo arcstat 2 5
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c
09:40:01   812   121     14     0    0    31    4    90   11   42G   64G
09:40:03   905   188     20     0    0    45    5   143   15   42G   64G

Decision: High miss rates during migration testing can look like “storage is slow” when it’s just cold cache. Warm up workloads or test with realistic cache conditions. If CPU is high and you’re using encryption/compression, verify whether you’re CPU-limited.

Fourth: Are mounts and properties doing something surprising?

cr0x@drhost:~$ sudo zfs get -r -o name,property,value canmount,mountpoint tank_dr | head -20
NAME         PROPERTY    VALUE
tank_dr      canmount    off
tank_dr      mountpoint  /tank_dr
tank_dr/apps canmount    on
tank_dr/apps mountpoint  /tank/apps

Decision: If a child dataset still points to a production mountpoint (like /tank/apps), you may mount over existing directories on the DR host. That’s how you end up “testing DR” by overwriting something you liked.

Common mistakes (with symptoms and fixes)

Mistake 1: Splitting without realizing you’re losing redundancy

Symptom: After split, the new pool shows single-disk top-level vdevs; the old pool now has fewer members; performance changes; risk posture changes.

Fix: Plan to zpool attach new disks on the destination to restore mirrors, or accept the risk explicitly for a short-lived migration pool. Document it in the change record; future-you will forget.

Mistake 2: Importing and auto-mounting into production paths

Symptom: After import, datasets mount under paths that collide with existing directories; services start reading/writing to the wrong copy.

Fix: Import with -o altroot=/mnt for inspection, or set canmount=off before export. Only enable mounting when you’re ready.

Mistake 3: Forgetting encryption key workflows on the destination

Symptom: Pool imports, but datasets won’t mount; applications see empty directories; zfs mount fails with key-related errors.

Fix: Confirm zfs get encryption,keylocation,keystatus. Move key material securely, test zfs load-key, and don’t assume “import” equals “usable.”

Mistake 4: Feature flag mismatch between source and destination

Symptom: zpool import refuses with “unsupported feature(s)” or similar.

Fix: Upgrade destination ZFS stack to support the pool’s feature flags. If you must support older systems, you needed to plan that before enabling features on the source—ZFS does not do “downgrade.”

Mistake 5: Using unstable device names and losing track of disks

Symptom: Post-move import is confusing; you can’t map /dev/sdX to real drives; accidental wrong-disk operations become likely.

Fix: Build and operate using /dev/disk/by-id. When troubleshooting, use serial numbers as your ground truth.

Mistake 6: Splitting a pool with existing errors or a degraded mirror

Symptom: The split pool imports but scrubs show checksum errors; resilvers take forever; one disk starts throwing I/O errors.

Fix: Stabilize first: replace failing disks, clear errors mentioned in zpool status -v, run a scrub, then split. If you must split under duress, import read-only on the destination and scrub immediately to assess damage.

Mistake 7: Assuming split gives you a clean “point in time” across apps

Symptom: Databases recover but with missing or inconsistent recent transactions; VM filesystems show journal replays.

Fix: Quiesce apps or use snapshots coordinated with application hooks. Split captures the pool state, not application coherence.

FAQ

1) Does zpool split copy data?

No. It reassigns mirror members to form a new pool. It’s fast because it’s not moving blocks; it’s changing ownership of existing disks.

2) Can I split a RAIDZ pool?

No, not in the “clone the pool” sense. zpool split is for mirrors. RAIDZ vdevs don’t have independent members that can become their own coherent vdevs.

3) Will the split pool have the same datasets and snapshots?

Yes: datasets, properties, and snapshots present at split time come along because the on-disk data is the same. But remember: it’s not a backup. If the source had logical corruption, you now have two copies of it.

4) What happens to pool GUIDs and names?

The new pool gets its own identity (new pool name, distinct pool GUID). This is good: it reduces accidental “same pool imported twice” confusion, but you still must be careful with mountpoints and host automation.

5) Can I import the split pool on another host while the original is still running?

Yes, that’s often the point. But you must export the split pool cleanly, and you should treat it as a separate system: avoid mounting into shared paths, and ensure no application writes to both copies unless that’s explicitly designed (it usually isn’t).

6) How do I keep the DR pool updated after the initial split seed?

Usually with snapshot-based incrementals: take snapshots on the source and zfs send/zfs receive to the DR pool. Split is great for the first copy; replication is how you keep it current.

7) What about native encryption—does split “break” it?

No. Encryption metadata is part of the datasets. But operationally, you need keys on the destination to mount encrypted datasets. Plan key distribution and test key load workflows before declaring victory.

8) Is it safe to run zpool split while the pool is online and in use?

It can be done online, but “safe” depends on your workload and risk tolerance. The split itself is quick, but removing mirror members changes redundancy and may change latency under load. For databases and VM images, I prefer quiescing or doing it in a low-traffic window.

9) Can I “undo” a split?

Not as a single magic command. You can attach disks back and re-mirror, or destroy one pool and reattach its disks, but you must treat it as a topology change with consequences. If you need reversibility, consider snapshot replication instead.

10) Why import read-only first?

Because it’s the safest way to confirm you brought the right disks and the pool is healthy without advancing on-disk state. It’s a cheap sanity check before you let services near it.

Conclusion

zpool split is one of those ZFS features that feels like cheating the first time you use it: instant pool clone, no network copy, no waiting. In production, it’s powerful precisely because it’s blunt. You’re not creating a “copy” in the backup sense—you’re creating a second pool by sacrificing mirror members, and you’re taking responsibility for redundancy, mount behavior, feature compatibility, and keys.

If you remember three things, remember these: verify you have mirrors, import read-only first, and control mounts like your weekend depends on it (because it does). Split is a great way to start a migration or DR plan. The rest of the plan—the boring snapshot cadence, the replication discipline, the scrubs and audits—is what turns that quick win into a system you can trust.

← Previous
MariaDB vs Percona Server Upgrades: How to Avoid “Works on Staging” Lies
Next →
PL1, PL2, and Tau Explained: The Three Numbers That Decide Everything

Leave a comment