When ZFS says it found corruption, it’s being painfully specific—if you ask the right questions. The trick is that the first thing you’ll see is usually not “/home/alice/report.xlsx is bad,” but a pool state, a vdev, and some checksum counters that look like they were designed by someone who hates sleep.
This piece is about turning zpool status -v from a scary red light into an actionable list: what’s broken, where it lives, whether ZFS repaired it, and how to pull the exact affected file(s) so you can restore or re-copy them with confidence. We’ll do it the way it works in production: fast triage first, deep forensics second, and prevention always.
What zpool status -v really means
At a high level, zpool status answers: “Is the pool healthy?” and “Which device is implicated?” Adding -v is the difference between “something is wrong” and “these specific files are known-bad right now.”
But here’s the part people miss: zpool status -v does not promise a neat filename every time. It prints paths only for permanent errors that ZFS can associate with data files and metadata. If ZFS repaired the data during a scrub (using redundancy), the file may never show up because it was never permanently damaged. If the error is in metadata that can’t be mapped to a file path (or can’t be decoded without deeper inspection), you’ll see object numbers, or worse, only “Permanent errors have been detected” with no helpful list.
Think of ZFS errors in three buckets:
- Correctable: A read returned bad data, but ZFS had another copy (mirror/RAIDZ) and healed it.
- Uncorrectable but detectable: ZFS knows the checksum doesn’t match, but it has no good copy. This is when you start seeing “permanent errors.”
- Operational: timeouts, I/O failures, disconnects. These may or may not imply data corruption, but they often lead to it if they cause partial writes on non-atomic devices or flaky paths.
Two jokes, as promised, because you’ll need them:
Joke #1: ZFS doesn’t “lose” data—it just keeps it in a location you can’t prove exists anymore.
Joke #2: The only thing more persistent than ZFS checksums is the VP asking if it’s “just a reboot.”
Reading the important parts of status output
Here’s a typical starting point:
cr0x@server:~$ sudo zpool status -v tank
pool: tank
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
scan: scrub repaired 0B in 02:14:33 with 0 errors on Tue Dec 24 03:10:11 2025
config:
NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
ata-WDC_WD80...-part1 ONLINE 0 0 0
ata-WDC_WD80...-part1 ONLINE 0 0 2
ata-WDC_WD80...-part1 ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
tank/media/exports/2024-archive.tar
tank/vmstore/vm-112-disk-0.qcow2
What matters:
- state: “DEGRADED” can mean a disk is missing or errors exist. “ONLINE” doesn’t mean “fine,” it means “present.”
- READ/WRITE/CKSUM columns: read/write are I/O failures; cksum is “data returned but wrong.” The last is often where silent corruption shows up.
- scan: scrub/resilver outcomes. “0 errors” in the scrub line doesn’t always mean “no harm ever happened,” it means “the scrub didn’t find uncorrected issues at that time.”
- errors: the “Permanent errors…” list is gold. It’s also incomplete when metadata is involved.
Facts and historical context that actually help
These aren’t trivia for trivia’s sake—each one influences what you do next.
- ZFS was designed to distrust storage. Checksums are end-to-end: data is verified after it comes off disk, not just when it’s written.
- “CKSUM errors” often implicate the path, not just the disk: SATA cables, HBAs, backplanes, expander firmware, and power can all flip bits or drop commands.
- Scrubs are proactive integrity audits. They read every block and verify checksums; resilvers only rebuild what’s needed for redundancy after a device event.
- ZFS keeps multiple layers of metadata. Some corruption affects a file; some affects metadata that maps blocks to files. The latter can be harder to print as a path.
- Block pointers include checksums and physical location, which is how ZFS detects wrong data even when the disk happily returns “success.”
- Copies can exist without mirrors: ZFS supports multiple copies of metadata (and optionally data) via properties like
copies=2. This changes repair behavior. - “ashift” is forever (for that vdev). A wrong sector size alignment can increase write amplification and stress devices—sometimes it becomes the slow-motion root cause of later corruption.
- Compression changes the blast radius. Compressed blocks mean logical file offsets don’t map 1:1 to physical blocks, so your “which part is corrupt?” answer may be “a record,” not “a sector.”
- ZFS can heal on read. On redundant pools, reading a corrupt block can trigger repair even before a scrub runs.
Fast diagnosis playbook
This is the order I use when I’m on-call and the pool just turned yellow—or the app team is yelling because a VM image won’t boot.
1) Confirm the scope: pool-wide or single dataset/file?
cr0x@server:~$ sudo zpool status -v
...
Interpretation: If you see “Permanent errors…” with file paths, you already have a target list. If you see only device errors or “too many errors,” you’re in “stability first” territory: stop the bleeding before forensic mapping.
2) Decide whether this is corruption or connectivity
cr0x@server:~$ sudo zpool status tank
...
NAME STATE READ WRITE CKSUM
ata-WDC... ONLINE 0 0 54
Interpretation: A rising CKSUM count with low READ/WRITE often means bad data is being returned (disk media, controller, cable). Rising READ/WRITE counts or device OFFLINE/FAULTED points more toward link resets, power, or a dying drive.
3) Check if a scrub already fixed it
cr0x@server:~$ sudo zpool status -v tank
scan: scrub repaired 0B in 02:14:33 with 0 errors on Tue Dec 24 03:10:11 2025
Interpretation: “Repaired X” means ZFS found bad blocks and corrected them using redundancy. If it repaired data, re-run the workload that failed and see if it’s gone. If you still have “Permanent errors,” redundancy wasn’t enough for those blocks.
4) Identify the fault domain: one disk, one HBA, one backplane lane?
cr0x@server:~$ ls -l /dev/disk/by-id/ | grep WDC_WD80 | head
...
Interpretation: If errors cluster on disks behind the same controller or expander, suspect the shared component. If it’s one disk repeatedly, suspect that disk (or its slot/cable).
5) If the pool is unstable, stabilize it before deep mapping
If devices are dropping, don’t run fancy zdb archaeology first. Fix cabling, replace the disk, stop the resets. A scrub on a flapping disk can turn “recoverable” into “permanent” by repeatedly failing reads.
Practical tasks: commands + interpretation (12+)
These are tasks I’ve actually run under pressure. Each includes what the output means and what decision it should drive.
Task 1: Get the high-signal pool view
cr0x@server:~$ sudo zpool status -xv
all pools are healthy
Interpretation: -x prints only pools with issues; -v adds file lists if available. If you get “healthy,” stop digging. If not, proceed.
Task 2: Pull a specific pool’s verbose status
cr0x@server:~$ sudo zpool status -v tank
pool: tank
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
...
errors: Permanent errors have been detected in the following files:
tank/projects/build-cache/index.sqlite
Interpretation: Even “ONLINE” can mean data corruption happened. If apps “may be affected,” treat it as “is affected until proven otherwise.”
Task 3: Find when errors started (event history)
cr0x@server:~$ sudo zpool events -v | tail -n 30
...
Interpretation: Look for device removal, checksum error events, or resilver start/finish. This often correlates with maintenance windows, kernel updates, or someone “tidying cables.”
Task 4: Run (or re-run) a scrub intentionally
cr0x@server:~$ sudo zpool scrub tank
Interpretation: Scrub is the integrity truth serum. Run it when the pool is stable. If you’re in the middle of an incident, scrubbing immediately can be right—unless you suspect hardware instability, in which case you fix that first.
Task 5: Watch scrub progress without guessing
cr0x@server:~$ watch -n 5 "sudo zpool status tank | sed -n '1,25p'"
Every 5.0s: sudo zpool status tank | sed -n '1,25p'
pool: tank
state: ONLINE
scan: scrub in progress since Wed Dec 25 01:10:11 2025
612G scanned at 1.20G/s, 110G issued at 220M/s, 4.21T total
0B repaired, 2.61% done, 0:52:01 to go
Interpretation: “scanned” vs “issued” tells you if the scrub is I/O constrained. If issued is low, you might be bottlenecked by vdev latency, SMR behavior, or throttling.
Task 6: Clear errors (only after you understand what you’re clearing)
cr0x@server:~$ sudo zpool clear tank
Interpretation: This clears counters and “known bad” state; it does not magically fix data. Use it after replacing hardware or after restoring/replacing corrupt files so you can see if errors return.
Task 7: Identify whether “CKSUM” is still climbing
cr0x@server:~$ sudo zpool status tank | awk 'NR==1,NR==40 {print}'
...
Interpretation: Take two snapshots in time. If CKSUM increments while idle, suspect hardware/firmware. If it increments only during heavy reads, suspect marginal media or controller errors under load.
Task 8: Check device health (SMART) for the specific offender
cr0x@server:~$ sudo smartctl -a /dev/sdc | egrep -i "realloc|pending|uncorrect|crc|error|self-test"
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 23
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
Interpretation: UDMA CRC errors scream “cable/backplane/controller path,” not “disk media.” Reallocated/pending sectors suggest the disk itself is failing. Neither is a perfect oracle; use it to choose what you replace first.
Task 9: Check kernel logs for resets and I/O errors
cr0x@server:~$ sudo dmesg -T | egrep -i "zfs|sd.*error|ata.*error|reset|I/O error|checksum" | tail -n 40
...
Interpretation: If you see link resets, command timeouts, or bus resets around the time ZFS saw corruption, the “corrupt file” is the symptom, not the disease.
Task 10: Get the dataset mountpoints and confirm the path is real
cr0x@server:~$ zfs list -o name,mountpoint,compression,recordsize -r tank | head -n 20
NAME MOUNTPOINT COMPRESS RECORDSIZE
tank /tank lz4 128K
tank/media /tank/media lz4 1M
tank/vmstore /tank/vmstore lz4 128K
Interpretation: If zpool status -v lists tank/vmstore/..., you need to know where that is on disk (/tank/vmstore) and what recordsize/compression might imply for application-level recovery.
Task 11: Verify the file can be read end-to-end (and trigger healing if redundant)
cr0x@server:~$ sudo dd if=/tank/media/exports/2024-archive.tar of=/dev/null bs=4M status=progress
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 4 s, 536 MB/s
...
Interpretation: If the read errors out or stalls, you’ve confirmed impact. On redundant pools, a read can also prompt ZFS to repair; after a successful read, check if CKSUM counters changed.
Task 12: If you have snapshots, restore just the file
cr0x@server:~$ zfs list -t snapshot -o name,creation -S creation | head
NAME CREATION
tank/media@autosnap_2025-12-24_0300 Tue Dec 24 03:00 2025
tank/media@autosnap_2025-12-23_0300 Mon Dec 23 03:00 2025
cr0x@server:~$ sudo cp -a /tank/media/.zfs/snapshot/autosnap_2025-12-24_0300/exports/2024-archive.tar /tank/media/exports/2024-archive.tar
Interpretation: This is often the fastest “fix”: replace the corrupt file with a known-good version. If it’s a VM disk, you may prefer to restore it into a new filename and do integrity checks before swapping.
Task 13: If no snapshots, verify a replica or re-copy from source
cr0x@server:~$ sudo rsync -avh --inplace --checksum /mnt/source/2024-archive.tar /tank/media/exports/2024-archive.tar
sending incremental file list
2024-archive.tar
4.21T 100% 1.05GB/s 1:04:10 (xfr#1, to-chk=0/1)
Interpretation: --checksum forces verification beyond timestamps; --inplace is controversial on copy-on-write systems, but can be appropriate when you want to rewrite blocks without creating a second full copy. Use with care (see mistakes section).
Task 14: Replace a failing disk cleanly
cr0x@server:~$ sudo zpool replace tank ata-WDC_WD80...-old /dev/disk/by-id/ata-WDC_WD80...-new
cr0x@server:~$ sudo zpool status tank
scan: resilver in progress since Wed Dec 25 02:11:12 2025
310G scanned at 1.05G/s, 92.1G issued at 320M/s, 4.21T total
0B resilvered, 2.19% done, 3:40:21 to go
Interpretation: Replacement addresses the underlying risk. It does not automatically fix already-corrupt blocks unless redundancy existed and the pool can reconstruct them during scrub/resilver reads.
Task 15: Use ZFS-level checksums to verify a dataset (quick sanity)
cr0x@server:~$ sudo zfs set readonly=on tank/media
cr0x@server:~$ sudo zpool scrub tank
cr0x@server:~$ sudo zfs set readonly=off tank/media
Interpretation: Temporarily setting a dataset read-only reduces write churn during investigation and can keep an application from overwriting evidence. Don’t do this on live app datasets without coordination.
Mapping pool errors to exact files (the real workflow)
If zpool status -v already prints file paths, congratulations—you’re ahead of the usual case. You still need to answer three questions:
- Is the file actually corrupt right now, or was it repaired?
- Is the corruption limited to that file, or a sign of deeper trouble?
- What’s the fastest safe recovery action?
When zpool status -v gives you file paths
ZFS will list paths it can associate with permanent errors. For each listed path:
- Try to read it end-to-end (or at least the critical portion).
- If redundant, see whether reading triggers repair (watch CKSUM and subsequent scrubs).
- Restore from snapshot or re-copy from source.
- Clear errors and scrub again to confirm the pool is clean.
Example verification loop:
cr0x@server:~$ sudo zpool status -v tank
errors: Permanent errors have been detected in the following files:
tank/projects/build-cache/index.sqlite
cr0x@server:~$ sudo sqlite3 /tank/projects/build-cache/index.sqlite "PRAGMA integrity_check;"
*** in database main ***
On tree page 914 cell 27: Rowid 188394 out of order
database disk image is malformed
Interpretation: The database is truly damaged at the application layer. ZFS did its job by telling you “this file is not trustworthy.” Your job is to restore/rebuild the database, not to argue with it.
When zpool status -v does NOT give you file paths
This is the more interesting case. You might see:
- “Permanent errors have been detected…” but no list
- “too many errors”
- errors attributed to metadata (sometimes shown as
<metadata>or object IDs depending on platform)
When ZFS can’t print a path, you can still often map corruption to files using object numbers and zdb. The idea is:
- Identify the dataset that contains the bad blocks (often hinted in status or by workload impact).
- Use
zdbto inspect dataset objects and locate the affected object number(s). - Map object numbers to paths (when possible) or at least identify the file type (zvol, dnode, directory, etc.).
Important operational warning: zdb is powerful and sharp. Use read-only flags where available, run it off-peak if possible, and capture output for later review. It shouldn’t modify the pool, but in a crisis you don’t want surprises.
Task: Identify datasets and whether the corruption likely affects a zvol
cr0x@server:~$ zfs list -t filesystem,volume -o name,type,used,volsize -r tank
NAME TYPE USED VOLSIZE
tank filesystem 3.21T -
tank/media filesystem 1.88T -
tank/vmstore filesystem 1.02T -
tank/vmstore/vm-112 volume 120G 120G
Interpretation: If the impacted thing is a volume (zvol), it won’t show up as a regular file path in the same way. Your “file” might be a block device consumed by a hypervisor or iSCSI target.
Task: Check if the reported path is inside a snapshot, clone, or current filesystem
cr0x@server:~$ sudo zpool status -v tank | sed -n '/Permanent errors/,$p'
errors: Permanent errors have been detected in the following files:
tank/media/exports/2024-archive.tar
cr0x@server:~$ ls -lah /tank/media/exports/2024-archive.tar
-rw-r--r-- 1 root root 4.3T Dec 21 01:01 /tank/media/exports/2024-archive.tar
Interpretation: Validate the file exists where you think it does. Sounds obvious; in the real world, mountpoints drift, bind mounts happen, and someone will insist it’s “under /srv.”
Task: Use zdb to map a path to internal object info (advanced)
On many OpenZFS systems you can ask zdb about a file path within a dataset. A common pattern is: find the dataset, then query with zdb to locate the dnode/object.
cr0x@server:~$ sudo zdb -ddd tank/media | head -n 25
Dataset tank/media [ZPL], ID 52, cr_txg 1, 1.88T, 9.21M objects
Object lvl iblk dblk dsize dnsize lsize %full type
1 1 128K 1K 16.0K 512 1.50K 100.00 ZFS master node
2 1 128K 1K 16.0K 512 14.5K 100.00 ZFS directory
...
Interpretation: This confirms the dataset type (ZPL filesystem) and that zdb can read metadata. If zdb itself throws I/O errors, stop and stabilize hardware.
Task: Identify a file’s inode/object (portable approach)
cr0x@server:~$ stat -c 'path=%n inode=%i size=%s' /tank/media/exports/2024-archive.tar
path=/tank/media/exports/2024-archive.tar inode=188394 size=4639123456789
Interpretation: On ZFS, the inode number often corresponds to an internal object number (dnode) for ZPL filesystems. That can be the bridge between “file path” and “object 188394” you may see elsewhere.
Task: Deep-dive a specific object with zdb (forensic mapping)
cr0x@server:~$ sudo zdb -dddd tank/media 188394 | sed -n '1,120p'
Object lvl iblk dblk dsize dnsize lsize %full type
188394 2 128K 128K 1.00M 512 4.21T 98.11 ZFS file
Bonus System attributes
ZPL_SIZE 4639123456789
ZPL_GEN 4821
Indirect blocks:
...
Interpretation: If you can inspect the object cleanly, you can sometimes go further and locate which block pointer(s) are failing, which helps differentiate “single bad sector” from “widespread trouble.” More importantly: it confirms you’re looking at the right thing.
Task: Prove whether corruption is localized or systemic
cr0x@server:~$ sudo zpool status tank | egrep -A3 "NAME|raidz|mirror|ata-|nvme"
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-WDC...-part1 ONLINE 0 0 0
ata-WDC...-part1 ONLINE 0 0 2
ata-WDC...-part1 ONLINE 0 0 0
Interpretation: “2 CKSUM on one disk” could be a one-off bit rot event or a cable sneeze. “Thousands across multiple disks” is usually infrastructure: HBA, backplane, firmware, power, or a bad batch of drives.
Task: If errors are on a single top-level vdev, understand the blast radius
cr0x@server:~$ sudo zpool list -v tank
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH
tank 21.8T 16.2T 5.6T - - 22% 74% 1.00x ONLINE
raidz1-0 21.8T 16.2T 5.6T - - 22% 74%
Interpretation: Single-vdev pools are common. If that vdev has issues, the whole pool’s integrity depends on it. Multi-vdev pools can have failure isolation, but also wider operational complexity.
Task: Check whether special vdevs or metadata devices are involved
cr0x@server:~$ sudo zpool status tank | sed -n '1,120p'
pool: tank
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
...
special
mirror-1 ONLINE 0 0 0
nvme-SAMSUNG... ONLINE 0 0 0
nvme-SAMSUNG... ONLINE 0 0 0
Interpretation: If a special vdev is failing (metadata/small blocks), you can see “file corruption” symptoms that look random and terrifying. Treat special vdev reliability like it’s the pool’s nervous system—because it is.
Task: If you suspect “bad RAM” or transient memory corruption
cr0x@server:~$ sudo zpool status -v tank
status: One or more devices has experienced an error resulting in data corruption.
...
Interpretation: ZFS is not immune to bad RAM. ECC helps. Without ECC, you can still run ZFS, but you should be more paranoid: repeatable checksum errors on different disks without a clear hardware path issue can be a hint. Correlate with system logs, crashes, and other anomalies.
Three corporate-world mini-stories from the trenches
Mini-story 1: The incident caused by a wrong assumption
A company I worked with had a “simple” rule: if zpool status says ONLINE, storage is fine. That rule lived in a runbook, which meant it eventually got promoted from “helpful shortcut” to “policy.”
One morning, a VM image wouldn’t boot. The hypervisor logs showed I/O errors; the storage node showed the pool ONLINE. So the incident commander declared the storage layer “cleared” and pushed the virtualization team to “fix the guest.” Classic hot potato, but with more Slack channels.
Someone finally ran zpool status -v (not just zpool status) and got a permanent error listing the exact .qcow2 backing file. The pool was ONLINE because the vdevs were present; the data was not fine because the pool had uncorrectable checksum failures in a few blocks.
The wrong assumption wasn’t “ZFS lies.” It was “ONLINE means healthy.” ONLINE means “the pool can be accessed.” It says nothing about whether every byte is intact. After that incident, the team changed the rule: storage is “green” only if zpool status -xv is clean and the last scrub completed without uncorrected errors.
The fix was boring: restore the VM disk from the previous night’s snapshot, then scrub the pool and replace the suspect drive path hardware. The lesson stuck because it was expensive: the VM had been “repaired” by reinstalling packages for two hours before anyone proved the disk image was corrupted.
Mini-story 2: The optimization that backfired
Another place decided scrubs were “too expensive” during business hours. They had analytics workloads and didn’t want read bandwidth consumed by integrity scans. So they moved scrubs to a monthly schedule and throttled them aggressively.
On paper, it looked responsible: fewer scrubs, less I/O, happier dashboards. In practice, it stretched the detection window. A single marginal SAS cable started producing intermittent checksum errors. Nothing crashed. The pool kept healing what it could. And because scrubs were rare, the problem stayed invisible long enough for a second disk to develop real media errors.
By the time the monthly scrub ran, it found uncorrectable blocks scattered across several datasets. zpool status -v listed a handful of files, but metadata damage meant some errors weren’t mapped cleanly to paths. Recovery turned into a messy mixture of restoring snapshots, verifying databases, and explaining to management why “we have RAIDZ, so we’re safe” is not a sentence you want in a postmortem.
The optimization wasn’t the schedule alone—it was the confidence it created. They treated scrubbing like defragmentation: optional. Scrubs are closer to “fire alarm tests.” They’re disruptive until the day you need them, and then you’ll wish you’d been doing them all along.
The outcome: they returned to weekly scrubs, tuned scrub speed based on latency SLOs, and started alerting on any nonzero CKSUM deltas per day. The funny part is the cluster ran faster afterward, because the underlying link errors were fixed and retries stopped dragging latency through the floor.
Mini-story 3: The boring but correct practice that saved the day
The most successful corruption incident I’ve seen was almost anticlimactic. A storage node reported a few CKSUM errors and zpool status -v listed two files: an exported archive and a small Postgres WAL segment.
The team had a habit—some called it paranoid—of taking frequent snapshots and replicating critical datasets. It wasn’t fancy: consistent naming, retention policies, periodic restore tests. The kind of thing nobody celebrates because it doesn’t ship features.
They did the basics: paused the ingestion job touching the dataset, restored the two files from a snapshot taken an hour earlier, and ran a scrub to validate no other uncorrected errors existed. They also pulled SMART and noticed UDMA CRC errors on one disk, swapped the cable during a maintenance window, and watched the CKSUM counter stop moving.
No heroic zdb spelunking. No “maybe the filesystem will fix itself.” Just disciplined hygiene. The post-incident review lasted 20 minutes because there wasn’t much drama left to narrate, which is the best kind of review.
If you want a moral: snapshots don’t prevent corruption, but they turn corruption into an inconvenience. And in corporate life, an inconvenience is basically a win.
Common mistakes: symptoms and fixes
Mistake 1: Clearing errors before restoring files
Symptom: You run zpool clear, the status looks clean, then the application reads the file and fails again later—or you lose track of which files were affected.
Fix: Treat the “Permanent errors” list as an incident artifact. Copy it to your ticket, restore/replace the files, then clear and scrub to confirm. Clearing is for verifying recurrence, not for denial.
Mistake 2: Assuming CKSUM means “disk is bad” (it might be, but not always)
Symptom: You replace a disk, resilver completes, and CKSUM errors keep rising—sometimes on the replacement drive.
Fix: Check SATA/SAS cabling, backplanes, HBAs, expander firmware, and power stability. Look at SMART UDMA CRC counters and kernel logs for link resets.
Mistake 3: Scrubbing on unstable hardware
Symptom: A disk is intermittently dropping; you scrub; afterward you have more permanent errors than before.
Fix: Stabilize first. Scrub is a full-surface read. If the path is flaky, you are forcing the system to repeatedly touch the weak spot. Replace the cable/HBA/disk as indicated, then scrub.
Mistake 4: Confusing “repaired 0B” with “no corruption ever happened”
Symptom: Scrub reports “0B repaired,” but you still see permanent errors or application-level corruption.
Fix: “0B repaired” can mean either “nothing was wrong” or “nothing could be repaired.” Always check the “errors:” section and the device error counters.
Mistake 5: Ignoring metadata/special vdev implications
Symptom: Random files across different datasets fail; zpool status -v output is sparse or weird; performance becomes erratic.
Fix: Check whether special vdevs exist and whether they are healthy. Treat them as critical. If a special vdev fails and you have no redundancy, the pool can be effectively lost even if data disks are fine.
Mistake 6: Treating a VM disk file like a normal file during recovery
Symptom: You restore a .qcow2 or raw image “in place” while the VM is running; later you get subtle guest filesystem corruption.
Fix: Quiesce the VM, restore to a new file, run guest-level checks if possible, then swap. VM images are big, hot, and sensitive to partial restore workflows.
Mistake 7: Turning on an “optimization” without understanding write patterns
Symptom: After changing recordsize, compression, or using rsync --inplace, you see worse fragmentation, longer scrubs, or unexpected latency spikes.
Fix: Treat performance tuning as a change request: measure, stage, and roll back if it backfires. Integrity workflows depend on predictable I/O behavior.
Checklists / step-by-step plan
Checklist A: “I just saw corruption” response plan
- Capture evidence:
zpool status -v,zpool events -v,dmesgexcerpts. - Determine if the pool is stable (devices not flapping). If unstable, prioritize hardware stabilization.
- If
zpool status -vlists files, inventory them and classify: critical DB/VM images first. - Run or schedule a scrub (when stable). Monitor progress and errors.
- Restore/replace corrupt files from snapshots or upstream source.
- Clear errors (
zpool clear) only after remediation. - Re-scrub to validate and watch counters for regression.
- Root-cause hardware: SMART, cabling, controller firmware, power events.
Checklist B: Step-by-step to find “the exact file”
- Run
zpool status -v. If you get file paths, stop here—you already have them. - If no paths, identify affected vdev/disk via error counters.
- Identify likely dataset by workload impact (which mountpoint/app is seeing errors) and by reviewing event history.
- Use
staton suspected files to get inode/object numbers. - Use
zdbto inspect dataset and object details; confirm object types. - Once you have a suspect file/object, attempt controlled reads to confirm.
- Recover from snapshot/replica; validate application-level integrity (db checks, archive tests, VM boot checks).
Checklist C: Post-fix verification
zpool status -xvshould be clean.- Scrub completes with 0 errors.
- CKSUM counters stop increasing after normal workloads resume.
- SMART and kernel logs show no link resets/timeouts.
- Restore tests prove backups/snapshots are valid.
FAQ
1) Does zpool status -v always show the exact filename?
No. It shows file paths when ZFS can map permanent errors to a file in a ZPL filesystem. If corruption is in metadata, affects unmapped objects, involves a zvol, or ZFS can’t safely resolve the path, you may get no filename or only object-like references.
2) What’s the difference between READ, WRITE, and CKSUM errors?
READ/WRITE are I/O failures (device couldn’t read/write successfully). CKSUM means the device returned data successfully, but it didn’t match the expected checksum—often pointing to silent corruption or a bad path (cable/HBA/backplane).
3) If ZFS repaired data during a scrub, do I need to do anything?
Yes: you need to find the root cause. A repair means redundancy saved you this time. Check device counters, SMART, and logs. One repaired event can be a cosmic ray; repeated repairs are a hardware problem auditioning for a full-time role.
4) Should I run a scrub immediately after seeing errors?
If the hardware path is stable, yes—scrub gives you the ground truth and may repair. If devices are flapping or you see resets/timeouts, fix stability first; scrubbing unstable gear can increase uncorrectable reads.
5) What’s the fastest way to “fix” a listed corrupt file?
Restore it from a snapshot (/path/.zfs/snapshot/...) or re-copy it from a known-good source. Then clear errors and scrub again to validate. For databases and VM images, do application-level integrity checks as well.
6) Can a bad SATA/SAS cable really cause checksum errors?
Absolutely. It’s one of the most common “that can’t be it” root causes. SMART’s UDMA CRC counter and kernel logs about link resets are strong signals. Replace the cable, reseat connectors, and ensure the backplane is healthy.
7) If I replace the disk, will the corrupt file automatically be fixed?
Only if redundancy allows reconstruction of the bad block and the pool actually reads and repairs it (scrub/heal-on-read). If the pool never had a good copy (or the corruption existed in all copies), replacing hardware won’t resurrect the data—you still need snapshots/backups.
8) Why does zpool status sometimes say “applications are unaffected” when it lists errors?
That line usually appears when ZFS detected and corrected an issue (or believes it did). Treat it as “no immediate I/O failures observed by applications,” not as a guarantee. If there are permanent errors, some data is untrusted.
9) What does “too many errors” mean?
ZFS is telling you the error rate is high enough that listing everything is impractical, or the situation is actively deteriorating. Focus on stabilizing hardware, running a scrub, and triaging impacted datasets based on application symptoms.
10) Can I use zpool clear to get rid of the scary message?
You can, but it’s like removing the smoke detector battery because it’s loud. Clear after remediation so you can detect whether errors recur. If you clear first, you lose a key breadcrumb trail.
Conclusion
zpool status -v is one of the rare tools in infrastructure that’s both honest and useful: it tells you what ZFS can prove, not what you want to hear. When it prints a file list, treat that output like a surgical checklist—verify, restore, validate, then clear and re-scrub. When it doesn’t print paths, don’t panic; it usually means the corruption is in metadata or otherwise hard to map, and that’s your cue to stabilize hardware and escalate to deeper inspection with zdb and object-level reasoning.
The operational win isn’t merely finding the corrupt file. It’s building a habit: frequent scrubs that fit your latency budget, snapshots you’ve actually tested, and alerting that catches a rising CKSUM counter before it becomes a postmortem headline.