You bought a “fast” SATA SSD, added it as a ZFS SLOG, and expected miracles. Instead you got slower NFS, jittery VM latency,
or—worse—a quiet increase in risk that only shows up during the first power event that matters.
A SLOG is not a cache. It is not a turbo button. It is a promise you make to applications that say “this write is safe now.”
If you keep that promise with a consumer SATA SSD, you’re betting production on a device that was never designed for that kind of oath.
What a SLOG actually does (and what it doesn’t)
ZFS has two related concepts that get mashed together in forum lore: the ZIL (ZFS Intent Log) and the SLOG
(Separate LOG device).
ZIL: the in-pool intent log that exists whether you like it or not
The ZIL is not an “extra feature.” It’s part of how ZFS provides POSIX semantics for synchronous writes. When an application
calls fsync(), uses O_SYNC, or when a protocol insists on stable writes (hello, NFS), ZFS must acknowledge only after the
data is safe against a crash.
For sync writes, ZFS first writes transaction records to the ZIL. Later, during the normal transaction group (TXG) commit,
data is written to its final location in the pool. On crash/reboot, ZFS replays the ZIL to recover the last acknowledged sync writes.
SLOG: relocating the ZIL to a dedicated device
By default, the ZIL lives on the main pool, spread across top-level vdevs. Adding a SLOG tells ZFS: “put the ZIL records here instead.”
The goal is to lower sync write latency and smooth out the random write penalty of sync workloads.
It only helps if your workload issues sync writes. If your workload is mostly async (typical bulk writes, many databases tuned for async,
media storage), a SLOG is a decorative spoiler on a minivan.
What a SLOG is not
- Not an L2ARC. It does not cache reads.
- Not a write-back cache. It does not absorb all writes; it only accelerates sync write acknowledgments.
- Not a performance freebie. A bad SLOG can make latency worse and increase failure blast radius.
The SLOG write pattern is brutal: small, mostly sequential, latency-sensitive writes that must be durable now.
The device must honor flushes. It must have power-loss protection or equivalent durability.
And it must not stall under sustained fsync pressure.
Joke #1: A consumer SATA SSD as SLOG is like using a travel pillow as a motorcycle helmet—soft, optimistic, and wrong at speed.
Why SATA SSD SLOG so often disappoints or fails
The cheap upgrade pitch is seductive: “I have a spare SATA SSD. Add it as log. Sync writes go brrr.” Sometimes you get a small win.
Often you get a mess. Here’s why.
1) SATA SSDs lie (or at least, they negotiate)
The SLOG is only as good as the device’s ability to make writes durable on demand. In ZFS terms, this means honoring cache flushes
and not acknowledging writes until data is in non-volatile media.
Many consumer SSDs have volatile DRAM caches and varying levels of firmware discipline around flush commands. Some are great.
Some are “fine until they aren’t.” The worst case is a drive that acknowledges quickly but loses the last seconds of writes on power loss.
For a SLOG, that is catastrophic: ZFS will replay the ZIL after reboot, but if the SLOG lost acknowledged log records, you’ve created
a window for silent corruption or application-level inconsistency.
2) Latency matters more than bandwidth, and SATA is bad at latency under pressure
Sync write performance is dominated by tail latency. A SLOG that does 50,000 IOPS in a benchmark but occasionally pauses
for garbage collection, firmware housekeeping, or SLC cache exhaustion will turn “fast” into “spiky.”
SATA’s protocol and queues are also limited compared to NVMe. You’re not buying just throughput; you’re buying better behavior
under concurrent flush-heavy workloads.
3) Endurance and write amplification show up earlier than you think
A SLOG sees a constant stream of small writes. ZFS writes log records, later discards them after TXG commit.
That churn can produce write amplification and steady wear.
Consumer SATA drives often have lower endurance ratings and weaker sustained write performance once their pseudo-SLC cache fills.
The failure mode is not always “drive dies.” Often it’s: latency degrades, then you start seeing application timeouts,
then someone disables sync to “fix it,” and now you’re running without a safety net.
4) The “single cheap SLOG” is a single point of drama
ZFS can operate without a SLOG. If the log device fails, the pool typically continues, but you may lose the last acknowledged
synchronous writes (because they were only on the failed SLOG).
Mirroring the SLOG removes that class of risk, but then your “cheap upgrade” now needs two devices—and you still need them to be
power-loss safe and stable under flush.
5) You might not have a sync problem at all
Plenty of ZFS performance pain is not the ZIL. It’s undersized ARC, recordsize mismatch, fragmented HDD vdevs,
too many small I/O patterns on RAIDZ, CPU saturation from checksumming/compression, or a hypervisor stack misconfigured for sync.
Adding a SLOG to a system that is already IOPS-bound elsewhere is a classic “tool applied to the wrong wound.”
Facts and historical context worth knowing
A few concrete points—some history, some engineering—that help cut through myths. These aren’t trivia; they change decisions.
- ZFS originated at Sun in the mid-2000s with an explicit focus on data integrity: end-to-end checksums and copy-on-write were not optional features.
- The ZIL exists even without a SLOG; adding a log device only relocates it. People who “add SLOG for faster writes” often misunderstand this.
- NFS traditionally treats many operations as synchronous (or demands stable storage semantics), which is why SLOG talk often starts in NFS-heavy shops.
- Early SSD eras made flush behavior notoriously inconsistent; firmware bugs around write barriers and FUA led to years of “it benchmarks fast” surprises.
- SATA’s command queue depth and protocol overhead are limited compared to NVMe; this matters most for flush-heavy and parallel I/O patterns.
- Power-loss protection (PLP) was historically an enterprise feature because it requires hardware (capacitors) and validation; consumer drives often omit it or implement partial measures.
- ZFS TXGs typically commit every few seconds (tunable), which defines how long log records might live before being discarded—short-lived, high-churn writes.
- “Disable sync” became a folk remedy in virtualization stacks because it makes benchmarks look great; it also makes crash recovery look like a crime scene.
Failure modes: performance, correctness, and “it seemed fine”
Performance failure: the SLOG becomes the bottleneck
When you add a SLOG, synchronous operations must hit that log device. If that device has higher latency than your pool’s best-case
sync path (say, a pool of decent SSDs, or even a well-behaved mirror of HDDs with write cache behavior you understand), you can make
sync writes slower.
Classic symptom: your pool’s regular writes are fine, reads are fine, but anything that calls fsync spikes to tens or hundreds of
milliseconds intermittently.
Correctness failure: acknowledging writes that aren’t truly durable
The nightmare scenario is a drive that returns success before the data is actually safe. With a SLOG, ZFS is using that “success” to
tell applications “your synchronous write is safe.” If power dies and the drive loses those acknowledged records, ZFS cannot replay what
it never actually got. The pool will import. It may even look clean. Your application will be the one discovering missing or corrupted
last-second transactions.
People ask, “Doesn’t ZFS protect against that?” ZFS protects against a lot. It can’t make a lying device honest.
Reliability failure: the SLOG dies and you lose the last safe writes
A non-mirrored SLOG is a single device standing between you and loss of acknowledged sync writes during that device’s failure window.
Even if you accept the risk, the operational reality is uglier: when a SLOG starts failing, it often fails by stalling I/O,
causing system-wide latency storms. Now you’re troubleshooting in production while your hypervisor queue backs up.
Operational failure: someone “fixes” it by disabling sync
This is where the cheap upgrade often ends. A team adds a SATA SLOG, sees worse latency, flips sync=disabled on a dataset,
celebrates, and unknowingly changes durability semantics for every VM or NFS client using that dataset.
Quote (paraphrased idea) from Werner Vogels: “Everything fails, all the time—design your systems assuming that reality.”
Fast diagnosis playbook
You want to find the bottleneck quickly, without belief-based tuning. Here’s a practical order of operations.
First: prove you have a sync workload
- Check ZFS dataset
syncproperties. - Check application/protocol behavior (NFS exports, hypervisor settings, database fsync patterns).
- Run a controlled sync write test and compare with async.
Second: measure SLOG latency and saturation
- Use
iostatto see if the log device is the busiest disk during the issue. - Look for spikes in
await/ service times on the SLOG. - Check if the SLOG is a SATA SSD with questionable flush behavior or no PLP.
Third: confirm pool health and TXG behavior
- Verify no vdev is degraded or resilvering.
- Check for sync write amplification due to small blocks / recordsize mismatch.
- Watch TXG commit times and dirty data limits if you suspect stalls.
Fourth: decide whether to remove, mirror, or replace the SLOG
- If the SLOG is slower than the pool: remove it.
- If you need SLOG for NFS/VM sync semantics: replace with PLP NVMe or enterprise SATA with capacitors; mirror it if the workload matters.
- If you don’t need sync acceleration: don’t run a SLOG “just because.”
Hands-on tasks: commands, outputs, and decisions
These are real tasks you can run on a Linux host using OpenZFS. Each one includes what the output means and what decision you make.
Use a test window if you’re changing properties; the commands below are primarily read-only unless stated.
Task 1: Identify whether a SLOG exists and what it is
cr0x@server:~$ sudo zpool status -v tank
pool: tank
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
sda ONLINE 0 0 0
sdb ONLINE 0 0 0
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
sde ONLINE 0 0 0
sdf ONLINE 0 0 0
logs
sdz ONLINE 0 0 0
errors: No known data errors
Meaning: There is a dedicated log vdev (sdz). That’s your SLOG.
Decision: If sdz is a consumer SATA SSD, treat it as suspicious until proven otherwise.
Task 2: Confirm the dataset sync setting (and find “helpful” overrides)
cr0x@server:~$ sudo zfs get -r sync tank
NAME PROPERTY VALUE SOURCE
tank sync standard default
tank/vmstore sync standard local
tank/nfs sync disabled local
Meaning: tank/nfs has sync=disabled, which changes correctness semantics.
Decision: If this was done to “fix performance,” treat it as a production risk item and plan a rollback with a proper SLOG or workload change.
Task 3: Determine whether the log device is SATA and what model it is
cr0x@server:~$ lsblk -d -o NAME,ROTA,TRAN,MODEL,SIZE,SERIAL | grep -E 'sdz|nvme'
sdz 0 sata CT500MX500SSD1 465.8G 2219E5A1B2C3
Meaning: The SLOG is a SATA Crucial MX500 (common consumer drive).
Decision: Assume no full PLP. Plan to validate flush behavior and latency under sync load; strongly consider replacing with a PLP-capable device.
Task 4: Check whether the drive claims to have volatile write cache
cr0x@server:~$ sudo hdparm -W /dev/sdz
/dev/sdz:
write-caching = 1 (on)
Meaning: Write cache is enabled. That’s not automatically bad if the drive has PLP; it’s dangerous if it doesn’t.
Decision: If this is a consumer SSD without PLP, don’t trust it as a durability device for synchronous acknowledgments.
Task 5: Pull SMART details and look for power-loss protection hints
cr0x@server:~$ sudo smartctl -a /dev/sdz | sed -n '1,60p'
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.8.0] (local build)
=== START OF INFORMATION SECTION ===
Model Family: Crucial/Micron MX500 SSDs
Device Model: CT500MX500SSD1
Serial Number: 2219E5A1B2C3
Firmware Version: M3CR046
User Capacity: 500,107,862,016 bytes [500 GB]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.1, 6.0 Gb/s
Local Time is: Thu Dec 26 11:02:41 2025 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Meaning: SMART identifies a consumer SATA SSD family. SMART rarely “confirms PLP” directly on consumer SATA.
Decision: Treat absence of explicit PLP support as “no PLP.” For SLOG, that pushes you toward replacement or removal.
Task 6: Check wear and media errors on the would-be SLOG
cr0x@server:~$ sudo smartctl -a /dev/sdz | egrep -i 'Media_Wearout|Percent_Lifetime|Reallocated|Uncorrect|CRC|Wear|Errors'
SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
cr0x@server:~$ sudo smartctl -a /dev/sdz | egrep -i 'Reallocated_Sector_Ct|Reported_Uncorrect|UDMA_CRC_Error_Count|Percent_Lifetime_Remain|Power_Loss'
Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
Meaning: No obvious media errors. That doesn’t mean it’s suitable as SLOG; it just means it’s not already dying loudly.
Decision: If you see CRC errors, suspect cabling/backplane—fix that before blaming ZFS.
Task 7: Watch per-disk latency during the incident window
cr0x@server:~$ sudo iostat -x 1
Linux 6.8.0 (server) 12/26/2025 _x86_64_ (32 CPU)
Device r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 2.0 18.0 64.0 980.0 98.0 1.20 22.3 8.1 24.0 1.8 36.0
sdz 0.0 420.0 0.0 2100.0 10.0 12.50 29.8 0.0 29.8 0.2 98.0
Meaning: The log device sdz is nearly saturated with small writes (avgrq-sz ~10KB), and its await is high.
Decision: Your SLOG is the bottleneck. Either replace it with a low-latency PLP device, mirror it, or remove it to fall back to in-pool ZIL.
Task 8: Confirm that sync writes are actually happening
cr0x@server:~$ sudo zpool iostat -v tank 1
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
tank 4.22T 6.58T 120 980 9.8M 44.1M
raidz2-0 4.22T 6.58T 120 910 9.8M 40.7M
sda - - 20 150 1.7M 6.8M
sdb - - 20 150 1.6M 6.8M
sdc - - 20 150 1.6M 6.8M
sdd - - 20 150 1.6M 6.8M
sde - - 20 155 1.7M 6.8M
sdf - - 20 155 1.6M 6.7M
logs - - 0 70 0K 3.4M
sdz - - 0 70 0K 3.4M
Meaning: The log vdev is actively receiving writes. That strongly suggests sync activity.
Decision: If you expected async, find out who (NFS, hypervisor, app) is forcing sync. Fix the workload or provision a proper SLOG.
Task 9: Measure sync vs async write latency with fio (carefully)
Run this on a test dataset, not directly on production paths unless you know what you’re doing.
cr0x@server:~$ sudo fio --name=syncwrite --directory=/tank/test --rw=write --bs=4k --iodepth=1 --numjobs=1 --size=512M --fsync=1 --direct=1
syncwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
...
write: IOPS=620, BW=2480KiB/s (2540kB/s)(512MiB/211498msec)
lat (usec): min=450, max=85000, avg=1600.12, stdev=4100.55
Meaning: Sync write latency has ugly tail spikes (max 85ms). That’s what users feel.
Decision: If max latency is high and correlated with SLOG saturation, replace/remove SLOG rather than tuning around it.
Task 10: Validate whether the SLOG is mirrored (it should be, if it matters)
cr0x@server:~$ sudo zpool status tank | sed -n '1,40p'
pool: tank
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
logs
sdz ONLINE 0 0 0
Meaning: Single-device SLOG. If it fails, you risk losing the last acknowledged sync writes.
Decision: If your business cares about those writes (VM storage, NFS for home directories, databases), mirror the SLOG or don’t use one.
Task 11: Check if the pool is currently resilvering/scrubbing (performance red flag)
cr0x@server:~$ sudo zpool status tank
pool: tank
state: ONLINE
scan: scrub in progress since Thu Dec 26 10:41:01 2025
1.92T scanned at 1.34G/s, 512G issued at 356M/s, 4.22T total
0B repaired, 12.13% done, 02:45:19 to go
config:
...
Meaning: A scrub is in progress. That can amplify latency issues, especially on HDD pools.
Decision: If the performance complaint coincides with scrub/resilver, pause the tuning crusade and retest after the maintenance load is gone.
Task 12: Find out whether the dataset is using small blocks that punish RAIDZ
cr0x@server:~$ sudo zfs get recordsize,volblocksize,compression tank/vmstore
NAME PROPERTY VALUE SOURCE
tank/vmstore recordsize 128K local
tank/vmstore volblocksize - -
tank/vmstore compression lz4 local
Meaning: If this is a VM image dataset backed by files, 128K may be okay, but often you want smaller blocks for random I/O patterns.
Decision: If your issue is random sync writes on RAIDZ, consider mirrors for VM workloads or tune the VM storage approach—don’t expect a SATA SLOG to save RAIDZ from physics.
Task 13: Verify whether your SLOG is actually being used (not bypassed)
cr0x@server:~$ sudo zdb -C tank | sed -n '1,120p'
MOS Configuration:
version: 5000
name: 'tank'
...
vdev_tree:
type: 'root'
id: 0
guid: 12345678901234567890
children[0]:
type: 'raidz'
...
children[1]:
type: 'log'
id: 1
guid: 998877665544332211
children[0]:
type: 'disk'
path: '/dev/disk/by-id/ata-CT500MX500SSD1_2219E5A1B2C3'
Meaning: The pool config includes a log vdev. ZFS will use it for sync writes unless disabled at dataset level or other constraints apply.
Decision: If you expected a mirrored log but see one child, you’ve found a design flaw, not a tuning knob.
Task 14: Remove a problematic SLOG (if you decide it’s doing harm)
This changes behavior. Schedule it. Communicate it. And remember: removing the SLOG does not “turn off” the ZIL; it moves it back to the pool.
cr0x@server:~$ sudo zpool remove tank sdz
cr0x@server:~$ sudo zpool status tank
pool: tank
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
errors: No known data errors
Meaning: The dedicated log is gone. Sync writes now go to in-pool ZIL.
Decision: If latency improves immediately, the SATA SLOG was your bottleneck. Next step is either “no SLOG” or “proper mirrored PLP SLOG.”
Task 15: If you must have a SLOG, add it as a mirror, using stable device IDs
cr0x@server:~$ sudo zpool add tank log mirror /dev/disk/by-id/nvme-INTEL_SSDPE2KX010T8_PHBT1234001A1P0A /dev/disk/by-id/nvme-INTEL_SSDPE2KX010T8_PHBT1234001A1P0B
cr0x@server:~$ sudo zpool status -v tank | sed -n '1,80p'
pool: tank
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
sda ONLINE 0 0 0
sdb ONLINE 0 0 0
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
sde ONLINE 0 0 0
sdf ONLINE 0 0 0
logs
mirror-1 ONLINE 0 0 0
nvme-INTEL_SSDPE2KX010T8_PHBT1234001A1P0A ONLINE 0 0 0
nvme-INTEL_SSDPE2KX010T8_PHBT1234001A1P0B ONLINE 0 0 0
errors: No known data errors
Meaning: Mirrored SLOG using NVMe devices (example). That’s the right structural pattern for reliability.
Decision: If you can’t afford two appropriate devices, you can’t afford a SLOG for important sync workloads. Run without it.
Three corporate mini-stories from the trenches
Mini-story #1: An incident caused by a wrong assumption
A mid-sized company consolidated a few aging NAS boxes into a shiny ZFS server. They ran NFS for home directories and build outputs.
Someone read that “SLOG speeds up writes” and added a spare consumer SATA SSD as the log device. They did a quick file copy test, saw
no obvious regression, and moved on. Everyone loves a quick win. Everyone loves a checkbox.
Weeks later, a short power event hit the rack—UPS transfer, not a full blackout. The server stayed up. The SATA SSD did not.
It dropped off the bus for a moment and came back. ZFS didn’t immediately panic; the pool stayed online. The team breathed again.
The next morning, developers complained about “random” build failures and corrupted artifacts. Nothing was consistently broken.
Re-running the same build would succeed. That’s the worst kind of failure: intermittent, confidence-eroding, and hard to reproduce.
The key detail was the assumption: they believed the SLOG was “just performance,” not durability semantics. They also believed an SSD
is inherently safer than spinning disks. But the SLOG was the only place those acknowledged synchronous records lived until TXG commit.
When the device went weird during a power anomaly, some acknowledged log writes never made it.
The fix wasn’t heroic. They removed the SLOG, forced clients to remount, and stopped the corruption pattern. Later they added a mirrored,
power-loss safe log device and documented why it existed. The lesson stuck because the incident was just painful enough, and not so painful
that everyone got fired.
Mini-story #2: An “optimization” that backfired
A virtualization cluster hosted mixed workloads: a few latency-sensitive databases, a lot of general-purpose VMs. The storage team
watched graphs and saw periodic fsync spikes. They added a SATA SLOG to the ZFS backend expecting to flatten those spikes.
Initially, median latency improved a bit. The team celebrated. Then month-end reporting arrived, and the real-world pattern changed:
lots of concurrent sync-heavy transactions. The SLOG drive hit its sustained write limits, the pseudo-SLC cache exhausted, and write
latency went from “mostly fine” to “jittery and occasionally awful.”
The VMs didn’t just slow down; they synchronized their misery. When the SLOG stalled, it stalled the synchronous acks for many VMs,
causing guest OSes to queue, causing applications to time out, causing retries that increased write pressure. Latency storms have a talent
for becoming self-sustaining.
Someone proposed the classic fix: disable sync on the zvol dataset backing the VM storage. It made the graphs look fantastic.
It also turned a crash-consistent workload into a “good luck” workload. A week later, an unrelated host panic forced a reboot.
A database came back with recovery errors that were expensive to triage and impossible to fully “prove” after the fact.
They rolled back the “optimization,” removed the SATA SLOG, and later installed a mirrored PLP NVMe SLOG. The real fix was not
“faster SSD.” It was stable latency and correct semantics, plus a clear policy: no one disables sync without a signed risk acceptance.
Mini-story #3: The boring but correct practice that saved the day
Another org ran ZFS for NFS and iSCSI with a mix of HDD mirrors and SSD mirrors. They had a SLOG—but it was mirrored, and it was made of
enterprise drives with power-loss protection. The choice looked extravagant compared to a cheap SATA SSD. It wasn’t.
What made them different wasn’t just hardware; it was process. They treated the SLOG like a durability component, not a performance toy.
They tracked SMART wear. They ran monthly fault-injection drills during a maintenance window: pull one SLOG device, confirm the pool stays
healthy, confirm performance remains acceptable, replace, resilver, repeat.
One day a firmware bug caused one SLOG device to start throwing errors. The monitoring fired early—before users screamed. They offlined the
device cleanly, replaced it, and kept running on the remaining mirror leg. No data loss. Minimal latency impact. A ticket, a swap, and done.
The incident never became a story inside the company because it never became dramatic. That’s the point. Boring is a feature in storage.
Common mistakes: symptom → root cause → fix
1) Symptom: “Adding SLOG made NFS slower”
Root cause: The SATA SLOG has worse fsync latency than your in-pool ZIL path, especially under load or garbage collection.
Fix: Remove the SLOG and retest; if you need SLOG, replace with low-latency PLP device (preferably mirrored).
2) Symptom: “Latency spikes every few minutes”
Root cause: SSD firmware housekeeping, SLC cache exhaustion, or TXG-related bursts interacting with a saturated SLOG.
Fix: Observe iostat -x for SLOG %util and await. If the SLOG is pegged, replace or remove it.
3) Symptom: “It’s fast in benchmarks, users still complain”
Root cause: You benchmarked throughput, but users feel tail latency. Sync workloads care about the slowest 1%.
Fix: Use fio --fsync=1 with low iodepth and track max latency; solve tail latency, not peak bandwidth.
4) Symptom: “We disabled sync and everything got better”
Root cause: You removed the requirement to make writes durable before ack. Performance “improves” because you changed the contract.
Fix: Re-enable sync for datasets that need correctness; deploy proper SLOG or redesign workload (e.g., local caching, application-level journaling).
5) Symptom: “After power event, application data is inconsistent”
Root cause: Non-PLP SLOG lost acknowledged writes, or drive lied about flush. ZFS cannot replay what never persisted.
Fix: Stop using consumer SATA as SLOG. Use PLP devices; mirror the SLOG; review UPS and write-cache policy.
6) Symptom: “Pool imports fine, but last transactions are missing”
Root cause: Sync acknowledgments were made based on SLOG that didn’t persist.
Fix: Treat as a durability incident. Audit dataset sync history and SLOG hardware. Replace with PLP mirror, then validate recovery procedures.
7) Symptom: “SLOG drive keeps dropping from SATA bus”
Root cause: Cabling/backplane issues, power instability, or consumer SSD firmware not happy with sustained flush patterns.
Fix: Fix hardware path (cables, HBA firmware, power), then stop using that model as log device. A flaky SLOG is worse than no SLOG.
8) Symptom: “SLOG wear climbs fast”
Root cause: High sync write rate + write amplification; consumer endurance is insufficient.
Fix: Use enterprise endurance drives for log, size appropriately, and monitor wear; consider workload changes to reduce forced sync (where safe).
Joke #2: Disabling sync to “fix” SLOG latency is like removing the smoke alarm because it’s too loud.
Checklists / step-by-step plan
Step-by-step: decide whether you should have a SLOG at all
- List the consumers of the dataset. NFS? VM images? Databases? Identify who issues sync writes.
- Check dataset properties. If
sync=disabledis present anywhere, flag it as a risk item. - Measure sync write latency without SLOG. If it’s already acceptable, don’t add complexity.
- If you need a SLOG, define the contract. Are you accelerating sync writes for correctness, or masking a design issue?
Step-by-step: if you already installed a SATA SSD SLOG
- Identify the device and model. If it’s consumer, assume “not PLP.”
- Check if it’s mirrored. If not mirrored, document the risk and prioritize remediation.
- Observe under load. Watch
iostat -xandzpool iostat -vduring sync-heavy periods. - Run a controlled fio sync test. Track max latency and jitter, not just IOPS.
- Make the call: remove it, or replace with mirrored PLP devices.
Step-by-step: build a correct SLOG setup (the version you won’t regret)
- Pick devices designed for durable low-latency writes. PLP is the headline feature; consistent latency is the hidden one.
- Use two devices as a mirror. If you can’t, accept that you’re choosing “risk” as a feature.
- Use stable device paths. Prefer
/dev/disk/by-id/..., not/dev/sdX. - Test failover. Offline one log device in a maintenance window; confirm the pool stays healthy and latency stays sane.
- Monitor wear and errors. Set alerts for SMART wear, media errors, and bus resets.
Operational checklist: things to document so future-you doesn’t suffer
- Which datasets require sync semantics and why (NFS exports, VM stores, database volumes).
- The expected behavior if the SLOG fails (what risk exists, what alerts fire, what runbook steps are).
- How to remove/replace the SLOG safely.
- Who is allowed to change
syncproperties and under what approval.
FAQ
1) Will a SLOG speed up all writes on ZFS?
No. It only helps synchronous writes. Async writes bypass it and go through normal TXG buffering and commit.
2) How do I know if my workload is sync-heavy?
Look for heavy writes on the log vdev via zpool iostat -v. Confirm dataset sync settings.
For NFS and many VM stacks, assume significant sync unless you’ve verified client/server settings.
3) Is using a consumer SATA SSD as SLOG always wrong?
For non-critical homelab experimentation, you can do it. For production where data correctness matters, it’s usually the wrong bet.
The risk is durability and latency spikes, not just raw speed.
4) What’s the difference between ZIL and SLOG?
ZIL is the mechanism. SLOG is a dedicated device where ZIL records are stored. Without SLOG, the ZIL lives on the pool.
5) Should SLOG be mirrored?
If you care about acknowledged sync writes surviving device failure, yes. A single SLOG device is a single point of “those writes are gone.”
6) If the SLOG dies, do I lose the whole pool?
Usually no; the pool can keep running or import without the log device depending on failure timing and configuration. The real danger is loss of the most recent acknowledged synchronous writes.
7) Why not just set sync=disabled and move on?
Because you’re changing the storage contract. Databases, VM filesystems, and NFS clients may believe data is safe when it isn’t.
That’s how you get “everything looked fine” followed by post-crash inconsistency.
8) How big does a SLOG need to be?
Often smaller than people think. You’re storing short-lived log records until TXG commit. Size helps with overprovisioning and endurance,
but latency consistency and PLP matter more than capacity.
9) Is NVMe always better for SLOG?
NVMe tends to have better latency characteristics and queueing, but “NVMe” is not a guarantee of PLP or consistent behavior.
You still choose models known for durable flush behavior and stable tail latency.
10) Could my issue be ARC or RAM, not SLOG?
Yes. If reads are thrashing and the system is memory-starved, everything gets slow and you’ll misattribute it to the log.
That’s why the fast diagnosis starts with proving it’s a sync-write bottleneck.
Conclusion: next steps you can act on today
A SATA SSD SLOG is tempting because it’s cheap and easy. That’s also why it’s such a reliable source of production pain:
it changes durability semantics, concentrates sync latency onto a single device, and exposes the ugly corners of consumer SSD behavior.
Practical next steps:
- Run the fast diagnosis playbook. Confirm you have a sync problem before you buy hardware or flip properties.
- If you already have a consumer SATA SLOG, measure it. If it’s pegged or spiky, remove it and retest. Don’t guess.
- If you need a SLOG for NFS/VM correctness, do it properly. Use power-loss protected devices, and mirror them.
- Stop treating
sync=disabledlike a tuning knob. Treat it like a risk acceptance that requires adult supervision.
The goal isn’t maximal benchmark numbers. The goal is stable latency and reliable semantics—especially on the worst day, when power flickers,
a drive misbehaves, and your job becomes explaining reality to people who were promised safety.