ZFS logbias: Latency vs Throughput—Pick What You Actually Need

Was this helpful?

ZFS has a reputation: rock-solid integrity, sensible defaults, and just enough knobs to let you either save your outage—or cause it. logbias is one of those knobs. It’s not glamorous, and it won’t fix a fundamentally slow pool. But it can absolutely change the shape of your performance, especially when you care about synchronous writes.

If you run databases, VM storage, NFS exports, or anything that uses fsync() like it’s getting paid by the call, your real question isn’t “How fast is ZFS?” It’s “What do I want: lower latency per sync, or more throughput over time?” logbias is where you tell ZFS what you actually need—rather than what you hope will happen.

What logbias actually is (and what it is not)

logbias is a per-dataset (or per-zvol) property that influences how ZFS handles synchronous writes: whether it should preferentially use the ZFS Intent Log (ZIL) path for low-latency acknowledgment, or whether it should bias toward pushing the data into the main pool more directly for better aggregate throughput.

Two values matter in practice:

  • logbias=latency (default): prioritize low latency for synchronous writes; use the ZIL/SLOG path aggressively.
  • logbias=throughput: try to avoid “double-writing” patterns and reduce dependence on the log device for large sync writes; push more into the main pool when reasonable.

What it is not:

  • Not a magic “make my pool fast” flag. If your pool is overloaded or has terrible random write behavior, logbias won’t redeem it.
  • Not the same as sync=disabled. logbias doesn’t change correctness semantics. sync=disabled does.
  • Not a substitute for a proper SLOG device (if your workload needs it), nor for proper record sizing or database tuning.

One sentence operational summary: logbias is about where ZFS prefers to pay the cost of synchronous write acknowledgment—on the log path now, or in the main pool soon—without lying to applications.

Joke #1: Setting logbias=throughput without measuring is like “optimizing” a meeting by removing the agenda—technically faster, spiritually disastrous.

A practical mental model: ZIL, SLOG, TXGs

If you’ve been around ZFS long enough, you’ve heard these acronyms thrown around like they’re self-evident. They’re not. Let’s make them useful.

ZIL: the intent log, always present

The ZFS Intent Log (ZIL) exists for one job: ensure ZFS can safely acknowledge synchronous writes. When an application issues a sync write (or does an fsync()), ZFS needs to commit enough information to stable storage so that, after a crash, it can replay those operations and keep application-level guarantees.

Important: the ZIL is not a write cache for everything. It’s a log of recent synchronous transactions, and it’s only used for recovery after an unclean shutdown. Under normal operation, data eventually lands in the main pool as part of transaction group (TXG) commits.

SLOG: a separate device for the ZIL, optional but powerful

SLOG is the name operators use for a separate log device attached via log vdev(s) (e.g., an enterprise SSD or NVMe). When present, ZFS writes ZIL records to that device instead of placing them in the main pool.

This is the critical nuance that causes so many production arguments: SLOG is about latency and IOPS for synchronous writes, not about making async writes faster, and not about increasing total pool bandwidth for sequential workloads.

TXGs: the batching system that makes ZFS efficient

ZFS groups modifications into TXGs and periodically commits them to disk. The default cadence is on the order of seconds (implementation-dependent; the operational point is “batched”). This batching is where ZFS gets much of its performance: it can reorder, coalesce, and write efficiently.

Synchronous writes interrupt the party. ZFS still wants to batch, but it must also provide a stable acknowledgment. The ZIL/SLOG is the compromise: “I’ll log enough to survive a crash now, then I’ll merge everything into the main pool in a sane batch later.”

The “double write” you should actually care about

In many sync-heavy workloads, you effectively write data twice:

  1. Write log records to ZIL (on SLOG or pool) so you can acknowledge the sync.
  2. Later, write the real blocks to their final location during TXG commit.

This is not “waste”; it’s how ZFS provides correctness and performance. But it means your fastest device can become your bottleneck in a very specific way: a tiny SLOG doing tiny sync writes can cap your entire application’s transaction rate.

Latency vs throughput: what logbias changes

Here’s the operator-grade explanation: ZFS has to decide how to handle synchronous writes, particularly large ones. Logging large amounts of data to the ZIL can be expensive and can create a second stream of writes that your system must later absorb again. In those situations, it can be smarter to treat “sync large write” more like “I should just get this into the pool promptly” rather than “I should stuff all of this through the log device.”

logbias is the hint that steers that decision.

logbias=latency (default): make sync acknowledgments fast

When your workload issues sync writes frequently and cares about the response time of each transaction—databases with durable commits, NFS with sync semantics, VM disks with barriers—latency is king.

With logbias=latency:

  • ZFS is more willing to push synchronous write data into the ZIL/SLOG path.
  • If you have a good SLOG (low latency, power-loss protection), you’ll often see dramatically lower commit latency.
  • Your pool can remain relatively calm because the SLOG absorbs the sync storm, and the pool catches up in TXG writes.

What can go wrong: if your SLOG is slow, consumer-grade, lacks PLP, or attached over a bus that’s already saturated, you have created a single point of performance failure. The pool might be fine; the sync acknowledgment path is not.

logbias=throughput: reduce log dependence, favor streaming into the pool

logbias=throughput is typically used for workloads that generate synchronous writes but where the latency per write is less important than the overall throughput—think large sequential writes that are marked synchronous because of application behavior, virtualization layers, or conservative export settings.

With logbias=throughput:

  • ZFS is encouraged to avoid pushing large synchronous writes through the ZIL, because that can turn into “log it now, write it again later” at scale.
  • The main pool does more of the heavy lifting directly, which can be better if your pool is wide (many vdevs) and your SLOG is a relatively small device.

What can go wrong: you can shift the bottleneck from SLOG latency to pool write latency. If the pool is made of slow spinning disks, or if it’s already fragmented, or if you have heavy read pressure, pushing more sync data “straight to the pool” can increase application-visible latency. In other words: you got throughput, and paid for it in tail latency.

What logbias does not fix

Some problems are upstream and will laugh at your property tweaks:

  • Mis-sized recordsize/volblocksize for the workload.
  • VM guests doing small random sync writes to thin-provisioned zvols with compression off and no TRIM.
  • NFS exports with sync behavior that forces every write to be durable before returning, combined with a mediocre SLOG.
  • A pool that is simply out of IOPS headroom.

Joke #2: The SLOG is like a nightclub bouncer—if you hire someone slow, it doesn’t matter how big the dance floor is; nobody gets in.

Facts & history you can use in meetings

These are the kind of short, concrete points that help you cut through cargo-cult tuning in design reviews.

  1. ZFS was designed around transactional semantics, not “write it in place immediately.” TXGs are fundamental, not an add-on.
  2. The ZIL exists even without a SLOG. If you don’t add a log vdev, the ZIL records land on the main pool devices.
  3. SLOG is only used for synchronous writes. Asynchronous streaming writes won’t suddenly speed up because you added a log device.
  4. The ZIL is normally only read after a crash. Under healthy operation, ZIL data is “write-only” and gets discarded after TXG commit.
  5. Power-loss protection (PLP) is not a performance feature; it’s a correctness feature. Without PLP, “fast” acknowledgment can become “fast data loss.”
  6. Many virtualization and network storage stacks generate sync writes conservatively (barriers, flushes, stable writes). Sometimes the app isn’t paranoid; the stack is.
  7. Early ZFS deployments popularized the “add an SSD SLOG” playbook, but the market’s consumer SSDs often lied about flush durability—operators learned the hard way.
  8. Separate log devices can be mirrored, because losing a SLOG during operation can cause a pool panic/offline event on some platforms/configurations—availability matters.
  9. Latency and throughput can move in opposite directions when you change the sync acknowledgment path, because you’re changing which device becomes “the gate.”

Three corporate-world mini-stories (failure, backfire, save)

1) Incident caused by a wrong assumption: “SLOG makes everything faster”

A mid-sized company ran a ZFS-backed NFS service for build artifacts and VM images. They had a performance problem: some clients were complaining about “random pauses.” Someone proposed the classic: “Add a fast NVMe as SLOG; ZFS will fly.” Procurement delivered a consumer NVMe with great sequential benchmarks and a price tag that made Finance smile.

They added it as a single log vdev and went home. For a few days, things looked better—lower average latency on NFS writes, fewer complaint tickets. Then they hit a power event: brief outage, UPS transfer, nothing dramatic. The storage server came back, and within minutes the pool was in trouble. Clients saw I/O errors and hung mounts. The box wasn’t dead; it was worse: it was alive and confused.

The root cause was painfully ordinary. The consumer NVMe did not provide reliable power-loss protection for flush semantics. During the power event, log records that were acknowledged as durable were not actually durable. ZFS did exactly what it should: it refused to proceed safely when the log chain didn’t make sense. The team assumed “fast SSD = good SLOG,” and production disagreed.

The fix was not mystical tuning. They replaced the SLOG with an enterprise device with PLP and mirrored it. Then they did the boring part: documented that “SLOG is not a cache; it is a promise.” They also added monitoring to detect log device latency and error rates before it turned into a service-wide incident.

2) Optimization that backfired: “logbias=throughput everywhere”

Another environment: a virtualization cluster using zvols for VM disks over iSCSI. The storage team observed that the SLOG was busy, and some VMs had mediocre write throughput. Someone read that logbias=throughput can improve performance for large sync writes and decided to apply it across the board: all datasets, all zvols, no exceptions. The change window was small and the motivation was understandable: fewer changes means fewer mistakes.

The next day, the helpdesk tickets were not about throughput. They were about “VM feels slow,” “database commits sometimes stall,” and the one that gets your attention: “payment API timing out intermittently.” The graphs looked fine in averages. The tail latencies did not.

What happened was architectural. Some VMs were doing large sequential writes (backup jobs) and indeed saw improved throughput. But the noisy neighbors were not the issue. The sensitive workloads were small, sync-heavy databases. By biasing away from the log path, the system pushed more sync work to the main pool. The pool could do it, but not at low latency when mixed with read load and periodic scrub activity. Commit latency spiked, then cascaded into application timeouts.

They rolled back selectively: logbias=latency for database zvols, logbias=throughput for backup and bulk-ingest volumes. Then they set performance SLOs that explicitly tracked p95/p99 commit latency for sync writes, not just MB/s. The optimization wasn’t wrong; the assumption that one setting fits all workloads was.

3) A boring but correct practice that saved the day: “measure sync latency and mirror the SLOG”

A financial-services shop (the kind that treats storage like a first-class product) ran ZFS for a set of internal PostgreSQL clusters. They’d been bitten before by “fast but flimsy” hardware, so their practice was almost dull: mirrored enterprise SLOG devices with PLP, and a dashboard that tracked ZIL-related metrics and sync write latency.

One afternoon, application teams started reporting slightly elevated transaction latency. Not an outage—just the early warning signs that usually get ignored until they become a weekend. The SRE on call checked the dashboard and noticed a change: the log vdev’s write latency had crept up, while pool latency was normal. That’s a very specific smell.

They pulled system logs and saw intermittent media errors on one of the SLOG devices. Because the log vdev was mirrored, the pool stayed healthy and the service stayed up. Because they measured the right thing (sync latency), they caught it before it became an incident. They replaced the failing device during business hours with no drama, which is the best kind of incident report: the one nobody writes.

This is the part that’s hard to sell in budget meetings: the boring practice didn’t “increase performance.” It prevented performance from becoming availability.

Fast diagnosis playbook

This is the “you have 15 minutes before the escalation call” sequence. The goal is to identify whether your bottleneck is the application semantics, the log path, or the main pool.

First: confirm you’re actually dealing with synchronous writes

  1. Check dataset/zvol properties: sync, logbias.
  2. Check client/protocol behavior: NFS mounts, database durability settings, VM write barriers.
  3. Look for symptoms: high latency at low throughput, lots of small writes, fsync-heavy patterns.

Second: determine whether the log path (SLOG) is the gate

  1. If you have a SLOG, check log vdev latency and errors.
  2. Compare app commit latency to SLOG device latency. If they track, you found your choke point.
  3. Check whether the SLOG is saturating (IOPS, queue depth) or stalling (flush latency).

Third: if not SLOG, the pool is your bottleneck (or the CPU is)

  1. Check pool write latency and utilization. If your pool vdevs are near 100% busy, you’re out of headroom.
  2. Check ARC pressure and read amplification; heavy reads can starve writes.
  3. Check compression, checksumming, and encryption overhead if CPU is pegged.

Decision point

If synchronous write latency is the issue:

  • Prefer logbias=latency with a proper SLOG for latency-sensitive datasets.
  • Consider logbias=throughput only where the workload is large and sequential and where extra pool writes won’t violate latency SLOs.

Hands-on tasks: commands, outputs, interpretation

These are real tasks you can run on a typical OpenZFS-on-Linux system. Adjust pool/dataset names to match your environment. The point is not to run commands for sport; it’s to turn logbias from folklore into a measured choice.

Task 1: Inspect current logbias and sync on a dataset

cr0x@server:~$ sudo zfs get -o name,property,value,source logbias,sync tank/db
NAME      PROPERTY  VALUE     SOURCE
tank/db   logbias   latency   local
tank/db   sync      standard  inherited from tank

Interpretation: This dataset is explicitly set to logbias=latency. Sync behavior is standard (honor application requests). If you’re debugging commit latency, this is the baseline you expect.

Task 2: Find where logbias is set across the pool

cr0x@server:~$ sudo zfs get -r -o name,property,value,source logbias tank | grep -v default
tank                     logbias   latency     default
tank/db                  logbias   latency     local
tank/backups             logbias   throughput  local

Interpretation: You have mixed intent: databases are latency-biased; backups are throughput-biased. That’s usually healthy, assuming your workloads really differ.

Task 3: Check whether you have a separate log (SLOG) vdev

cr0x@server:~$ sudo zpool status -v tank
  pool: tank
 state: ONLINE
  scan: scrub repaired 0B in 02:11:33 with 0 errors on Sun Dec 22 03:10:01 2025
config:

        NAME                        STATE     READ WRITE CKSUM
        tank                        ONLINE       0     0     0
          raidz2-0                  ONLINE       0     0     0
            sda                     ONLINE       0     0     0
            sdb                     ONLINE       0     0     0
            sdc                     ONLINE       0     0     0
            sdd                     ONLINE       0     0     0
        logs
          mirror-1                  ONLINE       0     0     0
            nvme0n1p2               ONLINE       0     0     0
            nvme1n1p2               ONLINE       0     0     0

Interpretation: You have a mirrored SLOG. This is a solid production posture for sync-heavy workloads: performance plus availability.

Task 4: Watch pool-level IO and latency

cr0x@server:~$ sudo zpool iostat -v tank 1
                              capacity     operations     bandwidth
pool                        alloc   free   read  write   read  write
--------------------------  -----  -----  -----  -----  -----  -----
tank                         3.21T  8.54T    210    460  18.4M  62.1M
  raidz2-0                    3.21T  8.54T    210    460  18.4M  62.1M
    sda                           -      -     35     78  3.1M  10.5M
    sdb                           -      -     37     80  3.3M  10.2M
    sdc                           -      -     34     76  3.0M  10.4M
    sdd                           -      -     35     77  3.1M  10.6M
logs                              -      -      -      -
  mirror-1                        -      -      0    920    0B  12.4M
    nvme0n1p2                     -      -      0    470    0B   6.2M
    nvme1n1p2                     -      -      0    450    0B   6.1M

Interpretation: Lots of writes going to the log vdev: strong hint you’re dealing with sync-heavy activity. If applications complain about latency, you now know where to look next: log device latency and flush behavior.

Task 5: Pull extended latency stats (where supported)

cr0x@server:~$ sudo zpool iostat -v -l tank 2
                              capacity     operations     bandwidth    total_wait     disk_wait
pool                        alloc   free   read  write   read  write   read  write   read  write
--------------------------  -----  -----  -----  -----  -----  -----  ----- -----   ----- -----
tank                         3.21T  8.54T    190    510  16.9M  64.8M   3ms  18ms     2ms  15ms
  raidz2-0                    3.21T  8.54T    190    240  16.9M  52.1M   2ms  22ms     2ms  19ms
logs                              -      -      0   1080    0B  12.7M    -   2ms       -   1ms
  mirror-1                        -      -      0   1080    0B  12.7M    -   2ms       -   1ms

Interpretation: Log vdev writes are low-latency (good). Pool writes are slower (22ms). If sync latency is good but throughput is bad, pool may be the limit; if sync latency is bad, log might be the gate. Numbers like these help you separate the two.

Task 6: Check dataset recordsize / volblocksize alignment (often the hidden tax)

cr0x@server:~$ sudo zfs get -o name,property,value recordsize tank/db
NAME     PROPERTY    VALUE
tank/db  recordsize  16K

cr0x@server:~$ sudo zfs get -o name,property,value volblocksize tank/vm-001
NAME         PROPERTY     VALUE
tank/vm-001  volblocksize 8K

Interpretation: Databases often behave better with smaller records (e.g., 8K–16K) depending on DB page size and access pattern. VMs frequently default to 8K volblocks for zvols. Misalignment can amplify IO and make sync storms worse no matter what logbias says.

Task 7: Verify whether the workload is forcing sync at the ZFS layer

cr0x@server:~$ sudo zfs get -o name,property,value sync tank
NAME  PROPERTY  VALUE
tank  sync      standard

Interpretation: standard is normally what you want: honor application requests. If someone set sync=always “for safety,” expect more ZIL traffic. If someone set sync=disabled “for performance,” expect faster benchmarks and more creative incident reports.

Task 8: Change logbias for a specific dataset (safely scoped)

cr0x@server:~$ sudo zfs set logbias=throughput tank/backups
cr0x@server:~$ sudo zfs get -o name,property,value,source logbias tank/backups
NAME         PROPERTY  VALUE       SOURCE
tank/backups logbias   throughput  local

Interpretation: This is a low-risk change if the dataset hosts large sequential writes and you can tolerate higher per-write latency. Don’t do this on the dataset that holds your transaction logs unless you enjoy emergency calls.

Task 9: Benchmark sync write latency in a way that resembles reality

Use fio if available. This example issues 4K sync-like writes with fsync behavior via fdatasync=1.

cr0x@server:~$ sudo fio --name=sync4k --directory=/tank/dbtest --rw=randwrite --bs=4k \
  --iodepth=1 --numjobs=1 --size=2G --direct=1 --fdatasync=1 --time_based --runtime=60 --group_reporting
sync4k: (groupid=0, jobs=1): err= 0: pid=22190: Tue Dec 24 10:11:09 2025
  write: IOPS=820, BW=3280KiB/s (3359kB/s)(192MiB/60001msec)
    clat (usec): min=500, max=32000, avg=1215.4, stdev=820.1
    lat (usec): min=510, max=32050, avg=1222.0, stdev=822.0

Interpretation: This tells you what the application feels: ~1.2ms average completion latency, with tails out to 32ms. If you change logbias or SLOG hardware and this doesn’t move, you’re tuning the wrong layer.

Task 10: Compare behavior with a throughput-oriented test

cr0x@server:~$ sudo fio --name=seq128k --directory=/tank/backuptest --rw=write --bs=128k \
  --iodepth=16 --numjobs=4 --size=8G --direct=1 --fsync=0 --time_based --runtime=60 --group_reporting
seq128k: (groupid=0, jobs=4): err= 0: pid=22310: Tue Dec 24 10:13:30 2025
  write: IOPS=2100, BW=262MiB/s (275MB/s)(15.4GiB/60001msec)

Interpretation: This is async-ish throughput. SLOG won’t matter much here, and logbias usually won’t either unless your stack is forcing sync semantics. If your “SLOG upgrade” changed this number, something else changed too.

Task 11: Identify whether NFS clients are forcing synchronous behavior

cr0x@server:~$ mount | grep nfs
10.0.0.20:/export/vmstore on /mnt/vmstore type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.42)

cr0x@server:~$ nfsstat -m | sed -n '1,6p'
/mnt/vmstore from 10.0.0.20:/export/vmstore
 Flags: rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys

Interpretation: Mount options don’t always tell the whole story; NFS semantics plus application fsync patterns do. But this confirms you’re not accidentally mounted read-only or with tiny rsize/wsize that makes everything feel like molasses.

Task 12: Check ZFS statistics for ZIL activity (Linux)

cr0x@server:~$ awk 'NR==1 || /zil/ {print}' /proc/spl/kstat/zfs/arcstats | head
13 1 0x01 122 4880 167920131122 293229773812
zil_commit_count                         4    189223
zil_commit_writer_count                  4    189223
zil_itx_count                            4    812333
zil_itx_indirect_count                   4    1102

Interpretation: If commit counts are climbing rapidly during your incident window, you have a sync workload. Correlate with log device latency. If commits are low but the app is slow, you’re likely fighting something else (pool saturation, CPU, network, guest behavior).

Task 13: Confirm your log devices are not silently dying

cr0x@server:~$ sudo smartctl -a /dev/nvme0n1 | sed -n '1,25p'
SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        41 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    2%
Data Units Written:                 12,345,678
Media and Data Integrity Errors:    0
Error Information Log Entries:      0

Interpretation: Media errors and error log entries matter more than “percent used.” A SLOG is hammered by small writes and flushes; a device can look “healthy” until it isn’t. Watch error counters and latency, not just capacity.

Task 14: Validate where your bottleneck lives using iostat

cr0x@server:~$ iostat -x 1 3
Device            r/s     w/s   rkB/s   wkB/s  await  svctm  %util
sda              30.0    75.0   3072   10432   18.5   2.1   98.0
sdb              29.0    76.0   2976   10384   19.0   2.0   97.5
sdc              28.0    74.0   2880   10240   18.9   2.0   97.1
sdd              29.0    75.0   2992   10368   19.2   2.1   98.3
nvme0n1           0.0   480.0      0    6144    1.1   0.2    9.5
nvme1n1           0.0   470.0      0    6016    1.0   0.2    9.2

Interpretation: The HDDs are pegged at ~98% utilization. Even if your SLOG is fast, the pool is saturated, which will eventually bleed into sync behavior via TXG pressure and overall service responsiveness. If your “fix” was only logbias, it won’t survive this reality.

Checklists / step-by-step plan

Checklist: choosing logbias per dataset

  1. Classify the workload: Is it dominated by small sync writes (DB commits, metadata-heavy operations), or large sequential sync writes (bulk ingest with sync semantics)?
  2. Define the success metric: p95/p99 latency for commits, or MB/s for bulk writes. If you can’t name it, you can’t tune it.
  3. Check whether you have a real SLOG: enterprise-grade, PLP, low latency; preferably mirrored for availability.
  4. Set logbias=latency for latency-sensitive datasets: databases, VM metadata-heavy disks, NFS homes with interactive workloads.
  5. Set logbias=throughput for bulk datasets: backups, media ingest, large-file staging—especially when sync semantics are unavoidable.
  6. Benchmark before and after: use a sync-like benchmark for sync workloads, not a streaming write test that ignores fsync.

Step-by-step plan: safe rollout in production

  1. Pick one dataset with a clear workload identity and a clear owner (someone who will confirm success or pain).
  2. Capture baseline metrics (sync latency, IOPS, tail latency, pool utilization, SLOG latency if present).
  3. Change only one variable: set logbias (don’t also change recordsize, compression, and hardware in the same window).
  4. Observe during peak, not just in the maintenance window. Sync problems often show up under contention.
  5. Roll back quickly if tail latencies regress. Keep the rollback command ready.
  6. Document the rationale (what workload, what metric improved, what metric degraded). This stops “tuning amnesia” six months later.

Operational guardrails

  • Never use sync=disabled as a performance workaround on workloads that claim durability requirements. If you must, treat it like a deliberate risk decision with sign-off.
  • Mirror SLOG devices for availability if your platform and risk tolerance demand it. A lost log device can turn into downtime.
  • Separate bulk from latency-sensitive workloads at least at the dataset level, ideally at the pool level if contention is severe.

Common mistakes, symptoms, fixes

Mistake 1: Assuming logbias=throughput is “faster” in general

Symptoms: averages look fine, but p95/p99 latency gets worse; databases time out; VMs “stutter” under load.

Why it happens: you shifted sync work from a fast log device to a busy pool, increasing contention and tail latency.

Fix: set logbias=latency on latency-sensitive datasets; keep throughput for bulk datasets only. Verify with sync benchmarks and tail metrics.

Mistake 2: Buying a “fast SSD” for SLOG without PLP

Symptoms: intermittent pool issues after power events; mysterious I/O errors; ZFS refusing to import cleanly or complaining about log replay.

Why it happens: the device lies (or is ambiguous) about flush durability. Sync acknowledgments become untrustworthy.

Fix: use enterprise devices with power-loss protection; mirror the log vdev; monitor device error logs and latency.

Mistake 3: Forgetting that SLOG does not help async writes

Symptoms: you add SLOG and see no change in large streaming write tests; leadership asks why you “wasted money.”

Why it happens: your workload is mostly asynchronous; the SLOG isn’t on the hot path.

Fix: benchmark the right thing (sync latency) and validate that the application/protocol actually issues sync writes.

Mistake 4: Setting global properties and calling it “standardization”

Symptoms: one team is happier, another is on fire; storage graphs look “okay” but product SLOs fail.

Why it happens: mixed workloads require mixed policy. ZFS gives you per-dataset control for a reason.

Fix: define workload classes and apply properties accordingly: DB vs backup vs VM vs home directories.

Mistake 5: Ignoring pool saturation because “the SLOG is fast”

Symptoms: sync latency is initially good, then degrades over time; periodic storms during scrub/resilver; “random pauses.”

Why it happens: TXG commits still have to land. A saturated pool eventually becomes everyone’s problem.

Fix: add vdevs, redesign layout, reduce fragmentation, separate workloads, or move to faster media. logbias can’t manufacture IOPS.

FAQ

1) Does logbias=throughput disable the ZIL?

No. The ZIL still exists and ZFS still provides synchronous semantics. logbias influences how ZFS prefers to handle certain sync write patterns, especially large ones, but it does not remove correctness guarantees.

2) If I have a SLOG, should I always use logbias=latency?

For latency-sensitive sync workloads, yes, that’s usually the right default. For bulk datasets with large sync writes, logbias=throughput can reduce log pressure and improve aggregate throughput. The right answer is “per dataset, per workload, measured.”

3) Will adding a faster SLOG improve my database throughput?

It can improve transaction rate if the database is gated by fsync() latency and you’re currently bottlenecked on slow stable writes. But it won’t fix poor query plans, insufficient RAM, or a saturated pool. Measure commit latency before buying hardware.

4) What’s the difference between ZIL and SLOG in one sentence?

ZIL is the intent log mechanism that always exists; SLOG is a separate device you optionally provide so ZIL writes land somewhere faster (and ideally safer) than the main pool.

5) Is it safe to run without a SLOG?

Yes, in the sense that correctness still holds: ZIL records will be written to the main pool. Performance may suffer for sync-heavy workloads because the pool devices must handle the synchronous acknowledgment path.

6) Should I mirror my SLOG?

If the service matters and the platform behavior makes loss of a log device disruptive, mirroring is the sane choice. Performance is rarely the reason to mirror; availability is. In production, availability usually wins.

7) Can logbias fix NFS “random pauses”?

Sometimes. If the pauses are due to synchronous write acknowledgment latency and your log device is the bottleneck (or missing), tuning logbias and/or adding a proper SLOG can help. If the pauses are due to pool saturation, network issues, or client-side behavior, it won’t.

8) Should I use sync=disabled instead of messing with logbias?

Only if you are explicitly choosing to risk data loss on power failure or crash, and the application owners agree that durability is optional. For most production systems, sync=disabled is not a tuning option; it’s a policy decision with consequences.

9) How do I know whether my application is doing sync writes?

Look for increasing ZIL commit counters, heavy log vdev write activity, and benchmarks that change drastically when you force fdatasync/fsync. Also check application settings (database durability modes) and protocol semantics (NFS stable writes, virtualization barriers).

10) If my SLOG is fast, why is my pool still busy?

Because SLOG only helps you acknowledge synchronous writes quickly. The data still has to be written to its final location during TXG commit. If the pool is undersized, heavily fragmented, or sharing mixed workloads, it can remain the long-term bottleneck.

Conclusion

logbias is not a performance “boost” setting. It’s a statement of priorities. When you set logbias=latency, you’re saying: “I care about how fast you acknowledge sync writes, and I’ve built the system—especially the SLOG—to make that safe and fast.” When you set logbias=throughput, you’re saying: “I care about moving data efficiently, and I’m willing to trade some per-write response time to avoid turning the log path into a second job.”

The systems that run well in production don’t pick one ideology. They pick per workload, measure the right metrics (including tail latency), and keep the boring parts—PLP, mirroring, monitoring—non-negotiable. If you do that, logbias stops being a mysterious property and becomes what it should have been all along: a deliberate choice.

← Previous
MariaDB vs Percona Server Replication: Where Edge Cases Bite
Next →
MySQL vs MariaDB: WordPress 504s—who collapses first under a traffic spike

Leave a comment