Docker: Slow writes on overlay2 — when to switch to volumes and why

October 1, 2025 • February 3, 2026 • Read: 23 min • Views: 15

Was this helpful?

You notice it during an incident, because that’s when you’re most honest with yourself.
The app is “fine,” CPU is boring, network is calm, but requests are stalling.
Somewhere, a container is writing to disk like it’s chiseling bytes into granite.

Nine times out of ten, the culprit isn’t “Docker is slow.” It’s the default container filesystem (overlay2) doing exactly what it was designed to do:
make images and container layers convenient. Not fast for write-heavy, fsync-happy workloads.

Overlay2 in one mental model (and why writes hurt)

overlay2 is the Docker storage driver that uses Linux OverlayFS: a union filesystem that layers a writable “upperdir” on top of one or more read-only “lowerdir” layers (your image layers).
When a container reads a file that exists in the image, it reads from the lower layers. When it writes, it writes to the upper layer.

The pain begins at the intersection of copy-on-write and metadata churn.
If a container modifies a file that exists in the lower layer, OverlayFS often needs to “copy up” the file into the upper layer first.
That copy-up is real I/O. It can be large. It also comes with metadata operations that are cheap on paper and expensive at scale.

For write-heavy workloads, two patterns are especially punishing:

Small random writes with fsync (databases, SQLite, message queues with durability flags). The union layer isn’t the main issue; it’s the extra indirection and metadata work plus how your host filesystem behaves under it.
Lots of file creation/deletion (build caches, unpacking archives, language package managers, temp directories). Overlay2 can turn “many tiny writes” into “many tiny writes plus many tiny metadata ops.”

Volumes exist because we eventually admit we’re running stateful workloads. A Docker volume is essentially a directory managed by Docker, mounted into the container directly from the host filesystem (no union layering on that path).
For the mounted path, you bypass copy-on-write overhead and usually reduce write amplification.

The decision isn’t philosophical. It’s mechanical: if the data is mutable and performance-sensitive, stop storing it in the container layer.
Keep the container filesystem for binaries and configuration; keep your state somewhere boring and direct.

Interesting facts & historical context

Short, concrete bits that explain why we ended up here:

Docker didn’t start with overlay2. Early Docker used AUFS heavily; it was fast-ish and featureful, but not upstream Linux-friendly. The shift toward OverlayFS was partly about being “in-kernel” and maintainable.
Device mapper era scars are real. The devicemapper storage driver (loop-lvm especially) caused spectacular write latency and “why is my disk full?” moments. overlay2 became the default for good reasons.
OverlayFS matured over multiple kernel releases. Features like multiple lower layers and performance fixes arrived iteratively. Your kernel version matters more than people want to admit.
XFS d_type matters. OverlayFS requires directory entry file type support (d_type). On XFS, that’s controlled by ftype=1. Get it wrong and Docker will warn—or worse, you’ll get weird behavior/performance.
Copy-up is not hypothetical. Editing a file that was in the image layer triggers a copy of that file into the container’s upperdir. For big files, you pay immediately.
Whiteouts are how deletions work in union filesystems. When you delete a file that exists in a lower layer, OverlayFS records a “whiteout” in upperdir. That’s extra metadata, and it accumulates.
Journaling filesystems trade latency for safety. ext4 and XFS have different default behaviors; add fsync-heavy apps and you can amplify the pain. Overlay2 doesn’t erase that; it can magnify it.
Container logs aren’t special. If you log to a file inside the container layer, you’re writing into overlay2. If you log to stdout, Docker’s logging driver writes somewhere else—still on disk, but different path and different failure modes.
“It’s fast on my laptop” is often page cache, not throughput. Overlay2 can look great until you hit sync writes, memory pressure, or a node with noisy neighbors.

What “slow writes” looks like in production

overlay2 slowness is rarely a single smoking gun. It’s a set of symptoms that rhyme:

P99 latency climbs while CPU stays polite. The app threads block on I/O.
Database checkpoints or compactions take longer inside containers than on the host.
“Disk is not full” but writes stall: you’re actually out of inodes, stuck behind journal contention, or throttled by the underlying block device queue.
Node-level iowait spikes without a corresponding “big throughput” story. That’s a classic sign of sync-heavy small writes.
Container restarts get slower over time if the writable layer accumulates lots of files and whiteouts. Metadata walks don’t age gracefully.

The tricky part: overlay2 isn’t always the bottleneck. The bottleneck might be your block storage, your filesystem mount options, your kernel,
or the fact that you put a database WAL on a union filesystem and then asked it to fsync like its job depends on it (it does).

Fast diagnosis playbook

When you’re on-call, you don’t want a 40-step investigation. You want a short ladder that gets you to “move data to a volume” or “this is the underlying disk” quickly.

First: confirm what’s writing and where

Identify top writers at the host level (process, device).
Map the container PID to a container name.
Determine whether the hot path is inside /var/lib/docker/overlay2 or a mounted volume/bind mount.

Second: decide if it’s sync-write latency or throughput

If await is high and %util is high: the device is saturated or queueing.
If await is high but %util is moderate: you might be paying per-operation latency (fsync, journal locks, metadata storms).
If the app is calling fsync constantly: you’re in the “latency game,” not the “MB/s game.”

Third: check for overlay2-specific amplifiers

Copy-up triggers (writing to files that came from the image layer).
Huge numbers of files in the writable layer (inode pressure, directory traversal cost).
Underlying filesystem mismatch (XFS ftype, weird mount options).

Fourth: choose the smallest safe change

If it’s mutable data: move it to a volume or bind mount. Prefer volumes for operational hygiene.
If it’s ephemeral build/cache data: consider tmpfs if it fits, or accept slower writes but stop persisting it.
If it’s the underlying disk: fix the disk story (IOPS, latency, scheduler, storage class). overlay2 is just the messenger.

Practical tasks: commands, outputs, decisions (12+)

These are the tasks you actually run at 2 a.m. Each has: a command, what typical output means, and what decision you make next.
Commands assume a Linux host running Docker Engine with overlay2.

Task 1: Confirm Docker is using overlay2 (and on what backing filesystem)

cr0x@server:~$ docker info --format '{{.Driver}} {{.DockerRootDir}}'
overlay2 /var/lib/docker

What it means: Storage driver is overlay2; Docker data root is /var/lib/docker.

Decision: All container writable layers live under that root unless you moved it. That’s where you look for heat.

Task 2: Check backing filesystem type and mount options for Docker root

cr0x@server:~$ findmnt -no SOURCE,FSTYPE,OPTIONS /var/lib/docker
/dev/nvme0n1p2 ext4 rw,relatime,errors=remount-ro

What it means: Docker root is on ext4, with typical options.

Decision: If this is network storage or slow HDD, stop blaming overlay2 and start blaming physics. If it’s XFS, verify ftype=1 (Task 3).

Task 3: If using XFS, verify d_type support (ftype=1)

cr0x@server:~$ xfs_info /dev/nvme0n1p2 | grep ftype
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1

What it means: OverlayFS can work correctly. ftype=0 is a red flag.

Decision: If ftype=0, plan a migration to a properly formatted filesystem. Don’t “tune around” a structural mismatch.

Task 4: Find top block devices and whether they’re saturated

cr0x@server:~$ iostat -xz 1 5
Linux 6.2.0 (server) 	01/03/2026 	_x86_64_	(16 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          12.10    0.00    3.40   22.80    0.00   61.70

Device            r/s     w/s   rMB/s   wMB/s  rrqm/s  wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz  rareq-s  wareq-s  svctm  %util
nvme0n1         10.0   980.0    0.5    12.0     0.0    20.0    0.0    2.0   1.20   18.50  20.10    52.0    12.5   0.90  89.0

What it means: Writes dominate; w_await is high and %util is near saturation. The device is queueing.

Decision: This is not purely overlay2. You either need more IOPS/lower latency storage, or fewer sync writes, or to isolate workloads.

Task 5: Identify which process is doing the I/O

cr0x@server:~$ sudo iotop -oPa
Total DISK READ:         0.00 B/s | Total DISK WRITE:      25.30 M/s
  PID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
23144 be/4  999        0.00 B/s   18.20 M/s  0.00 %  35.00 % postgres: wal writer process
19872 be/4  root       0.00 B/s    5.10 M/s  0.00 %  10.00 % dockerd --host=fd://

What it means: A database process (likely inside a container) is pushing WAL writes; dockerd is also active.

Decision: Map PID to container and check where its data directory lives. Databases on overlay2 are a known self-own.

Task 6: Map a PID to a container

cr0x@server:~$ ps -o pid,cmd -p 23144
  PID CMD
23144 postgres: wal writer process

cr0x@server:~$ sudo cat /proc/23144/cgroup | grep -E 'docker|kubepods' | head -n 1
0::/docker/5c3b1f2d0b0a8b3d5a2d6c5d8c1f0e9a7b6c5d4e3f2a1b0c9d8e7f6a5b4c3

What it means: The process belongs to a Docker container with that ID prefix.

Decision: Inspect that container’s mounts (Task 7). If PGDATA is not a volume/bind mount, fix that.

Task 7: Check container mounts and confirm whether data is on overlay2

cr0x@server:~$ docker inspect 5c3b1f2d0b0a --format '{{range .Mounts}}{{println .Destination .Type .Source}}{{end}}'
/var/lib/postgresql/data volume /var/lib/docker/volumes/pgdata/_data

What it means: The data directory is a Docker volume (good). If you saw no mount for PGDATA, it would be inside overlay2 (bad).

Decision: If it’s already a volume and still slow, the bottleneck is likely underlying storage or fsync patterns, not overlay2 layering.

Task 8: Prove whether writes hit overlay2 paths

cr0x@server:~$ sudo lsof -p 23144 | grep overlay2 | head
postgres 23144 999  cwd    DIR  8,2     4096  131081 /var/lib/docker/overlay2/9d2f.../merged/var/lib/postgresql/data

What it means: If you see open files under /var/lib/docker/overlay2/.../merged, that process is operating on the union mount.

Decision: Move that path to a volume. If it’s a database, don’t debate it—do it.

Task 9: Check inode exhaustion (sneaky “disk full”)

cr0x@server:~$ df -hi /var/lib/docker
Filesystem      Inodes  IUsed   IFree IUse% Mounted on
/dev/nvme0n1p2    20M    19M     1M   95% /var/lib/docker

What it means: You’re close to running out of inodes. Writes may fail or degrade as the FS struggles.

Decision: Clean up images/layers, move high-churn directories to volumes, and consider a filesystem with more inodes or different layout for Docker root.

Task 10: Find big writable layers and churners

cr0x@server:~$ docker ps --format 'table {{.Names}}\t{{.ID}}'
NAMES                 ID
api-1                  2f1c9b8c0d3a
worker-1               8a7d6c5b4e3f
postgres-1             5c3b1f2d0b0a

cr0x@server:~$ docker container inspect api-1 --format '{{.GraphDriver.Data.UpperDir}}'
/var/lib/docker/overlay2/1a2b3c4d5e6f7g8h9i0j/upper

cr0x@server:~$ sudo du -sh /var/lib/docker/overlay2/1a2b3c4d5e6f7g8h9i0j/upper
18G	/var/lib/docker/overlay2/1a2b3c4d5e6f7g8h9i0j/upper

What it means: The container’s writable layer is huge. That usually means someone is writing data that should be a volume, or caching aggressively inside the container filesystem.

Decision: Identify which directories are growing and move them to volumes/bind mounts (or tmpfs). Treat a giant upperdir as an operational smell.

Task 11: Confirm whether your workload is fsync-bound

cr0x@server:~$ sudo strace -p 23144 -e trace=fdatasync,fsync -tt -T -f
18:20:11.102938 fdatasync(7)            = 0 <0.024981>
18:20:11.128201 fdatasync(7)            = 0 <0.031442>

What it means: Each sync call costs ~25–30ms. That’s brutal for database throughput. overlay2 may add overhead, but the real enemy is sync latency.

Decision: Put the WAL/data on the fastest storage you can justify, and avoid union FS paths. If you can’t change storage, reduce fsync frequency only if your durability model allows it.

Task 12: Benchmark overlay2 path vs a volume path (A/B test, not vibes)

cr0x@server:~$ docker run --rm -it alpine sh -lc 'apk add --no-cache fio >/dev/null && fio --name=randwrite --directory=/tmp --size=512m --bs=4k --rw=randwrite --iodepth=1 --direct=1 --numjobs=1 --runtime=20 --time_based --fsync=1'
randwrite: (groupid=0, jobs=1): err= 0: pid=23: Fri Jan  3 18:20:45 2026
  write: IOPS=120, BW=480KiB/s (492kB/s)(9600KiB/20001msec)
    clat (usec): min=2000, max=65000, avg=8000.00, stdev=5000.00

cr0x@server:~$ docker volume create fiotest
fiotest

cr0x@server:~$ docker run --rm -it -v fiotest:/data alpine sh -lc 'apk add --no-cache fio >/dev/null && fio --name=randwrite --directory=/data --size=512m --bs=4k --rw=randwrite --iodepth=1 --direct=1 --numjobs=1 --runtime=20 --time_based --fsync=1'
randwrite: (groupid=0, jobs=1): err= 0: pid=23: Fri Jan  3 18:21:15 2026
  write: IOPS=320, BW=1280KiB/s (1311kB/s)(25600KiB/20002msec)
    clat (usec): min=1500, max=32000, avg=3500.00, stdev=2000.00

What it means: Volume path delivers materially better IOPS and lower latency for sync-heavy 4k writes.

Decision: If your real workload resembles this pattern (databases, queues), move it off overlay2.

Task 13: Check Docker’s logging driver and log path impact

cr0x@server:~$ docker info --format '{{.LoggingDriver}}'
json-file

cr0x@server:~$ docker inspect api-1 --format '{{.LogPath}}'
/var/lib/docker/containers/2f1c9b8c0d3a.../2f1c9b8c0d3a...-json.log

cr0x@server:~$ sudo ls -lh /var/lib/docker/containers/2f1c9b8c0d3a.../*json.log
-rw-r----- 1 root root 9.2G Jan  3 18:21 /var/lib/docker/containers/2f1c9b8c0d3a.../2f1c9b8c0d3a...-json.log

What it means: Logs are growing on the Docker root filesystem. That can compete with overlay2 writes and fill disks fast.

Decision: Rotate logs, cap them, or switch to a logging driver appropriate for your environment. Don’t let “printf debugging” become a storage DoS.

Task 14: Look for mount propagation mistakes (volume not actually mounted)

cr0x@server:~$ docker exec -it api-1 sh -lc 'mount | grep -E "/data|overlay" | head -n 3'
overlay on / type overlay (rw,relatime,lowerdir=...,upperdir=...,workdir=...)
/dev/nvme0n1p2 on /data type ext4 (rw,relatime)

What it means: /data is a real mount (good). If you only saw the overlay mount and no separate mount for your intended data path, your “volume” isn’t mounted where you think.

Decision: Fix the container spec (wrong destination path, typo, missing -v), then re-test performance. Assumptions are where outages breed.

When to switch to volumes (and when not to)

Switch to volumes when the data is mutable and one of these is true

It’s a database (Postgres, MySQL, MongoDB, Redis with AOF, etc.). Databases are not “just files.” They’re carefully choreographed fsync machines.
It’s a write-ahead log or journal (Kafka-like segments, queue durability logs, search index translogs). These workloads punish latency.
It’s high-churn app state (uploads, cache directories you actually care about, user-generated content).
It’s build output you rely on across restarts (CI caches, package caches) and you want predictability.
You need backup/restore semantics that aren’t “commit the container and pray.” Volumes give you a clear boundary for snapshotting and migration.

Keep overlay2 for what it’s good at

Immutable application bits: binaries, libraries, config templates. That’s what image layers are for.
Short-lived scratch files that don’t need to persist and don’t do a million fsync calls. If you need speed, use tmpfs. If you need persistence, use a volume.

Volumes vs bind mounts: pick deliberately

A bind mount is “mount this host path into the container.” A Docker volume is “Docker manages a host path and mounts it for you.”
Performance can be similar because both bypass the union layer on that path. The difference is operational:

Volumes are easier to enumerate, migrate, and reason about with Docker tooling. They also avoid accidental coupling to host directory layout.
Bind mounts are great when you need tight control (pre-existing directory trees, specific filesystem, or integration with host tools). They also make it easy to mount something you didn’t mean to mount.

My bias: in production, prefer volumes unless you have a clear reason not to.
Your future self will enjoy fewer “why is this path empty on the new node?” conversations.

Joke #1: Running a database on overlay2 is like wearing flip-flops to a construction site—possible, but the injury report writes itself.

How to switch safely: patterns that don’t page you later

Pattern 1: Explicit data directories, never “whatever the image uses by default”

Images often define default data paths inside the container filesystem. If you don’t override them with a volume mount, you’re implicitly using overlay2.
For databases, be explicit and loud.

cr0x@server:~$ docker volume create pgdata
pgdata

cr0x@server:~$ docker run -d --name pg \
  -e POSTGRES_PASSWORD=example \
  -v pgdata:/var/lib/postgresql/data \
  postgres:16
c1d2e3f4a5b6c7d8e9f0

Pattern 2: Move only the hot path, not the whole filesystem

You don’t need to mount /. You don’t need to re-platform everything. Identify the write-heavy directories:
database data, WAL, uploads, caches, logs (sometimes).
Mount those. Leave the rest alone.

Pattern 3: Split WAL/log from data when latency requirements differ

Not every deployment needs this, but when it does, it’s a lifesaver:
put WAL (or equivalent) on the fastest storage class, keep bulk data on capacity storage.
This is easier with orchestrators, but you can do it with Docker too if you’re disciplined.

Pattern 4: Avoid writing logs into container layers

Logging to stdout is not automatically “free,” but it avoids the union filesystem for app logs.
If you log to files, mount a volume or bind mount for the log directory and rotate.

Pattern 5: Decide what durability you actually need

The storage driver debate often hides a business decision: is data loss acceptable?
If you disable fsync (or use async modes), you will get better numbers—right up until you don’t.
“We can lose the last 5 seconds of data” is a valid policy. “We didn’t think about it” is not.

Three corporate mini-stories from the trenches

Mini-story 1: The incident caused by a wrong assumption

A team rolled out a new API service that ingested events and wrote them to a local queue before shipping to the main pipeline.
In staging it looked fine. In production, it started timing out under load. The on-call graph showed rising latency and a sudden increase in iowait.
CPU was low, which made everyone suspicious of “network” because that’s what we do when CPU isn’t guilty.

They assumed that because the queue was “just a file,” it would behave like a host file. But the file lived inside the container filesystem.
That meant overlay2. That meant union mount semantics. And under load, the queue did what queues do: lots of small appends, lots of syncs.

The symptom was bizarre: throughput would be fine for a few minutes after deploy, then degrade.
The writable layer grew, the metadata got hotter, and the device queue built up.
Someone tried “bigger CPU” because the dashboards looked empty. It did nothing. Of course it did nothing.

The fix was boring: mount a volume at the queue directory and redeploy.
Latency stabilized. Write amplification dropped. The incident ended quietly, which is the only kind of ending you should aim for.

The real lesson: if a component is a durability boundary (queue, journal, database), treat its storage path as infrastructure, not as an implementation detail.
The container layer is not infrastructure; it’s packaging.

Mini-story 2: The optimization that backfired

A platform team saw slow build times in CI. Containers were unpacking dependencies repeatedly.
They decided to “speed it up” by caching package directories inside the container filesystem, thinking it would avoid external I/O overhead.
They also liked the simplicity: fewer volumes, fewer mounts, less “state.”

It worked—briefly. Then disk usage ballooned under /var/lib/docker/overlay2.
The host wasn’t out of bytes, but it was bleeding inodes. Cleanup jobs started taking longer.
New builds became slower, not faster, because every container started with a fat writable layer and a lot of directory churn.

The team then “optimized” cleanup by pruning images aggressively between jobs.
That reduced disk usage but added registry pull traffic and cache misses, and it increased write pressure during unpack because nothing was warm anymore.
The net effect was higher variance and more sporadic timeouts.

The fix was to move caches to a dedicated volume (or per-runner cache directory via bind mount) and manage it intentionally:
caps, expiration, and clear ownership. Build output became predictable again.

The moral: caching is storage. If you treat it like a side effect, it will treat your disk like a suggestion.

Mini-story 3: The boring but correct practice that saved the day

Another org ran several stateful services in containers—not because it was trendy, but because it reduced drift and made upgrades less terrifying.
They had one rule that sounded unglamorous in design reviews: all persistent state must be on named volumes, and each service must document its volume mounts.

When a node started showing elevated write latency, they could immediately separate “overlay2 writes” from “volume writes.”
Their dashboards tracked disk latency per device, and their deployment specs made it obvious what lived where.
Triage was fast because the layout was consistent across services.

During the incident, they live-migrated a workload to a node with better storage and reattached volumes.
No container layer data mattered, so they weren’t stuck copying random directories out of /var/lib/docker/overlay2.
Recovery time was measured in minutes, not in archaeology.

The service owners hated the rule during prototyping because it forced them to think about paths early.
The on-call team loved it forever.

Joke #2: Nothing ruins a “stateless microservice” narrative like a 20GB writable layer named “upper.”

Common mistakes: symptom → root cause → fix

This is the section you read after you’ve already tried restarting it. Don’t worry, we’ve all been there.

1) Writes are slow only inside the container

Symptom: Same operation on host is fast; inside container it’s sluggish.

Root cause: Data path is inside overlay2, paying copy-on-write and metadata overhead; or the container is using different fsync patterns due to config.

Fix: Move the writable path to a volume/bind mount. Verify with mount in the container and benchmark A/B.

2) Database latency spikes during checkpoints/flushes

Symptom: Periodic stalls; WAL writer or checkpoint process shows high IO wait.

Root cause: fsync latency and journal contention on underlying storage; overlay2 can magnify it if DB files are in the writable layer.

Fix: Ensure DB data/WAL are on volumes on fast storage. If still slow, address underlying IOPS/latency (storage class, device, filesystem tuning).

3) “Disk full” errors but `df -h` shows space available

Symptom: Writes fail; Docker pulls fail; containers crash; but bytes aren’t exhausted.

Root cause: Inode exhaustion on Docker root, often from high file churn in overlay2 upperdirs or build caches.

Fix: Check df -hi. Prune, reduce file churn, move churn-heavy paths to volumes, and consider a filesystem/layout with sufficient inodes.

4) Performance degrades over time without increased load

Symptom: Same RPS, worse latency after days/weeks.

Root cause: Writable layers accumulate lots of files/whiteouts; metadata operations get slower; backups/anti-virus scanners touch Docker root; log files grow.

Fix: Keep state out of overlay2, rotate logs, prune unused images/containers, and keep Docker root on a filesystem not shared with noisy host processes.

5) “We moved to volumes and it’s still slow”

Symptom: Data is on a volume, but write latency remains high.

Root cause: Underlying storage is the bottleneck (IOPS-limited network disk, burst credits, throttling), or workload is sync-latency bound.

Fix: Measure device latency with iostat, look at fsync timing, and upgrade/isolate storage. A volume can’t make slow disks fast.

6) Mysterious errors on XFS-backed Docker root

Symptom: Weird file behavior or warnings; sometimes poor performance.

Root cause: XFS formatted with ftype=0 (no d_type) or unsupported kernel/filesystem combination.

Fix: Migrate Docker root to XFS with ftype=1 (or use ext4). This is a rebuild/migration job, not a toggle.

7) Massive Docker root growth and slow node behavior

Symptom: /var/lib/docker grows rapidly; node gets sluggish; container operations slow.

Root cause: Unbounded container logs, large writable layers, or runaway build artifacts inside containers.

Fix: Implement log caps/rotation, move artifacts to volumes, and enforce policies around image builds and container runtime writes.

Checklists / step-by-step plan

Checklist A: Decide if you should switch to volumes (quick gating)

Is the data mutable? If yes, lean volume.
Does the workload do fsync/fdatasync often? If yes, avoid overlay2 for that path.
Is the path expected to grow beyond a few hundred MB? If yes, don’t keep it in upperdir.
Do you need backups or migration? If yes, volumes make it sane.
Is the path a cache you can drop? If yes, consider tmpfs or accept overlay2 but cap/clean.

Checklist B: Migration plan for a stateful service (no drama edition)

Identify the true data directories. Read the app config. Don’t guess. For databases, find data dir and WAL/redo logs.
Create volumes with names that mean something. Avoid “data1” for everything; you will hate it later.
Stop the service cleanly. Let it flush. “Kill -9” is not a migration tool.
Copy data from old path to the volume. Preserve ownership and permissions.
Mount the volume at the exact same path the app expects. Consistency wins.
Start and validate with application-level checks. Not just “container is running.”
Benchmark the write path (fio or application metrics) and compare to baseline.
Set policy: log rotation, size caps, pruning schedules.

Checklist C: If volumes didn’t fix it (storage reality check)

Measure device latency and saturation with iostat.
Check for burstable storage throttling (cloud volumes can do this silently).
Confirm filesystem health and mount options.
Look for noisy neighbors: other containers writing logs or doing compactions.
Consider separating Docker root, volumes, and logs onto different devices.

FAQ

1) Is overlay2 “slow” by design?

overlay2 is optimized for image layering and reasonable container filesystem usage. It’s not designed as a high-performance database filesystem.
For heavy, sync-latency-sensitive writes, it’s often slower than a direct mount.

2) Do Docker volumes always outperform overlay2?

Not always. If the underlying storage is slow, a volume won’t fix physics.
But volumes remove union filesystem overhead on the mounted path, which often helps for metadata-heavy and fsync-heavy workloads.

3) What about bind mounts—are they “faster” than volumes?

Performance is usually similar. The difference is manageability and safety.
Volumes are easier to inventory and migrate with Docker tooling; bind mounts are explicit host paths and can be great when you need that control.

4) Why do small writes hurt more than big sequential writes?

Small sync writes are dominated by per-operation latency: metadata updates, journal commits, flushes, and sometimes barriers.
overlay2 adds extra layers of filesystem work. Big sequential writes can be buffered and streamed more efficiently.

5) Can I “tune” overlay2 to be as fast as volumes?

You can reduce pain (kernel updates, filesystem choices, avoiding copy-up patterns), but you can’t tune away the fundamental union filesystem semantics.
If you care about write performance on mutable data, mount it from the host.

6) Should I put `/var/lib/docker` on a separate disk?

Often yes in production. Separating Docker root from the OS disk reduces contention and makes capacity planning easier.
If logs are on the same disk too, you’re basically inviting a loud service to share a bedroom with a light sleeper.

7) Is tmpfs a good alternative to volumes for performance?

For truly ephemeral data, yes. tmpfs is fast and avoids disk latency.
But it consumes RAM (and can swap if you’re not careful). Don’t put “important” data on tmpfs unless you enjoy explaining data loss.

8) Does Kubernetes change the advice?

The principle stays: don’t store mutable, performance-sensitive state in the container layer.
In Kubernetes you’ll use PersistentVolumes, hostPath (carefully), or ephemeral volumes like emptyDir (backed by disk or memory) depending on durability needs.

9) If my database is on a volume, can I ignore overlay2 completely?

Not completely. Container start/stop, image pulls, and any writes outside your mounted directories still hit overlay2 and Docker root.
Keep Docker root healthy: space, inodes, log rotation, and reasonable image hygiene.

10) What’s the simplest rule that prevents most overlay2 write incidents?

If it persists across restarts and you’d be sad if it vanished, it goes on a volume. If it’s a database, it goes on a volume even if you think you’d be fine.

Practical next steps

overlay2 is a great default for packaging and launching software. It’s not a great place to stash mutable, high-churn, durability-sensitive data.
When writes are slow, the right question is: “What path is hot, and is it living in the container layer?”

Next steps you can do today:

Run the fast diagnosis playbook: identify the writer, map it to a container, confirm the data path.
Benchmark overlay2 vs a volume path with a quick fio A/B test that matches your workload (sync vs async matters).
Move databases, queues, and logs you care about onto volumes. Keep the rest in the image.
Put guardrails in place: log caps, inode monitoring, and a simple policy that persistent state never lives in upperdir.

One paraphrased idea from Werner Vogels (Amazon CTO): you build reliability by expecting failure and designing systems that keep working anyway.
Treat overlay2 as a packaging layer, not a storage strategy, and your systems get boring in the best possible way.