Docker “Text file busy” During Deploy: The Fix That Stops Flaky Restarts

Was this helpful?

You deploy. The container restarts. Then restarts again. Logs show a classic: Text file busy.
Sometimes it works on the second try, sometimes after five, sometimes only when you’re watching.

This error is a small message with a long tail: Linux telling you that you tried to modify or execute something
that the kernel is still treating as “in use.” In production, it shows up as flaky restarts, mysterious deploy
failures, and teams blaming Docker when the real culprit is how you ship files.

What “Text file busy” really means (and why Docker gets blamed)

On Linux, Text file busy usually maps to ETXTBUSY: you attempted an operation on an
executable file (historically called “text” because the text segment contains instructions) while it’s being
executed or otherwise pinned by the kernel in a way that blocks your action.

In container land, this happens during deploys because we mix two worlds:

  • Immutable-ish images (great): layers are content-addressed, built once, run many.
  • Mutable bind mounts (dangerous): host files show up live inside the container.

The classic failure pattern: a CI job, deploy script, or sidecar updates a binary or entrypoint script in a bind-mounted path
while an existing container is still starting up, shutting down, or restarting. The kernel refuses the operation at the worst moment.
Docker is just the messenger. Docker also happens to be the messenger you yell at, because it’s convenient and cannot talk back.

The fix that stops flakiness is boring and absolute: never update executables in place in a path that a running
container might execute. Build a new file, then atomically switch what “current” points to (typically via a symlink swap or bind mount swap),
or stop using bind mounts for executables entirely.

The one quote you should tape near your deploy scripts

“Hope is not a strategy.” (paraphrased idea commonly attributed in ops/reliability circles)

If your deploy relies on “hopefully the old process exits before we overwrite the file,” you’re not deploying. You’re gambling with better branding.

Joke #1: Containers are cattle, but the one holding your deploy script is always a pet. It has a name, and it bites.

Fast diagnosis playbook

This is the “I have 10 minutes before the release manager starts a thread” sequence. The goal is not philosophical purity.
The goal is to identify whether you’re dealing with (a) bind mount mutation, (b) overlayfs/layer behavior, (c) shutdown race, or (d) something else.

First: find the exact file that’s “busy”

  • From logs, extract the path: /app/bin/service, /entrypoint.sh, /usr/local/bin/foo.
  • If logs don’t show it, run the failing command with strace (see tasks below) to capture which file returns ETXTBUSY.

Second: determine if that path is a bind mount

  • docker inspect.Mounts for the container.
  • If it’s a bind mount, you’ve probably found your culprit.

Third: identify what process still has it open/executing

  • Use lsof or fuser on the host for the host-side path.
  • If it’s inside the container filesystem (overlay2), check running processes and their executable paths.

Fourth: decide which failure mode you’re in

  • Deploy script overwrote a running executable → fix deploy method (atomic switch), stop in-place edits.
  • Entrypoint is bind-mounted and replaced → stop mounting entrypoint scripts; bake into image or mount read-only with release switching.
  • Shutdown takes too long → handle SIGTERM, add preStop/wait, raise grace, don’t force kill early.
  • Self-updating process (yes, people still do this) → remove self-update; ship new image instead.

Fifth: apply a reliable mitigation while you work on the real fix

  • Stop overwriting; write to a new path and rename/symlink swap.
  • Mount executables read-only.
  • Disable “restart always” temporarily to avoid a restart storm that hides the root cause.

Root causes: the 8 ways you earn ETXTBUSY

1) In-place overwrite of a running binary (bind mount or shared volume)

Someone does cp new-binary /srv/app/bin/service while the old service is still running, or a container is in the middle of starting.
Linux allows many file operations on open files, but replacing an executing binary can trigger ETXTBUSY depending on the sequence and filesystem.
The message is your only warning that your deployment model is living dangerously.

2) Replacing an entrypoint script that’s currently being executed

Shell scripts are executables too. If your container starts with /entrypoint.sh from a bind mount, and your deploy updates that file,
you can hit “text file busy” during startup—exactly when your orchestrator is doing lots of restarts, health checks, and impatience.

3) CI/CD writes into a directory that is also the live runtime directory

The “simple” approach: a job builds artifacts and drops them into /srv/app/current. Another job restarts the container.
If those steps overlap or re-run, you’ve created a race condition with production as the referee.

4) Two containers share the same host path and deploy out of sync

One container is still running old code from a shared bind mount; another container updates that mount.
Congratulations: you’ve implemented distributed lock contention using only shell scripts.

5) Aggressive restart policies create self-inflicted DoS

restart: always is fine when the failure is rare. When the failure is triggered by a deploy step that keeps happening,
you get a tight restart loop. That loop increases the chances you collide with the file replacement window.
The error becomes “flaky” because the timing changes.

6) Overlayfs quirks when you mutate “should-be-immutable” paths

Docker’s overlay2 driver is designed for copy-on-write layers. Most of the time it behaves like a normal filesystem.
But when you try to do clever things—like hot-swapping executables in writable layers during startup—you’re leaning on subtleties:
whiteouts, copy-up, and per-layer semantics. You might not see ETXTBUSY every time, but you’ll see it when you least want it.

7) Atomicity misunderstandings: rename is atomic, copy is not

People say “we update it atomically” and then show you cp. Copying a file over another file is not atomic in the way you need.
A rename within the same filesystem is atomic; a copy followed by overwrite is an invitation to partial reads and weird errors.

8) Long shutdowns + forced kill + immediate restart

If your service takes a while to exit and your orchestrator kills it early, the process may still be present during the next start attempt
(or the filesystem still has execution references). Tight loops amplify this.
Often the fix isn’t “sleep 5” (although it “works”), it’s making shutdown deterministic and deploy steps non-overlapping.

Joke #2: Adding sleep 10 to fix a race condition is like fixing a leaky roof by buying louder rain.

The durable fix: stop editing executables in place

If you remember one thing: deploy by switching pointers, not by mutating live files.
That means release directories, symlinks, and read-only mounts, or building a new image and swapping containers.
You want the runtime to see a consistent, complete artifact, every time.

What “good” looks like

  • Artifacts are immutable: a binary or script is written once, then never modified.
  • Activation is atomic: you switch from release A to release B using an atomic operation (rename a symlink, update a bind mount target).
  • Rollback is the same operation: switch back to the previous pointer.
  • Containers don’t share mutable executables: if they share anything, it’s data, not code.

The two deployment patterns that actually behave

Pattern A: Bake executables into the image (recommended)

The container image is the artifact. Deploy means: pull new image, start new container, stop old container.
No bind mounts for /app/bin. No “live patching” inside the container. This is what Docker was built for.

Pattern B: Release directories + atomic symlink swap (when you must use bind mounts)

Sometimes you’re stuck: regulatory constraints, giant artifacts, air-gapped builds, legacy runtime assumptions.
Fine. Then do it like grown-ups:

  • Write new release to /srv/app/releases/2026-01-03_120501/.
  • Verify it (checksums, permissions, smoke test).
  • Atomically update /srv/app/current symlink to point to the new release.
  • Restart containers that mount /srv/app/current read-only.

The key is that /srv/app/current changes instantly as a pointer; the release directory contents never change.
That eliminates “half-copied executable” and greatly reduces “text file busy” because you’re not overwriting the file being executed.
If something is still executing the old binary, it keeps executing it from its old inode. The new containers start on the new inode.
This is how you buy sanity with filesystem semantics.

Small but important hardening choices

  • Mount code read-only into containers. If something tries to mutate it, it fails loudly.
  • Never bind mount over /usr/local/bin unless you enjoy archaeology.
  • Make entrypoints immutable (bake them into the image). If you must mount them, mount a versioned path and switch via symlink.
  • Control restarts: avoid infinite loops masking real failures; use backoff and alerting.

Practical tasks: 12+ commands that tell you what’s going on

These are the commands you run when the deploy is failing and people are suggesting “just reboot the node.”
Each task includes: command, sample output, what it means, and the decision you make.

Task 1: Confirm the failure signature in container logs

cr0x@server:~$ docker logs --tail=80 api-1
exec /app/bin/api: text file busy

What it means: The kernel rejected an execve() of /app/bin/api (or a shell/script invoked it) with ETXTBUSY.

Decision: Identify whether /app/bin/api is from an image layer or a mount. If it’s a mount, stop updating it in place.

Task 2: Inspect mounts and find bind mounts quickly

cr0x@server:~$ docker inspect api-1 --format '{{json .Mounts}}'
[{"Type":"bind","Source":"/srv/app/current","Destination":"/app","Mode":"ro","RW":false,"Propagation":"rprivate"}]

What it means: /app is a bind mount from the host path /srv/app/current. If deploy scripts modify files under that tree, you can race with execution.

Decision: Verify whether /srv/app/current is a symlink to versioned releases. If not, implement it.

Task 3: Check whether “current” is a symlink (and where it points)

cr0x@server:~$ ls -l /srv/app/current
lrwxrwxrwx 1 deploy deploy 44 Jan  3 11:58 /srv/app/current -> /srv/app/releases/2026-01-03_115801

What it means: Good sign: current is a pointer. If deploy updates the symlink atomically, containers see clean cutovers.

Decision: Ensure deploy writes into a new release directory and never edits contents of the pointed-to release after activation.

Task 4: Catch “in-place copy” behavior in deploy scripts

cr0x@server:~$ grep -R --line-number -E 'cp .* /srv/app/current|rsync .* /srv/app/current' /srv/deploy/scripts
/srv/deploy/scripts/deploy.sh:83:cp build/api /srv/app/current/bin/api

What it means: Someone is copying directly into the live tree. That’s the race.

Decision: Change deploy to stage into a new directory and symlink swap, or build a new image and redeploy.

Task 5: Identify who is holding the file open (host-side)

cr0x@server:~$ sudo lsof /srv/app/releases/2026-01-03_115801/bin/api | head
COMMAND   PID USER  FD   TYPE DEVICE SIZE/OFF    NODE NAME
api     23144  1001 txt    REG  253,0  834912 4123912 /srv/app/releases/2026-01-03_115801/bin/api

What it means: PID 23144 is executing the binary (the txt mapping). Overwriting that inode is exactly how you trigger ETXTBUSY and worse.

Decision: Do not overwrite that file. Deploy a new inode (new path) and switch via symlink; or stop the process cleanly before any replacement.

Task 6: Use fuser to confirm processes using the executable

cr0x@server:~$ sudo fuser -v /srv/app/releases/2026-01-03_115801/bin/api
                     USER        PID ACCESS COMMAND
/srv/app/releases/2026-01-03_115801/bin/api:
                     1001     23144 ...e.  api

What it means: Same result, different tool: a process is executing the file.

Decision: Fix the deploy to avoid file replacement; don’t “retry until it works.”

Task 7: See the container’s view of the executable and confirm it matches the mount

cr0x@server:~$ docker exec api-1 readlink -f /app/bin/api
/srv/app/current/bin/api

What it means: The container is executing from the bind-mounted tree.

Decision: Treat /srv/app like production code storage. Version it on disk; mount read-only; switch pointers atomically.

Task 8: Prove whether the file is being modified during deploy (inotify)

cr0x@server:~$ sudo inotifywait -m /srv/app/current/bin -e create,modify,move,delete
Setting up watches.
Watches established.
/srv/app/current/bin/ MODIFY api

What it means: Something is modifying api in place. That’s your smoking gun.

Decision: Remove in-place mutation. If you need to update, write api.new, verify, then rename or symlink swap.

Task 9: Check Docker’s restart loop behavior

cr0x@server:~$ docker ps --filter name=api-1 --format 'table {{.Names}}\t{{.Status}}\t{{.Image}}'
NAMES   STATUS                          IMAGE
api-1   Restarting (1) 3 seconds ago    api:prod

What it means: Container is flapping. That amplifies timing races and makes logs harder to interpret.

Decision: Temporarily stop the container to stabilize the system, then fix deploy method; or add backoff in orchestrator.

Task 10: Inspect container state for the last error

cr0x@server:~$ docker inspect api-1 --format '{{.State.Status}} {{.State.ExitCode}} {{.State.Error}}'
restarting 1 

What it means: Docker didn’t capture a separate “error” string here; you need logs + system call tracing for detail.

Decision: Use strace on the failing exec path or reproduce in a one-shot container to capture ETXTBUSY origin.

Task 11: Reproduce with a one-shot run that prints the failing exec

cr0x@server:~$ docker run --rm -v /srv/app/current:/app:ro api:prod /app/bin/api --version
bash: /app/bin/api: Text file busy

What it means: Even a clean one-shot run hits the issue. That suggests the host path is actively being modified or the file is in a weird intermediate state.

Decision: Stop deploy pipeline, verify file integrity (size, checksum, permissions), and confirm nobody writes to the live path.

Task 12: Confirm whether the file is being replaced via rename (good) or overwritten (bad)

cr0x@server:~$ sudo auditctl -w /srv/app/current/bin/api -p wa -k api-bin-watch
cr0x@server:~$ sudo ausearch -k api-bin-watch | tail -n 6
type=SYSCALL msg=audit(1735906101.220:911): arch=c000003e syscall=2 success=yes exit=3 a0=7f5b7a3c a1=241 a2=1b6 a3=0 items=1 ppid=1102 pid=28440 auid=1000 uid=1000 gid=1000 exe="/usr/bin/cp" key="api-bin-watch"

What it means: You caught cp writing to the executable directly. That’s not atomic and it collides with execution.

Decision: Replace “copy over” with “write new file then rename” or “stage release then symlink swap.”

Task 13: Verify atomic rename behavior in your deploy directory

cr0x@server:~$ cd /srv/app
cr0x@server:~$ ln -sfn /srv/app/releases/2026-01-03_115801 current.new
cr0x@server:~$ mv -Tf current.new current

What it means: mv -T treats the target as a file; -f forces replacement. This is a common, reliable pattern for atomic symlink updates on Linux.

Decision: Standardize this as the activation step. No partial file copies into current.

Task 14: Confirm mount is read-only inside the container

cr0x@server:~$ docker exec api-1 sh -lc 'mount | grep " /app "'
/dev/sda1 on /app type ext4 (ro,relatime,errors=remount-ro)

What it means: The container can’t modify code under /app. That’s good: it prevents self-modifying behavior and keeps blame on the deploy pipeline where it belongs.

Decision: Keep it read-only. If something breaks because it expected to write there, fix the app to write to a data volume.

Task 15: Validate graceful shutdown timing to avoid overlapping executions

cr0x@server:~$ docker stop -t 30 api-1
api-1
cr0x@server:~$ docker ps -a --filter name=api-1 --format 'table {{.Names}}\t{{.Status}}'
NAMES   STATUS
api-1   Exited (0) 3 seconds ago

What it means: The process exits cleanly within the grace period. If it didn’t, you’d see forced kill behavior and a higher chance of restarts colliding with deploy steps.

Decision: If shutdown is slow, fix signal handling, add preStop hooks, and adjust timeouts. Don’t compensate with “deploy retries.”

Task 16: Use strace to confirm the syscall returning ETXTBUSY

cr0x@server:~$ strace -f -e trace=execve,openat,rename,unlink -s 256 docker run --rm -v /srv/app/current:/app:ro api:prod /app/bin/api --version
execve("/usr/bin/docker", ["docker", "run", "--rm", "-v", "/srv/app/current:/app:ro", "api:prod", "/app/bin/api", "--version"], 0x7ffd1efc8b10 /* 36 vars */) = 0
...
execve("/app/bin/api", ["/app/bin/api", "--version"], 0x55d2b5d3d3a0 /* 14 vars */) = -1 ETXTBUSY (Text file busy)

What it means: No guessing. The kernel returned ETXTBUSY on execve of that path.

Decision: Treat this as an artifact lifecycle bug. Change how the file is produced and activated; don’t “tune Docker.”

Three corporate-world mini-stories

Mini-story #1: The incident caused by a wrong assumption

A mid-size fintech ran a Docker Compose stack on a few beefy VMs. They shipped a Go binary and a couple of scripts via a bind mount:
/srv/finapp/current mounted into /app. The assumption was simple and wrong: “Linux lets you replace files while they’re in use.”

The deploy job did cp of a new /srv/finapp/current/bin/api and immediately ran docker compose up -d.
On quiet days it worked. On busy days, some containers restarted, hit Text file busy, and flapped. The pager went off.
People blamed “Docker being flaky,” because the error showed up at container start, not during copy.

The post-incident review showed the real failure mode: multiple app instances were executing bin/api from the bind mount,
while the deploy replaced it in place. Sometimes the copy and the exec collided. Sometimes the copy partially wrote and the next exec
got a different error. They had built a race that depended on timing, CPU scheduling, and a healthy dose of spite.

The fix was not exotic. They staged artifacts into /srv/finapp/releases/<id>, verified them, and atomically swapped
current using mv -Tf. They also mounted /app read-only to make sure no container could mutate it.
The next deploy was boring, which is the correct emotional tone for a deploy.

Mini-story #2: The optimization that backfired

An ad-tech company wanted faster deploys. They were tired of building images, so they tried “artifact injection”:
build once on the host, then bind mount it into the container. They also enabled an aggressive restart policy so services would “heal.”

It was fast. It was also a great way to turn a harmless deploy race into a cluster-wide restart festival.
When a deploy kicked off, containers restarted quickly, some grabbed the file while it was mid-update, and several hit ETXTBUSY.
Restart policy kicked in, which restarted them again, which increased the chance they collided with the deploy window. Feedback loop achieved.

The team “fixed” it by adding sleep calls between copy and restart, then by adding more sleep when the first sleep wasn’t sleepy enough.
Deploys got slower. Failures got rarer. Then a particularly busy day hit and the timing shifted again.
The error returned like a seasonal allergy.

The real fix was to stop optimizing the wrong thing. They returned to building images for production, and only used bind mounts in dev.
For the few services that still needed host-mounted assets, they used release directories and a symlink swap. Deploy time went up a bit.
Incident time went down a lot. That’s the trade you want.

Mini-story #3: The boring but correct practice that saved the day

A healthcare SaaS provider had a strict “immutable artifacts” policy. Engineers complained about it the way people complain about seatbelts.
Every deploy produced a versioned release directory with checksums and a manifest. Activation was a symlink swap. Rollback was the same.

One Friday, a storage hiccup caused a deploy job to retry in the middle of staging. A second job started before the first finished.
That would have been a classic “text file busy” incident, because both jobs targeted the same app. But they didn’t target the same directory.
Each staging run wrote to a new, unique release path.

The activation step used a lock and an atomic update of current. Only one job won. The other job failed fast and loud.
The service never saw partial artifacts. Containers restarted exactly once. Nobody learned a new error message that day.

The postmortem was short. The fix was to make the lock more explicit and to improve alerting on deploy contention.
The practice that saved them wasn’t heroism. It was good filesystem hygiene applied consistently, which is how most reliability is actually achieved.

Common mistakes: symptom → root cause → fix

1) Symptom: “Text file busy” only during deploy, then it disappears

Root cause: In-place artifact update collides with container restarts; timing-dependent race.

Fix: Stage to versioned directory; atomically switch symlink; mount runtime code read-only.

2) Symptom: It happens more when traffic is high

Root cause: Slow shutdown or longer startup increases overlap window; restart loops amplify collisions.

Fix: Make shutdown deterministic (SIGTERM handling), increase grace period, add backoff; avoid immediate restart storms during deploy.

3) Symptom: Only one node shows the problem

Root cause: Node-specific deploy script behavior, different filesystem (NFS vs local ext4), or different mount options.

Fix: Compare mount types and options, standardize deploy mechanism, avoid executing from network shares when possible.

4) Symptom: “Text file busy” for entrypoint.sh or startup scripts

Root cause: Entrypoint is bind-mounted and updated; or config management rewrites it.

Fix: Bake entrypoint into the image; or version scripts and switch pointers, never overwrite.

5) Symptom: Deploy uses rsync and still fails

Root cause: rsync updates files in place by default; temporary files/rename behavior depends on flags.

Fix: Use staging directory; or rsync into a new release path; then symlink swap. Don’t rsync into “current.”

6) Symptom: “But we use rename, it should be atomic”

Root cause: Rename is atomic only within the same filesystem and only for the rename operation; your process might still be overwriting or copying.

Fix: Ensure staging and activation occur on the same filesystem; use mv -Tf on symlinks; avoid cross-filesystem moves.

7) Symptom: Container fails with “permission denied” after you “fixed” it

Root cause: New release directory has wrong ownership/exec bit; read-only mount exposes sloppy packaging.

Fix: Set correct permissions in build step; verify with stat; add a pre-activation smoke test.

8) Symptom: It only happens in Kubernetes, not locally

Root cause: Lifecycle hooks, rapid rescheduling, and readiness/liveness restarts create more aggressive timing. Also shared volumes are more common.

Fix: Avoid shared executable volumes; use images; if you must use volumes, use versioned paths and atomic pointer updates plus proper terminationGracePeriodSeconds.

Checklists / step-by-step plan

Checklist: Immediate containment (today)

  1. Stop the restart storm: temporarily disable automatic restarts or scale down the failing service.
  2. Freeze deploy jobs that write into the runtime path.
  3. Identify the busy file path from logs or strace.
  4. Verify whether that path is bind-mounted; if yes, treat it as the primary suspect.
  5. Use lsof/fuser on the host to confirm which process is executing it.
  6. If needed, stop the service cleanly before further changes. Avoid force-kill loops.

Checklist: Durable remediation (this week)

  1. Decide your artifact model:
    • Prefer: bake binaries into the image and redeploy containers.
    • Fallback: versioned release directories + symlink swap.
  2. Change deploy to stage artifacts into a unique directory:
    • Write: /srv/app/releases/<release-id>/
    • Never write: /srv/app/current/
  3. Add verification before activation:
    • Checksum/size validation
    • Permission checks (+x)
    • Basic run test (--version or --help)
  4. Activate via atomic pointer switch:
    • ln -sfn ... current.new
    • mv -Tf current.new current
  5. Mount the code path read-only in Docker/Compose/Kubernetes.
  6. Ensure shutdown is graceful:
    • Handle SIGTERM
    • Set reasonable stop timeouts

Checklist: Guardrails (this quarter)

  1. Add a CI check that fails builds if deploy scripts copy into “current.”
  2. Add auditing/inotify during canary deploys to ensure no in-place modification occurs.
  3. Standardize artifact paths and mount patterns across services.
  4. Record deploy contention: if two deploys overlap, fail one immediately with a clear error.
  5. Make rollbacks first-class: keep previous releases on disk; switch symlink back.

Interesting facts and historical context

  • ETXTBUSY is old Unix baggage with modern consequences: the name comes from “text segment,” the executable code mapping in early Unix.
  • Linux can unlink running executables: a process can keep running even if its executable is deleted, because it holds an open inode reference.
  • Atomic rename is one of the strongest filesystem guarantees: on POSIX filesystems, renaming within the same filesystem is atomic, which is why symlink swaps work.
  • Copy-on-write filesystems changed expectations: overlayfs and layered images encourage immutability, but bind mounts reintroduce mutability right where it hurts.
  • Shell scripts can trigger ETXTBUSY too: anything executed via execve can be “busy,” not just compiled binaries.
  • Restart loops hide root causes: orchestrators retry quickly; logs rotate; the first failure message vanishes under a pile of identical restarts.
  • NFS and network filesystems add their own flavor: close-to-open consistency and caching can produce timing-dependent behavior that looks like ETXTBUSY or partial updates.
  • “Hot patching” is usually a deployment smell: self-updating services were more common pre-container; with images, it’s almost always the wrong move.
  • Symlink-based releases predate containers: the pattern is decades old in web hosting and app servers because it matches how kernels and filesystems behave.

FAQ

1) Is “Text file busy” a Docker bug?

Almost never. It’s the kernel refusing an operation (usually execve or a file update) due to how the file is being used.
Docker just happens to be where you see it.

2) Why does it only happen sometimes?

Races are schedule-dependent. CPU load, IO latency, restart timing, and whether another deploy job overlaps all change the window.
“Sometimes” is exactly how race conditions introduce themselves.

3) Can I fix it by adding retries or sleep?

You can mask it. You won’t fix it. You’re betting that timing will be kinder next time, which is not a contract you can enforce.
The real fix is to stop overwriting executables in place and activate new releases atomically.

4) Does mounting the directory read-only help?

Yes, as a guardrail. It prevents containers from modifying their own code and surfaces incorrect assumptions fast.
It doesn’t fix a host-side deploy script that still overwrites the files, so pair it with proper release staging.

5) What if I must use bind mounts for code (legacy constraints)?

Use versioned release directories and a symlink pointer like /srv/app/current. Never copy into current.
Mount current read-only into containers. Switch the symlink atomically.

6) Why does switching a symlink help if processes still run old code?

Because processes execute inodes, not path strings. A running process keeps using its already-open inode.
New processes resolve the symlink to a new inode. You avoid editing the inode a running process depends on.

7) Does this happen with overlay2 even without bind mounts?

It can, but it’s less common. Most ETXTBUSY deploy issues come from mutating host-mounted files.
If you’re modifying files inside a container at runtime (especially executables), you’re recreating the same problem inside overlayfs.

8) How do I prove which process is responsible?

Use lsof or fuser on the host path, and confirm the container mount mapping with docker inspect.
If needed, use strace to catch execve returning ETXTBUSY.

9) Is rsync safe if I use –inplace or –delay-updates?

--inplace is actively unsafe for live executables. --delay-updates is better, but the safest pattern is still:
rsync into a brand-new release directory, verify, then switch the pointer.

10) What’s the quickest structural fix with the least organizational drama?

Keep your existing bind mount structure, but change deploy to create a new release directory and do an atomic symlink swap.
It’s low impact, high value, and easy to audit.

Conclusion: what to change Monday morning

“Text file busy” during Docker deploys is not a cosmic mystery. It’s your filesystem telling you your deployment method is unsafe.
The fix is also not mysterious: stop mutating executables in place, and activate releases atomically.

Next steps that pay off immediately:

  1. Find the path that triggers ETXTBUSY and confirm whether it’s bind-mounted.
  2. Eliminate in-place copies into the live runtime directory.
  3. Adopt either image-based deploys or release directories + symlink swap.
  4. Mount code read-only and make entrypoints immutable.
  5. Reduce restart storm behavior so failures are visible, not smeared across retries.

The goal isn’t to “never see ETXTBUSY again.” The goal is to build deploys that don’t depend on luck, timing, or the current phase of the moon.
Your future self will still be on call. Do them a favor.

← Previous
Efficiency: why faster isn’t always better
Next →
ZFS NFS Sync Semantics: Why Clients Change Your Write Safety

Leave a comment