You deploy a container. It boots. It’s alive. But it’s not behaving. The log says it’s talking to the wrong database, using the wrong log level, or binding the wrong port. You stare at your docker-compose.yml like it just lied to your face.
It probably did—accidentally, through precedence. Docker and Docker Compose have multiple layers that can set “the same” environment variable, and the winner is rarely the one you assumed. The fix isn’t heroics. It’s learning the rules, then instrumenting the truth.
The mental model: three different “env var” problems
Most Docker env var confusion comes from mixing up three separate mechanisms that merely look related:
1) Container runtime environment (what the process actually sees)
This is the set of variables inside the running container, visible to PID 1 and friends via env or /proc/1/environ. It comes from Docker’s container configuration, which itself can be sourced from image defaults, Compose, CLI flags, and other inputs.
2) Compose file interpolation (what Compose substitutes into YAML)
Compose also uses environment variables on your host to replace ${VAR} placeholders in the YAML. That substitution happens before any container is created. It’s not the same as setting the container runtime environment.
This is where people get burned: they set environment: in Compose and assume it will also affect any ${VAR} elsewhere in the file. It won’t.
3) Application configuration layering (what your app chooses to honor)
Even after Docker gets it right, your application may override it: config files, flags, defaults, framework conventions, or a library that reads env vars with a prefix, or ignores empty values, or treats "false" as truthy because it’s non-empty. Docker can’t save you from that.
If you want one slogan for production: Docker env vars are not “a setting.” They’re an input. Inputs have precedence rules. Inputs get ignored. Inputs drift.
Facts and history: how we got this mess
- Docker didn’t invent env-based configuration; the “12-factor app” popularized env vars as configuration in the early 2010s, and containers made it feel universal.
- ENV in Dockerfiles predates Compose; image authors baked defaults long before most teams standardized on Compose files for deployment.
- Compose variable substitution was modeled after shell habits:
${VAR}, default values, and “pull from your current environment.” Great for dev convenience, terrible for repeatability. - Early Compose implementations loaded a default
.envfile automatically, and that behavior became muscle memory—even when teams later split env files per environment. - Docker has two “env file” concepts: the CLI’s
--env-fileand Compose’senv_file:. They look similar, they are not interchangeable, and they don’t behave identically in edge cases. - The OCI image spec stores Env in image config;
ENVin a Dockerfile becomes metadata baked into the image, and it participates in runtime merging. - Secrets management became mainstream after env var leaks; env vars are easy to dump, log, or expose through debugging endpoints. “Convenient” isn’t the same as “safe.”
- Compose evolved into a spec; different implementations (docker compose plugin vs older docker-compose) historically disagreed on some behaviors. The sharp edges are mostly sanded down now, but legacy habits remain.
One operational truth hasn’t changed: when you stack multiple layers that can all define “the same” value, you will eventually ship the wrong one. Not because you’re careless. Because you’re human and the system is happy to accept conflicting inputs.
Precedence maps: Docker run, Dockerfile, Compose, and interpolation
Dockerfile ENV vs runtime overrides
Image authors set defaults with ENV in the Dockerfile. These become the image’s Config.Env. At runtime, Docker merges environment variables from multiple sources. The rule you should memorize:
Runtime settings override image defaults.
So if the image contains ENV LOG_LEVEL=info and you run:
cr0x@server:~$ docker run --rm -e LOG_LEVEL=debug alpine:3.20 env | grep LOG_LEVEL
LOG_LEVEL=debug
docker run precedence (practical view)
When you start a container with docker run, the effective environment is basically:
- Image defaults (
ENVin Dockerfile) - Env from
--env-file(if used) - Explicit
-e KEY=VALUEflags override earlier values
There are nuances (like -e KEY meaning “take value from the client environment”), but operationally: explicit beats implicit.
Compose is two different precedence games at once
Compose makes it worse because it plays two games:
- Interpolate variables into the Compose file (host-side):
${VAR}resolution. - Define runtime environment for the service (container-side):
environment:,env_file:, plus whatever is already in the image.
Compose interpolation precedence (host-side)
For replacing ${VAR} in the YAML, Compose generally follows this intuition:
- Variables from your shell environment (the environment where you run
docker compose) - Variables from the project
.envfile (if present in the working directory / project directory) - Default values in the expression, like
${VAR:-default}
If you only remember one thing: environment: does not feed interpolation. If you write:
cr0x@server:~$ cat docker-compose.yml
services:
api:
image: alpine:3.20
environment:
DB_HOST: db
command: ["sh", "-lc", "echo ${DB_HOST}"]
That ${DB_HOST} is resolved on the host, not inside the container. If your host doesn’t have DB_HOST set (and your .env doesn’t either), you’ll print an empty string or trigger a warning depending on your Compose version/settings.
Compose runtime environment precedence (container-side)
For the environment that ends up inside the container, the usual winner order is:
- Image
ENVdefaults - Variables loaded via
env_file: - Variables set in
environment:overrideenv_file - Some implementations also allow CLI overrides at
docker compose run -e, which beat the file
Also: if you specify multiple env_file entries, later files override earlier ones. That’s handy for layering. It’s also how you accidentally ship staging settings to production because a file got reordered in a diff.
Null, empty, and “present but blank”
There’s a nasty difference between:
- Unset: variable does not exist in the environment
- Empty: variable exists with value
"" - Literal string “null”: exists and equals
null(common when templating YAML)
Compose YAML can express empty values in ways that look innocent:
FOO:(empty)FOO: ""(explicit empty string)
Apps often treat “empty but present” as “configured,” leading to misbehavior. For databases, empty hostnames can resolve to localhost or fail over to defaults in library code. The container is fine. The app is “helping.”
One quote to keep you honest
Hope is not a strategy.
— General Gordon R. Sullivan
Env var precedence is where hope goes to die. Instrument it instead.
Where config drift is born: the gotchas that hurt in prod
Gotcha: .env is not your container environment
The .env file used by Compose for interpolation is a convenience feature. It is not automatically injected into the container unless you explicitly reference it with env_file: or map values into environment:.
That’s why “it worked on my laptop” happens: the laptop has a .env in the project directory, CI does not, and production uses a different working directory or runs Compose from a wrapper script.
Gotcha: “I changed the env var, why didn’t the running container change?”
Because containers are not shells. Updating a Compose file doesn’t live-patch the environment of an existing container. You must recreate the container (or at least restart with recreation, depending on what changed).
Gotcha: Healthchecks and sidecars read a different world
Healthcheck commands run inside the container, so they see the container env. But if you templated the healthcheck command with host-side interpolation, you may have injected the wrong value at creation time. The check passes, then production traffic fails. My favorite genre.
Gotcha: Proxy variables and “helpful” defaults
HTTP_PROXY, NO_PROXY, and friends are frequently set on corporate laptops, build agents, and even Docker daemon systemd units. They silently influence build and runtime behavior. You end up with containers that can reach the internet only in certain environments and no one knows why.
Joke #1: Environment variables are like office gossip: they spread everywhere, they’re rarely documented, and they always pick the worst time to be true.
Gotcha: Your app has its own precedence
Common patterns:
- Frameworks that prioritize config files over env vars unless a “use env” switch is set.
- Libraries that read
DATABASE_URLif present, otherwise readDB_HOST/DB_USER, etc. - Apps that treat
0,false, andnoinconsistently.
In production, you don’t want “magic.” You want an explicit config contract: which variable wins, and how to validate it at startup.
Practical tasks: commands that tell you what’s real (and what to do next)
You don’t debug precedence by reading YAML harder. You debug it by asking the runtime what it actually did. Below are practical tasks I’ve used on real systems, with commands, representative outputs, and the decision you make.
Task 1: Inspect the container’s effective environment (fast truth)
cr0x@server:~$ docker inspect -f '{{range .Config.Env}}{{println .}}{{end}}' api-1 | sort | sed -n '1,10p'
DB_HOST=db-prod
LOG_LEVEL=info
PORT=8080
TZ=UTC
What it means: These are the env vars recorded in the container config. This is what the process gets at start.
Decision: If the value is wrong here, stop blaming the application. Fix Compose/run flags/image defaults and recreate the container.
Task 2: Check what the process actually sees (in case entrypoint mutates env)
cr0x@server:~$ docker exec api-1 sh -lc 'tr "\0" "\n" < /proc/1/environ | sort | grep -E "DB_HOST|LOG_LEVEL|PORT"'
DB_HOST=db-prod
LOG_LEVEL=info
PORT=8080
What it means: PID 1’s environment. If this differs from docker inspect, something inside the container changed it (entrypoint script, supervisor, etc.).
Decision: If PID 1 differs, audit entrypoint scripts and startup tooling. That’s an app/image problem, not Compose.
Task 3: See the fully rendered Compose config (interpolation resolved)
cr0x@server:~$ docker compose config | sed -n '/services:/,/networks:/p' | sed -n '1,80p'
services:
api:
command:
- sh
- -lc
- ./start-api
environment:
DB_HOST: db-prod
LOG_LEVEL: info
image: myorg/api:1.8.4
What it means: This is Compose’s final interpretation, after merging files and resolving ${VAR}.
Decision: If docker compose config shows the wrong value, your interpolation inputs are wrong (shell env, .env, defaults) or your override files aren’t what you think.
Task 4: List containers and confirm you’re debugging the right one
cr0x@server:~$ docker ps --format 'table {{.Names}}\t{{.Image}}\t{{.Status}}\t{{.Ports}}'
NAMES IMAGE STATUS PORTS
api-1 myorg/api:1.8.4 Up 2 hours 0.0.0.0:8080->8080/tcp
db-1 postgres:16 Up 2 hours 5432/tcp
What it means: Names, images, uptime, ports. “Wrong config” bugs are often “wrong container” bugs.
Decision: If you see multiple similarly named containers, confirm project name and context; you may be inspecting last week’s stack.
Task 5: Verify which Compose project you’re operating on
cr0x@server:~$ docker compose ls
NAME STATUS CONFIG FILES
billing running(6) /srv/billing/docker-compose.yml
billing-dev running(6) /home/cr0x/billing/docker-compose.yml
What it means: Compose project names are part of identity. Same YAML, different project, different containers.
Decision: If you’re in the wrong project, stop. Switch directory, use -p, or clean up duplicates.
Task 6: Check whether the container was recreated after config changes
cr0x@server:~$ docker inspect -f '{{.Name}} {{.Created}}' api-1
/api-1 2026-01-03T08:12:54.123456789Z
What it means: Creation timestamp. If you edited config at 09:00 and the container was created at 08:12, nothing changed.
Decision: Recreate: docker compose up -d --force-recreate (or at least up -d if Compose detects a diff).
Task 7: Confirm what env files are being applied (and in what order)
cr0x@server:~$ docker compose config --services
api
db
cr0x@server:~$ grep -nE 'env_file|environment' -n docker-compose.yml
14: env_file:
15: - ./env/common.env
16: - ./env/prod.env
17: environment:
18: LOG_LEVEL: info
What it means: Multiple env files means layering. Later wins. environment: overrides them both.
Decision: If you need a value to be truly authoritative, put it in environment: (or a single final env file) and enforce file order.
Task 8: Detect host-side interpolation values (what Compose is pulling from)
cr0x@server:~$ env | grep -E '^DB_HOST=|^LOG_LEVEL='
LOG_LEVEL=debug
What it means: Your shell already has LOG_LEVEL=debug. If your Compose file uses ${LOG_LEVEL}, you just overrode production config with your terminal history.
Decision: Run Compose with a clean environment for production operations, or explicitly set required vars in a controlled file.
Task 9: Show which variables are missing during interpolation (catch silent blanks)
cr0x@server:~$ docker compose config 2>&1 | grep -i warning
WARNING: The "DB_PASSWORD" variable is not set. Defaulting to a blank string.
What it means: Compose substituted an empty string. Your YAML is now valid but your system isn’t.
Decision: Treat this as a deployment failure. Wire CI to fail on missing variables, or use required-variable patterns in your templates.
Task 10: Confirm image defaults (ENV baked into image)
cr0x@server:~$ docker image inspect myorg/api:1.8.4 -f '{{json .Config.Env}}'
["PORT=8080","LOG_LEVEL=warn","TZ=UTC"]
What it means: The image ships with LOG_LEVEL=warn. If you thought “unset means default,” the image author already decided.
Decision: Either explicitly override in Compose, or remove the default from the image if it causes surprises (prefer documentation + explicit config).
Task 11: Verify whether a variable is set to empty vs not set (inside container)
cr0x@server:~$ docker exec api-1 sh -lc 'if printenv OPTIONAL_FLAG >/dev/null 2>&1; then echo "present:[$OPTIONAL_FLAG]"; else echo "unset"; fi'
present:[]
What it means: The variable exists but is empty. Many apps treat that as “configured,” and then fall into weird code paths.
Decision: If empty should behave like unset, don’t set it at all. Remove it from environment: or your env files.
Task 12: Find the source of a value by diffing rendered config against runtime
cr0x@server:~$ docker compose config --format json | jq -r '.services.api.environment'
{
"DB_HOST": "db-prod",
"LOG_LEVEL": "info"
}
cr0x@server:~$ docker inspect -f '{{range .Config.Env}}{{println .}}{{end}}' api-1 | grep -E 'DB_HOST|LOG_LEVEL'
DB_HOST=db-prod
LOG_LEVEL=info
What it means: Compose and runtime agree. If behavior is still wrong, the app is overriding config or misparsing.
Decision: Shift the investigation to the application: startup logs, config dump endpoints, library precedence.
Task 13: Confirm what Compose thinks changed (avoid no-op deploys)
cr0x@server:~$ docker compose up -d
[+] Running 0/0
What it means: Compose saw nothing to do. Your env change may not be in the service definition (or it’s only in interpolation but didn’t change output).
Decision: Use --force-recreate or explicitly recreate the affected service after confirming the rendered config actually changed.
Task 14: Spot accidental proxy inheritance (classic corporate network trap)
cr0x@server:~$ docker exec api-1 sh -lc 'env | grep -i proxy'
HTTP_PROXY=http://proxy.corp:3128
NO_PROXY=localhost,127.0.0.1,db
What it means: The container is using a proxy. That can break service-to-service calls, certificate validation, and latency.
Decision: If this should not be present, explicitly unset or override proxy vars in Compose for production workloads.
Fast diagnosis playbook
This is the order that minimizes time-to-truth when a container is “configured wrong.” Don’t improvise. Use the funnel.
First: identify the runtime truth
- Confirm the target container:
docker psand verify name/image/uptime. - Inspect effective env:
docker inspect ... .Config.Env. - Check PID 1 env:
docker exec ... /proc/1/environ.
If runtime env is wrong, you’re in Docker/Compose land. If runtime env is right, you’re in application land.
Second: confirm Compose’s rendered intent
- Render the config:
docker compose config. - Check interpolation inputs: your shell env and your project
.env. - Check layering: multiple compose files,
env_fileorder,environmentoverrides.
Third: confirm the change actually shipped
- Creation time:
docker inspect ... .Created. - Recreate if needed:
docker compose up -d --force-recreatefor the service. - Verify again: inspect env after recreation.
Joke #2: In Docker, the only consistent configuration is the one you didn’t mean to override.
Three corporate mini-stories (anonymized, technically real)
Incident: the wrong assumption (“Compose env overrides everything, right?”)
A mid-sized company ran a payments API in Docker Compose on a handful of VM hosts. Their workflow was clean on paper: a base docker-compose.yml plus an override file per environment. The application accepted configuration via env vars. Classic.
A Friday deploy rolled out a change: a new env var PAYMENTS_PROVIDER_TIMEOUT_MS. The engineer set it in environment: for the production override. The service still timed out under load—then started retrying aggressively, causing upstream rate limits to kick in.
The team assumed the value “didn’t apply.” They increased it again. Same behavior. The real issue was that the image already had ENV PAYMENTS_PROVIDER_TIMEOUT_MS=2000 and the application had a config file baked into the image that took priority over env vars unless USE_ENV_CONFIG=true was set. In staging, that flag was set via a developer’s shell export and interpolated into the override file. In production, it wasn’t set. Compose substituted a blank. The container started with the default behavior: ignore env config.
They debugged the provider, the network, and the database before anyone ran docker compose config and saw the missing variable warning. The fix was boring: make USE_ENV_CONFIG explicit in the Compose service definition, and fail the deploy if it’s missing. Then rebuild the image to stop shipping an ambiguous config file.
Postmortem takeaway: don’t treat “env var exists” as “app uses env.” That’s two different contracts, and one of them was never written down.
Optimization that backfired: “Let’s deduplicate config with a shared env file”
A large enterprise team got tired of repeating configuration across 20 services. They introduced a shared env/common.env and included it via env_file: everywhere. Then they added env/prod.env, env/stage.env, and so on. The diff noise dropped. People celebrated. This is how it starts.
Three months later, a new service was added. Someone copied an existing service stanza and forgot to include env/prod.env. The service came up with defaults from the image and from common.env. It “worked,” but talked to the staging cache cluster because CACHE_HOST lived in prod.env for most services and in common.env for one legacy service. Both values were plausible hostnames. No immediate crash. Just wrong data paths.
They tried to “fix” it by moving more into common.env. That made the blast radius larger. Now a mistaken inclusion or omission changed behavior across multiple environments. They introduced “environment drift by file inclusion.” It’s a specific kind of failure: not a typo, but a missing layer.
The recovery was to stop pretending one env file could serve all services. They kept a shared file, but limited it to truly global non-risky values (time zone, log format, common feature toggles), and they required each service to have an explicit, environment-specific file. CI validated that every service had the expected env sources. Deduplication stayed, but with guardrails.
Boring but correct practice: pin, render, verify (and it saved the day)
A smaller team ran a customer support platform where downtime was visible within minutes. They had a habit that looked paranoid: every deploy produced an artifact containing the fully rendered Compose config (docker compose config output) and the final env var list per service, and they stored it alongside the build metadata.
One afternoon, the API started rejecting requests because it thought it was in “maintenance mode.” That mode was controlled by MAINTENANCE=true. Nobody had set it. Nobody admitted to setting it. Slack was doing its thing.
They compared the current artifact to the previous one. The rendered config clearly showed MAINTENANCE=true had been interpolated from the host environment during a manual hotfix deploy. The engineer had exported it earlier for a local test and forgot. Compose obediently substituted it into YAML that used ${MAINTENANCE} for convenience.
The fix took five minutes: rerun the deploy with a clean environment and remove host interpolation for that flag. The lesson stuck because it was measurable: their “paranoid” practice turned a vague mystery into a single diff line. Nothing heroic. Just evidence.
Common mistakes: symptoms → root cause → fix
1) Symptom: variable is correct in docker-compose.yml, but container has a different value
Root cause: You changed YAML but didn’t recreate the container, or you’re looking at a different Compose project.
Fix: Check docker compose ls, confirm container creation time, then docker compose up -d --force-recreate api.
2) Symptom: ${VAR} becomes blank even though you set it under environment:
Root cause: Confusing host-side interpolation with container runtime environment. Compose interpolates from the shell and .env, not from environment:.
Fix: Move the value into the host environment (controlled), or stop interpolating and use literals. Validate with docker compose config.
3) Symptom: value differs between staging and prod even with the same Compose files
Root cause: Different shell environment at deploy time (CI agent vs human shell), or different .env file in the working directory.
Fix: Deploy from a controlled environment. Avoid relying on developer shell exports. Store required interpolation vars in an explicit file used by the deploy job.
4) Symptom: app behaves like a setting is “enabled” even when you set false
Root cause: App parses booleans incorrectly (“non-empty string is truthy”), or uses a different variable name than you think.
Fix: Confirm by dumping effective config at app startup. Use strict parsing in the app. Prefer 0/1 if the app is sloppy.
5) Symptom: secrets show up in logs or support bundles
Root cause: Secrets passed via env vars are easy to dump via env, debug endpoints, or crash dumps.
Fix: Use Docker secrets (or file-mounted secrets) and pass paths, not values. At minimum, redact and lock down diagnostics.
6) Symptom: requests randomly go through a proxy or fail only on some hosts
Root cause: Proxy env vars inherited from host, systemd unit, or CI runner.
Fix: Explicitly set/unset HTTP_PROXY/NO_PROXY in Compose for production. Verify inside the container.
7) Symptom: Compose warns “variable not set, defaulting to blank string,” but deploy still succeeds
Root cause: Missing required interpolation inputs, and your pipeline doesn’t treat warnings as failures.
Fix: Gate deployments on “no missing variables” by parsing docker compose config output in CI, or by validating a required-vars list before running Compose.
8) Symptom: a variable from env_file doesn’t seem to apply
Root cause: It is overridden by environment:, by a later env_file, or by docker compose run -e.
Fix: Render the config, check ordering, and decide which layer is authoritative. Make it explicit.
Checklists / step-by-step plan
Checklist: make env var precedence boring (the goal)
- Stop using host interpolation for runtime settings unless you have a controlled deploy environment. If it changes, it should change in versioned config.
- Prefer explicit
environment:for service-critical values (DB endpoints, feature flags that can break safety, modes like maintenance, etc.). - Use
env_file:for bulk defaults but keep it small and predictable. Avoid “kitchen sink” env files shared by every service. - Use a single, environment-specific env file per service if you must use env files. Layering is fine, but layering requires discipline and validation.
- Don’t ship meaningful defaults in the image unless you mean it. Image defaults are invisible to Compose readers and love to surprise you.
- Make missing variables fatal. A blank string is rarely a safe default for credentials, endpoints, or feature toggles.
- Recreate containers when env changes. Bake that into your deploy procedure; don’t rely on human memory.
- Record the rendered Compose config for each deploy so you can diff what you intended vs what you shipped.
- Keep secrets out of env vars. Use files/secrets, pass paths, and audit what your diagnostics dump.
- Add a startup log line or endpoint that prints non-secret config (sanitized) so you can confirm the app’s interpretation.
Step-by-step: when you need to change a value safely
- Change the authoritative layer (usually
environment:or a single final env file). - Render the result: run
docker compose configand confirm the value appears where you expect. - Ship the change:
docker compose up -d --force-recreate service. - Verify at runtime:
docker inspectand/proc/1/environ. - Verify app interpretation: read startup logs or hit a config/status endpoint.
- Write down the precedence rule for that setting (even one sentence) so nobody repeats the incident.
FAQ
1) Does .env automatically become container environment variables?
No. .env is primarily used by Compose for variable substitution in the Compose file. To inject values into the container, use env_file: or environment:.
2) What wins: env_file or environment?
environment wins. If both define FOO, the value in environment: is the one the container gets.
3) What wins: image ENV or Compose environment?
Compose environment wins. Image defaults are the base layer; runtime configuration overrides them.
4) Why does ${VAR} sometimes substitute to an empty string without failing?
Because Compose treats missing interpolation variables as “not set” and may default to blank, often emitting a warning. If you don’t treat warnings as failures, you just shipped an empty value.
5) I set an env var and restarted the container. Why didn’t it apply?
A restart doesn’t change the container configuration. You need to recreate the container so Docker stores the new env in the container config.
6) Is docker exec env enough to know what’s configured?
It’s good, but check PID 1’s environment (/proc/1/environ) if you suspect entrypoint scripts or supervisors. Also confirm with docker inspect to see what Docker thinks the config is.
7) Are environment variables safe for secrets?
They’re convenient, not safe. They can leak through process listings, crash dumps, debug endpoints, and support bundles. Prefer secrets mounted as files and pass paths via env vars if needed.
8) Why do different machines produce different rendered Compose configs?
Because interpolation depends on the environment where Compose runs: shell variables, a local .env, and sometimes different working directories or wrappers. Rendered config is a build artifact—treat it like one.
9) If my app reads DATABASE_URL and also DB_HOST, what should I do?
Pick one contract and enforce it. If you must support both, define a strict precedence in the app and log which source won at startup (without printing secrets).
10) How do I stop developers’ shell env from affecting production deploys?
Run deploys from CI or a dedicated deploy environment with a sanitized environment. Avoid ${VAR} interpolation for runtime-critical settings unless the inputs are controlled and validated.
Conclusion: next steps that stop the bleeding
If you’ve been treating env vars as “simple,” Docker has been quietly disagreeing with you. The system isn’t malicious—just layered. The cure is to make those layers explicit and observable.
- Start using
docker compose configas a first-class deploy artifact. If it’s not rendered, it’s not real. - Make runtime verification routine:
docker inspectfor container config,/proc/1/environfor process truth. - Collapse unnecessary layers: fewer env files, fewer overrides, fewer “magic defaults” in images.
- Fail fast on missing variables. Warnings about blank defaults should be treated like a broken build.
- Move secrets out of env vars. Your future incident report will thank you.
When a system is behaving oddly, the fastest path is rarely deeper intuition. It’s asking the runtime what it did, then making it hard for humans to accidentally do the wrong thing again.