You know the scene: a repo with docker-compose.yml, docker-compose.dev.yml, docker-compose.prod.yml,
and one more file someone made “temporarily” during an incident. Then a year passes. Now your “dev” stack accidentally enables the prod-grade
reverse proxy, or prod quietly runs the debug image because the override chain is haunted.
Compose profiles are the grown-up answer: one Compose file, multiple stacks, predictable behavior. Less YAML archaeology, fewer “works on my laptop”
surprises, and far fewer Friday-night reversions.
What Compose profiles are (and what they aren’t)
A Compose profile is a label you attach to a service (and sometimes other resources) so it only starts when that profile is enabled.
It’s basically conditional inclusion. The Compose file remains one coherent model; profiles decide which parts are active for a run.
Here’s the core mental model:
-
Without profiles: running
docker compose upstarts every service in the file (subject to dependencies). -
With profiles: services tagged with profiles are excluded unless that profile is enabled via
--profileorCOMPOSE_PROFILES. -
Default services: services with no
profiles:entry behave like “always on”.
What profiles are not: a full templating system, a secrets manager, or a replacement for a proper deployment tool. They won’t stop you
from doing something reckless; they just make it harder to do it accidentally.
Opinionated guidance: treat profiles as feature gates for runtime topology. Use them to add/remove sidecars, tooling,
dev-only dependencies, and operational helpers. Don’t use them to paper over fundamentally different production architectures. If prod runs
on Kubernetes and dev runs on Compose, fine—profiles still help in dev and in local prod-like validation. But don’t pretend Compose profiles
magically make dev identical to prod. They make it disciplined.
One quote to keep your head straight during the next “just ship it” debate:
Hope is not a strategy.
— General Gordon R. Sullivan
Joke #1: If your dev and prod Compose files diverge long enough, they’ll eventually file separate tax returns.
Facts and history: why profiles exist
Profiles feel obvious now, but they’re a response to years of messy reality. Some context helps you understand the sharp edges.
8 concrete facts that matter in practice
-
Compose started life as Fig (2013–2014 era): it was designed for local multi-container apps, not enterprise deployment.
Profiles are a later concession to how people actually used it. -
Override files became the default workaround:
docker-compose.override.ymlwas a convenience feature,
and it accidentally trained teams to fork configuration endlessly. - Profiles arrived to reduce YAML sprawl: they let one file represent multiple shapes without a pile of overrides.
-
Compose V2 shifted into the Docker CLI:
docker compose(space) replaceddocker-compose(dash)
for most modern installs. Profiles are much more consistently supported there. -
Profiles are resolved client-side: the Compose CLI decides what to create. The Engine isn’t aware of your intent.
That means the “source of truth” is the Compose config you actually ran. -
Profiles interact with dependencies in non-obvious ways: a service with a profile can be pulled in because another service
depends on it (depending on how you start things). You need to test your startup paths. -
Multi-environment drift is an availability problem: duplicate YAML files don’t just waste time—they create unknown unknowns
that show up during incidents. -
Profiles pair well with “operational tooling” containers: backup jobs, migration runners, log shippers, and admin UIs can be
opt-in without infecting your default stack.
Design principles: how to structure a single-file dev/prod stack
A single Compose file can be clean or cursed. Profiles don’t save you if you design for chaos. Design for predictability instead.
1) Separate “always-on” from “contextual” services
Put your app, its database, and whatever is required to boot into the default set (no profile).
Put developer luxuries (live reload, admin UIs, fake SMTP, local S3, debug shells) behind dev.
Put production-only infra choices (real TLS edge, WAF-ish reverse proxy rules, log forwarders) behind prod or ops.
2) Keep ports boring, stable, and intentional
In dev, you probably publish ports to the host. In prod, you often don’t; you attach to a network and let a reverse proxy handle ingress.
Use profiles to avoid “prod accidentally binds to 0.0.0.0:5432” incidents.
3) Prefer named volumes; make persistence explicit
Storage is where “dev/prod differences” become data loss. Named volumes are fine for local, but prod should use mounted paths or a managed volume
driver and clearly defined backup/restore workflows.
4) Treat environment variables as API, not as a junk drawer
Use .env files, but don’t let them become a second configuration language. Use explicit defaults, document required variables,
and validate them in your entrypoint if the app is yours.
5) Compose is not an orchestrator; don’t cosplay
Compose can restart containers, do healthchecks, and define dependencies. It is not scheduling across nodes, doing progressive rollout, or managing
secrets at scale. Use it as a reliable “stack runner”. If you need more, graduate—don’t bolt on a pile of scripts until you reinvent a worse Kubernetes.
Joke #2: “Just one more override file” is how you summon YAML poltergeists.
A reference Compose file using profiles (dev/prod/ops)
This is a realistic baseline: a web app, a Postgres database, a cache, and optional helpers. The goal isn’t to be fancy.
The goal is to be hard to misuse.
cr0x@server:~$ cat compose.yml
services:
app:
image: ghcr.io/acme/demo-app:1.8.2
environment:
APP_ENV: ${APP_ENV:-dev}
DATABASE_URL: postgres://app:${POSTGRES_PASSWORD:-devpass}@db:5432/app
REDIS_URL: redis://redis:6379/0
depends_on:
db:
condition: service_healthy
redis:
condition: service_started
networks: [backend]
healthcheck:
test: ["CMD", "curl", "-fsS", "http://localhost:8080/healthz"]
interval: 10s
timeout: 2s
retries: 12
db:
image: postgres:16
environment:
POSTGRES_DB: app
POSTGRES_USER: app
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-devpass}
volumes:
- db_data:/var/lib/postgresql/data
networks: [backend]
healthcheck:
test: ["CMD-SHELL", "pg_isready -U app -d app"]
interval: 5s
timeout: 2s
retries: 20
redis:
image: redis:7
command: ["redis-server", "--save", "", "--appendonly", "no"]
networks: [backend]
# Dev-only: bind ports, live reload, friendly tools
app-dev:
profiles: ["dev"]
image: ghcr.io/acme/demo-app:1.8.2
environment:
APP_ENV: dev
LOG_LEVEL: debug
DATABASE_URL: postgres://app:${POSTGRES_PASSWORD:-devpass}@db:5432/app
REDIS_URL: redis://redis:6379/0
command: ["./run-dev.sh"]
volumes:
- ./src:/app/src:delegated
ports:
- "8080:8080"
depends_on:
db:
condition: service_healthy
redis:
condition: service_started
networks: [backend]
mailhog:
profiles: ["dev"]
image: mailhog/mailhog:v1.0.1
ports:
- "8025:8025"
networks: [backend]
adminer:
profiles: ["dev"]
image: adminer:4
ports:
- "8081:8080"
networks: [backend]
# Prod-ish: reverse proxy and tighter exposure
edge:
profiles: ["prod"]
image: nginx:1.27
volumes:
- ./nginx/conf.d:/etc/nginx/conf.d:ro
ports:
- "80:80"
depends_on:
app:
condition: service_healthy
networks: [frontend, backend]
# Ops-only: migrations and backups
migrate:
profiles: ["ops"]
image: ghcr.io/acme/demo-app:1.8.2
command: ["./migrate.sh"]
environment:
APP_ENV: ${APP_ENV:-prod}
DATABASE_URL: postgres://app:${POSTGRES_PASSWORD}@db:5432/app
depends_on:
db:
condition: service_healthy
networks: [backend]
pg-backup:
profiles: ["ops"]
image: postgres:16
environment:
PGPASSWORD: ${POSTGRES_PASSWORD}
entrypoint: ["/bin/sh", "-lc"]
command: >
pg_dump -h db -U app -d app
| gzip -c
> /backup/app-$(date +%F_%H%M%S).sql.gz
volumes:
- ./backup:/backup
depends_on:
db:
condition: service_healthy
networks: [backend]
networks:
frontend: {}
backend: {}
volumes:
db_data: {}
What this structure buys you
-
Default is safe:
app,db,redisrun with no host port exposure by default. -
Dev is ergonomic: enable
devto get live-reload app, mail testing, and Adminer. -
Prod is controlled: enable
prodto add an edge proxy; still no random dev ports. - Ops is explicit: migrations and backups are not “always running”; they’re invoked intentionally.
Note the deliberate duplication: app and app-dev are separate services. That’s not laziness.
It’s a safety boundary. The dev service binds ports and mounts source code; the prod-ish service does neither.
You can share an image tag while separating runtime behavior.
Practical tasks: 12+ real commands, outputs, and decisions
Below are concrete operational moves you’ll actually use. Each has: a command, what typical output means, and what decision you make next.
Run them in the repo root where compose.yml lives.
Task 1: Verify your Compose supports profiles (and which version you’re running)
cr0x@server:~$ docker compose version
Docker Compose version v2.27.0
Meaning: Compose V2 is installed. Profiles are supported.
If you see “command not found” or an ancient v1 binary, expect inconsistent behavior.
Decision: Standardize on docker compose across your team/CI. Mixed v1/v2 is how you get “but it worked yesterday” tickets.
Task 2: Render the effective config for a profile (catch surprises before you start containers)
cr0x@server:~$ docker compose -f compose.yml --profile dev config
services:
adminer:
image: adminer:4
networks:
backend: null
ports:
- mode: ingress
target: 8080
published: "8081"
protocol: tcp
app:
depends_on:
db:
condition: service_healthy
redis:
condition: service_started
environment:
APP_ENV: dev
DATABASE_URL: postgres://app:devpass@db:5432/app
REDIS_URL: redis://redis:6379/0
healthcheck:
test:
- CMD
- curl
- -fsS
- http://localhost:8080/healthz
image: ghcr.io/acme/demo-app:1.8.2
networks:
backend: null
app-dev:
command:
- ./run-dev.sh
depends_on:
db:
condition: service_healthy
redis:
condition: service_started
environment:
APP_ENV: dev
DATABASE_URL: postgres://app:devpass@db:5432/app
LOG_LEVEL: debug
REDIS_URL: redis://redis:6379/0
image: ghcr.io/acme/demo-app:1.8.2
networks:
backend: null
ports:
- mode: ingress
target: 8080
published: "8080"
protocol: tcp
volumes:
- type: bind
source: /home/cr0x/src
target: /app/src
bind:
create_host_path: true
db:
environment:
POSTGRES_DB: app
POSTGRES_PASSWORD: devpass
POSTGRES_USER: app
healthcheck:
test:
- CMD-SHELL
- pg_isready -U app -d app
image: postgres:16
networks:
backend: null
volumes:
- type: volume
source: db_data
target: /var/lib/postgresql/data
mailhog:
image: mailhog/mailhog:v1.0.1
networks:
backend: null
ports:
- mode: ingress
target: 8025
published: "8025"
protocol: tcp
redis:
command:
- redis-server
- --save
- ""
- --appendonly
- "no"
image: redis:7
networks:
backend: null
networks:
backend: {}
volumes:
db_data: {}
Meaning: You can see exactly which services appear under dev. This is your sanity check for accidental port exposure,
missing env vars, or wrong images.
Decision: If the rendered config includes anything you didn’t intend, fix the Compose file before you run it. Don’t debug a configuration typo at runtime.
Task 3: Start the safe default stack (no profiles enabled)
cr0x@server:~$ docker compose -f compose.yml up -d
[+] Running 4/4
✔ Network server_backend Created
✔ Volume "server_db_data" Created
✔ Container server-db-1 Started
✔ Container server-redis-1 Started
✔ Container server-app-1 Started
Meaning: Only default services started. No dev tools, no edge proxy.
Decision: Use this as your baseline for CI smoke tests and “prod-like local” runs. The more boring it is, the better it behaves during incidents.
Task 4: Start the dev experience explicitly
cr0x@server:~$ docker compose -f compose.yml --profile dev up -d
[+] Running 3/3
✔ Container server-mailhog-1 Started
✔ Container server-adminer-1 Started
✔ Container server-app-dev-1 Started
Meaning: Compose added only the dev-profile services; the default services were already running.
Decision: Make “dev is opt-in” a team rule. If someone wants debug ports in prod, they should have to say it out loud with a profile flag.
Task 5: Prove which profiles are enabled (useful in CI logs)
cr0x@server:~$ COMPOSE_PROFILES=prod docker compose -f compose.yml config --profiles
prod
Meaning: The CLI acknowledges which profile(s) will be considered. This is a small trick that prevents big misunderstandings.
Decision: In CI, echo effective profiles at the top of the job. You’re buying future-you a shorter incident.
Task 6: List containers for the project and spot profile services
cr0x@server:~$ docker compose -f compose.yml ps
NAME IMAGE COMMAND SERVICE STATUS PORTS
server-adminer-1 adminer:4 "entrypoint.sh php …" adminer running 0.0.0.0:8081->8080/tcp
server-app-1 ghcr.io/acme/demo-app:1.8.2 "./start.sh" app running (healthy)
server-app-dev-1 ghcr.io/acme/demo-app:1.8.2 "./run-dev.sh" app-dev running 0.0.0.0:8080->8080/tcp
server-db-1 postgres:16 "docker-entrypoint…" db running (healthy) 5432/tcp
server-mailhog-1 mailhog/mailhog:v1.0.1 "MailHog" mailhog running 0.0.0.0:8025->8025/tcp
server-redis-1 redis:7 "docker-entrypoint…" redis running 6379/tcp
Meaning: You can see which services are running and which ports are published. The PORTS column is your “what did we expose?” audit.
Decision: If you see ports published in environments where they shouldn’t be, stop and fix the file. Don’t normalize accidental exposure.
Task 7: Confirm why a service won’t start (dependency and healthcheck reality check)
cr0x@server:~$ docker compose -f compose.yml logs --no-log-prefix --tail=30 app
curl: (7) Failed to connect to localhost port 8080: Connection refused
Meaning: The healthcheck is failing. Either the app isn’t listening, it’s listening on a different port, or it’s crashing before bind.
Decision: Check docker compose logs app for startup errors, then docker exec into the container to validate the listening port.
Don’t touch the DB yet; most app healthcheck failures are app config, not storage.
Task 8: Inspect effective environment variables (find the “wrong .env” problem fast)
cr0x@server:~$ docker compose -f compose.yml exec -T app env | egrep 'APP_ENV|DATABASE_URL|REDIS_URL'
APP_ENV=dev
DATABASE_URL=postgres://app:devpass@db:5432/app
REDIS_URL=redis://redis:6379/0
Meaning: The container sees the values you think it sees. If the password is missing or empty, your .env isn’t loaded or the variable name is wrong.
Decision: If env vars are wrong, fix the caller side (your shell export, your CI secret injection, or the Compose file). Don’t “hotfix” by editing containers.
Task 9: Identify image drift between dev and prod profile services
cr0x@server:~$ docker compose -f compose.yml images
CONTAINER REPOSITORY TAG IMAGE ID SIZE
server-app-1 ghcr.io/acme/demo-app 1.8.2 7a1d0f2c9a33 212MB
server-app-dev-1 ghcr.io/acme/demo-app 1.8.2 7a1d0f2c9a33 212MB
server-db-1 postgres 16 5e2c6e1e12b8 435MB
server-redis-1 redis 7 1c90a3f8e3a4 118MB
Meaning: Both app services use the same image ID. That’s good: your dev behavior differs by command/volumes/ports, not by untracked code.
Decision: If image IDs differ unexpectedly, decide whether that’s intentional. If it’s not, unify tags or stop pretending the environments are comparable.
Task 10: Prove which services are actually part of a profile (useful during refactors)
cr0x@server:~$ docker compose -f compose.yml config --services
adminer
app
app-dev
db
edge
mailhog
migrate
pg-backup
redis
Meaning: This lists all services in the file, including profile-gated ones. Now you can cross-check ownership and remove dead weight.
Decision: If nobody can explain why a service exists, delete it or move it behind an ops profile and require explicit invocation.
Task 11: Start prod profile locally without dev exposure
cr0x@server:~$ COMPOSE_PROFILES=prod docker compose -f compose.yml up -d
[+] Running 1/1
✔ Container server-edge-1 Started
Meaning: Only the edge service was added; default services were already present.
Decision: Use this to validate nginx config changes with the same app/db you use elsewhere, without bringing in dev-only tools.
Task 12: Run one-off ops jobs without leaving zombie containers
cr0x@server:~$ COMPOSE_PROFILES=ops docker compose -f compose.yml run --rm migrate
Running migrations...
Migrations complete.
Meaning: The migration container ran and was removed. No long-running service, no surprise restarts.
Decision: Keep “ops actions” as run --rm jobs. If your migrations run as a permanent service, you’re creating a self-inflicted pager.
Task 13: Take a backup with the ops profile and validate the file exists
cr0x@server:~$ COMPOSE_PROFILES=ops docker compose -f compose.yml run --rm pg-backup
cr0x@server:~$ ls -lh backup | tail -n 2
-rw-r--r-- 1 cr0x cr0x 38M Jan 3 01:12 app-2026-01-03_011230.sql.gz
Meaning: The backup landed on the host filesystem. That’s the difference between “we have backups” and “we have a comforting story”.
Decision: If the file isn’t there, don’t proceed with risky changes. Fix mounts/permissions first. Backups that don’t restore are just performance art.
Task 14: Detect port collisions before blaming Docker
cr0x@server:~$ ss -ltnp | egrep ':8080|:8081|:8025' || true
LISTEN 0 4096 0.0.0.0:8080 0.0.0.0:* users:(("docker-proxy",pid=22419,fd=4))
LISTEN 0 4096 0.0.0.0:8081 0.0.0.0:* users:(("docker-proxy",pid=22455,fd=4))
LISTEN 0 4096 0.0.0.0:8025 0.0.0.0:* users:(("docker-proxy",pid=22501,fd=4))
Meaning: The host ports are already bound by Docker proxy processes. If your next up fails with “port is already allocated”, this is why.
Decision: Either stop the competing stack or change published ports. Don’t “solve” it by running everything privileged and hoping.
Three corporate mini-stories from the trenches
Mini-story 1: The incident caused by a wrong assumption
A mid-sized SaaS team kept two Compose files: one for dev, one for “prod-like”. The assumption was polite and deadly:
“They’re basically the same; prod-like just adds nginx.” Nobody re-verified that statement after the tenth small change.
A new engineer added a Redis container to the dev file only, because the app had a feature flag and “prod doesn’t use it yet”.
Weeks later, prod started enabling the flag in a canary. The prod-like Compose stack used in CI didn’t have Redis.
CI passed because the relevant tests were skipped when Redis wasn’t detected.
Then came a deployment where the feature flag rolled wider than intended. The app’s fallback behavior was to retry Redis connections
aggressively. CPU shot up, request latency followed, and a couple of nodes started getting killed by the kernel OOM reaper. Not all of them,
just enough to create a rolling brownout that looked like “network flakiness”.
The fix wasn’t heroic. They collapsed back to one Compose file and used profiles: Redis became default in the stack used for CI, and a new
profile gated “experimental” dependencies. That forced a conscious decision: if the app might use Redis in prod, Redis must exist in the prod-like model.
Lesson: assumptions about environment parity are like milk. They expire quietly, then ruin your day loudly.
Mini-story 2: The optimization that backfired
A large enterprise platform team tried to “optimize developer experience” by using profiles to swap entire images:
a tiny debug image for dev and a hardened image for prod. On paper, it reduced local build time and made the prod image stricter.
In practice, they created a forked universe.
The dev image had extra packages: curl, netcat, Python, and a few CA bundles that “just made things work”.
The prod image was slim: fewer libs, fewer tools, less attack surface. Respectable goals.
But the app had a hidden dependency on system CA certificates due to a third-party SDK doing TLS calls.
Dev never saw the bug because the debug image had the right CA chain. Prod did: TLS handshakes failed intermittently depending on which endpoint
the SDK hit, and the failures were wrapped in opaque exceptions. The incident dragged on because engineers kept reproducing in dev, where it worked.
They kept profiles, but changed the rule: profiles may change commands, mounts, and ports, but not the base OS composition of the runtime
image without a formal test that runs the prod image in dev workflows. They also added a “prod-image” profile that forces the prod image locally.
Lesson: optimizing for speed by changing the runtime substrate is the fastest way to buy slow incidents.
Mini-story 3: The boring but correct practice that saved the day
An internal payments team ran Compose for local dev and for a small on-prem “lab” environment used for partner integrations.
Their practice was unsexy: every change to Compose had to include an updated docker compose config output artifact in CI logs for each profile.
Not stored forever, just attached to the job summary.
One morning, a change landed that moved a port mapping from a dev-only service into a default service. It wasn’t malicious;
it was a copy/paste error while refactoring. The service happened to be a database admin UI. You know where this is going.
The lab environment had a strict firewall, so it wasn’t internet-exposed. But it was accessible to a large corporate network,
which is its own kind of wilderness. The team caught the mistake before deploying because the CI artifact for the default profile
suddenly showed a published port that hadn’t been there the day before.
They reverted, then reintroduced the change correctly behind the dev profile. No incident, no shame spiral, no “we’ll fix it later”.
Just a small, boring guardrail doing its job.
Lesson: printing the effective config is the ops equivalent of washing your hands. It’s not glamorous, and it prevents infections.
Fast diagnosis playbook: what to check first/second/third
When a Compose stack “doesn’t work”, the fastest path is to stop guessing what Compose did and inspect what it actually did.
Profiles add one more dimension to confusion, so your triage needs to be crisp.
First: confirm the intended profile set and rendered config
-
Run
docker compose --profile X configand scan for:- unexpected published ports
- missing services you assumed were there (cache, message broker, reverse proxy)
- env var defaults you forgot were defaults
- If config output surprises you, stop. Fix configuration before chasing runtime symptoms.
Second: check container state and health, not just “running”
-
Run
docker compose ps. Look for(healthy)and restart loops. - A service can be “Up” and still be dead inside. Healthchecks are your cheap lie detector.
Third: determine whether you have a dependency failure or an app failure
- If DB is unhealthy: check storage, permissions, and volume mounts.
- If DB is healthy but app is unhealthy: check app logs and env vars.
-
If everything is healthy but requests fail: check networking, published ports, and reverse proxy config (especially if
prodprofile adds an edge).
Bonus: isolate by removing profiles
If dev profile introduces breakage, run the default stack alone. If the default stack works, the regression is in dev-only services,
mounts, or port conflicts. Profiles make this isolation trivial—if you keep your defaults clean.
Common mistakes: symptoms → root cause → fix
Mistake 1: “Why is my dev tool running in prod?”
Symptom: Admin UI, MailHog, or debug endpoints appear in environments where they don’t belong.
Root cause: Service lacks profiles: ["dev"], or the environment sets COMPOSE_PROFILES=dev globally.
Fix: Add profiles to the service, and audit CI/hosts for leaked COMPOSE_PROFILES. In prod scripts, set COMPOSE_PROFILES=prod explicitly.
Mistake 2: “Enabling a profile didn’t start anything”
Symptom: docker compose --profile ops up shows no new containers, or only defaults start.
Root cause: The services are defined with a different profile name than you passed (typo), or you expected run-style jobs to appear under up.
Fix: Use docker compose config --services and inspect profiles sections. For one-off jobs, use docker compose run --rm SERVICE.
Mistake 3: “The app can’t connect to the database in dev, but prod works”
Symptom: Connection refused/timeouts only in dev profile.
Root cause: Dev service uses a different DATABASE_URL, or you accidentally pointed it at localhost instead of the service name db.
Fix: In containers, use service DNS names on the Compose network: db:5432. Confirm with docker compose exec app env.
Mistake 4: “Port is already allocated” appears randomly
Symptom: Starting dev profile fails with a port binding error.
Root cause: Another stack already binds the port, or you started two profiles that both publish the same host port (common with app and app-dev if both publish 8080).
Fix: Only publish ports in one of the services (typically the dev one). Verify collisions with ss -ltnp.
Mistake 5: “depends_on didn’t wait; the app started too early”
Symptom: App starts before DB is ready, causing crash loops.
Root cause: You used depends_on without health conditions, or the DB healthcheck is missing/incorrect.
Fix: Add healthchecks and use condition: service_healthy. Also make the app resilient with retries; Compose isn’t your reliability layer.
Mistake 6: “We thought profile services weren’t created, but they were”
Symptom: A profile-gated service exists as a container/network artifact, even when the profile wasn’t enabled.
Root cause: You previously ran with that profile enabled; resources remain until removed. Or your automation uses docker compose up with environment variables set.
Fix: Use docker compose down (and optionally -v in dev only). Treat “what’s currently running” as state, not intent.
Mistake 7: “Our backups succeeded but restores failed”
Symptom: Backup job runs without errors; restore later fails or produces empty data.
Root cause: Backup container wrote to a path inside the container that wasn’t mounted, or permissions prevented writing to the host.
Fix: Store backups on a host-mounted path. After backup, verify file presence and size with ls -lh. Periodically test restore.
Checklists / step-by-step plan
Step-by-step: migrate from multiple Compose files to one file with profiles
- Inventory services across files. List services and note differences (ports, volumes, image tags, commands).
-
Define profiles that match decisions, not people.
Use names likedev,prod,ops,debug. Avoidaliceornewthing. - Pick the “safe default” stack. No dev tools, no published ports except what’s required for basic function (often none).
-
Move dev-only services behind
dev. MailHog, Adminer, fake S3, local tracing UIs, etc. -
Split services when runtime behavior differs materially.
If dev needs bind mounts and different command: createapp-devrather than trying to toggle everything with env vars. - Keep image identity stable where possible. Prefer same image for app and app-dev; change command/mounts/ports.
-
Render configs in CI for each profile. Save
docker compose configoutputs in build logs. - Document “how to run” commands. Make them copy/pasteable; people will copy/paste them anyway.
-
Test three paths: default only,
--profile dev,--profile prod(or prod-like). - Kill the old files. Don’t keep them “just in case”. That’s how drift returns.
Operational checklist: before you declare a profile strategy “done”
- Default profile starts and is functional without published DB ports.
docker compose configoutput is stable and reviewed for each profile.- Dev profile does not change base images without an explicit test plan.
- Ops tasks use
run --rmand write output to host-mounted paths. - Port mappings are unique across services that might run together.
- Healthchecks exist for stateful dependencies (DB) and the app.
- Secrets are not committed, and prod invocations set profiles explicitly.
CI plan: minimal but effective
- Render config for default + dev + prod and store in logs.
- Start default stack, run smoke tests, tear down.
- Start dev stack (or subset), run unit/integration tests, tear down.
- Run ops migrations as a one-off job in a disposable environment.
FAQ
1) Should I use profiles or override files?
Use profiles for topology changes (which services exist) and for “dev tools are optional”.
Use override files sparingly for machine-local tweaks (like a developer’s custom port), and only if you can tolerate drift.
If you must choose one: profiles are easier to reason about and easier to audit.
2) Can a service belong to multiple profiles?
Yes. You can set profiles: ["dev", "ops"] for a service that’s useful in both contexts.
Be careful: multi-profile membership can become a logic puzzle during incidents.
Keep it rare and justified.
3) What happens if I run docker compose up with no profile specified?
Services without a profiles key are started. Services with a profiles key are ignored.
That’s why your default services must be safe and minimal.
4) Can enabling a profile accidentally start extra services via dependencies?
It can, depending on how you start things and how your dependencies are declared. Your job is to test startup paths:
starting “just the app”, starting the full stack, and starting profile services.
Assume humans will run weird commands during incidents.
5) Do profiles affect networks and volumes?
Profiles gate services. Networks and volumes are typically created as needed by the services that reference them.
If a volume is only referenced by a profiled service, it won’t be created unless that profile is active.
6) How do I prevent dev ports from being exposed when someone runs the wrong profile?
Make the default profile safe, and make prod invocations explicit. In scripts, set COMPOSE_PROFILES=prod
rather than relying on whatever environment variables happen to exist.
Also avoid publishing ports in default services unless you truly need them.
7) How should I handle migrations with profiles?
Put migrations in an ops profile as a one-off job and run them with docker compose run --rm migrate.
Don’t make migrations a long-running service. If it restarts, you’ll eventually migrate twice. That’s not an upgrade plan.
8) Are profiles suitable for “prod on a single VM” deployments?
Yes, with discipline. Profiles help you keep ops tools out of the baseline and prevent accidental exposure.
But don’t confuse “works on one VM” with “is an orchestrated production platform”.
Add monitoring, backups, and explicit rollback procedures. Compose won’t invent them.
9) What’s the cleanest way to switch between dev and prod behavior for the same app?
Prefer separate services (like app and app-dev) when the differences are meaningful (bind mounts, commands, ports).
Keep them on the same image tag when possible. Separate behavior, shared artifact.
10) Should I keep a debug profile?
Yes, if you use it responsibly. A debug profile for ephemeral tooling (tcpdump container, shell container, profiling agent)
can reduce mean time to understand. Just don’t let it become “prod with training wheels always on”.
Conclusion: practical next steps
Compose profiles are the simplest way to stop duplicating YAML while still running different stacks for different contexts.
They don’t eliminate complexity; they make it visible and controllable. That’s the point.
Do this next, in order
- Pick a safe default stack with no dev tools and minimal host port exposure.
-
Add
dev,prod, andopsprofiles to gate what’s optional, risky, or one-off. -
Make
docker compose configoutput part of CI logs for each profile. Treat it as an audit trail. -
Convert migrations/backup utilities into
run --rmjobs behindops. - Delete your extra Compose files once the single-file approach is validated. Drift loves sentimental attachment.
When you’re on call, you want fewer moving parts and fewer undocumented branches in behavior. Profiles give you that—if you keep your defaults clean
and your profiles intentional. Run less magic. Ship more predictability.