You deploy a routine change. A container restarts. Then Docker throws the one message that sounds simple and usually isn’t: “network not found”. Suddenly services won’t start, Compose is sulking, and your “quick fix” ideas start lining up like bad decisions at 2 a.m.
This is a field guide for rebuilding Docker networks safely—without turning your host into a museum of orphaned veth devices and broken NAT rules. We’ll move from fast diagnosis to surgical repairs, and only then to the “nuke from orbit” options.
What “network not found” actually means
Docker networking is mostly bookkeeping. When you create a network, Docker stores a record (name, ID, driver, IPAM settings) in its local state. When a container starts, Docker asks: “attach this container endpoint to network X.” If Docker can’t find X by ID, you get “network not found.”
The tricky part is that the name you see is not the real key. Docker uses a network ID, and Compose stores/uses that ID behind the scenes. If the network entry is deleted, corrupted, or the daemon state is reset, Compose may still try to attach containers to a now-nonexistent ID. That’s how you end up with a network that “exists” in your YAML, “exists” in your brain, but not in Docker’s actual state.
Where the error shows up
- docker run:
Error response from daemon: network <id> not found - docker compose up: failures during container creation or attach
- Swarm: tasks stuck with network attach errors
- After reboot: daemon restarts and “forgets” networks due to a broken state directory or partial restore
Important: “network not found” is not the same as “cannot reach service.” The latter is usually routing, iptables/nftables, MTU, DNS, or application config. “Network not found” is almost always control-plane state: Docker can’t locate a network object you referenced.
Dry-funny truth: Docker networks are like org charts. You can delete the box, but everyone still tries to report to it.
Fast diagnosis playbook (check these first)
This is the order that minimizes collateral damage and gets you from “error text” to “fix” with the least guessing. The goal isn’t to be clever; it’s to be fast and correct.
1) Identify what is failing and what network it wants
Grab the exact error. Is it referencing a network name or a hex ID? IDs are the giveaway that something stored a stale reference.
2) Confirm whether Docker believes the network exists
If docker network ls doesn’t show it, you are not dealing with a packet problem. You’re dealing with missing state.
3) Check whether Compose created and labels a project network
Compose networks have labels like com.docker.compose.project. If those labels don’t exist anymore, you’re rebuilding, not debugging.
4) Check daemon health and logs before you “fix” anything
Network state disappears for reasons: disk full, filesystem corruption, daemon crash loops, or an overzealous cleanup. Treat it like a symptom of a bigger problem until proven otherwise.
5) Decide the blast radius: single project vs host-wide
If only one Compose project is affected, don’t burn the whole host. Recreate the project network and reattach. If multiple networks vanished, suspect daemon state damage; you may need a controlled reset.
Interesting facts and a little history (because it helps)
- Docker’s “bridge” networking predates modern Compose workflows. Early Docker defaulted to the
docker0bridge and iptables NAT; user-defined bridges came later to improve DNS and isolation. - User-defined bridge networks have built-in name-based DNS. The embedded DNS server (typically at
127.0.0.11in containers) is why containers can resolve service names without extra tooling. - Compose v2 is a Docker CLI plugin. The move from a separate Python tool to a plugin changed how errors appear and how contexts are handled.
- Network IDs are content-addressable-style opaque identifiers. Humans operate on names; the daemon operates on IDs. Losing the mapping is how “but the YAML says…” becomes irrelevant.
- iptables vs nftables has been a recurring pain point. Docker historically programmed iptables directly; on some distros the nftables backend changes behavior and surprises people.
- Docker’s local state lives under
/var/lib/docker. If that directory is wiped, restored incompletely, or mounted oddly, networks “disappear” along with everything else. - Overlay networks were built for Swarm’s multi-host story. Even if you never use Swarm, the networking model still shows up in error strings and drivers.
- MACVLAN/IPVLAN drivers exist specifically to bypass the bridge. They’re great until your upstream network team discovers them and asks why the host can’t talk to its own containers.
- Some “cleanup” tools remove networks more aggressively than you think. Anything that prunes “unused” objects can race with restarts or mistakenly classify resources as unused.
One paraphrased idea worth keeping on your desk, often attributed to Werner Vogels: everything fails, all the time—design and operate assuming components will break
(paraphrased idea).
Common root causes and failure modes
1) The network was deleted (sometimes “helpfully”)
Someone ran docker network rm on “unused” networks. Or a cleanup job ran a prune. Or a CI agent on a shared host cleaned up after itself and “after itself” included your networks.
2) Docker daemon state drift or corruption
Unclean shutdown + disk issues + a stressed filesystem can produce partial state. Docker starts, but some objects are missing. Networks are often the first casualties because they’re metadata-heavy and referenced by IDs across multiple objects.
3) Compose project rename or directory move
Compose uses the project name (often derived from the directory) to name resources. Move the directory or change COMPOSE_PROJECT_NAME, and you can end up with containers expecting networks that were created under the old project name.
4) Multiple Docker contexts or wrong host
You think you’re on production, but your CLI points at a different Docker context. The network “is missing” because you’re looking at the wrong daemon. This is more common than people admit.
5) Swarm and overlay networks in a half-configured state
Overlay networks require cluster state. If a node is no longer part of the cluster, or the manager state is inconsistent, tasks can fail network attachment.
6) Firewall/iptables/nftables is broken (but that’s a different error)
This is a common distraction. Broken NAT rules usually produce reachability failures, not “network not found.” Still: a daemon restart can reprogram rules and collide with distro changes.
Practical tasks: commands, expected output, and decisions
These are real operator moves. Each task includes: the command, what the output means, and what decision you make next. Don’t copy-paste blindly on a shared host. Read what you’re doing, then do it.
Task 1: Confirm you’re talking to the right Docker daemon
cr0x@server:~$ docker context show
default
What it means: You’re using the default context (local daemon). If it shows something else, you might be debugging the wrong machine.
Decision: If unexpected, run docker context ls, switch to the correct one, and re-check the issue before touching networks.
Task 2: Verify the daemon is healthy and responsive
cr0x@server:~$ docker info --format '{{.ServerVersion}} {{.OperatingSystem}}'
27.2.0 Ubuntu 24.04.1 LTS
What it means: The CLI can talk to the daemon and get structured info. If this hangs or errors, your networking problem may be a daemon problem.
Decision: If unresponsive, check systemd and logs first. Don’t churn networks while the daemon is sick.
Task 3: List networks and see what exists right now
cr0x@server:~$ docker network ls
NETWORK ID NAME DRIVER SCOPE
1f3a2c1a9e2d bridge bridge local
a9b7b5d41c1f host host local
c3d6f2d51c9a none null local
What it means: Only default networks exist. If your Compose network is missing, this strongly supports “deleted network” or “lost state.”
Decision: If the target network is missing, stop trying to “connect” containers to it. Plan to recreate it.
Task 4: Find the network reference from a failing container
cr0x@server:~$ docker inspect -f '{{json .NetworkSettings.Networks}}' api_1
null
What it means: The container isn’t attached to any network (or doesn’t exist). If the container failed to create, inspect may fail instead.
Decision: If the container doesn’t exist, look at Compose events/logs; the network reference is likely in the error output during create.
Task 5: Inspect a network by name (if you think it exists)
cr0x@server:~$ docker network inspect app_default
[]
Error response from daemon: network app_default not found
What it means: Docker confirms the network object is absent.
Decision: Recreate the network (usually via Compose) rather than trying to repair attachments.
Task 6: Check Compose’s view of the project
cr0x@server:~$ docker compose ls
NAME STATUS CONFIG FILES
app running(3) /srv/app/docker-compose.yml
What it means: Docker Compose believes a project exists. It may still be running partially.
Decision: If status is inconsistent with reality, you may have orphaned containers or missing networks. Next step: docker compose ps.
Task 7: See what Compose thinks is running and what networks it expects
cr0x@server:~$ docker compose -p app ps
NAME IMAGE COMMAND SERVICE STATUS
app-db-1 postgres:16 "docker-entrypoint…" db Up 2 hours
app-api-1 app-api:latest "/bin/api" api Exit 1
What it means: Some services run, some fail. This often happens when a subset of containers is attached to a now-missing network, or the network vanished after restarts.
Decision: Check the failing service logs and events; confirm the “network not found” is the cause, not an app crash.
Task 8: Pull the real failure reason from events
cr0x@server:~$ docker events --since 10m --filter type=container --filter event=start --filter event=die
2026-01-03T09:11:26.321615241Z container start 5a8c0e3f8b2b (name=app-api-1, image=app-api:latest)
2026-01-03T09:11:26.412992531Z container die 5a8c0e3f8b2b (exitCode=1, name=app-api-1, image=app-api:latest)
What it means: The container did start and then died (exit 1). That’s not “network not found.” The error is application-level.
Decision: If events show container create failures instead (network attach failures often appear there), focus on network recreation. If it’s dying after start, debug app logs.
Task 9: Check daemon logs for network store or libnetwork errors
cr0x@server:~$ sudo journalctl -u docker --since "1 hour ago" | tail -n 20
Jan 03 09:03:10 server dockerd[1123]: time="2026-01-03T09:03:10.991132145Z" level=error msg="network sandbox join failed" error="network app_default not found"
Jan 03 09:03:10 server dockerd[1123]: time="2026-01-03T09:03:10.991264223Z" level=error msg="Handler for POST /v1.46/containers/create returned error: network app_default not found"
What it means: The daemon is explicitly failing container creation due to missing network. This is the textbook case.
Decision: Recreate app_default (preferably via docker compose up which also labels it correctly). Avoid manual creation unless you must match IPAM settings.
Task 10: Find who “owns” a network (labels matter)
cr0x@server:~$ docker network inspect -f '{{json .Labels}}' app_default
Error response from daemon: network app_default not found
What it means: Not found, so no labels to inspect. When it does exist, labels tell you if Compose created it and which project.
Decision: If you recreate manually, you lose Compose ownership metadata unless you reapply labels (not recommended). Better: let Compose create it.
Task 11: Recreate the network the boring, correct way (Compose)
cr0x@server:~$ cd /srv/app
cr0x@server:/srv/app$ docker compose -p app up -d
[+] Running 4/4
✔ Network app_default Created
✔ Container app-db-1 Started
✔ Container app-api-1 Started
✔ Container app-web-1 Started
What it means: Compose created the missing network and started containers. This is usually the cleanest fix.
Decision: Verify connectivity and DNS. If containers still fail with “network not found,” you may have multiple projects or stale references; proceed to deeper cleanup.
Task 12: Validate the network driver and subnet (catch silent differences)
cr0x@server:~$ docker network inspect -f 'Driver={{.Driver}} Subnet={{(index .IPAM.Config 0).Subnet}}' app_default
Driver=bridge Subnet=172.22.0.0/16
What it means: Confirms driver and IPAM config. If the subnet differs from what your firewall rules or allowlists expect, you’ll get “it starts but can’t reach anything.”
Decision: If subnet must be stable, pin it in Compose ipam and recreate intentionally, not by accident.
Task 13: Check container resolv.conf and embedded DNS behavior
cr0x@server:~$ docker exec app-api-1 cat /etc/resolv.conf
nameserver 127.0.0.11
options ndots:0
What it means: Embedded DNS is in use. If DNS fails, it’s often because the container is on the wrong network or the network is broken.
Decision: If you see a host DNS server instead, you may be using network_mode: host or custom settings. That changes how you troubleshoot.
Task 14: Confirm name resolution between services on the network
cr0x@server:~$ docker exec app-api-1 getent hosts db
172.22.0.2 app-db-1
What it means: DNS is working and db resolves to the expected container IP.
Decision: If it doesn’t resolve, verify both containers are on the same network and that you’re using the correct service name.
Task 15: Show which networks a container is attached to (avoid guessing)
cr0x@server:~$ docker inspect -f '{{range $k,$v := .NetworkSettings.Networks}}{{$k}} {{end}}' app-api-1
app_default
What it means: The container is attached to app_default.
Decision: If a container is attached to a different network than expected (like bridge), fix the Compose file or recreate the container.
Task 16: Detect orphaned endpoints and “in use” networks before removal
cr0x@server:~$ docker network inspect -f 'Containers={{len .Containers}}' app_default
Containers=3
What it means: Three container endpoints are attached. If you try to remove this network, Docker will refuse unless you disconnect or remove containers.
Decision: If you need to rebuild the network, plan a controlled stop/recreate of the stack.
Task 17: Safely stop a project and remove only its network
cr0x@server:~$ cd /srv/app
cr0x@server:/srv/app$ docker compose -p app down
[+] Running 4/4
✔ Container app-web-1 Removed
✔ Container app-api-1 Removed
✔ Container app-db-1 Removed
✔ Network app_default Removed
What it means: Project containers and its default network are removed. Volumes remain unless you used -v.
Decision: This is the safe rebuild method when the network is corrupted or misconfigured. Next: docker compose up -d to recreate.
Task 18: Confirm iptables/nftables hasn’t been silently neutered
cr0x@server:~$ sudo iptables -S DOCKER | head -n 5
-N DOCKER
-A DOCKER -i docker0 -j RETURN
-A DOCKER ! -i docker0 -p tcp -m tcp --dport 5432 -j DNAT --to-destination 172.22.0.2:5432
What it means: Docker’s iptables chains exist. If the DOCKER chain is missing or empty on a host relying on published ports, connectivity will fail even if the network exists.
Decision: If chains are missing after daemon restart, check daemon config, firewall manager, and whether the host uses nftables backend with incompatible rules.
Joke #1: If you fix Docker networking by rebooting, you didn’t fix it—you rolled the dice and won this time.
How to rebuild networks without breaking everything
Principle 1: Prefer recreating networks via the tool that created them
If Compose created the network, let Compose recreate it. Compose applies labels and naming conventions that keep future operations predictable. Manual docker network create is fine for one-off experiments; in production stacks it’s how you manufacture mysteries.
Principle 2: Contain the blast radius
There are three scopes of intervention:
- Project scope: Recreate
app_defaultand project containers only. - Host scope: Repair default networks (
bridge/docker0) and iptables programming. Riskier. - Daemon state reset: Last resort; can destroy all Docker state on the host.
Project-scope rebuild: the safest common fix
If you get “network not found” for a Compose-managed network, do this:
- Confirm the network name Compose expects (often
<project>_default). - Run
docker compose up -din the correct directory and with the correct project name. This should recreate the network if missing. - If containers are stuck or half-created, do
docker compose downand thenup -d.
When does docker compose up -d not fix it? When you have stale resources with conflicting names, mismatched project names, or external networks referenced that aren’t created.
External networks: where humans set traps for themselves
Compose supports external: true. That means “Compose will not create this network.” If the network is missing, Compose will fail—correctly. This is the scenario where operators swear Compose is broken. It isn’t. You told it not to create the network.
If your stack uses external networks, your job is to ensure they exist and have stable configuration (subnet, driver) across host rebuilds. If you can’t guarantee that, stop using external networks unless there’s a real reason.
Manual recreation (only when you must)
Sometimes you can’t run Compose (CI system down, repo unavailable, or you need a temporary bridge to bring critical services up). In that case, recreate manually with care:
- Match the driver (
bridge,overlay,macvlan). - Match IPAM settings if anything depends on fixed ranges.
- Know that Compose labels and ownership won’t be there, which affects
docker compose downlater.
cr0x@server:~$ docker network create --driver bridge --subnet 172.22.0.0/16 app_default
b9a1f1a2a3c4d5e6f7a8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8
What it means: You created a bridge network with a fixed subnet.
Decision: Use this only as a stopgap. Once stable, revert to Compose-managed creation to avoid drift.
Rebuilding the default bridge (docker0) without chaos
If the error is about the bridge network or docker0 is missing/broken, you’re in host-scope territory. Symptoms: containers fail to start even with default networking, or published ports stop working after upgrades.
Proceed like an adult:
- Schedule a window if this host runs anything important.
- Stop workloads cleanly (Compose down, Swarm drain, whatever applies).
- Restart Docker and validate
docker0, routes, and iptables chains.
cr0x@server:~$ ip link show docker0
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:8f:aa:bb:cc brd ff:ff:ff:ff:ff:ff
What it means: The interface exists. State DOWN is normal if no containers are attached.
Decision: If docker0 doesn’t exist, suspect daemon config disabling bridge creation, or a corrupted state directory.
Daemon state reset: last resort, very sharp knife
If networks routinely vanish, or Docker refuses to create new ones, you may be dealing with a corrupted /var/lib/docker (or storage driver issues). The fix might be to wipe Docker state and redeploy workloads from source-of-truth. That’s not “rebuild the network.” That’s “rebuild the node.”
Do not improvise this on a pet server that contains the only copy of your volumes. If you’re not sure, stop and inventory what data lives where.
Joke #2: “docker system prune -a” is not a maintenance plan; it’s a personality test.
Three corporate-world mini-stories (pain included)
Mini-story 1: The incident caused by a wrong assumption
A mid-sized company ran a few internal apps on a single beefy VM. The team used Docker Compose, and it worked well enough that nobody touched the networking section for months. Then an engineer added an external: true network because they wanted two Compose projects to share a reverse proxy network.
They assumed “external” meant “shared across projects” and that Compose would still create it if missing. They deployed a new VM from an image, ran docker compose up -d, and watched it fail immediately: “network proxy_net not found.” The on-call did what on-calls do: retried, restarted Docker, retried again.
The outage wasn’t the error itself; it was the response. Under pressure, they created the network manually—but used a different subnet than the old host. The reverse proxy came up, but internal allowlists and a couple of hard-coded monitoring checks expected the previous range. Half the traffic was now “mysteriously” blocked by a firewall rule meant to protect a database.
The fix was simple and annoyingly educational: create the external network explicitly and pin its subnet in infrastructure code, not in someone’s memory. Then rebuild the host in a controlled way so the network configuration didn’t depend on who typed it at the keyboard.
Mini-story 2: The optimization that backfired
A different org wanted “faster deployments” and reduced disk use. Someone added a nightly cleanup job that ran a prune command to delete unused Docker objects. It did reclaim space—until it didn’t.
The job ran during a quiet period when a Compose stack was being redeployed. For a brief window, containers were stopped and networks looked “unused.” The prune removed a user-defined bridge network that was about to be reused seconds later. The deployment resumed, and containers failed to start: “network not found.” The alert was delayed because the health checks only ran every few minutes. Great timing.
They “fixed” it by rerunning the deployment, which recreated the network. But the real damage was that the prune job remained. Weeks later, the same pattern hit again, this time during a heavier change window. The second incident was longer because engineers chased iptables ghosts and DNS issues that weren’t the problem.
The eventual correction was boring: prune only with explicit filters, never on shared hosts, and never without checking for running deployments. They also moved workloads to a system where ephemeral nodes made cleanup irrelevant. Speed improved. Not because prune was clever, but because the platform stopped being fragile.
Mini-story 3: The boring but correct practice that saved the day
A finance-adjacent team had a rule: every Compose project must declare its networks with explicit names and IPAM ranges, and every external network must be created by a provisioning step before any app deploy runs. The rule annoyed developers, which is usually a sign it’s doing something useful.
One day a host lost Docker state after a disk incident. Networks were gone. Containers were gone. Everyone had a bad afternoon. But the redeploy runbook didn’t involve guesswork: provision external networks, deploy Compose stacks, verify connectivity. Because subnets were pinned, firewall rules and monitoring didn’t need emergency edits.
The recovery wasn’t magical; it was predictable. Their runbook also included “verify you’re on the right Docker context” and “dump docker info and journalctl output into the incident ticket.” That meant less folklore and more evidence.
They still had an incident. They just didn’t have a second, self-inflicted incident layered on top of it. In operations, that’s what winning looks like.
Common mistakes: symptom → root cause → fix
1) Symptom: “network <hex> not found” after moving a Compose project directory
Root cause: Project name changed, old resources referenced by ID no longer exist, or you’re invoking Compose with a different -p value.
Fix: Run Compose with the original project name if you want to reuse resources, or run docker compose -p <name> down (if possible) and redeploy with the new name consistently.
2) Symptom: Compose fails only on a network marked external
Root cause: External network is not created (by design, Compose won’t create it).
Fix: Create it explicitly (ideally via provisioning/automation), and pin the subnet/driver so it matches expectations.
3) Symptom: “network not found” appears after a cleanup job or disk pressure event
Root cause: Automated prune removed networks, or Docker state was partially lost due to disk full/corruption.
Fix: Remove/limit cleanup jobs; rebuild networks via Compose; investigate host storage health and Docker state durability.
4) Symptom: Network exists, but new containers cannot attach; errors mention sandbox/join
Root cause: Stale endpoints, leftover veth pairs, or daemon/libnetwork internal inconsistency after crashes.
Fix: Controlled project restart (compose down/up). If persistent across projects, restart Docker during a window; if still broken, plan a node rebuild.
5) Symptom: Published ports stop working, but no “network not found”
Root cause: iptables/nftables programming broken; firewall manager overwrote Docker chains.
Fix: Fix firewall integration; restart Docker after correcting; validate DOCKER chains and NAT rules.
6) Symptom: You can’t see the network, but coworkers can
Root cause: Wrong Docker context, wrong host, or SSH session to the wrong environment.
Fix: Verify context and endpoint; standardize shell prompts; require explicit environment indicators in runbooks.
Checklists / step-by-step plans
Checklist A: Single Compose stack fails with “network not found”
- Confirm Docker context and host identity (avoid debugging the wrong box).
- Do not prune anything. First, collect evidence:
docker network lsdocker compose -p <proj> pssudo journalctl -u docker --since "1 hour ago"
- If the missing network is the project default: run
docker compose up -dfrom the correct directory. - If it still fails: run
docker compose down(no volumes) thendocker compose up -d. - Validate:
docker network inspect(driver/subnet),docker exec ... getent hosts .... - If it reoccurs: find and kill the process deleting networks (cleanup job, human habit, CI agent).
Checklist B: Multiple projects affected or default networks are missing
- Assume host-level issue until proven otherwise.
- Check daemon logs for storage/state errors; check disk free and filesystem health.
- Schedule a maintenance window (yes, even if it’s “just networking”).
- Stop workloads cleanly.
- Restart Docker; confirm
docker0exists and iptables chains are present. - If state is clearly corrupted: rebuild the node from source-of-truth rather than hand-editing
/var/lib/docker.
Checklist C: External network strategy that doesn’t bite you later
- Use external networks only when you truly need cross-project connectivity.
- Create them in provisioning code, not in a human’s terminal history.
- Pin subnet and driver and document why.
- Validate with a disposable container: attach, resolve names, reach expected ports.
- Audit cleanup jobs: external networks should never be pruned automatically on shared hosts.
FAQ
1) Why does Docker complain about a network ID instead of a name?
Because internally Docker references networks by ID. Names are for humans. Stacks and containers often store the ID, so when the ID disappears you get a hex-shaped clue.
2) Can I just recreate a network with the same name and be done?
Sometimes. But if a container references the old network ID, recreating by name won’t help until the container is recreated or reattached. The safest approach is to let Compose recreate both the network and the affected containers.
3) Does docker compose up -d recreate missing networks automatically?
Yes for networks it manages (non-external). If the network is marked external: true, Compose will not create it and will fail if it’s missing.
4) What’s the safest way to rebuild a broken Compose network?
docker compose down (without -v unless you mean it), then docker compose up -d. That clears containers and the project network and recreates cleanly.
5) Will removing a network delete my data?
Removing a network does not delete volumes. But if your data lives inside the container filesystem (no volume), removing containers will delete it. Inventory your volumes before you “clean up.”
6) How do I know if a cleanup job caused this?
Check systemd timers/cron, CI scripts, and shell histories. Also look at daemon logs timestamps: a sudden “network removed” or a gap followed by missing networks is a strong sign.
7) Is “network not found” ever caused by iptables or firewall rules?
Not typically. Firewall issues cause connectivity problems, not missing-object errors. If the daemon says “not found,” it can’t locate the network object in its state.
8) What if this is Swarm and overlay networks?
Then you need to confirm Swarm cluster health. Overlay networks depend on manager state. A node that left the cluster or a broken manager quorum can make tasks fail network attach. Fix cluster state first, then redeploy services.
9) How do I prevent this from happening again?
Stop running aggressive prune jobs on shared hosts, pin network configuration when stability matters, and treat /var/lib/docker as durable state that needs real disk hygiene and monitoring.
Next steps you should actually do
When Docker says “network not found,” believe it. Don’t waste time debugging packets for a network object that doesn’t exist.
- Run the fast diagnosis: confirm context, confirm the network is missing, check daemon logs.
- Fix at the smallest scope: recreate via Compose for that one project; avoid host-wide resets.
- If this keeps happening, treat it as a systems problem: cleanup jobs, disk health, daemon state stability, and repeatable provisioning of external networks.
- Write the runbook you wish you had today: commands to verify state, and the exact “down/up” sequence that is safe for your stack.
Rebuilding networks without breaking everything isn’t about heroics. It’s about knowing which layer is broken and refusing to “optimize” your way into new outages.