It always starts the same way: “containers can’t reach the internet,” “port 443 stopped publishing,” or the crowd favorite, “it worked yesterday.” You check Docker, it’s running. You check the app, it’s healthy. Then you look at the firewall and discover you’re not debugging “networking.” You’re debugging two firewall engines that both think they’re in charge.
On modern Linux, iptables might be a compatibility shim over nftables. Or it might be the legacy binary. Docker still speaks iptables. Your distro might prefer nftables. Add a firewall manager that reloads rules whenever it sneezes, and you’ve got a production-grade flame war conducted entirely in kernel tables.
What’s actually happening when Docker “breaks networking”
Docker’s default networking model on a single host is deceptively simple:
- Create a Linux bridge (usually
docker0). - Attach veth pairs for containers to that bridge.
- Do NAT (MASQUERADE) so containers can reach the outside world.
- Add filter rules so forwarding works, and so published ports land in the right container.
Every one of those steps touches netfilter. The part that hurts is that “iptables” isn’t just a tool; it’s also an interface to kernel rule tables. And in the last few years, the userland tool has been playing musical chairs:
- iptables-legacy: talks to the historical xtables interface.
- iptables-nft: the iptables command that writes rules into nftables (via a compatibility layer).
- nft: the native nftables tool and language.
If Docker is writing rules with one backend but your system is inspecting or managing the other backend, you will “see nothing,” believe nothing is configured, and then confidently “fix” it by making it worse.
There are also three policy layers that routinely collide:
- Kernel settings like
net.ipv4.ip_forwardand bridge netfilter sysctls. - Firewall managers (firewalld, ufw, custom systemd units, config management) that set a default policy and reload rules.
- Docker’s rule management, which assumes it can insert chains like
DOCKERandDOCKER-USERand then jump to them.
The failure modes are consistent:
- NAT rules exist in one ruleset, but packets are evaluated by another.
- Forwarding is blocked because the default policy is DROP, or because a firewall reload forgot Docker’s jumps.
- Published ports don’t work because DNAT rules are missing or the
DOCKERchain isn’t referenced. - Inter-container traffic dies because bridge filtering is enabled and you didn’t allow it.
Pick a side. Make it stable. Stop letting three tools scribble in the same notebook.
One short joke, as promised: NAT is like office politics: everything works until someone “simplifies” the rules and suddenly nobody can talk to anyone.
Fast diagnosis playbook (first/second/third checks)
First: confirm the firewall backend and where rules are being written
- Is
iptablesusing nft or legacy? - Does
nft list rulesetshow Docker chains/rules? - Are you inspecting the same backend Docker is programming?
Second: confirm forwarding and NAT exist for the Docker bridge
net.ipv4.ip_forward=1?- MASQUERADE rule for
docker0subnet? - FORWARD chain allows
docker0→ external and return traffic?
Third: locate who is overwriting rules after Docker starts
- firewalld reloads? ufw enable? custom hardening scripts?
- systemd ordering: does Docker start before/after your firewall service?
- Do rules disappear after a firewall reload?
When you’re under pressure
If this is production and the site is down, the quickest safe path is usually:
- Make sure Docker and your host are using the same backend (often by selecting iptables-legacy or aligning everything to nftables).
- Restore forwarding + NAT and confirm with packet counters.
- Prevent future rule wipes by fixing service ordering and using
DOCKER-USERfor your policy.
Then you circle back and make it correct, not just “working right now.”
Interesting facts and short history (so the weirdness makes sense)
- nftables landed in Linux 3.13 (2014) as the successor to iptables, offering a more flexible ruleset and better performance characteristics for large rule counts.
- iptables-nft is a compatibility layer, not “iptables with different output.” It translates iptables rules into nftables objects, and the translation has edge cases.
- Some distros switched iptables to nft by default (via alternatives), which quietly changed what
iptables -Smeans on the same command line. - Docker’s networking model predates nftables, and its operational assumptions were built around iptables chains it can insert and manage incrementally.
- nftables supports atomic ruleset updates, which is great for correctness; it also means a tool can replace the entire ruleset in one shot and accidentally delete Docker’s chains.
- firewalld evolved toward nftables, but many environments still run mixed tooling where Docker uses iptables commands and firewalld uses nftables directly.
- The DOCKER-USER chain exists for a reason: Docker needed a stable insertion point for user policy so it wouldn’t be overwritten by Docker’s own updates.
- Bridge netfilter is a common foot-gun: packets traversing Linux bridges can be passed to iptables/nftables depending on sysctls, which changes the path your rules must match.
- Container networking failures are often “policy failures” not “routing failures”: routes look correct, ARP works, but netfilter silently drops or fails to NAT.
Practical tasks: commands, outputs, and the decision you make
These are the checks I run when the graph is red and Slack is loud. Each task includes: a command, realistic output, what it means, and what decision you make.
Task 1: Identify which iptables backend you’re using
cr0x@server:~$ sudo iptables --version
iptables v1.8.9 (nf_tables)
What it means: Your iptables command is writing to nftables (iptables-nft). If Docker is using the same binary, its rules live in nftables, not legacy xtables.
Decision: You must inspect with nft or with iptables consistently. If another component is using iptables-legacy, you’ve got split-brain.
Task 2: Check alternatives (Debian/Ubuntu) for iptables and friends
cr0x@server:~$ sudo update-alternatives --display iptables
iptables - auto mode
link best version is /usr/sbin/iptables-nft
link currently points to /usr/sbin/iptables-nft
link iptables is /usr/sbin/iptables
/usr/sbin/iptables-legacy - priority 10
/usr/sbin/iptables-nft - priority 20
What it means: System preference is nft-backed iptables. Your tooling and expectations need to match that.
Decision: If you find Docker or your firewall uses legacy, align them (either move everything to nft, or deliberately switch to legacy for Docker-era simplicity).
Task 3: Confirm Docker’s bridge and subnets
cr0x@server:~$ docker network inspect bridge --format '{{json .IPAM.Config}}'
[{"Subnet":"172.17.0.0/16","Gateway":"172.17.0.1"}]
What it means: Default bridge subnet is 172.17.0.0/16. NAT and forwarding rules should reference that CIDR or interface.
Decision: If you use custom subnets, you need to see them here and verify the firewall matches. Mismatched subnets = NAT never triggers.
Task 4: Verify kernel forwarding is enabled
cr0x@server:~$ sysctl net.ipv4.ip_forward
net.ipv4.ip_forward = 0
What it means: The kernel won’t forward packets between interfaces. Containers can talk to the host, but not beyond it.
Decision: Enable forwarding permanently via sysctl. Temporary toggles are for incident response only.
cr0x@server:~$ sudo sysctl -w net.ipv4.ip_forward=1
net.ipv4.ip_forward = 1
Task 5: Check bridge netfilter sysctls (common in hardened builds)
cr0x@server:~$ sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables 2>/dev/null
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
What it means: Bridged traffic is being passed to iptables/nftables for filtering. That can be fine, but it means your FORWARD policies matter a lot.
Decision: If you didn’t intend to firewall intra-bridge traffic, consider setting these to 0, or write correct rules. Don’t guess.
Task 6: See whether Docker created its expected chains (iptables view)
cr0x@server:~$ sudo iptables -S | sed -n '1,40p'
-P INPUT ACCEPT
-P FORWARD DROP
-P OUTPUT ACCEPT
-N DOCKER
-N DOCKER-ISOLATION-STAGE-1
-N DOCKER-ISOLATION-STAGE-2
-N DOCKER-USER
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
What it means: Docker inserted chains, but your default FORWARD policy is DROP. Docker tries to punch holes; firewall managers sometimes reset policies and break this.
Decision: If traffic still fails, inspect counters and confirm these jumps still exist after firewall reloads. If not, the firewall manager is overwriting them.
Task 7: Verify NAT MASQUERADE exists for container egress
cr0x@server:~$ sudo iptables -t nat -S | sed -n '1,80p'
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-N DOCKER
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
What it means: MASQUERADE is present. If containers still can’t reach the internet, either forwarding is blocked, DNS is broken, or packets are using a different interface than you think.
Decision: If this MASQUERADE line is missing, fix the rule backend conflict first. Re-adding the rule manually is a band-aid; Docker will fight you later.
Task 8: Inspect the nftables ruleset directly (truth serum)
cr0x@server:~$ sudo nft list ruleset | sed -n '1,80p'
table inet filter {
chain input {
type filter hook input priority 0; policy accept;
}
chain forward {
type filter hook forward priority 0; policy drop;
jump DOCKER-USER
jump DOCKER-ISOLATION-STAGE-1
}
chain output {
type filter hook output priority 0; policy accept;
}
chain DOCKER-USER {
return
}
}
table ip nat {
chain PREROUTING {
type nat hook prerouting priority -100; policy accept;
}
chain POSTROUTING {
type nat hook postrouting priority 100; policy accept;
ip saddr 172.17.0.0/16 oifname != "docker0" masquerade
}
}
What it means: Docker-relevant rules exist in nftables. If iptables -S shows nothing but nft does, you’re likely using iptables-legacy when inspecting.
Decision: Standardize on one inspection and management toolchain. In incidents, the fastest way to waste an hour is to stare at the wrong ruleset.
Task 9: Prove whether firewall reload wipes Docker rules
cr0x@server:~$ sudo systemctl reload firewalld
cr0x@server:~$ sudo iptables -t nat -S | grep -E 'MASQUERADE|DOCKER' | head
Example output (bad sign):
cr0x@server:~$ sudo iptables -t nat -S | grep -E 'MASQUERADE|DOCKER' | head
What it means: After reload, Docker’s nat rules are gone (or you’re looking at the wrong backend). Containers will lose egress and published ports.
Decision: Fix the integration: either configure firewalld’s Docker zone handling, ensure Docker starts after the firewall and re-adds rules, or stop using a manager that replaces rulesets without care.
Task 10: Check DOCKER-USER chain for your policy insertion point
cr0x@server:~$ sudo iptables -S DOCKER-USER
-N DOCKER-USER
-A DOCKER-USER -j RETURN
What it means: Docker created the chain but you’re not using it. That’s fine, but it’s also where you should put your “block this container CIDR from that network” rules.
Decision: Put corporate policy in DOCKER-USER, not by editing Docker’s DOCKER chain. Docker treats its own chains like a scratch pad.
Task 11: Confirm port publishing creates DNAT rules
cr0x@server:~$ docker run -d --name webtest -p 8080:80 nginx:alpine
Unable to find image 'nginx:alpine' locally
nginx:alpine: Pulling from library/nginx
Status: Downloaded newer image for nginx:alpine
3c1c0f4c8b2a3d0a5a1f2d7b3f4d9c2d12e4f0f3c2c7c9bb8a1a3c0f8ad9f1b2
cr0x@server:~$ sudo iptables -t nat -S DOCKER | grep 8080
-A DOCKER ! -i docker0 -p tcp -m tcp --dport 8080 -j DNAT --to-destination 172.17.0.2:80
What it means: Docker published port 8080 by creating a DNAT rule. If the application is still unreachable, either FORWARD filtering blocks it, or the service isn’t listening in the container.
Decision: If no DNAT appears, Docker is failing to program iptables (often due to missing capabilities, daemon flags, or backend mismatch).
Task 12: Validate connectivity from inside a container (not from the host)
cr0x@server:~$ docker exec -it webtest sh -c 'ip route; wget -qO- --timeout=3 https://example.com | head'
default via 172.17.0.1 dev eth0
172.17.0.0/16 dev eth0 scope link src 172.17.0.2
What it means: The route is correct. If the wget hangs or fails, suspect NAT/forwarding/firewall or DNS. If it works, your problem is likely inbound port publishing, not outbound egress.
Decision: Split the problem: outbound vs inbound. They fail for different reasons and require different rules to fix.
Task 13: Watch packet counters to see if rules match at all
cr0x@server:~$ sudo iptables -t nat -L POSTROUTING -v -n | sed -n '1,10p'
Chain POSTROUTING (policy ACCEPT 120 packets, 7200 bytes)
pkts bytes target prot opt in out source destination
38 2280 MASQUERADE all -- * eth0 172.17.0.0/16 0.0.0.0/0
What it means: Counters incrementing = NAT rule is matching. If counters are zero while containers attempt egress, traffic may not be forwarded, may be leaving via a different interface, or may be blocked earlier.
Decision: Use counters to avoid cargo-cult rule changes. If counters don’t move, your “fix” is not in the right place.
Task 14: Find who last touched netfilter rules (journal clues)
cr0x@server:~$ sudo journalctl -u docker -u firewalld -u ufw --since "2 hours ago" | tail -n 25
Jan 02 09:14:22 server dockerd[1289]: time="2026-01-02T09:14:22.902311" level=info msg="Loading containers: done."
Jan 02 09:14:23 server dockerd[1289]: time="2026-01-02T09:14:23.110022" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.1/16. Daemon option --bip can be used to set a preferred IP address"
Jan 02 09:41:07 server firewalld[1022]: Reloaded firewall rules
Jan 02 09:41:07 server dockerd[1289]: time="2026-01-02T09:41:07.514010" level=warning msg="Failed to program NAT chain: iptables: No chain/target/match by that name."
What it means: firewalld reload happened, then Docker failed to program rules. That’s your smoking gun: rule tables got replaced and Docker’s expected chain references broke.
Decision: Fix service ordering, firewall manager configuration, or reduce rule ownership (see strategy section). Don’t keep restarting Docker as a “solution.”
Pick a strategy: legacy iptables, nftables-first, or “Docker doesn’t touch my firewall”
Most outages here come from indecision. Teams accidentally run a hybrid: Docker assumes iptables, the distro assumes nftables, and the firewall manager assumes it owns the whole ruleset. Choose a model and implement it deliberately.
Strategy A: Standardize on iptables-legacy (the “boring and predictable” route)
If you need maximum compatibility with older tooling or you’re untangling years of scripts, iptables-legacy can be the least-worst path. It reduces translation weirdness and aligns with Docker’s original assumptions.
When to choose it:
- Older distributions or kernel/userspace combos where nft tooling is inconsistent.
- Lots of operational muscle memory around iptables output and scripts.
- Minimal appetite to migrate firewall policy right now.
Cost: You’re betting against the distro’s direction. It can be fine, but own the decision.
Implementation sketch (Debian/Ubuntu):
cr0x@server:~$ sudo update-alternatives --set iptables /usr/sbin/iptables-legacy
update-alternatives: using /usr/sbin/iptables-legacy to provide /usr/sbin/iptables (iptables) in manual mode
Decision point: After switching, restart Docker and confirm rules appear with iptables -S. If rules still don’t appear, something else is off (permissions, daemon flags, firewall manager wiping).
Strategy B: Go nftables-first (the “modern kernel, modern rules” route)
This is where Linux is going. It’s also where people get hurt when they half-migrate. If you choose nftables-first, you need to be disciplined about rule ownership and inspection.
What “nftables-first” actually means:
- Use
iptablesonly as a compatibility interface if required, but inspect withnft. - Ensure your firewall manager uses nftables and does not wipe Docker-managed chains, or explicitly integrates with Docker.
- Persist nftables rules properly and avoid tools that “flush and replace everything” without coordination.
Important operational point: Docker may still emit iptables commands, but if those are the nft-backed iptables, it’s fine. The key is that you can’t mix nft and legacy tables and expect coherent behavior.
Strategy C: Tell Docker to stop managing iptables (for people who truly control the edge)
Docker has a daemon flag: --iptables=false. It’s tempting. It’s also a sharp knife, and it cuts quietly.
When this is sane:
- You have a dedicated network platform team managing all NAT/filter rules with nftables.
- You run fixed container networks and published ports are explicitly defined in firewall rules.
- You accept that Docker will not “just work” out of the box for developers.
When this is a trap:
- You rely on dynamic port publishing or ephemeral compose stacks.
- You have multiple teams deploying arbitrary containers on the same host.
Opinionated guidance: If you’re not prepared to write and maintain the equivalent of Docker’s NAT and forwarding rules yourself, do not disable Docker iptables. The kernel will not reward your optimism.
Second short joke: Disabling Docker’s iptables is like removing the steering wheel because you “prefer road feel.” You’ll definitely feel the road.
The one chain you should respect: DOCKER-USER
Docker gives you a supported place to apply policy: DOCKER-USER. It’s evaluated before Docker’s own rules for forwarded traffic. Put your allow/deny policy there. Keep it small. Keep it readable. Make it auditable.
Example policy: block containers from reaching RFC1918 networks except a needed service subnet. (Adjust for your environment.)
cr0x@server:~$ sudo iptables -I DOCKER-USER 1 -d 10.0.0.0/8 -j REJECT
cr0x@server:~$ sudo iptables -I DOCKER-USER 2 -d 172.16.0.0/12 -j REJECT
cr0x@server:~$ sudo iptables -I DOCKER-USER 3 -d 192.168.0.0/16 -j REJECT
cr0x@server:~$ sudo iptables -A DOCKER-USER -j RETURN
What it means: Containers can still reach the internet (NAT), but can’t laterally roam your corporate network. You’re using Docker’s supported hook instead of fighting its chains.
Decision: If you need policy, use DOCKER-USER. If you need complex policy, consider moving container networking to something with explicit network policy (and accept the operational overhead).
A single quote (paraphrased idea) to keep you honest
Richard Cook (paraphrased idea): “Success hides complexity; failure reveals it.”
When Docker networking works, nobody asks who owns the firewall. When it fails, you learn exactly how many tools were writing rules behind your back.
Three corporate mini-stories from the trenches
Mini-story 1: The incident caused by a wrong assumption
The team had a routine: deploy a new container image, watch the health check, move on. One Tuesday, they rolled a minor patch to a payment service that published a port on the host. The service went healthy, but customer traffic fell off a cliff. From the app’s perspective, nothing was wrong. From the load balancer’s perspective, the backend stopped accepting connections.
The first responder did what every tired engineer does: checked docker ps, confirmed -p 443:8443 was present, and then checked iptables. No DNAT rules. No DOCKER chain references. “Docker didn’t program iptables,” they said, and restarted Docker. It came back for a minute. Then it broke again after a firewall reload. That’s when the incident got interesting.
The wrong assumption was subtle: they assumed the system used legacy iptables because iptables -S showed nothing useful. In reality, the host had switched to nftables months earlier during an OS upgrade. Their runbook still used iptables-legacy tooling, and their “verification step” was inspecting the wrong backend. The rules existed, just not where they were looking.
Meanwhile, a security hardening job ran every 15 minutes and reloaded firewall policy using nftables, replacing the whole ruleset atomically. Docker’s nft-backed iptables rules were not preserved because the hardening job didn’t include them. Every quarter-hour, the published port vanished. It was a perfect metronome for downtime.
The fix was boring: stop replacing the entire nft ruleset, integrate Docker’s required chains, and pin the inspection tooling. The lesson stuck because it hurt: a “simple redeploy” became a firewall backend lesson delivered at production volume.
Mini-story 2: The optimization that backfired
A different company chased boot-time improvements. Someone noticed Docker startup occasionally took longer on a busy host. They profiled it and saw time spent manipulating iptables rules. The proposed optimization: disable Docker’s iptables management and let the “real firewall” handle it. Faster boot, fewer moving parts, better security posture. On a slide deck, it looked like maturity.
They rolled the change to a staging cluster where apps had stable ports and fixed networks. Everything passed. Then it hit production, where teams used Compose stacks and temporary services that published ports for debugging. Those ports started failing in the most annoying way: the containers ran, they logged “listening,” and local curl from the host worked via container IP. But the published host ports were dead because no DNAT rules were created. The operational symptom was “random ports don’t work,” which is a great way to trigger a week of finger-pointing.
To patch it, they added static nftables rules for “the important services.” That helped until the next team deployed something new. The firewall rules became a manual registry of ports that drifted from reality. The supposed optimization created an on-call tax: every new service required a firewall PR. Every incident required someone who understood nftables and Docker networking. The boot-time win was paid back with compound interest.
They ended up re-enabling Docker iptables management and moved policy into DOCKER-USER. Boot time regressed slightly. Incidents dropped sharply. The moral: performance “wins” that remove automation often just relocate work to humans, and humans are notoriously non-deterministic.
Mini-story 3: The boring but correct practice that saved the day
A third team ran a regulated environment. They treated firewall state like code: versioned config, explicit ownership, and integration tests that validated required chains existed after reloads. No heroics, just discipline. It was the kind of practice that feels slow until it isn’t.
During a routine OS patch, the distribution changed the default iptables alternative from legacy to nft. On the first reboot, half the containers lost outbound connectivity. It looked like a typical “Docker broke after patch” event, the kind that ruins weekends.
But their monitoring had a targeted check: verify that MASQUERADE for the container subnet exists and that packet counters increment under synthetic traffic. The alert fired within minutes of the reboot. The on-call followed a runbook that started with “verify backend” and “verify NAT counters,” not “restart Docker until it behaves.”
They found the mismatch fast: their firewall manager was loading legacy rules while the system’s iptables binary now targeted nftables. Two parallel universes. The fix was equally unglamorous: explicitly pin alternatives for iptables in the configuration management, and add a post-reload verification that checks the active ruleset with nft list ruleset.
Nothing magical happened. They just had fewer assumptions. That’s what “boring” looks like when it works.
Common mistakes: symptom → root cause → fix
1) Containers have no internet, but DNS resolves
Symptom: dig works; curl hangs or times out from containers.
Root cause: Missing MASQUERADE rule or forwarding blocked (FORWARD policy DROP, missing ACCEPT rules).
Fix: Confirm NAT in the correct backend; ensure net.ipv4.ip_forward=1; ensure FORWARD rules allow docker0 egress and RELATED,ESTABLISHED return traffic.
2) Published ports stopped working after a firewall reload
Symptom: docker run -p works until firewalld/ufw reload; then inbound breaks.
Root cause: Firewall manager replaces the ruleset and drops Docker’s DNAT chains/jumps.
Fix: Configure firewall manager to preserve Docker chains or order services so Docker reprograms rules after reload. Prefer policy in DOCKER-USER.
3) iptables shows no Docker rules, but containers are running
Symptom: iptables -S looks empty; you assume Docker didn’t set rules.
Root cause: You’re looking at iptables-legacy while Docker is writing via iptables-nft (or vice versa).
Fix: Check iptables --version and alternatives. Inspect with nft list ruleset if nft-backed.
4) Inter-container traffic on the same bridge is blocked
Symptom: Containers can reach the internet but can’t reach each other by IP.
Root cause: Bridge netfilter enabled with restrictive FORWARD rules, or Docker isolation chains plus custom rules blocking intra-bridge.
Fix: Allow -i docker0 -o docker0 in forward path (or the equivalent nft rule). Re-check sysctls bridge-nf-call-iptables.
5) Docker fails to start with “Failed to program NAT chain”
Symptom: Docker daemon logs errors about missing chains/targets.
Root cause: Another tool flushed tables while Docker was configuring, or you have inconsistent iptables modules/backends.
Fix: Fix service ordering; stop flushing tables behind Docker; align iptables backend; restart Docker after ensuring a stable firewall state.
6) Only some hosts in a fleet have the problem
Symptom: “It’s fine on host A, broken on host B” after the same deployment.
Root cause: Different iptables alternatives, different firewall managers, or different kernel sysctls from hardening baselines.
Fix: Standardize: pin alternatives, enforce sysctls, and verify with automated checks that validate NAT/forwarding rules exist in the active backend.
7) IPv6 works but IPv4 doesn’t (or the opposite)
Symptom: Containers reach IPv6 destinations but not IPv4, or vice versa.
Root cause: Separate rule paths: ip6tables vs iptables (or inet tables in nft). One side got configured, the other didn’t.
Fix: Inspect both families. In nft, prefer table inet for filter rules when appropriate, and ensure NAT exists for the address family you care about.
Checklists / step-by-step plan
Checklist A: Stabilize a broken host during an incident (15 minutes, no heroics)
- Confirm backend:
iptables --version, then verify withnft list rulesetif nft-backed. - Confirm forwarding:
sysctl net.ipv4.ip_forwardand set to 1 if needed. - Confirm NAT: look for MASQUERADE for the Docker subnet in the active backend.
- Confirm FORWARD path: ensure
DOCKER-USERand Docker’s forward ACCEPT rules exist and are referenced. - Check whether firewall reload wipes rules. If yes, stop the bleeding: temporarily avoid reloads, or restart Docker after reload as a short-term workaround.
- Use counters to confirm packets hit NAT and forward rules.
Checklist B: Make the fix permanent (the part everyone skips and then regrets)
- Pick your policy owner: either Docker manages iptables rules (default), or your firewall does and Docker is constrained. Don’t do “both.”
- Pin iptables backend consistently: use alternatives to ensure fleet-wide uniformity.
- Lock service ordering: Docker should start after the firewall is ready, and firewall reloads must not erase Docker chains.
- Put custom policy in DOCKER-USER: that’s the supported, stable hook.
- Persist sysctls: forwarding and bridge sysctls should be set via
/etc/sysctl.d/. - Add validation: post-boot and post-firewall-reload checks that verify MASQUERADE and chain jumps exist in the active backend.
Checklist C: A clean migration from legacy iptables to nftables (without surprise downtime)
- Inventory current iptables rules and identify which are Docker-managed vs site policy.
- Move site policy into
DOCKER-USERwhere possible. - Switch inspection tooling in runbooks from iptables output to nft (or ensure iptables-nft is used consistently).
- Test firewall reload behavior in staging: confirm Docker’s chains persist and that published ports survive reload.
- Roll to a small canary set of hosts; validate with synthetic container egress/inbound checks.
- Only then switch the fleet.
FAQ
1) Why does Docker still use iptables in 2026?
Because the kernel interface for packet filtering/NAT is netfilter, and iptables has been the practical interface for years. Docker’s model is built around inserting chains and rules incrementally. nftables is newer and better in many ways, but “newer” doesn’t automatically mean “every ecosystem component migrated.”
2) What’s the difference between iptables-nft and nftables?
iptables-nft is the iptables command writing rules into nftables using a translation layer. Native nftables uses nft and its own rule language and objects. The important operational detail: tools that manage nftables natively can replace rulesets atomically and accidentally remove rules created via iptables-nft unless they intentionally preserve them.
3) How do I know if I’m looking at the “right” rules?
Start with iptables --version. If it says (nf_tables), then nft list ruleset is the authoritative view. If you see (legacy), then iptables output reflects the actual active rules. Don’t trust muscle memory; verify.
4) Should I use firewalld with Docker?
You can, but you need to configure it intentionally. The main failure mode is firewalld reload wiping Docker’s rules or changing default forwarding behavior. If you use firewalld, test reload behavior, confirm Docker chains persist, and avoid policies that flush/replace the entire ruleset without accommodating Docker.
5) Where should I put “block containers from reaching X” rules?
Use the DOCKER-USER chain. Docker expects it and won’t overwrite it. Do not edit Docker’s own chains as your long-term policy store.
6) Is it safe to restart Docker to “fix networking”?
It’s a tactical move, not a fix. Restarting Docker may reprogram rules temporarily, but if a firewall manager reload wipes them again, you’ll be back here—just with extra disruption. Use restarts to restore service, then fix the ownership and ordering problem.
7) What about rootless Docker?
Rootless changes the networking model; it often uses user-space networking (like slirp4netns) rather than programming host iptables. That can avoid the firewall war, but it comes with different performance and feature trade-offs, especially for published ports and low-level networking requirements.
8) Why does the FORWARD policy being DROP matter if INPUT is ACCEPT?
Container traffic forwarded through the host hits the FORWARD chain, not INPUT. INPUT is for traffic destined to the host itself. If FORWARD is DROP and you don’t have the right exceptions, containers will be trapped on their bridge like it’s a very polite prison.
9) Can I run nftables for host security and still let Docker manage NAT?
Yes, if you do it coherently. Let Docker program rules via iptables-nft into nftables, and make sure your nftables management doesn’t wipe those rules. Put your custom policy in DOCKER-USER (via iptables) or create nft-native rules that integrate without flushing Docker’s tables.
10) My distro update flipped the backend and now everything is weird. What’s the safest response?
Pick one backend and enforce it fleet-wide. During stabilization, many teams choose iptables-legacy because it matches older assumptions. Longer term, migrating to nftables-first is fine—but do it with tests that cover reloads, NAT, forwarding, and published ports.
Conclusion: the next steps that actually stick
Docker networking failures around iptables/nftables aren’t mysterious. They’re political. Too many actors, not enough ownership. Your job is to stop the war by making a clear choice and enforcing it.
- Decide the backend (nftables-first or iptables-legacy) and pin it using alternatives/config management.
- Verify forwarding and NAT with counters, not vibes.
- Prove rule persistence across firewall reloads and reboots. If rules disappear, fix ordering or stop doing full ruleset replacements.
- Use DOCKER-USER as your stable policy hook. Keep Docker’s chains Docker’s problem.
- Automate a post-boot check that validates: backend, MASQUERADE present, forward jump to DOCKER-USER present, and at least one synthetic container can reach an external endpoint.
If you do those five things, you won’t eliminate networking incidents—this is Linux, not a fairy tale—but you will eliminate the class of incidents where you’re debugging the wrong firewall universe.