Docker networks: Bridge vs host vs macvlan — pick the one that won’t bite later

Was this helpful?

Some outages don’t start with a crash. They start with “We only changed networking.” Then suddenly your containers can’t reach the database, your monitoring goes blind, and the security team discovers your app is now listening on every interface like it’s 2009.

Docker gives you three tempting levers for local L2/L3 behavior—bridge, host, and macvlan. All three can work. All three can ruin your weekend if you pick them for the wrong reason. Let’s choose the one that ages well in production.

Decision first: what to pick in 60 seconds

If you’re running production systems, your default should be boring. Boring networks get paged less.

Pick bridge when

  • You need port publishing (-p) and reasonable isolation.
  • You want the host to stay the host, not “container soup with root on top.”
  • You have multiple services per host and want them to coexist without port conflicts.
  • You want a path to more advanced designs later (multiple networks, internal networks, policy controls).

Pick host when

  • You need very low overhead and can accept blast radius (packet filters, ports, and namespaces are shared).
  • You run one major network-facing service per host, and it already binds the ports it needs.
  • You have a good story for firewalling and observability on the host.

Pick macvlan when

  • You need containers to appear as first-class L2 citizens with their own MAC/IP on your physical LAN.
  • You’re integrating with systems that expect unique IPs per workload (legacy licensing, ACLs, upstream routers, multicast, awkward monitoring, “security appliances”).
  • You can manage L2 realities: ARP, CAM tables, switch port security, and IPAM discipline.

My opinionated default: start with user-defined bridge networks. Reach for host only when you can defend the risk in writing. Use macvlan only when you must interface with the LAN like a real host and you’ve validated the switch won’t punish you for it.

The mental model that prevents dumb outages

Docker networking isn’t magic. It’s Linux network namespaces plus some glue: virtual Ethernet pairs (veth), a bridge device, routing rules, and NAT/iptables/nftables rules. The “driver” you choose mostly decides where those packets travel and who owns the ports.

Three questions that decide everything

  1. Where do ports live? Are you mapping container ports to host ports (bridge), or does the container share the host’s port space (host), or does the container get its own IP (macvlan)?
  2. Who does L3/L4 policy enforcement? Host firewall + Docker-managed rules, or upstream network ACLs, or both?
  3. What is your failure domain? Do you want “one container got weird” to become “the host is weird”?

Also: performance is rarely your first problem. Debuggability and predictability are. A network driver that’s 3% faster but 30% harder to triage is not an optimization. It’s a future incident with a calendar invite.

One quote I’ve watched become true in more postmortems than I care to count: paraphrased idea: “Hope is not a strategy.” — Gene Kranz (paraphrased idea)

Bridge networking: the default for a reason

Bridge mode is Docker’s “I want containers to be their own little world, but still reachable” story. The container gets an IP on a private subnet. Docker creates a Linux bridge (like docker0 or a user-defined one), then wires container eth0 to it via a veth pair. Outbound traffic is routed/NATed to the host’s interface. Inbound traffic typically uses published ports.

Why bridge is production-friendly

  • Port publishing is explicit. You open what you mean to open. That matters when the container image changes and suddenly binds extra ports.
  • Names matter. User-defined bridge networks provide built-in DNS-based service discovery. Containers can talk by name without you duct-taping IPs into configs.
  • Isolation is real-ish. It’s not a full VM boundary, but it’s a meaningful containment line for accidental port collisions and some classes of misconfig.
  • Debugging is tractable. You can reason about flows: container → veth → bridge → host → uplink; and the NAT rules are visible.

When bridge bites

  • MTU mismatches. Overlay networks, VPNs, or jumbo frames can make PMTUD lie and packets disappear.
  • NAT surprises. Source IP changes can break upstream ACLs or confuse logs.
  • Hairpin behavior. Container-to-host-to-container via published port can behave differently than container-to-container direct traffic.
  • Firewall drift. Docker manipulates iptables/nftables. If your baseline firewall assumes full control, you’ll have a turf war.

Bridge networking is like a reliable sedan. Not sexy, but it starts in the winter, the parts are cheap, and everyone knows how to fix it.

Host networking: fast, sharp, and unsafe by default

--network host drops the pretense: the container shares the host’s network namespace. No NAT. No container IP. No port publishing. If the process binds 0.0.0.0:443, it’s binding the host’s port 443. That’s the point.

What host mode is actually good for

  • High packet rate workloads where NAT and conntrack overhead are measurable and painful.
  • Network appliances (routing daemons, BGP speakers, DHCP servers) where you want direct interface semantics.
  • Simple single-tenant hosts where one workload owns the box.

What host mode breaks (quietly)

  • Port collisions become “random” failures. Deploy two services that both want 8125/udp and you’ll discover it at runtime.
  • Security boundaries get blurry. The container can see host interfaces, sometimes host-local services, and your “it’s just inside Docker” assumption dies.
  • Observability gets weird. Tools that expect container IPs lose an anchor; traffic attribution can require cgroup-aware tooling.

Short joke #1: Host networking is like giving your container the master keys because it promised it would only move the car to vacuum it.

Host mode governance that makes it survivable

  • Use systemd or an orchestrator to prevent two containers binding the same port.
  • Enforce host firewall policy explicitly; don’t rely on Docker’s “nice defaults.”
  • Document port ownership per host like it’s a contract. Because it is.
  • Prefer host mode only for workloads that justify it: packet capture, metrics agents, edge proxies, or real networking daemons.

Macvlan: the “looks like a real host” option (and its traps)

Macvlan assigns each container its own MAC address and IP on your physical network segment. From the rest of the LAN, the container is a peer. No port mapping, no NAT. Just “here is another machine.” It’s seductive because it eliminates a category of awkwardness: upstream systems can talk directly to containers without special port juggling.

When macvlan is the right answer

  • Legacy ACLs and IP-based allowlists where you can’t or won’t rewrite policy around NAT.
  • Appliance-like containers that need their own IP identity for routing, monitoring, or network segmentation.
  • Multicast/broadcast-dependent software where NAT/bridge semantics are painful (with caveats; not everything becomes easy).

The big macvlan gotcha: host-to-container traffic

By default, with macvlan, the host cannot talk to its own macvlan children on the same physical interface. This surprises people every time. The packets don’t hairpin the way you expect.

The fix is usually to create a macvlan sub-interface on the host (a “shim” interface) in the same macvlan network and route through it. That’s not hard, but it’s another moving part you must keep across reboots and config management.

Where macvlan bites later

  • Switch port security and CAM table limits. Some switches don’t like one physical port suddenly emitting dozens of MACs. You’ll learn this at 2 a.m., during an incident, if you don’t ask first.
  • ARP storms and neighbor table churn. Containers come and go; ARP caches don’t always keep up.
  • IPAM becomes your problem. Docker’s IPAM can allocate from a range, but it won’t negotiate with your DHCP server or your network team’s spreadsheet unless you do the legwork.
  • Harder “single box” testing. You can’t just run everything on one laptop and expect the LAN to behave identically, especially when Wi‑Fi enters the chat.

Short joke #2: Macvlan is great until your switch sees thirty new MAC addresses and decides it’s time to practice mindfulness by dropping traffic.

Interesting facts and a little history (useful, not trivia night)

  • Linux network namespaces (the foundation for container networking) landed in the kernel in the late 2000s, originally to isolate networking stacks for processes without full virtualization.
  • Docker’s early networking leaned heavily on iptables NAT rules; for years, “Docker broke my firewall” was a rite of passage because it inserted chains automatically.
  • User-defined bridge networks added embedded DNS-based service discovery, which was a big step up from the old --link mechanism that baked brittle host entries into containers.
  • Macvlan as a kernel feature predates Docker adoption; it’s a Linux driver that lets multiple virtual MACs share one physical interface, used historically for network segregation and lab environments.
  • Conntrack (connection tracking) is a hidden tax in NAT-heavy setups; high connection rates can exhaust conntrack tables and look like random packet loss.
  • MTU issues got worse as overlays/VPNs became common; “works on one host, breaks across sites” often comes down to path MTU and fragmentation behavior.
  • Host networking is effectively opting out of one of containers’ most practical isolations: port namespace separation. This is why many orchestrators treat it as a privileged choice.
  • Bridge vs macvlan is often a debate about where identity lives: in the host (bridge/NAT) or in the LAN (macvlan). Neither is free.

Practical tasks: commands, outputs, and the decision you make

These are the tasks I actually run when someone says, “Networking is weird.” Each has a command, an example output, what it means, and what you decide next.

Task 1: List Docker networks and spot the obvious

cr0x@server:~$ docker network ls
NETWORK ID     NAME      DRIVER    SCOPE
a1b2c3d4e5f6   bridge    bridge    local
f6e5d4c3b2a1   host      host      local
112233445566   none      null      local
77aa88bb99cc   app-net   bridge    local

What it means: You have the default bridge, plus a user-defined bridge (app-net). That’s good: user-defined bridge networks give better DNS behavior and separation.

Decision: If your services are still using the default bridge and legacy patterns, move them to a user-defined network unless there’s a reason not to.

Task 2: Inspect a network and confirm subnet, gateway, and options

cr0x@server:~$ docker network inspect app-net
[
  {
    "Name": "app-net",
    "Driver": "bridge",
    "IPAM": {
      "Config": [
        {
          "Subnet": "172.22.0.0/16",
          "Gateway": "172.22.0.1"
        }
      ]
    },
    "Options": {
      "com.docker.network.bridge.name": "br-77aa88bb99cc"
    }
  }
]

What it means: This bridge uses a dedicated Linux bridge device and a defined subnet. Predictable.

Decision: If this subnet overlaps with your corporate VPN or data center ranges, change it now. Overlap causes “only broken from some laptops” incidents.

Task 3: Check which network a container is actually using

cr0x@server:~$ docker inspect -f '{{json .NetworkSettings.Networks}}' api-1
{"app-net":{"IPAddress":"172.22.0.10","Gateway":"172.22.0.1","MacAddress":"02:42:ac:16:00:0a"}}

What it means: The container is on app-net with an internal IP.

Decision: If the app expects to be reachable from the LAN without port mapping, bridge is not enough; consider macvlan or proper ingress/proxying.

Task 4: Confirm published ports and where they bind

cr0x@server:~$ docker ps --format 'table {{.Names}}\t{{.Ports}}'
NAMES     PORTS
api-1     0.0.0.0:8080->8080/tcp, [::]:8080->8080/tcp
db-1      5432/tcp

What it means: api-1 is exposed on host port 8080. db-1 is not published; it’s internal-only (good).

Decision: If you see 0.0.0.0 bindings you didn’t intend, lock it down (-p 127.0.0.1:... or firewall) before someone else finds it.

Task 5: Check Docker’s NAT rules (iptables) and whether they exist

cr0x@server:~$ sudo iptables -t nat -S | sed -n '1,40p'
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-N DOCKER
-N DOCKER_OUTPUT
-N DOCKER_POSTROUTING
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER_OUTPUT
-A POSTROUTING -s 172.22.0.0/16 ! -o br-77aa88bb99cc -j MASQUERADE

What it means: Docker is managing NAT for the bridge subnet. That MASQUERADE rule is your outbound path.

Decision: If you’re on nftables-only systems, verify Docker is compatible with your firewall stack. Mixed tooling causes “rules exist but don’t apply” confusion.

Task 6: Check conntrack pressure (NAT pain shows up here)

cr0x@server:~$ sudo conntrack -S
cpu=0 found=120384 invalid=42 ignore=0 insert=120410 insert_failed=0 drop=0 early_drop=0 error=0 search_restart=0

What it means: Invalid packets are low; no drops. Conntrack isn’t currently on fire.

Decision: If you see drops/insert_failed climbing during load, raising conntrack limits or reducing NAT/connection churn becomes urgent. Host mode sometimes “fixes” this by avoiding NAT, but it’s a trade.

Task 7: Verify route and MTU from inside the container

cr0x@server:~$ docker exec api-1 ip route
default via 172.22.0.1 dev eth0
172.22.0.0/16 dev eth0 proto kernel scope link src 172.22.0.10
cr0x@server:~$ docker exec api-1 ip link show eth0
2: eth0@if21: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
    link/ether 02:42:ac:16:00:0a brd ff:ff:ff:ff:ff:ff link-netnsid 0

What it means: Default route is the bridge gateway; MTU is 1500.

Decision: If your underlay is 1450 because of VPN/overlay, set Docker network MTU or host MTU so the container doesn’t send packets that get black-holed.

Task 8: See the host-side veth and bridge membership

cr0x@server:~$ ip link show br-77aa88bb99cc
7: br-77aa88bb99cc: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
    link/ether 02:42:11:22:33:44 brd ff:ff:ff:ff:ff:ff
cr0x@server:~$ bridge link | grep br-77aa88bb99cc | head
21: veth6d2b1a2@if20: <BROADCAST,MULTICAST,UP,LOWER_UP> master br-77aa88bb99cc state forwarding priority 32 cost 2

What it means: The container’s veth peer is attached to the bridge and forwarding.

Decision: If you don’t see the interface or it’s not forwarding, you likely have a kernel/bridge issue or the container is in a bad network namespace state. Restarting Docker might “fix it,” but first capture evidence.

Task 9: Test DNS inside a user-defined bridge network

cr0x@server:~$ docker exec api-1 getent hosts db-1
172.22.0.11    db-1

What it means: Docker’s embedded DNS works; container-to-container resolution is fine.

Decision: If DNS fails here, don’t blame your corporate DNS first. Check that containers share a user-defined network; the default bridge behaves differently.

Task 10: Confirm host networking behavior by checking container IP (there won’t be one)

cr0x@server:~$ docker run --rm --network host alpine ip addr show eth0 | sed -n '1,12p'
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 0c:de:ad:be:ef:01 brd ff:ff:ff:ff:ff:ff
    inet 10.20.30.40/24 brd 10.20.30.255 scope global eth0
       valid_lft forever preferred_lft forever

What it means: You’re seeing the host’s interface addressing from inside the container. That’s what “host network” means.

Decision: If you need per-container firewall rules or distinct IP identities, host mode is the wrong tool.

Task 11: Create a macvlan network with a controlled IP range

cr0x@server:~$ docker network create -d macvlan \
  --subnet=10.50.10.0/24 --gateway=10.50.10.1 \
  --ip-range=10.50.10.128/25 \
  -o parent=eno1 macvlan-net
8f9e0d1c2b3a4d5e6f7a8b9c0d1e2f3a4b5c6d7e8f9a0b1c2d3e4f5a6b7c8d9e

What it means: Containers can be assigned addresses in 10.50.10.128/25 while the rest of the subnet stays reserved for other uses.

Decision: If you can’t reserve a clean range and document it, don’t use macvlan. IP conflicts are slow-motion disasters.

Task 12: Run a container on macvlan and confirm it has a LAN IP

cr0x@server:~$ docker run -d --name web-mv --network macvlan-net --ip 10.50.10.140 nginx:alpine
c2b1a0f9e8d7c6b5a4f3e2d1c0b9a8f7e6d5c4b3a2f1e0d9c8b7a6f5e4d3c2b1
cr0x@server:~$ docker exec web-mv ip addr show eth0 | sed -n '1,10p'
2: eth0@if33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:0a:32:0a:8c brd ff:ff:ff:ff:ff:ff
    inet 10.50.10.140/24 brd 10.50.10.255 scope global eth0

What it means: The container is now a first-class LAN endpoint.

Decision: Verify your switch policy (port security, MAC limits). If packets drop after a few containers, it’s not “Docker being flaky.” It’s your L2 enforcing rules.

Task 13: Confirm the classic macvlan limitation: host can’t reach container (default)

cr0x@server:~$ ping -c 2 10.50.10.140
PING 10.50.10.140 (10.50.10.140) 56(84) bytes of data.

--- 10.50.10.140 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1017ms

What it means: Host-to-macvlan-child traffic isn’t working. This is expected in many setups.

Decision: If host services (backup agents, local monitors, sidecars) must talk to those containers, create a macvlan shim on the host.

Task 14: Add a macvlan shim interface on the host to reach macvlan containers

cr0x@server:~$ sudo ip link add macvlan-shim link eno1 type macvlan mode bridge
cr0x@server:~$ sudo ip addr add 10.50.10.2/24 dev macvlan-shim
cr0x@server:~$ sudo ip link set macvlan-shim up
cr0x@server:~$ ping -c 2 10.50.10.140
PING 10.50.10.140 (10.50.10.140) 56(84) bytes of data.
64 bytes from 10.50.10.140: icmp_seq=1 ttl=64 time=0.412 ms
64 bytes from 10.50.10.140: icmp_seq=2 ttl=64 time=0.398 ms

--- 10.50.10.140 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms

What it means: The host can now reach the macvlan network via the shim.

Decision: Make it persistent (systemd-networkd/NetworkManager) or it will vanish on reboot and you’ll rediscover this limitation in an incident.

Task 15: Validate ARP/neighbor table sanity on the host

cr0x@server:~$ ip neigh show dev eno1 | head
10.50.10.1 lladdr 00:11:22:33:44:55 REACHABLE
10.50.10.140 lladdr 02:42:0a:32:0a:8c REACHABLE
10.50.10.141 lladdr 02:42:0a:32:0a:8d STALE

What it means: The host is learning neighbors. If you see lots of FAILED or constant churn, macvlan might be stressing L2/L3.

Decision: If neighbor churn correlates with packet loss, reduce container churn, adjust GC thresholds, or reconsider macvlan for that environment.

Task 16: Confirm which process owns a port (host mode and “mystery listeners”)

cr0x@server:~$ sudo ss -lntp | grep ':8080'
LISTEN 0      4096         0.0.0.0:8080      0.0.0.0:*    users:(("nginx",pid=21457,fd=6))

What it means: Something (here, nginx) owns port 8080 on the host network stack.

Decision: If you expected Docker port publishing but see a direct listener, you might be in host network mode or running the service on the host by accident. Fix the deployment model before chasing phantom firewall issues.

Task 17: Trace packet path quickly with tcpdump (bridge vs host vs macvlan)

cr0x@server:~$ sudo tcpdump -ni br-77aa88bb99cc port 5432 -c 5
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on br-77aa88bb99cc, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:01:10.112233 IP 172.22.0.10.49822 > 172.22.0.11.5432: Flags [S], seq 123456789, win 64240, options [mss 1460,sackOK,TS val 1 ecr 0,nop,wscale 7], length 0

What it means: You’re seeing container-to-container traffic on the bridge. That confirms the issue isn’t “packets never left the container.”

Decision: If traffic appears on the bridge but not on the uplink, focus on routing/NAT rules. If it appears on uplink but not reaching the destination, it’s upstream.

Fast diagnosis playbook

This is the order that saves time. Not the order that makes you feel like a network wizard.

First: identify the network mode and the intended reachability

  • Run docker inspect on the container and confirm if it’s bridge/host/macvlan.
  • Clarify: is the failure container → internet, container → container, LAN → container, or host → container?

Bottleneck hint: Most “Docker networking” issues are really “your mental model is wrong” issues. Fix the model, then fix the config.

Second: check the obvious L3 basics (inside container and on host)

  • Inside container: ip addr, ip route, DNS resolution with getent hosts.
  • On host: bridge state, veth presence, route to subnet, neighbor table (macvlan).

Bottleneck hint: Missing default route or bad DNS causes 80% of “can’t reach X” complaints.

Third: validate policy and translation (iptables/nftables, port bindings, conntrack)

  • Bridge mode inbound issues: check published ports and iptables NAT chains.
  • Host mode issues: check which process owns the port, and host firewall rules.
  • Macvlan issues: check ARP/neighbor status and switch/security constraints.
  • High load weirdness: check conntrack stats and kernel logs.

Bottleneck hint: If traffic works briefly then fails under load, assume conntrack pressure or upstream L2 enforcement before you blame Docker.

Fourth: capture packets at the right interface

  • Bridge: tcpdump on the container interface (inside) and on the bridge (br-*).
  • Host: tcpdump on the host interface and use process-level tools to attribute traffic.
  • Macvlan: tcpdump on the parent interface and the macvlan shim (if used).

Bottleneck hint: If packets leave the container but never hit the bridge/uplink, you’ve got namespace wiring trouble. If they hit the uplink, it’s upstream.

Common mistakes: symptoms → root cause → fix

1) “The container can’t reach the internet, but DNS resolves”

Symptom: getent hosts example.com works; curl hangs or times out.

Root cause: Missing/incorrect default route, broken NAT masquerade rule, or host firewall blocking forwarding.

Fix: Verify ip route inside the container; on host, confirm iptables -t nat MASQUERADE and that IP forwarding is enabled. Ensure host firewall allows forwarding from the bridge subnet.

2) “Service is up, but nothing can connect from outside” (bridge)

Symptom: Container listens on 0.0.0.0:8080 internally, but LAN clients can’t connect.

Root cause: Port not published, or published only on localhost, or host firewall blocks the published port.

Fix: Check docker ps for 0.0.0.0:hostport->containerport. If missing, add -p. Then verify host firewall allows inbound.

3) “Two containers keep flapping, sometimes one won’t start” (host)

Symptom: Restart loops or intermittent bind errors; logs mention “address already in use.”

Root cause: Host mode shares port namespace; both services want the same port.

Fix: Stop using host networking for both, or redesign ports. In host mode, treat port allocation like a global resource per host.

4) “Macvlan containers work from the LAN, but the host can’t reach them”

Symptom: From another machine on the subnet, you can connect; from the Docker host, it fails.

Root cause: Default macvlan behavior prevents host-to-child communication on the same parent.

Fix: Add a macvlan shim interface on the host with an IP in the same subnet and route via it (or choose ipvlan L3 if appropriate for your environment).

5) “Everything worked until we added more containers; then random timeouts” (macvlan)

Symptom: New containers are reachable sometimes; ARP looks flaky; switch logs complain.

Root cause: Switch port security or MAC address limits, CAM table pressure, or ARP rate limiting.

Fix: Coordinate with network team: raise MAC limits on the port, disable strict security where appropriate, or avoid macvlan on that segment.

6) “Some requests hang, especially large responses” (any mode)

Symptom: Small pings work; large transfers stall; TLS handshakes sometimes fail.

Root cause: MTU mismatch and broken path MTU discovery.

Fix: Measure effective MTU, set Docker network MTU or adjust interface MTUs consistently, and validate with DF-bit pings.

7) “Container-to-container works by IP, not by name” (bridge)

Symptom: ping 172.22.0.11 works; ping db-1 fails.

Root cause: Containers are not on the same user-defined bridge network, or you’re using the default bridge without proper DNS behavior.

Fix: Put both containers on the same user-defined bridge network and use container names (or explicit aliases) there.

Checklists / step-by-step plan

Step-by-step: choosing the driver safely

  1. Write the reachability matrix. Who needs to talk to whom (LAN → container, container → LAN, host → container, container → internet).
  2. Decide identity requirements. Do upstream systems require per-workload IPs? If yes, macvlan might be required; otherwise, prefer bridge.
  3. Decide exposure model. Do you want explicit published ports (bridge) or shared host ports (host)? Default to explicit.
  4. Check your firewall ownership model. If the host firewall is centrally managed and Docker changes are frowned upon, plan the integration carefully (bridge) or avoid Docker NAT-heavy patterns.
  5. Check your switch policy if macvlan is in play (MAC limits, port security, ARP inspection). Do this before deployment.
  6. Pick the simplest model that satisfies requirements. Then document it so the next person doesn’t “optimize” it.

Bridge network production checklist

  • Use user-defined bridge networks, not the default bridge, for real apps.
  • Pick subnets that do not overlap with VPN/datacenter ranges.
  • Publish only required ports; bind to specific host IPs where possible.
  • Decide how you’ll manage iptables/nftables (and test after OS upgrades).
  • Validate MTU end-to-end and set it intentionally.

Host network production checklist

  • Reserve host mode for workloads that justify it (packet rate, real network daemons, host agents).
  • Maintain a port ownership map per host (or enforce with automation).
  • Harden host firewall rules; don’t rely on container defaults.
  • Verify observability attribution: can you tie traffic to a container/cgroup?

Macvlan production checklist

  • Reserve a documented IP range; avoid mixing with DHCP unless you really know what you’re doing.
  • Confirm switch policy supports multiple MACs per port and won’t flap.
  • Plan host-to-container access (shim interface) if needed.
  • Monitor ARP/neighbor table behavior and rate limits.
  • Decide who owns DNS: you’ll want forward/reverse records if corporate tooling expects them.

Three corporate mini-stories (because you will repeat them otherwise)

Mini-story 1: an incident caused by a wrong assumption (macvlan host reachability)

A mid-sized company was containerizing a legacy reporting service that needed to be reachable by a set of upstream batch jobs. They chose macvlan because every upstream job had a hardcoded allowlist of destination IPs, and rewriting the policy would have taken political capital they didn’t have.

Staging looked good. From other machines on the subnet, they could hit the container IPs directly. The change went to production on a Friday afternoon, because of course it did, and immediately their on-host health checks started failing. The orchestrator marked instances unhealthy and restarted them. Now the service was flapping, which made upstream jobs fail more often, which triggered more retries, which made everything louder.

The wrong assumption: “If the LAN can reach it, the host can reach it.” With macvlan, the host-to-child path is a known exception on many setups. Their health checks ran on the host network namespace and couldn’t hit the macvlan container IP. The service was fine; the host just couldn’t see it.

The fix was mundane: create a macvlan shim interface on each host, give it an IP in the macvlan subnet, and update the health checks to use that path. They also documented the constraint so nobody “simplified” it away later.

Mini-story 2: an optimization that backfired (host networking to avoid NAT)

A different org ran a high-throughput metrics ingestion pipeline. Someone noticed conntrack counters rising under load and decided bridge NAT was “wasting CPU.” The proposed fix was simple: move the ingest containers to --network host so packets avoid NAT and conntrack entirely.

In a narrow benchmark, it worked. CPU dropped, latency improved, and everyone got to feel like they had discovered a cheat code. The change rolled out gradually across a fleet.

Then weirdness started: a subset of hosts had intermittent ingest failures after deploys. Not all hosts, not all the time. The root cause was port collisions between the ingest service and a separate debugging sidecar that also bound a UDP port. In bridge mode, they could coexist via separate namespaces. In host mode, the second service simply couldn’t bind. Sometimes deploy ordering masked it; sometimes it exploded.

They reverted host networking for the ingest tier and instead tuned conntrack capacity and reduced connection churn with batching. The actual bottleneck wasn’t “NAT is slow.” It was “we created too many short-lived flows.” Host mode treated the symptom and introduced a new failure class that was harder to reason about.

Afterward, they added a rule: any change that increases the blast radius (like host networking) needs a written threat model and a rollback plan. That policy was less exciting than the benchmark graph, but it worked.

Mini-story 3: boring but correct practice that saved the day (user-defined bridge + explicit publishing)

A financial services team ran multiple customer-facing services on shared hosts. They standardized on user-defined bridge networks per application stack and published ports explicitly, binding to specific host interfaces. It wasn’t fashionable. It was also resilient.

One night, a vendor image update introduced a new debug listener inside the container. Nothing malicious; just one of those “oops, left it on” defaults. The service itself still ran normally.

Because the team used explicit port publishing, the new internal port stayed internal. It didn’t suddenly show up on the host, and it didn’t become reachable from the LAN. Monitoring didn’t scream, security didn’t panic, and the incident never became a headline.

Their postmortem was short: “We didn’t have to care because we didn’t expose it.” That’s the kind of boring win you only notice when you compare it to the alternate timeline.

FAQ

1) Is bridge networking always slower than host networking?

No. Bridge mode adds overhead (veth, bridge lookup, often NAT/conntrack), but for many workloads it’s not your bottleneck. Measure before you “fix” it. If you’re not saturating CPU on softirq/conntrack, bridge is usually fine.

2) Why should I prefer a user-defined bridge over the default bridge?

User-defined bridges provide better DNS/service discovery behavior, clearer separation, and more predictable multi-network setups. The default bridge is legacy-friendly, not production-friendly.

3) When is --network host a good idea?

When the workload truly needs direct host networking semantics: high packet rates where NAT overhead matters, network daemons, or host agents. Also when the host is effectively single-purpose. Otherwise, host mode increases blast radius and debugging complexity.

4) Why can’t the host reach macvlan containers by default?

Because macvlan isolates traffic between the parent interface and macvlan endpoints in a way that prevents the host’s stack from talking directly to its children on that same interface. The common workaround is a macvlan shim interface on the host.

5) Can I run macvlan on Wi‑Fi?

Sometimes, but it’s often painful. Many Wi‑Fi drivers and access points don’t handle multiple source MAC addresses per station the way you want. If you must try, do it in a lab first and prepare for disappointment.

6) Should I use macvlan just to avoid port conflicts?

Usually no. Port conflicts are better solved with bridge networking plus port publishing, or by placing a reverse proxy/ingress in front. Macvlan trades port conflicts for IPAM and L2 complexity.

7) How do I stop Docker from messing with my firewall?

You can constrain Docker’s behavior, but you can’t pretend it doesn’t need rules if you want NAT/published ports. The practical approach is to decide whether the host firewall is “Docker-aware” and test rule ordering after upgrades. If your environment forbids dynamic firewall changes, redesign around routed networking or upstream load balancers.

8) What’s the safest way to expose services from bridge mode?

Publish only required ports, bind to a specific host IP when appropriate (for example, internal interface only), and enforce host firewall rules. Treat published ports as external API surface.

9) How do I handle logging and client IP visibility with bridge NAT?

You may lose original source IP if traffic is proxied/NATed depending on the path. Use reverse proxies that pass X-Forwarded-For or proxy protocol where applicable, or avoid NAT for that hop (macvlan/host or a routed design) if true end-client IP is mandatory.

Next steps you can do this week

  • Inventory your hosts: list containers and note which ones use host networking and why. If “because it worked” is the reason, that’s not a reason.
  • Migrate one stack to a user-defined bridge if you’re still using the default bridge. Validate name resolution and port publishing discipline.
  • Pick a non-overlapping subnet plan for Docker bridges across environments (dev/stage/prod). Overlaps are a slow leak that becomes a flood.
  • Decide your macvlan policy: either “allowed with switch validation and IP range reservation” or “not allowed.” Ambiguity is how macvlan shows up in production without adults in the room.
  • Write a one-page runbook using the Fast diagnosis playbook above and include the exact commands your on-call can run without thinking.

If you want a rule that holds up under stress: use bridge until you can name the specific bridge limitation you’re hitting. Then choose host or macvlan as a deliberate exception, not a vibe.

← Previous
Debian 13: DNS split-horizon gone wrong — fix internal names without breaking the internet (case #33)
Next →
Docker Pull Is Painfully Slow: DNS, MTU, and Proxy Fixes That Actually Work

Leave a comment