You’ve got two networks that both think they own 10.0.0.0/8. Or worse, both are 192.168.1.0/24 because someone, somewhere, printed “192.168.1.1” on a sticker and called it architecture.
Now you need a site-to-site VPN between them. Routing alone can’t save you—because routing assumes IP addresses are meaningful and unique. Yours aren’t.
NAT over VPN is the crowbar you use to pry these networks apart and make them talk. It works. It’s also easy to do “mostly right” while quietly breaking DNS, logging, identity, and any application that thinks IPs are stable. This is the guide I wish more people read before shipping a tunnel to production on a Friday.
When NAT-over-VPN is the right tool (and when it’s not)
NAT over VPN exists for one core reason: overlapping address space. If Site A and Site B both have 10.0.0.0/16, you cannot route between them because “10.0.5.10” doesn’t mean one host—it means two. Routing needs uniqueness. NAT creates it by rewriting addresses as traffic crosses the boundary.
Use NAT over VPN when
- You can’t renumber one side in time (acquisitions, vendor networks, “temporary” labs that became production).
- You need a narrow integration: a few services across the tunnel, not full mesh connectivity.
- You need strong isolation: you want explicit translation boundaries and controlled reachability.
- You’re integrating cloud VPCs that were created with the same default CIDRs by different teams.
Don’t use NAT over VPN when
- You can renumber with a sane blast radius. Renumbering hurts once. NAT hurts forever.
- You need end-to-end identity based on IP addresses (some legacy ACL schemes, brittle licensing, geo rules). NAT will turn “who” into “some gateway.”
- You need inbound connections from both sides to arbitrary hosts. NAT can do it, but the operational load increases fast.
- You require perfect transparency (e.g., routing protocols, certain security tools, some protocols embedding IPs). NAT breaks “transparent” by design.
NAT is not evil. It’s a trade. Take it when you must, and then engineer it like it’s going to be audited by an irritated future version of you.
A mental model that won’t lie to you at 2 a.m.
You are building a translation boundary across a tunnel. The tunnel is just a transport—WireGuard, IPsec, OpenVPN, GRE over IPsec—it doesn’t matter. The important bit is this:
addresses used inside each LAN do not have to be globally unique, but addresses used across the boundary must be unique at that boundary.
Think in three address spaces:
- Local-real: the actual addresses on Site A (e.g.,
10.0.0.0/16). - Remote-real: the actual addresses on Site B (also
10.0.0.0/16, because life is pain). - Translated (virtual): the addresses you pretend the other side has when viewed across the VPN (e.g., “Site B becomes
172.20.0.0/16when seen from Site A”).
You then decide directionality:
- SNAT (source NAT): “When my hosts talk over the tunnel, rewrite their source address.”
- DNAT (destination NAT): “When traffic arrives over the tunnel to a translated address, rewrite it to the real destination.”
- Both, often: SNAT one direction, DNAT the other, so each side sees the other as a unique virtual CIDR.
The “NAT device” is typically the VPN gateway. That’s good: central control, fewer moving parts, easier debugging. But it also means that gateway becomes your single point of translation truth—and therefore your single point of failure if you misconfigure conntrack, routing, or firewall state.
Joke #1: NAT is like duct tape. If you use it carefully, you get home; if you use it everywhere, you eventually become the duct tape.
Design patterns: what to NAT, where to NAT, and why
Pattern A: “One side translates the other” (unidirectional translation)
Site A can reach Site B by translating Site B into a non-overlapping range as seen from A. Site B may not need to initiate connections back, or it can initiate using separate rules.
Good for: “A talks to vendor,” “A pulls metrics from B,” “one-way API calls.”
Risk: asymmetry. You’ll forget it’s asymmetric until an incident requires reverse connectivity (remote admin, callbacks, mutual TLS with IP-based allowlists, etc.).
Pattern B: “Both sides translate each other” (bidirectional virtual CIDRs)
Each side gets a virtual view of the other side. Example:
- Site A real:
10.0.0.0/16; Site A virtual (as seen by B):172.21.0.0/16 - Site B real:
10.0.0.0/16; Site B virtual (as seen by A):172.22.0.0/16
That means:
- From A, you reach B hosts via
172.22.x.y. - From B, you reach A hosts via
172.21.x.y.
Good for: two-way integrations, admin access, service meshes that aren’t aware they’re crossing org boundaries.
Risk: more NAT rules, more complexity, more “wait, what address did you test?” moments.
Pattern C: NAT only for “shared collision zones”
Sometimes only part of the CIDR overlaps. Example: both sites use 10.10.0.0/16, but Site A also has 10.20.0.0/16 that doesn’t overlap.
You can route the unique portion directly and NAT the overlapping slice.
Good for: reducing translation scope and preserving true source IPs where possible.
Risk: operational nuance. Troubleshooting becomes “this service is routed, that service is NATed,” which is a fancy way of saying it’s a trap for new on-call.
Where to NAT: gateway vs. host vs. dedicated middlebox
- NAT on the VPN gateway: best default. One place to manage state, firewall, and logging.
- NAT on individual hosts: avoid unless you love snowflakes. It breaks uniform policy and turns migrations into art projects.
- Dedicated NAT middlebox: good when VPN terminates on managed hardware you can’t customize, or when you need HA pairs and clean separation.
Pick your translated CIDRs like you pick passwords: not obvious
Don’t translate one side into 192.168.0.0/16 unless you enjoy collisions with home networks, coffee-shop VPNs, and that one exec who insists on tethering during outages.
Pick something like 172.20.0.0/14 or a carved chunk of 100.64.0.0/10 (CGNAT space) if your environment tolerates it. Be consistent and document it.
State, conntrack, and why “it pings” isn’t a design review
Most NAT implementations rely on connection tracking. That means:
- Return traffic must traverse the same gateway (symmetry).
- Failover without state sync can break live flows.
- High connection churn can exhaust conntrack tables.
If your app uses long-lived connections (databases, message brokers), plan for that. If it uses short bursts (HTTP with no keepalive, certain RPC patterns), plan harder.
What breaks when you do it wrong
NAT over VPN fails in predictable, boring ways. The problem is you usually discover them in the least boring moment possible.
1) Routing loops and black holes
If you translate traffic into a CIDR that either side already routes somewhere else, you create a loop or a sinkhole. Your monitoring might show “VPN up” while packets do interpretive dance between routers.
2) Asymmetric return paths (the silent killer)
NAT is stateful. If outbound goes through Gateway A but return comes back via Gateway B, Gateway B has no conntrack entry, so it drops or mis-NATs. Symptoms look like “works one way” or “SYN-SYN/ACK-then-nothing.”
3) DNS and name-to-address confusion
If Site A resolves db.siteb.internal to the real address (10.0.5.10) but must reach it via the translated address (172.22.5.10), your apps fail even though the network path is fine.
Fixing the network isn’t enough; you must fix naming.
4) IP-based allowlists and identity collapse
When you SNAT, the remote side may see all traffic as coming from the gateway’s translated IP. Your “allowlist this subnet” policy becomes “allow the gateway.”
That’s not inherently wrong—but it changes your threat model and your audit story.
5) Protocols that embed IP addresses
Some protocols carry IP literals inside payloads. Classic examples include certain FTP modes, SIP/VoIP, some VPN-in-VPN weirdness, legacy licensing checks, and apps that self-advertise endpoints.
NAT rewrites headers, not payloads, unless you add an application-layer gateway (which you probably shouldn’t).
6) Logging and forensics become less truthful
NAT changes source addresses. Unless you log both pre- and post-NAT tuples, your incident response gets worse.
NAT is not a reason to stop caring about attribution; it’s a reason to log like an adult.
7) MTU and fragmentation get weird
VPN encapsulation reduces effective MTU. NAT doesn’t cause that, but NAT-over-VPN deployments often coincide with “we added a tunnel and now some HTTPS calls hang.”
You’ll see PMTUD issues, blackholed ICMP, and “small packets work.”
Joke #2: The VPN was “up” the whole time. So was the Titanic.
Interesting facts and historical context
- NAT wasn’t part of the original Internet plan. It became mainstream in the mid-1990s as IPv4 address exhaustion got real and organizations wanted private networks.
- RFC 1918 (private IPv4 ranges) is from 1996. It formalized
10/8,172.16/12, and192.168/16, enabling the “everyone uses 10.0.0.0/8” era. - IPsec originally didn’t love NAT. Early IPsec ESP protected headers in ways that didn’t play nicely with NAT devices; NAT traversal (NAT-T) emerged to cope.
- Carrier-grade NAT (CGNAT) normalized large-scale translation. That’s the
100.64.0.0/10space—built because even NAT at the edge wasn’t enough. - NAT breaks the strict end-to-end principle. That design principle predates modern “zero trust,” but the tension is still visible: translation adds middle state.
- Linux conntrack tables have been a production limit for decades. High connection rates can exhaust tracking state long before CPU maxes out, leading to “random” drops.
- Many enterprises accidentally standardized on the same subnets. Default VPC/VNet templates and “copy the last site’s VLAN plan” created overlap as a normal business outcome.
- Policy-based routing predates many cloud VPN services. Old-school network engineers used it to steer traffic through NAT and tunnels long before “Transit Gateway” was a product category.
One quote worth keeping on a sticky note:
Everything fails, all the time.
— Werner Vogels
Three corporate mini-stories from the NAT mines
Mini-story 1: The incident caused by a wrong assumption
A mid-sized company acquired a smaller one. Both sides used 10.0.0.0/16 internally, because of course they did. The integration plan was “quick IPsec tunnel, NAT the acquired side into 172.20.0.0/16, done.”
It worked in the lab. It even worked for a week in production.
Then payroll processing failed. Not fully—just enough to be an incident with executive attention. The SRE on call saw successful pings and working TCP connects from the app servers to the payroll API. Yet the API returned 403s.
That’s the kind of problem that makes people blame the VPN, the firewall, or the moon.
Root cause: the payroll API had an IP-based allowlist. During testing, the source IPs were individual app hosts. In production, a new “cleanup” rule SNATed all outbound traffic to a single translated gateway IP to simplify policy.
The allowlist didn’t include the gateway IP. Nobody thought NAT would change identity, because “we’re just making routing work.”
Fix: update allowlists, add per-subnet SNAT pools so different app tiers preserved coarse identity, and—most importantly—write down that NAT changes attribution and must be reviewed with security controls.
The tunnel was fine. The assumption was not.
Mini-story 2: The optimization that backfired
Another org had a stable NAT-over-VPN setup, but the gateway CPU was higher than expected during peak hours. A network engineer decided to “optimize” by turning off connection tracking where possible and using stateless rules plus routing tweaks.
For a few protocols, it looked like free performance.
Two weeks later, random customer calls started failing—intermittently. Some requests succeeded; others stalled. The incident timeline was a mess because the VPN stayed up and packet loss wasn’t obvious.
It manifested mostly as timeouts at the application layer, which meant retries, which meant more load, which meant more timeouts. Classic.
The actual failure was subtle: some flows took a slightly different return path after a routing change on the remote side. With conntrack disabled for that traffic class, NAT mappings weren’t consistent for long-lived sessions, and the firewall state no longer matched what the application expected.
The system drifted into a half-broken state that was hard to reproduce on demand.
Fix: put conntrack back, make routing symmetric with explicit policy routing, and scale the gateway properly (bigger instance, better NIC offloads, and sensible conntrack sizing).
The “optimization” saved CPU and spent reliability. That’s not a bargain.
Mini-story 3: The boring but correct practice that saved the day
A global company ran NAT-over-VPN between a factory network and a central ERP environment. Nothing fancy: a couple of translated CIDRs, tight firewall rules, and a change control process that demanded three things:
a diagram, a test plan, and a rollback plan.
One night, a carrier event caused flapping and a failover to a backup VPN headend. The backup was correctly configured, but it didn’t sync conntrack state (because most setups don’t).
Several long-lived sessions dropped, alarms fired, and on-call started to sweat.
Here’s the boring part: their runbook included a known-good “session drain” procedure. They temporarily reduced keepalive intervals on the app tier, forced reconnects during a controlled window, and validated recovery using a checklist of synthetic transactions.
It wasn’t elegant. It was predictable.
The result was a contained incident, minimal data inconsistency, and a postmortem that didn’t include the phrase “we weren’t sure what the NAT rules did.”
Boring wins. Boring scales.
Practical tasks: commands, expected output, decisions
The point of NAT-over-VPN debugging is to stop guessing. Below are tasks you can run on Linux gateways and test hosts. Each includes:
the command, what output means, and what decision you make next.
Task 1: Confirm the overlap and find the collision
cr0x@server:~$ ip -brief addr
lo UNKNOWN 127.0.0.1/8 ::1/128
eth0 UP 10.0.1.1/24
wg0 UP 10.200.0.1/24
Meaning: This gateway is on 10.0.1.0/24 locally and has a tunnel interface wg0.
If the remote side is also 10.0.1.0/24 (or anything overlapping), pure routing will collide.
Decision: Choose a translated CIDR that does not exist on either side (and won’t later).
Task 2: Inspect routing table for ambiguous routes
cr0x@server:~$ ip route show
default via 10.0.1.254 dev eth0
10.0.1.0/24 dev eth0 proto kernel scope link src 10.0.1.1
10.200.0.0/24 dev wg0 proto kernel scope link src 10.200.0.1
Meaning: There is no explicit route for the translated remote CIDR yet.
Decision: Add a route for the remote translated CIDR via the tunnel interface (or via a policy routing table if needed).
Task 3: Verify the VPN handshake is actually established (WireGuard example)
cr0x@server:~$ sudo wg show
interface: wg0
public key: 3xQk...redacted
listening port: 51820
peer: c0z9...redacted
endpoint: 198.51.100.20:51820
allowed ips: 10.200.0.2/32
latest handshake: 31 seconds ago
transfer: 48.22 MiB received, 51.10 MiB sent
Meaning: The tunnel is up and passing traffic.
Decision: If handshake is stale, stop here and fix connectivity/keys/UDP reachability before touching NAT.
Task 4: Confirm IP forwarding is enabled on the gateway
cr0x@server:~$ sysctl net.ipv4.ip_forward
net.ipv4.ip_forward = 1
Meaning: The kernel will forward packets between interfaces.
Decision: If 0, enable it and persist via /etc/sysctl.d/. Without this, you’ll blame NAT for a routing problem.
Task 5: Check whether rp_filter will drop your asymmetrically routed packets
cr0x@server:~$ sysctl net.ipv4.conf.all.rp_filter
net.ipv4.conf.all.rp_filter = 1
Meaning: Strict reverse-path filtering is enabled globally. With policy routing or multiple uplinks, rp_filter can drop valid traffic.
Decision: For VPN NAT gateways, consider setting to 2 (loose) on relevant interfaces if you use asymmetric routing by design.
Task 6: Observe packets pre-NAT and post-NAT with tcpdump
cr0x@server:~$ sudo tcpdump -ni eth0 host 10.0.1.50 and port 443 -c 3
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:00:01.100000 IP 10.0.1.50.51544 > 172.22.5.10.443: Flags [S], seq 1000, win 64240, options [mss 1460], length 0
cr0x@server:~$ sudo tcpdump -ni wg0 host 172.22.5.10 and port 443 -c 3
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on wg0, link-type RAW (Raw IP), snapshot length 262144 bytes
12:00:01.100500 IP 172.21.1.50.51544 > 172.22.5.10.443: Flags [S], seq 1000, win 64240, options [mss 1360], length 0
Meaning: On LAN, source is 10.0.1.50; over tunnel, source became 172.21.1.50 (SNAT applied). MSS changed too (encapsulation effect).
Decision: If you don’t see the translated source on the tunnel, your NAT rule isn’t matching or is in the wrong chain/hook.
Task 7: Inspect NAT rules (iptables legacy example)
cr0x@server:~$ sudo iptables -t nat -S
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-A POSTROUTING -s 10.0.1.0/24 -d 172.22.0.0/16 -o wg0 -j SNAT --to-source 172.21.0.1
-A PREROUTING -i wg0 -d 172.21.0.0/16 -j DNAT --to-destination 10.0.1.0/24
Meaning: There’s SNAT for traffic leaving to the remote translated CIDR. There is also a DNAT rule that looks wrong: DNAT to an entire CIDR is not valid in that form and suggests confusion.
Decision: Use deterministic 1:1 mapping rules (NETMAP) or explicit DNAT per host/service, not “DNAT to a network” unless you know exactly what your tooling supports.
Task 8: Inspect NAT rules (nftables modern example)
cr0x@server:~$ sudo nft list ruleset
table ip nat {
chain prerouting {
type nat hook prerouting priority dstnat; policy accept;
iifname "wg0" ip daddr 172.21.0.0/16 dnat to 10.0.1.0/24
}
chain postrouting {
type nat hook postrouting priority srcnat; policy accept;
oifname "wg0" ip saddr 10.0.1.0/24 ip daddr 172.22.0.0/16 snat to 172.21.0.1
}
}
Meaning: Similar issue: the DNAT attempt is conceptually incorrect. DNAT needs a specific destination (or a mapping construct), not “all of these go to that network.”
Decision: Redesign using 1:1 netmapping (NETMAP) or use a pair of non-overlapping virtual CIDRs with deterministic translation.
Task 9: Validate conntrack sees the NATed flow
cr0x@server:~$ sudo conntrack -L -p tcp --dport 443 | head -n 3
tcp 6 431999 ESTABLISHED src=10.0.1.50 dst=172.22.5.10 sport=51544 dport=443 src=172.22.5.10 dst=172.21.0.1 sport=443 dport=51544 [ASSURED] mark=0 use=1
Meaning: Conntrack shows the original tuple and the translated tuple. That’s good.
Decision: If the table is empty while traffic is flowing, conntrack might be bypassed or broken—expect intermittent failures under load.
Task 10: Check conntrack capacity before it becomes a “random drops” incident
cr0x@server:~$ sudo sysctl net.netfilter.nf_conntrack_count net.netfilter.nf_conntrack_max
net.netfilter.nf_conntrack_count = 24891
net.netfilter.nf_conntrack_max = 262144
Meaning: You’re at ~9.5% utilization. Fine.
Decision: If count approaches max during peak, increase max (with memory awareness) and tune timeouts for your traffic profile.
Task 11: Detect MTU issues with DF ping (classic VPN pain)
cr0x@server:~$ ping -M do -s 1472 172.22.5.10 -c 2
PING 172.22.5.10 (172.22.5.10) 1472(1500) bytes of data.
ping: local error: message too long, mtu=1420
ping: local error: message too long, mtu=1420
--- 172.22.5.10 ping statistics ---
2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1030ms
Meaning: Effective MTU is 1420; trying to send 1500-byte frames with DF fails.
Decision: Clamp MSS on the tunnel/firewall or set interface MTU appropriately. Also ensure ICMP “frag needed” isn’t blocked.
Task 12: Confirm the remote sees the right source (application-impact check)
cr0x@server:~$ curl -sS --connect-timeout 3 https://172.22.5.10/whoami
{"remote_addr":"172.21.0.1","x_forwarded_for":null}
Meaning: The remote app sees the gateway’s SNAT address, not the original host.
Decision: If the remote app needs per-client identity, don’t SNAT everything to one IP. Use 1:1 mapping or a SNAT pool and update allowlists accordingly.
Task 13: Validate policy routing when you must keep NAT traffic symmetric
cr0x@server:~$ ip rule show
0: from all lookup local
100: from 10.0.1.0/24 to 172.22.0.0/16 lookup 100
32766: from all lookup main
32767: from all lookup default
cr0x@server:~$ ip route show table 100
172.22.0.0/16 dev wg0 scope link
Meaning: Traffic from the local subnet to the remote translated CIDR is forced into table 100, which routes it over the tunnel.
Decision: If you have multiple WANs or multiple tunnels, this is how you keep return paths consistent and avoid NAT state loss.
Task 14: Check firewall counters to prove what’s dropping
cr0x@server:~$ sudo iptables -vnL FORWARD
Chain FORWARD (policy DROP 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
812K 92M ACCEPT all -- eth0 wg0 10.0.1.0/24 172.22.0.0/16
221K 21M ACCEPT all -- wg0 eth0 172.22.0.0/16 10.0.1.0/24
512 42K DROP all -- * * 0.0.0.0/0 0.0.0.0/0
Meaning: Forwarding is allowed both directions for those CIDRs, and counters increment. There is also some default drop.
Decision: If counters on ACCEPT rules stay at zero while you test, your traffic isn’t reaching this box or isn’t matching the CIDRs you think it is (translation mismatch, wrong interface, wrong route).
Task 15: Confirm the remote has a route back to your translated space
cr0x@server:~$ ip route get 172.21.1.50
172.21.1.50 dev wg0 src 10.200.0.2 uid 0
cache
Meaning: The remote knows to return to 172.21.0.0/16 via the tunnel interface.
Decision: If the remote tries to send 172.21.0.0/16 to its default gateway instead, you’ll get one-way traffic and blame NAT. Add the route.
Fast diagnosis playbook
This is the order that gets you to the bottleneck fastest, without performing interpretive NAT on a live incident bridge.
First: prove the tunnel and routing are real
- VPN health: handshake/SA established, bytes incrementing. If not, stop.
- Routes: each side has a route to the translated CIDR via the tunnel. No route, no return path, no joy.
- Forwarding and firewall:
ip_forward=1; forwarding chain permits traffic; counters move.
Second: prove translation is actually happening
- tcpdump on both interfaces: see original tuple on LAN and translated tuple on tunnel.
- conntrack entry exists: confirm NAT state for the flow.
- Remote sees expected source: validate with a simple endpoint or server logs.
Third: find the “works for ping but not for apps” issues
- MTU/MSS: do DF pings; watch for stalls on TLS/HTTP POST.
- DNS: confirm the names resolve to the translated addresses (or provide split-horizon).
- Allowlist/auth: check if remote side expects real client IPs; NAT may collapse them.
Fourth: look for scale and aging problems
- conntrack capacity and timeouts under load.
- CPU softirq / NIC drops on gateways (interrupt saturation looks like packet loss).
- HA failover behavior: state loss and asymmetric paths after failover.
Common mistakes: symptoms → root cause → fix
1) “I can ping, but TCP times out”
Symptoms: ICMP works, small requests work, uploads or TLS handshakes hang.
Root cause: MTU/PMTUD issues due to encapsulation; ICMP “frag needed” blocked; missing MSS clamping.
Fix: clamp MSS on traffic over the tunnel, set correct MTU on tunnel interface, allow necessary ICMP.
2) “Works from gateway, not from hosts behind it”
Symptoms: Gateway can reach remote service, internal hosts cannot.
Root cause: missing FORWARD rules, missing SNAT for forwarded traffic, or missing return route for the translated client subnet.
Fix: enable forwarding, add forward firewall rules, implement SNAT, and add remote route back to translated client range.
3) “One direction works; reverse direction fails”
Symptoms: A→B works, B→A fails (or vice versa). Or SYNs go out, SYN/ACK never completes.
Root cause: asymmetric routing across HA gateways or multiple WANs; conntrack state exists on one node only.
Fix: enforce symmetric routing with policy routing; use active/standby; consider conntrack state sync if you truly need active/active.
4) “Remote app returns 403/unauthorized after NAT change”
Symptoms: Network connectivity is fine; application denies access.
Root cause: IP-based allowlists now see the NAT gateway IP, not the original clients.
Fix: update allowlists; use 1:1 translation or SNAT pools; move identity enforcement up the stack where possible.
5) “Some hosts reachable, others not”
Symptoms: A subset of remote hosts works; others are black holes.
Root cause: partial overlap; incorrect or incomplete netmapping; routes exist for some translated ranges but not others.
Fix: map deterministically and cover the entire intended range; audit routes on both sides; avoid mixing routed and NATed segments without strong documentation.
6) “After a failover, everything breaks until we restart stuff”
Symptoms: HA event causes widespread timeouts; eventually recovers after sessions expire or services restart.
Root cause: conntrack/NAT state lost during failover; long-lived sessions don’t reestablish cleanly.
Fix: prefer active/standby with stable routing; shorten keepalives; accept that some sessions will drop and design for retries; add controlled drain procedures.
7) “DNS resolves, but connections go to the wrong place”
Symptoms: Client resolves remote name, but reaches local host with same IP or wrong service.
Root cause: DNS returns real (overlapping) address; local routing prefers local subnet.
Fix: split-horizon DNS or conditional forwarding returning translated addresses; do not leak real overlapping IPs across the boundary.
8) “It used to work; now it drops under load”
Symptoms: intermittent failures during peak; logs show conntrack insert failures or drops.
Root cause: conntrack table exhaustion; insufficient NAT state capacity; CPU softirq saturation.
Fix: increase nf_conntrack_max, tune timeouts, scale gateway resources, and measure packet drops at NIC and kernel layers.
Checklists / step-by-step plan
Step-by-step: a sane NAT-over-VPN deployment
- Inventory the address space on both sides: actual CIDRs, DHCP ranges, static blocks, and “mystery” lab networks.
- Pick translated CIDRs that will not collide with either side, with room for growth. Write them down in a shared place.
- Decide translation direction: unidirectional or bidirectional. Default to bidirectional only if you truly need it.
- Decide translation granularity:
- 1:1 netmapping for preserving per-host identity
- many-to-one SNAT for simplicity (accept identity loss)
- Implement routes to translated CIDRs on both sides (static, or via your routing system if appropriate).
- Implement NAT rules with explicit matching: source CIDR, destination CIDR, ingress/egress interfaces.
- Implement firewall rules as if NAT didn’t exist: permit only required ports/services across translated CIDRs.
- Fix DNS: split-horizon, conditional forwarding, or explicit host overrides for translated names. If DNS is wrong, everything feels wrong.
- Handle MTU: set tunnel MTU and clamp MSS if needed. Test with DF pings and real application flows.
- Log translation: keep NAT/firewall logs (rate-limited) and maintain visibility into original vs translated tuples.
- Load test conntrack: confirm headroom under expected peak with burst factor.
- Write and drill a rollback: removing NAT is rarely instant; plan a “disable NAT, keep tunnel” and “disable tunnel” path.
Pre-change checklist (what to verify before you flip it on)
- Translated CIDRs do not appear in
ip routeon either side except the intended VPN routes. - Conditional DNS answers return translated addresses to the opposite site.
- Firewall default policy is explicit and tested.
- MTU/MSS tested with real payload sizes (not just ping).
- Conntrack sizing verified; monitoring exists for
nf_conntrack_countand drop counters.
Post-change checklist (what to check immediately after)
- tcpdump confirms pre/post NAT addresses on the correct interfaces.
- Remote logs show expected source addresses (and security controls still match).
- DNS resolution from both sides returns the correct view.
- Long-lived sessions (DB, brokers) sustain for at least an hour without drops.
- On-call has a one-page runbook with translated CIDRs and test commands.
FAQ
1) Can I avoid NAT by just adding more specific routes?
Not when the overlap is exact or ambiguous. If both sides have 10.0.1.0/24, a route doesn’t tell you which 10.0.1.50 you meant. You need uniqueness—either renumber or translate.
2) Is NAT over VPN the same as NAT-T?
No. NAT-T (NAT Traversal) is a technique to allow IPsec to work through NAT devices by encapsulating in UDP. NAT-over-VPN is you intentionally translating addresses across a tunnel to solve overlap.
3) Should I use SNAT, DNAT, or 1:1 mapping?
If you need remote side to distinguish hosts (allowlists, logging, per-host policies), use deterministic 1:1 mapping. If you just need “clients can reach service,” SNAT-to-one-IP is simpler but hides identity.
4) Where should the NAT live?
On the VPN gateway, unless you have a strong reason not to. Centralizing NAT reduces configuration drift and makes packet capture and firewall policy coherent.
5) How do I handle DNS with translated addresses?
Use split-horizon DNS: each site gets answers that make sense for its view. If Site A must reach Site B via 172.22.0.0/16, Site A’s DNS must return 172.22.x.y for Site B names.
6) Will NAT break mutual TLS or certificate validation?
Not directly—TLS doesn’t care about source IP. But if your authorization policies (outside TLS) use IP allowlists, NAT changes who the remote thinks you are.
7) Can I run active/active NAT gateways for high availability?
You can, but be careful. Without state synchronization and symmetric routing, you’ll break flows. Active/standby is operationally simpler for stateful NAT, and “simpler” is a feature.
8) Why does it work for some apps but not others?
Usually one of three reasons: MTU issues, DNS returning real (overlapping) IPs, or an application embedding IP addresses / expecting the client IP for authorization.
9) Is IPv6 the real fix?
IPv6 reduces address scarcity and overlap pressure, but it doesn’t magically fix existing IPv4-only devices, vendor appliances, or your current M&A timeline. It’s a strategy, not tonight’s mitigation.
10) What’s the best way to document NAT-over-VPN so people stop breaking it?
A diagram showing real and translated CIDRs, a table of NAT rules in plain language, and a small set of test cases with expected results. Also: put the translated ranges in IPAM if you have one.
Conclusion: next steps that actually reduce risk
NAT over VPN is a perfectly respectable solution to overlapping networks—if you treat it like a first-class design and not a temporary hack that magically stays temporary.
The work isn’t just NAT rules. It’s routing symmetry, DNS correctness, MTU handling, logging, and capacity planning for state.
Practical next steps:
- Write down the translated CIDRs and reserve them like real address space.
- Pick a translation model (1:1 vs many:1) based on identity needs, not convenience.
- Build a minimal test suite: ping with DF, TCP connect, a real HTTP request, and one long-lived session test.
- Add monitoring for conntrack utilization, packet drops, and VPN byte counters.
- Run a failover drill if you have HA. If you don’t, admit it and plan downtime expectations accordingly.
If you can renumber later, do it. But if you can’t, make NAT-over-VPN boring, predictable, and well-lit. Your future incidents will still happen—just with fewer mysteries and fewer people yelling at packet captures.