It’s the worst kind of outage: nothing is “down.” Ping works. DNS works. Your monitoring is mostly green. And yet a handful of websites hang forever, TLS handshakes stall, or a login page half-renders and then freezes like it’s thinking deeply about its life choices.
That’s when you should suspect MTU. Not because it’s fashionable, but because MTU failures create selective, maddening symptoms that look like browser bugs, SaaS flakiness, or “the internet is weird today.” They’re not. They’re physics plus misconfiguration plus someone blocking ICMP in 2011 and never admitting it.
The mental model: why MTU breaks “only some” traffic
MTU (Maximum Transmission Unit) is the largest IP packet size that can traverse a link without being fragmented. Ethernet’s default MTU is 1500 bytes. That number is old, practical, and still everywhere. The problem is that modern networks are a stack of tunnels, overlays, VPNs, and “helpful” middleboxes. Each layer steals bytes for its headers. If you don’t budget for that overhead, packets that used to fit stop fitting.
When a packet is too large for the next hop, there are only a few outcomes:
- Fragmentation happens (IPv4 only, if allowed): the router splits the packet. It often works, but it’s slower and fragile. Some devices mishandle fragments. Some security policies drop them.
- Packet is dropped and an ICMP message is sent back: “Fragmentation needed” (IPv4) or “Packet Too Big” (IPv6). This is the healthy path for Path MTU Discovery.
- Packet is dropped and no helpful ICMP returns: this is the infamous PMTUD black hole. TCP keeps retransmitting big packets that never make it. From the application’s perspective, things “hang.”
Why does it hit only some sites? Because not all flows generate packets the same size. A small HTTP response might fit. A larger TLS record, a cookie-heavy request, or a server that immediately sends a full-sized segment might exceed the real path MTU. Some CDNs and some sites are simply better at triggering your broken path than others.
One more important detail: TCP MSS (Maximum Segment Size) controls the TCP payload size, which is typically MTU minus IP+TCP headers. If you clamp MSS appropriately at a tunnel boundary, you can avoid generating too-large packets in the first place. It’s a band-aid with legitimate medical uses.
Dry-funny truth: MTU bugs are the network equivalent of a door that opens inward when the fire code expects outward—you only learn during the panic.
What “some sites don’t load” usually means at the packet level
Typical failure sequence:
- Client resolves DNS and connects to server IP: OK.
- TCP three-way handshake (SYN, SYN/ACK, ACK): OK (small packets).
- TLS handshake begins: often OK at first.
- Server sends a larger packet (certificate chain, HTTP headers, or initial content): dropped if it exceeds path MTU and PMTUD is broken.
- Client waits, retries, browser spins, and someone blames the SaaS vendor.
Facts and history: how we got here
MTU problems feel modern because we see them most with VPNs and overlays, but the ingredients are ancient. A few concrete facts and historical points that matter operationally:
- 1500-byte Ethernet MTU became common because it balanced efficiency and hardware limits in early Ethernet designs; it’s effectively the “default contract” of a lot of equipment.
- PPPoE famously reduces MTU to 1492 due to its encapsulation overhead; this has been biting DSL and some fiber deployments for decades.
- Path MTU Discovery (PMTUD) relies on ICMP “too big” feedback; when networks started blocking ICMP broadly for “security,” PMTUD failures became routine.
- IPv6 forbids in-network fragmentation; only endpoints fragment. That makes ICMPv6 “Packet Too Big” not optional if you want a functioning internet.
- IPsec, GRE, VXLAN, and WireGuard add overhead; a safe MTU inside tunnels is often lower than people assume, especially with multiple encapsulation layers.
- Jumbo frames (9000 MTU) are great inside controlled domains, but become a liability when accidentally extended across boundaries that can’t carry them.
- TCP MSS clamping became a common workaround in firewalls and edge routers precisely because PMTUD was unreliable in real-world networks.
- “Don’t block ICMP” has been standard advice for years, yet many corporate networks still do it partially—often allowing echo but blocking “fragmentation needed,” which is the wrong half.
- Browser behavior amplifies pain: modern browsers open many connections, use HTTP/2 or HTTP/3, and can hide per-connection stalls behind “loading” spinners that look like app bugs.
One quote that belongs in every ops team’s brain: Everything fails, all the time.
— Werner Vogels. It’s short, a little bleak, and operationally correct.
Fast diagnosis playbook (first/second/third)
If you have ten minutes and a user shouting “only some sites,” do this in order. Don’t freestyle. Freestyle is how you end up restarting the wrong router at 2 a.m.
First: verify it’s an MTU-shaped failure
- Reproduce from a host on the same network path as the user (same VPN, same VLAN, same Wi-Fi).
- Check whether small responses work but larger ones stall (a tell).
- Use a “do not fragment” ping test to find the largest working packet size.
Second: locate the boundary where MTU changes
- Look for tunnels: VPN, IPsec, GRE, WireGuard, cloud transit gateway, SD-WAN, overlay networks, WAN optimizers.
- Compare interface MTUs on both sides: client NIC, tunnel interface, firewall inside/outside, cloud ENI.
- Confirm ICMP “too big” is permitted end-to-end (or compensate with MSS clamping).
Third: apply the least-dangerous fix
- Prefer correcting MTU/PMTUD and allowing required ICMP.
- If politics or vendor gear blocks that, apply TCP MSS clamping at the tunnel edge as a pragmatic workaround.
- Document the chosen MTU and bake it into provisioning, not tribal memory.
Second short joke (and last one, per the laws of this document): The easiest way to find an MTU bug is to declare the incident “probably DNS.” It will immediately become MTU out of spite.
Symptom patterns that scream MTU
1) TLS handshake stalls or “hangs” after ClientHello/ServerHello
Small packets pass. Then a certificate chain or a larger handshake message triggers a too-big packet. If ICMP is blocked, neither endpoint learns the path MTU, so it keeps sending packets that never arrive.
2) HTTP headers arrive, but large downloads fail early
The first response fits. Larger segments don’t. Users report “the page loads but images don’t,” or “login works but the dashboard doesn’t.”
3) It works on mobile hotspot, fails on corporate Wi-Fi/VPN
Different path, different MTU, different ICMP policy. Mobile carriers often have their own MTU quirks too, but the path is different enough to avoid your black hole.
4) SSH connects, but scp stalls
Interactive keystrokes are tiny. File transfers fill TCP segments, exposing MTU issues quickly.
5) IPv4 works, IPv6 doesn’t (or vice versa)
IPv6 requires ICMPv6 PTB to work well. If a firewall blocks it, you get selective IPv6 misery. Conversely, some IPv4 paths “work” via fragmentation while IPv6 breaks hard.
6) Kubernetes/containers: pod-to-external is flaky, node-to-external is fine
Overlay networks (VXLAN, Geneve) reduce effective MTU inside the cluster. If the CNI MTU is wrong, pods generate packets that get dropped at egress.
Practical tasks: commands, outputs, decisions (12+)
These are the commands I actually run when I’m on-call and tired. Each task includes: command, sample output, what it means, and the decision you make.
Task 1: Check the local interface MTU
cr0x@server:~$ ip link show dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
Meaning: The host believes eth0 MTU is 1500. That’s only the first hop, not the path.
Decision: If you expect a tunnel/overlay, check that interface too. Don’t assume 1500 is safe end-to-end.
Task 2: List tunnel interfaces and their MTU
cr0x@server:~$ ip -d link show | egrep -A2 'mtu|wg0|tun0|gre|vxlan'
5: wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/none
7: vxlan.calico: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default
Meaning: WireGuard at 1420, VXLAN at 1450. Overhead is being accounted for, at least locally.
Decision: If users complain over VPN, 1420 may still be too high depending on underlay. Verify with DF pings.
Task 3: DF ping to find maximum working MTU (IPv4)
cr0x@server:~$ ping -M do -s 1472 -c 3 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 1472(1500) bytes of data.
ping: local error: message too long, mtu=1492
ping: local error: message too long, mtu=1492
ping: local error: message too long, mtu=1492
--- 1.1.1.1 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss
Meaning: You tried to send an IP packet of 1500 bytes total (1472 payload + 28 bytes headers). The kernel says MTU is effectively 1492 somewhere relevant locally (often PPPoE).
Decision: Drop payload until it passes. Then set tunnel MTU or MSS clamping accordingly.
Task 4: Binary search the working payload size
cr0x@server:~$ ping -M do -s 1464 -c 2 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 1464(1492) bytes of data.
1472 bytes from 1.1.1.1: icmp_seq=1 ttl=57 time=13.4 ms
1472 bytes from 1.1.1.1: icmp_seq=2 ttl=57 time=13.2 ms
--- 1.1.1.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
Meaning: Path supports 1492-byte IP packets (payload 1464 + 28 header).
Decision: If you run a VPN over this link, subtract tunnel overhead from 1492, not 1500.
Task 5: Check PMTUD ICMP messages in packet capture
cr0x@server:~$ sudo tcpdump -ni eth0 'icmp and (icmp[0]=3 and icmp[1]=4)'
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:01:44.912345 IP 203.0.113.1 > 192.0.2.10: ICMP unreachable - need to frag (mtu 1420), length 36
Meaning: You’re receiving “need to frag” with an MTU hint. PMTUD is at least partially functioning.
Decision: If you never see these when you expect them, suspect firewall rules dropping ICMP type 3 code 4 (IPv4) or ICMPv6 PTB.
Task 6: See if ICMP is blocked on your firewall (nftables)
cr0x@server:~$ sudo nft list ruleset | sed -n '1,120p'
table inet filter {
chain input {
type filter hook input priority 0; policy drop;
ct state established,related accept
iif "lo" accept
ip protocol icmp icmp type echo-request accept
ip6 nexthdr icmpv6 icmpv6 type echo-request accept
}
}
Meaning: Echo requests are allowed, but other ICMP types are not. That’s the classic “ping works, PMTUD dies” setup.
Decision: Permit ICMP “fragmentation needed” (IPv4) and “packet too big” (IPv6) at minimum, ideally allow relevant ICMP error types broadly.
Task 7: Confirm TCP MSS advertised on SYN packets
cr0x@server:~$ sudo tcpdump -ni eth0 'tcp[tcpflags] & (tcp-syn) != 0 and host 93.184.216.34' -c 3
12:05:01.100001 IP 192.0.2.10.51544 > 93.184.216.34.443: Flags [S], seq 1234567890, win 64240, options [mss 1460,sackOK,TS val 111 ecr 0,nop,wscale 7], length 0
12:05:01.200002 IP 192.0.2.10.51545 > 93.184.216.34.443: Flags [S], seq 1234567891, win 64240, options [mss 1460,sackOK,TS val 112 ecr 0,nop,wscale 7], length 0
12:05:01.300003 IP 192.0.2.10.51546 > 93.184.216.34.443: Flags [S], seq 1234567892, win 64240, options [mss 1460,sackOK,TS val 113 ecr 0,nop,wscale 7], length 0
Meaning: MSS 1460 implies a 1500 MTU assumption (1460 payload + 40 bytes IPv4+TCP headers).
Decision: If your real path MTU is 1492 or lower (or tunnel reduces it), you should clamp MSS lower at the edge so endpoints don’t send oversized segments.
Task 8: Apply MSS clamping (iptables) as a workaround
cr0x@server:~$ sudo iptables -t mangle -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu
cr0x@server:~$ sudo iptables -t mangle -S FORWARD | tail -n 3
-A FORWARD -p tcp -m tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu
Meaning: You’re rewriting MSS on SYN packets based on route MTU. This often avoids black holes even when ICMP is blocked.
Decision: Use this when you can’t reliably fix ICMP filtering quickly. Still schedule the real fix: correct MTU and ICMP policy.
Task 9: Diagnose with tracepath (Linux)
cr0x@server:~$ tracepath 93.184.216.34
1?: [LOCALHOST] pmtu 1500
1: 192.0.2.1 0.381ms
2: 198.51.100.1 1.992ms pmtu 1492
3: 203.0.113.9 6.110ms
4: 93.184.216.34 12.834ms reached
Resume: pmtu 1492 hops 4 back 4
Meaning: Path MTU discovered as 1492 at hop 2. That’s valuable and actionable.
Decision: If tracepath cannot discover PMTU (stays at 1500 suspiciously), suspect blocked ICMP PTB/frag-needed or an asymmetric path issue.
Task 10: Check the route MTU hint in Linux
cr0x@server:~$ ip route get 93.184.216.34
93.184.216.34 via 192.0.2.1 dev eth0 src 192.0.2.10 uid 1000
cache mtu 1492
Meaning: Kernel route cache believes PMTU is 1492 for that destination.
Decision: If applications still stall, the issue might be elsewhere (TLS inspection, proxy), or the PMTU differs for other destinations/paths.
Task 11: Validate IPv6 PMTUD basics
cr0x@server:~$ ping6 -c 2 -s 1452 -M do 2606:4700:4700::1111
PING 2606:4700:4700::1111(2606:4700:4700::1111) 1452 data bytes
1460 bytes from 2606:4700:4700::1111: icmp_seq=1 ttl=57 time=14.1 ms
1460 bytes from 2606:4700:4700::1111: icmp_seq=2 ttl=57 time=14.0 ms
--- 2606:4700:4700::1111 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
Meaning: Large-ish IPv6 packets pass with DF semantics (IPv6 doesn’t do router fragmentation). Good sign.
Decision: If IPv6 fails only for larger sizes, check ICMPv6 type 2 (Packet Too Big) filtering on firewalls.
Task 12: Identify container/CNI MTU mismatch (Kubernetes node)
cr0x@server:~$ ip link show dev cni0
9: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether 0a:58:0a:f4:00:01 brd ff:ff:ff:ff:ff:ff
cr0x@server:~$ ip link show dev vxlan.calico
7: vxlan.calico: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default
Meaning: Bridge is 1500 but VXLAN is 1450. Pods might try to use 1500 and get burned once encapsulated.
Decision: Set CNI/bridge MTU to match effective overlay MTU (often 1450) so pods never generate too-large frames.
Task 13: Confirm whether PMTUD is black-holed by testing a big TCP transfer
cr0x@server:~$ curl -I --max-time 10 https://example.com/
HTTP/2 200
content-type: text/html
server: envoy
cr0x@server:~$ curl --max-time 10 -o /dev/null -sS https://example.com/largefile.bin
curl: (28) Operation timed out after 10002 milliseconds with 0 bytes received
Meaning: Small request works; large transfer stalls. Not proof, but a strong MTU/fragmentation signal.
Decision: Immediately run DF ping / tracepath and inspect ICMP filtering. Don’t waste an hour on TLS cipher theories.
Task 14: Observe retransmissions and “stuck” large segments
cr0x@server:~$ sudo ss -ti dst 93.184.216.34:443 | sed -n '1,20p'
ESTAB 0 0 192.0.2.10:51544 93.184.216.34:443
cubic wscale:7,7 rto:204 rtt:52.3/1.2 ato:40 mss:1460 pmtu:1500 rcvmss:536 advmss:1460 cwnd:10 bytes_acked:3456 bytes_received:1200 segs_out:45 segs_in:38 retrans:8/12
Meaning: Retransmissions are happening. MSS is 1460, PMTU looks like 1500, but earlier tests suggested a lower path MTU—something’s inconsistent.
Decision: Suspect ICMP “too big” is not being received by this host (or asymmetric filtering). Implement MSS clamp and fix ICMP policy.
Three corporate mini-stories from the trenches
Mini-story 1: The incident caused by a wrong assumption
The company had a hybrid setup: on-prem offices, a cloud VPC, and a “simple” IPsec tunnel between them. Everyone assumed the tunnel was a dumb pipe. It was installed years ago, it “worked,” and it was nobody’s favorite topic.
Then a new internal web app shipped behind a reverse proxy in the cloud. Users in one office reported that the login page loaded, but after entering credentials the browser spun forever. The app team saw no errors. The cloud load balancer metrics looked fine. Support tried the classic ritual: clear cache, try a different browser, reboot your laptop. Nothing consistent.
An SRE finally reproduced it by connecting through that specific office network. TCP handshake succeeded. TLS started. Then the response died mid-flight. The key detail: this office had recently moved to a new ISP service using PPPoE. Their edge firewall still assumed 1500, and the IPsec overhead pushed packets beyond the real path MTU.
The wrong assumption was simple: “Ethernet is 1500, so the path is 1500.” The fix was also simple: adjust the tunnel MTU and allow the right ICMP types. A temporary MSS clamp stabilized things immediately. The postmortem action item was boring and essential: document the effective MTU budget per link and per tunnel, and validate it during provider changes.
Mini-story 2: The optimization that backfired
A different org decided to “optimize performance” in their data center by enabling jumbo frames end-to-end. They had storage traffic, vMotion, backups, the usual. On paper it looked clean: fewer packets, fewer interrupts, higher throughput. The network team rolled it out gradually and it seemed fine—inside the data center.
Then came an innocuous change: a new firewall pair at the edge, also configured for jumbo frames on internal interfaces. The assumption was that bigger is better and everything internal supported it. Except one segment didn’t: an older load balancer appliance in a remote rack that never got the memo and was still at 1500.
Most services didn’t notice. That’s the trick with MTU bugs: small control-plane traffic worked. Health checks were tiny. The load balancer passed some requests. But large responses and some TLS records vanished into the abyss. The outage pattern was surreal: “Site works for me unless I click the reports tab.”
The backfiring optimization wasn’t jumbo frames themselves. It was the belief that you can change MTU like it’s a cosmetic setting. You can’t. MTU is a contract, and you don’t get to renegotiate contracts unilaterally. They fixed it by enforcing MTU consistency per L2 domain, adding automated checks, and refusing to mix 1500 and 9000 without a deliberate gateway.
Mini-story 3: The boring but correct practice that saved the day
A fintech shop ran multiple VPN concentrators for remote employees. Nothing exotic: split tunnel for some services, full tunnel for others. They had a change management rule that sounded painfully conservative: any new tunnel profile must include an MTU test and a stored “known-good” MSS clamp rule, even if they didn’t enable it by default.
During a provider outage, they failed over one VPN endpoint to a backup transit path. Suddenly, a small percentage of users couldn’t access a couple of SaaS apps. Most users were fine. The helpdesk tickets were messy and contradictory because that’s what selective failures do.
The on-call SRE pulled up the runbook, ran the DF ping tests from a test host behind the failover path, and confirmed the effective MTU dropped. They enabled the pre-approved MSS clamp for that VPN profile and closed the incident fast, without debating firewall politics mid-crisis.
Later they still did the “real fix” (ICMP policy alignment and correct MTU settings), but the boring practice—having a tested, ready-to-enable workaround—turned an all-night mystery into a short incident with a clear timeline.
Common mistakes: symptom → root cause → fix
1) “Ping works, so the network is fine” → ICMP echo allowed, PMTUD ICMP blocked → allow the right ICMP types
Symptom: ping succeeds, but HTTPS stalls or large downloads fail.
Root cause: Firewall allows echo-request/reply but drops ICMP “fragmentation needed” (IPv4 type 3 code 4) and/or ICMPv6 Packet Too Big.
Fix: Permit PMTUD-related ICMP. If you can’t, clamp TCP MSS at the edge.
2) “VPN is up, therefore it can carry normal traffic” → tunnel overhead not budgeted → lower tunnel MTU and/or MSS clamp
Symptom: SSH works, scp stalls; some web apps load partially.
Root cause: IPsec/GRE/WireGuard overhead reduces effective MTU. Endpoints still send 1500-based MSS.
Fix: Set tunnel MTU to a safe value (commonly 1380–1420 depending on stack) and validate with DF ping. Add MSS clamp if needed.
3) “We enabled jumbo frames, nothing exploded immediately” → mixed MTUs in one L2 domain → enforce consistent MTU per segment
Symptom: intermittent failures, often size-dependent; monitoring mostly green.
Root cause: Some links/devices at 9000, others at 1500; black holes appear at boundaries.
Fix: Standardize MTU per VLAN/L2 domain. Where you must interconnect, route between domains and clamp/fragment appropriately.
4) “It’s an application bug” → TCP retransmits, browser waits → prove it with packet sizing tests
Symptom: only certain pages or SaaS vendors fail; engineers argue in Slack.
Root cause: Large segments dropped due to PMTUD failure; application just experiences a stall.
Fix: Run tracepath and DF pings; capture ICMP PTB/frag-needed; fix ICMP or clamp MSS.
5) “IPv6 is optional” → ICMPv6 PTB blocked → fix firewall policy for IPv6 correctly
Symptom: IPv6 connections hang for some sites; IPv4 fallback sometimes masks it.
Root cause: Blocking ICMPv6 breaks PMTUD; IPv6 relies on it.
Fix: Allow ICMPv6 error messages including Packet Too Big; validate with ping6 -M do tests.
6) “Containers are just processes” → CNI MTU wrong for overlay → set pod/bridge MTU to underlay-safe value
Symptom: node can reach external sites, pods can’t (or are flaky).
Root cause: Overlay adds encapsulation; if pods use 1500 they exceed underlay MTU after encapsulation.
Fix: Configure CNI MTU (often 1450 for VXLAN on 1500 underlay, but validate). Roll it out carefully; it affects pod networking.
Checklists / step-by-step plan
Incident checklist: when “some sites don’t load” hits production
- Reproduce on-path: test from the same network segment/VPN as the affected users.
- Run DF ping sizing: determine maximum working packet size to a stable external IP.
- Run tracepath: see whether PMTU is discovered and where it drops.
- Check firewall ICMP policy: confirm frag-needed/PTB allowed, not just echo.
- Check tunnel/overlay MTU: compare MTU of physical NIC, tunnel, and virtual interfaces.
- Look for retransmits: use
ss -tiand tcpdump to confirm repeated large segments and missing ICMP. - Apply safe workaround: MSS clamp at the appropriate edge if you need immediate relief.
- Verify with a large transfer: curl a large object, scp a file, or run a known problematic site.
- Write down what changed: provider swap, new firewall, new CNI config, new SD-WAN policy—MTU bugs love change windows.
Change management checklist: prevent MTU bugs before they exist
- Define MTU domains: per VLAN, per WAN link, per tunnel type. Don’t mix 1500 and 9000 casually.
- Budget overhead: document encapsulation overhead for each tunnel/overlay in your environment.
- Standardize CNI MTU: set it explicitly; don’t let defaults drift across clusters.
- ICMP policy review: allow PMTUD-related ICMP types. Treat it as reliability traffic, not “noise.”
- Have a tested MSS clamp plan: know where to apply it and how to roll it back.
- Automate verification: run scheduled DF ping/tracepath checks from key network segments and alert on PMTU changes.
- Document “known good” values: store MTU/MSS settings in infrastructure-as-code and in runbooks.
Decision table: what to fix, in what order
- If ICMP PTB/frag-needed is blocked: fix firewall policy first; MSS clamp as stopgap.
- If tunnel MTU is set higher than underlay budget: lower tunnel MTU; then retest. Don’t rely only on MSS clamp unless you must.
- If jumbo frames are involved: verify every hop supports it. If you can’t guarantee that, keep jumbo frames inside a controlled zone and route at the boundary.
- If only pods break: fix CNI/overlay MTU configuration; don’t “solve” it by random sysctls on nodes.
FAQ
1) Why do only some websites fail, not everything?
Because only some flows generate packets big enough to exceed your real path MTU. Small requests work; big TLS records, headers, or responses don’t.
2) If PMTUD exists, why is this still a problem?
PMTUD requires ICMP feedback. Many networks block the exact ICMP messages PMTUD needs, creating black holes. The mechanism is fine; the operational reality is messy.
3) Is blocking ICMP actually “more secure”?
Not in any useful modern sense. Selectively blocking ICMP breaks diagnostics and core functions (especially in IPv6). Good security is stateful filtering and least privilege, not sabotaging the control plane.
4) Should I just clamp TCP MSS everywhere and forget about it?
No. MSS clamping is a legitimate workaround at tunnel boundaries and edges, but blanket clamping can hide underlying issues and reduce performance unnecessarily. Fix ICMP and MTU correctness first; clamp where it’s justified.
5) What MTU should I set for WireGuard?
There’s no universal number. 1420 is common because it often works over typical 1500 underlays, but PPPoE, additional tunnels, or provider quirks may require lower. Measure with DF pings and adjust.
6) How does this relate to HTTP/3 and QUIC?
QUIC runs over UDP and still depends on the path MTU. Fragmentation behavior differs, but the core issue remains: oversized packets get dropped. QUIC stacks often use PMTUD-like logic too, and ICMP filtering can still hurt.
7) My ISP says their MTU is 1500. Why does tracepath show 1492?
Because some portion of your path (often PPPoE or another encapsulation) reduces effective MTU. The “ISP MTU” statement is usually about one segment, not your end-to-end path.
8) Can MTU issues look like DNS problems?
Yes, indirectly. If DNS responses are large (DNSSEC, many records) and the path mishandles fragmentation or blocks ICMP, DNS over UDP can fail selectively. But “some sites don’t load” is more commonly TCP/TLS MTU pain.
9) What’s the safest immediate mitigation during an incident?
Enable MSS clamping on the edge device that forwards traffic into the problematic tunnel/path, then validate with large transfers. After the fire is out, fix ICMP policy and correct MTU settings.
10) How do I prove it’s MTU to a skeptical team?
Show a before/after: DF ping maximum size, a tracepath PMTU drop, and a packet capture where large segments retransmit with no ICMP PTB returning. Then apply MSS clamp and demonstrate the problem disappears.
Next steps you can actually do
MTU problems aren’t glamorous. They’re not even interesting in the fun way. They’re interesting in the “why is the CEO’s laptop the only one that can’t load payroll” way.
Do these next, in this order:
- Add MTU/PMTU checks to your incident muscle memory: DF ping and tracepath should be as routine as checking DNS.
- Fix ICMP policy deliberately: allow PMTUD-related ICMP (IPv4 frag-needed, IPv6 packet-too-big). Stop celebrating “ping works” as proof of anything.
- Inventory tunnels and overlays: list where overhead is added and set explicit MTUs there. Defaults are not a strategy.
- Keep MSS clamping ready: treat it like a fire extinguisher—tested, placed at the edge, and used when needed.
- Document MTU domains: especially if you run jumbo frames. Make mixed-MTU boundaries explicit and routed, not accidental.
If your organization hears “some sites don’t load” and immediately starts arguing about browsers, you don’t need better browsers. You need better MTU hygiene.