Your SRE dashboard is quiet, your link LEDs are happy, and iperf3 just screamed “10 Gbit/s” like it’s proud of itself. Meanwhile the real application feels like it’s sending requests by carrier pigeon. This is the particular kind of failure that makes teams argue in Slack: “It can’t be the network, iperf is fine.” “It’s always the network.” Both can be wrong.
Case #24 is the classic trap: a throughput benchmark proves the pipe is wide, but your workload needs something else—low tail latency, sane CPU scheduling, stable queues, predictable DNS, or storage that can keep up. I’ll show you where the bottlenecks hide on Ubuntu 24.04, how to catch them with commands you can actually run, and how to make decisions from the output instead of collecting screenshots for a postmortem.
Fast diagnosis playbook
When the app is slow but iperf3 looks great, don’t “tune” anything yet. First locate the bottleneck class: CPU/interrupts, TCP/path issues, name resolution/TLS, storage, or application concurrency. Here’s the order that finds the culprit fastest in production.
1) Confirm the symptom is latency, not just throughput
- Measure request timing from the client side (app logs, curl timing, DB client stats).
- Run a latency tool (
ping,mtr,ssretransmits) while the slowdown is happening.
2) Check CPU pressure and softirq/irq imbalance
- If CPU is pinned, the network can be “fast” in iperf (one stream on one core) while your real app is blocked on softirq, TLS, GC, or syscalls.
- Look for high
softirq,iowait, or cgroup throttling.
3) Check TCP health: retransmits, congestion, MTU blackholes
ss -tishows retransmits and pacing.- MTU mismatch and broken PMTUD often produce “fast sometimes” plus “mysterious hangs.”
4) Check queueing and drops: NIC rings, qdisc, bufferbloat
- Drops can be invisible to iperf if its flows recover; your app might not.
- Watch
ethtool -S,nstat, and qdisc stats.
5) Check storage and filesystem: iowait, sync writes, NFS/SMB quirks
- If the app waits on disk, networking benchmarks are irrelevant.
- Look at
iostat,pidstat -d, and mount options.
6) Check “plumbing” dependencies: DNS, time, TLS, proxies
- Slow DNS looks like “slow network” and wastes everyone’s time.
- TLS handshake stalls can dominate P99 even with a perfect link.
Paraphrased idea (attributed): Werner Vogels has pushed the idea that you optimize for the tail, because users experience the slowest part.
That’s the heart of this case: iperf tells you the median pipe width; your users are stuck in the tail.
Why iperf looks good while your app suffers
iperf3 is a fantastic tool. It’s also a very specific tool: it measures throughput (and some loss/jitter in UDP mode) under a synthetic flow pattern. Real applications typically want a different set of guarantees:
- Latency distribution, not raw bandwidth. Many apps send small requests, wait for responses, then send more. The pipeline is never full. iperf fills it.
- Many concurrent flows. iperf default is one or a handful of streams. Your service mesh might be thousands of connections.
- CPU-heavy per byte. TLS, compression, JSON parsing, logging, kernel copies, checksums, and userspace networking stacks all eat CPU. iperf can be CPU-light by comparison.
- Different code paths. Your app may go through proxies, NAT, conntrack, DNS, L7 load balancers, sidecars, or a VPN. iperf is often run point-to-point and bypasses the real path.
- Different packet sizes. MSS, MTU, GSO/TSO, GRO/LRO all change behavior. iperf’s payload patterns are not your payload patterns.
- Different backpressure. Apps can block on disk, locks, database pools, or rate limiters. The socket becomes the messenger that gets blamed.
Here’s the uncomfortable truth: iperf being fast proves almost nothing beyond “two hosts can push bytes when asked nicely.” Production traffic is not asked nicely.
Short joke #1: iperf is like a treadmill test: it proves your heart works, but it doesn’t explain why you get tired carrying groceries.
Interesting facts and short history that matter
- TCP was built for reliability before speed was fashionable. It’s excellent at turning loss into latency when queues fill.
- Linux has used advanced queueing disciplines for years. Modern defaults lean toward fair queueing (like
fq_codel) to fight bufferbloat, but your environment may override it. - “Bufferbloat” became a mainstream ops term in the 2010s. Too much buffering can make throughput look great while interactive latency collapses.
- Interrupt moderation and NAPI changed the game. NICs moved from “interrupt per packet” to batching, trading latency for CPU efficiency—good until it isn’t.
- TSO/GSO/GRO are performance miracles with footnotes. Offloads reduce CPU by bundling packets, but can mask loss patterns and complicate troubleshooting.
- conntrack isn’t free. Stateful NAT and firewall tracking can become the bottleneck long before the link saturates.
- PMTUD failures are a classic “works in the lab” bug. If ICMP “Fragmentation needed” is blocked, certain paths blackhole large packets and create weird stalls.
- DNS moved from “small annoyance” to “major dependency.” Microservices turned lookups into a high-QPS workload; timeouts become user-visible.
- TLS got faster, and also heavier. TLS 1.3 reduced round trips but pushed more work to CPU. On busy hosts, handshakes can still dominate.
Practical tasks: commands, outputs, and decisions
These are real tasks you can run on Ubuntu 24.04. Each one includes: the command, what the output means, and the decision you make. Run them during the slowdown if you can; otherwise you’re diagnosing a ghost.
Task 1: Reproduce iperf in a way that matches your app
cr0x@server:~$ iperf3 -c 10.0.0.20 -P 8 -t 20
Connecting to host 10.0.0.20, port 5201
[SUM] 0.00-20.00 sec 22.8 GBytes 9.79 Gbits/sec 0 sender
[SUM] 0.00-20.04 sec 22.8 GBytes 9.77 Gbits/sec receiver
What it means: Multiple parallel streams (-P 8) can hide per-flow limits. If your app uses many connections, this is closer than a single stream.
Decision: If single-stream is slow but multi-stream is fast, suspect per-flow constraints: congestion control, MTU/PMTUD issues, policers, or a single CPU/IRQ bottleneck.
Task 2: Measure latency and path stability with mtr
cr0x@server:~$ mtr -rwzc 200 10.0.0.20
Start: 2025-12-30T10:12:31+0000
HOST: server Loss% Snt Last Avg Best Wrst StDev
1.|-- 10.0.0.1 0.0% 200 0.25 0.32 0.19 1.90 0.18
2.|-- 10.0.0.20 0.0% 200 0.42 0.55 0.35 5.70 0.44
What it means: Low average but high worst-case latency hints at queueing or intermittent drops/retransmits.
Decision: If worst-case spikes correlate with app slowness, move to queue/drops and TCP retransmit checks.
Task 3: See if the host is CPU-starved or stuck in iowait
cr0x@server:~$ uptime
10:14:02 up 12 days, 3:21, 2 users, load average: 18.42, 16.90, 14.11
What it means: Load average is runnable + uninterruptible tasks. High load can be CPU contention or I/O waits.
Decision: High load with low CPU usage often means blocked I/O (disk or network filesystem) or lock contention. Don’t touch NIC tuning until you know.
Task 4: Break down CPU, softirq, and iowait
cr0x@server:~$ mpstat -P ALL 1 5
Linux 6.8.0-xx-generic (server) 12/30/2025 _x86_64_ (32 CPU)
12:14:21 CPU %usr %sys %iowait %soft %idle
12:14:22 all 38.10 21.33 0.15 18.40 22.02
12:14:22 7 10.00 15.00 0.00 70.00 5.00
What it means: CPU 7 is drowning in %soft (softirq). That’s packet processing and friends.
Decision: If one or two CPUs are pegged in softirq, you likely have IRQ/RPS/RFS imbalance. Fix that before you blame “the network.”
Task 5: Identify which interrupts are hot
cr0x@server:~$ cat /proc/interrupts | egrep -i 'eth0|mlx|ixgbe|enp|eno' | head
98: 192837465 0 0 0 PCI-MSI 524288-edge enp5s0f0-TxRx-0
99: 120044 0 0 0 PCI-MSI 524289-edge enp5s0f0-TxRx-1
100: 118900 0 0 0 PCI-MSI 524290-edge enp5s0f0-TxRx-2
What it means: Queue 0 takes nearly all interrupts. The other queues are idle.
Decision: Enable or correct IRQ affinity (or use irqbalance carefully), and verify RSS queues are configured and spreading load.
Task 6: Check NIC offloads and current link state
cr0x@server:~$ sudo ethtool enp5s0f0
Settings for enp5s0f0:
Supported link modes: 1000baseT/Full 10000baseT/Full
Speed: 10000Mb/s
Duplex: Full
Auto-negotiation: on
Link detected: yes
What it means: Link is negotiated at expected speed/duplex. If you see 1000Mb/s unexpectedly, stop right here and fix cabling/SFP/switch port config.
Decision: Correct the physical/link layer mismatch before doing anything “clever” in software.
Task 7: Look for NIC-level drops and ring overruns
cr0x@server:~$ sudo ethtool -S enp5s0f0 | egrep -i 'drop|discard|miss|overrun|timeout' | head -n 20
rx_missed_errors: 18234
rx_no_buffer_count: 14590
tx_timeout_count: 0
What it means: rx_no_buffer_count suggests the NIC/driver ran out of receive buffers—often a ring size issue, CPU/softirq starvation, or too-small memory budgets under pressure.
Decision: If these counters rise during the slowdown, address CPU/IRQ distribution and consider ring tuning (ethtool -g/-G). Also check if the host is memory pressured.
Task 8: Inspect ring sizes (and whether you can change them)
cr0x@server:~$ sudo ethtool -g enp5s0f0
Ring parameters for enp5s0f0:
Pre-set maximums:
RX: 4096
TX: 4096
Current hardware settings:
RX: 512
TX: 512
What it means: Rings are small relative to max. Small rings can drop during microbursts; huge rings can increase latency by buffering too much. Pick your poison intentionally.
Decision: If you see RX drops and latency spikes under bursty load, increasing RX ring (moderately) can help. If your complaint is tail latency under moderate load, don’t blindly inflate buffers.
Task 9: Check TCP retransmits and socket health
cr0x@server:~$ ss -ti dst 10.0.0.20 | head -n 20
ESTAB 0 0 10.0.0.10:46322 10.0.0.20:443
cubic wscale:7,7 rto:204 rtt:3.2/0.8 ato:40 mss:1448 pmtu:1500 rcvmss:1448 advmss:1448
bytes_sent:184233 bytes_retrans:12480 bytes_acked:171753 segs_out:2219 segs_in:1962 data_segs_out:2010
What it means: Non-trivial bytes_retrans is a smoking gun. Retransmits turn into latency, head-of-line blocking, and user-visible slowness.
Decision: If retransmits correlate with slow periods, focus on path loss, queue drops, policers, MTU issues, or NIC/driver problems.
Task 10: Check kernel network statistics for retransmits and timeouts
cr0x@server:~$ nstat -az | egrep 'TcpRetransSegs|TcpTimeouts|TcpExtTCPSynRetrans|IpExtInNoRoutes'
TcpRetransSegs 18432 0.0
TcpTimeouts 1220 0.0
TcpExtTCPSynRetrans 219 0.0
What it means: Rising retransmits/timeouts point to loss, poor queueing, or path blackholes. SYN retransmits can indicate SYN drops (firewall, conntrack, or overloaded listener).
Decision: If SYN retransmits spike, inspect conntrack limits, firewall rules, and accept queue/backlog on the server.
Task 11: Detect MTU/PMTUD blackholes with a controlled probe
cr0x@server:~$ ping -M do -s 1472 -c 5 10.0.0.20
PING 10.0.0.20 (10.0.0.20) 1472(1500) bytes of data.
1472 bytes from 10.0.0.20: icmp_seq=1 ttl=64 time=0.42 ms
1472 bytes from 10.0.0.20: icmp_seq=2 ttl=64 time=0.44 ms
--- 10.0.0.20 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4004ms
What it means: This confirms 1500-byte MTU works end-to-end. If it fails with “Frag needed” or silently drops, you have an MTU mismatch or blocked ICMP causing PMTUD failure.
Decision: If it fails, fix MTU consistency or allow necessary ICMP. Don’t “solve” it with random MSS clamping unless you control the whole path and understand the tradeoff.
Task 12: Inspect qdisc and queueing behavior
cr0x@server:~$ tc -s qdisc show dev enp5s0f0
qdisc mq 0: root
qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
Sent 29837294812 bytes 21833890 pkt (dropped 482, overlimits 0 requeues 1221)
backlog 0b 0p requeues 1221
What it means: Drops at qdisc level can be healthy (controlled dropping) or symptomatic (too small limit, bursty app). Requeues suggest contention or pacing.
Decision: If you’re seeing sustained drops and rising latency, tune queues and pacing—but only after verifying CPU and NIC drops. If tail latency is the pain, fq_codel is usually your friend; replacing it with pfifo_fast is how you time-travel back to 2009.
Task 13: Check conntrack pressure (NAT/firewall bottleneck)
cr0x@server:~$ sudo sysctl net.netfilter.nf_conntrack_max
net.netfilter.nf_conntrack_max = 262144
cr0x@server:~$ sudo cat /proc/sys/net/netfilter/nf_conntrack_count
258901
What it means: You’re close to the limit. When conntrack is full, new connections fail or stall. iperf might still work if it reuses connections or if you test off-path.
Decision: Increase nf_conntrack_max (with memory awareness), reduce connection churn, or bypass conntrack for traffic that doesn’t need it. Also check for timeouts too high for your workload.
Task 14: Confirm DNS isn’t the “network problem”
cr0x@server:~$ resolvectl query api.internal.example
api.internal.example: 10.0.0.20 -- link: enp5s0f0
(A) -- flags: answer
52ms
What it means: 52ms to resolve an internal name is suspicious in a LAN context. A few of those per request chain and suddenly “the app is slow.”
Decision: If DNS latency is high or inconsistent, inspect the resolver path (systemd-resolved, upstream DNS, search domains, retries). Cache where appropriate and fix the upstream.
Task 15: Time a full request: DNS + connect + TLS + first byte
cr0x@server:~$ curl -sS -o /dev/null -w 'dns:%{time_namelookup} connect:%{time_connect} tls:%{time_appconnect} ttfb:%{time_starttransfer} total:%{time_total}\n' https://10.0.0.20/health
dns:0.000 connect:0.003 tls:0.118 ttfb:0.146 total:0.147
What it means: TLS handshake is dominating. Your “slow network” is actually CPU, entropy, certificate chain issues, OCSP stapling problems, or overloaded TLS termination.
Decision: If TLS is slow, profile the terminating endpoint: CPU, ciphers, session resumption, cert chain size, and whether there’s a proxy doing extra work.
Task 16: Check disk I/O wait and saturation (because the app writes somewhere)
cr0x@server:~$ iostat -xz 1 3
Device r/s w/s rkB/s wkB/s await svctm %util
nvme0n1 120.0 950.0 5200.0 88100.0 24.10 0.70 92.00
What it means: The disk is ~92% utilized and average wait is 24ms. Apps waiting on disk will “feel like network slowness” because responses don’t get produced.
Decision: If %util is high and await spikes during incidents, fix storage contention: separate workloads, tune fsync patterns, add cache, or scale storage.
Task 17: Find which process is doing the I/O or getting throttled
cr0x@server:~$ pidstat -dl 1 5
# Time UID PID kB_rd/s kB_wr/s iodelay Command
10:18:01 1001 21422 0.00 8420.00 120 postgres
10:18:01 0 9981 0.00 1200.00 40 fluent-bit
What it means: The database and logging agent are competing. In the real world, logs often eat the latency budget first.
Decision: If your logging/telemetry is heavy, rate-limit it, move it off the hot path, or send it to a separate disk. Observability that causes downtime is performance art, not engineering.
Task 18: Check cgroup CPU throttling (common in containers)
cr0x@server:~$ cat /sys/fs/cgroup/system.slice/docker.service/cpu.stat
usage_usec 8922334455
user_usec 6122334400
system_usec 2800000055
nr_periods 238994
nr_throttled 82211
throttled_usec 992233445
What it means: The service is being throttled. That creates latency spikes, connection stalls, and slow TLS—even though the NIC is fine.
Decision: Increase CPU quota, reduce CPU work per request, or re-balance placement. If you’re throttling, stop interpreting iperf output as reality.
Three corporate mini-stories from the trenches
Mini-story 1: The incident caused by a wrong assumption
They had a migration window and a checklist. Someone ran iperf3 between the old app servers and the new database cluster. It hit line rate. They declared the network “validated” and moved on.
Monday morning, the ticket queue started filling with “app freezes for 10–30 seconds.” Not a crash. Not a hard failure. Just long, humiliating pauses. The networking team pointed at iperf results. The database team pointed at CPU graphs that were “fine.” The application team rewrote three client timeouts in a panic and made it worse.
The actual issue was MTU mismatch across a VLAN boundary. The new path had jumbo frames enabled on a subset of switch ports, while a firewall in the middle dropped ICMP “fragmentation needed.” Small packets sailed through, which is why ping looked fine and why many requests worked. Certain payload sizes triggered PMTUD failure and blackholed until TCP timed out and tried smaller segments.
iperf looked great because the test used a different path (a direct routed hop that bypassed the firewall) and because the stream behavior recovered. The app, with many short-lived TLS connections and occasional larger responses, hit the broken edge cases constantly.
Fixing it was boring: consistent MTU, allow ICMP types needed for PMTUD, and one standard path for tests. The lesson wasn’t “iperf is bad.” The lesson was “the assumption that iperf proves the whole path is a lie you told yourself.”
Mini-story 2: The optimization that backfired
A platform team wanted lower CPU usage on their Ubuntu 24.04 nodes. Network processing was showing up in flamegraphs as softirq time. Someone suggested turning on more aggressive offloads and increasing NIC rings “so we never drop.” It sounded reasonable. It also shipped to production on a Friday, which is a strong indicator of optimism.
CPU usage went down. Success, right? But the customer-facing symptom—P99 latency—got worse. Not a little. Calls that were normally fine became spiky and unpredictable. The business side noticed because it always notices when you mess with tail latency.
The problem was classic bufferbloat, but inside the host. Enlarged rings and deeper queues absorbed bursts instead of dropping early, so TCP didn’t get timely congestion signals. Packets sat in buffers longer, and interactive flows waited behind bulk transfers. Throughput charts looked excellent. User experience did not.
Rolling back the change immediately improved P99. Then they did the mature thing: re-applied changes gradually, measured tail latency, kept fair queueing, and tuned rings only to the point where drops stopped being pathological.
Short joke #2: Nothing makes a system “fast” like buffering requests until next week.
Mini-story 3: The boring but correct practice that saved the day
A finance-adjacent service had strict latency SLOs and a habit of failing loudly during quarter-end. The team didn’t have time for hero debugging. They did something dull: they standardized a “golden” performance runbook for every host class.
Every node shipped with the same baseline: consistent MTU, known-good qdisc settings, IRQ affinity templates per NIC model, and a small set of “always-on” exporters that tracked TcpRetransSegs, softirq CPU, and disk await. They also pinned DNS behavior: explicit resolvers, short search domain lists, and caching where appropriate.
When a slowdown hit, the on-call didn’t debate ideology. They compared the live host against the baseline outputs. One node showed rising rx_no_buffer_count and had a different kernel driver version due to a missed reboot. That’s all. No mystery thriller.
They drained the node, rebooted to the correct kernel, and the incident ended. The postmortem was short and slightly embarrassing, which is the best kind: “we had a baseline, we detected drift, we fixed drift.”
The practice wasn’t exciting. It was effective. Most reliability work is just preventing your future self from being surprised.
Common mistakes: symptoms → root cause → fix
This section is where I get opinionated. These are patterns that waste days because they feel plausible. Use them as a smell test.
1) “iperf is fast, so the network is fine”
Symptom: iperf reports high Gbit/s; app has high P95/P99 latency and timeouts.
Root cause: You validated throughput for a small number of streams, not loss, jitter, queueing, or dependency latency. Real traffic uses different paths (proxies/NAT/VPN/service mesh).
Fix: Re-run tests on the real path (same VIP, same DNS, same TLS) and measure retransmits, qdisc drops, and DNS/TLS timings. Use curl -w style breakdowns and ss -ti.
2) “The app is slow only under load”
Symptom: Works in low traffic, collapses during spikes; iperf at off-peak is perfect.
Root cause: Queueing and bufferbloat, softirq saturation, conntrack exhaustion, or disk contention under real concurrency.
Fix: Observe during the spike. Capture mpstat, nstat, ethtool -S, tc -s, iostat. Fix the tightest resource: CPU/IRQ spreading, qdisc fairness, conntrack sizing, or storage isolation.
3) “Packet loss is tiny, so it can’t matter”
Symptom: Loss under 1%, yet user-visible stalls and retries.
Root cause: TCP is sensitive to loss, especially for short flows and during slow start. Even modest drops can balloon tail latency.
Fix: Locate drop point: NIC ring drops, qdisc drops, switch policers, Wi-Fi/underlay issues, or oversubscribed links. Drops are not all equal; drops at the wrong time hurt more.
4) “We increased buffers, now it’s stable” (until it’s not)
Symptom: Fewer drops, worse interactivity, P99 climbs.
Root cause: Too much buffering causes queueing delay; fairness between flows gets worse; TCP sees congestion later.
Fix: Prefer fair queueing and controlled dropping over deep buffers. Tune ring sizes carefully and measure latency, not just throughput.
5) “It’s Kubernetes/service mesh overhead” (sometimes true, often lazy)
Symptom: On-host iperf is fast; pod-to-pod or service-to-service is slow.
Root cause: Overlay MTU mismatch, conntrack pressure, kube-proxy iptables scaling, sidecar CPU throttling, or noisy neighbor cgroups.
Fix: Verify MTU end-to-end, monitor conntrack count, measure per-pod CPU throttling, and confirm the test uses the same service path as production traffic.
6) “The database is slow, but the network is fast”
Symptom: DB queries hang; iperf is fine; CPU seems okay.
Root cause: Storage latency (fsync, WAL, NFS, saturated NVMe), or lock contention that looks like “waiting for bytes.”
Fix: Measure iostat await, per-process I/O with pidstat -d, and database-specific wait events. Don’t use network tools to debug storage.
Checklists / step-by-step plan
Checklist A: When someone says “iperf is fine”
- Ask: “What exactly is slow?” Identify whether it’s connect time, TLS time, first byte, or total response.
- Confirm the app’s path: DNS name, VIP, proxy chain, NAT, overlay, firewall zones. Ensure your test uses the same path.
- Capture retransmits and timeouts:
ss -ti,nstat. - Capture drops:
ethtool -S,tc -s qdisc. - Capture CPU softirq and IRQ distribution:
mpstat,/proc/interrupts. - Capture storage latency:
iostat -xz,pidstat -d. - Only then consider tuning. If you tune first, you’re just changing the crime scene.
Checklist B: Fixing IRQ/softirq hotspots safely
- Verify RSS is enabled and multiple queues exist (
ethtool -lif supported). - Inspect interrupts per queue in
/proc/interrupts. - Check whether
irqbalanceis running and whether it helps or harms your workload. - Change one thing at a time (affinity or RPS), then measure softirq distribution and application latency.
- Record your final configuration in config management. Drift is the silent killer.
Checklist C: Validating MTU end-to-end
- Confirm interface MTU on all participating hosts (
ip link show). - Probe with
ping -M doat the expected size. - If overlay/VPN exists, account for encapsulation overhead and reduce inner MTU accordingly.
- Ensure ICMP messages required for PMTUD are not blocked inside your security policy.
- Retest application-level behavior (TLS handshakes, large responses) after changes.
Checklist D: When it smells like DNS
- Time lookups with
resolvectl queryduring the incident. - Check if search domains are causing repeated NXDOMAIN lookups.
- Verify resolver upstream health and whether caching is effective.
- Confirm the app isn’t doing per-request lookups due to missing connection pooling or bad client config.
FAQ
1) If iperf is fast, can the network still be the problem?
Yes. iperf mostly proves bandwidth under a specific flow pattern. Your problem may be loss, jitter, queueing delay, PMTUD, policers, or a different path involving NAT/proxies.
2) Should I run iperf with more parallel streams?
Often, yes. Use -P to mimic connection concurrency. If many streams are fast but one stream is slow, suspect per-flow constraints (loss, policers, MTU, congestion control).
3) What’s the fastest indicator that packets are being dropped?
Rising retransmits in ss -ti and nstat, plus NIC counters like rx_no_buffer_count or qdisc drops in tc -s. Drops don’t always show up in ping.
4) Why do I see high softirq and the app slows down?
Softirq is where the kernel processes packets. If it’s pinned on one core (common with bad IRQ/RSS distribution), your app threads compete for CPU and get delayed. Spread interrupts and packet steering across CPUs.
5) Is increasing NIC ring buffers a good idea?
Sometimes. It can reduce drops during bursts, but it can also increase latency by buffering more. If your SLO is latency, tune rings conservatively and measure P99 before/after.
6) How can DNS make my “network” slow?
Every lookup is a network dependency with timeouts and retries. In microservices, DNS lookups can sit on the critical path more often than people admit. Time it with resolvectl and break down request timing with curl -w.
7) Could storage be the bottleneck even if clients complain about “slow responses”?
Absolutely. If the server can’t read or write data quickly, it can’t respond quickly. Check iostat await and %util, then identify which processes are driving I/O.
8) Why does it only happen in containers or Kubernetes?
Containers add cgroup limits, overlay networks, and often conntrack-heavy paths. CPU throttling, MTU overhead, and conntrack exhaustion are common. Measure throttling and conntrack counts directly.
9) Should I disable offloads (TSO/GRO/LRO) to “fix latency”?
Not as a first move. Disabling offloads often increases CPU use and can worsen tail latency under load. Only change offloads when you have evidence of a driver/firmware bug or specific offload interaction.
10) What if iperf is fast, ping is fine, and everything still stalls?
Look at the “plumbing”: TLS handshake time, DNS, conntrack, accept queues/backlogs, and application pool exhaustion (DB connections, thread pools). Network tools won’t reveal those directly.
Conclusion: next steps you can do today
If you remember one thing from case #24, make it this: throughput is not user experience. iperf is a blunt instrument; your app is a fussy organism that cares about the tail.
Do these next, in this order:
- Capture a single “slow request” breakdown (
curl -wor app timing) to identify whether the pain is DNS, connect, TLS, or server time. - During the incident, collect four snapshots:
mpstat,ss -ti,nstat,iostat. Those alone usually tell you which subsystem is lying. - If softirq/IRQs are hot, fix distribution before tuning buffers.
- If retransmits/timeouts rise, hunt drops and MTU issues end-to-end.
- If storage await is high, stop blaming the network and go have the uncomfortable “disk is on fire” conversation.
- Write down a baseline and enforce it. The best incident is the one you can end by comparing outputs, not by improvising.