Ubuntu 24.04 “Connection reset by peer”: prove whether it’s client, proxy, or server (case #74)

Was this helpful?

“Connection reset by peer” is the networking equivalent of a shrug. The socket was alive, then it wasn’t. Your app logs blame the network, your network team blames the app, your proxy claims innocence, and your incident channel fills with vibes instead of evidence.

This piece is about turning that shrug into a signed confession. On Ubuntu 24.04, you can prove whether the reset came from the client, the proxy, or the server—down to the packet, the PID, the timeout, and the exact layer where the decision was made.

What “connection reset by peer” actually means (and what it doesn’t)

At the syscall level, “connection reset by peer” usually maps to ECONNRESET. Your process tried to read or write on a TCP socket, and the kernel told you: the other side nuked the connection.

In TCP terms, the “nuke” is typically a packet with the RST flag set. That RST can be sent for several reasons:

  • The peer explicitly reset the connection (application closed abruptly, or kernel decided the socket is invalid).
  • A middlebox forged a reset (proxy, firewall, load balancer, NAT, IDS). The “peer” might be a liar with good posture.
  • The host received traffic for a connection it doesn’t recognize (no socket, state lost, port closed) and responded with RST.

What it is not:

  • Not a timeout. Timeouts are usually ETIMEDOUT or application-layer timeouts (e.g., HTTP client gives up). A reset is immediate and explicit.
  • Not a clean close. Clean close is FIN/ACK and results in EOF on read, not ECONNRESET.
  • Not necessarily “the server crashed.” Sometimes the client bailed, sometimes the proxy enforced policy, sometimes conntrack did a disappearing act.

Two rules of thumb that hold up in production:

  1. If you didn’t capture packets, you don’t know who sent the RST. Logs can hint. Packets prove.
  2. If there’s a proxy, assume it’s involved until disproven. Proxies are paid to interfere.

Fast diagnosis playbook (first/second/third)

This is the “I’m on-call and it’s 02:13” version. The goal is to identify the bottleneck (client, proxy, or server) fast enough to stop the bleeding.

First: locate the reset’s origin by vantage point

  1. Pick one failing flow: client IP, proxy IP, server IP, destination port, and timestamp window (±30 seconds).
  2. Capture packets on the proxy and the server simultaneously (or as close as possible). If you can only do one, do it on the proxy.
  3. Look for the first RST in each capture and identify which interface saw it first and from which IP/port tuple.

Second: correlate with process and logs

  1. On the sender host, map the 4‑tuple to a socket and owning process with ss -tnpi.
  2. Check proxy logs for upstream resets vs client aborts vs timeouts (they are different words; don’t hand-wave them into one bucket).
  3. Check server logs for accept errors, TLS alerts, app exceptions, worker restarts.

Third: validate systemic causes

  1. Conntrack exhaustion (NAT or firewall path) and ephemeral port exhaustion (client side) can create “random” resets.
  2. MTU/PMTUD issues can look like resets when devices misbehave, especially with tunnels/VPNs.
  3. Backpressure and timeouts: a proxy with aggressive timeouts will reset “slow” upstreams. Slow can be CPU, storage, GC, or lock contention.

Joke #1: The fastest way to find the culprit is to add a proxy in front of it—now you have two culprits and a meeting invite.

A proof model: how to stop guessing and start pinning blame

When you’re trying to prove where an RST came from, you need a method that survives politics, not just physics. Here’s the model I use in postmortems:

1) Define the flow and its “truth anchors”

A single request might traverse: client → NAT → ingress proxy → service proxy → backend. The only things you can treat as “truth anchors” are:

  • Packet captures at multiple points.
  • Kernel socket state (ss, /proc), ideally on the host that sent the RST.
  • Timestamped logs with correlation IDs that actually propagate.

2) Use a timeline, not vibes

You’re looking for a sequence like this:

  • SYN/SYN-ACK/ACK establishes a connection.
  • Data flows (or doesn’t).
  • A party sends RST (or a device injects it).

The “who” is whichever hop first emits that RST, not whichever host logged the first exception.

3) Prove directionality with the 4‑tuple and ACK numbers

RST packets aren’t just flags. They include sequence/acknowledgment numbers. When you capture on both ends, you can often tell if a reset is:

  • Locally generated by the host you’re sniffing (it leaves the interface with correct routing, correct MAC, expected TTL patterns).
  • Forwarded (seen inbound on a proxy from upstream, then separately seen outbound to client, sometimes with different port mapping).
  • Injected (odd TTL, unexpected MAC/vendor OUI, or appearing only on one side).

4) Decide what “peer” means in your architecture

If the client talks to a proxy, the “peer” from the client’s perspective is the proxy. The backend server can be innocent while the proxy resets. That’s not semantics; it changes which team fixes it.

5) Treat timeouts as policy decisions

Many resets are not “failures”; they’re enforcement. Example: proxy sees upstream idle too long, sends RST to client to free resources. That’s a product decision wearing a networking costume.

Interesting facts and historical context (the useful kind)

  • Fact 1: TCP RST behavior was formalized early; resets exist to kill invalid connections fast instead of waiting on timeouts. This is why resets are so blunt.
  • Fact 2: Many classic “connection reset” incidents in the 2000s were actually NAT timeouts on stateful firewalls. NAT forgets you; your packets become “out of the blue.”
  • Fact 3: Linux can send RST when a packet arrives for a port with no listening socket—common during deploys where the process restarts and loses accept state.
  • Fact 4: Proxies popularized the idea of “client abort” vs “upstream reset” as separate categories; modern Nginx/HAProxy/Envoy logs can usually tell which direction broke first.
  • Fact 5: “RST injection” has been used historically for censorship and traffic interference. In enterprise networks, the less dramatic cousin is “security appliance that terminates flows.”
  • Fact 6: HTTP/2 and HTTP/3 shifted some failure modes: an HTTP/2 connection reset nukes many streams at once, amplifying impact.
  • Fact 7: Ubuntu 24.04 ships with a newer kernel and userland; changes in defaults (e.g., congestion control, TLS libraries, systemd services) can move the line where a reset shows up, even if your app didn’t change.
  • Fact 8: Keepalives were invented to avoid half-open connections lingering forever, but aggressive keepalive + short NAT timeout can still create the “it worked yesterday” kind of reset.

Practical tasks: commands, outputs, and decisions (12+)

These tasks are written to be run on Ubuntu 24.04 hosts. Each one includes: the command, what output means, and what decision you make.

Task 1: Confirm the error is actually ECONNRESET and where it appears

cr0x@server:~$ journalctl -u myapp --since "10 min ago" | tail -n 30
Dec 31 01:58:12 api-1 myapp[21450]: error: read ECONNRESET from 10.20.4.18:443
Dec 31 01:58:12 api-1 myapp[21450]: request_id=9b2a... upstream=payments duration_ms=212

What it means: The application saw a reset while reading. That suggests the peer (from this host’s perspective) sent RST or an intermediate device did.

Decision: Identify what “peer” is in this hop. If 10.20.4.18:443 is a proxy VIP, start on the proxy.

Task 2: Resolve the network path: who is the peer, really?

cr0x@server:~$ ip route get 10.20.4.18
10.20.4.18 via 10.20.1.1 dev ens5 src 10.20.2.17 uid 0
    cache

What it means: Traffic to the peer exits via ens5 and gateway 10.20.1.1. No surprise overlay here.

Decision: Capture on ens5. If route goes through a tunnel device, also suspect MTU and PMTUD.

Task 3: Snapshot live TCP health (server or proxy)

cr0x@server:~$ ss -s
Total: 1218
TCP:   842 (estab 311, closed 377, orphaned 2, timewait 355)

Transport Total     IP        IPv6
RAW       0         0         0
UDP       14        12        2
TCP       465       419       46
INET      479       431       48
FRAG      0         0         0

What it means: Lots of TIME_WAIT is normal on busy HTTP; “orphaned” staying low is good. A huge “orphaned” count can indicate abrupt closes and kernel pressure.

Decision: If TIME_WAIT is exploding on the client side and you’re close to ephemeral port limits, consider connection reuse and port range checks (later tasks).

Task 4: Find the exact socket and owning process for a failing connection

cr0x@server:~$ ss -tnpi dst 10.20.4.18:443 | head
ESTAB 0 0 10.20.2.17:51644 10.20.4.18:443 users:(("myapp",pid=21450,fd=78)) timer:(keepalive,116min,0)

What it means: The connection exists and belongs to myapp. Keepalive timer is set; the app is reusing connections.

Decision: If resets happen on reused connections, suspect idle timeouts on proxies/NATs. Go check proxy timeout settings and NAT idle timers.

Task 5: Capture packets on the host where you suspect the RST originates

cr0x@server:~$ sudo tcpdump -i ens5 -nn -s 0 -vvv 'host 10.20.4.18 and tcp port 443' -c 30
tcpdump: listening on ens5, link-type EN10MB (Ethernet), snapshot length 262144 bytes
01:58:12.041223 IP 10.20.2.17.51644 > 10.20.4.18.443: Flags [P.], seq 182:496, ack 911, win 501, length 314
01:58:12.042007 IP 10.20.4.18.443 > 10.20.2.17.51644: Flags [R.], seq 911, ack 496, win 0, length 0
30 packets captured

What it means: The peer (10.20.4.18) sent an RST immediately after receiving data. That’s not a timeout; that’s an active refusal midstream.

Decision: Focus on the peer’s logs and process. If 10.20.4.18 is a proxy VIP, capture on the proxy host to see if it received an upstream RST or generated one.

Task 6: Capture on the proxy: prove whether it’s upstream or client-side

cr0x@server:~$ sudo tcpdump -i ens5 -nn -s 0 -vvv '(host 10.20.2.17 and tcp port 443) or (host 10.30.7.55 and tcp port 8443)' -c 50
tcpdump: listening on ens5, link-type EN10MB (Ethernet), snapshot length 262144 bytes
01:58:12.041980 IP 10.20.2.17.51644 > 10.20.4.18.443: Flags [P.], seq 182:496, ack 911, length 314
01:58:12.042001 IP 10.20.4.18.443 > 10.20.2.17.51644: Flags [R.], seq 911, ack 496, length 0
01:58:12.041700 IP 10.20.4.18.41210 > 10.30.7.55.8443: Flags [P.], seq 991:1305, ack 4402, length 314
01:58:12.041750 IP 10.30.7.55.8443 > 10.20.4.18.41210: Flags [P.], seq 4402:4520, ack 1305, length 118

What it means: Proxy reset the client connection while the upstream was still happily responding. That’s a proxy-generated reset.

Decision: Stop blaming the backend. Audit proxy timeouts, request size limits, TLS settings, and worker health. The proxy is making a decision.

Task 7: Check Nginx logs for client abort vs upstream reset (if Nginx)

cr0x@server:~$ sudo tail -n 20 /var/log/nginx/error.log
2025/12/31 01:58:12 [info] 1192#1192: *884 client prematurely closed connection while reading client request body, client: 10.20.2.17, server: api.example, request: "POST /v1/payments HTTP/1.1"
2025/12/31 01:58:12 [error] 1192#1192: *885 upstream prematurely closed connection while reading response header from upstream, client: 10.20.2.17, server: api.example, request: "GET /v1/status HTTP/1.1", upstream: "http://10.30.7.55:8443/v1/status"

What it means: Two different problems. “Client prematurely closed” points to client abort. “Upstream prematurely closed” points to backend closing early (could be crash, keepalive mismatch, app close, or upstream timeout).

Decision: Split incidents by log phrasing. Treat them as different root-cause trees, not one big “network glitch.”

Task 8: Check HAProxy termination state (if HAProxy)

cr0x@server:~$ sudo tail -n 5 /var/log/haproxy.log
Dec 31 01:58:12 lb-1 haproxy[2034]: 10.20.2.17:51644 [31/Dec/2025:01:58:12.041] fe_https be_api/api-3 0/0/1/2/3 200 512 - - ---- 12/12/0/0/0 0/0 "GET /v1/status HTTP/1.1"
Dec 31 01:58:12 lb-1 haproxy[2034]: 10.20.2.17:51645 [31/Dec/2025:01:58:12.050] fe_https be_api/api-2 0/0/0/0/1 0 0 - - cD-- 3/3/0/0/0 0/0 "POST /v1/payments HTTP/1.1"

What it means: Termination flags like cD-- strongly suggest client-side abort (“client” + “data”). The request never completed.

Decision: If HAProxy reports client aborts but clients swear they didn’t, verify client timeouts and intermediates (mobile networks, corporate proxies, SDK defaults).

Task 9: Check Envoy for upstream reset reasons (if Envoy)

cr0x@server:~$ sudo journalctl -u envoy --since "10 min ago" | tail -n 10
Dec 31 01:58:12 edge-1 envoy[1640]: [debug] upstream reset: reset reason: connection termination, transport failure reason: delayed connect error: 111
Dec 31 01:58:12 edge-1 envoy[1640]: [debug] downstream connection termination

What it means: Envoy gives you an upstream reset reason and often a transport failure reason. “connect error: 111” is connection refused (RST from upstream host because no listener).

Decision: Treat “connection refused” as a backend availability issue (service not listening, wrong port, deploy gone wrong), not as random resets.

Task 10: Check kernel counters for listen/backlog pain

cr0x@server:~$ netstat -s | egrep -i 'listen|overflow|reset' | head -n 20
    14 times the listen queue of a socket overflowed
    14 SYNs to LISTEN sockets ignored
    2318 connection resets received
    1875 connections reset sent

What it means: Listen queue overflow can cause clients to see resets or connection failures under bursts. “reset sent” indicates this host is actively resetting peers.

Decision: If listen overflow increases during incidents, tune backlog (somaxconn, app accept backlog), or scale out. Don’t touch timeouts first; you’ll just hide the symptom.

Task 11: Check conntrack exhaustion (common on NAT/proxy boxes)

cr0x@server:~$ sudo sysctl net.netfilter.nf_conntrack_count net.netfilter.nf_conntrack_max
net.netfilter.nf_conntrack_count = 262131
net.netfilter.nf_conntrack_max = 262144

What it means: You’re basically full. Once conntrack is saturated, new flows get dropped or mangled. The resulting application errors can include resets, timeouts, and weird partial handshakes.

Decision: Increase nf_conntrack_max if memory allows, reduce timeouts for dead flows, and reduce connection churn (keepalive, pooling). Also find the source of connection explosion.

Task 12: Validate client ephemeral port range and TIME_WAIT pressure

cr0x@server:~$ sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 32768	60999

What it means: About 28k ports available per source IP. With lots of short-lived connections, you can run out and get connection failures that show up as resets upstream.

Decision: Prefer keepalive/pooling; if you must, widen port range and consider more source IPs (SNAT pools) on high-volume clients.

Task 13: Check for MTU blackholes and PMTUD breakage

cr0x@server:~$ ping -M do -s 1472 10.20.4.18 -c 3
PING 10.20.4.18 (10.20.4.18) 1472(1500) bytes of data.
From 10.20.2.17 icmp_seq=1 Frag needed and DF set (mtu = 1450)
From 10.20.2.17 icmp_seq=2 Frag needed and DF set (mtu = 1450)
From 10.20.2.17 icmp_seq=3 Frag needed and DF set (mtu = 1450)

--- 10.20.4.18 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2025ms

What it means: Path MTU is 1450, not 1500. If something blocks ICMP “frag needed,” some TCP sessions stall and then get killed by intermediates, sometimes via resets.

Decision: Fix MTU consistency (tunnels, VXLAN), ensure ICMP is permitted for PMTUD, or clamp MSS at edges.

Task 14: Confirm whether TLS handshake is being reset

cr0x@server:~$ openssl s_client -connect 10.20.4.18:443 -servername api.example -tls1_2 
CONNECTED(00000003)
write:errno=104

What it means: Errno 104 is ECONNRESET. Reset during handshake often points to proxy TLS policy (SNI required, ciphers, ALPN), rate limiting, or a backend that’s not actually speaking TLS on that port.

Decision: Compare with a known-good client, verify SNI, inspect proxy TLS config, and capture packets to see who sends the RST.

Task 15: Inspect live drops and errors at the interface

cr0x@server:~$ ip -s link show dev ens5
2: ens5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 06:3a:9e:11:22:33 brd ff:ff:ff:ff:ff:ff
    RX:  bytes packets errors dropped  missed   mcast
    1287349832 1298374      0     412       0       0
    TX:  bytes packets errors dropped carrier collsns
    983748233  992331      0       9       0       0

What it means: RX drops are non-zero. Drops can cause retransmissions and trigger timeout policies that end in resets.

Decision: If drops correlate with incident times, investigate NIC queueing, host CPU saturation, and upstream congestion. Don’t chase application ghosts until the packet path is stable.

Task 16: Tie a reset to a local firewall rule (nftables/ufw)

cr0x@server:~$ sudo nft list ruleset | sed -n '1,120p'
table inet filter {
  chain input {
    type filter hook input priority 0; policy accept;
    tcp dport 443 ct state new limit rate 50/second accept
    tcp dport 443 ct state new reject with tcp reset
  }
}

What it means: You are explicitly sending TCP resets when the rate limit is exceeded. That’s not “the network.” That’s policy.

Decision: If resets spike during bursts, adjust rate limits, add buffering/queueing, or scale. Also: document the rule so the next on-call doesn’t rediscover it at 3 a.m.

Proxy-specific failure modes (Nginx, HAProxy, Envoy)

If you run any kind of proxy, you are in the business of “connection reset by peer,” whether you like it or not. Here are the patterns that matter.

Proxy resets the client because the upstream is slow

Common when:

  • Upstream response headers take too long (proxy_read_timeout in Nginx; timeouts in HAProxy/Envoy).
  • Proxy buffers are full; it chooses to kill connections instead of going down with the ship.

How to prove: Packet capture on proxy shows client-side RST with no corresponding upstream RST at that moment. Proxy logs show timeout/504/termination state.

Fix: Decide whether to increase timeouts or fix upstream latency. Increasing timeouts is a bet that slowness is “normal.” In production, slowness is rarely normal; it’s a symptom.

Proxy resets because of request body limits or header limits

Proxies can close connections aggressively on oversized headers/bodies. Depending on config and timing, the client can see a reset rather than a neat 413/431.

How to prove: Proxy error logs mention “client sent too large request” or header parsing failures. Packet capture shows RST soon after client sends headers/body.

Fix: Raise limits carefully and intentionally. If you raise them globally, you’re also raising your blast radius for abuse and memory pressure.

Keepalive mismatch between proxy and upstream

One of the most common production-grade footguns: the proxy reuses upstream connections longer than the upstream wants. The upstream closes idle connections. Proxy tries to write on a dead socket. Reset happens somewhere, and your error message points the finger at the wrong hop.

How to prove: On proxy: upstream connection in ss shows keepalive timers; upstream logs show idle timeout closes; packet capture shows upstream FIN/RST on idle connection followed by proxy attempting to write.

Fix: Align keepalive idle timeouts across: client SDK, edge proxy, service proxy, backend server, and NAT. Pick a hierarchy (edge smallest or backend smallest) and document it.

TLS policy enforcement (SNI/ALPN/ciphers) that looks like random resets

TLS failure modes are often logged poorly by applications and are sometimes handled by proxies with hard closes. A client missing SNI can get reset. An old client offering weak ciphers can get reset. HTTP/2 negotiation can complicate this when ALPN is involved.

How to prove: openssl s_client shows reset during handshake; proxy logs show handshake errors; packet capture shows reset right after ClientHello.

Fix: Enforce TLS policy, but make it observable. If you’re going to reset, log why with enough detail to be actionable.

Server-side causes: app, kernel, TLS, storage (yes, storage)

Resets are sometimes the server’s fault in the boring way: the server is overloaded, misconfigured, or restarting, and the kernel does what kernels do—makes your app’s day worse.

App restarts and connection churn

If your backend restarts under load (OOM, crash loop, deploy), existing connections can be dropped. Clients see resets if the process dies or if a sidecar/proxy tears down sockets hard.

Proof: Correlate reset timestamps with service restarts in journalctl and orchestrator events. Packet capture shows resets coinciding with SYNs hitting no listener.

Listen backlog overflow

A server can be “up” and still not accept connections. If the accept queue overflows, SYNs get dropped or ignored; clients retry; intermediates react; eventually someone resets.

Proof: netstat -s shows listen overflows; server CPU may be pegged; app logs show slow accepts.

Fix: Tune backlog, scale, and reduce per-connection overhead. Don’t mask it with bigger timeouts.

Storage latency masquerading as network resets

This is where storage engineers get dragged into “network issues.” If your server threads block on disk (database fsync storms, log volume saturation, EBS hiccups, ZFS sync writes), request latency spikes. Proxies hit timeouts and reset clients. Clients blame the server. The server blames the network. Nobody blames the disk because the disk is “green.”

Proof: Latency spikes in application metrics align with proxy timeouts and client resets. On the server, iostat and pidstat show IO wait during incident window.

Fix: Treat storage latency as part of the request path. Put SLOs on it. If you can’t measure it, you can’t exonerate it.

TLS offload confusion

Backends accidentally speaking HTTP on a port that the proxy expects TLS (or vice versa) can yield resets that look like handshake flakiness. It’s a config drift classic.

Proof: Packet capture shows plaintext where TLS should be, or immediate reset after ClientHello. Proxy logs show “wrong version number” or handshake failure.

Fix: Make port/protocol contracts explicit. Treat “it’s always been that way” as a bug report, not a reason.

Client-side causes: aborts, NATs, MTU, and “helpful” libraries

Clients cause a shocking number of resets. Not out of malice—out of impatience, default timeouts, and middleboxes that aggressively garbage-collect state.

Client aborts due to local timeout

SDKs often default to short timeouts. Mobile clients are worse because networks are worse. Corporate desktop clients can be worst because security proxies are creative.

Proof: Proxy logs say “client prematurely closed connection” or HAProxy termination state indicates client abort. Packet capture on proxy shows FIN/RST from client side first.

Fix: Make client timeouts explicit and aligned with server/proxy behavior. If you need 30 seconds for a request, don’t ship a 5-second client timeout and then accuse the server.

NAT idle timeouts killing long-lived idle connections

A client behind NAT may keep a TCP connection idle longer than the NAT’s timeout. The NAT forgets the mapping. Next packet goes out, return traffic can’t match state, and some device responds with RST or drops.

Proof: Resets occur after idle periods; changing keepalive intervals changes failure rate. Conntrack tables on NAT show short timeouts for established flows.

Fix: Keepalives at intervals shorter than NAT idle timeout, or avoid long-lived idle connections in hostile networks.

MTU mismatch and blackholes

MTU problems aren’t supposed to cause resets. In a perfect world, they cause fragmentation or PMTUD adjustment. In the real world, some devices block ICMP, and others make “helpful” decisions like killing flows.

Proof: ping -M do shows path MTU smaller than expected; packet capture shows retransmissions and then resets by proxy after timeouts.

Fix: Fix the path MTU or clamp MSS at the edge of tunnels.

Joke #2: “It’s probably the client” is not a diagnosis; it’s a coping mechanism with a ticket number.

Three corporate-world mini-stories

Mini-story 1: The incident caused by a wrong assumption

They had a clean narrative: “The database is resetting connections.” The error was ECONNRESET, the stack trace pointed at the DB driver, and the service was timing out under load. So the DB team got paged, again, and everyone prepared for the usual dance.

A senior engineer asked a rude but necessary question: “Where did the reset come from?” Nobody knew. They had metrics, dashboards, and a war room; they did not have a packet capture.

They captured at two points: the application host and the HAProxy layer. On the application host, the RST appeared to come from the HAProxy VIP. On HAProxy, there was no upstream RST from the database. Instead, HAProxy was issuing resets because its client-side timeout was shorter than the slowest percentile of DB queries during peak IO wait.

The wrong assumption was that the “peer” in the error message was the database. In reality the peer was the proxy. The database was slow, yes, but it wasn’t resetting anything; the proxy was enforcing policy.

The fix was not “increase all timeouts forever.” They increased the proxy timeout slightly, but the real work was reducing IO stalls—index maintenance scheduling, better query plans, and moving a logging workload off the same volume. After that, the proxy stopped “helping.”

Mini-story 2: The optimization that backfired

A team tried to reduce latency and CPU by increasing keepalive reuse everywhere. Fewer handshakes, fewer TCP setups, less TLS overhead. On paper it was tidy.

They rolled it out at the edge proxy first, letting upstream connections live much longer. Within a day, customers reported sporadic resets on perfectly ordinary requests. The logs were infuriating: some requests succeeded instantly, others died mid-request with “connection reset by peer.”

The root cause was a mismatch: the upstream application server had an aggressive idle timeout and occasionally performed worker recycling. The proxy, now holding upstream connections longer, would reuse sockets that had been quietly killed. The first write on a half-dead connection triggered a reset. Under high concurrency it looked random.

Packet captures made it obvious: upstream sent FIN on idle; proxy didn’t notice in time; proxy wrote; kernel responded with RST behavior. The “optimization” increased the probability of hitting stale upstream sockets.

The fix was boring: align timeouts, enable active health checks that actually validate requests, and set upstream keepalive to something less heroic. Performance improved after they stopped trying to be clever.

Mini-story 3: The boring but correct practice that saved the day

A platform team had a rule: every customer-facing incident requires a “two-sided capture” if it involves networking errors. People complained because it felt slow and procedural. It was, and that was the point.

One afternoon, an internal service started failing with resets. The application team insisted it was the proxy. The proxy team insisted it was the app. The network team insisted it was “upstream.” The incident channel started to smell like a committee.

They followed the rule. Capture on the proxy’s client-facing interface and on its upstream interface. The reset was visible coming from a firewall hop between proxy and backend, only on the upstream side. The proxy was just passing the pain along.

Because they had evidence early, they avoided an hour of config thrashing. The firewall change window showed a recent rule adjustment that rejected certain new flows with TCP reset after a rate threshold. It wasn’t malicious; it was “protective.”

The fix was to adjust the firewall rule to match the service’s legitimate burst profile and to add an explicit exception for the proxy subnet. The reset rate dropped immediately. Nobody had to pretend they “felt” the issue.

Common mistakes: symptoms → root cause → fix

  • Symptom: Resets only on large POST requests.
    Root cause: Proxy body size limit or buffering behavior; sometimes WAF terminating flows.
    Fix: Raise specific limits (client_max_body_size), adjust buffering, confirm WAF policy; verify with packet capture and proxy logs.
  • Symptom: Resets happen after ~60 seconds of idle time on reused connections.
    Root cause: NAT/proxy idle timeout shorter than client keepalive reuse period.
    Fix: Set keepalive/idle timeouts consistently; consider TCP keepalive tuning; reduce idle reuse.
  • Symptom: Burst traffic causes immediate resets/connection refused.
    Root cause: Listen backlog overflow or rate limiting that rejects with TCP reset.
    Fix: Tune backlog, scale out, or adjust rate limiting; confirm with netstat -s and firewall rules.
  • Symptom: Only certain clients (older SDKs) see resets during TLS handshake.
    Root cause: TLS policy enforcement: missing SNI, unsupported ciphers, ALPN mismatch; proxy hard-closes.
    Fix: Make TLS requirements explicit; improve error reporting; test with openssl s_client using client-like settings.
  • Symptom: Resets correlate with deploys but only some percent of traffic fails.
    Root cause: Connection draining not configured; proxy sends traffic to restarting instances; stale upstream keepalive sockets.
    Fix: Add graceful shutdown, proper readiness gates, draining, and shorter upstream keepalive.
  • Symptom: “Random” resets under high connection churn; NAT/proxy hosts show odd behavior.
    Root cause: Conntrack table near full; drops/evictions create state loss.
    Fix: Increase conntrack max, reduce churn, tune timeouts, identify top talkers.
  • Symptom: Resets cluster with packet drops and retransmits.
    Root cause: Interface drops, queue overruns, CPU saturation, or path congestion leading to policy timeouts and resets.
    Fix: Fix packet loss first; then tune timeouts. Use ip -s link and captures to confirm.
  • Symptom: Resets appear only across VPN/tunnel paths.
    Root cause: MTU mismatch or ICMP blocked causing PMTUD failure; proxies terminate slow/stalled flows.
    Fix: Clamp MSS, align MTU, allow ICMP “frag needed,” validate with DF pings.

Checklists / step-by-step plan

Step-by-step plan: prove client vs proxy vs server in under an hour

  1. Pick one failing request: timestamp, client IP, destination VIP, request path, correlation ID if available.
  2. Identify the first proxy hop (ingress/load balancer). In modern stacks, that’s usually the true “peer” for the client.
  3. Capture packets at two points:
    • On the proxy: client-facing interface.
    • On the proxy: upstream-facing interface (or on the backend host).
  4. Find the first RST and record:
    • Source IP:port and destination IP:port
    • Exact time
    • Whether it’s [R.] or [R] and whether it’s in response to data
  5. Correlate with logs on the host that sent the RST:
    • Nginx error log phrases
    • HAProxy termination flags
    • Envoy upstream reset reasons
    • Kernel counters (netstat -s)
  6. Check systemic constraints (fast):
    • conntrack count vs max
    • interface drops
    • CPU saturation / IO wait
  7. Make a call:
    • If proxy generated RST: fix proxy config/timeouts/policy.
    • If backend generated RST: fix app lifecycle, listener, TLS, or overload.
    • If client generated RST: fix client timeout/retry logic or client network path.
    • If a middlebox injected it: fix firewall/WAF/NAT policy or state tables.

Operational checklist: what to record for a defensible incident report

  • pcap snippet(s) showing the RST at two vantage points
  • the 4‑tuple(s) involved and any NAT mapping if known
  • proxy log lines for the same request
  • server restart or overload evidence (journald, metrics)
  • conntrack and port range state if relevant
  • the exact config knobs implicated (timeouts, limits, rate limits)

FAQ

1) Does “connection reset by peer” always mean the remote server crashed?

No. It means someone on the other end of that TCP hop sent a reset—or a middlebox pretended to be them. In proxied architectures, “peer” is often the proxy.

2) How do I prove the proxy sent the reset?

Capture on the proxy. If the proxy sends RST to the client while the upstream side shows no corresponding RST (and may even be sending data), the proxy is the origin.

3) Can a firewall cause resets instead of drops?

Yes. Firewalls and WAFs often “reject with tcp reset” to fail fast. This is common with rate limits, invalid state, or policy violations.

4) Why do I only see it on keepalive connections?

Because stale connections get reused. NAT idle timeouts, proxy idle timeouts, and backend idle timeouts don’t automatically align. First write after idle is where you discover you were talking to a ghost.

5) Is packet capture really necessary? Can I do it with logs?

Logs can suggest; packets can prove. When multiple teams are involved, “suggest” buys you meetings. Proof buys you a fix.

6) What’s the difference between FIN and RST in practice?

FIN is a polite close: read returns EOF. RST is a hard abort: read/write fails with ECONNRESET. Many proxies and policy engines choose RST to reclaim resources quickly.

7) Can storage issues really lead to connection resets?

Indirectly, yes. Storage latency can make backends slow, triggering proxy timeouts; the proxy then resets the client. The reset is a symptom; the root cause can be IO wait.

8) How do I tell client aborts from server aborts?

Proxy logs are usually the fastest clue (client prematurely closed vs upstream closed). For proof, packet capture shows which side first sends FIN/RST.

9) What if I see resets but no one’s logging anything?

Then you’re missing observability at the layer that made the decision. Add structured logging on proxies (termination reason), keep short pcaps during incidents, and record socket state.

Conclusion: practical next steps

You don’t fix “connection reset by peer.” You fix the component that decided to send a reset—or the condition that forced it to.

Next steps that pay for themselves:

  1. Adopt the two-sided capture rule for networking incidents. It ends arguments quickly.
  2. Normalize timeout budgets across clients, proxies, and servers. Write them down. Make them part of change review.
  3. Make resets observable: proxy termination reasons, firewall reject counters, conntrack saturation alerts, backlog overflow alerts.
  4. Run the tasks above during the incident and paste the outputs into the ticket. Future-you will send past-you a thank-you note.

Paraphrased idea (attributed): Werner Vogels has emphasized that you should “build systems that assume failure and recover quickly,” because failure is normal in distributed systems.

← Previous
Proxmox local-lvm at 100%: find what ate your thin pool and fix it
Next →
Fail2ban for Mail: Rules That Actually Catch Attacks

Leave a comment