Debian 13 policy routing: debug ip rule and ip route without pain

Was this helpful?

You changed nothing “important,” just added a second uplink, a VPN, or a container bridge. Suddenly half your outbound traffic vanishes,
or replies come back on the wrong interface and die quietly. The logs say nothing. Your monitoring says “packet loss.”
Your coworkers say “Linux routing is black magic.”

It’s not magic. It’s policy routing. It’s deterministic. And it’s also unforgiving in the way only mature subsystems can be: it will do
exactly what you asked, not what you meant. This is a production-grade Debian 13 field guide to ip rule and ip route
debugging, with the kind of commands you can run under pressure and the kind of conclusions you can explain in a postmortem.

The mental model: what actually happens to a packet

When Debian 13 routes a packet, it does not pick “the default route” in a single global table and call it a day.
It runs a policy engine. That engine evaluates a list of rules (ip rule) from top to bottom.
Each rule can match packet metadata (source address, destination, fwmark, incoming interface, UID, and more) and point the lookup at a routing table.
The chosen table is then searched for the best route (ip route), and only then does the kernel decide the output interface, gateway, and source address.

The key move is that routing tables are not just “the routes.” They are “a possible universe of routes,”
and rules decide which universe applies to this packet.
If you’re debugging, don’t ask “what’s my route?” Ask: “which rule wins, which table is consulted, and what route is selected from that table?”

What “black magic” feels like in practice

Policy routing failures have a signature: things work from one IP but not another; ping works but TCP doesn’t; outbound works but replies don’t;
one subnet is fine while another is dead. Standard routing failures are usually more uniform. Policy failures are selective.

Another signature: you run ip route, see a perfectly sane default route, and still traffic exits the wrong interface.
That’s because the relevant packet never consulted the main table. It consulted a different table due to a rule you forgot existed.
Or a rule inserted by a network manager. Or a VPN client. Or your own “temporary” workaround from three quarters ago.

Policy routing is not optional once you have complexity

If you have any of these, you’re doing policy routing whether you admit it or not:

  • Two uplinks (dual-WAN, LTE backup, ISP migration)
  • VPN split tunneling
  • Multiple source IPs on one host
  • Kubernetes/CNI with host routing quirks
  • Traffic steering by application (UID/cgroup marks)
  • Any security policy using fwmarks to steer traffic

The trick is to make it observable and boring. Boring is how you keep your weekends.

A single quote worth keeping on your desk

Hope is not a strategy. — General Gordon R. Sullivan

Routing “hope” usually looks like “it worked in staging” or “Linux will figure it out.” It will. Just not in the way you wanted.

Facts and history that make today’s weirdness make sense

  • Policy routing in Linux isn’t new. It’s been in mainline since the 2.2 kernel era, and iproute2 grew up around it.
  • The “main” table isn’t special by design—only by convention. Rules decide when it’s used; many systems route most traffic via main only because rules are default.
  • ip rule is evaluated by priority, not insertion order. Priorities can overlap, and “lower number wins” is the real ordering.
  • There are multiple built-in rules even on a “simple” host. Typically: local table lookup, then main, then default. That local lookup is why your host can reach its own IPs even if you broke everything else.
  • Reverse path filtering (rp_filter) was built for sanity, not multi-homing. Strict mode drops packets that come in on an interface that the kernel wouldn’t use to reach the source.
  • fwmark-based routing is older than most people’s careers. Netfilter marking and policy routing have been paired for a long time because it’s a clean separation: classify with firewall, route with rules.
  • “Default route” is not singular in the kernel. You can have multiple default routes, metrics decide preference, and rules decide whether a default route is even considered.
  • Network managers love to add rules. systemd-networkd, ifupdown2, VPN clients, and container stacks often install their own rules and tables for isolation.
  • Conntrack can preserve decisions you no longer want. A flow routed one way can keep going that way until it expires, even after you “fixed” routing.

These aren’t trivia. They are the reasons policy routing feels like it has moods.
It doesn’t. You just changed the rules, and the rules changed the world.

Fast diagnosis playbook (first/second/third)

When traffic “mysteriously” chooses the wrong interface, do not start by editing config files. Start by asking the kernel what it would do.
These checks are ordered to collapse the problem space fast.

First: confirm the routing decision for a specific packet

  • Use ip route get for the destination, with a source IP if relevant.
  • If fwmarks are involved, test with mark using ip route get ... mark (when supported) or emulate via rule inspection and table lookups.
  • Confirm the chosen table via ip rule and the selected route via ip route show table X.

Second: identify which rule is winning and why

  • Dump rules with priorities. Look for matches: from, to, iif, fwmark, uidrange.
  • Check for rules you didn’t create: VPN clients, cloud init, networkd, container stacks.
  • Confirm the route exists in the consulted table; “blackhole” or “unreachable” routes can be deliberate.

Third: validate the return path and filtering

  • Check source address selection and rp_filter settings.
  • Look for asymmetric routing: outbound via one interface, inbound replies via another.
  • Validate NAT/conntrack behavior; flush carefully if needed.

If you follow this order, you typically avoid the 2 a.m. ritual of “restarting networking until it feels fixed.”
That ritual is how you turn a routing issue into an outage.

Practical tasks: 12+ commands, expected output, and what to do next

Every task below has three parts: a command you can run, what the output means, and the decision you make from it.
This is the difference between debugging and performing interpretive dance at the terminal.

Task 1: Snapshot the rules (with priorities) and spot surprises

cr0x@server:~$ ip -details -statistics rule show
0:	from all lookup local
32764:	from all fwmark 0x1 lookup vpn
32765:	from 192.0.2.10 lookup isp2
32766:	from all lookup main
32767:	from all lookup default

What it means: The kernel will first try the local table. Then any packet marked 0x1 uses table vpn.
Any packet sourced from 192.0.2.10 uses isp2. Everything else goes to main.

Decision: If traffic “randomly” goes to VPN or the wrong ISP, you now know the conditions. If you don’t recognize a rule, find which tool installed it before deleting it.

Task 2: List routing tables names and ensure you’re not guessing table IDs

cr0x@server:~$ cat /etc/iproute2/rt_tables
#
# reserved values
#
255	local
254	main
253	default
0	unspec
#
# local
#
100	isp2
200	vpn

What it means: Table names map to numeric IDs. If scripts refer to “table 200” and someone renames it, you’ve created a puzzle for Future You.

Decision: Use names in automation where possible (lookup vpn), keep this file under config management, and treat table IDs as part of API stability.

Task 3: Ask the kernel “how would you route this?”

cr0x@server:~$ ip route get 1.1.1.1
1.1.1.1 via 203.0.113.1 dev eth0 src 203.0.113.10 uid 0
    cache

What it means: For that destination, the kernel chose gateway 203.0.113.1 on eth0 and would source packets from 203.0.113.10.

Decision: If your expectation was “use eth1” or “source from 192.0.2.10,” you have a rules/tables mismatch, not an application issue.

Task 4: Test source-based policy explicitly (the common gotcha)

cr0x@server:~$ ip route get 1.1.1.1 from 192.0.2.10
1.1.1.1 via 192.0.2.1 dev eth1 src 192.0.2.10 uid 0
    cache

What it means: With source 192.0.2.10, the kernel now uses eth1. That implies a matching from 192.0.2.10 rule.

Decision: If replies are failing, check that return traffic also uses the same source and that rp_filter isn’t dropping asymmetric paths.

Task 5: Dump the main table and check the default route and metrics

cr0x@server:~$ ip -statistics route show table main
default via 203.0.113.1 dev eth0 metric 100
default via 192.0.2.1 dev eth1 metric 200
203.0.113.0/24 dev eth0 proto kernel scope link src 203.0.113.10
192.0.2.0/24 dev eth1 proto kernel scope link src 192.0.2.10

What it means: Two defaults exist; the lower metric (100) wins when the main table is used.

Decision: If you intended failover, make sure you have health checking (not just metrics). If you intended split traffic, metrics alone won’t do it; you need rules and separate tables.

Task 6: Inspect a non-main table (where the “magic” usually hides)

cr0x@server:~$ ip route show table isp2
default via 192.0.2.1 dev eth1
192.0.2.0/24 dev eth1 scope link src 192.0.2.10

What it means: Table isp2 has its own default and connected route. This is the minimal sane policy-routing table: it can reach the gateway and the internet.

Decision: If a table has only a default route but lacks the connected subnet route, ARP/neighbor resolution can get weird. Add the connected route or use proto kernel via address assignment.

Task 7: Check for blackhole/unreachable routes that silently “work as designed”

cr0x@server:~$ ip route show table vpn
blackhole 10.0.0.0/8
default via 10.8.0.1 dev tun0
10.8.0.0/24 dev tun0 scope link src 10.8.0.2

What it means: Someone intentionally blackholed RFC1918 space in the VPN table. That can prevent accidental hairpinning into corporate networks.

Decision: If your app needs to reach 10.x services and it’s failing, you don’t have a “routing bug,” you have a policy decision. Fix the policy: remove or narrow the blackhole, or adjust rules so that only specific traffic hits that table.

Task 8: Confirm whether rp_filter is dropping your packets

cr0x@server:~$ sysctl net.ipv4.conf.all.rp_filter net.ipv4.conf.eth0.rp_filter net.ipv4.conf.eth1.rp_filter
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.eth1.rp_filter = 1

What it means: Strict reverse path filtering is on. In multi-homed/policy-routing setups, strict mode often drops legitimate asymmetric traffic.

Decision: If you have intentional asymmetry or multiple uplinks, switch to loose mode (2) on relevant interfaces or globally—after you understand the security implications.

Task 9: Verify the rules are actually matching (source, iif, fwmark)

cr0x@server:~$ ip rule show
0:	from all lookup local
32764:	from all fwmark 0x1 lookup vpn
32765:	from 192.0.2.10 lookup isp2
32766:	from all lookup main
32767:	from all lookup default

What it means: Rules are broad. The fwmark rule matches any marked packet from any source. The source rule matches exactly one IP.

Decision: If too much traffic is going to VPN, stop marking everything. If too little traffic is going to isp2, broaden from to a subnet or add additional from rules for other addresses.

Task 10: Inspect nftables/iptables marking (the hidden steering wheel)

cr0x@server:~$ nft list ruleset | sed -n '1,120p'
table inet filter {
  chain output {
    type filter hook output priority 0; policy accept;
    meta skuid 1001 meta mark set 0x1
  }
}

What it means: Traffic generated by UID 1001 gets marked 0x1 in the OUTPUT hook, which triggers the fwmark routing rule.

Decision: If the wrong app is being steered, fix UID selection, use cgroup marks, or tag only specific destinations/ports. Avoid broad marks that accidentally capture package updates, monitoring, or DNS.

Task 11: Use tcpdump with interface focus to prove asymmetry

cr0x@server:~$ sudo tcpdump -ni eth0 'host 1.1.1.1 and (tcp or icmp)'
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:10:55.120001 IP 203.0.113.10 > 1.1.1.1: ICMP echo request, id 511, seq 1, length 64

What it means: Outbound ICMP is leaving on eth0. If replies return on eth1, you’ve found the asymmetry.

Decision: If you see outbound on one interface and inbound on another, check rp_filter and ensure the return path is pinned with correct source-based rules and per-table routes.

Task 12: Validate neighbor/ARP on the intended egress interface

cr0x@server:~$ ip neigh show dev eth1
192.0.2.1 lladdr 52:54:00:12:34:56 REACHABLE

What it means: The gateway MAC is known and reachable. If it were FAILED or missing, routing might be correct but L2 connectivity isn’t.

Decision: If neighbor resolution fails, stop blaming ip rule. Check VLAN tagging, switch ports, security groups, or ARP filtering settings.

Task 13: Check the local table and rule 0 (because localhost lies)

cr0x@server:~$ ip route show table local | head
local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1
local 203.0.113.10 dev eth0 proto kernel scope host src 203.0.113.10
local 192.0.2.10 dev eth1 proto kernel scope host src 192.0.2.10
broadcast 203.0.113.0 dev eth0 proto kernel scope link src 203.0.113.10

What it means: The local table is why you can always reach your own addresses even if your main routing is broken.
It can mask issues during testing (“curl to my IP works!”).

Decision: When debugging, test from another host, or use ip route get to see real egress decisions. Local table success is not proof of external reachability.

Task 14: Inspect conntrack for flows that keep taking the old path

cr0x@server:~$ sudo conntrack -L -p tcp 2>/dev/null | head
tcp      6 431999 ESTABLISHED src=203.0.113.10 dst=93.184.216.34 sport=42412 dport=443 src=93.184.216.34 dst=203.0.113.10 sport=443 dport=42412 [ASSURED] mark=1 use=1

What it means: This flow is established and carries mark=1. Even if you change marking rules, existing connections may persist on their original route.

Decision: If you’re testing routing changes, either use new connections (different source port) or surgically delete conntrack entries for the test flow. Avoid flushing conntrack on production unless you enjoy explaining it.

Task 15: Validate that the right source address is chosen (and fix with ip route if needed)

cr0x@server:~$ ip route show table isp2 default
default via 192.0.2.1 dev eth1

What it means: The default route does not specify src. The kernel may choose a source address you didn’t intend if multiple addresses exist.

Decision: In multi-address environments, set src explicitly on key routes in policy tables to make behavior stable and debuggable.

Joke #1: Policy routing is like office politics: the official org chart exists, but the real decisions happen in the side rules.

Three corporate mini-stories (how grown-ups break routing)

Incident #1: The wrong assumption (“main table means default behavior”)

A mid-sized SaaS company migrated from a single ISP to dual uplinks: primary fiber and a backup LTE router.
The host fleet had multiple IPs per box because of legacy allowlists. Everything looked fine in a quick test:
ip route showed the fiber default with a lower metric. Great. Ship it.

The next morning, a subset of outbound API calls started timing out. Not all. Not even most.
Only the calls from a specific service, running under a dedicated system user, and only to a handful of partner networks.
The team chased DNS. They chased MTU. They chased the partner’s firewall. The partner chased them back.

The actual problem was a single ip rule inserted months earlier by a “temporary” VPN client:
from all fwmark 0x1 lookup vpn. The service user’s traffic was being marked by an old nftables rule that no one remembered.
That traffic never consulted the main table. It went into the VPN table, which had a blackhole route for one of the partner’s RFC1918 ranges used behind NAT.

The wrong assumption was subtle: “if ip route looks right, routing is right.” It isn’t.
ip route without a table or get query is only the default universe.
The fix was boring: remove the stale marking rule, re-scope VPN steering to only the intended destinations, and add a routing regression test to CI that runs ip route get with service sources.

Postmortem action item: “No policy rule goes into production without an owner and a comment in config management.” That’s not bureaucracy; that’s memory.

Incident #2: The optimization that backfired (“let’s mark everything for performance”)

A large enterprise had a Debian-based proxy tier with two exits: a cheap internet link for bulk traffic and a premium link for latency-sensitive API calls.
Someone had a clever idea: mark traffic by process UID and let policy routing handle the rest. No need for complex proxy ACLs.
It worked. It also encouraged more “clever.”

The optimization was to mark all outbound traffic from a group of services to use the premium link, under the argument that it reduced tail latency.
The nftables rule was broad: “if UID in this range, set mark 0x1.”
That included not only the service, but the update agent, a metrics shipper, and a TLS certificate renewer running under the same account.

For a while, nothing broke. Then a routing flap occurred on the premium link during a maintenance window.
The failover design relied on metrics in the main table, but the marked traffic bypassed main entirely and went to a policy table with a single default route.
When the premium gateway became unreachable, the marked traffic didn’t fail over. It just stalled.
Monitoring looked weird because unmarked traffic was fine and marked traffic died in silence.

The fix wasn’t to abandon policy routing. The fix was to respect failure modes:
provide a backup default in the policy table with a higher metric, or use a rule that falls back to main when the premium route is unavailable.
And stop using UID marking as a blunt instrument; match on destination prefixes or ports where possible.

Joke #2: If you “optimize” routing without a failure plan, you’ve invented a new way for packets to take a long lunch break.

Incident #3: The boring practice that saved the day (“route get as a standard test”)

A fintech ran Debian hosts with strict compliance requirements: all database traffic must go over a private network, everything else over internet.
They had multiple NICs, multiple VLANs, and a VPN for admin access. That’s a lot of moving parts.
The network team didn’t rely on tribal knowledge; they codified routing behavior as tests.

During an OS upgrade cycle, a change in network configuration management inadvertently swapped a table ID: vpn moved from 200 to 201 on some hosts.
The rules still referenced lookup 200 in a legacy script, so those machines had a rule that pointed to an empty table.
The kernel fell through to main for admin traffic, which was “fine” until the security team noticed admin connections coming from the wrong egress IP and flagged it.

The saving grace was boring: their deployment pipeline ran a suite of route assertions on each host after network changes.
It executed ip route get checks for representative destinations with representative sources, including admin subnets, partner APIs, and private database ranges.
The assertion for admin traffic failed immediately on affected hosts. No outage, no mystery, just a red light.

They rolled back, fixed the scripts to use table names instead of numeric IDs, and kept the tests.
The system stayed complicated. The behavior stayed predictable. That’s the real win.

Common mistakes: symptom → root cause → fix

1) “Traffic exits the wrong interface, but ip route looks correct”

Symptom: ip route shows the expected default, but packets leave via another NIC.

Root cause: A policy rule sends that traffic to a different table (fwmark, source-based, iif-based).

Fix: Use ip rule show and ip route get DEST from SRC. Fix the winning rule or the consulted table. Stop trusting the main table view.

2) “Outbound works, inbound replies are dropped (or vice versa)”

Symptom: SYNs leave, SYN-ACKs never complete, or ICMP replies vanish.

Root cause: Asymmetric routing plus rp_filter=1 (strict) or upstream anti-spoofing.

Fix: Pin return path with source-based rules and per-table connected routes; set rp_filter=2 where appropriate; ensure correct src on policy routes.

3) “VPN split tunnel leaks traffic or hijacks everything”

Symptom: Some traffic unexpectedly goes through VPN, or VPN-intended traffic goes direct.

Root cause: Overbroad mark rule, wrong rule priority, or missing specific routes in VPN table.

Fix: Narrow marking to destinations/ports; ensure VPN rule priority is above main; verify VPN table has explicit routes for the tunneled prefixes and a deliberate default strategy.

4) “A policy table default exists but nothing works through it”

Symptom: Packets select the right table but never reach gateway.

Root cause: Missing connected route in that table, or gateway unreachable on that interface, or ARP/neighbor issues.

Fix: Add the link route in that table; verify ip neigh; confirm interface IP and L2 connectivity.

5) “After fixing rules, behavior doesn’t change for active connections”

Symptom: New flows behave correctly, old flows keep failing or keep using the old path.

Root cause: Conntrack entries preserve state and marks for established flows.

Fix: Test with new connections; delete specific conntrack entries; consider reducing timeouts for impacted flows during incident response rather than flushing globally.

6) “Packets routed correctly, but source IP is wrong”

Symptom: Traffic egresses the intended interface but uses an unexpected source address, causing upstream drops.

Root cause: Source address selection chooses a different address on that interface or from another interface due to route scope and address labels.

Fix: Specify src on critical routes in policy tables; ensure each interface has correct primary address; verify with ip route get ... from ....

7) “Everything works until a link failure, then only some traffic fails over”

Symptom: Unmarked traffic fails over fine; marked or source-routed traffic stalls.

Root cause: Policy tables lack a secondary default or lack health-aware failover; main table metrics don’t apply to policy table lookups.

Fix: Build failover inside the policy tables (secondary default with higher metric) or implement link-state-aware switching with explicit rule changes.

8) “Routing works interactively, fails in production service”

Symptom: Your manual curl works; the daemon fails.

Root cause: UID-based marking, network namespace differences, or service binds to a specific source IP.

Fix: Check nftables mark rules on OUTPUT; check the service unit’s user and network namespace; validate the service’s bind settings.

Checklists / step-by-step plans you can hand to on-call

Checklist A: “Traffic takes the wrong exit” (15 minutes, no changes yet)

  1. Run rule snapshot.

    cr0x@server:~$ ip rule show
    0:	from all lookup local
    32764:	from all fwmark 0x1 lookup vpn
    32765:	from 192.0.2.10 lookup isp2
    32766:	from all lookup main
    32767:	from all lookup default
    

    Decision: Identify the likely matching rule for the traffic class (source IP? fwmark?).

  2. Run route decision query for the exact destination, then with the suspected source.

    cr0x@server:~$ ip route get 93.184.216.34
    93.184.216.34 via 203.0.113.1 dev eth0 src 203.0.113.10 uid 0
        cache
    
    cr0x@server:~$ ip route get 93.184.216.34 from 192.0.2.10
    93.184.216.34 via 192.0.2.1 dev eth1 src 192.0.2.10 uid 0
        cache
    

    Decision: Confirm whether the wrong behavior is source-dependent. If yes, focus on source rules and per-table routes.

  3. Dump the consulted table.

    cr0x@server:~$ ip route show table isp2
    default via 192.0.2.1 dev eth1
    192.0.2.0/24 dev eth1 scope link src 192.0.2.10
    

    Decision: If the table is missing essentials (connected route, correct gateway), fix the table; do not “fix” by changing main.

  4. Prove packet path with tcpdump on both interfaces (short capture).

    cr0x@server:~$ sudo tcpdump -ni eth0 'host 93.184.216.34 and tcp'
    tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
    listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
    
    cr0x@server:~$ sudo tcpdump -ni eth1 'host 93.184.216.34 and tcp'
    tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
    listening on eth1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
    

    Decision: If you see asymmetry, jump to rp_filter and return-path rules.

Checklist B: “VPN split tunnel is misbehaving”

  1. Identify the VPN table and rule.

    cr0x@server:~$ ip rule show | grep -i vpn
    32764:	from all fwmark 0x1 lookup vpn
    

    Decision: If VPN selection is mark-based, you must audit marking. If it’s source-based, audit application bind/source selection.

  2. Inspect marking rules.

    cr0x@server:~$ nft list ruleset | grep -n 'mark set' | head
    12:    meta skuid 1001 meta mark set 0x1
    

    Decision: If the match is too broad, narrow it. Mark only what you can justify in writing.

  3. Confirm VPN table routes include what you intend to tunnel.

    cr0x@server:~$ ip route show table vpn
    blackhole 10.0.0.0/8
    default via 10.8.0.1 dev tun0
    10.8.0.0/24 dev tun0 scope link src 10.8.0.2
    

    Decision: If you need to reach some private prefixes, remove or narrow blackholes and add explicit routes.

Checklist C: “Failover doesn’t work for some traffic”

  1. Determine which table that traffic uses (rule match).

    cr0x@server:~$ ip -details rule show
    0:	from all lookup local
    32764:	from all fwmark 0x1 lookup vpn
    32765:	from 192.0.2.10 lookup isp2
    32766:	from all lookup main
    32767:	from all lookup default
    

    Decision: If it’s not using main, main table metrics won’t save you.

  2. Ensure the policy table has a secondary default route (if your design requires it).

    cr0x@server:~$ ip route show table isp2 | grep '^default'
    default via 192.0.2.1 dev eth1
    

    Decision: If there’s only one default and that gateway fails, that traffic will stall. Add a backup path in that table or implement rule changes on link failure.

FAQ

1) Why does ip route lie to me?

It’s not lying; it’s showing the main table by default. Policy routing might be sending your packet to a different table.
Use ip rule show and ip route get DEST from SRC to see the real decision.

2) What’s the difference between ip route show and ip route get?

show lists routes in a table. get asks the kernel to resolve a specific destination (and optional source),
returning the chosen gateway, interface, and source IP. In incidents, get is your truth serum.

3) Can I have two default routes in Debian 13?

Yes. The kernel chooses based on route metrics within a table. But if policy rules steer traffic into a different table, that table’s defaults apply instead.
Dual defaults are normal; unmanaged dual defaults are chaos.

4) Why does policy routing break only for one source IP?

Because many policies are source-based: ip rule add from 192.0.2.10 lookup isp2.
Apps binding to a specific source, or the kernel choosing a different source, can change which rule matches.

5) Do I need to disable rp_filter for policy routing?

Not always, but strict mode (1) frequently conflicts with intentional asymmetry in multi-homed setups.
Loose mode (2) is a common compromise. Keep security in mind: rp_filter helps against spoofing, so don’t change it casually.

6) Why did my fix work for new connections but not old ones?

Conntrack keeps state for established flows, including marks in many setups. Old flows may stick to old routing decisions.
Test with new connections or surgically remove conntrack entries for specific flows.

7) What’s the cleanest way to implement split tunneling?

Put only the tunneled prefixes in the VPN table and keep a deliberate default policy (either no default, or a default only for marked traffic).
Then steer only the intended traffic using specific rules (destination-based where possible; marks when necessary).

8) How do I know which service is causing fwmark-based routing?

Look at nftables rules that set marks (often matching UID/cgroup). Then map UID to a service.
Also inspect conntrack entries for mark= to see which flows are marked.

9) Should I use numeric table IDs directly in scripts?

Avoid it when you can. Use named tables and keep /etc/iproute2/rt_tables stable.
Numeric IDs are fine, but they’re easy to drift across fleets and hard to audit in postmortems.

10) Is it okay to “just flush” rules and routes during an incident?

On a production box that carries real traffic: no, not as a first move.
You can drop SSH, break service bindings, and erase the evidence you needed to diagnose. Snapshot first, then change surgically.

Conclusion: next steps that reduce future pain

Policy routing on Debian 13 is only “black magic” when you treat it like folklore.
Treat it like a program: rules are control flow, tables are data, and ip route get is your debugger.
The kernel is consistent. Your configuration might not be.

Practical next steps that pay rent:

  • Standardize on named routing tables and keep /etc/iproute2/rt_tables under config management.
  • Add a minimal routing regression test: a handful of ip route get checks for critical destinations and sources.
  • Audit ip rule and nftables marking rules quarterly; delete “temporary” rules before they become policy.
  • Decide explicitly how failover works in each policy table. If you didn’t design failure, you designed surprise.
  • Document the owner and purpose of each non-default rule. If it has no owner, it has no right to exist.

Do that, and “routing black magic” turns back into what it always was: a set of deterministic lookups you can reason about,
even when the pager is yelling at you.

← Previous
Ubuntu 24.04 “Stale file handle” on NFS: why it happens and how to stop it
Next →
ZFS Immutable Backups: Readonly + Snapshot Policies That Actually Hold Up

Leave a comment