Split DNS is one of those “should be boring” problems that turns into a weekend because someone rebooted a laptop, the VPN reconnected,
and suddenly git.company.internal resolves to the public internet (or worse, doesn’t resolve at all).
If you’ve ever watched a production deploy stall because a build agent lost internal DNS mid-run, you know the feeling: it’s not dramatic,
it’s just expensive.
Debian 13 can do split DNS cleanly. The trick is picking one DNS control plane and making everything else feed it.
This piece gives you a setup that survives reboot, Wi‑Fi changes, VPN reconnects, and “helpful” DHCP servers.
What you are building (and what you are not)
Split DNS means: certain domains resolve using specific DNS servers, while everything else uses your “normal” DNS.
Typical example:
*.corp.internaland*.svc.cluster.localresolve through the VPN’s DNS servers- everything else resolves through your LAN DNS (or public resolvers)
You are not building a DNS server farm. You’re building a deterministic client-side resolver configuration that:
(1) routes the right names to the right resolvers, (2) doesn’t leak internal queries to public DNS, and (3) stays stable through reboots.
Your enemy isn’t DNS complexity. It’s conflicting DNS managers. Debian systems often have multiple actors tugging on /etc/resolv.conf:
NetworkManager, systemd-resolved, VPN clients, DHCP clients, and occasionally someone’s dotfile from 2014.
Interesting facts and short history (because context prevents pain)
- Split DNS predates VPN hype: enterprises used it early for split-horizon zones where internal names should never resolve externally.
-
The
/etc/resolv.confformat is intentionally simple—so simple it can’t express per-domain routing. Split DNS requires something smarter. -
systemd-resolved’s “routing domains” (the
~domainconcept) are designed specifically to express “this suffix goes to that link’s DNS”. -
The classic “3 nameserver limit” in
resolv.confis not just folklore: many resolvers still effectively behave that way, and ordering can bite you. - Negative caching (remembering NXDOMAIN) can make DNS look haunted: fix the server, and clients still fail until caches expire.
- DNS search domains are older than most modern VPN tooling, and they’re frequently abused. A long search list increases query load and surprises.
- “DNS leaks” are often not about privacy drama—they’re about reliability. Internal zones queried via public DNS return NXDOMAIN and break apps.
- Many corporate VPNs historically pushed a full DNS takeover because older clients couldn’t do per-domain routing; that habit still lingers.
One paraphrased idea from Werner Vogels (Amazon CTO): everything fails, all the time; design systems assuming failure is normal
.
Split DNS is a tiny version of that philosophy—design for interface flaps and reconnects.
Design choices that actually hold up
Pick one owner for DNS decisions
On Debian 13, the cleanest approach is: let systemd-resolved be the resolver policy engine.
Let NetworkManager (or your VPN client) feed it link-specific DNS servers and domains.
Then point /etc/resolv.conf at resolved’s stub.
What you should avoid: a “choose-your-own-adventure” resolver stack where NetworkManager writes /etc/resolv.conf,
the VPN overwrites it, and a post-up script tries to patch it back. It works right up until it doesn’t—and it usually fails after reboot.
Understand the two modes of resolved
-
Stub listener: your system points to
127.0.0.53(local stub), and resolved does the upstream routing and caching. - Foreign / direct resolv.conf: apps query upstream servers directly. This is where split DNS dies quietly.
VPN should publish domains, not just servers
A VPN that only pushes DNS servers but no “which domains belong to me” is asking for a fight.
You want either:
- VPN interface gets routing domains like
~corp.internal,~company.local - or at least search domains (less precise, more risk)
DNS priority is not a vibe
DNS selection has rules. Resolvers pick servers based on link, route, and domain match. If you don’t set priorities,
the “winner” can change when interface metrics change (Wi‑Fi roaming is great at that).
Joke #1: DNS is the only system where “it’s cached” is accepted as both an explanation and an alibi.
Fast diagnosis playbook
When split DNS breaks, don’t start by editing files. Start by answering three questions quickly:
which resolver is active, which link owns the domain, and what query path the application is using.
First: Is systemd-resolved actually in charge?
- Check if
/etc/resolv.confpoints to the stub - Check
resolvectl statusfor link-specific DNS
Second: Is the domain routed to the VPN link?
resolvectl query git.corp.internaland verify it uses the VPN DNS server- Confirm the VPN link has
~corp.internal(routing domain), not just a generic search domain
Third: Is your app bypassing the OS resolver?
- Browsers, container runtimes, and some SDKs can do their own DNS or caching
- Check with
getent hostsversusdigand compare results
Quick bottleneck sniff test
- If
resolvectl queryis fast but the app is slow: suspect app-level caching, proxy settings, or DoH/DoT inside the app. - If
resolvectl queryis slow: suspect unreachable DNS server, MTU issues on VPN, or blocked UDP/TCP 53. - If only some domains fail: suspect routing domains, split-horizon conflicts, or stale negative cache.
Prerequisites and baseline checks
Debian 13 typically ships with systemd and can run systemd-resolved. NetworkManager is common on desktops and laptops; on servers you might
use systemd-networkd or ifupdown. This guide assumes you’re using NetworkManager or you have a VPN interface you can configure.
Decide your intended behavior before you touch anything:
- Do you want only internal domains to use VPN DNS (preferred)?
- Or do you want all DNS to go via VPN while connected (sometimes required by policy)?
- Do you need to support both IPv4 and IPv6 DNS servers?
Preferred setup: systemd-resolved as the DNS control plane
1) Ensure resolved is running
If resolved isn’t active, everything else becomes duct tape. Enable it and make it boring.
2) Point /etc/resolv.conf at the stub
You want /etc/resolv.conf to be a symlink to systemd’s stub resolv.conf. This is the anchor that survives reboot.
If you let other tools write it, you’ll get a different DNS personality every Monday.
3) Let NetworkManager talk to resolved
NetworkManager can integrate with systemd-resolved. When it does, DNS servers and domains become per-link properties, which is exactly what split DNS needs.
4) Attach routing domains to the VPN link
The magic is the routing domain syntax: ~corp.internal. The tilde means “route queries for this domain to this link’s DNS”.
Without it, you’re back to global DNS roulette.
NetworkManager VPN split DNS: WireGuard and OpenVPN patterns
WireGuard via NetworkManager
WireGuard is clean, but the ecosystem around it varies. If you manage it through NetworkManager, set DNS and domains on the WireGuard connection profile.
Some deployments also push DNS via scripts; avoid that if possible—keep one authority.
OpenVPN via NetworkManager
OpenVPN commonly pushes “dhcp-option DNS” and “dhcp-option DOMAIN”. NetworkManager can translate those into resolved link settings.
The key is ensuring domains become routing domains (or at least search domains), and that DNS priority doesn’t allow the LAN to steal internal names.
Joke #2: VPN DNS is like a corporate org chart—technically documented, emotionally unpredictable.
Optional: local dnsmasq for stubborn edge cases
I prefer systemd-resolved alone for Debian clients. But there are edge cases:
- You need conditional forwarding rules that a specific legacy application expects.
- You want to mirror a long-standing corporate pattern: “send corp.internal to 10.x, everything else to ISP”.
- You’re dealing with an environment where VPN tooling can’t reliably communicate domains to resolved.
In those cases, dnsmasq can act as a local policy forwarder, and resolved can forward to it—or you can skip resolved and let NetworkManager feed dnsmasq.
The risk is increasing complexity. Complexity is fine when it’s written down and stable. It’s not fine when it’s accidental.
Practical tasks: commands, outputs, and decisions
These tasks are written like you’re on-call: you run a command, interpret output, then decide what to do next.
You don’t “try stuff.” You change one variable at a time.
Task 1: Confirm Debian is using systemd-resolved
cr0x@server:~$ systemctl status systemd-resolved --no-pager
● systemd-resolved.service - Network Name Resolution
Loaded: loaded (/lib/systemd/system/systemd-resolved.service; enabled; preset: enabled)
Active: active (running) since Mon 2025-12-29 08:41:12 UTC; 2h 13min ago
Docs: man:systemd-resolved.service(8)
Meaning: “active (running)” is non-negotiable for this design.
Decision: If it’s inactive/disabled, enable it before touching VPN DNS.
Task 2: Verify what owns /etc/resolv.conf
cr0x@server:~$ ls -l /etc/resolv.conf
lrwxrwxrwx 1 root root 39 Dec 29 08:41 /etc/resolv.conf -> /run/systemd/resolve/stub-resolv.conf
Meaning: This symlink is the “reboot-proof” part.
Decision: If it points somewhere else (or is a regular file), fix it; otherwise you’re debugging a moving target.
Task 3: Inspect resolved’s global and per-link DNS view
cr0x@server:~$ resolvectl status
Global
Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub
Current DNS Server: 192.168.1.1
DNS Servers: 192.168.1.1
DNS Domain: lan
Link 2 (enp3s0)
Current Scopes: DNS
Protocols: +DefaultRoute
Current DNS Server: 192.168.1.1
DNS Servers: 192.168.1.1
DNS Domain: lan
Link 5 (wg0)
Current Scopes: DNS
Protocols: -DefaultRoute
DNS Servers: 10.20.30.53 10.20.30.54
DNS Domain: ~corp.internal ~svc.cluster.local
Meaning: The VPN link has routing domains (~corp.internal) and is not the default route for DNS.
Decision: If the VPN link lacks routing domains, split DNS won’t happen. Add them at the connection level.
Task 4: Confirm a specific internal name routes to the VPN DNS
cr0x@server:~$ resolvectl query git.corp.internal
git.corp.internal: 10.20.40.12 -- link: wg0
(git.corp.internal)
Meaning: The query went via link wg0.
Decision: If it says link: enp3s0 or “No appropriate name servers”, fix routing domains or DNS reachability.
Task 5: Compare libc resolution path (what most apps use)
cr0x@server:~$ getent hosts git.corp.internal
10.20.40.12 git.corp.internal
Meaning: NSS + libc are resolving correctly.
Decision: If resolvectl query works but getent fails, suspect NSS misconfiguration or an app bypassing the system resolver.
Task 6: Check NetworkManager is using resolved (not writing resolv.conf directly)
cr0x@server:~$ nmcli general status
STATE CONNECTIVITY WIFI-HW WIFI WWAN-HW WWAN
connected full enabled enabled enabled enabled
cr0x@server:~$ nmcli -t -f RUNNING,VERSION,STATE general
running:yes
version:1.48.0
state:connected
Meaning: NM is alive; next step is verifying it’s configured to use systemd-resolved.
Decision: If NM isn’t running, your DNS changes must be done via systemd-networkd or the VPN tooling directly.
Task 7: Inspect a VPN connection’s DNS and domains in NetworkManager
cr0x@server:~$ nmcli -f NAME,TYPE,DEVICE connection show --active
NAME TYPE DEVICE
Office-WG wireguard wg0
Home-LAN ethernet enp3s0
cr0x@server:~$ nmcli connection show "Office-WG" | sed -n '1,120p'
connection.id: Office-WG
connection.type: wireguard
connection.interface-name: wg0
ipv4.method: auto
ipv4.dns: 10.20.30.53,10.20.30.54
ipv4.dns-search: corp.internal,svc.cluster.local
ipv4.ignore-auto-dns: yes
ipv6.method: disabled
Meaning: NM has DNS and search domains. With resolved integration, these can become routing domains.
Decision: If ipv4.ignore-auto-dns is no on LAN, DHCP may override your intentions; tune priorities and ignore-auto-dns where appropriate.
Task 8: Set VPN-specific DNS servers and search domains (and persist them)
cr0x@server:~$ sudo nmcli connection modify "Office-WG" ipv4.dns "10.20.30.53 10.20.30.54"
cr0x@server:~$ sudo nmcli connection modify "Office-WG" ipv4.dns-search "corp.internal svc.cluster.local"
cr0x@server:~$ sudo nmcli connection modify "Office-WG" ipv4.ignore-auto-dns yes
cr0x@server:~$ sudo nmcli connection down "Office-WG" && sudo nmcli connection up "Office-WG"
Connection 'Office-WG' successfully deactivated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/17)
Connection 'Office-WG' successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/19)
Meaning: You made the VPN’s DNS explicit and persistent in NM.
Decision: If after reconnect resolvectl status still doesn’t show routing domains, check NM’s resolved integration or use a resolved drop-in.
Task 9: Confirm resolved sees the VPN DNS servers after reconnect
cr0x@server:~$ resolvectl status wg0
Link 5 (wg0)
Current Scopes: DNS
Protocols: -DefaultRoute
DNS Servers: 10.20.30.53 10.20.30.54
DNS Domain: ~corp.internal ~svc.cluster.local
Meaning: Perfect: link-scoped DNS servers and routing domains.
Decision: If domains are missing the tilde, you can still function via search domains, but expect leaks and weirdness. Fix it properly.
Task 10: Force routing domains on the VPN link via resolved (when the VPN client can’t)
cr0x@server:~$ sudo mkdir -p /etc/systemd/resolved.conf.d
cr0x@server:~$ sudo tee /etc/systemd/resolved.conf.d/wg0-domains.conf >/dev/null <<'EOF'
[Resolve]
EOF
cr0x@server:~$ sudo resolvectl domain wg0 '~corp.internal' '~svc.cluster.local'
cr0x@server:~$ sudo resolvectl dns wg0 10.20.30.53 10.20.30.54
Meaning: You applied settings live. Note: resolvectl domain changes are not automatically persistent across reboot unless your network manager re-applies them.
Decision: Use this for triage; then move the configuration into NetworkManager or systemd-networkd so it survives reboot.
Task 11: Validate you’re not leaking internal queries to public DNS
cr0x@server:~$ resolvectl query doesnotexist.corp.internal
doesnotexist.corp.internal: resolve call failed: 'does not exist' (NXDOMAIN) -- link: wg0
Meaning: NXDOMAIN came from the VPN link. Good. That means the query went where you intended.
Decision: If NXDOMAIN comes from your LAN DNS, your split routing isn’t working and internal queries are leaking.
Task 12: Check for a “helpful” local stub or conflicting listener on port 53
cr0x@server:~$ sudo ss -ltnup | grep ':53 '
udp UNCONN 0 0 127.0.0.53%lo:53 0.0.0.0:* users:(("systemd-resolve",pid=620,fd=13))
tcp LISTEN 0 4096 127.0.0.53%lo:53 0.0.0.0:* users:(("systemd-resolve",pid=620,fd=14))
Meaning: resolved owns the stub listener. No surprise local bind.
Decision: If you see dnsmasq/unbound also binding 127.0.0.1:53, decide who wins and disable the other. “Both” is not a plan.
Task 13: Confirm nsswitch is sane for host resolution
cr0x@server:~$ grep -E '^(hosts|networks):' /etc/nsswitch.conf
hosts: files mdns4_minimal [NOTFOUND=return] dns
networks: files
Meaning: DNS is in the chain; local files take precedence.
Decision: If dns is missing, or you have exotic modules, fix NSS before blaming resolved.
Task 14: Observe DNS behavior during a VPN reconnect
cr0x@server:~$ journalctl -u systemd-resolved -n 80 --no-pager
Dec 29 10:42:11 server systemd-resolved[620]: Switching to fallback DNS server 192.168.1.1#53.
Dec 29 10:42:25 server systemd-resolved[620]: Using degraded feature set UDP instead of UDP+EDNS0 for DNS server 10.20.30.53.
Dec 29 10:42:29 server systemd-resolved[620]: DNS server 10.20.30.53#53 connected via wg0.
Meaning: resolved is adapting to server reachability and EDNS0 capability. This is often a hidden reason for “it was fast yesterday.”
Decision: If you see constant switching, investigate MTU, packet loss, or firewall rules on the VPN path.
Three corporate mini-stories (because scars teach)
Mini-story 1: The incident caused by a wrong assumption
A mid-size company rolled out a new VPN concentrator. The network team tested it with a Windows client, confirmed internal sites worked,
then announced victory. Linux users were told to “just use OpenVPN” and given a config that pushed DNS servers but no domains.
The wrong assumption: “If we push internal DNS servers, internal names will resolve.” On Linux clients, that often means those DNS servers become
globally preferred. Then when users disconnected, the stale config stuck around—or worse, the resolver still tried to query unreachable 10.x servers.
People experienced it as “DNS randomly dies after VPN.”
The outage wasn’t a single big bang. It was a slow bleed: CI runners would fail to pull dependencies because public resolution worked, but internal artifact
names didn’t. Developers re-ran jobs until they passed. You can guess how that went for capacity planning.
The fix was not heroic. They stopped letting the VPN overwrite global DNS, and instead published routing domains for corp.internal and
svc.cluster.local. Linux clients then sent only those suffixes to the VPN DNS. Everything else stayed on LAN DNS.
The lesson: split DNS is not “optional polish.” It’s correctness. VPN DNS without domains is like handing out phone numbers without area codes.
Mini-story 2: The optimization that backfired
A different org wanted faster builds. Someone noticed DNS latency to the corporate resolvers over VPN was ~60–120ms during peak hours.
The proposed optimization: “Let’s cache everything aggressively on laptops using a local caching resolver and bump TTLs at the internal DNS.”
It sounded reasonable—until it wasn’t.
They rolled out a local caching layer, and it did improve median resolution time. But split DNS boundaries became blurry.
Some internal records were short-lived by design because services moved behind load balancers. The caching layer dutifully kept old answers.
When a service failed over, clients continued calling the dead endpoint for minutes.
The backfire moment was subtle: monitoring showed the service was healthy and the failover worked. Only a subset of clients failed.
The resolver caches were the culprit. A “faster DNS” project became a “why are only remote engineers broken” incident.
The rollback was painful because everyone had become dependent on the speed-up. The long-term fix was to respect TTLs, cache responsibly,
and stop treating DNS like a static configuration database. It isn’t. It’s a control plane.
Mini-story 3: The boring but correct practice that saved the day
A global company had a strict rule for endpoints: one DNS owner (system resolver), one configuration path (NetworkManager profiles),
and no post-connect scripts that edit /etc/resolv.conf. Engineers grumbled because it felt bureaucratic.
Then a VPN gateway patch introduced a bug where the pushed DNS server list occasionally included an unreachable resolver first.
Some clients would time out on the first server and fall back; others would stall due to retry logic. It could have been a mess.
Because they had a consistent design, diagnosis was fast. On affected machines, resolvectl status clearly showed the bad DNS server on the VPN link.
They could deploy a one-line NetworkManager profile update to reorder DNS servers and reduce the retry pain.
No heroics, no hand-editing files on laptops, no “run this script after you connect.” Just a boring config change that applied on reconnect.
The incident stayed contained, and the helpdesk didn’t melt.
Common mistakes: symptom → root cause → fix
-
Symptom: DNS works until reboot, then internal names fail.
Root cause:/etc/resolv.confis a regular file or managed by the wrong tool; resolved isn’t the active stub.
Fix: Enable resolved and symlink/etc/resolv.confto/run/systemd/resolve/stub-resolv.conf. -
Symptom: While on VPN, public DNS stops working or gets slow.
Root cause: VPN DNS servers are being used as global resolvers; they either block recursion or have bad egress.
Fix: Use routing domains (~corp.internal) so only internal domains hit VPN DNS; keep LAN/public resolvers for everything else. -
Symptom: Internal domains resolve to public IPs (wrong service, wrong cert, confusing errors).
Root cause: Split-horizon mismatch: queries for internal zones are going to public resolvers first.
Fix: Ensure the internal suffix is routed to the VPN link; verify withresolvectl queryshowinglink: wg0. -
Symptom:
resolvectl queryworks, but a browser or Java app fails.
Root cause: App bypassing OS resolver (DoH), pinned DNS, or stale app cache.
Fix: Confirmgetent hosts; disable app DoH or align it with corporate policy; restart the app after DNS changes. -
Symptom: Only some internal names fail, especially short hostnames.
Root cause: Missing search domain on the VPN link, or conflicting search order between LAN and VPN.
Fix: Add explicit domains; prefer FQDNs; reduce and order search domains carefully. -
Symptom: Random timeouts when resolving internal names over VPN.
Root cause: MTU/fragmentation issues on VPN path, or DNS server requires TCP fallback and it’s blocked.
Fix: Check resolved logs for degraded mode; test TCP/53 reachability; fix MTU or firewall rules. -
Symptom: DNS queries go to the wrong local service on port 53.
Root cause: Another resolver (dnsmasq/unbound) is listening, or a container network is hijacking DNS.
Fix: Decide on one local listener; stop the other; confirm withss -ltnup.
Checklists / step-by-step plan
Checklist A: Clean reboot-proof baseline
- Enable and start systemd-resolved.
- Symlink
/etc/resolv.confto resolved’s stub resolv.conf. - Confirm
resolvectl statusshowsresolv.conf mode: stub. - Pick one network manager: NetworkManager on desktops, systemd-networkd on servers. Don’t mix casually.
Checklist B: VPN split DNS that doesn’t leak
- Set VPN DNS servers explicitly (avoid “automatic if it works”).
- Attach routing domains to the VPN link:
~corp.internal,~svc.cluster.local. - Ensure VPN link is not default DNS route unless policy requires it.
- Validate with
resolvectl queryshowinglink: wg0for internal domains. - Validate public domains still resolve via LAN DNS.
Checklist C: Hardening for the real world
- Keep search domains short; avoid stacking five corp suffixes “just in case.”
- Prefer FQDNs in automation and config management.
- Make DNS behavior observable: teach people
resolvectl statusandresolvectl query. - Test reconnect loops: disconnect/reconnect VPN, change Wi‑Fi, suspend/resume.
- Document the single source of truth: “DNS is managed by resolved; NM feeds it.”
FAQ
1) Do I really need systemd-resolved for split DNS on Debian 13?
No, but you need something that can express per-domain routing. resolved is the most straightforward on Debian 13
and integrates well with NetworkManager.
2) What’s the difference between search domains and routing domains?
Search domains append suffixes to short names (e.g., git becomes git.corp.internal).
Routing domains (~corp.internal) tell resolved which DNS servers should receive queries for that suffix.
Use routing domains for split DNS. Use search domains sparingly.
3) Why not just put VPN DNS servers first in resolv.conf?
Because it breaks the moment the VPN disconnects, and because it sends non-corp queries to corp resolvers.
Split DNS is about correctness, not just “make internal resolve while connected.”
4) I use WireGuard with wg-quick, not NetworkManager. Can I still do this?
Yes, but you must ensure DNS and domain routing are applied persistently when the interface comes up.
A common approach is integrating with resolved via interface-up hooks—then verifying after reboot that the hooks run.
If you can, managing WireGuard through NetworkManager tends to be more consistent on laptops.
5) My internal DNS servers only answer internal zones and refuse recursion. Is that OK?
It’s fine and often intentional. It’s another reason you want split DNS: only internal suffixes should hit those servers,
while public names resolve elsewhere.
6) Why does “NXDOMAIN” sometimes persist after I fix DNS?
Negative caching. Resolvers and applications can cache NXDOMAIN for a TTL. Restarting resolved can clear its cache,
but apps may still cache. When troubleshooting, test with new names or restart the application.
7) How do I know if my app is bypassing the system resolver?
Compare results between getent hosts name and the app. If libc resolves but the app doesn’t,
check the app for DoH settings, embedded resolvers, or proxy configuration.
8) Should I disable IPv6 to “fix DNS issues”?
Usually no. Dual-stack adds complexity, but disabling IPv6 often hides the real issue (like broken VPN DNS over IPv6 or bad router advertisements).
Fix the configuration; don’t amputate the network.
9) When would you choose dnsmasq over systemd-resolved?
When you need conditional forwarding rules that don’t map cleanly to link-scoped DNS domains, or when legacy tooling can’t communicate split domains.
Treat it as an exception, not the default.
Conclusion: next steps that won’t ruin your Monday
The stable path on Debian 13 is simple: systemd-resolved owns DNS policy; NetworkManager (or your network stack) feeds it per-link DNS servers and routing domains.
Make /etc/resolv.conf point to the stub, and stop letting random scripts scribble on it.
Practical next steps:
- Verify
/etc/resolv.confis the resolved stub symlink. - Ensure your VPN link has routing domains (
~corp.internal) and VPN DNS servers inresolvectl status. - Run three tests:
resolvectl queryfor an internal name,getent hostsfor the same name, and a public name query. - Reboot once on purpose. If it breaks after a planned reboot, it was never stable—just temporarily lucky.