Ubuntu 24.04 “Temporary failure in name resolution”: stop guessing and fix DNS the right way

Was this helpful?

If you’re seeing Temporary failure in name resolution on Ubuntu 24.04, you’re already losing time in the worst possible way: the system is up, the app is down, and everyone’s blaming “the network.”

This error is almost never mystical. It’s usually a very specific break between your application and the resolver chain: libc → NSS → systemd-resolved → upstream DNS (or a local stub) → routing/firewall → actual nameserver. The fix is not “try 8.8.8.8 and hope.” The fix is to measure each hop and then change the one thing that’s actually wrong.

What the error actually means (and what it doesn’t)

Temporary failure in name resolution is typically returned by the resolver layer when a DNS query can’t be completed right now. In POSIX terms, that’s often EAI_AGAIN coming out of getaddrinfo(). The system is saying: “I tried to resolve a name and couldn’t reach a working resolver chain.”

What it does not mean:

  • “The internet is down.” Your default route might be fine.
  • “DNS is misconfigured.” Sometimes DNS settings are correct, but UDP/53 is blocked, MTU is wrong, or you’re routing to the wrong interface.
  • “It’s definitely systemd-resolved’s fault.” It’s often just the messenger.

What it usually means in production:

  • Your machine doesn’t know which DNS servers to ask (bad config, empty config, overwritten config).
  • Your machine knows which servers to ask, but can’t reach them (routing, firewall, VPN, broken link, DNS server outage).
  • Your machine reaches a DNS server that answers sometimes (timeouts, packet loss, EDNS/MTU issues, flaky upstream).
  • Your machine is asking the wrong DNS server for the name (split DNS, search domains, wrong interface precedence).

One quote worth keeping on your wall (paraphrased idea): “Hope is not a strategy” — a widely repeated SRE maxim often attributed to operations leadership. DNS deserves the same energy: measure, don’t guess.

Fast diagnosis playbook (first/second/third)

This is the fastest path to the bottleneck. Run it in order. Don’t skip to “fixes” until you can name the failing hop.

First: is it DNS, or is it routing?

  1. Resolve a name via the system resolver (getent). If it fails, proceed.
  2. Resolve via a specific known-good DNS server (dig @x.x.x.x). If that works, your upstream network is fine and your local resolver chain is broken.
  3. Ping a known IP (like your default gateway or a public IP if allowed). If IP connectivity is broken, DNS is a symptom.

Second: is systemd-resolved healthy and pointed somewhere sane?

  1. resolvectl status: check which DNS servers are in use per link and globally.
  2. ls -l /etc/resolv.conf: confirm whether you’re using the stub resolver or a static file, and whether that matches your intent.
  3. systemctl status systemd-resolved: confirm it’s running and not stuck in a loop.

Third: is the chosen DNS server reachable and answering?

  1. Test UDP/TCP reachability to the DNS server (timeouts matter more than “refused”).
  2. Query with dig for A and AAAA records; compare behavior. IPv6 often “works” just enough to waste your afternoon.
  3. If split DNS: ensure the query goes out the intended interface and with the right domain routing rules.

Dry-funny joke #1: DNS is the only system where “it’s just a name” can take down payroll.

Ubuntu 24.04 DNS stack in practice

Ubuntu 24.04 (like recent Ubuntu releases) typically uses:

  • Netplan to define network configuration (YAML under /etc/netplan/).
  • systemd-networkd or NetworkManager as the network backend, depending on server/desktop and your config.
  • systemd-resolved as the local caching stub resolver, usually bound to 127.0.0.53.
  • NSS (Name Service Switch) rules in /etc/nsswitch.conf determining whether lookups use DNS, files, mDNS, etc.
  • /etc/resolv.conf as the compatibility interface for legacy tools. On a modern Ubuntu, it’s often a symlink to a systemd-managed file.

Here’s the practical mental model. When an app asks “what is repo.internal.corp?”

  1. The app calls getaddrinfo().
  2. glibc consults /etc/nsswitch.conf (hosts: line) and chooses methods.
  3. For DNS, glibc reads /etc/resolv.conf and uses that nameserver list.
  4. If that file points to 127.0.0.53, queries go to systemd-resolved.
  5. systemd-resolved chooses an upstream DNS server based on per-interface configuration and routing domains, then issues the query.

So when you “edit resolv.conf,” you might be editing a symlink that gets overwritten, or you might be bypassing the intended resolver path. You can make it work temporarily and still be building a trap for Future You.

Interesting facts and historical context (short but useful)

  • DNS predates most people’s careers. It was designed in the early 1980s as a distributed replacement for a single HOSTS.TXT file that everyone had to download.
  • “resolv.conf” is a fossil that refuses to die. It’s still the universal interface even though modern systems have dynamic, per-link DNS rules.
  • systemd-resolved introduced a local stub by default. That’s why you often see nameserver 127.0.0.53—it’s not “wrong,” it’s an indirection layer.
  • DNS over UDP is fast until it isn’t. Packet loss turns “fast” into “mysteriously slow,” because retries and timeouts can stack up across libraries and apps.
  • TCP fallback matters more than people admit. Large responses, DNSSEC, or certain middleboxes can force TCP/53, and if TCP is blocked you get timeouts that look like random failures.
  • Search domains are productivity tools and outage generators. A missing dot can trigger multiple queries (api becomes api.prod.corp, api.corp, etc.), adding latency and confusion.
  • Negative caching is a thing. If your resolver caches “NXDOMAIN” and you just created a record, you can spend the TTL arguing with reality.
  • Split DNS got common because VPNs and SaaS got common. Your laptop/server may need public DNS for the world and private DNS for internal names, simultaneously.
  • Ubuntu’s defaults shifted over time. Older releases leaned on resolvconf or direct DHCP writes; modern ones prefer systemd tooling. Copy-pasting old playbooks can quietly break things.

Practical tasks: commands, outputs, decisions (12+)

These are real tasks I’d run on an Ubuntu 24.04 server during an incident. Each includes: command, example output, what it means, and the decision you make.

Task 1: Confirm the symptom using the system resolver (not ping)

cr0x@server:~$ getent ahosts archive.ubuntu.com
getent: Name or service not known

Meaning: This uses libc/NSS and reflects what most applications experience. “Name or service not known” or timeouts here confirms it’s not just ping being weird.

Decision: Continue with resolver-chain debugging. If getent works but your app fails, suspect application-specific DNS settings (containers, chroot, custom resolver libraries).

Task 2: Check whether IP routing works at all

cr0x@server:~$ ip route
default via 10.0.0.1 dev ens3 proto dhcp src 10.0.0.42 metric 100
10.0.0.0/24 dev ens3 proto kernel scope link src 10.0.0.42

Meaning: You have a default route via 10.0.0.1. If there’s no default route, DNS might fail because nothing can reach upstream servers.

Decision: If default route is missing/wrong, fix networking first (Netplan, DHCP, interface state) before touching DNS.

Task 3: Ping the default gateway (local L2 sanity)

cr0x@server:~$ ping -c 2 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.391 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.412 ms

--- 10.0.0.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1024ms
rtt min/avg/max/mdev = 0.391/0.401/0.412/0.010 ms

Meaning: L2/L3 to the gateway is fine. DNS failures are less likely to be “the NIC is dead.”

Decision: Move up-stack: resolver config and DNS reachability.

Task 4: Inspect /etc/resolv.conf (symlink tells the story)

cr0x@server:~$ ls -l /etc/resolv.conf
lrwxrwxrwx 1 root root 39 May  8 10:14 /etc/resolv.conf -> ../run/systemd/resolve/stub-resolv.conf

Meaning: This host uses the systemd stub resolver at 127.0.0.53. If you “fix DNS” by editing /etc/resolv.conf directly, it will likely be overwritten or ignored.

Decision: Use resolvectl and Netplan/NetworkManager configuration instead of hand-editing the file.

Task 5: Verify systemd-resolved is running and not degraded

cr0x@server:~$ systemctl status systemd-resolved --no-pager
● systemd-resolved.service - Network Name Resolution
     Loaded: loaded (/usr/lib/systemd/system/systemd-resolved.service; enabled; preset: enabled)
     Active: active (running) since Wed 2025-12-18 09:51:22 UTC; 2h 13min ago
       Docs: man:systemd-resolved.service(8)
             man:resolvectl(1)
   Main PID: 812 (systemd-resolve)
     Status: "Processing requests..."
      Tasks: 1 (limit: 18712)
     Memory: 7.9M
        CPU: 1.231s
     CGroup: /system.slice/systemd-resolved.service
             └─812 /usr/lib/systemd/systemd-resolved

Meaning: The service is up. That doesn’t guarantee upstream DNS is reachable, but it removes “resolver daemon crashed” from the list.

Decision: Look at resolver configuration and upstream reachability next.

Task 6: Check resolver configuration with resolvectl

cr0x@server:~$ resolvectl status
Global
         Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
  resolv.conf mode: stub
Current DNS Server: 10.0.0.53
       DNS Servers: 10.0.0.53 10.0.0.54
        DNS Domain: corp.internal

Link 2 (ens3)
    Current Scopes: DNS
         Protocols: +DefaultRoute
Current DNS Server: 10.0.0.53
       DNS Servers: 10.0.0.53 10.0.0.54
        DNS Domain: corp.internal

Meaning: DNS servers are 10.0.0.53 and 10.0.0.54; search domain corp.internal. That looks plausible. If this showed no DNS servers, you’ve found your problem.

Decision: If DNS servers are empty or wrong, fix Netplan/DHCP/NetworkManager. If they look right, test reachability and response behavior.

Task 7: Query through systemd-resolved explicitly

cr0x@server:~$ resolvectl query archive.ubuntu.com
archive.ubuntu.com: resolve call failed: Temporary failure in name resolution

Meaning: This confirms the failure is happening at or beyond systemd-resolved (not some app-specific DNS config).

Decision: Test direct queries to upstream DNS servers to separate “resolved” from “upstream network.”

Task 8: Query the upstream DNS server directly with dig (bypass the stub)

cr0x@server:~$ dig +time=2 +tries=1 @10.0.0.53 archive.ubuntu.com A

; <<>> DiG 9.18.24-1ubuntu1.1-Ubuntu <<>> +time=2 +tries=1 @10.0.0.53 archive.ubuntu.com A
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached

Meaning: The configured DNS server is not reachable (or not responding). This is not a “systemd-resolved config” issue; it’s a connectivity or server-side DNS problem.

Decision: Check routing to the DNS server, firewall rules, VPN state, and whether the DNS server is alive.

Task 9: Confirm you can reach the DNS server at the network layer

cr0x@server:~$ ping -c 2 10.0.0.53
PING 10.0.0.53 (10.0.0.53) 56(84) bytes of data.
64 bytes from 10.0.0.53: icmp_seq=1 ttl=63 time=1.12 ms
64 bytes from 10.0.0.53: icmp_seq=2 ttl=63 time=1.09 ms

--- 10.0.0.53 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1014ms
rtt min/avg/max/mdev = 1.09/1.10/1.12/0.015 ms

Meaning: The server is reachable by ICMP. That doesn’t prove UDP/53 or TCP/53 is allowed, but it’s a good sign.

Decision: Test UDP and TCP DNS explicitly.

Task 10: Test UDP/53 and TCP/53 behavior (because middleboxes exist)

cr0x@server:~$ dig +time=2 +tries=1 @10.0.0.53 archive.ubuntu.com A +notcp

; <<>> DiG 9.18.24-1ubuntu1.1-Ubuntu <<>> +time=2 +tries=1 @10.0.0.53 archive.ubuntu.com A +notcp
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
cr0x@server:~$ dig +time=2 +tries=1 @10.0.0.53 archive.ubuntu.com A +tcp

; <<>> DiG 9.18.24-1ubuntu1.1-Ubuntu <<>> +time=2 +tries=1 @10.0.0.53 archive.ubuntu.com A +tcp
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 41861
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; QUESTION SECTION:
;archive.ubuntu.com.          IN      A

;; ANSWER SECTION:
archive.ubuntu.com.   60      IN      A       91.189.91.81

;; Query time: 24 msec
;; SERVER: 10.0.0.53#53(10.0.0.53) (TCP)
;; WHEN: Wed Dec 18 12:06:12 UTC 2025
;; MSG SIZE  rcvd: 59

Meaning: UDP queries time out, but TCP works. That’s classic “UDP/53 blocked somewhere” (host firewall, network ACL, security group, or an overzealous DNS security policy).

Decision: Fix the firewall/network policy for UDP/53, or force TCP only as a stopgap (not ideal).

Task 11: Check local firewall rules (UFW and nftables)

cr0x@server:~$ sudo ufw status verbose
Status: active
Logging: on (low)
Default: deny (incoming), allow (outgoing), disabled (routed)
New profiles: skip
cr0x@server:~$ sudo nft list ruleset | sed -n '1,120p'
table inet filter {
        chain input {
                type filter hook input priority filter; policy drop;
                ct state established,related accept
                iif "lo" accept
                tcp dport 22 accept
        }
        chain output {
                type filter hook output priority filter; policy accept;
        }
}

Meaning: Output is accepted, so local firewall is probably not blocking outbound UDP/53. If your output chain policy were drop or you had an explicit deny, that’s your culprit.

Decision: If local firewall is clean, look at upstream ACLs/security groups/VPC NACLs, or the DNS server’s own firewall.

Task 12: Confirm which backend manages the link (NetworkManager vs networkd)

cr0x@server:~$ networkctl status ens3
● 2: ens3
                     Link File: /usr/lib/systemd/network/99-default.link
                  Network File: /run/systemd/network/10-netplan-ens3.network
                          Type: ether
                         State: routable (configured)
                  Online state: online
                       Address: 10.0.0.42 (DHCP4)
                       Gateway: 10.0.0.1
                           DNS: 10.0.0.53 10.0.0.54

Meaning: This interface is managed by systemd-networkd via a Netplan-generated file. Your DNS settings should be corrected in Netplan (or DHCP), not in NetworkManager.

Decision: Edit /etc/netplan/*.yaml appropriately and apply.

Task 13: Inspect Netplan DNS config (and spot “looks right, behaves wrong”)

cr0x@server:~$ sudo netplan get
network:
  version: 2
  ethernets:
    ens3:
      dhcp4: true

Meaning: DNS is coming from DHCP, not static config. If DHCP is handing out dead DNS servers, you’ll fail even though your host config is “clean.”

Decision: Fix DHCP options upstream, or override DNS locally in Netplan for this host role.

Task 14: Override DNS servers in Netplan (the right way) and apply

cr0x@server:~$ sudo tee /etc/netplan/50-dns-override.yaml >/dev/null <<'EOF'
network:
  version: 2
  ethernets:
    ens3:
      dhcp4: true
      nameservers:
        addresses: [10.0.0.53, 10.0.0.54]
        search: [corp.internal]
EOF
cr0x@server:~$ sudo netplan apply

Meaning: You’ve declared intent in the supported interface. If something overwrites it later, you now have a configuration artifact to audit.

Decision: Re-check resolvectl status and run a query again. If it still fails, it’s not a “missing DNS servers” problem.

Task 15: Check journal logs for resolved (timeouts leave fingerprints)

cr0x@server:~$ sudo journalctl -u systemd-resolved --since "20 min ago" --no-pager
Dec 18 11:52:07 server systemd-resolved[812]: Using degraded feature set UDP instead of UDP+EDNS0 for DNS server 10.0.0.53.
Dec 18 12:02:14 server systemd-resolved[812]: DNS server 10.0.0.53: Timeout while contacting DNS server.
Dec 18 12:02:19 server systemd-resolved[812]: Switching to DNS server 10.0.0.54.

Meaning: Resolved is timing out and failing over. “Degraded feature set” often hints at MTU/fragmentation problems, broken EDNS0 handling, or middleboxes.

Decision: If multiple DNS servers time out, it’s connectivity. If only one is bad, remove/replace it. If EDNS0 triggers failures, check MTU/path and DNS server behavior.

Task 16: Validate NSS order (sometimes you’re not even using DNS)

cr0x@server:~$ grep -E '^\s*hosts:' /etc/nsswitch.conf
hosts:          files mymachines mdns4_minimal [NOTFOUND=return] dns

Meaning: DNS is consulted after files and some mDNS logic. That’s normal on many Ubuntu installs. If you’re in a server environment and mDNS is causing delays, this line can matter.

Decision: If lookups hang before failing, consider whether mDNS or misconfigured local name services are delaying. Don’t “optimize” this without understanding blast radius (see story later).

Dry-funny joke #2: If you hardcode DNS everywhere, you’ll eventually run a distributed system made entirely of exceptions.

Three corporate mini-stories from the trenches

Incident caused by a wrong assumption: “It can’t be DNS because the IP pings”

A mid-sized company ran a fleet of Ubuntu servers in a cloud VPC. One Tuesday, builds started failing with the familiar error during package installs. The incident channel filled up fast: the build nodes could ping the NAT gateway, could curl a public IP, and could even hit the artifact store by IP. So the early assumption was: “DNS is fine; the repo is down.”

The repo wasn’t down. The build nodes were using a corporate DNS forwarder, reachable only through a security group rule that had recently been “tightened.” ICMP was allowed, TCP/443 was allowed, but UDP/53 outbound from that subnet had been removed. Not malicious—just someone optimizing a rule set with a spreadsheet.

The team lost hours because they kept testing the wrong thing. Pinging the DNS server succeeded, which reinforced the assumption. But DNS queries were UDP and silently dropped. The build tooling retried, hit timeouts, and fell back in inconsistent ways across languages and libraries. So it looked flaky.

The fix was boring: restore UDP/53 egress from that subnet to the forwarders, and document that DNS is not “just reachability.” Afterward they added a canary test that ran dig +notcp and dig +tcp separately, because “DNS works” is not a single boolean.

Optimization that backfired: caching harder (and caching the wrong answers)

At another shop, a platform team wanted faster deployments. They noticed a lot of repeated DNS queries during container pulls, service discovery, and telemetry. Their bright idea: increase caching aggressively and reduce query load on upstream resolvers.

They pushed resolver changes broadly. Query volume dropped. Graphs looked great. Then, the weirdness started: some hosts couldn’t reach new services for minutes after a rollout. Others would resolve a service to an old IP long after the service had moved. A few nodes “fixed themselves” after restarts, which made the incident feel supernatural.

The root cause wasn’t “caching is bad.” The root cause was caching without discipline. Service records had low TTLs for a reason (rapid failover, blue/green). By forcing longer caching and layering caches (node cache plus a forwarder cache), they effectively multiplied TTL behavior. Negative caching added another twist: transient NXDOMAIN responses got pinned.

They rolled back the aggressive cache settings and instead targeted the real bottleneck: upstream resolver capacity and latency, plus sane TTLs aligned with deployment mechanics. Performance improved without making DNS lie to the application.

Boring but correct practice that saved the day: per-role DNS configuration and a test you can’t argue with

A finance org ran Ubuntu 24.04 servers across multiple network zones: user-facing, internal app, and restricted data. Each zone had slightly different DNS needs. The tempting approach was “one golden image” with “one resolver config.” They didn’t do that.

They maintained per-role Netplan snippets: each role declared its DNS servers and search domains explicitly, with DHCP allowed for addressing but not trusted for resolver details. The network team still provided DNS via DHCP, but the hosts didn’t rely on it blindly.

The key part: a health check that ran on every host, every few minutes, and reported three measures: (1) getent lookup to an internal name, (2) dig to each configured DNS server over UDP, and (3) a TCP fallback test. The check produced unambiguous failure modes: “host resolver broken” vs “DNS server unreachable” vs “UDP blocked.”

When an upstream change broke UDP for one zone, the alarm didn’t say “name resolution failed.” It said “UDP/53 blocked to dns-a; TCP/53 works.” The network team fixed the ACL in minutes because the problem statement was precise, and the platform team didn’t waste time restarting services like it was a ritual.

Common mistakes: symptoms → root cause → fix

These are the patterns I see repeatedly on Ubuntu 24.04. The symptoms look similar; the fixes are not.

1) Symptom: apt update fails, but ping 1.1.1.1 works

  • Root cause: DNS resolver chain broken or upstream DNS unreachable, while general routing works.
  • Fix: Use resolvectl status to identify active DNS servers; then dig @server to test reachability. Fix DHCP/Netplan DNS or firewall/ACL for UDP/TCP 53.

2) Symptom: /etc/resolv.conf shows correct nameserver, but it keeps reverting

  • Root cause: /etc/resolv.conf is managed (symlink to systemd-resolved or generated by NetworkManager). Manual edits get overwritten.
  • Fix: Configure DNS in Netplan (networkd) or NetworkManager, or adjust /etc/systemd/resolved.conf if you truly want global overrides. Don’t fight the generator.

3) Symptom: Internal names fail, public names work (or vice versa)

  • Root cause: Split DNS misrouted (wrong interface chosen, missing routing domains, VPN DNS not applied).
  • Fix: Check per-link DNS and domains in resolvectl status. Ensure the right link has the right DNS Domain and that the VPN client integrates correctly (or configure per-link domains via Netplan/NetworkManager).

4) Symptom: DNS works for a while after reboot, then fails later

  • Root cause: DHCP renewal changes DNS servers, VPN reconnect changes link priority, or cloud-init/network scripts rewrite config.
  • Fix: Identify the writer: check journalctl for network events, confirm Netplan config, and if needed lock down DNS in Netplan or NetworkManager. Avoid “one-off” edits.

5) Symptom: dig works, but applications still fail

  • Root cause: App uses a different resolver path (container with its own resolv.conf, static DNS config, JVM settings), or NSS config differs in the environment.
  • Fix: Test with getent on the host and inside the container. Compare /etc/resolv.conf and /etc/nsswitch.conf in both contexts. Fix at the right layer.

6) Symptom: UDP queries time out, TCP works

  • Root cause: UDP/53 blocked or mangled; sometimes MTU/fragmentation/EDNS issues cause large UDP responses to fail.
  • Fix: Allow UDP/53. If EDNS/MTU is suspected, look for “degraded feature set” in resolved logs and validate path MTU or DNS server EDNS support.

7) Symptom: Very slow lookups, then intermittent failures

  • Root cause: Search domain expansion causing multiple queries; packet loss; one of multiple DNS servers is blackholing.
  • Fix: Use resolvectl statistics (if available) and dig with +tries=1 +time=1 against each server. Remove dead servers; reduce search domains for server roles.

Checklists / step-by-step plan (make it boring and correct)

Step-by-step: restore DNS on a broken Ubuntu 24.04 host

  1. Confirm it’s real: run getent ahosts example.com. If it works, your problem is higher-level (proxy, TLS, app config).
  2. Confirm routing: ip route and ping the gateway.
  3. Check resolver mode: ls -l /etc/resolv.conf.
  4. Inspect resolved config: resolvectl status and note:
    • Global DNS servers
    • Link-specific DNS servers
    • Search/routing domains
    • Which link is marked default route
  5. Bypass the stack: dig @<dns-server> example.com A with short timeout. Do this for each configured server.
  6. Test UDP vs TCP: dig +notcp and dig +tcp to the same server.
  7. Check local firewall: ufw status verbose and nft list ruleset.
  8. Fix the correct owner of config:
    • If interface managed by networkd: fix Netplan YAML and netplan apply.
    • If managed by NetworkManager: use nmcli to set DNS and connection properties.
    • If DHCP is wrong: fix DHCP option 6 upstream, then renew lease.
  9. Validate after change: resolvectl query, getent, and a real workflow (apt update if that’s the failing path).
  10. Stabilize: add a small DNS check in your monitoring so you catch drift and partial failures before users do.

Operational checklist: prevent the next incident

  • Standardize “who owns DNS settings” per host role: DHCP-only, Netplan override, or NetworkManager policy. Pick one.
  • Ensure at least two DNS servers, but don’t list dead ones “for later.” A dead secondary can still cost you seconds per lookup depending on retry behavior.
  • Document split DNS domains and which interface provides them (especially with VPN clients and overlay networks).
  • Test UDP and TCP DNS in CI for base images if you operate in tightly filtered networks.
  • Keep search domains short for servers. Developer laptops can afford convenience; production services want determinism.
  • Decide whether IPv6 is supported. If it’s half-supported, you’re volunteering for strange timeouts.

FAQ

1) Why does Ubuntu show nameserver 127.0.0.53 in /etc/resolv.conf?

That’s the local stub for systemd-resolved. Applications query localhost; resolved forwards to the real DNS servers based on per-link config and caches answers.

2) Should I disable systemd-resolved to “fix DNS”?

Only if you have a clear reason and a plan. Disabling it can work, but it’s a blunt instrument that often breaks split DNS, VPN behavior, and future maintenance. Fix upstream reachability or correct Netplan/NetworkManager config first.

3) I edited /etc/resolv.conf and it worked—why did it break later?

Because that file is often generated. Your edit either hit the stub file (which gets regenerated) or you replaced a symlink that a package/service expects. The correct durable fix is in Netplan, NetworkManager, or resolved configuration.

4) Why do some tools work while others fail (e.g., dig works, apt fails)?

dig can query a specific server directly, bypassing the system resolver path. apt typically uses libc’s resolver and whatever the host provides via NSS and /etc/resolv.conf. Test with getent to match application behavior.

5) What’s the fastest way to tell “DNS config issue” vs “DNS server unreachable”?

Run resolvectl status to see what server you’re using, then dig @that-server example.com with a short timeout. If the direct query times out, the server path is the issue. If it works, your local resolver chain is miswired.

6) Why does UDP matter if TCP works?

Most DNS queries default to UDP for performance. If UDP/53 is blocked, many clients will time out before falling back to TCP (if they do at all). That looks like intermittent failure and latency spikes.

7) How do VPNs cause “Temporary failure in name resolution” after disconnect?

VPN clients often add link-specific DNS and routing domains, then fail to cleanly remove or restore prior state. You end up with DNS servers that are unreachable post-VPN, or a different interface becoming the “default” for DNS routing.

8) Should I put public resolvers (like 1.1.1.1) on servers as a fallback?

Not as a reflex. In corporate environments you can leak internal names or break policy controls. In production, prefer resolvers that are intended for that network zone, and ensure they’re reachable with the correct firewall rules.

9) My DNS servers are reachable, but resolved logs say “degraded feature set UDP instead of UDP+EDNS0.” Is that bad?

It’s a clue. It usually points to a path that drops fragmented UDP packets or mishandles EDNS0. You might still “work,” but you’re close to intermittent failures—especially with DNSSEC, large TXT records, or busy resolvers.

10) What’s a safe “emergency workaround” during an outage?

Use a temporary Netplan override to point to known-good internal DNS servers for that zone, apply it, and validate. Avoid hand-editing /etc/resolv.conf unless you accept it may be overwritten and you document the deviation.

Conclusion: next steps you can do today

Stop treating Temporary failure in name resolution like a weather event. On Ubuntu 24.04, DNS failures are diagnosable if you respect the chain: application → NSS → resolv.conf → systemd-resolved → upstream DNS → network policy.

Next steps that pay off immediately:

  • Adopt the fast diagnosis playbook and teach it to your on-call rotation.
  • Standardize DNS ownership (Netplan vs NetworkManager vs DHCP) per host role, and write it down in the repo that builds your images.
  • Add one small DNS canary check that distinguishes: local resolver failure, upstream server failure, and UDP/TCP policy problems.
  • When you fix it, fix it at the right layer—so it stays fixed after renewals, reboots, and VPN drama.
← Previous
Office VPN + RDP: Secure Remote Desktop Without Exposing RDP to the Internet
Next →
ZFS Bookmarks: Saving Incrementals When Something Goes Wrong

Leave a comment