BIND9: Zone Transfers Gone Wild — Lock It Down Without Breaking Secondary DNS

February 16, 2026 • February 16, 2026 • Read: 21 min • Views: 0

Was this helpful?

Zone transfers are the DNS equivalent of a warehouse loading dock. When it works, inventory moves quietly and everyone goes home on time. When it’s misconfigured, you get trucks you never ordered, pallets everywhere, and someone “just testing something” with a forklift.

If you run BIND9 in production, you’ve probably felt the pain: secondaries stuck on old data, transfer storms chewing bandwidth, or the unsettling discovery that strangers can AXFR your internal zones. This is how you lock it down—without bricking your secondary DNS.

How zone transfers break in real life

Zone transfers (AXFR for full, IXFR for incremental) are supposed to be boring background replication. You publish a zone on one or more primaries, your secondaries pull updates, resolvers query whichever authoritative server they hit, and everyone trusts TTLs like adults.

Then you introduce one of the classic accelerants:

Over-permissive transfer ACLs (“allow-transfer { any; };” the DNS equivalent of leaving your badge reader on “demo mode”).
NAT or firewalls with asymmetric policy (NOTIFY gets through, the TCP transfer doesn’t; or vice versa).
Serial number sloppiness (manual edits without serial bumps; or dynamic DNS with hidden primary patterns that don’t match your assumptions).
Transfer storms (lots of secondaries, aggressive refresh, flapping masters, or a config change that forces full retransfer).
View mismatches (split-horizon zones, but transfer credentials/ACLs don’t line up, so secondaries replicate the wrong view or none at all).
TSIG drift (wrong key, wrong algorithm, wrong “server” statement association, or clocks off enough to fail signed messages).

When transfers go wild, you don’t just have “DNS issues.” You have:

Availability risk: secondaries serving stale data during incidents, causing partial outages that are infuriating to debug.
Confidentiality risk: AXFR leaks internal hostnames, service topology, and naming conventions. Attackers love that.
Capacity risk: transfer retries can be self-inflicted DDoS, especially if you have many zones or huge zones (think: auto-generated records).
Change-control risk: a “harmless” ACL tweak can quietly break replication and only show up after TTL and cache lifetimes expire.

Joke #1: DNS is the only system where “serial number management” is a reliability strategy and also a plot device.

Interesting facts and context (why we got here)

Short, concrete context matters because BIND’s behaviors are shaped by decades of operational scar tissue.

AXFR predates modern “least privilege” thinking. Early DNS assumed cooperating servers; the idea of hostile scanning for transfers came later.
Zone transfers use TCP. Regular DNS queries are usually UDP, but AXFR/IXFR are TCP by design to move lots of data reliably.
NOTIFY was an add-on. Originally, secondaries waited for refresh timers; NOTIFY (RFC 1996) made propagation faster but introduced new failure modes when UDP 53 is filtered differently than TCP 53.
IXFR exists to avoid full transfers. Incremental transfers (RFC 1995) reduce bandwidth and load, but they rely on journals/diffs being available and serial progression being sane.
TSIG is older than many people think. Transaction SIGnature (RFC 2845) became the workhorse for authenticating transfers without needing full DNSSEC complexity.
BIND’s “views” were built for split-horizon operations. Powerful, easy to misapply: your transfer ACL must align with the right view, or you’ll replicate the wrong answer set.
“Hidden master” designs became common to reduce attack surface. Exposing only secondaries to the internet is great—until you forget that the secondaries still need to reach the master on TCP 53.
Transfer formatting and SOA semantics are still central. Despite modern automation, SOA serial, refresh, retry, expire, and negative TTL remain the knobs that drive behavior.

A useful mental model: AXFR/IXFR/NOTIFY and who initiates what

Primary vs secondary: the direction of travel

In classic primary/secondary DNS, the secondary initiates the transfer. It connects to the primary (TCP 53), asks for IXFR or AXFR, and stores a local copy. That means your security posture should be: the primary only allows transfers to known secondaries, and the secondaries only talk to known primaries.

NOTIFY is the “doorbell,” not the delivery truck

NOTIFY is a quick heads-up from primary to secondary: “my serial changed; go check.” It is usually UDP. The actual transfer is the heavy lift and happens over TCP. So your firewall needs to allow both directions appropriately, and your BIND configuration needs to avoid accepting NOTIFY from random hosts.

IXFR is not magic

IXFR only works when the primary can supply the incremental changes. In BIND, that usually depends on a zone journal. Lose the journal (or disable it), and the secondary will often fall back to AXFR. If your zone is large, that fallback is expensive.

One quote to keep you honest

Hope is not a strategy. — Gene Kranz

Threat model: what you’re defending against

Locking down transfers isn’t paranoia; it’s hygiene. Here’s the realistic set of problems:

Unauthorized AXFR/IXFR attempts to enumerate internal names and services.
Misconfigured internal hosts accidentally acting like secondaries and hammering the primary with retries.
Compromised secondary used as a trusted pivot—still allowed by your primary’s ACL.
Reflection of operational mistakes: someone opened transfers “temporarily” to fix a broken secondary, then left it.
Resource exhaustion: transfer concurrency, TCP sockets, disk I/O for zone writes, journal churn, and log flooding.

The goal isn’t “no transfers.” The goal is only the right transfers, at the right time, from the right hosts, with authentication, and with sane limits.

Fast diagnosis playbook

When secondaries aren’t updating or transfers are melting your servers, you need a ruthless order of operations. Don’t start by editing named.conf in a panic. Start by finding the bottleneck.

First: prove basic reachability and protocol (TCP vs UDP)

Can the secondary reach the primary on TCP 53?
Is NOTIFY arriving (UDP 53)?
Is there any middlebox doing “helpful” DNS inspection?

Second: prove authorization and identity

Does the primary allow transfer to the secondary IP(s)?
Is TSIG configured on both ends, and are you actually using it for transfers?
Are you dealing with the correct view?

Third: prove state and serial movement

Does the primary’s SOA serial match what you expect?
Is the secondary stuck because it can’t IXFR and keeps failing AXFR?
Is the secondary serving stale because it never completed a transfer?

Fourth: isolate capacity constraints

CPU pegged? Probably DNSSEC signing, log storms, or too many concurrent transfers.
Disk I/O pegged? Zone file writes/journals, or a pathological number of small updates.
Network saturated? Full transfers, retries, and too many secondaries refreshing at once.

Practical tasks: commands, outputs, and decisions (12+)

These are real operator moves: commands you can run, what the output is telling you, and the next decision. Adjust hostnames and paths to your environment, but keep the logic.

Task 1: Confirm which server is primary for a zone (from anywhere)

cr0x@server:~$ dig +norecurse SOA example.com @ns1.example.net
;; ANSWER SECTION:
example.com.     3600  IN  SOA  ns1.example.net. hostmaster.example.com. 2026020401 1200 300 1209600 300

What it means: The SOA MNAME is ns1.example.net. Serial is 2026020401.

Decision: That’s the authority you should be transferring from (unless you use hidden masters). If your “primary” isn’t listed, you’re in a multi-master or hidden-master design—confirm your intended master list.

Task 2: Check if AXFR is (accidentally) open to the world

cr0x@server:~$ dig AXFR example.com @ns1.example.net
; Transfer failed.

What it means: Good sign. The server refused or you’re not allowed.

Decision: If AXFR succeeds from an untrusted host, treat it as a security incident: tighten allow-transfer, add TSIG, and consider whether the zone content is sensitive.

Task 3: Test transfer from a known secondary and verify TSIG works

cr0x@server:~$ dig AXFR example.com @ns1.example.net -y transfer-key:7b3Nq9hR7oR6Zp3kVvQb8w==
; (1 server found)
example.com.     3600 IN SOA ns1.example.net. hostmaster.example.com. 2026020401 1200 300 1209600 300
example.com.     3600 IN NS  ns1.example.net.
example.com.     3600 IN NS  ns2.example.net.
www.example.com. 300  IN A   203.0.113.10
example.com.     3600 IN SOA ns1.example.net. hostmaster.example.com. 2026020401 1200 300 1209600 300
;; Query time: 45 msec
;; XFR size: 28 records (messages 1, bytes 2150)

What it means: Transfer succeeded with the TSIG key. You saw start and end SOA, and record count.

Decision: If you can transfer only with TSIG, you’re on the right track. If it succeeds without TSIG too, you still have an ACL problem.

Task 4: Verify TCP 53 is reachable (because AXFR/IXFR are TCP)

cr0x@server:~$ nc -vz ns1.example.net 53
Connection to ns1.example.net 53 port [tcp/domain] succeeded!

What it means: The TCP path exists.

Decision: If this fails but UDP queries work, fix firewall/security groups. NOTIFY can fool you into thinking “DNS is open” when transfers aren’t.

Task 5: On the primary, check what BIND thinks is happening with the zone

cr0x@server:~$ sudo rndc zonestatus example.com
name: example.com
type: master
files: /var/cache/bind/db.example.com
serial: 2026020401
nodes: 1421
last loaded: Tue, 04 Feb 2026 08:12:22 GMT
secure: no
dynamic: no

What it means: The master has serial 2026020401 loaded from a specific file. It’s not dynamic.

Decision: If you expected dynamic updates, your deployment model is lying to you. If serial doesn’t match your file, you might be editing the wrong file or using inline-signing or a hidden master.

Task 6: On the secondary, verify whether it’s actually slaving the right master

cr0x@server:~$ sudo rndc zonestatus example.com
name: example.com
type: slave
masters: 203.0.113.11 port 53
serial: 2026020309
last loaded: Mon, 03 Feb 2026 19:41:03 GMT
next refresh: Tue, 04 Feb 2026 08:01:03 GMT

What it means: The secondary is behind (serial 2026020309 vs primary 2026020401). It believes the master is 203.0.113.11.

Decision: Confirm that IP is correct (hidden master patterns often trip people). If it’s wrong, fix the masters list. If it’s correct, you’re troubleshooting auth/reachability/transfer failure.

Task 7: Force the secondary to re-check and pull (controlled nudge)

cr0x@server:~$ sudo rndc refresh example.com
zone refresh queued

What it means: BIND will attempt to refresh; it’s not a guarantee of success.

Decision: Immediately watch logs for transfer attempts. If refresh does nothing, you may have a view mismatch or the zone isn’t loaded in that instance.

Task 8: Tail logs specifically for transfer messages

cr0x@server:~$ sudo journalctl -u bind9 -n 50 --no-pager
Feb 04 08:22:10 ns2 named[2143]: zone example.com/IN: refresh: retry limit reached
Feb 04 08:22:10 ns2 named[2143]: zone example.com/IN: Transfer started.
Feb 04 08:22:10 ns2 named[2143]: transfer of 'example.com/IN' from 203.0.113.11#53: connected using 203.0.113.22#47922
Feb 04 08:22:10 ns2 named[2143]: transfer of 'example.com/IN' from 203.0.113.11#53: failed while receiving responses: REFUSED

What it means: Primary is refusing the transfer. That’s almost always allow-transfer, TSIG mismatch, or view mismatch on the master.

Decision: Fix authorization on the primary (and be explicit). Do not “temporarily allow any;” and forget.

Task 9: Confirm the master’s `allow-transfer` posture by inspecting config

cr0x@server:~$ sudo named-checkconf -p | sed -n '/zone "example.com"/,/};/p'
zone "example.com" {
	type master;
	file "/var/cache/bind/db.example.com";
	allow-transfer { key transfer-key; 203.0.113.22; };
	also-notify { 203.0.113.22; };
};

What it means: This zone allows transfers either from a TSIG-authenticated client using transfer-key or from the IP 203.0.113.22.

Decision: Prefer “key-only” for transfers unless you have a specific reason to permit IP-only. If you keep IPs, keep the ACL tight and audited.

Task 10: Validate zone file integrity before blaming networking

cr0x@server:~$ sudo named-checkzone example.com /var/cache/bind/db.example.com
zone example.com/IN: loaded serial 2026020401
OK

What it means: The zone parses and serial is what you think it is.

Decision: If this fails, fix the zone file first. A broken zone can prevent loads, prevent transfers, or cause secondaries to keep retrying forever.

Task 11: Check for transfer concurrency and whether you’re choking yourself

cr0x@server:~$ sudo rndc status
version: BIND 9.18.24-1ubuntu1.4-Ubuntu (Extended Support Version) <id:...>
running on ns1: Linux x86_64 5.15.0-97-generic
boot time: Tue, 04 Feb 2026 07:45:10 GMT
last configured: Tue, 04 Feb 2026 08:10:03 GMT
current serial: 2026020401
xfrouts running: 12
xfers running: 0

What it means: xfrouts running shows outgoing transfers (master sending). If that number is high, you might be in a transfer storm or being probed.

Decision: If outgoing transfers are unexpectedly high, check who is connecting and whether your ACLs are too open. Also consider tuning transfers-out and friends (with care).

Task 12: Identify who is connecting to TCP/53 on the primary

cr0x@server:~$ sudo ss -tnp 'sport = :53' | head
State  Recv-Q Send-Q Local Address:Port Peer Address:Port  Process
ESTAB  0      0      203.0.113.11:53  203.0.113.22:47922 users:(("named",pid=2143,fd=42))
ESTAB  0      0      203.0.113.11:53  198.51.100.77:51244 users:(("named",pid=2143,fd=51))
ESTAB  0      0      203.0.113.11:53  198.51.100.88:51310 users:(("named",pid=2143,fd=63))

What it means: You have established TCP connections to port 53 from multiple peers. Only one of those is your known secondary.

Decision: If you see random internet IPs, you’re being scanned or you left transfers open. Lock down allow-transfer and consider firewalling TCP/53 to trusted secondaries only (if architecture allows).

Task 13: Prove NOTIFY is accepted only from the right hosts

cr0x@server:~$ sudo named-checkconf -p | sed -n '/options {/,/};/p' | sed -n '1,80p'
options {
	directory "/var/cache/bind";
	allow-notify { 203.0.113.11; };
	notify yes;
};

What it means: The server restricts which sources can send NOTIFY.

Decision: If allow-notify is missing on a secondary, consider adding it. Otherwise anyone can ring the doorbell and make you waste time checking for updates.

Task 14: Confirm the secondary is serving the updated serial (client-side reality check)

cr0x@server:~$ dig +norecurse SOA example.com @ns2.example.net
;; ANSWER SECTION:
example.com.     3600  IN  SOA  ns1.example.net. hostmaster.example.com. 2026020401 1200 300 1209600 300

What it means: The secondary is now serving the new serial.

Decision: If it still serves an old serial, the transfer didn’t land or didn’t load. Go back to logs and zone status.

Hardening BIND9 transfers without breaking secondaries

Rule 1: Explicitly define who may transfer, per zone

Start with the stance that transfers are denied by default. Then punch precise holes. In BIND, that’s typically allow-transfer on each master zone (or via a shared ACL). If you do it globally in options, you’ll eventually forget a zone that should be handled differently.

Rule 2: Prefer TSIG for transfers (and rotate keys like you mean it)

IP ACLs are necessary but not sufficient. IPs change. NAT lies. And “but it’s on a private network” is how you end up exporting a zone to a compromised host inside your own perimeter.

With TSIG, you get message authentication. You still need the IP policy, but TSIG gives you identity at the DNS layer. Use modern algorithms (like hmac-sha256) and keep key distribution controlled.

Rule 3: Don’t forget NOTIFY policy (it’s a load lever)

NOTIFY isn’t a data leak, but it’s a workload trigger. If anyone can send it, anyone can cause your secondaries to run refresh logic and attempt transfers. That’s a nice little way to waste CPU and log space.

Rule 4: Design for failure domains, not just “it works”

Two practical architectures:

Hidden master: internal master(s), public secondaries only. Great security posture, but requires clean TCP/53 connectivity from secondaries to masters and careful ACLs.
Public primary: one of your public servers is master. Simpler routing, bigger attack surface, more reason to be strict with transfers.

Rule 5: Watch transfer volume as a first-class SLO

If you never graph transfer rates and failures, you will find out you have a transfer problem only after customers do. Count:

successful IXFR vs AXFR ratio (a sudden AXFR spike is a smoke alarm)
transfer failures (REFUSED, NOTAUTH, TSIG errors, timeouts)
zone load times and I/O wait

Rule 6: Use limits—but use them like a surgeon, not like a gambler

BIND has knobs such as transfers-out, transfers-in, and serial-query-rate. They can prevent a meltdown, or they can slowly starve your secondaries so you serve stale data for hours. Apply limits after you understand normal transfer behavior and have visibility.

Joke #2: A transfer limit set too low is like a meeting room calendar—technically organized, functionally a denial of service.

Three corporate mini-stories from the zone-transfer mines

Mini-story 1: The incident caused by a wrong assumption

A mid-sized company ran a hidden-master setup: internal master, two public secondaries. The network team assumed “DNS is UDP 53,” because most of the world experiences DNS as quick UDP queries. Their firewall rules allowed inbound UDP/53 to the secondaries and allowed the secondaries to query the master over UDP/53 for “DNS.”

NOTIFY worked. Sort of. The master sent NOTIFY, the secondaries received it, and immediately tried to pull an IXFR. Over TCP. Which was blocked. BIND logged transfer timeouts, then retries, then more retries. Nothing updated.

It stayed invisible for a while because TTLs were long and cached answers masked the staleness. Eventually, a routine certificate rotation added a new validation record. Some resolvers got the new record from the master’s internal view during troubleshooting, but the public secondaries never served it. External validations failed intermittently. The incident response team got to enjoy the special misery of “it works for me” with DNS.

They fixed it by allowing TCP/53 from secondaries to master and by explicitly monitoring secondary SOA serial drift. The postmortem conclusion was painfully simple: they assumed DNS meant UDP, and that assumption ran production for them—right up until it didn’t.

Mini-story 2: The optimization that backfired

A larger org had hundreds of zones. Transfers were noisy, so an engineer tried to “optimize away” transfer load by making refresh timers longer and by reducing NOTIFY chatter. The thought: fewer refreshes, fewer transfers, less load.

It worked in steady state. Graphs looked calmer. Then a deployment bug pushed a bad zone file to the master: missing an NS record that some resolvers relied on. They rolled back quickly. But here’s the sting: many secondaries had not refreshed yet because refresh intervals were now long. A portion of the fleet served the broken version for much longer than anyone expected.

Clients got a roulette wheel of answers depending on which authoritative server they hit. Support saw “sporadic DNS failures.” Engineering saw “it’s fixed.” Both were true. That’s the kind of truth that ruins afternoons.

The remediation was to return refresh/retry to sane defaults, keep NOTIFY enabled, and instead reduce transfer load by fixing the real cause: too many full AXFRs due to missing IXFR journals and sloppy reload patterns. They also introduced a rollback playbook that forced secondaries to refresh immediately after critical DNS fixes.

Mini-story 3: The boring, correct practice that saved the day

A financial services shop had strict change control and a lot of split-horizon DNS using BIND views. Not exciting. The interesting part was their discipline: every zone had a per-zone transfer ACL, TSIG was mandatory, and every change went through named-checkconf and named-checkzone before deployment. They also had a dashboard for “serial drift” between master and each secondary.

One day, a secondary VM was restored from an older snapshot after a hypervisor failure. It came up with an outdated TSIG key file (old secret) and a slightly wrong time due to a busted NTP config. Transfers failed. The secondary kept serving stale data.

The monitoring caught it fast: serial drift alert plus transfer failure messages. The on-call didn’t have to guess. They replaced the TSIG key, fixed NTP, and forced a refresh. The incident was short, contained, and—most importantly—boring to explain.

That’s the standard you want. Not heroics. Just tight controls, preflight checks, and visibility that tells you exactly which secondary is lying.

Common mistakes (symptoms → root cause → fix)

1) Symptom: Secondary never updates; logs show REFUSED

Root cause: Primary’s allow-transfer doesn’t include the secondary IP or TSIG key, or the secondary is hitting the wrong view on the primary.

Fix: On the primary, set allow-transfer explicitly for that zone (prefer TSIG). Confirm view matching by checking which source address/view is being used.

2) Symptom: Transfers succeed sometimes, fail other times

Root cause: Secondary has multiple masters listed; one is reachable but not authorized, or NAT changes source IP unpredictably, or you have anycast complications.

Fix: Use TSIG so auth doesn’t depend on source IP. Prune master lists to the intended set. If anycast is involved, make sure transfer targets are stable and not “nearest node roulette.”

3) Symptom: Sudden spike in AXFR volume; network utilization climbs

Root cause: IXFR not available (journal lost, inline-signing changes, frequent reloads), or secondaries forced into full transfers after failures.

Fix: Investigate why IXFR is failing. Confirm journals are enabled/retained. Avoid unnecessary zone reload patterns. Consider transfer limits only after you reduce full transfers.

4) Symptom: Secondaries serve old data after a rollback

Root cause: Refresh intervals increased too much, NOTIFY disabled, or rollback didn’t trigger forced refresh.

Fix: Keep NOTIFY on for managed secondaries. For rollbacks, use rndc notify on master and rndc refresh on secondaries to converge quickly.

5) Symptom: TSIG errors like “bad key” or “clock skew”

Root cause: Wrong secret, wrong algorithm, wrong key name, or time drift causing signed messages to be rejected.

Fix: Verify key definition matches exactly on both ends. Ensure NTP is healthy. Rotate keys carefully and keep old/new during migration if needed.

6) Symptom: Transfers work for one view but not another

Root cause: The zone exists in multiple views; allow-transfer is set only in one, or the secondary’s source address hits a different view than expected.

Fix: Make view intent explicit. Use match-clients and/or dedicated transfer interfaces. Configure transfers per view and test from the secondary’s actual source IP.

7) Symptom: “connection reset” or timeouts mid-transfer

Root cause: Middlebox killing long-lived TCP, MTU issues, or server resource pressure (file I/O, TCP backlog).

Fix: Capture with tcpdump on both sides, verify path MTU, increase system TCP limits if needed, and reduce full AXFR frequency by fixing IXFR/journals.

8) Symptom: Unauthorized parties can AXFR your zone

Root cause: allow-transfer { any; }; or no restriction at all on the master, combined with exposed TCP/53.

Fix: Lock down per zone with tight ACL/TSIG. Consider firewalling TCP/53 to known secondaries. Audit periodically with external tests.

Checklists / step-by-step plan

Step-by-step: lock down transfers without breaking replication

Inventory your zones and intended secondaries. Make a list: zone → master(s) → secondary IPs → whether TSIG is required.
Decide your policy baseline: TSIG required for all transfers unless there’s a documented exception.
Create dedicated ACLs per environment. Keep them readable. “acl secondaries-prod { … }” beats a graveyard of IPs in every zone stanza.
Implement per-zone allow-transfer. Prefer key-only. If you must include IPs, include only the secondaries.
Restrict NOTIFY handling on secondaries. Configure allow-notify to accept from the master(s) only.
Ensure firewall rules match the protocol reality. Allow secondary → master TCP/53. Allow master → secondary UDP/53 for NOTIFY (or accept refresh-only behavior if you choose to disable NOTIFY).
Enable logging that can prove what happened. You want transfer successes and failures to show up clearly, without drowning everything else.
Roll out gradually. Start with one zone and one secondary. Verify, then expand.
Validate convergence. Compare SOA serial on all servers after a change.
Add monitoring: serial drift, transfer failures, and transfer volume. Alert on drift beyond a threshold and on sustained failures.
Test from outside. Attempt AXFR from an untrusted network and confirm it fails.
Write down the exception process. If anyone needs temporary access, it has an expiration and a ticket, not a memory.

Operational checklist: when adding a new secondary

Confirm the secondary’s source IP(s) (especially with NAT/containers).
Generate and distribute TSIG key securely.
Add the key and server association on both ends.
Update allow-transfer on the master zone(s).
Update also-notify on the master (optional but recommended).
Update allow-notify on the secondary.
Test TCP/53 connectivity both ways as required.
Force initial transfer and confirm SOA serial matches.

Operational checklist: when transfers spike

Identify who is connecting to TCP/53.
Check logs for REFUSED vs TSIG vs timeout patterns.
Measure AXFR vs IXFR ratio.
Check for zone reload loops (automation pushing unchanged zones repeatedly).
Consider temporary rate/transfer limits only after you confirm authorization isn’t wide open.

FAQ

1) Should I disable zone transfers entirely?

Only if you don’t use secondary DNS. If you have secondaries, you need transfers (or an alternative replication mechanism). The right move is strict authorization + TSIG.

2) Is TSIG enough, or do I still need IP-based ACLs?

Use both where you can. TSIG authenticates at the DNS layer; IP ACLs reduce noise and exposure. Defense in depth, not theology.

3) Why do my secondaries update slowly even though NOTIFY is enabled?

Because NOTIFY just triggers a check. If TCP/53 is blocked, TSIG fails, or the master refuses transfers, the secondary will still wait and retry. Check logs for the first failure after NOTIFY.

4) What’s the quickest way to tell if I’m leaking zone data?

From an external host you don’t trust, try dig AXFR zone @auth. It should fail. If it doesn’t, fix it today, not after lunch.

5) AXFR works from one secondary but not another. Why?

Most common causes: missing that secondary’s IP in allow-transfer, TSIG misconfiguration on that node, or the secondary’s traffic is sourced from a different IP than you think (containers/NAT).

6) Can I run transfers over a non-standard port?

You can, but it’s usually self-harm. Standardizing on TCP/53 reduces weird firewall policies and reduces “tribal knowledge.” If you do change ports, document and test relentlessly.

7) Do BIND views affect zone transfers?

Yes. Transfers occur within a view context. If the secondary matches a different view than intended, it may get REFUSED or transfer the wrong dataset. Be explicit with match-clients and test from the secondary’s actual source IP.

8) What’s the safe way to rotate TSIG keys without downtime?

Run old and new keys in parallel during a transition: allow either key for transfers, deploy new key everywhere, verify success, then remove the old key. Also ensure clocks are correct.

9) Is it okay to rely on “allow-transfer { none; }” globally and override per zone?

Yes. That’s a sane default. Just make sure you actually override it for the zones that need secondaries, and monitor for drift so you notice when you forgot.

10) If I firewall TCP/53 to only secondaries, do I still need `allow-transfer`?

Yes. Firewalls drift, rules get cloned, and internal networks aren’t inherently trusted. Keep the application-layer control.

Next steps

Do these in order if you want to sleep:

Audit exposure: attempt AXFR from an untrusted network against every authoritative server you operate.
Make transfers explicit: per-zone allow-transfer and per-secondary TSIG.
Fix the transport reality: ensure secondary → master TCP/53 is permitted; ensure NOTIFY paths are consistent with your firewall policy.
Add serial drift monitoring: if a secondary lags, you should know before users do.
Practice one controlled failure: intentionally break TSIG on a staging secondary and confirm your alerts, logs, and runbook lead you to the right fix.

Zone transfers don’t need to be exciting. If your transfers are exciting, your configuration is trying to tell you something. Listen to it.