Nothing ruins a calm on-call shift like “IP conflict detected” popping up on a server console, a helpdesk ticket, or a monitoring alert—followed by users describing the network as “haunted.” One minute the service is fine, the next minute it’s flapping, SSH sessions freeze, storage mounts hiccup, and some unlucky workstation keeps losing its default gateway.
This is one of those problems where you don’t need heroics. You need disciplined observation and a short list of moves that turn “mystery” into “MAC address on switchport 17.” The goal is simple: identify who is using the IP, where they’re connected, and why it happened—fast enough that you fix it once.
What an IP conflict really is (and why it looks random)
An IP conflict is not a “network error.” It’s an accounting error. Two hosts believe they own the same IP address on the same L2 broadcast domain (or on two domains that are accidentally bridged). The network will happily deliver packets to whichever host currently “wins” the mapping between IP and link-layer address (MAC for IPv4/ARP, or L2 address for IPv6/NDP). That mapping changes over time, which is why the failure looks intermittent and malicious.
What actually flips is the neighbor cache entry on other devices:
- In IPv4, ARP caches map
IPv4 → MAC. A conflict causes ARP cache churn. You see logs like “duplicate address detected,” “ARP flux,” or “kernel: arp: duplicate address.” - In IPv6, NDP (Neighbor Discovery Protocol) maps
IPv6 → link-layer. IPv6 conflicts are less common but can be nastier because you might have multiple IPv6 addresses (SLAAC + DHCPv6 + static) and privacy addresses that complicate attribution.
Conflicts tend to show up in a handful of scenarios:
- Static IP squats in DHCP range. Someone hard-codes
192.168.1.50on a printer because “it’s easier,” and DHCP later hands it out. - Cloned VMs or containers with preserved network identity. A template is cloned, a MAC address is duplicated, or a config management mistake forces the same IP on multiple nodes.
- Rogue DHCP server. A consumer router plugged into an office port starts offering leases and nonsense gateways.
- Layer-2 extension you didn’t intend. A bridge, mispatched VLAN, or stretched network makes two sites share the same subnet unexpectedly.
- Gratuitous ARP storms. Some devices spam “I own this IP” announcements during failover or reboot, and other systems believe them.
One quote that survives every postmortem is a reliability truism from Richard Cook, often paraphrased in operations circles: paraphrased idea: “Systems succeed because people constantly adapt; failures happen when that adaptation can’t keep up.”
In IP conflicts, the “adaptation” is your network constantly relearning ARP/NDP—until it can’t.
Joke #1: An IP conflict is the only time two machines agree on something and everyone else suffers.
Fast diagnosis playbook
This is the part you print (or at least internalize). When you’re under pressure, do not start by rebooting random devices. Reboots can temporarily “fix” it by changing the last-seen MAC, which destroys evidence.
First: confirm it’s a real conflict and scope it
- Identify the contested IP from logs/alerts/user reports.
- Find at least two different MAC addresses claiming that IP (ARP/NDP evidence). If you only have one MAC, you may be chasing a routing or firewall issue.
- Determine the broadcast domain/VLAN where the conflict lives. “Same IP” is only a problem if it’s on the same L2 segment, or bridged segments behaving like one.
Second: locate the offender physically/logically
- From a host on the same VLAN, query ARP for the IP and note the MAC.
- On the switch, look up that MAC in the forwarding table to get the switchport (or trunk/uplink).
- Follow the MAC hop by hop until you reach an access port. If it’s wireless, you’ll land on a WLC/AP association table instead.
Third: decide whether it’s DHCP, static, or virtualization
- Check DHCP logs/leases for that IP: was it assigned, to which MAC, when?
- Check for a rogue DHCP server if clients have wrong gateway/DNS or leases from a weird source.
- Check virtualization platforms for cloned MACs or duplicate static config (VM templates, cloud-init, netplan, systemd-networkd, Kubernetes CNI).
Stop conditions (when you have enough)
- You can name the two MAC addresses and map each to a device/port.
- You know which one is “legit” (inventory, DHCP reservation, documented static allocation).
- You understand the mechanism: static-in-DHCP, rogue DHCP, clone, mis-VLAN, failover behavior.
Interesting facts and historical context
- ARP is older than most “modern” ops habits. ARP was standardized in the early 1980s, when networks were smaller and “duplicate IP” was mostly a human mistake.
- Gratuitous ARP has two lives. It’s used both for helpful announcements (failover, cache refresh) and for abuse (spoofing/poisoning). Same mechanism, different intent.
- Some OSes actively defend their IP. Many stacks send ARP probes before using an address, and may log “address conflict” if replies come back.
- DHCP conflict detection exists, but it’s not magic. Many DHCP servers can ARP-ping an address before leasing it. This reduces conflicts but doesn’t prevent static misconfiguration after the fact.
- Duplicate MACs can masquerade as IP conflicts. If two NICs share a MAC (bad clones, mis-set MAC), switch learning becomes unstable, causing traffic to pinball between ports.
- IPv6 tried to make this rarer. Duplicate Address Detection (DAD) is built into IPv6 neighbor discovery. It helps, but it also creates its own failure modes when L2 is flaky.
- “ARP flux” is a real thing, not a spooky term. Linux can answer ARP on “the wrong interface” if configured loosely; multi-homed hosts can confuse peers.
- High availability uses the same tricks as attackers. VRRP, CARP, and many clustering systems rely on rapidly moving an IP and updating neighbor caches; when misconfigured, they look like conflicts.
Your toolkit: ARP, NDP, DHCP, switches, and packet captures
You’ll solve most IP conflicts with four tools: neighbor tables, DHCP logs, switch MAC tables, and packet captures. The trick is sequencing them so each answer narrows the search space. If you jump straight to tcpdump without knowing what you’re looking for, you’ll spend an hour reading noise.
IPv4: ARP is the courtroom
When two devices claim an IPv4 address, they announce it with ARP replies (sometimes unsolicited). Everyone else updates their cache. Your job is to catch those announcements and associate them with MACs and ports.
IPv6: NDP and DAD are the same drama with longer addresses
In IPv6, conflicts are often exposed during DAD: the host proposes an address and listens for objections. The objection is an NDP Neighbor Advertisement. Capturing that exchange is gold because it includes the offender’s link-layer address.
DHCP: the paper trail (when it exists)
DHCP provides structure: leases, reservations, timestamps. It also provides chaos when there’s more than one DHCP server, or when someone uses static IPs inside the pool. DHCP logs answer two vital questions: was this IP leased and to whom?
Switches and wireless controllers: where the body is buried
Once you have the offender MAC, your switch infrastructure can usually tell you the port (or the upstream device). That turns an “invisible network ghost” into a concrete ticket: “Port Gi1/0/17 has MAC aa:bb:cc:dd:ee:ff, labeled ‘Conference Room’.”
Packet capture: the truth serum
When logs lie, packets don’t. A short capture focused on ARP/NDP will show who is claiming the address and how often. That frequency matters: failover systems will burst then quiet; broken devices will spam forever; malware will be weirdly periodic.
Joke #2: ARP tables are like office seating charts—outdated, occasionally wrong, and everyone trusts them anyway.
Practical tasks: commands, outputs, decisions (12+)
These are real tasks you can run during an incident. Each includes: the command, an example output, what it means, and the decision you make next. Run them from a host on the affected VLAN whenever possible; running from “somewhere else” gives you stale or irrelevant neighbor data.
Task 1: Confirm the IP responds and see MAC resolution (IPv4)
cr0x@server:~$ ping -c 2 10.20.30.40
PING 10.20.30.40 (10.20.30.40) 56(84) bytes of data.
64 bytes from 10.20.30.40: icmp_seq=1 ttl=64 time=0.412 ms
64 bytes from 10.20.30.40: icmp_seq=2 ttl=64 time=0.398 ms
--- 10.20.30.40 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 0.398/0.405/0.412/0.007 ms
Meaning: The IP is alive (at least right now). That doesn’t prove conflict, but it sets up the ARP cache.
Decision: Immediately check the neighbor entry; don’t wait for it to age out.
Task 2: Inspect the ARP/neighbor entry (IPv4)
cr0x@server:~$ ip neigh show 10.20.30.40
10.20.30.40 dev eth0 lladdr 00:11:22:33:44:55 REACHABLE
Meaning: Your host currently believes 10.20.30.40 maps to MAC 00:11:22:33:44:55.
Decision: Record the MAC. Then watch for it to change; changes are the signature of a conflict.
Task 3: Watch ARP cache churn in real time
cr0x@server:~$ watch -n 1 "ip neigh show 10.20.30.40"
Every 1.0s: ip neigh show 10.20.30.40
10.20.30.40 dev eth0 lladdr 00:11:22:33:44:55 STALE
Every 1.0s: ip neigh show 10.20.30.40
10.20.30.40 dev eth0 lladdr aa:bb:cc:dd:ee:ff REACHABLE
Meaning: The MAC changed from 00:11:22:33:44:55 to aa:bb:cc:dd:ee:ff. That’s your smoking gun: two devices are answering for the same IP.
Decision: Now you hunt both MAC addresses through the network. Don’t “fix” yet; identify both endpoints first.
Task 4: Capture ARP claims (who is shouting “I own this IP”)
cr0x@server:~$ sudo tcpdump -ni eth0 arp and host 10.20.30.40
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:01:10.102345 ARP, Reply 10.20.30.40 is-at 00:11:22:33:44:55, length 28
12:01:10.508901 ARP, Reply 10.20.30.40 is-at aa:bb:cc:dd:ee:ff, length 28
Meaning: Two different MACs are emitting ARP replies for the same IP. This is definitive.
Decision: Move to switch/WLC lookup for each MAC. If you don’t have access, hand these MACs to whoever does.
Task 5: Check if your Linux host is the culprit (don’t laugh)
cr0x@server:~$ ip -4 addr show dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
inet 10.20.30.41/24 brd 10.20.30.255 scope global eth0
valid_lft forever preferred_lft forever
Meaning: This host is not configured with 10.20.30.40. Good—still worth checking because self-inflicted wounds are common.
Decision: If it was configured with the contested IP, stop and correct it before chasing the network.
Task 6: Look for kernel duplicate address messages (Linux)
cr0x@server:~$ sudo journalctl -k --since "1 hour ago" | egrep -i "duplicate|arp|conflict"
Feb 05 11:58:22 server kernel: arp: 10.20.30.40 is detected on eth0 with different hwaddr aa:bb:cc:dd:ee:ff
Meaning: The kernel observed a mismatch for the IP/MAC mapping. This confirms the issue from the OS perspective.
Decision: Use the timestamp to correlate with DHCP logs, switch CAM flaps, or a device reboot.
Task 7: Identify the “legitimate” owner via DHCP lease (ISC dhcpd example)
cr0x@server:~$ sudo grep -n "10.20.30.40" /var/lib/dhcp/dhcpd.leases | tail -n 12
1827:lease 10.20.30.40 {
1828: starts 3 2026/02/05 11:30:01;
1829: ends 3 2026/02/05 23:30:01;
1830: cltt 3 2026/02/05 11:30:01;
1831: binding state active;
1832: hardware ethernet 00:11:22:33:44:55;
1833: uid "\001\000\021\"3DU";
1834: client-hostname "acct-laptop-17";
1835:}
Meaning: DHCP believes 10.20.30.40 belongs to MAC 00:11:22:33:44:55 (client hostname included).
Decision: Treat 00:11:22:33:44:55 as the likely legitimate client if your DHCP scope is correct and not being poisoned. Now hunt aa:bb:cc:dd:ee:ff as the interloper.
Task 8: Check systemd-networkd DHCP client logs (to see what server answered)
cr0x@server:~$ journalctl -u systemd-networkd --since "2 hours ago" | egrep -i "DHCPv4|lease|server"
Feb 05 11:31:04 server systemd-networkd[612]: eth0: DHCPv4 address 10.20.30.41/24 via 10.20.30.1
Feb 05 11:31:04 server systemd-networkd[612]: eth0: DHCPv4 server: 10.20.30.2
Meaning: Your host got its lease from DHCP server 10.20.30.2. If clients are getting leases from different servers, you may have a rogue DHCP situation.
Decision: If the “server” IP is unexpected, escalate to network team to locate the rogue and enable DHCP snooping/port security.
Task 9: Find the switchport from a MAC (generic approach via ARP + SNMP-friendly workflow)
cr0x@server:~$ arp -an | grep "10.20.30.40"
? (10.20.30.40) at aa:bb:cc:dd:ee:ff [ether] on eth0
Meaning: You have the MAC to search in the switch CAM table.
Decision: On the access switch, run a MAC address-table lookup (vendor-specific). If you can’t, send the MAC to someone who can.
Task 10: Detect MAC flapping on Linux bridge (sometimes the “switch” is a host)
cr0x@server:~$ bridge fdb show br0 | head
00:11:22:33:44:55 dev vnet3 master br0
aa:bb:cc:dd:ee:ff dev vnet7 master br0
Meaning: On a virtualization host, you can see which vNICs correspond to each MAC. This is often the fastest path in VM-heavy environments.
Decision: Map vnet3/vnet7 to VM names (libvirt, qemu, etc.) and check their IP configs or cloud-init.
Task 11: Check for duplicated IP configuration in netplan (common in image-based provisioning)
cr0x@server:~$ sudo cat /etc/netplan/01-netcfg.yaml
network:
version: 2
ethernets:
eth0:
addresses: [10.20.30.40/24]
gateway4: 10.20.30.1
nameservers:
addresses: [10.20.30.2,8.8.8.8]
Meaning: This host is statically configured with the contested IP. If you see this on multiple machines due to a golden image, you’ve found your conflict factory.
Decision: Fix the provisioning pipeline: unique addressing via DHCP or templated static allocation. Then change the offending host’s IP and flush caches.
Task 12: Flush neighbor cache after the fix (client-side stabilization)
cr0x@server:~$ sudo ip neigh flush 10.20.30.40
Meaning: You removed the stale mapping so the host can relearn the correct MAC quickly.
Decision: Do this on impacted servers after remediation, especially if they pinned the “wrong” MAC and are still failing.
Task 13: IPv6 conflict check: view neighbor entry and DAD hints
cr0x@server:~$ ip -6 neigh show dev eth0 | head
fe80::a00:27ff:fe12:3456 lladdr 08:00:27:12:34:56 router REACHABLE
2001:db8:20:30::40 lladdr 00:11:22:33:44:55 STALE
Meaning: You have an IPv6 neighbor mapping. If it flips between two lladdrs, you have the IPv6 version of the problem.
Decision: Capture NDP (next task) and look for Neighbor Advertisements from multiple sources.
Task 14: Capture IPv6 NDP for a specific address
cr0x@server:~$ sudo tcpdump -ni eth0 "icmp6 and (ip6[40] == 135 or ip6[40] == 136)"
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:07:11.001122 IP6 fe80::1 > ff02::1:ff00:40: ICMP6, neighbor solicitation, who has 2001:db8:20:30::40, length 32
12:07:11.003210 IP6 fe80::a00:27ff:fe12:3456 > fe80::1: ICMP6, neighbor advertisement, tgt is 2001:db8:20:30::40, length 32
12:07:11.004001 IP6 fe80::b00:dead:beef:1 > fe80::1: ICMP6, neighbor advertisement, tgt is 2001:db8:20:30::40, length 32
Meaning: Two different link-local sources are advertising ownership of the same IPv6 address. That’s a conflict.
Decision: Extract the L2 addresses (use -e on tcpdump if needed) and trace them through switching/wireless like you would for ARP.
Task 15: Detect rogue DHCP offers on the wire (DORA inspection)
cr0x@server:~$ sudo tcpdump -ni eth0 -vvv "udp and (port 67 or port 68)" -c 20
tcpdump: listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:10:01.100001 IP (tos 0x0, ttl 64, id 1001, offset 0, flags [none], proto UDP (17), length 328)
0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:11:22:33:44:55, length 300, xid 0x1a2b3c4d
12:10:01.101111 IP (tos 0x0, ttl 64, id 2001, offset 0, flags [none], proto UDP (17), length 342)
10.20.30.2.67 > 10.20.30.41.68: BOOTP/DHCP, Reply, length 314, xid 0x1a2b3c4d, yiaddr 10.20.30.41
12:10:01.101999 IP (tos 0x0, ttl 255, id 3001, offset 0, flags [none], proto UDP (17), length 342)
192.168.0.1.67 > 10.20.30.41.68: BOOTP/DHCP, Reply, length 314, xid 0x1a2b3c4d, yiaddr 10.20.30.99
Meaning: Two different DHCP servers responded: 10.20.30.2 (expected) and 192.168.0.1 (highly suspicious on this subnet). That’s a rogue DHCP server, which can indirectly cause IP conflicts and definitely causes misrouting.
Decision: Engage network/security: shut the port, enable DHCP snooping, locate the rogue device by MAC from the DHCP packet (-e) and switchport lookup.
Three corporate mini-stories from the trenches
Story 1: The incident caused by a wrong assumption
The company was mid-migration from a legacy office network to a new segmented design. A small team spun up a “temporary” subnet for a lab: same RFC1918 range as production, because “it’s isolated.” They were confident because the lab switch was in a different closet and the lab router had a different WAN uplink. Isolation, by vibes.
Two months later, a contractor needed remote access to the lab environment. Someone added a VPN tunnel and bridged the lab VLAN into the campus core “just for a week.” Nobody updated the network diagrams. Nobody updated the IPAM records. The tunnel stayed.
Then came the fun: production endpoints started intermittently losing access to a file service. The service IP was stable; the MAC was not. Helpdesk blamed the storage team. Storage blamed the firewall. The firewall team blamed DNS because that’s what firewall teams do when they’re bored.
The breakthrough was painfully simple: an ARP capture showed two MACs claiming the file service IP. One was the real server. The other was a lab VM configured with a copy-pasted netplan file from a wiki page. The “isolated” lab had become part of the production broadcast domain through a bridging mistake.
The fix was not heroic. It was careful: remove the bridge, audit for shared RFC1918 overlap, and implement a rule—no overlapping address space between environments unless you can prove L2 and L3 separation. The postmortem conclusion wasn’t “people messed up.” It was “we assumed isolation without verifying the path.”
Story 2: The optimization that backfired
A different org was proud of their rapid provisioning. They had a golden VM image that booted fast and joined the fleet with minimal external dependencies. To reduce boot time, they decided to preconfigure static addressing in the image for a special service tier: fewer DHCP calls, faster readiness, fewer moving parts. On paper, tidy.
The image included /etc/netplan/01-netcfg.yaml with a static IP. The plan was “we’ll override it per-host during provisioning.” But the override relied on a cloud-init step that occasionally failed when the metadata service was slow. Those machines came up with the baked-in IP anyway.
It didn’t explode immediately. It smoldered. Only when scaling events happened—like patch windows or a sudden traffic spike—would several instances boot simultaneously and collide on the same IP. The monitoring graph looked like a heartbeat: up, down, up, down. The team called it “the flapper.”
They tried to dampen it: longer ARP cache timeouts, keepalives, “maybe it’s the load balancer.” All of that was symptom management. The actual optimization—static IP baked into the image—was the root of the chaos.
The fix was to stop being clever: remove static addressing from the base image, let DHCP do its job, and use reservations where stable IPs were required. Boot time increased slightly. The incident count dropped dramatically. That’s a trade most businesses can live with.
Story 3: The boring but correct practice that saved the day
In a heavily regulated environment, the network team kept an unglamorous habit: every VLAN had documented DHCP scopes, exclusions, and reserved static ranges. They also enforced that “static” addresses were only assigned out of the reserved range, never inside the DHCP pool. No exceptions, no “just for a printer,” no “it’s temporary.”
One morning, a wave of IP conflict alerts hit a clinical application segment. Triage began: ARP captures showed two MACs for the same IP—one belonging to a known thin client model, the other unknown. The team pulled the DHCP lease and immediately saw the intended owner. That narrowed the culprit to “something static or rogue,” not “DHCP randomly misbehaving.”
Next, they did the most boring thing imaginable: switch MAC lookup. The unknown MAC landed on an access port in a meeting room that was not supposed to be on that VLAN. The port was patched to a small unmanaged switch. From there: a consumer Wi‑Fi router someone brought from home. It was bridging and offering DHCP, and also had a static IP set from a previous environment that collided with a leased address.
The resolution took minutes: disable the port, remove the device, flush neighbor caches on key servers, confirm no other rogue DHCP sources. The post-incident report was satisfyingly dull: documented scopes and switchport tracing worked exactly as designed.
Boring wins. That’s not a slogan; it’s an operational truth.
Common mistakes: symptom → root cause → fix
1) Symptom: “It works for some users but not others”
Root cause: ARP caches differ across clients; some have MAC A for the IP, others have MAC B. The service appears randomly reachable.
Fix: Capture ARP replies to confirm dual claimants, then trace MACs via switch tables. After remediation, flush neighbor caches on critical clients/servers.
2) Symptom: “Conflict happens only after reboots or failover”
Root cause: HA pair misconfiguration (VRRP/CARP/keepalived) where both nodes think they’re master, or failover sends gratuitous ARP too aggressively and confuses upstream devices.
Fix: Validate HA state machine, preemption settings, and health checks. Ensure only one node holds the VIP. Capture ARP during failover to prove behavior.
3) Symptom: “We see duplicate IP alerts, but the IP doesn’t ping”
Root cause: The address is being claimed via ARP but host firewall drops ICMP; or you’re observing ARP spoofing / security scanning; or the conflict exists in a different VLAN than where you’re testing.
Fix: Don’t use ping as the truth. Use ARP/NDP evidence and switchport location. Verify VLAN context (trunk tagging, access VLAN).
4) Symptom: “New devices keep getting wrong gateway/DNS”
Root cause: Rogue DHCP server, often a consumer router or a misconfigured VM running dnsmasq.
Fix: Identify DHCP offer sources with tcpdump, then isolate by MAC and switchport. Implement DHCP snooping and block DHCP server ports at the edge.
5) Symptom: “The same MAC shows up on two switchports”
Root cause: Duplicate MAC address (cloned VMs, manual MAC setting, buggy NIC firmware) or a loop/unmanaged switch causing CAM instability.
Fix: Confirm in virtualization inventory; enforce unique MAC generation. Check for L2 loops; ensure STP is enabled; consider port security to limit MAC moves.
6) Symptom: “IP conflicts only on Wi‑Fi”
Root cause: Client isolation/roaming quirks, multiple SSIDs bridged to the same VLAN unexpectedly, or a misconfigured wireless controller relaying DHCP incorrectly.
Fix: Correlate by WLC client table using MAC, verify SSID-to-VLAN mapping, and look for rogue APs bridging networks.
7) Symptom: “We fixed it, but it came back a day later”
Root cause: You fixed the symptom (changed one host IP) but not the mechanism (template, DHCP overlap, unmanaged device that returns).
Fix: Root-cause discipline: find the source device and the reason it had that IP. Update IPAM, enforce exclusions, and add detection.
Checklists / step-by-step plan
Incident checklist: 15 minutes to clarity
- Write down: contested IP, affected VLAN/subnet, first observed time, who reported it.
- From a host on the same VLAN:
ip neigh show <IP>and record MAC. - Run:
tcpdump -ni <if> arp and host <IP>for 30–60 seconds. Record all MACs seen. - Check DHCP leases/logs: is the IP leased? to which MAC? Any unusual DHCP server IPs?
- Trace MAC(s) on the switch: find access port or upstream link. Repeat hop-by-hop.
- Identify device: asset inventory, vendor OUI, DHCP hostname, virtualization mapping, wireless client entry.
- Stop the bleeding: disable offending port, remove static config, or correct HA state. Avoid “just reboot it.”
- Stabilize: flush neighbor caches on key servers, restart affected services if needed.
- Prevent recurrence: fix DHCP scope overlaps, update IPAM, add guardrails (snooping, reservations, template fixes).
Prevention checklist: make conflicts boring again
- Maintain a reserved static range per VLAN and keep it out of DHCP pools.
- Enable DHCP conflict detection on the server where available (ARP check before offer).
- Enable DHCP snooping and Dynamic ARP Inspection (where supported) on access switches.
- Enforce unique MAC addressing in VM templates; never bake static IPs into golden images unless you also bake uniqueness.
- Log and alert on MAC flaps for the same IP (on firewall, switches, or via passive sensors).
- Keep a minimal “whois for MACs” process: OUI lookup + internal inventory mapping.
- Document HA virtual IP behavior (VRRP/CARP/keepalived) and test split-brain scenarios.
Post-incident verification checklist
- Confirm ARP/NDP for the IP resolves to exactly one MAC over a 5–10 minute watch.
- Confirm DHCP has no active lease for that IP bound to the “wrong” MAC (or clear it).
- Confirm switch CAM table shows the MAC on the expected port and is stable.
- Confirm affected applications are stable: no connection resets, no intermittent timeouts.
- Write the root cause in one sentence that includes the mechanism (not just the culprit device).
FAQ
1) How do I know it’s an IP conflict and not DNS?
DNS issues don’t change MAC addresses. If ip neigh (or ARP captures) show multiple MACs for the same IP, it’s a conflict. DNS can be broken too, but it’s a different failure signature.
2) Why does the problem come and go?
Because ARP/NDP caches age and refresh. Whichever device last announced ownership “wins” until something triggers a refresh (traffic, cache expiry, gratuitous ARP, neighbor solicitation).
3) Can I just clear the ARP table to fix it?
Clearing neighbor caches can restore connectivity briefly, but it doesn’t remove the second claimant. It’s a stabilization step after you fix the underlying cause, not a cure.
4) What if both devices are “legitimate,” like an HA pair?
Then only one of them should be master for the VIP at any time. If both are claiming it, you have split brain or misconfigured failover. Capture ARP during the event and verify the HA state transitions.
5) How do I find the device if I only have an IP and no switch access?
Use ARP capture to get the MAC. Then use whatever inventory you have: DHCP lease hostname, virtualization host bridge tables, wireless controller client lists, or endpoint management records that index by MAC.
6) Do IPv6 networks get IP conflicts?
Yes, but they’re often detected earlier thanks to Duplicate Address Detection. Still, mis-bridged segments, manual static IPv6, or cloned images can create duplicates. Use NDP captures and neighbor tables just like ARP.
7) What’s the fastest way to prove there are two devices?
tcpdump on ARP (or NDP for IPv6) and a watch loop on ip neigh show. If the MAC alternates, you have proof that’s hard to argue with.
8) Why do printers and IoT devices show up in these incidents so often?
They get “temporarily” statically configured, moved between networks, and rarely updated. They also tend to live under desks or in closets, which makes them perfect villains.
9) Should we put everything on DHCP to avoid conflicts?
For most endpoints and general servers, yes. For infrastructure needing stable addresses, use DHCP reservations or a documented static range outside the pool. “Some static, some DHCP, no documentation” is how you invite conflicts.
Conclusion: next steps that prevent repeats
When an IP conflict hits, the winning move is to stop guessing and start collecting two identifiers: the contested IP and the claiming MAC addresses. From there, it’s pure mechanics: map MAC to port (or VM vNIC, or Wi‑Fi association), validate DHCP intent, and remove the extra claimant.
Your practical next steps:
- Adopt the fast playbook and require ARP/NDP evidence before “fixes.”
- Clean up addressing: keep static IPs out of DHCP pools, and document reserved ranges.
- Harden the edge: DHCP snooping, ARP inspection, and sane port policies where your hardware supports it.
- Fix your templates: no baked-in static IPs, no duplicated MACs, and no “temporary” configs that become permanent.
If you do those four, “IP conflict detected” goes back to being a rare curiosity instead of a recurring character in your incident queue.