You log into a Proxmox node and the UI greets you with: “cluster not ready – no quorum?”.
Half your buttons are greyed out. VM starts fail. Storage looks “unknown”.
Everyone’s first impulse is to click harder. Second impulse is to “just force it”. Both are how you turn a bad day into a career-limiting event.
This is the guide you want when the cluster is wobbling and you’re trying to bring it back without triggering split-brain, corrupting cluster config, or causing a second outage while fixing the first.
What quorum really means in Proxmox (and why it blocks you)
Proxmox clustering is built on Corosync for membership and messaging, plus pmxcfs (Proxmox Cluster File System)
to replicate configuration (VM configs, storage definitions, firewall rules, users, etc.) across nodes. When quorum is lost, Proxmox intentionally
stops making “cluster-wide” changes because it cannot safely know whether it’s the only living truth.
Think of quorum as the cluster’s ability to answer one question confidently: “Are we the legitimate majority view?”
Without that, two halves of a partition can both believe they’re in charge. That’s split-brain. Split-brain is not an exciting architecture pattern;
it’s just corruption with better marketing.
In Proxmox, losing quorum usually manifests as:
- UI banner: “cluster not ready – no quorum?”
- CLI errors: pvesh / pveproxy errors, inability to write to /etc/pve
- VM operations blocked: especially HA-managed workloads
- Weird storage state: because parts of config live in /etc/pve
Note the nuance: your VMs might still be running. Your storage might still be fine. Your network might be fine.
Quorum loss is primarily a cluster coordination problem. The danger is that it tempts you to take actions that create a data problem.
One quote worth keeping in your head during cluster incidents:
“Hope is not a strategy.”
— Gene Kranz
Joke #1: Quorum is like a meeting that only counts if enough people show up. For once, the cluster is the adult in the room.
Fast diagnosis playbook (first/second/third checks)
When you’re in the middle of a disruption, you don’t need theory. You need to find the bottleneck quickly, choose the least risky recovery path,
and avoid “fixes” that only work because you haven’t noticed the second failure yet.
First: confirm it’s really quorum, not a UI or proxy problem
- Run
pvecm statuson the node you’re on. - Check
systemctl status pve-cluster corosync. - Check if
/etc/pveis mounted and responsive.
Second: determine membership reality (who is alive, who can talk)
- Run
pvecm nodes. - Check Corosync ring status (
corosync-cfgtool -sorcorosync-quorumtool -s). - From each node, test ring interfaces with
ping/arpingand validate routes/VLANs.
Third: decide the recovery class
Pick one of these, in this order of preference:
- Restore missing nodes or networking so the original cluster regains quorum naturally.
- Temporarily adjust expected votes only to match the reality you can prove is safe.
- Last resort: force a single node to become operational long enough to stabilize (with explicit acceptance of risk).
- Disaster recovery: rebuild cluster and rejoin nodes, after freezing the old state.
If you can’t articulate which recovery class you are in, you are not ready to type commands that change quorum.
Interesting facts and context (why this stuff is the way it is)
- Fact 1: “Quorum” in Corosync is provided by the
votequorumservice, which decides if the partition has enough votes to operate. - Fact 2: Proxmox’s
/etc/pveis a distributed filesystem (pmxcfs) stored in RAM and synchronized via Corosync; it’s not a normal directory. - Fact 3: The “expected votes” mechanism exists because clusters change size; it can also be abused to “convince” a partition it’s quorate.
- Fact 4: Two-node clusters are inherently awkward for quorum: without a third vote, a partition can’t distinguish “peer is down” from “link is down”.
- Fact 5: The idea behind quorum and majority voting goes back decades in distributed systems, and is a practical compromise: safety over availability during partitions.
- Fact 6: Corosync uses ring IDs and membership transitions; frequent ring changes usually mean packet loss, MTU mismatch, or unstable links.
- Fact 7: Proxmox HA uses cluster state; if quorum is lost, HA generally refuses to do anything “clever” because “clever” is how you double-start VMs.
- Fact 8: The qdevice concept (external tie-breaker) exists largely because organizations insist on even-number clusters and then act surprised when physics happens.
Joke #2: A two-node cluster without a tie-breaker is like two managers arguing in Slack. The only winner is the incident channel.
Safety rules: do these before touching anything
1) Decide what you are protecting: VM integrity, storage integrity, or just the UI
If storage is shared (Ceph, SAN, NFS, iSCSI), the biggest risk is two nodes believing they own the same writable resource.
If storage is local (ZFS per node), the risk shifts toward configuration drift and failed HA operations rather than raw block corruption.
2) Freeze automation
If you run any external automation that changes Proxmox config (Terraform, Ansible, scripts calling API), pause it.
Quorum incidents are where idempotence goes to die because the cluster doesn’t accept writes consistently.
3) Don’t “fix” by rebooting everything
Rebooting can sometimes clear a stuck state, but it can also destroy the evidence you needed: logs, ring transitions, or the one node that still had the correct config.
Treat reboots as controlled interventions, not therapy.
4) If shared storage is in play, establish fencing facts
If you have shared LUNs or NFS exports mounted read-write by multiple nodes, you need to know whether there is any fencing mechanism:
IPMI power control, SAN-level SCSI reservations, Ceph OSD rules, etc. If the answer is “uh”, do not force quorum casually.
Practical tasks with commands: read, decide, act
Below are real tasks you can run on a Proxmox node. Each includes: command, example output, what it means, and the decision you make.
Use them as a drill-down, not as a magical incantation list.
Task 1: Confirm quorum state and expected votes
cr0x@pve1:~$ pvecm status
Cluster information
-------------------
Name: prod-cluster
Config Version: 12
Transport: knet
Secure auth: on
Quorum information
------------------
Date: 2025-12-25 10:14:18
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1.2c
Quorate: No
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 1
Quorum: 2
Flags: 0
Meaning: This partition has 1 vote but expects 3; quorum requires 2. You are not quorate.
Decision: Don’t touch HA or cluster-wide config. Next, figure out where the other two votes went: nodes down or network partition.
Task 2: See what nodes the cluster thinks exist
cr0x@pve1:~$ pvecm nodes
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 pve1
0x00000002 1 pve2
0x00000003 1 pve3
Meaning: Cluster config expects three nodes with one vote each.
Decision: If pve2/pve3 are intentionally decommissioned, you need a controlled membership change. If they’re supposed to be alive, fix their reachability first.
Task 3: Check corosync service health on the local node
cr0x@pve1:~$ systemctl status corosync --no-pager
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled)
Active: active (running) since Thu 2025-12-25 10:01:03 UTC; 13min ago
Docs: man:corosync
Main PID: 1189 (corosync)
Tasks: 10
Memory: 45.2M
CPU: 7.412s
Meaning: Corosync is running. This is not “service is dead”; it’s “service can’t see enough peers.”
Decision: Focus on network/ring/peer health, not on restarting blindly.
Task 4: Check pve-cluster and pmxcfs readiness
cr0x@pve1:~$ systemctl status pve-cluster --no-pager
● pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled)
Active: active (running) since Thu 2025-12-25 10:01:10 UTC; 13min ago
Main PID: 1260 (pmxcfs)
Tasks: 12
Memory: 78.0M
CPU: 3.201s
Meaning: pmxcfs is running, but may be in a read-only-ish mode for cluster writes due to no quorum.
Decision: Validate whether /etc/pve is accessible and whether writes are blocked.
Task 5: Confirm /etc/pve is mounted (and not wedged)
cr0x@pve1:~$ mount | grep /etc/pve
pve on /etc/pve type fuse.pve (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)
Meaning: pmxcfs is mounted. That’s good.
Decision: If it’s missing, you have a bigger local node problem (pmxcfs not mounted), not just quorum loss.
Task 6: Check if cluster filesystem is writable (practically)
cr0x@pve1:~$ touch /etc/pve/.rw-test
touch: cannot touch '/etc/pve/.rw-test': Read-only file system
Meaning: With no quorum, pmxcfs refuses writes (common behavior).
Decision: Stop trying to “edit your way out.” Restore quorum first, or use a controlled single-node recovery procedure if that’s the chosen path.
Task 7: Look at Corosync ring status (link-level signal)
cr0x@pve1:~$ corosync-cfgtool -s
Local node ID 1, transport knet
LINK ID 0 udp
addr = 10.10.10.11
status:
nodeid: 1: connected
nodeid: 2: disconnected
nodeid: 3: disconnected
Meaning: pve1 is connected to itself but cannot reach pve2/pve3 on the ring network.
Decision: Confirm ring network reachability (IP, VLAN, MTU, firewall) before touching votes.
Task 8: Check for packet loss and MTU mismatch on the ring
cr0x@pve1:~$ ping -c 5 -M do -s 8972 10.10.10.12
PING 10.10.10.12 (10.10.10.12) 8972(9000) bytes of data.
ping: local error: message too long, mtu=1500
--- 10.10.10.12 ping statistics ---
5 packets transmitted, 0 received, +5 errors, 100% packet loss, time 4087ms
Meaning: You tried a jumbo ping and discovered MTU 1500 on the sender path. If you expected jumbo frames, you’ve found a real cause.
Decision: Align MTU end-to-end on the corosync network (switch ports, bonds, VLAN interfaces). Corosync does not enjoy mystery fragmentation.
Task 9: Validate basic reachability to peers (don’t skip the boring part)
cr0x@pve1:~$ ping -c 3 10.10.10.12
PING 10.10.10.12 (10.10.10.12) 56(84) bytes of data.
64 bytes from 10.10.10.12: icmp_seq=1 ttl=64 time=0.602 ms
64 bytes from 10.10.10.12: icmp_seq=2 ttl=64 time=0.581 ms
64 bytes from 10.10.10.12: icmp_seq=3 ttl=64 time=0.594 ms
--- 10.10.10.12 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2041ms
rtt min/avg/max/mdev = 0.581/0.592/0.602/0.009 ms
Meaning: IP connectivity exists, at least for ICMP. That’s necessary but not sufficient.
Decision: If ICMP works but Corosync still shows disconnected, suspect UDP filtering, asymmetric routing, or wrong bindnetaddr.
Task 10: Check corosync logs for membership churn and token timeouts
cr0x@pve1:~$ journalctl -u corosync -n 50 --no-pager
Dec 25 10:12:02 pve1 corosync[1189]: [KNET ] host: 2 (passive) link: 0 is down
Dec 25 10:12:03 pve1 corosync[1189]: [TOTEM ] A processor failed, forming new configuration.
Dec 25 10:12:04 pve1 corosync[1189]: [QUORUM] Members[1]: 1
Dec 25 10:12:04 pve1 corosync[1189]: [QUORUM] This node is within the non-primary component and will NOT provide service.
Meaning: KNET link down to host 2, and the quorum subsystem is refusing service. This is the smoking gun for “network or remote node down.”
Decision: Fix physical/virtual network path or bring the remote node back, then wait for membership to stabilize.
Task 11: Inspect Corosync configuration (bindnetaddr and node list)
cr0x@pve1:~$ cat /etc/pve/corosync.conf
totem {
version: 2
cluster_name: prod-cluster
transport: knet
interface {
linknumber: 0
bindnetaddr: 10.10.10.0
mcastport: 5405
}
}
nodelist {
node {
name: pve1
nodeid: 1
quorum_votes: 1
ring0_addr: 10.10.10.11
}
node {
name: pve2
nodeid: 2
quorum_votes: 1
ring0_addr: 10.10.10.12
}
node {
name: pve3
nodeid: 3
quorum_votes: 1
ring0_addr: 10.10.10.13
}
}
quorum {
provider: corosync_votequorum
}
Meaning: Corosync expects the ring network on 10.10.10.0/24. If the nodes moved subnets, changed IPs, or VLAN tags, membership will break.
Decision: If IPs changed, fix them back or update corosync.conf using a quorate node (or a controlled recovery method). Don’t hand-edit while pmxcfs is unhappy.
Task 12: Determine if the missing nodes are actually down
cr0x@pve1:~$ ssh root@10.10.10.12 "uptime && systemctl is-active corosync && pvecm status | sed -n '1,20p'"
10:15:41 up 12 days, 3:22, 2 users, load average: 0.22, 0.18, 0.15
active
Cluster information
-------------------
Name: prod-cluster
Config Version: 12
Transport: knet
Secure auth: on
Meaning: pve2 is up and corosync is active. So why is pve1 disconnected? Likely network partition, firewall, or asymmetric routing.
Decision: Run the same ring checks from pve2. You want symmetric evidence (pve2 also cannot see pve1, or sees something different).
Task 13: Confirm UDP port reachability (Corosync traffic) via tcpdump
cr0x@pve1:~$ tcpdump -ni vmbr1 udp port 5405 -c 10
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on vmbr1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
10:16:02.112233 IP 10.10.10.11.5405 > 10.10.10.12.5405: UDP, length 256
10:16:03.113244 IP 10.10.10.11.5405 > 10.10.10.12.5405: UDP, length 256
10:16:04.114255 IP 10.10.10.11.5405 > 10.10.10.12.5405: UDP, length 256
Meaning: Outbound packets are leaving. If you don’t see inbound replies, the problem is between the nodes (ACL, switch, routing, or peer firewall).
Decision: Capture on pve2 simultaneously. If pve2 sees inbound but replies vanish, suspect asymmetric routing or upstream filtering.
Task 14: Check host firewall status (PVE firewall can bite)
cr0x@pve1:~$ pve-firewall status
Status: enabled/running
Meaning: Firewall is on. That’s not bad; it’s just a variable.
Decision: Verify rules allow Corosync on the ring interface. If you recently hardened firewall rules, this is a prime suspect.
Task 15: Confirm cluster communication isn’t blocked by time drift
cr0x@pve1:~$ timedatectl
Local time: Thu 2025-12-25 10:16:55 UTC
Universal time: Thu 2025-12-25 10:16:55 UTC
RTC time: Thu 2025-12-25 10:16:55
Time zone: Etc/UTC (UTC, +0000)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no
Meaning: Time sync is healthy on this node. Corosync isn’t TLS-validity-checking your packets, but time drift correlates with “everything is weird,” including auth and logs.
Decision: If one node is minutes off, fix NTP before you trust any diagnosis from timestamps.
Task 16: As a controlled action, set expected votes (temporary) to regain quorum
cr0x@pve1:~$ pvecm expected 1
Setting expected votes to 1
Meaning: You told votequorum to expect only 1 vote. That can make this partition quorate instantly.
Decision: Only do this if you are certain the other nodes are down or you have fenced them. Otherwise, you risk two quorate partitions at once (that’s the horror movie).
Task 17: Validate quorum after the change
cr0x@pve1:~$ pvecm status | grep -E "Quorate|Expected votes|Total votes|Quorum"
Quorate: Yes
Expected votes: 1
Total votes: 1
Quorum: 1
Meaning: The node is now quorate (per its new expectation).
Decision: Use this window to stabilize the cluster configuration, but plan to revert expected votes once the cluster is whole again.
Task 18: Remove a dead node cleanly (when the cluster is quorate)
cr0x@pve1:~$ pvecm delnode pve3
Removing node pve3 from cluster
Meaning: Cluster membership is updated; expected votes and quorum math change accordingly.
Decision: Do this only when you have decided pve3 is permanently gone (or will be reinstalled and rejoined). Don’t delnode as a “ping test.”
Task 19: Check pmxcfs health after quorum restoration
cr0x@pve1:~$ ls -la /etc/pve/nodes
total 0
drwxr-xr-x 2 root www-data 0 Dec 25 10:18 .
drwxr-xr-x 1 root www-data 0 Dec 25 10:18 ..
drwxr-xr-x 2 root www-data 0 Dec 25 10:18 pve1
drwxr-xr-x 2 root www-data 0 Dec 25 10:18 pve2
Meaning: Nodes directory reflects current cluster members; this is a sanity check that cluster FS is coherent.
Decision: If nodes appear/disappear unexpectedly, stop and investigate membership flapping before doing config changes.
Task 20: If HA is configured, check manager status before unblocking operations
cr0x@pve1:~$ systemctl status pve-ha-lrm pve-ha-crm --no-pager
● pve-ha-lrm.service - PVE Local Resource Manager Daemon
Active: active (running)
● pve-ha-crm.service - PVE Cluster Resource Manager Daemon
Active: active (running)
Meaning: HA agents are running. That doesn’t mean they’re safe to act yet; it means they will act once quorum is back and state converges.
Decision: Review HA resources and ensure you won’t trigger unexpected migrations/starts the moment quorum returns.
Recovery paths: pick the least dangerous option
“Restore quorum” is not one thing. It’s a family of moves with different blast radii.
Here’s how I choose under pressure.
Path A (best): restore network and/or bring missing nodes back
If nodes are healthy but disconnected, fix the Corosync ring network. This is the cleanest resolution because it preserves membership history and avoids special quorum hacks.
Typical culprits: VLAN mis-tag, bond mode mismatch, MTU mismatch, LACP misconfig, firewall rules, or a switch reboot that came back with a different port profile.
Once connectivity returns, Corosync membership should converge, votequorum becomes quorate, pmxcfs becomes writable, and life returns.
If membership keeps flapping, do not proceed to configuration changes. Fix stability first.
Path B (acceptable): temporarily adjust expected votes
pvecm expected is a tool, not a lifestyle. It’s appropriate when:
- You have a multi-node cluster, but enough nodes are permanently offline right now that you cannot regain majority quickly.
- You can prove the other side cannot also declare quorum (because it’s powered off, fenced, or isolated from shared writable storage).
- You need to regain cluster write capability to perform housekeeping (delnode, adjust HA, fix config, schedule maintenance).
It is not appropriate when you merely suspect the others are down. Suspicion is for mystery novels, not cluster math.
Path C (high risk): force a single node to operate
Sometimes you have one surviving node and you must get services back. The risk is that when the network heals, you might have two divergent “truths” of cluster config.
If you go this route, you need containment:
- Confirm other nodes are down or fenced.
- Disable HA actions if they could start duplicates.
- If shared storage exists, ensure only one side can write (fencing, export lock, SAN masking).
In practice, many people “fix” quorum by forcing it, then discover later that HA restarted a VM elsewhere while they also started it locally.
That’s not a quorum fix; that’s a choose-your-own-disaster book.
Path D (rebuild): when the cluster has lost coherence
If pmxcfs is inconsistent, multiple nodes have been force-quorated in isolation, or the corosync config diverged, you may be in rebuild territory:
pick a source of truth node, capture configs, reinstall/recreate cluster, rejoin nodes, and reintroduce HA carefully.
This is slower, but it’s often the only way to regain trust in the system.
Three corporate mini-stories (realistic pain included)
Mini-story 1: The incident caused by a wrong assumption
A mid-sized company ran a three-node Proxmox cluster. Two nodes were in the same rack; the third was “nearby” in another row.
The Corosync ring used a dedicated VLAN. Someone did a switch maintenance window and moved a few access ports “temporarily” to a default VLAN.
The third node’s ring NIC landed in the wrong VLAN. The node stayed up, VMs kept running, and monitoring showed the host healthy.
The next morning, someone saw “cluster not ready – no quorum?” on one of the rack-local nodes. Their assumption: “pve3 must be down.”
They didn’t verify from pve3. They ran pvecm expected 1 on pve1 and immediately regained the UI and write access to /etc/pve.
Victory, right?
Not quite. pve3 was alive and still connected to pve2 intermittently through a different accidental path because of a second network change.
Now there were moments where two partitions briefly became “confident” in different realities. Config changes (firewall and storage edits) were applied on one side only.
When the VLAN was corrected, the cluster re-formed, and the team got a fresh set of strange symptoms: VMs missing from the UI on one node, storage definitions out of sync,
and sporadic permission errors. They spent hours “fixing permissions” that were actually symptoms of config divergence during the forced-quorum window.
The fix was boring: revert expected votes to the real value, stabilize ring connectivity, then reconcile config from backups and from one chosen source-of-truth node.
The lesson was sharper: don’t change quorum math based on assumptions about node health. Prove it.
Mini-story 2: The optimization that backfired
Another org wanted “faster cluster traffic” and decided to enable jumbo frames across their virtualization network.
They changed MTU on the Proxmox bridges and bonds. They changed it on the top-of-rack switches.
They did not change it on a couple of intermediate VLAN trunk ports because those were owned by a different team and “surely standard.”
For a while, everything looked fine. VM traffic was fine. Storage traffic was mostly fine.
Corosync, however, started flapping under load. Membership changes happened during peak backup windows.
Quorum would drop, return, drop again. HA got nervous and stopped making moves; operators got nervous and started rebooting nodes.
The flapping was not random. Corosync traffic was getting fragmented or dropped across the mismatched MTU path.
Because Corosync is designed to protect correctness, it treated packet loss like a membership threat. Which it was.
The “optimization” resulted in a cluster that was technically faster but operationally less reliable. They reverted to MTU 1500 on the Corosync ring and kept jumbo frames
for storage only, where they could prove end-to-end consistency.
The point: optimize where you can validate. Corosync is not impressed by your throughput graphs if your packets don’t arrive consistently.
Mini-story 3: The boring but correct practice that saved the day
A finance-adjacent company (so: audits, change control, and long meetings) ran a five-node Proxmox cluster.
They had a habit that engineers mocked: a printed one-page runbook with the corosync ring IPs, switch ports, and IPMI addresses for every node.
It was updated quarterly, signed off, and laminated. Yes, laminated.
One afternoon, a power distribution unit failed and took out two nodes. Another node stayed up but lost its ring switch uplink due to a spanning-tree re-convergence event.
The cluster dropped to two reachable nodes out of five. No quorum. The UI went cranky. Phones lit up.
Instead of forcing quorum, the on-call did three things quickly: verified which nodes were truly powered off via IPMI, confirmed the ring network issue on the remaining nodes,
and used the runbook to identify the exact switch uplink to check. They didn’t need to guess. They didn’t need to “discover” the topology during an incident.
The uplink was put back, the third node rejoined, quorum returned naturally (3/5), and they avoided any forced-vote maneuver.
Later they brought the two dead nodes back and let the cluster settle. Minimal drama.
The lesson: boring inventory and topology hygiene beats improvisation. Also, laminated runbooks are not cool, but neither is a split-brain postmortem.
Common mistakes: symptom → root cause → fix
1) Symptom: “cluster not ready – no quorum?” after a network change
Root cause: Corosync ring network broken (VLAN, MTU, bond/LACP, firewall).
Fix: Use corosync-cfgtool -s, tcpdump, and MTU pings to confirm loss; restore L2/L3 consistency; don’t change votes as a first response.
2) Symptom: One node sees quorum, another does not
Root cause: Partition, asymmetric routing, or one node forced expected votes locally.
Fix: Compare pvecm status on every node; undo forced vote settings; ensure all nodes share the same membership view before making cluster config changes.
3) Symptom: /etc/pve is read-only or writes fail
Root cause: No quorum, or pmxcfs is unhealthy due to corosync instability.
Fix: Restore quorum (preferred) or use controlled expected-votes change; then confirm membership stability before editing cluster config.
4) Symptom: Quorum drops intermittently (flapping)
Root cause: Packet loss, MTU mismatch, unstable link, overloaded ring interface, or noisy virtualized networking.
Fix: Treat it like a network reliability incident: measure loss, check switch errors, disable problematic offloads if needed, and consider dedicating a clean ring network.
5) Symptom: Two-node cluster loses quorum whenever one node reboots
Root cause: Two-node quorum math: majority of 2 is 2; losing one vote means no quorum.
Fix: Add a third vote (qdevice or third node). If you must run two nodes, accept limited cluster semantics and plan maintenance procedures carefully.
6) Symptom: After “fixing quorum,” HA starts/shuts down VMs unexpectedly
Root cause: HA state reconciliation after membership changes; resources may be in stale states.
Fix: Before restoring quorum, review HA resources; consider putting services into maintenance mode; restore quorum, then validate HA decisions before allowing automation.
7) Symptom: Cluster looks fine, but nodes can’t join or keep rejoining
Root cause: Wrong corosync.conf node addresses, duplicate node IDs, stale hostnames, or mismatched auth keys.
Fix: Validate /etc/pve/corosync.conf on the authoritative node; ensure unique nodeid and correct ring addresses; rejoin nodes cleanly rather than hacking.
Checklists / step-by-step plan
Checklist 1: “I just saw no quorum” (do this in 5 minutes)
- On the affected node: run
pvecm status. Screenshot or copy output into the incident log. - Run
pvecm nodesto confirm expected membership. - Run
corosync-cfgtool -sto see who is disconnected. - Check
journalctl -u corosync -n 50for link-down messages. - From the node, ping the ring IPs of missing peers. If ping fails, stop and fix network or power.
- If ping works, run
tcpdump -ni <ring-if> udp port 5405while you try from the peer too.
Checklist 2: “Should I use pvecm expected?” (decision gate)
- Are the missing nodes confirmed powered off (IPMI) or fenced? If no: don’t do it.
- Is there shared writable storage that could be mounted by both partitions? If yes: don’t do it unless storage is fenced/locked.
- Do you need to make cluster-wide writes (delnode, fix corosync.conf via pmxcfs, adjust HA) immediately? If no: wait and fix network instead.
- Do you have an explicit plan to revert expected votes once nodes return? If no: write the plan first.
Checklist 3: Controlled quorum restoration (safe-ish workflow)
- Stabilize membership: fix ring network until
corosync-cfgtool -sshows peers connected and logs stop flapping. - Confirm quorum:
pvecm statusshowsQuorate: Yeswith expected votes matching real cluster size. - Validate /etc/pve writes: create and remove a test file in /etc/pve (or verify config edits work via UI).
- Check HA: ensure HA services are healthy and no unexpected actions are pending.
- Only then: do cluster changes (remove dead nodes, change storage config, modify firewall rules).
Checklist 4: If you must run a single node temporarily
- Confirm other nodes are down via IPMI or physically disconnected from ring network.
- If shared storage exists: ensure only this node can write (disable exports, remove LUN mappings, or enforce fencing).
- Adjust expected votes only as long as needed. Document the change in the incident timeline.
- Avoid making large-scale cluster config edits in the forced state; do the minimum to restore critical workloads.
- When other nodes return, revert expected votes and watch membership converge before re-enabling HA and automation.
FAQ
1) Does “no quorum” mean my VMs are corrupted?
Usually no. It means cluster coordination is unsafe. VMs can keep running, especially on local storage.
The corruption risk rises if you start/stop the same VM from multiple partitions or if shared storage is writable from both sides.
2) Can I just restart corosync to fix it?
Restarting corosync can help if the process is wedged, but it rarely fixes the root cause (network partition, MTU, firewall, dead peer).
Also, restarting a component during flapping can cause more churn. Diagnose first; restart with intent.
3) What exactly does “pvecm expected” do?
It sets the expected total vote count used to compute quorum. Lowering it can make a smaller partition become quorate.
It’s powerful and dangerous: you can create a situation where two partitions both think they’re the majority if you do it on both sides.
4) Why are two-node Proxmox clusters such a headache?
Because majority of 2 is 2. If either node is unreachable, neither can prove it has the majority.
Without a tie-breaker, the safest behavior is to stop cluster writes. That’s what you’re seeing.
5) Do I need a qdevice?
If you have two nodes and you want clean quorum behavior, yes: add a qdevice (or a third node) to provide a third vote.
If you have three or more nodes, a qdevice is optional, but can still help in certain designs.
6) Why is /etc/pve special?
It’s pmxcfs: a cluster filesystem mounted via FUSE, backed by distributed state. It’s designed to prevent unsafe writes when membership is uncertain.
Treat it like a database, not like a local folder.
7) After quorum returns, why does everything feel “slow” for a bit?
Membership re-forms, state converges, HA recalculates, and services reestablish connections.
If you had flapping, you may have a backlog of retries. Give it a minute, but watch logs for continued churn.
8) How do I know I’m at risk of split-brain?
If you can’t prove that only one partition can access shared writable resources, you’re at risk.
Another red flag: different nodes show different membership or different expected votes. That’s how split-brain starts—quietly.
9) Is it safe to edit corosync.conf directly?
It’s safe only when you have a quorate cluster (so the change propagates consistently) or during a controlled single-node recovery where you understand
you are creating a new source of truth. Random edits during no-quorum are a great way to produce inconsistent cluster state.
10) What if only one node is left and I need the UI to manage VMs?
You can use expected votes to regain quorum on that node, but only after you’ve confirmed the other nodes are down or fenced.
Then keep changes minimal, and plan how you’ll reintroduce other nodes cleanly.
Conclusion: practical next steps
Quorum loss is Proxmox doing you a favor: it’s refusing to let you create an inconsistent cluster state.
Your job is to restore connectivity and membership first, and only then restore convenience.
Next steps that actually help:
- Run the fast diagnosis playbook and identify whether the failure is power, network, or configuration divergence.
- Fix the Corosync ring network stability before changing votes. Treat flapping as a network incident, not a Proxmox incident.
- If you must use
pvecm expected, do it as a controlled, documented, time-boxed action with fencing facts. - Once stable, reduce future pain: avoid two-node clusters without a tie-breaker, keep Corosync on a clean network, and document topology like you mean it.
If you take one thing from this: don’t “force quorum” until you can explain where every other vote went.
That explanation is the difference between recovery and archaeology.