If you’ve just run updates on a Proxmox VE host and your browser greets you with “Connection refused” on :8006, you’re in a very particular kind of pain. The host is often “up” (you can ping it, maybe SSH works), but the management plane is face-down, and suddenly every small admin task turns into a spelunking expedition.
This is one of those incidents where calm beats clever. “Connection refused” is a gift: it narrows the failure to a listening socket, a local firewall, or something killing the proxy. We’ll take that gift, walk a tight sequence, and get your UI back without guessing.
Fast diagnosis playbook (first 10 minutes)
This is the order that minimizes time-to-root-cause in production. It’s biased toward the failures that happen right after updates: service restart issues, certificate problems, firewall changes, and cluster dependencies.
Minute 0–2: Confirm the symptom is local, not “the network”
- From your workstation: does
sshwork? If SSH fails too, this isn’t a “8006” problem; it’s routing, link, IP, or host down. - From the Proxmox host itself: test its own port 8006 with
curl. If it can’t reach itself, your issue is almost certainlypveproxynot listening or being blocked locally.
Minute 2–5: Check whether anything is listening on 8006
- Use
ssto see if 8006 is bound. If nothing is listening, don’t touch firewalls yet. Fixpveproxyfirst. - Check
systemctl status pveproxy. If it’s failed, read the error and jump straight to logs (journalctl -u pveproxy).
Minute 5–7: Validate the management stack dependencies
- Check
pvedaemonandpvestatd. The UI can “connect” but act broken if these are dead; but “connection refused” is usuallypveproxy. - Check disk space and inode pressure. Full root filesystems and log partitions kill daemons in unglamorous ways.
Minute 7–10: Firewall, then cluster, then certificates
- Firewall: confirm
pve-firewalland nftables/iptables rules aren’t rejectingtcp/8006. - Cluster state: if this host is in a cluster, quorum and corosync can indirectly stall management (especially around config FS).
- Certificates: a broken or unreadable SSL key/cert can keep
pveproxyfrom starting.
Do this in order. If you start by “just restarting everything,” you can turn a clean failure into a messy one. Yes, I know restarting feels productive. It’s also the adult version of blowing on a Nintendo cartridge.
What “Connection refused” really means on 8006
Browsers are dramatic. “Connection refused” is not “slow,” not “TLS error,” and not “auth failed.” It’s usually one of three things:
- No process is listening on that IP:port. The kernel replies with RST (reset), and your client reports “refused.”
- A local firewall is actively rejecting the connection (also often RST). This is rarer than people think, but it happens—especially with rules that changed during updates.
- A reverse proxy / socket activation mismatch (less common). For Proxmox, the relevant daemon is
pveproxy, which binds to 8006 and handles TLS.
If the problem were “blocked” (DROP), you’d more often see timeouts. If it were “TLS broke,” you’d usually connect and then see a certificate error, not “refused.” So treat “refused” as a strong signal: the port is not reachable at the TCP accept level.
There’s also a subtle variant: 8006 is listening, but only on the wrong interface (for example, bound to 127.0.0.1). That can still look like a refusal from outside. We’ll check that explicitly.
Interesting facts and context (because this stuff has a history)
- Port 8006 wasn’t chosen for aesthetics. Proxmox historically used a high, non-standard port to avoid clashing with other web stacks on the same host.
pveproxyis a Perl daemon. It’s not trendy, but it’s battle-tested and tightly integrated with Proxmox’s API and auth stack.- Proxmox rides on Debian. Many “after updates” incidents are really Debian-level changes: kernel, libc, OpenSSL, systemd behavior, nftables defaults.
- Proxmox migrated the firewall backend over time. Debian’s broader shift from iptables to nftables has been a recurring source of “rules exist, but not where you think.”
- Cluster config is a filesystem. Proxmox cluster configuration lives in a distributed config filesystem (pmxcfs), and when it’s unhappy, management workflows get weird fast.
- TLS defaults have gotten stricter. OpenSSL and related libraries regularly deprecate old ciphers/protocols; daemons that can’t load keys/certs correctly often fail hard.
- Systemd made services more observable. That’s the good news. The bad news is it also made “restart loops” very efficient, so a broken daemon can fail 50 times before you blink.
- Proxmox updates are not just UI updates. They routinely touch storage tooling (ZFS), networking (ifupdown2), and virtualization layers (QEMU/KVM). You’re updating a small datacenter, not a web app.
One paraphrased idea worth keeping in your pocket, attributed to Werner Vogels: “Everything fails eventually; resilience comes from assuming that and designing operations accordingly.” (paraphrased idea)
Hands-on tasks: commands, outputs, decisions (12+)
These are the checks I actually run when I’m on-call and someone says, “Proxmox is down.” Each task includes: the command, a realistic output snippet, what it means, and what decision you make next.
Task 1: Verify the host is up and you’re on the right box
cr0x@server:~$ hostnamectl
Static hostname: pve01
Icon name: computer-server
Chassis: server
Machine ID: 1c3f0c0c9b0d4c2d9d2a2e0d1a0cbeef
Boot ID: 6b77c2a5a1a74c469c9c4a9b3d9d1234
Operating System: Debian GNU/Linux 12 (bookworm)
Kernel: Linux 6.8.12-2-pve
Architecture: amd64
Means: You’re on the Proxmox host you think you’re on, and you know the OS/kernel context after updates.
Decision: If the kernel changed, keep in mind: NIC naming, driver changes, and firewall backends can shift after reboot.
Task 2: Confirm whether 8006 is listening, and on which IP
cr0x@server:~$ ss -lntp | grep -E ':8006|State'
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 4096 0.0.0.0:8006 0.0.0.0:* users:(("pveproxy",pid=1432,fd=6))
Means: pveproxy is listening on all interfaces. “Connection refused” from outside is unlikely to be “service down.”
Decision: If you see 127.0.0.1:8006 only, treat it as a bind/interface problem. If you see nothing, go to Task 4 (service status) and Task 5 (logs).
Task 3: Test locally with curl (cuts the network out of the picture)
cr0x@server:~$ curl -k -sS -o /dev/null -w '%{http_code}\n' https://127.0.0.1:8006/
200
Means: The web UI endpoint responds locally. If remote users get refused, the problem is almost certainly firewall/routing/VIP, not Proxmox services.
Decision: Jump to firewall checks (Task 10/11) and interface checks (Task 8/9).
Task 4: Check pveproxy health via systemd (don’t guess)
cr0x@server:~$ systemctl status pveproxy --no-pager
● pveproxy.service - PVE API Proxy Server
Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; preset: enabled)
Active: failed (Result: exit-code) since Wed 2025-12-25 10:44:02 UTC; 45s ago
Process: 1519 ExecStart=/usr/bin/pveproxy start (code=exited, status=1/FAILURE)
Main PID: 1519 (code=exited, status=1/FAILURE)
Dec 25 10:44:02 pve01 pveproxy[1519]: starting server
Dec 25 10:44:02 pve01 pveproxy[1519]: can't load certificate '/etc/pve/local/pve-ssl.pem': No such file or directory
Dec 25 10:44:02 pve01 systemd[1]: pveproxy.service: Main process exited, code=exited, status=1/FAILURE
Dec 25 10:44:02 pve01 systemd[1]: pveproxy.service: Failed with result 'exit-code'.
Means: Clear failure: missing cert file under /etc/pve. That points at pmxcfs/cluster filesystem not mounted/healthy, or cert generation got interrupted.
Decision: Go to Tasks 6 and 7 (pmxcfs and /etc/pve checks), then regenerate certs once /etc/pve is sane.
Task 5: Read logs with context (journalctl is your friend)
cr0x@server:~$ journalctl -u pveproxy -b --no-pager -n 60
Dec 25 10:44:02 pve01 pveproxy[1519]: starting server
Dec 25 10:44:02 pve01 pveproxy[1519]: can't load certificate '/etc/pve/local/pve-ssl.pem': No such file or directory
Dec 25 10:44:02 pve01 systemd[1]: pveproxy.service: Main process exited, code=exited, status=1/FAILURE
Dec 25 10:44:02 pve01 systemd[1]: pveproxy.service: Failed with result 'exit-code'.
Dec 25 10:44:12 pve01 systemd[1]: pveproxy.service: Scheduled restart job, restart counter is at 5.
Means: It’s restart-looping. That can cause cascading issues (log spam, CPU churn), but the root cause is still the first error line.
Decision: Fix the cert/path/pmxcfs cause. Don’t keep restarting. You’re just making heat.
Task 6: Verify pmxcfs is running (because /etc/pve is not “normal”)
cr0x@server:~$ systemctl status pve-cluster --no-pager
● pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; preset: enabled)
Active: active (running) since Wed 2025-12-25 10:42:21 UTC; 3min ago
Main PID: 602 (pmxcfs)
Tasks: 7 (limit: 154322)
Memory: 52.1M
CPU: 1.824s
Means: pmxcfs is up. That doesn’t guarantee it’s healthy, but it rules out the obvious “service down” case.
Decision: If it’s failed, fix it first (storage/full disk/quorum). If it’s running, confirm /etc/pve is mounted and populated (Task 7).
Task 7: Confirm /etc/pve is mounted (and not a sad empty directory)
cr0x@server:~$ mount | grep ' /etc/pve '
pve:/etc/pve on /etc/pve type fuse.pve (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)
Means: It’s mounted via FUSE. Good. Now check whether the expected cert file exists.
Decision: If you don’t see a mount line, you’re not looking at the cluster filesystem; Proxmox services will misbehave. Investigate pmxcfs startup and cluster/quorum (Task 12/13).
Task 8: Make sure the management IP is still on the interface (updates + reboots love surprises)
cr0x@server:~$ ip -br addr
lo UNKNOWN 127.0.0.1/8 ::1/128
eno1 UP 10.20.10.11/24
vmbr0 UP 10.20.10.11/24 fe80::2a0:98ff:fe12:3456/64
Means: The IP is present. Note: having the same IP on a physical NIC and bridge can be legitimate depending on config, but it can also be a misconfig after an interface rewrite.
Decision: If the expected management IP is missing, fix networking first. Proxmox UI won’t magically bind to an IP you don’t have.
Task 9: Check routing (because “it pings” is not a routing table)
cr0x@server:~$ ip route
default via 10.20.10.1 dev vmbr0 proto kernel onlink
10.20.10.0/24 dev vmbr0 proto kernel scope link src 10.20.10.11
Means: Default route exists. If your management station is on another subnet, you need this to be correct to reach 8006.
Decision: Wrong default route or missing route: fix network config. Don’t blame Proxmox services for your router’s sins.
Task 10: Check Proxmox firewall status (PVE firewall can reject 8006)
cr0x@server:~$ pve-firewall status
Status: enabled/running
Means: PVE firewall is active. That’s fine, but it means rules matter.
Decision: If enabled, inspect ruleset and confirm 8006 is allowed on the management interface. If disabled, still check nftables/iptables (Task 11).
Task 11: Inspect nftables rules for explicit rejects (refused often equals REJECT)
cr0x@server:~$ nft list ruleset | sed -n '1,120p'
table inet filter {
chain input {
type filter hook input priority filter; policy drop;
iif "lo" accept
ct state established,related accept
tcp dport 22 accept
tcp dport 8006 reject with tcp reset
}
}
Means: Someone (or something) is explicitly rejecting 8006 with a TCP reset. That produces “Connection refused” from clients.
Decision: Fix the rule (remove the reject or replace with accept for your admin networks). Don’t “temporarily flush everything” unless you enjoy surprise exposure.
Task 12: Check corosync/quorum (cluster hosts can act haunted without it)
cr0x@server:~$ pvecm status
Cluster information
-------------------
Name: prod-cluster
Config Version: 42
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Wed Dec 25 10:47:11 2025
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000001
Ring ID: 1.2a
Quorate: Yes
Means: Quorum is fine. If it were Quorate: No, a bunch of config-dependent services can stall or refuse writes, and you’ll see secondary breakage.
Decision: If not quorate, decide: restore quorum (preferred) or isolate the node safely. Avoid “just reboot another node” as a strategy.
Task 13: Check disk space and inodes (boring, frequent, deadly)
cr0x@server:~$ df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/pve-root 94G 94G 0 100% /
Means: Root filesystem is full. Services that need to write PID files, logs, or state will fail in ways that look unrelated to storage.
Decision: Free space immediately and then restart affected services. Also find what grew (Task 14) and fix the underlying log/backup behavior.
Task 14: Identify what’s eating the root filesystem
cr0x@server:~$ du -xhd1 /var | sort -h
1.1G /var/cache
2.4G /var/log
38G /var/lib
Means: Something under /var/lib is large. On Proxmox, candidates include ISO storage, container images, backups, or runaway journald.
Decision: Drill down further (du -xhd1 /var/lib) and remove/move the correct offender. Don’t randomly delete cluster configs. That way lies regret.
Task 15: Validate certificates exist and are readable
cr0x@server:~$ ls -l /etc/pve/local/pve-ssl.pem /etc/pve/local/pve-ssl.key
-rw-r----- 1 root www-data 3882 Dec 25 10:10 /etc/pve/local/pve-ssl.pem
-rw-r----- 1 root www-data 1704 Dec 25 10:10 /etc/pve/local/pve-ssl.key
Means: Cert and key exist with plausible permissions.
Decision: If missing or corrupt, regenerate (Task 16). If permissions are wrong (too restrictive), fix ownership/mode so pveproxy can read them.
Task 16: Regenerate Proxmox certificates (when cert problems prevent pveproxy from starting)
cr0x@server:~$ pvecm updatecerts --force
Generating new certificates...
Restarting pveproxy...
Restarting pvedaemon...
done
Means: Proxmox has regenerated cluster-aware certs and restarted the relevant daemons.
Decision: Re-check systemctl status pveproxy and verify listening on 8006 (Task 2). If it still fails, go back to logs; don’t repeat this endlessly.
Task 17: Check for port conflicts (rare, but worth one glance)
cr0x@server:~$ lsof -nP -iTCP:8006 -sTCP:LISTEN
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
pveproxy 1432 root 6u IPv6 31245 0t0 TCP *:8006 (LISTEN)
Means: The right process owns the port.
Decision: If something else is listening on 8006, stop it or reconfigure it. Proxmox expects to own that port; fighting it is a lifestyle choice.
Task 18: Verify remote reachability from another host in the same subnet
cr0x@server:~$ nc -vz 10.20.10.11 8006
Connection to 10.20.10.11 8006 port [tcp/*] succeeded!
Means: TCP connection works. If your browser still errors, it’s likely TLS trust/UI caching, or you’re hitting the wrong IP/DNS.
Decision: Validate DNS resolution and browser target. Also check if you’re using a VIP, reverse proxy, or port-forward that changed during network updates.
One more operational truth: “Connection refused” is rarely a mystery and frequently a checklist failure. That’s good news. It means you can fix it without poetry.
Joke #1: If your post-update plan is “we’ll just roll back,” congratulations—you’ve discovered time travel, but only for package versions.
Three corporate mini-stories from the trenches
Incident #1: The wrong assumption (“It must be the firewall”)
A midsize SaaS company ran a three-node Proxmox cluster for internal CI runners and staging. After routine updates on one node, the web UI stopped responding on 8006. The on-call engineer’s first instinct was the usual: “security update tweaked the firewall.” They disabled the firewall service and still got “Connection refused.” That should have been the clue, but adrenaline is a persuasive drug.
The engineer then flushed nftables rules entirely. The node became reachable on 8006 for about thirty seconds and then went dark again. “See? Firewall.” Except it wasn’t. What actually happened was: pveproxy was in a restart loop, briefly binding 8006 between crashes. Flushing rules changed timing, not causality.
The real root cause lived in plain sight: root filesystem had hit 100%. pveproxy couldn’t write its state, pvedaemon got unhappy, and systemd performed its usual efficient failure choreography. The firewall scapegoat cost them about an hour, plus a fun audit finding because the node ran with an effectively open input policy longer than intended.
What fixed it wasn’t cleverness. They freed space under /var/log, rotated journals, restarted the services, and then added monitoring on filesystem fill. The final postmortem line was blunt: “We assumed cause from correlation (update → firewall). We didn’t validate the socket state first.”
Incident #2: The optimization that backfired (“Let’s lock down input policy to drop”)
A large corporate IT team standardized on a hardened host baseline. One change looked tidy on paper: set the default input policy to drop, then explicitly allow only required ports. Sensible. Controlled. Audit-friendly.
The problem was how they implemented it on Proxmox. They pushed an nftables template that allowed SSH and a handful of monitoring ports, but they assumed the web UI was behind a corporate reverse proxy and didn’t need direct 8006 access. That assumption was true for most environments. It was false for break-glass access when the reverse proxy was down—or during maintenance when admins SSHed from a jump host and expected to reach https://node:8006 directly.
After an update and reboot, Proxmox came back up fine, pveproxy listened on 8006, and local curl returned 200. From the jump host, every attempt got “Connection refused,” because the template used reject with tcp reset for unlisted ports. That’s the detail that misled people: a “refused” error smells like “service down.”
The fix was not “open everything.” They added a narrow allow rule for 8006 from the jump host subnet and kept the default drop policy. More importantly, they documented that management-plane ports are part of the break-glass path, not an optional convenience. The optimization (stronger baseline) was right; the rollout without operational exceptions was the mistake.
Incident #3: The boring practice that saved the day (pre-checks and staged reboots)
A financial services team ran Proxmox for VDI and a pile of internal services. They were conservative to the point of comedy: every update window began with a pre-check script, saved to a ticket, reviewed by a second person. It looked like bureaucracy until it wasn’t.
One night, updates included changes that required a reboot. Before rebooting, the pre-check captured: free space, memory pressure, cluster quorum, and whether the node was currently hosting critical VMs. It also captured “what is listening on 8006 right now” and the current firewall ruleset hash. Boring. Repetitive. Exactly the point.
After reboot, 8006 was refusing connections. The on-call engineer didn’t start flailing because they had a baseline: before reboot, nftables had an allow for 8006; after reboot, the allow was missing. That sharply reduced the search space. The issue was traced to a configuration management run that applied the wrong role to that node during the maintenance window.
They reverted the role, restored the allow rule, and were back online quickly. The postmortem had no heroics, just a line everyone hates and everyone needs: “Our pre-checks turned a mystery into a diff.” The only exciting part was how unexciting the recovery was.
Joke #2: The Proxmox UI didn’t “randomly break after updates.” It broke on schedule—you just didn’t read the calendar.
Common mistakes: symptom → root cause → fix
This section is where the time goes to die if you don’t recognize patterns. Here are the repeat offenders, tuned specifically for “Connection refused” on 8006 right after updates.
1) “Connection refused” immediately after reboot
- Symptom: Browser says refused;
pingworks; SSH works. - Root cause:
pveproxyfailed to start (cert missing/corrupt, pmxcfs not mounted, or config syntax issue). - Fix:
systemctl status pveproxy→journalctl -u pveproxy. Validate/etc/pvemount; regenerate certs withpvecm updatecerts --forceonly after /etc/pve is healthy.
2) Local curl works, remote refused
- Symptom:
curl -k https://127.0.0.1:8006returns 200; remote browser refused. - Root cause: nftables/iptables rejects 8006, or service bound only to loopback, or IP moved to different interface/VLAN.
- Fix: Check
ss -lntpfor bind address; check firewall rules (nft list ruleset); verify IP and routes (ip -br addr,ip route).
3) “It was working before the update,” and now only one node is broken
- Symptom: Cluster has multiple nodes; only updated node refuses on 8006.
- Root cause: That node’s
/etc/pveis stale/unmounted, corosync link issue, or the node booted into a different network config (bridge name changes, missing VLAN). - Fix: Check
pvecm status,systemctl status pve-cluster,mount | grep /etc/pve, and network config under/etc/network/interfaces(carefully).
4) “Connection refused” only from some networks
- Symptom: Works from jump host, refused from your laptop, or vice versa.
- Root cause: Management access rules are subnet-specific; maybe a corporate baseline introduced an explicit reject for 8006 except from approved ranges.
- Fix: Confirm with
nftcounters and rule order; add explicit allow for 8006 from required admin sources.
5) 8006 “refused” and other random services flapping
- Symptom: Not just the UI—metrics, backups, or container starts fail too.
- Root cause: Disk full, inode exhaustion, or filesystem mounted read-only after errors.
- Fix:
df -h,df -i,dmesg -T | tail; remediate storage, then restart services.
6) You keep restarting services and it keeps failing
- Symptom:
systemctl restart pveproxy“works” but UI still refused; or it fails immediately. - Root cause: Restarting doesn’t fix the underlying missing files, permission issues, or blocked port. It just moves the timestamp.
- Fix: Stop and read the logs. Fix the first error. Only then restart.
Checklists / step-by-step plan (safe recovery)
This is the plan you follow when you want the UI back and you don’t want to create collateral damage. It’s written to be safe on standalone hosts and cluster nodes.
Phase 1: Confirm scope (2–5 minutes)
- Verify host reachability: SSH in; confirm correct host (
hostnamectl). - Check local UI reachability:
curl -k https://127.0.0.1:8006/. - Check listener:
ss -lntp | grep :8006.
If 8006 isn’t listening: go to Phase 2.
If it is listening locally but remote fails: skip to Phase 4 (network/firewall).
Phase 2: Fix the service (5–15 minutes)
- Status and logs:
systemctl status pveproxy, thenjournalctl -u pveproxy -b. - Dependencies:
systemctl status pvedaemon pve-cluster. - Check /etc/pve:
mount | grep ' /etc/pve 'andls -l /etc/pve/local/. - Disk sanity:
df -h /,df -i /. Fix if full before doing anything else.
Phase 3: Certificates and config filesystem (as needed)
- If cert files are missing or broken: ensure
/etc/pveis properly mounted and populated. - Then regenerate certs:
pvecm updatecerts --force. - Re-check listener:
ss -lntp | grep :8006and localcurl.
Phase 4: Firewall and network path (when local works but remote doesn’t)
- Confirm bind address: if it’s only
127.0.0.1:8006, fix the config that constrains binding (rare; usually not default behavior). - Check Proxmox firewall:
pve-firewall status. If enabled, confirm rules allow 8006 from your admin subnets. - Check nftables:
nft list rulesetand look fortcp dport 8006rules that reject/drop. - Validate IP and routing:
ip -br addr,ip route. - Test from another host:
nc -vz <ip> 8006.
Phase 5: Cluster-specific sanity (only if clustered)
- Check quorum:
pvecm status. - Check corosync:
systemctl status corosyncand logs if needed. - Avoid risky “fixes”: don’t rip nodes out of the cluster just to make the UI come back. Restore network connectivity between nodes first.
When to stop and escalate
- If you see filesystem errors in
dmesgor remounts read-only. - If cluster quorum is lost and you’re not sure which nodes are “real.”
- If the host is also your storage head (ZFS/NFS/iSCSI) and a bad move could take down other services.
FAQ
1) Why does “Connection refused” usually mean pveproxy, not the API daemon?
Because 8006 is the TLS endpoint served by pveproxy. If that daemon isn’t listening (or a firewall rejects), the TCP connection won’t establish. pvedaemon problems tend to show up as UI errors after connect, not as a refusal.
2) What’s the single fastest check?
ss -lntp | grep :8006. If nothing is listening, stop thinking about routers and start reading pveproxy logs.
3) Local curl works but browser from my laptop is refused. What now?
Assume firewall or network path. Check nftables for reject with tcp reset on 8006, validate the IP is on the correct interface, and test from a host in the same subnet using nc -vz.
4) Can I just restart pveproxy and be done?
You can try once. If it fails, repeated restarts are theater. Read journalctl -u pveproxy and fix the first error line.
5) Does a missing /etc/pve mount really break the UI?
Yes. Proxmox stores key config and cert material under /etc/pve, backed by pmxcfs. If it’s not mounted or unhealthy, services that depend on it can fail or start with missing files.
6) How often is disk full the real cause?
Often enough that it should be in your first five checks. Root being 100% full breaks daemons, cert generation, logging, and sometimes package postinst scripts. It’s not glamorous, but it’s common.
7) I updated packages but didn’t reboot. Can that still cause 8006 to refuse?
Yes. Package post-install scripts can restart services immediately. If a dependency changed (OpenSSL, perl modules, config filesystem state) the restart can fail even without a reboot.
8) If quorum is lost, will 8006 refuse connections?
Not always. Quorum loss more commonly causes management actions to fail or configs to become read-only-ish. But it can indirectly break certificate paths and pmxcfs behavior, which can keep pveproxy from starting.
9) Should I open 8006 to the internet to “make it work”?
No. Fix the actual issue. If you need remote access, put it behind a VPN, a bastion, or a tightly restricted allowlist. Exposing management planes is how you end up spending weekends learning about incident response.
10) What if 8006 is listening but I still can’t load the UI?
That’s no longer “connection refused.” Check browser errors, TLS alerts, and whether you’re hitting the correct IP/DNS. Also verify pvedaemon is running and that the host isn’t under extreme resource pressure.
Next steps you should actually do
Restore service first, then make it harder for this to happen again.
- Lock in the fast checks:
ss -lntp,systemctl status pveproxy,journalctl -u pveproxy,df -h,nft list ruleset. Put them in your runbook exactly in that order. - Add monitoring that catches the real killers: root filesystem usage, inode usage, pmxcfs health, and whether 8006 is listening from an external vantage point.
- Stop treating updates as “just patches”: schedule them, stage them, and capture a before/after diff of firewall and network state. The boring practice is the one that works at 3 a.m.
- For clusters: treat quorum and corosync connectivity as part of the management plane. If the cluster is unstable, the UI is a symptom, not the disease.
If you do nothing else: the next time you see “Connection refused,” check whether anything is listening on 8006 before you touch a firewall. That single habit saves hours and keeps you from “fixing” the wrong thing aggressively.