Proxmox “Connection refused” on 8006 after updates: what to check first

Was this helpful?

If you’ve just run updates on a Proxmox VE host and your browser greets you with “Connection refused” on :8006, you’re in a very particular kind of pain. The host is often “up” (you can ping it, maybe SSH works), but the management plane is face-down, and suddenly every small admin task turns into a spelunking expedition.

This is one of those incidents where calm beats clever. “Connection refused” is a gift: it narrows the failure to a listening socket, a local firewall, or something killing the proxy. We’ll take that gift, walk a tight sequence, and get your UI back without guessing.

Fast diagnosis playbook (first 10 minutes)

This is the order that minimizes time-to-root-cause in production. It’s biased toward the failures that happen right after updates: service restart issues, certificate problems, firewall changes, and cluster dependencies.

Minute 0–2: Confirm the symptom is local, not “the network”

  1. From your workstation: does ssh work? If SSH fails too, this isn’t a “8006” problem; it’s routing, link, IP, or host down.
  2. From the Proxmox host itself: test its own port 8006 with curl. If it can’t reach itself, your issue is almost certainly pveproxy not listening or being blocked locally.

Minute 2–5: Check whether anything is listening on 8006

  1. Use ss to see if 8006 is bound. If nothing is listening, don’t touch firewalls yet. Fix pveproxy first.
  2. Check systemctl status pveproxy. If it’s failed, read the error and jump straight to logs (journalctl -u pveproxy).

Minute 5–7: Validate the management stack dependencies

  1. Check pvedaemon and pvestatd. The UI can “connect” but act broken if these are dead; but “connection refused” is usually pveproxy.
  2. Check disk space and inode pressure. Full root filesystems and log partitions kill daemons in unglamorous ways.

Minute 7–10: Firewall, then cluster, then certificates

  1. Firewall: confirm pve-firewall and nftables/iptables rules aren’t rejecting tcp/8006.
  2. Cluster state: if this host is in a cluster, quorum and corosync can indirectly stall management (especially around config FS).
  3. Certificates: a broken or unreadable SSL key/cert can keep pveproxy from starting.

Do this in order. If you start by “just restarting everything,” you can turn a clean failure into a messy one. Yes, I know restarting feels productive. It’s also the adult version of blowing on a Nintendo cartridge.

What “Connection refused” really means on 8006

Browsers are dramatic. “Connection refused” is not “slow,” not “TLS error,” and not “auth failed.” It’s usually one of three things:

  • No process is listening on that IP:port. The kernel replies with RST (reset), and your client reports “refused.”
  • A local firewall is actively rejecting the connection (also often RST). This is rarer than people think, but it happens—especially with rules that changed during updates.
  • A reverse proxy / socket activation mismatch (less common). For Proxmox, the relevant daemon is pveproxy, which binds to 8006 and handles TLS.

If the problem were “blocked” (DROP), you’d more often see timeouts. If it were “TLS broke,” you’d usually connect and then see a certificate error, not “refused.” So treat “refused” as a strong signal: the port is not reachable at the TCP accept level.

There’s also a subtle variant: 8006 is listening, but only on the wrong interface (for example, bound to 127.0.0.1). That can still look like a refusal from outside. We’ll check that explicitly.

Interesting facts and context (because this stuff has a history)

  • Port 8006 wasn’t chosen for aesthetics. Proxmox historically used a high, non-standard port to avoid clashing with other web stacks on the same host.
  • pveproxy is a Perl daemon. It’s not trendy, but it’s battle-tested and tightly integrated with Proxmox’s API and auth stack.
  • Proxmox rides on Debian. Many “after updates” incidents are really Debian-level changes: kernel, libc, OpenSSL, systemd behavior, nftables defaults.
  • Proxmox migrated the firewall backend over time. Debian’s broader shift from iptables to nftables has been a recurring source of “rules exist, but not where you think.”
  • Cluster config is a filesystem. Proxmox cluster configuration lives in a distributed config filesystem (pmxcfs), and when it’s unhappy, management workflows get weird fast.
  • TLS defaults have gotten stricter. OpenSSL and related libraries regularly deprecate old ciphers/protocols; daemons that can’t load keys/certs correctly often fail hard.
  • Systemd made services more observable. That’s the good news. The bad news is it also made “restart loops” very efficient, so a broken daemon can fail 50 times before you blink.
  • Proxmox updates are not just UI updates. They routinely touch storage tooling (ZFS), networking (ifupdown2), and virtualization layers (QEMU/KVM). You’re updating a small datacenter, not a web app.

One paraphrased idea worth keeping in your pocket, attributed to Werner Vogels: “Everything fails eventually; resilience comes from assuming that and designing operations accordingly.” (paraphrased idea)

Hands-on tasks: commands, outputs, decisions (12+)

These are the checks I actually run when I’m on-call and someone says, “Proxmox is down.” Each task includes: the command, a realistic output snippet, what it means, and what decision you make next.

Task 1: Verify the host is up and you’re on the right box

cr0x@server:~$ hostnamectl
 Static hostname: pve01
       Icon name: computer-server
         Chassis: server
      Machine ID: 1c3f0c0c9b0d4c2d9d2a2e0d1a0cbeef
         Boot ID: 6b77c2a5a1a74c469c9c4a9b3d9d1234
Operating System: Debian GNU/Linux 12 (bookworm)
          Kernel: Linux 6.8.12-2-pve
    Architecture: amd64

Means: You’re on the Proxmox host you think you’re on, and you know the OS/kernel context after updates.

Decision: If the kernel changed, keep in mind: NIC naming, driver changes, and firewall backends can shift after reboot.

Task 2: Confirm whether 8006 is listening, and on which IP

cr0x@server:~$ ss -lntp | grep -E ':8006|State'
State  Recv-Q Send-Q Local Address:Port  Peer Address:Port Process
LISTEN 0      4096         0.0.0.0:8006       0.0.0.0:*     users:(("pveproxy",pid=1432,fd=6))

Means: pveproxy is listening on all interfaces. “Connection refused” from outside is unlikely to be “service down.”

Decision: If you see 127.0.0.1:8006 only, treat it as a bind/interface problem. If you see nothing, go to Task 4 (service status) and Task 5 (logs).

Task 3: Test locally with curl (cuts the network out of the picture)

cr0x@server:~$ curl -k -sS -o /dev/null -w '%{http_code}\n' https://127.0.0.1:8006/
200

Means: The web UI endpoint responds locally. If remote users get refused, the problem is almost certainly firewall/routing/VIP, not Proxmox services.

Decision: Jump to firewall checks (Task 10/11) and interface checks (Task 8/9).

Task 4: Check pveproxy health via systemd (don’t guess)

cr0x@server:~$ systemctl status pveproxy --no-pager
● pveproxy.service - PVE API Proxy Server
     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Wed 2025-12-25 10:44:02 UTC; 45s ago
    Process: 1519 ExecStart=/usr/bin/pveproxy start (code=exited, status=1/FAILURE)
   Main PID: 1519 (code=exited, status=1/FAILURE)

Dec 25 10:44:02 pve01 pveproxy[1519]: starting server
Dec 25 10:44:02 pve01 pveproxy[1519]: can't load certificate '/etc/pve/local/pve-ssl.pem': No such file or directory
Dec 25 10:44:02 pve01 systemd[1]: pveproxy.service: Main process exited, code=exited, status=1/FAILURE
Dec 25 10:44:02 pve01 systemd[1]: pveproxy.service: Failed with result 'exit-code'.

Means: Clear failure: missing cert file under /etc/pve. That points at pmxcfs/cluster filesystem not mounted/healthy, or cert generation got interrupted.

Decision: Go to Tasks 6 and 7 (pmxcfs and /etc/pve checks), then regenerate certs once /etc/pve is sane.

Task 5: Read logs with context (journalctl is your friend)

cr0x@server:~$ journalctl -u pveproxy -b --no-pager -n 60
Dec 25 10:44:02 pve01 pveproxy[1519]: starting server
Dec 25 10:44:02 pve01 pveproxy[1519]: can't load certificate '/etc/pve/local/pve-ssl.pem': No such file or directory
Dec 25 10:44:02 pve01 systemd[1]: pveproxy.service: Main process exited, code=exited, status=1/FAILURE
Dec 25 10:44:02 pve01 systemd[1]: pveproxy.service: Failed with result 'exit-code'.
Dec 25 10:44:12 pve01 systemd[1]: pveproxy.service: Scheduled restart job, restart counter is at 5.

Means: It’s restart-looping. That can cause cascading issues (log spam, CPU churn), but the root cause is still the first error line.

Decision: Fix the cert/path/pmxcfs cause. Don’t keep restarting. You’re just making heat.

Task 6: Verify pmxcfs is running (because /etc/pve is not “normal”)

cr0x@server:~$ systemctl status pve-cluster --no-pager
● pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; preset: enabled)
     Active: active (running) since Wed 2025-12-25 10:42:21 UTC; 3min ago
   Main PID: 602 (pmxcfs)
      Tasks: 7 (limit: 154322)
     Memory: 52.1M
        CPU: 1.824s

Means: pmxcfs is up. That doesn’t guarantee it’s healthy, but it rules out the obvious “service down” case.

Decision: If it’s failed, fix it first (storage/full disk/quorum). If it’s running, confirm /etc/pve is mounted and populated (Task 7).

Task 7: Confirm /etc/pve is mounted (and not a sad empty directory)

cr0x@server:~$ mount | grep ' /etc/pve '
pve:/etc/pve on /etc/pve type fuse.pve (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)

Means: It’s mounted via FUSE. Good. Now check whether the expected cert file exists.

Decision: If you don’t see a mount line, you’re not looking at the cluster filesystem; Proxmox services will misbehave. Investigate pmxcfs startup and cluster/quorum (Task 12/13).

Task 8: Make sure the management IP is still on the interface (updates + reboots love surprises)

cr0x@server:~$ ip -br addr
lo               UNKNOWN        127.0.0.1/8 ::1/128
eno1             UP             10.20.10.11/24
vmbr0            UP             10.20.10.11/24 fe80::2a0:98ff:fe12:3456/64

Means: The IP is present. Note: having the same IP on a physical NIC and bridge can be legitimate depending on config, but it can also be a misconfig after an interface rewrite.

Decision: If the expected management IP is missing, fix networking first. Proxmox UI won’t magically bind to an IP you don’t have.

Task 9: Check routing (because “it pings” is not a routing table)

cr0x@server:~$ ip route
default via 10.20.10.1 dev vmbr0 proto kernel onlink
10.20.10.0/24 dev vmbr0 proto kernel scope link src 10.20.10.11

Means: Default route exists. If your management station is on another subnet, you need this to be correct to reach 8006.

Decision: Wrong default route or missing route: fix network config. Don’t blame Proxmox services for your router’s sins.

Task 10: Check Proxmox firewall status (PVE firewall can reject 8006)

cr0x@server:~$ pve-firewall status
Status: enabled/running

Means: PVE firewall is active. That’s fine, but it means rules matter.

Decision: If enabled, inspect ruleset and confirm 8006 is allowed on the management interface. If disabled, still check nftables/iptables (Task 11).

Task 11: Inspect nftables rules for explicit rejects (refused often equals REJECT)

cr0x@server:~$ nft list ruleset | sed -n '1,120p'
table inet filter {
  chain input {
    type filter hook input priority filter; policy drop;
    iif "lo" accept
    ct state established,related accept
    tcp dport 22 accept
    tcp dport 8006 reject with tcp reset
  }
}

Means: Someone (or something) is explicitly rejecting 8006 with a TCP reset. That produces “Connection refused” from clients.

Decision: Fix the rule (remove the reject or replace with accept for your admin networks). Don’t “temporarily flush everything” unless you enjoy surprise exposure.

Task 12: Check corosync/quorum (cluster hosts can act haunted without it)

cr0x@server:~$ pvecm status
Cluster information
-------------------
Name: prod-cluster
Config Version: 42
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Wed Dec 25 10:47:11 2025
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000001
Ring ID: 1.2a
Quorate: Yes

Means: Quorum is fine. If it were Quorate: No, a bunch of config-dependent services can stall or refuse writes, and you’ll see secondary breakage.

Decision: If not quorate, decide: restore quorum (preferred) or isolate the node safely. Avoid “just reboot another node” as a strategy.

Task 13: Check disk space and inodes (boring, frequent, deadly)

cr0x@server:~$ df -h /
Filesystem      Size  Used Avail Use% Mounted on
/dev/mapper/pve-root   94G   94G     0 100% /

Means: Root filesystem is full. Services that need to write PID files, logs, or state will fail in ways that look unrelated to storage.

Decision: Free space immediately and then restart affected services. Also find what grew (Task 14) and fix the underlying log/backup behavior.

Task 14: Identify what’s eating the root filesystem

cr0x@server:~$ du -xhd1 /var | sort -h
1.1G	/var/cache
2.4G	/var/log
38G	/var/lib

Means: Something under /var/lib is large. On Proxmox, candidates include ISO storage, container images, backups, or runaway journald.

Decision: Drill down further (du -xhd1 /var/lib) and remove/move the correct offender. Don’t randomly delete cluster configs. That way lies regret.

Task 15: Validate certificates exist and are readable

cr0x@server:~$ ls -l /etc/pve/local/pve-ssl.pem /etc/pve/local/pve-ssl.key
-rw-r----- 1 root www-data  3882 Dec 25 10:10 /etc/pve/local/pve-ssl.pem
-rw-r----- 1 root www-data  1704 Dec 25 10:10 /etc/pve/local/pve-ssl.key

Means: Cert and key exist with plausible permissions.

Decision: If missing or corrupt, regenerate (Task 16). If permissions are wrong (too restrictive), fix ownership/mode so pveproxy can read them.

Task 16: Regenerate Proxmox certificates (when cert problems prevent pveproxy from starting)

cr0x@server:~$ pvecm updatecerts --force
Generating new certificates...
Restarting pveproxy...
Restarting pvedaemon...
done

Means: Proxmox has regenerated cluster-aware certs and restarted the relevant daemons.

Decision: Re-check systemctl status pveproxy and verify listening on 8006 (Task 2). If it still fails, go back to logs; don’t repeat this endlessly.

Task 17: Check for port conflicts (rare, but worth one glance)

cr0x@server:~$ lsof -nP -iTCP:8006 -sTCP:LISTEN
COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
pveproxy 1432 root    6u  IPv6  31245      0t0  TCP *:8006 (LISTEN)

Means: The right process owns the port.

Decision: If something else is listening on 8006, stop it or reconfigure it. Proxmox expects to own that port; fighting it is a lifestyle choice.

Task 18: Verify remote reachability from another host in the same subnet

cr0x@server:~$ nc -vz 10.20.10.11 8006
Connection to 10.20.10.11 8006 port [tcp/*] succeeded!

Means: TCP connection works. If your browser still errors, it’s likely TLS trust/UI caching, or you’re hitting the wrong IP/DNS.

Decision: Validate DNS resolution and browser target. Also check if you’re using a VIP, reverse proxy, or port-forward that changed during network updates.

One more operational truth: “Connection refused” is rarely a mystery and frequently a checklist failure. That’s good news. It means you can fix it without poetry.

Joke #1: If your post-update plan is “we’ll just roll back,” congratulations—you’ve discovered time travel, but only for package versions.

Three corporate mini-stories from the trenches

Incident #1: The wrong assumption (“It must be the firewall”)

A midsize SaaS company ran a three-node Proxmox cluster for internal CI runners and staging. After routine updates on one node, the web UI stopped responding on 8006. The on-call engineer’s first instinct was the usual: “security update tweaked the firewall.” They disabled the firewall service and still got “Connection refused.” That should have been the clue, but adrenaline is a persuasive drug.

The engineer then flushed nftables rules entirely. The node became reachable on 8006 for about thirty seconds and then went dark again. “See? Firewall.” Except it wasn’t. What actually happened was: pveproxy was in a restart loop, briefly binding 8006 between crashes. Flushing rules changed timing, not causality.

The real root cause lived in plain sight: root filesystem had hit 100%. pveproxy couldn’t write its state, pvedaemon got unhappy, and systemd performed its usual efficient failure choreography. The firewall scapegoat cost them about an hour, plus a fun audit finding because the node ran with an effectively open input policy longer than intended.

What fixed it wasn’t cleverness. They freed space under /var/log, rotated journals, restarted the services, and then added monitoring on filesystem fill. The final postmortem line was blunt: “We assumed cause from correlation (update → firewall). We didn’t validate the socket state first.”

Incident #2: The optimization that backfired (“Let’s lock down input policy to drop”)

A large corporate IT team standardized on a hardened host baseline. One change looked tidy on paper: set the default input policy to drop, then explicitly allow only required ports. Sensible. Controlled. Audit-friendly.

The problem was how they implemented it on Proxmox. They pushed an nftables template that allowed SSH and a handful of monitoring ports, but they assumed the web UI was behind a corporate reverse proxy and didn’t need direct 8006 access. That assumption was true for most environments. It was false for break-glass access when the reverse proxy was down—or during maintenance when admins SSHed from a jump host and expected to reach https://node:8006 directly.

After an update and reboot, Proxmox came back up fine, pveproxy listened on 8006, and local curl returned 200. From the jump host, every attempt got “Connection refused,” because the template used reject with tcp reset for unlisted ports. That’s the detail that misled people: a “refused” error smells like “service down.”

The fix was not “open everything.” They added a narrow allow rule for 8006 from the jump host subnet and kept the default drop policy. More importantly, they documented that management-plane ports are part of the break-glass path, not an optional convenience. The optimization (stronger baseline) was right; the rollout without operational exceptions was the mistake.

Incident #3: The boring practice that saved the day (pre-checks and staged reboots)

A financial services team ran Proxmox for VDI and a pile of internal services. They were conservative to the point of comedy: every update window began with a pre-check script, saved to a ticket, reviewed by a second person. It looked like bureaucracy until it wasn’t.

One night, updates included changes that required a reboot. Before rebooting, the pre-check captured: free space, memory pressure, cluster quorum, and whether the node was currently hosting critical VMs. It also captured “what is listening on 8006 right now” and the current firewall ruleset hash. Boring. Repetitive. Exactly the point.

After reboot, 8006 was refusing connections. The on-call engineer didn’t start flailing because they had a baseline: before reboot, nftables had an allow for 8006; after reboot, the allow was missing. That sharply reduced the search space. The issue was traced to a configuration management run that applied the wrong role to that node during the maintenance window.

They reverted the role, restored the allow rule, and were back online quickly. The postmortem had no heroics, just a line everyone hates and everyone needs: “Our pre-checks turned a mystery into a diff.” The only exciting part was how unexciting the recovery was.

Joke #2: The Proxmox UI didn’t “randomly break after updates.” It broke on schedule—you just didn’t read the calendar.

Common mistakes: symptom → root cause → fix

This section is where the time goes to die if you don’t recognize patterns. Here are the repeat offenders, tuned specifically for “Connection refused” on 8006 right after updates.

1) “Connection refused” immediately after reboot

  • Symptom: Browser says refused; ping works; SSH works.
  • Root cause: pveproxy failed to start (cert missing/corrupt, pmxcfs not mounted, or config syntax issue).
  • Fix: systemctl status pveproxyjournalctl -u pveproxy. Validate /etc/pve mount; regenerate certs with pvecm updatecerts --force only after /etc/pve is healthy.

2) Local curl works, remote refused

  • Symptom: curl -k https://127.0.0.1:8006 returns 200; remote browser refused.
  • Root cause: nftables/iptables rejects 8006, or service bound only to loopback, or IP moved to different interface/VLAN.
  • Fix: Check ss -lntp for bind address; check firewall rules (nft list ruleset); verify IP and routes (ip -br addr, ip route).

3) “It was working before the update,” and now only one node is broken

  • Symptom: Cluster has multiple nodes; only updated node refuses on 8006.
  • Root cause: That node’s /etc/pve is stale/unmounted, corosync link issue, or the node booted into a different network config (bridge name changes, missing VLAN).
  • Fix: Check pvecm status, systemctl status pve-cluster, mount | grep /etc/pve, and network config under /etc/network/interfaces (carefully).

4) “Connection refused” only from some networks

  • Symptom: Works from jump host, refused from your laptop, or vice versa.
  • Root cause: Management access rules are subnet-specific; maybe a corporate baseline introduced an explicit reject for 8006 except from approved ranges.
  • Fix: Confirm with nft counters and rule order; add explicit allow for 8006 from required admin sources.

5) 8006 “refused” and other random services flapping

  • Symptom: Not just the UI—metrics, backups, or container starts fail too.
  • Root cause: Disk full, inode exhaustion, or filesystem mounted read-only after errors.
  • Fix: df -h, df -i, dmesg -T | tail; remediate storage, then restart services.

6) You keep restarting services and it keeps failing

  • Symptom: systemctl restart pveproxy “works” but UI still refused; or it fails immediately.
  • Root cause: Restarting doesn’t fix the underlying missing files, permission issues, or blocked port. It just moves the timestamp.
  • Fix: Stop and read the logs. Fix the first error. Only then restart.

Checklists / step-by-step plan (safe recovery)

This is the plan you follow when you want the UI back and you don’t want to create collateral damage. It’s written to be safe on standalone hosts and cluster nodes.

Phase 1: Confirm scope (2–5 minutes)

  1. Verify host reachability: SSH in; confirm correct host (hostnamectl).
  2. Check local UI reachability: curl -k https://127.0.0.1:8006/.
  3. Check listener: ss -lntp | grep :8006.

If 8006 isn’t listening: go to Phase 2.
If it is listening locally but remote fails: skip to Phase 4 (network/firewall).

Phase 2: Fix the service (5–15 minutes)

  1. Status and logs: systemctl status pveproxy, then journalctl -u pveproxy -b.
  2. Dependencies: systemctl status pvedaemon pve-cluster.
  3. Check /etc/pve: mount | grep ' /etc/pve ' and ls -l /etc/pve/local/.
  4. Disk sanity: df -h /, df -i /. Fix if full before doing anything else.

Phase 3: Certificates and config filesystem (as needed)

  1. If cert files are missing or broken: ensure /etc/pve is properly mounted and populated.
  2. Then regenerate certs: pvecm updatecerts --force.
  3. Re-check listener: ss -lntp | grep :8006 and local curl.

Phase 4: Firewall and network path (when local works but remote doesn’t)

  1. Confirm bind address: if it’s only 127.0.0.1:8006, fix the config that constrains binding (rare; usually not default behavior).
  2. Check Proxmox firewall: pve-firewall status. If enabled, confirm rules allow 8006 from your admin subnets.
  3. Check nftables: nft list ruleset and look for tcp dport 8006 rules that reject/drop.
  4. Validate IP and routing: ip -br addr, ip route.
  5. Test from another host: nc -vz <ip> 8006.

Phase 5: Cluster-specific sanity (only if clustered)

  1. Check quorum: pvecm status.
  2. Check corosync: systemctl status corosync and logs if needed.
  3. Avoid risky “fixes”: don’t rip nodes out of the cluster just to make the UI come back. Restore network connectivity between nodes first.

When to stop and escalate

  • If you see filesystem errors in dmesg or remounts read-only.
  • If cluster quorum is lost and you’re not sure which nodes are “real.”
  • If the host is also your storage head (ZFS/NFS/iSCSI) and a bad move could take down other services.

FAQ

1) Why does “Connection refused” usually mean pveproxy, not the API daemon?

Because 8006 is the TLS endpoint served by pveproxy. If that daemon isn’t listening (or a firewall rejects), the TCP connection won’t establish. pvedaemon problems tend to show up as UI errors after connect, not as a refusal.

2) What’s the single fastest check?

ss -lntp | grep :8006. If nothing is listening, stop thinking about routers and start reading pveproxy logs.

3) Local curl works but browser from my laptop is refused. What now?

Assume firewall or network path. Check nftables for reject with tcp reset on 8006, validate the IP is on the correct interface, and test from a host in the same subnet using nc -vz.

4) Can I just restart pveproxy and be done?

You can try once. If it fails, repeated restarts are theater. Read journalctl -u pveproxy and fix the first error line.

5) Does a missing /etc/pve mount really break the UI?

Yes. Proxmox stores key config and cert material under /etc/pve, backed by pmxcfs. If it’s not mounted or unhealthy, services that depend on it can fail or start with missing files.

6) How often is disk full the real cause?

Often enough that it should be in your first five checks. Root being 100% full breaks daemons, cert generation, logging, and sometimes package postinst scripts. It’s not glamorous, but it’s common.

7) I updated packages but didn’t reboot. Can that still cause 8006 to refuse?

Yes. Package post-install scripts can restart services immediately. If a dependency changed (OpenSSL, perl modules, config filesystem state) the restart can fail even without a reboot.

8) If quorum is lost, will 8006 refuse connections?

Not always. Quorum loss more commonly causes management actions to fail or configs to become read-only-ish. But it can indirectly break certificate paths and pmxcfs behavior, which can keep pveproxy from starting.

9) Should I open 8006 to the internet to “make it work”?

No. Fix the actual issue. If you need remote access, put it behind a VPN, a bastion, or a tightly restricted allowlist. Exposing management planes is how you end up spending weekends learning about incident response.

10) What if 8006 is listening but I still can’t load the UI?

That’s no longer “connection refused.” Check browser errors, TLS alerts, and whether you’re hitting the correct IP/DNS. Also verify pvedaemon is running and that the host isn’t under extreme resource pressure.

Next steps you should actually do

Restore service first, then make it harder for this to happen again.

  1. Lock in the fast checks: ss -lntp, systemctl status pveproxy, journalctl -u pveproxy, df -h, nft list ruleset. Put them in your runbook exactly in that order.
  2. Add monitoring that catches the real killers: root filesystem usage, inode usage, pmxcfs health, and whether 8006 is listening from an external vantage point.
  3. Stop treating updates as “just patches”: schedule them, stage them, and capture a before/after diff of firewall and network state. The boring practice is the one that works at 3 a.m.
  4. For clusters: treat quorum and corosync connectivity as part of the management plane. If the cluster is unstable, the UI is a symptom, not the disease.

If you do nothing else: the next time you see “Connection refused,” check whether anything is listening on 8006 before you touch a firewall. That single habit saves hours and keeps you from “fixing” the wrong thing aggressively.

← Previous
Ubuntu 24.04: LVM thin pool 100% — save your VMs before it’s too late
Next →
Future Laptop GPUs: The Rise of the Thin Monster

Leave a comment