USB passthrough on Proxmox works great right up until it doesn’t. One minute your Zigbee coordinator is humming, your UPS is reporting cleanly, or your external SSD is happily serving backups. The next minute: the device disappears, the VM logs fill with reconnect spam, and whatever automation you trusted starts acting like it has free will.
This is usually not “a Proxmox bug.” It’s power, runtime power management, controller resets, flaky hubs, or a subtle mismatch between how you passed the device through and how Linux expects it to behave under load. The good news: you can make USB boring again. The better news: you can prove which layer is failing before you change anything.
Fast diagnosis playbook
When USB passthrough drops in production, you don’t have time for interpretive dance. You need a tight sequence that separates: device failure, power issues, kernel power management, controller reset behavior, and virtualization plumbing.
First: determine where the disconnect is happening
- Host only? Host kernel logs show disconnects and re-enumeration. The VM/container just sees “device vanished.” Fix host power/PM/controller first.
- Guest only? Host sees stable USB, but guest driver resets. Focus on guest kernel, QEMU USB emulation choice, and how you passed the device.
- Physical only? Device LED blinks/reset, hub clicks, or other devices on the same hub also drop. Fix power, cable, hub, or port.
Second: capture the evidence while it’s happening
- Run
dmesg -Twon the Proxmox host during a drop. - Check the guest logs at the same timestamp.
- Correlate whether the bus resets (
usb X-Y: reset) or the device disconnects (disconnect).
Third: classify the failure mode
- Brownout / undervoltage: disconnects under load; hub or port shows power events; multiple devices may drop together.
- Autosuspend: drops after idle period; recurring interval; wakes on traffic and flaps.
- xHCI reset quirks: “xHCI host controller not responding” or frequent “reset SuperSpeed Gen 1 USB device.”
- Bad passthrough approach: USB device passthrough to QEMU works until device re-enumerates with a different path; passing by bus/port or controller is needed.
Fourth: apply the smallest effective fix
- Swap cable/port/hub and ensure powered hub where appropriate.
- Disable autosuspend for that device (not globally, unless you’re desperate).
- If it’s a chronic xHCI reset issue: pass through the entire USB controller via PCIe (IOMMU), or apply conservative kernel quirks.
How USB passthrough actually fails (and why it’s not random)
USB is deceptively simple at the human level: plug thing in, thing works. Under the hood it’s a negotiation between device, hub, host controller, kernel drivers, and power policies. Add virtualization and you’ve now got two kernels and a hypervisor involved in that negotiation.
Most “random” USB disconnects follow a handful of predictable patterns:
- Enumeration instability: the device disconnects and comes back with a new address. If your passthrough was pinned to a fragile identifier, the guest loses it.
- Runtime power management: Linux tries to save power by suspending the device. Some devices interpret that as “time to panic.”
- Signal integrity: marginal cables, unshielded extensions, and overloaded hubs create CRC errors and resets, especially at USB 3.x speeds.
- Controller reset behavior: xHCI controllers can go into reset storms under certain conditions, and the kernel’s recovery logic is not always kind to long-running sessions.
- Virtualized timing and buffering: guest drivers sometimes behave differently when the USB transport is emulated or mediated through QEMU instead of being native.
There’s also a harsh truth: a lot of popular USB dongles (Zigbee sticks, Z-Wave sticks, RTL-SDRs) were designed for hobbyist desktops, not for always-on servers sitting behind a noisy hub on a rack shelf. You can still make them reliable. You just need to treat them like production dependencies, not cute accessories.
Paraphrased idea from Werner Vogels (Amazon CTO): you build reliability by assuming things will fail and designing so failures don’t become outages.
One joke, since we’re about to read logs for a while: USB stands for “Unexpected Sudden Bye-bye.” That’s not official, but it matches the tickets.
Interesting facts and historical context (useful, not trivia)
- USB autosuspend has been a Linux feature for years, and it’s grown more aggressive as laptops drove power-saving defaults. Servers inherit those defaults unless you override them.
- xHCI replaced EHCI/OHCI/UHCI as USB 3.x arrived, consolidating complexity into one controller model. Great for features; occasionally spicy for stability.
- USB device addresses are not stable identifiers. They’re assigned during enumeration and can change after a reset. If your configuration keys off a changing address, it will fail eventually.
- Some “USB 3” ports share internal hubs on motherboards, meaning two physical ports can be one logical root hub. A reset can take out both.
- VM USB passthrough historically started with emulated controllers (UHCI/OHCI/EHCI in QEMU). Modern setups prefer host-device passthrough or entire controller passthrough for reliability.
- USB selective suspend in Windows and autosuspend in Linux solve similar problems with different knobs. Many devices are tested against Windows defaults, not Linux server workloads.
- USB 3.x uses additional “SuperSpeed” pairs beyond USB 2.0’s wires. A cable that “works fine” at USB 2.0 can fail spectacularly at USB 3.x speeds.
- PCIe passthrough of a USB controller often fixes flakiness by removing the emulation layer and giving the guest native control—at the cost of flexibility.
Practical tasks: commands, outputs, and the decision you make
These are real operations tasks you can run on a Proxmox host. Each task includes: command, what you’re looking at, and what decision you make next. Run them during normal operation and during a failure if you can.
Task 1: Watch host kernel events live
cr0x@server:~$ sudo dmesg -Tw
[Thu Dec 26 11:18:02 2025] usb 2-2: USB disconnect, device number 7
[Thu Dec 26 11:18:03 2025] usb 2-2: new full-speed USB device number 8 using xhci_hcd
[Thu Dec 26 11:18:03 2025] usb 2-2: New USB device found, idVendor=10c4, idProduct=ea60, bcdDevice= 1.00
[Thu Dec 26 11:18:03 2025] usb 2-2: Product: CP2102N USB to UART Bridge Controller
What it means: The host lost the device and re-enumerated it. This is not “guest only.”
Decision: Focus on host-side power management, cabling, hub, and controller resets. Guest tuning alone won’t fix physical disconnects.
Task 2: Confirm what Proxmox thinks is attached (VID:PID and path)
cr0x@server:~$ lsusb
Bus 002 Device 008: ID 10c4:ea60 Silicon Labs CP210x UART Bridge
Bus 002 Device 002: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 002: ID 1d6b:0002 Linux Foundation 2.0 root hub
What it means: You have a CP210x device on Bus 002. Device number can change. Vendor/product is stable.
Decision: Prefer passing through by vendor/product and physical port (host bus/port path) rather than by device number.
Task 3: Get a stable physical topology (find the port chain)
cr0x@server:~$ lsusb -t
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/8p, 5000M
|__ Port 2: Dev 8, If 0, Class=Vendor Specific Class, Driver=cp210x, 12M
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/14p, 480M
What it means: The device is on Bus 02, Port 2 of the root hub. That physical relationship is the anchor you want.
Decision: If you can keep it on the same port/hub chain, you can write deterministic udev rules and Proxmox passthrough configs.
Task 4: Inspect USB power control state for the device
cr0x@server:~$ DEVPATH=$(udevadm info -q path -n /dev/ttyUSB0); echo "$DEVPATH"
/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0
cr0x@server:~$ cat /sys$DEVPATH/power/control
auto
What it means: Runtime PM is enabled (auto). The kernel may autosuspend this interface.
Decision: If you see disconnects after idle, set this to on via udev rule (device-specific) or disable autosuspend.
Task 5: Check autosuspend delay for the interface
cr0x@server:~$ cat /sys$DEVPATH/power/autosuspend_delay_ms
2000
What it means: The system is willing to autosuspend after 2 seconds of idle. Many dongles hate this.
Decision: For serial dongles, coordinators, UPS USB HID, and SDRs: disable autosuspend for that device.
Task 6: Verify the USB controller and its driver (xHCI, etc.)
cr0x@server:~$ lspci -nnk | grep -A3 -i usb
00:14.0 USB controller [0c03]: Intel Corporation Device [8086:a36d] (rev 10)
Subsystem: Intel Corporation Device [8086:7270]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
What it means: Your USB controller is Intel xHCI on 00:14.0.
Decision: If you later choose PCI passthrough of a controller, this is the device you’d isolate and pass through (if it’s safe and in its own IOMMU group).
Task 7: Check IOMMU enablement (required for PCIe controller passthrough)
cr0x@server:~$ dmesg | grep -E "DMAR|IOMMU" | head
[ 0.812345] DMAR: IOMMU enabled
[ 0.812678] DMAR: Host address width 39
[ 0.813210] DMAR: DRHD base: 0x000000fed91000 flags: 0x0
What it means: VT-d/IOMMU is on and the kernel sees it.
Decision: You can consider passing through a whole USB controller if it’s in a sane IOMMU group.
Task 8: Determine IOMMU group for the USB controller
cr0x@server:~$ for d in /sys/kernel/iommu_groups/*/devices/*; do
if [[ "$d" == *"0000:00:14.0"* ]]; then echo "USB controller is in: ${d%/*}"; fi
done
USB controller is in: /sys/kernel/iommu_groups/5/devices
What it means: Controller is in group 5. Now you must confirm group 5 doesn’t include devices you can’t give up to a VM (like SATA or the NIC).
Decision: If group includes critical devices, do not passthrough that controller; use a separate PCIe USB card instead.
Task 9: List everything in the same IOMMU group
cr0x@server:~$ ls -1 /sys/kernel/iommu_groups/5/devices
0000:00:14.0
0000:00:14.2
What it means: There’s another device (00:14.2) in the group. You need to identify it before any passthrough.
Decision: If it’s something like Intel PCH thermal or MEI, you might still decide; if it’s storage/NIC, stop.
Task 10: Inspect Proxmox VM config for fragile USB mapping
cr0x@server:~$ sudo cat /etc/pve/qemu-server/101.conf
agent: 1
boot: order=scsi0;net0
cores: 4
memory: 4096
net0: virtio=DE:AD:BE:EF:00:01,bridge=vmbr0
scsi0: local-lvm:vm-101-disk-0,iothread=1,size=32G
usb0: host=10c4:ea60
What it means: This VM passes through by VID:PID, which is usually good. But if you have multiple identical dongles, it can still pick the wrong one.
Decision: If you have more than one matching device, pass through by bus/port (or use udev to create stable symlinks and bind by serial/path).
Task 11: Confirm if the device has a serial number you can pin to
cr0x@server:~$ sudo udevadm info -a -n /dev/ttyUSB0 | grep -E "serial|idVendor|idProduct" | head -n 10
ATTRS{idVendor}=="10c4"
ATTRS{idProduct}=="ea60"
ATTRS{serial}=="01A2B3C4"
What it means: Great: a stable serial is exposed. This is gold for rules and stable assignment.
Decision: Write udev rules to set power/control and to create stable symlinks keyed on serial.
Task 12: Check if usbcore autosuspend is globally enabled via kernel cmdline
cr0x@server:~$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-6.8.12-4-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on
What it means: No explicit usbcore autosuspend setting. Defaults apply (often autosuspend on).
Decision: Prefer device-specific fixes, but if you’re debugging, you can temporarily set usbcore.autosuspend=-1 to test the hypothesis.
Task 13: Validate whether runtime PM is actually suspending the device
cr0x@server:~$ cat /sys$DEVPATH/power/runtime_status
suspended
What it means: The interface is suspended right now. If your service expects constant presence, this is a red flag.
Decision: Disable autosuspend for the device and watch whether disconnects stop.
Task 14: Check for xHCI “controller not responding” patterns
cr0x@server:~$ journalctl -k -b | grep -i -E "xhci|host controller|not responding|reset" | tail -n 12
Dec 26 10:55:14 server kernel: xhci_hcd 0000:00:14.0: xHCI host controller not responding, assume dead
Dec 26 10:55:14 server kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up
Dec 26 10:55:15 server kernel: usb 2-2: USB disconnect, device number 7
Dec 26 10:55:17 server kernel: xhci_hcd 0000:00:14.0: xHCI Host Controller
Dec 26 10:55:17 server kernel: xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 2
What it means: This is a controller-level reset. Everything under that controller will drop, not just your dongle.
Decision: Stop wasting time on per-device tweaks. Use a different USB controller (add a PCIe USB card) or pass through a dedicated controller to the VM.
Power and signal integrity: the unglamorous root cause
If you want stable USB passthrough, start by assuming your setup is underpowered or electrically messy. That assumption is often correct, and it’s cheaper than chasing ghost bugs.
Common power-related patterns
- “Disconnect under load”: external SSD drops when backup starts; SDR dongle drops when sampling ramps; Zigbee stick drops when radio transmits more.
- “Multiple devices drop together”: indicates hub or controller reset, or shared power rail sag.
- “Works on desktop, fails on server”: desktop ports often have different hub topology and sometimes better front-panel cabling than that random internal header adapter you used in the server.
What to do (opinionated)
Use a powered hub for low-quality dongles and any long cable run. Yes, even if the device “should” draw little power. Many dongles are fine on average but spike at annoying times. The hub’s regulator and bulk capacitance often make the difference.
Avoid bargain-bin USB 3 cables. USB 3 signal integrity is less forgiving. If you need a long run, use a high-quality cable or keep it at USB 2 speeds where possible.
Prefer rear I/O ports directly on the motherboard. Front panel headers and internal bracket cables are a reliability tax.
How to prove it’s power
- Disconnects correlate with device activity, not time.
- Moving to a powered hub or different port reduces or eliminates drops.
- You see “over-current” or “power surge” type messages (less common, but very telling).
Second joke (and the last one): a powered hub is basically an apology letter to physics with a power brick attached.
Autosuspend and runtime PM: the silent device killer
Linux runtime power management is a good idea with a bad habit: it assumes devices implement the spec correctly. Many do. Some absolutely do not. For USB serial adapters, radio coordinators, and HID-ish devices that pretend to be simple, autosuspend can cause flapping or “device vanished” behavior that looks like hardware failure.
Three approaches, from best to worst
1) Disable autosuspend per device via udev (preferred)
Find identifying attributes (vendor/product, serial, or interface) and force power/control=on for that device. This keeps runtime PM enabled for everything else.
cr0x@server:~$ sudo tee /etc/udev/rules.d/99-usb-no-autosuspend.rules > /dev/null <<'EOF'
ACTION=="add", SUBSYSTEM=="usb", ATTR{idVendor}=="10c4", ATTR{idProduct}=="ea60", TEST=="power/control", ATTR{power/control}="on"
EOF
cr0x@server:~$ sudo udevadm control --reload-rules
cr0x@server:~$ sudo udevadm trigger
What it means: New device additions matching VID:PID will have runtime PM forced on.
Decision: If your device drops after idle, this is usually the first durable fix.
2) Disable autosuspend globally (use for testing, avoid as a permanent habit)
Setting usbcore.autosuspend=-1 turns off autosuspend across the board. This can be useful to prove causality quickly, especially in an incident. But it’s a shotgun.
cr0x@server:~$ sudo sed -i 's/^GRUB_CMDLINE_LINUX_DEFAULT="/GRUB_CMDLINE_LINUX_DEFAULT="usbcore.autosuspend=-1 /' /etc/default/grub
cr0x@server:~$ sudo update-grub
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-6.8.12-4-pve
done
What it means: Next boot will disable USB autosuspend globally.
Decision: If stability improves immediately, go back and implement per-device udev rules instead of leaving this forever.
3) Keep-alive traffic (sometimes necessary, often ugly)
Some users run periodic reads/writes or polling to keep devices awake. This is a workaround when device firmware is fragile. It’s also how you end up with a cron job that becomes “critical infrastructure.”
Decision: Use keep-alives only when you’ve proven autosuspend is the trigger and udev power controls can’t tame it.
xHCI resets, quirks, and controller-level fixes
If your logs show the xHCI controller dying and coming back, you’re not dealing with “a flaky dongle” anymore. You’re dealing with the host controller or its interaction with firmware, PCIe power management, or a bad combination of devices on the bus.
Recognize controller resets
These patterns matter:
xHCI host controller not responding, assume deadHC died; cleaning up- Bus numbers getting re-assigned after the reset
- Multiple USB devices disconnecting at once
Fix hierarchy (do this in order)
1) Firmware/BIOS sanity
- Update motherboard BIOS/UEFI. USB stability bugs are depressingly common in firmware.
- Disable “ErP” or deep sleep features that meddle with USB power rails, if applicable.
- Consider disabling PCIe ASPM for servers that don’t need it. ASPM can interact badly with some controllers.
2) Add a dedicated PCIe USB controller (my go-to)
A $20–$50 PCIe USB card with a reasonable chipset can isolate your “problem child” devices from the PCH’s integrated controller. You can then pass through that controller or keep it on the host with tuned power settings.
Decision: If controller resets are killing uptime, buy isolation. It’s cheaper than your time.
3) Pass through the entire controller (PCI passthrough)
For devices that hate mediated passthrough, giving the VM direct control over the USB controller is often the end of the story. The VM sees native hardware; resets and re-enumerations stay inside the guest.
Tradeoffs: you lose host access to those ports, and you must manage IOMMU groups correctly. Also: live migration becomes harder or impossible depending on your setup.
4) Kernel parameters and quirks (surgical, but know what you’re doing)
Sometimes you can work around flakiness with kernel parameters affecting xHCI or USB core behavior. This is where you keep change control tight because you’re affecting the host.
Decision: If a dedicated controller fixes it, don’t get clever with quirks. If you must use quirks, document the rationale and the observed log patterns.
Choosing the right passthrough mode (device vs. port vs. controller)
Most Proxmox USB passthrough pain is self-inflicted by picking the wrong granularity. The right choice depends on how the device behaves when it disconnects and re-enumerates.
Option A: Pass through a USB device (VID:PID)
Good for: unique devices with stable identity, where re-enumeration doesn’t confuse the mapping.
Bad for: multiple identical devices, or devices that change behavior across modes (bootloader vs. runtime) and present different IDs.
Failure mode: the wrong device gets attached after a reboot, or the device disappears and comes back as a different ID and never reattaches.
Option B: Pass through by bus/port path
This is more deterministic if the device stays physically on the same port chain. It’s also less flexible if you move cables around.
Good for: labs and servers where the dongle is “part of the machine,” not a movable accessory.
Option C: Pass through the entire controller (PCI)
Good for: devices needing low-level access, devices that re-enumerate in messy ways, and setups where the guest should own the USB stack end-to-end (Home Assistant VMs are common here).
Bad for: hosts where you need those ports for other host functions, or where IOMMU grouping prevents safe isolation.
VM vs. LXC: what changes, what doesn’t
Proxmox gives you two main consumers of USB: QEMU VMs and LXC containers. The hardware realities don’t change: power is power, the host controller is the host controller, and the host kernel is still in the loop unless you pass through a PCI controller.
QEMU VMs
- You can attach USB devices to the VM via host passthrough.
- If the host loses the device, the VM will lose it too.
- Passing through the whole controller via PCIe gives the guest native enumeration and often the best stability.
LXC containers
- LXC shares the host kernel. You are not virtualizing the USB stack; you’re granting access to device nodes.
- Autosuspend and runtime PM are host behaviors. Fix them on the host.
- Permissions and cgroup device rules matter: a “disconnect” may actually be “device node changed and container can’t open it.”
Decision rule: If you need a finicky USB device to be stable, a VM with PCIe USB controller passthrough is often the cleanest architecture. LXC is great, but it’s not a USB isolation boundary.
Common mistakes: symptom → root cause → fix
1) Device drops exactly after a few seconds/minutes of idle
Symptom: Works when actively used; drops when quiet; reconnects when traffic resumes; sometimes the guest app never recovers.
Root cause: USB autosuspend/runtime PM suspending a device that can’t resume cleanly.
Fix: udev rule: set power/control=on for that VID:PID or serial; optionally test with usbcore.autosuspend=-1.
2) Multiple USB devices vanish simultaneously
Symptom: Zigbee stick, UPS cable, and keyboard drop at the same time. Bus numbers reset.
Root cause: xHCI controller reset or power rail drop affecting a shared hub/controller.
Fix: move critical device to a dedicated controller; add PCIe USB card; update BIOS; avoid overloaded hubs.
3) External USB disk drops during backup jobs
Symptom: I/O errors in guest or host, followed by device re-enumeration; ZFS or backup job fails.
Root cause: power/cable issues; UAS quirks; enclosure firmware; hub brownout.
Fix: powered hub or direct port; shorter cable; consider disabling UAS for the enclosure (device-specific); prefer SATA/NVMe for serious backup targets.
4) Passthrough works until device firmware update mode
Symptom: You flash a dongle; it disappears; it comes back as a different USB ID; VM no longer captures it.
Root cause: bootloader enumerates as a different VID:PID and your mapping is too narrow.
Fix: temporarily passthrough by port path; or passthrough controller; or include both IDs if you know them.
5) Two identical dongles swap places after reboot
Symptom: VM gets “the wrong stick” and everything breaks in subtle ways.
Root cause: mapping by VID:PID without unique serial; enumeration order changes.
Fix: pin by serial via udev; or use physical port mapping; label cables like an adult.
6) LXC container loses device after reconnect
Symptom: Container had /dev/ttyUSB0, then after reconnect it becomes /dev/ttyUSB1; app fails.
Root cause: device node assignment changes; container permissions/cgroup allowlist doesn’t match new node.
Fix: create stable symlink in /dev/serial/by-id or /dev/serial/by-path; bind-mount stable path into container; ensure cgroup device permissions allow it.
7) “Fix” was disabling autosuspend globally and now other USB behaves oddly
Symptom: Power draw increases; some devices behave differently; laptop-like power features become irrelevant, but you’re on a server anyway.
Root cause: global hammer used for a per-device problem.
Fix: revert global setting; implement udev rule for the specific device; confirm runtime_status stays active for that device only.
Checklists / step-by-step plan
Step-by-step hardening plan (production-friendly)
- Capture evidence: record host
dmesg -Twoutput during at least one disconnect; save relevantjournalctl -klines. - Confirm topology: run
lsusb -t; note bus/port chain. If you can’t describe where it’s plugged in, you can’t make it stable. - Baseline power: move the device to a rear motherboard port; remove extension cables; test with a powered hub if it’s a dongle.
- Disable autosuspend per device: implement udev rule for VID:PID (and serial if available). Replug and verify
power/controlshowson. - Verify stability under idle: leave it idle for longer than the previous failure window; confirm no disconnects in
journalctl -k. - Stress the device: generate realistic traffic (radio activity, disk I/O, UPS polling). Watch for resets.
- If controller resets appear: stop. Add a dedicated PCIe USB controller or isolate via PCI passthrough.
- Choose the passthrough granularity: device passthrough if unique and stable; port/path if physically fixed; controller passthrough for finicky devices.
- Pin identity: use stable symlinks in
/dev/serial/by-idor by-path; avoid brittle numbering like/dev/ttyUSB0. - Write it down: document which port, which rule, and which VM config line. Future you will be tired.
- Monitor it: add a lightweight check that alerts on kernel disconnect messages or missing device nodes.
Rollback checklist (because you will need it once)
- Keep a copy of the original
/etc/default/gruband the udev rule files. - Apply one change at a time when possible. If you change hub + kernel params + passthrough mode, you won’t know what fixed it.
- Reboot windows: plan them. Some USB PM settings require a reboot to fully settle; udev rules don’t.
Three corporate mini-stories from the trenches
Incident caused by a wrong assumption: “It’s in the VM, so it’s a VM problem”
At a mid-size company, a team ran a Proxmox cluster supporting a handful of “small but critical” services. One of them: a VM that talked to a USB-connected UPS for safe shutdown signaling. They passed the USB HID device through to the VM and moved on with life.
Months later, they saw sporadic VM-side errors: the UPS software lost the device for a few seconds, then reconnected. The team assumed it was a guest driver glitch because the VM logs showed the symptom first. They upgraded the VM kernel, pinned packages, and even swapped UPS software. It improved nothing.
During a longer outage, someone finally tailed host dmesg while reproducing the issue. The host was logging USB disconnects on the same timestamps. It wasn’t the VM. It was the host suspending the device and occasionally failing to resume cleanly.
The fix was boring: a udev rule forced power/control=on for that VID:PID, and the device stayed awake. They also moved the UPS cable to a rear port away from a noisy hub used for lab gear. The incident writeup had one key correction: “Always prove which kernel logged the disconnect first.”
Optimization that backfired: “Power saving everywhere”
In another organization, someone decided to get aggressive about power efficiency. This was a rack of small servers in an office with limited cooling. They enabled a set of power-saving tweaks, including deeper C-states and more aggressive runtime power management defaults.
The USB passthrough casualties started quietly: a Zigbee coordinator in a Home Automation VM would drop once a day. Then an external USB SSD used for a weekly export started throwing I/O errors under sustained write. It didn’t fail in the lab; it failed in the office at 2 a.m., which is the traditional time for surprises.
Engineers chased application issues, then storage stack issues, then QEMU issues. The common factor was “idle-to-busy transitions.” Autosuspend and runtime PM were saving power in the only way they know how: by taking naps at the worst time.
They backed out the aggressive USB autosuspend behavior device-by-device instead of globally, and kept CPU power saving because it wasn’t implicated. The lesson wasn’t “power saving is bad.” It was “power saving needs per-device exceptions when USB peripherals are part of your reliability chain.”
Boring but correct practice that saved the day: dedicated controllers and deterministic wiring
A finance-adjacent company had a Proxmox host running a mix of services, including a VM that handled a small hardware security module connected over USB. Not a mass-market dongle; still USB, still finicky.
They treated it like any other production dependency. The USB device lived on a dedicated PCIe USB controller. That controller was passed through to the VM via PCI passthrough. The physical port was labeled, the cable was short and known-good, and there was a powered hub only where it added electrical stability (not because they ran out of ports).
When the host later received kernel upgrades, a handful of unrelated USB devices on the motherboard ports started exhibiting occasional resets. Nobody cared. The critical USB chain was isolated. The VM kept its controller, its device, and its stability.
The practice was boring because it required planning, not heroics. It also meant incidents were about the actual app, not about whether a dongle decided to re-enumerate today.
FAQ
1) Should I disable USB autosuspend globally on Proxmox?
Only as a short diagnostic test. If it helps, implement a per-device udev rule that forces power/control=on. Global disable is a blunt instrument.
2) Why does passing through by “Bus 002 Device 008” fail later?
Because the “Device 008” number is assigned at enumeration and changes after disconnects/resets. Use VID:PID, serial, by-path, or pass through the whole controller.
3) My Zigbee/Z-Wave stick resets when I plug it into a USB 3 port. Why?
USB 3 ports can be electrically noisier and often share hubs differently. Some 2.4 GHz radios are sensitive to interference, and some dongles behave better on USB 2. Use a USB 2 extension or a powered hub that keeps it away from the host.
4) Is PCI passthrough of a USB controller always better?
It’s often the most stable, but not always the most convenient. It reduces flexibility (host can’t use those ports) and can complicate migrations. Use it when device-level passthrough keeps breaking.
5) How do I know it’s power and not Linux power management?
Power issues correlate with load and may affect multiple devices; autosuspend correlates with idle time and shows runtime_status=suspended. Logs plus timing tell the story.
6) Why does my LXC container lose /dev/ttyUSB0 after a reconnect?
Because the kernel may reassign the node as /dev/ttyUSB1, etc. Use stable symlinks in /dev/serial/by-id or by-path and bind-mount those into the container.
7) Can hubs cause disconnects even if the device draws very little power?
Yes. Hubs can be unstable due to poor power regulation, firmware quirks, or signal integrity. A powered, decent-quality hub often fixes “mystery” drops.
8) My device has two different VID:PID pairs depending on mode. How do I handle passthrough?
Either passthrough by physical port/path so re-enumeration is still captured, or include both IDs, or use controller passthrough for the duration of firmware updates.
9) Is this a Proxmox issue or a Linux kernel issue?
Most of the time it’s neither; it’s the hardware, power management defaults, or controller behavior. Proxmox sits on the Linux kernel; host logs are the arbiter.
10) What’s the most reliable setup for a “must not drop” USB device?
Dedicated PCIe USB controller, controller passed through via PCI, device-specific autosuspend disabled on the host if host still touches it, short known-good cable, and deterministic port labeling.
Conclusion: next steps that actually stick
If you’re fighting Proxmox USB passthrough disconnects, don’t start with exotic kernel flags. Start with proof, then apply fixes that survive reboots and upgrades.
- Prove where the disconnect happens: host
dmesg -Twis your first truth source. - Stabilize the physical layer: rear port, good cable, powered hub if needed.
- Kill autosuspend for the specific device using a udev rule; verify
power/controlandruntime_status. - If the controller resets: isolate with a dedicated PCIe USB controller and consider PCI passthrough.
- Make identity deterministic: serial/by-id/by-path, not tty numbering.
- Document and monitor: the best fix is the one you can explain during an incident at 3 a.m.
USB can be stable in a hypervisor. It just needs the same respect you give disks and networks: clean power, deterministic wiring, and configurations that assume the device will misbehave eventually.