Debian 13 “Dependency failed” at boot: find the one service blocking startup (case #29)

December 23, 2025 • February 3, 2026 • Read: 23 min • Views: 12

Was this helpful?

You reboot a Debian 13 box for something routine—kernel update, firmware, a “quick” storage change—and it comes back with the cheery message:
“Dependency failed”. The login prompt never arrives, or you land in emergency mode staring at a blinking cursor and your own life choices.

The trick is that systemd is rarely failing “the system.” It’s failing a unit, and one unit (or a small chain) is enough to stall the boot. Your job is to identify
the single service that blocks startup, then decide whether to fix it, isolate it, or make the boot resilient.

What “Dependency failed” actually means in systemd

systemd is a dependency scheduler. It turns “start the machine” into a directed graph of units (services, mounts, sockets, devices, targets).
When you see “Dependency failed”, you’re typically seeing the downstream symptom, not the upstream cause.

Example: remote-fs.target can fail because a single .mount unit failed, which can fail because a device wasn’t found,
which can be because a LUKS mapping didn’t unlock, which can be because a keyfile wasn’t available yet. systemd will politely tell you
“dependency failed” at each hop. It’s like hearing “the meeting is canceled” and trying to debug the entire company calendar.

Three unit relationships that matter during boot

Requires= — hard dependency. If A Requires=B and B fails, A fails.
Wants= — soft dependency. If B fails, A can still proceed.
After=/Before= — ordering only. Doesn’t imply dependency, but can create “waiting” chains that look like hangs.

Most boot stalls that present as “Dependency failed” boil down to one of these: a mount that can’t mount, a device that never appears, a network-online wait,
or a storage import that times out. Find that single unit, and you’ve got the thread to pull.

Joke #1: systemd isn’t “slow”; it’s just extremely committed to waiting for that one thing you promised would exist.

Fast diagnosis playbook (do this first)

This is the high-signal sequence when you have a console (physical, IPMI, or VM console) and you need the culprit quickly. You’re hunting
for the one unit that’s either failed or waiting forever.

1) Get to a shell without wasting time

If you’re stuck in a boot splash, switch to another TTY (often Ctrl+Alt+F2). If you’re in emergency mode, you already have a shell.
If you cannot log in, use recovery mode in GRUB or add systemd.unit=emergency.target once for this boot.

2) Identify what failed in the previous boot

Run journalctl -b -1 -p err (if you rebooted) or journalctl -b -p err (current boot). You want the first error,
not the last complaint.

3) Ask systemd what blocked the boot critical path

systemd-analyze critical-chain points at the unit that dominated boot time. If you see a mount or a network-online target sitting there for 1–2 minutes,
that’s your blocker. If you see a service “waiting,” inspect its dependencies with systemctl list-dependencies.

4) Confirm the single unit and its upstream cause

For that unit, run systemctl status UNIT and then journalctl -u UNIT -b. If it’s a mount: check /etc/fstab.
If it’s storage: check device presence. If it’s network-online: check the network stack actually used.

5) Make a decision: fix now, bypass now, or isolate

Fix now: correct fstab, unlock LUKS, import pool, correct unit file.
Bypass now: temporarily comment a failing fstab line, add nofail, or mask a non-critical unit.
Isolate: boot into multi-user.target without the problematic target, or use emergency mode to regain control.

Your goal is uptime first, perfection later. The system doesn’t care that your NFS mount is “important” if it’s preventing SSH from starting.

Interesting facts and historical context (for your mental model)

systemd didn’t just replace SysV init; it replaced the boot philosophy. Instead of linear scripts, boot became a dependency graph with jobs and timeouts.
The phrase “Dependency failed” is systemd reporting a job failure propagation, not a “kernel panic”-class failure. It’s higher-level and usually fixable without reinstalling.
Debian adopted systemd as default in Debian 8 (Jessie), after years of debate. The main operational impact: fewer “mystery sleeps” in init scripts, more explicit ordering.
Targets like network-online.target exist because “network is configured” and “network is usable” are different states; many services mistakenly depend on the latter.
remote-fs.target and local-fs.target are coordination points, not “real services.” A single bad mount can hold them hostage.
systemd’s mount handling is tightly integrated with /etc/fstab. A typo there can manifest as a unit failure, not as a friendly message from mount.
LUKS and dm-crypt are typically orchestrated during boot by initramfs and systemd units; a missing keyfile can become a “dependency failed” two steps away from the real issue.
Debian’s default journald configuration historically balanced disk usage vs. forensics. If logs are volatile-only, you may lose the evidence after a hard reset.

One reliable paraphrased idea from W. Edwards Deming: paraphrased idea: A system’s results are mostly determined by the system, not heroic effort. That applies to boot, too—fix the dependency system, not just today’s failure.

The “one blocker” method: isolate the unit that stalls boot

“Dependency failed” is loud but unhelpful. The useful question is: which unit is earliest in the failure chain and/or which unit is longest on the critical chain?
Those are often the same, but not always.

Two patterns you’ll see

Pattern A: a unit fails quickly, and other units fail as a consequence

This is the clean case. Example: mnt-data.mount fails because UUID doesn’t exist. Then local-fs.target reports dependency failure,
and you drop to emergency mode.

Pattern B: a unit doesn’t fail; it waits until timeout

This is the “it feels hung” case. Example: systemd-networkd-wait-online.service waits 2 minutes for a carrier that never comes up.
The system eventually boots, but late, and you’ll get dependency failures if other services demand “online” rather than “configured.”

Operational rule: blame the first failing upstream unit, not the target

When a target fails, it’s almost never the target’s fault. The target is a mailbox. The letter inside is a failed mount, a failed import,
a stale device path, or a service with a hard dependency it shouldn’t have.

The fastest way to find “the one service” is to:
(1) list failed units,
(2) inspect the critical chain,
(3) read the journal for that unit,
(4) map the dependency graph just far enough to see what’s actually missing.

Practical tasks (commands, what the output means, the decision you make)

These are field commands you can run on Debian 13. Each task includes what you’re looking at and what decision follows. Use them in order when possible.
When the system is half-booted, prefer journalctl and systemctl over guesswork.

Task 1 — List failed units (your first shortlist)

cr0x@server:~$ systemctl --failed
  UNIT                              LOAD   ACTIVE SUB    DESCRIPTION
● mnt-data.mount                    loaded failed failed /mnt/data
● systemd-networkd-wait-online.service loaded failed failed Wait for Network to be Configured

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit state, i.e. generalization of SUB.
SUB    = The low-level unit state, values depend on unit type.

2 loaded units listed.

What it means: You have concrete suspects. Ignore “targets” unless they’re the only thing listed.
Decision: pick the unit that matches the boot symptom: emergency mode often points to mounts; long boots often point to wait-online.

Task 2 — See what blocked boot time (critical chain)

cr0x@server:~$ systemd-analyze critical-chain
graphical.target @2min 4.112s
└─multi-user.target @2min 4.112s
  └─remote-fs.target @2min 3.900s
    └─mnt-nfs.mount @2min 3.880s +2min
      └─network-online.target @3.870s
        └─systemd-networkd-wait-online.service @1.500s +2.360s
          └─systemd-networkd.service @1.200s +250ms
            └─systemd-udevd.service @900ms +280ms
              └─systemd-tmpfiles-setup-dev.service @650ms +220ms
                └─kmod-static-nodes.service @420ms +200ms
                  └─systemd-journald.socket @380ms
                    └─system.slice @370ms

What it means: mnt-nfs.mount consumed 2 minutes. That’s your bottleneck, even if “Dependency failed” points elsewhere.
Decision: investigate that mount and why it requires network-online; decide whether it should be nofail or automounted.

Task 3 — Show the exact error lines from the current boot

cr0x@server:~$ journalctl -b -p err --no-pager
Mar 12 08:41:02 server systemd[1]: mnt-data.mount: Mount process exited, code=exited, status=32/n/a
Mar 12 08:41:02 server systemd[1]: mnt-data.mount: Failed with result 'exit-code'.
Mar 12 08:41:02 server systemd[1]: Failed to mount /mnt/data.
Mar 12 08:41:02 server systemd[1]: local-fs.target: Dependency failed for Local File Systems.
Mar 12 08:41:10 server systemd-networkd-wait-online[412]: Timeout occurred while waiting for network connectivity.

What it means: There are at least two independent problems: a failed local mount and a network wait timeout.
Decision: fix local mounts first if you’re in emergency mode; fix wait-online next if boot is just slow.

Task 4 — Inspect a suspect unit’s status (what systemd thinks happened)

cr0x@server:~$ systemctl status mnt-data.mount --no-pager
× mnt-data.mount - /mnt/data
     Loaded: loaded (/etc/fstab; generated)
     Active: failed (Result: exit-code) since Wed 2025-03-12 08:41:02 UTC; 2min 12s ago
      Where: /mnt/data
       What: UUID=2f2c3a5b-8d5e-4e1b-b4d2-2e1b1f4d2f20
      Tasks: 0 (limit: 9250)
     Memory: 0B
        CPU: 8ms
     Error: 32 (Mount failure)
Mar 12 08:41:02 server systemd[1]: Mounting /mnt/data...
Mar 12 08:41:02 server mount[771]: mount: /mnt/data: special device UUID=2f2c3a5b-8d5e-4e1b-b4d2-2e1b1f4d2f20 does not exist.
Mar 12 08:41:02 server systemd[1]: mnt-data.mount: Mount process exited, code=exited, status=32/n/a
Mar 12 08:41:02 server systemd[1]: mnt-data.mount: Failed with result 'exit-code'.

What it means: It’s a generated mount unit from /etc/fstab, and the UUID isn’t present.
Decision: confirm whether the disk is gone, renamed, not decrypted, or the UUID changed; fix fstab accordingly.

Task 5 — Confirm the device exists (don’t trust fstab)

cr0x@server:~$ lsblk -f
NAME        FSTYPE FSVER LABEL UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
sda
├─sda1      vfat   FAT32       9C2B-1A1B                             504.7M     1% /boot/efi
├─sda2      ext4   1.0         6e6a86fb-4d93-4b8b-9b62-6a0c9b99afc2  712.4M    22% /boot
└─sda3      crypto 2           1f7c0a2c-3f9f-4e5a-9d3c-6d7f8e9a1b2c
  └─cryptroot
            ext4   1.0         2b1c4d9a-9d10-4d0a-ae91-5f8b21c4b1a1   38.9G    41% /
sdb
└─sdb1      ext4   1.0   DATA  9a8b7c6d-5e4f-3a2b-1c0d-9e8f7a6b5c4d

What it means: The UUID in fstab doesn’t match what’s actually on disk. The DATA filesystem is present but with a different UUID.
Decision: update /etc/fstab to the correct UUID (or use LABEL=DATA if your labeling discipline is solid).

Task 6 — Inspect /etc/fstab carefully (and pessimistically)

cr0x@server:~$ sed -n '1,200p' /etc/fstab
# /etc/fstab: static file system information.
UUID=2b1c4d9a-9d10-4d0a-ae91-5f8b21c4b1a1 /         ext4  errors=remount-ro 0 1
UUID=6e6a86fb-4d93-4b8b-9b62-6a0c9b99afc2 /boot     ext4  defaults          0 2
UUID=9C2B-1A1B /boot/efi vfat  umask=0077         0 1
UUID=2f2c3a5b-8d5e-4e1b-b4d2-2e1b1f4d2f20 /mnt/data ext4  defaults          0 2

What it means: One line references a UUID that doesn’t exist.
Decision: if /mnt/data is optional for boot, add nofail,x-systemd.device-timeout=5s or switch to automount; if it’s mandatory, fix the UUID and validate the device chain.

Task 7 — Test mounts without rebooting (catch errors safely)

cr0x@server:~$ mount -a
mount: /mnt/data: can't find UUID=2f2c3a5b-8d5e-4e1b-b4d2-2e1b1f4d2f20.

What it means: The problem is reproducible outside of boot. Good—debugging during boot is miserable.
Decision: fix the fstab entry; then re-run mount -a until it’s clean.

Task 8 — For network-online stalls: see who wants it

cr0x@server:~$ systemctl list-dependencies --reverse network-online.target
network-online.target
● ├─mnt-nfs.mount
● ├─remote-fs.target
● └─docker.service

What it means: Your NFS mount and Docker are forcing “online.” That may be unnecessary if they can tolerate “network configured.”
Decision: decouple non-critical services from network-online.target, or fix the actual network connectivity issue if it truly must be online.

Task 9 — Check the wait-online unit configuration (timeouts are policy)

cr0x@server:~$ systemctl cat systemd-networkd-wait-online.service
# /lib/systemd/system/systemd-networkd-wait-online.service
[Unit]
Description=Wait for Network to be Configured
Documentation=man:systemd-networkd-wait-online.service(8)
DefaultDependencies=no
Conflicts=shutdown.target
Before=network-online.target shutdown.target

[Service]
Type=oneshot
ExecStart=/lib/systemd/systemd-networkd-wait-online --timeout=120
RemainAfterExit=yes

[Install]
WantedBy=network-online.target

What it means: The default timeout can be long. 120 seconds feels like a hang in an outage.
Decision: if your environment has intermittent link, set a shorter timeout or stop requiring network-online for services that can retry later.

Task 10 — See whether your network stack is actually networkd

cr0x@server:~$ systemctl is-enabled systemd-networkd NetworkManager
enabled
disabled

What it means: You’re using networkd, so wait-online is relevant. If NetworkManager were enabled, you’d want the NM wait unit instead.
Decision: ensure you don’t have the wrong “wait-online” for the network stack you’re using. Mismatches lead to pointless timeouts.

Task 11 — Build a dependency graph for the suspect unit (surgical scope)

cr0x@server:~$ systemctl list-dependencies --all mnt-data.mount
mnt-data.mount
● ├─system.slice
● ├─-.mount
● ├─dev-disk-by\x2duuid-2f2c3a5b\x2d8d5e\x2d4e1b\x2db4d2\x2d2e1b1f4d2f20.device
● └─local-fs-pre.target

What it means: This mount is directly waiting on a device unit that doesn’t exist.
Decision: find why the device unit isn’t created (missing disk, wrong UUID, encryption not opened, driver missing).

Task 12 — Validate device discovery (udev and kernel messages)

cr0x@server:~$ dmesg -T | tail -n 30
[Wed Mar 12 08:40:41 2025] sd 1:0:0:0: [sda] 1000215216 512-byte logical blocks: (512 GB/477 GiB)
[Wed Mar 12 08:40:41 2025] sd 2:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
[Wed Mar 12 08:40:42 2025] EXT4-fs (sdb1): mounted filesystem 9a8b7c6d-5e4f-3a2b-1c0d-9e8f7a6b5c4d ro with ordered data mode
[Wed Mar 12 08:40:42 2025] EXT4-fs (sdb1): re-mounted 9a8b7c6d-5e4f-3a2b-1c0d-9e8f7a6b5c4d rw

What it means: The disk exists and the filesystem UUID is visible. The problem is your configuration, not hardware detection.
Decision: update references (UUID/LABEL) and consider adding guardrails (automount, nofail) for non-critical mounts.

Task 13 — Temporarily bypass a non-critical failing mount (get the box back)

cr0x@server:~$ cp -a /etc/fstab /etc/fstab.bak
cr0x@server:~$ sed -i 's|^UUID=2f2c3a5b-8d5e-4e1b-b4d2-2e1b1f4d2f20|# UUID=2f2c3a5b-8d5e-4e1b-b4d2-2e1b1f4d2f20|' /etc/fstab
cr0x@server:~$ systemctl daemon-reload

What it means: You’ve disabled the mount for now. This is a pragmatic move, not a moral failing.
Decision: reboot to restore basic services; then fix the mount properly during a controlled window.

Task 14 — Make the mount resilient instead of blocking boot

cr0x@server:~$ grep -n '/mnt/data' /etc/fstab
12:UUID=9a8b7c6d-5e4f-3a2b-1c0d-9e8f7a6b5c4d /mnt/data ext4 nofail,x-systemd.device-timeout=5s,x-systemd.automount 0 2

What it means: This mount won’t block boot; it’ll mount on access, and if the disk is missing it won’t throw you into emergency mode.
Decision: apply this only to mounts that are not required for the OS to function. Databases and root filesystems are not “nofail material.”

Task 15 — For emergency mode caused by fstab: verify systemd’s view of local-fs

cr0x@server:~$ systemctl status local-fs.target --no-pager
× local-fs.target - Local File Systems
     Loaded: loaded (/lib/systemd/system/local-fs.target; static)
     Active: failed (Result: dependency) since Wed 2025-03-12 08:41:02 UTC; 3min ago
       Docs: man:systemd.special(7)
Mar 12 08:41:02 server systemd[1]: local-fs.target: Dependency failed for Local File Systems.

What it means: The target is a casualty. The real failure is a mount unit under it.
Decision: stop staring at local-fs.target; go back to failed .mount units and fix those.

Task 16 — If the journal is missing: check journald persistence

cr0x@server:~$ ls -ld /var/log/journal
ls: cannot access '/var/log/journal': No such file or directory

What it means: Logs may be volatile. After a reboot, the evidence is gone.
Decision: enable persistent journaling on servers where boot forensics matters (most of them). Otherwise you’re debugging with vibes.

Task 17 — Enable persistent journal (so next time is less dumb)

cr0x@server:~$ sudo mkdir -p /var/log/journal
cr0x@server:~$ sudo systemd-tmpfiles --create --prefix /var/log/journal
cr0x@server:~$ sudo systemctl restart systemd-journald

What it means: Journald will start writing persistent logs (subject to config and disk space).
Decision: confirm disk usage policy; on small root filesystems, set limits to avoid filling /.

Case #29: a typical Debian 13 boot stall chain

Let’s pin down what “identify the one service that blocks startup” looks like in practice. Case #29 is a pattern I keep seeing on Debian hosts
that do “a little of everything”: local data disk, some remote mounts, and a service that insists on the network being perfect before it will even start.

The symptom set

On console: “A start job is running for /mnt/data” or “Dependency failed for Local File Systems.”
Sometimes it drops into emergency mode, sometimes it boots after ~2 minutes.
Post-boot: systemctl --failed shows one mount failed and optionally a wait-online failure.

The root cause chain (how it happens)

Someone swapped a disk, restored from backup, or re-created a filesystem. The UUID changed. The old UUID stayed in /etc/fstab.
systemd generated a mount unit for that entry and tried to satisfy it during boot. It couldn’t find the device; the mount failed.
Because the mount was treated as required (the default), local-fs.target failed and the boot sequence took the dramatic exit to emergency mode.

Separately, there was an NFS mount that used _netdev and systemd translated that into needing network. A unit or two pulled in
network-online.target, which pulled in systemd-networkd-wait-online.service. On a host with no cable plugged in (or a VLAN misconfigured),
wait-online timed out at 120 seconds. That’s not a “failure” in a human sense; it’s a policy you forgot you had.

So what’s “the one service” blocking startup?

In this case, it’s mnt-data.mount when you drop to emergency mode. The “Dependency failed” line mentions local-fs.target,
but the blocker is the mount unit generated from /etc/fstab.

If you’re not dropping to emergency mode but boot is slow, the one service is often systemd-networkd-wait-online.service or a remote mount unit
waiting behind it.

Your diagnosis must match the operational impact:
hard stop at boot → fix the hard dependency (usually local mounts);
slow boot → remove unnecessary online dependencies, shorten timeouts, or make mounts automount/nofail.

Joke #2: Nothing says “high availability” like a server refusing to boot until a remote share it’s never used becomes reachable.

Three corporate mini-stories (what actually happens in companies)

1) The incident caused by a wrong assumption: “UUIDs are forever”

A mid-sized SaaS company ran Debian on a small fleet of stateful workers. They had a data disk mounted at /srv/data, referenced by UUID in /etc/fstab.
The ops team liked UUIDs because device names like /dev/sdb1 are fickle—plug order changes, controllers reorder, chaos.

During a hardware refresh, a technician cloned disks and “cleaned up” partitions. The filesystem was recreated rather than cloned bit-for-bit.
Nobody noticed because the mountpoint existed and the directory looked fine on the staging bench. The UUID changed.

The first production reboot after a kernel update was ugly. Several workers landed in emergency mode with “Dependency failed for Local File Systems.”
The on-call engineer initially chased the wrong thing: they saw local-fs.target failing and assumed systemd was “broken.”
They tried to restart targets like they were services. That’s like rebooting a spreadsheet to fix a typo in cell B7.

The fix was simple: update UUIDs in /etc/fstab. The lesson was not “don’t use UUIDs.” The lesson was to treat storage identity as configuration
that must be validated after maintenance. They added a pre-reboot check that compared fstab entries to lsblk -f output and refused to proceed
when a referenced UUID wasn’t present.

Nobody got fired. But everyone stopped saying “it’s just a reboot.”

2) The optimization that backfired: “Make services start as early as possible”

Another organization—enterprise, compliance-heavy, lots of internal tooling—wanted faster boot times for a Debian-based appliance.
Someone looked at the boot chart and decided the network wait was “wasted time.” They disabled the wait-online unit and removed dependencies on network-online.target.
Boot was faster. Benchmarks looked great. Slide deck approved.

Then reality showed up. The appliance ran a service that required a stable IP and DNS before it could register with a central controller.
Without waiting for the network to be usable, the service started immediately, tried to resolve a name, failed, and exited. systemd dutifully restarted it.
The box wasn’t “down,” but it was in a restart loop for several minutes—sometimes longer on congested networks.

The worst part: it was intermittent. If DHCP was fast, it worked. If not, it didn’t. The team got to enjoy the classic enterprise experience:
a problem that appears after you declare victory.

They walked it back, but not by re-enabling a blanket 120-second wait. They made the dependency accurate:
services that truly required online networking used a tuned wait with a shorter timeout and better interface selection;
services that could retry later stopped pulling in network-online.

Optimization is a tax. If you don’t pay it in design, you pay it in incidents.

3) The boring but correct practice that saved the day: persistent logs + change notes

A financial services team ran Debian on bare metal with encrypted root and a couple of auxiliary mounts. They had an old rule:
persistent journaling enabled everywhere, and every storage-related change required a one-line note in an internal change log: “what changed, how to undo.”
Boring. Slightly annoying. Perfect.

One Monday, a host booted into emergency mode after a weekend maintenance. The console showed “Dependency failed,” but the engineer didn’t have to guess.
The previous boot logs were there. The journal pointed to a single mount unit failing, and the error line included the exact device path it couldn’t find.

The change log said a new iSCSI LUN had been added and a mount entry created. The person who did it also wrote the rollback: “comment the fstab line and reboot.”
Within minutes, the host was back up, and the team investigated the iSCSI discovery issue in daylight.

The practice wasn’t glamorous. It didn’t “scale AI-driven operations.” It did something better: it made the next failure cheap.

Common mistakes: symptoms → root cause → fix

1) Symptom: “Dependency failed for Local File Systems” and emergency mode

Root cause: A required mount from /etc/fstab failed (wrong UUID/LABEL, missing disk, fsck failure, missing crypto unlock).

Fix: Identify the failed .mount in systemctl --failed. Fix fstab. If non-critical, add nofail and a short device timeout.

2) Symptom: 90–120 second boot delay, then system eventually comes up

Root cause: *-wait-online.service timing out, usually because there is no carrier, wrong VLAN, or waiting for an interface that shouldn’t count.

Fix: Check which wait-online unit is active (networkd vs NetworkManager). Reduce timeout, adjust required interfaces, or remove unnecessary dependency on network-online.target.

3) Symptom: “Dependency failed for Remote File Systems”

Root cause: One remote mount unit (NFS/SMB/iSCSI-backed filesystem) failed. Could be DNS, network, creds, or server down.

Fix: Make remote mounts nofail + x-systemd.automount where appropriate; ensure _netdev is used; validate name resolution in early boot.

4) Symptom: “A start job is running for dev-disk-by…” then timeout

Root cause: systemd is waiting for a device unit that never appears. Often a stale UUID, missing multipath mapping, or storage driver not in initramfs.

Fix: Confirm with lsblk -f and dmesg. Update identifiers. For early-boot storage, ensure initramfs includes needed modules and configs.

5) Symptom: Boot fails after adding a new filesystem entry; manual mount works later

Root cause: Ordering issue: the device becomes available after the mount is attempted (e.g., iSCSI or delayed udev). Or you need x-systemd.requires= on a specific service.

Fix: Use x-systemd.automount or add proper dependencies; avoid “sleep 10” hacks in unit files.

6) Symptom: You “fixed it,” but next reboot fails again

Root cause: You edited runtime state or a generated unit, not the source (e.g., edited in /run), or you forgot daemon-reload, or the initramfs still contains old config.

Fix: Edit /etc/fstab and drop-ins under /etc/systemd/system. Run systemctl daemon-reload. If encryption/initramfs involved, rebuild initramfs.

7) Symptom: “Dependency failed” messages but everything seems fine

Root cause: Optional units are failing (e.g., a mount marked required by default), or a one-shot service is misconfigured but not critical.

Fix: Decide whether the unit should be required. Convert Requires= to Wants= where safe. Add nofail to optional mounts. Clean up failed units to reduce noise.

Checklists / step-by-step plan

Checklist A — Recover the system now (minutes matter)

Get a shell (emergency mode, recovery mode, or console TTY).
Run systemctl --failed and write down the failing .mount/.service.
Run journalctl -b -p err --no-pager and find the earliest relevant error.
If a mount is failing and not critical: comment it out in /etc/fstab, run systemctl daemon-reload, reboot.
If a mount is critical: confirm device presence with lsblk -f, fix UUID/LABEL, run mount -a, then continue boot.

Checklist B — Identify “the one unit” blocking boot (be precise)

Run systemd-analyze critical-chain and note the unit with the largest time.
Confirm it’s actually blocking: check whether it’s on the path to multi-user.target or your default target.
Inspect it: systemctl status UNIT.
Pull its journal: journalctl -u UNIT -b --no-pager.
Map direct dependencies: systemctl list-dependencies --all UNIT and --reverse for the target it delays.
Stop when you find the missing resource (device, network, credential, config file). That’s the true root cause.

Checklist C — Fix properly (so next reboot isn’t a rerun)

For mounts: use stable identifiers (UUID/LABEL) and confirm with lsblk -f.
For remote mounts: prefer x-systemd.automount and nofail unless the machine truly cannot function without them.
For network-online: only require it when the service truly needs working routes/DNS at startup.
Set sane timeouts: short for optional dependencies, longer only where justified.
Enable persistent journaling on servers; keep boot evidence.
Test with systemd-analyze blame and a controlled reboot window.

Checklist D — Prevent recurrence (the SRE tax you actually want to pay)

Add a pre-reboot validation script: confirm all UUIDs in /etc/fstab exist.
Standardize mount options for optional data disks (automount + short timeout).
Use drop-in unit overrides instead of editing vendor unit files.
Document rollback steps for storage and networking changes.
After any storage migration, run a reboot rehearsal in a maintenance window.

FAQ

1) Is “Dependency failed” a kernel problem?

Usually no. It’s systemd reporting that a unit couldn’t start because something it requires failed. Kernel issues tend to show up as panics, driver errors, or missing root device.

2) Why does it mention `local-fs.target` instead of the real failure?

Targets aggregate other units. When one required unit fails, the target reports dependency failure. The real failure is typically a specific .mount or .service.

3) What’s the fastest way to find the blocker?

Combine systemctl --failed with systemd-analyze critical-chain, then read the journal for the top suspect. Don’t start by editing random unit files.

4) Why does my system wait 120 seconds for networking?

That’s the default timeout in many wait-online units. It’s a policy choice, not a law of physics. If you don’t need “online,” don’t depend on it; if you do, tune it.

5) Should I use `nofail` in `/etc/fstab`?

For optional mounts, yes—especially removable, remote, or “nice to have” data volumes. For required mounts (root, critical application data), no. Fix the dependency instead.

6) What does `x-systemd.automount` buy me?

It defers the mount until first access. Boot doesn’t block on it. If the resource is temporarily missing, the system still comes up and you can remediate remotely.

7) My mount works after boot, but fails during boot. Why?

Ordering. The device or network path isn’t ready when the mount is attempted. Automount is often the cleanest fix. Otherwise, add correct dependencies—avoid sleep hacks.

8) How do I know whether I’m using NetworkManager or networkd?

Check which service is enabled and active. If you have NetworkManager managing interfaces, networkd’s wait-online may be irrelevant and will time out pointlessly.

9) I fixed `/etc/fstab` but systemd still fails the same way. What did I miss?

Run systemctl daemon-reload and re-test with mount -a. If the failure involves initramfs (encrypted or early-boot devices), rebuild initramfs too.

10) Can I just mask the failing unit?

You can, and sometimes you should—temporarily. Masking is a bypass, not a cure. It’s acceptable for non-critical units while you restore availability and plan a proper fix.

Next steps (what to do after you’ve recovered boot)

Once the system is up, don’t stop at “it boots now.” Capture the root cause while it’s fresh:
identify the single unit that blocked boot, document why it blocked, and decide whether it should ever be allowed to block again.

Record the culprit unit and the failing dependency (UUID, interface, remote endpoint).
Make optional dependencies optional: nofail, automount, shorter device timeouts.
Make required dependencies reliable: correct identifiers, correct ordering, correct initramfs contents.
Enable persistent journaling if you haven’t; next incident should be cheaper.
Reboot once in a controlled window to validate the fix. Confidence is earned, not assumed.

Debian boots fine. Your configuration is what turns it into a courtroom drama. systemd is just the clerk reading out the consequences.