Ubuntu 24.04 servers: Snap vs apt — where Snap quietly causes pain (and what to do)

Was this helpful?

The weirdest production outages don’t start with a kernel panic. They start with a perfectly normal Tuesday: you patch nothing, deploy nothing,
and yet a service restarts with a different binary than yesterday. Or disk usage creeps up on a “static” host until you’re paging a human at 03:00
because /var went read-only.

On Ubuntu 24.04, the Snap ecosystem is mature enough to be boring on desktops—and still surprisingly sharp-edged on servers.
Not always. Not everywhere. But often enough that you should have an opinion and an operational plan.

Snap vs apt in one page: what changes on a server

apt installs Debian packages (.deb) from APT repositories. Files land in the normal filesystem locations,
libraries are shared, upgrades are controlled by your APT policy, and configuration management tooling has had 20 years to learn the dance.
When you upgrade, you upgrade the whole dependency graph that repository offers at that moment. You can pin versions, hold packages, and stage updates.

Snap installs squashfs images plus a runtime layer (snapd). Each app is more self-contained, ships more of its dependencies,
and runs under confinement (AppArmor profiles, mount namespaces, cgroups integration, etc.). Updates are primarily driven by
snap refresh (automatic by default), and apps may update independently from the rest of the OS.
That separation is both the point and the problem.

Server reality, not marketing

  • Operational control: apt is “your change management”; Snap is “your change management plus a second one.”
    If you don’t deliberately align them, you will eventually trip over drift.
  • Disk and mounts: Snap adds loop devices and mounts. That’s not inherently bad, but it changes what “df shows full” means and how
    some tools behave under pressure.
  • Security model: confinement can reduce blast radius, but also breaks assumptions about filesystem paths, sockets, and access to host resources.
  • Network dependency: Snap expects to talk to the Snap Store. Proxies, TLS inspection, air-gaps, and restricted egress make this… entertaining.

My bias: on production servers, default to apt unless Snap gives you a specific, audited advantage you can operate.
“It was the default on the install image” is not an advantage. It’s just entropy with a logo.

Joke #1: Snap is like a minibar—convenient until you check the bill and discover you’ve been paying for tiny bottles of stuff you already own.

Facts & history that actually matter

These aren’t trivia-night facts. They’re the context that explains why your Ubuntu 24.04 server behaves differently from your mental model.

  1. Snap started as “snappy” for Ubuntu Core and IoT, where atomic updates and app bundling are features, not annoyances.
    Servers inherited that packaging approach later, sometimes without the same constraints.
  2. snapd runs as a system service and manages mounts, confinement, and refresh scheduling. On a minimal server, that’s another daemon
    with its own logs, timers, and failure modes.
  3. Snaps are squashfs images mounted via loop devices. This is why you see extra /dev/loop* devices and mounts under /snap.
  4. Snap revisions are kept for rollback (typically multiple revisions). That’s useful when a refresh breaks you, and also a reason disk usage grows quietly.
  5. Refresh is transactional at the Snap level. You usually get either “old revision works” or “new revision works,” which is great—until the refresh happens
    during your busiest hour.
  6. Canonical has pushed certain apps to Snap-first distribution over time. The classic example is browsers on desktops; on servers, the pressure shows up
    when “the package you expected” is replaced or deprecated.
  7. Snap confinement relies heavily on AppArmor. If you disable AppArmor or run in unusual security contexts, Snap’s security model can degrade or fail in odd ways.
  8. systemd integrates with Snap via generated unit files and snapd-managed services. Your usual “unit file lives in /etc/systemd/system” assumption can be wrong.

One paraphrased idea that holds up in ops: paraphrased idea “Hope is not a strategy.” — attributed to General Gordon R. Sullivan (paraphrased).
Packaging choices are strategy. You’re either choosing it, or inheriting it.

Where Snap quietly causes pain (real failure modes)

1) Surprise updates in the middle of traffic

Snap refresh is automatic by design. The default behavior is acceptable on a laptop. On a server, “acceptable” is a synonym for “eventually paged.”
The refresh can restart services, rotate binaries, and change behavior out from under a running workload.

The insidious part is not that updates happen. It’s that updates happen outside the change-control path your team already follows for apt,
config management, and maintenance windows.

2) Disk usage that creeps without looking like it creeps

Snap keeps multiple revisions and stores data under /var/snap and /var/lib/snapd.
If you run small root volumes (common in cloud images), Snap can push you into “root is 100% full” territory faster than you expect.
Meanwhile df output gets noisier because of mounted snap images.

The failure mode is classic: log rotation fails, journald can’t write, package operations fail, then something important falls over.
Snap isn’t the only culprit, but it’s a common silent contributor because it doesn’t look like “app data” to people scanning disk usage quickly.

3) Confinement breaks integration assumptions

Snaps are sandboxed. Great. But “sandboxed” means “your app can’t see the host the way you think it can.”
Common pain points:

  • Access to /etc and /var paths you assume exist (or are writable).
  • Binding to privileged ports, or binding to host network in unexpected ways depending on the snap and interface connections.
  • Reading host certificates, SSH keys, or CA bundles from standard locations when the snap uses its own.
  • Talking to host sockets (Docker socket, systemd socket activation, custom Unix sockets) without the right interfaces.

In incident review, this often shows up as: “But the config file is correct.” Yes. The process just can’t read it.

4) Observability drift: logs, paths, and service units are not where you expect

With apt, you know where the unit file lives, where logs land, and where to patch configs. Snap abstracts those details and sometimes relocates them.
It’s survivable, but it slows you down under pressure.

5) Store dependency and fleet behavior under constrained networks

If your servers live behind strict egress rules, proxies, or TLS inspection, snap refresh can fail repeatedly.
Repeated failures can cause:

  • Persistent timers waking up and doing work (log spam, CPU wakeups).
  • Stuck refreshes that block other operations.
  • Partial fleet drift: some machines update, others don’t, and you get “same version” lies in postmortems.

6) Mount and loop device overhead in fragile environments

Most servers won’t care about a few extra mounts. Some will:

  • Systems with tight mount namespace assumptions (some containers, chroot-based tooling, or hardening scripts).
  • Legacy monitoring that treats “more loop devices” as “someone is mining crypto.”
  • Hosts with strict limits on loop devices or odd initramfs behavior.

7) “But it worked yesterday” because the runtime changed

Snap updates can pull in new runtimes and bases (for example, core snaps). That can subtly change TLS behavior, cipher defaults,
or bundled library behavior. You won’t see it in apt history. You’ll see it as a client handshake error and a growing sense of doom.

Joke #2: When Snap says it’s “refreshing,” it’s not offering your server a spa day—it’s redecorating the kitchen while dinner service is running.

Practical tasks: commands, outputs, and decisions (12+)

Below are tasks you can run on Ubuntu 24.04 right now. Each includes: a command, realistic output, what it means, and the decision you make.
These are written for people who have to fix things, not debate packaging philosophy.

Task 1: Identify whether Snap is installed and active

cr0x@server:~$ systemctl status snapd --no-pager
● snapd.service - Snap Daemon
     Loaded: loaded (/usr/lib/systemd/system/snapd.service; enabled; preset: enabled)
     Active: active (running) since Mon 2025-12-29 08:11:22 UTC; 2h 14min ago
TriggeredBy: ● snapd.socket
   Main PID: 1123 (snapd)
      Tasks: 14 (limit: 18956)
     Memory: 54.7M
        CPU: 1min 12.333s
     CGroup: /system.slice/snapd.service
             └─1123 /usr/lib/snapd/snapd

What it means: snapd is running and enabled. Snap refreshes and mounts are in play.

Decision: If this is a production server and Snap isn’t explicitly required, plan to remove/disable it or lock refresh behavior.

Task 2: List installed snaps (and spot “surprise” packages)

cr0x@server:~$ snap list
Name        Version           Rev   Tracking       Publisher   Notes
core22      20241001          1380  latest/stable  canonical✓  base
lxd         5.21              33110 5.21/stable    canonical✓
snapd       2.63.1            23159 latest/stable  canonical✓  snapd

What it means: You’re running LXD from Snap, plus the base and snapd itself.

Decision: Confirm ownership: does your platform team support LXD as a snap? If not, migrate to apt packages (where available) or document Snap as a dependency.

Task 3: See when Snap will refresh next (and whether it’s blocked)

cr0x@server:~$ snap refresh --time
timer: 00:00~24:00/4
last: today at 06:17 UTC
next: today at 10:19 UTC

What it means: Snap refresh can happen any time in the day, roughly every 4 hours, depending on policy.

Decision: On servers, set a refresh window aligned to your maintenance window, or hold refresh temporarily during critical events.

Task 4: Set a refresh window (maintenance window control)

cr0x@server:~$ sudo snap set system refresh.timer=Sun03:00-05:00

What it means: Snap will attempt refresh during Sunday 03:00–05:00.

Decision: Use a window that matches your on-call reality. If your maintenance window is “never,” your system is already running on vibes.

Task 5: Hold refresh during a freeze (short-term, not forever)

cr0x@server:~$ sudo snap refresh --hold=24h
General refreshes of "all snaps" held until 2025-12-30T08:33:10Z

What it means: You’ve paused refreshes for 24 hours.

Decision: Use this during incident response or major launches. Create a follow-up ticket to remove the hold and patch intentionally.

Task 6: Check whether a snap refresh recently restarted a service

cr0x@server:~$ journalctl -u snapd --since "today" --no-pager | tail -n 12
Dec 29 06:16:59 server snapd[1123]: Refreshing "lxd"
Dec 29 06:17:21 server snapd[1123]: Restarted snap.lxd.daemon.service
Dec 29 06:17:22 server snapd[1123]: Refreshing "core22"
Dec 29 06:17:35 server snapd[1123]: Finished refresh, 2 snaps updated

What it means: LXD was refreshed and its daemon restarted.

Decision: If you didn’t schedule that restart, tighten refresh controls and add monitoring for snap-triggered restarts.

Task 7: Identify snap-managed systemd units

cr0x@server:~$ systemctl list-units --type=service | grep -E '^snap\.'
snap.lxd.activate.service          loaded active exited  Service for snap application lxd.activate
snap.lxd.daemon.service            loaded active running Service for snap application lxd.daemon
snap.snapd.apparmor.service        loaded active exited  Load AppArmor profiles managed internally by snapd

What it means: These are snap-generated units; editing them directly is usually the wrong move.

Decision: Use snap configuration knobs or systemd drop-ins (carefully) and document what’s supported.

Task 8: Understand the loop devices and mounts Snap adds (disk/mount noise)

cr0x@server:~$ lsblk -o NAME,TYPE,SIZE,MOUNTPOINT | grep -E 'loop|NAME'
NAME   TYPE   SIZE MOUNTPOINT
loop0  loop  74.2M /snap/core22/1380
loop1  loop  95.3M /snap/lxd/33110
loop2  loop  53.6M /snap/snapd/23159

What it means: Each snap revision is a mounted squashfs image via loop.

Decision: Don’t “clean up loop devices” manually. If disk is tight, reduce retained revisions and remove unused snaps.

Task 9: Measure how much space Snap uses (and where)

cr0x@server:~$ sudo du -sh /var/lib/snapd /var/snap 2>/dev/null
1.3G	/var/lib/snapd
612M	/var/snap

What it means: You’re spending ~2 GB on Snap infrastructure and snap app data.

Decision: On small root volumes, either budget space for Snap or remove it. If you keep it, configure retention and monitor /var.

Task 10: Reduce retained snap revisions (disk control)

cr0x@server:~$ sudo snap set system refresh.retain=2

What it means: Snap will keep fewer old revisions around (default is often higher).

Decision: Set retain to 2 on servers unless you have a strong rollback policy requiring more, and your disk budget supports it.

Task 11: Roll back a snap after a bad refresh

cr0x@server:~$ sudo snap revert lxd
lxd reverted to 5.21

What it means: You’ve rolled back to the previous revision that snapd kept.

Decision: Treat revert as a stabilization step. Then: freeze refresh, open an incident, and plan a controlled forward fix.

Task 12: Find Snap interface connections (why access is denied)

cr0x@server:~$ snap connections lxd | head
Interface  Plug            Slot                 Notes
network    lxd:network     :network             -
network-bind lxd:network-bind :network-bind     -
mount-observe lxd:mount-observe :mount-observe  -

What it means: Snaps use interfaces for permissions. Missing connections can cause “permission denied” even for root-run services.

Decision: If a snap can’t access something, check interfaces before rewriting configs. Use snap connect only when you understand the security trade-off.

Task 13: Spot snap refresh timers and what they will do

cr0x@server:~$ systemctl list-timers --all | grep -E 'snap|apt' | head -n 20
snapd.refresh.timer         loaded active waiting   Tue 2025-12-29 10:19:00 UTC  1h 41min left
apt-daily.timer             loaded active waiting   Tue 2025-12-29 06:12:41 UTC  21h left
apt-daily-upgrade.timer     loaded active waiting   Tue 2025-12-29 06:27:10 UTC  21h left

What it means: You have two independent update schedulers: apt timers and snapd refresh timer.

Decision: Align them with your maintenance plan. If you use a centralized patching system, consider disabling default timers and driving updates explicitly.

Task 14: Confirm whether a given command is a snap, not a deb

cr0x@server:~$ command -v lxd
/snap/bin/lxd

What it means: You are executing the snap-provided binary, not an apt-installed one.

Decision: In runbooks, record paths. In incidents, check which binary is running before assuming configuration locations and upgrade paths.

Task 15: Check apt equivalents and what’s installed

cr0x@server:~$ apt-cache policy lxd | sed -n '1,12p'
lxd:
  Installed: (none)
  Candidate: 1:0.0~git20240301.abcdef-0ubuntu1
  Version table:
     1:0.0~git20240301.abcdef-0ubuntu1 500
        500 http://archive.ubuntu.com/ubuntu noble/universe amd64 Packages

What it means: LXD is available via apt, but not installed. You’re running the snap.

Decision: Decide on one packaging channel per service. Mixed installs are how you get phantom bugs and “it depends which host” behavior.

Task 16: Remove a snap cleanly (if you’re choosing apt-first)

cr0x@server:~$ sudo snap remove --purge lxd
Remove "lxd" snap
lxd removed

What it means: The snap package and its data were removed (purged).

Decision: Only do this after you’ve migrated data and validated the apt-based deployment plan. Purge is not “maybe.” Purge is “I meant it.”

Task 17: Disable snapd (if you must keep the package but stop the behavior)

cr0x@server:~$ sudo systemctl disable --now snapd.service snapd.socket
Removed "/etc/systemd/system/multi-user.target.wants/snapd.service".
Removed "/etc/systemd/system/sockets.target.wants/snapd.socket".

What it means: snapd is stopped and won’t auto-start via the socket.

Decision: If any critical components are snaps, this will break them. Use only when you’ve verified you are snap-free, or you’re intentionally taking the hit.

Task 18: Prove snapd is no longer functional (sanity check)

cr0x@server:~$ snap list
error: cannot communicate with server: Post "http://localhost/v2/snaps": dial unix /run/snapd.socket: connect: no such file or directory

What it means: The snap client can’t talk to snapd. Good—if that’s what you intended.

Decision: Update your baseline checks and golden images so you don’t end up with half-managed hosts.

Fast diagnosis playbook: what to check first/second/third

This is the “it’s on fire and I have 15 minutes” sequence for Snap-related pain on Ubuntu 24.04 servers.
The goal is not to be elegant. The goal is to be correct quickly.

First: Did something refresh or restart?

  • Check snap refresh timing:

    cr0x@server:~$ snap refresh --time
    timer: 00:00~24:00/4
    last: today at 06:17 UTC
    next: today at 10:19 UTC

    Interpretation: If last lines up with the incident start, suspect refresh-induced change.

  • Check snapd logs for restarts:

    cr0x@server:~$ journalctl -u snapd --since "today" --no-pager | grep -E 'Refreshing|Restarted' | tail -n 20
    Dec 29 06:16:59 server snapd[1123]: Refreshing "lxd"
    Dec 29 06:17:21 server snapd[1123]: Restarted snap.lxd.daemon.service

    Interpretation: Snap did something. Now you need to decide whether to revert/hold.

Second: Is the pain disk, CPU, network, or permissions?

  • Disk pressure (root/var):

    cr0x@server:~$ df -h / /var
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/vda1        30G   28G  1.2G  96% /
    /dev/vda1        30G   28G  1.2G  96% /var

    Interpretation: You are one log spike away from a bad day. Snap retention and /var/lib/snapd are prime suspects.

  • CPU churn from refresh attempts:

    cr0x@server:~$ top -b -n 1 | head -n 12
    top - 10:41:10 up  1 day,  2:30,  1 user,  load average: 1.21, 1.05, 0.88
    Tasks: 231 total,   1 running, 230 sleeping,   0 stopped,   0 zombie
    %Cpu(s):  9.0 us,  2.1 sy,  0.0 ni, 88.2 id,  0.3 wa,  0.0 hi,  0.4 si,  0.0 st
    MiB Mem :  3906.7 total,   612.3 free,  1298.4 used,  1996.0 buff/cache
    MiB Swap:  2048.0 total,  2048.0 free,     0.0 used.  2340.1 avail Mem
        PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
       1123 root      20   0 1349808  56032  19840 S   6.0   1.4   1:12.33 snapd

    Interpretation: snapd is consuming CPU. If this coincides with refresh loops or store connectivity issues, fix network/proxy and set a refresh window.

  • Permission or confinement issues:

    cr0x@server:~$ journalctl -u snap.lxd.daemon --since "today" --no-pager | tail -n 8
    Dec 29 06:18:04 server lxd[14555]: Error: failed to open /etc/lxd/cluster.crt: permission denied
    Dec 29 06:18:04 server lxd[14555]: Error: daemon failed to start

    Interpretation: The service can’t access a host path. Check snap interfaces and confinement expectations before “fixing” file permissions blindly.

Third: Stabilize, then decide channel strategy

  • Stabilize: hold refresh and revert if needed.

    cr0x@server:~$ sudo snap refresh --hold=24h
    General refreshes of "all snaps" held until 2025-12-30T10:44:22Z
  • Decide: keep Snap but operate it (windows, monitoring, retention), or remove it and standardize on apt.

Three corporate mini-stories from the trenches

Mini-story #1: The incident caused by a wrong assumption

A mid-sized company ran a fleet of Ubuntu servers hosting internal developer tooling: artifact mirrors, CI runners, and a small LXD cluster for ephemeral test environments.
They were good at patching: apt updates rolled out in weekly waves, with canaries, dashboards, and a human watching error budgets.

During a busy sprint, developers started complaining that LXD containers failed to start on a subset of hosts. The on-call engineer checked apt history and found nothing.
The config repo was unchanged. CPU and memory were fine. The hosts looked “identical” on paper.

The wrong assumption was simple: “If it’s installed, it’s managed by apt.” In reality, LXD was installed via Snap on these images because someone had followed an upstream guide months earlier.
Snap refreshed LXD on a few hosts earlier that morning, restarting the daemon. The new revision tightened a confinement rule and the daemon stopped reading a certificate path that lived outside the snap’s allowed access.

The fix was fast once they realized what was happening: snap revert to stabilize, snap refresh --hold to stop churn, then a proper interface/paths solution.
The real work was the postmortem change: they added a baseline check in provisioning that flagged any host with unexpected snaps installed, and they documented “supported packaging channels” for core services.

Nobody was fired. But a few runbooks were rewritten with less optimism and more commands.

Mini-story #2: The optimization that backfired

Another org had a cost-reduction project: smaller root volumes on cloud instances, aggressive log shipping, and a mandate to “keep OS minimal.”
They switched some tooling to snaps because the apps were self-contained and “would reduce dependency mess.” That part was true.

The backfire came from basic math. Snap revisions accumulated across a fleet that rarely rebooted and almost never had hands-on maintenance.
Between /var/lib/snapd, old revisions, and application data under /var/snap, machines with 20–30 GB roots started flirting with 90% usage.
Not enough to alert. Enough to be fragile.

Then a noisy neighbor incident elsewhere caused log spikes. A few hosts crossed 100% used, journald stopped writing, health checks started timing out, and the orchestrator began rescheduling workloads.
Rescheduling made the remaining hosts hotter, and a boring disk issue turned into a cascading failure that looked like “random service instability.”

Their “optimization” was to reduce base image complexity. The reality was they introduced a second packaging cache with its own retention behavior, and they didn’t budget for it.
They fixed it by setting refresh.retain=2, moving some data off root volumes, and—this is the part people skip—adding a dashboard that broke down disk usage by category,
including Snap-managed paths. If you don’t chart it, it will still fill. It will just fill quietly.

Mini-story #3: The boring but correct practice that saved the day

A financial services team ran Ubuntu servers with strict change windows. They weren’t anti-Snap, but they treated Snap like any other source of change:
schedule it, observe it, and make it reversible.

Their baseline included:
a refresh window aligned to Sunday maintenance,
refresh retention set to 2,
monitoring for snapd unit restarts,
and a simple CI job that compared snap list output across the fleet to catch drift.
Nothing fancy. Just disciplined.

One weekend, a snap update introduced a behavior change in a monitoring agent. A few canaries updated first (because refresh windows were staggered by group).
The dashboards flagged the change immediately: agent reconnect rate dropped, CPU ticked up slightly, and logs had a new warning signature.

Because the team had an operational path, the response was boring: hold refresh across the fleet, revert on the canaries, open an internal ticket with the vendor,
and re-test in staging. Production never saw broad impact. Nobody learned about it from customers.

This is what “reliability engineering” looks like most days: you prevent excitement. It’s deeply unsexy. It works.

Common mistakes: symptom → root cause → fix

1) Symptom: “Service restarted by itself”

Root cause: snap refresh restarted a snap-managed systemd unit.

Fix: Set refresh.timer to a maintenance window; monitor snapd restarts; consider holding refresh during critical events.

2) Symptom: “Root filesystem slowly fills up even though app data is elsewhere”

Root cause: Snap revisions and snap data under /var/lib/snapd and /var/snap.

Fix: Set refresh.retain to 2; remove unused snaps; budget space; monitor Snap directories explicitly.

3) Symptom: “Permission denied reading /etc/… even as root”

Root cause: Snap confinement and missing interface connections; the process runs in a restricted context.

Fix: Use snap connections to inspect; adjust configuration to snap-approved paths; connect only required interfaces knowingly.

4) Symptom: “snap refresh keeps failing, logs spam, CPU wakes up”

Root cause: Restricted egress, proxy issues, TLS inspection, or DNS problems blocking store access.

Fix: Validate outbound connectivity; configure proxy for snapd; if you can’t support store access, don’t run snaps in that environment.

5) Symptom: “Monitoring shows lots of loop devices; someone panics about compromise”

Root cause: Normal Snap behavior: each revision mounts a squashfs image as a loop device.

Fix: Educate monitoring and humans; alert on disk usage and refresh events, not on the existence of loop devices.

6) Symptom: “We updated via apt, but the running binary didn’t change”

Root cause: The binary is from /snap/bin and not managed by apt.

Fix: Check command -v, standardize packaging channel, and enforce it in provisioning.

7) Symptom: “Systemd overrides don’t work”

Root cause: Snap-generated units may be regenerated; editing the unit file directly is brittle.

Fix: Use drop-ins carefully; prefer snap configuration options; document which changes survive refreshes.

8) Symptom: “Same app, different behavior across hosts”

Root cause: Snap refresh drift (different revisions), especially in fleets without consistent refresh windows or with failed refresh on some nodes.

Fix: Compare snap list across fleet; enforce refresh windows; alert on version skew; for critical services consider apt with pinned versions.

Checklists / step-by-step plan

Plan A: You keep Snap (but operate it like you mean it)

  1. Inventory snaps on every server role.

    cr0x@server:~$ snap list
    Name   Version   Rev   Tracking       Publisher   Notes
    snapd  2.63.1    23159 latest/stable  canonical✓  snapd

    Decision: If a role has snaps you didn’t expect, stop and decide ownership.

  2. Set a refresh window aligned to maintenance.

    cr0x@server:~$ sudo snap set system refresh.timer=Sun03:00-05:00
    

    Decision: Put updates where humans exist.

  3. Reduce retention unless disk is abundant.

    cr0x@server:~$ sudo snap set system refresh.retain=2
    

    Decision: If you want “infinite rollback,” buy “infinite disk.” Otherwise keep it tight.

  4. Alert on snapd restarts and snap-managed service restarts.

    cr0x@server:~$ systemctl list-units --type=service | grep -E '^snap\.'
    snap.snapd.apparmor.service        loaded active exited  Load AppArmor profiles managed internally by snapd

    Decision: Treat refresh like deploys. Surface them.

  5. Document confinement expectations for each snap (paths, sockets, ports).

    cr0x@server:~$ snap connections snapd | head
    Interface  Plug         Slot           Notes
    network    snapd:network :network      -
    

    Decision: If your app needs host integration beyond standard interfaces, reconsider Snap for that app.

  6. Run canaries: stagger refresh windows by group.

    Decision: You want a small blast radius when the next update is “interesting.”

Plan B: You go apt-first (recommended for most production servers)

  1. Find snap-provided binaries in your operational path.

    cr0x@server:~$ command -v lxd
    /snap/bin/lxd

    Decision: If your critical runtime is under /snap/bin, plan migration before removal.

  2. Migrate service-by-service (don’t rip out snapd first).

    cr0x@server:~$ snap list
    Name  Version  Rev  Tracking       Publisher   Notes
    lxd   5.21     33110 5.21/stable   canonical✓

    Decision: Remove one snap at a time, verify functionality, then proceed.

  3. Remove snaps cleanly once migrated.

    cr0x@server:~$ sudo snap remove --purge lxd
    Remove "lxd" snap
    lxd removed

    Decision: Purge removes data. Make backups first.

  4. Disable snapd when the host is snap-free.

    cr0x@server:~$ sudo systemctl disable --now snapd.service snapd.socket
    Removed "/etc/systemd/system/multi-user.target.wants/snapd.service".
    Removed "/etc/systemd/system/sockets.target.wants/snapd.socket".

    Decision: This is a policy decision. Enforce it in your base image.

  5. Confirm apt ownership and pin/hold as needed.

    cr0x@server:~$ apt-mark showhold
    

    Decision: For critical components, use apt pinning/holds plus controlled rollout, not “whatever updated last.”

  6. Make it hard to reintroduce Snap accidentally via golden images and CI checks.

    Decision: Packaging drift is a regression like any other. Test it.

FAQ

1) Is Snap “bad” for servers?

Not inherently. It’s a second packaging system with automatic updates, confinement, and store dependencies.
Those are fine when they match your operational model. On many production servers, they don’t—unless you actively manage them.

2) What’s the single biggest Snap risk in production?

Uncoordinated change. Auto-refresh that restarts daemons outside your maintenance window is the #1 way Snap turns into a pager.
Fix that first: refresh windows, holds during freezes, and monitoring.

3) Can I just disable snap refresh and forget about it?

You can hold refresh temporarily, and you can constrain refresh timing. But permanently freezing updates is how you grow a security problem
with a nice user interface. If you can’t patch snaps reliably, don’t run snaps for critical services.

4) Why does Snap eat disk space even when I removed the app?

Snap keeps revisions, and snapd plus bases take space. If you removed only one app, you might still have base snaps.
Check snap list, du -sh /var/lib/snapd /var/snap, and reduce refresh.retain.

5) Why do I see so many loop devices?

Each snap is a mounted squashfs image presented as a loop device. It’s normal.
The operational action is not “remove loop devices.” It’s “control how many revisions are retained” and “monitor disk usage sanely.”

6) What’s the cleanest way to tell if a binary is coming from Snap?

Use command -v or which. If it’s under /snap/bin, you’re running the snap.
Then check snap list to see which revision and channel.

7) Are snaps more secure because of confinement?

Confinement can reduce blast radius, yes. But security is a system property: you also need patch cadence you control, observability,
and a network model that supports the store. A well-managed deb can be safer than a snap you never update.

8) What about air-gapped or restricted egress environments?

Assume Snap will be painful unless you have a supported offline strategy and you’ve tested refresh behavior end-to-end.
If you can’t guarantee store access (directly or via an approved mechanism), apt repos you mirror internally are usually simpler and more predictable.

9) If I keep Snap, what are the minimum controls I should apply?

Set a refresh window, reduce retention, monitor snapd and snap-managed service restarts, and regularly audit version skew across the fleet.
If you can’t do those, you don’t “have Snap”—Snap has you.

10) Why does troubleshooting feel slower with Snap?

Because the mental map changes: unit files are generated, logs and data locations may differ, confinement errors look like permission issues,
and updates happen on a separate schedule. Once your runbooks include Snap-specific checks, it gets easier.

Practical next steps

Decide whether your servers are apt-first or Snap-operated. Don’t be “whatever happened historically.”
Mixed strategy is where bugs go to reproduce only on Fridays.

  1. Inventory: run snap list across every server role and record what’s installed.
  2. Control change: if Snap exists, set refresh.timer to a real maintenance window and set refresh.retain=2.
  3. Monitor: alert on snapd refresh events and snap-managed unit restarts; graph /var/lib/snapd growth.
  4. Standardize: for critical services, pick one channel, document it, and enforce it in your image build and CI checks.
  5. Practice rollback: test snap revert and your apt rollback strategy in staging so you don’t learn under pressure.

Ubuntu 24.04 is a solid server platform. Snap can be part of that story. But on servers, convenience is not a feature unless you can operate it.
The best packaging system is the one that doesn’t surprise you. The second best is the one you can roll back before customers notice.

← Previous
ZFS Mirrors: The Layout That Makes Random IO Feel Like NVMe
Next →
WordPress Database Bloated: wp_options Autoload Cleanup Without Breaking Things

Leave a comment