Ubuntu 24.04 tmpfs/ramdisk gone wild: stop it eating RAM (without breaking apps)

Was this helpful?

You log into an Ubuntu 24.04 box because “the app is slow.” You find RAM at 97%, swap churning, and the kernel has started picking winners. Then you notice something awkward: nobody “leaked” memory. A tmpfs did.

tmpfs is wonderful—until it isn’t. It makes files feel like RAM, because they effectively are. When it balloons, it does what RAM does when it’s full: it ruins your afternoon. Let’s stop it from eating the machine while keeping your applications happy and your boot process boring.

What you’re seeing (and why it’s confusing)

tmpfs is a filesystem that stores data in memory (page cache / shmem) and can swap under pressure. Ubuntu mounts several tmpfs instances by default: /run, /dev/shm, sometimes /tmp (depending on configuration), plus various per-service runtime directories.

The confusing part is that tools show tmpfs “Size” that looks huge—often half of RAM—and admins interpret that as allocated memory. It isn’t. That number is a limit, not usage. The usage is the “Used” column, and it’s real memory (or swap) that can push you into OOM territory if it grows.

When tmpfs “goes wild,” the usual root causes are boring:

  • An app writes large temporary files to /dev/shm or /run because it’s “fast.”
  • A queue or spool directory is on tmpfs and load spikes.
  • Containers mount tmpfs for /tmp or an emptyDir volume and log/metrics go nuts.
  • A misconfigured service keeps generating runtime artifacts and never cleans them up.
  • Somebody “optimized” by turning /tmp into tmpfs on a small-RAM server.

Fixing this isn’t one magical sysctl. It’s (1) measuring who’s writing, (2) setting limits that match reality, and (3) making sure the workload falls back to disk or fails gracefully instead of taking the host down.

Facts and short history that actually help

  1. tmpfs has been in Linux for decades and replaced the old “ramdisk everywhere” mindset because it can grow/shrink and can swap under memory pressure.
  2. Classic ramdisks pre-allocate: a block device backed by RAM with a fixed size. tmpfs is file-oriented and uses memory on demand.
  3. On many distros, tmpfs default size appears as ~50% of RAM (or similar), but that’s just the maximum allowed before ENOSPC.
  4. tmpfs data lives in the same memory pool as everything else. It competes with your application heap, filesystem cache, and kernel memory.
  5. Writing to tmpfs increases “Shmem” and “Cached” differently depending on mapping and accounting; reading /proc/meminfo properly matters.
  6. systemd made tmpfs more centrally managed: mounts like /run and per-unit runtime directories are part of the boot story now, not a pile of ad-hoc init scripts.
  7. /run replaced /var/run in modern systems: it’s tmpfs by design, so runtime state doesn’t persist across reboots.
  8. /dev/shm is POSIX shared memory, not “a nice scratch dir.” Some software treats it like a fast temp area anyway, and it will happily eat your RAM.
  9. Memory pressure can push tmpfs pages to swap, which is “fine” until it isn’t: latency spikes and swap storms can be worse than just using disk for temporary files.

One paraphrased idea from Werner Vogels (Amazon CTO) still holds in ops: everything fails all the time—design and operate with that assumption (paraphrased idea). tmpfs is not exempt; plan for it to fill.

Fast diagnosis playbook

This is the order that gets you to the culprit fastest, without wandering into theory.

1) Confirm it’s tmpfs and not “mystery memory”

  • Check df -hT to see which tmpfs mount is filling.
  • Check /proc/meminfo for Shmem, MemAvailable, and swap usage.
  • Check recent OOM or memory pressure signals in journalctl.

2) Find what directory is growing

  • Use du -x on the offending mount (stays within filesystem).
  • Check for deleted-but-open files with lsof +L1.

3) Identify the process and decide: cap it, move it, or fix it

  • If it’s a service: inspect unit, runtime dirs, and environment variables that point to /run, /tmp, /dev/shm.
  • If it’s a container: check mounts and tmpfs volumes; apply memory limits and tmpfs size limits.
  • If it’s a known cache: configure retention/TTL and enforce cleanup.

4) Apply the least-dangerous mitigation first

  • Set a sane tmpfs size= limit for the risky mount.
  • Move large scratch paths to disk (e.g., /var/tmp or a dedicated SSD-backed directory).
  • Only then consider aggressive kernel tuning; most tmpfs blow-ups are just file growth.

Joke #1: tmpfs isn’t “free memory.” It’s the same memory, just wearing a filesystem hat.

Practical tasks (commands, outputs, decisions)

Below are field-tested commands. For each: what it tells you, and what decision you make.

Task 1: List tmpfs mounts and spot the bloated one

cr0x@server:~$ df -hT | awk 'NR==1 || $2=="tmpfs" || $2=="devtmpfs"'
Filesystem     Type     Size  Used Avail Use% Mounted on
tmpfs          tmpfs    3.2G  1.9G  1.3G  60% /run
/dev/nvme0n1p2 ext4      80G   22G   54G  29% /
tmpfs          tmpfs     16G  8.1G  7.9G  51% /dev/shm
tmpfs          tmpfs    5.0M   44K  5.0M   1% /run/lock
tmpfs          tmpfs    3.2G  4.0K  3.2G   1% /run/user/1000

Meaning: /dev/shm is consuming 8.1G. That’s real pressure if the host has 16G RAM.

Decision: Focus on that mount first. Don’t touch /run/lock; it’s tiny and unrelated.

Task 2: Confirm actual memory pressure (not just a big tmpfs “Size”)

cr0x@server:~$ grep -E 'MemTotal|MemAvailable|Shmem|SwapTotal|SwapFree|Cached' /proc/meminfo
MemTotal:       16329640 kB
MemAvailable:    1128400 kB
Cached:          2413720 kB
Shmem:           8551120 kB
SwapTotal:       4194300 kB
SwapFree:         112340 kB

Meaning: Shmem is ~8.1G, which lines up with /dev/shm. MemAvailable is low and swap is nearly exhausted.

Decision: Treat this as a live incident: stop growth and/or evict workload. Don’t “just add swap” as the only fix.

Task 3: Check for OOM kills or memory pressure logs

cr0x@server:~$ journalctl -k -b | egrep -i 'oom|out of memory|memory pressure' | tail -n 8
kernel: Memory cgroup out of memory: Killed process 27144 (python3) total-vm:9652144kB, anon-rss:5132140kB, file-rss:11320kB, shmem-rss:2048kB
kernel: oom_reaper: reaped process 27144 (python3), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
kernel: Out of memory: Killed process 30911 (node) total-vm:7621400kB, anon-rss:2874100kB, file-rss:9020kB, shmem-rss:1780kB

Meaning: The kernel already killed processes. Even if shmem-rss in killed processes looks small, the system-wide shmem is huge.

Decision: Reduce tmpfs growth now (cap, clean, restart offenders), then fix root cause.

Task 4: Identify what’s actually in the tmpfs mount

cr0x@server:~$ sudo du -xh --max-depth=1 /dev/shm | sort -h
0	/dev/shm/snap.lxd
12K	/dev/shm/systemd-private-2d3c...-chrony.service-...
20M	/dev/shm/app-cache
8.1G	/dev/shm/feature-store
8.1G	/dev/shm

Meaning: One directory dominates: /dev/shm/feature-store.

Decision: Find which process owns/uses it; that’s your lever. Cleaning random systemd-private dirs is not the lever.

Task 5: Map the directory to a process (fast and usually good enough)

cr0x@server:~$ sudo lsof +D /dev/shm/feature-store 2>/dev/null | head -n 10
COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
python3  18422  svc   12w  REG   0,37  536870912  131112 /dev/shm/feature-store/chunk-0001.bin
python3  18422  svc   13w  REG   0,37  536870912  131113 /dev/shm/feature-store/chunk-0002.bin
python3  18422  svc   14w  REG   0,37  536870912  131114 /dev/shm/feature-store/chunk-0003.bin

Meaning: PID 18422 is writing multi-hundred-MB files into shared memory.

Decision: This is application behavior, not Ubuntu “randomly using RAM.” Fix app config or change its scratch path.

Task 6: Catch deleted-but-open tmpfs files (the “df says full, du says not” trap)

cr0x@server:~$ sudo lsof +L1 | head -n 10
COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF NLINK    NODE NAME
node    30911  svc   28w  REG   0,37 1073741824     0  140221 /dev/shm/feature-store/tmp.log (deleted)

Meaning: A 1G file was deleted from the directory tree but is still held open by a process. du won’t count it; df will.

Decision: Restart or signal the process to close/reopen logs; otherwise the space won’t return.

Task 7: Inspect mount options to understand current limits

cr0x@server:~$ findmnt -no TARGET,FSTYPE,OPTIONS /dev/shm
/dev/shm tmpfs rw,nosuid,nodev,size=16329640k,inode64

Meaning: Size limit is basically RAM-sized. That’s permissive; it allows an app to try to consume the whole host.

Decision: Set a lower size limit, but only after verifying what needs shared memory and how much.

Task 8: Check systemd’s view of tmpfs mounts and who owns them

cr0x@server:~$ systemctl status dev-shm.mount --no-pager
● dev-shm.mount - POSIX Shared Memory
     Loaded: loaded (/usr/lib/systemd/system/dev-shm.mount; static)
     Active: active (mounted) since Mon 2025-12-29 08:31:12 UTC; 3h 12min ago
      Where: /dev/shm
       What: tmpfs

Meaning: The mount is managed by systemd. This matters because you should configure it via a drop-in, not by editing random files that will be overwritten.

Decision: Use systemd override for mount options if you want it persistent and upgrade-safe.

Task 9: See if /tmp is tmpfs (it might be, depending on your choices)

cr0x@server:~$ findmnt /tmp
TARGET SOURCE FSTYPE OPTIONS
/tmp   /dev/nvme0n1p2[/tmp] ext4   rw,relatime

Meaning: /tmp is on disk here. If it were tmpfs, you’d see tmpfs and mount options including size=.

Decision: If your incident is on /tmp and it’s disk-backed, tmpfs is not your villain.

Task 10: Measure per-process shared memory usage (when /dev/shm is the hot spot)

cr0x@server:~$ ps -eo pid,comm,rss,shr,%mem --sort=-shr | head -n 8
  PID COMMAND        RSS    SHR %MEM
18422 python3     6210040 2101240 38.0
19211 python3     1180240  922120  7.2
 1123 gnome-shell  412320  210800  2.5

Meaning: SHR is large for the python process. It’s not a perfect proxy, but it’s a good directional signal.

Decision: Focus on that service. If it’s a fleet issue, you now have a quick “top talkers” query to automate.

Task 11: Check memory cgroup limits (common in containers and systemd services)

cr0x@server:~$ systemctl show myapp.service -p MemoryMax -p MemoryHigh -p MemoryCurrent
MemoryMax=infinity
MemoryHigh=infinity
MemoryCurrent=9312415744

Meaning: The service has no memory ceiling. If it uses tmpfs heavily, it can eat the host.

Decision: Consider setting MemoryHigh and MemoryMax so the service gets throttled/killed before the whole node dies.

Task 12: Check whether swap is masking the problem (until it explodes)

cr0x@server:~$ swapon --show
NAME      TYPE SIZE USED PRIO
/dev/sda2 partition 4G  3.9G -2

Meaning: Swap is almost fully used; performance will be ugly and your next allocation might trigger OOM.

Decision: Stop the tmpfs growth. Adding swap can buy time, but it’s not a responsible long-term plan.

Task 13: Identify tmpfs-heavy directories under /run (classic culprit: runaway runtime files)

cr0x@server:~$ sudo du -xh --max-depth=1 /run | sort -h | tail -n 8
4.0M	/run/systemd
12M	/run/udev
64M	/run/user
1.2G	/run/myapp
1.9G	/run

Meaning: /run/myapp is huge. Something is treating runtime state like storage.

Decision: Fix the app: runtime dirs should be bounded, rotated, or moved to disk if they’re large.

Task 14: Verify if tmpfs is constrained by inode exhaustion (yes, it happens)

cr0x@server:~$ df -ih /run
Filesystem    Inodes IUsed IFree IUse% Mounted on
tmpfs           800K  792K  8.0K   99% /run

Meaning: You’re out of inodes on tmpfs. You can be “full” even when bytes are available.

Decision: Find and delete the file spam. If it’s expected behavior, increase inode limit via mount options or move that workload off tmpfs.

Task 15: Remount a tmpfs with a temporary cap (incident mitigation)

cr0x@server:~$ sudo mount -o remount,size=4G /dev/shm
cr0x@server:~$ findmnt -no TARGET,OPTIONS /dev/shm
/dev/shm rw,nosuid,nodev,size=4G,inode64

Meaning: You just set a hard ceiling at 4G. Writes beyond that will fail with ENOSPC.

Decision: Do this only when you understand application impact. It’s a guardrail, not a cure.

Task 16: Make the cap persistent with a systemd mount drop-in

cr0x@server:~$ sudo systemctl edit dev-shm.mount
# editor opens; add:
# [Mount]
# Options=rw,nosuid,nodev,size=4G,mode=1777
cr0x@server:~$ sudo systemctl daemon-reload
cr0x@server:~$ sudo systemctl restart dev-shm.mount
cr0x@server:~$ findmnt -no TARGET,OPTIONS /dev/shm
/dev/shm rw,nosuid,nodev,size=4G,mode=1777,inode64

Meaning: The limit survives reboots. mode=1777 keeps typical shared memory permissions behavior.

Decision: Put this through change control if your estate includes apps that rely on large shared memory segments (databases, browsers, ML pipelines).

How tmpfs really uses memory (and why “Size” lies)

tmpfs stores file contents in memory pages managed by the kernel. That sounds like “RAM is allocated,” but allocation happens as you write. The size= you see in df is a maximum, enforced like a quota. It doesn’t reserve memory up front.

Where does the memory show up? Typically in a mix of:

  • Shmem in /proc/meminfo (shared memory and tmpfs pages).
  • Cached may still be non-trivial, because tmpfs pages are still page cache-like objects, but accounted differently than file cache from block devices.
  • Swap, if the system is under pressure and those pages are swappable.

That last point is where humans get tricked. “tmpfs can swap” sounds comforting. In practice, if your tmpfs is holding hot working data and it starts swapping, you now have:

  • an application that expected RAM-speed file semantics,
  • a kernel that will try to keep it alive by paging,
  • and a disk doing its best impression of a crying toaster.

Joke #2: If your incident response plan is “let tmpfs swap,” your disk just joined the on-call rotation.

tmpfs vs “ramdisk” on Ubuntu

People say “ramdisk” as a catch-all. There are two very different things:

  • tmpfs: dynamic, file-based, can swap, uses memory as needed up to a limit.
  • brd/ramdisk: a block device in RAM with fixed size (less common in day-to-day Ubuntu ops now).

Most “ramdisk problems” on Ubuntu 24.04 are tmpfs problems: /dev/shm, /run, or a tmpfs mounted by systemd or a container runtime.

When tmpfs growth is “normal”

Some workloads legitimately want memory-backed files:

  • High-frequency IPC using POSIX SHM.
  • Build systems that create massive temporary trees and benefit from RAM speed (on big-memory builders).
  • Some ML feature stores and model serving caches (if sized correctly and bounded).

“Normal” has two properties: it’s bounded and it’s recoverable. Unbounded tmpfs is just memory leak with extra steps.

Set safe limits without breaking apps

Your goal is not “make tmpfs small.” Your goal is “make failure localized.” When tmpfs is unlimited (or effectively so), a single misbehaving component can starve the entire host. When tmpfs is sensibly capped, the component gets ENOSPC and fails in a way your monitoring and retry logic can handle.

Pick the right target to cap

Don’t randomly cap everything. Start with the mounts that are both write-heavy and easiest to misuse:

  • /dev/shm: frequently abused as a fast scratch dir. It’s also used for legitimate IPC; be careful.
  • /run: should not hold huge data. If it does, something is wrong. Capping can help, but fixing the writer matters more.
  • Custom tmpfs mounts created for application caches: these should always have explicit size and inode limits.

Rule of thumb sizing (opinionated)

  • /run: usually a few hundred MB is plenty on typical servers. If you need more than 1–2G, your runtime directory design is questionable.
  • /dev/shm: size based on the biggest legitimate shared memory user. If you don’t have one, cap it. Common caps: 1G–4G on small/medium hosts.
  • App-specific tmpfs: size it to what you can afford to lose without paging the host into sludge. Then implement cleanup/TTL in the app.

Prefer “move it to disk” over “make tmpfs infinite”

If an app wants to write 20GB of “temporary” data, that’s not a tmpfs use case. Put it on disk. Use:

  • /var/tmp for temp data that may survive reboot and can be large.
  • A dedicated directory with fast storage (NVMe) and quotas if needed.
  • Per-service directories with ownership and mode set by systemd (StateDirectory=, CacheDirectory=, RuntimeDirectory=).

Failure mode: ENOSPC is a feature, not a bug

When you cap tmpfs, the app will eventually see “No space left on device.” That’s good: it’s explicit. The “bad” alternative is “the kernel killed your database because a sidecar filled /dev/shm.”

Your job is to make sure the app reacts sensibly to ENOSPC: back off, rotate, purge cache, or fail fast with a clear alert. If it crashes, that might still be an improvement over host-wide collapse—depending on your redundancy.

systemd knobs that matter on Ubuntu 24.04

Ubuntu 24.04 is systemd-first. That’s good: configuration is centralized, and mounts are units with clear ownership. It’s also bad if you keep editing /etc/fstab out of habit and wonder why your change doesn’t apply to a systemd-managed mount.

Controlling /dev/shm via dev-shm.mount

/dev/shm is typically managed by dev-shm.mount. Use a drop-in override to set options like size=, mode=, and potentially nr_inodes= if inode exhaustion is a real issue.

If you cap it, do it intentionally:

  • Know which apps use POSIX SHM (PostgreSQL extensions, Chromium-based tooling, some JVM IPC patterns, ML frameworks).
  • Test under load. Shared memory failures can be subtle.
  • Roll out gradually in a fleet. tmpfs caps are the kind of change that “works in staging” and then meets production traffic with different file churn.

Controlling /run: usually fix the writer, not the mount

/run being full is almost always misbehavior: log files, data spools, or caches in a runtime directory. You can cap /run, but you risk boot-time breakage if critical services can’t create sockets and PID files.

Better: identify the big directory (Task 13), then fix the service. Common corrections:

  • Move large runtime artifacts to /var/lib/myapp (state) or /var/cache/myapp (cache).
  • Use systemd unit directives: StateDirectory=, CacheDirectory=, RuntimeDirectory=, and set LogsDirectory= if relevant.
  • Audit cleanup on restart: stale runtime files can accumulate if the app only ever appends.

Per-service protections: MemoryMax, MemoryHigh, and friends

If tmpfs usage is caused by one service, you can stop it from taking the whole host by placing the service in a memory cgroup with limits.

Practical stance:

  • MemoryHigh is a pressure threshold; it throttles allocations and makes the service feel the pain first.
  • MemoryMax is the hard ceiling; crossing it can trigger kills.

This doesn’t “fix tmpfs,” but it limits blast radius. In production, blast radius is most of the game.

Containers: Docker and Kubernetes tmpfs pitfalls

tmpfs gets nastier inside containers because you’ve added at least two layers of misunderstanding:

  • The container filesystem might be overlayfs, and tmpfs mounts may be injected per-container.
  • Memory accounting is cgroup-based; the host and container may disagree about “free memory.”
  • Kubernetes emptyDir with medium: Memory is literally tmpfs. It’s not “fast ephemeral disk.” It’s RAM.

Docker: tmpfs mounts need explicit sizing

If you’re using Docker tmpfs mounts (e.g., --tmpfs /tmp), specify size. Otherwise you’re implicitly allowing growth constrained mostly by host limits and container memory limits.

Also, remember: container memory limits can turn tmpfs writes into immediate OOM inside the container. That’s better than killing the host, but still a service incident.

Kubernetes: emptyDir memory is not free

Kubernetes makes it easy to create memory-backed volumes. It’s also easy to forget to set:

  • Requests/limits for container memory.
  • Size limits for the volume (depending on your cluster policies and version/features).
  • Eviction thresholds so nodes don’t get wedged.

The pattern I trust: if you use memory-backed volumes, treat them like caches. Bound them, set TTL, and accept that eviction happens.

Swap, zram, and the “tmpfs ate RAM” illusion

Ubuntu 24.04 may be deployed with or without swap, and some environments use zram. This changes how tmpfs incidents feel:

  • No swap: tmpfs growth hits OOM faster, but at least you don’t spend 30 minutes in swap-death before it happens.
  • Swap on disk: tmpfs pages can be swapped out. This can keep the box “alive” while it becomes unusably slow.
  • zram: compressed RAM swap can absorb some pressure. It can also hide runaway tmpfs longer, and then fail in a more confusing way.

If your tmpfs is used for “hot” work files, swapping them is self-defeating. That workload should be on disk (fast disk, but still disk) or explicitly bounded so it can’t force swapping.

Operational rule: if swap usage rises in lockstep with tmpfs growth, you are not “saving memory.” You are paying interest on it.

Three corporate mini-stories from the trenches

Mini-story 1: The incident caused by a wrong assumption

They had a fleet of Ubuntu servers running a data processing service. One team member had read that tmpfs “uses up to half of RAM,” saw /dev/shm showing a large Size in df, and assumed it was pre-allocated. They filed a ticket: “Ubuntu is wasting 32GB on /dev/shm.”

The “fix” was fast and confident: cap /dev/shm to something tiny across the fleet. It was deployed during business hours because the change looked harmless. The next batch run started, and a component that used POSIX shared memory began failing intermittently. Not crashing cleanly—just corrupting its own workflow state when shared memory allocations failed mid-flight.

The incident wasn’t that shared memory was used. It was the assumption that the shown Size was wasted memory. In reality, the service had been using shared memory responsibly under normal conditions; the cap broke it by making it fail under legitimate load.

They recovered by rolling back the cap, then doing the boring work: measure actual Shmem during peak load, identify the legitimate high-water mark, and set a cap just above it. Then they added app-level handling for allocation failure to avoid silent corruption.

The postmortem lesson was simple: df shows limits, not allocations. Memory incidents are rarely solved with a single “global optimization.”

Mini-story 2: The optimization that backfired

A different organization had a latency-sensitive API. Someone decided to speed up request handling by moving a JSON cache and a temporary render directory into tmpfs. It worked in benchmarking: p99 improved, disks stayed quieter, graphs looked gorgeous.

Then a marketing campaign hit, traffic tripled, and the cache turned from “helpful” into “unbounded.” The tmpfs filled. The cache didn’t have eviction; it had hope. Memory pressure rose, swap filled, and the node became a slow-motion failure where health checks timed out and the orchestrator started replacing instances.

The kicker: the replacement surge amplified the problem. Cold-started instances refilled the cache, hitting tmpfs again, which caused a rolling brownout rather than a clean crash. The team spent hours chasing “network issues” because everything got slow at once.

The real fix was unsexy: move the cache back to disk (fast local SSD), implement LRU eviction, and set explicit size limits. They kept a small tmpfs for genuinely tiny, hot data—kilobytes to a few megabytes—not gigabytes. Performance stayed good, and the failure mode became manageable.

Mini-story 3: The boring but correct practice that saved the day

A payments-adjacent service ran on Ubuntu with strict reliability requirements. The team had a checklist for every node class: verify tmpfs mount options, cap /dev/shm based on workload, and set per-service memory limits with systemd. It was the kind of hygiene work nobody tweets about.

One afternoon a new version of a worker service started writing debug artifacts into /run. A feature flag had been misapplied; instead of sampling 1% of traffic, it sampled essentially everything. The files were small, but the count was enormous, and inode consumption shot up.

On a less disciplined setup, /run would have filled and taken down basic system functions. On their nodes, /run had monitoring for inode usage and the service had a memory ceiling. The service started failing, alerts fired with a clear “/run inode pressure” signal, and the host stayed healthy enough for operators to log in without fighting a half-dead system.

The remediation was swift: disable the flag, deploy a patch, and purge the runtime directory. No cascading outage. The boring practice didn’t prevent the bug, but it contained it.

Common mistakes: symptom → root cause → fix

1) “df shows /dev/shm is huge, so Ubuntu is wasting RAM”

Symptom: df -h shows tmpfs Size = half (or more) of RAM.

Root cause: Confusing tmpfs maximum size with actual allocation.

Fix: Check Used in df and Shmem in /proc/meminfo. Only act if usage grows.

2) “The host is OOMing, but du shows the tmpfs is small”

Symptom: df reports a full tmpfs; du doesn’t add up.

Root cause: Deleted-but-open files on tmpfs.

Fix: Use lsof +L1, restart the offending process, and confirm space is released.

3) “We capped /dev/shm and now random apps fail”

Symptom: Intermittent failures, IPC errors, weird crashes after change.

Root cause: Legitimate POSIX shared memory consumer exceeded the new cap.

Fix: Measure peak SHM usage, raise cap appropriately, or reconfigure the application to use disk-backed temp storage where acceptable.

4) “/run filled and the machine behaved like it was haunted”

Symptom: Services can’t start, PID files missing, sockets fail, logins weird.

Root cause: Something wrote large data or too many files into /run (bytes or inodes).

Fix: Identify large subdirs (du -x, df -ih), move writer to /var/lib or /var/cache, add cleanup and rotation.

5) “We put /tmp on tmpfs for speed and now builds crash”

Symptom: Compilers and package tools fail with ENOSPC or OOM during large builds.

Root cause: Build temp files are big; tmpfs was a poor fit on the given RAM size.

Fix: Keep /tmp on disk, or provide a dedicated fast disk path and point build tools to it.

6) “Kubernetes pods OOMKilled after we switched emptyDir to memory”

Symptom: Pods restart under load; node memory looks tight.

Root cause: emptyDir with medium: Memory consumes memory counted against pod/container limits.

Fix: Set memory requests/limits, cap tmpfs usage, or switch back to disk-backed emptyDir and optimize the workload.

7) “We remounted tmpfs smaller and now writes fail”

Symptom: Immediate ENOSPC errors after mitigation.

Root cause: The working set already exceeded your new limit; remounting doesn’t magically shrink used pages.

Fix: First reduce usage (delete files, restart processes holding deleted files), then remount with a smaller limit.

Checklists / step-by-step plan

Incident containment (15–30 minutes)

  1. Locate the mount: run df -hT and find the tmpfs with high Used%.
  2. Confirm memory pressure: check /proc/meminfo for MemAvailable, Shmem, and swap.
  3. Find the big directory: du -xh --max-depth=1 on that mount.
  4. Find the writer: lsof +D or targeted lsof / fuser.
  5. Check for deleted-but-open: lsof +L1.
  6. Stop growth: pause the job, scale down the deployment, or restart the service holding space.
  7. Guardrail if needed: remount tmpfs with a temporary size= cap only if you can tolerate ENOSPC.
  8. Stabilize: ensure swap is not pegged; if it is, consider restarting worst offenders to reclaim memory quickly.

Permanent fix (same day)

  1. Decide which data belongs in tmpfs: IPC and small hot caches only.
  2. Move large scratch to disk: change app config to use /var/tmp or a dedicated path.
  3. Set explicit tmpfs limits: use systemd mount overrides for /dev/shm or your custom mounts.
  4. Add cleanup: TTL, size-based eviction, or rotation. “We’ll clean it up later” is how tmpfs becomes a weapon.
  5. Add monitoring: alert on tmpfs Used% and inode usage (df -ih), plus host MemAvailable and swap.
  6. Add blast-radius controls: systemd MemoryHigh/MemoryMax for the worst offenders.

Fleet hardening (next sprint)

  1. Standardize mount policies: per node class, define tmpfs caps and document which apps need larger shared memory.
  2. Container policy: require explicit memory limits and forbid unbounded memory-backed volumes for non-cache data.
  3. Test failure modes: force tmpfs ENOSPC in staging and ensure apps degrade predictably (not corrupt, not hang).
  4. Audit /run usage: runtime dirs should not hold large datasets; enforce in code review and packaging.

FAQ

1) Is tmpfs actually using RAM or just “virtual” memory?

It uses real memory pages as data is written. Those pages can be swapped out under pressure, but they still count against system memory accounting.

2) Why does tmpfs show a Size equal to half my RAM?

That’s a default maximum. It’s not pre-allocated. The Used column is what matters, plus Shmem in /proc/meminfo.

3) Should I mount /tmp as tmpfs on Ubuntu 24.04?

Only if you’ve sized it and tested your workloads. On general-purpose servers, keeping /tmp disk-backed is safer. If you do use tmpfs, cap it and plan for ENOSPC.

4) Can I just increase swap to solve tmpfs memory spikes?

Increasing swap buys time, not correctness. It can also turn a clean failure into a slow brownout. Fix the writer and set limits.

5) What’s the safest place to put large temporary files?

Use disk: /var/tmp or a dedicated directory on fast storage. If you need performance, use NVMe and keep tmpfs for small hot data.

6) Why does “df says full” but “du says small” on tmpfs?

Usually deleted-but-open files. The directory entries are gone, but the process still holds the file descriptor. Use lsof +L1 and restart the process.

7) If I cap /dev/shm, will I break things?

Possibly. Some software legitimately uses shared memory. Measure actual usage under peak, set a cap above it, and test. Randomly choosing “512M seems fine” is how you earn weekend work.

8) How do I cap tmpfs persistently on Ubuntu 24.04?

For systemd-managed mounts like /dev/shm, create a drop-in override via systemctl edit dev-shm.mount and set Options=...size=....

9) Is /run supposed to be tmpfs?

Yes. It’s for runtime state that shouldn’t persist across reboots. If it’s huge, something is abusing it.

10) Does tmpfs have inode limits too?

Yes. You can run out of inodes before bytes. Check with df -ih. If inode usage is high, find file spam and fix the generator.

Next steps you can do today

tmpfs isn’t a bug in Ubuntu 24.04. It’s a sharp tool. The failures happen when you treat it like infinite scratch space and forget it shares the same RAM your applications need to breathe.

Do this in order:

  1. Identify which tmpfs is growing (df -hT), then verify it’s real pressure (/proc/meminfo).
  2. Find the directory and process responsible (du, lsof, lsof +L1).
  3. Move big temp data to disk, and cap tmpfs where misuse is likely (especially /dev/shm), using systemd overrides.
  4. Add monitoring for tmpfs bytes and inodes, plus memory/swap pressure, so this becomes a page you see coming—not a mystery outage.
  5. Contain blast radius with per-service memory limits where it makes sense.

If you do nothing else: cap the risky tmpfs mounts intentionally, and make applications handle ENOSPC like adults. Your kernel will thank you by not picking victims at 3 a.m.

← Previous
Debian 13 “Segfault” Crashes After Upgrade: Find the Exact Library Mismatch (Case #55)
Next →
MySQL vs PostgreSQL for High Concurrency: Who Hits the Wall First and Why

Leave a comment