Ubuntu 24.04 Memory Ballooning Surprises: Set Sane Limits and Stop Swap Storms (Case #16)

Was this helpful?

It starts as a “minor slowdown.” Then your graphs go modern art: load average climbs, latency spikes, disks look busy without doing anything useful, and the VM feels like it’s running through molasses.

On Ubuntu 24.04 in virtualized environments, memory ballooning can turn from “helpful overcommit feature” into “swap storm generator.” The fix is rarely mystical. It’s usually limits, accounting, and a hard decision about what you want to happen when memory is actually scarce.

What’s actually happening when ballooning hurts

Memory ballooning exists because hypervisors like to play Tetris with RAM. A VM may be configured for 16 GB, but most of the time it uses 4–8 GB. Ballooning lets the host reclaim the “unused” part and lend it to other VMs. In theory, everyone wins.

In practice, ballooning can produce a specific kind of misery:

  • The host reclaims memory from the guest at the worst possible time (because the host is under pressure).
  • The guest kernel sees less RAM available and starts reclaiming pages.
  • If reclaim can’t find enough clean page cache, it starts swapping anonymous memory.
  • Swap I/O happens on virtual disks, which are often on shared storage, which amplifies latency.
  • Now the guest is slower, so it makes progress more slowly, so it stays under pressure longer, so it swaps more.

Ballooning itself isn’t evil. Unbounded ballooning with optimistic overcommit and weak guardrails is evil. The easiest way to describe it: you’re doing memory management in two places (host and guest) with two kernels that don’t share a brain.

Here’s the part that surprises people on Ubuntu 24.04: the default ecosystem around memory pressure has gotten more “helpful.” cgroup v2 is standard, systemd is more proactive, and OOM behavior can look different than an older LTS. None of that is bad. It just means your old assumptions can fail loudly.

One quote to keep you honest (paraphrased idea): Gene Kranz’s reliability mindset: “We should be tough and competent—no excuses.” It applies to memory limits too.

Joke #1: Memory ballooning is like a “temporary” desk you set up in the kitchen. It’s temporary until dinner is ruined and you can’t find the forks.

Interesting facts and context (the stuff that explains the weirdness)

  1. Ballooning isn’t new. VMware popularized balloon drivers decades ago; KVM’s virtio-balloon later made it common in open stacks.
  2. Linux doesn’t treat swap as a failure by default. The kernel will swap to keep file cache and smooth spikes; that’s rational until the storage backend makes swap expensive.
  3. cgroup v2 changed the game. Memory accounting is tighter and more unified. If you set a memory.max, it’s a real wall, not a polite suggestion.
  4. “Available” memory is not “free.” The Linux “available” metric estimates what can be reclaimed without swapping. People still panic at “free.” They shouldn’t.
  5. kswapd is a symptom, not a villain. High kswapd CPU means the kernel is trying to reclaim pages under pressure. The root cause is almost always: too little RAM for the working set.
  6. Swap storms are contagious. One VM swapping heavily can degrade shared storage latency, which makes other VMs slower, which increases their memory pressure and swaps. Congratulations, you’ve invented a cluster-wide performance incident.
  7. Ballooning can look like a memory leak. From inside the guest, it feels like RAM vanished. Because it did.
  8. zram is a modern compromise. Compressed RAM swap reduces I/O but increases CPU use. Great for some workloads; awful for others.
  9. THP can complicate reclaim. Transparent Huge Pages can make reclaim and compaction more expensive during pressure, depending on workload and settings.

Fast diagnosis playbook (first/second/third)

First: confirm the machine is actually under memory pressure

  • Look for swap-in/out activity and major faults.
  • Confirm whether “available” memory is low and staying low.
  • Check PSI (pressure stall information) to see if the system is stalling on memory reclaim.

Second: prove whether ballooning is involved

  • Check virtio_balloon driver presence and balloon stats.
  • Correlate “host reclaimed” behavior (if visible) with guest memory drops.
  • In platforms like Proxmox/OpenStack, compare configured vs effective memory and balloon target changes.

Third: identify what is consuming memory and whether it’s reclaimable

  • Top processes by RSS and by anon memory.
  • File cache vs anonymous usage (page cache is your friend until it isn’t).
  • cgroup limits and per-service memory caps.

Decide the outcome you want

  • If you want predictable latency: reduce overcommit, set hard limits, and prefer OOM kill over swap storms.
  • If you want maximal consolidation: allow ballooning but cap its minimum, use zram carefully, and monitor PSI like it’s revenue.

Practical tasks: commands, outputs, and the decision you make

These are the tasks I actually run when a VM “mysteriously got slow.” Each includes what the output means and the decision that follows. Run them in order until you have a coherent story. If you can’t explain the story, don’t touch knobs yet.

Task 1: Check the headline memory state (and don’t misread it)

cr0x@server:~$ free -h
               total        used        free      shared  buff/cache   available
Mem:           7.7Gi       6.9Gi       112Mi       204Mi       731Mi       402Mi
Swap:          4.0Gi       3.6Gi       128Mi

What it means: Available is only 402 MiB and swap is heavily used. You are already in “reclaim mode.”

Decision: Move immediately to vmstat and PSI to see whether it’s actively thrashing or just swap-resident from a past event.

Task 2: Detect active swap storm vs “swap happened earlier”

cr0x@server:~$ vmstat 1 10
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 2  1 3718920 112640  55872 634320  64  128   220   310  410  790 12  8 55 25  0
 3  2 3721048  98304  53120 621104 256  512   880  1200  610 1200 14 10 35 41  0
 4  2 3723096  94208  51456 609440 512 1024  1300  1900  720 1400 12  9 28 51  0
 3  1 3725144  90112  49664 601112 128  256   600   840  540 1100 10  7 45 38  0

What it means: Non-zero si/so over multiple samples plus high wa indicates active swapping and I/O wait. This is not “old swap.”

Decision: You need to stop pressure. Either add memory / reduce ballooning / reduce workload / or accept OOM for a bounded blast radius.

Task 3: Check PSI to confirm stalls (Ubuntu 24.04 has it)

cr0x@server:~$ cat /proc/pressure/memory
some avg10=18.24 avg60=12.10 avg300=5.43 total=218443992
full avg10=6.12 avg60=3.98 avg300=1.22 total=68423992

What it means: “some” means tasks are delayed sometimes; “full” means the system is fully stalled waiting for memory. Non-trivial full values correlate strongly with user-visible pain.

Decision: Treat this like an incident. Stop thinking “maybe it’s CPU” until memory pressure is addressed.

Task 4: Check if balloon driver exists (guest side)

cr0x@server:~$ lsmod | grep -E 'virtio_balloon|vmw_balloon'
virtio_balloon         24576  0

What it means: virtio ballooning is available. That doesn’t prove it’s actively inflating, but it makes ballooning a plausible cause.

Decision: Inspect balloon statistics to see if the guest believes memory has been taken.

Task 5: Inspect virtio balloon stats (if exposed)

cr0x@server:~$ grep -H . /sys/devices/virtio*/balloon*/{num_pages,actual,free_page_hint} 2>/dev/null
/sys/devices/virtio0/virtio0/balloon/num_pages:2097152
/sys/devices/virtio0/virtio0/balloon/actual:1572864
/sys/devices/virtio0/virtio0/balloon/free_page_hint:1

What it means: Balloon “actual” indicates pages currently ballooned (reclaimed). If this grows during host pressure, your VM will see less usable RAM.

Decision: If ballooned pages are significant relative to total, stop treating RAM as “guaranteed.” Set a minimum or disable ballooning for latency-critical VMs.

Task 6: Confirm configured memory vs what the kernel sees

cr0x@server:~$ grep -E 'MemTotal|MemAvailable' /proc/meminfo
MemTotal:        8049136 kB
MemAvailable:     411224 kB

What it means: MemTotal is what the guest currently thinks it has. If you “configured 16G” but MemTotal is ~8G, ballooning (or hotplug sizing) is in play.

Decision: Align expectations with reality. If the VM needs 16G, don’t let the host treat it like a piggy bank.

Task 7: Identify top memory consumers by RSS (fast triage)

cr0x@server:~$ ps -eo pid,comm,rss --sort=-rss | head -n 10
  PID COMMAND           RSS
 2481 java           2381440
 1822 postgres       1024320
 1999 node            612480
 1320 snapd           188224
  911 systemd-journal  123456
 2766 python3          112320

What it means: You have large anonymous memory users (e.g., JVM) and stateful services. Killing the wrong process may “fix swapping” and also “delete your business.”

Decision: For known heavy hitters (JVM, databases), verify configured heap/buffers. Memory pressure caused by ballooning may expose overly optimistic app defaults.

Task 8: Check for kernel reclaim trouble (compaction, THP)

cr0x@server:~$ grep -E 'compact|thp' /proc/vmstat | head
compact_migrate_scanned 184221
compact_free_scanned 91234
compact_isolated 1422
thp_fault_alloc 1221
thp_collapse_alloc 18

What it means: High compaction activity under pressure can increase CPU time in reclaim and stall workloads. THP allocation/collapse can add overhead.

Decision: If you see heavy compaction alongside PSI “full,” consider tuning THP or ensuring more headroom rather than “optimizing swap.”

Task 9: Look for OOM or near-OOM events (journal)

cr0x@server:~$ journalctl -k --since "2 hours ago" | egrep -i "oom|out of memory|kswapd|memory pressure" | tail -n 20
Dec 30 09:41:12 server kernel: Memory cgroup out of memory: Killed process 1999 (node) total-vm:3128448kB, anon-rss:598112kB, file-rss:2048kB
Dec 30 09:41:12 server kernel: oom_reaper: reaped process 1999 (node), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

What it means: The kernel killed something due to cgroup memory OOM. That’s not random; it’s a limit being enforced (or a leak that hit the wall).

Decision: If you prefer this to swap storms, good—now make it intentional: set memory limits per service and document them. If you don’t, raise limits or reduce ballooning.

Task 10: Inspect swap devices and priorities

cr0x@server:~$ swapon --show --bytes
NAME       TYPE      SIZE        USED       PRIO
/swap.img  file  4294967296  3865051136       -2

What it means: A swapfile is in use, low priority. If it lives on slow or contended storage, swapping will punish everything.

Decision: Decide whether to keep swap (often yes, but smaller) and whether to use zram, or disable swap for latency-critical VMs (with guardrails).

Task 11: Check swap behavior knobs (swappiness, vfs cache pressure)

cr0x@server:~$ sysctl vm.swappiness vm.vfs_cache_pressure
vm.swappiness = 60
vm.vfs_cache_pressure = 100

What it means: Default-ish. Under ballooning pressure, the kernel might swap sooner than you want for certain workloads.

Decision: For database/latency VMs, consider lowering swappiness (e.g., 10–20) after you set sane memory limits. Tuning without limits is wishful thinking.

Task 12: Confirm cgroup v2 memory constraints (system-wide)

cr0x@server:~$ stat -fc %T /sys/fs/cgroup
cgroup2fs

What it means: You’re on cgroup v2. Ubuntu 24.04 defaults this way.

Decision: Use v2 controls (memory.max, memory.high, memory.swap.max). Stop applying v1-era advice verbatim.

Task 13: Check if systemd-oomd is active (it can change outcomes)

cr0x@server:~$ systemctl is-enabled systemd-oomd
enabled

What it means: Userspace OOM daemon may intervene based on pressure signals. That can be good (faster recovery) or confusing (unexpected kills).

Decision: If you run critical single-tenant workloads, configure oomd scopes/slices or disable it deliberately—don’t just “discover” it mid-incident.

Task 14: Spot I/O saturation caused by swap (the “it’s storage” angle)

cr0x@server:~$ iostat -xz 1 3
Linux 6.8.0-41-generic (server) 	12/30/2025 	_x86_64_	(4 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          11.01    0.00    7.32   39.88    0.00   41.79

Device            r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wkB/s   wrqm/s  %wrqm w_await wareq-sz  aqu-sz  %util
vda              42.0   1120.0     0.0   0.00   18.5    26.7      51.0   2496.0     2.0   3.77   44.2    48.9     3.3   95.0

What it means: High %util, high await, and lots of smallish writes: classic swap churn on virtual disk.

Decision: Fix memory pressure first. Storage tuning won’t outrun a VM swapping itself to death on shared backing.

Task 15: Confirm whether you’re in a VM and which hypervisor (helps pick the right knobs)

cr0x@server:~$ systemd-detect-virt
kvm

What it means: KVM guest. Ballooning is likely virtio-based; qemu-guest-agent might also be relevant depending on your platform.

Decision: On KVM stacks, decide: disable ballooning for critical workloads, or set a floor and monitor balloon target changes.

Task 16: Check memory.high/memory.max for a specific service (real culprit in modern systems)

cr0x@server:~$ systemctl show -p ControlGroup -p MemoryMax -p MemoryHigh nginx.service
ControlGroup=/system.slice/nginx.service
MemoryMax=infinity
MemoryHigh=infinity

What it means: No explicit memory control. If nginx is not the big user, fine. For heavy services, “infinity” means they can compete until the whole box suffers.

Decision: Put memory limits on the handful of services that can balloon unpredictably (workers, JVMs, build tools, batch jobs).

Set sane limits: hypervisor, guest, and cgroup v2

Ballooning becomes survivable when you set boundaries. In production, “unbounded sharing” is just an outage with optimism.

1) Decide your memory contract: guaranteed, burstable, or best-effort

Every VM should have one of these contracts:

  • Guaranteed: Memory is reserved. Ballooning disabled or minimum set close to max. Used for databases, control-plane services, latency SLOs.
  • Burstable: Some reclaim is allowed, but there’s a hard floor. Used for web tiers, caches, app servers that can shed load.
  • Best-effort: Ballooning on, low floor, may swap/oom. Used for dev, CI, batch, ephemeral workers.

2) On the hypervisor: stop promising the same RAM to everyone

Whatever stack you run—raw libvirt, Proxmox, OpenStack—the concept is the same:

  • Set a maximum memory (what the VM could have).
  • Set a minimum/reservation memory (what it should always keep).
  • Cap host-level overcommit to something you can survive when workloads align badly.

Why? Because reclaim is not free. The guest may have to swap anonymous pages to satisfy balloon inflation, and your storage will be the one paying the bill.

3) In the guest: use cgroup v2 limits to force local containment

If you can’t control the host perfectly (welcome to Earth), you can at least contain damage in the guest. cgroup v2 gives you three big tools:

  • memory.max: the hard limit. Exceed it and you get an OOM kill in that cgroup.
  • memory.high: a soft limit. The kernel will throttle/reclaim to keep usage near it.
  • memory.swap.max: cap swap usage per cgroup (powerful for preventing swap storms from a single service).

For systemd-managed services, you typically set these via unit overrides. Here’s what that looks like for a “burstable but bounded” worker service.

cr0x@server:~$ sudo systemctl edit worker.service
[Service]
MemoryHigh=2G
MemoryMax=3G
MemorySwapMax=512M

What it means: The service can use memory, but it can’t consume the machine. If it grows beyond 3G, it gets killed instead of pushing the whole VM into swap.

Decision: Use this for untrusted or bursty components: background jobs, message consumers, “one more exporter,” and anything written in a language that might discover infinity.

4) Make OOM decisions deliberate: kernel vs systemd-oomd

Ubuntu 24.04 can involve systemd-oomd, which acts based on PSI and cgroup memory pressure. This is not the classic kernel OOM killer. Different trigger, different behavior, sometimes earlier, often cleaner.

What you do:

  • If you want strict service containment, keep oomd enabled and manage slices.
  • If you run a single critical monolith and oomd might kill the wrong thing, tune it or disable it—but only after you have alternate containment (limits, reservations).
cr0x@server:~$ systemctl status systemd-oomd --no-pager
● systemd-oomd.service - Userspace Out-Of-Memory (OOM) Killer
     Loaded: loaded (/usr/lib/systemd/system/systemd-oomd.service; enabled; preset: enabled)
     Active: active (running) since Mon 2025-12-30 08:01:12 UTC; 2h 10min ago

What it means: It’s running and will act if configured slices cross pressure thresholds.

Decision: If you didn’t plan for it, plan now. “Surprise process kills” is not a monitoring strategy.

Stop swap storms: swap choices, swappiness, and pressure control

Swap is neither purely good nor purely evil. Swap is a tool. In VMs, it is also a performance trap because swap I/O is usually the slowest and most contended path in the entire stack.

Choose your swap model

You typically want one of these:

  • Small swap + low swappiness: Enough to avoid catastrophic OOM during small spikes, but not enough to sustain a long thrash.
  • zram swap: Swap in RAM with compression. Reduces disk I/O but spends CPU. Good for dev and bursty workloads; evaluate for CPU-bound production services.
  • No swap (rarely correct): Only when you have strict reservations and you prefer immediate failure over degraded behavior. Works best with good limits and alerting.

Task: See whether you’re using zram already

cr0x@server:~$ lsblk -o NAME,TYPE,SIZE,MOUNTPOINT | grep -E 'zram|swap'
zram0 disk  2G

What it means: zram device exists. It may or may not be configured as swap.

Decision: Confirm with swapon --show. If you have both zram and disk swap, set priorities intentionally.

Task: Set swappiness persistently (only after you fix limits)

cr0x@server:~$ sudo tee /etc/sysctl.d/99-memory-sane.conf >/dev/null <<'EOF'
vm.swappiness=15
vm.vfs_cache_pressure=100
EOF
cr0x@server:~$ sudo sysctl --system | tail -n 5
* Applying /etc/sysctl.d/99-memory-sane.conf ...
vm.swappiness = 15
vm.vfs_cache_pressure = 100

What it means: Kernel will prefer reclaiming cache over swapping anonymous memory compared to default behavior, though it still may swap under real pressure.

Decision: If the VM is swapping because ballooning removed RAM, swappiness tuning helps at the margins. The primary fix is to stop stealing the RAM in the first place.

Cap swap per service (the underrated fix)

Swap storms usually start with one greedy process. If you cap its swap, it can’t smear its suffering across the whole machine.

cr0x@server:~$ sudo systemctl set-property batch-jobs.service MemorySwapMax=0
cr0x@server:~$ systemctl show -p MemorySwapMax batch-jobs.service
MemorySwapMax=0

What it means: That service cannot use swap. If it runs out of memory, it will get OOM-killed within its cgroup.

Decision: Apply this to workloads where “slowly swapping for 30 minutes” is worse than “fail fast and retry.” Batch processing is a prime candidate.

Understand why swap storms feel like storage failures

As a storage engineer, I’ll say the quiet part out loud: a VM swapping heavily is indistinguishable from a storage denial-of-service attack—except it’s self-inflicted and fully authenticated.

Swap produces:

  • Small random writes (page-outs), plus reads on faults.
  • High queue depths and increased latency.
  • Contended backing devices (especially on shared SSD pools or network storage).

When you see “storage latency incident” and one guest is swapping, fix the guest memory contract first. Then talk about IOPS.

Joke #2: Swap storms are the only weather system that forms indoors, and it still manages to take your service down.

Three corporate mini-stories (how this fails in real companies)

Mini-story #1: The outage caused by a wrong assumption

The company had a tidy belief: “Configured memory is guaranteed memory.” They’d been running older Ubuntu guests on a KVM cluster with ballooning enabled for years, and most VMs never complained. The dashboards showed each VM “had 16 GB,” so people sized applications accordingly.

Then a host-level incident hit: one physical node lost a DIMM channel and the cluster rebalanced. Overcommit was still enabled. The scheduler packed workloads tightly to maintain capacity. Ballooning inflated on several guests at once, because the host needed memory immediately, not politely.

Inside a critical Ubuntu 24.04 VM, MemTotal didn’t change in anyone’s head, but it effectively did: available dropped, reclaim spiked, and the kernel started swapping out parts of a JVM heap. Latency increased, which caused request queues to grow, which caused more heap retention. It was a feedback loop with a nice UI.

The team initially chased CPU steal and “noisy neighbor storage.” Both were real, but downstream. The primary event was memory removal via ballooning during host pressure. Their assumption had been wrong: configured memory wasn’t a promise; it was a maximum.

The fix was boring and effective: disable ballooning for the SLO-bound tier, set host reservations for critical VMs, and stop overcommitting memory on the nodes that carried stateful services. Cost went up a bit. Incidents went down a lot.

Mini-story #2: The optimization that backfired

A different org tried to be clever: they enabled ballooning everywhere and reduced base VM sizes, betting that “Linux uses cache and can give it back.” They were technically correct, and operationally reckless.

For a few weeks, it looked great. They fit more VMs per host. Finance smiled. Then a routine deploy pushed a new build that increased memory use for a background indexing service. Not a leak—just more data structures.

Under ballooning, those VMs were living close to the edge. When the indexer surged, the guest reclaimed page cache and started swapping. The indexer’s own throughput dropped, so it ran longer and stayed memory-hot longer. Meanwhile, swap I/O hit the same storage pool used by database volumes.

The “optimization” turned into a multi-service degradation incident: web requests timed out, database commits slowed, and on-call was staring at storage graphs wondering why write latency spiked during “low traffic.” It wasn’t low traffic; it was high swap.

They rolled back the deploy and saw partial recovery, but the real solution was architectural: separate tiers by memory contract, set balloon floors, and enforce per-service memory caps so a background task can’t turn the entire cluster into a slow-motion disaster.

Mini-story #3: The boring but correct practice that saved the day

This one is less dramatic, which is the point. A team ran Ubuntu 24.04 guests for internal CI runners and a small production API. They had a strict rule: production VMs had ballooning disabled, swap capped, and systemd service limits defined. CI runners were best-effort and disposable.

One day the hypervisor cluster experienced unexpected memory pressure after a vendor firmware update changed power/performance characteristics. The host started reclaiming aggressively. Several best-effort VMs slowed down immediately.

Production stayed stable. Not because it had “more memory,” but because it had a real reservation and no balloon reclaim. The API VMs had a small swap, low swappiness, and a memory.max on a background log processing unit. When pressure increased, the background unit died and restarted. The API latency barely moved.

The incident report was short: “Host memory pressure affected best-effort tier. Production unaffected due to reservations and service limits.” No heroics, no kernel spelunking, no emotional support Slack threads.

They didn’t win a prize for it. They shipped features on schedule while everyone else was learning, again, that predictability is purchased with constraints.

Common mistakes: symptom → root cause → fix

1) Symptom: load average skyrockets, CPU usage looks moderate

Root cause: Memory pressure stalls and I/O wait from swapping. Load includes tasks waiting on I/O and reclaim, not just CPU execution.

Fix: Confirm with vmstat (si/so), iostat (await/util), and PSI. Then reduce ballooning and/or add memory headroom. Lower swappiness only after the contract is fixed.

2) Symptom: “We have 16 GB configured, but MemTotal is ~8 GB”

Root cause: Ballooning/hotplug configuration means the guest is not currently holding max memory.

Fix: Set a balloon minimum/reservation on the hypervisor for that VM, or disable ballooning for that tier. Verify by re-checking MemTotal and balloon stats.

3) Symptom: swap usage keeps growing even when traffic is flat

Root cause: Working set exceeds RAM after ballooning reclaim; kernel swaps out anon pages to survive.

Fix: Increase guaranteed memory or reduce reclaim. If the growth is tied to a single service, cap it with cgroup v2 memory.max and memory.swap.max.

4) Symptom: random process kills that look “unpredictable”

Root cause: systemd-oomd or cgroup memory OOM is enforcing limits (or default slice pressure rules).

Fix: Make memory control explicit: set MemoryMax/High on services, define slices, and decide which components should die first. Review oomd configuration rather than assuming kernel-only OOM behavior.

5) Symptom: storage latency incident with no obvious heavy workload

Root cause: VM swap churn generates constant small I/O. On shared storage, it looks like noisy neighbors.

Fix: Find the swapping guest via iostat/vmstat in the guest and host metrics. Stop the memory pressure. Storage tuning won’t fix sustained swap churn.

6) Symptom: “We disabled swap and now we crash more”

Root cause: No buffer for transient spikes; also no per-service containment, so one spike triggers system-wide OOM.

Fix: Reintroduce small swap or zram, but add cgroup limits so spikes are contained. Prefer fail-fast for non-critical services with MemorySwapMax=0.

7) Symptom: high kswapd CPU and frequent stutters during GC or compaction

Root cause: Reclaim and compaction overhead during memory pressure, potentially worsened by THP behavior.

Fix: Add headroom and reduce ballooning first. Then consider THP mode changes for specific workloads if compaction is a recurring cost center.

Checklists / step-by-step plan

Step-by-step: stabilize a swapping Ubuntu 24.04 VM right now

  1. Prove it’s memory pressure: run free -h, vmstat 1 10, and cat /proc/pressure/memory. If PSI full is non-trivial and si/so are active, you’re in a swap storm.
  2. Find the hog: use ps sorted by RSS. Confirm whether it’s a single process or broad pressure.
  3. Check ballooning: confirm virtio_balloon is loaded and inspect balloon stats. If MemTotal is lower than expected, treat it as balloon reclaim until proven otherwise.
  4. Reduce immediate harm: if one background service is the offender, cap it with MemoryMax and optionally MemorySwapMax. Restart that service to get back to a stable state.
  5. Stop storage collateral damage: if swap is hammering the disk, shed load or temporarily scale out. Don’t “tune iostat.”
  6. Make the fix permanent: decide the VM’s memory contract and adjust host reservation/ballooning and guest limits accordingly.

Checklist: configuration decisions that prevent repeat incidents

  • For each VM: classify as guaranteed/burstable/best-effort. Write it down.
  • For guaranteed VMs: disable ballooning or set a high minimum. Avoid heavy swap; keep a small safety swap if you can tolerate it.
  • For burstable VMs: set balloon minimum, use cgroup memory.high/max for known hog services, and consider swap caps.
  • For best-effort: allow ballooning, but implement retries and automation. Treat OOM as a normal failure mode.
  • Monitoring: alert on PSI memory full, sustained swap-out rate, and disk await/util spikes correlated with swap.
  • Change management: any change in host overcommit policy is a production change. Put it through the same review as a database migration.

Checklist: “stop making it worse” during an incident

  • Don’t reboot first. Rebooting can mask cause and guarantees downtime.
  • Don’t raise swap size as your primary response. That often extends the suffering window.
  • Don’t “drop caches” as a reflex. If pressure is anonymous memory, dropping cache doesn’t solve it and can increase disk reads.
  • Don’t change five sysctls at once. You won’t know what helped, and future you will hate present you.

FAQ

1) Is memory ballooning always bad?

No. It’s useful for consolidation and bursty workloads. It’s bad when you treat it as invisible and let it reclaim memory from latency-critical or stateful services.

2) How do I know ballooning is the cause and not an application memory leak?

Look for a mismatch between “what you think the VM has” and /proc/meminfo MemTotal, plus virtio balloon stats changing. A leak usually shows a steady increase in RSS for a process independent of balloon target changes.

3) Should I disable swap on Ubuntu 24.04 VMs to prevent swap storms?

Sometimes, but rarely as the first move. Swap-off without memory reservations and per-service limits turns “slow and degraded” into “fast and dead.” Prefer small swap + containment, or zram where appropriate.

4) What’s the single best metric to alert on?

Memory PSI “full” sustained above a small threshold is hard to ignore because it directly measures time spent stalled on memory pressure. Pair it with swap-out rate.

5) Why does the VM look like it has plenty of memory in the hypervisor UI, but the guest is swapping?

Because UIs often show configured maximum memory, not the current effective memory after ballooning. Trust the guest’s view and balloon stats.

6) Does lowering vm.swappiness fix ballooning issues?

It can reduce how eagerly the kernel swaps, but it cannot manufacture RAM. If ballooning removes too much memory, you’ll still hit reclaim pain—just in a different order.

7) Is zram a good default in VMs?

It depends. zram trades disk I/O for CPU. On CPU-light but storage-contended systems, it’s a win. On CPU-bound workloads (or when CPU steal is high), it can worsen tail latency.

8) How should I size swap in a VM that must not thrash?

Small: enough to absorb brief spikes (or capture a dump if you do that), not enough to let the VM “survive” in a permanently degraded state. Then use cgroup swap caps for known offenders.

9) What’s the best practice for databases under ballooning?

Don’t balloon them. Reserve memory. If you must balloon, set a high floor and ensure the database’s memory settings (buffers, caches) don’t assume the maximum is always present.

10) Why does everything get slow even if only one service is memory-hungry?

Because global reclaim and swap I/O affect the whole system: CPU time goes to reclaim, I/O queues fill, and latency cascades. Contain memory usage at the service level.

Conclusion: practical next steps

If Ubuntu 24.04 in a VM is surprising you with ballooning-driven swap storms, the remedy is not a magical sysctl. It’s an explicit memory contract and enforcement at the right layers.

  1. Within 30 minutes: run the fast diagnosis playbook, capture free, vmstat, PSI, and iostat. Prove whether swapping is active and whether ballooning is involved.
  2. Within a day: classify VMs into guaranteed/burstable/best-effort and adjust ballooning minimums or disable ballooning for guaranteed tiers.
  3. Within a week: add cgroup v2 memory limits for the handful of services that can dominate memory, and cap swap for those services where “fail fast” beats “slow forever.”
  4. Always: alert on memory PSI full and sustained swap-out. Treat swap storms as an incident class, not a mystery.

Ballooning is fine when you can afford surprises. Production generally can’t. Set limits like you mean them, and your storage will stop screaming at 3 a.m.

← Previous
Docker host networking risks: when it’s worth it and how to limit damage
Next →
One phishing click: how companies become headlines

Leave a comment