You upgrade a Debian host, it boots, services start… and then random processes start dying with the same deadpan message: Segmentation fault.
No neat error, no obvious pattern. Just the kernel dropping cores like confetti and your on-call channel getting louder.
This case (#55) is about turning that vague “segfault after upgrade” complaint into a specific, nameable fault: the exact shared library mismatch
(which file, which package, which symbol/ABI), plus the decision path to fix it without making the system worse.
What’s actually happening when “it segfaults”
After a major upgrade, “segfault” is usually not a mystery of modern computing. It’s an accounting error.
The process loaded a set of shared objects that don’t agree on a contract: ABI, symbol versions, calling conventions, structure layout,
or even assumptions about thread-local storage. The CPU does what it’s told, not what you meant.
On Debian, those contracts are normally enforced through packaging discipline: dependencies, symbol versioning, shlibs, triggers,
and the simple rule that you don’t mix stable release libraries with random blobs in /usr/local.
When you upgrade and still get segfaults, you’re often in one of these situations:
- Partial upgrade: you have a new userland piece using old libraries (or vice versa).
- Shadow libraries:
/usr/local/libor an application bundle is overriding distro libs. - RPATH/RUNPATH weirdness: the binary has hard-coded search paths that win over sane defaults.
- Wrong architecture / multiarch mix: the loader finds a library with the same name but incompatible ELF class/arch.
- Out-of-tree plugins: modules compiled against older headers load into a newer process and scribble on memory.
- Corrupt files: less common, but an interrupted upgrade or disk issue can produce subtly broken ELF objects.
Your job isn’t “stop the segfault.” Your job is “identify the mismatched object and prove it.” When you can name the file and the package,
fixes become boring: reinstall, remove overrides, align versions, rebuild plugins, or roll back.
Fast diagnosis playbook
If you’re in production, you don’t start with philosophical debugging. You start with the fastest path to narrowing the blast radius and
getting a concrete hypothesis.
1) First: confirm it’s the dynamic loader and not the kernel or hardware
- Check
journalctlfor consistent faulting IP/library names. - Confirm whether it’s many binaries or one service.
- Scan for obvious
/usr/localor custom loader involvement.
2) Second: grab one core dump and produce a backtrace with library paths
- Use
coredumpctlor the core file path. - In
gdb, print loaded shared libraries and backtrace. - Identify the first “weird” library path, version, or missing symbol info.
3) Third: verify package integrity and version alignment
apt policyfor suspected libraries and the crashing binary.dpkg -Vanddebsumsfor corruption/overwrites.- Look for held packages, diversions, and pinned priorities.
4) Fourth: trace loader decisions
LD_DEBUG=libs,versionson the failing binary (in a safe environment).- Confirm which exact
.sogot loaded, from which directory, and why.
5) Fifth: fix with the smallest possible blast radius
- Prefer removing overrides and reinstalling official packages.
- Don’t “symlink your way out” unless you like chasing ghosts later.
One quote to keep you honest. Werner Vogels (Amazon CTO) put it roughly like this—paraphrased idea:
“Everything fails all the time; design and operate assuming failure.” The operator’s version: assume your upgrade can leave a mixed world behind.
Interesting facts and context (so you stop being surprised)
- “Segmentation fault” is older than your career. The term comes from memory segmentation in early protected-mode designs, long before Linux was a glimmer.
- Linux reports “segfault” even when the root cause is an ABI contract break. The CPU faults; the OS reports; your actual bug is higher-level.
- glibc symbol versioning is a stability superpower. Libraries can export multiple versions of the same symbol to keep older binaries running—until you bypass the system.
- Debian’s packaging has a whole metadata system for shared libraries. The shlibs/symbols mechanism exists to prevent exactly this, but it can’t defend against local overrides.
- RPATH came first; RUNPATH came later. RUNPATH changes how transitive dependencies are resolved; it can make “it works on one box” failures maddening.
/usr/localis historically meant for local admin builds. It predates container culture and is still a common place to stash time bombs.- Major upgrades expose “undefined behavior” debt. A program that “worked” while relying on UB may crash when compiler, libc, or allocator behavior changes.
- C++ ABI issues are a recurring genre. Mixing compiler versions and libstdc++ expectations can produce crashes that look like random memory corruption.
Joke #1: A segfault is just your program’s way of saying it would like to go lie down for a while, preferably in someone else’s memory.
Practical tasks: commands, outputs, and decisions
These are not “try stuff” commands. Each task includes what you’re looking for and what decision you make next.
Run them on the affected host or, better, on a clone/snapshot if you’re dealing with production.
Task 1: Confirm the crash signature in the journal
cr0x@server:~$ journalctl -b -p warning..alert | grep -E "segfault|SIGSEGV|coredump" | tail -n 30
Dec 30 10:21:14 server kernel: myapp[22198]: segfault at 0 ip 00007f2d2a8f4c90 sp 00007ffd3b2a1c10 error 4 in libssl.so.3[7f2d2a860000+87000]
Dec 30 10:21:14 server systemd-coredump[22213]: Process 22198 (myapp) of user 1001 dumped core.
Dec 30 10:21:18 server kernel: myapp[22302]: segfault at 0 ip 00007f2d2a8f4c90 sp 00007ffd3b2a1c10 error 4 in libssl.so.3[7f2d2a860000+87000]
What it means: If the kernel names a library (in libssl.so.3), that’s your first suspect, not your last.
The faulting IP inside a library suggests either a real bug in that library or (more likely post-upgrade) a mismatch in what called into it.
Decision: Pick one crashing PID/core and investigate the exact library file path and package version for that library.
Task 2: See whether crashes are widespread or isolated
cr0x@server:~$ coredumpctl list --no-pager | tail -n 10
TIME PID UID GID SIG COREFILE EXE
Mon 2025-12-30 10:21:14 UTC 22198 1001 1001 11 present /usr/local/bin/myapp
Mon 2025-12-30 10:21:18 UTC 22302 1001 1001 11 present /usr/local/bin/myapp
Mon 2025-12-30 10:22:03 UTC 22511 0 0 11 present /usr/sbin/nginx
What it means: If both distro binaries (/usr/sbin/nginx) and local binaries (/usr/local/bin/myapp) are crashing,
you may have a system-wide library issue. If only /usr/local is crashing, start by distrusting /usr/local.
Decision: If system binaries crash too, prioritize verifying core system libraries and package integrity.
If only local apps crash, focus on loader paths, bundled libs, and build assumptions.
Task 3: Identify OS and architecture baseline
cr0x@server:~$ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 13 (trixie)"
NAME="Debian GNU/Linux"
VERSION_ID="13"
VERSION="13 (trixie)"
ID=debian
cr0x@server:~$ uname -a
Linux server 6.12.0-1-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.12.0-1 (2025-11-20) x86_64 GNU/Linux
What it means: Kernel/arch matter for debugging symbols and for ruling out “wrong-arch” libraries.
Decision: If you see mixed architectures installed (e.g., i386 on amd64), be ready to verify multiarch paths and loader selection.
Task 4: Verify the crashing executable’s origin and linkage
cr0x@server:~$ file /usr/local/bin/myapp
/usr/local/bin/myapp: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=b7c7d6..., for GNU/Linux 3.2.0, stripped
What it means: “dynamically linked” means you’re in shared-library land. “stripped” means backtraces will be less fun.
Decision: If the binary is local and stripped, plan to install debug symbols for system libraries and, if possible, obtain an unstripped build.
Task 5: Quick dependency view with ldd (but don’t worship it)
cr0x@server:~$ ldd /usr/local/bin/myapp | head -n 20
linux-vdso.so.1 (0x00007ffd7b7d7000)
libssl.so.3 => /usr/local/lib/libssl.so.3 (0x00007f2d2a860000)
libcrypto.so.3 => /usr/local/lib/libcrypto.so.3 (0x00007f2d2a3d0000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f2d2a1ef000)
/lib64/ld-linux-x86-64.so.2 (0x00007f2d2a9f0000)
What it means: The loader is picking /usr/local/lib/libssl.so.3 instead of Debian’s packaged OpenSSL in
/usr/lib/x86_64-linux-gnu. That’s a giant red flag after an upgrade.
Decision: Treat /usr/local/lib/libssl.so.3 as suspect #1. Confirm its version and whether it matches the expected ABI.
Task 6: Ask the dynamic linker what it’s doing (LD_DEBUG)
cr0x@server:~$ LD_DEBUG=libs,versions /usr/local/bin/myapp 2>&1 | head -n 40
22891: find library=libssl.so.3 [0]; searching
22891: search path=/usr/local/lib:/usr/lib/x86_64-linux-gnu:/lib/x86_64-linux-gnu (system search path)
22891: trying file=/usr/local/lib/libssl.so.3
22891: find library=libcrypto.so.3 [0]; searching
22891: trying file=/usr/local/lib/libcrypto.so.3
22891: checking for version `OPENSSL_3.2.0' in file /usr/local/lib/libssl.so.3 [0] required by file /usr/local/bin/myapp [0]
22891: checking for version `GLIBC_2.38' in file /lib/x86_64-linux-gnu/libc.so.6 [0] required by file /usr/local/lib/libssl.so.3 [0]
What it means: You now have proof of resolution order and symbol-version requirements.
If you see it requiring a symbol version not provided by your installed libraries, you’ll get loader errors.
But segfaults happen when the loader succeeds and the ABI still doesn’t match what the caller expects.
Decision: If the path preference is wrong, fix the path preference. Don’t try to “patch around” ABI mismatch at runtime.
Task 7: Confirm which package owns the suspect library (or that none does)
cr0x@server:~$ dpkg -S /usr/local/lib/libssl.so.3
dpkg-query: no path found matching pattern /usr/local/lib/libssl.so.3
What it means: It’s not a Debian-managed file. It could be a hand-installed OpenSSL, a vendor tarball, or leftovers from some “quick fix”.
Decision: If a core crypto library is unmanaged, strongly consider removing it from loader paths and using Debian’s packaged library.
Task 8: Inspect the suspect ELF metadata and BuildID
cr0x@server:~$ readelf -n /usr/local/lib/libssl.so.3 | sed -n '1,80p'
Displaying notes found in: .note.gnu.build-id
Owner Data size Description
GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring)
Build ID: 4b2f5d0a1c6b0f7a9e6d5d3a2b1f0c9d8e7a6b5c
What it means: BuildIDs let you match binaries to debug symbols and to known builds. A local library with a random BuildID is another clue.
Decision: If you can’t trace provenance, plan to replace it with a packaged library or rebuild it in a controlled pipeline.
Task 9: Check Debian’s OpenSSL library version and candidates
cr0x@server:~$ apt-cache policy libssl3
libssl3:
Installed: 3.3.2-1
Candidate: 3.3.2-1
Version table:
*** 3.3.2-1 500
500 http://deb.debian.org/debian trixie/main amd64 Packages
100 /var/lib/dpkg/status
What it means: Debian’s OpenSSL is installed and consistent with the release.
If your app is not using it, you’ve got an override problem, not “Debian broke crypto.”
Decision: Prefer Debian’s /usr/lib/x86_64-linux-gnu/libssl.so.3 unless you have a compelling, audited reason not to.
Task 10: Verify loader configuration and search paths
cr0x@server:~$ grep -R --line-number "/usr/local/lib" /etc/ld.so.conf /etc/ld.so.conf.d/* 2>/dev/null
/etc/ld.so.conf.d/local.conf:1:/usr/local/lib
cr0x@server:~$ ldconfig -v 2>/dev/null | head -n 25
/usr/local/lib: (from /etc/ld.so.conf.d/local.conf:1)
libssl.so.3 -> libssl.so.3
libcrypto.so.3 -> libcrypto.so.3
/usr/lib/x86_64-linux-gnu:
libssl.so.3 -> libssl.so.3
libcrypto.so.3 -> libcrypto.so.3
What it means: The system loader cache includes both, but preference is effectively giving /usr/local/lib a front-row seat.
That’s how you get mismatches after upgrades: distro libraries changed, your local ones didn’t (or changed differently).
Decision: Remove /usr/local/lib from global loader config unless you are absolutely sure you want it to override distro libs.
If you do want local libs, constrain them per-app, not globally.
Task 11: Capture a core dump and extract loaded libraries
cr0x@server:~$ coredumpctl info /usr/local/bin/myapp | sed -n '1,80p'
PID: 22198
UID: 1001
GID: 1001
Signal: 11 (SEGV)
Timestamp: Mon 2025-12-30 10:21:14 UTC
Command Line: /usr/local/bin/myapp --serve
Executable: /usr/local/bin/myapp
Control Group: /system.slice/myapp.service
Unit: myapp.service
Message: Process 22198 (myapp) of user 1001 dumped core.
cr0x@server:~$ coredumpctl debug /usr/local/bin/myapp
GNU gdb (Debian 15.2-1) 15.2
Reading symbols from /usr/local/bin/myapp...
(No debugging symbols found in /usr/local/bin/myapp)
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x00007f2d2a9f5000 0x00007f2d2aa1b000 Yes (*) /lib64/ld-linux-x86-64.so.2
0x00007f2d2a860000 0x00007f2d2a8e2000 Yes (*) /usr/local/lib/libssl.so.3
0x00007f2d2a3d0000 0x00007f2d2a84d000 Yes (*) /usr/local/lib/libcrypto.so.3
0x00007f2d2a1ef000 0x00007f2d2a3b8000 Yes (*) /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007f2d2a8f4c90 in ?? () from /usr/local/lib/libssl.so.3
#1 0x000000000040f23a in ?? ()
#2 0x0000000000410a11 in ?? ()
#3 0x00007f2d2a20a2ca in __libc_start_call_main () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x00007f2d2a20a385 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#5 0x000000000040b4e5 in ?? ()
What it means: You’ve confirmed the crash occurs inside the local libssl. That doesn’t prove libssl is “buggy”; it proves it’s in the line of fire.
The key is the mismatch between caller expectations and callee ABI/data structures.
Decision: Install debug symbols for the suspect libs (or switch to Debian libs) to get a meaningful backtrace.
Task 12: Install debug symbols for system libraries (when available)
cr0x@server:~$ sudo apt-get install -y gdb libc6-dbg libssl3-dbgsym
Reading package lists... Done
Building dependency tree... Done
The following NEW packages will be installed:
gdb libc6-dbg libssl3-dbgsym
0 upgraded, 3 newly installed, 0 to remove and 0 not upgraded.
What it means: With symbols, you can see function names and sometimes argument hints. If your crashing lib is in /usr/local,
distro dbgsym won’t help for that exact file—but it will help you understand the call boundary and whether glibc is implicated.
Decision: If the crashing library is unmanaged, either obtain matching debug symbols from whoever built it, or stop using it.
Task 13: Check for partial upgrades, held packages, and pins
cr0x@server:~$ apt-mark showhold
libc6
cr0x@server:~$ apt-get -s dist-upgrade | sed -n '1,80p'
Reading package lists... Done
Building dependency tree... Done
Calculating upgrade... Done
The following packages will be upgraded:
libc6 libc6-dev
2 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
What it means: A held libc6 is basically a self-inflicted compatibility fork. If you upgraded userland but pinned glibc, you’re gambling.
Decision: Remove the hold and complete the upgrade, or roll back and keep the whole system consistent. “Half-new Debian” is where segfaults breed.
Task 14: Verify package integrity for core components
cr0x@server:~$ sudo dpkg -V libc6 libssl3 | head -n 50
..5...... /lib/x86_64-linux-gnu/libc.so.6
What it means: The 5 indicates an MD5 mismatch: the file differs from what the package expects.
This can happen if something overwrote it, or if you’re on a filesystem snapshot with oddities, or if the upgrade was interrupted.
Decision: Reinstall the affected packages immediately. If the mismatch returns, suspect disk corruption or a configuration management tool writing into system paths.
Task 15: Reinstall the damaged packages (surgical repair)
cr0x@server:~$ sudo apt-get install --reinstall -y libc6 libssl3
Reading package lists... Done
Building dependency tree... Done
0 upgraded, 0 newly installed, 2 reinstalled, 0 to remove and 0 not upgraded.
What it means: You’ve restored package-managed libraries to the expected bits.
Decision: Retest the failing service. If it still loads /usr/local/lib, you haven’t fixed the mismatch—only cleaned the baseline.
Pinpointing the exact mismatch (the real goal)
“Library mismatch” is a vague diagnosis. We want something you can put in an incident report without embarrassment:
“/usr/local/lib/libssl.so.3 built against OpenSSL X and glibc Y was preloaded via global ld.so.conf and crashed when called by myapp built against Debian libssl3.”
That level of specificity is achievable with a few more targeted checks.
Step A: Prove the wrong library is being selected
You already saw it in ldd and LD_DEBUG. Now prove it’s due to global loader config versus app-specific settings.
cr0x@server:~$ /lib64/ld-linux-x86-64.so.2 --list /usr/local/bin/myapp | head -n 30
linux-vdso.so.1 (0x00007ffda91f2000)
libssl.so.3 => /usr/local/lib/libssl.so.3 (0x00007f6b4d3c0000)
libcrypto.so.3 => /usr/local/lib/libcrypto.so.3 (0x00007f6b4cf30000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f6b4cd4f000)
Decision: If the loader itself confirms the mapping, stop debating. Fix the mapping.
Step B: Check for RPATH/RUNPATH and old build habits
cr0x@server:~$ readelf -d /usr/local/bin/myapp | grep -E "RPATH|RUNPATH"
0x000000000000001d (RUNPATH) Library runpath: [/usr/local/lib]
What it means: The binary is hard-coded to prefer /usr/local/lib.
Even if you remove /usr/local/lib from ld.so.conf, this binary will still hunt there.
Decision: Rebuild the binary without that RUNPATH, or patch it with care (and document it like an adult).
Step C: Identify symbol version expectations
Sometimes the mismatch is subtle: the library exists, loads, exports the right names, but the versions don’t match what the binary expects.
cr0x@server:~$ objdump -T /usr/local/bin/myapp | grep -E "OPENSSL_|GLIBC_" | head -n 20
0000000000000000 DF *UND* 0000000000000000 (OPENSSL_3.3.0) EVP_MD_fetch
0000000000000000 DF *UND* 0000000000000000 (OPENSSL_3.3.0) SSL_CTX_new
0000000000000000 DF *UND* 0000000000000000 (GLIBC_2.38) memcpy
cr0x@server:~$ objdump -T /usr/local/lib/libssl.so.3 | grep -E "OPENSSL_" | head -n 20
000000000003d2a0 g DF .text 00000000000000f5 OPENSSL_3.2.0 SSL_CTX_new
0000000000047b10 g DF .text 0000000000000120 OPENSSL_3.2.0 EVP_MD_fetch
What it means: The binary wants OPENSSL_3.3.0 symbols. The library exports OPENSSL_3.2.0.
Usually this would fail at load with “version not found”. But mismatches can hide if the binary is less strict, or if plugins call into different versions indirectly.
Either way, you now have a crisp incompatibility statement.
Decision: Align OpenSSL versions: use Debian’s libssl that matches the binary’s build assumptions, or rebuild the binary against the library you intend to ship.
Step D: Detect “two copies of the same dependency” in one process
This is where things get spicy: two copies of a library with similar symbols, loaded from different paths, in the same process.
That can happen with plugins, dlopen, and vendor SDKs.
cr0x@server:~$ sudo gdb -q /usr/local/bin/myapp -ex 'set pagination off' -ex 'run --serve' -ex 'info proc mappings' -ex 'quit'
Reading symbols from /usr/local/bin/myapp...
(No debugging symbols found in /usr/local/bin/myapp)
Starting program: /usr/local/bin/myapp --serve
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
process 23102
Mapped address spaces:
Start Addr End Addr Size Offset Perms objfile
0x7f2d2a860000 0x7f2d2a8e2000 0x82000 0x0 r-xp /usr/local/lib/libssl.so.3
0x7f2d29f00000 0x7f2d29f82000 0x82000 0x0 r-xp /usr/lib/x86_64-linux-gnu/libssl.so.3
What it means: Both copies are mapped. That’s not “a little bad.” That’s “undefined behavior with a suit on.”
Decision: Find who is loading the second copy (often a plugin or dlopen path) and eliminate duplication. One process, one libssl.
Step E: Find the loader trigger: environment, systemd unit, wrapper scripts
cr0x@server:~$ systemctl cat myapp.service
# /etc/systemd/system/myapp.service
[Service]
ExecStart=/usr/local/bin/myapp --serve
Environment=LD_LIBRARY_PATH=/usr/local/lib
What it means: Someone set LD_LIBRARY_PATH in a systemd unit. That defeats most of Debian’s packaging safety rails.
Decision: Remove the environment override and ship dependencies properly. If you must use LD_LIBRARY_PATH, scope it tightly and document why.
Step F: Confirm the package state is consistent across the upgrade
cr0x@server:~$ dpkg -l | awk '$1 ~ /^(ii|iF|iU|rc)$/ {print $0}' | grep -E '^(iF|iU)'
iU libgcc-s1:amd64 14.2.0-3 amd64 GCC support library
cr0x@server:~$ sudo dpkg --configure -a
Setting up libgcc-s1:amd64 (14.2.0-3) ...
Processing triggers for libc-bin (2.38-6) ...
What it means: An “unconfigured” package can leave half-written state, including old library cache entries, triggers not run, and inconsistent dependencies.
Decision: Always complete configuration and triggers after an interrupted upgrade, then re-run ldconfig if needed.
Step G: Validate glibc and libstdc++ ABI expectations (common post-upgrade crash vector)
If your crashing frames are inside libstdc++.so.6, libgcc_s.so.1, or memory allocators, you might be in C++ ABI territory.
cr0x@server:~$ ldd /usr/local/bin/myapp | grep -E "libstdc\+\+|libgcc_s"
libstdc++.so.6 => /usr/local/lib/libstdc++.so.6 (0x00007f7a1a400000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f7a1a1e0000)
cr0x@server:~$ strings /usr/local/lib/libstdc++.so.6 | grep -E "GLIBCXX_" | tail -n 5
GLIBCXX_3.4.29
GLIBCXX_3.4.30
GLIBCXX_3.4.31
GLIBCXX_3.4.32
GLIBCXX_3.4.33
What it means: A local libstdc++ overriding distro libstdc++ is a frequent cause of “crashes after upgrade,” especially when plugins were built with a different toolchain.
The GLIBCXX_ versions help you see what the library claims to support.
Decision: Avoid overriding libstdc++ globally. Use the distro toolchain end-to-end, or vendor the entire runtime in a controlled container/sandbox.
Joke #2: If you’re debugging a segfault with LD_LIBRARY_PATH set system-wide, you’re not troubleshooting—you’re reenacting a horror movie.
Three corporate-world mini-stories (anonymized, painfully real)
Mini-story 1: The incident caused by a wrong assumption
A mid-sized company ran a fleet of Debian servers for a customer-facing API. The upgrade to Debian 13 was staged: first the OS, later the application.
The team assumed the app binaries were “self-contained enough” because they shipped a tarball with a few .so files beside the executable.
The first sign of trouble wasn’t downtime. It was noise: a new trickle of 502s and a few workers dying per hour. The kernel logs blamed libcrypto.
The developers shrugged—“we didn’t touch crypto”—and the incident sat in the uncomfortable zone: not broken enough for a rollback, not stable enough to ignore.
The wrong assumption was subtle: they believed “bundled libraries are only used by our app.” In reality, a prior admin had added /opt/vendor/lib
to /etc/ld.so.conf.d years earlier to satisfy a legacy tool. After the OS upgrade, the loader started selecting the vendor’s OpenSSL first
for multiple processes, not just the API workers.
The fix was boring and immediate: remove the global loader path, run ldconfig, restart affected services, and ensure the API workers use Debian’s OpenSSL
(or explicitly ship their own in an isolated runpath that doesn’t pollute the host).
The lesson that stuck: shared libraries are shared. If you teach the loader a new trick globally, every process learns it, including the ones you like.
Mini-story 2: The optimization that backfired
Another organization optimized their build pipeline by producing a single “portable” binary package. They built on a newer toolchain and linked dynamically
against a curated set of libraries copied into /usr/local/lib on each host. It reduced build artifacts and made deployments faster.
It also made upgrades fragile.
After moving to Debian 13, they saw intermittent segfaults under load. The failures correlated with TLS handshakes and HTTP/2 traffic,
and the stack traces varied. Some pointed at libssl, others at memory allocation. Engineers chased race conditions for days.
They added retries. They tuned thread pools. They even toggled compiler flags. The segfaults laughed politely and continued.
Eventually someone ran LD_DEBUG=libs and noticed a pattern: sometimes the process loaded Debian’s libcrypto, sometimes the local one,
depending on how plugins were loaded and which paths were active. Two copies of OpenSSL in one address space is not a supported configuration; it’s a chaos generator.
The “optimization” was using shared host paths as a deployment substrate. It seemed efficient, but it blurred ownership and versioning boundaries.
The fix required undoing that: either statically link where permissible, or ship a proper isolated runtime (container, chroot, or at least per-service runpath),
and stop mixing host and vendor copies.
The postmortem conclusion was blunt: convenience is a dependency. If you optimize by bypassing the distro’s dependency solver, you become the solver.
Mini-story 3: The boring but correct practice that saved the day
A financial services shop had a rule: before any major Debian upgrade, take a filesystem snapshot, capture dpkg selections,
and export the exact APT pinning state. No one loved this rule. It felt like paperwork for computers.
During their Debian 13 upgrade, a background job started crashing with segfaults inside libpq. The application had plugins, and
the plugin vendor provided a binary-only extension compiled “for Debian.” That phrase means nothing without details, but procurement was happy.
The SRE on call compared pre- and post-upgrade library inventories and noticed that the vendor installer had dropped a copy of libstdc++.so.6
into /usr/local/lib on a subset of hosts weeks earlier. Most boxes didn’t have it. The ones that did were crashing.
Because they had snapshots, they could diff loader cache state and confirm the change quickly.
The fix was straightforward: remove the stray local library, restore Debian’s package ownership, and push a policy that vendor installers must not modify global loader paths.
The vendor extension was rebuilt properly afterward, but the incident was stopped within an hour because the team could prove “what changed” without guessing.
The boring practice wasn’t heroics. It was hygiene: snapshots, inventory, and reproducibility. It doesn’t make you faster every day. It makes you fast when it counts.
Common mistakes: symptom → root cause → fix
Here are the repeat offenders I see after Debian upgrades. The goal is pattern recognition: you see the symptom, you jump to the right subsystem,
and you avoid “random debugging.”
1) Symptom: Only locally deployed apps crash; Debian services are fine
- Root cause: Local binaries built against older/newer libs; RUNPATH points to
/usr/local/lib; bundled libs shadow distro libs. - Fix: Remove RUNPATH to global locations; rebuild against Debian 13 toolchain; ship dependencies in an isolated runtime directory, not global loader paths.
2) Symptom: Many unrelated binaries segfault shortly after upgrade
- Root cause: Partial upgrade or held core packages (glibc, libgcc); inconsistent loader cache; interrupted dpkg configuration.
- Fix: Complete upgrade (
dpkg --configure -a,apt-get dist-upgrade), remove holds, reinstall core libs, rebuild initramfs if relevant.
3) Symptom: Backtrace points into libssl/libcrypto but app code changed little
- Root cause: Two copies of OpenSSL loaded; wrong
libssl.soresolved from/usr/localor vendor directories; plugin built against different OpenSSL. - Fix: Ensure one OpenSSL per process; remove global overrides; rebuild plugins; use distro libssl or vendor a full isolated stack.
4) Symptom: Crash only happens when a plugin/module loads
- Root cause: Plugin ABI mismatch (C++ ABI, struct layout, glibc expectations). The host app upgraded; plugin didn’t.
- Fix: Rebuild plugin against the upgraded headers/libs; enforce plugin version compatibility checks; avoid binary-only plugins unless you control the toolchain.
5) Symptom: dpkg -V shows mismatches in libc or other core libs
- Root cause: File overwritten, corruption, interrupted upgrade, or configuration management writing into system directories.
- Fix: Reinstall packages; investigate who modified files (audit config management, immutable infrastructure rules); run disk health checks if mismatches persist.
6) Symptom: Loader errors like “version `GLIBC_2.xx’ not found” or “undefined symbol”
- Root cause: Hard mismatch in required symbol version; binary built on newer glibc/libstdc++ than target.
- Fix: Build on the oldest target you need (or inside a Debian 13 build environment); don’t copy random
libc.so.6around; align toolchain.
Checklists / step-by-step plan
Checklist A: Triage in the first 15 minutes
- Get one clear crash signature from
journalctl(faulting library name, service, frequency). - Confirm whether the crashing executable is distro-managed (
dpkg -S) or local (/usr/local,/opt). - Grab a core dump with
coredumpctland list loaded shared libraries ingdb. - Run
lddandreadelf -dto identify RPATH/RUNPATH and library selection. - Check for
LD_LIBRARY_PATHin systemd units, wrappers, cronjobs.
Checklist B: Confirm package consistency (stop living in a partial upgrade)
- Check holds:
apt-mark showhold. Remove holds unless you have a fully tested reason. - Check broken state:
dpkg --audit, thendpkg --configure -a. - Simulate full upgrade:
apt-get -s dist-upgrade. - Verify integrity:
dpkg -Vfor libc and the suspect libraries. Reinstall anything mismatching.
Checklist C: Fix the mismatch with minimal collateral damage
- Preferred: stop overriding distro libraries globally. Remove
/usr/local/libfromld.so.confif it’s not strictly required. - Remove or rename unmanaged conflicting libraries (keep a rollback copy somewhere safe, not in loader paths).
- Run
ldconfigand restart only affected services (or reboot if core loader state is questionable). - If the app needs custom libraries, deploy them in an app-owned directory and reference via RUNPATH scoped to that app—or use containers.
- Rebuild plugins/modules against Debian 13 libraries and toolchain; treat binary-only plugins as suspect until proven compatible.
Checklist D: Make it not happen again
- Policy: no vendor installer may modify
/etc/ld.so.conf.dor drop libs into/usr/local/libwithout review. - Inventory: periodically scan for unmanaged ELF libs in loader paths.
- Upgrade runbook: enforce “no holds for glibc/libgcc/libstdc++ during release upgrades.”
- Build discipline: build in a Debian 13 environment; publish a manifest of required library versions.
FAQ
1) Why does a library mismatch cause a segfault instead of a clean loader error?
Loader errors happen when required symbols can’t be resolved. Segfaults happen when symbols resolve but the ABI contract is broken:
wrong struct layout, wrong expectations about allocation ownership, mismatched C++ types, or duplicate libraries in one process.
2) Is ldd reliable for diagnosing what will happen at runtime?
It’s a good first glance, not gospel. ldd can be affected by environment variables and doesn’t show all dlopen/plugin behavior.
For runtime truth, use LD_DEBUG=libs and inspect a live process mappings or a core dump.
3) Can Debian 13 upgrades legitimately introduce segfaults in stable software?
It’s possible but less common than people think. Most post-upgrade segfaults are triggered by local overrides, plugins, partial upgrades,
or binaries built on a different baseline. Treat “Debian broke it” as a hypothesis you must earn with evidence.
4) Should I “fix” it by symlinking /usr/local/lib/libssl.so.3 to Debian’s libssl?
No. That’s a brittle hack that makes provenance opaque and breaks package management expectations. Remove the override and let the loader select the packaged library,
or rebuild the app properly. Symlink hacks are how future-you ends up doing incident response on a holiday.
5) What if I must ship a custom OpenSSL for compliance reasons?
Then ship it in an isolated directory owned by the application, and ensure only that application uses it.
Avoid global loader path changes. Make the dependency boundary explicit (RUNPATH scoped to the app, container runtime, or chroot).
6) How do I tell if two copies of the same library are loaded?
Use gdb (info proc mappings), inspect /proc/$PID/maps, or analyze the core dump mappings.
If you see both /usr/local/lib/libssl.so.3 and /usr/lib/x86_64-linux-gnu/libssl.so.3, you’ve found a serious root cause.
7) Why do these crashes sometimes show up only under load?
ABI mismatches and duplicate libraries can behave like race conditions because memory layout and timing change with concurrency.
Under light load, you may not hit the corrupting code path. Under load, you will—reliably enough to ruin your week.
8) What’s the cleanest remediation when the crashing binary is in /usr/local?
Rebuild it against Debian 13 in a controlled environment, remove hard-coded RUNPATH to global locations, and remove reliance on global LD_LIBRARY_PATH.
If rebuilding isn’t possible, isolate the runtime (container/chroot) so it stops interfering with the host.
9) Do core dumps create security risk?
Yes: cores can contain secrets. Use systemd-coredump storage policies, restrict access, and consider disabling cores for sensitive services.
But for this class of issue, one well-handled core dump can save days of guessing.
10) When should I roll back instead of debugging forward?
If core system libraries show integrity mismatches you can’t explain, if multiple critical services are crashing, or if you suspect storage corruption,
rolling back to a known-good snapshot is often the safest move. Debug on a clone, not on the bleeding host.
Conclusion: next steps you can execute
“Segfault after upgrade” stops being scary once you treat it like an inventory problem: which binary loaded which exact shared objects from which exact paths,
and do those objects agree on ABI and symbol versions.
If you take nothing else from case #55, take this: stop letting /usr/local/lib (or vendor paths) override Debian libraries globally.
Pinpoint the mismatch with coredumpctl + gdb + LD_DEBUG, prove it with file paths and symbol versions,
and then fix it by restoring a single consistent library set.
Practical next steps:
- Pick one crash and extract the loaded library list from the core dump.
- Use
LD_DEBUG=libs,versionsto prove the loader’s choice and the version requirements. - Eliminate global overrides (
ld.so.conf.d,LD_LIBRARY_PATH, RUNPATH to shared locations). - Reinstall and verify core packages if
dpkg -Vindicates mismatches. - Rebuild or isolate anything that isn’t managed by Debian packaging.