Docker Vulnerability Scanning: What to Trust and What Is Noise

Was this helpful?

You ran a container scan and it screamed: “327 vulnerabilities, 14 critical.”
Meanwhile the service is a tiny Go binary, no shell, no package manager, and it doesn’t even open an inbound port.
The scanner is panicking anyway. Your security team wants a ticket. Your on-call wants sleep.

Container vulnerability scanning is valuable—when you treat it like an instrument panel, not a prophecy.
This is how you separate signal from noise, decide what to fix, and build a workflow that survives contact with production.

What scanners actually know (and what they don’t)

Most Docker vulnerability scanners work by inspecting image contents and matching what they find against vulnerability databases.
That sounds straightforward until you remember what a container image is: a pile of files plus metadata layers, built from an ecosystem that loves abstraction.
Scanners don’t “understand your app.” They infer, match, and guess.

What scanners are good at

  • Detecting known vulnerable package versions in OS package managers (dpkg, rpm, apk) and common language ecosystems.
  • Highlighting stale base images where upstream has shipped fixes but you haven’t rebuilt.
  • Inventorying components (SBOM-like behavior) so you can stop arguing about what’s “in the image.”
  • Finding obvious foot-guns: embedded private keys, world-writable files, or a surprise SSH server—depending on the tool.

Where scanners routinely lie (or at least mislead)

  • Reachability: a vulnerability in a library that’s present but never loaded will still be reported.
  • Backports: enterprise distros patch vulnerabilities without bumping versions in a way naive matching expects.
  • “Fixed in” ambiguity: the scanner says “fixed in 1.2.3” but your distro ships “1.2.2-ubuntu0.4” with the patch backported.
  • Context-free severity: CVSS assumes a generic deployment. Your container might be non-root, no network, read-only filesystem, seccomp locked down. CVSS does not care.
  • Ghost packages: build-time dependencies left behind in the final layer because someone forgot multi-stage builds.

Vulnerability scanning is a necessary signal generator, not a risk calculator. The risk calculation is your job.
Or, more precisely: your job is to create a system that computes risk consistently so humans don’t improvise at 2 a.m.

One quote I keep taped to the mental dashboard, from John Allspaw: “Automation should be a force multiplier, not a source of mystery.”

Facts and historical context (why this mess exists)

If vulnerability scanning feels like arguing with a very confident spreadsheet, that’s because it is. Some context helps you predict failure modes.

  1. CVE started in 1999 as a naming system so vendors could talk about the same bug without tribal translation.
  2. CVSS v2 (2005) and v3 (2015) standardized severity scoring, but it’s still an abstract model that can’t see your runtime mitigations.
  3. Containers popularized “immutable infrastructure”, but many teams still treat images like mutable VMs: patch inside, ship, forget.
  4. Distros backport fixes as policy (Debian, Ubuntu, RHEL families), which breaks simple “version < fixed_version” logic.
  5. Alpine’s musl libc ecosystem changed the dependency landscape; smaller images, but different compatibility and sometimes different vulnerability narratives.
  6. Heartbleed (2014) taught everyone a lesson: “just update OpenSSL” is not a strategy when you don’t even know where OpenSSL is.
  7. Log4Shell (2021) pushed SBOMs into the corporate mainstream because people were scanning everything to find where the jar lived.
  8. Executive Order 14028 (2021) accelerated SBOM requirements and “prove what’s inside” expectations in procurement pipelines.
  9. VEX emerged because SBOMs without “is it exploitable?” context create alert fatigue at industrial scale.

Fact #10, unofficial but painfully real: security tooling vendors discovered that “more findings” often sells better than “more accurate findings.”
That’s a market reality, not a conspiracy.

A practical trust model: three layers of truth

When you’re triaging findings, you need a hierarchy. Not for philosophy—so you can stop bikeshedding.

Layer 1: What’s in the image (inventory truth)

This is what scanners can usually do well: enumerate packages, libraries, and sometimes files.
The first question is not “is it critical?” The first question is “is it actually there?”

Trust signals:

  • Results derived from OS package metadata (dpkg/rpm/apk database) rather than filename guessing.
  • SBOM generation with checksums and package identifiers (purl, CPE, SPDX identifiers).
  • Reproducible builds or at least deterministic Dockerfiles so you can confirm the scan is for what you think it is.

Layer 2: Is it vulnerable in that distro/ecosystem (advisory truth)

A CVE is not a patch note. It’s a label.
“Vulnerable” depends on distro patch status, backports, compile flags, disabled features, and vendor decisions.

Trust signals:

  • Distro security trackers and vendor advisories referenced by the scanner, not only NVD.
  • Awareness of backporting: mapping by distro package release strings, not just upstream semver.
  • Tooling that distinguishes “unfixed,” “fixed,” “affected but mitigated,” and “not affected.”

Layer 3: Is it exploitable in your runtime (reality truth)

Exploitability is where scanning stops and operations begins:
network exposure, privileges, file system write access, secrets in env vars, kernel hardening, egress controls, and what the code path actually does.

Trust signals:

  • Runtime configuration known: user IDs, Linux capabilities, seccomp/apparmor, read-only rootfs, no-new-privileges.
  • Exposure known: which ports are reachable, ingress rules, service mesh policy, egress restrictions.
  • Evidence-based reachability analysis (SCA with call graph) for language dependencies where feasible.

If your pipeline treats Layer 1 as Layer 3, you will drown in alerts and still get popped by something boring.

Common noise categories (and when they stop being noise)

1) “Unfixed” CVEs in base image packages you don’t use

Example: your image includes libxml2 as a transitive dependency, scanner flags CVEs, but your app never parses XML.
That’s often “tolerable risk,” not “ignore forever.”

When it becomes real:

  • The vulnerable library is used by system components you do expose (nginx, curl, git, package manager calls).
  • The container runs privileged or shares host namespaces.
  • The library is commonly used via unexpected paths (e.g., image processing libraries through a user upload feature).

2) Backport false positives

Debian and Ubuntu frequently ship patches without version bumps that match upstream fix versions.
A scanner that only compares upstream versions will claim you’re vulnerable when you’re not.

When it becomes real:

  • You’re on a distro without backport culture, or you’re using upstream binaries.
  • The package vendor explicitly marks it as affected and unfixed.

3) Language dependency CVEs with no reachable code path

JavaScript, Java, Python—ecosystems where a dependency graph can be a small forest fire.
Scanners will flag libraries present in node_modules even if the code is never imported at runtime.

When it becomes real:

  • You ship the entire dependency tree into production (common in Node images).
  • There’s dynamic loading / reflection that makes reachability analysis hard.
  • You have user-controlled inputs that could hit obscure parsers.

4) CVEs that require local user access, in a container that runs as non-root

Many “local privilege escalation” CVEs are still relevant if an attacker can execute code in your container.
If they already have code execution, you’re already having a bad day.
But LPE matters because it can become container escape or lateral movement.

5) Kernel CVEs reported against images

Containers share the host kernel. If a scanner reports a kernel CVE “in the image,” treat that as informational at best.
Your fix is on the host nodes, not in the Dockerfile.

Joke #1: CVSS scores are like weather forecasts—useful, but if you plan surgery based on them, you’re going to have paperwork.

A triage framework that works under pressure

Here’s the workflow I want teams to use. It’s not perfect. It’s consistent, and consistency beats heroics.

Step A: Confirm you scanned the right artifact

You’d think this is obvious. It isn’t.
People scan :latest, or scan a local build while production runs an older digest.
Always tie findings to an immutable digest.

Step B: Reduce to “exploitable + exposed + unmitigated”

Severity is not a decision; it’s a hint.
Your decision should be driven by:

  • Exposed surface: is the vulnerable component reachable from outside the trust boundary?
  • Exploit maturity: public exploit? weaponized? just theoretical?
  • Privileges: can it lead to root in container? escape? data access?
  • Mitigations: config disables feature, network policies, sandboxing, WAF, read-only rootfs.
  • Time-to-fix: can you rebuild a base image today, or will it take a month to untangle?

Step C: Decide between rebuild, patch, or accept

  • Rebuild when the fix is in upstream packages and you just need a new base image.
  • Patch when you control the code and the vulnerability is in your dependency tree.
  • Accept (with record) when it’s not exploitable, not exposed, or mitigated—and you can prove it (VEX helps).

Step D: Prevent recurrence with build hygiene

Most “scanner noise” is self-inflicted: giant images, build tools in runtime layers, old base images, undisciplined dependency management.
Fix the pipeline and the noise drops.

Hands-on tasks: commands, outputs, and decisions (12+)

These are practical tasks you can run on a workstation or CI runner with Docker available.
Each one includes what the output means and what decision you make.

Task 1: Identify the exact image digest you’re scanning

cr0x@server:~$ docker image inspect --format '{{.RepoTags}} {{.Id}}' myapp:prod
[myapp:prod] sha256:9d2c7c0a0d4e2fd0f2c8a7f0b6b1a1f2a3c4d5e6f7a8b9c0d1e2f3a4b5c6d7e8

Meaning: The image ID (digest) is immutable. Scan this, not a tag that can move.

Decision: If your scanner reports findings without a digest reference, treat the report as “possibly wrong artifact” until proven otherwise.

Task 2: Confirm what production actually runs (digest, not tag)

cr0x@server:~$ docker ps --no-trunc --format 'table {{.Names}}\t{{.Image}}\t{{.ID}}'
NAMES          IMAGE                                                                 ID
myapp-prod     myregistry.local/myapp@sha256:2b1d...e9c3                             61c9f1c1f9d9b0a3b2c1d7b0e3a1c8f9d2e4b5a6c7d8e9f0a1b2c3d4e5f6a7b8

Meaning: Running container references a digest from your registry. If the digest differs from what you scanned, your scan is irrelevant.

Decision: Re-scan the running digest, or update deployment to match the scanned digest before filing remediation tickets.

Task 3: Quick inventory check—what OS are we dealing with?

cr0x@server:~$ docker run --rm myapp:prod cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"

Meaning: Distro context matters. Debian backports? Yes, sometimes. Alpine? Different database and patch cadence.

Decision: Configure scanners to use distro-specific feeds where possible; don’t rely solely on generic upstream “fixed in” data.

Task 4: Count packages in the runtime image (noise predictor)

cr0x@server:~$ docker run --rm myapp:prod dpkg-query -W | wc -l
143

Meaning: More packages usually means more CVEs. A runtime image with 143 packages might be fine, but it’s a clue.

Decision: If this number is hundreds or thousands for a simple service, plan a multi-stage build or distroless move; you’re paying vulnerability tax for convenience.

Task 5: Find the biggest layers (often where build tools got baked in)

cr0x@server:~$ docker history --no-trunc myapp:prod | head
IMAGE          CREATED        CREATED BY                                      SIZE      COMMENT
sha256:9d2c…   3 days ago     /bin/sh -c apt-get update && apt-get install…   312MB
sha256:1aa3…   3 days ago     /bin/sh -c useradd -r -u 10001 appuser          1.2MB
sha256:2bb4…   3 days ago     /bin/sh -c #(nop)  COPY file:app /app           24MB

Meaning: That 312MB layer smells like compilers, headers, and whatever else someone installed “temporarily.”

Decision: Split build and runtime stages; remove package manager caches; keep runtime minimal to reduce CVE surface.

Task 6: Run Trivy scan with useful defaults (and read it correctly)

cr0x@server:~$ trivy image --severity HIGH,CRITICAL --ignore-unfixed myapp:prod
2026-01-03T10:12:41Z	INFO	Detected OS: debian
2026-01-03T10:12:41Z	INFO	Number of language-specific files: 1
myapp:prod (debian 12)
=================================
Total: 3 (HIGH: 2, CRITICAL: 1)

libssl3 3.0.11-1~deb12u2   (CRITICAL)  CVE-202X-YYYY  fixed in 3.0.11-1~deb12u3
zlib1g  1:1.2.13.dfsg-1    (HIGH)      CVE-202X-ZZZZ  fixed in 1:1.2.13.dfsg-2

Meaning: --ignore-unfixed removes “no patch exists” items, which often reduces noise in day-to-day ops.
The remaining findings are actionable because a fixed version exists in the distro channel.

Decision: If fixed versions exist, rebuild with updated base packages. If not, assess exploitability and compensating controls; don’t block releases on “unfixed” by default.

Task 7: Generate an SBOM and treat it as an artifact

cr0x@server:~$ syft myapp:prod -o spdx-json > myapp.spdx.json
cr0x@server:~$ jq -r '.packages[0].name, .packages[0].versionInfo' myapp.spdx.json
base-files
12.4+deb12u5

Meaning: You now have a machine-readable inventory tied to an image build.

Decision: Store SBOMs alongside images (in artifact storage). Use them to diff builds and to answer “what changed?” in minutes, not meetings.

Task 8: Correlate scanner findings with installed package status

cr0x@server:~$ docker run --rm myapp:prod dpkg-query -W libssl3
libssl3	3.0.11-1~deb12u2

Meaning: Confirms what’s installed, not what the scanner guessed.

Decision: If the installed version already includes a backported fix (vendor says fixed), mark scanner entry as false positive and document it (preferably as VEX).

Task 9: Check if the container runs as root (risk multiplier)

cr0x@server:~$ docker inspect --format '{{.Config.User}}' myapp:prod

Meaning: Empty means it defaults to root. Many exploits go from “annoying” to “catastrophic” when you hand them root in the container.

Decision: If it runs as root, fix it unless there’s a strong reason. Add USER 10001 (or similar) and handle file permissions properly.

Task 10: Validate Linux capabilities at runtime

cr0x@server:~$ docker inspect --format '{{json .HostConfig.CapAdd}} {{json .HostConfig.CapDrop}}' myapp-prod
null ["ALL"]

Meaning: CapDrop ["ALL"] is a strong hardening move. If you see added capabilities (NET_ADMIN, SYS_ADMIN), the blast radius increases.

Decision: If capabilities are added “just in case,” remove them and test. Capabilities should be treated like root passwords: issued sparingly.

Task 11: Confirm read-only filesystem configuration (mitigation evidence)

cr0x@server:~$ docker inspect --format '{{.HostConfig.ReadonlyRootfs}}' myapp-prod
true

Meaning: Read-only rootfs reduces exploit persistence and blocks many write-then-exec attack chains.

Decision: If false, consider making it true and mounting only required writable paths (/tmp, app cache) as tmpfs/volumes.

Task 12: Identify which ports are actually exposed/listening

cr0x@server:~$ docker exec myapp-prod ss -lntp
State  Recv-Q Send-Q Local Address:Port  Peer Address:Port Process
LISTEN 0      4096   0.0.0.0:8080       0.0.0.0:*       users:(("myapp",pid=1,fd=7))

Meaning: One listener on 8080. That’s your primary remote attack surface.

Decision: If scanner findings target components not involved in this path (e.g., SSH, FTP libs), downgrade urgency unless there’s an internal exploit chain.

Task 13: Detect “shell present” and other convenience tools (attack surface)

cr0x@server:~$ docker run --rm myapp:prod sh -lc 'command -v bash; command -v curl; command -v gcc; true'
/usr/bin/bash
/usr/bin/curl
/usr/bin/gcc

Meaning: That’s a lot of tooling for a runtime container. Great for debugging. Also great for attackers.

Decision: Move tooling to a debug image or ephemeral toolbox container. Keep production runtime images boring.

Task 14: Compare two images to see if the CVE reduction is real

cr0x@server:~$ trivy image --severity HIGH,CRITICAL myapp:prod | grep -E 'Total:|CRITICAL|HIGH' | head
Total: 17 (HIGH: 13, CRITICAL: 4)
cr0x@server:~$ trivy image --severity HIGH,CRITICAL myapp:prod-slim | grep -E 'Total:|CRITICAL|HIGH' | head
Total: 4 (HIGH: 3, CRITICAL: 1)

Meaning: The slim image reduced findings materially. Now confirm functionality and operational requirements (logging, CA certs, timezone files).

Decision: If the slim image passes integration tests and still supports observability, promote it. If not, you learned which dependencies you truly need.

Task 15: Detect “stale rebuild” risk (base image updates not pulled)

cr0x@server:~$ docker pull debian:12-slim
12-slim: Pulling from library/debian
Digest: sha256:7a8b...cdef
Status: Image is up to date for debian:12-slim

Meaning: You’ve confirmed the local base image matches current registry state. In CI you should always pull to avoid building on stale layers.

Decision: If CI is caching base images without refresh, fix that. Vulnerability remediation often is “rebuild weekly,” not “patch manually in a container.”

Task 16: Verify you didn’t ship build-time secrets into layers

cr0x@server:~$ docker run --rm myapp:prod sh -lc 'grep -R --line-number -E "BEGIN (RSA|OPENSSH) PRIVATE KEY" / 2>/dev/null | head -n 3 || true'

Meaning: No output is good. Scanners sometimes find secrets; you can also sanity-check yourself.

Decision: If you find anything, treat it as an incident: rotate keys, rebuild images, and fix the Dockerfile/build context immediately.

Joke #2: The only thing more persistent than a vulnerability is a layer cache with bad decisions baked in.

Three corporate-world mini-stories

Mini-story 1: The incident caused by a wrong assumption

A mid-sized SaaS company ran nightly container scans and had a simple policy:
“If it’s Critical, block deployment.” The intention was good. The implementation was blunt.

One Friday, a new critical CVE landed in a widely used compression library.
Their scanner flagged it across dozens of images—API, workers, cron jobs, even a one-off migration tool.
The assumption: “Critical means internet-exploitable.” So the security response was to halt all releases.

Meanwhile, the real operational issue was unrelated: a database schema migration needed to ship to stop a slow-burning performance regression.
With deployments blocked, the regression became customer-visible. Latency climbed. Retries multiplied. Costs followed.

After a long weekend, they discovered the “critical” library was only present in a build toolchain layer for several images, not in the runtime stage.
For others, it was installed but not reachable from their exposed services. The correct fix would have been a scheduled rebuild and a targeted exception with justification.

The lesson wasn’t “ignore critical CVEs.” The lesson was “don’t confuse severity labels with exploitability in your architecture.”
They replaced the block rule with a triage rule: block only when (1) fixed version exists, (2) component is in runtime image, (3) reachable from an exposed path, or (4) the CVE is known to be actively exploited.

Mini-story 2: The optimization that backfired

An enterprise internal platform team got tired of long build times and registry bloat.
They introduced aggressive layer caching and pinned base images to digests in Dockerfiles to “ensure reproducibility.”
Builds got faster. Scan results got stable. Everyone clapped.

Three months later, a new wave of scanner alerts arrived—this time from external auditors, not their CI.
The auditors scanned images in the registry and found old, vulnerable packages that had long been fixed upstream.
The platform team’s scanners didn’t flag them because they weren’t rebuilding, and their scan pipeline only ran on “changed images.”

The optimization had created a trap: pinning base images to digests made builds reproducible, but also made them permanently stale unless someone updated the pin.
Their caching made it easy to forget that security patches are delivered through rebuilds.

The remediation wasn’t complicated, just operationally annoying:
they added scheduled rebuilds (weekly for internet-facing services, monthly for internal tools), plus an automated PR bot that bumps base image digests and runs integration tests.
Cache stayed—but it stopped being an excuse for never rebuilding.

The lesson: performance optimizations that reduce friction also reduce feedback. If you remove the “pain” of rebuilding, you must add an explicit rebuild cadence or your security posture quietly rots.

Mini-story 3: The boring but correct practice that saved the day

A fintech company had a habit that nobody bragged about: every image built in CI produced three artifacts—
the image digest, an SBOM, and a signed attestation of the build inputs (base image digest, git SHA, build args).
It wasn’t exciting. It was paperwork, automated.

Then a high-profile vulnerability dropped in a library used by their customer-facing gateway.
The scanner dashboard lit up, as dashboards do. The security team asked the usual question: “Are we affected?”
Historically, this question triggers weeks of hunting through repos and arguing about which service uses what.

This time, the SRE on call pulled the SBOM for the running gateway digest, grepped for the component, and had an answer in minutes:
yes, the vulnerable library was present, and yes, the affected feature was enabled in their config.
They also used attestations to confirm which builds introduced it and which environments were running those digests.

They rebuilt the base image, rolled out progressively, and used the digest mapping to confirm that prod was actually updated.
No guesswork. No “but we deployed, right?” Slack threads. Just a controlled fix.

The lesson: the difference between panic and progress is evidence.
SBOMs and attestations are not security theater when they’re wired into operations and tied to immutable digests.

Fast diagnosis playbook

When you have a scanner report and limited time, do this in order. The goal is to find the bottleneck in decision-making fast.

1) Confirm artifact identity

  • Do we have an image digest?
  • Does production run that digest?
  • Is the scan timestamp after the image build?

Why first: Half of scan drama is mismatched artifacts. Fix that and a bunch of “urgent” issues evaporate or become correctly targeted.

2) Determine fix availability

  • Is there a fixed package version in the distro repo?
  • Is there an upstream release for the language dependency?
  • Is the issue marked “unfixed” because no patch exists?

Why second: “No fix exists” changes the conversation from “patch now” to “mitigate and monitor.”

3) Check runtime exposure and privileges

  • Which ports are listening? What’s reachable from outside?
  • Does it run as root? Any extra capabilities? Privileged container?
  • Read-only filesystem? seccomp profile? no-new-privileges?

Why third: If the runtime is locked down and the vulnerable component isn’t exposed, you have time to rebuild correctly instead of rushing a risky hotfix.

4) Eliminate build-time dependencies from runtime image

  • Is the CVE in compilers, headers, package managers, shells?
  • Are these needed in production?

Why: Multi-stage builds can remove entire classes of findings without “patching” anything.

5) Decide: rebuild, patch, accept-with-evidence

  • Rebuild when fix exists in base image packages.
  • Patch when fix is in app dependencies.
  • Accept only with a documented rationale and a review date; ideally produce VEX statements.

Common mistakes: symptoms → root cause → fix

1) Symptom: “We fixed it, but the scanner still shows it”

Root cause: You updated code or packages but did not rebuild the image from a fresh base; or you rebuilt but deployment still runs the old digest.

Fix: Pull base image fresh in CI, rebuild, push, and deploy by digest. Verify with docker ps / orchestrator equivalent that the digest changed.

2) Symptom: “The scanner says vulnerable, vendor says not affected”

Root cause: Backported patches or mismatched advisory sources (NVD vs distro security tracker).

Fix: Prefer distro-aware scanning sources; record an exception with evidence (package release string + vendor advisory status). Consider VEX to encode “not affected.”

3) Symptom: “Every image has hundreds of CVEs, we ignore everything”

Root cause: Bloated runtime images, old base images, and no rebuild cadence. Alert fatigue becomes policy.

Fix: Multi-stage builds, minimal runtime base, scheduled rebuilds, and gating only on actionable categories (fixed+exposed+high impact).

4) Symptom: “Critical CVE in kernel package inside container”

Root cause: Misinterpretation: container images don’t bring their own kernel; scanners sometimes report kernel-related packages or headers.

Fix: Patch host nodes. For the image, remove kernel headers/tools unless needed. Treat “kernel CVE in image” as informational unless you ship kernel-related tooling.

5) Symptom: “We upgraded the base image and broke TLS/CA validation”

Root cause: Moving to slim/distroless without explicitly including CA certificates or timezone data your app expects.

Fix: Ensure ca-certificates (or equivalent) is present; explicitly copy cert bundles if distroless. Add integration tests for outbound TLS.

6) Symptom: “Scan reports CVEs in packages we don’t have”

Root cause: Scanner is detecting by file signatures or guessing language dependencies from manifests that don’t match runtime content.

Fix: Confirm with package database queries inside the image; generate SBOM; update scanner config to prefer package manager metadata.

7) Symptom: “We can’t patch because the fixed version doesn’t exist yet”

Root cause: Upstream has no patch, or your distro hasn’t shipped it, or you pinned repositories.

Fix: Mitigate: disable vulnerable feature, restrict exposure, drop privileges, add WAF rules, isolate network. Track and rebuild when fix lands.

Checklists / step-by-step plan

Daily operations checklist (keeps noise from becoming culture)

  1. Scan images by digest, not tag.
  2. Store scan results tied to digest and build ID.
  3. Track “actionable” findings: fixed available + in runtime + exposed/relevant.
  4. File tickets with exact package name/version and remediation path (rebuild vs code change).
  5. Require a justification for exceptions, including evidence (vendor status, runtime mitigations).

Weekly security hygiene plan (boring, effective)

  1. Rebuild internet-facing images at least weekly from fresh bases.
  2. Re-run scans on rebuilt images and compare deltas; investigate increases.
  3. Rotate out-of-date base images; stop using end-of-life distros.
  4. Prune build-time tools from runtime images (multi-stage).
  5. Verify runtime hardening: non-root user, drop capabilities, read-only rootfs where possible.

Step-by-step triage plan for a “Critical” finding

  1. Artifact check: confirm running digest matches scanned digest.
  2. Component check: verify package/library is installed in runtime layer.
  3. Fix availability: is there a patched package or release?
  4. Exposure: is the vulnerable component in the request path or reachable via user input?
  5. Privileges: root? added capabilities? writable filesystem?
  6. Mitigations: config disables feature? network policies? sandboxing?
  7. Decision: rebuild now, patch code, or accept with evidence and review date.
  8. Verification: deploy by digest; re-scan; confirm production runs new digest.

FAQ

1) Should we block deployments on vulnerability scans?

Yes, but only on findings that are actionable and meaningful: fixed version exists, component is in the runtime image, and the risk is relevant (exposed/high impact).
Blocking on “unfixed” findings by default usually trains people to bypass the system.

2) Which scanner should we trust: Trivy, Docker Scout, Grype, something else?

Trust is not a brand; it’s a workflow. Use at least one scanner that is distro-aware and one that can generate an SBOM.
The best scanner is the one you keep updated and whose output you can explain to an auditor without interpretive dance.

3) Why does my distroless image still have CVEs?

Distroless reduces packages, but it doesn’t eliminate vulnerabilities. You can still ship vulnerable libraries, CA bundles, or language runtimes.
Also, scanners may still report CVEs for bundled components inside your application binary or runtime.

4) Are “unfixed” vulnerabilities safe to ignore?

Not safe to ignore. Safe to treat differently.
If no patch exists, your options are mitigation (reduce exposure), replacement (different component), or acceptance with evidence and a review trigger.

5) Why do we see CVEs in packages we never use?

Because packages are installed, not used. Scanners report presence.
Your job is to reduce presence (smaller images) and assess reachability/exposure for what remains.

6) How do we handle backport false positives without creating a loophole?

Make exceptions structured: tie them to image digest, package release string, vendor status (“fixed via backport” / “not affected”), and set a revalidation date.
Prefer VEX-style statements over ad-hoc wiki pages.

7) Should we prefer Alpine for fewer CVEs?

Choose a base image for operational compatibility first (glibc vs musl issues are real), then for security hygiene.
Fewer reported CVEs is not the same as less risk; it can also be different reporting coverage.

8) How often should we rebuild images?

Internet-facing services: weekly is a good default. Internal batch jobs: monthly might be fine.
If you rebuild “only when code changes,” you are choosing to miss security fixes delivered through base image updates.

9) Do runtime mitigations mean we can ignore image vulnerabilities?

Mitigations reduce exploitability; they don’t erase vulnerabilities.
Use mitigations to buy time and reduce blast radius, but still rebuild/patch when fixes exist—especially for broadly exploited CVEs.

10) What’s the simplest way to reduce scan noise quickly?

Multi-stage builds and removing build tools from runtime images. Then add scheduled rebuilds from fresh bases.
Those two moves alone usually cut findings dramatically without reducing functionality.

Conclusion: next steps that actually reduce risk

Vulnerability scanning is a flashlight, not a court verdict. Treat it like telemetry: correlate, confirm, and act on what matters.
If you want less noise and more safety, you don’t need a new dashboard. You need evidence and a cadence.

  1. Scan by digest and prove production runs what you scanned.
  2. Generate and store SBOMs for every build; use them to answer “what’s inside” instantly.
  3. Rebuild regularly from fresh base images; security updates often arrive via rebuild, not code changes.
  4. Reduce runtime surface area with multi-stage builds and minimal bases.
  5. Harden runtime defaults: non-root, drop capabilities, read-only rootfs where possible.
  6. Adopt structured exceptions (ideally VEX) so “not affected” doesn’t become “ignored forever.”

Do those, and your scanner stops being a panic generator. It becomes what it should have been all along: a tool that helps you ship safer systems without losing your weekends.

← Previous
MySQL vs PostgreSQL: “random timeouts”—network, DNS, and pooling culprits
Next →
Windows IKEv2 Disconnects: Common Errors and Reliable Fixes

Leave a comment