You deploy. It worked yesterday. Today your pipeline dies with manifest unknown, and the only “change”
is that someone bumped “latest” somewhere because “it’s fine, it’s just a tag.” Now prod is stuck in
ImagePullBackOff, your incident channel is cooking, and you’re about to learn the difference between a
human-friendly label and a cryptographic pointer.
The good news: manifest unknown is usually not mysterious. It’s the registry saying “I can’t find the
manifest you asked for.” The bad news: you asked for it in a way that’s easy to get wrong. Let’s make it boring again.
What “manifest unknown” actually means
In Docker/OCI land, an “image” is not a single blob. It’s a manifest that references layers and config, stored
in a registry. When you run docker pull repo:tag, your client asks the registry for the manifest
corresponding to that reference.
“manifest unknown” is the registry’s way of saying: “I don’t have a manifest at that name+reference.”
That reference might be:
- A tag (
repo:1.2.3) - A digest (
repo@sha256:…)
If the registry can’t resolve the tag to a manifest, or can’t find the digest, you get this error. It’s not (usually)
a network timeout. It’s not a permissions error. It’s lookup failure.
Typical real-world causes:
- That tag never existed in that registry (typo, wrong project, wrong region).
- That tag existed, but was deleted or moved (retention policy, garbage collection, human cleanup).
- You pushed a multi-arch image incorrectly, so the tag points to something different than the client expects.
- You’re pulling from the wrong registry endpoint (proxy/cache/mirror mismatch).
- Authorization is weird and the registry deliberately returns “unknown” to avoid leaking what exists.
One operational principle to tattoo on the inside of your eyelids:
tags are pointers, not identities. If you treat them like identities, your pager will treat you like a hobby.
Tags vs digests: the truth and the lies
Tags: friendly, mutable, easy to break
A Docker tag is a string label associated with a manifest in a repository. The registry stores a mapping:
tag → manifest. That’s it. There’s no promise that the mapping is stable. There’s no promise that the
tag won’t be force-updated. There’s no promise your CI/CD won’t race another push.
Tags are great for humans. They’re great for workflows like “promote 1.2.3 to prod by retagging it as prod.”
They are terrible as the only supply-chain control in production.
Digests: boring, immutable, the only thing you can prove
A digest (like sha256:...) identifies specific content. When you pull by digest, you’re asking for an
exact manifest by its hash. That hash changes if the content changes. So the digest is effectively immutable.
In practical terms:
- Tag pull: “give me whatever the registry currently thinks
1.2.3is.” - Digest pull: “give me this exact manifest, no substitutions.”
When your incident is “it worked in staging but not in prod,” the first thing I look for is tag drift. The
second thing I look for is multi-arch drift. The third thing is whether someone “optimized” the registry cleanup job.
Joke #1: Tags are like office fridge labels—helpful until someone decides “this looks old” and throws your lunch away.
What does the digest actually hash?
This matters because people get confused about “layer digest” vs “manifest digest”:
- Manifest digest: hash of the manifest JSON document (references layers/config). Pulling by digest targets this.
- Layer digests: hashes of each compressed layer blob. These get shared across images.
- Config digest: hash of the image config JSON (env, entrypoint, history, rootfs diff IDs, etc.).
“manifest unknown” is about the manifest (or manifest list) not being found for your reference. It’s not complaining that
a layer blob is missing—when layers are missing you’ll see different errors (often 404 on blob, or download failures).
Manifests, manifest lists, and why multi-arch makes this noisier
Single-platform images are straightforward: a tag points to a manifest for one platform (say linux/amd64). Multi-arch
images add one more level: the tag points to a manifest list (OCI index / Docker manifest list),
which then points to per-platform manifests.
Your client does content negotiation with headers and chooses an entry matching its platform (architecture, OS,
sometimes variant). If the manifest list is missing the platform entry, you may see errors that look like
“manifest unknown” or “no matching manifest,” depending on client and registry.
Common pattern in 2026: teams build on Arm laptops, push “latest,” then prod nodes (amd64) pull and fail. Nobody
changed code. Reality changed architecture.
Why proxies and mirrors make it worse
Registry caches (pull-through caches, corporate mirrors, artifact managers) may cache tags, cache manifests, or
cache blobs with different TTLs. If a tag is updated upstream but the cache is stale—or worse, partially updated—you
can get “manifest unknown” from the cache even though upstream is fine. Or you can get it from upstream because you’re
pulling from the cache’s namespace by mistake.
Interesting facts and short history (because the past is still on-call)
- Fact 1: The Docker Registry HTTP API v2 (the one everyone uses now) was introduced to fix scaling and correctness issues from v1, including better content addressing.
- Fact 2: OCI (Open Container Initiative) standardized the image format and distribution specs after Docker’s early ecosystem explosion, so you’re often dealing with “OCI images” even when you say “Docker image.”
- Fact 3: A “manifest list” in Docker terms is essentially an OCI “image index.” Two names, same operational headaches.
- Fact 4: Registries frequently implement deletion and garbage collection as separate steps; you can delete a tag reference without immediately removing blobs, or remove blobs later and break old references.
- Fact 5: Some registries intentionally return 404/“unknown” for unauthorized clients to avoid leaking which repositories/tags exist.
- Fact 6: Multi-arch images became mainstream when Arm servers and developer machines became common; before that, many teams never saw manifest lists in day-to-day work.
- Fact 7: Kubernetes doesn’t “understand tags” beyond passing them to the runtime; when tags drift, K8s faithfully deploys your chaos at scale.
- Fact 8: The digest you see as
RepoDigestslocally is tied to the registry reference; the same local image may have multiple repo digests if it was pulled from multiple registries/tags over time. - Fact 9: BuildKit and
buildxmade multi-platform builds accessible, but also made it easy to push partial results if you don’t use the right flags.
Fast diagnosis playbook
The goal is speed: confirm whether you’re dealing with (a) wrong name/tag, (b) missing platform in multi-arch, (c) auth/mirror weirdness,
or (d) registry retention/deletion.
First: confirm the exact reference your system is trying to pull
- Copy the full image reference from logs: registry host, repo path, tag or digest.
- Check for invisible mistakes: wrong project, missing namespace, tag typo, accidental uppercase.
Second: test pull from a clean environment (no cached credentials, no cached image)
- Try
docker pullfrom a throwaway runner or a workstation with known-good network. - If it works there but not on nodes, suspect proxy/mirror, auth differences, or stale cache.
Third: resolve the tag to a digest and then pull by digest
- If pulling by digest works, your issue is tag drift or tag deletion.
- If digest pull fails with “unknown,” the manifest is gone (or you’re in the wrong registry).
Fourth: check multi-arch / platform matching
- Inspect with
docker buildx imagetools inspect. - If the needed platform isn’t listed, rebuild/push properly.
Fifth: confirm registry-side state and retention policies
- List tags if supported; inspect audit logs if you have them.
- Look for automated cleanup jobs, “untagged” purge rules, or time-based retention that ate your tag.
Paraphrased idea from Werner Vogels (Amazon CTO): You build it, you run it.
If you ship images, you own how they’re referenced and retained.
Practical tasks: commands, outputs, and decisions (12+)
These are real operator moves: run a command, interpret output, decide the next action. Keep them in your runbook.
Task 1: Reproduce the failure exactly (and don’t “simplify” the image name)
cr0x@server:~$ docker pull registry.example.com/payments/api:1.8.4
Error response from daemon: manifest for registry.example.com/payments/api:1.8.4 not found: manifest unknown: manifest unknown
What it means: The registry endpoint answered, and it doesn’t have a manifest for that repo+tag.
Decision: Verify reference correctness, then check whether the tag exists and what it points to.
Task 2: Check if it’s a tag typo by inspecting nearby tags (where supported)
cr0x@server:~$ curl -s -u "$REG_USER:$REG_PASS" https://registry.example.com/v2/payments/api/tags/list | jq .
{
"name": "payments/api",
"tags": [
"1.8.3",
"1.8.4-hotfix1",
"1.9.0",
"latest"
]
}
What it means: 1.8.4 does not exist, but 1.8.4-hotfix1 does.
Decision: Fix the deployment reference. If you expected 1.8.4, find out why it was never pushed.
Task 3: Resolve a tag to a digest (client-side) with buildx imagetools
cr0x@server:~$ docker buildx imagetools inspect registry.example.com/payments/api:1.9.0
Name: registry.example.com/payments/api:1.9.0
MediaType: application/vnd.oci.image.index.v1+json
Digest: sha256:4c3b7d6b2a6d7f3e9c5b2d6c0a7c9b3a2d1e0f9a8b7c6d5e4f3a2b1c0d9e8f7
Manifests:
Name: registry.example.com/payments/api:1.9.0@sha256:aa1c... (linux/amd64)
Name: registry.example.com/payments/api:1.9.0@sha256:bb2d... (linux/arm64)
What it means: The tag exists and points to a multi-arch index with two platforms.
Decision: If your nodes are amd64, pulling should work—unless a proxy is interfering or auth differs.
Task 4: Pull by digest to eliminate tag drift
cr0x@server:~$ docker pull registry.example.com/payments/api@sha256:4c3b7d6b2a6d7f3e9c5b2d6c0a7c9b3a2d1e0f9a8b7c6d5e4f3a2b1c0d9e8f7
4c3b7d6b2a6d: Pulling from payments/api
Digest: sha256:4c3b7d6b2a6d7f3e9c5b2d6c0a7c9b3a2d1e0f9a8b7c6d5e4f3a2b1c0d9e8f7
Status: Downloaded newer image for registry.example.com/payments/api@sha256:4c3b7d6b2a6d7f3e9c5b2d6c0a7c9b3a2d1e0f9a8b7c6d5e4f3a2b1c0d9e8f7
What it means: The content exists; the tag problem is separate.
Decision: Update manifests/deployments to pin by digest for prod, or fix your promotion process.
Task 5: Confirm what you actually have locally (RepoTags vs RepoDigests)
cr0x@server:~$ docker image inspect registry.example.com/payments/api:1.9.0 --format '{{json .RepoTags}} {{json .RepoDigests}}'
["registry.example.com/payments/api:1.9.0"] ["registry.example.com/payments/api@sha256:4c3b7d6b2a6d7f3e9c5b2d6c0a7c9b3a2d1e0f9a8b7c6d5e4f3a2b1c0d9e8f7"]
What it means: You have both the mutable tag and the immutable digest reference.
Decision: Use the digest in lockfiles, Helm values, and Kubernetes manifests for production.
Task 6: Check platform mismatch quickly
cr0x@server:~$ uname -m
aarch64
What it means: You’re on Arm64.
Decision: If the image only has amd64 manifests, you’ll hit “no matching manifest” or “unknown” variants. Inspect the index and rebuild as multi-arch.
Task 7: Inspect the manifest list and verify the platform you need exists
cr0x@server:~$ docker buildx imagetools inspect registry.example.com/analytics/worker:2.1.0
Name: registry.example.com/analytics/worker:2.1.0
MediaType: application/vnd.docker.distribution.manifest.list.v2+json
Digest: sha256:9b7f...
Manifests:
Name: registry.example.com/analytics/worker:2.1.0@sha256:1c2a... (linux/amd64)
What it means: Only amd64 is published.
Decision: Either rebuild/push arm64, or force scheduling to amd64 nodes, or stop pretending “works on my laptop” is a platform strategy.
Task 8: Prove you’re talking to the registry you think you are
cr0x@server:~$ docker info | sed -n '1,25p'
Client:
Version: 26.1.0
Context: default
Debug Mode: false
Server:
Containers: 12
Running: 3
Paused: 0
Stopped: 9
Registry Mirrors:
https://mirror.corp.local
Live Restore Enabled: false
What it means: There’s a registry mirror configured.
Decision: If pulls fail only on certain hosts, compare mirror settings; try bypassing the mirror for diagnosis.
Task 9: Bypass a registry mirror (temporary) to confirm mirror staleness
cr0x@server:~$ sudo cp /etc/docker/daemon.json /etc/docker/daemon.json.bak
cr0x@server:~$ sudo jq 'del(.["registry-mirrors"])' /etc/docker/daemon.json.bak | sudo tee /etc/docker/daemon.json >/dev/null
cr0x@server:~$ sudo systemctl restart docker
cr0x@server:~$ docker pull registry.example.com/payments/api:1.9.0
1.9.0: Pulling from payments/api
Digest: sha256:4c3b...
Status: Downloaded newer image for registry.example.com/payments/api:1.9.0
What it means: The mirror was the problem (stale cache, blocked path, auth mismatch).
Decision: Fix or evict mirror cache; don’t leave nodes “temporarily” bypassing mirrors unless you like surprise egress bills.
Task 10: Check Kubernetes events when the error appears in clusters
cr0x@server:~$ kubectl -n payments describe pod api-7f95b6d8dd-4n2qg | sed -n '/Events:/,$p'
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulling 2m50s kubelet Pulling image "registry.example.com/payments/api:1.8.4"
Warning Failed 2m49s kubelet Failed to pull image "registry.example.com/payments/api:1.8.4": rpc error: code = Unknown desc = failed to resolve reference "registry.example.com/payments/api:1.8.4": not found
Warning Failed 2m49s kubelet Error: ErrImagePull
Warning BackOff 2m35s kubelet Back-off pulling image "registry.example.com/payments/api:1.8.4"
Warning Failed 2m35s kubelet Error: ImagePullBackOff
What it means: Kubelet/runtime couldn’t resolve the reference. This is consistent with tag missing, auth/mirror issues, or platform mismatch.
Decision: Validate image reference and credentials used by the node, not by your laptop.
Task 11: Verify the node’s runtime and its view of the image reference
cr0x@server:~$ kubectl get node ip-10-1-2-3 -o jsonpath='{.status.nodeInfo.containerRuntimeVersion}{"\n"}'
containerd://1.7.18
What it means: You’re using containerd, not Docker Engine.
Decision: Use crictl tooling or containerd tooling on-node for deeper pull debugging; Docker CLI alone can mislead you.
Task 12: Confirm the imagePullSecret is present and correct (auth failures can masquerade)
cr0x@server:~$ kubectl -n payments get secret regcred -o jsonpath='{.type}{"\n"}'
kubernetes.io/dockerconfigjson
What it means: The secret exists and is the right type.
Decision: If still failing, decode it and confirm it points to the same registry host you’re pulling from.
Task 13: Decode the dockerconfigjson and check the registry hostname matches
cr0x@server:~$ kubectl -n payments get secret regcred -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d | jq -r '.auths | keys[]'
https://registry.example.com
https://registry-old.example.com
What it means: Multiple auth entries exist, including an old endpoint.
Decision: Remove stale entries; make sure the exact hostname in the image reference has credentials.
Task 14: Use skopeo to inspect remotely without pulling (great for CI runners)
cr0x@server:~$ skopeo inspect --creds "$REG_USER:$REG_PASS" docker://registry.example.com/payments/api:1.9.0 | jq -r '.Digest,.Architecture,.Os'
sha256:4c3b7d6b2a6d7f3e9c5b2d6c0a7c9b3a2d1e0f9a8b7c6d5e4f3a2b1c0d9e8f7
amd64
linux
What it means: The tag resolves, and you can see the digest and primary architecture from the remote manifest.
Decision: If skopeo sees it but your nodes don’t, suspect network path, mirror, or node credentials.
Task 15: Inspect registry responses directly (confirm 404 vs 401)
cr0x@server:~$ curl -s -o /dev/null -w "%{http_code}\n" -u "$REG_USER:$REG_PASS" \
-H 'Accept: application/vnd.oci.image.manifest.v1+json' \
https://registry.example.com/v2/payments/api/manifests/1.8.4
404
What it means: The registry returns 404 for that tag’s manifest. That’s consistent with “tag doesn’t exist” or “deleted.”
Decision: Stop debugging nodes. Fix the reference or restore the tag/manifest in the registry.
Task 16: Prove a retention policy deleted the tag (look for “untagged” images)
cr0x@server:~$ skopeo inspect --creds "$REG_USER:$REG_PASS" docker://registry.example.com/payments/api:1.8.4-hotfix1 | jq -r '.Created,.Digest'
2026-01-02T17:11:09Z
sha256:11aa22bb33cc44dd55ee66ff77889900aabbccddeeff00112233445566778899
What it means: A similar build exists with a different tag and a recent creation time, suggesting your expected tag was never promoted or got deleted.
Decision: Decide whether to redeploy using the correct tag/digest, or republish the missing tag as part of a controlled release process.
Task 17: Correctly push multi-arch images (avoid partial pushes)
cr0x@server:~$ docker buildx build --platform linux/amd64,linux/arm64 \
-t registry.example.com/analytics/worker:2.1.0 \
-t registry.example.com/analytics/worker:2.1 \
--push .
[+] Building 214.3s (31/31) FINISHED
=> pushing manifest for registry.example.com/analytics/worker:2.1.0
=> pushing manifest for registry.example.com/analytics/worker:2.1
What it means: You pushed a proper multi-arch index. The tag should resolve for both platforms.
Decision: Re-test with imagetools inspect. If only one platform appears, your builder likely didn’t have QEMU/binfmt configured or the build failed for one platform.
Task 18: Pin in Kubernetes by digest (the “stop bleeding” move)
cr0x@server:~$ kubectl -n payments set image deploy/api api=registry.example.com/payments/api@sha256:4c3b7d6b2a6d7f3e9c5b2d6c0a7c9b3a2d1e0f9a8b7c6d5e4f3a2b1c0d9e8f7
deployment.apps/api image updated
What it means: The deployment now references immutable content.
Decision: Do this for production workloads. Keep tags for human workflows, but deploy by digest.
Three corporate mini-stories from the trenches
Mini-story 1: The incident caused by a wrong assumption (tags are versions)
A mid-sized fintech had a “simple” rule: staging deploys used :latest, production deploys used a semver tag
like :2.6.1. It sounded mature. It looked mature in PowerPoint. Then a minor release turned into a Friday
night incident.
The release engineer rebuilt 2.6.1 to include a last-minute CA bundle update. Instead of cutting
2.6.2, they re-tagged and pushed 2.6.1 again. Nobody thought it mattered. “Same code,” they said.
Their registry allowed tag overwrite, and nobody had policies preventing it.
Production was a rolling update across multiple clusters. Some nodes pulled the original 2.6.1. Some pulled the
new 2.6.1. The app’s migrations ran once, but the app behavior differed slightly due to changed TLS defaults.
Half the fleet talked to an upstream dependency; half failed. The on-call team spent two hours investigating “random”
failures until someone compared RepoDigests across nodes.
The fix was dull and effective: production deployments pinned digests, and the release process treated tags as
immutable. If you need a new build, you cut a new tag. No exceptions. The culture shift was bigger than the technical change:
“version” now meant “digest-backed artifact,” not “string that looked like a version.”
Mini-story 2: The optimization that backfired (aggressive cleanup)
At an enterprise SaaS company, the container registry storage bill was climbing. Someone proposed a cleanup policy:
delete untagged manifests older than 14 days, and aggressively garbage-collect blobs to reclaim space. It was pitched as
a cost optimization with no downside. That should have been your first clue.
Their deployment strategy used “promotion by retagging”: build :commit-abc123, test it, then retag the same
digest as :prod. But the CI system occasionally created intermediate tags and then removed them. For a while,
the manifest was temporarily “untagged” before being re-tagged in promotion.
The cleanup job ran during that window. It deleted manifests it believed were untagged and eligible. Later, promotion tried
to retag a digest that no longer had a manifest. Result: manifest unknown for :prod in the middle
of a deployment. That’s how you learn your “untagged” definition is not the same as “unused.”
They fixed it by separating concerns: no deletion of recent untagged content, promotion performed as an atomic operation
when supported, and the cleanup window avoided business hours. Also: the registry team started treating “storage” as a
reliability dependency, not a line item to squeeze with cron jobs.
Mini-story 3: The boring but correct practice that saved the day (digest pinning + artifact attestations)
A regulated healthcare company had a rule: every production deployment must reference images by digest, and the digest must
be recorded in the change ticket. People grumbled. It felt bureaucratic. It slowed down the “just ship it” vibe, which is
exactly why it worked.
One day, their registry provider had an internal replication delay between regions. Tags were visible in one region before
the underlying manifests were fully replicated in another. Several teams saw intermittent manifest unknown
during pulls in their disaster recovery environment.
The on-call team didn’t debate whether :3.4.7 was “the right version.” They already had the digest from the
last approved deploy. They pulled by digest from the region where it existed, mirrored it internally, and updated the DR
manifests to point to the internal copy by digest. Systems recovered with minimal ambiguity.
It wasn’t glamorous. No heroics. Just an audit trail of “this digest is what we run,” which turned a fuzzy registry issue
into a straightforward replication problem. Compliance asked questions later; the answers were already written down.
Common mistakes: symptoms → root cause → fix
1) Symptom: “manifest unknown” for a tag that “definitely exists”
- Root cause: You’re pulling from a different registry host/namespace than the one you checked. Mirrors and similarly named repos love this trick.
- Fix: Compare the full image reference in logs. Verify
docker infomirrors. Resolve viaskopeo inspectagainst the exact host.
2) Symptom: Works on laptop, fails on nodes
- Root cause: Platform mismatch (arm64 vs amd64) or node runtime credentials differ from developer credentials.
- Fix: Check
uname -mon nodes; inspect manifest list platforms; verifyimagePullSecretsand node IAM/credential helpers.
3) Symptom: “manifest unknown” right after a push
- Root cause: Eventual consistency/replication delay, or you pushed to one region and pulled from another, or cache/mirror stale.
- Fix: Pull from the same endpoint you pushed to; bypass mirrors; retry with backoff; pin digest after confirming.
4) Symptom: Tag disappeared overnight
- Root cause: Retention policy deleted tags/manifests, or someone ran registry GC without understanding dependencies.
- Fix: Adjust retention policies; protect release tags; keep a “golden” internal mirror; store digests in release metadata.
5) Symptom: Kubernetes shows not found but registry UI shows the tag
- Root cause: The UI may be showing tag metadata from a different backend or cached view; kubelet is pulling from a different endpoint or through a proxy.
- Fix: Pull from the node network path; check container runtime mirror settings; use direct registry API calls from the node.
6) Symptom: Only one architecture works after “multi-arch push”
- Root cause: You pushed only one platform manifest, or pushed an index missing the desired platform due to build failure/QEMU missing.
- Fix: Use
docker buildx build --platform ... --pushand verify withimagetools inspectbefore tagging as release.
7) Symptom: Random nodes fail, random nodes succeed
- Root cause: Mixed node architectures, mixed mirror configs, or mixed credentials (some nodes can see private tags, others can’t).
- Fix: Standardize node bootstrap; audit daemon configs; ensure secrets/credential providers are consistent across node pools.
Joke #2: “manifest unknown” is the registry’s polite way of saying “I have no idea what you’re talking about,” which is also how auditors feel about “latest.”
Checklists / step-by-step plan
Step-by-step: stop the outage first
- Capture the exact failing reference from logs (including registry hostname and namespace).
- Attempt a direct pull from a clean host on the same network path as the failing nodes.
- Resolve to digest using
docker buildx imagetools inspectorskopeo inspect. - Deploy by digest (temporary or permanent). This bypasses tag drift and buys you time.
- If digest is missing too, locate the last known good digest from CI logs, SBOM/attestations, or previous deployment history.
- If multi-arch mismatch, schedule onto compatible nodes or publish the missing platform immediately.
Step-by-step: fix the system so it doesn’t happen again
- Make release tags immutable (policy). If your registry supports it, enforce it. If not, enforce via CI guards.
- Store digests as first-class release metadata (in Git, artifact metadata, deployment manifests, change records).
- Use tags for humans, digests for machines. Promotion can still use tags, but production should run digests.
- Verify multi-arch outputs in CI before marking a build as releasable.
- Align retention with release reality: protect release tags, delay untagged deletion, and don’t GC aggressively without understanding reference patterns.
- Standardize mirror configuration and monitor it like a dependency. If it can return stale manifests, it can cause outages.
Checklist: before you blame Docker
- Is the tag spelled correctly and in the correct repository path?
- Are you pulling from the same registry endpoint your CI pushes to?
- Are you behind a mirror, and is it consistent across nodes?
- Is the required platform present in the manifest list?
- Is the reference pinned by digest for production?
- Could retention/GC have deleted or orphaned the manifest?
- Do node credentials match what your CI/dev machine uses?
FAQ
1) Is “manifest unknown” always a missing tag?
No. It’s “missing manifest for the reference you asked for.” That can be a missing tag, a deleted digest, a wrong
repository path, or an auth/mirror situation that hides existence.
2) Why does Docker sometimes say “manifest unknown” instead of “unauthorized”?
Some registries intentionally return 404/unknown to unauthorized clients to avoid leaking which repos/tags exist. Always
verify credentials on the same machine/runtime that fails.
3) What’s the difference between RepoTags and RepoDigests?
RepoTags are the names you used to reference an image locally (mutable). RepoDigests are
registry-backed content references (immutable per content). Use digests for deployments.
4) If digests are immutable, why do I still see failures pulling by digest?
Because immutability doesn’t guarantee availability. If the registry deleted the manifest (or you’re querying a different
registry/region), the digest can be “unknown.” Immutable is not immortal.
5) Should we stop using tags entirely?
No. Tags are useful for humans and workflows: “this is release-2.1,” “this is prod,” “this is canary.” Just don’t let
production correctness depend on a mutable pointer. Deploy by digest, label by tag.
6) How do I make sure multi-arch images actually include both amd64 and arm64?
Build with docker buildx build --platform ... --push, then verify with
docker buildx imagetools inspect. Make the verification a CI gate, not a hope.
7) Why does it work when I bypass the registry mirror?
Your mirror is stale, misconfigured, or doesn’t have credentials for private repos. Mirrors are systems, not magical
performance dust. Monitor them, version their configs, and test failover behavior.
8) Kubernetes shows “not found,” but I can pull manually from my laptop. Why?
Different credentials, different DNS, different proxy/mirror, different architecture. Kubernetes nodes pull as the node
runtime, not as you. Reproduce from the node network path and runtime context.
9) What’s the fastest “fix” during an incident?
Pin the deployment to a known-good digest. It removes tag ambiguity immediately. Then you can investigate why the tag
didn’t resolve without holding production hostage.
10) Is retagging a valid promotion strategy?
Yes, if you do it with discipline: tags that represent environments can move, but the release artifact should be recorded
by digest. Also protect release tags from overwrite, or you’ll eventually ship two different “same versions.”
Conclusion: what to change on Monday
“manifest unknown” is not a mystical Docker curse. It’s a content lookup failure, usually caused by humans using tags as
identities, or by systems treating deletion and caching like harmless housekeeping.
Next steps that actually reduce incidents:
- Deploy by digest in production. Keep tags, but stop relying on them for correctness.
- Make release tags immutable by policy and enforcement.
- Validate multi-arch images in CI and fail builds that don’t publish the required platforms.
- Audit mirrors and retention policies like they’re production dependencies—because they are.
- Write down the digest at release time. Future-you will enjoy not guessing.