Dark Mode That Doesn’t Look Cheap — Token System Done Right

February 26, 2026 • February 26, 2026 • Read: 24 min • Views: 1

Was this helpful?

You shipped dark mode. The screenshots looked fine. Then Monday arrives: product says it feels “muddy,” design says the neutrals are “radioactive,”
accessibility says contrast is failing, and support says buttons “disappear sometimes.” That’s not a color problem. That’s a system problem.

Dark mode that looks expensive—calm, readable, consistent—comes from one thing: a token system with actual semantics, real layering, and operational discipline.
You don’t need a new palette. You need a theming architecture that survives scale, refactors, and people.

How cheap dark mode happens (and how it presents)

Cheap-looking dark mode has a consistent set of failure modes. You can recognize them like a seasoned on-call engineer recognizes a bad deploy:
everything is “technically working,” but the vibes are wrong and the incident channel is warming up.

Smell #1: Everything is the same gray

If your background, surface, and elevated surfaces are separated by tiny deltas (or random deltas), the UI becomes a flat puddle. In light mode,
you can often get away with sloppy surface elevation because shadows and ambient light do some of the work. In dark mode, your surfaces must carry
the hierarchy.

Smell #2: Text contrast is fine, but it still reads poorly

WCAG contrast ratios don’t guarantee comfort. Dark UIs can hit contrast targets while still producing halation—bright text that feels like it’s vibrating.
The fix is not “reduce contrast until it fails,” it’s to stop using pure white and pure black, and to control how your neutrals step.

Smell #3: Brand colors look neon or dirty

Saturated accents on dark surfaces can look like cheap LED signage. Conversely, if you just darken the same brand hue, it turns into a bruise.
You need separate token values per theme for accents, with constraints tied to contrast and perceived brightness, not “multiply by 0.8.”

Smell #4: Components “break character” under edge states

Hover, active, focus, disabled, and error states reveal whether you built a system or a collage. Dark mode is especially punishing:
a focus ring that worked in light mode can vanish; disabled text can become indistinguishable from normal; borders can look like accidental hairlines.

Smell #5: Every app team has its own interpretation

If tokens exist but every product uses them differently, that’s not theming; that’s “shared vocabulary” without shared meaning.
The fix is semantic tokens mapped from primitives, with clear intended usage and guardrails.

One operational truth: the more apps you have, the more you should treat theming like a platform. It’s not “colors.” It’s a compatibility contract.

Joke #1: Dark mode isn’t hard because colors are tricky. It’s hard because your organization treats “#121212” like a strategy.

Useful facts and historical context (so you stop repeating history)

Early “dark UIs” weren’t aesthetic choices. Many terminals were light text on dark backgrounds because phosphor displays and power limits made it practical.
“Invert colors” has been failing since forever. Early accessibility tools tried inversion and ran into image and brand-color chaos; modern theming is still cleaning that up.
OLED changed the conversation. Dark pixels can save power on OLED; on LCD, savings are smaller and sometimes negligible depending on brightness.
Material Design popularized structured elevation. It normalized thinking in surfaces and layers rather than “background + some cards,” which matters more in dark mode.
WCAG contrast ratios are math, not comfort. They measure luminance contrast, not perceived glare or readability over time.
Design tokens became a standardization response. As component libraries grew, teams needed a portable, tool-agnostic way to encode design decisions.
System preference became first-class. OS-level prefers-color-scheme turned theme into a runtime concern, not just a compile-time styling choice.
“Dark mode everywhere” created image/icon debt. Icons, illustrations, and charts often relied on implicit light backgrounds; theming forced explicit decisions.

If you take one lesson from history: every time teams treated dark mode as “light mode but darker,” they ended up building a second UI anyway—just without
admitting it. Admit it early. Build the system.

The token model: primitives, semantics, and components

Token systems fail when they’re organized like paint buckets instead of like an API. A good token system looks boring because it’s predictable.
Predictable is what you want at 2 a.m. when a hotfix can’t risk visual regressions.

Layer 1: Primitive tokens (raw materials)

Primitive tokens are your palette and measurements: grayscale steps, brand hues, spacing units, radii, type scales. They answer:
“What colors exist?” not “Where do they go?”

Rules that keep primitives sane:

Use steps, not vibes. Define neutrals as a ladder (e.g., neutral-0…neutral-1000) with consistent luminance deltas.
Keep primitives theme-aware. You can have separate primitive sets for light and dark (especially for neutrals and accents).
Primitives don’t leak into product code. If product code uses neutral-900 directly, you’ve already lost consistency.

Layer 2: Semantic tokens (meaning and intent)

Semantic tokens represent roles: bg, surface, text-primary, border-subtle, focus-ring,
danger, success, link.

Semantics answer: “What is this token for?” A semantic token should have a narrow meaning and a stable contract. The values can change per theme and brand.
The meaning should not.

Strong opinions that will save you:

Define a semantic token glossary. Write down intended usage and “do not use for X.”
Separate content from container. Text tokens should not be reused for borders because someone thought “it’s the same gray.”
Include interaction and state tokens. Hover/active/focus/disabled are not afterthoughts; they’re where dark mode breaks.
Include data-viz semantics early. Charts become unreadable quickly in dark mode if you don’t predefine axes, grid, series, and tooltip tokens.

Layer 3: Component tokens (last-mile tuning)

Component tokens are for cases where a component needs special tuning without polluting global semantics: e.g., button-primary-bg,
tooltip-bg, modal-overlay.

The trick: component tokens should reference semantic tokens by default, and only diverge when there’s a real reason.
If every component defines its own colors, you built a second palette with worse governance.

Naming that doesn’t rot

Avoid naming tokens after colors (blue-500) when you mean purpose (link). Name by intent. Always.
If you must keep color names (for primitives), keep them away from product code.

A practical naming scheme:

Primitives: color.neutral.0, color.neutral.900, color.brand.primary.600
Semantics: color.bg, color.surface, color.text.primary, color.border.subtle
Components: color.button.primary.bg, color.input.border.focus

If you prefer CSS custom properties, map to --color-bg, --color-text-primary, etc. Same idea, fewer dots.

Color behavior in dark UI: contrast, luminance, and why “just invert it” fails

Dark mode isn’t simply “light mode with different hex values.” You’re working against human perception and device physics.
Treat it like capacity planning: the numbers matter, but the real story is in the shape of the curve.

Pick “near-black,” not black

Pure black (#000000) looks harsh in many contexts and exaggerates contrast with text and elevation effects. Most high-quality dark themes use a near-black,
usually slightly tinted (cool or warm) to feel intentional and to reduce banding.

Use off-white text

Pure white on dark backgrounds can cause halation: perceived glow and eye strain. Use an off-white for primary text and step down for secondary and disabled.
Keep the steps consistent.

Elevation in dark mode is mostly about lightening surfaces

In light mode, elevated surfaces often get darker shadows. In dark mode, shadows don’t read the same; you typically lighten the elevated surface slightly,
and use subtle shadows or borders sparingly to signal separation.

Don’t reuse light-mode border logic

Borders in light mode are often slightly darker than the surface. In dark mode, borders may need to be slightly lighter than the surface,
or they disappear. This is a classic “copy the token mapping” bug.

Accent colors need theme-specific tuning

The same brand color can look different against dark surfaces. You generally need:

a slightly brighter (higher luminance) accent for dark backgrounds,
an outline/focus variant that remains visible on both the accent and the background,
and a muted variant for subtle backgrounds (badges, highlights) that doesn’t look like a bruise.

One quote, because it’s still true

“Hope is not a strategy.” — Rick Page

If your theme relies on hoping that components “probably look okay,” you’re going to ship regressions. Instrument it.

A production-grade theming architecture

You want a theming system that can do three things without drama:
(1) switch themes reliably, (2) keep components consistent, (3) support multiple brands or products without forking the world.

Decision: Where do tokens live?

Put tokens in a single versioned package. Generate outputs for the platforms you care about (CSS variables, JSON, TS types).
The “source of truth” should be one format, not three hand-edited copies.

Decision: How do themes apply?

Use a single root attribute and CSS variables:

data-theme="light" and data-theme="dark" on <html> or <body>
Define variable sets scoped to that attribute

This prevents the “half-themed” page where some subtree is stuck on old values. It also makes it measurable: you can assert the attribute exists.

Decision: How do tokens map across themes?

Treat mapping as a matrix. Each semantic token must have a value for each theme (and for each brand if needed). If a semantic token doesn’t have
a value in dark mode, that is a build-time failure, not a runtime surprise.

Decision: How do you handle user preference and persistence?

Use the OS preference as default (prefers-color-scheme), but persist explicit user choices. Make it deterministic:

Order: user override → stored preference → OS preference → default
Apply the theme attribute as early as possible to avoid flashes

Performance: prevent theme switch jank

Theme switching touches computed styles. Large DOMs make this expensive. Keep variable sets shallow (root-scoped), avoid per-component inline styles,
and don’t animate everything during a theme swap.

Joke #2: If your theme toggle animates 300 CSS properties, congratulations—you’ve invented a battery drain simulator.

Testing: treat theme as a release surface

A real theming system has:

token completeness tests (no missing mappings),
contrast tests for key pairs (text/background, icons/surfaces, focus rings),
visual regression tests for representative screens,
lint rules that prevent product code from using primitives directly.

Workflows and governance: how tokens don’t devolve into a junk drawer

Tokens fail slowly. First, a team “just needs one special gray.” Then another team adds “slightly different” hover behavior. Six months later,
dark mode looks like a patchwork quilt and nobody can explain why. Governance isn’t bureaucracy; it’s how you keep the system cheap to operate.

Define ownership and the change path

Pick a small group (design systems + one product engineer) as maintainers. Everyone else files changes through a predictable process:

Request: what UI problem are you solving?
Proposed semantic token: why does it deserve to exist?
Mapping: values for light + dark (and brands), including contrast notes
Rollout plan: how do we migrate old usage?

Make “semantic drift” observable

Create a small token usage report in CI:

Which semantic tokens are unused?
Which primitives are referenced outside the token package?
Which components define custom colors instead of using semantics?

Plan migrations like you plan storage migrations

Storage engineers learn this early: you don’t migrate data by “just changing the path.” You run dual-write, you validate, you roll out gradually.
Tokens are similar. Introduce new semantic tokens, map them, migrate components, deprecate old ones with warnings, and only then remove.

Stop theme bugs at the boundary

Enforce the rule: product code consumes only semantic tokens (or component tokens), not primitives.
If a team needs a new shade, that’s a token change request, not a local hack.

Practical tasks with commands: inspect, measure, decide (12+)

The quickest way to improve dark mode is to treat it like an operational surface. Measure first. Then decide.
Below are real, runnable tasks you can use in a repo that stores tokens and ships CSS variables.
I’ll show commands, example outputs, what they mean, and the decision you make.

Task 1: Find direct usage of primitive tokens in product code

cr0x@server:~$ rg -n "neutral\.(?:[0-9]{1,4})|--color-neutral-[0-9]{1,4}" apps/ packages/
apps/web/src/components/Banner.css:14:  color: var(--color-neutral-50);
apps/admin/src/pages/Settings.tsx:92:  background: var(--color-neutral-950);

Output meaning: Product code is using primitives directly, bypassing semantics.

Decision: Open a refactor issue: replace with semantic tokens (--color-text-secondary, --color-surface, etc.) and add a lint rule to prevent recurrence.

Task 2: Verify every semantic token has a value in dark theme

cr0x@server:~$ jq -r '.semantic | keys[]' tokens/semantic.json | wc -l
148

cr0x@server:~$ jq -r '.themes.dark.semantic | keys[]' tokens/theme-dark.json | wc -l
146

Output meaning: Dark theme is missing 2 semantic mappings.

Decision: Fail the build until the missing keys are mapped. Missing tokens in dark mode become runtime “random colors,” which is how cheap happens.

Task 3: List the missing semantic keys

cr0x@server:~$ comm -3 \
  <(jq -r '.semantic | keys[]' tokens/semantic.json | sort) \
  <(jq -r '.themes.dark.semantic | keys[]' tokens/theme-dark.json | sort)
color.focus.ring
color.table.row.hover

Output meaning: Two roles lack dark-mode definitions: focus ring and table hover row.

Decision: Define them explicitly for dark. Don’t “borrow” from light mappings.

Task 4: Validate CSS variables actually exist in the built artifact

cr0x@server:~$ npm run build:css
...output...
dist/tokens.css  34.2kb

cr0x@server:~$ rg -n "--color-focus-ring" dist/tokens.css | head
211:  --color-focus-ring: #7aa2ff;

Output meaning: The token is present in the built CSS.

Decision: If missing, your build pipeline is dropping tokens or the name differs. Fix the generator or naming mismatch.

Task 5: Check for duplicate or conflicting token definitions

cr0x@server:~$ rg -n "--color-text-primary:" dist/tokens.css
54:  --color-text-primary: #e7eaf0;
912: --color-text-primary: #f6f7fb;

Output meaning: Token is defined twice—likely two theme scopes overlapping or a merge mistake.

Decision: Ensure variables are scoped under [data-theme="dark"] and [data-theme="light"], not duplicated at root.

Task 6: Confirm theme scoping is correct (root attribute only)

cr0x@server:~$ rg -n "\[data-theme=" dist/tokens.css | head -n 20
1:[data-theme="light"] {
401:[data-theme="dark"] {

Output meaning: Only two scopes exist. Good.

Decision: If you see many scattered scopes, consolidate to root to avoid partial theme application and performance hits.

Task 7: Measure token bundle size regression (don’t ship a theme as a novel)

cr0x@server:~$ ls -lh dist/tokens.css
-rw-r--r-- 1 cr0x cr0x 34K Feb  4 09:12 dist/tokens.css

Output meaning: Baseline size is modest.

Decision: If it jumps significantly, investigate token explosion (often from per-component overrides). Size isn’t only bandwidth; it’s parse time.

Task 8: Run a quick contrast audit on critical pairs (script-driven)

cr0x@server:~$ node scripts/contrast-audit.mjs tokens/theme-dark.json | head
PASS color.text.primary on color.bg ratio=12.8
PASS color.text.secondary on color.bg ratio=7.1
FAIL color.text.disabled on color.surface ratio=2.3
FAIL color.focus.ring on color.bg ratio=2.0

Output meaning: Disabled text and focus ring are too low-contrast on their intended surfaces.

Decision: Adjust the semantic values (not component hacks). Define minimum ratios per role (e.g., focus ring must be visible, disabled must still be readable when required).

Task 9: Detect “accidental pure black/white” creeping in

cr0x@server:~$ rg -n "#000000|#ffffff" tokens/ dist/ packages/ | head
tokens/theme-dark.json:22:    "color.bg": "#000000"
tokens/theme-light.json:18:   "color.text.primary": "#ffffff"

Output meaning: You’ve got pure black background and pure white text—a classic glare recipe.

Decision: Replace with near-black and off-white, then re-check contrast and perceived comfort.

Task 10: Spot tokens that are identical across themes (often a sign of missing design work)

cr0x@server:~$ node scripts/diff-themes.mjs tokens/theme-light.json tokens/theme-dark.json | head
SAME color.link.visited = #6b7cff
SAME color.chart.grid = #2a2f3a
DIFF color.bg light=#ffffff dark=#0f1115

Output meaning: Some tokens are unchanged across themes. Sometimes that’s correct; often it’s lazy mapping.

Decision: Review each “SAME” token. Links and chart grids rarely behave identically in light and dark contexts.

Task 11: Verify `prefers-color-scheme` behavior in compiled CSS

cr0x@server:~$ rg -n "prefers-color-scheme" dist/app.css
122:@media (prefers-color-scheme: dark) {

Output meaning: The app has OS preference support.

Decision: Ensure OS preference doesn’t override explicit user selection. If it does, your theme will “flip back” and users will file bugs with capital letters.

Task 12: Check for a flash of incorrect theme (FOUC) in server-rendered HTML

cr0x@server:~$ curl -sS -D- http://localhost:3000/ | head -n 30
HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
...
<html lang="en">
<head>
...
</head>
<body>

Output meaning: HTML is returned, but we didn’t see the theme attribute yet.

Decision: If theme is applied via late JS only, you’ll get flashes. Fix by setting data-theme server-side or via an inline early script.

Task 13: Confirm the theme attribute exists at runtime (headless check)

cr0x@server:~$ node scripts/check-theme-attribute.mjs http://localhost:3000/
OK html[data-theme] present value=dark

Output meaning: Theme is applied early enough to be detectable.

Decision: If missing, fix SSR or early bootstrap. Don’t rely on “it updates quickly.” Users notice.

Task 14: Find “one-off” component overrides that bypass tokens

cr0x@server:~$ rg -n "background:\s*#|color:\s*#" apps/ packages/ | head
apps/web/src/components/Tag.css:8:  background: #1d2533;
apps/web/src/components/Tag.css:9:  color: #cfe1ff;

Output meaning: Hard-coded hex values exist in product components.

Decision: Replace with semantic/component tokens. Hard-coded colors are untestable at scale and will drift between themes.

Fast diagnosis playbook: find the bottleneck fast

When dark mode looks wrong in production, you need a triage path that’s faster than a design review. Here’s the order that finds root causes quickly.

First: confirm the theme selector and scope

Is data-theme set correctly on the root element?
Are the CSS variable definitions scoped only at root theme selectors?
Is there any subtree overriding variables?

If this is wrong, everything else is noise. Fix scope before touching colors.

Second: check token completeness and fallbacks

Any missing semantic keys in dark theme mapping?
Any CSS variables referenced that aren’t defined?
Any unintended fallbacks like var(--x, #fff) shipping to production?

Missing tokens cause “random” behavior that changes across pages and components. It’s not random. It’s undefined behavior with better branding.

Third: identify the failing role, not the failing component

Is the problem text legibility? Then inspect color.text.*.
Is the problem hierarchy? Then inspect bg/surface/elevation tokens.
Is the problem state visibility? Then inspect hover/active/focus/disabled roles.

Fix roles at the semantic level. Patching a single component is how you accumulate theme debt.

Fourth: measure contrast and glare against the intended surfaces

Run your contrast audit and review pairs that matter. Don’t waste time on theoretical pairs nobody uses.

Fifth: check rendering and performance symptoms

Theme toggle causes jank? Suspect too many per-node style changes or transitions.
Only some components update? Suspect shadow DOM boundaries, iframes, or nested theme scopes.
Icons look wrong? Suspect asset pipeline and SVG fill/stroke tokens.

Three corporate mini-stories (anonymized, technically accurate)

Mini-story #1: An incident caused by a wrong assumption

A mid-size SaaS company rolled out dark mode in a “platform week.” They had a component library, a token package, and a confident PM.
The plan was simple: map the old light-mode semantic tokens to darker hex values, ship, celebrate.

The wrong assumption: “Borders are just lower-contrast text colors.” In light mode, they’d been using a slightly darker gray for borders and had
accidentally re-used a text token because it looked fine. They carried that shortcut into dark mode.

In production, form fields “disappeared” for a subset of users. Not all users—only those on lower-end laptop panels with poor black levels and
people who had reduced brightness. Support tickets described it as “input boxes missing,” which sounded like layout bugs.

Engineering triaged CSS. Layout was fine. DOM was fine. Then someone toggled a debug overlay that highlighted focusable elements: the inputs were there,
but the border token was mapped to a value too close to the surface in dark mode. The error was systemic: dozens of components depended on that role.

The fix wasn’t “make borders brighter.” The fix was to introduce proper semantic separation: color.border.subtle,
color.border.default, color.text.secondary—and to ban reusing text tokens for borders in code review.
Dark mode didn’t break their UI. Their assumptions did.

Mini-story #2: An optimization that backfired

Another company wanted instant theme switching with zero flash and minimal CSS. An engineer proposed a clever optimization:
generate only one set of CSS variables and “compute” the dark palette on the client by applying a transform to the light palette.
Fewer tokens, less CSS, faster builds. It sounded tidy.

They shipped it behind a flag. It worked… until it didn’t. Brand colors turned ugly because the transform didn’t preserve perceived luminance.
Warning and error colors became ambiguous. Chart colors collided. Focus rings faded into the background on some surfaces.

The operational pain was worse than the aesthetics. Bugs were hard to reproduce because the computed palette depended on runtime math, browser rounding,
and sometimes user zoom. QA couldn’t “diff” the theme because it wasn’t a static artifact.

The postmortem conclusion was blunt: they had optimized the wrong thing. Token count and CSS size were not their bottleneck; correctness and predictability were.
They reverted to explicit theme mappings with a smaller semantic surface area and used conventional minification for size.

The backfire pattern is common: “smart” theming tricks increase hidden complexity. In production systems, hidden complexity bills you later, with interest.

Mini-story #3: A boring but correct practice that saved the day

A larger org ran multiple products on a shared design system. They enforced a rule that felt annoying: every token change required
updating a token snapshot test and a small set of contrast assertions. No exceptions. Engineers grumbled; designers rolled their eyes.

Then a rebrand happened. New accent color. New neutrals. Everyone expected the dark theme to be a mess. The token maintainers merged the new primitives,
updated the semantic mappings, and CI lit up like a dashboard during a bad deploy.

The tests caught two issues immediately: focus ring contrast dropped below the internal threshold on the darkest surface, and disabled text became too faint
on a common table background. Neither was obvious in a quick glance, and both would have become customer-facing bugs within hours.

They fixed the mappings before release. The rollout was boring. Boring is the highest compliment you can pay a theming change in production.

The practice wasn’t glamorous. It was just guardrails: snapshot, contrast checks, and a rule that product code can’t use primitives.
The result: dark mode survived a rebrand with fewer incidents than a typical minor CSS refactor.

Common mistakes: symptom → root cause → fix

1) Symptom: “Everything looks gray and flat”

Root cause: Surface tokens don’t encode elevation; background, surface, and raised surfaces are too similar.

Fix: Define a surface scale: bg, surface, surface-2, surface-3 with measured luminance steps in dark theme. Use borders/shadows sparingly and consistently.

2) Symptom: “Text passes contrast checks but feels like it glows”

Root cause: Using pure white or near-white on near-black; high contrast triggers halation and eye strain.

Fix: Use off-white for primary text and slightly raise background luminance. Keep contrast high enough but not extreme; validate with real reading, not only ratios.

3) Symptom: “Disabled controls are invisible”

Root cause: Disabled tokens were derived by reducing opacity uniformly; on dark surfaces this collapses distinctions.

Fix: Create dedicated semantic tokens for disabled text, icons, and borders. Validate on the darkest common surface and on elevated surfaces.

4) Symptom: “Focus rings disappear on some components”

Root cause: Focus ring token doesn’t account for backgrounds; it’s too close to both surface and accent colors.

Fix: Use a focus ring color with sufficient contrast on both bg and surface. Consider dual rings (outer + inner) using two tokens.

5) Symptom: “Borders look too bright or too heavy”

Root cause: Border tokens were copied from light mode logic or mapped from text tokens.

Fix: Define border roles separately (subtle, default, strong) and tune per theme. Never reuse text tokens for borders.

6) Symptom: “Brand color looks neon on dark”

Root cause: Accent color used unchanged; saturation and perceived brightness explode on dark backgrounds.

Fix: Provide dark-theme-specific accent values. Add semantic tokens for accent-on-surface and accent-on-accent text colors.

7) Symptom: “Only some parts of the page switch theme”

Root cause: Tokens are scoped in multiple places or overridden inside component styles; shadow DOM/iframe boundaries not handled.

Fix: Root-scope variables. For iframes/shadow roots, explicitly pass theme attribute and inject variables consistently.

8) Symptom: “Theme toggle causes jank”

Root cause: Theme change triggers expensive recalculation across large DOM; transitions applied broadly; per-component inline styles.

Fix: Keep variables at root. Reduce transitions during theme change. Avoid updating hundreds of nodes; update one attribute.

Checklists / step-by-step plan

Step-by-step: build a dark theme that holds up

Define semantic roles first. List the roles your UI needs: backgrounds, surfaces, text tiers, borders, icons, states, focus, overlays, charts.
Create a neutral ladder for dark. Pick near-black background and step surfaces upward with measured differences.
Set text tiers explicitly. Primary, secondary, tertiary, disabled. Use off-white for primary.
Define state tokens. Hover/active/focus/disabled for common controls, including subtle variants.
Map semantics to primitives per theme. Do not auto-transform; define values intentionally.
Generate platform outputs. CSS variables + JSON + types from one source.
Enforce consumption rules. Product code cannot reference primitives; only semantics/components.
Add CI checks. Completeness, duplicates, contrast checks for critical pairs, snapshot for token output.
Run visual regression on representative screens. Auth, settings, tables, modals, forms, empty states, error states.
Roll out gradually. Feature flag, measure support tickets, monitor session feedback, then expand.

Checklist: “Does this dark theme look expensive?”

Surfaces show hierarchy without relying on thick borders.
Text is readable for long sessions; no “glow” effect.
Focus is unmissable on every surface and component.
Disabled states are clearly disabled, not invisible.
Error/warning/success are distinct and not painfully saturated.
Charts remain legible; grids and axes don’t vanish.
No hard-coded hex values in product code.
Theme switch is fast and doesn’t flash the wrong theme.

Checklist: token governance that doesn’t turn into a committee

One owner group for token source of truth.
Clear criteria for adding a semantic token.
Deprecation process with warnings and migrations.
Automated report for primitive leakage and unused tokens.
Release notes for token changes that impact UI semantics.

FAQ

1) Should I have separate primitive palettes for light and dark?

For neutrals: yes, usually. For brand accents: often yes. Trying to reuse one primitive set across themes tends to produce muddy neutrals or neon accents.
Keep semantics stable; allow primitives to differ per theme.

2) Are semantic tokens worth the overhead?

If you have more than one team or more than one product, yes. Semantic tokens are how you prevent every component from becoming its own color theory experiment.
They also make theme bugs diagnosable: you fix a role, not 40 components.

3) Can we just use opacity for disabled states?

You can, but it’s fragile in dark mode. Opacity blending depends on what’s behind the element and can collapse contrast unpredictably.
Prefer explicit disabled tokens for text/icons/borders, validated on common surfaces.

4) What’s the best background color for dark mode?

Near-black with a slight tint, chosen alongside your surface steps. The “best” value is the one that supports hierarchy and long-form reading without glare.
If you pick pure black, you’ll spend time compensating everywhere else.

5) How do we handle images and illustrations?

Treat them as assets with theme variants or design them to work on both backgrounds. For SVG icons, prefer currentColor and semantic icon tokens.
For illustrations, decide whether to provide dark variants or keep them in neutral containers.

6) How do we prevent a flash of wrong theme on first load?

Apply data-theme before first paint: server-render it when possible, or run a tiny inline script in the document head that reads stored preference
and sets the attribute immediately.

7) Does WCAG guarantee our dark mode is good?

WCAG helps you not ship unreadable text. It doesn’t guarantee comfort or aesthetic quality. Use contrast checks as guardrails, then evaluate glare, hierarchy,
and state visibility with real screens at realistic brightness.

8) How do we support multiple brands without forking everything?

Keep one semantic layer shared across brands. Provide brand-specific primitive palettes and semantic mappings. If brands need different semantics, challenge it:
often it’s actually a component variant request, not a new semantic contract.

9) Should we animate the theme switch?

Lightly, if at all. A subtle background fade can be pleasant; animating every token change is expensive and can feel like the UI is melting.
Prioritize correctness and speed; animation is optional.

10) What’s the simplest rule to enforce in code review?

“No hard-coded colors in product code.” If a color needs to exist, it must be a token with semantics. This single rule eliminates a huge class of regressions.

Next steps you can actually do this week

Inventory your semantic tokens. Write the glossary: what each role means and what it must never be used for.
Run leakage detection. Find primitives and hard-coded hex in product code; open a migration backlog.
Add two CI checks immediately: (a) semantic completeness per theme, (b) contrast audit for critical pairs.
Fix the top three perception killers: near-black background + off-white text + a visible focus ring on every surface.
Pick one “hero screen” and one “worst screen.” Make them perfect in dark mode. Then scale out, component by component.
Stop doing theme work in components. Theme mapping belongs in tokens. Components consume semantics. Keep the contract clean.

Dark mode that doesn’t look cheap is not a miracle palette. It’s an operationally sound system: semantics, layering, tests, and a refusal to ship undefined behavior.
Build it like you build reliable infrastructure—because at scale, that’s what it is.