NVIDIA/AMD Driver Crash Loop: Clean Install the Right Way (DDU + Safe Mode)

February 22, 2026 • February 22, 2026 • Read: 24 min • Views: 1

Was this helpful?

When your GPU driver gets into a crash loop, the machine stops being a computer and starts being a slot machine. You boot, the desktop appears, the screen flashes, then—black screen, driver reset, repeat. Sometimes you can’t even log in long enough to click “uninstall.”

This is one of those problems where “just reinstall the driver” sounds reasonable and fails spectacularly. A clean install isn’t one action. It’s a controlled sequence: gather evidence, cut Windows Update out of the loop, use Safe Mode so the driver isn’t actively fighting you, remove the right artifacts (not random ones), then reinstall with verification. Do it once, do it correctly, and you typically don’t see the issue again—unless the hardware is actually dying, which is a different kind of exciting.

What a driver crash loop looks like (and what it isn’t)

A “driver crash loop” is usually the Windows graphics stack repeatedly resetting the display driver. You see:

Screen flicker to black and back, often right after login.
Apps closing with “device removed”/“device hung” errors (DirectX/Vulkan).
Event Viewer logs like “Display driver nvlddmkm stopped responding and has successfully recovered” or AMD equivalents.
Stuck on a black screen with a cursor, or the desktop loads but is unusable.
Fans ramping briefly, then silence, then ramping again.

What it is not (most of the time):

A random game crash with a stable desktop afterward. That’s usually the game, an overlay, or an undervolt.
A full system power-off under load. That’s often PSU, power cabling, VRM, or thermal shutdown.
Consistent artifacts on the BIOS splash screen. That’s often physical GPU memory or core damage.

If you’re getting a loop specifically after a driver update, Windows update, or GPU swap, treat it as an install integrity problem until proven otherwise. The goal is to get you back to a stable baseline driver with minimal superstition.

Joke #1 (short, relevant): A GPU driver crash loop is the only treadmill where the computer does the running and you still lose progress.

Fast diagnosis playbook (first/second/third)

First: confirm it’s a driver reset loop, not a power/hardware collapse

Check Event Viewer for display driver resets and TDR events.
Check Reliability Monitor for recurring “Windows Hardware Error Architecture” entries or “LiveKernelEvent.”
Sanity check temperatures if you can stay in Windows for 60 seconds: if the GPU hits thermal limits instantly, you have a cooling or paste problem.

Decision: If the system hard-powers off, reboots without logs, or shows artifacts in BIOS, stop the driver dance and investigate PSU/cables/GPU hardware. If it’s recoverable resets and black flickers, proceed with a clean driver workflow.

Second: stop Windows from “helping”

Disconnect from the network or disable Windows device driver updates temporarily.
Identify if Windows keeps injecting an older driver right after you uninstall.

Decision: If you uninstall and the driver comes back on reboot without you installing it, Windows Update or the DriverStore is reinfecting the system. You must isolate and control that.

Third: isolate the driver stack and overlays

Boot Safe Mode to remove the active vendor driver.
Remove known conflict layers (overlay recorders, RGB kernel drivers, monitoring tools) only after you’ve captured logs.

Decision: If Safe Mode is stable and normal mode is not, you’re looking at a driver/stack issue, not “Windows is broken.” You can fix it with discipline.

Interesting facts and context (so the behavior makes sense)

TDR exists to keep your desktop alive. Windows Timeout Detection and Recovery (TDR) resets the GPU driver when the GPU appears hung, instead of forcing a reboot. Great for uptime, confusing during failures.
WDDM changed everything. Starting with Windows Vista, the Windows Display Driver Model moved GPU scheduling and memory management into a more structured model. It improved stability overall, but made partial driver leftovers more “interesting.”
Windows keeps a driver warehouse. The DriverStore caches driver packages so Windows can reapply them. This is fantastic when your NIC driver disappears—less fantastic when it keeps resurrecting a broken display package.
DDU became popular because uninstallers are not surgeons. Vendor uninstallers often leave registry keys, services, driver packages, and settings intended for “smooth upgrades.” Smooth upgrades are exactly what you don’t want in a crash loop.
“Clean install” inside NVIDIA’s installer is not DDU. It resets some settings and profiles, but it does not fully disinfect the DriverStore or remove every artifact.
Hybrid graphics adds complexity. Laptops with iGPU + dGPU (Optimus, Advanced Optimus, AMD Switchable Graphics) can fail in ways desktops never will—wrong device gets the primary path, wrong power state, wrong mux mode.
Hardware-accelerated GPU scheduling (HAGS) is relatively new. It can be beneficial, but it adds one more moving part in the pipeline. When things are unstable, fewer moving parts is a valid strategy.
“Studio” vs “Game Ready” is mostly packaging and validation cadence. It’s not magic. But switching branches can avoid a bad regression when one track ships a bug first.

Decision points: software, configuration, or hardware?

You don’t want to spend three hours doing a perfect DDU cleanup if the GPU is physically failing. Conversely, you don’t want to RMA a GPU because Windows Update reinstalled a mismatched driver.

Signals that it’s probably software/config

Safe Mode is stable; normal mode flickers and resets.
The problem started immediately after a driver update or Windows update.
Event Viewer shows repeated display driver resets (TDR) without hard resets.
Switching to Microsoft Basic Display Adapter stops the loop.
Different driver version behaves differently (even if both are imperfect).

Signals that it’s probably hardware/power

Artifacts in BIOS/UEFI or before Windows loads.
System loses power under load (instant off) with no useful logs.
Driver crash loop persists across a clean OS install.
GPU temperatures or hotspot spike abnormally at idle, or fans fail.
Changing PSU/cables/slot changes symptoms more than driver changes do.

There’s also the gray zone: unstable undervolts/overclocks, flaky RAM, or a borderline PSU can manifest as “driver crashes” because the driver is the component forced to handle the failure. Drivers get blamed because they’re on the scene of the crime.

One paraphrased idea worth keeping on a sticky note: paraphrased idea — Werner Vogels (reliability mindset: everything fails, so design and operate accordingly). That’s exactly how you should treat GPU drivers in a production workstation: assume they can fail and build a recovery path.

Practical tasks with commands (12+), outputs, and decisions

These tasks are designed for Windows 10/11. The commands are runnable in an elevated Command Prompt or PowerShell. I’m using a Linux-styled prompt in the code blocks per formatting constraints; the commands themselves are Windows-native.

Task 1: Confirm GPU and current driver version (Device Manager equivalent via PowerShell)

cr0x@server:~$ powershell -NoProfile -Command "Get-CimInstance Win32_VideoController | Select-Object Name,DriverVersion,DriverDate | Format-List"

Name          : NVIDIA GeForce RTX 3080
DriverVersion : 31.0.15.5161
DriverDate    : 12/01/2023 00:00:00

What it means: You’re seeing the active driver that Windows believes is loaded.

Decision: If the version isn’t what you installed (or changes after reboot), Windows Update or another package is overwriting it.

Task 2: Find display driver resets (Event Viewer via wevtutil)

cr0x@server:~$ wevtutil qe System /q:"*[System[(EventID=4101)]]" /c:5 /f:text

Event[0]:
  Log Name: System
  Source: Display
  Event ID: 4101
  Level: Warning
  Description:
  Display driver nvlddmkm stopped responding and has successfully recovered.

What it means: Event ID 4101 is the classic TDR recovery symptom for display drivers.

Decision: If 4101 repeats in tight intervals after boot/login, you’re dealing with a crash loop, not a one-off app crash.

Task 3: Check for LiveKernelEvent / WHEA hints (Reliability Monitor data via WMI)

cr0x@server:~$ powershell -NoProfile -Command "Get-CimInstance Win32_ReliabilityRecords | Where-Object { $_.SourceName -match 'Windows' -or $_.SourceName -match 'Hardware' } | Select-Object -First 5 TimeGenerated,SourceName,ProductName,Message | Format-List"

TimeGenerated : 2/4/2026 9:12:10 AM
SourceName    : Windows
ProductName   : Windows
Message       : The Desktop Window Manager process has exited.

What it means: DWM exiting repeatedly often correlates with GPU driver instability.

Decision: If you see WHEA corrected errors alongside GPU resets, consider PSU/RAM/PCIe stability as contributors.

Task 4: Identify driver packages in DriverStore (pnputil)

cr0x@server:~$ pnputil /enum-drivers | findstr /i "nvidia amd display"

Published Name : oem42.inf
Original Name  : nv_dispi.inf
Provider Name  : NVIDIA
Class Name     : Display adapters

Published Name : oem17.inf
Original Name  : u0397489.inf
Provider Name  : Advanced Micro Devices, Inc.
Class Name     : Display adapters

What it means: DriverStore contains one or more display driver packages—sometimes multiple vendors if you swapped GPUs.

Decision: If stale packages exist for the wrong vendor, plan to remove them during cleanup to prevent “driver resurrection.”

Task 5: Remove a specific display driver package (carefully)

cr0x@server:~$ pnputil /delete-driver oem42.inf /uninstall /force

Driver package deleted successfully.

What it means: This removes the package from DriverStore and uninstalls devices using it.

Decision: If deletion fails due to “in use,” you’re not cleanly detached—use Safe Mode, or disable the device first.

Task 6: Confirm Windows is not auto-installing drivers (Device Installation Settings via registry)

cr0x@server:~$ reg query "HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\DriverSearching" /v SearchOrderConfig

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\DriverSearching
    SearchOrderConfig    REG_DWORD    0x1

What it means: 1 typically means Windows is allowed to search Windows Update for drivers.

Decision: For a controlled rebuild, temporarily set this to 0 and/or disconnect network before reinstall.

Task 7: Disable automatic driver updates via registry (temporary control)

cr0x@server:~$ reg add "HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\DriverSearching" /v SearchOrderConfig /t REG_DWORD /d 0 /f

The operation completed successfully.

What it means: Windows should stop fetching drivers automatically from Windows Update.

Decision: Do this before cleanup/reinstall, then revert later if your environment requires it.

Task 8: Verify Safe Mode boot configuration (bcdedit)

cr0x@server:~$ bcdedit /enum | findstr /i safeboot

safeboot                  Minimal

What it means: The system is configured to boot into Safe Mode (Minimal).

Decision: Use this when you can’t reliably click through Advanced Startup options.

Task 9: Set Safe Mode for next boot (then reboot)

cr0x@server:~$ bcdedit /set {current} safeboot minimal

The operation completed successfully.

What it means: Next boot goes to Safe Mode.

Decision: After cleanup, remove the safeboot flag or you’ll keep booting into Safe Mode and wonder why your audio is missing.

Task 10: Remove Safe Mode boot flag (return to normal boot)

cr0x@server:~$ bcdedit /deletevalue {current} safeboot

The operation completed successfully.

What it means: Restores normal boot behavior.

Decision: Run this after DDU and after you’re ready to install the new driver.

Task 11: Check if Windows is using Microsoft Basic Display Adapter (good baseline)

cr0x@server:~$ powershell -NoProfile -Command "Get-PnpDevice -Class Display | Format-Table -AutoSize Status,Class,FriendlyName,InstanceId"

OK     Display  Microsoft Basic Display Adapter  PCI\VEN_10DE&DEV_2206&SUBSYS...

What it means: You’re on the generic driver. Ugly resolution, but stable enough to work.

Decision: If Basic Display Adapter is stable, your crash loop is almost certainly in the vendor driver or settings layer.

Task 12: Capture installed GPU-related software that can hook the stack

cr0x@server:~$ powershell -NoProfile -Command "Get-ItemProperty HKLM:\Software\Microsoft\Windows\CurrentVersion\Uninstall\*,HKLM:\Software\WOW6432Node\Microsoft\Windows\CurrentVersion\Uninstall\* | Where-Object { $_.DisplayName -match 'NVIDIA|AMD|Radeon|GeForce|Afterburner|Rivatuner|Overlay' } | Select-Object DisplayName,DisplayVersion | Sort-Object DisplayName | Format-Table -AutoSize"

AMD Software                         24.1.1
MSI Afterburner                      4.6.5
RivaTuner Statistics Server          7.3.5

What it means: You have potential hook/overlay tools installed.

Decision: Don’t uninstall everything blindly. But if a clean driver install still loops, remove overlays and monitoring tools next.

Task 13: Check system file integrity (because crashes corrupt things)

cr0x@server:~$ sfc /scannow

Beginning system scan. This process will take some time.
Windows Resource Protection found corrupt files and successfully repaired them.

What it means: Windows system files were damaged and repaired.

Decision: If SFC repairs files, follow with DISM repair; unstable drivers can leave the OS in a half-broken state.

Task 14: Repair Windows component store (DISM)

cr0x@server:~$ DISM /Online /Cleanup-Image /RestoreHealth

Deployment Image Servicing and Management tool
Version: 10.0.22621.1
The restore operation completed successfully.

What it means: The component store is healthy again.

Decision: If DISM fails repeatedly, you may be dealing with deeper OS corruption—fix that before blaming the GPU driver.

Task 15: Check if the GPU is throwing WHEA PCIe errors (hardware signal)

cr0x@server:~$ wevtutil qe System /q:"*[System[Provider[@Name='Microsoft-Windows-WHEA-Logger'] and (EventID=17 or EventID=18)]]" /c:5 /f:text

Event[0]:
  Provider Name: Microsoft-Windows-WHEA-Logger
  Event ID: 17
  Level: Warning
  Description:
  A corrected hardware error has occurred.

What it means: Corrected hardware errors can correlate with PCIe instability, marginal PSU, riser cables, or aggressive overclocks.

Decision: If WHEA events spike during driver resets, consider removing risers, reseating the card, updating BIOS/chipset, and backing off overclocks/undervolts.

Checklists / step-by-step plan (DDU + Safe Mode)

Principles (the rules that prevent rework)

Control the environment: no Windows Update surprise installs mid-cleanup.
Remove drivers when they are not active: Safe Mode reduces file locks and running services.
Reboot at the right times: not constantly, not never.
Change one variable at a time: don’t “also tweak registry TdrDelay and undervolt” on the same pass.

Pre-flight checklist (5 minutes, saves hours)

Download the correct driver installer for your GPU and OS beforehand. Put it on the desktop. You’re going offline later.
Download DDU (Display Driver Uninstaller) beforehand and extract it to a known folder (e.g., C:\Tools\DDU).
Note your current GPU tuning settings (Afterburner/Adrenalin) and reset to defaults if you can. If you can’t, don’t panic—we can still clean install.
Disconnect from the network (unplug Ethernet, disable Wi‑Fi) or set driver searching to off (Task 7). Do both if Windows has been particularly “helpful.”
Create a restore point if the system is stable enough. Not because restore points are perfect, but because they’re cheap insurance.

Step-by-step: the clean install workflow that actually works

Force Safe Mode for next boot (if you can’t reliably reach Advanced Startup).
```
cr0x@server:~$ bcdedit /set {current} safeboot minimal

The operation completed successfully.
```
Decision: If this fails with access denied, your shell is not elevated. Fix that first.
Reboot into Safe Mode.

In Safe Mode, the system should use a basic driver and be less crash-happy. If Safe Mode itself crashes, shift your suspicion toward hardware or severe OS corruption.
Run DDU as Administrator and choose the right device type (GPU) and vendor (NVIDIA or AMD).

Settings you generally want: prevent downloads of drivers from Windows Update (DDU can set policies). This is one of those times where a tool being bossy is a feature.

Action: “Clean and restart” is the normal choice. “Clean and shutdown” is useful if you’re swapping GPUs.
After DDU restarts, remove Safe Mode flag so you can boot normally.
```
cr0x@server:~$ bcdedit /deletevalue {current} safeboot

The operation completed successfully.
```
Decision: If you forget this, you’ll keep booting into Safe Mode and misdiagnose “driver didn’t install.” It installed; you’re just in Safe Mode.
Stay offline and boot into normal mode.

At this point, Windows should be running on Microsoft Basic Display Adapter. Resolution may be wrong. That’s fine. Stability is the goal.

Verify:
```
cr0x@server:~$ powershell -NoProfile -Command "Get-PnpDevice -Class Display | Select-Object -ExpandProperty FriendlyName"

Microsoft Basic Display Adapter
```
Install the driver you downloaded, not whatever Windows wants to fetch.
- NVIDIA: consider “Driver only” if you’re troubleshooting, and skip GeForce Experience until stable.
- AMD: choose a “Factory Reset” option only if you are not already using DDU (DDU already did the heavy lifting). If you did DDU, you typically don’t need both.
Decision: If the crash loop returns during install, cancel and reboot; then try a different known-good driver version (often one branch back). This is where “latest” is not a virtue.

Reboot once after installation.

Don’t stack additional changes yet. First, confirm stability and correct driver version:

cr0x@server:~$ powershell -NoProfile -Command "Get-CimInstance Win32_VideoController | Select-Object Name,DriverVersion | Format-Table -AutoSize"

Name                       DriverVersion
----                       -------------
NVIDIA GeForce RTX 3080    31.0.15.5161

Re-enable network and confirm Windows does not overwrite the driver.

Wait a few minutes. Reboot once more. Re-check driver version. If it changed, you have a driver update policy problem to solve (see Common mistakes).
Only after stability: add back software layers.

Overlays, monitoring, RGB drivers, capture tools—add them back one by one. Yes, it’s tedious. That’s why it works.

Joke #2 (short, relevant): Safe Mode is like a fire drill for your PC—everything looks worse, but you can finally see who’s starting the smoke.

Three corporate mini-stories from the trenches

Mini-story 1: The incident caused by a wrong assumption

A media team had a couple of high-end Windows workstations used for color grading and GPU-accelerated encoding. A driver update went out as part of “routine patching.” The next morning, one box started flickering to black every few seconds. The operator did what everyone does: reinstalled the newest driver again, because “maybe it was corrupted.” It got worse.

The wrong assumption: the installer is authoritative. They assumed that if the installer completed, the system was running that version. In reality, Windows Update had a display driver in its queue and kept racing the vendor installer. After each reboot, the system landed on a different driver build with different components. The user experienced it as randomness. Operations experienced it as a support ticket that reproduced only when no one was looking.

We pulled a short log set: Event ID 4101 bursts after login, driver version flipping between boots, and multiple OEM INF packages in DriverStore for both NVIDIA and an old AMD card that had been used six months earlier. No one remembered that old swap. The box did.

The fix was not heroic: isolate the workstation from the network, boot Safe Mode, run DDU, remove stale driver packages, install a known-stable branch driver, then reintroduce network connectivity only after confirming the driver version stayed put. The loop vanished. The most valuable step was the least glamorous: stopping Windows from “helping.”

The postmortem lesson was blunt: you don’t have a driver version until you can prove it persists across a reboot with Windows Update enabled. Anything else is vibes.

Mini-story 2: The optimization that backfired

A finance department had GPU-accelerated charting on trading desktops. Someone read that “disabling TDR improves performance” for long GPU compute tasks and decided to standardize a registry change to increase TDR delays. It wasn’t malicious. It was the classic “make the error go away by hiding it.”

It worked for exactly one week. Then a subset of machines started freezing hard instead of recovering. Before the change, a GPU hang would trigger a driver reset, the app would crash, and the user would reopen it. Annoying, but survivable. After the change, the OS waited longer before deciding the GPU was hung. That meant the whole desktop remained unresponsive for longer stretches. People interpreted it as “the PC is dead,” and they power-cycled—often in the middle of disk writes.

The second-order effect was worse: forced power-offs led to occasional file system repairs on boot, profile corruption, and one machine that got stuck in an automatic repair loop. The “performance tweak” had turned a recoverable app-level fault into a system reliability problem.

We rolled the TDR tweak back, returned to default timeouts, then fixed the actual cause: a specific driver version + overlay combination that triggered hangs during rapid multi-monitor mode switches. The workstations went back to predictable behavior: if a hang happened, it recovered quickly, logged clearly, and didn’t encourage users to yank the power cord.

Lesson: don’t tune failure detection mechanisms until you understand what they’re detecting. TDR is not your enemy; it’s your parachute.

Mini-story 3: The boring practice that saved the day

An engineering org maintained a lab of Windows machines used to validate GPU-accelerated builds. Nothing glamorous. Just a set of boxes that needed to be stable and repeatable. The team had a habit that felt bureaucratic: every machine had a “driver baseline sheet” listing GPU model, driver branch, exact installer file name, and the date it was certified for the lab.

One afternoon, several machines started showing driver resets after a routine Windows cumulative update. Panic tried to happen. But the baseline sheet made it boring. They compared driver versions, saw that two machines had drifted to a different driver build, and quickly identified that Windows Update had pushed a new display driver to those two only.

The response was simple and fast: isolate network, Safe Mode, DDU, reinstall baseline driver, and reapply a policy to block automatic driver updates for that device class. The “fix” took under an hour because the team wasn’t debating what “good” looked like. They already had a known-good state and the means to return to it.

That’s the unsexy truth of reliability: a clean rollback path is worth more than a thousand clever tweaks. The baseline sheet didn’t prevent the issue, but it prevented thrash. In production, that’s a win.

Common mistakes: symptom → root cause → fix

1) Symptom: driver reinstalls itself after you uninstall

Root cause: Windows Update and/or DriverStore contains a display driver package that auto-applies on boot.

Fix: Disconnect network; set SearchOrderConfig to 0; use DDU in Safe Mode; remove stale packages with pnputil. Verify driver version persists across reboot with network re-enabled.

2) Symptom: Safe Mode is stable, normal mode crash loops

Root cause: Vendor driver stack, settings, or hook software (overlay/monitoring/RGB) triggers resets.

Fix: DDU in Safe Mode; clean install driver; delay installing overlays; test stability between changes.

3) Symptom: black screen after install, but system is “alive” (RDP works)

Root cause: Display output path/mode issue (multi-monitor EDID, refresh rate, HDR, cable/port negotiation) or bad default resolution on reboot.

Fix: Boot into Safe Mode and remove the driver; boot normal with Basic Display Adapter; reconnect one monitor on a known-good port/cable; install driver; then add monitors back.

4) Symptom: “nvlddmkm” or AMD driver resets only when launching games

Root cause: unstable OC/UV, bad shader cache state, overlay conflict, or driver regression with a specific API path.

Fix: reset GPU to stock; clear shader caches via driver UI; reinstall driver; disable overlays; if still present, roll back one driver branch version.

5) Symptom: random flicker + resets after Windows update, especially with multiple monitors

Root cause: update changed graphics subsystem behavior; HAGS/VRR/HDR interactions; monitor firmware quirks.

Fix: turn off HAGS and VRR temporarily; test single-monitor; reinstall stable driver; update monitor firmware if applicable.

6) Symptom: crash loop persists even after DDU and reinstall

Root cause: deeper OS corruption, conflicting kernel drivers, or actual hardware instability (PCIe, PSU, GPU).

Fix: run SFC/DISM; check WHEA events; reseat GPU; remove risers; test different PSU/cables; run memory test; consider clean OS install or hardware RMA if BIOS artifacts appear.

7) Symptom: system hard-reboots or powers off under GPU load

Root cause: power delivery issue, PSU transient handling, cable/connector issue, or VRM/thermal protection.

Fix: separate PCIe power cables (no daisy-chain on high-end cards), verify connectors seated, reduce power limit to test, check PSU capacity/quality, inspect temps and hotspots.

8) Symptom: “clean install” option used, but issues remain

Root cause: vendor “clean install” is not a full removal; leftovers remain in DriverStore/services/settings.

Fix: Use DDU in Safe Mode and control Windows Update. Treat vendor clean install as a convenience feature, not a remediation tool.

Operational mindset: why DDU + Safe Mode is the right default

In SRE terms, a driver crash loop is a flapping dependency. The display driver is repeatedly failing, Windows is repeatedly recovering, and your workstation is stuck in a partial outage state. The instinct is to “do something” repeatedly—reinstall, reboot, reinstall again—until it magically stabilizes.

That approach fails because the system isn’t deterministic during the loop. Files are locked. Services are mid-start. Windows Update is racing you. Settings are half-applied. It’s like trying to replace a disk while the RAID controller keeps re-adding the same bad member from a closet.

Safe Mode reduces the number of active components. DDU reduces the number of leftover artifacts. Combined, they create a predictable maintenance window where you can actually make the system converge to a known state.

DDU workflow details that matter (and the parts people skip)

Offline is not optional (if you’ve seen driver “resurrection”)

If Windows can reach Windows Update, it can fetch a driver at the exact wrong moment—between your uninstall and your reinstall. That can leave you with mismatched components: control panel from one version, driver core from another, audio driver from a third. This is how you get weirdness like HDMI audio missing, or the control panel refusing to open, or the driver resetting when you open the settings UI.

Practical guidance: unplug Ethernet. Disable Wi‑Fi. Then do the work. If you’re in a corporate environment where that’s hard, use a policy-based driver update block, and validate it.

Remove only what you intend to remove

DDU is powerful. So is pnputil /delete-driver. Power is not the same thing as wisdom.

If you’re on NVIDIA, don’t go deleting chipset drivers because “they also said NVIDIA.” NVIDIA makes chipset packages for some platforms; deleting the wrong thing can break storage or networking.
If you’re on AMD, remember AMD touches both GPU and chipset ecosystems. Be precise about what you’re removing.

The target is display driver packages and their services/settings—nothing more. We’re disinfecting the graphics stack, not performing an exorcism.

Reboots are part of state convergence

People either reboot after every click or avoid reboots like they cost money. The right number is: reboot when the cleanup tool tells you to reboot, and reboot once after installation. Add one more reboot if you’re validating “does the driver persist across boot with network enabled?” That’s it.

Keep a known-good driver handy

“Latest” is great for features and game fixes. “Known-good” is great for work. If you’re troubleshooting, pick stability first. Use a driver version you’ve run before without issues, or one validated by your org.

FAQ

1) Do I really need DDU? Can’t I just uninstall from Apps & Features?

If you’re in a crash loop, yes, you really need DDU more often than not. App uninstallers don’t reliably remove DriverStore packages, services, and settings in a way that prevents reapplication. DDU in Safe Mode gives you a clean baseline.

2) Is NVIDIA “Clean installation” checkbox enough?

No. It resets some settings and profiles, but it’s not the same as removing driver packages and preventing Windows from reinstalling something else. Use it as a convenience feature after you’re stable, not as your primary remediation.

3) Should I install GeForce Experience / AMD overlay tools while troubleshooting?

Not at first. Get the driver stable with minimal extras. Then add management tools back if you need them. Every overlay and recorder is another hook into the graphics pipeline.

4) What if Safe Mode still crash loops?

That’s a red flag. Possible causes: severe OS corruption, failing storage, or hardware instability bad enough that even basic display paths trigger issues. Run SFC/DISM, check WHEA logs, and consider hardware checks (reseat GPU, PSU/cabling, RAM test).

5) Do I need to change TDR registry keys like TdrDelay?

Usually no. TDR tweaks are a last resort for specific workloads (e.g., long GPU compute kernels) and can turn recoverable hangs into long freezes. Fix the root cause first: driver version, overlays, power/thermals, or OS integrity.

6) Why does the crash loop happen right after login?

Login triggers a bunch of GPU-accelerated activity: DWM composition, startup apps, overlays, multi-monitor configuration, HDR/VRR negotiation, and power state changes. If the driver is fragile, that spike is where it shows.

7) I switched from AMD to NVIDIA (or vice versa). Is that special?

Yes. Cross-vendor swaps commonly leave driver packages and services behind. Windows can also keep old packages in DriverStore. DDU plus DriverStore cleanup is the correct move, and “clean and shutdown” in DDU is helpful if you’re physically swapping cards.

8) How do I know if Windows Update is overwriting my driver?

Check your driver version (Task 1), reboot with network enabled, and check again. If it changes without you installing anything, Windows Update or device install policy is doing it. Fix that before you keep testing driver versions, or you’ll be benchmarking chaos.

9) Can a bad HDMI/DisplayPort cable cause what looks like a driver crash loop?

A bad cable usually causes flicker, dropouts, or no signal—not repeated driver reset events. But multi-monitor negotiation problems can look similar. If logs show TDR resets, it’s more than a cable; still, simplifying to one monitor on a known-good cable is a good isolation step.

10) Should I do a full Windows reinstall?

Only after you’ve done the controlled DDU workflow and verified Windows isn’t reintroducing drivers. If you still crash loop on a clean driver with no overlays and OS integrity checks pass, a clean OS install can separate software rot from hardware reality.

Conclusion: next steps that actually stick

If your system is stuck in a NVIDIA/AMD driver crash loop, stop improvising. The reliable path is:

Prove it’s a driver reset loop (Event ID 4101, DWM exits, Reliability Monitor).
Cut off driver reinfection (offline + disable automatic driver searching temporarily).
Boot Safe Mode and run DDU to remove the active vendor stack cleanly.
Boot normal on Basic Display Adapter, install a known-good driver you already downloaded, then reboot once.
Re-enable network and verify the driver version persists across reboot.
Add overlays/tuning tools back one at a time. If instability returns, you just found your culprit.

When you do it this way, the system converges. When you don’t, you end up “testing” different driver versions while Windows Update swaps them underneath you and an overlay injects itself into every graphics API call. That’s not troubleshooting; that’s performance art.

If, after a disciplined clean install, the loop persists—and especially if you see WHEA errors or BIOS artifacts—treat it like a hardware stability problem. Reseat the GPU, check cabling, test PSU, and back off any OC/UV. Drivers can be buggy. Hardware can be tired. Your job is to figure out which one is lying to you today.