Use WinDbg to Solve Blue Screen Crashes

Everyone has experienced a Blue Screen crash at some point or another. Windows Vista and Windows 7 greatly reduced the frequency of this happening, and now it’s pretty rare. I’ve only seen one on my home computer in several years, and it was caused by a faulty sound card, not by Windows itself. And that’s normally the case now: the core operating system is stable, but devices and programs you add can take away from the system’s reliability.

One of the Directors at my company has been experiencing frequent blue screen crashes since upgrading to a new computer. We were suspecting a hardware issue, but weren’t able to pin it down to anything specific. That’s where the Debugging Tools for Windows come in. These are very in-depth, technical resources that dig deep into the internal workings of the operating system, and can determine exactly what the last thing that happened was before the system crashed and was shut down.

Before beginning any debugging steps, you’ll need to make sure you have a crash dump file. This can be enabled in your system’s properties. After recovering from a crash dump, Windows will display a prompt at the beginning of your next session with the location of the crash dump file. Make a note of its location, or copy it to a flash drive for debugging on another machine. The file can range from a few hundred KB, up to the size of your complete physical memory, depending on how verbose you’ve selected. Most crash dumps are on the smaller end.

After downloading and installing the tools above, you also need to download Symbol packages. These contain mappings from memory addresses, to entry points and function names and are necessary to actually see what’s going on. The symbols are a few hundred megabytes, and aren’t included by default because relatively few people need to use them daily. Once you’ve installed the SDK, start WinDbg from the “Debugging Tools for Windows” entry on your Start menu, and you’ll stare at a rather unfriendly looking console.

Now, set up your Symbols: File > Symbol File Path and add the location you’ve installed the symbol files. You can also add a link to an online symbol server, which will help fill in the gaps if you didn’t download the right symbol package.

Navigate to File > Open Crash Dump… and select the crash dump you’d like to investigate. Then wait a moment while it loads symbol files.

Type, or click, on “!analyze -v” to run the debugger through the trace and find out what happened. You’ll get a print-out:

7: kd> !analyze -v
**                        Bugcheck Analysis                                   **   
BAD_POOL_HEADER (19)The pool is already corrupt at the time of the current request.
This may or may not be due to the caller.
The internal pool links must be walked to figure out a possible cause of the problem,
and then special pool applied to the suspect tags or the driververifier to a suspect driver.
Arguments:Arg1: 0000000000000003, the pool freelist is corrupt.
Arg2: fffffa800a420080, the pool entry being checked.
Arg3: fffffa8007c34030, the read back flink freelist value (should be the same as 2).
Arg4: 1002026001f00705, the read back blink freelist value (should be the same as 2).
Debugging Details:

PROCESS_NAME:  UltraMonTaskba
LAST_CONTROL_TRANSFER:  from fffff8000300d70f to fffff80002ee3640

FOLLOWUP_IP: nt!ExDeferredFreePool+cbbfffff800`0300d70f cc              int     3
SYMBOL_NAME:  nt!ExDeferredFreePool+cbb
FOLLOWUP_NAME:  Pool_corruption
IMAGE_NAME:  Pool_Corruption
MODULE_NAME: Pool_Corruption
FAILURE_BUCKET_ID:  X64_0x19_3_nt!ExDeferredFreePool+cbb
BUCKET_ID:  X64_0x19_3_nt!ExDeferredFreePool+cbb

Below the Debugging Details header, is the real interesting piece of information, the PROCESS_NAME entry – in this case, UltraMonTaskba[r]. UltraMon is an add-on utility designed to make computing with multiple monitors easier and more intuitive by making per-screen task bars and allowing you to quickly manipulate windows across multiple monitors. In this case, it appears to have an incompatibility with his particular system configuration and is corrupting part of the system memory.

Updating UltraMon should solve the issue in this case, but if that doesn’t, then it may have to be removed entirely. I’ve used this technique to identify several difficult to diagnose problems including left-over remnants of an improperly un-installed Anti-Virus application, a failing graphics card, a third-party firewall that was stepping on the built-in Windows firewall, and more. The debugging output can be a bit verbose, but most of it can be ignored which makes it that much easier to use.

This entry was posted in Software, Technology and tagged , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s