The Blue Screen of Death (BSOD) Windows produces on critical system failures is something most Windows users have come across at some point in the last 2 decades. So ubiquitous has the screen become that not even Bill Gates himself can escape it and to many users it is merely a sign that something went wrong and it’s time to turn restart the computer. The truth is that the screen is trying to tell us something. Knowing how to troubleshoot BSOD can be a critical skill in maintaining healthy systems. In this series, you’ll be introduced to crash dump analysis. Crash dump analysis is the examination of Windows Crash Dumps, the byproduct of a Blue Screen of Death.
Whatever you call it, you’re sure to know what it means. All systems administrators and IT support staff know what they are, but have you ever taken the time to figure out what a blue screen means? When it happens right after you’ve installed new hardware, you can be relatively sure that the two are connected somehow. But when an executive’s laptop is crashing and rebooting every day, and they assure you “they haven’t installed anything or made any changes”, you’ll really be glad you know how to troubleshoot them properly.
As shown here, most systems are configured to automatically restart after the BSOD happens. This is great for the users. Not that they love having their computer restart in the middle of their work, but at least they get started working again instead of calling the help desk and telling them exactly every number and letter on the screen.
This can have a bad effect on computer support, since it makes it so easy to reboot and just forget the whole thing ever happened.
Like the text of the BSOD itself says, “If this is the first time you’ve seen this message, reboot…” However, as long as they’ve sent you in, you might as well see for yourself exactly what has been causing the reboots.
If you’ve ever heard this, thought this, or applied this, you’re not alone. Many problems are indeed caused by video drivers. And many times a blue screen is resolved by upgrading or rolling back a video driver.
But unless you’ve learned how to troubleshoot using psychic abilities, it’s a better approach to be systematic and to have some tools at your disposal.
The main tool you need to familiarize yourself with is the Windows Debugger (windbg.exe).
To help get you going, this article is going to show you how to:
It is really not hard to do basic crash dump analysis. How easy is it? It’s amazingly simple. There is a little bit of setup, and then to complete a basic analysis is seriously a couple of mouse clicks. The debugger does most of the work, especially if what you want to do is a basic analysis. There are more advanced techniques for system crash debugging, but even the “couple of mouse click” basic crash dump debugging can reduce the number of times you rebuild the computer. I estimate 50% of crash dump files reveal exactly the nature of the blue screen through simple, basic debugging. I have seen some cases where technicians had suggested a rebuild of the system to resolve a Blue Screen that they could not find a cause for. I stepped in and offered to determine what the blue screen was caused by. After running one of the crash dump files through WinDbg.exe, I could tell that it was most likely bad RAM causing the system crashes. Running Memory diagnostic software confirmed the faulty hardware, and the system stopped crashing after it was removed. System was repaired, not rebuilt, and the owner of the computer was going to be much happier than they are after they’re told they have to reinstall all of their programs.
So to get started, the first thing you need to do is install the Windows Debugger on one of your systems. Since you can move the memory dumps without much trouble, it is best to install the debugger on your machine, instead of going through the installation on the troubled machines. The Windows Debugger (WinDbg.exe) is installed as part of the Windows Software Development Kit (SDK).
1) Download the SDK from: http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=8279.
2) When installing, you may choose not to install options in the Windows Native Code Development section. If you’re not doing Windows development, you won’t need it, and they add about 350MB to the download.
3) You may get a notice that you’ve got a .NET client installed instead of the full version. This will not affect debugging. Install the full client if you wish, or dismiss the notice dialog and continue with the installation.
4) Once completed, you’ll have the Windows Debugger (windbg.exe) available from the Start Menu.
Symbols are a key component of debugging. The symbols are the decoder ring for what’s going on in the executable files and DLLs. When a call is made, you don’t just get a reference to the command that looks like a memory address. With properly configured symbols, you get the method name that’s being called. Though the method names usually include some shorthand, they are least provide clues what they do. And though they may not be exactly clear what the method does from the name, it can at least be looked up as opposed to it just being a mystery. Now, with previous versions of Windows Debugger, you had to download the specific symbols for the Operating System you were using. You also had to also download the symbols for every OS you wanted to perform debugging on. Troubleshooting an XP machine meant loading up the XP symbols. Working with Vista today? Load up the Vista symbols.
It’s much easier now. Instead of loading all of the symbols, you can configure WinDbg to use Microsoft’s Symbol Server to provide all the symbols, as needed and automatically. Here’s how to configure the symbol server:
1) Open Windbg.exe.
2) Trace through “File >> Symbol File Path”, or use the CTRL+S keyboard shortcut.
3) Enter SRV*C:\Symbols*http://msdl.microsoft.com/download/symbols in the dialog box that opens.
4) Create the folder C:\symbols. This is the folder of your choosing. If you have utilities in C:\utilities and want a subfolder there, or in a debugging folder, that’s completely fine. Just have a folder for the symbols to go, and replace the name of your symbol folder in the Symbol path where the above example shows C:\symbols.
5) Exit WinDbg.exe, saving the workspace when prompted.
To summarize, Blue Screen of Death’s are caused by errors and more often than not these errors are easily traceable through the crash dump they produce. Now that you have installed the Windows Debugger on your debugging machine, you’re ready to learn how to debug memory dump files by using some basic debugging techniques and a properly configured WinDbg.exe which we cover in Part 2 of this series.