So, you're interested in learning to solve BSODs? A satisfying goal, and there's good job security as there's an endless supply of threads.
To be a good BSOD analyst, you don't need deep technical knowledge of how Windows works (though it doesn't hurt!). You do need a good "technician's knowledge" of computers, as there's so much more to it than "what driver was blamed?". As often as not, hardware is the cause, and you should be proficient in that regard. Instructing OPs how to swap out RAM, change memory voltages, and spot PSU problems is SO much easier when you are familiar with the processes already.
Good surface knowledge of Windows is essential. What if that driver won't install right? What if Windows won't boot right? What if you suspect malware is the cause...do you know how to spot other signs of it? What if the OP wants to do a repair install but his DVD is giving him an error message? You could just farm stuff out, but it's better if you're capable of handling it all yourself.
Perhaps even more important is a desire to get to the bottom of the case, no matter what it is. Good BSOD analysts don't feel the need to stick to the "rules" of the game. They exercise complete liberty to post whatever they want in the thread, no matter how unorthodox it might be. Feel like turning the OP into a guinea pig? Go for it! Try new things, learn what doesn't work, and remember what did work for next time. And when you see a thread someone else has solved, spend the 30 seconds and find out what symptoms the the OP was having, and what the solution was.
Ready to proceed?
Start by installing Windbg from the Windows SDK: http://msdn.microsoft.com/en-us/wind...dware/hh852360
Once installed, associate .dmp files with Windbg by entering the following in a command prompt:
"C:\Program Files (x86)\Windows Kits\8.0\Debuggers\x64\windbg.exe" -IA
If Windbg is installed in a different location, change the command accordingly. Just a heads-up, the -IA part is case sensitive. Confused the heck out of me when I first tried it, as most commands are not case sensitive.
When done, open a copy of Windbg, go to File > Symbol file path, and copy/paste:
You can replace C:\symbols with any other path you'd like the symbol cache to be stored on. If you have a low-capacity SSD, be warned the folder can grow to a couple GBs.
After that, you can just double-click on the dmps and it will open. If a driver or program is the cause of the BSODs, it will usually show up in the Probably Caused By line.
Probably caused by: e1c62x64.sys
You can look up the drivers it blames here: Driver Reference Table
A couple other tips:
If a Windows/system driver is blamed, it's not the real problem. Use your powers of reasoning: if tcpip.sys is blamed, perhaps the network adapter drivers are at fault?
You can use Driver Verifier to try to get 3rd-party drivers blamed: Driver Verifier - Enable and Disable
If Verifier_Enabled dumps continue to point to system drivers, hardware is most likely the cause. The most common cause is RAM, though CPU, motherboard, PSU, video card, hard drive, and sometimes some funky ones (monitor, USB devices) can also cause problems. I wrote up some tutorials to diagnostics we use often: RAM - Test with Memtest86+ Hardware - Stress Test With Prime95
To get a list of the running drivers on the system at the time of the crash, run from Windbg:
Spend some time looking up those drivers on the Driver Reference Table until you can quickly glance down the list and pick out the 3rd-party ones. The Windows drivers are rarely of any consequence, but you should still know what they do. One word of warning, however: don't fall into the same pitfall all too many people do, and that is putting too much emphasis on the date of the driver. Is it true that older drivers can have compatibility problems, and should be updated, but few things that I see BSOD analysts doing irritate me more than lists of drivers to update. If a 3rd-party driver is the cause, 95% of the time it will be blamed directly.
I'd be a fool not to at least mention the !analyze -v
command. Try running that on a dump, see what kind of information it reveals. PROCESS_NAME
shows which process was running at the time of the crash; usually not enough to make any conclusions, but when taken from many dumps from the same system, may reveal some circumstantial evidence. FAILURE_BUCKET_ID
and BUCKET ID
can sometimes reveal culprit drivers that are not blamed in the Probably Caused By line.
And one last command I rarely see any other BSOD analysts on the volunteer forums using: the !sysinfo
commands. !sysinfo machineid
shows information about the motherboard and OEM. !sysinfo smbios
reveals a wealth
of information about the motherboard configuration. Want to know what size DIMMs are installed in which slots, and what speed they're running at? Give it a whirl! Or run the generic !sysinfo
command for a list of supported arguments and try them out. Get to know what information you have access to
. Once you do, you will no longer be content to simply use the dumps. I resigned from a Moderator position and left another forum once, among other reasons, but a major part was they didn't see the point in asking for the other info, and weren't on board with my attempts to get some instructions stickied.
That's the basic idea of what we do. As you go along, you'll have dozens (if not more!) of questions. Feel free to post questions in this thread, or you can PM me and I'll help you along.