BSOD error 0x0000000C4 after many attempts to resolve

Arc · 17 Jan 2013

AndySitz said:

Ah, I didn't even know what a RAID was, just looked it up, I don't have a RAID array, so not sure that I'll need it, but if anyone here suggests that I should install it anyway, I'll take your word for it, that's why I came here, for good advice and assistance.

Anyway, so far no issues, it seemed like the driver might have failed once and then recovered, but I couldn't be sure, because there was no notification in the tray like there was for the other times, my primary monitor screen did go black for a brief period though and then came back on, it seemed consistent with what was going on prior. Though right now it's too early to tell if that might just occasionally happen. As long as I don't get a BSOD and a forced restart, I don't mind the occasional driver crash, I suppose. I'll update in another 48 hours or so with news of whether I've experienced any BSOD problems or not and--hopefully--if not, I'll be able to mark this topic as solved.

Dont install the Marvell Raid driver then.

Let us know, which version of display driver you have installed after the clean reinstall?

AndySitz · 17 Jan 2013

Arc said:

Let us know, which version of display driver you have installed after the clean reinstall?

Sure thing, the video driver version I have on here currently is Nvidia 306.23, the one you suggested. I selected it from the menu of beta and legacy drivers after doing a search for my card's particulars. Windows Update wanted to try to update it to a later version, but I hid/disabled that update, I'm just going to keep this version on here for now and see what happens in the next day or so of using my PC.

AndySitz · 17 Jan 2013

Well crud, updating here, something crashed again, I believe it was my graphics driver. All I did was open an internet browsing window and my primary monitor flashed black and after a moment or two went into power save mode, followed closely by my second monitor. I didn't actually see a blue screen, but when I logged back on, I was given the following blue screen information for the unexpected crash:

Blue Screen Details said:

Problem signature:
Problem Event Name: BlueScreen
OS Version: 6.1.7601.2.1.0.256.1
Locale ID: 1033

Additional information about the problem:
BCCode: 116
BCP1: FFFFFA8009EC44E0
BCP2: FFFFF8800F37E584
BCP3: FFFFFFFFC00000B5
BCP4: 000000000000000A
OS Version: 6_1_7601
Service Pack: 1_0
Product: 256_1

Files that help describe the problem:
C:\Windows\Minidump\011713-32198-01.dmp
C:\Users\Andy\AppData\Local\Temp\WER-45895-0.sysdata.xml

Read our privacy statement online:
Windows 7 Privacy Statement - Microsoft Windows

If the online privacy statement is not available, please read our privacy statement offline:
C:\Windows\system32\en-US\erofflps.txt

I'm going to grab the files and attach a new zip to this post with the latest crash dump.

Arc · 18 Jan 2013

Code:

*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 116, {fffffa8009ec44e0, fffff8800f37e584, ffffffffc00000b5, a}

Unable to load image \SystemRoot\system32\DRIVERS\nvlddmkm.sys, Win32 error 0n2
*** WARNING: Unable to verify timestamp for nvlddmkm.sys
*** ERROR: Module load completed but symbols could not be loaded for nvlddmkm.sys
Probably caused by : nvlddmkm.sys ( nvlddmkm+14b584 )

Followup: MachineOwner
---------

7: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

VIDEO_TDR_FAILURE (116)
Attempt to reset the display driver and recover from timeout failed.
Arguments:
Arg1: fffffa8009ec44e0, Optional pointer to internal TDR recovery context (TDR_RECOVERY_CONTEXT).
Arg2: fffff8800f37e584, The pointer into responsible device driver module (e.g. owner tag).
Arg3: ffffffffc00000b5, Optional error code (NTSTATUS) of the last failed operation.
Arg4: 000000000000000a, Optional internal context dependent data.

Debugging Details:
------------------


FAULTING_IP: 
nvlddmkm+14b584
fffff880`0f37e584 4055            push    rbp

DEFAULT_BUCKET_ID:  GRAPHICS_DRIVER_TDR_FAULT

CUSTOMER_CRASH_COUNT:  1

BUGCHECK_STR:  0x116

PROCESS_NAME:  System

CURRENT_IRQL:  0

STACK_TEXT:  
fffff880`02d14638 fffff880`02f42000 : 00000000`00000116 fffffa80`09ec44e0 fffff880`0f37e584 ffffffff`c00000b5 : nt!KeBugCheckEx
fffff880`02d14640 fffff880`02f41ddb : fffff880`0f37e584 fffffa80`09ec44e0 fffffa80`114e8160 fffffa80`114cb010 : dxgkrnl!TdrBugcheckOnTimeout+0xec
fffff880`02d14680 fffff880`02e0ff07 : fffffa80`09ec44e0 00000000`c00000b5 fffffa80`114e8160 fffffa80`114cb010 : dxgkrnl!TdrIsRecoveryRequired+0x273
fffff880`02d146b0 fffff880`02e3dd5a : fffffa80`ffffffff 00000000`0001d41e fffff880`02d14800 00000000`00000006 : dxgmms1!VidSchiReportHwHang+0x40b
fffff880`02d14790 fffff880`02e23bb1 : fffffa80`1150e000 ffffffff`feced300 00000000`00000006 fffffa80`1150e000 : dxgmms1!VidSchWaitForCompletionEvent+0x196
fffff880`02d147d0 fffff880`02e2104c : 00000000`000135f6 fffff8a0`0e83ee50 fffffa80`1150e000 00000000`00000000 : dxgmms1!VIDMM_GLOBAL::xWaitForAllEngines+0x1e9
fffff880`02d148d0 fffff880`02e1e8af : 00000000`00000000 fffffa80`0b5e7070 00000000`00000034 fffffa80`0a35c158 : dxgmms1!VIDMM_GLOBAL::ReferenceAllocationForSubmission+0xfc
fffff880`02d14910 fffff880`02e3865d : 00000000`00000000 fffff8a0`0dd9ffd0 fffffa80`00000000 fffffa80`0fbae900 : dxgmms1!VIDMM_GLOBAL::PrepareDmaBuffer+0xe1b
fffff880`02d14ae0 fffff880`02e38398 : fffff800`00b96080 fffff880`02e37d00 fffffa80`00000000 fffffa80`00000000 : dxgmms1!VidSchiSubmitRenderCommand+0x241
fffff880`02d14cd0 fffff880`02e37e96 : 00000000`00000000 fffffa80`0f43f010 00000000`00000080 fffffa80`114cb010 : dxgmms1!VidSchiSubmitQueueCommand+0x50
fffff880`02d14d00 fffff800`0371ce5a : 00000000`054111df fffffa80`1150d060 fffffa80`09d449e0 fffffa80`1150d060 : dxgmms1!VidSchiWorkerThread+0xd6
fffff880`02d14d40 fffff800`03476d26 : fffff800`035f8e80 fffffa80`1150d060 fffff800`03606cc0 00000000`00000000 : nt!PspSystemThreadStartup+0x5a
fffff880`02d14d80 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x16


STACK_COMMAND:  .bugcheck ; kb

FOLLOWUP_IP: 
nvlddmkm+14b584
fffff880`0f37e584 4055            push    rbp

SYMBOL_NAME:  nvlddmkm+14b584

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: nvlddmkm

IMAGE_NAME:  nvlddmkm.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  503f8bb8

FAILURE_BUCKET_ID:  X64_0x116_IMAGE_nvlddmkm.sys

BUCKET_ID:  X64_0x116_IMAGE_nvlddmkm.sys

Followup: MachineOwner
---------

7: kd> lmvm nvlddmkm
start             end                 module name
fffff880`0f233000 fffff880`0ff1b000   nvlddmkm T (no symbols)           
    Loaded symbol image file: nvlddmkm.sys
    Image path: \SystemRoot\system32\DRIVERS\nvlddmkm.sys
    Image name: nvlddmkm.sys
    Timestamp:        Thu Aug 30 21:20:16 2012 (503F8BB8)
    CheckSum:         00CCA304
    ImageSize:        00CE8000
    Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4

Follow the guide : STOP 0x116: VIDEO_TDR_ERROR troubleshooting

AndySitz · 25 Jan 2013

Hi, sorry it took me a little while to get back to this thread, I tried some of the troubleshooting tips and things had been improving slightly, at least it wasn't actually blue-screening/force restarting when the driver would crash, but what was particularly strange was the times that the driver would crash. I'd be on my day off and be playing Star Wars: The Old Republic with Very High graphics settings while using Fraps to record my gameplay and it would be fine for a fairly lengthy gaming session. Then, after quitting SWTOR and doing some average computer use--web browsing, etc., the driver would crash and recover.

It was just strange because it didn't even seem to be crashing according to how graphically intense the current action I was using the PC for. I don't think it's the driver itself, I rolled that back based on advice in this thread to a more stable version and haven't updated it since (uninstalled Nvidia update and hid windows updates relating to the video driver), so the only thing it seemed it could be was the RAM issue the guy from EVGA technical support had spoken with me about.

So I RMA'd my 3 sticks of 4 gigs of Corsair Vengeance and got an advance RMA so that I wouldn't be without my PC while waiting. The replacement RAM just got in yesterday, I installed it here just a few minutes ago and will be observing my PC for the next couple days. If it crashes anymore, I have no idea what the problem could be (or maybe it's just fairly normal, I dunno...), at this point, I've RMAd pretty much every single piece of my PC build except for the motherboard, the processor and the case itself. I'll update in about 48 hours hopefully with positive results and--if the problem has resolved--I'll include a very brief summary of what the problem likely was and what I did to fix it.

Thanks again to everyone who provided me with advice for this problem.

AndySitz · 28 Jan 2013

This is frustrating. I've tried a lot of things now, I've had basically my entire computer build RMAd, I followed the troubleshooting guide, and the graphics driver still crashes from time to time. Fortunately lately it hasn't been followed by a blue screen or forced restart, but there was one time that a timeout did occur, I'll just go ahead and attach that dump file here, but I'm sure it will likely be similar--if not identical--to the one I last posted.

I guess a question I have: would having three monitors hooked up to my card have any impact on the graphics driver crashing? The primary monitor is 1080p, but the other two are lower resolution, just so you're aware. I also have the Windows 7 computer usage gauge monitor showing and at any given time when I'm not doing anything particularly stressful on my machine--like right now, typing this out, for instance--the CPU usage is at roughly 2 percent and the RAM usage is at about 18 percent.

Another question I have: Is there a program I can get a hold of to monitor the temperature of my GPU at all times? I've run benchmark programs before--such as the one from EVGA's (the manufacturer of my card) site--which are able to monitor the GPU's temperature and percent usage, but I haven't been able to find one which will just allow me to see the temperature at all times. On that note, even when running the stress tests and benchmarks, the GPU temperature has never exceeded 86 degrees, that's generally where it tops out and it has completed the full 2 hour stress test from EVGA's site without the driver failing once.

That's one thing I don't understand, the driver seems more prone to failure when I'm doing something like I am now as opposed to when I'm actually playing a graphically intense game, such as Star Wars: The Old Republic. It will occasionally fail and recover in SWTOR, but it seems to do so less than when I'm using the PC for simple tasks such as web or forum browsing, email, typing responses, notepad, etc.

Arc · 28 Jan 2013

Your GPU is good enough to control three displays IMO.

There are a few gadgets, you can try:

You have done a lot, try one more thing.

Download and install Driver Fusion.
Reboot the computer in Advanced Boot Options, safe mode. Search Driver Fusion in your start menu, and remove all components of your nVidia display driver.
Boot normally now. Go to Drivers - Download NVIDIA Drivers, Drivers > Beta and Legacy; search there with your cards particulars for GeForce 306.23 Driver, dated 13.9.2012 and install it.

Let us know the results.

AndySitz · 29 Jan 2013

I went ahead and did that just now, I uninstalled the video driver, then restarted, used advanced boot into safe mode, got rid of all Nvidia components with Driver Fusion, restarted, then used advanced boot into safe mode again and installed the Nvidia driver version you suggested (306.23) specifically in safe mode so that Windows Update wouldn't attempt to install a later version. Then rebooted normally. I'll post an update here in a day or so. Thanks for the suggestion and thanks for the tools, I'll give them a look and make sure my graphics card isn't running hot.

AndySitz · 29 Jan 2013

Well that was awfully fast. About a minute, maybe two minutes tops after I had made that post and had gotten the temp monitoring tool installed, I opened my web browser again, the temp for GPU1 showed at roughly 43 C and GPU was about 62 C and the driver experienced another fatal crash which forced a shutdown. I don't understand what else this could be. Is it possible that after all this time RMAing other stuff after the graphics card, EVGA might have accidentally sent me a faulty replacement graphics card in the first place? That's the only thing I can even think of right now. When I had my old card (same as the replacement) before it died, it didn't do this. I'm sure the dump info is likely pretty much identical to some of the others I've posted, but I'm going to upload it anyway.

EDIT: I got into contact with EVGA support again, told the support representative all the things I'd done to try to fix the crashes, he suggested disabling the HD Audio Driver in Device Manager (the 0 slot was onboard, but there were two other entries in my device manager, which he said was odd). I disabled both of the two besides the onboard and he suggested a clean install of the driver. Said that disabling that HD audio driver could sometimes resolve instability issues. So I did that, now I'm waiting to see. So far so good, but it's only been a few minutes. He also said if that didn't work, I should adjust the voltage setting for my GPU itself using a tool from EVGA called PrecisionX. So if I experience any crashes here in the next day or so, I'll run that and increase the voltage starting in increments of 50 milivolts a couple times. He said that's pretty much all they can do to troubleshoot based on what I've done this far, so I'm really hoping that one of these steps will finally resolve this issue. I really don't want to have to wait for a replacement card again. I'll post an update later.

Arc · 30 Jan 2013

How far you have followed STOP 0x116: VIDEO_TDR_ERROR troubleshooting?