New
#11
Sure thing, the video driver version I have on here currently is Nvidia 306.23, the one you suggested. I selected it from the menu of beta and legacy drivers after doing a search for my card's particulars. Windows Update wanted to try to update it to a later version, but I hid/disabled that update, I'm just going to keep this version on here for now and see what happens in the next day or so of using my PC.
Well crud, updating here, something crashed again, I believe it was my graphics driver. All I did was open an internet browsing window and my primary monitor flashed black and after a moment or two went into power save mode, followed closely by my second monitor. I didn't actually see a blue screen, but when I logged back on, I was given the following blue screen information for the unexpected crash:
I'm going to grab the files and attach a new zip to this post with the latest crash dump.Blue Screen Details said:
Follow the guide : STOP 0x116: VIDEO_TDR_ERROR troubleshootingCode:******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* Use !analyze -v to get detailed debugging information. BugCheck 116, {fffffa8009ec44e0, fffff8800f37e584, ffffffffc00000b5, a} Unable to load image \SystemRoot\system32\DRIVERS\nvlddmkm.sys, Win32 error 0n2 *** WARNING: Unable to verify timestamp for nvlddmkm.sys *** ERROR: Module load completed but symbols could not be loaded for nvlddmkm.sys Probably caused by : nvlddmkm.sys ( nvlddmkm+14b584 ) Followup: MachineOwner --------- 7: kd> !analyze -v ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* VIDEO_TDR_FAILURE (116) Attempt to reset the display driver and recover from timeout failed. Arguments: Arg1: fffffa8009ec44e0, Optional pointer to internal TDR recovery context (TDR_RECOVERY_CONTEXT). Arg2: fffff8800f37e584, The pointer into responsible device driver module (e.g. owner tag). Arg3: ffffffffc00000b5, Optional error code (NTSTATUS) of the last failed operation. Arg4: 000000000000000a, Optional internal context dependent data. Debugging Details: ------------------ FAULTING_IP: nvlddmkm+14b584 fffff880`0f37e584 4055 push rbp DEFAULT_BUCKET_ID: GRAPHICS_DRIVER_TDR_FAULT CUSTOMER_CRASH_COUNT: 1 BUGCHECK_STR: 0x116 PROCESS_NAME: System CURRENT_IRQL: 0 STACK_TEXT: fffff880`02d14638 fffff880`02f42000 : 00000000`00000116 fffffa80`09ec44e0 fffff880`0f37e584 ffffffff`c00000b5 : nt!KeBugCheckEx fffff880`02d14640 fffff880`02f41ddb : fffff880`0f37e584 fffffa80`09ec44e0 fffffa80`114e8160 fffffa80`114cb010 : dxgkrnl!TdrBugcheckOnTimeout+0xec fffff880`02d14680 fffff880`02e0ff07 : fffffa80`09ec44e0 00000000`c00000b5 fffffa80`114e8160 fffffa80`114cb010 : dxgkrnl!TdrIsRecoveryRequired+0x273 fffff880`02d146b0 fffff880`02e3dd5a : fffffa80`ffffffff 00000000`0001d41e fffff880`02d14800 00000000`00000006 : dxgmms1!VidSchiReportHwHang+0x40b fffff880`02d14790 fffff880`02e23bb1 : fffffa80`1150e000 ffffffff`feced300 00000000`00000006 fffffa80`1150e000 : dxgmms1!VidSchWaitForCompletionEvent+0x196 fffff880`02d147d0 fffff880`02e2104c : 00000000`000135f6 fffff8a0`0e83ee50 fffffa80`1150e000 00000000`00000000 : dxgmms1!VIDMM_GLOBAL::xWaitForAllEngines+0x1e9 fffff880`02d148d0 fffff880`02e1e8af : 00000000`00000000 fffffa80`0b5e7070 00000000`00000034 fffffa80`0a35c158 : dxgmms1!VIDMM_GLOBAL::ReferenceAllocationForSubmission+0xfc fffff880`02d14910 fffff880`02e3865d : 00000000`00000000 fffff8a0`0dd9ffd0 fffffa80`00000000 fffffa80`0fbae900 : dxgmms1!VIDMM_GLOBAL::PrepareDmaBuffer+0xe1b fffff880`02d14ae0 fffff880`02e38398 : fffff800`00b96080 fffff880`02e37d00 fffffa80`00000000 fffffa80`00000000 : dxgmms1!VidSchiSubmitRenderCommand+0x241 fffff880`02d14cd0 fffff880`02e37e96 : 00000000`00000000 fffffa80`0f43f010 00000000`00000080 fffffa80`114cb010 : dxgmms1!VidSchiSubmitQueueCommand+0x50 fffff880`02d14d00 fffff800`0371ce5a : 00000000`054111df fffffa80`1150d060 fffffa80`09d449e0 fffffa80`1150d060 : dxgmms1!VidSchiWorkerThread+0xd6 fffff880`02d14d40 fffff800`03476d26 : fffff800`035f8e80 fffffa80`1150d060 fffff800`03606cc0 00000000`00000000 : nt!PspSystemThreadStartup+0x5a fffff880`02d14d80 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x16 STACK_COMMAND: .bugcheck ; kb FOLLOWUP_IP: nvlddmkm+14b584 fffff880`0f37e584 4055 push rbp SYMBOL_NAME: nvlddmkm+14b584 FOLLOWUP_NAME: MachineOwner MODULE_NAME: nvlddmkm IMAGE_NAME: nvlddmkm.sys DEBUG_FLR_IMAGE_TIMESTAMP: 503f8bb8 FAILURE_BUCKET_ID: X64_0x116_IMAGE_nvlddmkm.sys BUCKET_ID: X64_0x116_IMAGE_nvlddmkm.sys Followup: MachineOwner --------- 7: kd> lmvm nvlddmkm start end module name fffff880`0f233000 fffff880`0ff1b000 nvlddmkm T (no symbols) Loaded symbol image file: nvlddmkm.sys Image path: \SystemRoot\system32\DRIVERS\nvlddmkm.sys Image name: nvlddmkm.sys Timestamp: Thu Aug 30 21:20:16 2012 (503F8BB8) CheckSum: 00CCA304 ImageSize: 00CE8000 Translations: 0000.04b0 0000.04e4 0409.04b0 0409.04e4
Hi, sorry it took me a little while to get back to this thread, I tried some of the troubleshooting tips and things had been improving slightly, at least it wasn't actually blue-screening/force restarting when the driver would crash, but what was particularly strange was the times that the driver would crash. I'd be on my day off and be playing Star Wars: The Old Republic with Very High graphics settings while using Fraps to record my gameplay and it would be fine for a fairly lengthy gaming session. Then, after quitting SWTOR and doing some average computer use--web browsing, etc., the driver would crash and recover.
It was just strange because it didn't even seem to be crashing according to how graphically intense the current action I was using the PC for. I don't think it's the driver itself, I rolled that back based on advice in this thread to a more stable version and haven't updated it since (uninstalled Nvidia update and hid windows updates relating to the video driver), so the only thing it seemed it could be was the RAM issue the guy from EVGA technical support had spoken with me about.
So I RMA'd my 3 sticks of 4 gigs of Corsair Vengeance and got an advance RMA so that I wouldn't be without my PC while waiting. The replacement RAM just got in yesterday, I installed it here just a few minutes ago and will be observing my PC for the next couple days. If it crashes anymore, I have no idea what the problem could be (or maybe it's just fairly normal, I dunno...), at this point, I've RMAd pretty much every single piece of my PC build except for the motherboard, the processor and the case itself. I'll update in about 48 hours hopefully with positive results and--if the problem has resolved--I'll include a very brief summary of what the problem likely was and what I did to fix it.
Thanks again to everyone who provided me with advice for this problem.
This is frustrating. I've tried a lot of things now, I've had basically my entire computer build RMAd, I followed the troubleshooting guide, and the graphics driver still crashes from time to time. Fortunately lately it hasn't been followed by a blue screen or forced restart, but there was one time that a timeout did occur, I'll just go ahead and attach that dump file here, but I'm sure it will likely be similar--if not identical--to the one I last posted.
I guess a question I have: would having three monitors hooked up to my card have any impact on the graphics driver crashing? The primary monitor is 1080p, but the other two are lower resolution, just so you're aware. I also have the Windows 7 computer usage gauge monitor showing and at any given time when I'm not doing anything particularly stressful on my machine--like right now, typing this out, for instance--the CPU usage is at roughly 2 percent and the RAM usage is at about 18 percent.
Another question I have: Is there a program I can get a hold of to monitor the temperature of my GPU at all times? I've run benchmark programs before--such as the one from EVGA's (the manufacturer of my card) site--which are able to monitor the GPU's temperature and percent usage, but I haven't been able to find one which will just allow me to see the temperature at all times. On that note, even when running the stress tests and benchmarks, the GPU temperature has never exceeded 86 degrees, that's generally where it tops out and it has completed the full 2 hour stress test from EVGA's site without the driver failing once.
That's one thing I don't understand, the driver seems more prone to failure when I'm doing something like I am now as opposed to when I'm actually playing a graphically intense game, such as Star Wars: The Old Republic. It will occasionally fail and recover in SWTOR, but it seems to do so less than when I'm using the PC for simple tasks such as web or forum browsing, email, typing responses, notepad, etc.
Your GPU is good enough to control three displays IMO.
There are a few gadgets, you can try:
You have done a lot, try one more thing.
- Download and install Driver Fusion.
- Reboot the computer in Advanced Boot Options, safe mode. Search Driver Fusion in your start menu, and remove all components of your nVidia display driver.
- Boot normally now. Go to Drivers - Download NVIDIA Drivers, Drivers > Beta and Legacy; search there with your cards particulars for GeForce 306.23 Driver, dated 13.9.2012 and install it.
Let us know the results.
I went ahead and did that just now, I uninstalled the video driver, then restarted, used advanced boot into safe mode, got rid of all Nvidia components with Driver Fusion, restarted, then used advanced boot into safe mode again and installed the Nvidia driver version you suggested (306.23) specifically in safe mode so that Windows Update wouldn't attempt to install a later version. Then rebooted normally. I'll post an update here in a day or so. Thanks for the suggestion and thanks for the tools, I'll give them a look and make sure my graphics card isn't running hot.
Well that was awfully fast. About a minute, maybe two minutes tops after I had made that post and had gotten the temp monitoring tool installed, I opened my web browser again, the temp for GPU1 showed at roughly 43 C and GPU was about 62 C and the driver experienced another fatal crash which forced a shutdown. I don't understand what else this could be. Is it possible that after all this time RMAing other stuff after the graphics card, EVGA might have accidentally sent me a faulty replacement graphics card in the first place? That's the only thing I can even think of right now. When I had my old card (same as the replacement) before it died, it didn't do this. I'm sure the dump info is likely pretty much identical to some of the others I've posted, but I'm going to upload it anyway.
EDIT: I got into contact with EVGA support again, told the support representative all the things I'd done to try to fix the crashes, he suggested disabling the HD Audio Driver in Device Manager (the 0 slot was onboard, but there were two other entries in my device manager, which he said was odd). I disabled both of the two besides the onboard and he suggested a clean install of the driver. Said that disabling that HD audio driver could sometimes resolve instability issues. So I did that, now I'm waiting to see. So far so good, but it's only been a few minutes. He also said if that didn't work, I should adjust the voltage setting for my GPU itself using a tool from EVGA called PrecisionX. So if I experience any crashes here in the next day or so, I'll run that and increase the voltage starting in increments of 50 milivolts a couple times. He said that's pretty much all they can do to troubleshoot based on what I've done this far, so I'm really hoping that one of these steps will finally resolve this issue. I really don't want to have to wait for a replacement card again. I'll post an update later.
Last edited by AndySitz; 29 Jan 2013 at 16:58.