I've been dealing with this issue for almost 2 months now, to no avail, so I'll document the problem and the steps I've taken to try and 'troubleshoot' the source of the problem.
The Problem:
Almost 2 months ago my old GPU died (EVGA 9800 GTX+). I bought a XFX HD 7850 2GB and has experienced numerous BSOD all with the same error message "Attempt to reset display driver and recover from time out failed" while playing Skyrim. I've tried various versions of AMD's catalyst driver but it has not resolved my issue. I have not attempted to overclock the GPU, all factory default settings. The BSOD's are not easily reproducible, but are consistent. So far, it's only happened when I play Skyrim, after maybe 40+ minutes of gaming (but not always, as I've gone longer than that without experiencing the BSOD). What happens is that I would see artifacting on the screen to the point where I can't see anything, but sometimes I can still pull up the in game menu and save the game (sometimes I cannot, as the system freezes), and then the BSOD. In 2 other instances, I encountered the BSOD after I quit Skyrim and immediately attempted to launch FireFox. In all cases, the error message is always
Probably caused by : atikmpag.sys
VIDEO_TDR_FAILURE (116)
Attempt to reset the display driver and recover from timeout failed.
*I will post the latest full crash dump at the end of this post.
My system specs:
Mobo: EVGA nvidia nForce i750 SLI FTW
CPU: Intel Core 2 Duo E8400 3.00 GHz
PSU: OCZ GXS 600W
OS: Windows 7 Professional 64bit SP 1
RAM: 2GB x2
HDD: Seagate 500 GB
What I've done to 'troubleshoot' the problem:
- Display Drivers: I have tried different versions of AMD's Catalyst driver. The first version I installed was the 'release build' Catalyst 13.1 after first installing the GPU. I uninstalled the old nvidia display driver, cleaned orphan nividia driver files in safe mode using Driver Sweeper, installed catalyst driver version 13.1. Other Calalyst driver versions I have tried are: 12.8, 12.11, 13.2 beta 3, 13.2 beta 5, 13.3 beta 3 (latest beta as of the time of this post).
- I used Furmark and MSI Kombuster to stress test my GPU. No BSODs.
- I have updated all drivers for my motherboard (sound and network drivers), and flashed my BIOS to the latest possible version EVGA 750i FTW (E175) - SZ1K BIOS Released (it is an old motherboard).
- To eliminate the possibility of 'overheating', I used SpeedFan, HWInfo, and MSI Afterburner to monitor my system temperature (the latter two programs allowed me to monitor temperature during the game). No apparent 'overheating' during Skyrim or before the BSODs (highest GPU temperature is 60 degrees celcius, highest CPU temperature is 57 degrees celcius). The motherboard temperature displayed in the screenshot is inaccurate. Apparently both Speedfan and HWInfo have issues sensing the correct temperature for my motherboard. I checked in BIOS, and the idle temperature for my motherboard is 40 degrees celcius, not 100+....
- I used memtest86+ to test my memory, 8 passes without errors (it took a bloody long time). I also reseated the memory sticks.
- I used Seatools to test my HDD for bad sectors. Found none.
- Windows Power Management is set to 'performance' so my PC and display will never go into 'sleep mode.'
- I got a power supply tester from a friend to test my PSU to rule out a failing PSU as the culprit.
- I contacted XFX to initiate the RMA process. The entire process took 3 weeks. They told me they tested the card and found no problems, but that they would send me a replacement card of the same model (which I assume would be a refurbished GPU). After receiving and installing the replacement card, I am still experiencing the same BSOD with the same error message.
Here is my latest crash dump.
As I do not have a spare machine to test the GPU, nor a spare modern GPU to test my machine, I've pretty much exhausted all options I know of and at my wit's end. If you want me to run some sort of log collector, I'll be happy to do that. I am not sure I have a modern game I can test to see if this is a 'skyrim specific' BSOD (but what could possibly cause that?)....maybe I can try and dig up my copy of GTAIV....anyways, thanks for reading.
The Problem:
Almost 2 months ago my old GPU died (EVGA 9800 GTX+). I bought a XFX HD 7850 2GB and has experienced numerous BSOD all with the same error message "Attempt to reset display driver and recover from time out failed" while playing Skyrim. I've tried various versions of AMD's catalyst driver but it has not resolved my issue. I have not attempted to overclock the GPU, all factory default settings. The BSOD's are not easily reproducible, but are consistent. So far, it's only happened when I play Skyrim, after maybe 40+ minutes of gaming (but not always, as I've gone longer than that without experiencing the BSOD). What happens is that I would see artifacting on the screen to the point where I can't see anything, but sometimes I can still pull up the in game menu and save the game (sometimes I cannot, as the system freezes), and then the BSOD. In 2 other instances, I encountered the BSOD after I quit Skyrim and immediately attempted to launch FireFox. In all cases, the error message is always
Probably caused by : atikmpag.sys
VIDEO_TDR_FAILURE (116)
Attempt to reset the display driver and recover from timeout failed.
*I will post the latest full crash dump at the end of this post.
My system specs:
Mobo: EVGA nvidia nForce i750 SLI FTW
CPU: Intel Core 2 Duo E8400 3.00 GHz
PSU: OCZ GXS 600W
OS: Windows 7 Professional 64bit SP 1
RAM: 2GB x2
HDD: Seagate 500 GB
What I've done to 'troubleshoot' the problem:
- Display Drivers: I have tried different versions of AMD's Catalyst driver. The first version I installed was the 'release build' Catalyst 13.1 after first installing the GPU. I uninstalled the old nvidia display driver, cleaned orphan nividia driver files in safe mode using Driver Sweeper, installed catalyst driver version 13.1. Other Calalyst driver versions I have tried are: 12.8, 12.11, 13.2 beta 3, 13.2 beta 5, 13.3 beta 3 (latest beta as of the time of this post).
- I used Furmark and MSI Kombuster to stress test my GPU. No BSODs.
- I have updated all drivers for my motherboard (sound and network drivers), and flashed my BIOS to the latest possible version EVGA 750i FTW (E175) - SZ1K BIOS Released (it is an old motherboard).
- To eliminate the possibility of 'overheating', I used SpeedFan, HWInfo, and MSI Afterburner to monitor my system temperature (the latter two programs allowed me to monitor temperature during the game). No apparent 'overheating' during Skyrim or before the BSODs (highest GPU temperature is 60 degrees celcius, highest CPU temperature is 57 degrees celcius). The motherboard temperature displayed in the screenshot is inaccurate. Apparently both Speedfan and HWInfo have issues sensing the correct temperature for my motherboard. I checked in BIOS, and the idle temperature for my motherboard is 40 degrees celcius, not 100+....
- I used memtest86+ to test my memory, 8 passes without errors (it took a bloody long time). I also reseated the memory sticks.
- I used Seatools to test my HDD for bad sectors. Found none.
- Windows Power Management is set to 'performance' so my PC and display will never go into 'sleep mode.'
- I got a power supply tester from a friend to test my PSU to rule out a failing PSU as the culprit.
- I contacted XFX to initiate the RMA process. The entire process took 3 weeks. They told me they tested the card and found no problems, but that they would send me a replacement card of the same model (which I assume would be a refurbished GPU). After receiving and installing the replacement card, I am still experiencing the same BSOD with the same error message.
Here is my latest crash dump.
Code:
WARNING: Whitespace at end of path element
Symbol search path is: SRV*c:\symbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows 7 Kernel Version 7601 (Service Pack 1) MP (2 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 7601.18044.amd64fre.win7sp1_gdr.130104-1431
Machine Name:
Kernel base = 0xfffff800`03052000 PsLoadedModuleList = 0xfffff800`03296670
Debug session time: Fri Mar 29 13:29:38.986 2013 (UTC - 4:00)
System Uptime: 0 days 1:59:31.718
Loading Kernel Symbols
...............................................................
................................................................
..............................................
Loading User Symbols
Loading unloaded module list
.......
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
Use !analyze -v to get detailed debugging information.
BugCheck 116, {fffffa800512a4e0, fffff880079398a4, 0, 2}
*** WARNING: Unable to verify timestamp for atikmpag.sys
*** ERROR: Module load completed but symbols could not be loaded for atikmpag.sys
Probably caused by : atikmpag.sys ( atikmpag+98a4 )
Followup: MachineOwner
---------
1: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
VIDEO_TDR_FAILURE (116)
Attempt to reset the display driver and recover from timeout failed.
Arguments:
Arg1: fffffa800512a4e0, Optional pointer to internal TDR recovery context (TDR_RECOVERY_CONTEXT).
Arg2: fffff880079398a4, The pointer into responsible device driver module (e.g. owner tag).
Arg3: 0000000000000000, Optional error code (NTSTATUS) of the last failed operation.
Arg4: 0000000000000002, Optional internal context dependent data.
Debugging Details:
------------------
FAULTING_IP:
atikmpag+98a4
fffff880`079398a4 4055 push rbp
DEFAULT_BUCKET_ID: GRAPHICS_DRIVER_TDR_FAULT
CUSTOMER_CRASH_COUNT: 1
BUGCHECK_STR: 0x116
PROCESS_NAME: System
CURRENT_IRQL: 0
STACK_TEXT:
fffff880`03aac988 fffff880`0730f000 : 00000000`00000116 fffffa80`0512a4e0 fffff880`079398a4 00000000`00000000 : nt!KeBugCheckEx
fffff880`03aac990 fffff880`0730ed0a : fffff880`079398a4 fffffa80`0512a4e0 fffffa80`07039c00 fffffa80`06fb9010 : dxgkrnl!TdrBugcheckOnTimeout+0xec
fffff880`03aac9d0 fffff880`073b5f07 : fffffa80`0512a4e0 00000000`00000000 fffffa80`07039c00 fffffa80`06fb9010 : dxgkrnl!TdrIsRecoveryRequired+0x1a2
fffff880`03aaca00 fffff880`073dfb75 : 00000000`ffffffff 00000000`000702e8 00000000`00000000 00000000`00000002 : dxgmms1!VidSchiReportHwHang+0x40b
fffff880`03aacae0 fffff880`073de2bb : 00000000`00000102 00000000`00000004 00000000`000702e8 00000000`00000000 : dxgmms1!VidSchiCheckHwProgress+0x71
fffff880`03aacb10 fffff880`073b12c6 : ffffffff`ff676980 fffffa80`06fb9010 00000000`00000000 00000000`00000000 : dxgmms1!VidSchiWaitForSchedulerEvents+0x1fb
fffff880`03aacbb0 fffff880`073dde7a : 00000000`00000000 fffffa80`075278d0 00000000`00000080 fffffa80`06fb9010 : dxgmms1!VidSchiScheduleCommandToRun+0x1da
fffff880`03aaccc0 fffff800`0336834a : 00000000`fffffc32 fffffa80`07040060 fffffa80`03c70b30 fffffa80`07040060 : dxgmms1!VidSchiWorkerThread+0xba
fffff880`03aacd00 fffff800`030b8946 : fffff880`009e6180 fffffa80`07040060 fffff880`009f0f40 6c43582b`30685035 : nt!PspSystemThreadStartup+0x5a
fffff880`03aacd40 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxStartSystemThread+0x16
STACK_COMMAND: .bugcheck ; kb
FOLLOWUP_IP:
atikmpag+98a4
fffff880`079398a4 4055 push rbp
SYMBOL_NAME: atikmpag+98a4
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: atikmpag
IMAGE_NAME: atikmpag.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 5147bf76
FAILURE_BUCKET_ID: X64_0x116_IMAGE_atikmpag.sys
BUCKET_ID: X64_0x116_IMAGE_atikmpag.sys
Followup: MachineOwner
---------
As I do not have a spare machine to test the GPU, nor a spare modern GPU to test my machine, I've pretty much exhausted all options I know of and at my wit's end. If you want me to run some sort of log collector, I'll be happy to do that. I am not sure I have a modern game I can test to see if this is a 'skyrim specific' BSOD (but what could possibly cause that?)....maybe I can try and dig up my copy of GTAIV....anyways, thanks for reading.