I'm still looking through the kernel dump. I'm trying to backtrace where originally the no resource error came from that's causing the held up driver, but I'm running into a bit of a snag. Suffice to say, though, your newest crash is identical to previous ones, except avast is no longer present.
I've read through your OP, and I think this actually may be either a bad motherboard, or you actually ran into a rare case where your replacement GPU ended up being bad, too! Either way, we're definitely dealing with some bad hardware here, but it's a matter of figuring just what.
You mentioned you ran Memtest. How many passes of it did you actually try? It should be at least 7 consecutive passes, one is simply not enough. I also recommend
Prime95. Run that on Torture Test on Blend settings overnight, then another overnight run on Large FFTs. Report back on what crashed/errored and what didn't, and make sure that your CPU doesn't overheat when running this since Prime95 runs at max load!
Of course, there's also the potential your PSU could be bad, and is only bugging out when during higher load operations, especially when the GPU is involved. Unfortunately there's no definitive testing for it besides taking it to the shop or buying a diagnostic kit.
Analysts:
Nothing so far software-wise appears unusual. First I checked the TDR_RECOVERY_CONTEXT on Arg1 of the bugcheck. While I don't have any symbols for this, I decided to go dirty and just dump raw memory:
Code:
3: kd> dps fffffa80085ff4e0;dc fffffa80085ff4e0
fffffa80`085ff4e0 00000000`52445476
fffffa80`085ff4e8 fffffa80`09b72be8
fffffa80`085ff4f0 00000000`00000002
fffffa80`085ff4f8 00000000`00000080
fffffa80`085ff500 fffffa80`09b87000
fffffa80`085ff508 fffffa80`073f34c0
fffffa80`085ff510 00000000`00000000
fffffa80`085ff518 00000000`00038dfd
fffffa80`085ff520 00000000`00038dfd
fffffa80`085ff528 00000000`ffffffff
fffffa80`085ff530 fffff880`0520cb10 nvlddmkm!nvDumpConfig+0x2cdc38
fffffa80`085ff538 00000000`52445476
fffffa80`085ff540 00000000`00042caa
fffffa80`085ff548 00000117`00000010
fffffa80`085ff550 00040000`00000001
fffffa80`085ff558 00002005`00000007
fffffa80`085ff4e0 52445476 00000000 09b72be8 fffffa80 vTDR.....+......
fffffa80`085ff4f0 00000002 00000000 00000080 00000000 ................
fffffa80`085ff500 09b87000 fffffa80 073f34c0 fffffa80 .p.......4?.....
fffffa80`085ff510 00000000 00000000 00038dfd 00000000 ................
fffffa80`085ff520 00038dfd 00000000 ffffffff 00000000 ................
fffffa80`085ff530 0520cb10 fffff880 52445476 00000000 .. .....vTDR....
fffffa80`085ff540 00042caa 00000000 00000010 00000117 .,..............
fffffa80`085ff550 00000001 00040000 00000007 00002005 ............. ..
While
dpsdidn't show anything much worthwhile besides the nvidia driver doing a dump config, what I did notice was that this memory allocation starts with some ASCII characters that looks very much like a pool tag! As you know, kernel/driver memory resides a part of RAM reserved for them called
pool memory. To my understanding all pool allocations by drivers are required to have an associated pool tag to identify itself, which typically come in 4 alphanumerics. In our case it's
vTDR. So obviously we're looking at pool memory here, which we can get more details on via
!pool, using the address we used to dump the contents with:
Code:
3: kd> !pool fffffa80085ff4e0
Pool page fffffa80085ff4e0 region is Nonpaged pool
fffffa80085ff000 size: 170 previous size: 0 (Free ) CcPL
fffffa80085ff170 size: 90 previous size: 170 (Allocated) WfpH
fffffa80085ff200 size: 210 previous size: 90 (Free) CcSc
fffffa80085ff410 size: c0 previous size: 210 (Allocated) WfpL
*fffffa80085ff4d0 size: b30 previous size: c0 (Allocated) *vTDR
Pooltag vTDR : Video timeout detection/recovery, Binary : dxgkrnl.sys
So it is indeed, and as was hinted by the TDR_RECOVERY_CONTEXT name, the tag says it's associated with the video TDR, so this allocation of pool memory contains the recovery context, which is kinda hinted by Nvidia's
nvDumpConfig function being present, which I assume part of the context has contents from the card's configuration. When checking the entire allocation, I didn't find anything of note other than it's a somewhat larger allocation (nothing unusual to my understanding) and has a large range of zeroed data, but that may just be to ensure whatever data needs to be dumped by the card can fit inside the allocation. Overall nothing I can personally find odd. However, I would like to know if perhaps the video card is running out of RAM to use, so I'll check with
!vm:
Code:
3: kd> !vm
*** Virtual Memory Usage ***
Physical Memory: 2089506 ( 8358024 Kb)
Page File: \??\C:\pagefile.sys
Current: 8358024 Kb Free Space: 8358020 Kb
Minimum: 8358024 Kb Maximum: 25074072 Kb
Available Pages: 1314351 ( 5257404 Kb)
ResAvail Pages: 1950874 ( 7803496 Kb)
Locked IO Pages: 0 ( 0 Kb)
Free System PTEs: 33512065 ( 134048260 Kb)
Modified Pages: 46961 ( 187844 Kb)
Modified PF Pages: 46827 ( 187308 Kb)
NonPagedPool Usage: 14816 ( 59264 Kb)
NonPagedPool Max: 1552404 ( 6209616 Kb)
PagedPool 0 Usage: 32245 ( 128980 Kb)
PagedPool 1 Usage: 6766 ( 27064 Kb)
PagedPool 2 Usage: 1624 ( 6496 Kb)
PagedPool 3 Usage: 1720 ( 6880 Kb)
PagedPool 4 Usage: 1698 ( 6792 Kb)
PagedPool Usage: 44053 ( 176212 Kb)
PagedPool Maximum: 33554432 ( 134217728 Kb)
Session Commit: 7049 ( 28196 Kb)
Shared Commit: 108163 ( 432652 Kb)
Special Pool: 0 ( 0 Kb)
Shared Process: 7205 ( 28820 Kb)
PagedPool Commit: 44103 ( 176412 Kb)
Driver Commit: 8262 ( 33048 Kb)
Committed pages: 866927 ( 3467708 Kb)
Commit limit: 4178549 ( 16714196 Kb)
Total Private: 587401 ( 2349604 Kb)
0fe4 League of Lege 311834 ( 1247336 Kb)
0ae4 LolClient.exe 100257 ( 401028 Kb)
0074 svchost.exe 33119 ( 132476 Kb)
097c Skype.exe 32174 ( 128696 Kb)
0a18 explorer.exe 9597 ( 38388 Kb)
08bc svchost.exe 9340 ( 37360 Kb)
0db0 LoLLauncher.ex 8526 ( 34104 Kb)
09f8 dwm.exe 8091 ( 32364 Kb)
0518 AvastSvc.exe 8048 ( 32192 Kb)
0630 audiodg.exe 7374 ( 29496 Kb)
03d8 svchost.exe 5607 ( 22428 Kb)
0c1c SearchIndexer. 5217 ( 20868 Kb)
0230 csrss.exe 4533 ( 18132 Kb)
0164 svchost.exe 3952 ( 15808 Kb)
05b8 svchost.exe 2891 ( 11564 Kb)
0488 svchost.exe 2868 ( 11472 Kb)
0a50 nvxdsync.exe 2070 ( 8280 Kb)
042c svchost.exe 1988 ( 7952 Kb)
0c58 nvtray.exe 1854 ( 7416 Kb)
09d4 AvastUI.exe 1641 ( 6564 Kb)
059c spoolsv.exe 1517 ( 6068 Kb)
0a74 nvvsvc.exe 1448 ( 5792 Kb)
0dcc svchost.exe 1369 ( 5476 Kb)
0e8c wmpnetwk.exe 1338 ( 5352 Kb)
025c services.exe 1308 ( 5232 Kb)
0304 svchost.exe 1199 ( 4796 Kb)
0390 svchost.exe 1163 ( 4652 Kb)
07a4 iTunesHelper.e 1150 ( 4600 Kb)
0264 lsass.exe 1073 ( 4292 Kb)
0768 AppleMobileDev 917 ( 3668 Kb)
11a0 WmiPrvSE.exe 891 ( 3564 Kb)
07e4 rads_user_kern 872 ( 3488 Kb)
096c iPodService.ex 854 ( 3416 Kb)
0a80 taskhost.exe 836 ( 3344 Kb)
09a8 HsMgr64.exe 806 ( 3224 Kb)
0150 daemonu.exe 764 ( 3056 Kb)
078c mDNSResponder. 754 ( 3016 Kb)
0e9c wuauclt.exe 724 ( 2896 Kb)
034c nvvsvc.exe 723 ( 2892 Kb)
02a8 winlogon.exe 719 ( 2876 Kb)
09a0 rundll32.exe 711 ( 2844 Kb)
07f4 svchost.exe 709 ( 2836 Kb)
0270 lsm.exe 683 ( 2732 Kb)
0364 nvSCPAPISvr.ex 660 ( 2640 Kb)
01dc csrss.exe 549 ( 2196 Kb)
0380 svchost.exe 548 ( 2192 Kb)
0b74 jusched.exe 518 ( 2072 Kb)
07a0 nusb3mon.exe 511 ( 2044 Kb)
094c HsMgr.exe 500 ( 2000 Kb)
0218 wininit.exe 429 ( 1716 Kb)
0144 smss.exe 139 ( 556 Kb)
0004 System 38 ( 152 Kb)
0f98 chrome.exe 0 ( 0 Kb)
0eac SndVol.exe 0 ( 0 Kb)
0c68 ielowutil.exe 0 ( 0 Kb)
0440 PMB.exe 0 ( 0 Kb)
All healthy, so there must be some oddity about the driver not having enough space. Perhaps it's looking for a contiguous memory region to use that it can't find, but while I doubt that, I can't prove or disprove it either with my current knowledge. Right now I just think we're dealing with some hardware fluke causing unusual behavior in an otherwise stable environment.