- May 7, 2013
- 10,400
Rich (BB code):
DRIVER_PNP_WATCHDOG (1d5)
A driver has failed to complete a Pnp operation within a specific time.
Arguments:
Arg1: 0078006600670069, First few character of the service associated with the devnode.
Arg2: fffff80189897750, Pointer to the nt!TRIAGE_PNP_WATCHDOG on Win10 RS4 and higher.
Arg3: ffffc88d43595040, Thread responsible for the Pnp Watchdog.
Arg4: 0000000000057e4f, Milliseconds elapsed since the watchdog was armed.
The description of the bugcheck is straightforward, a thread has got stuck and hasn't completed a PnP I/O request within the specified time limit, which is what the PnP watchdog is responsible for monitoring. The first parameter doesn't actually contain any useful information but you would usually use the dc command with this pointer. The most useful parameter is actually the third parameter which contains the address of the stuck thread.
Now, there is actually two ways which you can debug this crash: you can either go through the call stack of the hung thread or use !pnptriage and then for PnP requests. We'll do the latter and then confirm this finding by using the former method.
Rich (BB code):
0: kd> !pnptriage
********************************************************************************
Dumping PnP DeviceAction Queue @ 0xfffff80184643400
********************************************************************************
Dumping nt!_PNP_DEVICE_ACTION_ENTRY @ 0xffffc88d43fcade0
Request to ReenumerateDeviceOnly !devstack ffffc88d3a526de0
Argument - 00000000
CompletionEvent - 0x00000000
Dumped 1 entries
PnP DeviceActionWorker active!
********************************************************************************
Dumping PnP DeviceEvent Queue @ 0xffffc88d3a49d190
********************************************************************************
List = 0xffffc88d3a49d208, 0xffffc88d3a49d208
Event list is empty
********************************************************************************
Dumping PnP DeviceCompletion Queue @ 0xfffff80184644980
********************************************************************************
1 Pnp operation(s) dispatched (IRP pending) currently.
Dumping pending asynchronous request list...
Dumping pended asynchronous request...
!devnode ffffc88d3c587c50
CompletionState = DeviceNodeEnumerateCompletion (0x30d)
Pending !irp ffffc88d43613a40
Dumping completed asynchronous request list...
[...]
We can see that there is a pending I/O request which is related to a PnP operation. This matches what mentioned in the bugcheck description, if we examine the request using the suggested command then we can find the following:
Rich (BB code):
0: kd> !irp ffffc88d43613a40
Irp is active with 3 stacks 3 is current (= 0xffffc88d43613ba0)
No Mdl: No System Buffer: Thread ffffc88d43595040: Irp stack trace.
cmd flg cl Device File Completion-Context
[N/A(0), N/A(0)]
0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[N/A(0), N/A(0)]
0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
>[IRP_MJ_PNP(1b), IRP_MN_QUERY_DEVICE_RELATIONS(7)]
0 e0 ffffc88d43e8c030 00000000 fffff80183d6d250-ffffc88d41165360 Success Error Cancel
Unable to load image \SystemRoot\system32\DRIVERS\igdkmd64.sys, Win32 error 0n2
\Driver\igfx nt!PnpDeviceCompletionRoutine
Args: 00000000 00000000 00000000 00000000
It looks like a device relations request wasn't completed in the appropriate time by the Intel graphics card driver. However, you may be wondering why there is seemingly two different drivers shown in the output? Well, I wondered the same and at first thought it could have been a separate but related driver, although, it seems that the driver is just an alias for the same driver shown.
Rich (BB code):
sc qc igfx
[SC] QueryServiceConfig SUCCESS
SERVICE_NAME: igfx
TYPE : 1 KERNEL_DRIVER
START_TYPE : 3 DEMAND_START
ERROR_CONTROL : 0 IGNORE
BINARY_PATH_NAME : \SystemRoot\system32\DRIVERS\igdkmd64.sys
LOAD_ORDER_GROUP : Video
TAG : 6
DISPLAY_NAME : igfx
DEPENDENCIES :
SERVICE_START_NAME :
Okay, so we've identified the faulty driver from the !pnptriage but let's investigate a little further and understand exactly why the thread became hung to begin with.
Rich (BB code):
0: kd> .thread ffffc88d43595040
Implicit thread is now ffffc88d`43595040
Rich (BB code):
0: kd> knL
*** Stack trace for last set context - .thread/.cxr resets it
# Child-SP RetAddr Call Site
00 ffffa187`54941040 fffff801`83c1c9c0 nt!KiSwapContext+0x76
01 ffffa187`54941180 fffff801`83c1beef nt!KiSwapThread+0x500
02 ffffa187`54941230 fffff801`83c1b793 nt!KiCommitThreadWait+0x14f (3)
03 ffffa187`549412d0 fffff801`83c17fdd nt!KeWaitForSingleObject+0x233
04 ffffa187`549413c0 fffff801`83c2260a nt!ExpWaitForResource+0x6d
05 ffffa187`54941440 fffff801`83c22074 nt!ExpAcquireResourceSharedLite+0x4da
06 ffffa187`54941500 fffff801`89cc4833 nt!ExAcquireResourceSharedLite+0x44
07 ffffa187`54941540 fffff801`89ce9006 dxgkrnl!DpiFdoHandleQueryDeviceRelations+0xf3 (2)
08 ffffa187`54941600 fffff801`89cac0f9 dxgkrnl!DpiFdoDispatchPnp+0xd6
09 ffffa187`549416c0 fffff801`83c11385 dxgkrnl!DpiDispatchPnp+0xe9
0a ffffa187`549417e0 fffff801`84165b4e nt!IofCallDriver+0x55
0b ffffa187`54941820 fffff801`83c045ee nt!PnpAsynchronousCall+0xea
0c ffffa187`54941860 fffff801`84137d29 nt!PnpSendIrp+0x9e
0d ffffa187`549418d0 fffff801`84137c90 nt!PnpQueryDeviceRelations+0x51
0e ffffa187`54941960 fffff801`84135ce7 nt!PipEnumerateDevice+0xc8 (1)
0f ffffa187`54941990 fffff801`8412fcc6 nt!PipProcessDevNodeTree+0x297
10 ffffa187`54941a60 fffff801`83d6efba nt!PiRestartDevice+0xba
11 ffffa187`54941ab0 fffff801`83c8e5c5 nt!PnpDeviceActionWorker+0x46a
12 ffffa187`54941b70 fffff801`83d265f5 nt!ExpWorkerThread+0x105
13 ffffa187`54941c10 fffff801`83e048d8 nt!PspSystemThreadStartup+0x55
14 ffffa187`54941c60 00000000`00000000 nt!KiStartSystemThread+0x28
Let's start from the beginning of the call stack, a bus enumeration request is created by the PnP Manager (1) which in turns lead to a PnP-based IRP being created as shown earlier. Our device receives the request (2) and begins processing it, however, in order to process the request further it attempts to acquire a lock; this is where the issue starts, the lock is currently being held by another thread, this thread doesn't release the lock in time and the PnP watchdog eventually pulls down the system. Remember, the PnP watchdog is essentially just a thread attached to a timer, which when the timer has expired, will bugcheck the system.
Rich (BB code):
05 ffffa18754941440 fffff80183c22074 nt!ExpAcquireResourceSharedLite+4da
Parameter[0] = ffffc88d4398ec90
Parameter[1] = 0000000000000001
Parameter[2] = (unknown)
Parameter[3] = (unknown)
The ExAcquireResourceSharedLite is responsible for acquiring the lock mentioned earlier, in this case, it is an ERESOURCE object which can dump using the !locks command.
Rich (BB code):
0: kd> !locks ffffc88d4398ec90
Resource @ 0xffffc88d4398ec90 Exclusively owned
Contention Count = 1
NumberOfSharedWaiters = 1
Threads: ffffc88d43cf3040-01<*> ffffc88d43595040-01
1 total locks
Now, the function will attempt to acquire an ERESOURCE lock with shared access, that is to say, acquire this lock while another thread also has acquisition of it. The caveat here is that the thread will only additionally own the lock if the lock isn't exclusively owned. In our case, it is and therefore the thread is placed in a wait state.
Source: ExAcquireResourceSharedLite function (wdm.h) - Windows drivers
- If the resource is currently owned as exclusive by another thread or if there is another thread waiting for exclusive access and the caller does not already have shared access to the resource, the current thread either is put into a wait state (Wait set to TRUE) or ExAcquireResourceSharedLite returns FALSE.
The second parameter indicates that the thread should be put in a wait state if the lock cannot be immediately acquired. To understand where !locks is getting it's information from, you can dump the ERESOURCE object using:
Rich (BB code):
0: kd> dt _ERESOURCE ffffc88d4398ec90
FLTMGR!_ERESOURCE
+0x000 SystemResourcesList : _LIST_ENTRY [ 0xffffc88d`44048dc0 - 0xffffc88d`43e8d050 ]
+0x010 OwnerTable : 0xffffc88d`432206d0 _OWNER_ENTRY
+0x018 ActiveCount : 0n1
+0x01a Flag : 0xc86
+0x01a ReservedLowFlags : 0x86 ''
+0x01b WaiterPriority : 0xc ''
+0x020 SharedWaiters : 0xffffa187`549414a0 Void
+0x028 ExclusiveWaiters : (null)
+0x030 OwnerEntry : _OWNER_ENTRY
+0x040 ActiveEntries : 1
+0x044 ContentionCount : 1
+0x048 NumberOfSharedWaiters : 1
+0x04c NumberOfExclusiveWaiters : 0
+0x050 Reserved2 : (null)
+0x058 Address : (null)
+0x058 CreatorBackTraceIndex : 0
+0x060 SpinLock : 0
The SharedWaiters field is actually a pointer to another structure called _KWAIT_CHAIN which is a linked list of all the threads which are waiting to acquire this lock.
Rich (BB code):
0: kd> dt _KWAIT_CHAIN 0xffffa187`549414a0
nt!_KWAIT_CHAIN
+0x000 Head : 0xffffa187`549414a0 Void
Rich (BB code):
0: kd> dl 0xffffa187`549414a0
ffffa187`549414a0 ffffa187`549414a0 ffffa187`549414a0
ffffa187`549414b0 ffffc88d`43595040 00000000`00060001
The OwnerEntry field is a pointer to the named structure and contains information who owns the lock.
Rich (BB code):
0: kd> dt _OWNER_ENTRY ffffc88d4398ec90+30
FLTMGR!_OWNER_ENTRY
+0x000 OwnerThread : 0xffffc88d`43cf3040
+0x008 IoPriorityBoosted : 0y0
+0x008 OwnerReferenced : 0y0
+0x008 IoQoSPriorityBoosted : 0y1
+0x008 OwnerCount : 0y00000000000000000000000000001 (0x1)
+0x008 TableSize : 0xc
Now, if we switch to the owner thread to try and understand why it may have not released the lock in time, we can see that the call stack has multiple calls to our problematic driver.
Rich (BB code):
0: kd> knL
*** Stack trace for last set context - .thread/.cxr resets it
# Child-SP RetAddr Call Site
00 ffffa187`54622330 fffff801`a82eef4c igdkmd64+0x1119ad
01 ffffa187`54622360 fffff801`a8264e83 igdkmd64+0xbef4c
02 ffffa187`546223e0 fffff801`a825e59b igdkmd64+0x34e83
03 ffffa187`54622460 fffff801`a85e15a9 igdkmd64+0x2e59b
04 ffffa187`54622810 fffff801`89cd3622 igdkmd64+0x3b15a9
05 ffffa187`54622850 fffff801`89cd2d02 dxgkrnl!DpiDxgkDdiStartDevice+0x6a
06 ffffa187`546228b0 fffff801`89cd3b38 dxgkrnl!DpiFdoStartAdapter+0x58e
07 ffffa187`54622a30 fffff801`89ce8c40 dxgkrnl!DpiFdoStartAdapterThreadImpl+0x308
08 ffffa187`54622be0 fffff801`83d265f5 dxgkrnl!DpiFdoStartAdapterThread+0x30
09 ffffa187`54622c10 fffff801`83e048d8 nt!PspSystemThreadStartup+0x55
0a ffffa187`54622c60 00000000`00000000 nt!KiStartSystemThread+0x28
If we go back to the call stack, you can find the same information as the !pnptriage command:
Rich (BB code):
07 ffffa18754941540 fffff80189ce9006 dxgkrnl!DpiFdoHandleQueryDeviceRelations+f3
Parameter[0] = ffffc88d43e8c030 << Device Object
Parameter[1] = ffffc88d43613a40 << IRP (I/O Request)
Parameter[2] = (unknown)
Parameter[3] = (unknown)
Rich (BB code):
0: kd> !devstack ffffc88d43e8c030
!DevObj !DrvObj !DevExt ObjectName
> ffffc88d43e8c030 \Driver\igfx ffffc88d43e8c180
ffffc88d3c5f1cd0 \Driver\ACPI ffffc88d3a5b8bf0
ffffc88d3c58d360 \Driver\pci ffffc88d3c58d4b0 NTPNP_PCI0002
!DevNode ffffc88d3c587c50 :
DeviceInst is "PCI\VEN_8086&DEV_0162&SUBSYS_84CA1043&REV_09\3&11583659&0&10"
ServiceName is "igfx"
Rich (BB code):
0: kd> !DevNode ffffc88d3c587c50
DevNode 0xffffc88d3c587c50 for PDO 0xffffc88d3c58d360
Parent 0xffffc88d3a611ae0 Sibling 0xffffc88d3a5bf020 Child 0000000000
InstancePath is "PCI\VEN_8086&DEV_0162&SUBSYS_84CA1043&REV_09\3&11583659&0&10"
ServiceName is "igfx"
State = DeviceNodeEnumeratePending (0x30c)
Previous State = DeviceNodeStarted (0x308)
StateHistory[11] = DeviceNodeStarted (0x308)
StateHistory[10] = DeviceNodeStartPostWork (0x307)
StateHistory[09] = DeviceNodeStartCompletion (0x306)
StateHistory[08] = DeviceNodeStartPending (0x305)
StateHistory[07] = DeviceNodeResourcesAssigned (0x304)
StateHistory[06] = DeviceNodeDriversAdded (0x303)
StateHistory[05] = DeviceNodeInitialized (0x302)
StateHistory[04] = DeviceNodeUninitialized (0x301)
StateHistory[03] = DeviceNodeInitialized (0x302)
StateHistory[02] = DeviceNodeUninitialized (0x301)
StateHistory[01] = DeviceNodeInitialized (0x302)
StateHistory[00] = DeviceNodeUninitialized (0x301)
StateHistory[19] = Unknown State (0x0)
StateHistory[18] = Unknown State (0x0)
StateHistory[17] = Unknown State (0x0)
StateHistory[16] = Unknown State (0x0)
StateHistory[15] = Unknown State (0x0)
StateHistory[14] = Unknown State (0x0)
StateHistory[13] = Unknown State (0x0)
StateHistory[12] = Unknown State (0x0)
Flags (0x6c000030) DNF_ENUMERATED, DNF_IDS_QUERIED,
DNF_NO_LOWER_DEVICE_FILTERS, DNF_NO_LOWER_CLASS_FILTERS,
DNF_NO_UPPER_DEVICE_FILTERS, DNF_NO_UPPER_CLASS_FILTERS
Notice how the device enumeration request still hasn't completed and is pending?
Rich (BB code):
0: kd> !irp ffffc88d43613a40
Irp is active with 3 stacks 3 is current (= 0xffffc88d43613ba0)
No Mdl: No System Buffer: Thread ffffc88d43595040: Irp stack trace.
cmd flg cl Device File Completion-Context
[N/A(0), N/A(0)]
0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[N/A(0), N/A(0)]
0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
>[IRP_MJ_PNP(1b), IRP_MN_QUERY_DEVICE_RELATIONS(7)]
0 e0 ffffc88d43e8c030 00000000 fffff80183d6d250-ffffc88d41165360 Success Error Cancel
\Driver\igfx nt!PnpDeviceCompletionRoutine
Args: 00000000 00000000 00000000 00000000
And to prove that is a bus enumeration request as previously mentioned, you can check the Parameters field of the IRP:
Rich (BB code):
0: kd> dx -r1 (*((FLTMGR!_IO_STACK_LOCATION *)0xffffc88d43613ba0)).Parameters.QueryDeviceRelations
(*((FLTMGR!_IO_STACK_LOCATION *)0xffffc88d43613ba0)).Parameters.QueryDeviceRelations [Type: <anonymous-tag>]
[+0x000] Type : BusRelations (0) [Type: _DEVICE_RELATION_TYPE]
Just to recap, our device driver receives a bus enumeration request, it then attempts to acquire a lock, however, the lock is exclusively owned by another thread which is also running our driver code, so our IRP processing thread is then placed into a wait state, which in turn eventually leads to the watchdog discovering that the request hasn't completed within the time limit and therefore bugchecks the system.
References:
IRP_MN_QUERY_DEVICE_RELATIONS - Windows drivers
https://geoffchappell.com/studies/windows/km/ntoskrnl/api/ex/resource/eresource.htm
Last edited: