Debugging Stop 0x1D5 - DRIVER_PNP_WATCHDOG

x BlueRobot

Administrator
Staff member
Joined
May 7, 2013
Posts
10,400
Rich (BB code):
DRIVER_PNP_WATCHDOG (1d5)
A driver has failed to complete a Pnp operation within a specific time.
Arguments:
Arg1: 0078006600670069, First few character of the service associated with the devnode.
Arg2: fffff80189897750, Pointer to the nt!TRIAGE_PNP_WATCHDOG on Win10 RS4 and higher.
Arg3: ffffc88d43595040, Thread responsible for the Pnp Watchdog.
Arg4: 0000000000057e4f, Milliseconds elapsed since the watchdog was armed.

The description of the bugcheck is straightforward, a thread has got stuck and hasn't completed a PnP I/O request within the specified time limit, which is what the PnP watchdog is responsible for monitoring. The first parameter doesn't actually contain any useful information but you would usually use the dc command with this pointer. The most useful parameter is actually the third parameter which contains the address of the stuck thread.

Now, there is actually two ways which you can debug this crash: you can either go through the call stack of the hung thread or use !pnptriage and then for PnP requests. We'll do the latter and then confirm this finding by using the former method.

Rich (BB code):
0: kd> !pnptriage

********************************************************************************
Dumping PnP DeviceAction Queue @ 0xfffff80184643400
********************************************************************************

Dumping nt!_PNP_DEVICE_ACTION_ENTRY @ 0xffffc88d43fcade0
    Request to ReenumerateDeviceOnly !devstack ffffc88d3a526de0
    Argument - 00000000
    CompletionEvent - 0x00000000

Dumped 1 entries
PnP DeviceActionWorker active!


********************************************************************************
Dumping PnP DeviceEvent Queue @ 0xffffc88d3a49d190
********************************************************************************

List = 0xffffc88d3a49d208, 0xffffc88d3a49d208
Event list is empty

********************************************************************************
Dumping PnP DeviceCompletion Queue @ 0xfffff80184644980
********************************************************************************

1 Pnp operation(s) dispatched (IRP pending) currently.

Dumping pending asynchronous request list...
Dumping pended asynchronous request...
!devnode ffffc88d3c587c50
CompletionState = DeviceNodeEnumerateCompletion (0x30d)
        Pending !irp ffffc88d43613a40

Dumping completed asynchronous request list...
[...]

We can see that there is a pending I/O request which is related to a PnP operation. This matches what mentioned in the bugcheck description, if we examine the request using the suggested command then we can find the following:

Rich (BB code):
0: kd> !irp ffffc88d43613a40
Irp is active with 3 stacks 3 is current (= 0xffffc88d43613ba0)
 No Mdl: No System Buffer: Thread ffffc88d43595040:  Irp stack trace. 
     cmd  flg cl Device   File     Completion-Context
 [N/A(0), N/A(0)]
            0  0 00000000 00000000 00000000-00000000   

            Args: 00000000 00000000 00000000 00000000
 [N/A(0), N/A(0)]
            0  0 00000000 00000000 00000000-00000000   

            Args: 00000000 00000000 00000000 00000000
>[IRP_MJ_PNP(1b), IRP_MN_QUERY_DEVICE_RELATIONS(7)]
            0 e0 ffffc88d43e8c030 00000000 fffff80183d6d250-ffffc88d41165360 Success Error Cancel
          Unable to load image \SystemRoot\system32\DRIVERS\igdkmd64.sys, Win32 error 0n2
 \Driver\igfx    nt!PnpDeviceCompletionRoutine
            Args: 00000000 00000000 00000000 00000000

It looks like a device relations request wasn't completed in the appropriate time by the Intel graphics card driver. However, you may be wondering why there is seemingly two different drivers shown in the output? Well, I wondered the same and at first thought it could have been a separate but related driver, although, it seems that the driver is just an alias for the same driver shown.

Rich (BB code):
sc qc igfx
[SC] QueryServiceConfig SUCCESS

SERVICE_NAME: igfx
        TYPE               : 1  KERNEL_DRIVER
        START_TYPE         : 3   DEMAND_START
        ERROR_CONTROL      : 0   IGNORE
        BINARY_PATH_NAME   : \SystemRoot\system32\DRIVERS\igdkmd64.sys
        LOAD_ORDER_GROUP   : Video
        TAG                : 6
        DISPLAY_NAME       : igfx
        DEPENDENCIES       :
        SERVICE_START_NAME :

Okay, so we've identified the faulty driver from the !pnptriage but let's investigate a little further and understand exactly why the thread became hung to begin with.

Rich (BB code):
0: kd> .thread ffffc88d43595040
Implicit thread is now ffffc88d`43595040

Rich (BB code):
0: kd> knL
  *** Stack trace for last set context - .thread/.cxr resets it
 # Child-SP          RetAddr               Call Site
00 ffffa187`54941040 fffff801`83c1c9c0     nt!KiSwapContext+0x76
01 ffffa187`54941180 fffff801`83c1beef     nt!KiSwapThread+0x500
02 ffffa187`54941230 fffff801`83c1b793     nt!KiCommitThreadWait+0x14f (3)
03 ffffa187`549412d0 fffff801`83c17fdd     nt!KeWaitForSingleObject+0x233
04 ffffa187`549413c0 fffff801`83c2260a     nt!ExpWaitForResource+0x6d
05 ffffa187`54941440 fffff801`83c22074     nt!ExpAcquireResourceSharedLite+0x4da
06 ffffa187`54941500 fffff801`89cc4833     nt!ExAcquireResourceSharedLite+0x44
07 ffffa187`54941540 fffff801`89ce9006     dxgkrnl!DpiFdoHandleQueryDeviceRelations+0xf3 (2)
08 ffffa187`54941600 fffff801`89cac0f9     dxgkrnl!DpiFdoDispatchPnp+0xd6
09 ffffa187`549416c0 fffff801`83c11385     dxgkrnl!DpiDispatchPnp+0xe9
0a ffffa187`549417e0 fffff801`84165b4e     nt!IofCallDriver+0x55
0b ffffa187`54941820 fffff801`83c045ee     nt!PnpAsynchronousCall+0xea
0c ffffa187`54941860 fffff801`84137d29     nt!PnpSendIrp+0x9e
0d ffffa187`549418d0 fffff801`84137c90     nt!PnpQueryDeviceRelations+0x51
0e ffffa187`54941960 fffff801`84135ce7     nt!PipEnumerateDevice+0xc8 (1)
0f ffffa187`54941990 fffff801`8412fcc6     nt!PipProcessDevNodeTree+0x297
10 ffffa187`54941a60 fffff801`83d6efba     nt!PiRestartDevice+0xba
11 ffffa187`54941ab0 fffff801`83c8e5c5     nt!PnpDeviceActionWorker+0x46a
12 ffffa187`54941b70 fffff801`83d265f5     nt!ExpWorkerThread+0x105
13 ffffa187`54941c10 fffff801`83e048d8     nt!PspSystemThreadStartup+0x55
14 ffffa187`54941c60 00000000`00000000     nt!KiStartSystemThread+0x28

Let's start from the beginning of the call stack, a bus enumeration request is created by the PnP Manager (1) which in turns lead to a PnP-based IRP being created as shown earlier. Our device receives the request (2) and begins processing it, however, in order to process the request further it attempts to acquire a lock; this is where the issue starts, the lock is currently being held by another thread, this thread doesn't release the lock in time and the PnP watchdog eventually pulls down the system. Remember, the PnP watchdog is essentially just a thread attached to a timer, which when the timer has expired, will bugcheck the system.

Rich (BB code):
05 ffffa18754941440 fffff80183c22074 nt!ExpAcquireResourceSharedLite+4da 
    Parameter[0] = ffffc88d4398ec90
    Parameter[1] = 0000000000000001
    Parameter[2] = (unknown)       
    Parameter[3] = (unknown)

The ExAcquireResourceSharedLite is responsible for acquiring the lock mentioned earlier, in this case, it is an ERESOURCE object which can dump using the !locks command.

Rich (BB code):
0: kd> !locks ffffc88d4398ec90

Resource @ 0xffffc88d4398ec90    Exclusively owned
    Contention Count = 1
    NumberOfSharedWaiters = 1
     Threads: ffffc88d43cf3040-01<*> ffffc88d43595040-01    
1 total locks

Now, the function will attempt to acquire an ERESOURCE lock with shared access, that is to say, acquire this lock while another thread also has acquisition of it. The caveat here is that the thread will only additionally own the lock if the lock isn't exclusively owned. In our case, it is and therefore the thread is placed in a wait state.

  • If the resource is currently owned as exclusive by another thread or if there is another thread waiting for exclusive access and the caller does not already have shared access to the resource, the current thread either is put into a wait state (Wait set to TRUE) or ExAcquireResourceSharedLite returns FALSE.
Source: ExAcquireResourceSharedLite function (wdm.h) - Windows drivers

The second parameter indicates that the thread should be put in a wait state if the lock cannot be immediately acquired. To understand where !locks is getting it's information from, you can dump the ERESOURCE object using:

Rich (BB code):
0: kd> dt _ERESOURCE ffffc88d4398ec90
FLTMGR!_ERESOURCE
   +0x000 SystemResourcesList : _LIST_ENTRY [ 0xffffc88d`44048dc0 - 0xffffc88d`43e8d050 ]
   +0x010 OwnerTable       : 0xffffc88d`432206d0 _OWNER_ENTRY
   +0x018 ActiveCount      : 0n1
   +0x01a Flag             : 0xc86
   +0x01a ReservedLowFlags : 0x86 ''
   +0x01b WaiterPriority   : 0xc ''
   +0x020 SharedWaiters    : 0xffffa187`549414a0 Void
   +0x028 ExclusiveWaiters : (null) 
   +0x030 OwnerEntry       : _OWNER_ENTRY
   +0x040 ActiveEntries    : 1
   +0x044 ContentionCount  : 1
   +0x048 NumberOfSharedWaiters : 1
   +0x04c NumberOfExclusiveWaiters : 0
   +0x050 Reserved2        : (null) 
   +0x058 Address          : (null) 
   +0x058 CreatorBackTraceIndex : 0
   +0x060 SpinLock         : 0

The SharedWaiters field is actually a pointer to another structure called _KWAIT_CHAIN which is a linked list of all the threads which are waiting to acquire this lock.

Rich (BB code):
0: kd> dt _KWAIT_CHAIN 0xffffa187`549414a0
nt!_KWAIT_CHAIN
   +0x000 Head             : 0xffffa187`549414a0 Void

Rich (BB code):
0: kd> dl 0xffffa187`549414a0
ffffa187`549414a0  ffffa187`549414a0 ffffa187`549414a0
ffffa187`549414b0  ffffc88d`43595040 00000000`00060001

The OwnerEntry field is a pointer to the named structure and contains information who owns the lock.

Rich (BB code):
0: kd> dt _OWNER_ENTRY ffffc88d4398ec90+30
FLTMGR!_OWNER_ENTRY
   +0x000 OwnerThread      : 0xffffc88d`43cf3040
   +0x008 IoPriorityBoosted : 0y0
   +0x008 OwnerReferenced  : 0y0
   +0x008 IoQoSPriorityBoosted : 0y1
   +0x008 OwnerCount       : 0y00000000000000000000000000001 (0x1)
   +0x008 TableSize        : 0xc

Now, if we switch to the owner thread to try and understand why it may have not released the lock in time, we can see that the call stack has multiple calls to our problematic driver.

Rich (BB code):
0: kd> knL
  *** Stack trace for last set context - .thread/.cxr resets it
 # Child-SP          RetAddr               Call Site
00 ffffa187`54622330 fffff801`a82eef4c     igdkmd64+0x1119ad
01 ffffa187`54622360 fffff801`a8264e83     igdkmd64+0xbef4c
02 ffffa187`546223e0 fffff801`a825e59b     igdkmd64+0x34e83
03 ffffa187`54622460 fffff801`a85e15a9     igdkmd64+0x2e59b
04 ffffa187`54622810 fffff801`89cd3622     igdkmd64+0x3b15a9
05 ffffa187`54622850 fffff801`89cd2d02     dxgkrnl!DpiDxgkDdiStartDevice+0x6a
06 ffffa187`546228b0 fffff801`89cd3b38     dxgkrnl!DpiFdoStartAdapter+0x58e
07 ffffa187`54622a30 fffff801`89ce8c40     dxgkrnl!DpiFdoStartAdapterThreadImpl+0x308
08 ffffa187`54622be0 fffff801`83d265f5     dxgkrnl!DpiFdoStartAdapterThread+0x30
09 ffffa187`54622c10 fffff801`83e048d8     nt!PspSystemThreadStartup+0x55
0a ffffa187`54622c60 00000000`00000000     nt!KiStartSystemThread+0x28

If we go back to the call stack, you can find the same information as the !pnptriage command:

Rich (BB code):
07 ffffa18754941540 fffff80189ce9006 dxgkrnl!DpiFdoHandleQueryDeviceRelations+f3 
    Parameter[0] = ffffc88d43e8c030 << Device Object
    Parameter[1] = ffffc88d43613a40 << IRP (I/O Request)
    Parameter[2] = (unknown)       
    Parameter[3] = (unknown)

Rich (BB code):
0: kd> !devstack ffffc88d43e8c030
  !DevObj           !DrvObj            !DevExt           ObjectName
> ffffc88d43e8c030  \Driver\igfx       ffffc88d43e8c180  
  ffffc88d3c5f1cd0  \Driver\ACPI       ffffc88d3a5b8bf0  
  ffffc88d3c58d360  \Driver\pci        ffffc88d3c58d4b0  NTPNP_PCI0002
!DevNode ffffc88d3c587c50 :
  DeviceInst is "PCI\VEN_8086&DEV_0162&SUBSYS_84CA1043&REV_09\3&11583659&0&10"
  ServiceName is "igfx"

Rich (BB code):
0: kd> !DevNode ffffc88d3c587c50
DevNode 0xffffc88d3c587c50 for PDO 0xffffc88d3c58d360
  Parent 0xffffc88d3a611ae0   Sibling 0xffffc88d3a5bf020   Child 0000000000   
  InstancePath is "PCI\VEN_8086&DEV_0162&SUBSYS_84CA1043&REV_09\3&11583659&0&10"
  ServiceName is "igfx"
  State = DeviceNodeEnumeratePending (0x30c)
  Previous State = DeviceNodeStarted (0x308)
  StateHistory[11] = DeviceNodeStarted (0x308)
  StateHistory[10] = DeviceNodeStartPostWork (0x307)
  StateHistory[09] = DeviceNodeStartCompletion (0x306)
  StateHistory[08] = DeviceNodeStartPending (0x305)
  StateHistory[07] = DeviceNodeResourcesAssigned (0x304)
  StateHistory[06] = DeviceNodeDriversAdded (0x303)
  StateHistory[05] = DeviceNodeInitialized (0x302)
  StateHistory[04] = DeviceNodeUninitialized (0x301)
  StateHistory[03] = DeviceNodeInitialized (0x302)
  StateHistory[02] = DeviceNodeUninitialized (0x301)
  StateHistory[01] = DeviceNodeInitialized (0x302)
  StateHistory[00] = DeviceNodeUninitialized (0x301)
  StateHistory[19] = Unknown State (0x0)
  StateHistory[18] = Unknown State (0x0)
  StateHistory[17] = Unknown State (0x0)
  StateHistory[16] = Unknown State (0x0)
  StateHistory[15] = Unknown State (0x0)
  StateHistory[14] = Unknown State (0x0)
  StateHistory[13] = Unknown State (0x0)
  StateHistory[12] = Unknown State (0x0)
  Flags (0x6c000030)  DNF_ENUMERATED, DNF_IDS_QUERIED, 
                      DNF_NO_LOWER_DEVICE_FILTERS, DNF_NO_LOWER_CLASS_FILTERS, 
                      DNF_NO_UPPER_DEVICE_FILTERS, DNF_NO_UPPER_CLASS_FILTERS

Notice how the device enumeration request still hasn't completed and is pending?

Rich (BB code):
0: kd> !irp ffffc88d43613a40
Irp is active with 3 stacks 3 is current (= 0xffffc88d43613ba0)
 No Mdl: No System Buffer: Thread ffffc88d43595040:  Irp stack trace.  
     cmd  flg cl Device   File     Completion-Context
 [N/A(0), N/A(0)]
            0  0 00000000 00000000 00000000-00000000    

            Args: 00000000 00000000 00000000 00000000
 [N/A(0), N/A(0)]
            0  0 00000000 00000000 00000000-00000000    

            Args: 00000000 00000000 00000000 00000000
>[IRP_MJ_PNP(1b), IRP_MN_QUERY_DEVICE_RELATIONS(7)]
            0 e0 ffffc88d43e8c030 00000000 fffff80183d6d250-ffffc88d41165360 Success Error Cancel 
           \Driver\igfx    nt!PnpDeviceCompletionRoutine
            Args: 00000000 00000000 00000000 00000000

And to prove that is a bus enumeration request as previously mentioned, you can check the Parameters field of the IRP:

Rich (BB code):
0: kd> dx -r1 (*((FLTMGR!_IO_STACK_LOCATION *)0xffffc88d43613ba0)).Parameters.QueryDeviceRelations
(*((FLTMGR!_IO_STACK_LOCATION *)0xffffc88d43613ba0)).Parameters.QueryDeviceRelations                 [Type: <anonymous-tag>]
    [+0x000] Type             : BusRelations (0) [Type: _DEVICE_RELATION_TYPE]

Just to recap, our device driver receives a bus enumeration request, it then attempts to acquire a lock, however, the lock is exclusively owned by another thread which is also running our driver code, so our IRP processing thread is then placed into a wait state, which in turn eventually leads to the watchdog discovering that the request hasn't completed within the time limit and therefore bugchecks the system.

References:

IRP_MN_QUERY_DEVICE_RELATIONS - Windows drivers
https://geoffchappell.com/studies/windows/km/ntoskrnl/api/ex/resource/eresource.htm
 
Last edited:

Has Sysnative Forums helped you? Please consider donating to help us support the site!

Back
Top