- May 7, 2013
- 10,400
Rich (BB code):
VIDEO_DXGKRNL_FATAL_ERROR (113)
The dxgkrnl has detected that a violation has occurred. This resulted
in a condition that dxgkrnl can no longer progress. By crashing, dxgkrnl
is attempting to get enough information into the minidump such that somebody
can pinpoint the crash cause. Any other values after parameter 1 must be
individually examined according to the subtype.
Arguments:
Arg1: 0000000000000019, The subtype of the bugcheck:
Arg2: 0000000000000001
Arg3: 00000000000010de << Vendor Id
Arg4: 0000000000001e84 << Device Id
Along with Stop 0x116, Stop 0x117 and Stop 0x119, this is another video-related bugcheck and probably one of the rarest among the them. I've only ever seen it on a handful of occassions and usually they don't provide much information. Although, I was really curious about the meaning of the bugcheck parameters and while searching for clues, I managed to find out a possible exact reason why the bugcheck was caused.
Let's begin with the bugcheck parameters since that is usually the place where I always start when debugging a crash. The first two parameters are completely unknown to me and I still haven't been able to find out what they mean. On the other hand, fortunately I have been able to find the meaning behind the other two parameters; vendor Id and device Id respectively. These values are used to uniquely identify a device and can be used to search an online PCI database to find what device they belong to.
The raw stack also contains the following string which corresponds to the values seen in the parameters.
Rich (BB code):
PCI\VEN_10DE&DEV_1E84&SUBSYS_C7261462&REV_A1\4&2f503515&0&0019
Rich (BB code):
15: kd> !load cmkd; !stack -p
Call Stack : 7 frames
## Stack-Pointer Return-Address Call-Site
00 ffff880e514f43d8 fffff8048cc241a0 nt!KeBugCheckEx+0
Parameter[0] = 0000000000000113
Parameter[1] = 0000000000000019
Parameter[2] = 0000000000000001
Parameter[3] = 00000000000010de
01 ffff880e514f43e0 fffff8048cb41427 watchdog!WdLogEvent5_WdCriticalError+e0
Parameter[0] = ffffa381e4df2ee0
Parameter[1] = (unknown)
Parameter[2] = (unknown)
Parameter[3] = (unknown)
02 ffff880e514f4420 fffff8048ca1dfa8 dxgkrnl!DpiFdoHandleSurpriseRemoval+167
Parameter[0] = ffffa381e5441030 << Device Object
Parameter[1] = ffffa381ea99fda0 << IRP Address
Parameter[2] = (unknown)
Parameter[3] = (unknown)
03 ffff880e514f4460 fffff8048c978029 dxgkrnl!DpiFdoDispatchPnp+b8
Parameter[0] = ffffa381e5441030
Parameter[1] = ffffa381ea99fda0
Parameter[2] = (unknown)
Parameter[3] = (unknown)
04 ffff880e514f4510 fffff8048e7565e4 dxgkrnl!DpiDispatchPnp+e9
Parameter[0] = ffffa381e5441030
Parameter[1] = ffffa381ea99fda0
Parameter[2] = (unknown)
Parameter[3] = (unknown)
05 ffff880e514f4630 ffffa381e5666000 nvlddmkm+d65e4 (leaf)
Parameter[0] = (unknown)
Parameter[1] = (unknown)
Parameter[2] = (unknown)
Parameter[3] = (unknown)
Now, let's dump the call stack and examine what is happening just before the crash. We can see that a third-party driver is present - graphics card - which fits into context of the bugcheck description. However, it doesn't inform us why the crash has occurred which is important when diagnosing an issue. As we can see in the call stack, the stack frames 4 and 3 are related to dispatching a PnP IRP; stack frame 2 indicates that the IRP was a surprise removal IRP which is a form of PnP IRP. Now, if we look closely, we can see that it takes two parameters: a device object and an IRP.
If we dump the device object, we can see that it is related to the graphics card and the driver which was mentioned in the call stack.
Rich (BB code):
15: kd> !devobj ffffa381e5441030
Device object (ffffa381e5441030) is for:
\Driver\nvlddmkm DriverObject ffffa381e49ceba0
Current Irp 00000000 RefCount 0 Type 00000023 Flags 00002004
SecurityDescriptor ffffc60f81a9ed60 DevExt ffffa381e5441180 DevObjExt ffffa381e5442808
ExtensionFlags (0000000000)
Characteristics (0x00000100) FILE_DEVICE_SECURE_OPEN
AttachedTo (Lower) ffffa381df8ee6e0 Name paged out
Let's dump the IRP stack and examine who is currently processing it.
Rich (BB code):
15: kd> !irp ffffa381ea99fda0
Irp is active with 3 stacks 3 is current (= 0xffffa381ea99ff00)
No Mdl: No System Buffer: Thread ffffa381e4b770c0: Irp stack trace.
cmd flg cl Device File Completion-Context
[N/A(0), N/A(0)]
0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[N/A(0), N/A(0)]
0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
>[IRP_MJ_PNP(1b), IRP_MN_SURPRISE_REMOVAL(17)]
0 0 ffffa381e5441030 00000000 00000000-00000000
\Driver\nvlddmkm
Args: 00000000 00000000 00000000 00000000
Notice the same device object which was passed as a parameter earlier? We can clearly see that the IRP is a suprise removal IRP which was issued earlier. This IRP can be issued for a number of different reasons, including if the device happens to be suddenly removed from the system.
Instead, let's examine the IRP dispatch table for the driver and see what we can find. The IRP dispatch table contains all the dispatch handler routines which are called when a device receives an IRP of a particular type. In our case, we're particularly interested in the PnP IRP.
Rich (BB code):
15: kd> !drvobj ffffa381e49ceba0 7
fffff8047fe47b58: Unable to get value of ObpRootDirectoryObject
fffff8047fe47b58: Unable to get value of ObpRootDirectoryObject
Driver object (ffffa381e49ceba0) is for:
\Driver\nvlddmkm
Driver Extension List: (id , addr)
Couldn't read extension at 0xffffa381e5426570
Device Object list:
ffffa381e8195b20 ffffa381e8194b20: Could not read device object
DriverEntry: fffff8048fd191b8 nvlddmkm
DriverStartIo: 00000000
DriverUnload: fffff8048e758960 nvlddmkm
AddDevice: fffff8048e755ba8 nvlddmkm
Dispatch routines:
[00] IRP_MJ_CREATE fffff8048e755fb8 nvlddmkm+0xd5fb8
[01] IRP_MJ_CREATE_NAMED_PIPE fffff8048e755fb8 nvlddmkm+0xd5fb8
[02] IRP_MJ_CLOSE fffff8048e755fb8 nvlddmkm+0xd5fb8
[03] IRP_MJ_READ fffff8048e755fb8 nvlddmkm+0xd5fb8
[04] IRP_MJ_WRITE fffff8048e755fb8 nvlddmkm+0xd5fb8
[05] IRP_MJ_QUERY_INFORMATION fffff8048e755fb8 nvlddmkm+0xd5fb8
[06] IRP_MJ_SET_INFORMATION fffff8048e755fb8 nvlddmkm+0xd5fb8
[07] IRP_MJ_QUERY_EA fffff8048e755fb8 nvlddmkm+0xd5fb8
[08] IRP_MJ_SET_EA fffff8048e755fb8 nvlddmkm+0xd5fb8
[09] IRP_MJ_FLUSH_BUFFERS fffff8048e755fb8 nvlddmkm+0xd5fb8
[0a] IRP_MJ_QUERY_VOLUME_INFORMATION fffff8048e755fb8 nvlddmkm+0xd5fb8
[0b] IRP_MJ_SET_VOLUME_INFORMATION fffff8048e755fb8 nvlddmkm+0xd5fb8
[0c] IRP_MJ_DIRECTORY_CONTROL fffff8048e755fb8 nvlddmkm+0xd5fb8
[0d] IRP_MJ_FILE_SYSTEM_CONTROL fffff8048e755fb8 nvlddmkm+0xd5fb8
[0e] IRP_MJ_DEVICE_CONTROL fffff8048e755fb8 nvlddmkm+0xd5fb8
[0f] IRP_MJ_INTERNAL_DEVICE_CONTROL fffff8048e755fb8 nvlddmkm+0xd5fb8
[10] IRP_MJ_SHUTDOWN fffff8048e755fb8 nvlddmkm+0xd5fb8
[11] IRP_MJ_LOCK_CONTROL fffff8048e755fb8 nvlddmkm+0xd5fb8
[12] IRP_MJ_CLEANUP 00000000
[13] IRP_MJ_CREATE_MAILSLOT 00000000
[14] IRP_MJ_QUERY_SECURITY 00000000
[15] IRP_MJ_SET_SECURITY 00000000
[16] IRP_MJ_POWER 00000000
[17] IRP_MJ_SYSTEM_CONTROL 00000000
[18] IRP_MJ_DEVICE_CHANGE 00000000
[19] IRP_MJ_QUERY_QUOTA 00000000
[1a] IRP_MJ_SET_QUOTA 00000000
[1b] IRP_MJ_PNP 00000000 << No dispatch routine to handle the IRP
Device Object stacks:
!devstack ffffa381e8195b20 :
!DevObj !DrvObj !DevExt ObjectName
> ffffa381e8195b20 ffffa381e8195b20: Could not read device object or _DEVICE_OBJECT not found
ffffa381e8195c70 InfoMask field not found for _OBJECT_HEADER at ffffa381e8195af0
Could not read DeviceObjectExtension from DeviceObject 0xffffa381e8195b20
ffffa381e8194b20: Could not read device object
Error processing device objects. Processed 1 device objects before error.
Notice that there is no handler routine for the PnP IRP? The driver doesn't know to handle this I/O request which in turn leads to our bugcheck. You may be thinking that this rather odd, why doesn't the driver have a dispatch routine for every form of IRP? However, it is perfectly normal and reasonable for drivers to be unable to process every form of IRP, especially in the context of this driver. The surprise removal IRP is designed for PnP devices which may unexpectedly be removed or inserted at any time by the user. If you take for example, a USB keyboard or mouse, I may remove my keyboard at the moment's notice, whereas, it is very unlikely that I would remove my graphics card while my system is still running.
With what we've learned from the dump file and what they user has mentioned:
Also I noticed that alot of times it crashed it was after I bumped my desk or something. Then my screen would go black, fans start running at 100% and nothing happens until I manually reboot.
I believe that it could be in fact a loose connection which is causing the crash to occur. A bump could quite easily cause the graphics card to become momentarily disconnected which would cause the PnP manager to issue a surprise removal request since its noticed that the graphics card is no longer present in the device tree. This is just an assumption at the moment and the thread is still active.
If you're interested in the following the development of this bugcheck then the thread is available here - Frequent crashes
References:
Handling an IRP_MN_SURPRISE_REMOVAL Request - Windows drivers
Writing IRP Dispatch Routines - Windows drivers