Vir Gnarus
BSOD Kernel Dump Expert
- Mar 2, 2012
- 474
Howdy all, just a short tip here that I use frequently to get more nitty gritty on what was going on during a crash. For any crashdump where the last running thread is relevant to the cause, sometimes you may not get any important clues on a potential suspect at first glance. Take this !analyze -v output for instance (crashdump attached to this thread):
This occurred when the client turned on DV (Driver Verifier) and had his computer quickly go into reboot loop after having done so. Yet despite the analysis engine's best intentions, it came up with cdrom.sys as the culprit, which is probably not the cause. Nothing in the running thread's callstack unwind showed any other drivers of suspicion, and unfortunately for us, while we can run the address of the pending IRP through !irp extension, we get no valuable details from this minidump (like the device object). So far this seems like we're stuck.
But wait! We have one trick we can pull, and that's dump the raw stack for this thread. The stack unwind information shows us what the analysis engine was able to produce as being most likely the callstack for this thread. However, there may be references to other functions and whatnot that are not present in this stack unwind. Therefore we'll want to dump the raw stack to show details on all the different variables and registers that were involved in this thread that it decided to preserve on the stack. So, to do that, we start with getting details on the thread first using !thread:
Note that I just had to type !thread and didn't need to specify the address of the thread because it was smart enough to figure that I wanted to use the thread involved in the current context (again, context helps!). Because that context happens to be for the latest thread that was running before the crash, that's the one that we want (in minidumps it's the only context you'll get!). Note that the stack unwind has KeBugCheckEx in it so that means it's our crashing thread.
Now, while the stack unwind is no different than what we saw from the !analyze -v output, we do have a lot of extra relevant data on the thread. While all of this can be used for various debugging efforts, our focus is centered on one line, which contains the Base and the Limit. This is essentially the very bottom of the stack and the maximum limit that the stack can fill up too. So what we do is take those values and use them as start and end points for a range to show all the data inbetween. You can use various types of d command to dump the data, but the one we want in particular is dps, which is part of a triplet of special d commands that are designed to show symbols. As you might remember in other articles, symbols are those informative function names like what you see in the stack unwind info that gives you a good idea what the purpose of that particular function is. dps and its fellow brothers dds and dqs will parse through the range of data you give it, determine if there's an instruction address in it, and then use the appropriate symbol for that address. It's preferred to use dps as opposed to its other two related commands because unlike them, dps will discover whether you're dealing with x32 or x64 sized memory addresses and will read and output the data accordingly. So anyways, let's do that:
The first thing you're probably wondering is why I switched the start and end values by using the Limit for the start and Base for the end. That's because stacks grow backwards (downwards in value) so that is reflected by the Base and Limit values. Since you can't tell from this output I presented since I've clipped a lot of it, there's a good couple hundred lines it posts from this. Fortunately most of the stack in this thread hasn't even been used, so there's a long range of question marks at the beginning and end. That means we have less to work with.
Now, if you're following along, even scrolling through it you'll immediately discover a rather obtuse line in the midst of it:
Ah ha! So we have a suspect here! As you can tell, what happened is that while dps did find this as a valid instruction address, it could not find any symbols for the module (driver) that this is a part of, which ends up having it bicker. That's because this - like many other 3rd-party drivers - do not have publicly available symbols that you loaded. If it weren't for you setting up Windbg to use the Microsoft symbol server or to install all the symbol packages from its website, you'd see the same thing for all the Windows functions, too! That's why it's very important to have symbols and the proper symbols, otherwise a stack unwind will be extremely vague or very misleading.
As it was figured, this is part of Daemon Tools, and in fact it's dated from Jan 2012, so while it's recent at the time of this writing, it still was causing problems. After the client removed it, they reported no issues. Yay!
Please understand that this is not a 100% surefire way of finding an exact cause. Every bugcheck is different, and this is just one way of finding important data on the situation that may help you. This was a relatively easy case once Driver Verifier kicked in, and all it took is to use this method to extract data that ended up being most important to solving this. There's something different for every case, so it's imperative that you have a general idea what to look for and start using tricks like this to dump all the data that will assist you in that approach.
Update:
I think it's important to mention that this did not in fact fix the client's problem afterall, and I know exactly why. Daemon Tools driver here was responsible for the crash, yes, but why it crashed is important:
If I am correct, I am pretty sure this is related to the Force Pending I/O Requests check that's in Driver Verifier. Often this check is recommended on technical support forums without people being aware of exactly what it does and what it's for. As mentioned in BSOD Method & Tips, this check is a no-no. This is because it's a stress test creating an artificial environment that would very rarely occur in a real-world situation and so it does this to accelerate the testing procedure for this. Many drivers out there actually are not designed to handle pended IRPs like this, and so forcing this behavior on them will cause them to bug out (Windows drivers included!). While Daemon Tools driver does fail to properly handing this, it's not really supposed too, and so it whacks out when it is forced too using the Force Pending I/O Requests check.
Understand that the 0xC9 crash was really caused by the IO subsystem checking (I/O Verification) option selected in Driver Verifier. However, what really triggered it is Force Pending I/O Requests, because that option will randomly force an IRP to go pending, which is not a common experience when running a PC. An IRP related to Daemon Tools goes pending, Daemon Tools coughs on it because it doesn't know how to handle pending IRPs correctly, and the I/O Verification check in DV notices this mishandling and triggers the bugcheck. When looking at the Windbg help manual for 0xC9 bugchecks, you'll find that the subcode for this, 0x23E, is a Non-fatal error. It's an ugly and potentially risky fault, but it is not capable in and of itself to cause harm to the system, so it is not generally worthy of creating a BSOD. DV only gives a BSOD anyways to inform the driver developer of the bug to help them with the debugging process when bugtesting their code. In fact, if you actually set up a live kernel debugger on the faulting system, the debugger will catch this kind of fault, and you can choose to ignore the fault and continue on with regular PC activity. It's not often you can recover from a BSOD like that!
Oh, and just to confirm the fact the client had Force Pending I/O Requests on:
Again, note that I/O subsystem checking is on, which actually caused the BSOD, but only because of Force Pending I/O Request's intervention in the first place.
So the real culprit for the client's problems continued to be at large, and all we did was waste time with this. So remember folks, don't turn on Force Pending I/O Requests! Note that this still does not nullify this example as being a legitimate method for finding suspect drivers. We still found what was involved in this bugcheck, but it was not responsible for the real-world problems the client was dealing with.
Bonus:
Just so you know, this driver actually did show up in the stack unwind present in both the !analyze -v and !thread outputs. Take the address of the instruction in the driver as we see in the raw stack, and search for it in the stack unwind output:
It showed up here in the Args to Child because it's part of what's called the home address space. It is where a function typically (keyword here!) stores the contents of 4 general registers (rcx, rdx, r8, and r9, respectively) before it goes ahead and calls another function, which that function can then use these to work with. If you're a programmer, you'll notice that you can pass parameters to functions or during API calls. The first 4 parameters most commonly are what resides in these 4 registers. However, note that this is just a common use for this portion of the stack, and it is not set in stone (plus, this all only applies to x64 stacks). A function can at any time decide to use the home space for anything other than storing those 4 particular registers, though most commonly this is how it works, especially in kernel mode (drivers and Windows code, not user mode applications). So what we have here is this instruction address to inside dtsoftbus01.sys was quite possibly slapped into the r9 register (since it's the fourth address or parameter here) some point in time and it stored it here for safe keeping. The purpose and reasoning behind the exact placement of this here in the raw stack I'm not entirely sure yet, still learning on it. But that's the general idea.
You can read up more info on home spaces and stack usage in x64 stacks here and here. Also a comprehensive demonstration and tutorial is here. It's a bit advanced but it helped me a lot on figuring out stack construction on x64 and how to walk it.
Code:
2: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
DRIVER_VERIFIER_IOMANAGER_VIOLATION (c9)
The IO manager has caught a misbehaving driver.
Arguments:
Arg1: 000000000000023e, A driver has marked an IRP pending but didn't return STATUS_PENDING.
Arg2: fffffa8005aea2c0, The address in the driver's code where the error was detected.
Arg3: fffff9800a9aaee0, IRP address.
Arg4: 0000000000000000, Status code.
Debugging Details:
------------------
TRIAGER: Could not open triage file : C:\Program Files (x86)\Windows Kits\8.0\Debuggers\x64\triage\modclass.ini, error 2
OVERLAPPED_MODULE: Address regions for 'watchdog' and 'cdrom.sys' overlap
BUGCHECK_STR: 0xc9_23e
DRIVER_VERIFIER_IO_VIOLATION_TYPE: 23e
FAULTING_IP:
+0
fffffa80`05aea2c0 4883ec48 sub rsp,48h
FOLLOWUP_IP:
cdrom!DeviceSendRequestSynchronously+c9
fffff880`0838be95 488b0da490ffff mov rcx,qword ptr [cdrom!WdfDriverGlobals (fffff880`08384f40)]
IRP_ADDRESS: fffff9800a9aaee0
DEVICE_OBJECT: fffffa8005f0b060
CUSTOMER_CRASH_COUNT: 1
DEFAULT_BUCKET_ID: VERIFIER_ENABLED_VISTA_MINIDUMP
PROCESS_NAME: System
CURRENT_IRQL: 2
LOCK_ADDRESS: fffff800034e1b80 -- (!locks fffff800034e1b80)
Resource @ nt!PiEngineLock (0xfffff800034e1b80) Available
WARNING: SystemResourcesList->Flink chain invalid. Resource may be corrupted, or already deleted.
WARNING: SystemResourcesList->Blink chain invalid. Resource may be corrupted, or already deleted.
1 total locks
PNP_TRIAGE:
Lock address : 0xfffff800034e1b80
Thread Count : 0
Thread address: 0x0000000000000000
Thread wait : 0x0
LAST_CONTROL_TRANSFER: from fffff8000376c3dc to fffff800032e2c40
STACK_TEXT:
fffff880`033c39a8 fffff800`0376c3dc : 00000000`000000c9 00000000`0000023e fffffa80`05aea2c0 fffff980`0a9aaee0 : nt!KeBugCheckEx
fffff880`033c39b0 fffff800`0377647a : fffff800`0376a9f0 fffffa80`05aea2c0 fffff980`0a9aaee0 00000000`00000000 : nt!VerifierBugCheckIfAppropriate+0x3c
fffff880`033c39f0 fffff800`0377734e : 00000000`0000023e 00000000`00000000 fffff980`0a9aaee0 00000000`ffffffff : nt!ViErrorFinishReport+0xda
fffff880`033c3a40 fffff800`0378271f : fffffa80`061af820 fffffa80`0617fab0 fffffa80`0617fb68 fffff880`015a8228 : nt!VfErrorReport6+0x6e
fffff880`033c3b10 fffff800`03782b63 : fffffa80`0617fab0 00000000`00000002 fffffa80`05f0b060 fffff880`015a43f4 : nt!IovpCallDriver2+0x13f
fffff880`033c3b70 fffff800`03788c2e : fffff980`0a9aaee0 fffff980`0a9aaee0 00000000`00000002 fffffa80`05f0b060 : nt!VfAfterCallDriver+0x353
fffff880`033c3bc0 fffff880`00d680c2 : fffff880`00de17c0 fffffa80`0614ea20 00000000`00000025 fffffa80`0617fab0 : nt!IovCallDriver+0x57e
fffff880`033c3c20 fffff880`00d7d59f : fffffa80`063b8da0 00000000`00000000 fffff880`00000000 fffff880`083775d5 : Wdf01000!FxIoTarget::SubmitSync+0x24a
fffff880`033c3cd0 fffff880`0838be95 : fffffa80`00000020 fffffa80`063b8da0 fffffa80`0614ea20 00000000`00000001 : Wdf01000!imp_WdfRequestSend+0x24b
fffff880`033c3d20 fffff880`0838c2f4 : 00000000`00000000 00000000`00000024 00000000`00000001 fffff880`033c3ee0 : cdrom!DeviceSendRequestSynchronously+0xc9
fffff880`033c3d80 fffff880`0838af04 : 0000057f`f9f447e8 fffff880`033c3ee0 fffffa80`060eb830 00000000`00000024 : cdrom!DeviceSendSrbSynchronously+0x3ec
fffff880`033c3eb0 fffff880`08389825 : fffffa80`00000002 fffffa80`060bbb80 fffff880`083813b8 0000057f`f9f447e8 : cdrom!DeviceCacheDeviceInquiryData+0xbc
fffff880`033c3f80 fffff880`00dd29d4 : 00000000`00000000 fffff880`00000010 00000000`0000000c fffff880`00ddb620 : cdrom!DeviceEvtSelfManagedIoInit+0x105
fffff880`033c3fc0 fffff880`00dd294e : fffffa80`0612aeb0 00000000`0000000c fffff880`00ddb620 fffff880`00ddb600 : Wdf01000!FxSelfManagedIoMachine::Init+0x34
fffff880`033c3ff0 fffff880`00dc6b9b : 00000000`00000000 00000000`00000000 00000000`00000000 fffff880`00ddb620 : Wdf01000!FxSelfManagedIoMachine::ProcessEvent+0x14a
fffff880`033c4060 fffff880`00dc66eb : 00000000`00000312 00000000`00000000 fffff880`00ddb600 fffff6fc`c00544e0 : Wdf01000!FxPkgPnp::PowerD0StartingStartSelfManagedIo+0x2f
fffff880`033c4090 fffff880`00dc638e : fffffa80`0614db60 00000000`00000040 fffff880`00ddb3e0 00000000`00000040 : Wdf01000!FxPkgPnp::PowerEnterNewState+0x1db
fffff880`033c41c0 fffff880`00dc606c : 00000000`00000000 00000000`00000040 00000000`00000501 fffffa80`0614db60 : Wdf01000!FxPkgPnp::PowerProcessEventInner+0x13e
fffff880`033c4230 fffff880`00dd3b5e : 00000000`00000000 00000000`00000501 00000000`00000000 00000000`00000501 : Wdf01000!FxPkgPnp::PowerProcessEvent+0x1d4
fffff880`033c42c0 fffff880`00dd3abb : 00000000`00000501 00000000`00000000 00000000`00000500 fffff800`037875ec : Wdf01000!FxPkgPnp::NotPowerPolOwnerStarting+0xe
fffff880`033c42f0 fffff880`00dcf950 : fffffa80`0614db60 00000000`00000001 00000000`00000000 fffff880`00dd7ff0 : Wdf01000!FxPkgPnp::NotPowerPolicyOwnerEnterNewState+0x1c7
fffff880`033c4370 fffff880`00dcf372 : 00000000`00000000 00000000`00000001 00000000`00000000 fffffa80`0614ee20 : Wdf01000!FxPkgPnp::PowerPolicyProcessEventInner+0x3f8
fffff880`033c43e0 fffff880`00dcba45 : 00000000`00000000 fffffa80`0614ee20 00000000`00000108 00000000`00000000 : Wdf01000!FxPkgPnp::PowerPolicyProcessEvent+0x1e2
fffff880`033c4470 fffff880`00dcb841 : 00000000`00000101 00000000`00000108 00000000`00000108 fffff800`00000006 : Wdf01000!FxPkgPnp::PnpEventHardwareAvailable+0x111
fffff880`033c44b0 fffff880`00dcb4fe : fffffa80`0614db60 fffff880`033c45e0 00000000`00000004 fffff880`00dda390 : Wdf01000!FxPkgPnp::PnpEnterNewState+0x1a5
fffff880`033c4520 fffff880`00dcb201 : 00000000`00000000 00000000`00000002 00000000`00000000 fffffa80`0614db60 : Wdf01000!FxPkgPnp::PnpProcessEventInner+0x122
fffff880`033c4590 fffff880`00dc2d9c : 00000000`00000000 fffffa80`0614db60 00000000`00000002 00000000`00000000 : Wdf01000!FxPkgPnp::PnpProcessEvent+0x1b1
fffff880`033c4620 fffff880`00dc1dd6 : fffffa80`0614cd30 00000000`00000002 00000000`00000000 fffffa80`0614db60 : Wdf01000!FxPkgPnp::_PnpStartDevice+0x20
fffff880`033c4650 fffff880`00d91245 : fffff980`0a82ee50 fffff980`0a82ee50 fffffa80`0614ee20 00000000`00000000 : Wdf01000!FxPkgPnp::Dispatch+0x1b2
fffff880`033c46c0 fffff880`00d9114b : fffffa80`0614ee20 fffff980`0a82ee50 00000000`00000002 fffffa80`0614ee20 : Wdf01000!FxDevice::Dispatch+0xa9
fffff880`033c46f0 fffff800`03788c16 : fffff980`0a82ee50 00000000`00000002 fffffa80`0614ee20 fffffa80`0614ee20 : Wdf01000!FxDevice::DispatchWithLock+0x93
fffff880`033c4730 fffff800`0369bd6e : fffff980`0a82ee50 fffffa80`06186830 fffffa80`0614ee20 fffffa80`061b9a40 : nt!IovCallDriver+0x566
fffff880`033c4790 fffff800`033d387d : fffffa80`05f0b060 fffffa80`06186830 fffff800`033dcfa0 00000000`00000000 : nt!PnpAsynchronousCall+0xce
fffff880`033c47d0 fffff800`036ab0b6 : fffff800`034e1940 fffffa80`0614a890 fffffa80`06186830 fffffa80`0614aa38 : nt!PnpStartDevice+0x11d
fffff880`033c4890 fffff800`036ab354 : fffffa80`0614a890 fffffa80`039e002b fffffa80`039ef010 00000000`00000001 : nt!PnpStartDeviceNode+0x156
fffff880`033c4920 fffff800`036cea86 : fffffa80`0614a890 fffffa80`039ef010 00000000`00000002 00000000`00000000 : nt!PipProcessStartPhase1+0x74
fffff880`033c4950 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!PipProcessDevNodeTree+0x296
STACK_COMMAND: kb
SYMBOL_STACK_INDEX: 9
SYMBOL_NAME: cdrom!DeviceSendRequestSynchronously+c9
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: cdrom
IMAGE_NAME: cdrom.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 4ce79298
FAILURE_BUCKET_ID: X64_0xc9_23e_VRF_cdrom!DeviceSendRequestSynchronously+c9
BUCKET_ID: X64_0xc9_23e_VRF_cdrom!DeviceSendRequestSynchronously+c9
Followup: MachineOwner
---------
This occurred when the client turned on DV (Driver Verifier) and had his computer quickly go into reboot loop after having done so. Yet despite the analysis engine's best intentions, it came up with cdrom.sys as the culprit, which is probably not the cause. Nothing in the running thread's callstack unwind showed any other drivers of suspicion, and unfortunately for us, while we can run the address of the pending IRP through !irp extension, we get no valuable details from this minidump (like the device object). So far this seems like we're stuck.
But wait! We have one trick we can pull, and that's dump the raw stack for this thread. The stack unwind information shows us what the analysis engine was able to produce as being most likely the callstack for this thread. However, there may be references to other functions and whatnot that are not present in this stack unwind. Therefore we'll want to dump the raw stack to show details on all the different variables and registers that were involved in this thread that it decided to preserve on the stack. So, to do that, we start with getting details on the thread first using !thread:
Code:
2: kd> !thread
GetPointerFromAddress: unable to read from fffff80003515000
THREAD fffffa80039d6680 Cid 0004.0040 Teb: 0000000000000000 Win32Thread: 0000000000000000 RUNNING on processor 2
Not impersonating
GetUlongFromAddress: unable to read from fffff80003454ba4
Owning Process fffffa80039b8740 Image: System
Attached Process N/A Image: N/A
fffff78000000000: Unable to get shared data
Wait Start TickCount 1064
Context Switch Count 4460 IdealProcessor: 2
ReadMemory error: Cannot get nt!KeMaximumIncrement value.
UserTime 00:00:00.000
KernelTime 00:00:00.000
Win32 Start Address nt!ExpWorkerThread (0xfffff800032ecef0)
Stack Init fffff880033c4db0 Current fffff880033c3840
Base [COLOR=#ff8c00]fffff880033c5000[/COLOR] Limit [COLOR=#0000cd]fffff880033bf000[/COLOR] Call 0
Priority 13 BasePriority 12 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP RetAddr : Args to Child : Call Site
fffff880`033c39a8 fffff800`0376c3dc : 00000000`000000c9 00000000`0000023e fffffa80`05aea2c0 fffff980`0a9aaee0 : nt!KeBugCheckEx
fffff880`033c39b0 fffff800`0377647a : fffff800`0376a9f0 fffffa80`05aea2c0 fffff980`0a9aaee0 00000000`00000000 : nt!VerifierBugCheckIfAppropriate+0x3c
fffff880`033c39f0 fffff800`0377734e : 00000000`0000023e 00000000`00000000 fffff980`0a9aaee0 00000000`ffffffff : nt!ViErrorFinishReport+0xda
fffff880`033c3a40 fffff800`0378271f : fffffa80`061af820 fffffa80`0617fab0 fffffa80`0617fb68 fffff880`015a8228 : nt!VfErrorReport6+0x6e
fffff880`033c3b10 fffff800`03782b63 : fffffa80`0617fab0 00000000`00000002 fffffa80`05f0b060 fffff880`015a43f4 : nt!IovpCallDriver2+0x13f
fffff880`033c3b70 fffff800`03788c2e : fffff980`0a9aaee0 fffff980`0a9aaee0 00000000`00000002 fffffa80`05f0b060 : nt!VfAfterCallDriver+0x353
fffff880`033c3bc0 fffff880`00d680c2 : fffff880`00de17c0 fffffa80`0614ea20 00000000`00000025 fffffa80`0617fab0 : nt!IovCallDriver+0x57e
fffff880`033c3c20 fffff880`00d7d59f : fffffa80`063b8da0 00000000`00000000 fffff880`00000000 fffff880`083775d5 : Wdf01000!FxIoTarget::SubmitSync+0x24a
fffff880`033c3cd0 fffff880`0838be95 : fffffa80`00000020 fffffa80`063b8da0 fffffa80`0614ea20 00000000`00000001 : Wdf01000!imp_WdfRequestSend+0x24b
fffff880`033c3d20 fffff880`0838c2f4 : 00000000`00000000 00000000`00000024 00000000`00000001 fffff880`033c3ee0 : cdrom!DeviceSendRequestSynchronously+0xc9
fffff880`033c3d80 fffff880`0838af04 : 0000057f`f9f447e8 fffff880`033c3ee0 fffffa80`060eb830 00000000`00000024 : cdrom!DeviceSendSrbSynchronously+0x3ec
fffff880`033c3eb0 fffff880`08389825 : fffffa80`00000002 fffffa80`060bbb80 fffff880`083813b8 0000057f`f9f447e8 : cdrom!DeviceCacheDeviceInquiryData+0xbc
fffff880`033c3f80 fffff880`00dd29d4 : 00000000`00000000 fffff880`00000010 00000000`0000000c fffff880`00ddb620 : cdrom!DeviceEvtSelfManagedIoInit+0x105
fffff880`033c3fc0 fffff880`00dd294e : fffffa80`0612aeb0 00000000`0000000c fffff880`00ddb620 fffff880`00ddb600 : Wdf01000!FxSelfManagedIoMachine::Init+0x34
fffff880`033c3ff0 fffff880`00dc6b9b : 00000000`00000000 00000000`00000000 00000000`00000000 fffff880`00ddb620 : Wdf01000!FxSelfManagedIoMachine::ProcessEvent+0x14a
fffff880`033c4060 fffff880`00dc66eb : 00000000`00000312 00000000`00000000 fffff880`00ddb600 fffff6fc`c00544e0 : Wdf01000!FxPkgPnp::PowerD0StartingStartSelfManagedIo+0x2f
fffff880`033c4090 fffff880`00dc638e : fffffa80`0614db60 00000000`00000040 fffff880`00ddb3e0 00000000`00000040 : Wdf01000!FxPkgPnp::PowerEnterNewState+0x1db
fffff880`033c41c0 fffff880`00dc606c : 00000000`00000000 00000000`00000040 00000000`00000501 fffffa80`0614db60 : Wdf01000!FxPkgPnp::PowerProcessEventInner+0x13e
fffff880`033c4230 fffff880`00dd3b5e : 00000000`00000000 00000000`00000501 00000000`00000000 00000000`00000501 : Wdf01000!FxPkgPnp::PowerProcessEvent+0x1d4
fffff880`033c42c0 fffff880`00dd3abb : 00000000`00000501 00000000`00000000 00000000`00000500 fffff800`037875ec : Wdf01000!FxPkgPnp::NotPowerPolOwnerStarting+0xe
fffff880`033c42f0 fffff880`00dcf950 : fffffa80`0614db60 00000000`00000001 00000000`00000000 fffff880`00dd7ff0 : Wdf01000!FxPkgPnp::NotPowerPolicyOwnerEnterNewState+0x1c7
fffff880`033c4370 fffff880`00dcf372 : 00000000`00000000 00000000`00000001 00000000`00000000 fffffa80`0614ee20 : Wdf01000!FxPkgPnp::PowerPolicyProcessEventInner+0x3f8
fffff880`033c43e0 fffff880`00dcba45 : 00000000`00000000 fffffa80`0614ee20 00000000`00000108 00000000`00000000 : Wdf01000!FxPkgPnp::PowerPolicyProcessEvent+0x1e2
fffff880`033c4470 fffff880`00dcb841 : 00000000`00000101 00000000`00000108 00000000`00000108 fffff800`00000006 : Wdf01000!FxPkgPnp::PnpEventHardwareAvailable+0x111
fffff880`033c44b0 fffff880`00dcb4fe : fffffa80`0614db60 fffff880`033c45e0 00000000`00000004 fffff880`00dda390 : Wdf01000!FxPkgPnp::PnpEnterNewState+0x1a5
fffff880`033c4520 fffff880`00dcb201 : 00000000`00000000 00000000`00000002 00000000`00000000 fffffa80`0614db60 : Wdf01000!FxPkgPnp::PnpProcessEventInner+0x122
fffff880`033c4590 fffff880`00dc2d9c : 00000000`00000000 fffffa80`0614db60 00000000`00000002 00000000`00000000 : Wdf01000!FxPkgPnp::PnpProcessEvent+0x1b1
fffff880`033c4620 fffff880`00dc1dd6 : fffffa80`0614cd30 00000000`00000002 00000000`00000000 fffffa80`0614db60 : Wdf01000!FxPkgPnp::_PnpStartDevice+0x20
fffff880`033c4650 fffff880`00d91245 : fffff980`0a82ee50 fffff980`0a82ee50 fffffa80`0614ee20 00000000`00000000 : Wdf01000!FxPkgPnp::Dispatch+0x1b2
fffff880`033c46c0 fffff880`00d9114b : fffffa80`0614ee20 fffff980`0a82ee50 00000000`00000002 fffffa80`0614ee20 : Wdf01000!FxDevice::Dispatch+0xa9
fffff880`033c46f0 fffff800`03788c16 : fffff980`0a82ee50 00000000`00000002 fffffa80`0614ee20 fffffa80`0614ee20 : Wdf01000!FxDevice::DispatchWithLock+0x93
fffff880`033c4730 fffff800`0369bd6e : fffff980`0a82ee50 fffffa80`06186830 fffffa80`0614ee20 fffffa80`061b9a40 : nt!IovCallDriver+0x566
fffff880`033c4790 fffff800`033d387d : fffffa80`05f0b060 fffffa80`06186830 fffff800`033dcfa0 00000000`00000000 : nt!PnpAsynchronousCall+0xce
fffff880`033c47d0 fffff800`036ab0b6 : fffff800`034e1940 fffffa80`0614a890 fffffa80`06186830 fffffa80`0614aa38 : nt!PnpStartDevice+0x11d
fffff880`033c4890 fffff800`036ab354 : fffffa80`0614a890 fffffa80`039e002b fffffa80`039ef010 00000000`00000001 : nt!PnpStartDeviceNode+0x156
fffff880`033c4920 fffff800`036cea86 : fffffa80`0614a890 fffffa80`039ef010 00000000`00000002 00000000`00000000 : nt!PipProcessStartPhase1+0x74
fffff880`033c4950 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!PipProcessDevNodeTree+0x296
Note that I just had to type !thread and didn't need to specify the address of the thread because it was smart enough to figure that I wanted to use the thread involved in the current context (again, context helps!). Because that context happens to be for the latest thread that was running before the crash, that's the one that we want (in minidumps it's the only context you'll get!). Note that the stack unwind has KeBugCheckEx in it so that means it's our crashing thread.
Now, while the stack unwind is no different than what we saw from the !analyze -v output, we do have a lot of extra relevant data on the thread. While all of this can be used for various debugging efforts, our focus is centered on one line, which contains the Base and the Limit. This is essentially the very bottom of the stack and the maximum limit that the stack can fill up too. So what we do is take those values and use them as start and end points for a range to show all the data inbetween. You can use various types of d command to dump the data, but the one we want in particular is dps, which is part of a triplet of special d commands that are designed to show symbols. As you might remember in other articles, symbols are those informative function names like what you see in the stack unwind info that gives you a good idea what the purpose of that particular function is. dps and its fellow brothers dds and dqs will parse through the range of data you give it, determine if there's an instruction address in it, and then use the appropriate symbol for that address. It's preferred to use dps as opposed to its other two related commands because unlike them, dps will discover whether you're dealing with x32 or x64 sized memory addresses and will read and output the data accordingly. So anyways, let's do that:
Code:
2: kd> dps [COLOR=#0000cd]fffff880033bf000[/COLOR] [COLOR=#ff8c00]fffff880033c5000[/COLOR]
fffff880`033bf000 ????????`????????
fffff880`033bf008 ????????`????????
fffff880`033bf010 ????????`????????
fffff880`033bf018 ????????`????????
fffff880`033bf020 ????????`????????
fffff880`033bf028 ????????`????????
fffff880`033bf030 ????????`????????
...
fffff880`033c4fc0 ????????`????????
fffff880`033c4fc8 ????????`????????
fffff880`033c4fd0 ????????`????????
fffff880`033c4fd8 ????????`????????
fffff880`033c4fe0 ????????`????????
fffff880`033c4fe8 ????????`????????
fffff880`033c4ff0 ????????`????????
fffff880`033c4ff8 ????????`????????
fffff880`033c5000 ????????`????????
The first thing you're probably wondering is why I switched the start and end values by using the Limit for the start and Base for the end. That's because stacks grow backwards (downwards in value) so that is reflected by the Base and Limit values. Since you can't tell from this output I presented since I've clipped a lot of it, there's a good couple hundred lines it posts from this. Fortunately most of the stack in this thread hasn't even been used, so there's a long range of question marks at the beginning and end. That means we have less to work with.
Now, if you're following along, even scrolling through it you'll immediately discover a rather obtuse line in the midst of it:
Code:
fffff880`033c3b28 fffff880`015a8228Unable to load image \SystemRoot\system32\DRIVERS\[COLOR=#ff0000]dtsoftbus01.sys[/COLOR], Win32 error 0n2
*** WARNING: Unable to verify timestamp for [COLOR=#ff0000]dtsoftbus01.sys[/COLOR]
*** ERROR: Module load completed but symbols could not be loaded for [COLOR=#ff0000]dtsoftbus01.sys[/COLOR]
[COLOR=#ff0000]dtsoftbus01+0x5228[/COLOR]
Ah ha! So we have a suspect here! As you can tell, what happened is that while dps did find this as a valid instruction address, it could not find any symbols for the module (driver) that this is a part of, which ends up having it bicker. That's because this - like many other 3rd-party drivers - do not have publicly available symbols that you loaded. If it weren't for you setting up Windbg to use the Microsoft symbol server or to install all the symbol packages from its website, you'd see the same thing for all the Windows functions, too! That's why it's very important to have symbols and the proper symbols, otherwise a stack unwind will be extremely vague or very misleading.
As it was figured, this is part of Daemon Tools, and in fact it's dated from Jan 2012, so while it's recent at the time of this writing, it still was causing problems. After the client removed it, they reported no issues. Yay!
Please understand that this is not a 100% surefire way of finding an exact cause. Every bugcheck is different, and this is just one way of finding important data on the situation that may help you. This was a relatively easy case once Driver Verifier kicked in, and all it took is to use this method to extract data that ended up being most important to solving this. There's something different for every case, so it's imperative that you have a general idea what to look for and start using tricks like this to dump all the data that will assist you in that approach.
Note
Tip: if you suspect a thread's stack is corrupt, you can use this to dump the stack, and along with dps you can use dc to dump the stack with ASCII characters. Often times a stack is trashed by mishandling of it and having a function unintentionally dump a bunch of data onto the stack. Sometimes you may have strings of text sitting in that data that will point you to a cause, and the dc command is just the thing to use to sniff them out!
Update:
I think it's important to mention that this did not in fact fix the client's problem afterall, and I know exactly why. Daemon Tools driver here was responsible for the crash, yes, but why it crashed is important:
Code:
DRIVER_VERIFIER_IOMANAGER_VIOLATION (c9)
The IO manager has caught a misbehaving driver.
Arguments:
Arg1: 000000000000023e, [COLOR=#ff0000]A driver has marked an IRP pending but didn't return STATUS_PENDING[/COLOR].
Arg2: fffffa8005aea2c0, The address in the driver's code where the error was detected.
Arg3: fffff9800a9aaee0, IRP address.
Arg4: 0000000000000000, Status code.
If I am correct, I am pretty sure this is related to the Force Pending I/O Requests check that's in Driver Verifier. Often this check is recommended on technical support forums without people being aware of exactly what it does and what it's for. As mentioned in BSOD Method & Tips, this check is a no-no. This is because it's a stress test creating an artificial environment that would very rarely occur in a real-world situation and so it does this to accelerate the testing procedure for this. Many drivers out there actually are not designed to handle pended IRPs like this, and so forcing this behavior on them will cause them to bug out (Windows drivers included!). While Daemon Tools driver does fail to properly handing this, it's not really supposed too, and so it whacks out when it is forced too using the Force Pending I/O Requests check.
Understand that the 0xC9 crash was really caused by the IO subsystem checking (I/O Verification) option selected in Driver Verifier. However, what really triggered it is Force Pending I/O Requests, because that option will randomly force an IRP to go pending, which is not a common experience when running a PC. An IRP related to Daemon Tools goes pending, Daemon Tools coughs on it because it doesn't know how to handle pending IRPs correctly, and the I/O Verification check in DV notices this mishandling and triggers the bugcheck. When looking at the Windbg help manual for 0xC9 bugchecks, you'll find that the subcode for this, 0x23E, is a Non-fatal error. It's an ugly and potentially risky fault, but it is not capable in and of itself to cause harm to the system, so it is not generally worthy of creating a BSOD. DV only gives a BSOD anyways to inform the driver developer of the bug to help them with the debugging process when bugtesting their code. In fact, if you actually set up a live kernel debugger on the faulting system, the debugger will catch this kind of fault, and you can choose to ignore the fault and continue on with regular PC activity. It's not often you can recover from a BSOD like that!
Oh, and just to confirm the fact the client had Force Pending I/O Requests on:
Code:
2: kd> !verifier
Verify Level fbb ... enabled options are:
Special pool
Special irql
All pool allocations checked on unload
[COLOR=#0000ff]Io subsystem checking enabled[/COLOR]
Deadlock detection enabled
DMA checking enabled
Security checks enabled
[COLOR=#ff0000]Force pending I/O requests[/COLOR]
IRP Logging
Miscellaneous checks enabled
Summary of All Verifier Statistics
RaiseIrqls 0x0
AcquireSpinLocks 0x22ba
Synch Executions 0x66
Trims 0x1fee
Pool Allocations Attempted 0x6b15
Pool Allocations Succeeded 0x6b15
Pool Allocations Succeeded SpecialPool 0x6b15
Pool Allocations With NO TAG 0x3
Pool Allocations Failed 0x0
Resource Allocations Failed Deliberately 0x0
Current paged pool allocations 0x4e for 0002A83C bytes
Peak paged pool allocations 0x50 for 0002A874 bytes
Current nonpaged pool allocations 0x34c9 for 00436B1C bytes
Peak nonpaged pool allocations 0x34ca for 00436CA0 bytes
GetPointerFromAddress: unable to read from fffff80003515100
Again, note that I/O subsystem checking is on, which actually caused the BSOD, but only because of Force Pending I/O Request's intervention in the first place.
So the real culprit for the client's problems continued to be at large, and all we did was waste time with this. So remember folks, don't turn on Force Pending I/O Requests! Note that this still does not nullify this example as being a legitimate method for finding suspect drivers. We still found what was involved in this bugcheck, but it was not responsible for the real-world problems the client was dealing with.
Bonus:
Just so you know, this driver actually did show up in the stack unwind present in both the !analyze -v and !thread outputs. Take the address of the instruction in the driver as we see in the raw stack, and search for it in the stack unwind output:
Code:
fffff880`033c3b28 [COLOR=#006400]fffff880`015a8228[/COLOR]Unable to load image \SystemRoot\system32\DRIVERS\dtsoftbus01.sys, Win32 error 0n2
*** WARNING: Unable to verify timestamp for dtsoftbus01.sys
*** ERROR: Module load completed but symbols could not be loaded for dtsoftbus01.sys
dtsoftbus01+0x5228
Child-SP RetAddr : Args to Child : Call Site
fffff880`033c39a8 fffff800`0376c3dc : 00000000`000000c9 00000000`0000023e fffffa80`05aea2c0 fffff980`0a9aaee0 : nt!KeBugCheckEx
fffff880`033c39b0 fffff800`0377647a : fffff800`0376a9f0 fffffa80`05aea2c0 fffff980`0a9aaee0 00000000`00000000 : nt!VerifierBugCheckIfAppropriate+0x3c
fffff880`033c39f0 fffff800`0377734e : 00000000`0000023e 00000000`00000000 fffff980`0a9aaee0 00000000`ffffffff : nt!ViErrorFinishReport+0xda
fffff880`033c3a40 fffff800`0378271f : fffffa80`061af820 fffffa80`0617fab0 fffffa80`0617fb68 [COLOR=#006400]fffff880`015a8228[/COLOR] : nt!VfErrorReport6+0x6e
fffff880`033c3b10 fffff800`03782b63 : fffffa80`0617fab0 00000000`00000002 fffffa80`05f0b060 fffff880`015a43f4 : nt!IovpCallDriver2+0x13f
fffff880`033c3b70 fffff800`03788c2e : fffff980`0a9aaee0 fffff980`0a9aaee0 00000000`00000002 fffffa80`05f0b060 : nt!VfAfterCallDriver+0x353
fffff880`033c3bc0 fffff880`00d680c2 : fffff880`00de17c0 fffffa80`0614ea20 00000000`00000025 fffffa80`0617fab0 : nt!IovCallDriver+0x57e
fffff880`033c3c20 fffff880`00d7d59f : fffffa80`063b8da0 00000000`00000000 fffff880`00000000 fffff880`083775d5 : Wdf01000!FxIoTarget::SubmitSync+0x24a
fffff880`033c3cd0 fffff880`0838be95 : fffffa80`00000020 fffffa80`063b8da0 fffffa80`0614ea20 00000000`00000001 : Wdf01000!imp_WdfRequestSend+0x24b
fffff880`033c3d20 fffff880`0838c2f4 : 00000000`00000000 00000000`00000024 00000000`00000001 fffff880`033c3ee0 : cdrom!DeviceSendRequestSynchronously+0xc9
fffff880`033c3d80 fffff880`0838af04 : 0000057f`f9f447e8 fffff880`033c3ee0 fffffa80`060eb830 00000000`00000024 : cdrom!DeviceSendSrbSynchronously+0x3ec
fffff880`033c3eb0 fffff880`08389825 : fffffa80`00000002 fffffa80`060bbb80 fffff880`083813b8 0000057f`f9f447e8 : cdrom!DeviceCacheDeviceInquiryData+0xbc
fffff880`033c3f80 fffff880`00dd29d4 : 00000000`00000000 fffff880`00000010 00000000`0000000c fffff880`00ddb620 : cdrom!DeviceEvtSelfManagedIoInit+0x105
fffff880`033c3fc0 fffff880`00dd294e : fffffa80`0612aeb0 00000000`0000000c fffff880`00ddb620 fffff880`00ddb600 : Wdf01000!FxSelfManagedIoMachine::Init+0x34
fffff880`033c3ff0 fffff880`00dc6b9b : 00000000`00000000 00000000`00000000 00000000`00000000 fffff880`00ddb620 : Wdf01000!FxSelfManagedIoMachine::ProcessEvent+0x14a
fffff880`033c4060 fffff880`00dc66eb : 00000000`00000312 00000000`00000000 fffff880`00ddb600 fffff6fc`c00544e0 : Wdf01000!FxPkgPnp::PowerD0StartingStartSelfManagedIo+0x2f
fffff880`033c4090 fffff880`00dc638e : fffffa80`0614db60 00000000`00000040 fffff880`00ddb3e0 00000000`00000040 : Wdf01000!FxPkgPnp::PowerEnterNewState+0x1db
fffff880`033c41c0 fffff880`00dc606c : 00000000`00000000 00000000`00000040 00000000`00000501 fffffa80`0614db60 : Wdf01000!FxPkgPnp::PowerProcessEventInner+0x13e
fffff880`033c4230 fffff880`00dd3b5e : 00000000`00000000 00000000`00000501 00000000`00000000 00000000`00000501 : Wdf01000!FxPkgPnp::PowerProcessEvent+0x1d4
fffff880`033c42c0 fffff880`00dd3abb : 00000000`00000501 00000000`00000000 00000000`00000500 fffff800`037875ec : Wdf01000!FxPkgPnp::NotPowerPolOwnerStarting+0xe
fffff880`033c42f0 fffff880`00dcf950 : fffffa80`0614db60 00000000`00000001 00000000`00000000 fffff880`00dd7ff0 : Wdf01000!FxPkgPnp::NotPowerPolicyOwnerEnterNewState+0x1c7
fffff880`033c4370 fffff880`00dcf372 : 00000000`00000000 00000000`00000001 00000000`00000000 fffffa80`0614ee20 : Wdf01000!FxPkgPnp::PowerPolicyProcessEventInner+0x3f8
fffff880`033c43e0 fffff880`00dcba45 : 00000000`00000000 fffffa80`0614ee20 00000000`00000108 00000000`00000000 : Wdf01000!FxPkgPnp::PowerPolicyProcessEvent+0x1e2
fffff880`033c4470 fffff880`00dcb841 : 00000000`00000101 00000000`00000108 00000000`00000108 fffff800`00000006 : Wdf01000!FxPkgPnp::PnpEventHardwareAvailable+0x111
fffff880`033c44b0 fffff880`00dcb4fe : fffffa80`0614db60 fffff880`033c45e0 00000000`00000004 fffff880`00dda390 : Wdf01000!FxPkgPnp::PnpEnterNewState+0x1a5
fffff880`033c4520 fffff880`00dcb201 : 00000000`00000000 00000000`00000002 00000000`00000000 fffffa80`0614db60 : Wdf01000!FxPkgPnp::PnpProcessEventInner+0x122
fffff880`033c4590 fffff880`00dc2d9c : 00000000`00000000 fffffa80`0614db60 00000000`00000002 00000000`00000000 : Wdf01000!FxPkgPnp::PnpProcessEvent+0x1b1
fffff880`033c4620 fffff880`00dc1dd6 : fffffa80`0614cd30 00000000`00000002 00000000`00000000 fffffa80`0614db60 : Wdf01000!FxPkgPnp::_PnpStartDevice+0x20
fffff880`033c4650 fffff880`00d91245 : fffff980`0a82ee50 fffff980`0a82ee50 fffffa80`0614ee20 00000000`00000000 : Wdf01000!FxPkgPnp::Dispatch+0x1b2
fffff880`033c46c0 fffff880`00d9114b : fffffa80`0614ee20 fffff980`0a82ee50 00000000`00000002 fffffa80`0614ee20 : Wdf01000!FxDevice::Dispatch+0xa9
fffff880`033c46f0 fffff800`03788c16 : fffff980`0a82ee50 00000000`00000002 fffffa80`0614ee20 fffffa80`0614ee20 : Wdf01000!FxDevice::DispatchWithLock+0x93
fffff880`033c4730 fffff800`0369bd6e : fffff980`0a82ee50 fffffa80`06186830 fffffa80`0614ee20 fffffa80`061b9a40 : nt!IovCallDriver+0x566
fffff880`033c4790 fffff800`033d387d : fffffa80`05f0b060 fffffa80`06186830 fffff800`033dcfa0 00000000`00000000 : nt!PnpAsynchronousCall+0xce
fffff880`033c47d0 fffff800`036ab0b6 : fffff800`034e1940 fffffa80`0614a890 fffffa80`06186830 fffffa80`0614aa38 : nt!PnpStartDevice+0x11d
fffff880`033c4890 fffff800`036ab354 : fffffa80`0614a890 fffffa80`039e002b fffffa80`039ef010 00000000`00000001 : nt!PnpStartDeviceNode+0x156
fffff880`033c4920 fffff800`036cea86 : fffffa80`0614a890 fffffa80`039ef010 00000000`00000002 00000000`00000000 : nt!PipProcessStartPhase1+0x74
fffff880`033c4950 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!PipProcessDevNodeTree+0x296
It showed up here in the Args to Child because it's part of what's called the home address space. It is where a function typically (keyword here!) stores the contents of 4 general registers (rcx, rdx, r8, and r9, respectively) before it goes ahead and calls another function, which that function can then use these to work with. If you're a programmer, you'll notice that you can pass parameters to functions or during API calls. The first 4 parameters most commonly are what resides in these 4 registers. However, note that this is just a common use for this portion of the stack, and it is not set in stone (plus, this all only applies to x64 stacks). A function can at any time decide to use the home space for anything other than storing those 4 particular registers, though most commonly this is how it works, especially in kernel mode (drivers and Windows code, not user mode applications). So what we have here is this instruction address to inside dtsoftbus01.sys was quite possibly slapped into the r9 register (since it's the fourth address or parameter here) some point in time and it stored it here for safe keeping. The purpose and reasoning behind the exact placement of this here in the raw stack I'm not entirely sure yet, still learning on it. But that's the general idea.
You can read up more info on home spaces and stack usage in x64 stacks here and here. Also a comprehensive demonstration and tutorial is here. It's a bit advanced but it helped me a lot on figuring out stack construction on x64 and how to walk it.
Attachments
Last edited: