Yes, forensic analysis books such as what you have there is excellent material to get into. OS Design books are also preferable. I unfortunately have no personal preferences at this time to give you.
Unless you have the know-how to figure out to a T all that caused the crash, you'll eventually reach a point where you have to make an educated guess on what it could be. While this isn't the most optimal solution, the good thing is that your current knowledge should at least be able to explain what it could
not be. To someone without knowledge, they would look at a BSOD and have no idea what could've generated it. However, you have been able to determine as far as it being related to the keyboard. As broad an estimate that is, it's still quite a improvement over being completely clueless! Someone else could be tinkering with their video card, not realizing it has absolutely nothing to do with this BSOD. So don't work yourself out just because you've only been able to go so far on your estimate. You've accomplished one step successfully, now go the other step by figure out just what happened with the I/O involving this keyboard that got all out of whack.
What I did is start with the DV error, in that some I/O problem occurred that DV caught, in that it detected a misbehaving driver. Ok, what did DV catch? Check the minor error code in Arg1 of the bugcheck: 0x23b, or "
The caller has changed the status field of an IRP it does not understand." Hold it! what does this mean? This is where understanding IRP handling is important, which you should read extensively from both the MSDN articles pertaining to IRPs as well as the Windows Internals book on I/O. Basically, IRPs pertaining to I/O gets passed down a stack of drivers relevant to that I/O, such as disk drivers and filter drivers (like A/V drivers) for file I/O. One of the elements of an IRP is the status field, which tells the drivers handling it the current status of the I/O that this IRP pertains too. Here's a breakdown of an IRP for ya:
Code:
2: kd> dt !_IRP
nt!_IRP
+0x000 Type : Int2B
+0x002 Size : Uint2B
+0x008 MdlAddress : Ptr64 _MDL
+0x010 Flags : Uint4B
+0x018 AssociatedIrp : <unnamed-tag>
+0x020 ThreadListEntry : _LIST_ENTRY
+0x030 IoStatus : _IO_STATUS_BLOCK
+0x040 RequestorMode : Char
+0x041 PendingReturned : UChar
+0x042 StackCount : Char
+0x043 CurrentLocation : Char
+0x044 Cancel : UChar
+0x045 CancelIrql : UChar
+0x046 ApcEnvironment : Char
+0x047 AllocationFlags : UChar
+0x048 UserIosb : Ptr64 _IO_STATUS_BLOCK
+0x050 UserEvent : Ptr64 _KEVENT
+0x058 Overlay : <unnamed-tag>
+0x068 CancelRoutine : Ptr64 void
+0x070 UserBuffer : Ptr64 Void
+0x078 Tail : <unnamed-tag>
There's the IoStatus substructure. We'll break it down further:
2: kd> dt !_IRP IoStatus.
nt!_IRP
+0x030 IoStatus :
+0x000 Status : Int4B
+0x000 Pointer : Ptr64 Void
+0x008 Information : Uint8B
Ok, that's the structure of things. Now let's see what it is pertaining to the faulting IRP for your example, using the IRP address in Arg3 of the bugcheck:
Code:
2: kd> dt !_IRP fffffa800a83fcf0
nt!_IRP
+0x000 Type : 0n6
+0x002 Size : 0x310
+0x008 MdlAddress : (null)
+0x010 Flags : 0x40000000
+0x018 AssociatedIrp : <unnamed-tag>
+0x020 ThreadListEntry : _LIST_ENTRY [ 0xfffffa80`0a83fd10 - 0xfffffa80`0a83fd10 ]
+0x030 IoStatus : _IO_STATUS_BLOCK
+0x040 RequestorMode : 0 ''
+0x041 PendingReturned : 0 ''
+0x042 StackCount : 8 ''
+0x043 CurrentLocation : 7 ''
+0x044 Cancel : 0 ''
+0x045 CancelIrql : 0 ''
+0x046 ApcEnvironment : 0 ''
+0x047 AllocationFlags : 0x80 ''
+0x048 UserIosb : (null)
+0x050 UserEvent : (null)
+0x058 Overlay : <unnamed-tag>
+0x068 CancelRoutine : (null)
+0x070 UserBuffer : (null)
+0x078 Tail : <unnamed-tag>
Let's break it down further:
2: kd> dt !_IRP IoStatus. fffffa800a83fcf0
nt!_IRP
+0x030 IoStatus :
+0x000 Status : 0n-1073741808
+0x000 Pointer : 0x00000000`c0000010 Void
+0x008 Information : 0
Both Pointer and Status are the same value, just presented differently (one in decimal, one in hex). It's obvious we're dealing with a valid error status code, because the format of it fits the pattern ("c000XXXX"). Let's look it up:
Code:
2: kd> !error c0000010
Error code: (NTSTATUS) 0xc0000010 (3221225488) - The specified request is not a valid operation for the target device.
Ah ha, so the status field that ultimately got passed was that the request was not valid for the target device. This is where we hit a crossroads in our analysis; we can either figure out why the request was not valid, or what the device was that didn't like it. Let's start with the device first, by looking up the IRP using
!irp:
Code:
2: kd> !irp fffffa800a83fcf0 1
Irp is active with 8 stacks 7 is current (= 0xfffffa800a83ff70)
No Mdl: No System Buffer: Thread 00000000: Irp stack trace.
Flags = 40000000
ThreadListEntry.Flink = fffffa800a83fd10
ThreadListEntry.Blink = fffffa800a83fd10
IoStatus.Status = c0000010
IoStatus.Information = 00000000
RequestorMode = 00000000
Cancel = 00
CancelIrql = 0
ApcEnvironment = 00
UserIosb = 00000000
UserEvent = 00000000
Overlay.AsynchronousParameters.UserApcRoutine = 00000000
Overlay.AsynchronousParameters.UserApcContext = 00000000
Overlay.AllocationSize = 00000000 - 00000000
CancelRoutine = 00000000
UserBuffer = 00000000
&Tail.Overlay.DeviceQueueEntry = fffffa800a83fd68
Tail.Overlay.Thread = 00000000
Tail.Overlay.AuxiliaryBuffer = 00000000
Tail.Overlay.ListEntry.Flink = 00000000
Tail.Overlay.ListEntry.Blink = 00000000
Tail.Overlay.CurrentStackLocation = fffffa800a83ff70
Tail.Overlay.OriginalFileObject = 00000000
Tail.Apc = 00000000
Tail.CompletionKey = 00000000
cmd flg cl Device File Completion-Context
[ 0, 0] 0 2 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 ffffffffc0000010
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[ 17,ff] 0 2 fffffa800a38c060 00000000 00000000-00000000
\Driver\SaiMini
Args: fffffa800a384040 00000000 00000000 00000000
>[ 17,ff] 0 e0 fffffa800a38c060 00000000 fffff80003709da0-fffffa800a83ffb8 Success Error Cancel
\Driver\SaiMini nt!IovpInternalCompletionTrap
Args: fffffa800a384040 00000000 00000000 00000000
[ 17,ff] 0 e0 fffffa800a389aa0 00000000 fffff80003713240-fffff880031a1720 Success Error Cancel
\DRIVER\VERIFIER_FILTER nt!ViIrpSynchronousCompletionRoutine
Args: fffffa800a384040 00000000 00000000 00000000
I added "1" in order to give us extra details on the IRP, which is really just providing us nothing different then what we did previously by dumping the !_IRP data structure. You can tell the IoStatus.Status subfield is easily visible here. Evidently this is the easier option to check this out, but I went the other route at first to show how to view data structures properly.
Now, what we're looking for is the device, which is shown in green. You'll discover there's two devices here. Let's start from the bottom one:
Code:
2: kd> !devobj fffffa800a389aa0
Device object (fffffa800a389aa0) is for:
\DRIVER\VERIFIER_FILTER DriverObject fffffa8008583cc0
Current Irp 00000000 RefCount 0 Type 00000022 Flags 00002010
DevExt fffffa800a389bf0 DevObjExt fffffa800a389c30
ExtensionFlags (0xc0000800) DOE_BOTTOM_OF_FDO_STACK, DOE_DESIGNATED_FDO
Unknown flags 0x00000800
AttachedTo (Lower) fffffa800a38c060 \Driver\SaiMini
Device queue is not busy.
This is the VERIFIER_FILTER device object, so we evidently shouldn't be worried about this. It's obvious this is what started the IRP since Driver Verifier was involved, and it also shows that this IRP is probably a fake IRP created by DV to test drivers for bugs. We'll extrapolate more on that later. For now, telling from the "Current IRP" being null, it's no longer involved. Let's move to the next device object. Note that it is listed as the lower device in the device stack in relation to the one we're looking at now, meaning it's "closer" to the actual device inside the OS kernel doing the I/O than this DV device object is. Remember that an OS is designed to be the medium between which applications (software) can interact with hardware to accomplish stuff. Anyways, let's move on:
Code:
2: kd> !devobj fffffa800a38c060
Device object (fffffa800a38c060) is for:
_HID00000001 \Driver\SaiMini DriverObject fffffa800a389e70
Current Irp fffffa800a83fcf0 RefCount 0 Type 00000022 Flags 00002050
Dacl fffff9a100083df1 DevExt fffffa800a38c1b0 DevObjExt fffffa800a38c648
ExtensionFlags (0xe0000800) DOE_RAW_FDO, DOE_BOTTOM_OF_FDO_STACK,
DOE_DESIGNATED_FDO
Unknown flags 0x00000800
AttachedDevice (Upper) fffffa800a389aa0 \DRIVER\VERIFIER_FILTER
AttachedTo (Lower) fffffa800a3893d0 \DRIVER\VERIFIER_FILTER
Device queue is not busy.
Getting warmer. We can tell this device object is for the SaiMini drive and is currently handling the faulting IRP. We can also tell from the "_HID00000001" that it's dealing with a HID (Human Interface Device) like mouse or keyboard. Now, for a bit of a bigger picture. Let's look at the entire device stack this is involved with. Using any of the device object addresses aforementioned will do:
Code:
2: kd> !devstack fffffa800a38c060
!DevObj !DrvObj !DevExt ObjectName
fffffa800a389aa0 \DRIVER\VERIFIER_FILTERfffffa800a389bf0
> fffffa800a38c060 \Driver\SaiMini fffffa800a38c1b0 _HID00000001
fffffa800a3893d0 \DRIVER\VERIFIER_FILTERfffffa800a389520
fffffa800a384040 \Driver\SaiNtBus fffffa800a384190
!DevNode fffffa800a386010 :
DeviceInst is "SaitekMagicBus\SaitekKeyboard\1&31a7fa5&0&0000"
ServiceName is "SaiMini"
All right! Just from looking at the device node info, we can easily determine this is a keyboard we're dealing with and not a mouse. We can also get the grand picture on the entire device stack involved with this IRP and how I/O flows. From the top, DV has a filter driver involved to watch out activity between SaiMini and user land (usermode environment: applications and services); then we have SaiMini which I guess is a minifilter driver (read in Windows Internals), followed by yet another DV filter driver to verify activity between SaiMini and the lowest driver in the stack, SaiNtBus.
Ok, now that we figured out the device involved, it's time to determine what the request was and why it was invalid. So we'll have to look back at the IRP, reprinted here for convenience:
Code:
2: kd> !irp fffffa800a83fcf0
Irp is active with 8 stacks 7 is current (= 0xfffffa800a83ff70)
No Mdl: No System Buffer: Thread 00000000: Irp stack trace.
cmd flg cl Device File Completion-Context
[ 0, 0] 0 2 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 ffffffffc0000010
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[ 17,ff] 0 2 fffffa800a38c060 00000000 00000000-00000000
\Driver\SaiMini
Args: fffffa800a384040 00000000 00000000 00000000
>[ 17,ff] 0 e0 fffffa800a38c060 00000000 fffff80003709da0-fffffa800a83ffb8 Success Error Cancel
\Driver\SaiMini nt!IovpInternalCompletionTrap
Args: fffffa800a384040 00000000 00000000 00000000
[ 17,ff] 0 e0 fffffa800a389aa0 00000000 fffff80003713240-fffff880031a1720 Success Error Cancel
\DRIVER\VERIFIER_FILTER nt!ViIrpSynchronousCompletionRoutine
Args: fffffa800a384040 00000000 00000000 00000000
The major (MJ) function code and the minor (MN) function code are what we are looking for, which is printed on the left of the
!irp output, that being 17,ff, for major and minor, respectively. Let's look them up in the Windbg help manual for
!irp. The MJ function code turns out to be IRP_MJ_SYSTEM_CONTROL. Looking up in MSDN you'll find the respective article
here. Judging by the description, it's a WMI function, so look back at the Windbg manual for
!irp again but in the WMI minor function code table (description of its items are
here. Now this is funny, our current MN code value of 0xff is not in the list. What's the deal? Time to look at that article in the previous link that's on the WMI minor IRP codes for an idea. It's from here we discover:
Drivers that do not register as WMI data providers must forward all WMI requests to the next-lower driver.
...
If the driver receives an IRP containing any other IRP minor function code [outside of what's listed], it should forward the IRP to the next-lower driver.
The second line there is what's really important. Since the MN code of 0xff doesn't fit the bill with any listed options, any driver handling the IRP should not fiddle with it and just pass it down the line in the device stack. However, remember the c0000010 error concerning the device not being able to understand the request? This is the request it's referring too. What appears to be happening is DV is sending this fake request that
should be passed down the line untouched but SaiMini is interfering with it by saying it doesn't understand it, and therefore is passing it down with the altered IoStatus. That's exactly what is explained in the bugcheck explanation. So, apparently, the driver wasn't coded properly to conform to WMI standards that it should not be tampering with IRPs that it has no right to do so, and therefore DV caught it and issued a bugcheck. Whether it's actually responsible for what's involved with the person's dilemma, I'm not sure, but it is a valid bug in the code of this driver, and looking at this and other related Saitek drivers being from 2010, I'd say it's probably out of date, too. If there's no updates for it, the only option probably is to either have them contact Saitek on this, or just uninstall the drivers and Saitek software for the keyboard and just rely on the basic Windows drivers, which should suffice in most cases.
Hope that helps. I too didn't understand much of this bugcheck until I did a little bit of research. Took me a couple hours but now I know a bit more about IRP handling which will help me in the future (unless I'm misguided somehow!).