Moderator, BSOD Kernel Dump Expert, Contributor
- May 7, 2013
- Minkowski Space
The Complete Debugging Guide to Stop 0x124 - Part 1
The Stop 0x124 is mostly caused by hardware, and in some exceptional cases, can be potentially caused by buggy device drivers. There isn't much of a debugging methodology to debugging a Stop 0x124, but there is plenty of background information which would be useful for understanding some of the terminology witnessed within a Stop 0x124 bugcheck.
A failure of a Stop 0x124 to be successfully created, usually produces a Stop 0x122, a debugging tutorial for Stop 0x122 can be found here - http://www.sysnative.com/forums/bsod-kernel-dump-analysis-debugging-information/13977-debugging-stop-0x122-whea_internal_error.html
WHEA (Windows Hardware Error Architecture) was introduced on Windows Vista and Windows Server 2008, to provide a effective error reporting system which would make debugging more effective, and take precedence over the MCA (Machine Check Architecture) as a primary error reporting architecture for hardware devices. MCA and MCE do still exist on Windows Vista and later operating systems, but are delivered through WHEA instead.
Structure of WHEA:
WHEA consists of a number of different components, the main concepts are LLHEHs (Low-Level Hardware Error Handler), PSHEDs (Platform-Specific Hardware Error Driver) and WHEA error records. The following diagram obtained from the Microsoft documentation provides an overview of how these components interact with the rest of the operating system:
The LLHEH is the first component which would handle the error discovered by the error source. Error sources will discussed later in this guide, but for now, I will simply mention that the error source is the hardware component which discovered the hardware error, and does not mean where the error originated from. The following flow diagram will hopefully help to illustrate the entire WHEA process.
Hardware Error -> Error Source Alerts OS -> LLHEH for corresponding error source is invoked -> Error Packet is created -> Error Packet is processed into a Error Record -> Error Record is processed by PSHED -> Bugcheck is produced
It is important to note that the above flow diagram is rather crude and doesn't necessarily show the details of each process involved in the WHEA bugchecking process. Please note it also only illustrates what happens with a fatal hardware error, something which will only lead to a bugcheck.
I will now begin to discuss Error Sources, and their purpose within a WHEA bugcheck. To begin, we need to understand and identify that the first parameter of the Stop 0x124 is the value of the error source.
2: kd> .bugcheck Bugcheck code 00000124 Arguments 00000000`00000000 fffffa80`04ba6028 00000000`be000000 00000000`00800400
2: kd> dt nt!_WHEA_ERROR_SOURCE_TYPE WheaErrSrcTypeMCE = 0n0 WheaErrSrcTypeCMC = 0n1 WheaErrSrcTypeCPE = 0n2 WheaErrSrcTypeNMI = 0n3 WheaErrSrcTypePCIe = 0n4 WheaErrSrcTypeGeneric = 0n5 WheaErrSrcTypeINIT = 0n6 WheaErrSrcTypeBOOT = 0n7 WheaErrSrcTypeSCIGeneric = 0n8 WheaErrSrcTypeIPFMCA = 0n9 WheaErrSrcTypeIPFCMC = 0n10 WheaErrSrcTypeIPFCPE = 0n11 WheaErrSrcTypeMax = 0n12
2: kd> .frame /r 3 03 fffff880`02f6db00 fffff800`02c26052 hal!HalpMcaReportError+0x4c rax=0000000000000000 rbx=fffffa8004c17ea0 rcx=0000000000000124 rdx=0000000000000000 rsi=fffff88002f6de00 rdi=fffffa8004c17ef0 rip=fffff80002c26700 rsp=fffff88002f6db00 rbp=fffff88002f6de30 r8=fffffa8004ba6028 r9=00000000be000000 r10=0000000000800400 r11=0000000000000002 r12=00000000ffffff02 r13=0000000000000000 r14=0000000000000000 r15=0000000000000001 iopl=0 ov up ei pl nz na po nc cs=0010 ss=0018 ds=002b es=002b fs=0053 gs=002b efl=00000a06 hal!HalpMcaReportError+0x4c: fffff800`02c26700 488b8c2430010000 mov rcx,qword ptr [rsp+130h] ss:0018:fffff880`02f6dc30=ffff00906cfd8774
As mentioned previously, a LLHEH will produce a error packet, which in turn can be investigated by the debugger.
Each error packet is represented by the WHEA_ERROR_PACKET macro, and there is currently two different versions: WHEA_ERROR_PACKET_V1 and WHEA_ERROR_PACKET_V2. The V1 type is supported by Windows Vista SP1 and Windows Server 2008; V2 is supported by from Windows 7 and all latter operating systems.
The only difference between the two structures, is the Signature member. The Signature member takes the value of WHEA_ERROR_PACKET_V2_SIGNATURE for Version 2 or WHEA_ERROR_PACKET_V1_SIGNATURE for Version 1. Since Windows Vista systems are pretty much obsolete now, there isn't any real reason to bother examining the Version 1 structure.
2: kd> dt _WHEA_ERROR_PACKET_V2 nt!_WHEA_ERROR_PACKET_V2 +0x000 Signature : Uint4B +0x004 Version : Uint4B +0x008 Length : Uint4B +0x00c Flags : _WHEA_ERROR_PACKET_FLAGS +0x010 ErrorType : _WHEA_ERROR_TYPE +0x014 ErrorSeverity : _WHEA_ERROR_SEVERITY +0x018 ErrorSourceId : Uint4B +0x01c ErrorSourceType : _WHEA_ERROR_SOURCE_TYPE +0x020 NotifyType : _GUID +0x030 Context : Uint8B +0x038 DataFormat : _WHEA_ERROR_PACKET_DATA_FORMAT +0x03c Reserved1 : Uint4B +0x040 DataOffset : Uint4B +0x044 DataLength : Uint4B +0x048 PshedDataOffset : Uint4B +0x04c PshedDataLength : Uint4B
The ErrorType field contains the WHEA_ERROR_TYPE structure which describes the hardware which reported the error.
2: kd> dt nt!_WHEA_ERROR_TYPE WheaErrTypeProcessor = 0n0 WheaErrTypeMemory = 0n1 WheaErrTypePCIExpress = 0n2 WheaErrTypeNMI = 0n3 WheaErrTypePCIXBus = 0n4 WheaErrTypePCIXDevice = 0n5 WheaErrTypeGeneric = 0n6