1. #1

    Double Fault

    I was recently sent a pretty neat kernel-dump by my good friend Jared. I've always wanted to go into double faults, so let's get started! Thanks, Jared : )

    Code:
    UNEXPECTED_KERNEL_MODE_TRAP (7f)
    This means a trap occurred in kernel mode, and it's a trap of a kind
    that the kernel isn't allowed to have/catch (bound trap) or that
    is always instant death (double fault).
    Arguments:
    Arg1: 0000000000000008, EXCEPTION_DOUBLE_FAULT
    Arg2: 0000000080050033
    Arg3: 00000000000406f8
    Arg4: fffff800032aa875
    In our case, the 1st argument was 8, therefore this indicates a double fault occurred. So, what is a double fault, and when/why does one occur?

    Double faults occur when an exception cannot be handled by the handler, or when an exception occurs when the CPU is already trying to call an exception handler for a previously thrown exception. In most cases, two exceptions that were thrown at the exact same time are handled separately, however in some cases, you may have a situation occur in which a pagefault occurs, but the exception handler is located in a not-present page, two page faults would occur and neither of them can be handled. This is known as a double fault! Also, double faults can occur (like in this scenario) when the processor cannot properly service an interrupt that is pending.

    Code:
    4: kd> k
    Child-SP          RetAddr           Call Site
    fffff880`009b9de8 fffff800`0328b169 nt!KeBugCheckEx
    fffff880`009b9df0 fffff800`03289632 nt!KiBugCheckDispatch+0x69
    fffff880`009b9f30 fffff800`032aa875 nt!KiDoubleFaultAbort+0xb2 <- Uh oh, double fault!
    fffff880`03dccfd0 fffff800`032909ba nt!KiIpiSendRequest+0x305 <- Processor #4 sent an inter-processor interrupt to interrupt another processor saying "Hey, we need to flush the TLB."
    fffff880`03dcd090 fffff800`032ec198 nt!KeFlushMultipleRangeTb+0x22a <- Flushing translation lookaside buffer, this is a multiprocessor job.
    fffff880`03dcd160 fffff800`033935ea nt! ?? ::FNODOBFM::`string'+0x204ce
    fffff880`03dcd350 fffff800`03394be7 nt!MiEmptyWorkingSet+0x24a <- Removing as many pages as possible from the working set.
    fffff880`03dcd400 fffff800`0372f371 nt!MiTrimAllSystemPagableMemory+0x218 <- Unmapping all pageable system memory.
    fffff880`03dcd460 fffff800`0372f4cf nt!MmVerifierTrimMemory+0xf1
    fffff880`03dcd490 fffff800`0372fc24 nt!ViKeRaiseIrqlSanityChecks+0xcf  <- A sanity check is essentially verifier saying "Okay, what IRQL are we on and are we supposed to be here?"
    fffff880`03dcd4d0 fffff880`018443f5 nt!VerifierKeAcquireSpinLockRaiseToDpc+0x54 <- IRST resetting IRQL to DISPATCH (2) and then acquiring a lock.
    fffff880`03dcd530 fffff880`018222a2 iaStor+0x253f5 <- Intel Rapid Storage Technology
    fffff880`03dcd560 fffff880`01871489 iaStor+0x32a2 <- Intel Rapid Storage Technology
    Code:
    4: kd> ub nt!KiIpiSendRequest+0x305
    nt!KiIpiSendRequest+0x2eb:
    fffff800`032aa85b 5e              pop     rsi
    fffff800`032aa85c 5d              pop     rbp
    fffff800`032aa85d c3              ret
    fffff800`032aa85e 8bc6            mov     eax,esi
    fffff800`032aa860 e9e2feffff      jmp     nt!KiIpiSendRequest+0x1d7 (fffff800`032aa747)
    fffff800`032aa865 0fb70db4892100  movzx   ecx,word ptr [nt!KeActiveProcessors (fffff800`034c3220)]
    fffff800`032aa86c 0fb705af892100  movzx   eax,word ptr [nt!KeActiveProcessors+0x2 (fffff800`034c3222)]
    fffff800`032aa873 8bfa            mov     edi,edx
    By unassmembling nt!KiIpiSendRequest+0x305 backwards, it looks like there's a check for active processors, and then the attempt to send the IPI.

    Code:
    4: kd> !ipi
    IPI State for Processor 0
        TargetCount          0  PacketBarrier        0  IpiFrozen     2 [Frozen]
    
    
    IPI State for Processor 1
        TargetCount          0  PacketBarrier        0  IpiFrozen     2 [Frozen]
    
    
    IPI State for Processor 2
        TargetCount          0  PacketBarrier        0  IpiFrozen     2 [Frozen]
    
    
    IPI State for Processor 3
        TargetCount          0  PacketBarrier        0  IpiFrozen     2 [Frozen]
    
    
    IPI State for Processor 4
        TargetCount          0  PacketBarrier        0  IpiFrozen     0 [Running]
    
    
    IPI State for Processor 5
        TargetCount          0  PacketBarrier        0  IpiFrozen     2 [Frozen]
    
    
    IPI State for Processor 6
        TargetCount          0  PacketBarrier        0  IpiFrozen     2 [Frozen]
    
    
    IPI State for Processor 7
        TargetCount          0  PacketBarrier        0  IpiFrozen     2 [Frozen]
    By running !ipi we can check the inter-processor interrupt state for every processor on the box. We can see here that every single processor (except #4) is in a frozen state (idle), therefore obviously our IPI is never going to be serviced, will remain pending, and we're going to double fault.

    Code:
    4: kd> lmvm iastor
    start             end                 module name
    fffff880`0181f000 fffff880`01bc3000   iaStor     (no symbols)           
        Loaded symbol image file: iaStor.sys
        Image path: \SystemRoot\system32\DRIVERS\iaStor.sys
        Image name: iaStor.sys
        Timestamp:        Wed Feb 01 19:15:24 2012
    The IRST driver is dated from early 2012, which is likely the problem since it is a notoriously problematic driver, and it gets worse as it gets older. The newer update would likely solve it, but honestly, I always usually recommend a user safely removes and replaces this driver with the standard MSFT driver if they aren't running a RAID setup. Kaspersky was also present on this system, and antivirus suites don't tend to play nice with this software either.

    This post also shows how helpful Driver Verifier is, and how without it in this specific scenario, we likely would have had no idea what was causing this, and may interpret it as a hardware problem.

    Thanks for reading!
    Jared and axe0 say thanks for this.


    • Ad Bot

      advertising
      Beep.

        
       

  2. #2
    x BlueRobot's Avatar
    Join Date
    May 2013
    Location
    Minkowski Space
    Posts
    1,590

    Re: Double Fault

    Just to add, there is also Triple Faults which is when an exception occurs whilst a Double Fault is being handled by the exception handler, Triple Faults result in a CPU reset and a reboot of the entire computer.
    Machines Can Think

    I am currently studying again, and therefore may not be available very often.


  3. #3

    Re: Double Fault

    Yep!

    Triple Faults are mainly caused by buffer overflows (or underflows) in 3rd party drivers which lead to writing over the Interrupt Descriptor Table (IDT). The Triple Fault itself actually occurs when the next interrupt fires and the CPU cannot call the interrupt handler or the double fault handler because the IDT descriptors are now corrupted.

    Do you know if the shutdown cycle occurs on x64 as well? I know it definitely does (the CPU reset) on x86.

  4. #4
    Jared's Avatar
    Join Date
    Feb 2014
    Age
    20
    Posts
    1,568
    • specs System Specs
      • Manufacturer:
        Custom
      • Motherboard:
        ASUS Maximus VII Ranger
      • CPU:
        i7 4790K @ 4.4GHz
      • Memory:
        Corsair Vengeance 16GB 1866MHz
      • Graphics:
        MSI Gaming 4G GTX 980
      • Sound Card:
        Creative Soundblaster ZxR
      • Hard Drives:
        Samsung 850 SSD 250GB
      • Disk Drives:
        Western Digital Black Caviar 2TB
      • Power Supply:
        Corsair RM650 Modular 650 Watts
      • Case:
        Fractal Design Define R5 Window
      • Cooling:
        Corsair H100i GTX
      • Display:
        Dell U2515H 25inch 2560x1440 + LG Flatron M2262D 22inch 1920x1080
      • Operating System:
        Windows 10 Professional x64

    Re: Double Fault

    Good post, I was thinking of doing something similar when I had the time, I might write it on my blog.

    One question, what exactly is the nt! ?? ::FNODOBFM::`string'+0x204ce function?
    I'm guessing it's a user mode function but I can't say for sure.
    I've seen those strings on callstacks quite a lot, I remember reading what it was but I can't remember.

    AFAIK it does still initiate a shutdown cycle on x64 systems.

  5. #5

    Re: Double Fault

    One question, what exactly is the nt! ?? ::FNODOBFM::`string'+0x204ce function?
    Wow, I cannot believe I actually found my post from nearly five months ago where I talked about this.

    nt! ?? ::FNODOBFM::`string' - TO MY KNOWLEDGE, the debugger (WingDbg) is slightly confused about symbol names in NTDLL due to the binary being reorganized into function chunks. The functions are no longer contiguous in memory. Hot code paths are clustered together with hot code paths of other functions. “Cold” code paths are moved elsewhere. That way you save on paging I/O by maximizing the amount of relative data on each code page.

    Essentially, to my understanding, when a sequence of code is compiled, it will occupy a single contiguous chunk of memory. With this said however, the optimizer can spread the executable code all over the place, replacing the inline code with a jump to some other memory location.

    When this happens, to my knowledge, FunctionName+Offset no longer equals FunctionAddress+Offset, therefore the output of information in the debugger isn't correct. In these specific cases, the code is moved to a location (which is random, to my knowledge) and the closest symbolic name is a string in the image. When this happens, the debugger (WinDbg) uses the string as a best guess for the return address on the stack.

  6. #6
    x BlueRobot's Avatar
    Join Date
    May 2013
    Location
    Minkowski Space
    Posts
    1,590

    Re: Double Fault

    Quote Originally Posted by Jared View Post
    AFAIK it does still initiate a shutdown cycle on x64 systems.
    I thought it would, wouldn't really make any sense otherwise.
    Machines Can Think

    I am currently studying again, and therefore may not be available very often.


  7. #7
    jcgriff2's Avatar
    Join Date
    Feb 2012
    Location
    New Jersey Shore
    Posts
    14,528

    Re: Double Fault

    Great thread!

    Quote Originally Posted by Jared View Post
    One question, what exactly is the nt! ?? ::FNODOBFM::`string'+0x204ce function?
    I was told by a crash expert from Microsoft years ago that it was a CPU instruction.



    A few years ago under Vista and early Windows 7, almost always when a BSOD had the double-fault bugcheck 0x7f (0x8,,,), the first thing we looked for was to see if COMODO was installed. All it took was a quick check for inspect.sys.

    No idea why, but COMODO was responsible for the majority of double-fault BSODs that we saw during that time period.

    BSOD Posting Instructions - Windows 10, 8.1, 8, 7 & Vista ` ` `Carrona Driver Reference Table (DRT)
    https://www.sysnative.com/
    Sysnative Hex-Decimal-UNIX Date Conversion

    Has Sysnative Forums helped you?
    Please consider donating to help support the forum.
    Thank You!

    Microsoft MVP 2009-2015

  8. #8
    Jared's Avatar
    Join Date
    Feb 2014
    Age
    20
    Posts
    1,568
    • specs System Specs
      • Manufacturer:
        Custom
      • Motherboard:
        ASUS Maximus VII Ranger
      • CPU:
        i7 4790K @ 4.4GHz
      • Memory:
        Corsair Vengeance 16GB 1866MHz
      • Graphics:
        MSI Gaming 4G GTX 980
      • Sound Card:
        Creative Soundblaster ZxR
      • Hard Drives:
        Samsung 850 SSD 250GB
      • Disk Drives:
        Western Digital Black Caviar 2TB
      • Power Supply:
        Corsair RM650 Modular 650 Watts
      • Case:
        Fractal Design Define R5 Window
      • Cooling:
        Corsair H100i GTX
      • Display:
        Dell U2515H 25inch 2560x1440 + LG Flatron M2262D 22inch 1920x1080
      • Operating System:
        Windows 10 Professional x64

    Re: Double Fault

    Interesting, I wonder if there's a way we could find out what the instruction is...

  9. #9

    Re: Double Fault

    Probably with internal Microsoft symbols. If it's a function, it looks like it does to us because we don't have the private symbols to resolve the function name.

Similar Threads

  1. Assistance requested for BSOD; PAGE FAULT IN NONPAGED AREA - Vista SP2 x86
    By JacquieDV in forum BSOD, Crashes, Kernel Debugging
    Replies: 14
    Last Post: 07-28-2014, 03:09 PM
  2. WHEA error for a MCA fault
    By Capt.Jack Sparrow in forum BSOD, Crashes, Kernel Debugging
    Replies: 4
    Last Post: 07-31-2012, 02:54 PM
  3. Firefox, WordPress, Win 7 or me at fault
    By JohnthePilot in forum Windows 7 | Windows Vista
    Replies: 43
    Last Post: 05-04-2012, 06:55 AM

Log in

Log in