Windows custom driver freezes system with 100% CPU

bharatgade

New member
Joined
Apr 1, 2019
Posts
1
There is a kernel level driver installed on a terminal server.It works fine for certain period of time on that terminal sever. later on
that terminal server itself getting into freezed state where nobody can RDP & web console to connect with server. In my case,
CPU is always hitting to 100% in freezed state and i had to hard reboot only by using VM option "power off". After unstalling that driver the terminal server works fine or even responds properly always.Even if it is 100% CPU usage and gets slow but still reponds to the RDP & web console.

That scenario is kind of hard to reproduce it. but still i got successful to fetch complete memory dump out of that machine in that scenario then i analyzed full memory dump using microsoft WinDbg tool. WinDbg tool displayed faulty driver module name and call stack as below

Module Name: MMTEProxy (Installed Driver)

0: kd> !analyze -v
***
* *
* Bugcheck Analysis *
* *
***

NMI_HARDWARE_FAILURE (80)
This is typically due to a hardware malfunction. The hardware supplier should
be called.
Arguments:
Arg1: 00000000004f4454
Arg2: 0000000000000000
Arg3: 0000000000000000
Arg4: 0000000000000000

Debugging Details:
------------------
KEY_VALUES_STRING: 1

PROCESSES_ANALYSIS: 1

SERVICE_ANALYSIS: 1

STACKHASH_ANALYSIS: 1

TIMELINE_ANALYSIS: 1

DUMP_CLASS: 1

DUMP_QUALIFIER: 402

BUILD_VERSION_STRING: 9600.17415.amd64fre.winblue_r4.141028-1500

SYSTEM_MANUFACTURER: VMware, Inc.

VIRTUAL_MACHINE: VMware

SYSTEM_PRODUCT_NAME: VMware Virtual Platform

SYSTEM_VERSION: None

BIOS_VENDOR: Phoenix Technologies LTD

BIOS_VERSION: 6.00

BIOS_DATE: 04/05/2016

BASEBOARD_MANUFACTURER: Intel Corporation

BASEBOARD_PRODUCT: 440BX Desktop Reference Platform

BASEBOARD_VERSION: None

DUMP_TYPE: 0

BUGCHECK_P1: 4f4454

BUGCHECK_P2: 0

BUGCHECK_P3: 0

BUGCHECK_P4: 0

CPU_COUNT: 2

CPU_MHZ: bb8

CPU_VENDOR: GenuineIntel

CPU_FAMILY: 6

CPU_MODEL: 3e

CPU_STEPPING: 4

CPU_MICROCODE: 6,3e,4,0 (F,M,S,R) SIG: 42C'00000000 (cache) 42C'00000000 (init)

DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT

BUGCHECK_STR: 0x80

PROCESS_NAME: svchost.exe

CURRENT_IRQL: 0

ANALYSIS_SESSION_HOST: INPN01LAP107

ANALYSIS_SESSION_TIME: 03-26-2019 16:30:13.0120

ANALYSIS_VERSION: 10.0.18317.1001 amd64fre

LAST_CONTROL_TRANSFER: from fffff8005ae205b2 to fffff8009a6601a7

STACK_TEXT:
nt!KxWaitForLockOwnerShip+0x27
MMTEProxy!SVSessionLutTranslatePort+0x2c2 [c:\users\dkelone\git\MMTE\MMTE\MMTEdriver\sessionlut.c @ 873]
MMTEProxy!PerformProxySocketRedirection+0xba7 [c:\users\dkelone\git\MMTE\MMTE\MMTEdriver\filteralebindredirect.c @ 247]
MMTEProxy!TriggerProxyByALERedirectInline+0x244 [c:\users\dkelone\git\MMTE\MMTE\MMTEdriver\filteralebindredirect.c @ 690]
MMTEProxy!DDProxyBindRedirectClassify+0x537 [c:\users\dkelone\git\MMTE\MMTE\MMTEdriver\filteralebindredirect.c @ 881]

THREAD_SHA1_HASH_MOD_FUNC: 03f7fb5fd041c46c9b4dff8f1685ccff753d3642

THREAD_SHA1_HASH_MOD_FUNC_OFFSET: 7f4a5e830d38804e610244f134268d53640c97a0

THREAD_SHA1_HASH_MOD: 2a8f232a3e3c38ad2a6b44b0d2253b97c2ac4b2a

FOLLOWUP_IP:
MMTEProxy!SVSessionLutTranslatePort+2c2 [c:\users\dkelone\git\MMTE\MMTE\MMTEdriver\sessionlut.c @ 873]
fffff800`5ae205b2 c644244000 mov byte ptr [rsp+40h],0

FAULT_INSTR_CODE: 402444c6

FAULTING_SOURCE_LINE: c:\users\dkelone\git\MMTE\MMTE\MMTEdriver\sessionlut.c

FAULTING_SOURCE_FILE: c:\users\dkelone\git\MMTE\MMTE\MMTEdriver\sessionlut.c

FAULTING_SOURCE_LINE_NUMBER: 873

FAULTING_SOURCE_CODE:
No source found for 'c:\users\dkelone\git\MMTE\MMTE\MMTEdriver\sessionlut.c'

SYMBOL_STACK_INDEX: 1

SYMBOL_NAME: MMTEProxy!SVSessionLutTranslatePort+2c2

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: MMTEProxy

IMAGE_NAME: MMTEProxy.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 5a60d5f0

STACK_COMMAND: .thread ; .cxr ; kb

BUCKET_ID_FUNC_OFFSET: 2c2

FAILURE_BUCKET_ID: 0x80_MMTEProxy!SVSessionLutTranslatePort

BUCKET_ID: 0x80_MMTEProxy!SVSessionLutTranslatePort

PRIMARY_PROBLEM_CLASS: 0x80_MMTEProxy!SVSessionLutTranslatePort

TARGET_TIME: 2019-02-26T11:15:36.000Z

OSBUILD: 9600

OSSERVICEPACK: 0

SERVICEPACK_NUMBER: 0

OS_REVISION: 0

SUITE_MASK: 16

PRODUCT_TYPE: 3

OSPLATFORM_TYPE: x64

OSNAME: Windows 8.1

OSEDITION: Windows 8.1 Server TerminalServer

OS_LOCALE:

USER_LCID: 0

OSBUILD_TIMESTAMP: 2014-10-29 06:08:48

BUILDDATESTAMP_STR: 141028-1500

BUILDLAB_STR: winblue_r4

BUILDOSVER_STR: 6.3.9600.17415.amd64fre.winblue_r4.141028-1500

ANALYSIS_SESSION_ELAPSED_TIME: 685

ANALYSIS_SOURCE: KM

FAILURE_ID_HASH_STRING: km:0x80_MMTEProxy!svsessionluttranslateport

FAILURE_ID_HASH: {c64b7e97-0bf3-daf1-ad95-9f39cbf37a9a}

Followup: MachineOwner
---------

Since i am not expert in kernel level driver development,But i tried to google about driver. Internally it uses the following lock to perform any operation at process table or session table


#Code snippet

{
...
...

KeAcquireInStackQueuedSpinLock(&gProcessTableLock,&processTableLockHandle);

tempNode = processTableListHead;

while (processTableListHead != tempNode->Flink)
{
tempNode = tempNode->Flink;

if (CONTAINING_RECORD(tempNode, PROCESS_TABLE, list_entry)->processId == processId &&
CONTAINING_RECORD(tempNode, PROCESS_TABLE, list_entry)->inUse == TRUE)
{
*sessionID = CONTAINING_RECORD(tempNode, PROCESS_TABLE, list_entry)->sessionId;
found = TRUE;
break;
}
}

....
....
....

KeReleaseInStackQueuedSpinLock(&processTableLockHandle);

}


With help of WinDbg tool, What i observed here, Mostly it is failling at source line no where assinging the value to a variables and that variables defined before accuiring the lock. You can see it in above driver code snippet. my driver is a WFP ALE filtered driver. it inspects traffic it works in a multhreaded environment and my driver allocates/freed memory in non-paged pool

Still i am not getting what causing this issue. whether its lock is not handled properly at code level or some particular situation.


Can you please help me with pointer or direction?
 
Hi. . .

Forget about the drivers here, the bugcheck (0x80) is what you need to look at:

Code:
NMI_HARDWARE_FAILURE (80)
                        This is typically due to a hardware malfunction.

Some piece of hardware on the system has failed.

Start with RAM and HDD diagnostic tests -

RAM - memtest86+ - run 1 stick at a time; alternate the slots - Test RAM With Memtest86+

HDD - SeaTools for DOS, LONG test - Hard Drive (HDD) Diagnostics (Sea Tools for DOS) & SSD Test

Regards. . .

jcgriff2
 
i had to hard reboot only by using VM option "power off"

Possibly? I'm wondering if it crashed within a VM instance then hardware may erroneously be blamed for the the issue, especially if the error doesn't occur with a particular driver removed.
 

Has Sysnative Forums helped you? Please consider donating to help us support the site!

Back
Top