We've all seen it, Bitdefender's protection process involved in an otherwise blank 0x4A bug check. After all this time of still seeing it, I investigated it to see what the issue really is as I knew it couldn't have just been the user's fault as it was consistent, and the crashes stopped after removing Bitdefender.
First off, taking a look at a non-verifier enabled kernel dump, here's our bug check as discussed:
0x4A bug check, essentially implying that the thread which was previously involved in a system call attempted to return to user mode at an IRQL higher than PASSIVE_LEVEL (zero [0] on x86 and x64).
In this case, at the time of the crash, the IRQL was DISPATCH_LEVEL (Two [2] on x86 and x64).
The process involved in the IRQL raise was vsserv.exe, Bitdefender's main active protection process.
Let's also go further and dump the address of the system function involved:
We can see its VA type is UserRange, its protection is 0x7 which implies it's R/W/X (or E).
If we run !vad on the VAD Address field, we can see frequent mention of Bitdefender:
Let's use !address and -v together to get nice verbose PTE/PFN/VAD information:
Overall, we can see vsserv.exe is listed as active and valid within the page regarding its location, as well as ntdll being involved with memory usage:
Throughout all of the 0x4A Bitdefender related crashes, the NT kernel was labeled as the fault:
Given we're seeing ntdll, we can likely imagine the reason for the NT kernel being blamed as being the fault of the crash is because most of the API from ntdll is implemented in the NT kernel variants, with this being ntkrnlmp.exe because this system has a multi-processor without physical address extension configuration.
So regarding processor #10, that's probably as far as we're going to go considering it's the bug check thread and there's no information really whatsoever:
All we can see if we're exiting user-mode code using the KiSystemServiceExit function, and we go off the rails right there - KiSystemServiceExit+0x245. This function is in charge of handling the various call-styles used to enter kernel-mode, and then returning to user-mode.
With that said, let's switch to the other processor within the system that was involved and see what's going on at the time of the crash. To find out the active processors on the specific system, we'll use !running:
We can see our processors are #10 and #11. We've explored #10, so let's check #11. The reason 0-9 aren't listed is because they're idle.
I used knL as opposed to the other stack dump commands as I wanted to get the frame # feature for reference reasons.
Starting at frame # 2a, we can see the NtDeviceIoControlFile function calls IopXxxControlFile. The latter function appears to be undocumented, so I'm unsure as to what it does. What I do know is, the NtDeviceIoControlFile function is ultimately used to build descriptors for a driver. I imagine it's using the IopXxxControlFile function to aid in passing such to the driver.
Also, for what it's worth, although NtDeviceIoControlFile has since been superseded by DeviceIoControl, the former native function provides more information that may be beneficial to the caller (especially for debugging purposes). This is likely why Bitdefender chose to use the former function instead.
If we disassemble this function, we can wade through some of the stuff and find some of the interesting tidbits:
So after neatly putting together this disassembly of sorts, we can see that this is indeed how the NtDeviceIoControlFile function is passing on the buffer and such to the driver.
The IoAllocateMdl function in this specific case is used to ultimately associate the MDL with an IRP, which is why we call into the IoAllocateIrp function, to of course assign the IRP. IoGetAttachedDevice is called likely to return a pointer to the devobj, with help from the IoGetRelatedDeviceObject function to probably obtain the devobj from the file system driver stack.
ObReferenceObjectByHandleWithTag is called to increment the reference count of the object, and to write a four-byte value known as a "tag" so it can support object reference tracing for debugging purposes. Finally, the ProbeForWrite function is called to ensure that a user-mode buffer meets the following:
As all appears to have went well, we can see the driver we were ultimately building and passing descriptors to/for was bdfwfpf.sys, which is Bitdefender's firewall filter driver. As it's a driver in charge of a firewall, it of course uses the WFP API (Windows Filtering Platform) to achieve its goals (not just filtering and monitoring).
We can confirm this easily by looking at the very first driver/function call after Bitdefender's firewall, which is fwpkclnt.sys. Specifically, Bitdefender's firewall driver called it to inject new/cloned data to the data stream. Directly afterwords we have calls from the Network I/O Subsystem to continue the injecting, which is because fwpkclnt.sys exports kernel-mode functions, as opposed to fwpuclnt.dll which exports and handles the user-mode side.
To handle and/or continue the injection into the data stream, it looks like DPC(s) are used to handle it by calling KeInsertQueueDpc to create a queued DPC for execution.
After discussion with Jared, we also thought that the IRQL was possibly DISPATCH_LEVEL due to the multiple injections, etc, therefore Windows deferred it to a DPC. Given this possibly being the case, when the DPC was to be worked on, the system service finished but the IRQL is still DISPATCH_LEVEL. Since that was the case, we get a bug check.
We continue through netio.sys' functions regarding the data stream injection, ultimately injecting the request to the stack and going through a few tcpip.sys functions.
To continue sending the data along, NDIS' NdisSendNetBufferLists function is called, and NDIS' filter driver (which I believe is pacer.sys), called NdisFSendNetBufferLists to send the list of network data buffers back to Bitdefender's firewall driver.
Bitdefender's firewall driver then calls into NDIS' network data buffer sending functions to send the list to the user's network miniport driver, e1c62x64.sys (Intel(R) 82579V Gigabit Network Connection). The network miniport driver then calls NDIS' NdisMAllocateNetBufferSGList function to obtain a scatter/gather list for the network data for the associated NET_BUFFER structure.
In order to do so, NDIS needs to call the HAL, which we can see through the function HalBuildScatterGatherList. What is supposed to happen next is, the HAL builds the scatter/gather list, and we go on through various registered miniport functions. However, this did not happen, and we go off the rails on frame #00 with a call to the miniport driver.
So, where's our problem? Frame #23:
FwpsStreamInjectAsync0, the function in charge of injecting TCP data segments into a TCP data stream, is the issue. How so? Well, let's get dirty once again.
Using the NDIS debugging extension (!ndiskd), we can get a lot of information to help us here. On its lonesome, !ndiskd isn't too special. However, when we use !ndiskd.miniport, it gets fun.
So we know that our miniport involved in all of this was the Intel Gigabit, so let's look at that one:
If we take a look at the miniport address:
We get a lot of good information, and can see that Bitdefender's firewall filter driver is/was involved with this miniport. We know this, because we saw it all happening in the stack, but this just confirms it.
Anyway, what's next? Well, let's check for any pending NBLs (NET_BUFFER_LISTS):
Ah ha, we have one held by the miniport driver that was involved in passing data to Bitdefender's firewall filter driver. Let's look at the pending NBL:
From here we can take a direct look at the NBL:
What appears to be happening here is multiple NBLs in a chain are being passed, the FwpsStreamInjectAsync0 function is called to pass Bitdefender's data, and then the chain is broken as the call goes on (see the NBL next member is zeroed out/null).
Possibly a fix (in Bitdefender's case) is to avoid multiple injections inside the stream callout routine, possibly taking NBLs in a chain and calling the FwpsStreamInjectAsync0 function just ONCE for each callout routine execution. Unsure, kernel development isn't my strong point : ) It's not Bitdefender's fault as this is a Windows bug apparently, anyway.
A fix for user's is to install this hotfix and hope it works, as it should. Overall, maybe Bitdefender instead of making any developmental changes could just raise awareness for this issue, like creating a well explained documentation page with a link to the hotfix.
First off, taking a look at a non-verifier enabled kernel dump, here's our bug check as discussed:
Code:
10: kd> .bugcheck
Bugcheck code 0000004A
Arguments 00000000`77a1dc2a 00000000`00000002 00000000`00000000 fffff880`13695b60
0x4A bug check, essentially implying that the thread which was previously involved in a system call attempted to return to user mode at an IRQL higher than PASSIVE_LEVEL (zero [0] on x86 and x64).
Code:
10: kd> !irql
Debugger saved IRQL for processor 0xa -- 2 (DISPATCH_LEVEL)
In this case, at the time of the crash, the IRQL was DISPATCH_LEVEL (Two [2] on x86 and x64).
Code:
X64_RAISED_IRQL_FAULT_vsserv.exe_nt!KiSystemServiceExit+245
The process involved in the IRQL raise was vsserv.exe, Bitdefender's main active protection process.
Let's also go further and dump the address of the system function involved:
Code:
10: kd> !address 0000000077a1dc2a
Usage: VAD
Base Address: 00000000`779d0000
End Address: 00000000`77b79000
Region Size: 00000000`001a9000
VA Type: UserRange
VAD Address: 0xfffffa8020d1f830
Commit Charge: 0xd
Protection: 0x7 [ReadWriteCopyExecute]
Memory Usage: Section [\Windows\System32\ntdll.dll]
No Change: no
More info: !vad 0x779d0000
We can see its VA type is UserRange, its protection is 0x7 which implies it's R/W/X (or E).
If we run !vad on the VAD Address field, we can see frequent mention of Bitdefender:
Code:
10: kd> !vad 0xfffffa8020d1f830
...
fffffa8018c95a70 ( 2) 7fee7980 7fee79c8 6 Mapped Exe EXECUTE_WRITECOPY \Program Files\Bitdefender\Bitdefender 2015\asengines_00015_008\mimepack.dll
fffffa802278cb70 ( 3) 7fee79d0 7fee7aa2 69 Mapped Exe EXECUTE_WRITECOPY \Program Files\Bitdefender\Bitdefender 2015\asengines_00015_008\asregex.dll
fffffa8018c944e0 ( 0) 7fee7ab0 7fee7bb9 9 Mapped Exe EXECUTE_WRITECOPY \Program Files\Bitdefender\Bitdefender 2015\asengines_00015_008\asmcocr.dll
fffffa804d4baa50 ( 2) 7fee7bc0 7fee7dea 259 Mapped Exe EXECUTE_WRITECOPY \Program Files\Bitdefender\Bitdefender 2015\asengines_00015_008\asunicode.dll
fffffa8022f25450 ( 3) 7fee7df0 7fee810b 89 Mapped Exe EXECUTE_WRITECOPY \Program Files\Bitdefender\Bitdefender 2015\asengines_00015_008\asemlthin.mdl
fffffa8050690e60 ( 1) 7fee8110 7fee8333 86 Mapped Exe EXECUTE_WRITECOPY \Program Files\Bitdefender\Bitdefender 2015\asengines_00015_008\asemlrtr.mdl
fffffa8018c0b1e0 ( 2) 7fee8340 7fee8442 81 Mapped Exe EXECUTE_WRITECOPY \Program Files\Bitdefender\Bitdefender 2015\asengines_00015_008\ascore.dll
fffffa804f4f7970 ( 3) 7fee8550 7fee8622 69 Mapped Exe EXECUTE_WRITECOPY \Program Files\Bitdefender\Bitdefender 2015\otengines_00350_006\asregex.dll
fffffa804e14ff80 (-1) 7fee8630 7fee885a 259 Mapped Exe EXECUTE_WRITECOPY \Program Files\Bitdefender\Bitdefender 2015\otengines_00350_006\asunicode.dll
fffffa804f509640 ( 2) 7fee8860 7fee89ef 82 Mapped Exe EXECUTE_WRITECOPY \Program Files\Bitdefender\Bitdefender 2015\otengines_00350_006\ashttprbl.mdl
fffffa804f4bfc80 ( 3) 7fee89f0 7fee8cc9 88 Mapped Exe EXECUTE_WRITECOPY \Program Files\Bitdefender\Bitdefender 2015\otengines_00350_006\ashttpph.mdl
fffffa80232c0460 ( 1) 7fee8cd0 7fee8dca 82 Mapped Exe EXECUTE_WRITECOPY \Program Files\Bitdefender\Bitdefender 2015\otengines_00350_006\ashttpdsp.mdl
fffffa804ce69120 ( 3) 7fee8dd0 7fee8edb 81 Mapped Exe EXECUTE_WRITECOPY \Program Files\Bitdefender\Bitdefender 2015\otengines_00350_006\ashttpbr.mdl
fffffa804ce52bb0 ( 2) 7fee8ee0 7fee8fe2 81 Mapped Exe EXECUTE_WRITECOPY \Program Files\Bitdefender\Bitdefender 2015\otengines_00350_006\otcore.dll
...
Let's use !address and -v together to get nice verbose PTE/PFN/VAD information:
Code:
10: kd> !address -v -map 0x779d0000
PXE: fffff6fb7dbed000 [contains 02e0000763d62867]
Page Frame Number: 763d62, at address: fffffa80162b8260
Page Location: 6 (ActiveAndValid)
PTE Frame: 0000000000763e3c
Attributes: M:Modified,Cached
Usage: PPEs Process fffffa8020d96b10 [vsserv.exe], Entries:5
PPE: fffff6fb7da00008 [contains 18700007641ad867]
Page Frame Number: 7641ad, at address: fffffa80162c5070
Page Location: 6 (ActiveAndValid)
PTE Frame: 0000000000763d62
Attributes: M:Modified,Cached
Usage: PDEs Process fffffa8020d96b10 [vsserv.exe], Entries:31
PDE: fffff6fb40001de0 [contains 0370000764ab6867]
Page Frame Number: 764ab6, at address: fffffa80162e0220
Page Location: 6 (ActiveAndValid)
PTE Frame: 00000000007641ad
Attributes: M:Modified,Cached
Usage: PTEs Process fffffa8020d96b10 [vsserv.exe], Entries:159
PTE: fffff680003bce80 [contains 82a000079e2d4025]
Page Frame Number: 79e2d4, at address: fffffa8016da87c0
Page Location: 6 (ActiveAndValid)
PTE Frame: 000000000079ec0c
Attributes: P:Prototype,Cached
Usage: MappedFile CA:fffffa801f3a3010 [\Windows\System32\ntdll.dll]
Type: Valid
Attrs: Private,NormalPage,NotDirty,NotDirty1,Accessed,User,NotWritable,NotWriteThrough
PFN: 79e2d4
Overall, we can see vsserv.exe is listed as active and valid within the page regarding its location, as well as ntdll being involved with memory usage:
Code:
10: kd> !vad 0x779d0000
VAD level start end commit
fffffa8020d1f830 (-1) 779d0 77b78 13 Mapped Exe EXECUTE_WRITECOPY \Windows\System32\ntdll.dll
Throughout all of the 0x4A Bitdefender related crashes, the NT kernel was labeled as the fault:
Code:
Probably caused by : ntkrnlmp.exe
Given we're seeing ntdll, we can likely imagine the reason for the NT kernel being blamed as being the fault of the crash is because most of the API from ntdll is implemented in the NT kernel variants, with this being ntkrnlmp.exe because this system has a multi-processor without physical address extension configuration.
Code:
10: kd> vertarget
Windows 7 Kernel Version 7601 (Service Pack 1) [COLOR=#ff0000]MP[/COLOR] (12 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 7601.18839.amd64fre.win7sp1_gdr.150427-0707
Machine Name:
Kernel base = 0xfffff800`03064000 PsLoadedModuleList = 0xfffff800`032ab730
Debug session time: Fri May 15 09:00:27.644 2015 (UTC - 4:00)
System Uptime: 0 days 16:20:00.892
So regarding processor #10, that's probably as far as we're going to go considering it's the bug check thread and there's no information really whatsoever:
Code:
10: kd> k
Child-SP RetAddr Call Site
fffff880`13695928 fffff800`030d7e69 nt!KeBugCheckEx
fffff880`13695930 fffff800`030d7da0 nt!KiBugCheckDispatch+0x69
fffff880`13695a70 00000000`77a1dc2a nt!KiSystemServiceExit+0x245
00000000`2798f908 00000000`00000000 0x77a1dc2a
All we can see if we're exiting user-mode code using the KiSystemServiceExit function, and we go off the rails right there - KiSystemServiceExit+0x245. This function is in charge of handling the various call-styles used to enter kernel-mode, and then returning to user-mode.
With that said, let's switch to the other processor within the system that was involved and see what's going on at the time of the crash. To find out the active processors on the specific system, we'll use !running:
Code:
10: kd> !running
System Processors: (0000000000000fff)
Idle Processors: (00000000000003ff) (0000000000000000) (0000000000000000) (0000000000000000)
Prcbs Current (pri) Next (pri) Idle
10 fffff880038c9180 fffffa8021c85b50 ( 9) fffff880038d41c0 ................
11 fffff8800393b180 fffffa8021c91060 ( 9) fffff880039461c0 ................
We can see our processors are #10 and #11. We've explored #10, so let's check #11. The reason 0-9 aren't listed is because they're idle.
Code:
11: kd> knL
# Child-SP RetAddr Call Site
00 fffff880`03969e20 fffff880`0584f75e e1c62x64+0x558e
01 fffff880`03969e50 fffff880`0584ff1f e1c62x64+0x2075e
02 fffff880`03969ec0 fffff880`0584fb43 e1c62x64+0x20f1f
03 fffff880`03969f70 fffff880`0584fa49 e1c62x64+0x20b43
04 fffff880`03969fa0 fffff800`0301f62f e1c62x64+0x20a49
05 fffff880`03969fe0 fffff880`01a0c600 hal!HalBuildScatterGatherList+0x203
06 fffff880`0396a050 fffff880`0584ffb2 ndis!NdisMAllocateNetBufferSGList+0x110
07 fffff880`0396a0f0 fffff880`05850649 e1c62x64+0x20fb2
08 fffff880`0396a150 fffff880`0585028e e1c62x64+0x21649
09 fffff880`0396a1b0 fffff880`01ac84f1 e1c62x64+0x2128e
0a fffff880`0396a1f0 fffff880`01a0c4d4 ndis!ndisMSendNBLToMiniport+0xb1
0b fffff880`0396a250 fffff880`05c6d6b8 ndis!NdisFSendNetBufferLists+0x64
0c fffff880`0396a290 fffff880`05c6d92c bdfndisf6+0x16b8
0d fffff880`0396a2f0 fffff880`05c6df4b bdfndisf6+0x192c
0e fffff880`0396a380 fffff880`01a0c4d4 bdfndisf6+0x1f4b
0f fffff880`0396a480 fffff880`00c16199 ndis!NdisFSendNetBufferLists+0x64
10 fffff880`0396a4c0 fffff880`01a0c419 pacer!PcFilterSendNetBufferLists+0x29
11 fffff880`0396a5c0 fffff880`01ac85d5 ndis!ndisSendNBLToFilter+0x69
12 fffff880`0396a620 fffff880`01c60eb6 ndis!NdisSendNetBufferLists+0x85
13 fffff880`0396a680 fffff880`01c67895 tcpip!IpNlpFastSendDatagram+0x496
14 fffff880`0396aa30 fffff880`01c68450 tcpip!TcpTcbSend+0x495
15 fffff880`0396acb0 fffff880`01c671a8 tcpip!TcpEnqueueTcbSendOlmNotifySendComplete+0xa0
16 fffff880`0396ace0 fffff880`01b30267 tcpip!TcpEnqueueTcbSend+0x258
17 fffff880`0396ad90 fffff880`01b35f5d NETIO!StreamInjectRequestsToStack+0x287
18 fffff880`0396ae60 fffff880`01b376b4 NETIO!StreamPermitDataHelper+0x5d
19 fffff880`0396ae90 fffff800`030e41dc NETIO!StreamPermitRemoveDataDpc+0x84
1a fffff880`0396af00 fffff800`030db335 nt!KiRetireDpcList+0x1bc
1b fffff880`0396afb0 fffff800`030db14c nt!KyRetireDpcList+0x5
1c fffff880`13abf190 fffff800`0312371c nt!KiDispatchInterruptContinue
1d fffff880`13abf1c0 fffff800`030c2aec nt!KiDpcInterrupt+0xcc
1e fffff880`13abf350 fffff880`01b383aa nt!KeInsertQueueDpc+0x1dc
1f fffff880`13abf3e0 fffff880`01b3b468 NETIO!StreamPermitData+0x13a
20 fffff880`13abf450 fffff880`01b3b99a NETIO!StreamInternalClassify+0x1a8
21 fffff880`13abf520 fffff880`01b3bd8e NETIO!StreamInject+0x1ca
22 fffff880`13abf5f0 fffff880`01b91df3 NETIO!FwppStreamInject+0x12e
23 fffff880`13abf680 fffff880`05c9aaf1 fwpkclnt!FwpsStreamInjectAsync0+0xcf
24 fffff880`13abf6e0 fffff880`05c9bce3 bdfwfpf+0x2af1
25 fffff880`13abf780 fffff880`05ca469c bdfwfpf+0x3ce3
26 fffff880`13abf7c0 fffff880`05ca4d0a bdfwfpf+0xc69c
27 fffff880`13abf840 fffff880`05c9ebb3 bdfwfpf+0xcd0a
28 fffff880`13abf8a0 fffff800`033f3e47 bdfwfpf+0x6bb3
29 fffff880`13abf8d0 fffff800`033f46a6 nt!IopXxxControlFile+0x607
2a fffff880`13abfa00 fffff800`030d7b53 nt!NtDeviceIoControlFile+0x56
2b fffff880`13abfa70 00000000`77a1dc2a nt!KiSystemServiceCopyEnd+0x13
2c 00000000`27a9f928 00000000`00000000 0x77a1dc2a
I used knL as opposed to the other stack dump commands as I wanted to get the frame # feature for reference reasons.
Starting at frame # 2a, we can see the NtDeviceIoControlFile function calls IopXxxControlFile. The latter function appears to be undocumented, so I'm unsure as to what it does. What I do know is, the NtDeviceIoControlFile function is ultimately used to build descriptors for a driver. I imagine it's using the IopXxxControlFile function to aid in passing such to the driver.
Also, for what it's worth, although NtDeviceIoControlFile has since been superseded by DeviceIoControl, the former native function provides more information that may be beneficial to the caller (especially for debugging purposes). This is likely why Bitdefender chose to use the former function instead.
Code:
11: kd> ln nt!IopXxxControlFile
(fffff800`033f3840) nt!IopXxxControlFile
(fffff800`033f4650) nt!NtDeviceIoControlFile
Exact matches:
nt!IopXxxControlFile (<no parameter info>)
If we disassemble this function, we can wade through some of the stuff and find some of the interesting tidbits:
Code:
11: kd> u fffff800`033f3840 fffff800`033f4650
fffff800`033f3956 e845c5ffff call nt!ProbeForWrite (fffff800`033efea0)
fffff800`033f39b7 e81498fdff call nt!ObReferenceObjectByHandleWithTag (fffff800`033cd1d0)
fffff800`033f3b05 e84688cfff call nt!IoGetRelatedDeviceObject (fffff800`030ec350)
fffff800`033f402a e8d130cdff call nt!IoGetAttachedDevice (fffff800`030c7100)
fffff800`033f3c01 e88a7bcfff call nt!IoAllocateIrp (fffff800`030eb790)
fffff800`033f40d1 e82af9cfff call nt!IoAllocateMdl (fffff800`030f3a00)
So after neatly putting together this disassembly of sorts, we can see that this is indeed how the NtDeviceIoControlFile function is passing on the buffer and such to the driver.
The IoAllocateMdl function in this specific case is used to ultimately associate the MDL with an IRP, which is why we call into the IoAllocateIrp function, to of course assign the IRP. IoGetAttachedDevice is called likely to return a pointer to the devobj, with help from the IoGetRelatedDeviceObject function to probably obtain the devobj from the file system driver stack.
ObReferenceObjectByHandleWithTag is called to increment the reference count of the object, and to write a four-byte value known as a "tag" so it can support object reference tracing for debugging purposes. Finally, the ProbeForWrite function is called to ensure that a user-mode buffer meets the following:
- Resides in the user-mode portion of the address space.
- Is writeable.
- Is correctly aligned.
As all appears to have went well, we can see the driver we were ultimately building and passing descriptors to/for was bdfwfpf.sys, which is Bitdefender's firewall filter driver. As it's a driver in charge of a firewall, it of course uses the WFP API (Windows Filtering Platform) to achieve its goals (not just filtering and monitoring).
We can confirm this easily by looking at the very first driver/function call after Bitdefender's firewall, which is fwpkclnt.sys. Specifically, Bitdefender's firewall driver called it to inject new/cloned data to the data stream. Directly afterwords we have calls from the Network I/O Subsystem to continue the injecting, which is because fwpkclnt.sys exports kernel-mode functions, as opposed to fwpuclnt.dll which exports and handles the user-mode side.
To handle and/or continue the injection into the data stream, it looks like DPC(s) are used to handle it by calling KeInsertQueueDpc to create a queued DPC for execution.
Code:
11: kd> !dpcs
CPU Type KDPC Function
10: Normal : 0xfffffa806a7b7cb0 0xfffff88001b37630 NETIO!StreamPermitRemoveDataDpc
After discussion with Jared, we also thought that the IRQL was possibly DISPATCH_LEVEL due to the multiple injections, etc, therefore Windows deferred it to a DPC. Given this possibly being the case, when the DPC was to be worked on, the system service finished but the IRQL is still DISPATCH_LEVEL. Since that was the case, we get a bug check.
We continue through netio.sys' functions regarding the data stream injection, ultimately injecting the request to the stack and going through a few tcpip.sys functions.
To continue sending the data along, NDIS' NdisSendNetBufferLists function is called, and NDIS' filter driver (which I believe is pacer.sys), called NdisFSendNetBufferLists to send the list of network data buffers back to Bitdefender's firewall driver.
Bitdefender's firewall driver then calls into NDIS' network data buffer sending functions to send the list to the user's network miniport driver, e1c62x64.sys (Intel(R) 82579V Gigabit Network Connection). The network miniport driver then calls NDIS' NdisMAllocateNetBufferSGList function to obtain a scatter/gather list for the network data for the associated NET_BUFFER structure.
In order to do so, NDIS needs to call the HAL, which we can see through the function HalBuildScatterGatherList. What is supposed to happen next is, the HAL builds the scatter/gather list, and we go on through various registered miniport functions. However, this did not happen, and we go off the rails on frame #00 with a call to the miniport driver.
So, where's our problem? Frame #23:
Code:
23 fffff880`13abf680 fffff880`05c9aaf1 fwpkclnt![COLOR=#ff0000]FwpsStreamInjectAsync0+0xcf[/COLOR]
FwpsStreamInjectAsync0, the function in charge of injecting TCP data segments into a TCP data stream, is the issue. How so? Well, let's get dirty once again.
Using the NDIS debugging extension (!ndiskd), we can get a lot of information to help us here. On its lonesome, !ndiskd isn't too special. However, when we use !ndiskd.miniport, it gets fun.
Code:
11: kd> !ndiskd.miniport
MiniDriver Miniport Name
fffffa8020c71cd0 fffffa8018c281a0 RAS Async Adapter
fffffa801f844cd0 fffffa801f8771a0 SonicWALL NetExtender Adapter
fffffa801f862840 fffffa801f86b1a0 WAN Miniport (SSTP)
fffffa801f84bb70 fffffa801f8671a0 WAN Miniport (PPTP)
fffffa801f837c30 fffffa801f8631a0 WAN Miniport (PPPOE)
fffffa801f8409b0 fffffa801f85e1a0 WAN Miniport (IPv6)
fffffa801f8409b0 fffffa801f85a1a0 WAN Miniport (IP)
fffffa801f8409b0 fffffa801f8561a0 WAN Miniport (Network Monitor)
fffffa801f835cd0 fffffa801f8411a0 WAN Miniport (L2TP)
fffffa801f82f820 fffffa801f83d1a0 WAN Miniport (IKEv2)
fffffa801f664020 fffffa801f7b81a0 Intel(R) 82579V Gigabit Network Connection
fffffa801f5cb9e0 fffffa801f5e61a0 Teredo Tunneling Pseudo-Interface
fffffa801f5cb9e0 fffffa801f5e21a0 Microsoft ISATAP Adapter #2
fffffa801f5cb9e0 fffffa801f5de1a0 Microsoft ISATAP Adapter
fffffa801f5cb9e0 fffffa801f5d61a0 Microsoft 6to4 Adapter
So we know that our miniport involved in all of this was the Intel Gigabit, so let's look at that one:
Code:
11: kd> !ndiskd.minidriver fffffa801f664020
MINIPORT DRIVER
e1cexpress
Ndis handle fffffa801f664020
Driver Context NULL
DRIVER_OBJECT fffffa801f7b6e70
Driver image e1c62x64.sys
Registry path \REGISTRY\MACHINE\SYSTEM\ControlSet001\services\e1cexpress
Reference Count 2
Flags [No flags set]
MINIPORTS
Miniport
[COLOR=#ff0000]fffffa801f7b81a0 [/COLOR]- Intel(R) 82579V Gigabit Network Connection
If we take a look at the miniport address:
Code:
11: kd> !ndiskd.miniport fffffa801f7b81a0
MINIPORT
Intel(R) 82579V Gigabit Network Connection
Ndis handle fffffa801f7b81a0
Ndis API version v6.20
Adapter context fffffa801f990000
Miniport driver fffffa801f664020 - e1cexpress v12.6
Network interface fffffa8019c8c870
Media type 802.3
Device instance PCI\VEN_8086&DEV_1503&SUBSYS_849C1043&REV_06\3&11583659&0&C8
Device object fffffa801f7b8050 More information
MAC address e0-3f-49-78-a1-dd
Code:
STATE
Miniport Running
Device PnP Started
Datapath Normal
Interface Up
Media Connected
Power D0
References 0n10
Total resets 0
Pending OID None
Flags BUS_MASTER, 64BIT_DMA, SG_DMA, DEFAULT_PORT_ACTIVATED,
SUPPORTS_MEDIA_SENSE, DOES_NOT_DO_LOOPBACK,
MEDIA_CONNECTED
PnP flags PM_SUPPORTED, DEVICE_POWER_ENABLED,
DEVICE_POWER_WAKE_ENABLE, RECEIVED_START,
HARDWARE_DEVICE
Code:
BINDINGS
Protocol list Driver Open Context
RSPNDR fffffa8021b39cf0 fffffa8021b608d0 fffffa8021b62010
LLTDIO fffffa8021b1a8f0 fffffa8021b528d0 fffffa8021b361b0
TCPIP6 fffffa801d05c2c0 fffffa801fb13010 fffffa801fb0b010
TCPIP fffffa8019c7b890 fffffa801fb08580 fffffa801fb03ba0
Filter list Driver Module Context
WFP LightWeight Filter-0000
fffffa801f59f010 fffffa801faff660 fffffa801faff400
QoS Packet Scheduler-0000
fffffa801f5ab930 fffffa801fb00780 fffffa801f9d3010
[COLOR=#ff0000]BitDefender Firewall NDIS6 Filter Driver-0000[/COLOR]
fffffa801f574d40 fffffa801fb04c80 fffffa801fb04850
We get a lot of good information, and can see that Bitdefender's firewall filter driver is/was involved with this miniport. We know this, because we saw it all happening in the stack, but this just confirms it.
Anyway, what's next? Well, let's check for any pending NBLs (NET_BUFFER_LISTS):
Code:
11: kd> !ndiskd.pendingnbls fffffa801f7b81a0
PHASE 1/3: Found 23 NBL pool(s).
PHASE 2/3: Found 512 freed NBL(s).
Pending Nbl [COLOR=#ff0000]Currently held by[/COLOR]
fffffa80593c82c0 [COLOR=#0000ff]fffffa801f7b81a0 - Intel(R) 82579V Gigabit Network Connection [Miniport][/COLOR]
PHASE 3/3: Found 1 pending NBL(s) of 789 total NBL(s).
Search complete.
Ah ha, we have one held by the miniport driver that was involved in passing data to Bitdefender's firewall filter driver. Let's look at the pending NBL:
Code:
11: kd> !ndiskd.nbl fffffa80593c82c0
NBL fffffa80593c82c0 [COLOR=#ff0000]Next NBL NULL[/COLOR]
First NB fffffa80593c83f0 Source fffffa801fb08580 - TCPIP
From here we can take a direct look at the NBL:
Code:
11: kd> dt _NET_BUFFER_LIST fffffa80593c82c0
ndis!_NET_BUFFER_LIST
_NET_BUFFER_LIST
[COLOR=#ff0000]+0x000 Next : (null) [/COLOR]
+0x008 FirstNetBuffer : 0xfffffa80`593c83f0 _NET_BUFFER
+0x000 Link : _SLIST_HEADER
+0x010 Context : 0xfffffa80`593c84a0 _NET_BUFFER_LIST_CONTEXT
+0x018 ParentNetBufferList : (null)
+0x020 NdisPoolHandle : 0xfffffa80`1cfe6080 Void
+0x030 NdisReserved : [2] (null)
+0x040 ProtocolReserved : [4] 0x746c6100`00000001 Void
+0x060 MiniportReserved : [2] 0xfffffa80`1f990000 Void
+0x070 Scratch : (null)
+0x078 SourceHandle : 0xfffffa80`1fb08580 Void
+0x080 NblFlags : 0
+0x084 ChildRefCount : 0n0
+0x088 Flags : 0x100
+0x08c Status : 0n0
+0x090 NetBufferListInfo : [19] 0x00000000`00220015 Void
What appears to be happening here is multiple NBLs in a chain are being passed, the FwpsStreamInjectAsync0 function is called to pass Bitdefender's data, and then the chain is broken as the call goes on (see the NBL next member is zeroed out/null).
Possibly a fix (in Bitdefender's case) is to avoid multiple injections inside the stream callout routine, possibly taking NBLs in a chain and calling the FwpsStreamInjectAsync0 function just ONCE for each callout routine execution. Unsure, kernel development isn't my strong point : ) It's not Bitdefender's fault as this is a Windows bug apparently, anyway.
A fix for user's is to install this hotfix and hope it works, as it should. Overall, maybe Bitdefender instead of making any developmental changes could just raise awareness for this issue, like creating a well explained documentation page with a link to the hotfix.
Last edited: