BSOD - Win7 x64 - Stumped

Yes, both machines are work issued and both are running Trend and if that's the cause, I'm 100% certain that we will never find a solution since there is zero chance our Helpdesk will allow me to use anything else and/or disable it (even temporarily for troubleshooting).

DTN.IQ should not be a concern. As Patrick pointed out, it isn't malicious software and the BSODs happen even when it isn't running and it has no hooks into the system when not running.

Here are 3 fresh Minidumps from Friday. Interesting note. BSOD were particularly bad on Friday. Until I disconnected the VPN and shut down the Cisco app. I am currently running without the VPN connected to see if I still get BSODs.

Question. Would it make sense to turn driver verifier back on at this point?
 

Attachments

  • minidumps.zip
    89.2 KB · Views: 2
Code:
BugCheck C5, {2c00000008, 2, 0, fffff800033f5b05}

Probably caused by : Pool_Corruption ( nt!ExDeferredFreePool+249 )

Code:
2: kd> knL
 # Child-SP          RetAddr           Call Site
00 fffff880`033c4458 fffff800`032c0169 nt!KeBugCheckEx
01 fffff880`033c4460 fffff800`032bede0 nt!KiBugCheckDispatch+0x69
02 fffff880`033c45a0 fffff800`033f5b05 nt!KiPageFault+0x260
03 fffff880`033c4730 fffff800`033f44f1 nt!ExDeferredFreePool+0x249
04 fffff880`033c47c0 fffff880`04274c23 nt!ExFreePoolWithTag+0x411
05 fffff880`033c4870 fffff880`046c7095 tdx!TdxTdiDispatchClose+0x33
06 fffff880`033c48a0 00000000`00000000 tmtdi+0xd095

It seems that your Trend Micro Internet Security suite has caused an illegal page fault, resulting in a bugcheck.

I would suggest looking for an updated version of the program, or removing the program temporarily as a test:

Code:
2: kd> lmvm tmtdi
start             end                 module name
fffff880`046ba000 fffff880`046d7000   tmtdi    T (no symbols)           
    Loaded symbol image file: tmtdi.sys
    Image path: \SystemRoot\system32\DRIVERS\tmtdi.sys
    Image name: tmtdi.sys
    Timestamp:        Wed Jan 09 11:35:01 2013 (50ED55E5)
    CheckSum:         0001CDF1
    ImageSize:        0001D000
    Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4
 
If your IT department doesn't want to disable it temporarily or swap it out, go to a higher up and explain. That's about all you can do.
 
Thanks for the replies guys. I wanted to give an update to the situation. Per my last post, I left the VPN client shut down all weekend and Monday and experienced zero BSOD (so ~3.5 days stable but 2 of those days were light workload). On Tuesday, I uninstalled and re-installed the Cisco VPN client (same old version our company uses) and re-enabled it. I had another BSOD within n hour.

This was finally enough evidence for our Helpdesk to allow me to update to a newer version of the Cisco VPN software. I have been running the new version for almost 24hrs now and so far it has been stable.

I'm hoping that is the cause of the crashes.

Of course with the recent dumps pointing towards Trend being the culprit, I'm hesitant to declare victory just yet.

Do you guys think that there could have been some sort of conflict between Trend and Cisco causing the problem and the new version of Cisco resolved the issue? I imagine the answer you guys will give is "its possible but we can't be certain". Just curious as to what your best guess would be at this point.
 
Do you guys think that there could have been some sort of conflict between Trend and Cisco

Yes, this happens all the time. It's not uncommon that security software conflicts with network software. Usually in business environments if it happens on the latest version for both software, you have to talk to both vendors and hope they care enough to work out a patch/fix so you can keep using both without having to ditch one.

However if you crash again, then you know what your next step is (remove Trend). Curious as to why you guys even use Trend in your environment, as I haven't seen it lately.
 
It *does* happen with Cisco software regularly enough that I am always fearful when issues arise on machines running Cisco VPN software...
 
Curious as to why you guys even use Trend in your environment, as I haven't seen it lately.

I don't have a real answer for you because I don't really know but we have used it for the last 10+ years. We are a fairly large company (and part of one of the largest in the world). I would guess we have ~1000-1500 end user machines (not counting servers and some of those also run Trend) between our various offices (just for our small segment of the corporation). A few years ago, I asked one of our helpdesk guys why we don't move to a cheaper solution (like MSSE) and their response was that none of the low-cost solutions have any way to manage them from a central location (I don't want to use the word free because all of them have some sort of cost associated with them even if it isn't licensing). Granted, that was a few years back and may not be the case anymore and/or may not be the reason anymore but we use a lot of old methodologies around here. The "if it aint broke, don't fix it" mentality runs rampant to a fault.
 
Most IT departments choose based on ease of management first, then how many other products can be bundled under the license (I'm looking at you, McAfee), and then perhaps based on quality of solution.

I think it's part of the reason why Microsoft moved Forefront Client Security to be Forefront Endpoint Protection (now System Center Endpoint Protection) and under the management of SCCM, and also why MBAM can be integrated with SCCM for management of Bitlocker as well - not because these made the products themselves that much better, but because it made the management/reporting story much better for IT departments.
 
Sigh...

Unfortunately the BSODs returned yesterday (after 2 days stable on the new version of Cisco VPN).

Interestingly enough, when I look at the crash dumps myself (my only knowledge of crash dump debugging has been acquired from this site since this thread started), the most recent full memory dump https://onedrive.live.com/redir?resid=469B5CD55B3DC661!7159&authkey=!ABYpNIkp_HiJCi0&ithint=file,zip points to a different problem than the minidump from the same crash (attached along with 2 others from yesterday). Is this normal or does it point to some other problem? If so, I'm assuming the full dump should be more accurate right?

Now, while waiting for the full dump to upload to onedrive...and to further my own understanding and enabling me to help myself better in the future:

When analyzing the memory dump, I see the following:

Code:
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************Use !analyze -v to get detailed debugging information.
BugCheck 3B, {c0000005, fffff80003470ac5, fffff88020135b60, 0}
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for TmPreFlt.sys - 
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for fltmgr.sys - 
Probably caused by : TmPreFlt.sys ( TmPreFlt!TmpQueryFullName+1027 )

TmPreFlt.sys is a Trend driver.

This is confirmed by the stack trace:
Code:
1: kd> kn
 # Child-SP          RetAddr           Call Site
00 fffff880`20135298 fffff800`03478169 nt!KeBugCheckEx
01 fffff880`201352a0 fffff800`03477abc nt!KiBugCheckDispatch+0x69
02 fffff880`201353e0 fffff800`034a375d nt!KiSystemServiceHandler+0x7c
03 fffff880`20135420 fffff800`034a2535 nt!RtlpExecuteHandlerForException+0xd
04 fffff880`20135450 fffff800`034b34c1 nt!RtlDispatchException+0x415
05 fffff880`20135b30 fffff800`03478242 nt!KiDispatchException+0x135
06 fffff880`201361d0 fffff800`03476b4a nt!KiExceptionDispatch+0xc2
07 fffff880`201363b0 fffff800`03470ac5 nt!KiGeneralProtectionFault+0x10a
08 fffff880`20136540 fffff800`035acd0e nt!ExpInterlockedPopEntrySListFault16
09 fffff880`20136550 fffff880`0adde0bb nt!ExAllocatePoolWithTag+0xfe
0a fffff880`20136640 fffff880`0adde317 TmPreFlt!TmpQueryFullName+0x1027
0b fffff880`20136680 fffff880`01587067 TmPreFlt!TmpQueryFullName+0x1283
0c fffff880`20136710 fffff880`01588329 fltmgr!FltAcquirePushLockShared+0x907
0d fffff880`20136810 fffff880`015866c7 fltmgr!FltIsCallbackDataDirty+0xa39
0e fffff880`20136890 fffff800`0378242f fltmgr+0x16c7
0f fffff880`201368f0 fffff800`03770424 nt!IopCloseFile+0x11f
10 fffff880`20136980 fffff800`037701e1 nt!ObpDecrementHandleCount+0xb4
11 fffff880`20136a00 fffff800`037707a4 nt!ObpCloseHandleTableEntry+0xb1
12 fffff880`20136a90 fffff800`03477e53 nt!ObpCloseHandle+0x94
13 fffff880`20136ae0 00000000`776713aa nt!KiSystemServiceCopyEnd+0x13
14 00000000`0027e228 00000000`00000000 0x776713aa.

Is there anyway to figure out exactly what Trend was doing at the time? I was following through some of the tutorials on the site in some spare time but of course they all seem more specific to other problems so I can only gather so much from them (especially when I don't know exactly what I should be looking for).

Btw, I did breach the subject to my Helpdesk about Trend and as expected, they were very reluctant. However, I noticed in my scan logs from the most recent BSOD that there appears to be a hung scan from back when these BSOD first started (mid December) that keeps restarting itself and attempting to complete but apparently never does. There are 3 more recent scans in this same scenario but there are also several successfully completed scans as well so I don't know if that issue is being caused by the BSOD or vis-versa. I am getting some traction with my helpdesk on getting that issue resolved but they are pretty much ignoring my request to unload Trend and replacing with something else even if only for troubleshooting purposes (as expected).
 

Attachments

  • Laptop1-minidumps.zip
    91.5 KB · Views: 0
Without Trend's symbols, it'll have to be reversed. I'm taking a look now, but it looks an awful lot like pool corruption by the trend driver right now. Which is what they're *famous* for, by the way (or maybe, is that infamous? This is the same company that built their initial product drivers based on the DDK samples from Microsoft based on pool tags I used to find - that changed a few years back, but that doesn't mean it's gotten any better).
 
Code:
1: kd> .bugcheck
Bugcheck code 0000003B
Arguments 00000000`c0000005 fffff800`03470ac5 fffff880`20135b60 00000000`00000000

Code:
1: kd> .cxr fffff880`20135b60
rax=000000107cc70004 rbx=fffff8a02769ec78 rcx=fffff880009edbe0
rdx=11170002123de255 rsi=0000000000000006 rdi=0000000000000000
rip=fffff80003470ac5 rsp=fffff88020136540 rbp=fffff880009edbe0
 r8=11170002123de254  r9=fffff80003403000 r10=fffff880009edbe0
r11=fffff8a023e23d10 r12=fffff80003608580 r13=0000000000000000
r14=0000000000000002 r15=0000000066704d54
iopl=0         nv up ei pl nz na pe nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
nt!ExpInterlockedPopEntrySListFault16:
fffff800`03470ac5 498b08          mov     rcx,qword ptr [r8] ds:002b:11170002`123de254=????????????????

Setting the rcx register to the value at address r8.

Code:
1: kd> !pte 11170002123de254
                                           VA 11170002123de254
PXE at FFFFF6FB7DBED000    PPE at FFFFF6FB7DA00040    PDE at FFFFF6FB40008488    PTE at FFFFF68001091EF0
contains 0070000188ED6867  contains 0000000000000000
pfn 188ed6    ---DA--UWEV  not valid

WARNING: noncanonical VA, accesses will fault !

It's noncanonical, so of course it faulted.

Code:
1: kd> k
  *** Stack trace for last set context - .thread/.cxr resets it
Child-SP          RetAddr           Call Site
fffff880`20136540 fffff800`035acd0e nt!ExpInterlockedPopEntrySListFault16
fffff880`20136550 fffff880`0adde0bb nt!ExAllocatePoolWithTag+0xfe
fffff880`20136640 fffff880`0adde317 TmPreFlt!TmpQueryFullName+0x1027 // Trend kernel-mode driver
fffff880`20136680 fffff880`01587067 TmPreFlt!TmpQueryFullName+0x1283 // Trend kernel-mode driver
fffff880`20136710 fffff880`01588329 fltmgr!FltpPerformPreCallbacks+0x2f7 
fffff880`20136810 fffff880`015866c7 fltmgr!FltpPassThrough+0x2d9
fffff880`20136890 fffff800`0378242f fltmgr!FltpDispatch+0xb7
fffff880`201368f0 fffff800`03770424 nt!IopCloseFile+0x11f
fffff880`20136980 fffff800`037701e1 nt!ObpDecrementHandleCount+0xb4
fffff880`20136a00 fffff800`037707a4 nt!ObpCloseHandleTableEntry+0xb1
fffff880`20136a90 fffff800`03477e53 nt!ObpCloseHandle+0x94
fffff880`20136ae0 00000000`776713aa nt!KiSystemServiceCopyEnd+0x13
00000000`0027e228 00000000`00000000 0x776713aa

Trend's driver is calling into the ExAllocatePoolWithTag function to allocate pool memory.

Code:
1: kd> lmvm TmPreFlt
start             end                 module name
fffff880`0addb000 fffff880`0adeb000   TmPreFlt   (export symbols)       TmPreFlt.sys
    Loaded symbol image file: TmPreFlt.sys
    Image path: \??\C:\Program Files (x86)\Trend Micro\OfficeScan Client\TmPreFlt.sys
    Image name: TmPreFlt.sys
    Timestamp:        Sat Aug 30 09:11:25 2014

Code:
1: kd> .foreach (place {.shell -ci "!poolfind *" sed s/.size.*// | sed s/*//} ) {!poolval place}
Pool page 0000000000736564 region is Unknown

Validating Pool headers for pool page: 0000000000736564

Pool page [ 0000000000736000 ] is __inVALID.

Code:
Pool page [ fffff80003470000 ] is __inVALID.

Analyzing linked list...
[ fffff80003470000 ]: invalid previous size [ 0x48 ] should be [ 0x0 ]
[ fffff80003470000 --> fffff80003470440 (size = 0x440 bytes)]: Corrupt region
[ fffff80003470680 --> fffff80003470b70 (size = 0x4f0 bytes)]: Corrupt region

As usual Cluberti is right on the money, looks like Trend is just causing a slew of issues (namely pool corruption). As to why? No idea. Probably just comes down to bugs in their kernel-mode driver(s) among other possible things.
 
OK, so it appears there certainly was a corrupted Trend install on this machine. I finally got our helpdesk to agree to reinstall it (best I could get out of them) to address the issue of the several month old scans still running. When they went to uninstall, the uninstaller wouldn't run and they had to do a manual uninstall. Of course there were then problems with the reinstall but after a long day of fighting with it yesterday, I was finally able to get it cleaned out properly (including all of the hidden drivers in device manager) and everything reinstalled properly and is now once again working correctly. The old scans are gone and it appears to be as good as I will get going forward as far as Trend is concerned.

Of course, this didn't solve the problem since I just had another bluescreen this morning. This time, the dump is pointing to usbhub.sys and I have no idea what that means beyond the obvious.

https://onedrive.live.com/redir?resid=469B5CD55B3DC661!7818&authkey=!AF-s-Aj16RitOrM&ithint=file,zip

Note that over the past several days (prior to re-installing trend), I had several bluescreens that were once again pointing towards the Audio Drivers as in the original post. Yesterday, I removed the audio device from device manager, uninstalled the drivers, and let windows install new ones after a reboot. I have not yet reinstalled the latest drivers from dell's website for the audio.

Are we still dealing with a Trend issue? Or perhaps there is something else causing the corruption and Trend is somehow getting the blame?

There are two things I can't seem to get past when dealing with this issue. The first is, according to my helpdesk group, I am the only person reporting problems and they've already sent me a new PC which also is getting the bluescreens. The second issue, is why would it only be BSODing when connected to the VPN? I don't have a real good answer for either of these and neither does my helpdesk.

At this point (like the title of my thread), I'm stumped as to a fix.
 

Attachments

  • 021315-45583-01.zip
    29.9 KB · Views: 0
No offense, but how exactly would reinstalling a product that is failing be expected to fix the issue? Unless the vendor has fixed the issues causing the failures, reinstalling will fix nothing.
 
When they went to uninstall, the uninstaller wouldn't run

Did they use a removal tool as opposed to Add/Remove Programs list? I hope so.....

Either way, as Cluberti said, unless there was a weird issue with that specific installation of Trend that caused the issue we're seeing here (entirely unlikely), then reinstalling will do absolutely no good.
 
My apologies for the latent reply, I thought I had posted this before but apparently I didn't.

No offense, but how exactly would reinstalling a product that is failing be expected to fix the issue? Unless the vendor has fixed the issues causing the failures, reinstalling will fix nothing.
The same way that replacing a failing part on a car fixes an issue. Or do you just take the part off and wait/hope the manufacturer releases a new model of the part because the previous design must have been faulty? As a said above, the installation of Trend on my machine started behaving badly starting just before the first bluescreens. My hope was that fixing Trend would fix the problem. Once again I'll reiterate that my company is not a small company. We have literally thousands of machines running Trend and I'm the only person having problems.

When they went to uninstall, the uninstaller wouldn't run

Did they use a removal tool as opposed to Add/Remove Programs list? I hope so.....

Either way, as Cluberti said, unless there was a weird issue with that specific installation of Trend that caused the issue we're seeing here (entirely unlikely), then reinstalling will do absolutely no good.
They initially started using Add/Remove but the uninstaller wouldn't accept the password (the password was the reason I couldn't do it myself). At that point they used a removal tool but apparently it was a bad one because it didn't get everything. They couldn't get it to re-install until I manually removed all the Trend services (sc delete xxxxxx). At that point it re-installed but not correctly (still had some other weirdness going on) and I ran through the manual uninstall directions on Trend's website to remove not only the services, but also the registry keys, files on disk, and device drivers from device manager. At that point, everything installed correctly.

From that point, everything worked great for about a week and then the BSOD started up again. I then figured out that I could GREATLY reduce the occurrences of the BSOD by keeping 1 machine connected to the VPN (with Trend on it) but sitting idle unless I needed something on the company network. Whenever I needed something on the network, I would remote into that machine from my main laptop. This setup worked for a couple weeks without either machine crashing but this week, my main PC (which was no longer connected to the VPN) started crashing again. After the 2nd crash, my manager gave me the OK to dis-obey our helpdesk and I removed Trend entirely using the manual uninstall directions from Trend's website (I couldn't use the uninstaller because I don't know the password).

I installed MSE and I am once again using Cisco VPN on this machine connecting directly to the VPN. That was Monday.

And now I'm back because today I've had another BSOD.
Memory Dump attached
https://onedrive.live.com/redir?resid=469B5CD55B3DC661!8173&authkey=!AIfXmgU0idAIuZc&ithint=file,zip

Thanks again in advance for any and all help.
 
Code:
2: kd> .bugcheck
Bugcheck code 00000001
Arguments 00000000`7715132a 00000000`00000000 00000000`0000ffff fffff880`0a8e1b60

0x1 bug check, no info at all really with these. Taking a look at the 3rd parameter, we can see the value of the CombinedApcDisable friend. It's split into two 16-bit values, both being SpecialAPCDisable and KernelAPCDisable.

Both values to no surprise are negative, which tells us that Special/Kernel APCs were disabled and never re-enabled. Since both APC types were disabled, the thread entered a a Guarded region as opposed to a Critical region (https://msdn.microsoft.com/en-us/li...925(v=vs.85).aspx?f=255&MSPPError=-2147217396).

Drivers enter Guarded/Critical regions when holding locks to prevent APCs suspending or terminating the thread, which would cause a hang or deadlock and inevitably result in a bug check since we're dealing with kernel-mode drivers & threads.



Overall, your best bet is to enable verifier to see what driver is causing this. Without it, we'd just be taking random stabs in the dark.

Driver Verifier:

What is Driver Verifier?

Driver Verifier monitors Windows kernel-mode drivers, graphics drivers, and even 3rd party drivers to detect illegal function calls or actions that might corrupt the system. Driver Verifier can subject the Windows drivers to a variety of stresses and tests to find improper behavior.

Essentially, if there's a 3rd party driver believed to be causing the issues at hand, enabling Driver Verifier will help us see which specific driver is causing the problem.

Before enabling Driver Verifier, it is recommended to create a System Restore Point:

Vista - START | type rstrui - create a restore point
Windows 7 - START | type create | select "Create a Restore Point"
Windows 8/8.1 - Restore Point - Create in Windows 8

How to enable Driver Verifier:

Start > type "verifier" without the quotes > Select the following options -

1. Select - "Create custom settings (for code developers)"
2. Select - "Select individual settings from a full list"
3. Check the following boxes -
- Special Pool
- Pool Tracking
- Force IRQL Checking
- Deadlock Detection
- Security Checks (only on Windows 7 & 8/8.1)
- DDI compliance checking (only on Windows 8/8.1)
- Miscellaneous Checks
4. Select - "Select driver names from a list"
5. Click on the "Provider" tab. This will sort all of the drivers by the provider.
6. Check EVERY box that is NOT provided by Microsoft / Microsoft Corporation.
7. Click on Finish.
8. Restart.
 
Quick Update.

When I turned on driver verifier, the machine went into an endless BSOD loop. Immediately after startup (it loaded into windows but while loading the startup items) it would BSOD.

I got loaded into safe mode and turned off driver verifier only for the driver that was failing at startup (my webcam). I then was able to reboot into normal mode and discovered that the webcam drivers would cause a BSOD as soon as you try to activate the video (I typically only use it for audio). I reloaded the most current drivers from Logitech's website (I was intentionally only using stock windows drivers at the time) and all is well with that again.

I've been 100% stable since the 19th (1 week) while running driver verifier. It bogs down at times due to the verifier running and audio is frequently distorted but that's certainly better than crashing several times a day. Unfortunately, in the past while trying to run verifier, it seems to make the machine more stable (which as I understand it is exactly the opposite of what should happen). This machine has gone stable for 1wk+ in the past with verifier running.

I'm going to turn it back off this afternoon and start running without verifier again. If it stays stable for a couple weeks without verifier running, I'll once again load Trend and see if the BSOD return. If they do, I believe I have pretty concrete proof to take to our helpdesk team as evidence that it was Trend causing the BSOD all along and hopefully they will allow me to continue running with MSE (but I doubt it...I'm sure they will want me to go into several more months of troubleshooting to figure out why my machine can't handle Trend but all the others work fine).

Thanks again for everyone's help. This has been a long and frustrating process. Without your help, I'm not sure where I would be at this point.
 
Have you ever tried running it outside of the docking station for an extended period of time, and then experienced a crash?
 
Back
Top