What's new

[SOLVED] BSODs on 25-30 Computers

turtlej0e

Active member
Joined
Feb 13, 2018
Posts
25
Good morning Sysnative!

The organization I work for has been plagued with multiple BSODs on approximately 25-30 machines after updating to the latest version of Windows 10 (Fall Creators Update, v1709) in November/December of last year. The BSODs always happen after a user logs off their account (can be by directly logging off, or by shutdown/restart). The strange thing is, I can't reproduce it immediately. I can try a solution and then reboot, shutdown, or log off to my hearts content with no BSOD. But after some time (we noticed as little as 20-30 minutes but sometimes even all day), the BSOD will happen when the user logs out.

Here is a list of the computer models this is happening on:

- Dell Latitude E5470
- Dell Latitude E7470
- Dell OptiPlex 790, 7010, and 9020
- Microsoft Surface Pro 4
- Microsoft Surface Pro 2017

All laptops and tablets have docks that connect them to 1-2 monitors. All computers are running Windows 10 v1709 x64, have anywhere between 4-16GB of RAM, and all are Intel Core processors (some i3, i5, and i7). Some desktop computers have AMD Radeon 7470 graphics cards, and some of them have a USB graphics card for triple monitor configurations. A couple laptops have Intel+AMD graphics, and the rest of the machines use Intel Graphics. Most of the Windows installs are Dell OEM, but I think there is a sprinkling of fresh installs too.

These are the four bugcheck codes that seem to happen every time a user logs off. It varies day to day, and doesn't seem to be particular to a certain model of computer:

- 0x00000139
- 0x000000c2
- 0x0000003b
- 0x00000050

Often the dumps reference win32kfull.sys and other system components that are considered sacrosanct. I tried analyzing the basics with WinDBG and did not see any third party drivers that were obvious.

I ran Driver Verifier on a test system, and it immediately crashed one of the network filter drivers related to our antivirus (Panda Endpoint Protection). I removed the antivirus software on another computer not running DV, and it still crashed. I ran DV on my work machine without any antivirus, and crashed a couple drivers related to some software that I use (not common on the other machines to my knowledge). I removed that software, and DV did not crash any more drivers on my system. I believe I ran it for a good 24-48 hours on my machine. However, I still received a bugcheck about a week later on logoff.

In short, here is a list of solutions we have tried (in no particular order):

- Disabling Fast Startup
- Updating drivers using Dell Command Update
- Updating drivers using Dell SupportAssist
- Updating drivers manually from Dell's website
- Updating drivers with versions not available from Dell (Intel Ethernet, WiFi, Bluetooth, SATA, Graphics, etc)
- Updating BIOS
- Removing AV (Panda Endpoint Protection)
- Removing remote support (TeamViewer)
- Installing latest Windows Updates (we manage updates through WSUS)
- Removing Group Policy from some machines
- Complete clean install of Windows 10 v1709 from ISO (downloaded from Microsoft VLSC - it's a legit ISO. We later reinstalled Windows 10 on another computer using a newer copy of the ISO from VLSC, so the ISO wasn't corrupt.)

Since the problem is happening on 25-30 different machines, I am skeptical of it being a hardware issue, but who knows - there have been stranger things.

All of our machines have a vanilla configuration with the following common software. Some machines have additional software depending on the staff's position:

- Adobe Flash Player
- Microsoft Office Professional 2016
- Google Chrome
- Mozilla Firefox
- VLC Media Player
- Foxit Reader
- TeamViewer
- Panda Endpoint Protection
- Dell Command Update

Below is a link to four memory dumps - one of each bugcheck code we have encountered. There are dumps from yesterday (2/12/2018) and one from today (2/13/2018).

Dumps: WeTransfer

Any help is much appreciated! I feel we are at our wits end with this problem.
 

HyperHenry

Active member
Joined
Feb 12, 2018
Posts
40
Location
Currently Texas
Could you please read this thread and upload them here as instructed. It's easier and safer than third party sites. Thanks. :wave:
 

Tekno Venus

Senior Administrator, Site Designer
Staff member
Joined
Jul 21, 2012
Posts
6,070
Location
UK
Could you please read this thread and upload them here as instructed. It's easier and safer than third party sites. Thanks. :wave:
The dump files are too large for our site attachment limit. The 3rd party site linked is fine :-)
 

HyperHenry

Active member
Joined
Feb 12, 2018
Posts
40
Location
Currently Texas
NP, I'm not able to access them but you are in good hands here. :thumbsup2: It most likely is due to settings somewhere as I can't access most third party sites.
 

cwsink

Sysnative Staff, BSOD Kernel Dump Expert
Joined
Apr 3, 2017
Posts
279
The callstacks all seem to have GDI or sprite related functions involved. You said after 20 to 30 minutes? Do the computers have an old screen saver loading which might be using old graphics calls, perhaps?
 

philc43

BSOD Academy Instructor, BSOD Kernel Dump Expert
Joined
Jul 7, 2017
Posts
123
Location
Cambridge, UK
I ran a scan of all the drivers in one of your crash dumps (Surface Device) and located all of them that had no symbols; here is the result.

Code:
*** ERROR: Module load completed but symbols could not be loaded for RTKVHD64.sys
From Realtek Audio

Code:
*** ERROR: Module load completed but symbols could not be loaded for PSINKNC.sys
*** ERROR: Module load completed but symbols could not be loaded for NNSSTRM.sys
*** ERROR: Module load completed but symbols could not be loaded for NNSSMTP.sys
*** ERROR: Module load completed but symbols could not be loaded for PSINDvct.sys
*** ERROR: Module load completed but symbols could not be loaded for NNSPRV.sys
*** ERROR: Module load completed but symbols could not be loaded for NNSHTTPS.sys
*** ERROR: Module load completed but symbols could not be loaded for NNSDHCP.sys
*** ERROR: Module load completed but symbols could not be loaded for NNSDNS.sys
*** ERROR: Module load completed but symbols could not be loaded for NNSHTTP.sys
*** ERROR: Module load completed but symbols could not be loaded for NNSPICC.sys
*** ERROR: Module load completed but symbols could not be loaded for NNSPIHSW.sys
*** ERROR: Module load completed but symbols could not be loaded for NNSPOP3.sys
*** ERROR: Module load completed but symbols could not be loaded for PSKMAD.sys
All from Panda Security


Code:
*** ERROR: Module load completed but symbols could not be loaded for IntcDAud.sys
From: Intel Graphics Media Accelerator

Code:
*** ERROR: Module load completed but symbols could not be loaded for iacamera64.sys
From Camera Driver


Drivers without symbols can often give problems and indeed confirms that Panda Security might be better removed while troubleshooting. One of my concerns is that you can never be sure you have completely removed all the drivers unless you confirm by a search for them all afterwards.

Check for newer drivers for the others, here are the dates associated with your present drivers:
iacamera64.sys Tue May 23 23:49:02 2017 IntcDAud.sys Thu Dec 1 02:15:06 2016 RTKVHD64.sys Fri Aug 5 01:25:28 2016


Another testing route might be to clean install W10 and test before you load software, gradually introducing them piece by piece with until you get the problem. That could help pinpoint which of the software is causing the error.
 

turtlej0e

Active member
Joined
Feb 13, 2018
Posts
25
The callstacks all seem to have GDI or sprite related functions involved. You said after 20 to 30 minutes? Do the computers have an old screen saver loading which might be using old graphics calls, perhaps?
Good thought, but screensavers are disabled through Group Policy. Computers are set to put the displays to sleep and to lock after 10 minutes. It isn't always 20-30 minutes. That just what I noticed when I worked on one computer. Sometimes it can take almost a full work day before logging out will trigger the BSOD.
 

turtlej0e

Active member
Joined
Feb 13, 2018
Posts
25
Drivers without symbols can often give problems and indeed confirms that Panda Security might be better removed while troubleshooting. One of my concerns is that you can never be sure you have completely removed all the drivers unless you confirm by a search for them all afterwards.

Check for newer drivers for the others, here are the dates associated with your present drivers:
iacamera64.sys Tue May 23 23:49:02 2017 IntcDAud.sys Thu Dec 1 02:15:06 2016 RTKVHD64.sys Fri Aug 5 01:25:28 2016


Another testing route might be to clean install W10 and test before you load software, gradually introducing them piece by piece with until you get the problem. That could help pinpoint which of the software is causing the error.
I do have a dump from when Panda Security was uninstalled. It's from 1/31/2018, but it's uploaded here if you want to see it: WeTransfer

I will check a Surface and see if I can get those specific drivers updated. The Surface has the latest driver package from Microsoft (Feb. 2018 IIRC), but I will see if I can find newer drivers from other sources.
 

philc43

BSOD Academy Instructor, BSOD Kernel Dump Expert
Joined
Jul 7, 2017
Posts
123
Location
Cambridge, UK
I do have a dump from when Panda Security was uninstalled. It's from 1/31/2018, but it's uploaded here if you want to see it: WeTransfer
I can confirm that there were no Panda Security drivers loaded in that crash dump so you did not have any traces left behind.
 

cwsink

Sysnative Staff, BSOD Kernel Dump Expert
Joined
Apr 3, 2017
Posts
279
The crash seems to have occurred while cleaning up a device context in the latest linked dump. So another graphics related function. It's hard to imagine what all the systems have in common as far as graphics hardware or even graphics related drivers. Do you have any graphics related utilities installed on all of the systems? Screen capture, OLE copy & paste, virtual desktop, shell extensions, etc. ?

With logoff and shutdown the system would be doing some clean up and I'm wondering if the logoff process is telling something to clean up handles to graphics objects that have already been freed.
 

turtlej0e

Active member
Joined
Feb 13, 2018
Posts
25
The crash seems to have occurred while cleaning up a device context in the latest linked dump. So another graphics related function. It's hard to imagine what all the systems have in common as far as graphics hardware or even graphics related drivers. Do you have any graphics related utilities installed on all of the systems? Screen capture, OLE copy & paste, virtual desktop, shell extensions, etc. ?

With logoff and shutdown the system would be doing some clean up and I'm wondering if the logoff process is telling something to clean up handles to graphics objects that have already been freed.
Sorry for the delay in responding. Yesterday was a busy day.

We do use a third party screenshot tool called Snagit on some of our computers, but only a small handful of the computers have that. None of the Surface Pros have that installed. Foxit Reader does install a shell extension in the context menu for converting a document to PDF, could that be graphics related? There is the virtual desktop feature that is built into Windows 10, but I don't believe it is used by any staff. I will continue to look for any software that could be graphics related.


Check for newer drivers for the others, here are the dates associated with your present drivers:
iacamera64.sys Tue May 23 23:49:02 2017 IntcDAud.sys Thu Dec 1 02:15:06 2016 RTKVHD64.sys Fri Aug 5 01:25:28 2016
I updated those drivers, plus the wireless adapter and graphics drivers. Apparently the latest driver packages from Microsoft for the Surface Pro 4 do not contain the most recent drivers. Interestingly, searching automatically for a driver in Device Manager would not grab the latest driver from Windows Update. However when I searched the hardware ID of the device on Microsoft Update Catalog, I then found much more recent drivers. I did this on three Surface Pro 4's, and I removed Panda Security off one of them in case we get another BSOD dump.
 

cwsink

Sysnative Staff, BSOD Kernel Dump Expert
Joined
Apr 3, 2017
Posts
279
Looking at the 8GB complete dump, pool memory was detected as being freed twice and it looks like the memory was allocated by something using the pool tag "GVdv". Searches suggest that is a pool tag used by win32k.sys so not likely to be the problem itself. The crash happened in the winlogon.exe process around the time it was doing some graphics related work and the calls look "old" to me. It appears to be loading an animated icon, for example, and a few other calls that remind me of MFC style programming but maybe that is normal. Do the computers have some sort of customized logon/logoff dialog or process?
 

cwsink

Sysnative Staff, BSOD Kernel Dump Expert
Joined
Apr 3, 2017
Posts
279
Do you know if you had Special Pool enabled while using Driver Verifier? According to this NT Debugging blog post that's the setting to use
while trying to detect a double free of pool memory. You'd want to enable it for all drivers rather than myfault.sys - or at least all non-Microsoft drivers.
 

x BlueRobot

Moderator, BSOD Kernel Dump Expert, Contributor
Joined
May 7, 2013
Posts
1,878
Location
Minkowski Space
I haven't read any of the dumps, but I would suggest following cwsink's suggestion.

cwsink, you can usually find the description of the driver by using the !pooltag extension on the string.
 

cwsink

Sysnative Staff, BSOD Kernel Dump Expert
Joined
Apr 3, 2017
Posts
279
Thank you for the tip! The !pooltag command says win32k.sys as well.
 

turtlej0e

Active member
Joined
Feb 13, 2018
Posts
25
Looking at the 8GB complete dump, pool memory was detected as being freed twice and it looks like the memory was allocated by something using the pool tag "GVdv". Searches suggest that is a pool tag used by win32k.sys so not likely to be the problem itself. The crash happened in the winlogon.exe process around the time it was doing some graphics related work and the calls look "old" to me. It appears to be loading an animated icon, for example, and a few other calls that remind me of MFC style programming but maybe that is normal. Do the computers have some sort of customized logon/logoff dialog or process?
No, the only thing close to that would be the login banner, but that is pushed through GP so it isn't really a custom dialog or anything that we would have scripted. I've been looking through Task Scheduler and I've found a few tasks for Google Update and the Adobe Flash updater that seem to be common across devices, but I am not sure if that could cause the issue or not. Do you have any utility that you would recommend that could track in greater detail of processes being started and stopped? I started playing with SysMon from the SysInternals Suite, but I'm not sure if that would be helpful or not.

Do you know if you had Special Pool enabled while using Driver Verifier? According to this NT Debugging blog post that's the setting to use
while trying to detect a double free of pool memory. You'd want to enable it for all drivers rather than myfault.sys - or at least all non-Microsoft drivers.
I enabled Special Pool on my laptop - it's running on all the drivers except the Microsoft ones. Panda Security is removed from my computer.

I appreciate the help so far!!
 

Tekno Venus

Senior Administrator, Site Designer
Staff member
Joined
Jul 21, 2012
Posts
6,070
Location
UK
I started playing with SysMon from the SysInternals Suite, but I'm not sure if that would be helpful or not!
SysMon is the tool that comes to mind for me, but I think ProcMon would also be able to do what you want.

As a starting point for SysMon, take a look at the config file here: GitHub - SwiftOnSecurity/sysmon-config: Sysmon configuration file template with default high-quality event tracing. It's very well documented and created by SwiftOnSecurity who is well regarded. It's mainly designed for security and malware forensics, but should offer a good starting point for building a SysMon config that suits your needs.

-Stephen
 

turtlej0e

Active member
Joined
Feb 13, 2018
Posts
25
I started playing with SysMon from the SysInternals Suite, but I'm not sure if that would be helpful or not!
SysMon is the tool that comes to mind for me, but I think ProcMon would also be able to do what you want.

As a starting point for SysMon, take a look at the config file here: GitHub - SwiftOnSecurity/sysmon-config: Sysmon configuration file template with default high-quality event tracing. It's very well documented and created by SwiftOnSecurity who is well regarded. It's mainly designed for security and malware forensics, but should offer a good starting point for building a SysMon config that suits your needs.

-Stephen
Okay! I set up SysMon on my computer with the SwiftOnSecurity config. I also will have more crash dumps next week. It's an extended weekend coming up, so I won't be back in till Tuesday. Thanks for all your help, and have a great weekend!
 

turtlej0e

Active member
Joined
Feb 13, 2018
Posts
25
So I got a full crash dump on my computer last week before I left for the weekend. I had Special Pool enabled and the antivirus removed. Hopefully this dump will be more helpful. I also checked the SysMon events around the time of crash, and it showed around 400 events (give or take a few) over the span of ten seconds for a Process Creation of esif_assist_64.exe. It looks like it is related to the Intel Dynamic Platform and Thermal Framework. I might try reinstalling the drivers from Dell to see if that changes anything, but it could be entirely unrelated.

Full dump: https://goo.gl/hC2x7S
 
Top