New gaming rig. Strange different BSODs while\after playing a specific game

Koncept

Member
Joined
Jan 24, 2020
Posts
16
Greetings. Recently I had some random BSODs on my new gaming rig. What really concerns me, that it can be hardware issue, cause it's latest Intel's "special edition" CPU - 9900KS, so I am really worried that I had a bad luck and something wrong with my chip.
Below is tl;dr version with dumps attached + long version of whats happening.

OS - Windows 10, 1909 (Build 18363.592)
· 64bit
· Was originally pre-installed by gaming rig-building company; OEM edition of home version
· The rig is 2 months old
· Never reinstalled OS since
· Desktop PC
· i9-9900KS
· 2080Ti FE (founder's edition from NVIDIA)
· Asus ROG Maximus Hero XI
· Corsair RMX 850W
· HyperX Predator 2x16 RGB DDR4
· M.2 Evo 970 Plus 250GB (for system) + M.2 Evo 970 Plus 1TB (for everything else) (No RAID)
· CPU Cooled by Noctua NH-D14 (really massive air cooler), no water-cooling, stock cooling on GPU
· Everything runs at default, no overclocking on GPU, CPU, RAM

tl;dr: First time since 2 months of smooth sailing encountered different BSODs while playing particular game: Escape from Tarkov (EFT). Before BSOD windows event manager logs some WHEA-Logger Cache Hierarchy errors, but BSOD itself never was with WHEA-Logger which is strange for me. For the first time I thought that must be software related (game itself is in deep beta very far from release), tho it still concerned me, that the game can crash with BSOD, but the latest BSOD was after playing the game (maybe it still was running in processes somehow, I can't really remember or check). I check temps on every game I play on this rig with HWMonitor, and they are always fine even in this case. I play different "heavy" games like Rainbow Six: Siege, Elder Scrolls Online etc. - never had any problems, temps are OK (never above 68-70-ish C under heavy load like R6 on very high settings), but something wrong happening while\after playing EFT.

Long story with some notes:
Rig was build by specialized company there in Russia, everything brand new (literally 2 months old, was build around mid November 2019), it's under warranty, so I can ask them for support\change in parts, but I don't really want right now, cause I am not sure if it is software related, or I really have something wrong with my hardware.
What I noted from the start - as this company provides different builds with different parts (you can choose with builder your CPU, you mobo etc) - they use OEM version of Windows with different pre-installed tools for everything, what I mean is - I don't have EVGA parts, or MSI, but I noticed, that some MSI and EVGA related programs (with RGB tuning and such) were pre-installed nonetheless. I even had to remove them, because EVGA RGB program was still running it's services which were messing with my Corsair RGBs and I believe some "traces" of these programs may still be loading at startup, because I always see some Asus related services stop in Event manager after booting up.

I already noticed WHEA-logger errors before (no BSODs) while playing different game (don't remember which one) - the first time got really unhappy and worried, even read up that Meltdown\Spectre protection from this exploits maybe causing this on new Intel CPUs, so I disabled Spectre protection with inSpectre program - can't really tell if that was the case, or I stopped playing that particular game - never saw WHEA-logger error's since - switched to playing R6, Path of Exile, Mortal Kombat 11 and ESO - never saw any of them during that time, log was really clean, no problems at all.
Then, like 1 week ago, picked up EFT. Temps are fine as always, FPS are good, everything working as intended, but then random BSOD occurs - IRQL_NOT_LESS_OR_EQUAL and Event manager logged multiple WHEA-Logger errors with cache hierarchy error.
I don't have tools for debugging full dump, so I went with BlueScrenView - noticed that there was xusb22.sys with ntoskrnl.exe - quick google said that xusb22.sys is driver for wireless x-box type controller, and I always have mine connected, just in case I want to play MK11. Also people on reddit reported that some of them have experienced BSOD and odd behavior while playing EFT with controller connected, so I unplugged it.
WHEA-Logger got me concerned again and I went ahead and successfully updated BIOS to the latest version from official ASUS website. BIOS settings were default, not even a sign of overclocking, even rig manufacturer never changed anything in there, and after BIOS update it's for sure went to defaults, because I disabled RGBs and USB charging in state-off after updating (they are enabled by default). XMP is off (I don't know for what it is and do I need to enable it).
2 days later again same BSOD - IRQL_NOT_LESS_OR_EQUAL, xusb22.sys is not the case, cause I unplugged my controller.
At this point I thought it's definitely something with the game, everything was fine while playing other games.

Today, after EFT servers went down and had to ALT+F4 from the game (mentioning this because it could possibly was still in the processes after that) I was surfing with chrome, waiting for them to back up and suddenly a BSOD - different this time - SYSTEM_SERVICE_EXCEPTION.

I always keep windows up to date with NVIDIA drivers.
Did a CinebenchR20 and CPU-Z test to see if it is something to do with CPU being under heavy load - nothing, no WHEA-errors, no BSODs, temps are fine during the tests.
Windows memory test - no errors
Windows integrity test - no errors
Device manager shows no problems with any of the listed devices, tho, I have to mention, my keyboard is pretty old, and I have to unplug it when shutting down and I need to plug it again after boot - otherwise it won't work after full shut down, POST will warn that there is no keyboard attached and device manager will show controller error or something like that, on reset it is fine - I think cause it's old usb 2.0 keyboard, but this never bothered me.

Below I will attach collector files.
Sorry for any grammar or spelling mistakes, English is not my native language.
 

Attachments

1. Run Driver Verifier Manager through write verifier in search bottom
2. Select "Create custom settings (for code developers)"
3. Tick all standard settings and 2 additional settings (force pending i/o request and irp logging)
4. Select "Select driver names from a list"
5. Tick all 3rd party drivers without Microsoft drivers
6. Apply changes and reboot computer
 
1. Run Driver Verifier Manager through write verifier in search bottom
2. Select "Create custom settings (for code developers)"
3. Tick all standard settings and 2 additional settings (force pending i/o request and irp logging)
4. Select "Select driver names from a list"
5. Tick all 3rd party drivers without Microsoft drivers
6. Apply changes and reboot computer
For a better memory test please follow the instructions in this tutorial: Test RAM with PassMark MemTest86

Please report results.

ATM away from PC, had to run to work, never expected such fast answers, thank you both! Will do as soon as I get home (3-4 hours from now)
 
Only managed to do 1 pass with memtest86, just to be sure it works, will do 4 later as you can't do 8 passes with free version anymore - was ok, no errors.
Enabled driver verifier with above settings, query says it works and I can feel a bit of a lag right now when playing, so far so good. Played EFT yesterday, everything was ok. I will need to spend some more time with it enabled for it to crash, as it is completely inconsistent - in the most cases I can play all day and everything will be ok, dunno how to make it crash on purpose with driver verifier running.
Should I prioritize memtest86 with all 4 passes or just hang around with verifier for a crash?
I actually remembered 1 thing. The only driver I reinstalled (or installed above default windows) was realtek hd audio, almost right after I bought this rig. If I remember correctly there was a noticeable buzz sound in my new headphones when idling in windows, so I went ahead to Asus website and installed it from there. Buzz sound (or whatever drove me crazy) went away, but I should note, that sometimes after boot there is another awkward audio artifact - when sound stops playing (like person stops talking in Discord, or I pause a song for example) there is a little noticeable "pop sound" after the pause\stop. I noticed that recently and it is not always here, sometimes everything is ok, sometimes it will be that way and rebooting usually helps. I don't know if this can be connected somehow, but given that this is the only 3rd party driver I installed myself and same inconsistency with "pop" audio artifact... I can roll back with device manager to default windows driver, but I think I should not because we are in the middle of the troubleshooting there with driver verifier or should I?
 
For now, play this problematic game with Driver Verifier turned on
 
Run the Memtest86 overnight when you are not using your rig and after the Driver Verifier tests. For Driver Verifier use for about 48 hours max.
 
IRQL_NOT_LESS_OR_EQUAL (...) wireless x-box type controller

5. Tick all 3rd party drivers without Microsoft drivers

For now, play this problematic game with Driver Verifier turned on
Very good idea, but better tick windows drivers, no ?

Also,
· CPU Cooled by Noctua NH-D14 (really massive air cooler), no water-cooling, stock cooling on GPU
· Everything runs at default, no overclocking on GPU, CPU, RAM
To be honest, on an electronical low-level point-of-vue, I really appreciate people doing this.
o/c stuff on boards makes CPU and memory loose cycles here and there (mainly by bad latch) and people relies too much on CPU rollbacks. Sure, computers won't take fire. But this makes cpu and mem i/o streams not fluid (esp with no ECC ram, which have its cost).

And we're almost Noctua twins :) (but mine is 12 years old and still good !)
 
Last edited:
I would not quite agree with that. In a situation where we know that the problem may be drivers, and Driver Verifier on third-party drivers alone is not enough (i.e. we are still not able to find a driver that is unstable), sometimes it is useful to run Driver Verifier on third-party drivers and for drivers e.g. from the system kernel (ntoskrnl.exe). Then you also check whether the driver communicates correctly with the kernel itself or another Microsoft driver. I recommend the article about Driver Verifier which is below:
Analyst's Perspective: My Driver Passes Driver Verifier! (Or Does It...) - OSR
Driver Verifier is described quite well and exactly, how it works, how to use it correctly to find driver errors, etc.
Of course, the scenario described is when you create the driver, but nothing prevents you from checking the third-party drivers in the same way
 
While Windows drivers aren't bug-free, I have yet to see a situation where driver verifier is required to find a Windows driver causing crashes that hasn't been addressed with a Windows Update yet, or is really caused by hardware. Besides, Windows drivers have been tested with driver verifier before they're released publicly just like any driver provided via Windows Update.

The link describes enabling driver verifier on Windows drivers when a developer is debugging their own driver, a completely different scenario.
 
OK, maybe different. Let's say you have a blue screen which, according to crash dumps, is caused by the kernel. You have checked all these hardware issues etc. and you are sure that the problem is the drivers. You enable Driver Verifier on all third-party drivers. You still have blue screens, and the debugging again turns out to be the kernel. Why? Because Verifier checks only those calls which a given driver performs directly to a given function. Therefore, the rule is that Verifier should be turned on and for third-party drivers that we want to check (it does not matter if we wrote the driver ourselves or not) and for the driver that the third-party driver uses. If you enable Verifier and for third-party drivers, and in this case also for the kernel driver, Verifier will also check the calls that the driver makes to the system kernel, so any buffers and allocations made from the driver to the kernel will also be checked
 
That's a very unrealistic scenario.

As I said, there shouldn't be a need to use driver verifier on the Windows kernel drivers.
 
I'm argue. There are cases where after using Driver Verifier or you still can't pinpoint the correct driver or worse, after turning it on there are no blue screens. The problem is that we consider such cases from the machine to be a hardware failure, and sometimes the problem is simply inaccurate diagnostics (overlooking some option in the Driver Verifier or Microsoft's driver so that Verifier could not verify the exact connections used by the drivers). In addition, no one in this forum has even tried to advise such diagnostics, consolidating the belief that since Microsoft's drivers are well tested, they do not cause blue screens (although as we know well, sometimes there are exceptions). The setting for Microsoft driver verification is not to test these drivers, but more to check if third-party drivers are correctly communicating with Microsoft drivers (how the driver for the network card is guilty, and changing it to an older or newer version, or reinstalling the driver it does not help to verify you set this network adapter driver and Microsoft network drivers that are in the stack to verify the calls that performs this "guilty" network card driver for Microsoft drivers and check if it is not another driver dirty in calls)
 
The setting for Microsoft driver verification is not to test these drivers, but more to check if third-party drivers are correctly communicating with Microsoft drivers.
It is to validate the actions performed by the drivers are correct, thus to test the drivers, these actions may be done with the help of Windows drivers though.


You're complaining about how some analysts approach certain situations, but they do so mostly from experience. A dozen of my approaches that have been done based on experience were wrong, that only says whenever I see a similar problem I need to look more carefully but doesn't necessarily make me change how I handle things. I wouldn't change my approach if, out of a few hundred situations, only a dozen times it wasn't effective, but I would be open to learn from those dozen times to see where I could improve.

In the end, everyone makes mistakes from which they do, or should, learn.
 
No, it's not that I'm complaining, we're all analysts doing a great job. Only I think that it is not worth to be afraid of new solutions, and it often happens that such methods can also lead to solutions (I have not seen such things in my life 😉)
 
Is "wireless x-box type controller" an actual x-box controler or a generic controler that use the x-box driver ?
I though it could be usefull to tick Windows drivers because I'm not sure of the controler's response (especially signal/noise levels). Maybe someone is playing wireless 5m around the computer too. This would create IRQs that you computer doesn't know.

I used a PS3 controler for long with an unsigned driver that acted like a bridge for x-box drivers. The moment I bough a real x-box controler, some games that had CTD did not bug anymore. Never caused a BSOD though.

I really have not much clues on this case :/
 
Last edited:

Has Sysnative Forums helped you? Please consider donating to help us support the site!

Back
Top