Frequent BSODs

Shroopy

Member
Joined
Sep 20, 2020
Posts
14
Hello, my PC has had a problem for a while of crashing randomly. It happens on average once every few days, usually when I'm away from the computer. I'll come back and see that the PC has restarted on its own. It tends to happen in front of me when I'm not giving any mouse or keyboard input, like when I'm watching a (non-youtube, it doesn't seem to happen with YT) video or when I'm sitting in a call. It seems like the trouble might have to do with the computer going into sleep mode? It actually properly bluescreened for the first time today, giving me a KERNEL_SECURITY_CHECK_FAILURE message. I remembered this forum exists, so I thought I'd see if I could get any help.
 

Attachments

Hello again, and welcome.

Of the five dumps uploaded, four of them are Nvidia graphics related, the nvlddmkm.sys Nvidia graphics driver is referenced in all of them. Two of these are TDR timeout failures, that's the Windows Timeout Detection and Recovery feature trying to recover by resetting the graphics card and driver - that will cause a crash to desktop. The first suspect in these cases is the Nvidia graphics driver, the version you have is recent....
Code:
2: kd> lmvm nvlddmkm
Browse full module list
start             end                 module name
fffff807`a1fb0000 fffff807`a5a06000   nvlddmkm T (no symbols)          
    Loaded symbol image file: nvlddmkm.sys
    Image path: nvlddmkm.sys
    Image name: nvlddmkm.sys
    Browse all global symbols  functions  data
    Timestamp:        Fri Mar  1 19:55:35 2024 (65E21697)
    CheckSum:         0393217E
    ImageSize:        03A56000
    Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4
    Information from resource tables:
However the Nvidia drivers download website has a more recent driver (552.12 dated 4th April 2024). I would initially suggest that you install that driver.

It's also well worth removing the graphics card and re-seating it firmly. Sometimes that helps with these kinds of issue.

I would also check your chipset drivers because your System log contains a number of WHEA errors for the PCIe root port...
Code:
Log Name:      System
Source:        Microsoft-Windows-WHEA-Logger
Date:          15/04/2024 01:11:29
Event ID:      17
Task Category: None
Level:         Warning
Keywords:      
User:          LOCAL SERVICE
Computer:      DESKTOP-677QF36
Description:
A corrected hardware error has occurred.

Component: PCI Express Root Port
Error Source: Advanced Error Reporting (PCI Express)

Primary Bus:Device:Function: 0x0:0x1:0x0
Secondary Bus:Device:Function: 0x0:0x0:0x0
Primary Device Name:PCI\VEN_8086&DEV_4C01&SUBSYS_373317AA&REV_01
Secondary Device Name:
These may be symptoms of the graphics card issue of course but it's important to check with the Lenovo website.

BTW. Your System and Application logs don't even cover a whole day. If you regularly clean these logs then please stop. They often contain valuable clues, especially historical clues, when you're having problems like this.

The fifth dump is an outlier in that nvlddmkm.sys isn't referenced, it doesn't look like a graphics operation was in progress either. The dump contains a 0xC000001D exception code, that's an illegal instruction execution attempt. Whilst several things can cause this exception bad RAM is one of the common causes. In addition, two of the nvlddmkm.sys dumps fail with 0xC0000005 exceptions - a memory access violation. This exception can also have many causes, bad RAM being a common one. For these reasons I think it worth the effort of running Memtest86 on your RAM to see whether that's the real cause...
  1. Download Memtest86 (free), use the imageUSB.exe tool extracted from the download to make a bootable USB drive containing Memtest86 (1GB is plenty big enough). Do this on a different PC if you can, because you can't fully trust yours at the moment.
  2. Then boot that USB drive on your PC, Memtest86 will start running as soon as it boots.
  3. If no errors have been found after the four iterations of the 13 different tests that the free version does, then restart Memtest86 and do another four iterations. Even a single bit error is a failure.
Let us know how all that goes. If it BSODs again after you've done all the above then please run the SysnativeBSODCollectionApp again and upload the new output. And PLEASE stop clearing logs!
 
Really sorry about the logs! I haven’t cleared them manually, nor does anything I run clear them as far as I know.
I’ll try those suggestions tomorrow.
 
  1. Open the Event Viewer by entering the Run command eventvwr.
  2. In there expand the Windows Logs in the left-hand pane.
  3. Right-click on System and then select Properties from the menu
  4. In the System log Properties check that the size of the log file is large. Mine is set to 1048576kB (which is 1GB), if you're short of system drive space make it a bit smaller.
  5. Click OK to close that dialog
  6. Repeat for the Application log, right-click and select Properties. My Application log is also 1048576kB.
  7. The other logs you can leave at their default values.
The System and Application logs are the ones that contain the most useful information. Mine are large because I have ample space on my system drive and so that I keep a decent history - my System log goes back to June 2022, which is overkill really but better than only having a single day.
 
My log file size is set to 20480kb for system and application logs. I've increased the size to 1gb for future troubleshooting. Again I'm really sorry about the logs - I haven't cleaned them manually at all, I don't know why they disappeared yesterday. I've also updated my graphics driver to the latest version, and reinstalled my chipset driver.
 
Last edited:
I've been having more BSODs. I will try memtest86 next. I wanted to say that I tried the built in windows memory checker today out of curiosity, and that seems to have been what wiped out my logs. My logs now only date back to ~30 minutes ago when the memory test finished.
 
That's unlikely but not impossible. In any case, the Windows Memory Diagnostic is not very thorough, which is why we recommend Memtest86. THough even Memtest86 isn;t perfect, by far the best test of your RAM is to remove one stick and run with just the one for a few days, or until you get a BSOD. Then swap sticks and run with just the other one for a few days, or until you get a BSOD. Be sure to use the correct motherboard slot for only one RAM stick.

I'm wondering whether your problem might be system drive related? One thing at a time though, let's test your RAM first.
 
I've run Memtest86 twice with no errors. I've attached another report from the Sysnative collection app. Since you mentioned that my system drive might be at fault, I've also attached a readout from CrystalDiskInfo on my system drive, in case that might be helpful for you. I also ran chkdsk on the drive, and Windows believes that it's fine. I'll try reseating my graphics card next.
Thank you for your patience and help!
 

Attachments

Those latest two dumps (16th April) are both graphics errors, so I now thing that your problem may have been a graphics problem after all. Both the recent dumps are TDR failures, the Windows Timeout Detection and Recovery feature as already mentioned, and so we're looking at either the graphics card or the graphics driver.

However, I note that you have been using FitGirlRepacks, presumably to download pirated games? There are two problems with this; the first is that assisting with any pirated software is against the rules of the forum, the second is that any pirated software is highly likely to contain trojans and viruses and this could potentially be the cause of your problems.
 
I've reseated my graphics card, and I'm still getting crashes. I have Malwarebytes, so I'm sure that viruses are off the table, especially since the crashes have been going on before I downloaded those things. Also, I'm not asking for help with pirated software. I've attached another sysnative log.
 

Attachments

I appreciate that you're not asking for help WITH pirated software but the links to FitGirlRepacks and your tacit admission above that you do have pirated software installed is of great concern. For all I know that may be the root cause of these issues.

It's up to you what you do of course but I'm not comfortable helping to fix a PC that is, or which looks like it is, running pirated software. I will step back now and allow others on here to assist you if they so choose.
 
I've reformatted my drive, reinstalled windows, swapped out the graphics card, swapped out the PSU, swapped out the CMOS battery, and still am having these crashes.
 
Can you upload another SysnativeFileCollectionApp output file please? If you have reinstalled Windows and you're still having these BSODs then one of two things is the cause...
  1. This is a hardware problem
  2. You have reinstalled the software/driver that's causing the problem
You can test for both of those possibilities by starting Windows in Safe Mode. In Safe Mode a stripped-down Windows system is loaded, with only critical services and drivers loaded. Typically no third-party drivers are loaded. This does mean that you won;t be able to do any useful work in Safe Mode, or play games, and many of your devices may not work properly (or at all) because their drivers have not been loaded. Your display will be low resolution for example, because you'll be using only the Windows basic display driver.

The usefulness of Safe Mode is that because it's s stripped-down system consisting on Microsoft services and drivers it's very stable, so if you get BSODs or crashes in Safe Mode you have a hardware problem. On the other hand, if it's stable in Safe Mode then your problem is with a third-party driver or service that wasn't loaded in Safe Mode. There is another technique we can use in that case to locate the problem service or driver.
 
That BSOD wasn't in Safe Mode though, was it? Is it stable in Safe Mode?

This BSOD is a VIDEO_TDR_FAILURE, which happens when a graphics hang occurs and the Windows Timeout Detection and Recovery (TDR) feature is unable to recover, so the system BSODs. The cause of this BSOD is either a bad graphics card, a bad graphics driver, or some other hardware (like the motherboard slot). You've swapped the card, and the driver looks very recent (June 2nd 2024). You could try back-level graphics drivers (try the three most recent drivers) using DDU in between to ensure complete driver removal.

The potential causes are still the same two I mentioned earlier...
  • This is a hardware problem
  • You have reinstalled the software/driver that's causing the problem
Running in Safe Mode will help confirm whether this is hardware because it will eliminate (most of) your third-party services and drivers. TBH when you do a clean install of Windows for troubleshooting purposes should always stop once Windows and all updates and drivers are installed and test the system thoroughly in that pristine state. I can see from the recent dump that you went on to reinstall your third party apps and drivers as well - it's thus entirely possible that you have reinstalled the problem.
 
I'm trying to do safe mode, but I'm having trouble - The crashes seem to consistently happen when I'm using Discord and/or OpenTogetherTube, but both of those require an internet connection. I tried turning on Safe Mode with Networking, but I didn't actually have networking. I'm using wifi, not ethernet.
 
It's unlikely you'll be able to run those apps in Safe Mode because third-party services are not loaded in Safe Mode. When running in Safe Mode you need to test it as much as you are able with what works. The objective is to see whether it crashes in Safe Mode so you do need to use it as much as you can and for long enough to have normally had a crash. If you can't get it to crash in Safe Mode then we're looking at a software/driver problem which you may have reinstalled.

On the other hand, the dumps you uploaded all point at the Nvidia graphics driver as the problem, thouigh it could still be the card itself of course. The version of nvlddmkm.sys that you have installed is back-level, dating from 4th April 2024...
Code:
1: kd> lmvm nvlddmkm
Browse full module list
start             end                 module name
fffff806`87e80000 fffff806`8b8e8000   nvlddmkm T (no symbols)           
    Loaded symbol image file: nvlddmkm.sys
    Image path: nvlddmkm.sys
    Image name: nvlddmkm.sys
    Browse all global symbols  functions  data
    Timestamp:        Tue Apr  2 23:30:52 2024 (660C6AFC)
    CheckSum:         03946599
    ImageSize:        03A68000
    Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4
    Information from resource tables:
The Nvidia drivers download site lists four later driver versions for your GTX 1660 Super. I suggest you download the latest version (555.99 June 4th 2024) and clean install that. It might be wise to use DDU to uninstall previous driver versions first however.

It's also worth you visiting the official Lenovo driver download site for your PC, there was a critical patch update issued in Dec 2023 and there are some keyboard driver updates dated Jan 2024. I would check that you also have the latest version of all other drivers available there.
 
Are you sure you're reading the new sysnative logs? I'm not using a 1660 super anymore, I'm using a rtx 4060. I'll get those lenovo drivers installed just in case, and I already updated my nvidia driver. I'll also try more safe mode testing.
 
I did look at the wrong Sysnative file, apologies. When did you change the graphics card? I can see where you said that you'd 'swapped out the graphics card' but you didn't make it clear that you'd upgraded it. It's extraordinarily difficult to troubleshoot a system when you keep changing the hardware without being clear what new hardware is being installed. Every time to change the hardware, or change anything we havn't asked you to, you effectively reset all our troubleshooting - because you've created a different platform.

Did you use DDU to remove all traces of the GTX 1660 driver before you installed the RTX 4060? The dump from 22nd June, which is after your June 18th post, is another 0x116 VIDEO_TDR_FAILURE - exactly the same BSOD you were seeing before you upgraded the graphics card. The 0x116 indicates that there was a problem in the graphics subsystem and the Windows TDR was unable to reset the card. you can see this clearly in the latest dump (from 22nd June)...
Code:
0: kd> k
 # Child-SP          RetAddr               Call Site
00 ffffb985`6d70f128 fffff805`3f9cb1ee     nt!KeBugCheckEx
01 ffffb985`6d70f130 fffff805`3f97d012     dxgkrnl!TdrBugcheckOnTimeout+0xfe
02 ffffb985`6d70f170 fffff805`3f975589     dxgkrnl!ADAPTER_RENDER::Reset+0x12a
03 ffffb985`6d70f1a0 fffff805`3f9ca945     dxgkrnl!DXGADAPTER::Reset+0x60d
04 ffffb985`6d70f250 fffff805`3f9caaa2     dxgkrnl!TdrResetFromTimeout+0x15
05 ffffb985`6d70f280 fffff805`392ce2d5     dxgkrnl!TdrResetFromTimeoutWorkItem+0x22
06 ffffb985`6d70f2c0 fffff805`3920e957     nt!ExpWorkerThread+0x155
07 ffffb985`6d70f4b0 fffff805`3941f3b4     nt!PspSystemThreadStartup+0x57
08 ffffb985`6d70f500 00000000`00000000     nt!KiStartSystemThread+0x34
You read this stack from the bottom up, it's a push-down stack of the function calls leading up to the bugcheck. You see the thread start up, then dxgkrnl.sys (the Windows DirectX driver) encounters a graphics timeout and the TDR functioin is called. You can see the TDR function of dxgkrnl.sys reset the driver (dxgkrnl!TdrResetFromTimeout+0x15) and then reset the graphics card (dxgkrnl!DXGADAPTER::Reset+0x60d and dxgkrnl!ADAPTER_RENDER::Reset+0x12a) and still the graphics issue persists because we get another graphics timeout which leads to the bugcheck.

Clearly the issue is not the graphics card because you've had theis BSOD with two different cards. It's not the graphics driver either because you've had this issue with two different graphics drivers (assuming that you did use DDU to remove all traces of the GTX 1660 driver). The only hardware left is the motherboard.

However. Before you swap out the motherboard you want to be 100% certain that thre isn't a software or driver cause. There are no third party drivers on the full call stack but it's still not impossible that you reinstalled the original problem. That's why I keep encouraging you to run in Safe Mode.

One other thing that it may be worth you trying is to enable Driver Verifier to see whether that catches a rogue driver...

Driver Verifier subjects selected drivers (typically all third-party drivers) to extra tests and checks every time they are called. These extra checks are designed to uncover drivers that are misbehaving. If any selected driver fails any of the Driver Verifier tests/checks then Driver Verifier will BSOD. The resulting minidump should contain enough information for us to identify the flaky driver. It's thus essential to keep all minidumps created whilst Driver Verifier is enabled.

To enable Driver Verifier do the following:

1. Take a System Restore point and/or take a disk image of your system drive (with Acronis, Macrium Reflect, or similar). It is possible that Driver Verifier may BSOD a driver during the boot process (some drivers are loaded during boot). If that happens you'll be stuck in a boot-BSOD loop.

If you should end up in a boot-BSOD loop, boot the Windows installation media and use that to run system restore and restore to the restore point you took, to remove Driver Verifier and get you booting again. Alternatively you can use the Acronis, Macrium Reflect, or similar, boot media to restore the disk image you took.

Please don't skip this step. it's the only way out of a Driver Verifier boot-BSOD loop.

2. Start the Driver Verifier setup dialog by entering the command verifier in either the Run command box or in a command prompt.

3. On that initial dialog, click the radio button for 'Create custom settings (for code developers)' - the second option - and click the Next button.

4. On the second dialog check (click) the checkboxes for the following tests...
  • Special Pool
  • Force IRQL checking
  • Pool Tracking
  • Deadlock Detection
  • Security Checks
  • Miscellaneous Checks
  • Power framework delay fuzzing
  • DDI compliance checking
Then click the Next button.

5. On the next dialog click the radio button for 'Select driver names from a list' - the last option - and click the Next button.

6. On the next dialog click on the 'Provider' heading, this will sort the drivers on this column (it makes it easier to isolate Microsoft drivers).

7. Now check (click) ALL drivers that DO NOT have Microsoft as the provider (ie. check all third-party drivers).

8. Then, on the same dialog, check the following Microsoft drivers (and ONLY these Microsoft drivers)...
  • Wdf01000.sys
  • ndis.sys
  • fltMgr.sys
  • Storport.sys
These are high-level Microsoft drivers that manage lower-level third-party drivers that we otherwise wouldn't be able to trap. That's why they're included.

9. Now click Finish and then reboot. Driver Verifiier will be enabled.

Be aware that Driver Verifier will remain enabled across all reboots and shutdowns. It can only be disabled manually.

Also be aware that we expect BSODs. Indeed, we want BSODs, to be able to identify the flaky driver(s). You MUST keep all minidumps created whilst Driver Verifier is running, so disable any disk cleanup tools you may have.

10. Leave Driver Verifier running for 48 hours, use your PC as normal during this time, but do try and make it BSOD. Use every game or app that you normally use, and especially those where you have seen it BSOD in the past. If Windows doesn't automatically reboot after each BSOD then just reboot as normal and continue testing. The Driver Verifier generated BSODs are these...
  • 0xC1: SPECIAL_POOL_DETECTED_MEMORY_CORRUPTION
  • 0xC4: DRIVER_VERIFIER_DETECTED_VIOLATION
  • 0xC6: DRIVER_CAUGHT_MODIFYING_FREED_POOL
  • 0xC9: DRIVER_VERIFIER_IOMANAGER_VIOLATION
  • 0xD6: DRIVER_PAGE_FAULT_BEYOND_END_OF_ALLOCATION
  • 0xE6: DRIVER_VERIFIER_DMA_VIOLATION
If you see any of these BSOD types then you can disable Driver Verifier early because you'll have caught a misbehaving driver.

Note: Because Driver Verifier is doing extra work each time a third-party driver is loaded you will notice some performance degradation with Driver Verifier enabled. This is a price you'll have to pay in order to locate any flaky drivers. And remember, Driver Verifier can only test drivers that are loaded, so you need to ensure that every third-party driver gets loaded by using all apps, features and devices.

11. To turn Driver Verifier off enter the command verifier /reset in either Run command box or a command prompt and reboot.

Should you wish to check whether Driver Verfier is enabled or not, open a command prompt and enter the command verifier /query. If drivers are listed then it's enabled, if no drivers are listed then it's not.

12. When Driver Verifier has been disabled, navigate to the folder C:\Windows\Minidump and locate all .dmp files in there that are related to the period when Driver Verifier was running (check the timestamps). Zip these files up if you like, or not as you choose. Upload the file(s) to the cloud with a link to it/them here (be sure to make it public).
 
Last edited:

Has Sysnative Forums helped you? Please consider donating to help us support the site!

Back
Top