[SOLVED] Cronic BSOD - Windows 7 x64

pchill

Contributor
Joined
Jul 22, 2014
Posts
7
I have had cronic random BSODs for the last 2 years on my custom build HTPC. In general it has pointed to video, although at times there is no video running. This is an HTPC where I run a lot of live video. There is no consistent activity which causes this. Here are my machine specs:
RAM XMS 8GB DDR3-1333 (PC3-10666) CL9 Dual Channel Desktop
PSU Corsair Builder Series 600 Watt ATX 12V Power Supply
MB Gigabyte GA-Z68AP-D3 LGA 1155 Z68 ATX Intel Motherboard
GPU Gigabyte Radeon HD 6850
CPU Core i5 2500k LGA 1155
DVD LG 12x Super Multi Blue Internal SATA 1.5Gb/s Blu-ray Combo
SSD OCZ Agility 3 AGT3-25SAT3-120G 6Gbs 2.5" Solid State Drive
Cablecard Ceton InfiniTV 4 PCie TV Tuner Card
Storage 1 WD Caviar Green 2TB 3.5" Internal
Storage 2 WD Caviar Green 2TB 3.5" Internal
Storage 3 WD - My Book 3TB External USB 3.0 Hard Drive - Black
2 Xbox 360s attached
OS Windows 7 64bit – Full Retail
Security MS Security Essentials

Most of the dumps were NTKernel related. I have always felt they might be related to the GPU. However, note point 9 below where I removed the GPU, which seems to debunk this theory. Here are the steps I have taken to try and mitigate this in order:

  1. Reloaded BIOS
  2. Took it to MicroCenter where they run their super test. They reloaded my GPU drivers and found no major issues. For my $100 I got a BSOD 3 weeks later.
  3. Reloaded Win 7 64 bit
  4. Memtest 86 – 9 passes, no errors reported
  5. Intel stress test on CPU, no errors reported
  6. Furmark stress test on GPU, no errors reported
  7. WD lifeguard test on HDDs, no errors reported
  8. Driver Verifier – Crash suggested ATIKPMAG.SYS
  9. Removed GPU, software, & drivers and installed Intel HD 3000
  10. Driver Verifier - New Crash suggests Intel driver

I was just about to go and purchase a brand new one that was factory, but I find that there are a lot of components I would carry over to the new HTPC, which brings the fear that I could transfer the issue.

Thank you much in advance
 

Attachments

Hi,

PAGE_FAULT_IN_NONPAGED_AREA (50)

This indicates that invalid system memory has been referenced.

Bug check 0x50 usually occurs after the installation of faulty hardware or in the event of failure of installed hardware (usually related to defective RAM, be it main memory, L2 RAM cache, or video RAM).

Another common cause is the installation of a faulty system service.

Antivirus software can also trigger this error, as can a corrupted NTFS volume.

Code:
1: kd> k
Child-SP          RetAddr           Call Site
fffff880`03b9fc78 fffff800`0333c53b nt!KeBugCheckEx
fffff880`03b9fc80 fffff800`032bdcee nt! ?? ::FNODOBFM::`string'+0x43781
fffff880`03b9fde0 fffff880`041b470d [COLOR=#4b0082]nt!KiPageFault+0x16e[/COLOR]
fffff880`03b9ff70 fffffa80`0b02b000 [COLOR=#ff0000]igdkmd64+0x11170d[/COLOR]
fffff880`03b9ff78 fffff880`00000001 0xfffffa80`0b02b000
fffff880`03b9ff80 00000000`00000000 0xfffff880`00000001

Verifier flagged the Intel Graphics driver for referencing invalid memory. Right away this is a flag raiser, and possibly the problem. You have a dedicated GPU, yet your integrated graphics drivers are installed + enabled.

Uninstall your Intel Graphics drivers ASAP, restart, boot into the BIOS, and then disable your integrated graphics.



A few other things I want to tackle as well..

1. wdcsam64.sys is listed and loaded which is the Western Digital SES (SCSI Enclosure Services) driver. Please remove this software ASAP as it's very troublesome and is also not necessary to the functionality of your system.

2. AppleCharger.sys is listed and loaded which is the GIGABYTE On/Off Charge driver. See here for details - GIGABYTE ON/OFF Charge

Very troublesome software, so please uninstall ASAP!

Regards,

Patrick
 
Thanks Patrick. I did the following:

Removed the 2 files you recommended
Removed the Intel onboard graphics
Reinstalled the GPU

Now I'm going to stress test it with Verifier and see how it goes.

Thank you much, again!!!
 
My pleasure, please keep me updated.

Disable verifier if you don't crash in 48 hrs.

Regards,

Patrick
 
INTERRUPT_EXCEPTION_NOT_HANDLED (3d)

Code:
STACK_TEXT:  
fffff800`00b9c9b8 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : [COLOR=#ff0000]atikmdag+0x8c49f[/COLOR]

The only thing in the stack at the time of the crash is the AMD/ATI video driver.



Aside from removing the Intel Graphics driver, you disabled integrated video, yes? If so, if you're currently on the latest ATI/AMD video card driver, try the beta. If you're on the beta, try the latest... etc. Keep going back until you find one that fits. If none of them solve the problem, this is likely a faulty video card itself. We'll cross that bridge if/when we have to.

Regards,

Patrick
 
Patrick,

A few points:
1. BIOS was set to turn off. In this case the only option is "Turn off if PEG is detected."
2. I had another crash. I couldn't figure out what caused it, so I've uploaded another. Would you please be willing to take one more quick look at this one to confirm it looks like video?

Since this has been continuing on for 2 years, does it seem reasonably probably that it is the GPU? I think I'll just go get another, which it is probably time for an upgrade, anyway.
 

Attachments

Yes, especially since the 0x3D bug check, I am very inclined to believe this is a faulty GPU.

Regards,

Patrick
 
Hi,

I have upgraded my GPU. I definitely have seen a significant reduction in BSODs. The 2 instances I have received: 1 related to coming out of sleep mode and the other was just random, computer on and no programs open.

Separately, I ran verifier for 3 days in b/n, but no BSODs.

I have once again uploaded my files.
 

Attachments

To be honest, I think it's bad RAM.

Code:
BugCheck D1, {[COLOR="#FF0000"]0[/COLOR], 2, [COLOR="#800080"]8[/COLOR], [COLOR="#FF0000"]0[/COLOR]}

Strange how we notice a null pointer being dereferenced, it's also trying to get executed hence the page fault.

Code:
FAILURE_BUCKET_ID:  X64_0xD1_[COLOR="#FF0000"][B]VRF_CODE_AV_NULL_IP[/B][/COLOR]_nt!KiPageFault+260

It's a verified bugcheck, another reason to believe it's hardware.




Code:
BugCheck 19, {3, fffffa80069c50a0, fffffa80069c50a0, [COLOR="#FF0000"][B]32[/B]fffa80069c50a0[/COLOR]}

A list entry has been corrupt, it seems like a bit flip but I don't think so.
Given it's a minidump and putting the other BSOD into account I would say it's bad RAM.

With all this said so far I suggest running Memtest86 for at least 8 passes.

Which one should I download?


You have two options to choose from, you can either download the ISO version then burn it do a CD and boot it from there.
The other option is downloading the auto installer for USB sticks, you then boot from that USB stick.
Be warned though, it will format your USB then install the files needed to make it bootable so any files left over will be wiped off.

Download it here:

Memtest86+ - Advanced Memory Diagnostic Tool

So how does it work?

It works by writing a series of test patterns to most memory addresses over 9 tests, it then reads the data back to compare it for errors.

The default pass does 9 different tests varying in access patterns and test data. A tenth pass is optional from the menu which writes all the memory in zeroes then sleeps for 90 minutes and compares it to see if any address have changed, this takes 3 hours per pass each time.

My memtest86 isn't booting! What should I do?

This can be caused by a number of different reasons, common ones include your BIOS not setting using the correct settings, you might want to change your boot priority order.
Other causes include your motherboard not supporting bootable USB sticks in which case you'll need to use a CD (or floppy drive).

Any other issues you might want to look here:

FAQ : please read before posting
 
I have run Memtest86+. I have attached a screenshot. It did not do anything until passes 9 & 14, which show an error on test 9. Does that generally suggest a RAM issue or if you run it long enough will it cause an error?

separately, I'm rerunning it for each 4Gig stick. At pass 9, we are talking more than 16 hours. Just curious as I see the answers seem all over the place.
 

Attachments

  • image.jpg
    image.jpg
    183.2 KB · Views: 5
You will see all sorts of different recommendations for what is considered a good benchmark for passes. I generally say 8 because that's a solid few hours in which most of the important tests based off of the algorithm have run their course. However, pass 9 is just after that, so it's definitely valid. As an example, if you ran Memtest for a week straight and didn't see errors until after the full week's time, it's very likely a false positive (as it has been running and stressed for a long period of consistent time).

All in all, I'd say you have yourself some bad RAM there.

Regards,

Patrick
 
Back
Top