Home Technical Talk

Random freezes/BSODs. No clue what's the cause.

polycounter lvl 9
Offline / Send Message
TheFlow polycounter lvl 9
Hi guys, 

really, I don't know any further myself.  ZBrush crashes all the time, Photoshop goes smooth for a few hours and eventually crashes, too. 3D modeling itself works well but when I hit the render button it crashes after rendering a few minutes mostly. And I tried everything I know out.

The blue screen messages are different almost every time. Sometimes KMODE exeptions, sometimes related to the graphics drivers or NTFS.sys or some other drivers. Even with a frequently updated system and manually installed hardware drivers. I don't get it.

What I already tried:
- Checked the ram with Memtest86+ over night, no issues there
- checked CPU cooler, CPU never goes over 75°c even under heavy load
- ran checkdisk and looked into Samsung Magician for my SSD which tells me the health of the SSD should be fine
- I reassembled the complete system and every single cable in it
- drivers and Windows freshly installed by hand, every driver is from the developers' respective site
- GPU is brand new, problems where also exactly the same with the old nVidia GTX 560Ti

Crashes also happen occasionally while streaming video or doing other low performance tasks like Word, Excel etc but not as much as under heavy load.


This is my system:

CPU:          Intel Core i7-2600K (Sandy Bridge-DT XE, D2) 3400 MHz (34.00x100.0) @ 1600 MHz (16.00x100.0)
Mboard:    MSI P67A-G45 (MS-7673)
Chipset:    Intel P67 (Cougar Point) [B3]
Memory:    24576 MBytes @ 666 MHz, 9-9-9-24
                   - 8192 MB PC10600 DDR3 SDRAM - Kingston 9905403-888.A00LF
                   - 4096 MB PC10600 DDR3 SDRAM - Kingston 9905403-038.A00G
                   - 8192 MB PC10600 DDR3 SDRAM - Kingston 9905403-888.A00LF
                   - 4096 MB PC10600 DDR3 SDRAM - Kingston 9905403-038.A00G
Graphics:  NVIDIA GeForce GTX 1060 6GB (GP106-400) [Gainward CardExpert]
                   NVIDIA GeForce GTX 1060 6GB, 6144 MB GDDR5 SDRAM
Drive:         Samsung SSD 840 PRO Series, 125.0 GB, Serial ATA 6Gb/s @ 6Gb/s
Drive:         WDC WD5000AAKS-007AA0, 488.4 GB, Serial ATA 3Gb/s
Drive:         SAMSUNG HD200HJ, 195.4 GB, Serial ATA 3Gb/s

Sound:      Intel Cougar Point PCH - High Definition Audio Controller [B3]
Sound:      NVIDIA GP106 - High Definition Audio Controller
Netw.:        RealTek Semiconductor RTL8168/8111 PCI-E Gigabit Ethernet NIC
OS:            Microsoft Windows 10 Professional (x64) Build 15063.413 (RS2)
PSU:          Super Flower - Golden Green HX 450W 80+ Gold 


Any ideas? I'm thankful for any pointer in the right direction. 

Best regards and thank you,
-Flo

Replies

  • Farfarer
    My suspicion would be a bad RAM stick, where the bad sector isn't used until the usage is high. Try removing first one of the pairs and seeing if the issue remains, then swapping them out for the other pair.

    I know you've run mentest, but I've had similar issues in the past with ram that passed diagnostic tests but still caused BSODs.

    Everything else is harder to test unless you happen to have spare known-to-be-good parts around to replace them. Ultimately you're going to have to strip back the installed hardware a bit at a time until the issue doesn't manifest any more.
  • TheFlow
    Offline / Send Message
    TheFlow polycounter lvl 9
    Thanks Farfarer! I already tried this out a couple of times and forgot to mention this and crashes still happen... unfortunately. Also, I tried the (single) RAM sticks in different slots, but still there are these crashes. To be honest, I think they happen less often, but I didn't count.

    The worst part is that this sh*t mostly happens when I'm already in a deep state of work and completely unpredictable. :(
  • rollin
    Offline / Send Message
    rollin polycounter
    Hi TheFlow, I had/have the same problem as you. Here it was caused by RAM. I was not able to completely get rid of it (swapped in new RAM.. way way less BSOD but anyway.. they still happen from time to time). I guess it could be something with the motherboard or heat.. 
    I have the same cpu as you but I can't think of plausible explanation why he should be the culprit.. 

    As lame as it is, imo you'r only option is to swap all your hardware to find the cause.

    As a sidenode: On my Work Rig I had a very strange Problem where a HDD controller was not correctly configured after switching from win7 to win10. This caused also random crashes. But no bluescreens. what I want to say is that it might be a tiny chance that it's a driver issue.. 
  • throttlekitty
    It really does sound like a RAM problem, I'm with Farfarer on that.

    I'd also take a look at power. Watch the PSU with HWMonitor or something, checking that it's giving constant rates, or rather fluctuating within spec. Generally 5% is acceptable, the 5 and 12 volt lines are the more important ones to watch. Also get some benchmarks like prime95 and some others to run while you do so.
  • Bek
    Offline / Send Message
    Bek interpolator
    I wouldn't expect a blue screen with a PSU issue; I think that usually ends in a lockup or sudden shutdown / failure to POST. How old is the PSU?

    Did you reinstall windows from the same source? If a crash is garunteed after x time under heavy load, try downloading a linux distro and running that from a usb, do something intensive there, and see if it crashes — just to check if it's a hardware issue or a software issue. At least if you know it's a hardware issue for sure, you can start considering which components to test and replace.

    (I wonder if VRAM could suffer corruption? Can you remove the 1060 and test with onboard video?)

    Also, are you running an overclock on your CPU? If so it might be time to dial it back.
  • TheFlow
    Offline / Send Message
    TheFlow polycounter lvl 9
    Hi rollin, throttlekitty and Bek! Thanks for your guys help! 
    Already tested prime95 a couple of weeks ago and that ran smoothly without any problems. :) Like I said, I don't really think it is the heat since I watched over the temperatures and they never went up too high. Also checked the voltage etc via HWInfo, and the rates where constant enough for me, but I have no clue about these things so probably I can have misleading judgement. :) My power supply is about 4 years old. Never overclocked anything on my system, just the occasional Boost that is automatically integrated in the CPU.

    What I found out, and since then (yesterday evening) everything runs smoothly even under high load: my Windows 10 didn't install the latest CPU driver, even though I updated constantly and also installed the drivers by hand (via Intel Installer), Windows did for some case take the older ones from 2013.

    Rollin, check your device manager under "system devices"  your probably top two entries (or something similar, my Win is german, can't tell the english version right now), they are "2nd generation Intel Core Processor family DRAM Controller" and "PCI Express controller". Both where the old version, right clicking on it and selecting "Update driver" did the job. If this really is the solution I am so glad. But I won't believe that was it, ... not yet.
  • thomasp
    Offline / Send Message
    thomasp hero character
    RAM and PSU have been pretty much the only causes of hardware lockups/BSOD's and sudden reboots for me in the past.

    might be worth to just buy those fresh and swap them in to see how the system behaves (separately, obviously). that is if you have a no questions asked return policy in your country.

    PSU can be tricky to identify as the culprit if it's borderline maxed out.

  • igi
    Offline / Send Message
    igi polycounter lvl 12
    Most of the time it's RAM mismatch or similar. Try to lower your RAM mhz, increase your cas latency, make sure your cas latency is compatible to all RAM modules you use. Use only identical RAM in your stack, use only single channel RAM(ie 1,2 or 3,4 slots instead of 1,3 or 2,4 dual channel, once I got a similar problem). Flash your mainboard bios to latest version. As a fallback option use only single RAM stick and benchmark. Rearrange your SATA ports.  These are the first things I'd try for diagnose. Your PSU should be sufficient for your rig but you can test it if you have an onboard vga or some low-power vga lying around.
  • TheFlow
    Offline / Send Message
    TheFlow polycounter lvl 9
    Ok, so from the Intel CPU Driver that I selected manually (for whatever strange reason windows wasn't using the most recent driver but the one from 2013) the BSODs went down at least by 50%, had just one in the next two days. But then it started again to come more regularly.

    After thinking for days which way to go, I decided to buy a new PSU with even more power to be sure. Well, turns out it looks like it was the power supply. I did all benchmarks, also heavy rendering and for 24 hours the thing runs smooth and save. I hope I can say my hardware problems are solved. Gonna make sure to save some money on the side so I can build or buy a new machine if problems come my way again with the current rig.

    Thanks for the help everyone!!
Sign In or Register to comment.