Random BSOD, errors IRQL not less or equal/KMODE exception not handled

Page 2 of 3 FirstFirst 123 LastLast

  1. Posts : 11,269
    Windows 7 Home Premium 64 Bit
       #11

    I still suspect your RAID setup. The drivers are 2008 even after the update... Perhaps you need to install the chipset drivers first to support the new RAID drivers.

    I personally avoid RAID for these reasons: Why RAID is (usually) a Terrible Idea - Puget Custom Computers
      My Computer


  2. Posts : 12
    Windows 7 Professional 64bit
    Thread Starter
       #12

    Hi, ran some tests during the weekend...
    The good news is Memtest86+ ran 10 hour (8 passes) and found 0 errors...
    The bad news start with Prime95 and continue with IntelBurn.

    The first one (Prime95) stops core #3 after 0 to 2 tests, and core #4 shortly after - while cores #1 and #2 continue without problems (I'll attach the results.txt text).

    Then I checked the maximum recommended temp for my processor on the AMT site (roughly 72° C) and started IntelBurn... It froze the PC two or three times after reaching 70° C, usually after finishing the first pass of the 5 standard suggested in the guide. Once the PC was shut down by Core Temp - I also thought this might be the cause for the PC freezing, so I terminated Core Temp before running IntelBurn again: it froze anyway.
    I did get a BSOD, as well, but I forgot to write down which program was running (either Prime95 or IntelBurn, though).

    Attaching dump, results.txt and test.CSV file created with HWiNFO while running IntelBurn.

    Thanks
    ALex
      My Computer


  3. Posts : 11,269
    Windows 7 Home Premium 64 Bit
       #13

    Your processor should be running in the mid to high 50s under stress. Anything higher than that, and I would be concerned that you have improper cooling. Reapply the thermal coolant compound for the processor and heatsink. Also, make sure the heatsink is properly attached and cannot budge if you gently wiggle it with your finger.

    Thermal Paste and How To Use It | techPowerUp

    As you add and remove hardware (or wiggle your heatsink), follow these steps for ESD safety:
    1. Shut down and turn off your computer.
    2. Unplug all power supplies to the computer (AC Power then battery for laptops, AC power for desktops)
    3. Hold down the power button for 30 seconds to close the circuit and ensure all power drains from components.
    4. Make sure you are grounded by using proper grounding techniques, i.e. work on an anti-static workbench, anti-static desk, or an anti-static pad. Hold something metallic while touching it to the anti-static surface, or use an anti-static wristband to attach to the anti-static material while working. If you do not have an anti-static workbench, desk, or pad, you can use your computer tower/case by finding a metal hold in it, such as a drive bay.

    Once these steps have been followed, it is safe to remove and replace components within your computer.


    It also appears that you have either a bad processor, or your RAM is not compatible. The CPU-Z screenshots may help.

    stress.txt (Overclockers Australia Forums - View Single Post - Prime95 stress.txt file ???)

    STRESS TESTING YOUR COMPUTER

    BACKGROUND
    ----------

    Today's computers are not perfect. Even brand new systems from major
    manufacturers can have hidden flaws. If any of several key components such
    as CPU, memory, cooling, etc. are not up to spec, it can lead to incorrect
    calculations and/or unexplained system crashes.

    Overclocking is the practice of increasing the speed of the CPU and/or
    memory to make a machine faster at little cost. Typically, overclocking
    involves pushing a machine past its limits and then backing off just a
    little bit.

    For these reasons, both non-overclockers and overclockers need programs
    that test the stability of their computers. This is done by running
    programs that put a heavy load on the computer. Though not originally
    designed for this purpose, this program is one of a few programs that
    are excellent at stress testing a computer.


    RESOURCES
    ---------

    This program is a good stress test for the CPU, memory, L1 and L2 caches,
    CPU cooling, and case cooling. The torture test runs continuously, comparing
    your computer's results to results that are known to be correct. Any
    mismatch and you've got a problem! Note that the torture test sometimes
    reads from and writes to disk but cannot be considered a stress test for
    hard drives.

    You'll need other programs to stress video cards, PCI bus, disk access,
    networking and other important components. In addition, this is only one
    of several good programs that are freely available. Some people report
    finding problems only when running two or more stress test programs
    concurrently. You may need to raise prime95's priority when running two
    stress test programs so that each gets about 50% of the CPU time.

    Forums are a great place to learn about available stability test programs
    and to get advice on what to do when a problem is found.

    The currently popular stability test programs are (sorry, I don't have
    web addresses for these):
    Prime95 (this program's torture test)
    3DMark2001
    CPU Stability test
    Sisoft sandra
    Quake and other games
    Folding@Home
    Seti@home
    Genome@home

    Several useful websites for help (look for overclocking community or forum):
    Overclockers: The Performance Computing Community for Overclocking Hardware and How to Overclock Information
    Ars Technica
    HARDOCP - HardOCP Computer Hardware Reviews and News
    AnandTech
    Tom's Hardware: Hardware News, Tests and Reviews
    PC Overclocking, Modding and Building | Sharky Extreme
    Also try the alt.comp.hardware.overclocking Usenet newsgroup.

    Utility programs you may find useful (I'm sure there are others - look around):
    Motherboard monitor from livewiredev.com - dowload music Resources and Information.
    Memtest86 from Memtest86.com - Memory Diagnostic
    Cpuburn by redelm: http://pages.sbcglobal.net/redelm/
    TaskInfo2002 from Iarsn - High Quality System Software


    WHAT TO DO IF A PROBLEM IS FOUND?
    ---------------------------------

    The exact cause of a hardware problem can be very hard to find.

    If you are not overclocking, the most likely cause is an overheating CPU
    or memory DIMMs that are not quite up to spec. Another possibility is
    you might need a better power supply. Try running MotherBoard monitor
    and browse the forums above to see if your CPU is running too hot.
    If so, make sure the heat sink is properly attached, fans are operational,
    and air flow inside the case is good. For isolating memory problems, try
    swapping memory DIMMs with a co-worker's or friend's machine. If the errors
    go away, then you can be fairly confidant that memory was the cause of
    the trouble. A power supply problem can often be identified by a significant
    drop in the voltages when prime95 starts running. Once again the overclocker
    forums are a good resource for what voltages are acceptable.

    If you are overclocking then try increasing the core voltage, reduce the
    CPU speed, reduce the front side bus speed, or change the memory timings
    (CAS latency). Also try asking for help in one of the forums above - they
    may have other ideas to try.


    CAN I IGNORE THE PROBLEM?
    -------------------------

    Ignoring the problem is a matter of personal preference. There are
    two schools of thought on this subject.

    Most programs you run will not stress your computer enough to cause a
    wrong result or system crash. If you ignore the problem, then video games
    may stress your machine resulting in a system crash. Also, stay away from
    distributed computing projects where an incorrect calculation might cause
    you to return wrong results. Bad data will not help these projects!
    In conclusion, if you are comfortable with a small risk of an occasional
    system crash then feel free to live a little dangerously! Keep in mind
    that the faster prime95 finds a hardware error the more likely it is that
    other programs will experience problems.

    The second school of thought is, "Why run a stress test if you are going
    to ignore the results?" These people want a guaranteed 100% rock solid
    machine. Passing these stability tests gives them the ability to run
    CPU intensive programs with confidence.


    FREQUENTLY ASKED QUESTIONS
    --------------------------

    Q) My machine is not overclocked. If I'm getting an error, then there must
    be a bug in the program, right?

    A) The torture test is comparing your machines results against
    KNOWN CORRECT RESULTS. If your machine cannot generate correct
    results, you have a hardware problem. HOWEVER, if you are failing
    the torture test in the SAME SPOT with the SAME ERROR MESSAGE
    every time, then ask for help at mersenneforum.org - it is
    possible that a recent change to the torture test code may have
    introduced a software bug.

    Q) How long should I run the torture test?

    A) I recommend running it for somewhere between 6 and 24 hours.
    The program has been known to fail only after several hours and in
    some cases several weeks of operation. In most cases though, it will
    fail within a few minutes on a flaky machine.

    Q) Prime95 reports errors during the torture test, but other stability
    tests don't. Do I have a problem?

    A) Yes, you've reached the point where your machine has been
    pushed just beyond its limits. Follow the recommendations above
    to make your machine 100% stable or decide to live with a
    machine that could have problems in rare circumstances.

    Q) A forum member said "Don't bother with prime95, it always pukes on me,
    and my system is stable!. What do you make of that?"

    or

    "We had a server at work that ran for 2 MONTHS straight, without a reboot
    I installed Prime95 on it and ran it - a couple minutes later I get an error.
    You are going to tell me that the server wasn't stable?"

    A) These users obviously do not subscribe to the 100% rock solid
    school of thought. THEIR MACHINES DO HAVE HARDWARE PROBLEMS.
    But since they are not presently running any programs that reveal
    the hardware problem, the machines are quite stable. As long as
    these machines never run a program that uncovers the hardware problem,
    then the machines will continue to be stable.
      My Computer


  4. Posts : 12
    Windows 7 Professional 64bit
    Thread Starter
       #14

    Hi, I always had the impression it ran somewhat hot but, having had probelms in the past with a fan that would stop without cause and not start again, the temps with the new fan didn't see too high.
    I'll re-apply the thermal coolant paste, although I've done it quite a few times in the past (I always assembled my own machines, except for this one) and shouldn't have screwed up this much...
    I'm attaching the CPU-Z screenshots and going to bed :)
    Thanks and g'nite
    ALex
      My Computer


  5. Posts : 11,269
    Windows 7 Home Premium 64 Bit
       #15

    The overheating could be a result of a failed processor, possibly due to the broken fan you had previously... The Prime95 tests do indicate the processor as a likely culprit. Memory is the next likely.

    For the CPU-Z screenshots, are both of your modules exactly the same? Mind uploading an image of Slot #2 under the SPD tab?
      My Computer


  6. Posts : 12
    Windows 7 Professional 64bit
    Thread Starter
       #16

    Of course I don't mind.
    Here are shots of slots #1 and #2.
    Thanks
      My Computer


  7. Posts : 11,269
    Windows 7 Home Premium 64 Bit
       #17

    The modules appear to be compatible with Intel motherboards and not AMD.
    Code:
    Part Number :  M2F2G64CB8HA4N-CG  
    M/B Company	Motherboard Model No	Chipset
    ASUS		P6T Deluxe		Intel X58
    Gigabyte	GA-EX58-UD4P		Intel X58
    Gigabyte	GA-EX58-UD4		Intel X58
    ASUS		P6T6 WS Revolution	Intel X58
    ASUS		P6T WS PRO		Intel X58
    Gigabyte	GA-EP43T-DS3		Intel P43
    Gigabyte	GA-EP45T-Extreme	Intel P45
    Gigabyte	GA-EP45T-DS4		Intel P45
    Gigabyte	GA-EP45T-DS3R		Intel P45
    ASUS		Striker II Extreme	nVidia nForce 790 Ultra SLI
    ECS		P45T-A3			Intel P45
    ECS		X48T-A			Intel X48
    Gigabyte	GA-X38T-DQ6		Intel X38
    MSI		MS-7353			Intel X38
    ASUS		P5K3 DELUXE		Intel X38
    Gigabyte	GA-P35T-DS4		Intel P35
    The above was taken from: elixir_Compatibility




    The following gives more information about AMD compatibility:

    System Building and Compatibility

    AMD Memory for Systems
      My Computer


  8. Posts : 12
    Windows 7 Professional 64bit
    Thread Starter
       #18

    Damn.
    They haven't been giving me any trouble, though - the eight Memtest86+ passes haven't returned any error...

    Could it be the 2 damaged cores and the memory incompatibility have caused the BSODs?
    Is there anything in the DMPs pointing to that?

    Thanks
    ALex
      My Computer


  9. Posts : 11,269
    Windows 7 Home Premium 64 Bit
       #19

    Memtest86+ is not foolproof, and it may not pick up on the incompatibility. Your CPU memory controller may be giving the indication instead.

    The .dmps point to the AMD processor driver, but again, that is inconclusive. Even if the CPU were bad, I would still advise that you get AMD compatible RAM to go with the replacement AMD processor (unless you were considering getting all new components and an Intel based system).
      My Computer


  10. Posts : 12
    Windows 7 Professional 64bit
    Thread Starter
       #20

    I had a look at the compatibility lists at Elixir, and the only difference I spotted between my DIMMs and another model, listed as compatible with the AMD 790GX chipset is the presence of a heat spreader...
    Then again, please take my observations with due patience: I've never delved deep into hardware or OSs, apart from a couple of months before assembling my next PC :)

    When I leave for the holidays I'll leave the PC with the guy who assembled it, with instructions to replace processor and RAM, then. Hopefully I'll be able to hang on for the next two weeks... :)

    I'll flag the thread as "Solved" (I can always get back to it, after replacing processor and RAM - knock on wood)...

    I can't thank you enough for helping me solve this problem - pls let me know if I should click anywhere, to get you the credit for helping me (I've seen the "Add to Reputation" buttons, bbut don't know what they do or how they work).
    ALex
      My Computer


 
Page 2 of 3 FirstFirst 123 LastLast

  Related Discussions
Our Sites
Site Links
About Us
Windows 7 Forums is an independent web site and has not been authorized, sponsored, or otherwise approved by Microsoft Corporation. "Windows 7" and related materials are trademarks of Microsoft Corp.

© Designer Media Ltd
All times are GMT -5. The time now is 20:11.
Find Us