Stop 0x124 - what it means and what to try

  1. Posts : 1,377

    Stop 0x124 - what it means and what to try


    A "stop 0x124" is fundamentally different to many other types of bluescreens because it stems from a hardware complaint. Stop 0x124 minidumps contain very little practical information, and it is therefore necessary to approach the problem as a case of hardware in an unknown state of distress.

    Generic "Stop 0x124" Troubleshooting Strategy:
    1) Ensure that none of the hardware components are overclocked. Hardware that is driven beyond its design specifications - by overclocking - can malfunction in unpredictable ways.

    2) Ensure that the machine is adequately cooled. If there is any doubt, open up the side of the PC case (be mindful of any relevant warranty conditions!) and point a mains fan squarely at the motherboard. That will rule out most (lack of) cooling issues.

    3) Update all hardware-related drivers: video, sound, RAID (if any), NIC... anything that interacts with a piece of hardware. It is good practice to run the latest drivers anyway.

    4) Update the motherboard BIOS according to the manufacturer's instructions. Their website should provide detailed instructions as to the brand and model-specific procedure.

    5) Rarely, bugs in the OS may cause "false positive" 0x124 events where the hardware wasn't complaining but Windows thought otherwise (because of the bug). At the time of writing, Windows 7 is not known to suffer from any such defects, but it is nevertheless important to always keep Windows itself updated.

    6) Attempt to (stress) test those hardware components which can be put through their paces artificially. The most obvious examples are the RAM and HDD(s). For the RAM, use the in-built memory diagnostics (run MDSCHED) or the 3rd-party memtest86 utility to run many hours worth of testing. For hard drives, check whether CHKDSK /R finds any problems on the drive(s), notably "bad sectors". Unreliable RAM, in particular, is deadly as far as software is concerned, and anything other than a 100% clear memory test result is cause for concern. Unfortunately, even a 100% clear result from the diagnostics utilities does not guarantee that the RAM is free from defects - only that none were encountered during the test passes.

    7) As the last of the non-invasive troubleshooting steps, perform a "vanilla" reinstallation of Windows: just the OS itself without any additional applications, games, utilities, updates, or new drivers - NOTHING AT ALL that is not sourced from the Windows 7 disc. Should that fail to mitigate the 0x124 problem, jump to the next steps. Otherwise, if you run the "vanilla" installation long enough to convince yourself that not a single 0x124 crash has occurred, start installing updates and applications slowly, always pausing between successive additions long enough to get a feel for whether the machine is still free from 0x124 crashes. Should the crashing resume, obviously the very last software addition(s) may be somehow linked to the root cause.
    If stop 0x124 errors persist despite the steps above, and the harware is under warranty, consider returning it and requesting a replacement which does not suffer periodic MCE events. Be aware that attempting the subsequent harware troubleshooting steps may, in some cases, void your warranty:
    8) Clean and carefully remove any dust from the inside of the machine. Reseat all connectors and memory modules. Use a can of compressed air to clean out the RAM DIMM sockets as much as possible.

    9) If all else fails, start removing items of hardware one-by-one in the hope that the culprit is something non-essential which can be removed. Obviously, this type of testing is a lot easier if you've got access to equivalent components in order to perform swaps.
    Should you find yourself in the situation of having performed all of the steps above without a resolution of the symptom, unfortunately the most likely reason is because the error message is literally correct - something is fundamentally wrong with the machine's hardware.


    Background Information:

    Windows passes on the hardware error report in the form of a "stop 0x124" because it can't do anything else once the hardware has signalled an uncorrectable fault condition. In technical terms, the vast majority of stop 0x124 crashes correspond to "Machine Check Exceptions" (MCEs) issued by the processor to alert the software to the existence of a hardware problem. It's possible for drivers to indirectly induce hardware to register MCEs by "driving" in ways that are confusing to the hardware, but from a user's point of view that disctinction is so subtle as to be invisible.

    It is important to note that there are many different possible MCE triggers, and one machine's stop 0x124 is likely to be entirely different to another's. Hence, it is best not to place too much emphasis on very specialised ways in which other individuals have resolved their own 0x124 problems - the more exotic the other machine's MCE solution, the less likely it is to apply to your own setup.

    It is possible - but painful - to interpret the hardware's error report. It's passed along in the so-called "MCi_Status" register, the contents of which are generally visible as bugcheck parameters 3 and 4 on the BSOD screen, as well as in the corresponding minidump.

    The trouble is that the hardware's complaints are almost never "practical", in the sense that they would explain what is wrong in layman's terms and include a recommendation for how to fix it. Instead, it's esoteric stuff which is intended for hardware specialists and driver developers.

    Interpreting MCi_Status Contents:

    This is not a viable troubleshooting methodology for most cases of stop 0x124 crashes, both because of the procedure's complexity and the impracticality of the resultant output. It is included here for the sake of completeness, and in case anyone should wish to go to the extreme in an attempt to understand recalcitrant stop 0x124 crashes on their machine.

    Interpreting the numbers a matter of consulting information published by Intel and AMD. The MCi_Status register contents are a bitmask, and each individual bit has a very specific meaning. Reference:
    Machine Check Exception - Wikipedia, the free encyclopedia

    As an example, a hypothetical stop 0x124 crash may pass on an MCi_Status from the hardware whose contents are below:


    Interpretation is performed based on the position of each significant bit, starting from "63" on the far left and ending with bit "0" on the far right:

    63: VAL - MCi_STATUS register valid
    61: UC - Error uncorrected
    60: EN - Error enabled
    57: PCC - Processor context corrupt
    36: component has received a parity error on the RS[2:0]# pins for a response transaction.
    35: (Reserved)
    27/26/25: Bus queue error type = "Response Parity Error" (011)

    MCA [15:0]:
    0000 1110 0000 1111
    F: "Normal" filtering (0)
    PP: Generic (11)
    T: Request did not time out (0)
    RRRR: Generic Error (0000)
    II: Other transaction (11)
    LL: Memory hierarchy level "generic" (11)
    Last edited by H2SO4; 04 Nov 2009 at 04:49.
      My Computer

  2. Posts : 4,772
    Windows 7 Ultimate - 64-bit | Windows 8 Pro - 64-bit

    These Bugchecks also can be caused by hardware error, such as a machine check or a nonmaskable interrupt (NMI), occurs. This category also includes disk failures when the memory manager is attempting to read data to satisfy page faults

    0x101 - CLOCK_WATCHDOG_TIMEOUT (Software bugs can cause these errors too,
    but they are most common on over-clocked hardware systems.)
      My Computer

  3. Arc
    Posts : 35,373
    Microsoft Windows 10 Pro Insider Preview 64-bit

    IMO in case of Stop 0x9C, Machine Check Exception, it is to follow this method.

    A Machine Check Exception is usually reported in Stop 0x124, in the first parameter...
    A fatal hardware error has occurred. Parameter 1 identifies the type of error
    source that reported the error. Parameter 2 holds the address of the
    WHEA_ERROR_RECORD structure that describes the error conditon.
    Arg1: 0000000000000000, Machine Check Exception
    Arg2: fffffa800966f028, Address of the WHEA_ERROR_RECORD structure.
    Arg3: 00000000be000000, High order 32-bits of the MCi_STATUS value.
    Arg4: 0000000000800400, Low order 32-bits of the MCi_STATUS value.
    Debugging Details:
    Which is described in Bug Check 0x124: WHEA_UNCORRECTABLE_ERROR (Windows Debuggers) as
    A machine check exception occurred.
    These parameter descriptions apply if the processor is based on the x64 architecture, or the x86 architecture that has the MCA feature available (for example, Intel Pentium Pro, Pentium IV, or Xeon).
    Whereas for a Stop 0x9C, it is reported as ....
    A fatal Machine Check Exception has occurred.
    KeBugCheckEx parameters;
        x86 Processors
            If the processor has ONLY MCE feature available (For example Intel
            Pentium), the parameters are:
            1 - Low  32 bits of P5_MC_TYPE MSR
            2 - Address of MCA_EXCEPTION structure
            3 - High 32 bits of P5_MC_ADDR MSR
            4 - Low  32 bits of P5_MC_ADDR MSR
            If the processor also has MCA feature available (For example Intel
            Pentium Pro), the parameters are:
            1 - Bank number
            2 - Address of MCA_EXCEPTION structure
            3 - High 32 bits of MCi_STATUS MSR for the MCA bank that had the error
            4 - Low  32 bits of MCi_STATUS MSR for the MCA bank that had the error
        IA64 Processors
            1 - Bugcheck Type
                1 - MCA_ASSERT
                2 - MCA_GET_STATEINFO
                    SAL returned an error for SAL_GET_STATEINFO while processing MCA.
                3 - MCA_CLEAR_STATEINFO
                    SAL returned an error for SAL_CLEAR_STATEINFO while processing MCA.
                4 - MCA_FATAL
                    FW reported a fatal MCA.
                5 - MCA_NONFATAL
                    SAL reported a recoverable MCA and we don't support currently
                    support recovery or SAL generated an MCA and then couldn't
                    produce an error record.
                0xB - INIT_ASSERT
                0xC - INIT_GET_STATEINFO
                      SAL returned an error for SAL_GET_STATEINFO while processing INIT event.
                0xD - INIT_CLEAR_STATEINFO
                      SAL returned an error for SAL_CLEAR_STATEINFO while processing INIT event.
                0xE - INIT_FATAL
                      Not used.
            2 - Address of log
            3 - Size of log
            4 - Error code in the case of x_GET_STATEINFO or x_CLEAR_STATEINFO
        AMD64 Processors
            1 - Bank number
            2 - Address of MCA_EXCEPTION structure
            3 - High 32 bits of MCi_STATUS MSR for the MCA bank that had the error
            4 - Low  32 bits of MCi_STATUS MSR for the MCA bank that had the error
    Arg1: 0000000000000000
    Arg2: fffff880009fbc70
    Arg3: 0000000000000000
    Arg4: 0000000000000000
      My Computer

  4. Arc
    Posts : 35,373
    Microsoft Windows 10 Pro Insider Preview 64-bit

    Factory bloatware causing stop 0x124.

    Random BSOD while playing or exiting games

    Adjusting my stop 0x124 troubleshooting strategy after it.
      My Computer

  5. Arc
    Posts : 35,373
    Microsoft Windows 10 Pro Insider Preview 64-bit

    Arc said:
    Factory bloatware causing stop 0x124.

    Random BSOD while playing or exiting games

    Adjusting my stop 0x124 troubleshooting strategy after it.
    Confirmed with a repetition! Credit goes to Boozad for such an wonderful addition!

    So now we know that Gigabyte bloatware causes stop 0x124 BSOD.
      My Computer


  Related Discussions
Our Sites
Site Links
About Us
Windows 7 Forums is an independent web site and has not been authorized, sponsored, or otherwise approved by Microsoft Corporation. "Windows 7" and related materials are trademarks of Microsoft Corp.

© Designer Media Ltd
All times are GMT -5. The time now is 01:09.
Find Us