BSOD 0x00000124 seemingly random occurrences


  1. Posts : 2
    Windows 7 Enterprise 64-bit
       #1

    BSOD 0x00000124 seemingly random occurrences


    BSOD:
    BSOD's - STOP: 0x00000124 randomly several times throughout the day


    Guesstimates:
    -As this did not start at day 1 it’s very possibly a software/OS issues
    -While testing the machine on 1 monitor over two days we did not experience issues; could be a video card problem?
    -Faulty CPU(s)/Motherboard?
    -USB 2.0 monitor hubs plugged into 3.0 ports?


    System Specs:
    Lenovo Thinkstation D30 Type 4223
    Windows 7 Enterprise 64-bit
    Two Intel Xeon Processors E5-2680
    Lenovo OEM Motherboard
    12x2gb ECC DDR3 1Rx8 PC3 – 12800
    Nvidia NVS 450
    Sound Card Integrated
    Lenovo LT2452P 24" Wide LED x 3
    Crucial 512GB CT512M4SSD1
    Seagate 2TB ST2000DM001
    Lenovo OEM PSU
    Lenovo D30 OEM
    Lenovo stock OEM Cooling
    Standard Lenovo USB Keyboard and Mouse
    Logitech H390 Clearchat USB Mic Headset
    Microsoft 1080p HD LifeCam Studio


    Situation:
    We’ve been experiencing frequent BSODs on this machine. While troubleshooting it, all Windows and Lenovo updates/driver updates were run. We did find a fault in the Seagate hard drive, and replaced it with a new unit. Memtest passed all RAM during 3 full cycles. We stress-tested the machine with Prime 95 (on all 32 CPU threads) for over 4 hours; no errors. While troubleshooting it, we did not experience a single BSOD.
    After returning the machine back to the owner, the BSODs have begun again. I’m wondering if it’s video-related as he’s using 3 24” LCDs, while we only troubleshot the machine on a single 22” LCD. I updated the video driver using Nvidia’s current driver, as opposed to the possibly-out-of-date Lenovo supplied driver.
    I also noticed he had some USB devices (monitor hubs) plugged into the 3.0 ports. We have experienced a few machines throwing BSODs with printers plugged into these ports, so maybe this is similarly related.

    Final Thoughts:
    I’ve attached the .zip of the Windows 7 Forums Diagnostic Tool results. I’m wondering if this is simply a bad motherboard we need to RMA. Any assistance would be greatly appreciated; thank you.
      My Computer


  2. Posts : 1,314
    Windows 7 64-bit
       #2

    All your crashes are identical in that the PCI-Express bus is reporting an unexpected connection completion because the connection timed out. I personally am not familiar enough with PCI-E WHEA errors to get the nitty gritty on exactly what device caused it, but I do know the USB thing with the monitor hubs can be involved because often - especially on OEM mobos - the USB bus is made as an extension of the PCI-E bus, so if there's a USB problem it can manifest as a PCI-E error. The video card can also be involved here because of the PCI-E bus it's connected too. Again, I cannot be sure what is what.

    Previous experiences I've dealt with involving this have involved dust or some other debris getting in a PCI-E slot or that a card is not seated properly. Re-seat cards and make sure slots and card connectors are nice n clean. Make sure USB ports also don't have anything iffy in them.

    I recommend we work on a process of elimination here with the USB devices, or do it backwards and start testing with one of the monitors and its associated hub and see if it bugs out again. I also recommend - if you haven't already - to update BIOS and chipset drivers as well as drivers associated with the USB and/or PCI-E buses as those may have fixed instability issues. Either way, just contemplate on what just may be causing connection timeouts on the PCI-E/USB bus and work with that. Typically I've found it's due to a physical connection issue, but it can just as well be related to any drivers associated with a USB/PCI-E device or the bus itself. If all other options have been exhausted, then you'll have to blame the motherboard (unfortunately there's no testing procedure for mobos besides a hardware swap).

    I wish I could help you pinpoint more on this, but the problem it seems to me with dealing with PCI-E bus is that typically the one reporting the error is the root port, which is the central hub of the PCI-E bus. The bus has several nodes (called bridges) which report between their associated bridge or end devices to the root port, but if an error happens to get past the bridges to the root port, all the root port can say at most is that it retrieved it from such-n-such bridge, which isn't really narrowing things down. There is no header log to read either which often can have decent info in it. The only hint of data I can garner is that it's device #2, but without an understanding of the layout of the PCI-E bus for that motherboard, I cannot determine what this is referring too (is this USB port #2? Is it PCI-E slot #2? Is it #2 device in an enumerated device list?). We'll just have to say it involves the PCI-E/USB bus and go from there. Again, update drivers and BIOS, tinker with video card and USB devices, and see how that goes.

    Analysts:

    Code:
    Use !analyze -v to get detailed debugging information.
    
    BugCheck 124, {4, fffffa80151b78d8, 0, 0}
    
    TRIAGER: Could not open triage file : C:\Program Files (x86)\Windows Kits\8.0\Debuggers\x64\triage\modclass.ini, error 2
    Probably caused by : GenuineIntel
    
    Followup: MachineOwner
    ---------
    
    30: kd> !errrec fffffa80151b78d8
    ===============================================================================
    Common Platform Error Record @ fffffa80151b78d8
    -------------------------------------------------------------------------------
    Record Id     : 01cdcf535a539a27
    Severity      : Fatal (1)
    Length        : 672
    Creator       : Microsoft
    Notify Type   : PCI Express Error
    Timestamp     : 12/3/2012 21:55:09 (UTC)
    Flags         : 0x00000000
    
    ===============================================================================
    Section 0     : PCI Express
    -------------------------------------------------------------------------------
    Descriptor    @ fffffa80151b7958
    Section       @ fffffa80151b79e8
    Offset        : 272
    Length        : 208
    Flags         : 0x00000001 Primary
    Severity      : Recoverable
    
    Port Type     : Root Port
    Version       : 1.1
    Command/Status: 0x0010/0x0407
    Device Id     :
      VenId:DevId : 8086:3c04 // referring to device that reported error (root port), not actual bad device. Same with class code.
      Class code  : 030400
      Function No : 0x00
      Device No   : 0x02
      Segment     : 0x0000
      Primary Bus : 0x00
      Second. Bus : 0x00
      Slot        : 0x0000
    Dev. Serial # : 0000000000000000
    Express Capability Information @ fffffa80151b7a1c
      Device Caps : 00008001 Role-Based Error Reporting: 1
      Device Ctl  : 0007 ur FE NF CE
      Dev Status  : 0003 ur fe NF CE
       Root Ctl   : 0008 fs nfs cs
    
    AER Information @ fffffa80151b7a58
      Uncorrectable Error Status    : 00014000 ur ecrc mtlp rof UC ca CTO fcp ptlp sd dlp und
      Uncorrectable Error Mask      : 00000000 ur ecrc mtlp rof uc ca cto fcp ptlp sd dlp und
      Uncorrectable Error Severity  : 00062010 ur ecrc MTLP ROF uc ca cto FCP ptlp sd DLP und
      Correctable Error Status      : 00002000 ADV rtto rnro dllp tlp re
      Correctable Error Mask        : 00000000 adv rtto rnro dllp tlp re
      Caps & Control                : 0000000e ecrcchken ecrcchkcap ecrcgenen ecrcgencap FEP
      Header Log                    : 00000000 00000000 00000000 00000000
      Root Error Command            : 00000000 fen nfen cen
      Root Error Status             : 00000000 MSG# 00 fer nfer fuf mur ur mcr cer
      Correctable Error Source ID   : 00,00,00
      Correctable Error Source ID   : 00,00,00
    
    ===============================================================================
    Section 1     : Processor Generic
    -------------------------------------------------------------------------------
    Descriptor    @ fffffa80151b79a0
    Section       @ fffffa80151b7ab8
    Offset        : 480
    Length        : 192
    Flags         : 0x00000000
    Severity      : Informational
    
    Proc. Type    : x86/x64
    Instr. Set    : x64
    CPU Version   : 0x00000000000206d7
    Processor ID  : 0x000000000000002e
    Read my article on these types of crashes here. UC status bit means Unexpected Completion, and CTO means Completion Timeout. Most likely the timeout triggered the unexpected completion bit. What strikes me odd is that this crash reported itself as recoverable, but it still BSOD. I wonder if a BIOS/chipset bug has anything to do with this.
      My Computer


  3. Posts : 2
    Windows 7 Enterprise 64-bit
    Thread Starter
       #3

    Much appreciation or the thorough investigation for me. Since updating the video driver and removing the LCD USB hubs from the USB 3.0 inputs, we haven't had another BSOD.

    I'm fairly certain at this point you definitely nailed it on likely being the video card or the USB inputs.

    Again, many thanks!
      My Computer


  4. Posts : 1,314
    Windows 7 64-bit
       #4

    I say keep vigilant on it for a week and if everything's in the clear go ahead and mark this thread as solved for future reference.
      My Computer


 

  Related Discussions
Our Sites
Site Links
About Us
Windows 7 Forums is an independent web site and has not been authorized, sponsored, or otherwise approved by Microsoft Corporation. "Windows 7" and related materials are trademarks of Microsoft Corp.

© Designer Media Ltd
All times are GMT -5. The time now is 02:00.
Find Us