New
#11
The issue is that you, I assume, can't boot in normal mode and stay booted long enough to run the usual tests. That is why I recommend the stripdown method.
Also, the stripdown method is the best way to get guaranteed results with hardware problems.
That's true, I cannot boot in normal mode and that's why I had so much trbl getting the logs to u.
Can u tell me pls what application u r using to read the dmp files?
I have tried notepad++ and WinDbg.exe from windows debugging tools but all i can see is garbage.
Btw, I have already stripped down my pc to the absolute essentials. MB, CPU, graphics card, RAM and 1 SSD. I am still getting the error.
I run memtest86 and let it run for all the 4 passes without any errors. Then I run from safe mode with command prompt chkdsk and let it restart and run the checkdisk at boot and still no errors.
Next thing I will try to change the graphic card with another one.
I am afraid though it's the cpu or maybe even worse the MB.
I'm using Windbg, you won't get much more understandable 'garbage' than what you see in Windbg if Windbg is properly configured.
There are only a few tools that you could use to read the dumps, many don't have the right engine.
Memtest86+ needs to be run for at least 8+ passes, that's our requirement to say the RAM is fine.
As the crashes are 0x124 all that can be done is dump the error. The 'error' is the error that the processor reported after a core noticed a fatal hardware error. As we are troubleshooting 0x124 minidumps, we will NOT be able to identify the root cause. We use minidumps to identify any patterns that may lead to the root cause, it is a process of trial and error to find it and as such the 0x124 is an exceptional BSOD that is often considered as difficult.
Here's an example of what Windbg tells us.
The CPU reported an error, using the address of the error record structure the error can be dumped.
The error record tells us that a generic error occured on the bus, this was reported by processor 1.Code:******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* WHEA_UNCORRECTABLE_ERROR (124) A fatal hardware error has occurred. Parameter 1 identifies the type of error source that reported the error. Parameter 2 holds the address of the WHEA_ERROR_RECORD structure that describes the error conditon. Arguments: Arg1: 0000000000000000, Machine Check Exception Arg2: fffffa8007912028, Address of the WHEA_ERROR_RECORD structure. Arg3: 00000000f4800000, High order 32-bits of the MCi_STATUS value. Arg4: 0000000000040e0f, Low order 32-bits of the MCi_STATUS value. Debugging Details: ------------------ DUMP_CLASS: 1 DUMP_QUALIFIER: 400 BUILD_VERSION_STRING: 7601.23915.amd64fre.win7sp1_ldr.170913-0600 SYSTEM_MANUFACTURER: System manufacturer SYSTEM_PRODUCT_NAME: System Product Name SYSTEM_SKU: SKU SYSTEM_VERSION: System Version BIOS_VENDOR: American Megatrends Inc. BIOS_VERSION: 3004 BIOS_DATE: 04/14/2017 BASEBOARD_MANUFACTURER: ASUSTeK COMPUTER INC. BASEBOARD_PRODUCT: A88XM-PLUS BASEBOARD_VERSION: Rev X.0x DUMP_TYPE: 2 BUGCHECK_P1: 0 BUGCHECK_P2: fffffa8007912028 BUGCHECK_P3: f4800000 BUGCHECK_P4: 40e0f BUGCHECK_STR: 0x124_AuthenticAMD CPU_COUNT: 4 CPU_MHZ: da3 CPU_VENDOR: AuthenticAMD CPU_FAMILY: 15 CPU_MODEL: 60 CPU_STEPPING: 1 CUSTOMER_CRASH_COUNT: 1 DEFAULT_BUCKET_ID: WIN7_DRIVER_FAULT PROCESS_NAME: svchost.exe CURRENT_IRQL: f ANALYSIS_SESSION_HOST: DESKTOP-3NDMJ43 ANALYSIS_SESSION_TIME: 10-28-2017 14:10:47.0423 ANALYSIS_VERSION: 10.0.16299.15 x86fre STACK_TEXT: fffff880`009f5a88 fffff800`0382fa3b : 00000000`00000124 00000000`00000000 fffffa80`07912028 00000000`f4800000 : nt!KeBugCheckEx fffff880`009f5a90 fffff800`039efeef : 00000000`00000001 fffffa80`07909620 00000000`00000000 fffffa80`07909670 : hal!HalBugCheckSystem+0x1e3 fffff880`009f5ad0 fffff800`0382f700 : 00000000`00000728 fffffa80`07909620 fffff880`009f5e30 00000000`00000000 : nt!WheaReportHwError+0x26f fffff880`009f5b30 fffff800`0382f052 : fffffa80`07909620 fffff880`009f5e30 fffffa80`07909620 00000000`00000000 : hal!HalpMcaReportError+0x4c fffff880`009f5c80 fffff800`03822e8f : 00000000`00000000 00000000`00000001 fffff880`009f5eb0 00000000`00000000 : hal!HalpMceHandler+0x9e fffff880`009f5cc0 fffff800`038d56ec : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : hal!HalHandleMcheck+0x47 fffff880`009f5cf0 fffff800`038d5553 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxMcheckAbort+0x6c fffff880`009f5e30 fffff800`03ba8110 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiMcheckAbort+0x153 fffff880`03d1aa30 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!CmpParseCacheLookup+0xa4 THREAD_SHA1_HASH_MOD_FUNC: da4fea960fbfe8004f3ee009ae82cad3cd166d3b THREAD_SHA1_HASH_MOD_FUNC_OFFSET: 7e642f071dbe1a1981ca36cc1601ea164a8ef7bf THREAD_SHA1_HASH_MOD: aaf50b05ed9e59744c0f817a91f11e19affa07f1 FOLLOWUP_NAME: MachineOwner MODULE_NAME: AuthenticAMD IMAGE_NAME: AuthenticAMD DEBUG_FLR_IMAGE_TIMESTAMP: 0 STACK_COMMAND: .thread ; .cxr ; kb FAILURE_BUCKET_ID: X64_0x124_AuthenticAMD_PROCESSOR_BUS BUCKET_ID: X64_0x124_AuthenticAMD_PROCESSOR_BUS PRIMARY_PROBLEM_CLASS: X64_0x124_AuthenticAMD_PROCESSOR_BUS TARGET_TIME: 2017-10-25T20:36:11.000Z OSBUILD: 7601 OSSERVICEPACK: 1000 SERVICEPACK_NUMBER: 0 OS_REVISION: 0 SUITE_MASK: 272 PRODUCT_TYPE: 1 OSPLATFORM_TYPE: x64 OSNAME: Windows 7 OSEDITION: Windows 7 WinNt (Service Pack 1) TerminalServer SingleUserTS OS_LOCALE: USER_LCID: 0 OSBUILD_TIMESTAMP: 2017-09-13 16:55:13 BUILDDATESTAMP_STR: 170913-0600 BUILDLAB_STR: win7sp1_ldr BUILDOSVER_STR: 6.1.7601.23915.amd64fre.win7sp1_ldr.170913-0600 ANALYSIS_SESSION_ELAPSED_TIME: 67b ANALYSIS_SOURCE: KM FAILURE_ID_HASH_STRING: km:x64_0x124_authenticamd_processor_bus FAILURE_ID_HASH: {44023482-051f-5b9f-29e6-f81bb1d7cfed} Followup: MachineOwner --------- 1: kd> !errrec fffffa80`07912028 =============================================================================== Common Platform Error Record @ fffffa8007912028 ------------------------------------------------------------------------------- Record Id : 01d34dd0dba38765 Severity : Fatal (1) Length : 928 Creator : Microsoft Notify Type : Machine Check Exception Timestamp : 10/25/2017 20:36:11 (UTC) Flags : 0x00000000 =============================================================================== Section 0 : Processor Generic ------------------------------------------------------------------------------- Descriptor @ fffffa80079120a8 Section @ fffffa8007912180 Offset : 344 Length : 192 Flags : 0x00000001 Primary Severity : Fatal Proc. Type : x86/x64 Instr. Set : x64 Error Type : BUS error Operation : Generic Flags : 0x00 Level : 3 CPU Version : 0x0000000000660f01 Processor ID : 0x0000000000000001 =============================================================================== Section 1 : x86/x64 Processor Specific ------------------------------------------------------------------------------- Descriptor @ fffffa80079120f0 Section @ fffffa8007912240 Offset : 536 Length : 128 Flags : 0x00000000 Severity : Fatal Local APIC Id : 0x0000000000000001 CPU Id : 01 0f 66 00 00 08 04 01 - 0b 32 d8 7e ff fb 8b 17 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 Proc. Info 0 @ fffffa8007912240 =============================================================================== Section 2 : x86/x64 MCA ------------------------------------------------------------------------------- Descriptor @ fffffa8007912138 Section @ fffffa80079122c0 Offset : 664 Length : 264 Flags : 0x00000000 Severity : Fatal Error : BUSLG_GENERIC_ERR_*_NOTIMEOUT_ERR (Proc 1 Bank 5) Status : 0xf480000000040e0f Address : 0x0000000000000003 Misc. : 0x0000000000000000
Generic means that the system was unable to determine what action happened at that moment.
Action = fetching, reading, writing, execution, eviction & snoop from the view point of the processor.
LG = level generic, the cache level that the processor was using. Generic, again, means that the system was unable to determine what cache level was in use.
It further tells us that there was no timeout between data X and data Y.
There is some more that can be retrieved from this small dump, but it requires knowing the technical documentation of Intel and AMD to continue.
Ps, I would understand it if this is gibberish for you. We've all been at that point.
Hehehe. I searched a little more on how to use the windbg and found out what I was doing wrong and I was able to open the logs and check them.
I noticed an error in the logs about uGuru.sys from my old MB (abit) so I was wondering if that could be the problem. I was not able to totally uninstall the driver although I thought I had .
I will try to find a way to do it tomorrow first thing (today is the daughter's bday party :-p ) and see where that leads me .
Thnx a lot my friend once more for the explanations. ;-)
I used another graphics card from my wife's pc which is working fine for sure and I still got a BSOD.
I then took out my ssd and connected it as the main drive (and only one drive) to my wife's pc and I logged on fine without any problem.
I even tried formatting my HDD and reinstalling windows from a usb drive and while it was expanding the files (after formating the whole drive) again the pc restarted without any msg this time.
For sure it's not a driver or a virus since I have zero data on the ssd now and there is no other drive connected.
Does that leave CPU or MB or is that only my idea?
How can I check if there is a faulty CPU or MB?
I can see the page at Ultimate Boot CD - Overview but don't know which one to use at boot.
I'll have to try that too.
I am thinking that maybe due to the fact that I didn't apply any extra cpu paste when I put the default cooler on the cpu maybe that's the trbl.
I will have to get the cooler off and apply some paste before bying a new one to try it on.
Damnnnn it gets tougher and tougher to find the source.
Please note:
Any time a cpu cooler is remove the cpu and the cooler should be cleaned of all old paste and new thermal paste should be applied.
A very good place to learn.
Arctic Silver, Inc. - Instructions
Jack
Hey Bear.
Thnx for ur answer. As of yet i have never removed the cooler from the cpu. Actually the default cooler came with the paste already applied and ready to be attached to the cpu but I thought that since I counted on the ready paste and did not clean it off and applied my own, that this could be the problem.
Yeah, I know, grasping at straws but what else is there right now?
Both MB and CPU are under guarrantee (they were bought 4 months ago) so I was wondering if there is any way to prove that it's one or the other conclusively!
Any ideas anyone?