[Advanced] Principles of debugging


  1. Posts : 1,377
    Win7x64
       #1

    [Advanced] Principles of debugging


    Adventures In A 32-bit Minidump, Part 0
    Yes, I realise it has "stacked", but how?!?

    One of the items of information stored in a minidump is the "stack" of the crashing thread.

    All threads have a stack, at all times, and the word "stack" is not used here in the slang sense (to crumple and fall). Since a thread's stack is fundamental to our understanding of its activity, including those times when that thread has caused a crash, let's examine a simple stack in more detail:

    kd> k
    ChildEBP RetAddr
    f8b69dac 805c4cce nt!ExpWorkerThread+0x100
    f8b69ddc 805411c2 nt!PspSystemThreadStartup+0x34
    00000000 00000000 nt!KiThreadStartup+0x16

    The 'k' (stack unwind) command was used to display the current thread's stack. In the display above, there are 3 columns of information:

    1. ChildEBP: a pointer to a memory location which stores the address of the previous function on the stack ("stack frame").
    2. RetAddr: The "return address" where processing will resume once this function returns (finishes what it had to do).
    3. (not labeled) Function name: the module name and function names, according to the module symbols.
    Focus first on the top frame of the current stack:

    ChildEBP RetAddr
    f8b69dac 805c4cce nt!ExpWorkerThread+0x100

    The ChildEBP pointer is supposed to contain the address of the previous function's frame. Let's examine what is at that location using a variation of the debugger's 'd' (dump memory) command:

    kd> dd f8b69dac L1
    f8b69dac f8b69ddc

    Notice that the address found there corresponds exactly to the previous (middle of the current stack) function's ChildEBP:

    ChildEBP RetAddr
    f8b69ddc 805411c2 nt!PspSystemThreadStartup+0x34

    And now let's repeat the same procedure with the next ChildEBP value:

    kd> dd f8b69ddc L1
    f8b69ddc 00000000

    The number found at memory address f8b69ddc is zero (0). That confirms what the last line from the full stack listing is telling us:

    ChildEBP RetAddr
    00000000 00000000 nt!KiThreadStartup+0x16

    Since the function name is "KiThreadStartup" (Kernel Internal Thread Startup), it's understandable that the stack does not extend any further back, and KiThreadStartup was in fact the first function on the stack. Hence, the ChildEBP in that case was zero (0) - it wasn't called from another function.

    What about RetAddr, the "return address"? Let's check the top frame's RetAddr:

    ChildEBP RetAddr
    f8b69dac 805c4cce nt!ExpWorkerThread+0x100
    f8b69ddc 805411c2 nt!PspSystemThreadStartup+0x34
    00000000 00000000 nt!KiThreadStartup+0x16

    The stack suggests that the ExpWorkerThread function was called by PspSystemThreadStartup, and it is logical to assume that control would be returned back to PspSystemThreadStartup once ExpWorkerThread has completed whatever task it was asked to perform. If we examine those memory addresses using the debugger's 'ln' ("list nearest" [function]) command:

    kd> ln 805c4cce
    (805c4c9a) nt!PspSystemThreadStartup+0x34

    Yes, execution will resume precisely where we left off in PspSystemThreadStartup - at 0x34 (52 decimal) bytes from the start of that function. Likewise, we can expect that once PspSystemThreadStartup is completely finished, the OS will give control back to KiThreadStartup:

    kd> ln 805411c2
    (805411ac) nt!KiThreadStartup+0x16

    =========================================

    One potential question at this point is how does the debugger display the stack based on the contents of a memory dump - how does the 'k' command produce its output?

    For each thread, the OS maintains an information block which keeps track of many thread attributes, including the locations in memory where the thread's stack starts and finishes. During a crash, the "information block" for the thread which directly caused the crash is recorded in the memory dump, and portions of it can be viewed with the '!thread' debugger command:

    kd> !thread
    THREAD 82265470 Cid 0004.0108 Teb: 00000000 Win32Thread: 00000000 RUNNING on processor 0
    Not impersonating
    DeviceMap e1005570
    Owning Process 0 Image: <Unknown>
    Attached Process 82154500 Image: winlogon.exe
    Wait Start TickCount 38493 Ticks: 0
    Context Switch Count 768
    UserTime 00:00:00.000
    KernelTime 00:00:00.020
    Start Address nt!ExpWorkerThread (0x80533ee6)
    Stack Init f8b6a000 Current f8b69d1c Base f8b6a000 Limit f8b67000 Call 0
    Priority 12 BasePriority 12 PriorityDecrement 0 DecrementCount 0

    Note the stored stack values, including the "current" stack pointer at the time. Based on a series of steps very similar to what we did above by following ChildEBP from a function back to its caller, the debugger "unwinds" and hence displays the crashing thread's stack in text format.

    The function names at or near the top of the stack form an excellent starting point for web searches regarding the question of whether others have experienced identical or very similar crashes.
    Last edited by H2SO4; 05 Nov 2009 at 03:07.
      My Computer


  2. Posts : 1,377
    Win7x64
    Thread Starter
       #2

    Listing loaded modules


    Adventures In A 32-bit Minidump, Part 1
    What's a "loaded modules" list?

    Another of the most important bits of information in a minidump is the list of all modules (executable files) loaded at the time of the crash.

    Such a list becomes particularly important when deterministic minidump analysis cannot specifically pinpoint a crash cause. By examining the list of modules, it is possible to make educated guesses regarding those modules most likely to be involved, and hence most likely to require updating or even removal.

    The objective of the debugger's 'lm' command ("list modules") is self-explanatory:

    0: kd> lm
    start end module name
    80bb0000 80bb8000 kdcom (deferred)
    82c03000 83003000 nt (pdb symbols) s:\symCachePub\ntkrnlmp.pdb\998A3472EEA6405CB8C089DE868F26222\ntkrnlmp.pdb
    83003000 8303a000 hal (deferred)
    87800000 87811000 fileinfo (deferred)
    88035000 8817e000 tcpip (pdb symbols) c:\symcachepub\tcpip.pdb\416A30179D3D4AF4A8278FA8146232A92\tcpip.pdb

    ...

    In the truncated display above, the debugger shows several columns of information about each module:

    • The "start" and "end" addresses in memory where the module's code has been loaded. For various reasons, that range doesn't necessarily match the file's on-disk size.
    • The name of the executable binary (module), without its extension. (Note that the debugger always refers to the kernel image as "nt".)
    • The module's symbols status - loaded or "deferred".
    "Symbols" are oft-misunderstood by those starting off in debugging. In a nutshell, a module's symbols contain information about the module which is irrelevant to the computer's execution of code in that module, but useful during subsequent debugging. One analogy is a car repair manual - it's not required for driving the car, but almost indispensable for troubleshooting, and entirely car model-specific.

    In order to minimise module size, Microsoft strips off symbolic information from modules when they are compiled; otherwise, the OS would consume far more space. Instead of being included with the OS, each module's symbols are made available for download from a HTTP server (preferred) and in symbols "packs" for various OSs and Service Packs. Note that every module has its own specific symbols library (a PDB file), and different versions of the same module will have correspondingly different symbols.

    In our example, the "tcpip" module (tcpip.sys) already has symbols loaded by the debugger:

    88035000 8817e000 tcpip (pdb symbols) s:\SymCachePub\tcpip.pdb\416A30179D3D4AF4A8278FA8146232A92\tcpip.pdb

    The long number is a unique identifier constructed from the module's attributes. The name of the module, plus that number, is what the debugger uses to search Microsoft's public symbol server for a particular PDB when it's required. The rest of the path above is specific to the machine which was used to generate the example.

    The debugger performs "lazy" symbol loading by default, which means that (for performance reasons) it will not bother to search Microsoft's symbol server for symbols corresponding to a specific module until that module's code is encountered during debugging. As an example, here is the stack from the corresponding minidump:

    0: kd> k
    ChildEBP RetAddr
    80d8a564 880a67f8 nt!KiTrap0E+0x2cf
    80d8a5d8 880a62a6 tcpip!IN6_IS_ADDR_ISATAP+0x18
    80d8a5e8 880a36c9 tcpip!Ipv6AddressType+0x27
    80d8a658 880a4bde tcpip!IppValidateSetAllRouteParameters+0x56
    80d8a698 880a4b88 tcpip!IppUpdateUnicastRouteUnderLock+0x34


    Because "nt" (actually NTOSKRNL.EXE) and TCPIP.SYS are the only modules whose functions have been encountered during the stack unwind, their's are the only symbols loaded. All other modules in the listing above have their symbols status specified as "deferred".

    The debugger's 'lm' command can optionally take switches. For example, 'lm t' displays the timestamp for each module, in addition to the basic information already discussed above:

    0: kd> lm t
    start end module name
    80bb0000 80bb8000 kdcom Tue Jul 14 11:08:58 2009 (4A5BDAAA)
    82c03000 83003000 nt Tue Jul 14 09:15:08 2009 (4A5BBFFC)
    83003000 8303a000 hal Tue Jul 14 09:11:03 2009 (4A5BBF07)

    ...

    The textual representation of a module's timestamp (red) is converted for display from the "unix time" value (green) which stored in the minidump for each module. (Unix time is defined as the number of seconds since midnight, Jan 1, 1970.)

    Lastly, the 'lm' command can optionally take 'm' (match name pattern) and 'v' (display verbose information) switches. In that combination, it is very useful for revealing as much information as possible about a specific module:

    0: kd> lm vm tcpip
    start end module name
    88035000 8817e000 tcpip (pdb symbols) c:\symcachepub\tcpip.pdb\416A30179D3D4AF4A8278FA8146232A92\tcpip.pdb
    Loaded symbol image file: tcpip.sys
    Mapped memory image file: c:\symcachepub\tcpip.sys\4A5BBF8E149000\tcpip.sys
    Image path: \SystemRoot\System32\drivers\tcpip.sys
    Image name: tcpip.sys
    Timestamp: Tue Jul 14 09:13:18 2009 (4A5BBF8E)
    CheckSum: 00146D18
    ImageSize: 00149000
    File version: 6.1.7600.16385
    Product version: 6.1.7600.16385
    ... (parts omitted for clarity)...
    FileVersion: 6.1.7600.16385 (win7_rtm.090713-1255)
    FileDescription: TCP/IP Driver
    LegalCopyright: © Microsoft Corporation. All rights reserved.


    In situations where the cause of a crash is suspected to be an unknown buggy driver which is corrupting memory, a valid approach is to list all loaded modules, create a sub-list consisting of only those drivers which are not part of the OS, and then rely on each driver's age as a general indicator as to whether an update is likely to be available.

    Updating the oldest drivers frequently brings positive results and resolves crashes.
    Last edited by H2SO4; 28 Sep 2009 at 21:06.
      My Computer


  3. Posts : 1,377
    Win7x64
    Thread Starter
       #3

    Threads, registers, and contexts


    Adventures In A 32-bit Minidump, Part 2
    We are discussing "registers", not the registry

    The last of the main items of minidump information is the "context" of the thread that caused the crash. This is the most esoteric and least practical of the topics covered so far, but it is also fundamental to understanding how a computer functions at the interface between software and hardware.

    The concept of a "memory hierachy" is frequently drawn (in computer science textbooks) as a pyramid. Since I can't easily draw here, imagine that "registers" are at the top (pointy) end, and "offline storage" is the wide pyramid base:

    1. Registers
    2. L1 processor cache
    3. L2 processor cache (if present)
    4. L3 processor cache (if present)
    5. Physical memory (RAM)
    6. Secondary storage (HDD)
    7. Offline storage (backups, DVDs, USB sticks...)
    The further down the list, the greater the storage capacity, but at the cost of slower access speed. Registers are tiny regions of extremely fast-access memory built right into the processor itself. When discussing whether a particular processor is in "32-bit" or "64-bit" modes, what that really refers to is the width (size) of each individual register.

    The IA-32 processor architecture, more commonly called x86 when it's running Windows, has a mere 8 registers which can be termed "general purpose". Almost everything that a PC does, from games, to spreadsheets, to playing music, all happens because the contents of those 8 registers are added to each other, subtracted, compared, multipled, moved, and so on almost ad infinitum. Here is what the contents of those registers might look like, with the aid of the debugger's 'r' ("display registers") command:

    kd> r
    eax=00000001 ebx=005a0737 ecx=8054a14c edx=000003f8 esi=00000089 edi=91e65b50
    eip=80526fc8 esp=80548fd0 ebp=80548fe0 iopl=0 nv up ei pl nz na po nc


    The names of the registers are deceptively simple:

    • A: (Accumulator). General purpose, sometimes used as the destination register for results of arithmetic operations, and also for function return values.
    • B: General purpose
    • C: ("Counter"). General purpose, sometimes used to store loop counter interators.
    • D: General purpose
    • SI: "Source Index" pointing to source data for some memory and string operations. Also general purpose.
    • DI: "Destination Index" counterpart to SI. Also general purpose.
    • IP: "Instruction pointer" that contains the address in memory of the next instruction which is to be fetched and executed by the processor. This is not a general purpose register.
    • SP: "Stack pointer" which maintains information regaring the current stack position.
    • BP: "Base pointer" which mostly (but not always) contains the address of the beginning of the current stack frame.
    The E and X name prefix and suffix, as in EAX, are used to address either the the lower 16 bits of the register (just AX), or all 32 bits (EAX).

    The cryptic-looking "nv up ei pl nz na po nc" letter combinations reveal the current contents of the FLAGS register. Because it is a bitmask, the bits in FLAGS are displayed (as above) according to their meaning, and not as a number. "nz" means "not zero" (zero flag not set), "po" means "parity odd", "nc" is "no carry" (carry flag not set), and so on for roughly 20 different flags (the debugger's 'r' command doesn't display all 20).

    Individual bits within that register are set or cleared for specific reasons, including comparison operations. For example, if the result of a comparison between EAX and EBX deduces that they contain the same value, the "zero flag" within the FLAGS register is set to TRUE (1) - there is zero difference between them. Subsequently, a JZ (jump if zero flag set) instruction may cause the execution to continue at a different point in code which is there to deal with the implications of those two numbers being equal. That is an example of conditional branching which is the cornerstone of computing - it was theorised about by Charles Babbage in the 19th century.

    Here is what a typical series of processor instructions looks like, "unassembled" using the debugger's 'u' command (I added the comments in green):

    8052701c mov eax,edi // MOVe contents of EDI into EAX (copy them)
    8052701e lea ecx,[eax+1] // Load Effective Address, here simply setting ECX = EAX +1
    80527021 mov dl,byte ptr [eax] // MOVe contents of byte pointed at by EAX into the beginning of the D register.
    80527023 inc eax // INCrement EAX by one


    This is the only "language" that a processor understands. The vast majority of NTOSKRNL, user32.dll, ntdll.dll, and all other executable files consists of incredibly complex and intricate series of similar instructions which are usually translated ("compiled") from an original source written in a higher-level programming language such as C.

    By initiating the execution of such a series of instructions at a given starting point, a "thread" (of execution) is brought into temporary existence. Each processor core is capable of following the instructions of only one such thread at any given time. Once every 15 milliseconds, the OS makes a decision as to whether it is appropriate to allow a different thread to have its slice of processor time, called "pre-empting" the currently running thread, or whether the current thread should be allowed to continue. By quickly swapping between many different threads in that manner, the OS makes it look like several things are happening at once, although in reality the number of things (threads) happening concurrently is exactly equal to the number of processor cores.

    In order to temporarily suspend a thread in favour of another one, the OS must be able to record the current state of each of the processor registers for the thread being suspended. Later, when the suspended thread is to be reanimated, the previously saved register values for that thread are restored into corresponding processor registers, and execution is allowed to resume exactly at the point where it was previously suspended. The thread is not aware of being parked aside occasionally; from its point of view, it is merely executing a sequence of instructions until the end of the sequence is encountered.

    A CONTEXT structure is used by the OS to store (in memory) the register values for a given thread. We can use the debugger's 'dt' ("dump type") command to see the members of the CONTEXT struct:

    kd> dt nt!_CONTEXT
    +0x000 ContextFlags : Uint4B
    ... omitted for clarity...
    +0x09c Edi : Uint4B
    +0x0a0 Esi : Uint4B
    +0x0a4 Ebx : Uint4B
    +0x0a8 Edx : Uint4B
    +0x0ac Ecx : Uint4B
    +0x0b0 Eax : Uint4B
    +0x0b4 Ebp : Uint4B
    +0x0b8 Eip : Uint4B
    +0x0bc SegCs : Uint4B
    +0x0c0 EFlags : Uint4B
    +0x0c4 Esp : Uint4B
    +0x0c8 SegSs : Uint4B
    +0x0cc ExtendedRegisters : [512] UChar


    The computer's memory normally contains many instances of the CONTEXT struct, each of which pertains to a single thread. A bugcheck minidump contains (at least) the context of the thread that caused the crash - i.e. the content of the processor's registers at the point where the crash condition was detected. That information can subsequently be used by developers for the purpose of understanding the code-level reasons behind a crash.
    Last edited by H2SO4; 07 Oct 2009 at 12:12.
      My Computer


 

  Related Discussions
Our Sites
Site Links
About Us
Windows 7 Forums is an independent web site and has not been authorized, sponsored, or otherwise approved by Microsoft Corporation. "Windows 7" and related materials are trademarks of Microsoft Corp.

© Designer Media Ltd
All times are GMT -5. The time now is 21:16.
Find Us