Help postmorten debugging of a mixed mode Win32 application

Help postmorten debugging of a mixed mode Win32 application - c++

Here's the situation:
Background
I have a mixed mode .NET/Native application developed in Visual Studio 2008.
What I mean by mixed mode is that the front end is written in C++ .NET which calls into a native C++ library. The native code does the bulk of the work in the app, including kicking off new threads as it requires. The .NET code is just for UI purposes (win forms).
I have a release build of application running on a tester's computer.
The native libraries were compiled with full optimisations but also with debugging enabled (the "Debug Information Format" was set to "Program Database").
What this means is that I have the debugging symbols for the application in a PDB file.
The problem
So anyway, one of the testers is having a problem with the app where it occasionally crashes on XP. I've been able to get the minidump of the crash using Dr Watson for several runs.
When I debug into it (using the minidump - I'm not actually debugging the real app), all the debugging symbols are loaded correctly: I can see the full stack trace of all of the native threads correctly. Other threads (which are presumably the .NET threads) don't have a stack trace, but they all at least show me which dll the thread was started on (i.e. ntdll.dll).
It correctly reports the thread which fails ("Unhandled exception at 0x0563d652 in user(5).dmp: 0xC0000005: Access violation reading location 0x00000000).
However when I go into the thread it shows nothing useful. In the stack trace there is a single entry which just has the memory address "0563d652()" (not even "ntldll.dll").
When I go into dissasembly it just shows a random section of about 30 instructions. Either side of the memory address is just "???". It almost looks like it is not part of my source code (isn't your binary loaded sequentially into memory? is it normal to have a random set of assembly statements in the middle of nowhere?).
My questions
So basically my questions are threfold.
1) Can anyone explain the debugger's lack of information?
2) Bearing in mind, I can't show the error occurred in my code, can anyone suggest a reason for the failure
3) Can I do anything else to help me diagnose this current problem in the future?
Help!
John
Update:
Here is the stack dump for the failing thread from WinDBG
# ChildEBP RetAddr
WARNING: Frame IP not in any known module. Following frames may be wrong.
00 099bf414 02d0e7fc 0x563d652
01 00000000 00000000 0x2d0e7fc
Weird huh? Doesn't even show a DLL.
Is it possible that I've corrupted the stack/heap somehow which has caused a thread to just get corrupted...?

Are you using WinDbg? If so, are you using the Son of strike extension?
Bugslayer: Son-of-Strike
-or-
Drill Into .NET Framework Internals to See How the CLR Creates Runtime Objects?

We had an issue similar to this where a code bug was silent in MSVC2K5 SP1, but if you had the MSVC2K5 SP2 runtime installed it caused an error which didn't point at valid code.
Part of the problem is, when you start executing data as code you could be doing anything and so the crash location becomes useless as you cannot even get back to a valid stack trace.
We had this happen to us when the new .Net runtime install installed a newer version of the MSVC C++ Runtime in the SxS directory.
In the end our method to resolve the issue was to make the crash happen frequently and add as much logging as necessary to localize it.

could you post the stack of the faulting thread once you've grabbed and installed a copy of windbg and opened the dump file there?
we could start from there.

Your EIP was just corrupted.
Assuming the ESP is valid, you can view the callstack, just type:
dds esp [enter]
dds [enter]
You can also use the memory windows:
Set address to: esp
Set format to: Pointer&Symbol

Related

Qt software debugging techniques

I have the application which is published to the Microsoft Store, under health page for the application in the Dev center, it reports crash issue:
fail_fast_fatal_app_exit_c0000409_qt5core.dll!qt_logging_to_console
And stack trace:
0 ucrtbase.dll abort 0x000000000000004E
1 Qt5Core.dll qt_logging_to_console 0x000000000000017A
2 Qt5Core.dll QMessageLogger::fatal 0x0000000000000093
3 Qt5Gui.dll QPixmap::paintEngine 0x0000000000000052
4 Qt5Gui.dll QPixmap::QPixmap 0x0000000000000037
When debugging the application I don't get any crashes. The question is how to get the crash location or function line/name in the code by the stack trace? Any ideas? Thanks.

The hexes look like some offsets, maybe from function start as they are small. You could download and compile dia2dump utility (it is also somewhere in VS samples), which can dump lots of information about debugging symbols, including line numbers for each offset. So you would know the line numbers for the stack.
PS note you need to have the Qt's pdbs, not your program ones, to investigate this stack.

I think the problem was with programs icons creation/destruction on the fly and DestroyIcon destroyed the handle which was in use.
I added some improvements, now it only creates 1 object of HICON and QPixmap, then appends to the structure in the loop and after the loop exited it calls DestroyIcon function to destroy the HICON handle.
Also I have found this article about finding bugs from the Microsoft Store:
How to get a crash dump (or any usable crash report) for a converted Windows Store UWP app?
But also I'm going to try StackWalker application to check for other issues. Thank you.

Visual Studio - Call stack does not trace back to user function

Ran into some access violation in visual studio 2010 and here's the callstack:
Most of the call stack are assembly code in the dll(almost illegible to me). I want to trace back to the line in my code which caused the violation, but it seems there's no user function in the call stack.
How can I find the line in my function causing the violation ? Do I need to adjust some settings ?

Getting a reliable stack trace out of optimized C or C++ code is difficult. The optimizer chooses speed over diagnosability. The debugger needs PDB files for such code to know how to interpret the stack frames correctly and find the return address to the calling method.
Clearly you don't have these PDBs, you are getting the raw addresses from the operating system DLLs instead of their function names. Getting those PDBs is pretty simple, Microsoft has a public server that does nothing but deliver those PDBs for any released version of Windows, including service packs and security updates.
Telling the debugger about that server is required, the feature is off by default. It is particularly easy for VS2010, the server name is preprogrammed in the dialog, you only have to turn it on. Tools + Options, Debugging, Symbols, tick the checkbox in front of "Microsoft Symbol Servers". Set the cache directory, any writable directory will do.
Start debugging again, it will take a while at first to cache the PDBs. When it is done, you'll see a greatly improved stack trace. Accurate and with function names for the Windows DLLs.

C++ DX11 application only runs in Visual Studio IDE

Alright, I presented this question on the MSDN forums but have yet to receive any kind of response so I figured I'd give StackOverflow a try.
I'm currently developing a DirectX application using VS2008 on Win7. I recently experienced a nasty memory corruption bug with a memory allocation class that grabbed byte aligned memory. During this bug I could still run the debug and release executables however it would crash due the instructions getting corrupted or whatever, but it would still execute for a bit until the crash.
I then stripped out the entire memory allocation class. The application runs perfectly in the IDE (release and debug builds) but I can't run any of the executables at all. They immediately crash with a non-responding/stop working error. And I don't think it is my environment because I get the same issue on another computer that wasn't having problems before either.
Dependency walker gives a "Warning: At least one delay-load dependency module was not found. Warning: At least one module has an unresolved import due to a missing export function in a delay-load dependent module." error and indicates that GPSVC.dll and IESHIMS.DLL can't be found. I've read that this can be misleading and just indicates a potential problem somewhere. And Dependency walker wasn't giving me this error the day before.
I haven't tinkered with any of the configuration or project settings or added new code. Any idea of what could be causing this behavior?
Also another note, I installed the Windows 7.1 sdk the same day. Think this could be some kind of compiler related bug?
Just in case some useful information pops up on the MSDN post, here is the link
http://social.msdn.microsoft.com/Forums/en-IE/vsdebug/thread/f692b394-8af2-4453-991c-aa6a443a9019
Thanks!
Edit -
Here is the last couple lines of Dependency Walker's profiling output
GetProcAddress(0x76CD0000 [c:\windows\syswow64\KERNEL32.DLL], "DecodePointer") called from "c:\windows\syswow64\NVWGF2UM.DLL" at address 0x6D8BAE4F and returned 0x77B59D65.
GetProcAddress(0x76CD0000 [c:\windows\syswow64\KERNEL32.DLL], "DecodePointer") called from "c:\windows\syswow64\NVWGF2UM.DLL" at address 0x6D8BAE4F and returned 0x77B59D65.
GetProcAddress(0x76CD0000 [c:\windows\syswow64\KERNEL32.DLL], "EncodePointer") called from "c:\windows\syswow64\NVWGF2UM.DLL" at address 0x6D8BAF60 and returned 0x77B60FDB.
GetProcAddress(0x76CD0000 [c:\windows\syswow64\KERNEL32.DLL], "DecodePointer") called from "c:\windows\syswow64\NVWGF2UM.DLL" at address 0x6D8BAF70 and returned 0x77B59D65.
Second chance exception 0xC0000005 (Access Violation) occurred in "c:\users\joel\desktop\DXAPP.EXE" at address 0x0110152E.
Exited "c:\users\joel\desktop\DXAPP.EXE" (process 0x27D8) with code 255 (0xFF).
Is this referring to a DLL grabbing a null pointer or to my actual instructions? Going to read up on how to use WinDbg real quick and I'll post it's output if this doesn't shed any immediate light on the issue.
Edit 2 -
Simply running the application and hitting debug to bring up Visual Studio consistently brought me to where I'm compiling my shaders. I'm assuming at the moment that the root of the problem lies around this. However, I still don't understand the change of behavior during execution between using the IDE and not.
Solution! -
I was so thrown off by the previous memory corruption bug that I didn't realize my shaders weren't in a local directory to the executables. This in turn was generating a null pointer that wasn't handled properly after calling D3DX11CompileFromFile().

Shoot, sorry I meant to post this as a comment...
I can only suggest more diagnostic attempts.
One would be to profile the app from within Depends, this will also show dynamic DLL loads and might show something new. Also it captures the debug output. It may behave differently than launching in the debugger itself and provide a clue. You don't mention actually profiling so I thought I'd suggest it in case you hadn't. Also, pay very close attention to the paths for the DLL's loaded - you might be surprised at a DLL loading from a location other than you intended.
Another suggestion is to try at attach to the stopped app after the crash (before dismissing the error dialog). See if you can get a stack trace or anything out of it.
Finally try attaching (or even launching from) WindDbg rather than the IDE. Like the Depends profile, the difference in debugger behavior and how it hooks the app may allow the crash to happen, while providing the clues you need.
Good Luck!

How to extract debugging information from a crash

If my C++ app crashes on Windows I want to send useful debugging information to our server.
On Linux I would use the GNU backtrace() function - is there an equivalent for Windows?
Is there a way to extract useful debugging information after a program has crashed? Or only from within the process?
(Advice along the lines of "test you app so it doesn't crash" is not helpful! - all non-trivial programs will have bugs)

The function Stackwalk64 can be used to snap a stack trace on Windows.
If you intend to use this function, you should be sure to compile your code with FPO disabled - without symbols, StackWalk64 won't be able to properly walk FPO'd frames.
You can get some code running in process at the time of the crash via a top-level __try/__except block by calling SetUnhandledExceptionFilter. This is a bit unreliable since it requires you to have code running inside a crashed process.
Alternatively, you can just the built-in Windows Error Reporting to collect crash data. This is more reliable, since it doesn't require you to add code running inside the compromised, crashed process. The only cost is to get a code-signing certificate, since you must submit a signed binary to the service. https://sysdev.microsoft.com/en-US/Hardware/signup/ has more details.

You can use the Windows API call MiniDumpWriteDump if you wish to roll your own code. Both Windows XP and Vist automate this process and you can sign up at https://winqual.microsoft.com to gain access to the error reports.
Also check out http://kb.mozillazine.org/Breakpad and http://www.codeproject.com/KB/debug/crash_report.aspx for other solutions.

This website provides quite a detailed overview of stack retrieval on Win32 after a C++ exception:
http://www.eptacom.net/pubblicazioni/pub_eng/except.html
Of course, this will only work from within the process, so if the process gets terminated or crashes to the point where it terminates before that code is run, it won't work.

Generate a minidump file. You can then load it up in windbg or Visual Studio and inspect the entire stack where the crash occurred.
Here's a good place to start reading.

Its quite simple to dump the current stackframe addresses into a log file. All you have to do is get such a function called on program faults (i.e. a interrupt handler in Windows) or asserts. This can be done at released versions as well. The log file then can be matched with a map file resulting in a call stack with function names.
I published a article about this some years ago.
See http://www.ddj.com/architect/185300443

Let me describe how I handle crashes in my C++/WTL application.
First, in the main function, I call _set_se_translator, and pass in a function that will throw a C++ exception instead of using structured windows exceptions. This function gets an error code, for which you can get a Windows error message via FormatMessage, and a PEXCEPTION_POINTERS argument, which you can use to write a minidump (code here). You can also check the exception code for certain "meltdown" errors that you should just bail from, like EXCEPTION_NONCONTINUABLE_EXCEPTION or EXCEPTION_STACK_OVERFLOW :) (If it's recoverable, I prompt the user to email me this minidump file.)
The minidump file itself can be opened in Visual Studio like a normal project, and providing you've created a .pdb file for your executable, you can run the project and it'll jump to the exact location of the crash, together with the call stack and registers, which can be examined from the debugger.

If you want to grab a callstack (plus other good info) for a runtime crash, on a release build even on site, then you need to set up Dr Watson (run DrWtsn32.exe). If you check the 'generate crash dumps' option, when an app crashes, it'll write a mini dump file to the path specified (called user.dmp).
You can take this, combine it with the symbols you created when you built your server (set this in your compiler/linker to generate pdb files - keep these safe at home, you use them to match the dump so they can work out the source where the crash occurred)
Get yourself windbg, open it and use the menu option to 'load crash dump'. Once it's loaded everything you can type '~#kp' to get a callstack for every thread (or click the button at the top for the current thread).
There's good articles to know how to do this all over the web, This one is my favourite, and you'll want to read this to get an understanding of how to helpyourself manage the symbols really easily.

You will have to set up a dump generation framework in your application, here is how you may do it.
You may then upload the dump file to the server for further analysis using dump analyzers like windbg.

You may want to use adplus to capture the crash callstack.
You can download and install Debugging tools for Windows.
Usage of adplus is mentioned here:
Adplus usage
This creates the complete crash or hang dump. Once you have the dump, Windbg comes to the rescue. Map the correct pdbs and symbols and you are all set to analyze the dump. To start with use the command "!analyze -v"

How to get a stack trace when C++ program crashes? (using msvc8/2005)

Sometimes my c++ program crashes in debug mode, and what I got is a message box saying that an assertion failed in some of the internal memory management routines (accessing unallocated memory etc.). But I don't know where that was called from, because I didn't get any stack trace. How do I get a stack trace or at least see where it fails in my code (instead of library/ built-in routines)?

If you have a crash, you can get information about where the crash happened whether you have a debug or a release build. And you can see the call stack even if you are on a computer that does not have the source code.
To do this you need to use the PDB file that was built with your EXE. Put the PDB file inside the same directory as the EXE that crashed. Note: Even if you have the same source code, building twice and using the first EXE and the second PDB won't work. You need to use the exact PDB that was built with your EXE.
Then attach a debugger to the process that crashed. Example: windbg or VS.
Then simply checkout your call stack, while also having your threads window open. You will have to select the thread that crashed and check on the callstack for that thread. Each thread has a different call stack.
If you already have your VS debugger attached, it will automatically go to the source code that is causing the crash for you.
If the crash is happening inside a library you are using that you don't have the PDB for. There is nothing you can do.

If you run the debug version on a machine with VS, it should offer to bring it up and let you see the stack trace.
The problem is that the real problem is not on the call stack any more. If you free a pointer twice, that can result in this problem somewhere else unrelated to the program (the next time anything accesses the heap datastructures)
I wrote this blog on some tips for getting the problem to show up in the call stack so you can figure out what is going on.
http://www.atalasoft.com/cs/blogs/loufranco/archive/2007/02/06/6-_2200_Pointers_2200_-on-Debugging-Unmanaged-Code.aspx
The best tip is to use the gflags utility to make pointer issues cause immediate problems.

You can trigger a mini-dump by setting a handler for uncaught exceptions. Here's an article that explains all about minidumps
Google actually implemented their own open source crash handler called BreakPad, which also mozilla use I think (that's if you want something more serious - a rich and robust crash handler).

If I remember correctly that message box should have a button which says 'retry'. This should then break the program (in the debugger) at the point where the assertion happened.

CrashFinder can help you locate the place of the exception given the DLL and the address of the exception reported.
You can take this code and integrate it into your application to have a stack trage automatically generated when there is an uncaught exception. This is generally performed using __try{} __except{} or with a call to SetUnhandledExceptionFilter which allows you to specify a callback to all unhandled exceptions.

You can also have a post-mortem debugger installed on the client system. This is a decent, general way to get information when you do not have dump creation built into your application (maybe for an older version for which you must still get information).
Dr. Watson on Windows can be installed by running: drwtsn32 -i Running drwtsn32 (without any options) will bring up the configuration dialog. This will allow the creation of crash dump files, which you can later analyze with WinDbg or something similar.

You can use Poppy for this. You just sprinkle some macros across your code and it will gather the stack trace, together with the actual parameter values, local variables, loop counters, etc. It is very lightweight so it can be left in the release build to gather this information from crashes on end-user machines

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js