Program crash - how to read appcompat.txt? - c++

After the program I am debugging crashes, I am left with heap dump *.mdmp file & appcompat.txt in my Temp directory. I understand that appcompat.txt is an error report. Is there a description of its format?
My appcompat.txt lists a number of DLLs. Am I correct assuming that the reason for a crash could have only come from one of the listed DLLs? Can I limit my debugging effort to the DLLs listed in appcompat.txt?
Thanks in advance!

The minidump file is far more informative for diagnosing crashes:
Install Debugging Tools for Windows, if you don't already have it.
Set up the symbol path variable _NT_SYMBOL_PATH to point to the Microsoft symbol server
Run Windbg and do File -> Open Crash Dump and locate your .dmp or .mdmp file
Type !analyze -v.
This will try to isolate the location of the crash. Note that just because a crash occurs in a particular dll it doesn't mean that is where the bug resides - it could be because an invalid parameter has been passed in from your application code. The analysis should hopefully show you a meaningful stack and an error code which should help in working out the actual cause of the crash.

Related

How to display Minidump stack traces on Mac OSX?

I've written a small test program which throws a C++ exception. I've setup Breakpad to write a minidump on this thrown exception. I now have a .dmp file I'd like to see the stack trace for. Several references indicate I should.
Generate a .sym file using Breakpad's 'dump_syms' utility program, which I've done. I'm running dump_syms on the debug binary (which should have the debug symbols built in?). ./dump_syms breakpad_testing > breakpad_testing.sym
At this point I have the .dmp file and the .sym file
Inspect the first line of the .sym file to view to get a binary version hash? This will look something like this -
MODULE mac x86_64 ED3C7C3C3C283C749036117557E0A8500 breakpad_testing.
Use this to create an expected folder structure mkdir -p
./symbols/breakpad_testing/ED3C7C3C3C283C749036117557E0A8500 and
move the .sym file there. mv breakpad_testing.sym
./symbols/breakpad_testing/ED3C7C3C3C283C749036117557E0A8500
Show stack using 'minidump_stackwalk' tool. minidump_stackwalk
breakpad_testing.dmp ./symbols
However these steps don't seem to have any effect on the output of minidump_stackwalk, I still see output lines such as minidump.cc:2122: INFO: MinidumpModule could not determine version for /Users/mb/Library/Developer/Xcode/DerivedData/<blah>/Build/Products/Debug/breakpad_testing and an unsymbolicated stack trace.
Is there something I'm misunderstanding or not using properly regarding Breakpad on OSX?
This is one of the references I've been following https://blog.inventic.eu/2012/08/qt-and-google-breakpad/

Meaning of a gdb backtrace when there is not source code

I have a gdb backtrace of a crashed process, but I can't see the specific line in which the crash occurred because the source code was not in that moment. I don't understand some of the information given by the mentioned backtrace.
The backtrace is made of lines like the following one:
<path_to_binary_file>(_Z12someFunction+0x18)[0x804a378]
Notice that _Z12someFunction is the mangled name of int someFunction(double ).
My questions are:
Does the +0x18 indicate the offset, starting at _Z12someFunction address, of the assembly instruction that produced the crash?
If the previous question is affirmative, and taking into account that I am working with a 32-bit architecture, does the +0x18 indicates 0x18 * 4 bytes?
If the above is affirmative, I assume that the address 0x804a378 is the _Z12someFunction plus 0x18, am I right?
EDIT:
The error has ocurred in a production machine (no cores enabled), and it seems to be a timing-dependant bug, so it is not easy to reproduce it. That is because the information I am asking for is important to me in this occasion.
Most of your assumptions are correct. The +0x18 indeed means offset (in bytes, regardless of architecture) into the executable.
0x804a378 is the actual address in which the error occurred.
With that said, it is important to understand what you can do about it.
First of all, compiling with -g will produce debug symbols. You, rightfully, strip those for your production build, but all is not lost. If you take your original executable (i.e. - before you striped it), you can run:
addr2line -e executable
You can then feed into stdin the addresses gdb is giving you (0x804a378), and addr2line will give you the precise file and line to which this address refers.
If you have a core file, you can also load this core file with the unstriped executable, and get full debug info. It would still be somewhat mangled, as you're probably building with optimizations, but some variables should, still, be accessible.
Building with debug symbols and stripping before shipping is the best option. Even if you did not, however, if you build the same sources again with the same build tools on the same environment and using the same build options, you should get the same binary with the same symbols locations. If the bug is really difficult to reproduce, it might be worthwhile to try.
EDITED to add
Two more important tools are c++filt. You feed it a mangled symbol, and produces the C++ path to the actual source symbol. It works as a filter, so you can just copy the backtrace and paste it into c++filt, and it will give you the same backtrace, only more readable.
The second tool is gdb remote debugging. This allows you to run gdb on a machine that has the executable with debug symbols, but run the actual code on the production machine. This allows live debugging in production (including attaching to already running processes).
You are confused. What you are seeing is backtrace output from glibc's backtrace function, not gdb's backtrace.
but I can't see the specific line in which the crash occurred because
the source code was not in that moment
Now you can load executable in gdb and examine the address 0x804a378 to get line numbers. You can use list *0x804a378 or info symbol 0x804a378. See Convert a libc backtrace to a source line number and How to use addr2line command in linux.
Run man gcc, there you should see -g option that gives you possibility to add debug information to the binary object file, so when crash happens and the core is dumped gdb can detect exact lines where and why the crash happened, or you can run the process using gdb or attach to it and see the trace directly without searching for the core file.

C++ Application crash where to start looking? MSVCR90.dll

This is a very open ended question and I am really just looking for how to approach locating the issue.
The application runs for a day or so and then will crash while being used. The point in the application that it crashes is not the same each time. The memory that the application is using is not increasing.
C++ is not my standard dev language so any pointers would be appreciated.
The run time error I am given is detailed below. Having googled this I can see that 40000015 is a generic I don't know what has happened style error. Is there anyway I can use the additional information (1-4) to assist in locating the issue?
Any help is much appreciated!
Thanks
Problem signature:
Problem Event Name: APPCRASH
Application Name: Main.exe
Application Version: 1.1.10.0
Application Timestamp: 5278d640
Fault Module Name: MSVCR90.dll
Fault Module Version: 9.0.30729.4940
Fault Module Timestamp: 4ca2ef57
Exception Code: 40000015
Exception Offset: 0005beae
OS Version: 6.1.7601.2.1.0.256.48
Locale ID: 2057
Additional Information 1: 3793
Additional Information 2: 379382cf89267e4a4b730ab2a7cc6828
Additional Information 3: f05b
Additional Information 4: f05b042c097ccdb870355bd0f539be8d
Read our privacy statement online:
http://go.microsoft.com/fwlink/?linkid=104288&clcid=0x0409
If the online privacy statement is not available, please read our privacy statement offline:
C:\Windows\system32\en-US\erofflps.txt
I would start by running it under debugger and leave it running for a day. Remember to enable all exceptions to catch - in my VS 2005 its in Debug->Exceptions, add handler for 40000015 exception.
If you cannot run it under debugger, ie. it happens only on client PC (still you can use remote debugging), then you can implement exception hanlder using : AddVectoredExceptionHandler, then use StackWalk64 to log call stack. If you can compile with symbols, then such stack will contain full path to the source of exception. It will be inside MSVCR90.dll, but will originate probably somewhere in your code. If you cannot include symbols then you can always use .map files or windbg with locally stored .pdb files. Of course this is a lot of work especially if C++ is not your main language, so the first suggestion is the best for you.
Ok, you can also use MiniDumpWriteDump and then use windbg instead of StackWalk64.

Strange "No such file or directory" error in cuda-gdb

I already asked this question in the nvidia forum but never got an answer link.
Every time I try to step into a kernel I get a similar error message to this:
__device_stub__Z10bitreversePj (__par0=0x110000) at
/tmp/tmpxft_00005d4b_00000000-1_bitreverse.cudafe1.stub.c:10
10 /tmp/tmpxft_00005d4b_00000000-1_bitreverse.cudafe1.stub.c: No such file or directory.
in /tmp/tmpxft_00005d4b_00000000-1_bitreverse.cudafe1.stub.c
I tried to follow the instructions of the cuda-gdb walkthrough by the error stays.
Has somebody a tip what could cause this behaviour?
The "device stub" for bitreverse(unsigned int*) (whatever that is) was compiled with debug info, and it was located in /tmp/tmpxft_00005d4b_00000000-1_bitreverse.cudafe1.stub.c (which was likely machine-generated).
The "No such file" error is telling you that that file is not (or no longer) present on your system, but this is not an error; GDB just can't show you the source.
This should not prevent you from stepping further, or from setting breakpoints in other functions and continuing.
I was able to solve this problem by using -keep flag on the nvcc compiler. This specifies that the compiler should keep all intermediate files created during the compilation, including the stub.c files created by cudafe that are needed for the debugger to step through kernel functions. Otherwise the intermediate files seem to get deleted by default at the end of the compilation and the debugger will not be able to find them. You can specify a directory for the intermediate files as well, and will need to point your debugger (cuda-gdb, nsight, etc) to this location.
I think I had such problem once, but can't really remember what was it caused with. Do you use textures in your kernel? In that case you couldn't debug it.

Aborted core dumped C++

I have a large C++ function which uses OpenCV library and running on Windows with cygwin g++ compiler. At the end it gives Aborted(core dumped) but the function runs completely before that. I have also tried to put the print statement in the end of the function. That also gets printed. So I think there is no logical bug in code which will generate the fault.
Please explain.
I am also using assert statements.But the aborted error is not due to assert statement. It does not say that assertion failed. It comes at end only without any message.
Also the file is a part of a large project so I cannot post the code also.
gdb results:
Program received signal SIGABRT, Aborted.
0x7c90e514 in ntdll!LdrAccessResource () from /c/WINDOWS/system32/ntdll.dll
It looks like a memory fault (write to freed memory, double-free, stack overflow,...). When the code can be compiled and run under Linux you can use valgrind to see if there are memory issues. Also you can try to disable parts of the application until the problem disappears, to get a clue where the error happens. But this method can also give false positives, since memory related bugs can cause modules to fail which are not the cause of the error. Also you can run the program in gdb. But also here the position the debugger points to may not be the position where the error happened.
You don't give us much to go on. However, this looks like you are running into some problem when freeing resources. Maybe a heap corruption. Have you tried running it under gdb and then looking where it crashes? Also, check if all your new/delete calls match.
Load the core dump together with the binary into gdb to get an idea at which location the problem list. Command line is:
gdb <path to the binary> <path to the core file>
For more details on gdb see GDB: The GNU Project Debugger.
Run it through AppVerifier and cdb.
E.g.
cdb -xd sov -xd av -xd ch <program> <args>