I'm using gdb to debug an intermittent crash. I can open the core dump, and see that the crash occurred inside a shared library. (I can see the function names and the file name of the library in the backtrace, though I don't have the source code for the library.)
Meanwhile, the library has been updated, so that file name now holds a different version of the library than the one that was loaded when the core dump was generated.
I can run disassemble to see the machine code for the function where the crash occurred - but would I see the code from the version in use when the crash occurred, or will gdb load the code from the library file on disk, thereby picking a mismatching version?
would I see the code from the version in use when the crash occurred, or will gdb load the code from the library file on disk, thereby picking a mismatching version?
The latter (mismatched version).
By default, executable (and other read-only mappings) are not saved in the core to save space -- the contents is already available on disk.
On Linux you can ask your system to save read-only mappings with:
echo 0x7 > /proc/self/coredump_filter
See man 5 core.
Related
ALL,
I am writing an application which apparently has memory leaks according to MSVC. This application consists of the binary executable and couple of DLLs. The application and the DLL both using "Dynamic Linking".
I also have a written application which contains only one binary file which is link statically.
I tried to apply VLD to both.
With the second application there is no problem. It can be started and is executing fine.
With the first application - I can't even start it. It is always crashing on the start-up.
I added the VLD to the mai executable and to all DLL I am producing.
So I'm wondering what might be the problem for the crash - whether it is a multiple DLL or the fact that I'm using "Dynamic Linking".
I also wonder if getting the source code of VLD and trying to compile that along with the project will help and I finally will be able to run the application and see the leaks.
Thank you for any pointers to resolve the crash.
EDIT1:
Here is the backtrace for the crash:
ntdll.dll!77c40e92()
[Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll]
vld_x86.dll!04f9abf0()
vld_x86.dll!04fae9df()
vld_x86.dll!04faeb4d()
KernelBase.dll!75a241e6()
user32.dll!75f57433()
user32.dll!75f55ab6()
user32.dll!75f558c4()
ntdll.dll!77c496de()
ntdll.dll!77c49658()
ntdll.dll!77c57825()
ntdll.dll!77c5b530()
ntdll.dll!77c6751f()
vld_x86.dll!04faf9b6()
vld_x86.dll!04fadd99()
msvcrt.dll!75c9b0f9()
KernelBase.dll!75a24093()
vld_x86.dll!04faf9b6()
vld_x86.dll!04faf9b6()
vld_x86.dll!04fade47()
ALL,
I installed the latest version of VLD (2.5.1), copied the 2 dlls and the pdb to the executable directory and the program was able to start without crash.
I had some issues reading the output of VLD but I will probably create a new thread for it.
Thank you for reading and sorry for the noise.
Got this callstack when I open a Windows crash dump in Visual Studio 2005:
> myprog.exe!app_crash::CommonUnhandledExceptionFilter(_EXCEPTION_POINTERS * pExceptionInfo=0x0ef4f318) Line 41 C++
pdm.dll!513fb8e2()
[Frames below may be incorrect and/or missing, no symbols loaded for pdm.dll]
kernel32.dll!_UnhandledExceptionFilter#4() + 0x1c7 bytes
...
Looking at the module load info:
...
'DumpFM-V235_76_1_0-20110412-153403-3612-484.dmp': Loaded '*C:\Program Files\Common Files\Microsoft Shared\VS7Debug\pdm.dll', No matching binary found.
...
We see that this binary was not even loaded, because the machine used to analyze the dump is a different machine than the machine that produced the dump.
I don't have access to this other machine at the moment -- can I somehow get this stack fixed, or will I always need the exact binary at this exact path location?
If you absolutely want to debug this dump in Visual Studio, then you can get away with copying the system DLLs from the machine that produced the dump to the same folder where your .dmp file is. That way, it will load those binaries instead of trying to find them in the same path on the debugging system as they were on the original system (which probably will contain different versions of the same modules).
As Naveen pointer out though, you won't have this problem when loading the dump in WinDBG (for reasons I have yet to understand). That is why when I get a dump from clients, I always analyze them in WinDBG.
If you need help on using WinDBG for crash dump analysis, the following Web site is full of info on the subject: http://www.dumpanalysis.org/.
Another option is to use the ModuleRescue tool from the folks at DebugInfo.com. This will scan a dump file, allow you to choose the module that isn't loading symbols, and then it generates a fake module that has just enough info in it for the debugger to load the symbols from the symbol server.
When Visual Studio can't load the symbols for this module and opens a dialog asking you to find the symbols, just point your debugger at that fake module and it will load correctly.
This tool basically does the same thing that WinDbg does, albeit with a different workflow.
One of our users having an Exception on our product startup.
She has sent us the following error message from Windows:
Problem Event Name: APPCRASH
Application Name: program.exe
Application Version: 1.0.0.1
Application Timestamp: 4ba62004
Fault Module Name: agcutils.dll
Fault Module Version: 1.0.0.1
Fault Module Timestamp: 48dbd973
Exception Code: c0000005
Exception Offset: 000038d7
OS Version: 6.0.6002.2.2.0.768.2
Locale ID: 1033
Additional Information 1: 381d
Additional Information 2: fdf78cd6110fd6ff90e9fff3d6ab377d
Additional Information 3: b2df
Additional Information 4: a3da65b92a4f9b2faa205d199b0aa9ef
Is it possible to locate the exact place in the source code where the exception has occured having this information?
What is the common technique for C++ programmers on Windows to locate the place of an error that has occured on user computer?
Our project is compiled with Release configuration, PDB file is generated.
I hope my question is not too naive.
Yes, that's possible. Start debugging with the exact same binaries as ran by your user, make sure the DLL is loaded and you've got a matching PDB file for it. Look in Debug + Windows + Modules for the DLL base address. Add the offset. Debug + Windows + Disassembly and enter the calculated address in the Address field (prefix with 0x). That shows you the exact machine code instruction that caused the exception. Right-click + Go To Source code to see the matching source code line.
While that shows you the statement, this isn't typically good enough to diagnose the cause. The 0xc0000005 exception is an access violation, it has many possible causes. Often you don't even get any code, the program may have jumped into oblivion due to a corrupted stack. Or the real problem is located far away, some pointer manipulation that corrupted the heap. You also typically really need a stack trace that shows you how the program ended up at the statement that bombed.
What you need is a minidump. You can easily get one from your user if she runs Vista or Win7. Start TaskMgr.exe, Processes tab, select the bombed program while it is still displaying the crash dialog. Right-click it and Create Dump File.
To make this smooth, you really want to automate this procedure. You'll find hints in my answer in this thread.
If you have a minidump, open it in Visual Studio, set MODPATH to the appropriate folders with the original binaries and PDBs, and tell it to "run". You may also need to tell it to load symbols from the Microsoft symbol servers. It will display the call stack at the error location. If you try to look at the source code for a particular stack location, it may ask you where the source is; if so, select the appropriate source folder. MODPATH is set in the debug command-line properties for the "project" that has the name of the minidump file.
I know this thread is very old, but this was a top Google response, so I wanted to add my $.02.
Although a mini-dump is most helpful, as long as you have compiled your code with symbols enabled (just send the file without the .pdb, and keep the .pdb!) you can look up what line this was using the MSVC Debugger or Windows Debugger. MSN article on that:
http://blogs.msdn.com/b/danielvl/archive/2010/03/03/getting-the-line-number-for-a-faulting-application-error.aspx
Source code information isn't preserved in compiled C++ code, unlike in runtime-based metadata-aware languages (such as .NET or Java). The PDB file is a symbol index which can help a debugger map compiled code backwards to source, but it has to be done during program execution, not from a crash dump. Even with a PDB, Release-compiled code is subject to a number of optimizations that can prevent the debugger from identifying the source code.
Debugging problems which only manifest on end-user machines is usually a matter of careful state logging and a lot of detail-oriented time and effort combing over the source. Depending on your relationship with the user (for example, if you're internal corporate IT development), you may be able to make a virtual machine image of the user's machine and use it for debugging, which can help speed the process tremendously by precisely replicating the installed software and standard running processes on the user's workstation.
There are several ways to find the crash location after the fact.
Use a minidump. See the answers above.
Use the existing executable in a debugger. See the answers above.
If you have PDB files (Visual Studio, Visual Basic 6), use DbgHelpBrowser to load the PDB file and query it for the crash location.
If you have TDS files (separate TDS file, or embedded in the exe, Delphi, C++ Builder 32 bit), use TDS Browser to load the TDS/DLL/EXE file and query it for the crash location.
If you have DWARF symbols (embedded in the EXE, C++ Builder 64 bit, gcc, g++), use DWARF Browser to load the DLL/EXE and query it for the crash location.
If you have MAP files, use MAP File Browser to load the MAP file and query it for the crash location.
I wrote these tools for use in house. We've made them available for free.
What are the 'best practices' when it comes to debugging core dumps using GDB?
Currently, I am facing a problem:
The release version of my application is compiled without the '-g' compiler flag.
The debug version of my application (compiled with '-g') is archived (along with the source code, and a copy of the release binary).
Recently, when a user gave me a core dump, I tried debugging it using
gdb --core=./core.pid ./my_app_debug-bin
The core was created by my_app_release-bin. There seems to be some kind of mismatch between the core file and the binary.
On the other hand, if I try
gdb --core=./core.pid ./my_app_release-bin
the core matches but I am unable to get source code line numbers (although I get the function names).
Is this what is practised? Because I feel I am missing something here.
It sounds like there are other differences between your release and debug build then simply the absence/presence of the -g flag. Assuming that's the case, there is nothing you can do right now, but you can adjust your build to handle this better:
Here's what we do at my place of work.
Include the -g flag when building the release version.
Archive that version.
run strip --strip-unneeded on the binary before shipping it to customers.
Now, when we get a crash we can use the archived version with symbols to do debugging.
One thing to note is that if your release version includes optimization, debugging may be difficult even with symbols. For example, the optimizer can reorder your code so even though the debugger will say you crashed on line N, you can't assume that the code actually executed line N-1.
You need to do some additional stuff to create binaries with stripped debug information that you can then debug from cores. Best description I could find is here
No, you don't miss anything. debug and release are just different binaries, so the core files of release don't match the debug binary. You have to look at machine code to get something from the release core dump.
You probably have to ask your user how the crash happened and collect additional log information or whatever you app produces.
I have a couple of questions regarding core dumps. I have gdb on Windows, using Cygwin.
What is the location of core dump file? Is it a.exe.stackdump file? (This is the only file that generated after crash) I read on other forums that the core dump file is named "core". But I don't see any file with name "core".
What is the command for opening and understanding core dump file?
You need to configure Cygwin to produce core dumps by including
error_start=x:\path\to\dumper.exe
in your CYGWIN environment variable (see here in section "dumper" for more information). If you didn't do this, you will only get a stacktrace -- which may also help you in diagnosing the problem, though.
Start gdb as follows to attach it to a core dump file:
gdb myexecutable --core=mycorefile
You can now use the usual gdb commands to print a stacktrace, examine the values of variables, and so on.
Yes, cygwin creates a.exe.stackdump files by default. You need to configure it to create cores as well (Martin's answer covers that).
A simple tutorial on core dump debugging can be found here