Solaris Core dump analysis - gdb

I use pstack to analyze core dump files in Solaris
How else can I analyze the core dump from solaris?
What commands can be used to do this?
What other information will be available from the dump?

You can use Solaris modular debugger,mdb, or dbx. mdb comes with SUNWmdb (or SUNWmdb x for the 64 bits version) package.
A core file is the image of your running process at the time it crashed.
Depending on whether your application was compiled with debug flags or not,you will be able to view an image of the stack, hence to know which function caused the core, to get the value of the parameters that were passed to that function, the value of the variables, the allocated memory zones ...
On recent solaris versions, you can configure what the core file will contain with the coreadm command ; for instance, you can have the mapped memory segments the process were attached to.
Refer to MDB documentation and dbx documentation. The GDB quick reference card is also helpful once you know the basics of GDB.

I guess any answer to this question should start with a simple recipe:
For dbx, the recipe is:
% dbx a.out core
(dbx) where
(dbx) threads
(dbx) thread t#3
(dbx) where

If the core dump is from a program you wrote or built, then use whichever debugger you would normally use to debug the running application. They should all be able to load core files. If you're not picky about debuggers, and you're using Solaris, I would recommend dbx. It will help to get the latest FCS version of Sun Studio with patches, or else the latest Express version of Sun Studio. It's also very helpful if you can load the core file into the debugger on the same system where the core file was created. If the code in the libraries is different from when the core file was created, then stack trace won't be useful when it goes through libraries. Debuggers also use OS helper libraries for understanding the libthread and runtime linker data structures, so IF you need to load the core file on a different machine, you'll want to make sure the helper libraries installed on the OS match the system data structures in the OS. You can find out everything you never wanted to know about these system libraries in a white paper that was written a few years ago.
http://developers.sun.com/solaris/articles/DebugLibraries/DebugLibraries_content.html

The pflags command is also useful for determining the state each thread was in when it core dumped. In this way you can often pinpoint the problem.

I would suggest trying gdb first as it's easier to learn basic tasks than the native Solaris debuggers in my opinion.

GDB can be used.
It can give the call that was attempted prior to the dump.
http://sourceware.org/gdb/
http://en.wikipedia.org/wiki/GDB
Having the source is great and if you can reproduce the errors even better as you can use this to debug it.
Worked great for me in the past.

Attach to the process image using the dbx debugger:
dbx [executable_file_name] [coredump_file_name]
It is important that there were no changes to the executable since the core was dumped (i.e. it wasn't rebuilt).
You can see the stack trace to see where the program crashed with dbx command "where".
You can move up and down the stack with command "up" and "down", or jump to the exact stack frame with "frame [number]", with the numbers seen in the output of "where".
You can print the value of variables or expressions with "print [expr]" command.
Have fun.

I found dbx on my solaris x86 box at
/opt/SUNWspro/bin/dbx
Cheers!

Related

Headless debugging on Windows

There is a bug that I would like to fix that only occurs on Windows Server without a GUI running. I have set up a Windows Server 2019 machine on Google Compute Engine that reproduces the bug, and would like to debug it.
Ideally, I would like to use gdb, but seeing as the program was built with Visual Studio 2019, gdb can't read the debugging symbols.
I don't have a Windows machine, so using Visual Studio will be difficult. I could set up a VM, but if there's an in-terminal way to do this that would be preferred.
I did a pretty thorough Google search, but it didn't turn up anything. Is there really no Windows solution for debugging C++ code headlessly?
MS has 2 console debuggers called CDB and NTSD so you don't actually need Visual Studio GUI to do the debugging. In fact there are a lot of debugging environments in Windows from MS beside the usual Visual Studio. Just install them in your Windows Server and control them remotely from your terminal
You can also debug MSVC-compiled code with LLDB since the PDB format has been published long time ago and LLVM on Windows does support it. No idea about current LLDB on Linux though
And since you have the source code, sometimes the old-school printf debugger is the best way to analyze the issue
If you can get a Windows VM it'll be much better to do remote debugging. In fact almost all debuggers support that feature including GDB or LLDB, so even if you don't have the source code you can still run any Windows debugger and step through the instructions instead of high-level code lines from a remote machine
An alternative way is to take a memory dump and debug later. After getting the dump file, just drag it into your VS solution or any debugging tool like WinDbg and then select "Start Debugging". Now you can step through instructions/code lines and examine variables' values, or jump to an arbitrary function's stack frame just as if you're really running the malfunctioning app
There are many ways to dump a process' memory. You can set Windows to automatically save a dump file when your app crashes, or just capture memory snapshots manually during runtime. Comparing 2 snapshots is also useful for detecting leaked memory. For more information on how to do that read
Collecting User-Mode Dumps
Steps to Catch a Simple “Crash Dump” of a Crashing Process
There's also an easy way to take a dump of a live process using task manager (or any other similar tools)

Qt Creator can't break on thrown exceptions (when using CDB as debugger)

I set Qt Creator to break when a C++ exception is thrown:
I then tested it with this code:
try {
throw std::runtime_error("error");
} catch (std::exception &e) {
qDebug("%s", e.what());
}
But it didn't break on throw std::runtime_error("error");. I'm using CDB, not GDB, because I'm using the MSVC Kit.
Edit: There is another question where CDB is working for the OP, even if slowly. So it should work in principle. My configuration is: Qt Creator 3.3.0, compiling with Qt4/MSVC 9.0 (x86), the debugger is CDB 6.2.9200.16384.
Edit 2: This is what I'm getting in the CDB log window (I made a diff between the CDB log with and without the breakpoint):
<bu100400 CxxThrowException
<!qtcreatorcdbext.breakpoints -t 1 -v
<!qtcreatorcdbext.pid -t 2
dATTEMPT SYNC
d*** Bp expression 'CxxThrowException' contains symbols not qualified with module name.
1 breakpoint(s) pending...
*** Unable to resolve unqualified symbol in Bp expression 'CxxThrowException' from module 'C:\Windows\WinSxS\x86_microsoft.windows.common-controls_6595b64144ccf1df_5.82.7601.18201_none_ec80f00e8593ece5\comctl32.dll'.
Full CDB log (in case needed): http://pastebin.com/jhNRy9bE
Edit 3: #HansPassant's explained why it fails in the comments:
Keep in mind that you are using a very old version of MSVC++, big changes at VS2012. The pastebin shows it being out of sync pretty badly, never getting to the DLL that contains __CxxThrowException#8 (MSVCR90D.dll) before the exception is thrown. It is simple with the sxe debugger command, automatic break when any exception is thrown.. Maybe you shouldn't be using QT's UI at all, it looks too gimpy. – Hans Passant 10 hours ago
Just look at the trace, the debugger shows what DLLs it is searching for "CxxThrowException". It never gets to msvcr90d.dll. And the exception is thrown while it is searching for the symbol after which it all ends. Completely out of sync. – Hans Passant 56 mins ago
I'll just write up why this is going wrong, a workaround is going to be difficult to find. The debugger trace in your pastebin tells the tale.
The basic issue is that the communication between the debugger and the QT front-end is rather poor. And in your case it gets out of sync, QT doesn't wait long enough for the debugger to complete the command. QT tries to set a breakpoint on the msvcr90d.dll!__CxxThrowException#8 function, the one that raises a C++ exception in the Microsoft CRT. This function can be present in more than one executable file if the program uses multiple CRTs. A pretty common mishap, caused by building with /MT. And sometimes intentional if you use a well-isolated DLL that you interface with by using COM.
This takes a while as you might imagine, the complaint in the linked question, the debugger has to plow through the symbol information for every DLL that's loaded. It will take especially long if the PDB for the DLL needs to be downloaded from the symbol server and doesn't otherwise get cached so it is available the next time you debug. Not your issue afaict, it does setup the cache location to C:\Users\sasho\AppData\Local\Temp\symbolcache. Go have a look-see to verify that you do see PDBs for the operating system DLLs there.
This operation is tricky, the debugger doesn't give a good signal that it is done searching the DLLs. What the QT should do is verify the debugger feedback against the list of DLLs it obtained. It does not do that, it issues the g command before the debugger is done searching. Could be a timeout that is too short but it actually looks like QT doesn't count on the debugger performing this command in the background. A convenience to a human, not exactly very helpful here :)
There ought to be a way to configure CDB to not perform this search in the background. This is well-hidden, I don't see anything in the debugger.chm help file but it probably wasn't updated in a while. Google doesn't help either. I'd recommend you ask a question about it. Most significantly perhaps is that you have a rather major mismatch in version numbers. The compiler you use is 2008 vintage, the debugger is quite new, SDK 8.0 version, I can't tell what QT version you use.
So a possible workaround is to intentionally use an older version of CDB, one that's more likely to have been tested with the QT front-end version you use. Download the corresponding SDK version, version 6.0 matches the VS2008 time frame. I think the "Debugging Tools for Windows" was still a separate download back then and not yet included in the SDK. Another workaround is to stop relying on the friendly QT front-end and learn to drive CDB from the command prompt. Moderately more useful is WINDBG, uses the same debugging engine but has a GUI interface. Just moderate, it is still mostly prompt driven. You do lose several days of your life learning the commands however. Getting the debugger to break when an exception is thrown is trivial, use the sxe command.

c++ program terminating when one thread has access violation - how to catch this in linux - for win32 I get stacktraces in vs2010

c++ program terminated with no exceptions or stacktrace
I have a multi threaded application
If one of my threads has an access violation with reading out of bounds on an array (or any seg fault condition) my entire application immediately terminates.
If this happens on my windows counter part using visual studio I get a nice stacktrace of where the error was, and what the issue was.
I desperately need this type of debugging environment to be able to succeed at my project. I have too many threads and too many developers running different parts of the project to have one person not handle an exception properly and it destroys the entire project.
I am running Fedora Core 14
I am compiling with gcc 4.5.1
gdb is fedora 7.2-16.fc14
My IDE is eclipse Juno I am using the CDT builder
my toolchain is the cross GCC and my builder is the CDT Internal Builder
Is there ANY setting for the gdb or gcc or eclipse that will help me detect these type of situations?
That's what's supposed to happen. Under Unix, you get a full
core dump (which you can examine in the debugger), provided
you've authorized them. (ulimits -c—traditionally, they
were authorized by default, but Linux seems to have changed
this.)
Of course, to get any useful information from the core dump,
you'll need to have compiled the code with symbol information,
and not stripped it later. (On the other hand, you can copy the
core dump from your users machine onto your development machine,
and see what happened there.)
You're definitely looking for core dumps, as James Kanze wrote.
I would just add that core dumps will show you where the program crashed, which is not necessarily the same place as where the problem (memory corruption etc.) occurred. And of course some out-of-bounds reads/writes may not exhibit themselves by crashing immediately.
You can narrow the search by enabling check for incorrect memory allocations/deallocations in glibc. The simplest way is to set environmental variable MALLOC_CHECK_. Set it to 2 and glibc will check for heap corruption with every memory allocation/deallocation and abort the program when it founds any problem (producing a core dump, if enabled). This often helps to get closer to the real problem.
http://www.gnu.org/software/libc/manual/html_node/Heap-Consistency-Checking.html

GDB 7.5 (OS X): Can't access source from library functions

gdb newbie here, so I hope I haven't overlooked something glaringly obvious… (and if I did, maybe a kind soul could point it out anyway? ;)
I'm debugging a GCC C++ application under OS X Lion. As it's quite heavy on STL, I'd really like to use a GDB version with python support (i.e. >=v7) for pretty printing of containers. My application is split up into a backend library (.dylib) that does all the heavy lifting, and a very simple frontend application. All of the sources and binaries are below a common source path, and everything has been compiled with debugging symbols (I tried both -g and -ggdb).
Using the GDB version in XCode (which identifies as "GNU gdb 6.3.50-20050815 (Apple version gdb-1820)"), displaying the source lines of frames in a backtrace works as expected out of the box, no matter whether the respective call happens in the frontend app or the backend library:
(gdb) f 12
#12 0x000000010002ddc5 in FL3D::Resource::createMesh_ (this=0x7fff5fbff7c8, fl3d=#0x7fff5fbff1f8, id=) at /Development/workspace/fl3d/libfl3d/resource.cpp:234
234 std::vector& t = textureIndices_.at(i);
(gdb)
So far so good. GDB 7.5 and 7.4.1, on the other hand, refuse to give me any source lines from the library:
(gdb) f 12
#12 0x000000010002ddc5 in FL3D::Resource::createMesh_(FL3D::FL3DParser&, std::string) ()
from /Development/workspace/fl3d/libfl3d/build/libfl3d.dylib
(gdb)
I'm really confused by the different responses given – gdb6 prints the correct path to the source file and the correct line, while gdb7 gets the function prototype right (supposedly read from the debugging symbols of the .dylib?), but doesn't seem to know anything about the source. Interestingly, though, it DOES show the corresponding source line for calls in the frontend's main() function!
I've already tried manually setting the path to the library's source files with "dir libfl3d", but that doesn't change anything. I also notice that gdb6 says "Reading symbols for shared libraries" a few times when I run the application and gdb7 doesn't – but the symbols don't seem to be the problem, as they seem to be resolved correctly by both versions?
I'm at my wit's end here. Any pointers?
The Apple gdb is displaying the debug information because it knows how to find & parse DWARF on this platform. The gdb version 7 that you're showing is a gdb that doesn't know how to find the DWARF debug information on a Mac OS X system -- that output that you're showing above is what no-debug-info looks like. My guess is that the FSF gdb version 7 support for Mac OS X has not seen a lot of attention, I would be hesitant to recommend using it on this platform.
As bames53 notes, you're far better off using LLDB on Mac OS X at this point. It is the debugger that all of the support work is going in to and Objective-C / C++ container support is rapidly being added to LLDB but not gdb. The gdb provided by Apple is on an end-of-life path and all users will be switched over to LLDB in the future.
Give lldb a try. It's a little different but it's quite good. There is a cheat sheet that a lot of people find useful in the beginning, it shows gdb and lldb command equivalences. http://lldb.llvm.org/lldb-gdb.html

Visual studio release build

I'm trying to generate a release build for a C++ application that I've written. The application runs fine (debug & release) when you run it from within VS2008; but when you run the executable it crashes nearly every single time.
Now, is there a hack so I can run this application as a standalone application without having to run through all of the code and finding the bug that is causing it?
Thanks in advance.
In short, no.
you will have to find the bug, if it works within VS, then I'd hazard a guess that it is a timing issue, possibly you're overwriting shared thread data, this would be less likely (though still possible to see) inside VS as its being run in a debug environment which slows it down a bit.
If you want help finding your bug, then tell us more. Otherwise, build your release with debug symbols (pdbs), install DrWatson as the system debugger and run it standalone. When it crashes DrWatson will create a minidump file, load this into WinDbg (my favourite) and you'll be able to see exactly where your bug is (it'll even tell you that the dump contains an exception and show you it by default. You need to add your source code path and path to your symbols in WinDbg to get it to do this correctly).
Then you will also know how to diagnose crashes when the app is run on-site too.
Are you loading external resources? If you are check that your relative paths are correct in the C++ program.
One possibility is that your program uses uninitialized heap data. Launching a program from the debugger enables the NT debug heap, which causes the heap allocator to fill new memory blocks with a fill pattern, and also enables some heap checking. Launching the same program from outside the debugger leaves the NT debug heap disabled, but if the program was linked against the debug version of the C runtime, then the CRT debug heap will still be enabled.
A much less likely possibility is that your program requires SeDebugPrivilege to be set in its process token. The debugger enables this privilege in its process token, which has the side effect that all programs launched from the debugger inherit this privilege. If your program tries to use OpenProcess()/ReadProcessMemory()/WriteProcessMemory() and doesn't handle errors correctly, it's conceivable that it could crash.
There are a few possibilities. Besides what has already been mentioned, running an app from Visual Studio will execute in the same security context as the Visual Studio instance. So if, for instance, you are working on Vista, you might be hitting an unhandled security violation if you're trying to access protected files, or the registry.
What if you build a debug version and run it standalone? Does it crash? If so, you can usually break into the debugger from there and get a call stack to see what the malfunction is.
From the details you've given, it sounds like there may be a library issue. Are you running the program on the same computer? If not then you'll also have to deploy the appropriate libraries for your application. If you are running on the same computer but outside of the dev environment, ensure that your application can see the appropriate libraries.
Best way i have found to debug in release is to create a crash dump when an crash happens and the dump then allows me to load debug symbols on my dev computer and find out whats going on. More info here: http://www.debuginfo.com/articles/effminidumps.html
You can also go to file => open in Visual Studio and open the .exe, so you are not starting it under the debugger per se. Not sure if it will help.
http://blogs.msdn.com/saraford/archive/2008/08/21/did-you-know-you-can-debug-an-executable-that-isn-t-a-part-of-a-visual-studio-project-without-using-tools-attach-to-process-296.aspx