Waiting with a crash for a debugger? - c++

When an assert fails or there is a segmentation fault, it would be very convenient that one of the following happens:
Program ask whether to run a debugger.
Program waits with crashing until debugger is attached.
Program leaves something (core dump?) that we can resume execution from this point and investigate.
The question is quite general due to variety of platforms, languages and debuggers.
I'm asking about C++ and I guess that Windows (VS), Linux (gdb), Mac (gdb?) solutions would be most useful for community. I'm interested in Linux + gdb.

On Linux (and probably OSX and other unixen), you can allow programs to leave a coredump with the ulimit utility.
Here's a quick howto.

On Windows there is DebugBreak() (and IsDebuggerPresent()), which is one of the options of what can happen when an assert fails.
On MacOS there are similar API calls (Debugger() or SysBreak()).
I don't know much about Linux, but AFAIK a failed assertion on Linux will cause a coredump, which can be looked at in the debugger.

On Linux, Basically when something horrible happens, your program receives a signal, There is a default behavior for the program if you do not 'mask' this signal, but you can usually 'mask' it to do something else, such as opening the gdb. You can find how to mask and a lot more from here, specifically here.
Regarding the assert, you can easily create your own version of assert, to do whatever you want.

Unfortunately, my response only extends to Windows but it would make sense that Linux would also have some way to signal a debugger.
Any machine with Visual Studio installed on it should have Just in Time debugging enabled. This essentially means that the debugger does not have to be running when a process encounters a fatal exception.
Just in Time debugging is enabled through a registry key. Check out the link above for additional details.
If your looking to capture a process snap, for review later, then this is generally accomplished via Adplus.vbs (attended) or DebugDiag (unattended). Adplus is available through the Debugging Toolkit for Windows, but DebugDiag is a separate download.

Additionally to ulimit suggested by gnud, it could be a good idea to use a crash reporter: http://code.google.com/p/google-breakpad/w/list

I've implemented such a functionality as a LD_PRELOADable library in https://github.com/l29ah/waitgdb
Basically it handles debug-worthy signals and stops the process sending it SIGSTOP so you can attach your debugger later.

Related

Debugging a program without halting it

I have a large multi-threaded program written in cpp and compiled with gcc.
Every now and then I run into bugs in runtime. Is there a way to attach gdb (or anything else) and try to look what each thread is doing and maybe see some internal class members?
The thing is I do not want gdb to freeze the program. There are timing sensitive parts and freezing the program will change its behavior (and possibly crash it if its long enough).
Is there a way to attach gdb (or anything else) and try to look what each thread is doing and maybe see some internal class members?
Yes: GDB can examine stack trace of each thread, and local and global variables (provided you compiled with debug info).
The thing is I do not want gdb to freeze the program.
That's trickier: GDB can only examine stopped threads.
If you have some threads that should continue to run, you should look into non-stop debugging mode.

How GDB handles SIGSEGV

When debugging a C++ program emit a SIGSEGV with gdb,it is possible to handle the signal and asked to nostop.
How gdb handles this kind of scenario ??
Have searched gdb source code and couldn't find a starting point.
You cannot automatically ignore SIGSEGV. I also wouldn't recommend doing that anyway. Although you can make gdb ignore the signal and not pass it to the program, the kernel will attempt to re-run the offending instruction once the signal handler returns and results in an infinite loop. See this answer for more information.
One way to work around it is to the skip the instruction or change register values so that it does not segfault. The link shows an example of setting a register. You can also use the jump command to skip over an instruction.
is possible to handle the signal and asked to nostop.
It's unclear whether you want GDB to handle the signal, or the program itself.
If the latter, gdb handle SIGSEGV nostop noprint pass will do exactly that.
This is actually something the OS does. On Windows, if a program has a debugger attached and an exception is thrown, Windows will ask the debugger if it wants to handle it. If/when it declines, it passes it to the program. If the program doesn't handle it, Windows passes it to the debugger again.

Automatically Relaunch Application On Crash?

On Android, I'm running an application using the NDK that runs a series of tests in C++. If ever one of the tests fails, which most likely means a crash, I'd like the application to relaunch itself and start at the next test.
I wish I could use exceptions but the NDK doesn't support them.
Is this possible?
Why does your application have to crash? Why not catch any exception being thrown? Even the compiler doesn't enforce you to add a try..catch block, RuntimeExceptions might still be thrown.
You can also use Thread.setDefaultUncaughtExceptionHandler. Note that this must be called per thread.
If, for some reason, the solutions above are not suitable for you, you could create a background service that acts as a watchdog timer.
EDIT: Check this link: for a custom version of the NDK that supports C++ exceptions. I found it in this thread.

Possible to trap write to address (x86 - linux)

I want to be able to detect when a write to memory address occurs -- for example by setting a callback attached to an interrupt. Does anyone know how?
I'd like to be able to do this at runtime (possibly gdb has this feature, but my particular
application causes gdb to crash).
If you want to intercept writes to a range of addresses, you can use mprotect() to mark the memory in question as non-writeable, and install a signal handler using sigaction() to catch the resulting SIGSEGV, do your logging or whatever and mark the page as writeable again.
What you need is access to the X86 debug registers: http://en.wikipedia.org/wiki/Debug_register
You'll need to set the breakpoint address in one of DR0 to DR3, and then the condition (data write) in DR7. The interrupt will occur and you can run your debug code to read DR6 and find what caused the breakpoint.
If GDB doesn't work, you might try a simpler/smaller debugger such as http://sourceforge.net/projects/minibug/ - if that isn't working, you can at least go through the code and understand how to use the debugging hardware on the processor yourself.
Also, there's a great IBM developer resource on mastering linux debugging techniques which should provide some additional options:
http://www.ibm.com/developerworks/linux/library/l-debug/
A reasonably good article on doing this is windows is here (I know you're running on linux, but others might come along to this question wanting to do it in windows):
http://www.codeproject.com/KB/debug/hardwarebreakpoint.aspx
-Adam
GDB does have that feature: it is called hardware watchpoints, and it is very well supported on Linux/x86:
(gdb) watch *(int *)0x12345678
If your application crashes GDB, build current GDB from CVS Head.
If that GDB still fails, file a GDB bug.
Chances are we can fix GDB faster than you can hack around SIGSEGV handler (provided a good test case), and fixes to GDB help you with future problems as well.
mprotect does have a disadvantage: your memory must be page-boundary aligned. I had my problematic memory on the stack and was not able to use mprotect().
As Adam said, what you want is to manipulate the debug registers. On windows, I used this: http://www.morearty.com/code/breakpoint/ and it worked great. I also ported it to Mach-O (Mac OS X), and it worked great, too. It was also easy, because Mach-O has thread_set_state(), which is equivalent to SetThreadContext().
The Problem with linux is that it doesn't have such equivalents. I found ptrace, but I thought, this can't be it, there must be something simpler. But there isn't. Yet. I think they are working on a hw_breakpoint API for both kernel and user space. (see http://lwn.net/Articles/317153/)
But when I found this: http://blogs.oracle.com/nike/entry/memory_debugger_for_linux I gave it a try and it wasn't that bad. The ptrace method works by some "outside process" acting as a "debugger", attaching to your program, injecting new values for the debug registers, and terminating with your program continuing with a new hw breakpoint set. The thing is, you can create this "outside process" yourself by using fork(), (I had no success with a pthread), and doing these simple steps inline in your code.
The addwatchpoint code must be adapted to work with 64 bit linux, but that's just changing USER_DR7 etc. to offsetof(struct user, u_debugreg[7]). Another thing is that after a PTRACE_ATTACH, you have to wait for the debuggee to actually stop. But instead of retrying a POKEUSER in a busy loop, the correct thing to do would be a waitpid() on your pid.
The only catch with the ptrace method is that your program can have only one "debugger" attached at a time. So a ptrace attach will fail if your program is already running under gdb control. But just like the example code does, you can register a signal handler for SIGTRAP, run without gdb, and when you catch the signal, enter a busy loop waiting for gdb to attach. From there you can see who tried to write your memory.

Debugging multitheaded programs

I have been a C programmer for many years and my favorite "debugger" has always been the printf() function - I only resort to visual studio's debugger when absolutely forced and so have never been very proficient in using it. Recently I have had to modify a program from C to C++ (although of course printf still works fine) and and parts of the program are now farmed out in to multiple threads (one for each core on a multicore machine) to make the program run faster. Now i will no doubt come up against awkward multi-thread related bugs like deadlocks and I wonder what debugging methodology I can turn to. Does visual studio (2008) have everything I could reasonably need to help me resolve thread related bugs? Should I take some time out now to learn how to use some third party debugger? Could I solve most problems using my good old printf?
Could I for example write code which, if kept waiting on entry to a critical section would print something like "Thread X waiting to enter ... but blocked because its being used by thread Y"?
Visual Studio supports thread debugging to some extend. Via the Threads Window you can select threads, suspend and resume threads etc. When you switch between threads the Call Stack Window is updated accordingly so you can inspect what each thread is doing. You may also restrict breakpoints to specific threads.
If you want an alternative WinDbg (which is part of the free Debugging Tools for Windows package from Microsoft) offers lots of options as well but with a slightly more esoteric user interface.
As for using printf, there's the problem of synchronizing output. If you don't do it you output will most likely be gibberish. If you do synchronize it you basically change the concurrency of the application, which may or may not affect the problem you're trying to solve.
If you could port your project to Linux, Valgrind (especially the 'helgrind' tool) would do exactly what you ask. http://valgrind.org/
I'm not sure if this is exactly what you are asking, but, To help in debugging, you can write code that gives each thread a "name", so that debug messages printed to the debug window, (or a log file or whatever) include that thread "name" along with whatever other info you prescribe. The code below is in C# but this is available even in unmanaged C++
Thread T = new Thread(RunSchedule);
T.Name = "Scheduler"; // <=== Thread given a name here...
T.Start();
Intel provides several tools to find out threading-related issues: data races, deadlocks, performance penalties. These tools are: Intel Thread Checker, Intel Thread Profiler.