GDB doesn't return after segmentation fault - c++

I have some code, that I'm currently porting from OS X to Linux (console tool).
Somewhere in this code, I get a segmentation fault. The problem is, that if I run the program without GDB, I clearly see the segmentation fault, and the program is killed. But when I'm running GDB it just halts, and GDB never returns to the prompt. So I'm not really able to examine what's going on.
C++ is the code. Compiled with the -g option in g++.
Btw. Pretty new to GDB, so excuse me if this is obvious.
Any ideas?
Thanks in advance.
Trenskow

gdb will suspend your program when the seg fault signal is received
type where to see the stack trace and start inspecting what's going on from there.
Also consider enabling core dumps, that way you can load the core dump in GDB and investigate what is going on
you can then load the core dump like this
> gdb your_program the_core_dump

The behaviour you describe is not typical - I suspect the stack may have been trashed.
Try sending various signals directly via the 'kill' command.
Might be worth you running a test program in gdb with an abort() in it so that you can learn what the expected behaviour is for gdb.

I've seen this before when my stack was too large. Try moving stack variables onto the heap (make them globals), recompile, and see if you still get the error.

Related

SEGFAULT after inspecting variable with GDB on QtCreator

I have the following situation: My program runs fine, does all I want it to do, tests pass and Valgrind says it's all right. The only issue is the fact that if I pause execution at some point and try to inspect the state of some objects in the debug view of QtCreator (using GDB) some variables become <not-accessible> and, on resuming execution it reaches a segmentation fault.
To be a little more specific, the program is single threaded and this happens while following the pointers in a tree structure. The structure seems to be fine by the output of the tests.
Does anyone know about a possible cause? Maybe I have messed up with the stack in a way that luckily doesn't affect the tests or it may be just an IDE or debugger issue that should not care about? Thanks in advance for any answers.
Does anyone know about a possible cause?
Do you have multiple threads in your program?
When a program behaves differently depending on whether some GDB breakpoints are present or not, in 99.99% of the cases the program has a data race, and mere fact of stopping it at "inopportune" time exposes that fact.
On Linux, you could use Thread Sanitizer to check for data races.

Recover from crash with a core dump

A C++ program crashed on FreeBSD 6.2 and OS was kind enough to create a core dump. Is it possible to amputate some stack frames, reset the instruction pointer and restart the process in gdb, and how?
Is it possible to amputate some stack frames, reset the instruction pointer and restart the process in gdb?
I assume you mean: change the process state, and set it to start executing again (as if it never crashed in the first place).
No. For one thing, how do you propose GDB (if it magically had this capability) would handle your file descriptors (which the kernel automatically closed when your process died)?
Yes, gdb can debug core dumps just as well as running programs. Assuming that a.out is the name of your program's executable and that a.core is the name of your core file, invoke gdb like so:
gdb a.out a.core
And then you can debug like normal, except you cannot continue execution in any way (even if you could, the program would just crash again). You can examine the stack trace, registers, memory, etc.
Possible duplicate of this: Best practices for recovering from a segmentation fault
Summary: It is possible but not recommended. The way to do it is to usse setjmp() and longjmp() from a signal handler. (Please look at complete source code example in duplicate post.

Corrupt stack problem in C/C++ program

I am running a C/C++ program in linux servers to serve videos. The program's(say named Plugin) core functionality is to convert videos and we fork a separate Plugin process for each video request. But I am having a weird problem for which sometimes server load average gets unexpectedly high. What I see from top command at this stage is that there are some processes which are running for long time and taking some huge CPU's.
When I debug this running program with gdb and backtrace stack,what I found is the corrupt stack: "Previous frame inner to this frame (corrupt stack?)". I have searched the net and found that this occurs if the program gets segmentation fault.
But what I know if the program gets segmentation fault, the program should crash and exit at that point. But surprisingly the program still running after segmentation fault.
What can be the causes of this? I know there must be some big problems in the program but I just can't understand from where to start fixing the problem...It would be great if any of you can show me some lights...
Thanks in advance
Attaching the debugger changes the behavior of the process so you won't get reliable investigation results most probably. Corrupted stack message from the debugger can mean that the particular debugger does not understand text info from the binary.
I would recommend running pstack several time subsequently on the problematic (this is known as "Monte Carlo performance profiling") and also attach strace or truss to the problematic and check what system calls is the process doing when consuming CPU.
Run your program under Valgrind and fix any invalid memory writes that it finds.
Certain optimisations, such as frame pointer omission, can make it harder for the debugger to understand the stack.
If you have the code, compile the program in debug and run Valgrind on it.
If you don't have the code, contact the author/provider of the program.
The corrupt stack message simply means the code is doing something weird with the memory. It does not mean the program has a segmentation fault. Also, the program can still run if it choose to handle the SIGSEGV signal.
If by forking you mean that you have some process which spawn and run other smaller processes, just monitor for such spikes and restart the process. This assumes that you have no access to the fix the program.
There could be some interesting manipulation of the stack taking place through assembly code manipulation, such as true tail-recursion optimization, self-modifying code, non-returning functions, etc. that may cause the debugger to be incapable of properly back-tracing the stack and causing it to trigger a corrupted stack error, but that doesn't necessarily mean that memory is corrupted ... but definitely something non-traditional is happening under the hood.

GDB (DDD), debugging questions

Some things in GDB (actually using DDD gui) confuse me, when debugging my own C++ codes:
1) Why is there no backtrace available after a HEAP ERROR crash?
2) Why does gdb sometimes stop AFTER the breakpoint rather than AT the breakpoint?
3) Sometimes stepping through commented lines causes execution of some instructions (gdb busy)??
Any explanations greatly appreciated,
Petr
1) I'm not sure for heap error, but for example if you ran out of memory it might not be able to process the backtrace properly. Also if a heap corruption caused a pointer to blow up part of your application's stack that would cause the backtrace to be unavailable.
2) If you have optimization enabled, it's quite possible for this to happen. The compiler can reorder statements, and the underlying assembly upon which the breakpoint was placed may correspond to the later line of code. Don't use optimization when trying to debug such things.
3) This could be caused by either the source code not having been rebuilt before execution (so the binary is different from the actual source, or possibly even from optimization settings again.
Few possible explainations:
1) Why is there no backtrace available after a HEAP ERROR crash?
If the program is generating a core dump file you can run GDB as follows: "gdb program -c corefile" and get a backtrace.
2) Why does gdb sometimes stop AFTER the breakpoint rather than AT the breakpoint?
Breakpoints are generally placed on statements, so watch out for that. The problem here could also be caused by a mismatch between the binary and the code you're using.
3) Sometimes stepping through commented lines causes execution of some instructions (gdb busy)??
Again, see #2.
2) Why does gdb sometimes stop AFTER the breakpoint rather than AT the breakpoint?
Do you have optimization enabled during your compilation? If so the compiler may be doing non-trivial rearrangements of your code... This could conceivable address your number 3 as well.
With g++ use -O0 or no -O at all to turn optimization off.
I'm unclear of what your number 1 is asking.
Regarding the breakpoint and comment/instruction behavior, are you compiling with optimization enabled (e.g, -O3, etc.)? GDB can handle that but the behavior you are seeing sometimes occurs when debugging optimized code, especially with code compiled with aggressive optimizations.
Heap checks are probably done after main returns, try set backtrace past-main in GDB. If it's crashing - the process is gone - you need to load the core file into debugger (gdb prog core).
Optimized code, see #dmckee's answer
Same as 2.

Segmentation fault only when running on multi-core

I am using a c++ library that is meant to be multi-threaded and the number of working threads can be set using a variable. The library uses pthreads. The problem appears when I run the application ,that is provided as a test of library, on a quad-core machine using 3 threads or more. The application exits with a segmentation fault runtime error. When I try to insert some tracing "cout"s in some parts of library, the problem is solved and application finishes normally.
When running on single-core machine, no matter what number of threads are used, the application finishes normally.
How can I figure out where the problem seam from?
Is it a kind of synchronization error? how can I find it? is there any tool I can use too check the code ?
Sounds like you're using Linux (you mention pthreads). Have you considered running valgrind?
Valgrind has tools for checking for data race conditions (helgrind) and memory problems (memcheck). Valgrind may be to find such an error in debug mode without needing to produce the crash that release mode produces.
Some general debugging recommendations.
Make sure your build has symbols (compile with -g). This option is orthogonal to other build options (i.e. the decision to build with symbols is independent of the optimization level).
Once you have symbols, take a close look at the call stack of where the seg fault occurs. To do this, first make sure your environment is configured to generate core files (ulimit -c unlimited) and then after the crash, load the program/core in the debugger (gdb /path/to/prog /path/to/core). Once you know what part of your code is causing the crash, that should give you a better idea of what is going wrong.
You are running into a race condition.
Where multiple threads are interacting on the same resource.
There are a whole host of possible culprits, but without the source anything we say is a guess.
You want to create a core file; then debug the application with the core file. This will set up the debugger to the state of the application at the point it crashed. This will allow you to examin the variables/registers etc.
How to do this will very depending on your system.
A quick Google revealed this:
http://www.codeguru.com/forum/archive/index.php/t-299035.html
Hope this helps.