force coredump on glib free error - c++

I get the following error when I run my program and it won't happen under gdb. How can I force glibc or ubuntu to dump core on abort? I tried "ulimit -c unlimited". But, this is not a seg fault and no luck. Also, I have too many memory errors in valgrind fixing all of them will take a lot of time.
Also, setting MALLOC_CHECK_ to 0 is not forcing program to exit. But, that's not a option for me.
* glibc detected ./main: free(): invalid next size (fast): 0x0000000000ae0560 **
Edit
Anyway I found what is exactly causing this glibc corruption in valgrind. Just keeping it open to see if it's possible.

From glibc documentation:
If MALLOC_CHECK_ is set to 0, any detected heap corruption is silently ignored; if set to 1, a diagnostic is printed on stderr; if set to 2, abort is called immediately.
Calling abort() usually produces a core dump (subject to ulimit -c setting).

Use Valgrind to diagnose and fix the problem. It will be quicker and straight to the point, since this indeed looks like a classic heap corruption.
There is likely a (Valgrind) package available for your distro, if you use a common one.
The only other method to create a core dump would be to attach GDB to the process before it happens. But that still doesn't get you closer to the solution of what causes the problem. Valgrind is the superior approach.

Related

gdb segmentation fault line number missing with c++11 option [duplicate]

Is there any gcc option I can set that will give me the line number of the segmentation fault?
I know I can:
Debug line by line
Put printfs in the code to narrow down.
Edits:
bt / where on gdb give No stack.
Helpful suggestion
I don't know of a gcc option, but you should be able to run the application with gdb and then when it crashes, type where to take a look at the stack when it exited, which should get you close.
$ gdb blah
(gdb) run
(gdb) where
Edit for completeness:
You should also make sure to build the application with debug flags on using the -g gcc option to include line numbers in the executable.
Another option is to use the bt (backtrace) command.
Here's a complete shell/gdb session
$ gcc -ggdb myproj.c
$ gdb a.out
gdb> run --some-option=foo --other-option=bar
(gdb will say your program hit a segfault)
gdb> bt
(gdb prints a stack trace)
gdb> q
[are you sure, your program is still running]? y
$ emacs myproj.c # heh, I know what the error is now...
Happy hacking :-)
You can get gcc to print you a stacktrace when your program gets a SEGV signal, similar to how Java and other friendlier languages handle null pointer exceptions. See my answer here for more details:
how to generate a stacktace when my C++ app crashes ( using gcc compiler )
The nice thing about this is you can just leave it in your code; you don't need to run things through gdb to get the nice debug output.
If you compile with -g and follow the instructions there, you can use a command-line tool like addr2line to get file/line information from the output.
Run it under valgrind.
you also need to build with debug flags on -g
You can also open the core dump with gdb (you need -g though).
If all the preceding suggestions to compile with debugging (-g) and run under a debugger (gdb, run, bt) are not working for you, then:
Elementary: Maybe you're not running under the debugger, you're just trying to analyze the postmortem core dump. (If you start a debug session, but don't run the program, or if it exits, then when you ask for a backtrace, gdb will say "No stack" -- because there's no running program at all. Don't forget to type "run".) If it segfaulted, don't forget to add the third argument (core) when you run gdb, otherwise you start in the same state, not attached to any particular process or memory image.
Difficult: If your program is/was really running but your gdb is saying "No stack" perhaps your stack pointer is badly smashed. In which case, you may be a buffer overflow problem somewhere, severe enough to mash your runtime state entirely. GCC 4.1 supports the ProPolice "Stack Smashing Protector" that is enabled with -fstack-protector-all. It can be added to GCC 3.x with a patch.
There is no method for GCC to provide this information, you'll have to rely on an external program like GDB.
GDB can give you the line where a crash occurred with the "bt" (short for "backtrace") command after the program has seg faulted. This will give you not only the line of the crash, but the whole stack of the program (so you can see what called the function where the crash happened).
The No stack problem seems to happen when the program exit successfully.
For the record, I had this problem because I had forgotten a return in my code, which made my program exit with failure code.

How to find "index out of bound" if I don't get Segmentation Fault

I have a program which cause Seg fault in a machine, which is not accessible for me. However, when I compile and run it with the same compiler and same input on my machine, I don't get anything. The problem is probably "array index out of bound" which might lead to Seg Fault in some circumstances, however, compiler does not show any warning. The program is huge and complicated. So I cannot find the problem just by checking the code.
Any suggestion on how to get the Segmentation Fault on my machine too? This way I can debug the code and find the problem.
You could use valgrind if it works over Linux machine.
To use valgrind you just type on console:
valgrind --leak-check=full --num-callers=20 --tool=memcheck ./program
and should return invalid read/write of size X according to the variable and (if you compiled with debugging information), it will tell you the line where the problem might be.
By the way, you can install valgrind in Ubuntu/Debian Linux (for example) just as easy as:
sudo apt-get install valgrind
You can try a solution such as Valgrind as other posters mentioned, or your compiler may also have some specific ability to insert guard words before I located a raise to detect this kind of access.

Counter exit code 139 when running, but gdb make it through

My question sounds specific, but I doubt it still can be of a C++ debug issue.
I am using omnet++ which is to simulate wireless network. omnet++ itself is a c++ program.
I encountered a queer phenomena when I run my program (modified inet framework with omnet++ 4.2.2 in Ubuntu 12.04): the program exit with exit code 139 (people say this means memory fragmentation) when touching a certain part of the codes, when I try to debug, gdb doesn't report anything wrong with the 'problematic' codes where the simulation exits previously, actually, the debug goes through this part of codes and output expected results.
gdb version info: GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Could anybody tell me why the run fails but debug doesn't?
Many thanks!
exit code 139 (people say this means memory fragmentation)
No, it means that your program died with signal 11 (SIGSEGV on Linux and most other UNIXes), also known as segmentation fault.
Could anybody tell me why the run fails but debug doesn't?
Your program exhibits undefined behavior, and can do anything (that includes appearing to work correctly sometimes).
Your first step should be running this program under Valgrind, and fixing all errors it reports.
If after doing the above, the program still crashes, then you should let it dump core (ulimit -c unlimited; ./a.out) and then analyze that core dump with GDB: gdb ./a.out core; then use where command.
this error is also caused by null pointer reference.
if you are using a pointer who is not initialized then it causes this error.
to check either a pointer is initialized or not you can try something like
Class *pointer = new Class();
if(pointer!=nullptr){
pointer->myFunction();
}

gdb: malloc(): memory corruption (fast):

When executing gdb> core-file , gdb gives the following errors and then crashes creating a core file:
Reading symbols from ./libtcmalloc_minimal.so.0...
*** glibc detected *** gdb: malloc(): memory corruption (fast): 0x0000000000ec04a0 ***
I haven't found any reference to gdb crashing with this error. Has anyone run into this? If so what can be done about it.
The version of GDB is: GNU gdb (GDB) SUSE (6.8.50.20090302-1.5.18)
Thanks
what can be done about it
Any crash in GDB itself is a bug.
However, nobody would care about this bug, unless it can be reproduced with current GDB (yours is 5 years old).
So, download current release of GDB (7.5.1 currently), and build it.
If it works, use it to debug your problem.
If it doesn't work, file a bug in GDB bugzilla.
If you get this error as a result of calling
ptr = (ptr_t*)malloc(sizeof(ptr_t));
in your program, it may be due a missing stdlib.h header.

Aborted core dumped C++

I have a large C++ function which uses OpenCV library and running on Windows with cygwin g++ compiler. At the end it gives Aborted(core dumped) but the function runs completely before that. I have also tried to put the print statement in the end of the function. That also gets printed. So I think there is no logical bug in code which will generate the fault.
Please explain.
I am also using assert statements.But the aborted error is not due to assert statement. It does not say that assertion failed. It comes at end only without any message.
Also the file is a part of a large project so I cannot post the code also.
gdb results:
Program received signal SIGABRT, Aborted.
0x7c90e514 in ntdll!LdrAccessResource () from /c/WINDOWS/system32/ntdll.dll
It looks like a memory fault (write to freed memory, double-free, stack overflow,...). When the code can be compiled and run under Linux you can use valgrind to see if there are memory issues. Also you can try to disable parts of the application until the problem disappears, to get a clue where the error happens. But this method can also give false positives, since memory related bugs can cause modules to fail which are not the cause of the error. Also you can run the program in gdb. But also here the position the debugger points to may not be the position where the error happened.
You don't give us much to go on. However, this looks like you are running into some problem when freeing resources. Maybe a heap corruption. Have you tried running it under gdb and then looking where it crashes? Also, check if all your new/delete calls match.
Load the core dump together with the binary into gdb to get an idea at which location the problem list. Command line is:
gdb <path to the binary> <path to the core file>
For more details on gdb see GDB: The GNU Project Debugger.
Run it through AppVerifier and cdb.
E.g.
cdb -xd sov -xd av -xd ch <program> <args>