gdb: malloc(): memory corruption (fast): - gdb

When executing gdb> core-file , gdb gives the following errors and then crashes creating a core file:
Reading symbols from ./libtcmalloc_minimal.so.0...
*** glibc detected *** gdb: malloc(): memory corruption (fast): 0x0000000000ec04a0 ***
I haven't found any reference to gdb crashing with this error. Has anyone run into this? If so what can be done about it.
The version of GDB is: GNU gdb (GDB) SUSE (6.8.50.20090302-1.5.18)
Thanks

what can be done about it
Any crash in GDB itself is a bug.
However, nobody would care about this bug, unless it can be reproduced with current GDB (yours is 5 years old).
So, download current release of GDB (7.5.1 currently), and build it.
If it works, use it to debug your problem.
If it doesn't work, file a bug in GDB bugzilla.

If you get this error as a result of calling
ptr = (ptr_t*)malloc(sizeof(ptr_t));
in your program, it may be due a missing stdlib.h header.

Related

gcc segmentation fault - how can I find a line where it happened?

I'm using Ubuntu and gcc. My application crashes I only have Segmentation fault message in console. (previously Segmentation fault (core dumped) was reported but now it changed to just Segmentation fault).
There are no hints where the problem is so I do not understand how should I fix the problem. I need some hints to find what caused this - ideally complete stack trace or at least object type/method or something like this.
What would be the correct way of troubleshooting such type of problem? (may be compile with some extra flags, run some tools, collect core dump and analyze it somehow?)
You might well need to enable core dumps with
ulimit -c unlimited
Once you have a core dump, you can look at the program state with GDB:
gdb my_prog core
You should then have the same view that you would have had if you'd run the program under GDB until it crashed - you could just do that, rather than collecting the core dump. In particular, it will show you which line caused the segfault, and the state of the call stack at that point.
To get the best debugging view, you should tell the compiler to include debugging symbols (-g) and disable optimisation (-O0).
you can use the gdb tools to help debuging.
run gdb ./your_app_name on terminal if you have the gdb installed and you will see some infomation as follow:
.....
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./gsvod_client...done.
(gdb)
then input "r" to start your app, if it crashed again, you can type 'bt' to see the line where the problem occured.

Debug error before main() using GDB

Is there anyway to debug a link error or any kind of error that may occur before the execution of the main() function using GDB?
Is there anyway to debug a link error
Presumably you are asking about runtime link error (e.g. `error: libfoo.so: no such file or directory'), and not about (static) link step of your build process.
The trick is to set a breakpoint on exit or (exit_group on Linux) system call with e.g. catch syscall exit. You will then be stopped inside ld.so at the point where it gives up running your binary.
or any kind of error that may occur before the execution of the main() function using GDB?
Any other kind of error, e.g. SIGSEGV can be debugged "normally" -- for signal you don't need to do anything at all -- GDB will just stop. For other errors, just set a breakpoint as usual.
On way to debug initialization code (even if you don't have symbols) goes like this:
gdb somebinary
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
etc.
info file
Symbols from "somebinary".
Local exec file:
`/somebinary', file type elf64-x86-64.
Entry point: 0x4045a4, etc.
break *0x4045a4
run
...Breakpoint 1, 0x00000000004045a4 in ?? ()
From here on you can proceed as usual.

Counter exit code 139 when running, but gdb make it through

My question sounds specific, but I doubt it still can be of a C++ debug issue.
I am using omnet++ which is to simulate wireless network. omnet++ itself is a c++ program.
I encountered a queer phenomena when I run my program (modified inet framework with omnet++ 4.2.2 in Ubuntu 12.04): the program exit with exit code 139 (people say this means memory fragmentation) when touching a certain part of the codes, when I try to debug, gdb doesn't report anything wrong with the 'problematic' codes where the simulation exits previously, actually, the debug goes through this part of codes and output expected results.
gdb version info: GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Could anybody tell me why the run fails but debug doesn't?
Many thanks!
exit code 139 (people say this means memory fragmentation)
No, it means that your program died with signal 11 (SIGSEGV on Linux and most other UNIXes), also known as segmentation fault.
Could anybody tell me why the run fails but debug doesn't?
Your program exhibits undefined behavior, and can do anything (that includes appearing to work correctly sometimes).
Your first step should be running this program under Valgrind, and fixing all errors it reports.
If after doing the above, the program still crashes, then you should let it dump core (ulimit -c unlimited; ./a.out) and then analyze that core dump with GDB: gdb ./a.out core; then use where command.
this error is also caused by null pointer reference.
if you are using a pointer who is not initialized then it causes this error.
to check either a pointer is initialized or not you can try something like
Class *pointer = new Class();
if(pointer!=nullptr){
pointer->myFunction();
}

force coredump on glib free error

I get the following error when I run my program and it won't happen under gdb. How can I force glibc or ubuntu to dump core on abort? I tried "ulimit -c unlimited". But, this is not a seg fault and no luck. Also, I have too many memory errors in valgrind fixing all of them will take a lot of time.
Also, setting MALLOC_CHECK_ to 0 is not forcing program to exit. But, that's not a option for me.
* glibc detected ./main: free(): invalid next size (fast): 0x0000000000ae0560 **
Edit
Anyway I found what is exactly causing this glibc corruption in valgrind. Just keeping it open to see if it's possible.
From glibc documentation:
If MALLOC_CHECK_ is set to 0, any detected heap corruption is silently ignored; if set to 1, a diagnostic is printed on stderr; if set to 2, abort is called immediately.
Calling abort() usually produces a core dump (subject to ulimit -c setting).
Use Valgrind to diagnose and fix the problem. It will be quicker and straight to the point, since this indeed looks like a classic heap corruption.
There is likely a (Valgrind) package available for your distro, if you use a common one.
The only other method to create a core dump would be to attach GDB to the process before it happens. But that still doesn't get you closer to the solution of what causes the problem. Valgrind is the superior approach.

Aborted core dumped C++

I have a large C++ function which uses OpenCV library and running on Windows with cygwin g++ compiler. At the end it gives Aborted(core dumped) but the function runs completely before that. I have also tried to put the print statement in the end of the function. That also gets printed. So I think there is no logical bug in code which will generate the fault.
Please explain.
I am also using assert statements.But the aborted error is not due to assert statement. It does not say that assertion failed. It comes at end only without any message.
Also the file is a part of a large project so I cannot post the code also.
gdb results:
Program received signal SIGABRT, Aborted.
0x7c90e514 in ntdll!LdrAccessResource () from /c/WINDOWS/system32/ntdll.dll
It looks like a memory fault (write to freed memory, double-free, stack overflow,...). When the code can be compiled and run under Linux you can use valgrind to see if there are memory issues. Also you can try to disable parts of the application until the problem disappears, to get a clue where the error happens. But this method can also give false positives, since memory related bugs can cause modules to fail which are not the cause of the error. Also you can run the program in gdb. But also here the position the debugger points to may not be the position where the error happened.
You don't give us much to go on. However, this looks like you are running into some problem when freeing resources. Maybe a heap corruption. Have you tried running it under gdb and then looking where it crashes? Also, check if all your new/delete calls match.
Load the core dump together with the binary into gdb to get an idea at which location the problem list. Command line is:
gdb <path to the binary> <path to the core file>
For more details on gdb see GDB: The GNU Project Debugger.
Run it through AppVerifier and cdb.
E.g.
cdb -xd sov -xd av -xd ch <program> <args>