I'm using FLTK in C++ and my program was every so often crashing when I changed the value of a widget. I ran my program with gdb to replicate the error and got two similar, but not identical errors when performing the backtrace. Weirdly the backtrace doesn't list any functions in my code, but it is in code I wouldn't expect to be in error so what might be wrong in my code to give these results?
The backtraces/errors are
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000018
0x00007fff8f018d2b in tiny_free_list_remove_ptr ()
(gdb) backtrace
#0 0x00007fff8f018d2b in tiny_free_list_remove_ptr ()
#1 0x00007fff8f01579d in szone_free_definite_size ()
#2 0x00007fff8f00f8c8 in free ()
#3 0x00007fff8ccdcfc0 in object_dispose ()
#4 0x00007fff919fff2b in -[__NSArrayI dealloc] ()
#5 0x00007fff919c228a in CFRelease ()
#6 0x00007fff8f339591 in -[NSFocusState flush] ()
#7 0x00007fff8f337f43 in -[NSView _focusFromView:withContext:] ()
#8 0x00007fff8f337719 in -[NSView lockFocusIfCanDraw] ()
#9 0x00007fff8f33744e in -[NSView lockFocus] ()
#10 0x0000000100033548 in Fl_Window::make_current ()
#11 0x0000000100042e8a in Fl_Double_Window::flush ()
#12 0x0000000100034134 in Fl_X::flush ()
#13 0x000000010003acc8 in Fl::flush ()
#14 0x0000000100036e8f in fl_mac_flush_and_wait ()
#15 0x000000010003ae39 in Fl::run ()
#16 0x0000000100100150 in main ()
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: 13 at address: 0x0000000000000000
0x00007fff8ccda718 in objc_msgSend_vtable13 ()
(gdb) backtrace
#0 0x00007fff8ccda718 in objc_msgSend_vtable13 ()
#1 0x00007fff91a132fa in __CFRunLoopDoObservers ()
#2 0x00007fff919ee7b8 in __CFRunLoopRun ()
#3 0x00007fff919ee0e2 in CFRunLoopRunSpecific ()
#4 0x00007fff8e873eb4 in RunCurrentEventLoopInMode ()
#5 0x00007fff8e873b94 in ReceiveNextEventCommon ()
#6 0x00007fff8e873ae3 in BlockUntilNextEventMatchingListInMode ()
#7 0x00007fff8f2fc533 in _DPSNextEvent ()
#8 0x00007fff8f2fbdf2 in -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] ()
#9 0x0000000100036e1a in fl_wait ()
#10 0x0000000100036eb6 in fl_mac_flush_and_wait ()
#11 0x000000010003ae39 in Fl::run ()
#12 0x0000000100100150 in main ()
Any ideas? Thanks in advance.
Any crash inside free implementation (your first stack trace) is usually a sign of double-free, or another kind heap corruption.
System malloc on MacOS has debugging features that you can turn on, and that should allow you to catch it close to where it is happening.
Related
I have coredump from application that crashed
because of wrong index in std::vector::at.
I created that elf file via strip myelf -o myelf.strip.
Then I run myelf.strip and it crashed.
If I ran gdb myelf.strip coredump I got:
(gdb) bt
#0 0x00007b864b918fff in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007b864b91a42a in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007b864c2310ad in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007b864c22f066 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007b864c22e089 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007b864c22e9dd in __gxx_personality_v0 () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007b864bc94f33 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#7 0x00007b864bc9529b in _Unwind_RaiseException () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#8 0x00007b864c22f2bc in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00007b864c257b85 in std::__throw_out_of_range_fmt(char const*, ...) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00006c48e78b6cde in ?? ()
#11 0x00006c48e78beb6a in ?? ()
...
All looks as expected, then I want to decode frames #10, #11...
And I ran gdb myelf coredump, in other words the same elf file,
but with debug info, and got:
(gdb) bt
#0 0x00007b864b918fff in ?? ()
#1 0x7a69733e2d736968 in ?? ()
#2 0xfffffffe7fffffff in ?? ()
#3 0xffffffffffffffff in ?? ()
#4 0xffffffffffffffff in ?? ()
#5 0xffffffffffffffff in ?? ()
#6 0xffffffffffffffff in ?? ()
#7 0xffffffffffffffff in ?? ()
#8 0xffffffffffffffff in ?? ()
#9 0xffffffffffffffff in ?? ()
#10 0xffffffffffffffff in ?? ()
#11 0xffffffffffffffff in ?? ()
I validate myelf.strip via strip myelf -o myelf.strip_2 && diff myelf.strip myelf.strip_2, and myelf.strip is really file that corresponding to myelf.
So why the result of gdb for stripped and not stripped file is different?
strip somehow rellocate symbols?
This is old debian based distro with such packages: binutils 2.28-5 , gdb 7.12-6
enter code hereI have a binary that I am migrating from 32 bit to 64 bit.
I ran it and this was the result when I did top -H -p <name of binary> :
Note that all the entries are the threads of the same process.
So I decided that I must check what is happening inside each thread. Thus, I started attaaching to each process.
This was the result:
gdb attach 28608
(gdb) bt
#0 0x00000039a40ccfc2 in select () from /lib64/libc.so.6
#1 0x00002b40b4d20178 in ?? ()
#2 0x0000000000000000 in ?? ()
gdb attach 28472
(gdb) bt
#0 0x00000039a40ccfc2 in select () from /lib64/libc.so.6
#1 0x00002b40b4d20178 in ?? ()
#2 0x0000002d00000000 in ?? ()
#3 0x000000300000002e in ?? ()
#4 0x0000003200000000 in ?? ()
#5 0x00000000142cf418 in ?? ()
#6 0x00000000142cf3f8 in ?? ()
#7 0x0000003900000038 in ?? ()
#8 0x0000003e0000003b in ?? ()
#9 0x0000004000000000 in ?? ()
#10 0x00000000142cf278 in ?? ()
#11 0x00000000142cf2f8 in ?? ()
#12 0x0000004800000047 in ?? ()
#13 0x00007fffe259cd00 in ?? ()
#14 0x00007fffe259cd70 in ?? ()
#15 0x0000000000000000 in ?? ()
gdb attach 28475
(gdb) bt
#0 0x00000039a40ccfc2 in select () from /lib64/libc.so.6
#1 0x00002b40b4ee8f3c in ?? ()
#2 0x0000000000000002 in ?? ()
#3 0x0000000000069f50 in ?? ()
#4 0x00002b40b542e160 in ?? ()
#5 0x00002b40b4ee9e91 in ?? ()
#6 0x00002b40b5505681 in ?? ()
#7 0x00000000140fede0 in ?? ()
#8 0x0000000000000000 in ?? ()
gdb attach 28609
(gdb) bt
#0 0x00000039a40ccfc2 in select () from /lib64/libc.so.6
#1 0x00002b40b4d20178 in ?? ()
#2 0x0000000000000000 in ?? ()
I am not quite sure what this select() function is.
Can you tell me what might be wrong here ? Why are all the threads stuck like that ?
have you ever encountered something like this before ?
select is an api call that allows you to monitor file descriptors for activity, at which point you will be notified and can carry out the activity (eg read or write).
It looks like your program has 4 threads that are each waiting on select to return.
I am hunting down a deadlock, but I don't understand gdb behavior in this respect. I have two threads:
Thread 2 (Thread 0x2aaaadf66940 (LWP 10229)):
#0 0x0000003f95e0d654 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x0000003f95e08f65 in _L_lock_1127 () from /lib64/libpthread.so.0
#2 0x0000003f95e08e63 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00002b67cbdeaded in ?? ()
#4 0x000000002d0e9608 in ?? ()
#5 0x00002b67cbd1e1f2 in ?? ()
#6 0x000000000000000b in ?? ()
#7 0x00002aaaca08e410 in ?? ()
#8 0x00002aaab405d558 in ?? ()
#9 0x00002aaaadf65f48 in ?? ()
#10 0x00002aaaadf65fa0 in ?? ()
#11 0x00002aaaadf65fc0 in ?? ()
#12 0x00002aaaadf65f40 in ?? ()
#13 0x00002aaaadf65f50 in ?? ()
#14 0x000000002d0e7460 in ?? ()
#15 0x0000000026014330 in ?? ()
#16 0x00002b67cc1d08b0 in ?? ()
#17 0x0000003f94e7587b in free () from /lib64/libc.so.6
#18 0x00002aaac8b67450 in ?? ()
#19 0x00002aaaadf66070 in ?? ()
#20 0x13477fb9fe21aee8 in ?? ()
#21 0x000003742e856f43 in ?? ()
#22 0x00002b67cbe11811 in ?? ()
#23 0x00002b67cc1cfc70 in ?? ()
#24 0x000000002d0e8328 in ?? ()
#25 0x000000002d0e9630 in ?? ()
#26 0x00002b67cbded355 in ?? ()
#27 0x0000000052cdceee in ?? ()
#28 0x000000002d0e9608 in ?? ()
#29 0x0000000000000001 in ?? ()
#30 0x000000002d0e9700 in ?? ()
#31 0x000000002d0e96a8 in ?? ()
#32 0x000000002d0e9728 in ?? ()
#33 0x000000002d0e9630 in ?? ()
#34 0x00002b67cbded538 in ?? ()
#35 0x000000002ccbc6a8 in ?? ()
#36 0x00002aaaadf66070 in ?? ()
#37 0xfffffffffffffffe in ?? ()
#38 0x0000000000000008 in ?? ()
#39 0x00002b67cbe0cf00 in ?? ()
#40 0x0000003b24002216 in ?? () from /usr/lib64/tls/libnvidia-tls.so.319.60
#41 0x00002b67cbe116ec in ?? ()
#42 0x000000002d0e9648 in ?? ()
#43 0xffffffffffffff01 in ?? ()
#44 0x00002b67cc1f38f8 in ?? ()
#45 0x00002b67cbe103fa in ?? ()
#46 0x0000000019eac470 in ?? ()
#47 0x0000000034bc8ef0 in ?? ()
#48 0x00000000ffffffff in ?? ()
#49 0x0000000000000000 in ?? ()
Thread 1 (Thread 0x2b67c311c600 (LWP 9798)):
#0 0x0000003f95e0d654 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x0000003f95e08f4a in _L_lock_1034 () from /lib64/libpthread.so.0
#2 0x0000003f95e08e0c in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00002b67cbdf02a8 in ?? ()
#4 0x000000000000b000 in ?? ()
#5 0x000000000000b000 in ?? ()
#6 0x000000002d0e7460 in ?? ()
#7 0x00002aaad484e6c0 in ?? ()
#8 0x00007fffd540a1b0 in ?? ()
#9 0x0000003f94e73f0e in malloc () from /lib64/libc.so.6
#10 0x0000003b24002216 in ?? () from /usr/lib64/tls/libnvidia-tls.so.319.60
#11 0x0000000000000000 in ?? ()
These two threads are apparently deadlocking: Thread 1 wants to acquire the lock from thread 2 (note the owner)
(gdb) p *(pthread_mutex_t*)0x2d0e9648
$1 = {__data = {__lock = 2, __count = 0, __owner = 10229, __nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
and thread 2 wants to acquire the lock from thread 1
(gdb) p *(pthread_mutex_t*)0x2d0e8330
$2 = {__data = {__lock = 2, __count = 1, __owner = 9798, __nusers = 1, __kind = 1, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
Now, What I don't understand is why the backtrace is so broken. I tried checking which libraries are mapped to those address (in particular the 2b67cbd) but none do. I tried a disas. no luck:
(gdb) disas 0x00002b67cbdeaded
No function contains specified address.
There seems to be nothing on those addresses. I thought it was a stack corruption, but then what is happening that actually calls the pthread lock? Who sends the thread to that code? and how reliable is the call to free() (note that the other thread is doing a call to malloc, so they might be related in their activity)?
(gdb) disas 0x00002b67cbdeaded
No function contains specified address.
There seems to be nothing on those addresses.
Your conclusion is likely not correct. Try (gdb) x/20i 0x00002b67cbdeaded-5, and you'll see that in fact there is code there, including a CALL pthread_mutex_lock.
What's likely happening is that something in your program is using a JIT compiler, and the code that calls pthread_mutex_lock does not have any symbols (that GDB knows about) associated with it.
That code also doesn't have any unwind descriptors, which makes the rest of the stack completely unreliable. free and malloc may or may not be actually on stack.
It may be illustrative to look at /proc//maps and see what is mapped in the 0x00002b67cbdea000 region. Most likely you'll find anonymous mapping with rwxp permissions.
When ctrl + leftclick the unfocused window of a QT application launched via Cygwin X, I am noticing a very repeatable segfault. At this point I've trimmed off all of my application code and can still see the same behavior while running a simple QMainWindow holding a few empty TextEdits. A simple left click into the unfocused window (while holding ctrl) will cause a segfault ~50% of the time.
Has anybody noticed a similar behavior? I'm very curious since I don't seem to see this behavior documented or reported elsewhere.
Note: I've noticed that this behavior applies to all Modifier keys (alt, ctrl, shift, etc).
#0 0x00007f8429cf614b in ?? () from /usr/local/lib/qt5/5.0.2/gcc_64/plugins/platforms/libqxcb.so
#1 0x00007f8429cee501 in ?? () from /usr/local/lib/qt5/5.0.2/gcc_64/plugins/platforms/libqxcb.so
#2 0x00007f8429ce9e20 in ?? () from /usr/local/lib/qt5/5.0.2/gcc_64/plugins/platforms/libqxcb.so
#3 0x00007f8429ceb0cb in ?? () from /usr/local/lib/qt5/5.0.2/gcc_64/plugins/platforms/libqxcb.so
#4 0x00007f84306c955e in QObject::event(QEvent*) () from /usr/local/lib/qt5/5.0.2/gcc_64/lib/libQt5Core.so.5
#5 0x00007f84311f25b4 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () from /usr/local/lib/qt5/5.0.2/gcc_64/lib/libQt5Widgets.so.5
#6 0x00007f84311f5991 in QApplication::notify(QObject*, QEvent*) () from /usr/local/lib/qt5/5.0.2/gcc_64/lib/libQt5Widgets.so.5
#7 0x00007f84306a27c4 in QCoreApplication::notifyInternal(QObject*, QEvent*) () from /usr/local/lib/qt5/5.0.2/gcc_64/lib/libQt5Core.so.5
#8 0x00007f84306a4701 in QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) () from /usr/local/lib/qt5/5.0.2/gcc_64/lib/libQt5Core.so.5
#9 0x00007f84306ea0d3 in ?? () from /usr/local/lib/qt5/5.0.2/gcc_64/lib/libQt5Core.so.5
#10 0x00007f842e7e3f05 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#11 0x00007f842e7e4248 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#12 0x00007f842e7e4304 in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#13 0x00007f84306ea514 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/local/lib/qt5/5.0.2/gcc_64/lib/libQt5Core.so.5
#14 0x00007f84306a169b in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/local/lib/qt5/5.0.2/gcc_64/lib/libQt5Core.so.5
#15 0x00007f84306a4c3e in QCoreApplication::exec() () from /usr/local/lib/qt5/5.0.2/gcc_64/lib/libQt5Core.so.5
About a week ago I started having troubles getting a decent backtrace from a core dump using GDB. If I load the program in GDB and have it crash, I can get a backtrace fine.
This is what I get when doing it from a core dump:
(gdb) bt
#0 0x00007fd10ad42425 in ?? ()
#1 0x00007fd10ad45b8b in ?? ()
#2 0x0000000000000004 in ?? ()
#3 0x0000000000000005 in ?? ()
#4 0x00007ffff770887e in ?? ()
#5 0x0000000000000009 in ?? ()
#6 0x00007fd10ae87ea7 in ?? ()
#7 0x0000000000000003 in ?? ()
#8 0x00007ffff77072ba in ?? ()
#9 0x0000000000000006 in ?? ()
#10 0x00007fd10ae87eab in ?? ()
#11 0x0000000000000002 in ?? ()
#12 0x00007ffff77072ce in ?? ()
#13 0x0000000000000002 in ?? ()
#14 0x00007fd10ae85b82 in ?? ()
#15 0x0000000000000001 in ?? ()
#16 0x00007fd10ae87ea7 in ?? ()
#17 0x0000000000000003 in ?? ()
#18 0x00007ffff77072b4 in ?? ()
#19 0x000000000000000c in ?? ()
#20 0x00007fd10ae87eab in ?? ()
#21 0x0000000000000002 in ?? ()
#22 0x0000000000000020 in ?? ()
#23 0x0000000000000000 in ?? ()
(gdb)
This happens regardless of whether it's a SIGSEGV, SIGABRT (Unhandled exception or assert/verify).
I am compiling with the following compiler flags:
g++ -Wall -Wextra -g -ggdb -std=gnu++0x -rdynamic -pthread -O0
I can't really think of anything that has changed to be causing this. Any ideas?
Turns out that despite the "core dumped" message, if there was an older existing core file, it wasn't being overwritten. This is apparently a ubuntu bug according to this:
Why is my core file not overwritten?