When I compile same code with g++ with -o2 flag I can backtrace successfully without any Bogus adresses. Like;
0x08156079 in CItem::GetValue (this=0x3adb0f00, idx=0) at item.cpp:957
0x081b123c in quest::item_has_flag (L=0x3af9bdc0) at questlua_item.cpp:155
0x08363cba in luaD_precall (L=0x3af9bdc0, func=0x3b1cedd8) at ldo.c:249
0x0836ba86 in luaV_execute (L=0x3af9bdc0) at lvm.c:637
0x08363fad in resume (L=0x3af9bdc0, ud=0xffffa164) at ldo.c:337
0x0836393b in luaD_protectedparser (L=0x3af9bdc0, z=0x8363f80, bin=-24220)
....
But I need to use g++48 (with c++11) for better performance and other reasons... So, when I do same thing with -o3 optimize flag and g++48 I can't get any file names etc. Like;
#0 0x28a56f3c in ?? ()
No symbol table info available.
#1 0x00000032 in ?? ()
No symbol table info available.
#2 0xbfbf9838 in ?? ()
No symbol table info available.
#3 0x28a4ea3a in ?? ()
No symbol table info available.
#4 0x00000032 in ?? ()
No symbol table info available.
#5 0x00000004 in ?? ()
No symbol table info available.
#6 0x00000001 in ?? ()
No symbol table info available.
#7 0x28a70694 in ?? ()
No symbol table info available.
#8 0xbfbf969c in ?? ()
No symbol table info available.
#9 0x28a6b06c in ?? ()
No symbol table info available.
Which flags I must not use for debugging? (-fno-omit-frame-pointer) Which flags should I use for debugging? And reason... I'm not a SO expert.
With gcc 4.8 you can use -Og switch, to enable all optimizations that do not interfere with debugging.
Also make sure you enabled debug info (-g switch). If you updated your compiler to newer release, you should also update the debugger. Another thing you may try is to make sure gcc generates debug info in compatible format (try -gdwarf-2 or similar).
Related
I am trying to debug my c++ programming assignment application using gdb on an Ubuntu server, because it produces segmentation fault.
But the file produces ?? symbols that are unreadable to me when I try bt it gives me.
(gdb) bt
#0 0x00007f141956d277 in ?? ()
#1 0x00007ffc1a866bd0 in ?? ()
#2 0x000055e1f101d5e0 in ?? ()
#3 0x00007ffc1a866db0 in ?? ()
#4 0x000055e1f1433e70 in ?? ()
#5 0x00007ffc1a866bd0 in ?? ()
#6 0x000055e1f10224a9 in ?? ()
#7 0x000055e1f14341f8 in ?? ()
#8 0x00000001f14344d0 in ?? ()
#9 0x0000000000000000 in ?? ()
I was following this link, and it told me to load these symbols
symbol-file /path/to/my/binary
sharedlibrary
The sharedlibrary was found, but the symbol-file path is not there. So,it did change bt command output somehow
(gdb) bt
#0 tcache_get (tc_idx=0) at malloc.c:2943
#1 _GI__libc_malloc (bytes=19) at malloc.c:3050
#2 0x000055e1f10224a9 in ?? ()
#3 0x000055e1f14341f8 in ?? ()
#4 0x00000001f14344d0 in ?? ()
#5 0x0000000000000000 in ?? ()
I still don't understand the bug.
Now, I don't know it's a problem from the GDB for not having this symbol-file or its a compilation problem which I don't know how or that's enough for me to debug, but I was following Debugging a Segmentation Fault and it was much clearer to troubleshoot.
When I search for similar cases, all of them were answered only for their case, not a general solution how to deal with these kinds of error. I also thought of installing or locating that symbol-file but I didn't understand how.
If someone could help me, I need to understand what is my problem and how should I fix it.
Note: core dump is produced in the /tmp not in current application directory
I was following this link, and it told me to load these symbols
Don't follow this link (it's unnecessarily complicated for your use case).
Instead, do this:
gdb /path/to/my/binary
(gdb) run
... GDB will stop when your program encounters SIGSEGV
(gdb) bt # should produce meaningful output now.
I'm using the Point Cloud Library with cmake for compilation, and I've got it building in debug mode, but my program doesn't seg fault or abort in the way I'd expect it to.
Specifically, I get messages like this:
(gdb) run bunny
Starting program: debug/our_cvfh bunny
libc++abi.dylib: terminating
[New Thread 0x170b of process 80178]
Program received signal SIGABRT, Aborted.
0x00007fff88c6f866 in ?? ()
(gdb) bt
#0 0x00007fff88c6f866 in ?? ()
#1 0x00007fff8bb5235c in ?? ()
#2 0x0000000000000000 in ?? ()
(gdb) break rec/registered_views_source.h:305
Cannot access memory at address 0x961d60
In this case, I know where the error is, but I'd like to be able to backtrace it and see what called the function in this case.
Is PCL creating another thread, and that's why I can't backtrace? I'm not doing any visualizations right now, so I can't figure out why it'd be using threading.
I've also tried running the program in the debug directory instead of from my source root directory. Here's another example of it not working:
$ gdb our_cvfh
GNU gdb (GDB) 7.7
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin13.1.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from our_cvfh...done.
run (gdb) run
Starting program: /Users/jwoods/Projects/lidargl/fpfh/debug/our_cvfh
[New Thread 0x170b of process 33571]
Program received signal SIGSEGV, Segmentation fault.
0x000000010016cdec in ?? ()
(gdb) bt
#0 0x000000010016cdec in ?? ()
#1 0x00007fff5fbfbd08 in ?? ()
#2 0x00007fff5fbfbcc0 in ?? ()
#3 0x00007fff5fbfbcc8 in ?? ()
#4 0x00007fff5fbfbcc8 in ?? ()
#5 0x00007fff5fbfbcc8 in ?? ()
#6 0xffffffffffffffff in ?? ()
#7 0x00007fff5fbfbcc8 in ?? ()
#8 0x00007fff5fbfbcc8 in ?? ()
#9 0x00007fff5fbfbcc0 in ?? ()
#10 0x00007fff5fbfbcc0 in ?? ()
#11 0x00007fff5fbfbcc8 in ?? ()
#12 0x00007fff5fbfbcc8 in ?? ()
#13 0x00007fff5fbfbcc8 in ?? ()
#14 0x00007fff5fbff4a8 in ?? ()
#15 0x00007fff5fbff4d8 in ?? ()
#16 0x00007fff5fbff420 in ?? ()
#17 0x00007fff5fbff4d8 in ?? ()
#18 0x0000000000000000 in ?? ()
(gdb)
gdb works fine when I'm not using CMake, so my guess is it has something to do with CMake. This doesn't seem to present a problem for anyone else, which tells me it may also have to do with the fact that I'm using CMake with Mac OS X.
How do I get my normal GDB behavior?
Update
I can run dsymutil my_output_binary in order to generate the debugging symbols (following make). This is a workaround. I'd like it to be done automatically, and amn't sure why it's not. The dsymutil strategy works for most segfaults, but doesn't work for some cases of SIGABRT. Here is the output:
Calling compute <--- normal std::cerr output of my program, single-threaded
Assertion failed: (index >= 0 && index < size()), function operator[], file /usr/local/Cellar/eigen/3.2.1/include/eigen3/Eigen/src/Core/DenseCoeffsBase.h, line 378.
[New Thread 0x170b of process 64108]
Program received signal SIGABRT, Aborted.
0x00007fff84999866 in ?? ()
(gdb) bt
#0 0x00007fff84999866 in ?? ()
#1 0x00007fff862c335c in ?? ()
#2 0x0000000000000000 in ?? ()
(gdb) info threads
Id Target Id Frame
2 Thread 0x170b of process 64108 0x00007fff8499a662 in ?? ()
* 1 Thread 0x1503 of process 64108 0x00007fff84999866 in ?? ()
(gdb)
Note that my program is not itself multi-threaded, but it appears to be making use of a library which is creating threads — or at least that's what I gather from the gdb output.
I've tried disabling threading with Eigen, which is used by PCL.
Interestingly, lldb is able to generate the backtraces, but I'm curious as to why GDB can't.
Other than simply using LLDB in lieu of GDB, I haven't figured out how to debug threads.
However, I have figured out how to automatically produce debugging symbols! Hooray.
You need three files:
UseCompVer.cmake
AddOptions.cmake
UseDebugSymbols.cmake
Put these in your project's cmake/Modules/ directory.
In CMakeLists.txt, you'll need the following lines after your project declaration:
set(CMAKE_MODULE_PATH "${CMAKE_SOURCE_DIR}/cmake/Modules/")
if (APPLE) # this if statement is optional, but you definitely need the include()
include(UseDebugSymbols)
endif (APPLE)
That will look in the appropriate location.
I should note that I made another change to my project simultaneously, which may also be useful to you. In my add_executable line, right after the name of the target, I added MACOSX_BUNDLE. This flag will cause it to be compiled as a .app instead of a regular binary. Here's an example from my project:
add_executable(pose MACOSX_BUNDLE pose.cpp rec/global_nn_recognizer_cvfh.cpp rec/global_nn_recognizer_cvfh.hpp rec/render_views_tesselated_sphere.cpp)
Im receving many cores from different programs on my redhat server those cores happens without any specific pattern it can happen with Tuxedo servers as well as ordinary programs the only common thing between all program is that all of them have the same top error with this 8629 number check_match.8629 ()
how can I identify what this number refering to?
Thanks in advance
data from core dump file
#0 0x005546b1 in check_match.8629 () from /lib/ld-linux.so.2
No symbol table info available.
#1 0x00554e17 in do_lookup_x () from /lib/ld-linux.so.2
No symbol table info available.
#2 0x005550da in _dl_lookup_symbol_x () from /lib/ld-linux.so.2
No symbol table info available.
#3 0x00559a05 in _dl_fixup () from /lib/ld-linux.so.2
No symbol table info available.
#4 0x0055fc90 in _dl_runtime_resolve () from /lib/ld-linux.so.2
You need a library with debugging symbols to debug the core file. Once you have that, you can get a backtrace from core which will give you leads.
The number with the core file might be the PID. Check this to confirm that - How to generate core dump file in Ubuntu
or cat /proc/sys/kernel/core_pattern
Currently I have a C++ program which crashed every now and then when running. I run it in Dr. Memory, but when it doesn't crash there is no obvious error reported, and when it does drmemory doesn't report anything at all (it just ends). So I debugged it with the shipped GDB in the corresponding MinGW package.
The program is compiled & linked with MinGW gcc 32bit (dwarf2 & sjlj produce similar results), under 64bit Windows 7. The GDB debugger caught a SIGSEGV that caused the program to crash. I made a complete backtrace, but I don't know how to analyze it, for I haven't done it before.
This is where the debugger stopped when the program crashed:
Thread 539 (Thread 5904.0x11b4):
#0 0x00000000 in ?? ()
No symbol table info available.
#1 0x00433f1b in pthread_create_wrapper ()
No symbol table info available.
#2 0x763d1287 in msvcrt!_itow_s () from C:\Windows\syswow64\msvcrt.dll
No symbol table info available.
#3 0x763d1328 in msvcrt!_endthreadex () from C:\Windows\syswow64\msvcrt.dll
No symbol table info available.
#4 0x74e5336a in KERNEL32!BaseThreadInitThunk () from C:\Windows\syswow64\kernel32.dll
No symbol table info available.
#5 0x77069f72 in ntdll!RtlInitializeExceptionChain () from C:\Windows\system32\ntdll.dll
No symbol table info available.
#6 0x77069f45 in ntdll!RtlInitializeExceptionChain () from C:\Windows\system32\ntdll.dll
No symbol table info available.
#7 0x00000000 in ?? ()
No symbol table info available.
I don't know if I should post other parts of the backtrace; it's very long, because the program is a multi-threaded Qt Gui program.
I guess the #0 00000000 in ?? () is where the program crashed, because in other threads there is always a new thread started after ... -> msvcrt!_endthreadex () -> msvcrt!_itow_s () -> pthread_create_wrapper ().
I've been debugging this for days...it's driving me up the wall. Any help will be greatly appreciated!!
P.S. Please let me know if I should post other parts of the backtrace too. I don't know whether the point GDB stopped is just the point problems popped in a multithreaded program.
There was a core dump produced at the customer end for my application and while looking at the backtrace I don't have the symbols loaded...
(gdb) where
#0 0x000000364c032885 in ?? ()
#1 0x000000364c034065 in ?? ()
#2 0x0000000000000000 in ?? ()
(gdb) bt full
#0 0x000000364c032885 in ?? ()
No symbol table info available.
#1 0x000000364c034065 in ?? ()
No symbol table info available.
#2 0x0000000000000000 in ?? ()
No symbol table info available.
One think I want to mention in here is that the application being used is build with -g option.
To me it seems that the required libraries are not being loaded. I tried to load the libraries manually using the "symbol-file", but this doesn't help.
What could be the possible issue?
No symbol table info available.
Chances are you invoked GDB incorrectly. Don't do this:
gdb core
gdb -c core
Do this instead:
gdb exename core
Also see this answer for what you'll likely have to do to get meaningful crash stack trace for a core from customer's machine.
I was facing a similar issue and later found out that I am missing -g option, Make sure you have compiled the binary with -g.
This happens when you run gdb with path to executable that does not correspond to the one that produced the core dump.
Make sure that you provide gdb with the correct path.
<put an example of correct code or commands here>