GDB core dump analysis inside a docker container - c++

I'm trying to do a post mortem analysis using gdb on a C++ application inside a docker container. I have increased ulimit, added SYS_PTRACE capability, and used seccomp=unconfined. But when I use gdb and the main application with debug symbols I see this error:
warning: Unexpected size of section `.reg-xstate/10734' in core file.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program terminated with signal SIGSEGV, Segmentation fault.
warning: Unexpected size of section `.reg-xstate/10734' in core file.
#0 0x00007f05ce755350 in ?? ()
[Current thread is 1 (Thread 0x7f04d3ffe700 (LWP 10734))]
(gdb)
(gdb) bt
#0 0x00007f05ce755350 in ?? ()
#1 0x00007f05cea2382f in ?? ()
#2 0x00007f0000000008 in ?? ()
#3 0x00007f05ced76c40 in ?? ()
#4 0x00007f04d3ffdd50 in ?? ()
#5 0x00007f05ce897d69 in ?? ()
#6 0x00007f05ced76c40 in ?? ()
#7 0x00007f04d3ffdd70 in ?? ()
#8 0x00007f04d3ffdd90 in ?? ()
#9 0x00007f05ce89796c in ?? ()
#10 0x00007f04d3ffdd90 in ?? ()
#11 0x0000000000000000 in ?? ()
Why is this happening when the ulimit is unlimited?
root#57a1bc01c676:/workspace/builddir# ulimit
unlimited

Related

GDB produce ?? symbols for segmentaion fault on ubuntu

I am trying to debug my c++ programming assignment application using gdb on an Ubuntu server, because it produces segmentation fault.
But the file produces ?? symbols that are unreadable to me when I try bt it gives me.
(gdb) bt
#0 0x00007f141956d277 in ?? ()
#1 0x00007ffc1a866bd0 in ?? ()
#2 0x000055e1f101d5e0 in ?? ()
#3 0x00007ffc1a866db0 in ?? ()
#4 0x000055e1f1433e70 in ?? ()
#5 0x00007ffc1a866bd0 in ?? ()
#6 0x000055e1f10224a9 in ?? ()
#7 0x000055e1f14341f8 in ?? ()
#8 0x00000001f14344d0 in ?? ()
#9 0x0000000000000000 in ?? ()
I was following this link, and it told me to load these symbols
symbol-file /path/to/my/binary
sharedlibrary
The sharedlibrary was found, but the symbol-file path is not there. So,it did change bt command output somehow
(gdb) bt
#0 tcache_get (tc_idx=0) at malloc.c:2943
#1 _GI__libc_malloc (bytes=19) at malloc.c:3050
#2 0x000055e1f10224a9 in ?? ()
#3 0x000055e1f14341f8 in ?? ()
#4 0x00000001f14344d0 in ?? ()
#5 0x0000000000000000 in ?? ()
I still don't understand the bug.
Now, I don't know it's a problem from the GDB for not having this symbol-file or its a compilation problem which I don't know how or that's enough for me to debug, but I was following Debugging a Segmentation Fault and it was much clearer to troubleshoot.
When I search for similar cases, all of them were answered only for their case, not a general solution how to deal with these kinds of error. I also thought of installing or locating that symbol-file but I didn't understand how.
If someone could help me, I need to understand what is my problem and how should I fix it.
Note: core dump is produced in the /tmp not in current application directory
I was following this link, and it told me to load these symbols
Don't follow this link (it's unnecessarily complicated for your use case).
Instead, do this:
gdb /path/to/my/binary
(gdb) run
... GDB will stop when your program encounters SIGSEGV
(gdb) bt # should produce meaningful output now.

Address sanitizer: PC is at a non-executable region. Maybe a wild jump?

My compiled code is faulty and I decided to use address sanitizer to find its problem with g++ and options -Og -g3 -fsanitize=address -fno-omit-frame-pointer.
I used llvm-symbolizer
export ASAN_SYMBOLIZER_PATH=/usr/bin/llvm-symbolizer-9
$ ASAN_OPTIONS=symbolize=1 ./a.out
AddressSanitizer:DEADLYSIGNAL
=================================================================
==7296==ERROR: AddressSanitizer: SEGV on unknown address 0x000200000039 (pc 0x000200000039 bp 0x000000000001 sp 0x7ffd2b566c48 T0)
==7296==The signal is caused by a READ memory access.
==7296==Hint: PC is at a non-executable region. Maybe a wild jump?
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer: nested bug in the same thread, aborting.
It has no meaningful information that helps me finding the source of the problem in my code
It does not dump the stack trace. What does it mean and how can I find the stack-trace or the location of the dumped addresses?
update: Using gdb does not give much more info too
(gdb) run
Starting program: /media/bin/a.out
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Full compilation was performed in 0.005376 seconds.
Code has an error.
No input file is given
==8965==LeakSanitizer has encountered a fatal error.
==8965==HINT: For debugging, try setting environment variable LSAN_OPTIONS=verbosity=1:log_threads=1
==8965==HINT: LeakSanitizer does not work under ptrace (strace, gdb, etc)
Program received signal SIGABRT, Aborted.
0x00007ffff1eac438 in __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007ffff1eac438 in __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007ffff1eae03a in __GI_abort () at abort.c:89
#2 0x00007ffff72d14be in ?? () from /usr/lib/x86_64-linux-gnu/libasan.so.5
#3 0x00007ffff72dbe78 in ?? () from /usr/lib/x86_64-linux-gnu/libasan.so.5
#4 0x00007ffff72e0ecf in ?? () from /usr/lib/x86_64-linux-gnu/libasan.so.5
#5 0x00007ffff72e0f05 in ?? () from /usr/lib/x86_64-linux-gnu/libasan.so.5
#6 0x00007ffff1eb137a in __cxa_finalize (d=0x7ffff752e4a0) at cxa_finalize.c:56
#7 0x00007ffff71ca793 in ?? () from /usr/lib/x86_64-linux-gnu/libasan.so.5
#8 0x00007fffffffdc50 in ?? ()
#9 0x00007ffff7de7e27 in _dl_fini () at dl-fini.c:235
Backtrace stopped: frame did not save the PC

Crash when running QT application on Ubuntu Server

I am seeing that if I run a QT application on Ubuntu desktop edition I am able to run the application. If I take the same application and try and run it on Ubuntu server edition I am seeing a crash when starting the QT application. So far I have seen that I need to set QT to render offscreen with setting this environmental variable:
export QT_QPA_PLATFORM=offscreen
And then when I run the application I get this stack trace with the application crashing:
Thread 3 "hmi" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7fa8700 (LWP 18084)]
0x00007fffdf88decb in ?? () from /Qt5.6.2/5.6/gcc_64/plugins/platforms/libqoffscreen.so
(gdb) bt
#0 0x00007fffdf88decb in ?? () from /Qt5.6.2/5.6/gcc_64/plugins/platforms/libqoffscreen.so
#1 0x00007fffdf88e283 in ?? () from /Qt5.6.2/5.6/gcc_64/plugins/platforms/libqoffscreen.so
#2 0x00007ffff399a78d in QOpenGLContext::create() () from /Qt5.6.2/5.6/gcc_64/lib/libQt5Gui.so.5
#3 0x00007ffff41d2a67 in ?? () from /Qt5.6.2/5.6/gcc_64/lib/libQt5Quick.so.5
#4 0x00007ffff41d32d2 in ?? () from /Qt5.6.2/5.6/gcc_64/lib/libQt5Quick.so.5
#5 0x00007ffff39633ea in QWindow::event(QEvent*) () from /Qt5.6.2/5.6/gcc_64/lib/libQt5Gui.so.5
#6 0x00007ffff4206553 in QQuickWindow::event(QEvent*) () from /Qt5.6.2/5.6/gcc_64/lib/libQt5Quick.so.5
#7 0x00007ffff4eb25ca in QCoreApplication::notify(QObject*, QEvent*) () from /Qt5.6.2/5.6/gcc_64/lib/libQt5Core.so.5
#8 0x00007ffff4eb2720 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () from /Qt5.6.2/5.6/gcc_64/lib/libQt5Core.so.5
#9 0x00007ffff3958c69 in QGuiApplicationPrivate::processExposeEvent(QWindowSystemInterfacePrivate::ExposeEvent*) () from /Qt5.6.2/5.6/gcc_64/lib/libQt5Gui.so.5
#10 0x00007ffff39597fd in QGuiApplicationPrivate::processWindowSystemEvent(QWindowSystemInterfacePrivate::WindowSystemEvent*) () from /Qt5.6.2/5.6/gcc_64/lib/libQt5Gui.so.5
#11 0x00007ffff393aad3 in QWindowSystemInterface::sendWindowSystemEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /Qt5.6.2/5.6/gcc_64/lib/libQt5Gui.so.5
#12 0x00007fffdf88e3f0 in ?? () from /Qt5.6.2/5.6/gcc_64/plugins/platforms/libqoffscreen.so
#13 0x00007fffe9eb3197 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#14 0x00007fffe9eb33f0 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#15 0x00007fffe9eb349c in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#16 0x00007ffff4f01f07 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /Qt5.6.2/5.6/gcc_64/lib/libQt5Core.so.5
#17 0x00007ffff4eb076a in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /Qt5.6.2/5.6/gcc_64/lib/libQt5Core.so.5
#18 0x00007ffff4eb85fd in QCoreApplication::exec() () from /Qt5.6.2/5.6/gcc_64/lib/libQt5Core.so.5
I take the same application and run with offscreen set and on the desktop edition I am not seeing the same crash. I do not see much info about the QT library libqoffscreen.so and cannot find symbols for the prebuilt libraries to get a better stack trace. Is there anything that I may need to install on the Ubuntu server to allow me to run this QT application?
You’re trying to use Qt Quick, and this won’t work with the offscreen platform, as there’s no OpenGL support there. I don’t understand why you’d want to run such an application headless – it won’t be useful, presumably. Since it your code, you shouldn’t be showing the GUI in the headless mode: either add a command option to start without the GUI, or disable the GUI code using compile options (e.g. a macro passed from the makefile).

OpenCV-C++ VideoCapture fails to open video files

Recently I upgraded my OS from Ubuntu Precise Saucy (13.10) to Trusty (14.04). After this upgrade, cv::VideoCapture became not working properly. The program aborts when reading a video file. For example,
int main(int argc, char**argv)
{
cv::VideoCapture vin("sample/vout2l.avi");
...
Executing this program, it aborts with a message:
*** Error in `./cv2-videoread.out': malloc(): memory corruption: 0x0000000000e3eff0 ***
Abort (core dumped)
The backtrace looks like:
[New LWP 15586]
[New LWP 15587]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./cv2-videoread.out'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007ff953e61c37 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007ff953e61c37 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ff953e65028 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007ff953e9e2a4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007ff953eabe26 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x00007ff953eac1ab in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#5 0x00007ff953eadba4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x00007ff953eaf7d2 in posix_memalign () from /lib/x86_64-linux-gnu/libc.so.6
#7 0x00007ff94fa640fe in av_malloc () from /usr/lib/x86_64-linux-gnu/libavutil.so.52
#8 0x00007ff94fa641b1 in av_strdup () from /usr/lib/x86_64-linux-gnu/libavutil.so.52
#9 0x00007ff94fa5e5db in av_dict_set ()
from /usr/lib/x86_64-linux-gnu/libavutil.so.52
#10 0x00007ff954738574 in CvCapture_FFMPEG::open(char const*) ()
from /usr/lib/libopencv_highgui.so.2.4
#11 0x00007ff954738719 in cvCreateFileCapture_FFMPEG ()
from /usr/lib/libopencv_highgui.so.2.4
#12 0x00007ff95473aac9 in cvCreateFileCapture_FFMPEG_proxy(char const*) ()
from /usr/lib/libopencv_highgui.so.2.4
---Type <return> to continue, or q <return> to quit---
#13 0x00007ff954722d89 in cvCreateFileCapture ()
from /usr/lib/libopencv_highgui.so.2.4
#14 0x00007ff954723045 in cv::VideoCapture::open(std::string const&) ()
from /usr/lib/libopencv_highgui.so.2.4
#15 0x00007ff95472315c in cv::VideoCapture::VideoCapture(std::string const&) ()
from /usr/lib/libopencv_highgui.so.2.4
#16 0x0000000000401281 in main (argc=1, argv=0x7fff1f938388) at cv2-videoread.cpp:30
(gdb)
NOTE: cv::VideoCapture vin(... is 30th line.
Before upgrading the OS, this code was working with the same input file.
From the backtrace, it seems that the trouble happens at CvCapture_FFMPEG and libavutil. I upgraded the packages ffmpeg libavutil-dev libavutil51 libavutil52 but they were already up-to-date.
Also, OpenCV packages are up-to-date (I checked libopencv-core-dev libopencv-core2.4 libopencv-dev libopencv-highgui-dev libopencv-highgui2.4).
I also tested OpenCV built from source, but got the same results.
Do you have ideas to figure this out?
So, I've solved this issue.
By analyzing the program with ldd, I found that it was linked to, for example, /usr/lib/libopencv_highgui.so. However, in x86_64 system, it should be /usr/lib/x86_64-linux-gnu/libopencv_highgui.so. In my system, both files were installed.
The issue was caused by /usr/lib/libopencv_*.so (I'm not sure how I installed them. Maybe from source code...?). I removed these files, and compiled the above program again. Then it worked without errors.

produce "_Unwind_Resume" call seeing under gbd backtracing

Here is the problem which I'm trying to address:
We've got a core dump while processing data. The result of backtracing is:
#0 0x00a99402 in __kernel_vsyscall ()
#1 0x00306df0 in raise () from /lib/libc.so.6
#2 0x00308701 in abort () from /lib/libc.so.6
#3 0x001c4530 in _gnu_cxx::_verbose_terminate_handler() () from /usr/lib/libstdc++.so.6
#4 0x001c1f35 in ?? () from /usr/lib/libstdc++.so.6
#5 0x001c12ca in ?? () from /usr/lib/libstdc++.so.6
#6 0x001c1d99 in __gxx_personality_v0 () from /usr/lib/libstdc++.so.6
#7 0x00d1c7e6 in ?? () from /lib/libgcc_s.so.1
#8 0x00d1cb62 in _Unwind_Resume () from /lib/libgcc_s.so.1
........
I've looked through the code base of our application and it is not clear that the problem is due to uncaught exception or whatever (but i know it somehow connected with exceptions because of _Unwind_Resume call is there). So I'm trying to write simple program which also fails with core dump and its gdb backtracing contains the lines above.
os: CentOS, compiler: gnu gcc 4.1.2, language: c/c++
Any suggestions about the problem/code would be much appreciated