So I am experiencing this really weird behavior of gdb on Linux (KDE Neon 5.20.2):
I start gdb and load my executable using the file command:
(gdb) file mumble
Reading symbols from mumble...
As you can see it did find debug symbols. Then I start my program (using start) which causes gdb to pause at the entry to the main function. At this point I can also print out the back trace using bt and it works as expected.
If I now continue my program and interrupt it at any point during startup, I can still display the backtrace without issues. However if I do something in my application that happens in another thread than the startup (which all happens in thread 1) and interrupt my program there, gdb will no longer be able to display the stacktrace properly. Instead it gives
(gdb) bt
#0 0x00007ffff5bedaff in ?? ()
#1 0x0000555556a863f0 in ?? ()
#2 0x0000555556a863f0 in ?? ()
#3 0x0000000000000004 in ?? ()
#4 0x0000000100000001 in ?? ()
#5 0x00007fffec005000 in ?? ()
#6 0x00007ffff58a81ae in ?? ()
#7 0x0000000000000000 in ?? ()
which shows that it can't find the respective debug symbols.
I compiled my application with cmake (gcc) using -DCMAKE_BUILD_TYPE=Debug. I also ensured that a bunch of debug symbols are present in the binary using objdump --debug mumble (Which also printed a few lines of objdump: Error: LEB value too large, but I'm not sure if this is related to the problem I am seeing).
While playing around with gdb, I also encountered the error
Cannot find user-level thread for LWP <SomeNumber>: generic error
a few times, which lets me suspect that maybe there is indeed some issue invloving threads here...
Finally I tried starting gdb and before loading my binary using set verbose on which yields
(gdb) set verbose on
(gdb) file mumble
Reading symbols from mumble...
Reading in symbols for /home/user/Documents/Git/mumble/src/mumble/main.cpp...done.
This does also look suspicious to me as only main.cpp is explicitly listed here (even though the project has much, much more source files). I should also note that all successful backtraces that I am able to produce (as described above) all originate from main.cpp.
I am honestly a bit clueless as to what might be the root issue here. Does someone have an idea what could be going on? Or how I could investigate further?
Note: I also tried using clang as a compiler but the result was the same.
Used program versions:
cmake: 3.18.4
gcc: 9.3.0
clang: 10.0.0
make: 4.2.1
I am working with DPDK version 18.11.8 stable on Linux, using a gcc x64 build.
At runtime I get a segmentation fault. Running gdb on the core dump gives this backtrace:
#0 0x0000000000f65680 in rte_eth_devices ()
#1 0x000000000048a03a in rte_eth_rx_burst (nb_pkts=7,
rx_pkts=0x7fab40620480, queue_id=0, port_id=<optimized out>)
at
/opt/dpdk/dpdk-18.08/x86_64-native-linuxapp-gcc/include/rte_ethdev.h:3825
#2 Socket_poll (ucRxPortId=<optimized out>, ucRxQueId=ucRxQueId at entry=0
'\000', uiMaxNumOfRxFrm=uiMaxNumOfRxFrm at entry=7,
pISocketListener=pISocketListener at entry=0xf635d0 <FH_gtFrontHaulObj+16>)
at /data/<snip>/SocketClass.c:2188
#3 0x000000000048b941 in FH_perform (args_ptr=<optimized out>) at
/data/<snip>/FrontHaul.c:281
#4 0x00000000005788e4 in eal_thread_loop ()
#5 0x00007fab419fae65 in start_thread () from /lib64/libpthread.so.0
#6 0x00007fab4172388d in clone () from /lib64/libc.so.6
So it seems that rte_eth_rx_burst() calls rte_eth_devices () and that function crashes, presumably because of an illegal memory access. Possibly a hugepages problem?
I want to enable more debug info in DPDK. I am building DPDK using:
usertools/dpdk-setup.sh
Am I correct in thinking that the build commands in that script use make and I should modify the appropriate:
config/defconfig_*
file (defconfig_x86_64-native-linuxapp-gcc in my case) ?
If so, would these values be appropriate?
CONFIG_RTE_LIBRTE_ETHDEV_DEBUG=y
RTE_LOG_LEVEL=RTE_LOG_DEBUG
RTE_LIBRTE_ETHDEV_DEBUG=y
(not sure whether all values should be prefixed by 'CONFIG_'?)
I tried building DPDK using:
$ export EXTRA_CFLAGS='-O0 -g'
$ make install T=x86_64-native-linuxapp-gcc
but that gave no extra info in the backtrace.
EDIT: error is identified update is Fixed and running without crashing now
using chat room dpdk-debug, we were able to rebuild the libraries and application with proper CFLAGS. Using gdb have identified the probable cause is in rte_eth_rx_burst not being passed with pointer array for mbuf.
Based on the GDB details for frame 1, it looks the application is not build with the EXTRA_CFLAGS (assuming you are using DPDK example Makefile). The right way to build an DPDK application for debugging is to follow the steps as
cd [dpdk target folder]
make clean
make EXTRA_CFLAGS='-O0 -ggdb'
cd [application folder]
make EXTRA_CFLAGS='-O0 -ggdb'
then use GDB in TUI or non-TUI mode to analyze the error.
note:
one of the most common mistakes I commit in rx_burst, is passing *mbuf_array instead of **mbuf_array as the argument.
if custom Makefile is used for the application, pass the EXTRA_CFLAGS as CFLAGS+="-O0 -ggdb"
I'm trying to use protobufs v 3.3.2 with Qt 5.9.1. This works with some Qt applications, but only if they are command line programs. Once I create a GUI application with Qt and protobufs, I get this error:
[libprotobuf FATAL
/home/mkraus/Documents/dev/star385/build/linux-desktop-debug-libs/protobuf/src/src/google/protobuf/stubs/common.cc:78]
This program was compiled against version 2.6.1 of the Protocol Buffer runtime library, which is not compatible with the installed
version (3.3.2). Contact the program author for an update. If you
compiled the program yourself, make sure that your headers are from
the same version of Protocol Buffers as your link-time library.
(Version verification failed in
"/build/mir-ui6vjS/mir-0.26.3+16.04.20170605/obj-x86_64-linux-gnu/src/protobuf/mir_protobuf.pb.cc".)
I should clarify that my part of the code is certainly using version 3.3.2 (I'm downloading and compiling protobufs from the git sources and statically linking). Look at the stack trace below to see that something that Qt is referencing is causing a protobuf version mismatch.
I'm developing on Ubuntu 16.04 and using the default desktop environment (Unity).
Work-Arounds
My troubleshooting has revealed these symptoms and work-arounds:
Use KDE / KUbuntu. Changing the desktop environment when logging in completely avoids the version mismatch issue.
Run the Qt application with -platform eglfs. This runs the application in full-screen mode using OpenGL. The program runs, but the window size is incorrect. When using the -platform eglfs option, it works even in Unity, but without this option, it gives me the above error.
Any Qt application that is a command-line only application (using QCoreApplication instead of QGuiApplication) can use protobufs 3.3.2. Changing the same app to use a GUI causes the version mismatch issue.
Questions
How can I use protobufs 3.3.2 with Qt GUI applications, and also not be dependent on what desktop environment is in use? Is it Qt that is using the version 2.6.1 of protobufs, and if so, is it feasible to compile Qt to use protobufs 3.3.2?
Debug Info
Here is a stack trace (the program crashes almost immediately upon starting):
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): This program was compiled against version 2.6.1 of the Protocol Buffer runtime library, which is not compatible with the installed version (3.3.2). Contact the program author for an update. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "/build/mir-ui6vjS/mir-0.26.3+16.04.20170605/obj-x86_64-linux-gnu/src/protobuf/mir_protobuf.pb.cc".)
Thread 1 "scan" received signal SIGABRT, Aborted.
0x00007ffff4dff428 in __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007ffff4dff428 in __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007ffff4e0102a in __GI_abort () at abort.c:89
#2 0x00007ffff543984d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007ffff54376b6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007ffff5437701 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007ffff5437919 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x0000000000603e0a in google::protobuf::internal::LogMessage::Finish (this=0x7fffffffc250)
at /home/mkraus/Documents/dev/star385/build/linux-desktop-debug-libs/protobuf/src/src/google/protobuf/stubs/common.cc:268
#7 0x0000000000603e5a in google::protobuf::internal::LogFinisher::operator= (this=0x7fffffffc20f, other=...)
at /home/mkraus/Documents/dev/star385/build/linux-desktop-debug-libs/protobuf/src/src/google/protobuf/stubs/common.cc:276
#8 0x0000000000603171 in google::protobuf::internal::VerifyVersion (headerVersion=2006001, minLibraryVersion=2006000,
filename=0x7fffde80aec0 "/build/mir-ui6vjS/mir-0.26.3+16.04.20170605/obj-x86_64-linux-gnu/src/protobuf/mir_protobuf.pb.cc")
at /home/mkraus/Documents/dev/star385/build/linux-desktop-debug-libs/protobuf/src/src/google/protobuf/stubs/common.cc:86
#9 0x00007fffde7d490b in mir::protobuf::protobuf_AddDesc_mir_5fprotobuf_2eproto() ()
from /usr/lib/x86_64-linux-gnu/libmirprotobuf.so.3
#10 0x00007fffde7d2409 in ?? () from /usr/lib/x86_64-linux-gnu/libmirprotobuf.so.3
#11 0x00007ffff7de76ba in call_init (l=<optimized out>, argc=argc#entry=1, argv=argv#entry=0x7fffffffd5d8,
env=env#entry=0x7fffffffd5e8) at dl-init.c:72
#12 0x00007ffff7de77cb in call_init (env=0x7fffffffd5e8, argv=0x7fffffffd5d8, argc=1, l=<optimized out>) at dl-init.c:30
#13 _dl_init (main_map=main_map#entry=0xa2f450, argc=1, argv=0x7fffffffd5d8, env=0x7fffffffd5e8) at dl-init.c:120
#14 0x00007ffff7dec8e2 in dl_open_worker (a=a#entry=0x7fffffffc6e0) at dl-open.c:575
#15 0x00007ffff7de7564 in _dl_catch_error (objname=objname#entry=0x7fffffffc6d0, errstring=errstring#entry=0x7fffffffc6d8,
mallocedp=mallocedp#entry=0x7fffffffc6cf, operate=operate#entry=0x7ffff7dec4d0 <dl_open_worker>, args=args#entry=0x7fffffffc6e0)
at dl-error.c:187
#16 0x00007ffff7debda9 in _dl_open (file=0xa2f048 "/opt/Qt5.8.0/5.8/gcc_64/plugins/platformthemes/libqgtk3.so", mode=-2147479551,
caller_dlopen=0x7ffff599b7a8, nsid=-2, argc=<optimized out>, argv=<optimized out>, env=0x7fffffffd5e8) at dl-open.c:660
#17 0x00007ffff1806f09 in dlopen_doit (a=a#entry=0x7fffffffc910) at dlopen.c:66
#18 0x00007ffff7de7564 in _dl_catch_error (objname=0xa02b80, errstring=0xa02b88, mallocedp=0xa02b78,
operate=0x7ffff1806eb0 <dlopen_doit>, args=0x7fffffffc910) at dl-error.c:187
#19 0x00007ffff1807571 in _dlerror_run (operate=operate#entry=0x7ffff1806eb0 <dlopen_doit>, args=args#entry=0x7fffffffc910)
at dlerror.c:163
#20 0x00007ffff1806fa1 in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87
#21 0x00007ffff599b7a8 in ?? () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Core.so.5
#22 0x00007ffff5994fd5 in ?? () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Core.so.5
#23 0x00007ffff598a647 in QFactoryLoader::instance(int) const () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Core.so.5
#24 0x00007ffff6b392f1 in ?? () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Gui.so.5
#25 0x00007ffff6b43538 in QGuiApplicationPrivate::createPlatformIntegration() () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Gui.so.5
#26 0x00007ffff6b43edd in QGuiApplicationPrivate::createEventDispatcher() () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Gui.so.5
#27 0x00007ffff59a57d6 in QCoreApplicationPrivate::init() () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Core.so.5
#28 0x00007ffff6b456ab in QGuiApplicationPrivate::init() () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Gui.so.5
#29 0x00007ffff6b46364 in QGuiApplication::QGuiApplication(int&, char**, int) () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Gui.so.5
#30 0x00000000005c55bd in main (argc=1, argv=0x7fffffffd5d8) at /home/mkraus/Documents/dev/star385/src/linux/ui/scan/main.cpp:35
You can find here a discussion about the same issue and they talk about an interesting workaround.
It seems that this error is caused by the library libqgtk3.so located in /opt/Qt/5.9/gcc_64/plugins/platformthemes. If you don't need it in your project you can rename/remove it to make the error go away.
If you are using CMake as a build system you also need to comment all the lines in the file /opt/Qt/5.9/gcc_64/lib/cmake/Qt5Gui/Qt5Gui_QGtk3ThemePlugin.cmake to avoid configure issues.
To add on, the real problem comes from the library libmir which depends on the the libprotobuf. You may run on this problem whenever you try to use recent tensorflow with libgtk3.0 because of this hard dependency. As libmir depends on the system libprotobuf which is normally behind the version in use by tensorflow (which downloads its own version from the repository).
The good news, this BUG on libgtk was reported and fixed however, to use the fixed version you have to move to libgtk3.0 3.22 (see BUG report).
If you are using Qt from the Ubuntu package repository, you can remove the offending library by uninstalling qt5-gtk-platformtheme. This will remove libqgtk3.so and the corresponding CMake file without having to resort to hacks that might have unintended consequences.
As Blabdouze said, this error is caused by the libqgtk3 plugin which is used to set the GUI style. libqgtk3 uses the libmir system library, which uses protobuf 2.6.1. This leads to conflicts when the application starts.
I found a workaround that allows you to avoid editing of Qt files:
You need to copy the "plugins" folder from ".../Qt/5.хх.хх/gcc_64/" to some other location (for example, next to the project build folder).
Then you must remove "platformthemes/libqgtk3.so" and "platformthemes/libqgtk3.so.debug" from the copied folder.
In main(), before creating a QApplication instance, call the static function "QApplication::setLibraryPaths("path/to/copied/plugins/folder")".
Finally, you must add variable "LD_LIBRARY_PATH" with the value ".../Qt/5.хх.хх/gcc_64/lib" (correct path will depend on your Qt version) in a project's "environment settings" in Qt Creator. You also may add a "QT_DEBUG_PLUGINS" variable with a value of "1". It will allow you to check which plugins are used by your project and remove unnecessary plugins from the release version.
In conclusion I would like to note that this error occurred when running the project in Ubuntu 16.04, but it disappeared when I switched to version 18.04. It seems that in version 18.04 app uses the default Qt style instead of the GTK style.
I'm compiling a program on Ubuntu 14.04.3. I then copy it to an Amazon AWS server running Ubuntu 14.04.2. Yet it instantly crashes with Illegal Instruction (it works on the source machine) with the following stacktrace from gdb:
Program received signal SIGILL, Illegal instruction.
...
(gdb) bt
#0 0x000000000093716b in std::vector<int, std::allocator<int> >::_M_fill_insert(__gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, unsigned long, int const&) ()
#1 0x0000000000706581 in _GLOBAL__sub_I__ZN5abcdf6kfjg446zcadetERKSs ()
#2 0x0000000000b2abad in __libc_csu_init ()
#3 0x00007ffff7106e55 in __libc_start_main (main=0x6fa390 <main>, argc=2, argv=0x7fffffffe668,
init=0xb2ab60 <__libc_csu_init>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe658)
at libc-start.c:246
#4 0x0000000000708437 in _start ()
What gives? It appears they are using the same versions of libc.
Because I am you, I was able to check your compiler flags and found the following among them:
-march=native
As per this answer:
If you use -march then GCC will be free to generate instructions that work on the specified CPU, but not on (typically) earlier CPUs in the architecture family.
I went ahead and recompiled your program without -march=native and it ran on the Amazon server without a hitch. I am not sure why this ever worked before - perhaps because you switched from VirtualBox to VMWare, which upgraded the local VM's processor capabilities beyond that of the Amazon server's, which caused -march=native to start generating incompatible code.
Continuing with that answer, you can alternatively try -mtune for a safe way to optimize the program:
If you use -mtune, then the compiler will generate code that works on any of them, but will favour instruction sequences that run fastest on the specific CPU you indicated.
Whenever I create two separate libraries with LLVM 3.0 and link them together. I always get the following stack trace on exit.
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000001004b0000
#0 0x00007fff8a95cda2 in memmove$VARIANT$sse42 ()
#1 0x00000001006020a0 in llvm::PassRegistry::removeRegistrationListener ()
#2 0x00000001005fbe60 in llvm::cl::list<llvm::PassInfo const*, bool, llvm::PassNameParser>::~list ()
#3 0x00007fff8a9767c8 in __cxa_finalize ()
#4 0x00007fff8a976652 in exit ()
I am creating one shared library from the Core component and one from the Target component.
I have tried calling:
LLVMPassRegistryRef pass_registry = LLVMGetGlobalPassRegistry();
LLVMInitializeCore(pass_registry);
Any ideas on how proceed?
I've found a simple solution in case anyone is wondering. The --enable-shared option (disabled by default) on the configure script creates a LLVM-3.X shared library. Linking to this rather than the output of the llvm-config --libs core solved it.