Running a DPDK application binary with all dependent library (compiled in different machine) on another machine - dpdk

A DPDK application with several runtime dependent libraries are compiled on one machine
Binaries and libraries are copied from that machine to another machine with similar specs and environment
Running the DPDK application with the parameters as given below, but the application crashes during rte_eal_init()
App-binary -l 1 -a 0000:02:00.0 -a 0000:03:00.0 -d /opt/upf/lib/ --proc-type=primary --file-prefix=.app_0000:02:00.0
This is the back trace from gnu debugger crash core file
#0 0x00007faaa0ead337 in raise () from /lib64/libc.so.6
#1 0x00007faaa0eaea28 in abort () from /lib64/libc.so.6
#2 0x00007faaa125104f in __rte_panic () from /opt/upf/lib/librte_eal.so.21
#3 0x00007faa9e228e1c in tailqinitfn_rte_ring_tailq () from /opt/upf/lib/librte_ring.so.21.0
#4 0x00007faaa278a973 in _dl_init_internal () from /lib64/ld-linux-x86-64.so.2
#5 0x00007faaa278f54e in dl_open_worker () from /lib64/ld-linux-x86-64.so.2
#6 0x00007faaa278a784 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#7 0x00007faaa278eb3b in _dl_open () from /lib64/ld-linux-x86-64.so.2
#8 0x00007faaa0c73eeb in dlopen_doit () from /lib64/libdl.so.2
#9 0x00007faaa278a784 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#10 0x00007faaa0c744ed in _dlerror_run () from /lib64/libdl.so.2
#11 0x00007faaa0c73f81 in dlopen##GLIBC_2.2.5 () from /lib64/libdl.so.2
#12 0x00007faaa125bc55 in eal_plugins_init () from /opt/upf/lib/librte_eal.so.21
#13 0x00007faaa126f2ba in rte_eal_init () from /opt/upf/lib/librte_eal.so.21
#14 0x000000000041414a in Dpdk_LibTask (arg=<optimized out>) at /root/5g_upf/core/service/common/dpdk/dpdk.c:1244
#15 0x00007faaa2566e65 in start_thread () from /lib64/libpthread.so.0
#16 0x00007faaa0f7588d in clone () from /lib64/libc.so.6
Updates:
Host Machine details:
i3 8100 3.6GHz 4 Cores
8 GB RAM
CentOS 7
gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC)
GNU ld version 2.27-44.base.el7
DPDK 20.11.0
3 NICs bounded to DPDK
0000:01:00.0 '82574L Gigabit Network Connection 10d3' drv=uio_pci_generic unused=e1000e
0000:07:00.0 '82574L Gigabit Network Connection 10d3' drv=uio_pci_generic unused=e1000e
0000:08:00.0 '82574L Gigabit Network Connection 10d3' drv=uio_pci_generic unused=e1000e
Target Machine details:
i3 8100 3.6GHz 4 Cores
8 GB RAM
CentOS 7
gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC)
GNU ld version 2.27-44.base.el7
DPDK 20.11.0
3 NICs bounded to DPDK
0000:02:00.0 '82574L Gigabit Network Connection 10d3' drv=uio_pci_generic unused=e1000e
0000:03:00.0 '82574L Gigabit Network Connection 10d3' drv=uio_pci_generic unused=e1000e
0000:04:00.0 '82574L Gigabit Network Connection 10d3' drv=uio_pci_generic unused=e1000e

[Based on the live debug with Sumesh].
Application background:
The application has dependency on DPDK libraries, 3rd party libraries and GNU libraries.
Actual open source project, builds DPDK 20.11.
Docker instance is started with docker run where root permission is shared and and copies these over to a local folder (same machine).
With LD_LIBRARY_PATH set to desired folder DPDK libraries dependency are corrected.
What caused the issues:
Machine-A is used to build the DPDK 20.11 libraries.
Instead of running docker instance on Machine-A, Machine-B is choose as target machine.
DPDK libraries are copied from Machine-A to Machine-B.
Docker-run is used to start the application in Machine-B
How to fix the issue:
DPDK libraries when built has numerous other libraries and version dependency
Build and install DPDK on target (MACHINE-B) and do not copy.
in docker-run share the permission to access the /usr/lib/lib64 which houses DPDK, 3rd party and GNU Libraries.
update the LD_LIBRARY_PATH to have access to right folder to resolve the dependency.
Note:
Tested and validated on both Host and Docker with hello-world for sanity
Sumesh is updating the scripts to reflect the folder permission for the custom application.

Related

Can't see symbols from Erlang NIF library in core file

I'm working on an Erlang wrapper over a 3rd party C library on Ubuntu Linux on x86, so I'm creating a NIF. Sometimes my code (I think) crashes, resulting in a core file. Unfortunately the stacktrace is not really helpful:
(gdb) bt
#0 0x00007fc22229968a in ?? ()
#1 0x0000000060e816d8 in ?? ()
#2 0x0000000007cd48b0 in ?? ()
#3 0x00007fc228031410 in ?? ()
#4 0x00007fc228040b80 in ?? ()
#5 0x00007fc228040c50 in ?? ()
#6 0x00007fc22223de0b in ?? ()
#7 0x0000000000000000 in ?? ()
even though I built my NIF .so file with debug info:
ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=b70dd1f2450f5c0e9980c8396aaad2e1cd29024c, with debug_info, not stripped
The beam also has debug info:
ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=e0a5dba6507b8c2b333faebc89fbc6ea2f7263b9, for GNU/Linux 3.2.0, with debug_info, not stripped
However, info sharedlibrary doesn't show neither the NIF nor the 3rd party lib:
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x00007fc28942ed50 0x00007fc289432004 Yes /lib/x86_64-linux-gnu/libgtk3-nocsd.so.0
0x00007fc289429220 0x00007fc28942a179 Yes /lib/x86_64-linux-gnu/libdl.so.2
0x00007fc2892e83c0 0x00007fc28938ef18 Yes /lib/x86_64-linux-gnu/libm.so.6
0x00007fc2892b76a0 0x00007fc2892c517c Yes /lib/x86_64-linux-gnu/libtinfo.so.6
0x00007fc28928dae0 0x00007fc28929d4d5 Yes /lib/x86_64-linux-gnu/libpthread.so.0
0x00007fc2890b9630 0x00007fc28922e20d Yes /lib/x86_64-linux-gnu/libc.so.6
0x00007fc289657100 0x00007fc289679674 Yes (*) /lib64/ld-linux-x86-64.so.2
0x00007fc24459c040 0x00007fc2445ab8ad Yes /home/nar/otp/23.3.4.2/lib/crypto-4.9.0.2/priv/lib/crypto.so
0x00007fc2239e3000 0x00007fc223b7c800 Yes (*) /lib/x86_64-linux-gnu/libcrypto.so.1.1
0x00007fc2896500e0 0x00007fc28965028c Yes /home/nar/otp/23.3.4.2/lib/crypto-4.9.0.2/priv/lib/crypto_callback.so
0x00007fc289649380 0x00007fc28964bc1c Yes /home/nar/otp/23.3.4.2/lib/asn1-5.0.15/priv/lib/asn1rt_nif.so
0x00007fc289638720 0x00007fc28963bd70 Yes /lib/x86_64-linux-gnu/librt.so.1
I found this answer mentioning that "The Erlang VM doesn't load NIF libraries with global symbols exposed". Could this be the reason why I don't see the symbols? Is there a way to tell gdb to look up symbols from my .so file?
I built the Erlang VM with debug enabled (I used kerl to build and set KERL_BUILD_DEBUG_VM to true), then started the erlang with the -debug option. This way some asserts were seem to be enabled in the code, they crashed and that lead to me to the bugs in my code. Since then I don't have the crashes.

How to install libstdc++6 debug symbols on Ubuntu 20.04?

For example, take the following minimal example:
#include <cstdio>
#include <stdexcept>
int main(int argc, char* argv[]){
#ifdef __GLIBCPP__
std::printf("GLIBCPP: %d\n",__GLIBCPP__);
#endif
#ifdef __GLIBCXX__
std::printf("GLIBCXX: %d\n",__GLIBCXX__);
#endif
throw std::runtime_error("Were are libstdc++.so.6 debug symbols?");
return 0;
}
When running it inside my gdb, it does not show the debug symbols for libstdc++.so.6:
$ g++ -o testmain test.cpp -ggdb --std=c++98 && gdb ./testmain
GNU gdb (Ubuntu 9.1-0ubuntu1) 9.1
Copyright (C) 2020 Free Software Foundation, Inc.
...
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./testmain...
(gdb) r
Starting program: /home/user/Downloads/testmain
GLIBCXX: 20200408
terminate called after throwing an instance of 'std::runtime_error'
what(): Were are libstdc++.so.6 debug symbols?
Program received signal SIGABRT, Aborted.
__GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt f
#0 __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
set = {__val = {0, 0, 0, 0, 0, 0, 0, 0, 29295, 0, 0, 0, 0, 0, 0, 0}}
pid = <optimized out>
tid = <optimized out>
ret = <optimized out>
#1 0x00007ffff7be1859 in __GI_abort () at abort.c:79
save_stage = 1
act = {__sigaction_handler = {sa_handler = ... <stderr>}
sigs = {__val = {32, 0 <repeats 15 times>}}
#2 0x00007ffff7e67951 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#3 0x00007ffff7e7347c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#4 0x00007ffff7e734e7 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#5 0x00007ffff7e73799 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#6 0x000055555555524a in main (argc=1, argv=0x7fffffffdef8) at test.cpp:11
No locals.
(gdb)
It just shows No symbol table info available for the libstdc++.so.6 frames.
How can I show the symbols for the libstdc++.so.6?
Searching on this list https://packages.ubuntu.com/search?keywords=libstdc%2B%2B6, I already tried installing the following packages, but none of them fixed the problem:
libgcc-10-dev:amd64 <none> 10.2.0-5ubuntu1~20.0
libstdc++-10-dev:amd64 <none> 10.2.0-5ubuntu1~20.0
libstdc++6-10-dbg:amd64 <none> 10.2.0-5ubuntu1~20.0
libc6-amd64-cross:all <none> 2.31-0ubuntu7cross
linux-libc-dev-amd64-cross:all <none> 5.4.0-21.25cross
libc6-dev-amd64-cross:all <none> 2.31-0ubuntu7cross
libstdc++6-amd64-cross:all <none> 10.2.0-5ubuntu1~20.04cross
libgcc-10-dev-amd64-cross:all <none> 10.2.0-5ubuntu1~20.04cross
libstdc++-10-dev-amd64-cross:all <none> 10.2.0-5ubuntu1~20.04cross
libstdc++6-10-dbg-amd64-cross:all <none> 10.2.0-5ubuntu1~20.04cross
libx32stdc++6-10-dbg:amd64 <none> 10.2.0-5ubuntu1~20.0
Related questions:
How do you find what version of libstdc++ library is installed on your linux machine?
/usr/lib/libstdc++.so.6: version `GLIBCXX_3.4.15' not found
$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
Update 1
$ dpkg --list | grep libstdc++6
ii libstdc++6:amd64 10.2.0-5ubuntu1~20.04 amd64 GNU Standard C++ Library v3
ii libstdc++6-10-dbg-amd64-cross 10.2.0-5ubuntu1~20.04cross1 all GNU Standard C++ Library v3 (debug build) (amd64)
ii libstdc++6-7-dbg:amd64 7.5.0-6ubuntu2 amd64 GNU Standard C++ Library v3 (debug build)
ii libstdc++6-amd64-cross 10.2.0-5ubuntu1~20.04cross1 all GNU Standard C++ Library v3 (amd64)
Update 2
$ dpkg --list | grep libstdc++6
ii libstdc++6:amd64 10.2.0-5ubuntu1~20.04 amd64 GNU Standard C++ Library v3
ii libstdc++6-10-dbg:amd64 10.2.0-5ubuntu1~20.04 amd64 GNU Standard C++ Library v3 (debug build)
ii libstdc++6-10-dbg-amd64-cross 10.2.0-5ubuntu1~20.04cross1 all GNU Standard C++ Library v3 (debug build) (amd64)
ii libstdc++6-amd64-cross 10.2.0-5ubuntu1~20.04cross1 all GNU Standard C++ Library v3 (amd64)
Background Story:
Days ago, I was also curious about the same question as yours. But that's on CentOS.
What can I do differently after I install those missing debug info packages for gdb?
You can check the question to see what I learnt during searching, I solve your question with those prior knowledge.
In short, for the same thing, in CentOS the difficulties come down to installing the debug info packages. Because the gdb in CentOS tells what exact version of some debug info files you need to install and it gives the full command.
debuginfo-install glibc-2.17-307.el7.1.x86_64 libgcc-4.8.5-44.el7.x86_64 libstdc++-4.8.5-44.el7.x86_64
But this command just can't work and you need to manually add some package sources to install that .
However, as soon as you succeed installing the debug info packages, everything else is set up nicely, even the source files! You can s step into e.g. abort() and list around the source code!
In Ubuntu:
You have to find the exact version of your libstdc++.so.xxx and install the corresponding debug info files.
No libarary(e.g. libstdc++) source files will be installed and set up after install the corresponding debug info files packages. But you can manually do it with set substitute-path.
Answer Part:
I made my gdb work under Ubuntu 18.04.5 LTS. I think that may applies to yours too.
I assume you know this https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html .
So firstly I ldd my.a.out.
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fbfa6f84000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fbfa697b000)
...
In my Ubuntu, reading debug symbol for libc.so.6 is successful. So I want to check both .so files' .gnu_debuglink section.
libc.so.6 is a link to libc-2.27.so
so I read the above section with readelf -x.gnu_debuglink libc-2.27.so and gives me:
Hex dump of section '.gnu_debuglink':
0x00000000 6c696263 2d322e32 372e736f 00000000 libc-2.27.so....
0x00000010 32e033a0 2.3.
This means its debug info file's name is libc-2.27.so, which exists in /usr/lib/debug/lib/x86_64-linux-gnu directory.
Now check libstdc++.so.6, which is a link to libstdc++.so.6.0.25 in my machine.
readelf -x.gnu_debuglink libstdc++.so.6.0.25 gives:
Hex dump of section '.gnu_debuglink':
0x00000000 31313961 34346139 39373538 31313436 119a44a997581146
0x00000010 32306338 65396438 65323433 64373039 20c8e9d8e243d709
0x00000020 34663737 66362e64 65627567 00000000 4f77f6.debug....
0x00000030 30573da0 0W=.
This 119a44a99758114620c8e9d8e243d7094f77f6.debug is a build-id debug file.
Learnt from your question and comments below, I do dpkg --list | grep libstdc++ and shows
ii libstdc++-7-dev:amd64 7.5.0-3ubuntu1~18.04 amd64 GNU Standard C++ Library v3 (development files)
ii libstdc++-8-dev:amd64 8.4.0-1ubuntu1~18.04 amd64 GNU Standard C++ Library v3 (development files)
ii libstdc++6:amd64 8.4.0-1ubuntu1~18.04 amd64 GNU Standard C++ Library v3
ii libstdc++6:i386 8.4.0-1ubuntu1~18.04 i386 GNU Standard C++ Library v3
So I sudo apt install libstdc++6-8-dbg.
Then I used dpgk-query -L libstdc++6-8-dbg to see what files are installed with this packages.
tianhe#tianhe-windy:/lib/x86_64-linux-gnu$ dpkg -L libstdc++6-8-dbg
/.
/usr
/usr/lib
/usr/lib/debug
/usr/lib/debug/.build-id
/usr/lib/debug/.build-id/f2
/usr/lib/debug/.build-id/f2/119a44a99758114620c8e9d8e243d7094f77f6.debug
/usr/lib/x86_64-linux-gnu
/usr/lib/x86_64-linux-gnu/debug
/usr/lib/x86_64-linux-gnu/debug/libstdc++.a
/usr/lib/x86_64-linux-gnu/debug/libstdc++.so.6.0.25
/usr/lib/x86_64-linux-gnu/debug/libstdc++fs.a
/usr/share
/usr/share/doc
/usr/share/gdb
/usr/share/gdb/auto-load
/usr/share/gdb/auto-load/usr
/usr/share/gdb/auto-load/usr/lib
/usr/share/gdb/auto-load/usr/lib/x86_64-linux-gnu
/usr/share/gdb/auto-load/usr/lib/x86_64-linux-gnu/debug
/usr/share/gdb/auto-load/usr/lib/x86_64-linux-gnu/debug/libstdc++.so.6.0.25-gdb.py
/usr/lib/x86_64-linux-gnu/debug/libstdc++.so
/usr/lib/x86_64-linux-gnu/debug/libstdc++.so.6
/usr/share/doc/libstdc++6-8-dbg
And I think I got the debug files when I saw this line:
/usr/lib/debug/.build-id/f2/119a44a99758114620c8e9d8e243d7094f77f6.debug.
Then I open gdb again and it works. I can now s step into string s = "hello";.
So try check what I describe above see if they match.
I followed these instructions https://www.hiroom2.com/ubuntu-2004-dbgsym-en/.
Adding the debug symbols repo:
#!/bin/sh -e
U=http://ddebs.ubuntu.com
C=$(lsb_release -cs)
cat <<EOF | sudo tee /etc/apt/sources.list.d/ddebs.list
deb ${U} ${C} main restricted universe multiverse
#deb ${U} ${C}-security main restricted universe multiverse
deb ${U} ${C}-updates main restricted universe multiverse
deb ${U} ${C}-proposed main restricted universe multiverse
EOF
wget -O - http://ddebs.ubuntu.com/dbgsym-release-key.asc | \
sudo apt-key add -
sudo apt update -y
Then install symbols for libstdc++6
sudo apt-get install libstdc++6-dbgsym
In addtion to #Rick's answer.
In ubuntu 20.04+, you need to install libstdc++6-dbgsym, and before this you need to add debug symbol repo to apt.
To get the source code, you should run apt source libstdc++6, then run ./debian/rules patch as described in debian/README.source.
(Personally I feel installing debug info and source code in ubuntu is much more complex than centOS. I suggest you to use centOS if you just want to have a look into libstdc++'s source code.

How to debug DPDK libraries to diagnose segmentation fault?

I am working with DPDK version 18.11.8 stable on Linux, using a gcc x64 build.
At runtime I get a segmentation fault. Running gdb on the core dump gives this backtrace:
#0 0x0000000000f65680 in rte_eth_devices ()
#1 0x000000000048a03a in rte_eth_rx_burst (nb_pkts=7,
rx_pkts=0x7fab40620480, queue_id=0, port_id=<optimized out>)
at
/opt/dpdk/dpdk-18.08/x86_64-native-linuxapp-gcc/include/rte_ethdev.h:3825
#2 Socket_poll (ucRxPortId=<optimized out>, ucRxQueId=ucRxQueId at entry=0
'\000', uiMaxNumOfRxFrm=uiMaxNumOfRxFrm at entry=7,
pISocketListener=pISocketListener at entry=0xf635d0 <FH_gtFrontHaulObj+16>)
at /data/<snip>/SocketClass.c:2188
#3 0x000000000048b941 in FH_perform (args_ptr=<optimized out>) at
/data/<snip>/FrontHaul.c:281
#4 0x00000000005788e4 in eal_thread_loop ()
#5 0x00007fab419fae65 in start_thread () from /lib64/libpthread.so.0
#6 0x00007fab4172388d in clone () from /lib64/libc.so.6
So it seems that rte_eth_rx_burst() calls rte_eth_devices () and that function crashes, presumably because of an illegal memory access. Possibly a hugepages problem?
I want to enable more debug info in DPDK. I am building DPDK using:
usertools/dpdk-setup.sh
Am I correct in thinking that the build commands in that script use make and I should modify the appropriate:
config/defconfig_*
file (defconfig_x86_64-native-linuxapp-gcc in my case) ?
If so, would these values be appropriate?
CONFIG_RTE_LIBRTE_ETHDEV_DEBUG=y
RTE_LOG_LEVEL=RTE_LOG_DEBUG
RTE_LIBRTE_ETHDEV_DEBUG=y
(not sure whether all values should be prefixed by 'CONFIG_'?)
I tried building DPDK using:
$ export EXTRA_CFLAGS='-O0 -g'
$ make install T=x86_64-native-linuxapp-gcc
but that gave no extra info in the backtrace.
EDIT: error is identified update is Fixed and running without crashing now
using chat room dpdk-debug, we were able to rebuild the libraries and application with proper CFLAGS. Using gdb have identified the probable cause is in rte_eth_rx_burst not being passed with pointer array for mbuf.
Based on the GDB details for frame 1, it looks the application is not build with the EXTRA_CFLAGS (assuming you are using DPDK example Makefile). The right way to build an DPDK application for debugging is to follow the steps as
cd [dpdk target folder]
make clean
make EXTRA_CFLAGS='-O0 -ggdb'
cd [application folder]
make EXTRA_CFLAGS='-O0 -ggdb'
then use GDB in TUI or non-TUI mode to analyze the error.
note:
one of the most common mistakes I commit in rx_burst, is passing *mbuf_array instead of **mbuf_array as the argument.
if custom Makefile is used for the application, pass the EXTRA_CFLAGS as CFLAGS+="-O0 -ggdb"

Protobuf version conflicts with Qt

I'm trying to use protobufs v 3.3.2 with Qt 5.9.1. This works with some Qt applications, but only if they are command line programs. Once I create a GUI application with Qt and protobufs, I get this error:
[libprotobuf FATAL
/home/mkraus/Documents/dev/star385/build/linux-desktop-debug-libs/protobuf/src/src/google/protobuf/stubs/common.cc:78]
This program was compiled against version 2.6.1 of the Protocol Buffer runtime library, which is not compatible with the installed
version (3.3.2). Contact the program author for an update. If you
compiled the program yourself, make sure that your headers are from
the same version of Protocol Buffers as your link-time library.
(Version verification failed in
"/build/mir-ui6vjS/mir-0.26.3+16.04.20170605/obj-x86_64-linux-gnu/src/protobuf/mir_protobuf.pb.cc".)
I should clarify that my part of the code is certainly using version 3.3.2 (I'm downloading and compiling protobufs from the git sources and statically linking). Look at the stack trace below to see that something that Qt is referencing is causing a protobuf version mismatch.
I'm developing on Ubuntu 16.04 and using the default desktop environment (Unity).
Work-Arounds
My troubleshooting has revealed these symptoms and work-arounds:
Use KDE / KUbuntu. Changing the desktop environment when logging in completely avoids the version mismatch issue.
Run the Qt application with -platform eglfs. This runs the application in full-screen mode using OpenGL. The program runs, but the window size is incorrect. When using the -platform eglfs option, it works even in Unity, but without this option, it gives me the above error.
Any Qt application that is a command-line only application (using QCoreApplication instead of QGuiApplication) can use protobufs 3.3.2. Changing the same app to use a GUI causes the version mismatch issue.
Questions
How can I use protobufs 3.3.2 with Qt GUI applications, and also not be dependent on what desktop environment is in use? Is it Qt that is using the version 2.6.1 of protobufs, and if so, is it feasible to compile Qt to use protobufs 3.3.2?
Debug Info
Here is a stack trace (the program crashes almost immediately upon starting):
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): This program was compiled against version 2.6.1 of the Protocol Buffer runtime library, which is not compatible with the installed version (3.3.2). Contact the program author for an update. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "/build/mir-ui6vjS/mir-0.26.3+16.04.20170605/obj-x86_64-linux-gnu/src/protobuf/mir_protobuf.pb.cc".)
Thread 1 "scan" received signal SIGABRT, Aborted.
0x00007ffff4dff428 in __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007ffff4dff428 in __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007ffff4e0102a in __GI_abort () at abort.c:89
#2 0x00007ffff543984d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007ffff54376b6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007ffff5437701 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007ffff5437919 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x0000000000603e0a in google::protobuf::internal::LogMessage::Finish (this=0x7fffffffc250)
at /home/mkraus/Documents/dev/star385/build/linux-desktop-debug-libs/protobuf/src/src/google/protobuf/stubs/common.cc:268
#7 0x0000000000603e5a in google::protobuf::internal::LogFinisher::operator= (this=0x7fffffffc20f, other=...)
at /home/mkraus/Documents/dev/star385/build/linux-desktop-debug-libs/protobuf/src/src/google/protobuf/stubs/common.cc:276
#8 0x0000000000603171 in google::protobuf::internal::VerifyVersion (headerVersion=2006001, minLibraryVersion=2006000,
filename=0x7fffde80aec0 "/build/mir-ui6vjS/mir-0.26.3+16.04.20170605/obj-x86_64-linux-gnu/src/protobuf/mir_protobuf.pb.cc")
at /home/mkraus/Documents/dev/star385/build/linux-desktop-debug-libs/protobuf/src/src/google/protobuf/stubs/common.cc:86
#9 0x00007fffde7d490b in mir::protobuf::protobuf_AddDesc_mir_5fprotobuf_2eproto() ()
from /usr/lib/x86_64-linux-gnu/libmirprotobuf.so.3
#10 0x00007fffde7d2409 in ?? () from /usr/lib/x86_64-linux-gnu/libmirprotobuf.so.3
#11 0x00007ffff7de76ba in call_init (l=<optimized out>, argc=argc#entry=1, argv=argv#entry=0x7fffffffd5d8,
env=env#entry=0x7fffffffd5e8) at dl-init.c:72
#12 0x00007ffff7de77cb in call_init (env=0x7fffffffd5e8, argv=0x7fffffffd5d8, argc=1, l=<optimized out>) at dl-init.c:30
#13 _dl_init (main_map=main_map#entry=0xa2f450, argc=1, argv=0x7fffffffd5d8, env=0x7fffffffd5e8) at dl-init.c:120
#14 0x00007ffff7dec8e2 in dl_open_worker (a=a#entry=0x7fffffffc6e0) at dl-open.c:575
#15 0x00007ffff7de7564 in _dl_catch_error (objname=objname#entry=0x7fffffffc6d0, errstring=errstring#entry=0x7fffffffc6d8,
mallocedp=mallocedp#entry=0x7fffffffc6cf, operate=operate#entry=0x7ffff7dec4d0 <dl_open_worker>, args=args#entry=0x7fffffffc6e0)
at dl-error.c:187
#16 0x00007ffff7debda9 in _dl_open (file=0xa2f048 "/opt/Qt5.8.0/5.8/gcc_64/plugins/platformthemes/libqgtk3.so", mode=-2147479551,
caller_dlopen=0x7ffff599b7a8, nsid=-2, argc=<optimized out>, argv=<optimized out>, env=0x7fffffffd5e8) at dl-open.c:660
#17 0x00007ffff1806f09 in dlopen_doit (a=a#entry=0x7fffffffc910) at dlopen.c:66
#18 0x00007ffff7de7564 in _dl_catch_error (objname=0xa02b80, errstring=0xa02b88, mallocedp=0xa02b78,
operate=0x7ffff1806eb0 <dlopen_doit>, args=0x7fffffffc910) at dl-error.c:187
#19 0x00007ffff1807571 in _dlerror_run (operate=operate#entry=0x7ffff1806eb0 <dlopen_doit>, args=args#entry=0x7fffffffc910)
at dlerror.c:163
#20 0x00007ffff1806fa1 in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87
#21 0x00007ffff599b7a8 in ?? () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Core.so.5
#22 0x00007ffff5994fd5 in ?? () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Core.so.5
#23 0x00007ffff598a647 in QFactoryLoader::instance(int) const () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Core.so.5
#24 0x00007ffff6b392f1 in ?? () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Gui.so.5
#25 0x00007ffff6b43538 in QGuiApplicationPrivate::createPlatformIntegration() () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Gui.so.5
#26 0x00007ffff6b43edd in QGuiApplicationPrivate::createEventDispatcher() () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Gui.so.5
#27 0x00007ffff59a57d6 in QCoreApplicationPrivate::init() () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Core.so.5
#28 0x00007ffff6b456ab in QGuiApplicationPrivate::init() () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Gui.so.5
#29 0x00007ffff6b46364 in QGuiApplication::QGuiApplication(int&, char**, int) () from /opt/Qt5.8.0/5.8/gcc_64/lib/libQt5Gui.so.5
#30 0x00000000005c55bd in main (argc=1, argv=0x7fffffffd5d8) at /home/mkraus/Documents/dev/star385/src/linux/ui/scan/main.cpp:35
You can find here a discussion about the same issue and they talk about an interesting workaround.
It seems that this error is caused by the library libqgtk3.so located in /opt/Qt/5.9/gcc_64/plugins/platformthemes. If you don't need it in your project you can rename/remove it to make the error go away.
If you are using CMake as a build system you also need to comment all the lines in the file /opt/Qt/5.9/gcc_64/lib/cmake/Qt5Gui/Qt5Gui_QGtk3ThemePlugin.cmake to avoid configure issues.
To add on, the real problem comes from the library libmir which depends on the the libprotobuf. You may run on this problem whenever you try to use recent tensorflow with libgtk3.0 because of this hard dependency. As libmir depends on the system libprotobuf which is normally behind the version in use by tensorflow (which downloads its own version from the repository).
The good news, this BUG on libgtk was reported and fixed however, to use the fixed version you have to move to libgtk3.0 3.22 (see BUG report).
If you are using Qt from the Ubuntu package repository, you can remove the offending library by uninstalling qt5-gtk-platformtheme. This will remove libqgtk3.so and the corresponding CMake file without having to resort to hacks that might have unintended consequences.
As Blabdouze said, this error is caused by the libqgtk3 plugin which is used to set the GUI style. libqgtk3 uses the libmir system library, which uses protobuf 2.6.1. This leads to conflicts when the application starts.
I found a workaround that allows you to avoid editing of Qt files:
You need to copy the "plugins" folder from ".../Qt/5.хх.хх/gcc_64/" to some other location (for example, next to the project build folder).
Then you must remove "platformthemes/libqgtk3.so" and "platformthemes/libqgtk3.so.debug" from the copied folder.
In main(), before creating a QApplication instance, call the static function "QApplication::setLibraryPaths("path/to/copied/plugins/folder")".
Finally, you must add variable "LD_LIBRARY_PATH" with the value ".../Qt/5.хх.хх/gcc_64/lib" (correct path will depend on your Qt version) in a project's "environment settings" in Qt Creator. You also may add a "QT_DEBUG_PLUGINS" variable with a value of "1". It will allow you to check which plugins are used by your project and remove unnecessary plugins from the release version.
In conclusion I would like to note that this error occurred when running the project in Ubuntu 16.04, but it disappeared when I switched to version 18.04. It seems that in version 18.04 app uses the default Qt style instead of the GTK style.

Program compiled on a VMWare machine crashes with illegal instruction when run on an Amazon server

I'm compiling a program on Ubuntu 14.04.3. I then copy it to an Amazon AWS server running Ubuntu 14.04.2. Yet it instantly crashes with Illegal Instruction (it works on the source machine) with the following stacktrace from gdb:
Program received signal SIGILL, Illegal instruction.
...
(gdb) bt
#0 0x000000000093716b in std::vector<int, std::allocator<int> >::_M_fill_insert(__gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, unsigned long, int const&) ()
#1 0x0000000000706581 in _GLOBAL__sub_I__ZN5abcdf6kfjg446zcadetERKSs ()
#2 0x0000000000b2abad in __libc_csu_init ()
#3 0x00007ffff7106e55 in __libc_start_main (main=0x6fa390 <main>, argc=2, argv=0x7fffffffe668,
init=0xb2ab60 <__libc_csu_init>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe658)
at libc-start.c:246
#4 0x0000000000708437 in _start ()
What gives? It appears they are using the same versions of libc.
Because I am you, I was able to check your compiler flags and found the following among them:
-march=native
As per this answer:
If you use -march then GCC will be free to generate instructions that work on the specified CPU, but not on (typically) earlier CPUs in the architecture family.
I went ahead and recompiled your program without -march=native and it ran on the Amazon server without a hitch. I am not sure why this ever worked before - perhaps because you switched from VirtualBox to VMWare, which upgraded the local VM's processor capabilities beyond that of the Amazon server's, which caused -march=native to start generating incompatible code.
Continuing with that answer, you can alternatively try -mtune for a safe way to optimize the program:
If you use -mtune, then the compiler will generate code that works on any of them, but will favour instruction sequences that run fastest on the specific CPU you indicated.