Wt Segfault on ORM Transaction Commit - c++

http://www.webtoolkit.eu/wt/doc/tutorial/dbo.html says
The complete source code for the examples used in this tutorial are available as ready-to-run programs in the examples/feature/dbo/ folder of Wt.
I'm trying to run tutorial1.C from that directory, and I get the following output:
(gdb) run
Starting program: /home/lawsa/sources/memory/dist/flashcard --docroot . --http-address 0.0.0.0 --http-port 9090
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
begin transaction
create table "user" (
"id" integer primary key autoincrement,
"version" integer not null,
"name" text not null,
"password" text not null,
"role" integer not null,
"karma" integer not null
)
commit transaction
created tables
ending transaction
Program received signal SIGSEGV, Segmentation fault.
0x000000000041a4a8 in void std::vector<Wt::Dbo::ptr_base*, std::allocator<Wt::Dbo::ptr_base*> >::emplace_back<Wt::Dbo::ptr_base*>(Wt::Dbo::ptr_base*&&) ()
(gdb) bt
#0 0x000000000041a4a8 in void std::vector<Wt::Dbo::ptr_base*, std::allocator<Wt::Dbo::ptr_base*> >::emplace_back<Wt::Dbo::ptr_base*>(Wt::Dbo::ptr_base*&&) ()
#1 0x0000000000419c8e in std::vector<Wt::Dbo::ptr_base*, std::allocator<Wt::Dbo::ptr_base*> >::push_back(Wt::Dbo::ptr_base*&&) ()
#2 0x0000000000419682 in void Wt::Dbo::Session::implSave<User>(Wt::Dbo::MetaDbo<User>&) ()
#3 0x0000000000418c4e in Wt::Dbo::MetaDbo<User>::flush() ()
#4 0x00007ffff6c8eae2 in Wt::Dbo::Session::flush() ()
from /usr/lib/libwtdbo.so.38
#5 0x00007ffff6c9d14d in Wt::Dbo::Transaction::Impl::commit() ()
from /usr/lib/libwtdbo.so.38
#6 0x00007ffff6c9d1a9 in Wt::Dbo::Transaction::commit() ()
from /usr/lib/libwtdbo.so.38
#7 0x00000000004063d2 in run() ()
#8 0x0000000000407066 in main ()
(gdb)
For your reference, here's my code: http://sprunge.us/PYSO (I hope that lasts for a while, but let me know if it stops working). And my Makefile: http://sprunge.us/UCge and I ran gdb using $ gdb --args ./flashcard --docroot . --http-address 0.0.0.0 --http-port 9090
You can see the output from line 80 but not from 83, and the backtrace from gdb suggests that line 81 (commit) is the problem. If I remove line 81 so that the transaction commits because of going out of scope, the same problem exists, but it comes from the destructor of transaction.
I'm running archlinux with Wt 3.3.4-4, gcc 5.1.0-5, compiling with -std=c++0x.
The only thing I can imagine is if there is some binary incompatibility with std::vector?

Recompiling Wt should fix the problem. Instead of using the Wt as packaged through archlinux (pacman), compile Wt from source.
There is perhaps an ABI discrepancy somewhere that should be solved by compiling everything on your own machine with the same standard and boost libraries.

Related

gdb only finds some debug symbols

So I am experiencing this really weird behavior of gdb on Linux (KDE Neon 5.20.2):
I start gdb and load my executable using the file command:
(gdb) file mumble
Reading symbols from mumble...
As you can see it did find debug symbols. Then I start my program (using start) which causes gdb to pause at the entry to the main function. At this point I can also print out the back trace using bt and it works as expected.
If I now continue my program and interrupt it at any point during startup, I can still display the backtrace without issues. However if I do something in my application that happens in another thread than the startup (which all happens in thread 1) and interrupt my program there, gdb will no longer be able to display the stacktrace properly. Instead it gives
(gdb) bt
#0 0x00007ffff5bedaff in ?? ()
#1 0x0000555556a863f0 in ?? ()
#2 0x0000555556a863f0 in ?? ()
#3 0x0000000000000004 in ?? ()
#4 0x0000000100000001 in ?? ()
#5 0x00007fffec005000 in ?? ()
#6 0x00007ffff58a81ae in ?? ()
#7 0x0000000000000000 in ?? ()
which shows that it can't find the respective debug symbols.
I compiled my application with cmake (gcc) using -DCMAKE_BUILD_TYPE=Debug. I also ensured that a bunch of debug symbols are present in the binary using objdump --debug mumble (Which also printed a few lines of objdump: Error: LEB value too large, but I'm not sure if this is related to the problem I am seeing).
While playing around with gdb, I also encountered the error
Cannot find user-level thread for LWP <SomeNumber>: generic error
a few times, which lets me suspect that maybe there is indeed some issue invloving threads here...
Finally I tried starting gdb and before loading my binary using set verbose on which yields
(gdb) set verbose on
(gdb) file mumble
Reading symbols from mumble...
Reading in symbols for /home/user/Documents/Git/mumble/src/mumble/main.cpp...done.
This does also look suspicious to me as only main.cpp is explicitly listed here (even though the project has much, much more source files). I should also note that all successful backtraces that I am able to produce (as described above) all originate from main.cpp.
I am honestly a bit clueless as to what might be the root issue here. Does someone have an idea what could be going on? Or how I could investigate further?
Note: I also tried using clang as a compiler but the result was the same.
Used program versions:
cmake: 3.18.4
gcc: 9.3.0
clang: 10.0.0
make: 4.2.1

GDB is not able to read the core file it produced

I'm debugging a SIGSEGV error on a huge application running on Yocto/ARM64 (iMX8QM).
If I run the application in GDB, I can get the backtrace:
Thread 1 "HmiAppCentral" received signal SIGSEGV, Segmentation fault.
0x0000000000b0a0d0 in kanzi::Node3D::~Node3D() ()
(gdb) bt
#0 0x0000000000b0a0d0 in kanzi::Node3D::~Node3D() ()
#1 0x0000000000cd4e44 in kanzi::Model3D::~Model3D() ()
#2 0x0000000000b09c38 in kanzi::Node3D::removeChild(unsigned long) ()
[...]
Then I export the core dump, quit GDB and restart it:
(gdb) generate-core-file
warning: target file /proc/2279/cmdline contained unexpected null characters
[...]
gdb -c core.2279
Then GDB is not able to print the backtrace anymore:
(gdb) bt full
#0 0x0000000000b0a0d0 in ?? ()
No symbol table info available.
#1 0x0000000000000001 in ?? ()
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
The address of the first frame is correct (0x0000000000b0a0d0), however GDB is not able to find the function name when reloading the core dump. Any hint?
Just like when the OS creates a core file, the original program executable is not included in the core file itself, and it is this executable that contains the debug information (or allows GDB to find the debug information).
What this means is, if you want to debug with the debug information then you need to provide both the executable and the core file, so something like:
gdb my_program.exe -c core.pid

Gdb cannot find assertion failure positions after recompiling

It seems that gdb fails finding the code position of an assertion failure, after I recompile my code. More precisely, I expect the position of a signal raise, relative to an assertion failure, to be
0x00007ffff7a5ff00 in raise () from /lib64/libc.so.`6
while instead I obtain
0x00007ffff7a5ff00 in ?? ()
For instance, consider the following code
#include <assert.h>
int main()
{
assert(0);
return 0;
}
compiled with debug symbols and debugged with gdb.
> gcc -g main.c
> gdb a.out
On the first run of gdb, the position is found, and the backtrace is reported correctly:
GNU gdb (Gentoo 8.0.1 p1) 8.0.1
...
(gdb) r
Starting program: /home/myself/a.out
a.out: main.c:5: main: Assertion `0' failed.
Program received signal SIGABRT, Aborted.
0x00007ffff7a5ff00 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff7a5ff00 in raise () from /lib64/libc.so.6
#1 0x00007ffff7a61baa in abort () from /lib64/libc.so.6
#2 0x00007ffff7a57cb7 in ?? () from /lib64/libc.so.6
#3 0x00007ffff7a57d72 in __assert_fail () from /lib64/libc.so.6
#4 0x00005555555546b3 in main () at main.c:5
(gdb)
The problem comes when I recompile the code. After recompiling, I issue the run command in the same gdb instance. Gdb re-reads the symbols, starts the program from the beginning, but does not find the right position:
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
`/home/myself/a.out' has changed; re-reading symbols.
Starting program: /home/myself/a.out
a.out: main.c:5: main: Assertion `0' failed.
Program received signal SIGABRT, Aborted.
0x00007ffff7a5ff00 in ?? ()
(gdb) bt
#0 0x00007ffff7a5ff00 in ?? ()
#1 0x0000000000000000 in ?? ()
(gdb) up
Initial frame selected; you cannot go up.
(gdb) n
Cannot find bounds of current function
At this point the debugger is unusable. One cannot go up, step forward.
As a workaround, I can manually reload the file, and positions are found again.
(gdb) file a.out
Load new symbol table from "a.out"? (y or n) y
Reading symbols from a.out...done.
(gdb) r
Starting program: /home/myself/a.out
a.out: main.c:5: main: Assertion `0' failed.
Program received signal SIGABRT, Aborted.
0x00007ffff7a5ff00 in raise () from /lib64/libc.so.6
(gdb)
Unfortunately, after reloading the file this way, gdb fails resetting the breakpoints.
ERRATA CORRIGE: I was experiencing failure in resetting the breakpoints using gdb 7.12.1. After upgrading to 8.0.1 the problem vanished. Supposedly, this was related to the bugfix https://sourceware.org/bugzilla/show_bug.cgi?id=21555. However, code positions where assertions fail still cannot be found correctly.
Does anybody have any idea about what is going on here?
This has started happening after a system update. The system update recompiled all system libraries, including the glibc, as position independent code, i.e., compiled with -fPIC.
Also, the version of the gcc I am using is 6.4.0
Here is a workaround. Since file re-reads the symbols correctly, while run does not, we can define a hook for the command run so to execute file before:
define hook-run
pi gdb.execute("file %s" % gdb.current_progspace().filename)
end
after you change the source file and recompile u are generating a different file from the one loaded to GDB.
you need to stop the running debug cession and reload the file.
you cant save the previously defined breakpoints and watch points in the file to a changed source, since gdb is actually inserting additional code to your source to support breakpoints and registrar handlers.
if you change the source the the behavior is undefined and you need to reset those breakpoints.
you can refer to gdb manual regarding saving breakpoints in a file as
Mark Plotnick suggested, but it wont work if you change the file(from my experience)
https://sourceware.org/gdb/onlinedocs/gdb/Save-Breakpoints.html

Calling a shared library from Haskell via FFI blocks, while it doesn't when linked from a C program

I'm trying to interface with a Basler USB3 camera from a Haskell application, but I'm experiencing some difficulty. The camera comes with a C++ library that makes it fairly straight forward. The following code can be used to acquire a camera source:
extern "C" {
void basler_init() {
PylonAutoInitTerm pylon;
CInstantCamera camera( CTlFactory::GetInstance().CreateFirstDevice());
camera.RegisterConfiguration( (CConfigurationEventHandler*) NULL, RegistrationMode_ReplaceAll, Cleanup_None);
cout << "Using device " << camera.GetDeviceInfo().GetModelName() << endl;
}
}
I've used this source code to build a shared library - libbasler.so. To confirm it works, here's a basic C program that links against it and confirms everything is working:
void basler_init();
int main () {
basler_init();
}
I compile and run this as:
$ gcc Test2.c -lbasler -Llib -Wl,--enable-new-dtags -Wl,-rpath,pylon5/lib64 -Wl,-E -lpylonbase -o Test2-c
$ PYLON_CAMEMU=1 LD_LIBRARY_PATH=lib ./Test2-c
Using device Emulation
This is the expected output.
However, when I try and use this with Haskell, the behaviour changes and the program blocks indefinitely. Here's the Haskell source code:
{-# LANGUAGE ForeignFunctionInterface #-}
foreign import ccall "basler_init" baslerInit :: IO ()
main :: IO ()
main = baslerInit
I compile and run this as:
$ ghc --make Test2.hs -o Test2-haskell -Llib -lbasler -optl-Wl,--enable-new-dtags -optl-Wl,-rpath,pylon5/lib64 -optl-Wl,-E -lpylonbase
$ PYLON_CAMEMU=1 LD_LIBRARY_PATH=lib ./Test2-haskell
The application now hangs indefinitely.
I have ran both through strace to try and get an idea of what is going on, but I'm unable to really make much sense of it. The output is too long to add here, but please see these two pastes:
strace output for the C application: https://gist.github.com/ocharles/001b5f42c09229bc7a8482a22cadf486
strace output for the Haskell application: https://gist.github.com/ocharles/4c1c45a9ee78f75cd723f1a2910998f3
On top of that, I've used gdb to try and ascertain where the Haskell application is getting stuck:
$ PYLON_CAMEMU=1 LD_LIBRARY_PATH=lib gdb Test2-haskell
GNU gdb (GDB) 7.11
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from Test2-haskell...done.
(gdb) run
Starting program: /home/ollie/work/circuithub/receiving-station/Test2-haskell
warning: File "/nix/store/9ljgbhb26ca0j9shwh8bwsa77h42izr2-gcc-5.4.0-lib/lib/libstdc++.so.6.0.21-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
add-auto-load-safe-path /nix/store/9ljgbhb26ca0j9shwh8bwsa77h42izr2-gcc-5.4.0-lib/lib/libstdc++.so.6.0.21-gdb.py
line to your configuration file "/home/ollie/.gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "/home/ollie/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
info "(gdb)Auto-loading safe path"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/nix/store/bb32xf954imhdrzn7j8h82xs1bx7p3fr-glibc-2.23/lib/libthread_db.so.1".
^C
Program received signal SIGINT, Interrupt.
0x00007ffff6c6fb33 in __recvfrom_nocancel () from /nix/store/98s2znxww6x7h2ch7cj1w5givahxmdna-glibc-2.23/lib/libc.so.6
(gdb) bt
#0 0x00007ffff6c6fb33 in __recvfrom_nocancel () from /nix/store/98s2znxww6x7h2ch7cj1w5givahxmdna-glibc-2.23/lib/libc.so.6
#1 0x00007fffedb885c2 in GxImp::CEnumCollector::OnReady(unsigned int, _GX_SOCKET_INTERFACE_INFO const*) () from /home/ollie/work/circuithub/receiving-station/pylon5/lib64/libgxapi-5.0.1.so
#2 0x00007fffedb8d54d in CCollector::Collect(GxImp::CSocket*, unsigned int, unsigned int, _GX_SOCKET_INTERFACE_INFO const*) () from /home/ollie/work/circuithub/receiving-station/pylon5/lib64/libgxapi-5.0.1.so
#3 0x00007fffedb8817b in CBroadcastSocketCollection::Collect(CCollector&, unsigned int) () from /home/ollie/work/circuithub/receiving-station/pylon5/lib64/libgxapi-5.0.1.so
#4 0x00007fffedb889ab in Gx::Enumerator::Discover(Gx::Enumerator::Callee*, unsigned int, unsigned int, sockaddr const*) () from /home/ollie/work/circuithub/receiving-station/pylon5/lib64/libgxapi-5.0.1.so
#5 0x00007fffeddeaca0 in Pylon::CBaslerGigETl::DoDeviceEnumeration(Pylon::DeviceInfoList&, bool, sockaddr const*) () from pylon5/lib64/libpylon_TL_gige-5.0.1.so
#6 0x00007fffeddeaebc in Pylon::CBaslerGigETl::InternalEnumerateDevices(Pylon::DeviceInfoList&) () from pylon5/lib64/libpylon_TL_gige-5.0.1.so
#7 0x00007fffeddf3c99 in Pylon::CTransportLayerBase<Pylon::IGigETransportLayer>::EnumerateDevices(Pylon::DeviceInfoList&, Pylon::DeviceInfoList const&, bool) () from pylon5/lib64/libpylon_TL_gige-5.0.1.so
#8 0x00007ffff7949669 in Pylon::CTlFactory::EnumerateDevices(Pylon::DeviceInfoList&, Pylon::DeviceInfoList const&, bool) () from pylon5/lib64/libpylonbase-5.0.1.so
#9 0x00007ffff7949c8f in Pylon::CTlFactory::InternalCreateDevice(Pylon::CDeviceInfo const&, GenICam_3_0_Basler_pylon_v5_0::gcstring_vector const&, bool) () from pylon5/lib64/libpylonbase-5.0.1.so
#10 0x00007ffff794a655 in Pylon::CTlFactory::CreateFirstDevice(Pylon::CDeviceInfo const&) () from pylon5/lib64/libpylonbase-5.0.1.so
#11 0x00007ffff7bd7dc5 in basler_init () from lib/libbasler.so
#12 0x0000000000438415 in rFl_info ()
#13 0x0000000000000000 in ?? ()
For the C program:
Reading symbols from Test2-c...done.
(gdb) run
Starting program: /home/ollie/work/circuithub/receiving-station/Test2-c
warning: File "/nix/store/9ljgbhb26ca0j9shwh8bwsa77h42izr2-gcc-5.4.0-lib/lib/libstdc++.so.6.0.21-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
add-auto-load-safe-path /nix/store/9ljgbhb26ca0j9shwh8bwsa77h42izr2-gcc-5.4.0-lib/lib/libstdc++.so.6.0.21-gdb.py
line to your configuration file "/home/ollie/.gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "/home/ollie/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
info "(gdb)Auto-loading safe path"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/nix/store/bb32xf954imhdrzn7j8h82xs1bx7p3fr-glibc-2.23/lib/libthread_db.so.1".
[New Thread 0x7fffed4ae700 (LWP 13792)]
[New Thread 0x7fffeccad700 (LWP 13793)]
Using device Emulation
[Thread 0x7fffeccad700 (LWP 13793) exited]
[Thread 0x7fffed4ae700 (LWP 13792) exited]
[Inferior 1 (process 13788) exited normally]
My guess is GHC's runtime is doing something that causes pthreads to have different behaviour, but I'm not sure what that could be.
I believe this TRAC commentary is relevant:
https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Signals
The difference occurs starting a line 437 in the C strace output
vs. line 495 in the Haskell strace output.
At this point the library creates two UDP sockets and sends two UDP datagrams
(lines 448-449 C / lines 506-507 Haskell). The packets are broadcast to
two local networks: 192.168.1.0/24 and 192.168.56.0/24.
It then waits for a response on either of these sockets with a timeout of (apparently)
25 microseconds (line 450 C / line 508 Haskell). In the C case the select call
times out. In the Haskell case the select call is repeatedly interrupted by
a SIGVTALRM signal which is used by the GHC RTS. This is the same pattern
which is shown in the the above TRAC commentary.
For a potential fix, have a look at how the mysql package implements and uses the block_rts_signals() and unblock_rts_signals() macros:
mysql_signals.c

gdb doesn't display parameter info

I compiled my binaries using -g and tried to debug using gdb. I am able to debug it, but it does not show parameter values. I tried all configuration option to display parameter, it didn't help. Please let me know if I miss anything. Let me paste the command output
<code>
root> gdb /os/libexec/libmgr
GNU gdb (GDB) 7.2
This GDB was configured as "i386-linux".
Reading symbols from /os/libexec/libmgr...done.
(gdb) attach 2685
Attaching to program: /os/libexec/libmgr, process 2685
Reading symbols from /os/lib/libts.so.1.0...done.
................
0x74890047 in select () at ../sysdeps/unix/syscall-template.S:82
82 ../sysdeps/unix/syscall-template.S: No such file or directory.
in ../sysdeps/unix/syscall-template.S
(gdb) bt
\#0 0x74890047 in select () at ../sysdeps/unix/syscall-template.S:82
\#1 0x0817774b in thread_fetch ()
\#2 0x0809b528 in libmLocalMain ()
\#3 0x0809b6bb in main ()
(gdb)
</code>
I ported gdb 7.8. The problem is solved without changing single line of code.