I am compiling the unit tests (via GoogleTests) for my program and whenever I try to compile in DEBUG mode on Solaris 11.3 with Eigen 3.2.x, I'm getting this SIGSEGV error then core dump when running the program in gdb:
(gdb) r
...
[Thread debugging using libthread_db enabled] [New Thread 1 (LWP 1)]
Program received signal SIGSEGV, Segmentation fault. [Switching to
Thread 1 (LWP 1)] 0x0830fc30 in
Eigen::internal::ploadu (
from=0xfeffe5a0) at ./eigen/Eigen/src/Core/arch/SSE/Complex.h:307 307 {
EIGEN_DEBUG_UNALIGNED_LOAD return Packet1cd(ploadu((const
double*)from)); }
(gdb)
When print from in gdb this is what I'm getting:
gdb p from: (const std::complex< double > *) 0xfeffe5a0
This SIGSEGV only on Solaris, and only when compiling with -Og. I've compiled and tested it on multiple other OSes and there are no issues whatsoever. Is this a known issue? It looks it has to do with some SSE optimizations and alignments, however I cannot pinpoint what exactly is going on.
Related
Environment
OS: CentOS 7.9_x64
Memory, CPU, current Disk Space:Memory 96G, Disk 1T
TDengine Version:TDengine-server-2.0.20.13-Linux-x64
TDengine taosd daemon coredump.
gdb output:
[New LWP 5461]
[New LWP 5499]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/taosd'.
Program terminated with signal 11, Segmentation fault.
#0 0x000056308db735cf in gcBuildQueryJson (pContext=0x7fdfdc0008c0, cmd=0x7fdfe00014a0, result=0x7fdfcc048ab0, numOfRows=682) at /home/ubuntu/workroom/jenkins/TDinternal/community/src/plugins/http/src/httpGcJson.c:154
154 /home/ubuntu/workroom/jenkins/TDinternal/community/src/plugins/http/src/httpGcJson.c: No such file or directory.
Missing separate debuginfos, use: debuginfo-install glibc-2.17-324.el7_9.x86_64
how to resolve it?
how to resolve it?
It's a bug in TDengine-server. You don't "resolve" bugs.
You can try to figure out what the bug is (via debugging), or you can try newer version of TDengine-server (current appears to be 2.2.0.2) and hope that the particular bug you've hit has been fixed.
I had attached gdb to a long running process(>25 hours). To manage the session, I used screen on my Ubuntu machine. I could get the session back. I got back the gdb console. But on continuing I saw my process throw SIGABRT and exit followed up by other process exit messages.
[New LWP 122]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fe8ef29ea15 in futex_abstimed_wait_cancelable (private=0, abstime=0x7ffc8c628420, expected=0, futex_word=0x7fe8e6378640) at ../sysdeps/unix/sysv/
linux/futex-internal.h:205
205 ../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
(gdb) c
Continuing.
(gdb) [Thread 0x7be8d8bfd700 (LWP 48) exited]
Thread 32 "my-process" received signal SIGABRT, Aborted.
[Switching to Thread 0x7be8d2bbd700 (LWP 60)]
0x00007fe8eece4428 in __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) c
Continuing.
Couldn't get registers: No such process.
Couldn't get registers: No such process.
Couldn't get registers: No such process.
(gdb) [Thread 0x7be8b67ff700 (LWP 119) exited]
[Thread 0x7be8b49fe700 (LWP 122) exited]
...
I am not able to get the gdb console after that. Though I see a process running when I run ps -ef
root 133 0 1 Jan14 ? 00:26:09 gdb --pid=23
How do get back the console for this gdb process? I wanted to see the backtrace.
Or is there a better way to attach gdb to a long running process ?
This question already has answers here:
Why does java app crash in gdb but runs normally in real life?
(2 answers)
Closed 8 years ago.
I wrote a C++ program. It invokes some functions provided by libhdfs(HDFS API for C++, implemented with JNI) and it runs OK when normally executed. When I use gdb to launch the program and input run command. The program fails to run and I got the following error message in gdb context:
[Thread debugging using libthread_db enabled]
[New Thread 0x40100940 (LWP 18482)]
[New Thread 0x40201940 (LWP 18483)]
...
[New Thread 0x41514940 (LWP 18502)]
Program received signal SIGSEGV, Segmentation fault.
0x00002aaaac26c862 in ?? ()
I use command shell echo $CLASSPATH in gdb context. It shows the correct HDFS related environment.
I searched with Google and StackOverflow. But I did not get any idea.
Any tip?
Why does java app crash in gdb but runs normally in real life? provided a solution:
handle SIGSEGV nostop noprint pass
While, it is not so elegant.
I am unable to move forward in getting to see the core dumped.
I have got this when i typed
gdb normal_estimation core
Reading symbols from /home/sai/Documents/pcl_learning/normal_estimation/build/normal_estimation...(no debugging symbols found)...done.
warning: core file may not match specified executable file.
[New LWP 11816]
warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
Core was generated by `./normal_estimation'.
Program terminated with signal 11, Segmentation fault.
#0 0xb53101d6 in free () from /lib/i386-linux-gnu/libc.so.6
(gdb)
Please let me know what should i do?
Program terminated with signal 11, Segmentation fault.
#0 0xb53101d6 in free () from /lib/i386-linux-gnu/libc.so.6
The first command you need to learn is backtrace (or its synonym: where).
This will tell you which code invoked the free, which crashed.
However, it is possible that that code has nothing to do with the actual problem: any crash in free is always caused by heap corruption of some sort (freeing un-allocated memory, freeing the same memory twice, writing to memory that has already been freed, or overflowing an allocated buffer).
The most useful tools to diagnose heap corruption on Linux are Valgrind and AddressSanitizer. Chances are either of these tools will tell you exactly what you are doing wrong.
I have this output when trying to debug
Program received signal SIGSEGV, Segmentation fault 0x43989029 in
std::string::compare (this=0x88fd430, __str=#0xbfff9060) at
/home/devsw/tmp/objdir/i686-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:253
253 { return memcmp(__s1, __s2, __n); }
Current language: auto; currently c++
Using valgrind I getting this output
==12485== Process terminating with default action of signal 11 (SIGSEGV)
==12485== Bad permissions for mapped region at address 0x0
==12485== at 0x1: (within path_to_my_executable_file/executable_file)
You don't need to use Valgrind, in fact you want to use the GNU DeBugger (GDB).
If you run the application via gdb (gdb path_to_my_executable_file/executable_file) and you've compiled the application with debugging enabled (-g or -ggdb for GNU C/C++ compilers), you can start the application (via run command at the gdb prompt) and once you arrive at the SegFault, do a backtrace (bt) to see what part of your program called std::string::compare which died.
Example (C):
mctaylor#mpc:~/stackoverflow$ gcc -ggdb crash.c -o crash
mctaylor#mpc:~/stackoverflow$ gdb -q ./crash
(gdb) run
Starting program: /home/mctaylor/stackoverflow/crash
Program received signal SIGSEGV, Segmentation fault.
0x00007f78521bdeb1 in memcpy () from /lib/libc.so.6
(gdb) bt
#0 0x00007f78521bdeb1 in memcpy () from /lib/libc.so.6
#1 0x00000000004004ef in main (argc=1, argv=0x7fff3ef4d848) at crash.c:5
(gdb)
So the error I'm interested in is located on crash.c line 5.
Good luck.
Just run the app in the debugger. At one point it will die and you will have a stack trace with the information you want.