Using libc++ causes GDB to segfault on OS X - c++

I'm trying to use C++11 (with Clang and libc++ on OS X) for a program, but whenever I debug with gdb and try to inspect standard containers, gdb segfaults. Here's a minimal example:
file.cpp:
#include <iostream>
#include <string>
int main(int argc, char* argv[])
{
std::string str = "Hello world";
std::cout << str << std::endl; // Breakpoint here
}
If I compile for C++11 using the following:
$ c++ --version
Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn)
Target: x86_64-apple-darwin12.4.0
Thread model: posix
$
$ c++ -ggdb -std=c++11 -stdlib=libc++ -Wall -pedantic -O0 -c file.cpp
$ c++ -ggdb -std=c++11 -stdlib=libc++ -Wall -pedantic -O0 file.o -o program
And then debug as follows, it crashes when I try to p str.size():
$ gdb program
GNU gdb 6.3.50-20050815 (Apple version gdb-1824) (Wed Feb 6 22:51:23 UTC 2013)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin"...Reading symbols for shared libraries ... done
(gdb) br file.cpp:8
Breakpoint 1 at 0x100000d80: file file.cpp, line 8.
(gdb) run
Starting program: /Users/mjbshaw/School/cs6640/2/program
Reading symbols for shared libraries ++............................. done
Breakpoint 1, main (argc=1, argv=0x7fff5fbffab0) at file.cpp:8
8 std::cout << str << std::endl; // Breakpoint here
(gdb) p str.size()
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000000
std::__1::operator<< <char, std::__1::char_traits<char>, std::__1::allocator<char> > (__os=#0x7fff5fc3d628, __str=#0x1) at string:1243
1243
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on"
Evaluation of the expression containing the function (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__is_long() const) will be abandoned.
If I don't run this in gdb, I get no crash and it works fine (but I need gdb to debug my program). Also, if I remove -std=c++11 -stdlib=libc++ from the compiling options, then it works fine (even in gdb), but I need C++11 for my program.
Are there some known issues with gdb and C++11 (specifically libc++)? I know libc++ and libstdc++ can cause issues if used together, but I'm not trying to use them together (at least not consciously; all I want to use is libc++). Am I specifying some compilation options wrong? Is there a way to properly compile for C++11 on OS X and still be able to debug properly?

GDB 6.3 is almost nine years old. That's just about eternity in Internet years. The product has improved greatly since then. Updating to the last stable release is a must for every developer.

Related

How do you find out the cause of rare crashes that are caused by things that are not caught by try catch (access violation, divide by zero, etc.)?

I am a .NET programmer who is starting to dabble into C++. In C# I would put the root function in a try catch, this way I would catch all exceptions, save the stack trace, and this way I would know what caused the exception, significantly reducing the time spent debugging.
But in C++ some stuff(access violation, divide by zero, etc.) are not caught by try catch. How do you deal with them, how do you know which line of code caused the error?
For example let's assume we have a program that has 1 million lines of code. It's running 24/7, has no user-interaction. Once in a month it crashes because of something that is not caught by try catch. How do you find out which line of code caused the crash?
Environment: Windows 10, MSVC.
C++ is meant to be a high performance language and checks are expensive. You can't run at C++ speeds and at the same time have all sorts of checks. It is by design.
Running .Net this way is akin to running C++ in debug mode with sanitizers on. So if you want to run your application with all the information you can, turn on debug mode in your cmake build and add sanitizers, at least undefined and address sanitizers.
For Windows/MSVC it seems that address sanitizers were just added in 2021. You can check the announcement here: https://devblogs.microsoft.com/cppblog/addresssanitizer-asan-for-windows-with-msvc/
For Windows/mingw or Linux/* you can use Gcc and Clang's builtin sanitizers that have largely the same usage/syntax.
To set your build to debug mode:
cd <builddir>
cmake -DCMAKE_BUILD_TYPE=debug <sourcedir>
To enable sanitizers, add this to your compiler command line: -fsanitize=address,undefined
One way to do that is to add it to your cmake build so altogether it becomes:
cmake -DCMAKE_BUILD_TYPE=debug \
-DCMAKE_CXX_FLAGS_DEBUG_INIT="-fsanitize=address,undefined" \
<sourcedir>
Then run your application binary normally as you do. When an issue is found a meaningful message will be printed along with a very informative stack trace.
Alternatively you can set so the sanitizer breaks inside the debugger (gdb) so you can inspect it live but that only works with the undefined sanitizer. To do so, replace
-fsanitize=address,undefined
with
-fsanitize-undefined-trap-on-error -fsanitize-trap=undefined -fsanitize=address
For example, this code has a clear problem:
void doit( int* p ) {
*p = 10;
}
int main() {
int* ptr = nullptr;
doit(ptr);
}
Compile it in the optimized way and you get:
$ g++ -O3 test.cpp -o test
$ ./test
Segmentation fault (core dumped)
Not very informative. You can try to run it inside the debugger but no symbols are there to see.
$ g++ -O3 test.cpp -o test
$ gdb ./test
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
...
Reading symbols from ./test...
(No debugging symbols found in ./test)
(gdb) r
Starting program: /tmp/test
Program received signal SIGSEGV, Segmentation fault.
0x0000555555555044 in main ()
(gdb)
That's useless so we can turn on debug symbols with
$ g++ -g3 test.cpp -o test
$ gdb ./test
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
...
Reading symbols from ./test...
(gdb) r
Starting program: /tmp/test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
test.cpp:4:5: runtime error: store to null pointer of type 'int'
Program received signal SIGSEGV, Segmentation fault.
0x0000555555555259 in doit (p=0x0) at test.cpp:4
4 *p = 10;
Then you can inspect inside:
(gdb) p p
$1 = (int *) 0x0
Now, turn on sanitizers to get even more messages without the debugger:
$ g++ -O0 -g3 test.cpp -fsanitize=address,undefined -o test
$ ./test
test.cpp:4:5: runtime error: store to null pointer of type 'int'
AddressSanitizer:DEADLYSIGNAL
=================================================================
==931717==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x563b7b66c259 bp 0x7fffd167c240 sp 0x7fffd167c230 T0)
==931717==The signal is caused by a WRITE memory access.
==931717==Hint: address points to the zero page.
#0 0x563b7b66c258 in doit(int*) /tmp/test.cpp:4
#1 0x563b7b66c281 in main /tmp/test.cpp:9
#2 0x7f36164a9082 in __libc_start_main ../csu/libc-start.c:308
#3 0x563b7b66c12d in _start (/tmp/test+0x112d)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /tmp/test.cpp:4 in doit(int*)
==931717==ABORTING
That is much better!

"Python Exception <class 'gdb.error'> There is no member named _M_dataplus." when trying to print string

I'm trying to debug a segfault in a homework program and I've discovered that my GDB can no longer even print std::strings. How can I fix it?
I'm on Ubuntu 18.04 LTS.
CLang++ version:
$ clang++ --version
clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
GDB version:
$ gdb --version
GNU gdb (Ubuntu 8.1-0ubuntu3.1) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
I've writen a small test program called gdbbroke.cpp:
#include <string>
int main()
{
std::string test = "sanity check";
return 0;
}
gdbbroke$ clang++ -o gdbbroke gdbbroke.cpp -std=c++11 -Wall -Wextra -Wpedantic -Wconversion -Wnon-virtual-d
tor -ggdb
gdbbroke$ gdb ./gdbbroke
[...]
Reading symbols from ./gdbbroke...done.
(gdb) break main()
Breakpoint 1 at 0x4007a3: file gdbbroke.cpp, line 5.
(gdb) run
Starting program: gdbbroke/gdbbroke
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, main () at gdbbroke.cpp:5
5 std::string test = "sanity check";
(gdb) print test
$1 = Python Exception <class 'gdb.error'> There is no member named _M_dataplus.:
(gdb)
I expected print test to output:
(gdb) print test
$1 = "sanity check"
However it just throws the Python error above.
With Clang, to print the string you need debug symbols of libstdc++ to be installed.
See this Clang bug resolved as INVALID https://bugs.llvm.org/show_bug.cgi?id=24202.
The string should be printed if you install libstdc++ debug symbols. On the other hand you can simply use GCC instead of Clang. In that case you don't need to install libstdc++ debug symbols because GCC already emits them. Clang does not emit them because it does debug information optimization while GCC does not do it. See also related question Cannot view std::string when compiled with clang.

Debug symbols not included in gcc-compiled C++

I am building a module in C++ to be used in Python. My flow is three steps: I compile the individual C++ sources into objects, create a library, and then run a setup.py script to compile the .pyx->.cpp->.so, while referring to the library I just created.
I know I can just do everything in one step with the Cython setup.py, and that is what I used to do. The reason for splitting it into multiple steps is I'd like the C++ code to evolve on its own, in which case I would just use the compiled library in Cython/python.
So this flow works fine, when there are no bugs. The issue is I am trying to find the source of a segfault, so I'd like to get the debugging symbols so that I can run with gdb (which I installed on OSX 10.14, it was a pain but it worked).
I have a makefile, which does the following.
Step 1: Compile individual C++ source files
All the files are compiled with the bare minimum flags, but -g is there:
gcc -mmacosx-version-min=10.7 -stdlib=libc++ -std=c++14 -c -g -O0 -I ./csrc -o /Users/colinww/system-model/build/data_buffer.o csrc/data_buffer.cpp
I think even here there is a problem: when I do nm -pa data_buffer.o, I see no debug symbols. Furthermore, I get:
(base) cmac-2:system-model colinww$ dsymutil build/data_buffer.o
warning: no debug symbols in executable (-arch x86_64)
Step 2: Compile cython sources
The makefile has the line
cd $(CSRC_DIR) && CC=$(CC) CXX=$(CXX) python3 setup_csrc.py build_ext --build-lib $(BUILD)
The relevant parts of setup.py are
....
....
....
compile_args = ['-stdlib=libc++', '-std=c++14', '-O0', '-g']
link_args = ['-stdlib=libc++', '-g']
....
....
....
Extension("circbuf",
["circbuf.pyx"],
language="c++",
libraries=["cpysim"],
include_dirs = ['../build'],
library_dirs=['../build'],
extra_compile_args=compile_args,
extra_link_args=link_args),
....
....
....
ext = cythonize(extensions,
gdb_debug=True,
compiler_directives={'language_level': '3'})
setup(ext_modules=ext,
cmdclass={'build_ext': build_ext},
include_dirs=[np.get_include()])
When this is run, it generates a bunch of compilation/linking commands like
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/colinww/anaconda3/include -arch x86_64 -I/Users/colinww/anaconda3/include -arch x86_64 -I. -I../build -I/Users/colinww/anaconda3/lib/python3.7/site-packages/numpy/core/include -I/Users/colinww/anaconda3/include/python3.7m -c circbuf.cpp -o build/temp.macosx-10.7-x86_64-3.7/circbuf.o -stdlib=libc++ -std=c++14 -O0 -g
and
g++ -bundle -undefined dynamic_lookup -L/Users/colinww/anaconda3/lib -arch x86_64 -L/Users/colinww/anaconda3/lib -arch x86_64 -arch x86_64 build/temp.macosx-10.7-x86_64-3.7/circbuf.o -L../build -lcpysim -o /Users/colinww/system-model/build/circbuf.cpython-37m-darwin.so -stdlib=libc++ -g
In both commands, the -g flag is present.
Step 3: Run debugger
Finally, I run my program with gdb
(base) cmac-2:sim colinww$ gdb python3
(gdb) run system_sim.py
It dumps out a ton of stuff related to system files (seems unrelated) and finally runs my program, and when it segfaults:
Thread 2 received signal SIGSEGV, Segmentation fault.
0x0000000a4585469e in cpysim::DataBuffer<double>::Write(long, long, double) () from /Users/colinww/system-model/build/circbuf.cpython-37m-darwin.so
(gdb) info local
No symbol table info available.
(gdb) where
#0 0x0000000a4585469e in cpysim::DataBuffer<double>::Write(long, long, double) () from /Users/colinww/system-model/build/circbuf.cpython-37m-darwin.so
#1 0x0000000a458d6276 in cpysim::ChannelFilter::Filter(long, long, long) () from /Users/colinww/system-model/build/chfilt.cpython-37m-darwin.so
#2 0x0000000a458b0d29 in __pyx_pf_6chfilt_6ChFilt_4filter(__pyx_obj_6chfilt_ChFilt*, long, long, long) () from /Users/colinww/system-model/build/chfilt.cpython-37m-darwin.so
#3 0x0000000a458b0144 in __pyx_pw_6chfilt_6ChFilt_5filter(_object*, _object*, _object*) () from /Users/colinww/system-model/build/chfilt.cpython-37m-darwin.so
#4 0x000000010002f1b8 in _PyMethodDef_RawFastCallKeywords ()
#5 0x000000010003be64 in _PyMethodDescr_FastCallKeywords ()
As I mentioned above, I think the problem starts in the initial compilation step. This has nothing to do with cython, I'm just calling gcc from the command line, passing the -g flag.
(base) cmac-2:system-model colinww$ gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/usr/include/c++/4.2.1
Apple LLVM version 10.0.1 (clang-1001.0.46.4)
Target: x86_64-apple-darwin18.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
Any help is appreciated, thank you!
UPDATE
I removed the gcc tag and changed it to clang. So I guess now I'm confused, if Apple will alias gcc to clang, doesn't that imply that in "that mode" it should behave like gcc (and, implied, someone made sure it was so).
UPDATE 2
So, I never could get the debug symbols to appear in the debugger, and had to resort to lots of interesting if-printf statements, but the problem was due to an index variable becoming undefined. So thanks for all the suggestions, but the problem is more or less resolved (until next time). Thanks!
The macOS linker doesn't link debug information into the final binary the way linkers usually do on other Unixen. Rather it leaves the debug information in the .o files and writes a "debug map" into the binary that tells the debugger how to find and link up the debug info read from the .o files. The debug map is stripped when you strip your binary.
So you have to make sure that you don't move or delete your .o files after the final link, and that you don't strip the binary you are debugging before debugging it.
You can check for the presence of the debug map by doing:
$ nm -ap <PATH_TO_BINARY> | grep OSO
You should see output like:
000000005d152f51 - 03 0001 OSO /Path/To/Build/Folder/SomeFile.o
If you don't see that the executable probably got stripped. If the .o file is no around then somebody cleaned your build folder earlier than they should have.
I also don't know if gdb knows how to read the debug map on macOS. If the debug map entries and the .o files are present, you might try lldb and see if that can find the debug info. If it can, then this is likely a gdb-on-macOS problem. If the OSO's and the .o files are all present, then something I can't guess went wrong, and it might be worth filing a bug with http://bugreporter.apple.com.

Why doesn't SIGSEGV crash the process?

I'm trying to implement breakpad to get crash reports and stack traces for our cross-platform Qt application. I think I implemented all the necessary code, but I can't get the application to crash reliably on Windows.
I use MinGW gcc compiler and Qt.
I created a button in the UI.
void crash() {
printf(NULL);
int* x = 0;
*x = 1;
int a = 1/0;
}
/* .... */
connect(ui->btnCrash, SIGNAL(clicked()),this,SLOT(crash()));
When clicking the button, nothing really happens. However when running in debug mode, the debugger (gdb) detects a SIGSEGV on first function call and then abandons running the rest of the method. I notice the same behavior when deliberately doing illegal stuff in other places in the code. This leads to unexpected/undefined behavior.
Now this behavior is different from Linux, where when calling this crash(), the process is properly crashed, and a dump is created.
So what's the difference ? How can I have the same behavior across platforms ?
Here is source for a minimal console program that attempts
to dereference a null pointer
main.c
#include <stdio.h>
int shoot_my_foot() {
int* x = 0;
return *x;
}
int main()
{
int i = shoot_my_foot();
printf("%d\n",i);
return 0;
}
I'll compile and run it on (Ubuntu 18.04) Linux:
$ gcc -Wall -Wextra -o prog main.c
$ ./prog
Segmentation fault (core dumped)
What was the system return code?
$ echo $?
139
When a program is killed for a fatal signal, Linux returns 128 + the signal number to the caller. So
that was 128 + 11, i.e. 128 + SIGSEGV.
That is what happens, on Linux, when a program tries to dereference a null pointer.
This is what Linux did to the misbehaving program: it killed it and returned us
128 + SIGSEGV. It is not what the program did: it does not handle any signals.
Now I'll hop into a Windows 10 VM and compile and run the same program with the
Microsoft C compiler:
>cl /Feprog /W4 main.c
Microsoft (R) C/C++ Optimizing Compiler Version 19.11.25547 for x64
Copyright (C) Microsoft Corporation. All rights reserved.
main.c
Microsoft (R) Incremental Linker Version 14.11.25547.0
Copyright (C) Microsoft Corporation. All rights reserved.
/out:prog.exe
main.obj
>prog
>
Nothing. So the program crashed, and:
>echo %errorlevel%
-1073741819
The system return code was -1073741819, which is the signed integral value
of 0xc0000005, the famous Windows error code that means Access Violation.
Still in Windows, I'll now compile and run the program with GCC:
>gcc --version
gcc (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 7.2.0
>gcc -Wall -Wextra -o prog.exe main.c
>prog
>echo %errorlevel%
-1073741819
As before, the program crashed, system code 0xc0000005.
One more time from the top:
>gcc -Wall -Wextra -o prog.exe main.c
>prog
>echo %errorlevel%
-1073741819
No change.
That is what happens, on Windows, when a program tries to dereference a null pointer.
That is what Windows does to the misbehaving program: it kills it and returns us
0xc0000005.
There is nothing about the misbehaving C program we can thank for the fact
Windows does the same thing with it whether we compile with it MinGW-W64 gcc
or MS cl. And there is nothing about it we can blame for the fact that Windows
does not do the same thing with it as Linux.
Indeed, there is nothing about it we can even thank for the fact that the same thing happened
to the misbehaving program, compiled with GCC, both times when we just ran it. Because the C
(or C++) Standard does not promise that dereferencing a null pointer will cause SIGSEGV to
be raised (or that division by 0 will cause SIGFPE, and so on). It just promises that
this operation results in undefined behaviour, including possibly causing SIGSEGV
when the program is run under gdb, on Tuesdays, and otherwise not.
As a matter of fact, the program does cause a SIGSEGV in all three of our
compilation scenarios, as we can observe by giving the program a handler for that
signal:
main_1.c
#include <stdlib.h>
#include <stdio.h>
#include <signal.h>
#include <assert.h>
static void handler(int sig)
{
assert(sig == SIGSEGV);
fputs("Caught SIGSEGV\n", stderr);
exit(128 + SIGSEGV);
}
int shoot_my_foot(void) {
int* x = 0;
return *x;
}
int main(void)
{
int i;
signal(SIGSEGV, handler);
i = shoot_my_foot();
printf("%d\n",i);
return 0;
}
On Linux:
$ gcc -Wall -Wextra -o prog main_1.c
$ ./prog
Caught SIGSEGV
$ echo $?
139
On Windows, with MinGW-W64gcc`:
>gcc -Wall -Wextra -o prog.exe main_1.c
>prog
Caught SIGSEGV
>echo %errorlevel%
139
On Windows, with MS cl:
>cl /Feprog /W4 main_1.c
Microsoft (R) C/C++ Optimizing Compiler Version 19.11.25547 for x64
Copyright (C) Microsoft Corporation. All rights reserved.
main_1.c
Microsoft (R) Incremental Linker Version 14.11.25547.0
Copyright (C) Microsoft Corporation. All rights reserved.
/out:prog.exe
main_1.obj
>prog
Caught SIGSEGV
>echo %errorlevel%
139
That consistent behaviour is different from what we'd observe with the
the original program under gdb:
>gcc -Wall -Wextra -g -o prog.exe main.c
>gdb -ex run prog.exe
GNU gdb (GDB) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-w64-mingw32".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from prog.exe...done.
Starting program: C:\develop\so\scrap\prog.exe
[New Thread 6084.0x1e98]
[New Thread 6084.0x27b8]
Thread 1 received signal SIGSEGV, Segmentation fault.
0x0000000000401584 in shoot_my_foot () at main.c:5
5 return *x;
(gdb)
The reason for that is that gdb by default installs signal handlers for
all fatal signals, and the behaviour of its SIGSEGV handler is to output
the like of:
Thread 1 received signal SIGSEGV, Segmentation fault.
0x0000000000401584 in shoot_my_foot () at main.c:5
5 return *x;
and drop to the gdb prompt, unlike the behaviour of the SIGSEGV handler we installed in
main_1.c.
So there you have an answer to the question:
How can I have the same behavior across platforms ?
that in practice is as good as it gets:-
You can handle signals in your program, and confine your signal handlers to
code whose behaviour is the same across platforms, within your preferred
meaning of the same.
And this answer is only as good as it gets, in practice, because in principle,
per the language Standard, you cannot depend upon an operation that causes undefined
behaviour to raise any specific signal, or have any specific or even consistent outcome.
If it is in fact your objective to implement consistent cross-platform handling of
fatal signals, then the appropriate function call to provoke signal sig for your testing
purposes is provided by the standard header <signal.h> (in C++,
<csignal>):
int raise( int sig )
Sends signal sig to the program.
Your code has undefined behaviour in
*x = 1;
because you shall not dereference a null pointer. Actually I am not so certain about dividing by zero, but once you got off the rails all bets are off anyhow.
If you want to signal a SIGSEGV then do that, but dont use undefined behaviour that may cause you code to do anything. You should not expect your code to have any output but rather fix it ;).

C++0x thread static linking problem

I am having some issues trying to statically link programs using c++0x thread features. Code looks this: (Compiler is gcc 4.6.1 on Debian x86_64 testing)
#include <iostream>
#include <thread>
static void foo() {
std::cout << "FOO BAR\n";
}
int main() {
std::thread t(foo);
t.join();
return 0;
}
I link it with:
g++ -static -pthread -o t-static t.cpp -std=c++0x
When I execute the program, I have the following error:
terminate called after throwing an instance of 'std::system_error'
what(): Operation not permitted
Aborted
GDB Debug output looks like this:
Debugger finished
Current directory is ~/testspace/thread/
GNU gdb (GDB) 7.2-debian
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/will/testspace/thread/t-static...done.
(gdb) list -
1 #include <iostream>
(gdb) b 1
Breakpoint 1 at 0x4007c8: file t.cpp, line 1.
(gdb) r
Starting program: /home/will/testspace/thread/t-static
terminate called after throwing an instance of 'std::system_error'
what(): Operation not permitted
Program received signal SIGABRT, Aborted.
0x00000000004a8e65 in raise ()
(gdb) bt
#0 0x00000000004a8e65 in raise ()
#1 0x000000000045df90 in abort ()
#2 0x000000000044570d in __gnu_cxx::__verbose_terminate_handler() ()
#3 0x0000000000442fb6 in __cxxabiv1::__terminate(void (*)()) ()
#4 0x0000000000442fe3 in std::terminate() ()
#5 0x0000000000443cbe in __cxa_throw ()
#6 0x0000000000401fe4 in std::__throw_system_error(int) ()
#7 0x00000000004057e7 in std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) ()
#8 0x0000000000400b18 in std::thread::thread<void (&)()> (this=0x7fffffffe540, __f=#0x4007c4) at /usr/include/c++/4.6/thread:135
#9 0x00000000004007f3 in main () at t.cpp:11
(gdb)
Update:
Linking with static libstdc++ could (possibly) make this error disappear, and the compiled C++0x programs can run on systems without gcc 4.6 libs:
g++ -static-libgcc -pthread -L.-o t thread.cpp -std=c++0x
But first, we should make a symbolic link to 'libstdc++.a' at current directory:
ln -s `g++ -print-file-name=libstdc++.a`
(Reference: http://www.trilithium.com/johan/2005/06/static-libstdc/)
you can use -u to resolve the problem (test in gcc version 4.6.3/(Ubuntu EGLIBC 2.15-0ubuntu10.4) 2.15 , gcc version 4.8.1/(Ubuntu EGLIBC 2.15-0ubuntu10.5~ppa1) 2.15)
-Wl,-u,pthread_cancel,-u,pthread_cond_broadcast,-u,pthread_cond_destroy,-u,pthread_cond_signal,-u,pthread_cond_wait,-u,pthread_create,-u,pthread_detach,-u,pthread_cond_signal,-u,pthread_equal,-u,pthread_join,-u,pthread_mutex_lock,-u,pthread_mutex_unlock,-u,pthread_once,-u,pthread_setcancelstate
1. reproduce the bug
g++ -g -O0 -static -std=c++11 t.cpp -lpthread
./a.out
terminate called after throwing an instance of 'std::system_error'
what(): Enable multithreading to use std::thread: Operation not permitted
Aborted (core dumped)
nm a.out | egrep "\bpthread_.*"
w pthread_cond_broadcast
w pthread_cond_destroy
w pthread_cond_signal
w pthread_cond_wait
w pthread_create
w pthread_detach
w pthread_equal
w pthread_join
w pthread_mutex_lock
w pthread_mutex_unlock
w pthread_once
w pthread_setcancelstate
2. resolve the bug
g++ -g -O0 -static -std=c++11 t.cpp -lpthread -Wl,-u,pthread_join,-u,pthread_equal
./a.out
FOO BAR
nm a.out | egrep "\bpthread_.*"
0000000000406320 T pthread_cancel
w pthread_cond_broadcast
w pthread_cond_destroy
w pthread_cond_signal
w pthread_cond_wait
0000000000404970 W pthread_create
w pthread_detach
00000000004033e0 T pthread_equal
00000000004061a0 T pthread_getspecific
0000000000403270 T pthread_join
0000000000406100 T pthread_key_create
0000000000406160 T pthread_key_delete
00000000004057b0 T pthread_mutex_lock
00000000004059c0 T pthread_mutex_trylock
0000000000406020 T pthread_mutex_unlock
00000000004063b0 T pthread_once
w pthread_setcancelstate
0000000000406220 T pthread_setspecific
For reasons exactly unknown to me (I consider this a bug) you can not use std::thread on gcc 4.6 when linking statically, since the function __ghtread_active_p() will be inlined as returning false (look at the assembly of _M_start_thread), causing this exception to be thrown. It might be that they require weak symbols for the pthread_create function there and when statically linking they are not there, but why they don't do it otherwise is beyond me (Note that the assembly later contains things like callq 0x0, there seems to be going something very wrong).
For now I personally use boost::threads since I am using boost anyways...
You should make sure you're linking against pthread library, otherwise you'll get the "Operation not permitted" message.
For instance, to compile your source code, I'd use this:
g++ -Wall -fexceptions -std=c++0x -g -c file.cpp -o file.o
Then linking with this:
g++ -o file file.o -lpthread
When doing it without object files, you could try something like this:
g++ -Wall -fexceptions -std=c++0x -g main.cpp -o file -lpthread
Remember to leave libraries at the end, since they'll be used on the linking process only.
My previous answer was deleted and I wrote detailed answer.
In common cases this issue happens due incomplete linking of the libpthread. I found information about this here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52590
You can try link you app with the following flags:
-Wl,--whole-archive -lpthread -Wl,--no-whole-archive
Also you could look at this questions with the similar problem:
What is the correct link options to use std::thread in GCC under linux?
Starting a std::thread with static linking causes segmentation fault