segfault when using boost::signal with -D_GLIBCXX_DEBUG compiler flag - c++

I'm building with g++, and yesterday a helpful person on SO told me to compile with the -D_GLIBCXX_DEBUG and -D_GLIBCXX_DEBUG_PEDANTIC flags. I did so, and I spent most of yesterday tweaking my code to conform to these flags. Now it's complaining about my use of boost::signal, and I'm not sure where the problem is.
I have a class Yarl that has a function refresh() that I want to bind to a signal sigRefresh in another class EventHandler:
class Yarl
{
private:
void refresh();
(...)
};
class EventHandler
{
public:
boost::signal<void()> sigRefresh;
(...)
}
Then, in a member function of Yarl, I have this bit of code:
EventHandler eventHandler;
eventHandler.sigRefresh.connect(boost::bind(&Yarl::refresh, this));
Before I started compiling with those flags, this code ran fine. Now that I'm using them, my program segfaults at the second line.
Here's the backtrace from gdb:
#0 0x001eeee6 in __gnu_debug::_Safe_iterator_base::_M_detach_single() ()
from /usr/lib/libstdc++.so.6
#1 0x001f0555 in __gnu_debug::_Safe_sequence_base::_M_detach_all() ()
from /usr/lib/libstdc++.so.6
#2 0x0804e8a3 in ~_Safe_sequence_base (this=0x812cda4,
__in_chrg=<value optimized out>)
at /usr/include/c++/4.4/debug/safe_base.h:180
#3 0x08085af9 in __gnu_debug::_Safe_sequence<std::__debug::vector<boost::signals::trackable const*, std::allocator<boost::signals::trackable const*> > >::~_Safe_sequence() ()
#4 0x08085b44 in std::__debug::vector<boost::signals::trackable const*, std::allocator<boost::signals::trackable const*> >::~vector() ()
#5 0x080873ab in boost::signals::detail::slot_base::data_t::~data_t() ()
#6 0x080873e3 in void boost::checked_delete<boost::signals::detail::slot_base::data_t>(boost::signals::detail::slot_base::data_t*) ()
#7 0x0808802e in boost::detail::sp_counted_impl_p<boost::signals::detail::slot_base::data_t>::dispose() ()
#8 0x08083d04 in boost::detail::sp_counted_base::release (this=0x812ce30)
at /usr/local/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:145
#9 0x08083d76 in ~shared_count (this=0xbffff358,
__in_chrg=<value optimized out>)
at /usr/local/boost/smart_ptr/detail/shared_count.hpp:217
#10 0x08083f70 in ~shared_ptr (this=0xbffff354,
__in_chrg=<value optimized out>)
at /usr/local/boost/smart_ptr/shared_ptr.hpp:169
#11 0x080847f1 in ~slot_base (this=0xbffff354, __in_chrg=<value optimized out>)
at /usr/local/boost/signals/slot.hpp:27
#12 0x08084829 in ~slot (this=0xbffff354, __in_chrg=<value optimized out>)
at /usr/local/boost/signals/slot.hpp:105
#13 0x0808390f in yarl::Yarl::mainLoop (this=0xbffff3dc) at src/Yarl.cpp:408
#14 0x08083a96 in yarl::Yarl::startGame (this=0xbffff3dc) at src/Yarl.cpp:452
#15 0x08083abe in main () at src/Yarl.cpp:461
Anyone see what I should fix?
EDIT: I have a small sample program that illustrates the problem, as suggested by Daniel Trebbien.
Here's the header file (test.hpp):
#include <boost/bind.hpp>
#include <boost/signal.hpp>
#include <iostream>
#include <tr1/memory>
namespace yarl
{
class Yarl
{
private:
void refresh();
public:
void hookSignal();
};
namespace events
{
class EventHandler
{
public:
boost::signal<void()> sigRefresh;
};
}
}
and here's the implementation:
#include "test.hpp"
using namespace std;
namespace yarl
{
void Yarl::refresh()
{
cout << "in refresh" << endl;
}
void Yarl::hookSignal()
{
events::EventHandler eventHandler;
eventHandler.sigRefresh.connect(boost::bind(&Yarl::refresh, this));
eventHandler.sigRefresh();
}
}
int main()
{
yarl::Yarl y;
y.hookSignal();
}
As before, this sample program works fine when compiled in g++ with only a -g flag, but if I add -D_GLIBCXX_DEBUG and -D_GLIBCXX_DEBUG_PEDANTIC, it segfaults on the eventHandler.sigRefresh.connect line.
I recompiled boost with -D_GLIBCXX_DEBUG and -D_GLIBCXX_DEBUG_PEDANTIC, and it didn't fix the problem, but while it was compiling I noticed it was doing something odd. I compiled with bjam using this command (according to this boost tutorial):
sudo bjam --build-dir=. --toolset=gcc --variant=debug --cxxflags=-D_GLIBCXX_DEBUG,-D_GLIBCXX_DEBUG_PEDANTIC --layout=tagged stage
despite the --variant=debug tag, it was still compiling the release versions. I also didn't see any mention of my debug flags anywhere in the output. Is it possible I compiled it wrong?

Do I have to have differently compiled versions of boost for release code and debug code?
I'm afraid you do. From personal experience, boost is extremely sensitive to changes in compiler flags. A few years ago a free software project I was hacking on had to stop using boost::system and boost::filesystem just because those modules have shared libraries that weren't reliably compiled (by the Linux distributors) with exactly the same flags as our code. The symptoms were just the same - inexplicable crashes on correct code.
Because of this I have to recommend not using any Boost module that ships a shared library. Ever. It's sad.

Related

Why are my enums causing segfaults during exit()?

I am running into a weird issue at work where after updating from RHEL 7 (linux kernel 3.10.0, GCC 4.8.5) to RHEL 8 (linux kernel 4.18.0, GCC 8.3.1), our enums have started to cause problems while destructing. From my best diagnosis in gdb, it is trying to call the destructor on the same static object more than once (once for each lib that instantiates the enums and is used to build the executable in question) and segfaulting on the second attempt, as the object has already been destroyed.
Here is the backtrace:
#0 0x0000000000000000 in ?? ()
#1 0x00007ffff3c91b6f in __tcf_2 () at /sourcepath/ExampleEnum.H:106
#2 0x00007ffff68ae3c7 in __cxa_finalize () from /lib64/libc.so.6
#3 0x00007ffff3c33c87 in __do_global_dtors_aux () from /libpath/lib64/libsecond_lib.so
#4 0x00007fffffff9c10 in ?? ()
#5 0x00007ffff7de42a6 in _dl_fini () from /lib64/ld-linux-x86-64.so.2
This is the second time it reaches that line of ExampleEnum.H in __tcf_2, a function related to static destruction. The first time is no problem.
Here is the structure of the enums:
#ifndef _EXAMPLEENUM_H
#define _EXAMPLEENUM_H
#include "OurString.H"
#define EXAMPLEENUM_SOURCE_LIST(enum) \
enum(THIS_EXAMPLE_ENUM, "THIS_EXAMPLE", "", false),\
enum(ExampleEnumMax, "ExampleEnumMax", "error", false)
#define NAME_GENERATOR(name, guiname, description, p4) name
#define GUI_NAME_STR_GENERATOR(name, guiname, description, p4) guiname
class Example {
public:
enum Enum {
EXAMPLEENUM_SOURCE_LIST(NAME_GENERATOR)
};
static const int NUM_FIELDS = ExampleEnumMax + 1;
static const char* names[NUM_FIELDS];
};
typedef Example::Enum ExampleEnum
extern const OurString ExampleEnum_GuiName[Example::ExampleEnumMax + 1];
#ifdef CONSTRUCT_ENUM_STRINGS
const OurString ExampleEnum_GuiName[Example::ExampleEnumMax + 1] = {
EXAMPLEENUM_SOURCE_LIST(GUI_NAME_STR_GENERATOR)
};
#endif
#endif
And then in the libs where it is used, this names.C is compiled into the lib:
#define CONSTRUCT_ENUM_STRINGS 1
#include <enumpath/ExampleEnum.H>
#undef CONSTRUCT_ENUM_STRINGS
const char* Example::names[Example::NUM_FIELDS] = {
EXAMPLEENUM_SOURCE_LIST(GUI_NAME_STR_GENERATOR)
};
We have a band-aid solution that basically just covers up the problem, ie calling _exit(0) at the end of main() skips all destructors, including static destructors which pose the problem so it doesn't segfault. However, obviously we want to fix the way our enums are handled such that we can run all necessary destructors (and no more than necessary) without segfaulting.
Is there anything obviously wrong with our enums? They have been working through several kernel/gcc versions and have only recently posed a problem.
Is there likely to be anything wrong with how they are used in the libs? This problem only occurs when an executable is compiled with multiple libs that use the same enum, which is unfortunately quite often. Is there some strict tree of import dependency structure we could keep to to fix this?
Why did it work up until we updated the OS?
EDIT:
Concerns about OurString's destructor have been raised, I didn't include it because it was trivial:
~OurString() throw () {}
ALSO: a little more debugging and going through a version compiled by GCC 4.8.5 that doesn't segfault shows me that __tcf_2 is entered twice there too, so my theory about improperly calling the destructor multiple times is wrong, and it looks like #PaulMcKenzie's theory of static initialization order is likely.
Thanks in advance!

same piece of C++ code works in g++ 4.6 compiler but crashes with 5.1

The following piece of code works with g++ 4.6 compiler but crashes with segmentation fault when compiled with g++ 5.1 compiler. The variable access gString is causing the segmentation fault.
#define _GLIBCXX_DEBUG 1
#define _GLIBCXX_USE_CXX11_ABI 0
#include<string>
#include<iostream>
#include<vector>
static std::string gString("hello");
static void
__attribute__((constructor))
initialize()
{
gString.assign("hello world");
return;
}
static void
__attribute__((destructor))
finalize()
{
return;
}
int main(int ac, char **av)
{
//std::cerr<<gString;
return 0;
}
GDB output:
Reading symbols from /home/rk/str...done.
(gdb) b initialize
Breakpoint 1 at 0x401419: file str.cc, line 15.
(gdb) r
Starting program: /home/rk/str
Breakpoint 1, initialize() () at str.cc:15
15 gString.assign("hello world");
(gdb) n
Program received signal SIGSEGV, Segmentation fault.
0x00000000004018d6 in std::string::size() const () at /usr/include/c++/5/bits/basic_string.h:3118
3118 { return _M_rep()->_M_length; }
(gdb) bt
#0 0x00000000004018d6 in std::string::size() const () at /usr/include/c++/5/bits/basic_string.h:3118
#1 0x00000000004016ff in std::string::assign(char const*, unsigned long) () at /usr/include/c++/5/bits/basic_string.tcc:706
#2 0x000000000040166e in std::string::assign(char const*) () at /usr/include/c++/5/bits/basic_string.h:3542
#3 0x0000000000401428 in initialize() () at str.cc:15
#4 0x00000000004023dd in __libc_csu_init ()
#5 0x00007ffff71ad700 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x0000000000401289 in _start ()
Why are you using __attribute__((constructor)) in C++ instead of simply a global object with a constructor? Those attributes are useful in C code, but redundant in C++.
The problem is that your constructor runs before the standard iostreams have been initialized, which would not be a problem if you used a global object with a constructor.
You could try adding a priority to your constructor, but I don't think it will help in this case:
__attribute__((constructor(999)))
The runtime error also happens with gcc 4.9.2 (see ideone example).
The problem is related to the iostreams which are not yet initialized. Commenting out the cerr line, and everything works fine
Apparently, it's a known issue.
Edit: Additional remarks
This small workaround seems to work, at least with 4.9: use c stdio instead of iostreams:
fprintf(stderr, "_initialize"); // this works
But I fully agree with Jonathan's suggestion of using a global (singleton ?) object relying solely on well defined standard C++ behaviour, unless you really need the constructor being run exactly at the moment of a dynamic library load.

segfault using SWIG converted code for tcl

I'm having a segmentation fault with my program.
In fact I write a library in C++ and convert it for tcl using SWIG.
The segfault occurs here:
return Tcl_NewIntObj(static_cast< int >(value));
where value=0
the gdb back trace shows:
(gdb) bt
#0 0x000054b6 in ?? ()
#1 0xb6650d5d in SWIG_From_long (value=0) at mntdisplay_wrap.cc:1712
#2 SWIG_From_int (value=0) at mntdisplay_wrap.cc:1722
#3 Testguimnt_Init (interp=0x9714e28) at mntdisplay_wrap.cc:3774
#4 0xb76748fe in Tcl_LoadObjCmd () from /opt/ActiveTcl-8.6/lib/libtcl8.6.so
#5 0xb75d02af in TclNREvalObjv () from /opt/ActiveTcl-8.6/lib/libtcl8.6.so
#6 0xb75d0859 in Tcl_EvalObjv () from /opt/ActiveTcl-8.6/lib/libtcl8.6.so
#7 0xb75d0d99 in TclEvalEx () from /opt/ActiveTcl-8.6/lib/libtcl8.6.so
#8 0xb7670045 in Tcl_FSEvalFileEx () from /opt/ActiveTcl-8.6/lib/libtcl8.6.so
#9 0xb767645f in Tcl_MainEx () from /opt/ActiveTcl-8.6/lib/libtcl8.6.so
#10 0x0804885c in main ()
In the wrapper:
line 1712:
SWIGINTERNINLINE Tcl_Obj*
SWIG_From_long (long value)
{
if (((long) INT_MIN <= value) && (value <= (long) INT_MAX)) {
return Tcl_NewIntObj(static_cast< int >(value)); //1712
} else {
return Tcl_NewLongObj(value);
}
}
1722:
SWIGINTERNINLINE Tcl_Obj *
SWIG_From_int (int value)
{
return SWIG_From_long (value); //1722
}
3774:
SWIG_Tcl_SetConstantObj(interp, "MESSAGE_NEW", SWIG_From_int(static_cast< int >(MESSAGE_NEW)));
where MESSAGE_NEW is defined in a enum and is 0.
Please, if you have any idea, please help me. Thank you!
EDIT:
I found the cause of the problem: it's an linking error.
I created a new thread for this issue:
C++: linked library disappears and gives segfault during execution
I found the problem.
Please see my other post:
C++: linked library disappears and gives segfault during execution
There was an undefined symbol of my library. I defined it and problem solved!
The confusion was, my program crashed in the middle of tcl wrapper functions (where my undefined symbol was not involved at all). I don't really know why but that's it..
Hope it will help others!

segmentation fault while using dynamically allocated object in shared lib loaded at runtime

I have static library linked to executable file. Executable file itself don't use library symbols. But this executable loads some shared libraries at runtime, one of which uses symbols from library. Below is very simplified version of library source files.
ParentClass.h
#include <iostream>
using namespace std;
class ParentClass {
ParentClass() {}
// some functionality
};
ChildClass.h
#include <ParentClass.h>
struct StaticData {
static const char *staticString;
};
class ChildClass : public ParentClass, public StaticData {
ChildClass() {}
// some extended functionality here
};
ChildClass.cpp
#include "ChildClass.h"
const char * StaticData::staticString = "string";
// functionality implementation
Here is several facts:
1.code like this:
ChildClass test;
//extended use of test functionality
works quite well.
2.code like this:
ChieldClass *test = new ChieldClass();
test->some_func(); // some func don't use dynamic memory
test->some_other_func(); // dynamic memory used (in my case malloc in gethostbyname system function)
works quite well when used in binary directly linked with library, but fails with segfault "path_to_exec malloc(): memory corruption: some_address" when used in shared library loaded in runtime (see description in the beginning).
3.code like this:
ParentClass *test = new ParentClass();
test->some_func();
test->some_other_func();
Works well everywhere.
I'm having trouble understanding why code in item 2 causes segmentation fault, but I am suspecting that trouble is in the use of static data in ChildClass (besides that difference ChildClass only defines some new functions with extended functionality that uses ParenClass functions. Segmentation fault occures even when I am using not overloaded ParentClass functions). But I can't link this single difference with the fact that segnemtation fault occures only while using ChildClass in shared library dynamically loaded to executable my library was linked with.
I'l be glad to hear any ideas to get rid of this seg fault.
Update: bt when using logger function with std::cout (some names are omitted). Call sequence:
ChildClass *test = new ChildClass();
test->printInfo();
test->connect();
connect fnction isn't redefined in ChildClass.
(gdb) bt
#0 0x00007f756f67e165 in raise () from /lib/libc.so.6
#1 0x00007f756f680f70 in abort () from /lib/libc.so.6
#2 0x00007f756f6b427b in ?? () from /lib/libc.so.6
#3 0x00007f756f6bdad6 in ?? () from /lib/libc.so.6
#4 0x00007f756f6c0b6d in ?? () from /lib/libc.so.6
#5 0x00007f756f6c2930 in malloc () from /lib/libc.so.6
#6 0x00007f756f6af35b in ?? () from /lib/libc.so.6
#7 0x00007f756f7291de in ?? () from /lib/libc.so.6
#8 0x00007f756f72aa65 in __res_maybe_init () from /lib/libc.so.6
#9 0x00007f756f72ca70 in __nss_hostname_digits_dots () from /lib/libc.so.6
#10 0x00007f756f731fe4 in gethostbyname_r () from /lib/libc.so.6
#11 0x0000000000507929 in underlaying_c_code_connect (client=0x7f7564017348) at /home/beduin/???/lib/???/UnderlayingCCode.cpp:1477
#12 0x0000000000504a24 in ParentClass::connect (this=0x7f7564017340) at /home/beduin/???/lib/???/ParentClass.cpp:216
#13 0x00007f7569342f68 in Plugin::Start (this=0x7f75640208c0) at /home/beduin/???/plugins/???/Plugin.cpp:84
#14 0x00000000004c7d45 in ???::PluginHolder::StartPlugin (this=0x7fffed7dc5e0, it=#0x7fffed7dbad0) at /home/beduin/???/plugins.cpp:317
#15 0x00000000004c8656 in ???::PluginHolder::Start (this=0x7fffed7dc5e0) at /home/beduin/mrvs/framework/base/plugins.cpp:401
#16 0x00000000004c7935 in ???::PluginHolder::LockNLoad (this=0x7fffed7dc5e0) at /home/beduin/???/plugins.cpp:284
#17 0x00000000004afe6f in main (argc=3, argv=0x7fffed7dd978) at /home/beduin/???/main.cpp:148
Using custom logger:
#0 0x00007f824aa12165 in raise () from /lib/libc.so.6
#1 0x00007f824aa14f70 in abort () from /lib/libc.so.6
#2 0x00007f824aa4827b in ?? () from /lib/libc.so.6
#3 0x00007f824aa51ad6 in ?? () from /lib/libc.so.6
#4 0x00007f824aa54b6d in ?? () from /lib/libc.so.6
#5 0x00007f824aa56930 in malloc () from /lib/libc.so.6
#6 0x00007f824b2a46bd in operator new () from /usr/lib/libstdc++.so.6
#7 0x00007f824b280b29 in std::string::_Rep::_S_create () from /usr/lib/libstdc++.so.6
#8 0x00007f824b281aeb in std::string::_Rep::_M_clone () from /usr/lib/libstdc++.so.6
#9 0x00007f824b28205c in std::string::reserve () from /usr/lib/libstdc++.so.6
#10 0x00007f824b27c021 in std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::overflow () from /usr/lib/libstdc++.so.6
#11 0x00007f824b280215 in std::basic_streambuf<char, std::char_traits<char> >::xsputn () from /usr/lib/libstdc++.so.6
#12 0x00007f824b2763b5 in std::__ostream_insert<char, std::char_traits<char> > () from /usr/lib/libstdc++.so.6
#13 0x00007f824b27662f in std::operator<< <std::char_traits<char> > () from /usr/lib/libstdc++.so.6
#14 0x00000000004f4fb0 in ???::Logger::LogWriter::operator<< <char [25]> (this=0x7fff8e241fc0, str=#0x52b2fd)
at /home/beduin/???/log:184
#15 0x0000000000500388 in ChildClass::printInfo (this=0x7f8240017470) at /home/beduin/???/ChildClass.cpp:480
#16 0x00007f82446d6f5c in Plugin::Start (this=0x7f82400208a0) at /home/beduin/???/plugins/???/Plugin.cpp:83
#17 0x00000000004c7d35 in ???::PluginHolder::StartPlugin (this=0x7fff8e243b30, it=#0x7fff8e243020) at /home/beduin/???/plugins.cpp:317
#18 0x00000000004c8646 in ???::PluginHolder::Start (this=0x7fff8e243b30) at /home/beduin/???/plugins.cpp:401
#19 0x00000000004c7925 in ???::PluginHolder::LockNLoad (this=0x7fff8e243b30) at /home/beduin/???/plugins.cpp:284
#20 0x00000000004afe5f in main (argc=3, argv=0x7fff8e244ec8) at /home/beduin/???/main.cpp:148
Run your program under valgrind (rather than gdb). It will show you the first place where invalid memory access occurs, which may be different from the place where the crash ultimately happens.
Regarding the fact that it's broken when linked as a shared library, are you using -fPIC or not? If not, try it.

Boost: what could be the reasons for a crash in boost::slot<>::~slot?

I am getting such a crash:
#0 0x90b05955 in __gnu_debug::_Safe_iterator_base::_M_detach
#1 0x90b059ce in __gnu_debug::_Safe_iterator_base::_M_attach
#2 0x90b05afa in __gnu_debug::_Safe_sequence_base::_M_detach_all
#3 0x000bc54f in __gnu_debug::_Safe_sequence_base::~_Safe_sequence_base at safe_base.h:170
#4 0x000aac05 in __gnu_debug::_Safe_sequence<__gnu_debug_def::vector<boost::signals::trackable const*, std::allocator<boost::signals::trackable const*> > >::~_Safe_sequence at safe_sequence.h:97
#5 0x000ac9c1 in __gnu_debug_def::vector<boost::signals::trackable const*, std::allocator<boost::signals::trackable const*> >::~vector at vector:95
#6 0x000acf65 in boost::signals::detail::slot_base::data_t::~data_t at slot.hpp:32
#7 0x000acf8f in boost::checked_delete<boost::signals::detail::slot_base::data_t> at checked_delete.hpp:34
#8 0x000b081e in boost::detail::sp_counted_impl_p<boost::signals::detail::slot_base::data_t>::dispose at sp_counted_impl.hpp:78
#9 0x0000a016 in boost::detail::sp_counted_base::release at sp_counted_base_gcc_x86.hpp:145
#10 0x0000a046 in boost::detail::shared_count::~shared_count at shared_count.hpp:217
#11 0x000a9fb0 in boost::shared_ptr<boost::signals::detail::slot_base::data_t>::~shared_ptr at shared_ptr.hpp:169
#12 0x000aa459 in boost::signals::detail::slot_base::~slot_base at slot.hpp:27
#13 0x000aad07 in boost::slot<boost::function<bool ()(char, int)> >::~slot at slot.hpp:105
#14 0x001b943b in main at vermes.cpp:102
This is the code:
#include <boost/signal.hpp>
#include <boost/lexical_cast.hpp>
#include <boost/function.hpp>
#include <boost/bind.hpp>
bool dummyfunc(char,int) { return false; }
int main(int argc, char **argv)
{
boost::signal<bool (char, int)> myslot;
myslot.connect(0, &dummyfunc);
return 0;
}
It's the first time I am working with Boost and I am also completly new to the code of the project I am trying to port here.
That is why I would like to ask if such a crash could be in any way explained by Boost or if it must be unrelated to Boost.
I already tried to understand the crash itself but I got stuck somehow. It seems that probably the std::vector, which is going to be deleted here, is messed up (messed up = memory corrupt). That vector is a member of slot_base::data_t. The deletion is done in the destructor of slot_base::shared_ptr. So perhaps the shared_ptr also was messed up - so perhaps even the whole slot_base was messed up. But in the code I have, I don't really see a reason why that memory could be messed up. It is even the first access at all after the construction of myslot.
Addition: What I also don't really understand is why the ~slot_base() is called here at all when I do the connect. But I also didn't found the connect-memberfunction. Is that a magic makro somewhere?
I found the problem. When I enable these preprocessor definitions (my Xcode does that by default in Debug configuration), it crashes:
-D _GLIBCXX_DEBUG=1
-D _GLIBCXX_DEBUG_PEDANTIC=1
I guess Boost (bjam) compiled without those and that causes such problems because the STL structures (like vector) look different in binary form when compiled with or without this.
It sounds like your GConsole class is not derived from boost::trackable.
When a signal is bound to a member function, it expects the member's object to exist, always.
You can either explicitly disconnect signals when member function's owner is destroyed, or you can derive the object from boost::trackable, which will do the maintenance automatically when the object is destroyed.