How to use global variable safely in dlopen-ed shared library? - c++

One of my project for Linux uses a lot of shared libraries which are connected with dlopen/dlsym/dlclose to gain modularity. For thread-safety reason, I need to put global variable (boost::mutex) in one of shared library. Obviously, boost::mutex has non-trivial destructor. (it can even abort(3) the program).
I understand that using global variable that are not trivially destructible is bad idea and some guidelines are explicitly forbid to do such things, but I really need global mutex for the module.
After adding global boost::mutex variable, it crashes on program exit. (boost::mutex::lock() throws because of EINVAL of pthread_mutex_lock) I assume that the shared libary called after main, but, that mutex is already destroyed. (static destruction order fiasco things, I assume.)
Unfortunately, this program is quite complicated and I need a workaround to do my job on time.
After digging ctor/dtor things, I found that this code works.
#include <boost/aligned_storage.hpp>
#include <boost/thread/lock_guard.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/type_traits/alignment_of.hpp>
boost::aligned_storage<sizeof(boost::mutex), boost::alignment_of<boost::mutex>::value> gMutexStorage;
boost::mutex* gMutex = NULL;
__attribute__ ((constructor))
void MyModuleInit()
{
gMutex = new(reinterpret_cast<boost::mutex*>(gMutexStorage.address())) boost::mutex();
}
__attribute__ ((destructor))
void MyModuleUnInit()
{
gMutex->~mutex();
gMutex = NULL;
}
// global mutex use
extern "C" __attribute__ ((visibility("default")))
int SomeFunc(...)
{
boost::lock_guard<boost::mutex> guard(*gMutex);
....
}
Then, this MyModuleUnInit function registered in .fini_array section (For the reference, other destructor of static C++ variable registered in __cxa_atexit) of the shared library and it is called after mutex use, so program exits normally.
However, I am not confident of the safety of that mutex usage. Are these code is Okay to use? If not, how can I deal with static(global) variable for dlopen-ed shared library?

Related

Registering file handlers in code running prior to main()

I'm looking into ways to prevent unnecessary clutter in setup code in main() as well as various other places. I often have tons of setup code that registers itself with some factory. A standard example is e.g. handlers for various file types.
To avoid having to write this code and instead just make handlers magically work if linked into the application, I figured I could replace the code by something like the following:
test.cc:
int main() {
return 0;
}
loader.h:
#ifndef LOADER_H_
#define LOADER_H_
#include <functional>
namespace loader {
class Loader {
public:
Loader(std::function<void()> f);
};
} // namespace loader
#define REGISTER_HANDLER(name, f) \
namespace { \
::loader::Loader _macro_internal_ ## name(f); \
}
#endif // LOADER_H_
loader.cc:
#include "loader.h"
#include <iostream>
namespace loader {
Loader::Loader(std::function<void()> f) { f(); }
} // namespace loader
a.cc:
#include <iostream>
#include "loader.h"
REGISTER_HANDLER(a, []() {
std::cout << "hello from a" << std::endl;
})
The idea here is that a.cc would in a real application e.g. call some method where it registers it self as a handler for a certain file type. Compiling the code with c++ -std=c++11 test.cc loader.cc a.cc creates a binary that prints "hello from a" while c++ -std=c++11 test.cc loader.cc stays silent.
I'm wondering if there's something subtle that I might need to be careful with? For example, if someone creates complex objects in the lambda that is run here, I assume weird things can happen during cleanup for example in a multithreaded application?
You wrote:
... unnecessary clutter in setup code in main() ...
int main() {
return 0;
}
This is not preventing unnecessary clutter. This is hiding your initializations. They still occur, but now you have to chase after them. That's really not the way to do it. Also, it will force the use of a lot of global state - in many independent global variables, most probably - which is also a bad thing. Instead, consider writing something like:
class my_app_state { /* ... */ };
my_app_state initialize(/* perhaps with argc and argv here? */) {
//
// Your "unnecessary" clutter goes here...
//
return whatever;
}
int main() {
auto app_state = initialize();
//
// do stuff involving the app_state...
//
}
and don't try to "game" the program loader.
This approach is not guaranteed to work:
[basic.start.dynamic]/4 It is implementation-defined whether the dynamic initialization of a non-local non-inline variable with static storage duration is sequenced before the first statement of main or is deferred. If it is deferred, it strongly happens before any non-initialization odr-use of any non-inline function or non-inline variable defined in the same translation unit as the variable to be initialized.
Thus, the initialization of _macro_internal_a may be deferred until something in a.cc is used. And since nothing in a.cc is in fact used, the initialization may not be performed at all.
In practice, linkers tend to discard object files that do not appear to be referenced by anything in the program (especially when those files come from libraries).

Is LTO allowed to remove unused global object if there is code in a different translation unit relying on side effects of its construction?

First, just to avoid XY problem: this issue comes from https://github.com/cnjinhao/nana/issues/445#issuecomment-502080177. The library code should probably not do such thing (reliance on construction of unused global object) but the question is more about whether it's valid LTO behaviour rather than code quality issues.
Minimal code that showcases the same problem (untested, just to make example smaller):
// main.cpp
#include <lib/font.hpp>
int main()
{
lib::font f;
}
// lib/font.hpp
namespace lib
{
struct font
{
font();
int font_id;
};
}
// lib/font.cpp
#include <lib/font.hpp>
#include <lib/font_abstraction.hpp>
namespace lib
{
font::font()
{
font_id = get_default_font_id();
}
}
// lib/font_abstraction.hpp
namespace lib
{
int get_default_font_id();
void initialize_font();
}
// lib/font_abstraction.cpp
#include <lib/font_abstraction.hpp>
namespace lib
{
static int* default_font_id;
int get_default_font_id()
{
return *default_font_id;
}
void initialize_font()
{
default_font_id = new int(1);
}
}
// lib/platform_abstraction.hpp
namespace lib
{
struct platform_abstraction
{
platform_abstraction();
};
}
// lib/platform_abstraction.cpp
#include <lib/platform_abstraction.hpp>
#include <lib/font_abstraction.hpp>
namespace lib
{
platform_abstraction::platform_abstraction()
{
initialize_font();
}
static platform_abstraction object;
}
The construction of font object in main.cpp relies on the initialization of the pointer. The only thing that initializes the pointer is global object object but it's unsued - in the case of linked issue that object was removed by LTO. Is such optimization allowed? (See C++ draft 6.6.5.1.2)
Some notes:
The library was build as a static library and linked with main file using -flto -fno-fat-lto-objects and dynamic C++ standard library.
This example can be build without compiling lib/platform_abstraction.cpp at all - in such scenario the pointer will not be initialized for sure.
VTT's answer gives a GCC answer, but the question is tagged language-lawyer.
The ISO C++ reason is that objects defined in a Translation must be initialized before the first call to a function defined in the same Translation Unit. That means platform_abstraction::object must be initialized before platform_abstraction::platform_abstraction() is called. As the linker correctly figured out, there are no other platform_abstraction objects, so platform_abstraction::platform_abstraction is never called, so object's initialization can be postponed indefinitely. A conforming program cannot detect this.
Since you never reference object from static library in the main executable it is not going to exist unless you link that static library with -Wl,--whole-archive. It is not a good idea to rely on construction of some global objects to perform initialization anyway. So you should just invoke initialize_font explicitly prior to using other functions from that library.
Additional explanation for question tagged language-lawyer:
static platform_abstraction object; can not be eliminated in any circumstances according to
6.6.4.1 Static storage duration [basic.stc.static]
2 If a variable with static storage duration has initialization or a destructor with side effects, it shall not be eliminated even if it appears to be unused, except that a class object or its copy/move may be eliminated as
specified in 15.8.
So what is going on here? When linking static library (archive of object files) by default linker will only pick objects files required to fill undefined symbols and since stuff from platform_abstraction.cpp is not used anywhere else linker will completely omit this translation unit. --whole-archive option alters this default behavior by forcing linker to link all the object files from the static library.
Don't have global static variables.
The order of initialization is undefined (in the general case).
Put static objects inside funcations as static objects then you can gurantee they are created before use.
namespace lib
{
static int* default_font_id;
int get_default_font_id()
{
return *default_font_id;
}
void initialize_font()
{
default_font_id = new int(1);
}
}
// Change this too:
namespace lib
{
int get_default_font_id()
{
// This new is guaranteed to only ever be called once.
static std::unique_ptr<int> default_font_id = new int(1);
return *default_font_id;
}
void initialize_font()
{
// Don't need this ever.
}
}

Access the same namespace from different libraries

I build several libraries (static and dynamic ones) which all need to access a namespace containing namespace-global variables.
The functions for altering the variables are defined in one cpp file. If a function inside a library accesses one of those functions, it seems to create a local copy of the whole cpp file, including all variables (and maybe also functions). This means, every library accesses a variable at a different address, resultig in a mess, because the variables have to be shared by all libraries. How can I get around this?
The sourcecode, reduced to the essential:
//include.h
namespace myns {
extern int vars;
}
.
//include.cpp
#include <myns/include.h>
namespace myns {
int vars;
void MyClass::setVars(int var) {
vars = var;
}
}
.
//myclass.h
namespace myns {
class MyClass {
void setVars(int var);
}
}
.
//myclass.cpp.in
//This will be generated by CMake and then compiled into a library
#include <myns/#package#/myclass.h>
namespace myns {
namespace #package# {
class __declspec(dllexport) MySubclass : public MyClass {
MySubclass();
}
MySubclass::MySubclass() {
setVar(#value#);
}
}
}
using namespace myns;
extern "C" __declspec(dllexport) void exported_function() {
new MySubclass();
}
Everytime the exported_function() is called, the vars variable would have a different address. Why does that happen? I need all library functions to access the same vars variable!
Even though the language bears little resemblance to the alleged "C++", and even though the question is severely underspecified (what kind of libraries?), the problem is sufficiently well known that it's answerable.
In short, each Windows DLL has its own internal variables etc. (a Windows DLL is more akin to an executable than to a C++ library).
Note that standard C++ does not support dynamically loaded libraries except for a single cryptic little statement about global initialization order.
A solution is to put the shared state in its own DLL.
By the way, when you are providing setters and getters for a variable there is really little point in also making it directly accessible. That's asking for trouble. But then, using a global variable is, in the first place, also asking for trouble.

c++ std::thread with static member hangs

I'm trying to creating a logging class where the call to write a log is static. Now, due to performance requirements I'm want to perform the actual logging in a separate thread. Since the function to write to a log is static, I think the thread also needs to be static, which is also tied to another static member function that performs the actual writing of the log. I tried coding it but somehow it hangs during the initialization of the static thread. The code sample that duplicates the behavior is below:
"Logger.h"
#ifndef LOGGER_H
#define LOGGER_H
#include <condition_variable>
#include <mutex>
#include <queue>
#include <string>
#include <thread>
#include <vector>
#define LIBRARY_EXPORTS
#ifdef LIBRARY_EXPORTS // inside DLL
#define LIBRARY_API __declspec(dllexport)
#else // outside DLL
#define LIBRARY_API __declspec(dllimport)
#endif
using namespace std;
namespace Company { namespace Logging {
class LIBRARY_API Logger
{
public:
~Logger();
void static Write(string message, vector<string> categories = vector<string>());
private:
Logger();
Logger(Logger const&) {}
void operator=(Logger const&) {}
static thread processLogEntriesThread;
static void ProcessLogEntries();
};
}}
#endif
"Logger.cpp"
#include "Logger.h"
#include <iostream>
using namespace std;
namespace Company { namespace Logging {
thread Logger::processLogEntriesThread = thread(&Logger::ProcessLogEntries);
Logger::Logger()
{
}
Logger::~Logger()
{
Logger::processLogEntriesThread.join();
}
void Logger::Write(string message, vector<string> categories)
{
cout << message << endl;
}
void Logger::ProcessLogEntries()
{
}
}}
One odd behavior that I found is that the hanging part only happens when the class packaged in a DLL. If I use the class files directly into the console EXE project it seems to be working.
So basically my problem is the hanging part and if I'm doing things correctly.
Thanks in advance...
you can use my logger library => https://github.com/PraGitHub/Prapository/tree/master/C_Cpp/Logger
If this post is found irrelevant, please pardon me.
the hanging part only happens when the class packaged in a DLL
See Dynamic-Link Library Best Practices for full details why it hangs:
You should never perform the following tasks from within DllMain:
Call CreateThread. Creating a thread can work if you do not synchronize with other threads, but it is risky.
The solution is provide an initialization function/object for your logger library that the user must call explicitly in main, rather than having a global thread object initialized before main is entered. This function should create the thread.
Or create the thread on the first logging call using std::call_once. However, this involves an extra conditional check on each logging call. This check may be cheap but it is not free.
I can not see any usage of the logger thread. Having a thread as member in a class did not mean that all member functions will run in the created thread. The destructor of logger will never called, while you have no logger instance. iostream is not thread safe!
What you have to do:
Create some kind of storage to collect the logging infos. This instance must be thread safe!
Push messages from the outside world into this instance. The instance itself must have a own thread which reads from the storage and put the data to the output. This must be done also in a thread safe manner because reading and writing comes from different threads!

Sharing data with a dynamically loaded library (dlopen,dlsym)

My main program would load a simple dynamic library called hello.so
In main
void* handle = dlopen("./hello.so", RTLD_LAZY);
In main , pass a callback function called testing (defined somewhere in main.h) and invoke the hello() from the dynamic library
typedef void (*callback)();
typedef void (*hello_t)( callback);
/* do something */
hello_t hello = (hello_t) dlsym(handle, "hello");
hello(testing);
In dynamic library,
#include
#include "main.h"
extern "C" void hello( void (*fn)() ) {
/*do something and then invoke callback function from main */ fn();
}
Are there other ways to allow functions/data of main to be called/used from dynamic library apart from using callbacks?
No, this is the preferred way of doing it, in my opinion. Any other way that I can think of involves making the DLL aware of the objects in the program it's linked with, which is most likely bad practice.
Regarding data, just a reminder though you didn't ask, it's usually best practice to copy any data that needs to be stored, if it's passed across library/program boundaries. You can get into a complete mess if you have the library using data whose lifetime is controlled by the program, and vice versa.