I was trying to think of a way how to deal with C libraries that expect you to globally initialize them and I came up with this:
namespace {
class curl_guard {
public:
curl_guard()
{
puts("curl_guard constructor");
// TODO: curl_global_init
}
~curl_guard()
{
puts("curl_guard destructor");
// TODO: curl_global_cleanup
}
};
curl_guard curl{}; // nothing is ever printed to terminal
}
But when I link this into an executable and run it, there's no output, because it is optimized out (verified with objdump), even in debug build. As I understand, this is intended, because this type is never accessed in any way. Is there any way to mark it so that it is not excluded? Preferably without making it accessible to the user of the library. I'd prefer a general solution, but GCC-only also works for my purposes.
I am aware of typical singleton patterns, but I think this is a special case where none of them apply, because I never want to access this even from internals, just simply have a class tucked away which has one simple job of initializing and deinitializing a C library which is caused by the library being linked in and not something arbitrary like "just don't forget to construct this in main" which is as useful as going back to writing C code.
The real solution was pretty simple - the code can't be separate and needs to be in one of compilation units that are used by the user of the library.
libcurl needs to be initialized globally, and initialization is NOT thread safe, because libraries it depends on also cannot be initialized in thread-safe manner, so it is one of those things where global pre-main initialization is not only convenient, but useful. I in fact am using several other libraries which are like that, and I separate them from libraries that do use threads in the background, by putting those in the main as opposed to pre-main.
And while there are some concerns with doing something like this at all, that's exactly what I need, including library aborting before main even runs if it's installed improperly.
Sure, it's "good practice" to ignore critical errors just to please the users of libraries who insist on libraries aborting being oh so terrible, but I'm sure noone likes to know that in next 50 seconds, they will be dead due to flying vertically downwards due to something that could've been found and fixed early.
Related
In C++ it's possible to call arbitrary functions at init time before main() has been entered, including calls to libraries that are not fully initialized yet, which can cause confusing errors. If I'm writing a library, is it possible in standard modern C++ (C++20 or so) to tell whether main() has started yet, so I can prevent the user of the library from using it before it's safe?
Considered solutions:
Having a library::init() function called at the beginning of main(). The library already works fine without an init(), so it seems silly to add one just to improve error reporting. If nothing else works, obviously this is the best solution.
Using a static initializer to determine when it's safe to use the library. It cannot be predicted what order static initializers run in, so this is not reliable.
Using a function-level static variable to initialize the library (essentially lazy initialization) so that initialization order doesn't matter. I already do this for some things, but this can't extend the protection to other libraries or system resources that are not usable before main().
Walking up the stack manually to find a frame for main(). I think not. : )
Answering my own question: from what I gather based on people's comments and looking at
Is there any way in C/C++ to detect if code is running during static initialization?
How can I call a function or statically initialize an object immediately before main?
it appears that there is no way to get this information in standard C++, and the standard solution is just to provide a library::init() function for the user to call in main().
While working on a library depending on another third party library, I ran into the problem that the third party library is absolutely trash requires a manual call to a global setup and cleanup function.
int main()
{
setup();
//do stuff
cleanup();
}
Now, this is a total sh*t show not much of a problem in application code since it is just syntactically horrifying, but is actually a pain in a library.
The library is supposed to abstract these weird implementation details away, and requiring the user to call a setup function is like slapping my own face.
I tried to make them disappear
//namespace scope
struct AutoMagic
{
AutoMagic() {setup();}
~AutoMagic() {cleanup();}
};
AutoMagic automagic;
And then I realized this won't work across translation units as seen here and there
C++ global initialization order ignores dependencies?
Is initialization order guaranteed
C++ Dynamic initialization - across translation units
Initialization across c++ translation units
C++ static initialization order
Thus the question in the title.
You could try something like this:
std::shared_ptr<AutoMagic> init(){
static std::shared_ptr<AutoMagic> ptr(new AutoMagic{});
return ptr;
}
and in every translation unit do:
auto libMagic = init();
I didn't test it, but whichever translation unit calls init() first should create the object and whichever shared_ptr is unloaded last should call the destructor.
In the end, the only viable thing left is unportable compiler specific code
AutoMagic automagic __attribute__((init_priority(420));
Unportable as in, basically all mainstream compilers support this except msvc.
This attribute forces the variable to be initialized before all variables without this attribute and those with priority number greater than this variable.
This goes to illustrate that there is indeed a demand for initialization priority guarantees and that it is obviously still not being given by the standard.
Sadly, you can't really get out of this. You had a nice idea with the global instance, but indeed that doesn't work too well with dependencies. It has the further problem of potentially being optimised away by the linker unless you "use" it or are careful with build options; I encountered this issue with instances of an entity registration class inside my shared library.
Two solid options:
Add similar pieces of cr*p setup/teardown functions to your own library, that defer to setup() and cleanup()
Change libraries
I would apologise profusely, but really this is the third-party author's fault. So you should k*ll th*m get them to apologise.
Consider this mock-up of my situation.
in an external header:
class ThirdPartyObject
{
...
}
my code: (spread among a few headers and source files)
class ThirdPartyObjectWrapper
{
private:
ThirdPartyObject myObject;
}
class Owner
{
public:
Owner() {}
void initialize();
private:
ThirdPartyObjectWrapper myWrappedObject;
};
void Owner::initialize()
{
//not weird:
//ThirdPartyObjectWrapper testWrappedObject;
//weird:
//ThirdPartyObject testObject;
}
ThirdPartyObject is, naturally, an object defined by a third party (static precompiled) library I'm using. ThirdPartyObjectWrapper is a convenience class that eliminates a lot of boiler-plating for working with ThirdPartyObject. Owner::initialize() is called shortly after an instance of Owner is created.
Notice the two lines I have labeled as "weird" and "not weird" in Owner::initialize(). All I'm doing here is creating a couple of objects on the stack with their default constructors. I don't do anything with those objects and they get destroyed when they leave scope. There are no build or linker errors involved, I can uncomment either or both lines and the code will build.
However, if I uncomment "weird" then I get a segmentation fault, and (here's why I say it's weird) it's in a completely unrelated location. Not in the constructor of testObject, like you might expect, but in the constructor of Owner::myObjectWrapper::myObject. The weird line never even gets called, but somehow its presence or absence consistently changes the behavior of an unrelated function in a static library.
And consider that if I only uncomment "not weird" then it runs fine, executing the ThirdPartyObject constructor twice with no problems.
I've been working with C++ for a year so it's not really a surprise to me that something like this would be able happen, but I've about reached the limit of my ability to figure out how this gotcha is happening. I need the input of people with significantly more C++ experience than me.
What are some possibilities that could cause this to happen? What might be going on here?
Also, note, I'm not asking for advice on how to get rid of the segfault. Segfaults I understand, I suspect it's a simple race condition. What I don't understand is the behavior gotcha so that's the only thing I'm trying to get answers for.
My best lead is that it has to do with headers and macros. The third party library actually already has a couple of gotchas having to do with its headers and macros, for example the code won't build if you put your #include's in the wrong order. I'm not changing any #include's so strictly this still wouldn't make sense, but perhaps the compiler is optimizing includes based on the presence of a symbol here? (it would be the only mention of ThirdPartyObject in the file)
It also occurs to me that because I am using Qt, it could be that the Meta-Object Compiler (which generates supplementary code between compilations) might be involved in this. Very unlikely, as Qt has no knowledge of the third party library where the segfault is happening and this is not actually relevant to the functionality of the MOC (since at no point ThirdPartyObject is being passed as an argument), but it's worth investigating at least.
Related questions have suggested that it could be a relatively small buffer overflow or race condition that gets tripped up by compiler optimizations. Continuing to investigate but all leads are welcome.
Typical culprits:
Some build products are stale and not binary-compatible.
You have a memory bug that has corrupted the state of your process, and are seeing a manifestation of that in a completely unrelated location.
Fixing #1 is trivial: delete the build folder and build again. If you're not building in a shadow build folder, you've set yourself up for failure, hopefully you now know enough to stop :)
Fixing #2 is not trivial. View manual memory management and possible buffer overflows with suspicion. Use modern C++ programming techniques to leverage the compiler to help you out: store things by value, use containers, use smart pointers, and use iterators and range-for instead of pointers. Don't use C-style arrays. Abhor C-style APIs of the (Type * array, int count) kind - they don't belong in C++.
What fun. I've boiled this down to the bottom.
//#include <otherthirdpartyheader.h>
#include <thirdpartyobject.h>
int main(...)
{
ThirdPartyObject test;
return 0;
}
This code runs. If I uncomment the first include, delete all build artifacts, and build again, then it breaks. There's obviously a header/macro component, and probably some kind of compiler-optimization component. But, get this, according to the library documentation it should give me a segfault every time because I haven't been doing a required initialization step. So the fact that it runs at all indicates unexpected behavior.
I'm chalking this up to library-specific issues rather than broad spectrum C++ issues. I'll be contacting the vendor going forward from here, but thanks everyone for the help.
I'm working on a project where we have several executables that share several object files. We want to add logging to all of the executables, and have a library for doing so.
However, it seems clumsy to go to the main() function of every executable file and add in the same boiler-plate function call to start the logging. It means we write the same thing over again, and loose out on maintainability and DRY ("don't repeat yourself"). It would be nice if we could systematically ensure that logging started before the main function gets called.
It occurred to me there are functions in libc++ that make the call to main, and it may be possible to override them. However, I don't know what they are and imagine this could break things if we're not careful. Does anyone know how this would be done? Or, if that's too over-the-top, any other suggestions on how to proceed?
We're using C++11 with g++ 4.8 if it makes any difference.
You do not need to do this by modifying main().
You should instead create a class at global scope in a shared object library. The constructor of this class will perform the "initialisation" you want to do, before main() runs, and its destructor will run after main().
The issue you need to deal with is that the order of this initialisation and destruction is not guaranteed to be deterministic with regards to any other global-scope objects. All of this could go in one .cpp compilation unit.
class LoggingManager // you can make this a singleton but not necessary
{
public:
LoggingManager();
~LoggingManager();
};
LoggingManager::LoggingManager()
{
// your initialisation code goes here
}
LoggingManager::~LoggingManager()
{
// your clean-up code goes here. It should not throw
}
LoggingManager loggingManagerStaticInstance;
Note that there is a small danger of the "static initialization" issue which means in reality your loggingManagerStaticInstance might not be loaded until your compilation unit is first accessed.
In reality it doesn't matter if this is after main() as long as the initialisation happens before it is first needed (a bit like a singleton) but it means your compilation unit might need to contain something that is guaranteed to get pulled in.
If you want to "stick" to gnu or similar they provide __attribute__(constructor) which might resolve it although there is an easier way of having some dummy extern int implemented or dummy function that returns an int that gets called from within whatever header you do actually use to implement logging.
I am working on a C++ project in Xcode, and one of my .cpp files instantiates some variables. Another .cpp file in the application uses these variables to instantiate another object and needs them to be instantiated to not throw a null-pointer exception. My solution so far was simply to drag-drop (XCode simplicity) the first file over the second one in the build-phase order. It works fine now, but I have a feeling that it is not the optimal solution, and that there is something fundamentally wrong with my code if I need to organise the compile order manually for the application to run properly.
Should I never instantiate something outside of functions, or what is the golden rule? Thanks.
EDIT: An example as requested.
The problem lies in a Observer/Event system.
In a source-file I do this:
Trigger* mainMenu_init = new Trigger(std::vector<Event*> {
// Event(s):
event_gameInit,
}, [](Event* e) {
// Action(s):
std::cout << "Hello World" << std::endl;
});
In the trigger's constructor the Event is asked to add is as an observer:
for(Event* event : events)
event->addObserver(this);
BUT, the events are just external pointers, so if they are not initialised (which they are in another source-file) this initialisation will fail. So what I found was that if I do not organise the compilation-phase myself, random triggers will not work while other will, depending on if they are built before or after the Event.cpp file.
I assume you are talking about non-trivial initialization of global variables (or of static variables), such as (at the top level of a file):
MyObject *myPtrObject = new MyObject(42, "blah");
MyObject myOtherObject;
("trivial" initialization is, roughly speaking, when there is no constructor involved and everything just involves constants; so if you initialize a pointer to zero, it will be zero before any code is actually invoked)
The order of initialization between different source files is NOT GUARANTEED in C++. It happens to depend on the order of the files with Apple's current system, but THAT MIGHT CHANGE.
So yes, there is something fundamentally wrong.
Golden Rules
IMPORTANT: In the initialization of a global object, don't use any other global objects from different source files.
Don't overuse global variables. They have numerous disadvantages from a software design point of view.
Keep initialization of global objects simple. That will make it easier to stick to the first rule.
Not knowing anything about your program, it's of course hard to give more concrete design advice.