How to execute an auto-setup function - c++

While working on a library depending on another third party library, I ran into the problem that the third party library is absolutely trash requires a manual call to a global setup and cleanup function.
int main()
{
setup();
//do stuff
cleanup();
}
Now, this is a total sh*t show not much of a problem in application code since it is just syntactically horrifying, but is actually a pain in a library.
The library is supposed to abstract these weird implementation details away, and requiring the user to call a setup function is like slapping my own face.
I tried to make them disappear
//namespace scope
struct AutoMagic
{
AutoMagic() {setup();}
~AutoMagic() {cleanup();}
};
AutoMagic automagic;
And then I realized this won't work across translation units as seen here and there
C++ global initialization order ignores dependencies?
Is initialization order guaranteed
C++ Dynamic initialization - across translation units
Initialization across c++ translation units
C++ static initialization order
Thus the question in the title.

You could try something like this:
std::shared_ptr<AutoMagic> init(){
static std::shared_ptr<AutoMagic> ptr(new AutoMagic{});
return ptr;
}
and in every translation unit do:
auto libMagic = init();
I didn't test it, but whichever translation unit calls init() first should create the object and whichever shared_ptr is unloaded last should call the destructor.

In the end, the only viable thing left is unportable compiler specific code
AutoMagic automagic __attribute__((init_priority(420));
Unportable as in, basically all mainstream compilers support this except msvc.
This attribute forces the variable to be initialized before all variables without this attribute and those with priority number greater than this variable.
This goes to illustrate that there is indeed a demand for initialization priority guarantees and that it is obviously still not being given by the standard.

Sadly, you can't really get out of this. You had a nice idea with the global instance, but indeed that doesn't work too well with dependencies. It has the further problem of potentially being optimised away by the linker unless you "use" it or are careful with build options; I encountered this issue with instances of an entity registration class inside my shared library.
Two solid options:
Add similar pieces of cr*p setup/teardown functions to your own library, that defer to setup() and cleanup()
Change libraries
I would apologise profusely, but really this is the third-party author's fault. So you should k*ll th*m get them to apologise.

Related

Is there a portable way to tell if main() has started yet?

In C++ it's possible to call arbitrary functions at init time before main() has been entered, including calls to libraries that are not fully initialized yet, which can cause confusing errors. If I'm writing a library, is it possible in standard modern C++ (C++20 or so) to tell whether main() has started yet, so I can prevent the user of the library from using it before it's safe?
Considered solutions:
Having a library::init() function called at the beginning of main(). The library already works fine without an init(), so it seems silly to add one just to improve error reporting. If nothing else works, obviously this is the best solution.
Using a static initializer to determine when it's safe to use the library. It cannot be predicted what order static initializers run in, so this is not reliable.
Using a function-level static variable to initialize the library (essentially lazy initialization) so that initialization order doesn't matter. I already do this for some things, but this can't extend the protection to other libraries or system resources that are not usable before main().
Walking up the stack manually to find a frame for main(). I think not. : )
Answering my own question: from what I gather based on people's comments and looking at
Is there any way in C/C++ to detect if code is running during static initialization?
How can I call a function or statically initialize an object immediately before main?
it appears that there is no way to get this information in standard C++, and the standard solution is just to provide a library::init() function for the user to call in main().

constructing completely private and anonymous static singleton

I was trying to think of a way how to deal with C libraries that expect you to globally initialize them and I came up with this:
namespace {
class curl_guard {
public:
curl_guard()
{
puts("curl_guard constructor");
// TODO: curl_global_init
}
~curl_guard()
{
puts("curl_guard destructor");
// TODO: curl_global_cleanup
}
};
curl_guard curl{}; // nothing is ever printed to terminal
}
But when I link this into an executable and run it, there's no output, because it is optimized out (verified with objdump), even in debug build. As I understand, this is intended, because this type is never accessed in any way. Is there any way to mark it so that it is not excluded? Preferably without making it accessible to the user of the library. I'd prefer a general solution, but GCC-only also works for my purposes.
I am aware of typical singleton patterns, but I think this is a special case where none of them apply, because I never want to access this even from internals, just simply have a class tucked away which has one simple job of initializing and deinitializing a C library which is caused by the library being linked in and not something arbitrary like "just don't forget to construct this in main" which is as useful as going back to writing C code.
The real solution was pretty simple - the code can't be separate and needs to be in one of compilation units that are used by the user of the library.
libcurl needs to be initialized globally, and initialization is NOT thread safe, because libraries it depends on also cannot be initialized in thread-safe manner, so it is one of those things where global pre-main initialization is not only convenient, but useful. I in fact am using several other libraries which are like that, and I separate them from libraries that do use threads in the background, by putting those in the main as opposed to pre-main.
And while there are some concerns with doing something like this at all, that's exactly what I need, including library aborting before main even runs if it's installed improperly.
Sure, it's "good practice" to ignore critical errors just to please the users of libraries who insist on libraries aborting being oh so terrible, but I'm sure noone likes to know that in next 50 seconds, they will be dead due to flying vertically downwards due to something that could've been found and fixed early.

Override the call to main()?

I'm working on a project where we have several executables that share several object files. We want to add logging to all of the executables, and have a library for doing so.
However, it seems clumsy to go to the main() function of every executable file and add in the same boiler-plate function call to start the logging. It means we write the same thing over again, and loose out on maintainability and DRY ("don't repeat yourself"). It would be nice if we could systematically ensure that logging started before the main function gets called.
It occurred to me there are functions in libc++ that make the call to main, and it may be possible to override them. However, I don't know what they are and imagine this could break things if we're not careful. Does anyone know how this would be done? Or, if that's too over-the-top, any other suggestions on how to proceed?
We're using C++11 with g++ 4.8 if it makes any difference.
You do not need to do this by modifying main().
You should instead create a class at global scope in a shared object library. The constructor of this class will perform the "initialisation" you want to do, before main() runs, and its destructor will run after main().
The issue you need to deal with is that the order of this initialisation and destruction is not guaranteed to be deterministic with regards to any other global-scope objects. All of this could go in one .cpp compilation unit.
class LoggingManager // you can make this a singleton but not necessary
{
public:
LoggingManager();
~LoggingManager();
};
LoggingManager::LoggingManager()
{
// your initialisation code goes here
}
LoggingManager::~LoggingManager()
{
// your clean-up code goes here. It should not throw
}
LoggingManager loggingManagerStaticInstance;
Note that there is a small danger of the "static initialization" issue which means in reality your loggingManagerStaticInstance might not be loaded until your compilation unit is first accessed.
In reality it doesn't matter if this is after main() as long as the initialisation happens before it is first needed (a bit like a singleton) but it means your compilation unit might need to contain something that is guaranteed to get pulled in.
If you want to "stick" to gnu or similar they provide __attribute__(constructor) which might resolve it although there is an easier way of having some dummy extern int implemented or dummy function that returns an int that gets called from within whatever header you do actually use to implement logging.

Is it bad practice if I have to organise Compile Sources

I am working on a C++ project in Xcode, and one of my .cpp files instantiates some variables. Another .cpp file in the application uses these variables to instantiate another object and needs them to be instantiated to not throw a null-pointer exception. My solution so far was simply to drag-drop (XCode simplicity) the first file over the second one in the build-phase order. It works fine now, but I have a feeling that it is not the optimal solution, and that there is something fundamentally wrong with my code if I need to organise the compile order manually for the application to run properly.
Should I never instantiate something outside of functions, or what is the golden rule? Thanks.
EDIT: An example as requested.
The problem lies in a Observer/Event system.
In a source-file I do this:
Trigger* mainMenu_init = new Trigger(std::vector<Event*> {
// Event(s):
event_gameInit,
}, [](Event* e) {
// Action(s):
std::cout << "Hello World" << std::endl;
});
In the trigger's constructor the Event is asked to add is as an observer:
for(Event* event : events)
event->addObserver(this);
BUT, the events are just external pointers, so if they are not initialised (which they are in another source-file) this initialisation will fail. So what I found was that if I do not organise the compilation-phase myself, random triggers will not work while other will, depending on if they are built before or after the Event.cpp file.
I assume you are talking about non-trivial initialization of global variables (or of static variables), such as (at the top level of a file):
MyObject *myPtrObject = new MyObject(42, "blah");
MyObject myOtherObject;
("trivial" initialization is, roughly speaking, when there is no constructor involved and everything just involves constants; so if you initialize a pointer to zero, it will be zero before any code is actually invoked)
The order of initialization between different source files is NOT GUARANTEED in C++. It happens to depend on the order of the files with Apple's current system, but THAT MIGHT CHANGE.
So yes, there is something fundamentally wrong.
Golden Rules
IMPORTANT: In the initialization of a global object, don't use any other global objects from different source files.
Don't overuse global variables. They have numerous disadvantages from a software design point of view.
Keep initialization of global objects simple. That will make it easier to stick to the first rule.
Not knowing anything about your program, it's of course hard to give more concrete design advice.

Do you really need a main() in C++?

From what I can tell you can kick off all the action in a constructor when you create a global object. So do you really need a main() function in C++ or is it just legacy?
I can understand that it could be considered bad practice to do so. I'm just asking out of curiosity.
If you want to run your program on a hosted C++ implementation, you need a main function. That's just how things are defined. You can leave it empty if you want of course. On the technical side of things, the linker wants to resolve the main symbol that's used in the runtime library (which has no clue of your special intentions to omit it - it just still emits a call to it). If the Standard specified that main is optional, then of course implementations could come up with solutions, but that would need to happen in a parallel universe.
If you go with the "Execution starts in the constructor of my global object", beware that you set yourself up to many problems related to the order of constructions of namespace scope objects defined in different translation units (So what is the entry point? The answer is: You will have multiple entry points, and what entry point is executed first is unspecified!). In C++03 you aren't even guaranteed that cout is properly constructed (in C++0x you have a guarantee that it is, before any code tries to use it, as long as there is a preceeding include of <iostream>).
You don't have those problems and don't need to work around them (wich can be very tricky) if you properly start executing things in ::main.
As mentioned in the comments, there are however several systems that hide main from the user by having him tell the name of a class which is instantiated within main. This works similar to the following example
class MyApp {
public:
MyApp(std::vector<std::string> const& argv);
int run() {
/* code comes here */
return 0;
};
};
IMPLEMENT_APP(MyApp);
To the user of this system, it's completely hidden that there is a main function, but that macro would actually define such a main function as follows
#define IMPLEMENT_APP(AppClass) \
int main(int argc, char **argv) { \
AppClass m(std::vector<std::string>(argv, argv + argc)); \
return m.run(); \
}
This doesn't have the problem of unspecified order of construction mentioned above. The benefit of them is that they work with different forms of higher level entry points. For example, Windows GUI programs start up in a WinMain function - IMPLEMENT_APP could then define such a function instead on that platform.
Yes! You can do away with main.
Disclaimer: You asked if it were possible, not if it should be done. This is a totally un-supported, bad idea. I've done this myself, for reasons that I won't get into, but I am not recommending it. My purpose wasn't getting rid of main, but it can do that as well.
The basic steps are as follows:
Find crt0.c in your compiler's CRT source directory.
Add crt0.c to your project (a copy, not the original).
Find and remove the call to main from crt0.c.
Getting it to compile and link can be difficult; How difficult depends on which compiler and which compiler version.
Added
I just did it with Visual Studio 2008, so here are the exact steps you have to take to get it to work with that compiler.
Create a new C++ Win32 Console Application (click next and check Empty Project).
Add new item.. C++ File, but name it crt0.c (not .cpp).
Copy contents of C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\crt\src\crt0.c and paste into crt0.c.
Find mainret = _tmain(__argc, _targv, _tenviron); and comment it out.
Right-click on crt0.c and select Properties.
Set C/C++ -> General -> Additional Include Directories = "C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\crt\src".
Set C/C++ -> Preprocessor -> Preprocessor Definitions = _CRTBLD.
Click OK.
Right-click on the project name and select Properties.
Set C/C++ -> Code Generation -> Runtime Library = Multi-threaded Debug (/MTd) (*).
Click OK.
Add new item.. C++ File, name it whatever (app.cpp for this example).
Paste the code below into app.cpp and run it.
(*) You can't use the runtime DLL, you have to statically link to the runtime library.
#include <iostream>
class App
{
public: App()
{
std::cout << "Hello, World! I have no main!" << std::endl;
}
};
static App theApp;
Added
I removed the superflous exit call and the blurb about lifetime as I think we're all capable of understanding the consequences of removing main.
Ultra Necro
I just came across this answer and read both it and John Dibling's objections below. It was apparent that I didn't explain what the above procedure does and why that does indeed remove main from the program entirely.
John asserts that "there is always a main" in the CRT. Those words are not strictly correct, but the spirit of the statement is. Main is not a function provided by the CRT, you must add it yourself. The call to that function is in the CRT provided entry point function.
The entry point of every C/C++ program is a function in a module named 'crt0'. I'm not sure if this is a convention or part of the language specification, but every C/C++ compiler I've come across (which is a lot) uses it. This function basically does three things:
Initialize the CRT
Call main
Tear down
In the example above, the call is _tmain but that is some macro magic to allow for the various forms that 'main' can have, some of which are VS specific in this case.
What the above procedure does is it removes the module 'crt0' from the CRT and replaces it with a new one. This is why you can't use the Runtime DLL, there is already a function in that DLL with the same entry point name as the one we are adding (2). When you statically link, the CRT is a collection of .lib files, and the linker allows you to override .lib modules entirely. In this case a module with only one function.
Our new program contains the stock CRT, minus its CRT0 module, but with a CRT0 module of our own creation. In there we remove the call to main. So there is no main anywhere!
(2) You might think you could use the runtime DLL by renaming the entry point function in your crt0.c file, and changing the entry point in the linker settings. However, the compiler is unaware of the entry point change and the DLL contains an external reference to a 'main' function which you're not providing, so it would not compile.
Generally speaking, an application needs an entry point, and main is that entry point. The fact that initialization of globals might happen before main is pretty much irrelevant. If you're writing a console or GUI app you have to have a main for it to link, and it's only good practice to have that routine be responsible for the main execution of the app rather than use other features for bizarre unintended purposes.
Well, from the perspective of the C++ standard, yes, it's still required. But I suspect your question is of a different nature than that.
I think doing it the way you're thinking about would cause too many problems though.
For example, in many environments the return value from main is given as the status result from running the program as a whole. And that would be really hard to replicate from a constructor. Some bit of code could still call exit of course, but that seems like using a goto and would skip destruction of anything on the stack. You could try to fix things up by having a special exception you threw instead in order to generate an exit code other than 0.
But then you still run into the problem of the order of execution of global constructors not being defined. That means that in any particular constructor for a global object you won't be able to make any assumptions about whether or not any other global object yet exists.
You could try to solve the constructor order problem by just saying each constructor gets its own thread, and if you want to access any other global objects you have to wait on a condition variable until they say they're constructed. That's just asking for deadlocks though, and those deadlocks would be really hard to debug. You'd also have the issue of which thread exiting with the special 'return value from the program' exception would constitute the real return value of the program as a whole.
I think those two issues are killers if you want to get rid of main.
And I can't think of a language that doesn't have some basic equivalent to main. In Java, for example, there is an externally supplied class name who's main static function is called. In Python, there's the __main__ module. In perl there's the script you specify on the command line.
If you have more than one global object being constructed, there is no guarantee as to which constructor will run first.
If you are building static or dynamic library code then you don't need to define main yourself, but you will still wind up running in some program that has it.
If you are coding for windows, do not do this.
Running your app entirely from within the constructor of a global object may work just fine for quite awhile, but sooner or later you will make a call to the wrong function and end up with a program that terminates without warning.
Global object constructors run during the startup of the C runtime.
The C runtime startup code runs during the DLLMain of the C runtime DLL
During DLLMain, you are holding the DLL loader lock.
Tring to load another DLL while already holding the DLL loader lock results in a swift death for your process.
Compiling your entire app into a single executable won't save you - many Win32 calls have the potential to quietly load system DLLs.
There are implementations where global objects are not possible, or where non-trivial constructors are not possible for such objects (especially in the mobile and embedded realms).