C++: When (and how) are C++ Global Static Constructors Called? - c++

I'm working on some C++ code and I've run into a question which has been nagging me for a while... Assuming I'm compiling with GCC on a Linux host for an ELF target, where are global static constructors and destructors called?
I've heard there's a function _init in crtbegin.o, and a function _fini in crtend.o. Are these called by crt0.o? Or does the dynamic linker actually detect their presence in the loaded binary and call them? If so, when does it actually call them?
I'm mainly interested to know so I can understand what's happening behind the scenes as my code is loaded, executed, and then unloaded at runtime.
Thanks in advance!
Update: I'm basically trying to figure out the general time at which the constructors are called. I don't want to make assumptions in my code based on this information, it's more or less to get a better understanding of what's happening at the lower levels when my program loads. I understand this is quite OS-specific, but I have tried to narrow it down a little in this question.

When talking about non-local static objects there are not many guarantees. As you already know (and it's also been mentioned here), it should not write code that depends on that. The static initialization order fiasco...
Static objects goes through a two-phase initialization: static initialization and dynamic initialization. The former happens first and performs zero-initialization or initialization by constant expressions. The latter happens after all static initialization is done. This is when constructors are called, for example.
In general, this initialization happens at some time before main(). However, as opposed to what many people think even that is not guaranteed by the C++ standard. What is in fact guaranteed is that the initialization is done before the use of any function or object defined in the same translation unit as the object being initialized. Notice that this is not OS specific. This is C++ rules. Here's a quote from the Standard:
It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of
namespace scope is done before the first statement of main. If the initialization is deferred to some point
in time after the first statement of main, it shall occur before the first use of any function or object defined
in the same translation unit as the object to be initialized

This depends heavy on the compiler and runtime. It's not a good idea to make any assumptions on the time global objects are constructed.
This is especially a problem if you have a static object which depends on another one being already constructed.
This is called "static initialization order fiasco". Even if thats not the case in your code, the C++Lite FAQ articles on that topic are worth a read.

This is not OS specific, rather its compiler specific.
You have given the answer, initialization is done in __init.
For the second part, in gcc you can guarantee the order of initialization with a __attribute__((init_priority(PRIORITY))) attached to a variable definition, where PRIORITY is some relative value, with lower numbers initialized first.

The grantees you have:
All static non-local objects in the global namespace are constructed before main()
All static non-local objects in another namespace are constructed before any functions/methods in that namespace are used (Thus allowing the compiler to potentially lazy evaluate them [but don't count on this behavior]).
All static non-local objects in a translation unit are constructed in the order of declaration.
Nothing is defined about the order between translation units.
All static non-local objects are destroyed in the reverse order of creation. (This includes the static function variables (which are lazily created on first use).
If you have globals that have dependencies on each other you have two options:
Put them in the same translation unit.
Transform them into static function variables retrieved and constructed on first use.
Example 1: Global A's constructor uses Global log
class AType
{ AType() { log.report("A Constructed");}};
LogType log;
AType A;
// Or
Class AType()
{ AType() { getLog().report("A Constructed");}};
LogType& getLog()
{
static LogType log;
return log;
}
// Define A anywhere;
Example Global B's destructor uses Global log
Here you have to grantee that the object log is not destroyed before the object B. This means that log must be fully constructed before B (as the reverse order of destruction rule will then apply). Again the same techniques can be used. Either put them in the same translation unit or use a function to get log.
class BType
{ ~BType() { log.report("B Destroyed");}};
LogType log;
BType B; // B constructed after log (so B will be destroyed first)
// Or
Class BType()
{ BType() { getLog();}
/*
* If log is used in the destructor then it must not be destroyed before B
* This means it must be constructed before B
* (reverse order destruction guarantees that it will then be destroyed after B)
*
* To achieve this just call the getLog() function in the constructor.
* This means that 'log' will be fully constructed before this object.
* This means it will be destroyed after and thus safe to use in the destructor.
*/
~BType() { getLog().report("B Destroyed");}
};
LogType& getLog()
{
static LogType log;
return log;
}
// Define B anywhere;

According to the C++ standard they are called before any function or object of their translation unit is used. Note that for objects in the global namespace this would mean they are initialized before main() is called. (See ltcmelo's and Martin's answers for mote details and a discussion of this.)

Related

When is a `static const std::map` initialized, when not a member of class? [duplicate]

I'm working on some C++ code and I've run into a question which has been nagging me for a while... Assuming I'm compiling with GCC on a Linux host for an ELF target, where are global static constructors and destructors called?
I've heard there's a function _init in crtbegin.o, and a function _fini in crtend.o. Are these called by crt0.o? Or does the dynamic linker actually detect their presence in the loaded binary and call them? If so, when does it actually call them?
I'm mainly interested to know so I can understand what's happening behind the scenes as my code is loaded, executed, and then unloaded at runtime.
Thanks in advance!
Update: I'm basically trying to figure out the general time at which the constructors are called. I don't want to make assumptions in my code based on this information, it's more or less to get a better understanding of what's happening at the lower levels when my program loads. I understand this is quite OS-specific, but I have tried to narrow it down a little in this question.
When talking about non-local static objects there are not many guarantees. As you already know (and it's also been mentioned here), it should not write code that depends on that. The static initialization order fiasco...
Static objects goes through a two-phase initialization: static initialization and dynamic initialization. The former happens first and performs zero-initialization or initialization by constant expressions. The latter happens after all static initialization is done. This is when constructors are called, for example.
In general, this initialization happens at some time before main(). However, as opposed to what many people think even that is not guaranteed by the C++ standard. What is in fact guaranteed is that the initialization is done before the use of any function or object defined in the same translation unit as the object being initialized. Notice that this is not OS specific. This is C++ rules. Here's a quote from the Standard:
It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of
namespace scope is done before the first statement of main. If the initialization is deferred to some point
in time after the first statement of main, it shall occur before the first use of any function or object defined
in the same translation unit as the object to be initialized
This depends heavy on the compiler and runtime. It's not a good idea to make any assumptions on the time global objects are constructed.
This is especially a problem if you have a static object which depends on another one being already constructed.
This is called "static initialization order fiasco". Even if thats not the case in your code, the C++Lite FAQ articles on that topic are worth a read.
This is not OS specific, rather its compiler specific.
You have given the answer, initialization is done in __init.
For the second part, in gcc you can guarantee the order of initialization with a __attribute__((init_priority(PRIORITY))) attached to a variable definition, where PRIORITY is some relative value, with lower numbers initialized first.
The grantees you have:
All static non-local objects in the global namespace are constructed before main()
All static non-local objects in another namespace are constructed before any functions/methods in that namespace are used (Thus allowing the compiler to potentially lazy evaluate them [but don't count on this behavior]).
All static non-local objects in a translation unit are constructed in the order of declaration.
Nothing is defined about the order between translation units.
All static non-local objects are destroyed in the reverse order of creation. (This includes the static function variables (which are lazily created on first use).
If you have globals that have dependencies on each other you have two options:
Put them in the same translation unit.
Transform them into static function variables retrieved and constructed on first use.
Example 1: Global A's constructor uses Global log
class AType
{ AType() { log.report("A Constructed");}};
LogType log;
AType A;
// Or
Class AType()
{ AType() { getLog().report("A Constructed");}};
LogType& getLog()
{
static LogType log;
return log;
}
// Define A anywhere;
Example Global B's destructor uses Global log
Here you have to grantee that the object log is not destroyed before the object B. This means that log must be fully constructed before B (as the reverse order of destruction rule will then apply). Again the same techniques can be used. Either put them in the same translation unit or use a function to get log.
class BType
{ ~BType() { log.report("B Destroyed");}};
LogType log;
BType B; // B constructed after log (so B will be destroyed first)
// Or
Class BType()
{ BType() { getLog();}
/*
* If log is used in the destructor then it must not be destroyed before B
* This means it must be constructed before B
* (reverse order destruction guarantees that it will then be destroyed after B)
*
* To achieve this just call the getLog() function in the constructor.
* This means that 'log' will be fully constructed before this object.
* This means it will be destroyed after and thus safe to use in the destructor.
*/
~BType() { getLog().report("B Destroyed");}
};
LogType& getLog()
{
static LogType log;
return log;
}
// Define B anywhere;
According to the C++ standard they are called before any function or object of their translation unit is used. Note that for objects in the global namespace this would mean they are initialized before main() is called. (See ltcmelo's and Martin's answers for mote details and a discussion of this.)

Using the keyword 'new' in file scope?

Given this class in the header file:
class ClassA
{
public:
ClassA(){};
}
Then in file.cpp
#include file.h
ClassA* GlobalPointerToClassAType = new ClassA();
a. Is it allowed, and is it good practice to use the keyword 'new' to allocate memory for an object in the heap(?) in lines of file-scope?
b. If it is allowed, then when exactly does the constructor ClassA() is actually called?
c. How does it differ if I wrote instead this line:
ClassA GlobalInstanceOfClassAType = ClassA();
in terms of the time of calling the constructor, in terms of memory efficiency, and in terms of good practice?
a. Is it allowed, and is it good practice to use the keyword 'new' to allocate memory for an object in the heap(?) in lines of file-scope?
It is allowed. Whether is it good practice to use new here is opinion based. And i predict that most people will answer no.
b. If it is allowed, then when exactly does the constructor ClassA() is actually called?
Let's start from some concepts.
In C++, all objects in a program have one of the following storage durations:
automatic
static
thread (since C++11)
dynamic
And if you check the cppreference, it claim:
static storage duration. The storage for the object is allocated when the program begins and deallocated when the program ends. Only one instance of the object exists. All objects declared at namespace scope (including global namespace) have this storage duration, plus those declared with static or extern. See Non-local variables and Static local variables for details on initialization of objects with this storage duration.
So, GlobalPointerToClassAType has static storage duration, it fit the statement that "All objects declared at namespace scope (including global namespace) have this storage duration...".
And if you get deeper into the link of the above section, you will find:
All non-local variables with static storage duration are initialized as part of program startup, before the execution of the main function begins (unless deferred, see below). All non-local variables with thread-local storage duration are initialized as part of thread launch, sequenced-before the execution of the thread function begins. For both of these classes of variables, initialization occurs in two distinct stages:
There's more detail in the same site, you can go deeper if you want to get more, but for this question, let's only focus on the initialization time. According to the reference, The constructor ClassA() might be called before the execution of the main function begins (unless deferred).
What is "deferred"? The answer is in the below sections:
It is implementation-defined whether dynamic initialization happens-before the first statement of the main function (for statics) or the initial function of the thread (for thread-locals), or deferred to happen after.
If the initialization of a non-inline variable (since C++17) is deferred to happen after the first statement of main/thread function, it happens before the first odr-use of any variable with static/thread storage duration defined in the same translation unit as the variable to be initialized. If no variable or function is odr-used from a given translation unit, the non-local variables defined in that translation unit may never be initialized (this models the behavior of an on-demand dynamic library). However, as long as anything from a translation unit is odr-used, all non-local variables whose initialization or destruction has side effects will be initialized even if they are not used in the program.
Let's see a tiny example, from godbolt. I use clang, directly copy your code, except that the Class A and main are defined in the same translation unit. You can see clang generate some section like __cxx_global_var_init, where the class ctor is called.

Static initialization order fiasco for built-in objects/libraries

If I have some variables that I'm initializing statically (before main begins), am I free to use any built-in stuff in these constructors, like <iostream> or <vector>?
The "static initialization order fiasco" occurs because the order in which static variables are initialized (among different translation units) is undefined.
So what if something benign like
std::cout << "Hello" << std::endl;
happens to rely on some static variable inside <iostream> being initialized ahead of time? (I'm not saying it does, but assume it did.) What's to say that these static variables inside built-in libraries are initialized before my own static variables? Like inside say "Person.cpp" or whatever.
Edit: Is std::cout guaranteed to be initialized? was suggested as a duplicate to this question. However, I think my question is slightly broader in scope because it asks about any standard built-in library, rather than just <iostream>.
The C++ standards make no strong statements of the behavior of the program before the start of main, or after main has completed.
In experience, I have found a number of issues where the C++ runtime has destroyed some object (e.g. resources used for management of std::mutex), which have created a deadlock in the destruction of a complex type.
I would recommend the following pattern for static's
C++ creates objects declared static in a function in the order they are executed. This leaves a pattern which will ensure that objects exist as they are needed.
AnObject * AnObject::getInstance() {
static AnObject a;
return &a;
}
This should be executed to get hold of the global, and will occur at the point when getInstance() is called.
C++ 11 onwards
This code is guaranteed to be thread-safe, where if multiple threads of execution arrive in getInstance, only one will construct the object and the rest will wait.
Pre C++ 11
Creating this pattern replaces ill-defined order with thread safety issues.
Luckily, it will be possible to create a criticalsection/mutex primative in main, which is able to arbitrate.
In some OSs (e.g. InitializeCriticalSection and Windows), these locks can safely be created before main as static variables.
AnObject * AnObject::getInstance() {
EnterCriticalSection( &aObjectcrit );
static AnObject a;
LeaveCriticalSection( &aObjectcrit );
return &a;
}
Assuming you have initialized aObjectcrit either in or before this function is called.
The result of this pattern is a form of onion construction, where objects are required in the order they are needed, and when the program exits, they are destroyed in the reverse order they were created in.
You're confusing objects (std::cout) and types (std::vector). The former is covered by the linked question, and the latter is a type. Static initialization applies to objects, but not to types.

Is This "Idiom" Well-Defined?

Let's say I have a C library that has its own (de)initialization routine, as many of them do.
init_API();
deinit_API();
now let's say I want to provide another level of abstraction to the user, and abstract away these calls, using a class that's instantiated statically. The method I was thinking of:
struct API_initializer{
API_initializer(){
init_API();
if(API_init_failure)
throw (APIFailureException); //important
}
~API_initializer(){
deinit_API();
}
};
struct API_initializer_holder{
static API_initializer initializer;
};
Now, my question is, is this well-defined behavior? I.E., will the static constructor be called at some reasonable point, and will all the (static) variables the C API need be initialized properly? In addition, is it bad practice to throw an exception that the user has no way of catching?
In my experience it is a bad idea to try and rely on static initialization order.
A better idea would be to get rid of the holder, and just go:
int main()
{
API_initializer foo;
// rest of program
}
If you really want it to throw on failure then include a try..catch block.
NB. Make the class non-copyable to prevent accidents.
No, the program will break if the initialization is failed. You cannot catch the exception and handle it because the global variables(static member data) is initialized before main().
It's better to call init_API() and deinit_API() in main().
This is covered by ยง3.6.2 [basic.start.init]/p4-6 of the standard:
It is implementation-defined whether the dynamic initialization of a
non-local variable with static storage duration is done before the
first statement of main. If the initialization is deferred to some
point in time after the first statement of main, it shall occur before
the first odr-use of any function or variable defined in the same
translation unit as the variable to be initialized.35
[...]
If the initialization of a non-local variable with static or thread storage duration exits via an exception, std::terminate is called.
35 A non-local variable with static storage duration having initialization with side-effects must be initialized even if it is not odr-used.
As there's no way your static object can be in the same translation unit as a function in the C library you are wrapping, the only way to guarantee initialization would be to define it in the same translation unit as main(), which doesn't strike me as adding any kind of convenience to the end user.
In any event, it's probably not a good idea to cause std::terminate to be called by throwing an exception if initialization fails. The end user probably want to do better error handling than that.
The answer is: it depends. If the external API is implemented
in C (so that it will have no dynamic initialization), and it is
not used in any of your static initializers (which should be the
case, since there is no way that they can safely do init_API),
then the idiom is more or less safe (except that you shouldn't
allow an exception to escape from a static initializer. Whether
it is a good idea or not is another question; if the
initialization can fail, you may not want to execute it before
you can catch and handle the error, which means that you're in
main.

How to do static de-initialization if the destructor has side effects and the object is accessed from another static object's destructor?

There is a simple and well-known pattern to avoid the static initialization fiasco, described in section 10.13 of the C++ FAQ Lite.
In this standard pattern, there is a trade-off made in that either the constructed object gets never destructed (which is not a problem if the destructor does not have important side effects) or the static object cannot safely be accessed from another static object's destructor (see section 10.14 of the C++ FAQ Lite).
So my question is: How do you avoid the static de-initialization fiasco if a static object's destructor has important side effects that must eventually occur and the static object must be accessed by another static object's destructor?
(Note: the FAQ-lite mentions this question is answered in FAQ 16.17 of C++ FAQs: Frequently Asked Questions by M. Cline and and G. Lomow. I do not have access to this book, which is why I ask this question instead.)
function static objects like global objects are guaranteed to be destroyed (assuming they are created).
The order of destruction is the inverse of creation.
Thus if an object depends on another object during destruction you must guarantee that it is still available. This is relatively simple as you can force the order of destruction by making sure the order of creation is done correctly.
The following link is about singeltons but describes a similar situation and its solution:
Finding C++ static initialization order problems
Extrapolating to the general case of lazy initialized globals as described in the FAQ lite we can solve the problem like this:
namespace B
{
class B { ... };
B& getInstance_Bglob;
{
static B instance_Bglob;
return instance_Bglob;;
}
B::~B()
{
A::getInstance_abc().doSomthing();
// The object abc is accessed from the destructor.
// Potential problem.
// You must guarantee that abc is destroyed after this object.
// To gurantee this you must make sure it is constructed first.
// To do this just access the object from the constructor.
}
B::B()
{
A::getInstance_abc();
// abc is now fully constructed.
// This means it was constructed before this object.
// This means it will be destroyed after this object.
// This means it is safe to use from the destructor.
}
}
namespace A
{
class A { ... };
A& getInstance_abc()
{
static A instance_abc;
return instance_abc;
}
}
As long as the other object's static destructor runs first, you're ok. You can assure this by having the other object get constructed before "object A". As long as both objects are declared in the same compilation unit, they will be initialized in the order they appear in the source, and destructed in the opposite order.
If you need this to happen across compilation units, you're out of luck. Better is to create them dynamically at runtime and destroy them at the end of main, rather than making them static.
It is a bit of a hack but I would add some static bools to keep track of the order of de initialization Then the object that ends up last does the clean up even if it isn't the owner.