If I have some variables that I'm initializing statically (before main begins), am I free to use any built-in stuff in these constructors, like <iostream> or <vector>?
The "static initialization order fiasco" occurs because the order in which static variables are initialized (among different translation units) is undefined.
So what if something benign like
std::cout << "Hello" << std::endl;
happens to rely on some static variable inside <iostream> being initialized ahead of time? (I'm not saying it does, but assume it did.) What's to say that these static variables inside built-in libraries are initialized before my own static variables? Like inside say "Person.cpp" or whatever.
Edit: Is std::cout guaranteed to be initialized? was suggested as a duplicate to this question. However, I think my question is slightly broader in scope because it asks about any standard built-in library, rather than just <iostream>.
The C++ standards make no strong statements of the behavior of the program before the start of main, or after main has completed.
In experience, I have found a number of issues where the C++ runtime has destroyed some object (e.g. resources used for management of std::mutex), which have created a deadlock in the destruction of a complex type.
I would recommend the following pattern for static's
C++ creates objects declared static in a function in the order they are executed. This leaves a pattern which will ensure that objects exist as they are needed.
AnObject * AnObject::getInstance() {
static AnObject a;
return &a;
}
This should be executed to get hold of the global, and will occur at the point when getInstance() is called.
C++ 11 onwards
This code is guaranteed to be thread-safe, where if multiple threads of execution arrive in getInstance, only one will construct the object and the rest will wait.
Pre C++ 11
Creating this pattern replaces ill-defined order with thread safety issues.
Luckily, it will be possible to create a criticalsection/mutex primative in main, which is able to arbitrate.
In some OSs (e.g. InitializeCriticalSection and Windows), these locks can safely be created before main as static variables.
AnObject * AnObject::getInstance() {
EnterCriticalSection( &aObjectcrit );
static AnObject a;
LeaveCriticalSection( &aObjectcrit );
return &a;
}
Assuming you have initialized aObjectcrit either in or before this function is called.
The result of this pattern is a form of onion construction, where objects are required in the order they are needed, and when the program exits, they are destroyed in the reverse order they were created in.
You're confusing objects (std::cout) and types (std::vector). The former is covered by the linked question, and the latter is a type. Static initialization applies to objects, but not to types.
Related
I'm working on some C++ code and I've run into a question which has been nagging me for a while... Assuming I'm compiling with GCC on a Linux host for an ELF target, where are global static constructors and destructors called?
I've heard there's a function _init in crtbegin.o, and a function _fini in crtend.o. Are these called by crt0.o? Or does the dynamic linker actually detect their presence in the loaded binary and call them? If so, when does it actually call them?
I'm mainly interested to know so I can understand what's happening behind the scenes as my code is loaded, executed, and then unloaded at runtime.
Thanks in advance!
Update: I'm basically trying to figure out the general time at which the constructors are called. I don't want to make assumptions in my code based on this information, it's more or less to get a better understanding of what's happening at the lower levels when my program loads. I understand this is quite OS-specific, but I have tried to narrow it down a little in this question.
When talking about non-local static objects there are not many guarantees. As you already know (and it's also been mentioned here), it should not write code that depends on that. The static initialization order fiasco...
Static objects goes through a two-phase initialization: static initialization and dynamic initialization. The former happens first and performs zero-initialization or initialization by constant expressions. The latter happens after all static initialization is done. This is when constructors are called, for example.
In general, this initialization happens at some time before main(). However, as opposed to what many people think even that is not guaranteed by the C++ standard. What is in fact guaranteed is that the initialization is done before the use of any function or object defined in the same translation unit as the object being initialized. Notice that this is not OS specific. This is C++ rules. Here's a quote from the Standard:
It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of
namespace scope is done before the first statement of main. If the initialization is deferred to some point
in time after the first statement of main, it shall occur before the first use of any function or object defined
in the same translation unit as the object to be initialized
This depends heavy on the compiler and runtime. It's not a good idea to make any assumptions on the time global objects are constructed.
This is especially a problem if you have a static object which depends on another one being already constructed.
This is called "static initialization order fiasco". Even if thats not the case in your code, the C++Lite FAQ articles on that topic are worth a read.
This is not OS specific, rather its compiler specific.
You have given the answer, initialization is done in __init.
For the second part, in gcc you can guarantee the order of initialization with a __attribute__((init_priority(PRIORITY))) attached to a variable definition, where PRIORITY is some relative value, with lower numbers initialized first.
The grantees you have:
All static non-local objects in the global namespace are constructed before main()
All static non-local objects in another namespace are constructed before any functions/methods in that namespace are used (Thus allowing the compiler to potentially lazy evaluate them [but don't count on this behavior]).
All static non-local objects in a translation unit are constructed in the order of declaration.
Nothing is defined about the order between translation units.
All static non-local objects are destroyed in the reverse order of creation. (This includes the static function variables (which are lazily created on first use).
If you have globals that have dependencies on each other you have two options:
Put them in the same translation unit.
Transform them into static function variables retrieved and constructed on first use.
Example 1: Global A's constructor uses Global log
class AType
{ AType() { log.report("A Constructed");}};
LogType log;
AType A;
// Or
Class AType()
{ AType() { getLog().report("A Constructed");}};
LogType& getLog()
{
static LogType log;
return log;
}
// Define A anywhere;
Example Global B's destructor uses Global log
Here you have to grantee that the object log is not destroyed before the object B. This means that log must be fully constructed before B (as the reverse order of destruction rule will then apply). Again the same techniques can be used. Either put them in the same translation unit or use a function to get log.
class BType
{ ~BType() { log.report("B Destroyed");}};
LogType log;
BType B; // B constructed after log (so B will be destroyed first)
// Or
Class BType()
{ BType() { getLog();}
/*
* If log is used in the destructor then it must not be destroyed before B
* This means it must be constructed before B
* (reverse order destruction guarantees that it will then be destroyed after B)
*
* To achieve this just call the getLog() function in the constructor.
* This means that 'log' will be fully constructed before this object.
* This means it will be destroyed after and thus safe to use in the destructor.
*/
~BType() { getLog().report("B Destroyed");}
};
LogType& getLog()
{
static LogType log;
return log;
}
// Define B anywhere;
According to the C++ standard they are called before any function or object of their translation unit is used. Note that for objects in the global namespace this would mean they are initialized before main() is called. (See ltcmelo's and Martin's answers for mote details and a discussion of this.)
I have C++ code which declares static-lifetime variables which are initialized by function calls. The called function constructs a vector instance and calls its push_back method. Is the code risking doom via the C++ static initialization order fiasco? If not, why not?
Supplementary information:
What's the "static initialization order fiasco"?
It's explained in C++ FAQ 10.14
Why would I think use of vector could trigger the fiasco?
It's possible that the vector constructor makes use of the value of another static-lifetime variable initialized dynamically. If so, then there is nothing to ensure that vector's variable is initialized before I use vector in my code. Initializing result (see code below) could end up calling the vector constructor before vector's dependencies are fully initialized, leading to access to uninitialized memory.
What does this code look like anyway?
struct QueryEngine {
QueryEngine(const char *query, string *result_ptr)
: query(query), result_ptr(result_ptr) { }
static void AddQuery(const char *query, string *result_ptr) {
if (pending == NULL)
pending = new vector<QueryEngine>;
pending->push_back(QueryEngine(query, result_ptr));
}
const char *query;
string *result_ptr;
static vector<QueryEngine> *pending;
};
vector<QueryEngine> *QueryEngine::pending = NULL;
void Register(const char *query, string *result_ptr) {
QueryEngine::AddQuery(query, result_ptr);
}
string result = Register("query", &result);
Fortunately, static objects are zero-initialised even before any other initialisation is performed (even before the "true" initialisation of the same objects), so you know that the NULL will be set on that pointer long before Register is first invoked.1
Now, in terms of operating on your vector, it appears that (technically) you could run into such a problem:
[C++11: 17.6.5.9/3]: A C++ standard library function shall not directly or indirectly modify objects (1.10) accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function’s non-const arguments, including this.
[C++11: 17.6.5.9/4]: [Note: This means, for example, that implementations can’t use a static object for internal purposes without synchronization because it could cause a data race even in programs that do not explicitly share objects between threads. —end note]
Notice that, although synchronisation is being required in this note, that's been mentioned within a passage that ultimately acknowledges that static implementation details are otherwise allowed.
That being said, it seems like the standard should further state that user code should avoid operating on standard containers during static initialisation, if the intent were that the semantics of such code could not be guaranteed; I'd consider this a defect in the standard, either way. It should be clearer.
1 And it is a NULL pointer, whatever the bit-wise representation of that may be, rather than a blot to all-zero-bits.
vector doesn't depend on anything preventing its use in dynamic initialisation of statics. The only issue with your code is a lack of thread safety - no particular reason to think you should care about that, unless you have statics whose construction spawns threads....
Initializing result (see code below) could end up calling the vector constructor before that class is fully initialized, leading to access to uninitialized memory.
No... initialising result calls AddQuery which checks if (pending == NULL) - the initialisation to NULL will certainly have been done before any dynamic initialisation, per 3.6.2/2:
Constant initialization is performed:
...
— if an object with static or thread storage duration is not initialized by a constructor call and if either the object is value-initialized or every full-expression that appears in its initializer is a constant expression
So even if the result assignment is in a different translation unit it's safe. See 3.6.2/2:
Together, zero-initialization and constant initialization are called static initialization; all other initialization is dynamic initialization. Static initialization shall be performed before any dynamic initialization takes place.
I read several posts on C++ initialization from Google, some of which direct me here on StackOverflow. The concepts I picked from those posts are as follows:
The order of initialization of C++ is:
Zero Initialization;
Static Initialization;
Dynamic Initialization.
Static objects (variables included) are first Zero-initialized, and then Static-initialized.
I have several inquiries as to the initialization issue (storage class issue may be related as well):
Global objects (defined without static keyword) are also static objects, right?
Global objects are also initialized like static objects by two steps like above, right?
What is the Static Initialization? Does it refer to initializing static objects (defined with static keyword)?
I also read that objects defined within block (i.e. in a function) with static keyword is initialized when the execution thread first enters the block! This means that local static objects are not initialized before main function execution. This means they are not initialized as the two steps mentioned above, right?
Dynamic initialization refers to initialization of objects created by new operator, right? It might refer to initialization like myClass obj = myClass(100); or myClass obj = foo();
I have too many inquiries on the initialization and storage class specifier issues. I read the C++2003 Standard document, but cannot find a clear logic since they are scattered throughout the document.
I hope you give me an answer that logically explains the whole map of storage class specifier and initialization. Any reference is welcome!
Code that might explain my question:
class myClass{
public:
int i;
myClass(int j = 10): j(i){}
// other declarations
};
myClass obj1;//global scope
static myClass obj2(2);//file scope
{ //local scope
myClass obj3(3);
static myClass obj4(4);
}
EDIT:
If you think my question is rather tedious, you can help explain your ideas based on the code above.
I read several posts on C++ initialization from Google, some of which direct me here on StackOverflow. The concepts I picked from those posts are as follows:
The order of initialization of C++ is:
Zero Initialization;
Static Initialization;
Dynamic Initialization.
Yes, indeed there are 3 phases (in the Standard). Let us clarify them before continuing:
Zero Initialization: the memory is filled with 0s at the byte level.
Constant Initialization: a pre-computed (compile-time) byte pattern is copied at the memory location of the object
Static Initialization: Zero Initialization followed by Constant Initialization
Dynamic Initialization: a function is executed to initialize the memory
A simple example:
int const i = 5; // constant initialization
int const j = foo(); // dynamic initialization
Static objects (variables included) are first Zero-initialized, and then Static-initialized.
Yes and no.
The Standard mandates that the objects be first zero-initialized and then they are:
constant initialized if possible
dynamically initialized otherwise (the compiler could not compute the memory content at compile-time)
Note: in case of constant initialization, the compiler might omit to first zero-initialized memory following the as-if rule.
I have several inquiries as to the initialization issue (storage class issue may be related as well):
Global objects (defined without static keyword) are also static objects, right?
Yes, at file scope the static object is just about the visibility of the symbol. A global object can be referred to, by name, from another source file whilst a static object name is completely local to the current source file.
The confusion stems from the reuse of the world static in many different situations :(
Global objects are also initialized like static objects by two steps like above, right?
Yes, as are local static objects in fact.
What is the Static Initialization? Does it refer to initializing static objects (defined with static keyword)?
No, as explained above it refers to initializing objects without executing a user-defined function but instead copying a pre-computed byte pattern over the object's memory. Note that in the case of objects that will later be dynamically initialized, this is just zero-ing the memory.
I also read that objects defined within block (i.e. in a function) with static keyword is initialized when the execution thread first enters the block! This means that local static objects are not initialized before main function execution. This means they are not initialized as the two steps mentioned above, right?
They are initialized with the two steps process, though indeed only the first time execution pass through their definition. So the process is the same but the timing is subtly different.
In practice though, if their initialization is static (ie, the memory pattern is a compile-time pattern) and their address is not taken they might be optimized away.
Note that in case of dynamic initialization, if their initialization fails (an exception is thrown by the function supposed to initialize them) it will be re-attempted the next time flow-control passes through their definition.
Dynamic initialization refers to initialization of objects created by new operator, right? It might refer to initialization like myClass obj = myClass(100); or myClass obj = foo();
Not at all, it refers to initialization requiring the execution of a user defined function (note: std::string has a user-defined constructor as far as the C++ language is concerned).
EDIT: My thanks to Zach who pointed to me I erroneously called Static Initialization what the C++11 Standard calls Constant Initialization; this error should now be fixed.
I believe there are three different concepts: initializing the variable, the location of the variable in memory, the time the variable is initialized.
First: Initialization
When a variable is allocated in memory, typical processors leave the memory untouched, so the variable will have the same value that somebody else stored earlier. For security, some compilers add the extra code to initialize all variables they allocate to zero. I think this is what you mean by "Zero Initialization". It happens when you say:
int i; // not all compilers set this to zero
However if you say to the compiler:
int i = 10;
then the compiler instructs the processor to put 10 in the memory rather than leaving it with old values or setting it to zero. I think this is what you mean by "Static Initialization".
Finally, you could say this:
int i;
...
...
i = 11;
then the processor "zero initializes" (or leaves the old value) when executing int i; then when it reaches the line i = 11 it "dynamically initializes" the variable to 11 (which can happen very long after the first initialization.
Second: Location of the variable
There are: stack-based variables (sometimes called static variables), and memory-heap variables (sometimes called dynamic variables).
Variables can be created in the stack segment using this:
int i;
or the memory heap like this:
int *i = new int;
The difference is that the stack segment variable is lost after exiting the function call, while memory-heap variables are left until you say delete i;. You can read an Assembly-language book to understand the difference better.
Third: The time the variable is initialized
A stack-segment variable is "zero-initialized" or statically-initialized" when you enter the function call they are defined within.
A memory-heap variable is "zero-initialized" or statically-initialized" when it is first created by the new operator.
Final Remark
You can think about static int i; as a global variable with a scope limited to the function it is defined in. I think the confusion about static int i; comes because static hear mean another thing (it is not destroyed when you exit the routine, so it retains its value). I am not sure, but I think the trick used for static int i; is to put it in the stack of main() which means it is not destroyed until you exit the whole program (so it retains the first initialization), or it could be that it is stored in the data segment of the application.
I have heard that using static member objects is not a very good practice.
Say for example, I have this code:
class Foo {
...
static MyString str;
};
I define and initialize this variable in the implementation file of this class as:
MyString Foo::str = "Some String"; // This is fine as my string API handles this.
When I run this code, I get a warning:
warning:'Foo::str' requires global construction.
I have quite much of such members in my class, what is the best way to handle this.
Thanks,
Most of the arguments against them are the same as for global variables:
Initialization order between different compilation units is undefined.
Initialization order inside one compilation unit may affect the behavior — thus non trivial ordering may be required.
If a constructor throws an exception you can't catch it, your program is terminated.
APPENDED: To handle this properly you must either make sure that above points don't apply to your code and ignore the warning, or redesign your program: Do you really need them static? Why not use const char* Foo::str = "Some String";?
The biggest reason for concern with this example is that constructing the static member object happens before main() and destruction happens after main() (or when you call exit()). So far, that's a good thing. But when you have multiple objects like this in your program, you risk a bug where code attempts to use an object that has not yet been constructed or has already been destroyed.
The C++ FAQ Lite has some helpful discussion on this topic, including a workaround/solution. Recommended reading is questions 10.14 through 10.18. 10.17 is most directly applicable to your example.
Using a static member, you are not guaranteeing thread safety, imagine two threads trying to access the static member - now what would be the value of that member - was it the one from thread x or thread y, this also induces another side-effect, race conditions, where one thread modifies the static member before the other thread completes... in other words, using a static member can be hazardous...
As an example, it is required to know the number of instances of a class. This would require a class static member to track the count of instances.
There is nothing wrong in having a static member of a class if the problem solution requires such a design. It's just that the nitty gritties have to be taken care as mentioned in other posts.
I'm working on some C++ code and I've run into a question which has been nagging me for a while... Assuming I'm compiling with GCC on a Linux host for an ELF target, where are global static constructors and destructors called?
I've heard there's a function _init in crtbegin.o, and a function _fini in crtend.o. Are these called by crt0.o? Or does the dynamic linker actually detect their presence in the loaded binary and call them? If so, when does it actually call them?
I'm mainly interested to know so I can understand what's happening behind the scenes as my code is loaded, executed, and then unloaded at runtime.
Thanks in advance!
Update: I'm basically trying to figure out the general time at which the constructors are called. I don't want to make assumptions in my code based on this information, it's more or less to get a better understanding of what's happening at the lower levels when my program loads. I understand this is quite OS-specific, but I have tried to narrow it down a little in this question.
When talking about non-local static objects there are not many guarantees. As you already know (and it's also been mentioned here), it should not write code that depends on that. The static initialization order fiasco...
Static objects goes through a two-phase initialization: static initialization and dynamic initialization. The former happens first and performs zero-initialization or initialization by constant expressions. The latter happens after all static initialization is done. This is when constructors are called, for example.
In general, this initialization happens at some time before main(). However, as opposed to what many people think even that is not guaranteed by the C++ standard. What is in fact guaranteed is that the initialization is done before the use of any function or object defined in the same translation unit as the object being initialized. Notice that this is not OS specific. This is C++ rules. Here's a quote from the Standard:
It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of
namespace scope is done before the first statement of main. If the initialization is deferred to some point
in time after the first statement of main, it shall occur before the first use of any function or object defined
in the same translation unit as the object to be initialized
This depends heavy on the compiler and runtime. It's not a good idea to make any assumptions on the time global objects are constructed.
This is especially a problem if you have a static object which depends on another one being already constructed.
This is called "static initialization order fiasco". Even if thats not the case in your code, the C++Lite FAQ articles on that topic are worth a read.
This is not OS specific, rather its compiler specific.
You have given the answer, initialization is done in __init.
For the second part, in gcc you can guarantee the order of initialization with a __attribute__((init_priority(PRIORITY))) attached to a variable definition, where PRIORITY is some relative value, with lower numbers initialized first.
The grantees you have:
All static non-local objects in the global namespace are constructed before main()
All static non-local objects in another namespace are constructed before any functions/methods in that namespace are used (Thus allowing the compiler to potentially lazy evaluate them [but don't count on this behavior]).
All static non-local objects in a translation unit are constructed in the order of declaration.
Nothing is defined about the order between translation units.
All static non-local objects are destroyed in the reverse order of creation. (This includes the static function variables (which are lazily created on first use).
If you have globals that have dependencies on each other you have two options:
Put them in the same translation unit.
Transform them into static function variables retrieved and constructed on first use.
Example 1: Global A's constructor uses Global log
class AType
{ AType() { log.report("A Constructed");}};
LogType log;
AType A;
// Or
Class AType()
{ AType() { getLog().report("A Constructed");}};
LogType& getLog()
{
static LogType log;
return log;
}
// Define A anywhere;
Example Global B's destructor uses Global log
Here you have to grantee that the object log is not destroyed before the object B. This means that log must be fully constructed before B (as the reverse order of destruction rule will then apply). Again the same techniques can be used. Either put them in the same translation unit or use a function to get log.
class BType
{ ~BType() { log.report("B Destroyed");}};
LogType log;
BType B; // B constructed after log (so B will be destroyed first)
// Or
Class BType()
{ BType() { getLog();}
/*
* If log is used in the destructor then it must not be destroyed before B
* This means it must be constructed before B
* (reverse order destruction guarantees that it will then be destroyed after B)
*
* To achieve this just call the getLog() function in the constructor.
* This means that 'log' will be fully constructed before this object.
* This means it will be destroyed after and thus safe to use in the destructor.
*/
~BType() { getLog().report("B Destroyed");}
};
LogType& getLog()
{
static LogType log;
return log;
}
// Define B anywhere;
According to the C++ standard they are called before any function or object of their translation unit is used. Note that for objects in the global namespace this would mean they are initialized before main() is called. (See ltcmelo's and Martin's answers for mote details and a discussion of this.)