I had a discussion this morning with a colleague about static variable initialization order. He mentioned the Nifty/Schwarz counter and I'm (sort of) puzzled. I understand how it works, but I'm not sure if this is, technically speaking, standard compliant.
Suppose the 3 following files (the first two are copy-pasta'd from More C++ Idioms):
//Stream.hpp
class StreamInitializer;
class Stream {
friend class StreamInitializer;
public:
Stream () {
// Constructor must be called before use.
}
};
static class StreamInitializer {
public:
StreamInitializer ();
~StreamInitializer ();
} initializer; //Note object here in the header.
//Stream.cpp
static int nifty_counter = 0;
// The counter is initialized at load-time i.e.,
// before any of the static objects are initialized.
StreamInitializer::StreamInitializer ()
{
if (0 == nifty_counter++)
{
// Initialize Stream object's static members.
}
}
StreamInitializer::~StreamInitializer ()
{
if (0 == --nifty_counter)
{
// Clean-up.
}
}
// Program.cpp
#include "Stream.hpp" // initializer increments "nifty_counter" from 0 to 1.
// Rest of code...
int main ( int, char ** ) { ... }
... and here lies the problem! There are two static variables:
"nifty_counter" in Stream.cpp; and
"initializer" in Program.cpp.
Since the two variables happen to be in two different compilation units, there is no (AFAIK) official guarantee that nifty_counter is initialized to 0 before initializer's constructor is called.
I can think of two quick solutions as two why this "works":
modern compilers are smart enough to resolve the dependency between the two variables and place the code in the appropriate order in the executable file (highly unlikely);
nifty_counter is actually initialized at "load-time" like the article says and its value is already placed in the "data segment" in the executable file, so it is always initialized "before any code is run" (highly likely).
Both of these seem to me like they depend on some unofficial, yet possible implementation. Is this standard compliant or is this just "so likely to work" that we shouldn't worry about it?
I believe it's guaranteed to work. According to the standard ($3.6.2/1): "Objects with static storage duration (3.7.1) shall be zero-initialized (8.5) before any other initialization takes place."
Since nifty_counter has static storage duration, it gets initialized before initializer is created, regardless of distribution across translation units.
Edit: After rereading the section in question, and considering input from #Tadeusz Kopec's comment, I'm less certain about whether it's well defined as it stands right now, but it is quite trivial to ensure that it is well-defined: remove the initialization from the definition of nifty_counter, so it looks like:
static int nifty_counter;
Since it has static storage duration, it will be zero-initialized, even without specifying an intializer -- and removing the initializer removes any doubt about any other initialization taking place after the zero-initialization.
I think missing from this example is how the construction of Stream is avoided, this often is non-portable. Besides the nifty counter the initialisers role is to construct something like:
extern Stream in;
Where one compilation unit has the memory associated with that object, whether there is some special constructor before the in-place new operator is used, or in the cases I've seen the memory is allocated in another way to avoid any conflicts. It seems to me that is there is a no-op constructor on this stream then the ordering of whether the initialiser is called first or the no-op constructor is not defined.
To allocate an area of bytes is often non-portable for example for gnu iostream the space for cin is defined as:
typedef char fake_istream[sizeof(istream)] __attribute__ ((aligned(__alignof__(istream))))
...
fake_istream cin;
llvm uses:
_ALIGNAS_TYPE (__stdinbuf<char> ) static char __cin [sizeof(__stdinbuf <char>)];
Both make certain assumption about the space needed for the object. Where the Schwarz Counter initialises with a placement new:
new (&cin) istream(&buf)
Practically this doesn't look that portable.
I've noticed that some compilers like gnu, microsoft and AIX do have compiler extensions to influence static initialiser order:
For Gnu this is: Enable the init-priority with the -f flag and use __attribute__ ((init_priority (n))).
On windows with a microsoft compiler there is a #pragma (http://support.microsoft.com/kb/104248)
Related
Here is the code:
int factorial(int n)
{
if ( n < 0 ) return -1; //indicates input error
else if ( n == 0 ) return 1;
else return n * factorial(n-1);
}
int const a = 10 ; //static initialization
//10 is known at compile time. Its 10!
int const b = factorial(8); //dynamic initialization
//factorial(8) isn't known at compile time,
//rather it's computed at runtime.
(stolen from here)
So it makes sense to me why b is dynamically initialized and a is statically initialized.
But what if a and b had automatic storage duration(maybe they had been initialized in main()), could you then still call their initialization either static or dynamic? Because, to me, they sound like a more general name for initialization than for example Copy initialization.
Also, I have read this and can anybody tell me why they have not directly explained what static and dynamic initialization are? I mean, it looks like that they only have explained in what situations they happen, but maybe there is a reason why?
cppreference states the initializer may invoke (some intializations, like value initialization etc.), but later in the article, they mention static and dynamic initialization as if those two were more general names for some initializations. This could sound confusing, but here I have illustrated what I understand:
(not the most beautiful thing)
Static and dynamic initialization describe the process of loading a binary and getting to the point when main is ready to run.
static initialization describes the information the compiler can work out at compile time, and allows for fixed values to be stored in the binary so at the point when the binary is loaded by the operating system, it has the correct value.
dynamic initialization describes the code which is inserted by the compiler before main runs, which initializes the information which the compiler was unable to calculate. That may be because it involves code directly, or that it refers to information which was not visible to the compiler at compile time.
But what if a and b had automatic storage duration
The simple case when a is an automatic variable of limited scope.
int a = 12;
This could not be statically initialized, because the compiler will not know where to initialize a, as it would be different each time, and on each thread which called it.
The compiler will be able to initialize a with something like.
mov (_addr_of_a), 12
As _addr_of_a is unknown until runtime, and the value 12 is embedded in the code, a case for statically initializing it would not be done.
More complex cases ...
int a[] = { /* some integer values */ };
This is possibly going to be implemented by the compiler as a mixture of static and dynamic code as below.
static int a_init = { /* some integer values */ };
memcpy( a, a_init, length_in_bytes_of_a );
So some cases there will be "leakage" from static initialization into runtime behaviour.
Dynamic behavior is more problematic - it assumes that a function which does not normally expose its implementation, has both a slow execution time, and is a constexpr to give value to the caching at start of the result. I have not seen this optimization occur.
Static and dynamic initialization are technical terms which describe the process of creating a running program. Similar patterns may exist for local variables, but they would not fall into the technical definition of static and dynamic initialization.
Let's say I create:
class Hello {
public:
int World(int in)
{
static int var = 0; // <<<< This thing here.
if (in >= 0) {
var = in;
} else {
cout << var << endl;
}
}
};
Now, if I do:
Hello A;
Hello B;
A.World(10);
A.World(-1);
B.World(-1);
I'm getting output of "10" followed by another "10". The value of the local variable of a method just crossed over from one instance of a class to another.
It's not surprising - technically methods are just functions with a hidden this parameter, so a static local variable should behave just like in common functions. But is it guaranteed? Is it a behavior enforced by standard, or is it merely a happy byproduct of how the compiler handles methods? In other words - is this behavior safe to use? (...beyond the standard risk of baffling someone unaccustomed...)
Yes. It doesn't matter if the function is a [non-static] member of a class or not, it's guranteed to have only one instance of it's static variables.
Proper technical explanation for such variables is that those are objects with static duration and internal linkage - and thus those names live until program exits, and all instances of this name refer to the same entity.
Just one thing to add to the correct answer. If your class was templated, then the instance of var would only be shared amongst objects of the same instantiation type. So if you had:
template<typename C>
class Hello {
public:
int World(int in)
{
static int var = 0; // <<<< This thing here.
if (in >= 0) {
var = in;
} else {
cout << var << endl;
}
}
};
And then:
Hello<int> A;
Hello<int> B;
Hello<unsigned> C;
A.World(10);
A.World(-1);
B.World(-1);
C.World(-1);
Then the final output would be "0" rather than "10", because the Hello<unsigned> instantiation would have its own copy of var.
If we are talking about the Windows Compiler it's guaranteed
https://msdn.microsoft.com/en-us/library/y5f6w579.aspx
The following example shows a local variable declared static in a member function. The static variable is available to the whole program; all instances of the type share the same copy of the static variable.
They use an example very similar to yours.
I don't know about GCC
Yes, it is guaranteed. Now, to answer the question "Any risk of sharing local static variable of a method between instances?" it might be a bit less straightforward. There might be potential risks in the initialization and utilization of the variable and these risks are specific to variables local to the method (as opposed to class variables).
For the initialization, a relevant part in the standard is 6.7/4 [stmt.dcl]:
Dynamic initialization of a block-scope variable with static storage
duration (3.7.1) or thread storage duration (3.7.2) is performed the
first time control passes through its declaration; such a variable is
considered initialized upon the completion of its initialization. If
the initialization exits by throwing an exception, the initialization
is not complete, so it will be tried again the next time control
enters the declaration. If control enters the declaration concurrently
while the variable is being initialized, the concurrent execution
shall wait for completion of the initialization. If control
re-enters the declaration recursively while the variable is being
initialized, the behavior is undefined.
In the simple cases, everything should work as expected. When the construction and initialization of the variable is more complex, there will be risks specific to this case. For instance, if the constructor throws, it will have the opportunity to throw again on the next call. Another example would be recursive initialization which is undefined behavior.
Another possible risk is the performance of the method. The compiler will need to implement a mechanism to ensure compliant initialization of the variable. This is implementation-dependent and it could very well be a lock to check if the variable is initialized, and that lock could be executed every time the method is called. When that happens, it can have a significant adverse effect on performance.
I noticed that if you initialize a static variable in C++ in code, the initialization only runs the first time you run the function.
That is cool, but how is that implemented? Does it translate to some kind of twisted if statement? (if given a value, then ..)
void go( int x )
{
static int j = x ;
cout << ++j << endl ; // see 6, 7, 8
}
int main()
{
go( 5 ) ;
go( 5 ) ;
go( 5 ) ;
}
Yes, it does normally translate into an implicit if statement with an internal boolean flag. So, in the most basic implementation your declaration normally translates into something like
void go( int x ) {
static int j;
static bool j_initialized;
if (!j_initialized) {
j = x;
j_initialized = true;
}
...
}
On top of that, if your static object has a non-trivial destructor, the language has to obey another rule: such static objects have to be destructed in the reverse order of their construction. Since the construction order is only known at run-time, the destruction order becomes defined at run-time as well. So, every time you construct a local static object with non-trivial destructor, the program has to register it in some kind of linear container, which it will later use to destruct these objects in proper order.
Needless to say, the actual details depend on implementation.
It is worth adding that when it comes to static objects of "primitive" types (like int in your example) initialized with compile-time constants, the compiler is free to initialize that object at startup. You will never notice the difference. However, if you take a more complicated example with a "non-primitive" object
void go( int x ) {
static std::string s = "Hello World!";
...
then the above approach with if is what you should expect to find in the generated code even when the object is initialized with a compile-time constant.
In your case the initializer is not known at compile time, which means that the compiler has to delay the initialization and use that implicit if.
Yes, the compiler usually generates a hidden boolean "has this been initialized?" flag and an if that runs every time the function is executed.
There is more reading material here: How is static variable initialization implemented by the compiler?
While it is indeed "some kind of twisted if", the twist may be more than you imagined...
ZoogieZork's comment on AndreyT's answer touches on an important aspect: the initialisation of static local variables - on some compilers including GCC - is by default thread safe (a compiler command-line option can disable it). Consequently, it's using some inter-thread synchronisation mechanism (a mutex or atomic operation of some kind) which can be relatively slow. If you wouldn't be comfortable - performance wise - with explicit use of such an operation in your function, then you should consider whether there's a lower-impact alternative to the lazy initialisation of the variable (i.e. explicitly construct it in a threadsafe way yourself somewhere just once). Very few functions are so performance sensitive that this matters though - don't let it spoil your day, or make your code more complicated, unless your programs too slow and your profiler's fingering that area.
They are initialized only once because that's what the C++ standard mandates. How this happens is entirely up to compiler vendors. In my experience, a local hidden flag is generated and used by the compiler.
I'm used to thinking of all initialization of globals/static-class-members as happening before the first line of main(). But I recently read somewhere that the standard allows initialization to happen later to "assist with dynamic loading of modules." I could see this being true when dynamic linking: I wouldn't expect a global initialized in a library to be initialized before I dlopen'ed the library. However, within a grouping of statically linked together translation units (my app's direct .o files) I would find this behavior very unintuitive. Does this only happen lazily when dynamically linking or can it happen at any time? (or was what I read just wrong? ;)
The standard has the following in 3.6.2/3:
It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of
namespace scope is done before the first statement of main. If the initialization is deferred to some point
in time after the first statement of main, it shall occur before the first use of any function or object defined
in the same translation unit as the object to be initialized.
But o Of course you can never officially tell when the initialization takes place since the initialization will occur before you access the variable! as follows:
// t1.cc
#include <iostream>
int i1 = 0;
int main () {
std::cout << i1 << std::endl
// t2.cc
extern int i1;
int i2 = ++i1;
I can conform that g++ 4.2.4 at least appears to perform the initialization of 'i2' before main.
The problem that one wanted to solve with that rule is the one of dynamic loading. The allowance isn't restricted to dynamic loading and formally could happen for other cases. I don't know an implementation which use it for anything else than dynamic loading.
Let's review a pseudocode:
In DLL:
static int ItsDllVar = 1;
int EXPORTED_FUNCTION() { return ItsDllVar; }
In application:
static int AppVar1 = 2;
static int AppVar2 = EXPORTED_FUNCTION() + AppVar1;
So according to static initializing AppVar2 gets 1+2=3
Lazy initialization applicable for local static variables (regardless of DLL)
int f()
{
static int local_i = 5;//it get's 5 only after visiting f()
return local_i;
}
I think this is what happened in my case with g++ 4.7 and CMake (not sure if this is a relevant detail regarding CMake). I have a code that registers a function in the factory. It relies on the constructor calling from a globally initialized variable.
When this code was in the statically linked library the initialization didn't happen! It is now working fine, when I moved it to the object files that linked directly (i.e., they are not combined into a library first).
So, I suspect that you are correct.
Can I control the order static objects are being destructed?
Is there any way to enforce my desired order? For example to specify in some way that I would like a certain object to be destroyed last, or at least after another static object?
The static objects are destructed in the reverse order of construction. And the order of construction is very hard to control. The only thing you can be sure of is that two objects defined in the same compilation unit will be constructed in the order of definition. Anything else is more or less random.
The other answers to this insist that it can't be done. And they're right, according to the spec -- but there is a trick that will let you do it.
Create only a single static variable, of a class or struct that contains all the other things you would normally make static variables, like so:
class StaticVariables {
public:
StaticVariables(): pvar1(new Var1Type), pvar2(new Var2Type) { };
~StaticVariables();
Var1Type *pvar1;
Var2Type *pvar2;
};
static StaticVariables svars;
You can create the variables in whatever order you need to, and more importantly, destroy them in whatever order you need to, in the constructor and destructor for StaticVariables. To make this completely transparent, you can create static references to the variables too, like so:
static Var1Type &var1(*svars.var1);
VoilĂ -- total control. :-) That said, this is extra work, and generally unnecessary. But when it is necessary, it's very useful to know about it.
Static objects are destroyed in the reverse of the order in which they're constructed (e.g. the first-constructed object is destroyed last), and you can control the sequence in which static objects are constructed, by using the technique described in Item 47, "Ensure that global objects are initialized before they're used" in Meyers' book Effective C++.
For example to specify in some way that I would like a certain object to be destroyed last, or at least after another static onject?
Ensure that it's constructed before the other static object.
How can I control the construction order? not all of the statics are in the same dll.
I'll ignore (for simplicity) the fact that they're not in the same DLL.
My paraphrase of Meyers' item 47 (which is 4 pages long) is as follows. Assuming that you global is defined in a header file like this ...
//GlobalA.h
extern GlobalA globalA; //declare a global
... add some code to that include file like this ...
//GlobalA.h
extern GlobalA globalA; //declare a global
class InitA
{
static int refCount;
public:
InitA();
~InitA();
};
static InitA initA;
The effect of this will be that any file which includes GlobalA.h (for example, your GlobalB.cpp source file which defines your second global variable) will define a static instance of the InitA class, which will be constructed before anything else in that source file (e.g. before your second global variable).
This InitA class has a static reference counter. When the first InitA instance is constructed, which is now guaranteed to be before your GlobalB instance is constructed, the InitA constructor can do whatever it has to do to ensure that the globalA instance is initialized.
Short answer: In general, no.
Slightly longer answer: For global static objects in a single translation-unit the initialization order is top to bottom, the destruction order is exactly reverse. The order between several translation-units is undefined.
If you really need a specific order, you need to make this up yourself.
Theres no way to do it in standard C++ but if you have a good working knowledge of your specific compiler internals it can probably be achieved.
In Visual C++ the pointers to the static init functions are located in the .CRT$XI segment (for C type static init) or .CRT$XC segment (for C++ type static init) The linker collects all declarations and merges them alphabetically. You can control the order in which static initialization occurs by declaring your objects in the proper segment using
#pragma init_seg
for example, if you want file A's objects to be created before file B's:
File A.cpp:
#pragma init_seg(".CRT$XCB")
class A{}A;
File B.cpp:
#pragma init_seg(".CRT$XCC")
class B{}B;
.CRT$XCB gets merged in before .CRT$XCC. When the CRT iterates through the static init function pointers it will encounter file A before file B.
In Watcom the segment is XI and variations on #pragma initialize can control construction:
#pragma initialize before library
#pragma initialize after library
#pragma initialize before user
...see documentation for more
Read:
SO Initialization Order
SO Solving the Order of Initialization Problem
No, you can't. You should never rely on the other of construction/destruction of static objects.
You can always use a singleton to control the order of construction/destruction of your global resources.
Do you really need the variable to be initialized before main?
If you don't you can use a simple idiom to actually control the order of construction and destruction with ease, see here:
#include <cassert>
class single {
static single* instance;
public:
static single& get_instance() {
assert(instance != 0);
return *instance;
}
single()
// : normal constructor here
{
assert(instance == 0);
instance = this;
}
~single() {
// normal destructor here
instance = 0;
}
};
single* single::instance = 0;
int real_main(int argc, char** argv) {
//real program here...
//everywhere you need
single::get_instance();
return 0;
}
int main(int argc, char** argv) {
single a;
// other classes made with the same pattern
// since they are auto variables the order of construction
// and destruction is well defined.
return real_main(argc, argv);
}
It does not STOP you to actually try to create a second instance of the class, but if you do the assertion will fail. In my experience it works fine.
You can effectively achieve similar functionality by having a static std::optional<T> instead of a T. Just initialize it as you'd do with a variable, use with indirection and destroy it by assigning std::nullopt (or, for boost, boost::none).
It's different from having a pointer in that it has preallocated memory, which is I guess what you want. Therefore, if you destroy it & (perhaps much later) recreate it, your object will have the same address (which you can keep) and you don't pay the cost of dynamic allocation/deallocation at that time.
Use boost::optional<T> if you don't have std:: / std::experimental::.