initialization order of thread_local vs. global variables - c++

C.h:
#include <iostream>
class C {
public:
explicit C(int id) { std::cout<<"Initialized "<<id<<"\n"; }
};
1.cpp:
#include "C.h"
C global(1);
2.cpp:
#include "C.h"
thread_local C thread(2);
int main() {}
My question is: Is it guaranteed that global will be initialized before thread?
The C++ standard is somewhat vague on this point, as far as I understand it. It says (from the C++17 n4659 draft):
[basic.start.static] Static initialization
Variables with static storage duration are initialized as a
consequence of program initiation. Variables with thread storage
duration are initialized as a consequence of thread execution.
It stands to reason that "program initiation" happen before "thread execution", but since both those expressions appear in the standard only in that place, I'm seeking advise from actual language lawyers.

I'm going to use the C++20 working draft since the wording there is a little cleaner, although none of the real rules have changed.
First, thread_local behaves basically like static as far as non-local goes: [basic.stc.thread]/2:
[ Note: A variable with thread storage duration is initialized as specified in [basic.start.static], [basic.start.dynamic], and [stmt.dcl] and, if constructed, is destroyed on thread exit ([basic.start.term]). — end note ]
Yes, it's a note. But a non-local object declared thread_local is basically static so this makes sense.
Now, neither global nor thread have constant initialization - so both are zero initialized and then they have to undergo dynamic initialization. To [basic.start.dynamic]!
Dynamic initialization of a non-local variable with static storage duration is unordered if the variable is an implicitly or explicitly instantiated specialization, is partially-ordered if the variable is an inline variable that is not an implicitly or explicitly instantiated specialization, and otherwise is ordered.
Neither of our variables are specializations, neither of them are inline. So both are ordered.
A declaration D is appearance-ordered before a declaration E if
D appears in the same translation unit as E, or
the translation unit containing E has an interface dependency on the translation unit containing D,
in either case prior to E.
Our declarations are not appearance-ordered with respect to each other.
Dynamic initialization of non-local variables V and W with static storage duration are ordered as follows:
Alright, sub-bullet 1:
If V and W have ordered initialization and the definition of V is appearance-ordered before the definition of W, or if V has partially-ordered initialization, W does not have unordered initialization, and for every definition E of W there exists a definition D of V such that D is appearance-ordered before E,
Doesn't apply. It's a complicated condition, but it doesn't apply.
Otherwise, if the program starts a thread other than the main thread before either V or W is initialized, it is unspecified in which threads the initializations of V and W occur; the initializations are unsequenced if they occur in the same thread.
Nope, no threads.
Otherwise, the initializations of V and W are indeterminately sequenced.
There we go. global and thread are indeterminately sequenced.
Note also that:
It is implementation-defined whether the dynamic initialization of a non-local inline variable with static storage duration is sequenced before the first statement of main or is deferred.
and:
It is implementation-defined whether the dynamic initialization of a non-local non-inline variable with thread storage duration is sequenced before the first statement of the initial function of a thread or is deferred.

There is no guarantee and there cannot be any form of guarantee of it - at least currently.
Imagine following case, you have another unrelated static global variable Z that uses your thread_local variable during initialization or, say, even creates another thread and uses it - all during its initialization.
Now it just happens that this static global variable Z gets initialized prior to your static global variable global. This implies that the thread_local variable had to be initialized prior to your static global variable.
Note: currently, there is no way to guarantee in which order static global variables are being initialized - a known issue of C++. So if you use one global variable inside another it might or might not lead to an error - technically a UB. Don't think that it affects thread_local variables in any way as their initialization mechanism tends to be very different.

Related

Guarantee of deferred dynamic initialization of non odr-used global variable

Consider the following complete program consisting of two TU's:
// 1.cpp
bool init() { /* ... */ }
const auto _{init()};
// 2.cpp
int main() {}
Question: is there any guarantee that _ is initialized at some point (I do not care when)?
Now consider the program consisting of one TU:
// 1.cpp
bool init() { /* ... */ }
const auto _{init()};
int main() {}
Note that _ is not odr-used.
However, can main(), in the second case, be said to be odr-used, since it gets (sort of) "referred by the implementation" as it gets called when the program is run?
And if main() is odr-used, does this imply that _ is guaranteed to be initialized even if it's not odr-used?
EDIT:
This is what en.cppreference.com says about Deferred dynamic initialization:
If no variable or function is odr-used from a given translation unit,
the non-local variables defined in that translation unit may never be
initialized (this models the behavior of an on-demand dynamic library)
Can you answer my questions considering the above when reading my two examples?
It's supposedly the linker's job to collate all objects with static storage-duration from all translation units for initialization during program initiation - however, its a bit more than that, the guarantee is that those objects will be initialized before the use of any function within that translation unit.
basic.start.static/1: Variables with static storage duration are initialized as a
consequence of program initiation....
Also see:
basic.stc.static/2: If a variable with static storage duration has initialization or a
destructor with side effects, it shall not be eliminated even if it
appears to be unused...
The object _ is guaranteed to be initialized. According to [basic.start.static]/1,
Variables with static storage duration are initialized as a consequence of program initiation. Variables with
thread storage duration are initialized as a consequence of thread execution.
In case you were wondering whether that could be read only as guaranteeing that static initialization shall occur, and not guaranteeing that dynamic initialization shall occur, see [dcl.dcl]/11,
A definition causes the appropriate amount of storage to be
reserved and any appropriate initialization (11.6) to be done.
Thus, all initialization required by the semantics of the initializer {init()} shall be performed on the object _.
As usual, the as-if rule applies. If init() has any observable behaviour, such behaviour must occur. It has any side effects that affect observable behaviour, such side effects must occur.
The fact that _ is not odr-used is irrelevant. The tangent about main is irrelevant too.

constructor execution sequence/order: dependent initialization of static variable (class instance) in a function

For the following code segment:
class Bar {
public:
int x;
int y;
Bar(int _x, int _y) { /* some codes here */ ...}
};
class Foo {
public:
int x;
int y;
int z;
Foo(Bar b):x(b.x), y(b.y)
{
z = someFunction(x, y);
}
};
void f(int x, int y)
{
Bar b(x, y);
static Foo x(b);
}
int main()
{
f(2, 3);
}
In my mind, a static variable inside a function should be initialized even before main(). However, the static variable x of type Foo depends on a local variable b of type Bar.
The questions are:
1) When does the constructor of x execute? i.e. Are x initialized with the first invocation of the local variable b? I don't want some particular result of some special compiler case, but want to know if it is well-defined in the C++ language.
2) Is it a valid program?
3) It it a a good practice?
In my mind, a static variable inside a function should be initialized even before main()
Your mind is incorrect... at least partly. A static local variable may be initialized early in some situations, but not in a case where the constructor depends on a local variable such as this one.
n3242 draft of the standard §6.7/4:
... An implementation is permitted to perform early initialization of other block-scope variables with static or thread storage duration under the same conditions that an implementation is permitted to statically initialize a variable with static or thread storage duration in namespace scope (3.6.2). Otherwise such a variable is initialized the first time control passes through its declaration; ...
For completeness, here is the requirements for constant (static) initialization §3.6.2/2:
Constant initialization is performed:
— if each full-expression (including implicit conversions) that appears in the initializer of a reference with
static or thread storage duration is a constant expression (5.19) and the reference is bound to an lvalue
designating an object with static storage duration or to a temporary (see 12.2);
— if an object with static or thread storage duration is initialized by a constructor call, if the constructor is
a constexpr constructor, if all constructor arguments are constant expressions (including conversions),
and if, after function invocation substitution (7.1.5), every constructor call and full-expression in the
mem-initializers is a constant expression;
— if an object with static or thread storage duration is not initialized by a constructor call and if every
full-expression that appears in its initializer is a constant expression.
1) x is initialized when the execution reaches it's declaration for the first time and that's when the constructor is run. So, b is fully initialized when the initialization of x starts.
2) As far as the initialization dependency is concerned, yes.
3) Sure, if you need that, a constructor of a static local object may depend on a local object. As long as you don't refer to that local object after the it's out of scope. In this case you simply copy it's members, so you don't depend on it after constructing x.
According to the C++ Standard (6.7 Declaration statement)
4 The zero-initialization (8.5) of all block-scope variables with
static storage duration (3.7.1) or thread storage duration (3.7.2) is
performed before any other initialization takes place. ...Otherwise
such a variable is initialized the first time control passes through
its declaration; such a variable is considered initialized upon the
completion of its initialization. If the initialization exits by
throwing an exception, the initialization is not complete, so it will
be tried again the next time control enters the declaration. ...
Thus before the function will get the control local static variables are zero-initialized and then when the function will get the control they are initialized using their initializers (or constructors).
Vlad did the hard part of finding references in C++ standard.
Now for your questions :
when will constructor of x be executed ?: per Vlad's answer, it will be executed the first time you will call f, and x will then keep its value through any other call
Is it a valid program? : the current program does not compiler but for other mistakes : x and y are private in bar, and in f, x in already a parameter. But the initialization of a static variable with the values passed at first invocation is fine
is it a a good practice? : it is not a common usage so it must be explained in a comment for following readers or maintainers. Apart from that nothing is wrong with it provided you know why you use this construct.

Does the C++ standard require that dynamic initialization of static variables be performed in the main thread?

Does the C++ standard require that dynamic initialization of non-local static variables, be performed in the same thread that calls main()?
More specifically, in C++11, is std::this_thread::get_id() guaranteed to return the same result in static initializers and inside main()?
Edit:
Even more specifically, given the following code:
#include <iostream>
#include <thread>
static std::thread::id id = std::this_thread::get_id();
int main()
{
std::cout << id << "\n";
std::cout << std::this_thread::get_id() << "\n";
return 0;
}
are the two emitted thread IDs required/guaranteed to match?
No. The standard nowhere provides such a guarantee, and in fact the contrary is implied by [basic.start.init]/p2:
If a program starts a thread (30.3), the subsequent initialization of
a variable is unsequenced with respect to the initialization of a
variable defined in a different translation unit. Otherwise, the
initialization of a variable is indeterminately sequenced with respect
to the initialization of a variable defined in a different translation
unit. If a program starts a thread, the subsequent unordered
initialization of a variable is unsequenced with respect to every
other dynamic initialization. Otherwise, the unordered initialization
of a variable is indeterminately sequenced with respect to every other
dynamic initialization.
There would be no need to weaken the sequencing guarantee in the presence of threads if all initializations had to be performed on the same thread.
Standard doesn't say anything about what thread should perform such initialization. It only requires specific orderings and guarantees:
3.6.2 Initialization of non-local variables [basic.start.init]
2. Static initialization shall be performed before any dynamic initialization takes place. [...] Variables with ordered initialization defined within a single translation unit shall be initialized in the order of their definitions in the translation unit. [...] If a program starts a thread, the subsequent initialization of a variable is unsequenced with respect to the initialization of a variable defined in a different translation unit. Otherwise, the initialization of a variable is indeterminately sequenced with respect to the initialization of a variable defined in a different translation unit.
4. It is implementation-defined whether the dynamic initialization of a non-local variable with static storage duration is done before the first statement of main. If the initialization is deferred to some point in time
after the first statement of main, it shall occur before the first odr-use of any function or variable defined in the same translation unit as the variable to be initialized.
5. It is implementation-defined whether the dynamic initialization of a non-local variable with static or thread storage duration is done before the first statement of the initial function of the thread. If the initialization is deferred to some point in time after the first statement of the initial function of the thread, it shall occur before the first odr-use of any variable with thread storage duration defined in the same translation unit as the variable to be initialized.
However, most implementations will do so - initialization of static non-local variables will be performed in the same thread, that calls main(). Example from Visual C++ 11:
#include <iostream>
#include <thread>
using namespace std;
struct Cx
{
public:
Cx()
{
cout<<"Cx: "<<std::this_thread::get_id()<<endl;
}
};
static Cx c;
int main()
{
cout<<"Main: "<<std::this_thread::get_id()<<endl;
return 0;
}
Output:
Cx: 5820
Main: 5820
After setting breakpoint inside Cx::Cx():
No, though it may be a good idea to write your program that way. The syntax requires that static initialization happens in a deterministic way, but does not dictate things like the thread involved.

Is the static initialization of global variables completed before `main()`?

Some relevant excerpts from the C++ standard 1998:
The storage for objects with static storage duration shall be zero-initialized before any other initialization takes place. Zero-initialization and initialization with constant expression are collectively called static initialization; all other initialization is dynamic initialization. Objects of POD types with static storage duration initialized with constant expressions shall be initialized before any dynamic initialization takes place. Objects with static storage duration defined in namespace scope in the same translation unit and dynamically initialized shall be initialized in the order in which their definition appears in the translation unit.
It is implementation-defined whether or not the dynamic initialization of an object of namespace scope is done before the first statement of main. If the initialization is deferred to some point in time after the first statement of main, it shall occur before the first use of any function or object defined in the same translation unit as the object to be initialized.
Consider the following code.
int a = 1;
int main()
{
cout << a << endl;
return 0;
}
According to the standard, the static initialization takes place before the dynamic initialization, and the dynamic initialization may take place after main() is entered. My question is: is the global variable a initialized to be 1 before main() is entered? Then if all the threads are created after main() is entered the static initialization of global variables is guaranteed to be thread-safe.
The standard says that all objects are initialized in the same translation unit (aka object file which corresponds to a single source file) before any function is called in that translation unit. In your example, it looks like they are in the same file, so a will be initialized before main() is called.
The standard is allowing lazy initialization to occur in the event that a DLL is loaded at run time. If you allow run-time linkage of your code, you can't say that everything is initialized before main().
Yes, if you had a object outside a function such as
Foobar foo;
If Foobar has a constructor, it would nominally run before main(). Likewise its destructor runs after main() exits. I would be hesitant to make use of this feature though. One issue, if you have these sort of objects in multiple files the order of creation is indeterminate.

Are static class members guaranteed to be initialized before `main` is called?

Is there any guarantee that static class members are initialized before main is called?
I think no:
[C++03: 3.6.2/3]: It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of namespace scope is done before the first statement of main. If the initialization is deferred to some point in time after the first statement of main, it shall occur before the first use of any function or object defined in the same translation unit as the object to be initialized.
Hmm, really?
Well, arguably, "defined in namespace scope" is not quite the same thing as "an object of namespace scope":
[C++03: 9.4.2/2]: The declaration of a static data member in its class definition is not a definition and may be of an incomplete type other than cv-qualified void. The definition for a static data member shall appear in a namespace scope enclosing the member's class definition. In the definition at namespace scope, the name of the static data member shall be qualified by its class name using the :: operator. The initializer
expression in the definition of a static data member is in the scope of its class (3.3.6).
However, it's the initializer that's in the class's scope; there's no mention of the static member itself having anything other than namespace scope (unless we mentally inject the word "lexically" everywhere).
There is this pleasing paragraph:
[C++03: 9.4.2/7]: Static data members are initialized and destroyed exactly like non-local objects (3.6.2, 3.6.3).
However, unfortunately, the only further definition of the sequencing of main and static initialisation, with respect to "non-local objects", is the aforementioned [C++03: 3.6.2/3].
So what then?
I believe that the intent of this otherwise potentially ambiguous rule is clearly shown by the new wording in C++11, which resolves everything:
[C++11: 9.4.2/6]: Static data members are initialized and destroyed exactly like non-local variables (3.6.2, 3.6.3).
[C++11: 3.6.2/4]: It is implementation-defined whether the dynamic initialization of a non-local variable with static storage duration is done before the first statement of main. [..]
C++03: In short, no guarantee
C++11: No guarantee, see Lightness' answer.
My interpretation/analysis of the C++03 statements:
Terminology: [basic.start.init]/1
Zero-initialization and initialization with a constant expression are collectively called static initialization; all other initialization is dynamic initialization.
Order of initialization on non-local objects:
Objects with static storage duration (3.7.1) shall be zero-initialized (8.5) before any other initialization takes place.
But it doesn't mention when "any other initialization" takes place, i.e. there's no guarantee it'll be before the first statement of main, even for zero-initialization.
Objects of POD types (3.9) with static storage duration initialized with constant expressions (5.19) shall be initialized before any dynamic initialization takes place.
But again, no guarantee.
Dynamic initialization
[basic.start.init]/3
It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of namespace scope is done before the first statement of main. If the initialization is deferred to some point in time after the first statement of main, it shall occur before the first use of any function or object defined in the same translation unit as the object to be initialized.
But what is an "object of namespace scope"? I have not found any clear definition in the Standard. scope is actually a property of a name, not of an object. Therefore we could read this as "object defined in namespace scope" or "object introduced by a name of namespace scope". Note the reference "9.4" after dynamic initialization. It refers to "Static members", which can only mean static data members. So I'd say it means "object defined at namespace scope", as static data members are defined at namespace scope:
[class.static.data]/2
The definition for a static data member shall appear in a namespace scope enclosing the member’s class definition.
Even if you don't agree on this interpretation, there's still
[basic.start.init]/1
Objects with static storage duration defined in namespace scope in the same translation
unit and dynamically initialized shall be initialized in the order in which their definition appears in the translation unit.
This clearly applies to static data members, which means that they cannot be initialized differently than objects introduced by names of namespace scope if there's such an object before the definition of the static data member. That is, if there was no guarantee at all on the dynamic initialization of static data members, the guarantees of any preceding object introduced by a name of namespace scope would apply - which are: none (it does not have to be initialized before the first statement of main).
If there's no such object preceding the definition of the static data member and you disagree on the interpretation - there would be no guarantee on the dynamic initialization of static data members at all.
Conclusion
So we only have a guarantee that dynamic initialization happens sometime (before any usage) plus an exception that initialization with side-effects must not be eliminated. Still, we have no guarantee that any kind of initialization of non-local objects is performed before the first statement of main.
Note: There are workarounds, like:
#include <iostream>
struct my_class
{
static int& my_var()
{
static int i = 42;
return i;
}
};
int j = ++my_class::my_var();
int k = ++my_class::my_var();
int main()
{
std::cout << j << " : " << k << std::endl;
}