I've recently become painfully aware of the Static Initialization Order Fiasco. I am wondering though if the rule that "initialization order is undefined across translation units" still holds for static members in a parent class which are needed by static members in a child class.
For example, say we have (excluding, for brevity, all the # guards and includes)
// a.h
class A {
static int count;
static int register_subclass();
};
// a.cpp
int A::count = 0;
int A::register_subclass() {
return count ++;
}
And then a sub-classes of A,
// b.h
class B : public A {
static int id;
};
// b.cpp
int B::id = A::register_subclass();
There are two translation units here with static objects in one depending on static objects in the other on initialization... it seems like it may be an instance of the static initialization order fiasco.
My question is: is it actually safe?
That is, am I guaranteed that there is no chance that B::id will contain junk copied from A::count before the latter is initialized? From my own tests, A always seems to get initialized first, but I'm not sure how to introduce noise in the initialization order to increase the probability of failing if the behavior is undefined.
Generally, it is not safe to rely on the static initialization order of a base class and derived class. There is no guarantee that the static initialization of A will happen before B. That is the definition of the static initialization order fiasco.
You could use the construct on first use idiom:
// a.h
class A {
private:
static int& count();
protected:
static int register_subclass();
};
// a.cpp
int& A::count() {
static int count = 0;
return count;
}
int A::register_subclass() {
return count()++;
}
// b.h
class B : public A {
public:
static int id;
};
// b.cpp
int B::id = A::register_subclass();
Live demo.
Update: However, saying that, bogdan pointed out in comments
according to [3.6.2] in the Standard, the order of initialization in this specific example is guaranteed. It has nothing to do with inheritance, but with the fact that the initialization of A::count is constant initialization, which is guaranteed to be done before dynamic initialization, which is what B::id uses.
But unless you have a complete grasp of such intracaccies I recommend you use the construct on first use idiom.
And it is ok in this case but be careful of functions like A::register_subclass in a multi-threaded context. If multiple threads call it simultaneously anything could happen.
I am wondering though if the rule that "initialization order is undefined across translation units" still holds for static members in a parent class which are needed by static members in a child class.
Yes, it does.
The only way that static data members relate to inheritance hierarchies (or, really, their encapsulating classes at all) is in their fully qualified names; their definition and initialisation is completely unaware/uncaring of this.
Related
Why and when does the compiler optimize away static member variables? I have the following code
#include <iostream>
#include <typeinfo>
class X {
public:
X(const char* s) { std::cout << s << "\n"; };
};
template <class S> class Super {
protected:
Super() { (void)m; };
static inline X m { typeid(S).name() };
};
class A : Super<A> {
};
class B : Super<B> {
B() {};
};
class C {
static inline X m { "c" };
};
A a {};
int main() { return 0; }
On the output I can see that Super<A>::m, Super<B>:m, and C::m are all initialized.
Super<A>::m is not initialized if the statement A a {}; is removed. It makes sense because then m is never accessed. However this does not explain why it is not removed for B and C.
Is this behavior specified or is it an artifact of how the compiler detects unused variables?
The definition of a static data member of a class template specialization is implicitly instantiated only if it is used in such a way that a definition would be required.
For the class B you are, unconditionally, defining the default constructor. The default constructor uses the default constructor of Super<B> to initialize the base, meaning that the definition of the Super<B>::Super() constructor will be implicitly instantiated. This constructor's definition is odr-using m in (void)m; and therefore Super<B>::m's definition will also be implicitly instantiated.
In the case of class A, you are not explicitly defining any constructor. The implicit special member functions will be defined only when they are used in such a way that a definition would be required. In the line A a {}; you are calling the implicit default constructor of A and hence it will be defined. The definition will be calling the default constructor of Super<A> as before, requiring the Super<A>::m's definition to be instantiated. Without A a {}; there is nothing in the code requiring a definition of any special member function of A or the default constructor of Super<A> or the definition of m. Therefore none of them will be defined.
In the case of C, there is no template for which we would need to consider instantiation. C::m is explicitly defined.
Given that the static data member is defined, it must (generally) be initialized eventually. All of the inline static data members here have dynamic initialization with observable side effects, so the initialization must happen at runtime. It is implementation-defined whether they will be initialized before main's body starts execution or whether initialization will be deferred upto the first non-initialization odr-use of the inline static data member. (This is meant to allow for dynamic libraries.)
You aren't actually non-initialization odr-using any of the inline static data members, so it is implementation-defined whether they will actually be initialized at all. If the implementation does define the initialization to not be deferred, then all of these inline static data members which have been defined will also be initialized before main is entered.
The order in which the initializations will happen is indeterminate. The static data members of the class template specializations have unordered initialization, meaning they have no ordering guarantees with any other dynamic initialization. And there is only one static data member which isn't specialized from a template and that one is inline and therefore only partially ordered, although there is nothing else it could be ordered with.
Actually, there is one additional static storage duration object which will be initialized here, a global variable of type std::ios_base::Init included through <iostream>. The initialization of this variable causes the initialization of the standard streams (std::cout, etc.). Because your inline static data members from the templates have unordered initialization, they will not be ordered with this initialization. Similarly if you had multiple translation units containing C::m, it would also not be ordered with it. As a consequence you might be using std::cout before it is initialized, causing undefined behavior. You can cause early initialization of the standard streams by constructing an object of type std::ios_base::Init:
class X {
public:
X(const char* s) {
[[maybe_unused]] std::ios_base::Init ios_base_init;
std::cout << s << "\n";
};
};
Aside from considerations such as above, the compiler is not allowed to remove static data members if their initialization has observable side effects. Of course the as-if rule still applies as always meaning that the compiler can compile to whatever machine instructions which will result in the same observable behavior as described above.
For practical purposes you should also be careful. There are some compiler flags that are sometimes used for code size optimization which will eliminate dynamic initialization if the variable seems to be unused. (Although that is not standard-conforming behavior.) For example the --gc-sections linker flag together with GCC's -ffunction-section -fdata-section can have this effect.
As you can see dynamic initialization of static storage duration objects is kind of complicated in C++. In your case here there are only minor dependency issues, but this can quickly become very messy, which is why it is usually recommended to avoid it as much as possible.
Just wondering if in the following the static members are initialised before the Foo class object is initialised. Since both are static variables, one a static member and the other a static global variable, initialisation order is not guaranteed or specified.
struct Foo
{
Foo() { assert(a == 7); }
static inline int a = 7;
};
Foo foo;
int main()
{
}
So the initialisation order between the global Foo and the static class member is not defined, you would think there is no guarantee. However, I'm thinking that before a Foo is instantiated that the Foo class would need to be completed/initialised first, and so in that case that there might be a guarantee that the static member variable would be initialised first.
I'm thinking that before a Foo is instantiated that the Foo class would need to be completed/initialised first
That is not generally the case, so you should be careful with that assumption.
However, in your specific example, the order of initialization is guaranteed. The initializer of a is a constant expression and therefore a will be constant-initialized. Constant-initialization is guaranteed to happen before any dynamic initialization, which the initialization of foo is.
Even if a was not constant-initialized, there wouldn't be an issue here, because foo is defined after a in the same translation unit, foo is not an inline or template variable and because Foo is not a template. If either of these requirements were not fulfilled, there could be problems in the ordering guarantees.
With the new feature of c++11, we are able to do In-class member initialisation. But still static data member cannot be defined in class.
class A
{
static const int i = 10;
int j = 10;
const int k = 20;
static int m = 10; // error: non-const static data member must be initialized out of line
};
Why is this feature not provided?
Non-static data member in-class initialization
First of, this is completely different than static member initialization.
In-class member initialization is just a syntactic sugar that transforms to constructor initialization list.
E.g.
struct X
{
int a_ = 24;
int b_ = 11;
int c_;
X(int c) : c_{c}
{
}
X(int b, int c) : b_{b}, c_{c}
{
}
};
Pretty much is equivalent to:
struct X
{
int a_;
int b_;
int c_;
X(int c) : a_{24}, b{11}, c_{c}
{
}
X(int b, int c) : a{24}, b_{b}, c_{c}
{
}
};
Just syntactic sugar. Nothing that couldn't been done previous to C++11 with more verbose code.
Static data member in-class initialization
Things are more complicated here because there has to be just 1 symbol for the static data member. You should read about ODR (One Definition Rule).
Let's start with const static data member. You might be surprised that initialization is allowed only from compile time constant expressions:
auto foo() { return 24; }
constexpr auto bar() { return 24 };
struct X
{
static const int a = foo(); // Error
static const int b = bar(); // Ok
};
The actual rule (well not a rule per se, but a reasoning if you will) is more general (for both const and non-const static data members): static data member initialization, if in line, must be a compile time expression. This effectively means that the only static data member in line initialization allowed is for const static data members with constexpr initialization.
Now let's see the reasoning behind that: if you have an in line initialization that would make it a definition and this means that every compilation unit where the definition of X appears will have a X::a symbol. And every such compilation unit will need to initialize the static member. In our example foo would be called for each compilation unit that includes directly or indirectly the header with the definition of X.
The first problem with this is that it's unexpected. The number of calls to foo will depend on the number of compilation units that have included X, even if you wrote a single call to foo for a single initialization of a single static member.
There is a more serious problem though: foo not being a constexpr function nothing prevents foo from returning different results on each invocation. So you will end up with a bunch of X::a symbols which should be under ODR but each of them initialized with different values.
If you are still not convinces then there is the 3rd problem: having multiple definition of X::a would simply be a violation of ODR. So... the previous two problems are just some of the motivations for why ODR exists.
Forcing an out-of line definition for X::a is the only way which allows correct definition and initialization of X::a: in a single compilation unit. You can still mess up and write the out of line definition and initialization in a header, but with an in line initialization you definitely have multiple initializations.
As n.m. showed since C++17 you have inline data members and here we are allowed in-class initialization:
struct X
{
static inline int i = foo();
};
Now we can understand why: with inline the compiler will chose just one definition of X::i (from one compilation unit) and so you just have one evaluation of the initialization expression that is chosen from one compilation unit. Note that it is still your duty to respect ODR.
It is provided in C++17.
static inline int m = 10;
Q. Why wasn't it provided in C++11?
A. Because it wasn't ready back then. (You can ask the same question about every single new language feature, and the answer is always the same too.)
Q. Why does it require the inline keyword?
A. For simplicity of compiler development, better expressivity, and/or better consistency with the other parts of the language. Most likely there's some weighed combimation of several factors.
"This feature is not provided" because in-class initialization for non-static and static members are semantically very different features. Your suprize is based on the fact that they look superficially similar. But in reality they actually have nothing in common.
In-class declaration of a static data member is just that - a declaration. It does not provide definition for that member. The definition has to be provided separately. The placement of that definition in the code of your program will have consequences. E.g. it will also define the initialization behavior for that static data member (the order of initialization) and it will affect the exported symbols in object files. This is why choosing the location for that definition is your responsibility. The language wants you to do it and it wants you to do it explicitly. These issues do not apply to non-static members, which is what makes them fundamentally different.
If you don't care about such matters, starting from C++17 you can explicitly tell the compiler that you don't care by declaring your static member inline. Once you do that, you'll be able to initialize it in-class.
c.hpp:
class C
{
private:
static SomeClass var;
public:
static void f()
{
// Uses var;
}
};
c.cpp:
SomeClass C::var;
Is it always safe to call C::f()? For instance, from constructor of some global variable defined in a different compilation unit?
No. The initialization order of all but function-local static variables leads to disaster in the worst way possible.
The technical term is "Static Initialization Order Fiasco". It's real, and googleable.
The trick is to not use globals in any form. Function-local static are incredibly useful and should be used when appropriate if you know when they are so incredibly useful.
I'm using a third party C++ library that requires the definition of a global array of structures for it to use. I don't like this design but I'm stuck with it.
LibStruct g_Structs[] =
{
{ /* structure initialization data */ },
{ /* structure initialization data */ },
// etc
};
int g_NumStructs = sizeof(g_Structs) / sizeof(g_Structs[0]);
I'd like to break this down a bit so that classes can supply a structure definition that applies for that class. This risks causing the static initialization fiasco unless the initialization is done safely using getters or after main() begins.
If I declare MyStructs for each class in the header file for that class as static data members with __declspec(selectany) attribute (Visual C++ specific) then it seems to work. Does the selectany attribute have an effect on the construction order of that data? Does the appearance of the selectany definition of the static data member before the global array actually mean that it's constructed in that order? Or does this behaviour just depend on which of the multiple selectany definitions gets thrown away by the linker? Are there any guarantees with selectany?
// In the header for Class1
static const LibStruct __declspec(selectany) Class1::m_MyStruct = { /* structure initialization data */ };
// In the header for Class2
static const LibStruct __declspec(selectany) Class2::m_MyStruct = { /* structure initialization data */ };
// Danger - potential for static initialization fiasco?
LibStruct g_Structs[] =
{
Class1::m_MyStruct,
Class2::m_MyStruct,
// etc
};
int g_NumStructs = sizeof(g_Structs) / sizeof(g_Structs[0]);
CLARIFICATION:
In this case MyStruct is a C style struct that has no constructors or virtual functions and only contains pointers which are all initialized to point to other global data or non-member functions. So the initialization of a global const MyStruct using = {} syntax shouldn't run any constructor code.
Adding my own answer as there haven't been any other answers.
It seems that this code will work but only because the Microsoft compiler implementation chooses to initialize global const primitive data by mapping it into memory from the .data section of the executable image before any static initialization code runs. So it's implementation dependent, but dependent on the compiler's static initialization strategy, not the use of __declspec(selectany).
Stepping through the entry point of the application shows that this appears to be correct. The static const MyStruct objects are already initialized at the entry point but the global array is initialized later in static construction code.