How does program know if static variable needs to be initialized? [duplicate] - c++

This question already has answers here:
Why does initialization of local static objects use hidden guard flags?
(2 answers)
Closed 4 years ago.
As in the title - how does program know, that foo is already initialized when function is called second time:
int getFoo()
{
static int foo = 30;
return foo;
}
int main()
{
getFoo();
getFoo();
}
I want to know, whether the program stores some additional information about which static variable was already initialized.
Edit:
I found an answer here:
Why does initialization of local static objects use hidden guard flags?
Like I guessed - most compilers store additional "guard variable".

Have a look at [stmt.dcl]/4:
Dynamic initialization of a block-scope variable with static storage duration or thread storage duration is performed the first time control passes through its declaration; such a variable is considered initialized upon the completion of its initialization. If the initialization exits by throwing an exception, the initialization is not complete, so it will be tried again the next time control enters the declaration. If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization.94 If control re-enters the declaration recursively while the variable is being initialized, the behavior is undefined.

You have to be careful here. Primitive statics are initialised at compile time (as long as the initialisation value is a compile-time contant, as Peter points out), so in your example, GetFoo just, in effect, returns a constant.
HOWEVER...
statics which initialise an object (or initialise a primitive by calling a function) perform said initialisation when the scope in which they are declared is entered for the first time.
Furthermore, as of C++ 11 this has to be done in a threadsafe way, which generates a lot of extra code (although not much runtime overhead, after the first time through) and that might be an issue on, say, a micro-controller where code size often matters.
Here's a concrete example:
#include <iostream>
struct X
{
X () { std::cout << "Initialising m\n"; m = 7; }
int m;
};
void init_x ()
{
static X x;
}
int main () {
std::cout << "main called\n";
init_x ();
std::cout << "init_x returned\n";
}
Output:
main called
Initialising m
init_x returned
Live demo: https://wandbox.org/permlink/NZApcYYGwK36vRD4
Generated code: https://godbolt.org/z/UUcL9s

Related

Is static initialization atomic across all objects?

C++11 guarantees that the initialization of static local variables is atomic at the first call of the function. Although the standard doesn't mandate any implementation, the only way to handle this efficiently is double-checked locking.
I asked myself if all objects are initialized are initialized across the same mutex (likely) or if each static object initialization acts on its own mutex (unlikely). So I wrote this litlte C++20-program that uses some variadic and fold expression tricks to have a number of different functions that each initialize their own static object:
#include <iostream>
#include <utility>
#include <latch>
#include <atomic>
#include <chrono>
#include <thread>
using namespace std;
using namespace chrono;
atomic_uint globalAtomic;
struct non_trivial_t
{
non_trivial_t() { ::globalAtomic = ~::globalAtomic; }
non_trivial_t( non_trivial_t const & ) {}
~non_trivial_t() { ::globalAtomic = ~::globalAtomic; }
};
int main()
{
auto createNThreads = []<size_t ... Indices>( index_sequence<Indices ...> ) -> double
{
constexpr size_t N = sizeof ...(Indices);
latch latRun( N );
atomic_uint synch( N );
atomic_int64_t nsSum( 0 );
auto theThread = [&]<size_t I>( integral_constant<size_t, I> )
{
latRun.arrive_and_wait();
if( synch.fetch_sub( 1, memory_order_relaxed ) > 1 )
while( synch.load( memory_order_relaxed ) );
auto start = high_resolution_clock::now();
static non_trivial_t nonTrivial;
nsSum.fetch_add( duration_cast<nanoseconds>( high_resolution_clock::now() - start ).count(), memory_order_relaxed );
};
(jthread( theThread, integral_constant<size_t, Indices>() ), ...);
return (double)nsSum / N;
};
constexpr unsigned N_THREADS = 64;
cout << createNThreads( make_index_sequence<N_THREADS>() ) << endl;
}
I create 64 threads with the above code since my system has up to 64 CPUs in a processor group (Ryzen Threadripper 3990X, Windows 11). The results fulfilled my expectations in a way that each initialization is reported to take about 7.000ns. If each initialization would act on its own mutex the mutex locks would take the short path and you'd have no kernel-contention and the times would be magnitudes lower. So are there any further questions ?
The question I asked myself afterwards is: what happens if the constructor of the static object has its own static object ? Does the standard explicitly mandate that this should work, forcing the implementation to consider that the mutex has to be recursive ?
No, static initialization is not atomic across all objects. Different static objects may get initialized by different threads simultaneously.
It just so happens that GCC and Clang do in fact use a single global recursive mutex (to handle the recursive case you described, which is required to work), but other compilers use a mutex for every static function-local object (i.e. Apple's compiler). Therefore you can't rely on static initialization happening one object at a time - simply because it doesn't, depending on your compiler (and the version of that compiler).
Section 6.7.4 of the standard:
A local object of POD type (basic.types) with static
storage duration initialized with con- stant-expressions is
initialized before its block is first entered. An implementation is
permitted to perform early initialization of other local objects
with static storage duration under the same conditions that an
implementation is permitted to statically initialize an object with
static storage duration in namespace scope (basic.start.init).
Otherwise such an object is initialized the first time control passes
through its declaration; such an object is considered initialized upon
the completion of its initialization. If the initialization exits by
throwing an exception, the initialization is not complete, so it will
be tried again the next time control enters the declaration. If control re-enters the declaration (recursively) while the object is being initialized, the behavior is undefined.
The standard only forbids recursive initialization of the same static object; it doesn't forbid the initialization of one static object to require another static object to be initialized. Since the standard explicitly states that all static objects that don't fall in this forbidden category must be initialized when the block containing them is first executed, the case you asked about is allowed.
int getInt1();
int getInt2() { //This could be a constructor, too, and nothing would change
static int result = getInt1();
return result;
}
int getInt3() {
static int result = getInt2(); //Allowed!
return result;
}
This also applies to the case when the constructor of a function-local static object itself contains such a static object. A constructor is really just a function too, which means this case is identical to the example above.
See also: https://manishearth.github.io/blog/2015/06/26/adventures-in-systems-programming-c-plus-plus-local-statics/
Every static local variable has to be atomic. If every single one of them has it's own mutex or double-checked locking then that will be true.
There could also be a single global recursive mutex that allows one thread and one thread only to be initializing static local variables at a time. That works too. But if you have many static local variables and multiple threads accessing them for the first time then that could be horribly slow.
But lets consider your case of a static local variable having a static local variable:
class A {
static int x = foo();
};
void bla() {
static A a;
};
Initializing a requires initializing x. But nothing says there can't be some other thread that also has an A c; and will be initializing x at the same time. So x still needs to be protected even though in the case of bla() it is inside an already static initialization.
Another example (hope that compiles, haven't checked):
void foo() {
static auto fn = []() {
static int x = bla();
};
}
Here x can only ever be initialized when fn is initialized. So the compiler could possibly skip protecting x. That would be an optimization that follows the as-if principal. Apart from timing there is no difference whether x is protected or not. On the other hand locking for x would always succeed and the cost of that is very small. Compilers might not optimize it because nobody invested the time to detect and optimize such cases.

why static variable is allocated when program starts but initialize later?

I'm new here, I see Static variables in a function are initialized before the function is called for the first time., but I still don't know why it doesn't call the constructor before the function starts?
class Base
{
public:
Base();
~Base();
private:
};
Base::Base()
{
cout << "I'm Base" << endl;
}
Base::~Base()
{
}
int main()
{
cout << "start program!" << endl;
static Base b;
return 0;
}
When are static function variables allocated?, I think the case is almost the same with me..
Any help will be appreciated ^_^
Not quite. Static variables are initialised the first time they are encountered, which of course is not necessarily at the start of a function.
Objects with static storage duration have two phases of initialisation: Static phase and dynamic phase. Some static variables don't have dynamic initialisation at all. Those objects that do have dynamic initialisation are initially statically zero-initialised.
The static phase of initialisation happens when the program starts, before anything else. Thus, memory must also have been allocated before anything else.
The dynamic phase of initialisation cannot be instantaneous. Dynamic initialisation may have dependencies on initialisation of other static objects. Some objects are necessarily initialised before other objects. This is why dynamic initialisation happens after allocation.
For namespace scope variables with static storage, their dynamic initialisation happens either before main, or it may be deferred later in which case it happens before anything from that same translation unit is accessed or called (in practice, deferral happens when dynamic loading is involved).
For static local variables...
Static variables in a function are initialized before the function is called for the first time
Not exactly. Their dynamic initialisation always happens exactly when execution reaches them for the first time. That is always after the function is called; not before. For example:
void foo(bool bar)
{
if (bar) {
static T var;
}
}
var will not be initialised even when the function is called, if the provided argument is false.
The order of dynamic initialisation across translation units is unspecified. This would otherwise make it impossible to safely rely on initialisation of objects with static storage from other translation units, but the "initialisation on first use" behaviour of static local variables is a feature that allows exact control over the order of their initialisation, making it possible to rely on their initialisation even across translation unit boundaries.
I think, based on your comments, that the detail you are missing is that there are both global and local static variables.
Global static variables are initialized when the program is first loaded. Edit: Apparently this is not required behavior (though it is the most common) - initialization is allowed to be delayed. See comment by #walnut.
Example:
int main()
{
std::cout << "start program!" << std::endl;
return 0;
}
static Base b;
Output:
I'm Base
start program!
While static variables in functions are initialized the first time control passes over them. Example:
int main()
{
std::cout << "start program!" << std::endl;
static Base b;
return 0;
}
Output:
start program!
I'm Base
I have added a second example, as per #EvilTeach's comment, to show how the static is only initialized a single time despite multiple functioncalls. Also, added guard (from #eerorika's answer) to show how initialization only happens when execution actually reaches the variable.
void testFunc(bool test)
{
std::cout << "testFunc called with: " << test << std::endl;
if (test)
static Base b;
}
int main()
{
testFunc(false);
testFunc(true);
testFunc(true);
return 0;
}
Output:
testFunc called with: 0
testFunc called with: 1
I'm Base
testFunc called with: 1

C++ - Non-local static object vs local static object

Regarding the book "Effective C++" from Scot Meyers, and the 4th item: non-local static objects can be uninitialized before the are used (static in this case means "global", with static life). If you replace it with a local-static object, which is created inside a function that returns a reference to it, the object is then for sure initialized before use.
I always have a file with constants. I declare extern const int a; in an .hpp file and define it in the .cpp file. But can then the same thing happen? a can be uninitialized. Or not? Does the same rule apply for built-in types?
Even though you can, it's not such a good idea to return references to "local-static" variables. The variable was (presumably) declared locally to reduce its scope to just the enclosing function, so attempting to increase its scope in this manner is rather hacky. You could make it a global variable and use something like std::call_once to guarantee it's initialized exactly once on first usage. Returning a mutable reference to a local-static object also raises thread-safety concerns because the function may no longer be re-entrant.
POD types with static storage duration are guaranteed to be zero-initialized. You can also initialize them with a constant expression and the language will guarantee they are initialized before any dynamic initialization takes place. Here's a similar question that may provide some additional insight.
The problem regarding static initialization is known as static initialization order fiasco:
In short, suppose you have two static objects x and y which exist in
separate source files, say x.cpp and y.cpp. Suppose further that the
initialization for the y object (typically the y object’s constructor)
calls some method on the x object.
So if you have another translation unit using your constants, you have a rather good chance that your program will not work. Sometimes it is the order the files were linked together, some plattforms even define it in the doc (I think Solaris is one example here).
The problem also applies to builtin types such as int. The example from the FAQ is:
#include <iostream>
int f(); // forward declaration
int g(); // forward declaration
int x = f();
int y = g();
int f()
{
std::cout << "using 'y' (which is " << y << ")\n";
return 3*y + 7;
}
int g()
{
std::cout << "initializing 'y'\n";
return 5;
}
int main() {
std::cout << x << std::endl << y << std::endl;
return 0;
}
If you run this example, the output is:
using 'y' (which is 0)
initializing 'y'
So y first gets zero-initialized and then constant initialization (?) happens.
The solution is the Construct On First Use Idiom:
The basic idea of the Construct On First Use Idiom is to wrap your
static object inside a function.
Static loca objects are constructed the first time the control flow reaches their declaration.

What is the lifetime of class static variables in C++?

If I have a class called Test ::
class Test
{
static std::vector<int> staticVector;
};
when does staticVector get constructed and when does it get destructed ?
Is it with the instantiation of the first object of Test class, or just like regular static variables ?
Just to clarify, this question came to my mind after reading Concepts of Programming Languages (Sebesta Ch-5.4.3.1) and it says ::
Note that when the static modifier
appears in the declaration of a
variable in a class definition in C++,
Java and C#, it has nothing to do with
the lifetime of the variable. In that
context, it means the variable is a
class variable, rather than an
instance variable. The multiple use
of a reserved word can be confusing
particularly to those learning the
language.
did you understand? :(
I want to write some text about initializaton too, which i can later link to.
First the list of possibilities.
Namespace Static
Class Static
Local Static
Namespace Static
There are two initialization methods. static (intended to happen at compile time) and dynamic (intended to happen at runtime) initialization.
Static Initialization happens before any dynamic initialization, disregarding of translation unit relations.
Dynamic Initiaization is ordered in a translation unit, while there is no particular order in static initialization. Objects of namespace scope of the same translation unit are dynamically initialized in the order in which their definition appears.
POD type objects that are initialized with constant expressions are statically initialized. Their value can be relied on by any object's dynamic initialization, disregarding of translation unit relations.
If the initialization throws an exception, std::terminate is called.
Examples:
The following program prints A(1) A(2)
struct A {
A(int n) { std::printf(" A(%d) ", n); }
};
A a(1);
A b(2);
And the following, based on the same class, prints A(2) A(1)
extern A a;
A b(2);
A a(1);
Let's pretend there is a translation unit where msg is defined as the following
char const *msg = "abc";
Then the following prints abc. Note that p receives dynamic initialization. But because the static initialization (char const* is a POD type, and "abc" is an address constant expression) of msg happens before that, this is fine, and msg is guaranteed to be correctly initialized.
extern const char *msg;
struct P { P() { std::printf("%s", msg); } };
P p;
Dynamic initialization of an object is not required to happen before main at all costs. The initialization must happen before the first use of an object or function of its translation unit, though. This is important for dynamic loadable libraries.
Class Static
Behave like namespace statics.
There is a bug-report on whether the compiler is allowed to initialize class statics on the first use of a function or object of its translation unit too (after main). The wording in the Standard currently only allows this for namespace scope objects - but it seems it intends to allow this for class scope objects too. Read Objects of Namespace Scope.
For class statics that are member of templates the rule is that they are only initialized if they are ever used. Not using them will not yield to an initialization. Note that in any case, initialization will happen like explained above. Initialization will not be delayed because it's a member of a template.
Local Static
For local statics, special rules happen.
POD type objects initialized with constant expression are initialized before their block in which they are defined is entered.
Other local static objects are initialized at the first time control passes through their definition. Initialization is not considered to be complete when an exception is thrown. The initialization will be tried again the next time.
Example: The following program prints 0 1:
struct C {
C(int n) {
if(n == 0)
throw n;
this->n = n;
}
int n;
};
int f(int n) {
static C c(n);
return c.n;
}
int main() {
try {
f(0);
} catch(int n) {
std::cout << n << " ";
}
f(1); // initializes successfully
std::cout << f(2);
}
In all the above cases, in certain limited cases, for some objects that are not required to be initialized statically, the compiler can statically initialize it, instead of dynamically initializing it. This is a tricky issue, see this answer for a more detailed example.
Also note that the order of destruction is the exact order of the completion of construction of the objects. This is a common and happens in all sort of situations in C++, including in destructing temporaries.
Exactly like regular static (global) variables.
It gets constructed at the same time the global variables get constructed and destructed along with the globals as well.
Simply speaking:
A static member variable is constructed when the global variables are constructed. The construction order of global variables is not defined, but it happens before the main-function is entered.
Destruction happens when global variables are destroyed.
Global variables are destroyed in the reversed order they were constructed; after exiting the main-function.
Regards,
Ovanes
P.S.: I suggest to take a look at C++-Standard, which explains (defines) how and when global or static member variables are constructed or destructed.
P.P.S.: Your code only declares a static member variable, but does not initialize it. To initialize it you must write in one of the compilation units:
std::vector Test::staticVector;
or
std::vector Test::staticVector=std::vector(/* ctor params here */);
Some specific VC++ information in case that's what you're using:
Static class variables construction occurs at same time as other static/global variables.
In windows, the CRT startup function is responsible for this construction.
This is the actual entry point of most programs you compile (it is the function which calls your Main/Winmain function).
In addition, it is responsible for initializing the entire C runtime support (for example you need it to use malloc).
The order of construction is undefined, however when using the microsoft VC compiler the order of construction for basic types will be OK, for example it is legal and safe to write
statics.h:
... MyClass declaration ...
static const int a;
static int b;
static int ar[];
}
statics.cpp:
const int MyClass::a = 2;
int MyClass::b = a+3;
int MyClass::ar[a] = {1,2}

When do function-level static variables get allocated/initialized?

I'm quite confident that globally declared variables get allocated (and initialized, if applicable) at program start time.
int globalgarbage;
unsigned int anumber = 42;
But what about static ones defined within a function?
void doSomething()
{
static bool globalish = true;
// ...
}
When is the space for globalish allocated? I'm guessing when the program starts. But does it get initialized then too? Or is it initialized when doSomething() is first called?
I was curious about this so I wrote the following test program and compiled it with g++ version 4.1.2.
include <iostream>
#include <string>
using namespace std;
class test
{
public:
test(const char *name)
: _name(name)
{
cout << _name << " created" << endl;
}
~test()
{
cout << _name << " destroyed" << endl;
}
string _name;
};
test t("global variable");
void f()
{
static test t("static variable");
test t2("Local variable");
cout << "Function executed" << endl;
}
int main()
{
test t("local to main");
cout << "Program start" << endl;
f();
cout << "Program end" << endl;
return 0;
}
The results were not what I expected. The constructor for the static object was not called until the first time the function was called. Here is the output:
global variable created
local to main created
Program start
static variable created
Local variable created
Function executed
Local variable destroyed
Program end
local to main destroyed
static variable destroyed
global variable destroyed
Some relevant verbiage from C++ Standard:
3.6.2 Initialization of non-local objects [basic.start.init]
1
The storage for objects with static storage
duration (basic.stc.static) shall be zero-initialized (dcl.init)
before any other initialization takes place. Objects of
POD types (basic.types) with static storage duration
initialized with constant expressions (expr.const) shall be
initialized before any dynamic initialization takes place.
Objects of namespace scope with static storage duration defined in
the same translation unit and dynamically initialized shall be
initialized in the order in which their definition appears in
the translation unit. [Note: dcl.init.aggr describes the
order in which aggregate members are initialized. The
initialization of local static objects is described in stmt.dcl. ]
[more text below adding more liberties for compiler writers]
6.7 Declaration statement [stmt.dcl]
...
4
The zero-initialization (dcl.init) of all local objects with
static storage duration (basic.stc.static) is performed before
any other initialization takes place. A local object of
POD type (basic.types) with static storage duration
initialized with constant-expressions is initialized before its
block is first entered. An implementation is permitted to perform
early initialization of other local objects with static storage
duration under the same conditions that an implementation is
permitted to statically initialize an object with static storage
duration in namespace scope (basic.start.init). Otherwise such
an object is initialized the first time control passes through its
declaration; such an object is considered initialized upon the
completion of its initialization. If the initialization exits by
throwing an exception, the initialization is not complete, so it will
be tried again the next time control enters the declaration. If control re-enters the declaration (recursively) while the object is being
initialized, the behavior is undefined. [Example:
int foo(int i)
{
static int s = foo(2*i); // recursive call - undefined
return i+1;
}
--end example]
5
The destructor for a local object with static storage duration will
be executed if and only if the variable was constructed.
[Note: basic.start.term describes the order in which local
objects with static storage duration are destroyed. ]
The memory for all static variables is allocated at program load. But local static variables are created and initialized the first time they are used, not at program start up. There's some good reading about that, and statics in general, here. In general I think some of these issues depend on the implementation, especially if you want to know where in memory this stuff will be located.
The compiler will allocate static variable(s) defined in a function foo at program load, however the compiler will also add some additional instructions (machine code) to your function foo so that the first time it is invoked this additional code will initialize the static variable (e.g. invoking the constructor, if applicable).
#Adam: This behind the scenes injection of code by the compiler is the reason for the result you saw.
I try to test again code from Adam Pierce and added two more cases: static variable in class and POD type. My compiler is g++ 4.8.1, in Windows OS(MinGW-32).
Result is static variable in class is treated same with global variable. Its constructor will be called before enter main function.
Conclusion (for g++, Windows environment):
Global variable and static member in class: constructor is called before enter main function (1).
Local static variable: constructor is only called when execution reaches its declaration at first time.
If Local static variable is POD type, then it is also initialized before enter main function (1).
Example for POD type: static int number = 10;
(1): The correct state should be: "before any function from the same translation unit is called". However, for simple, as in example below, then it is main function.
#include <iostream>
#include <string>
using namespace std;
class test
{
public:
test(const char *name)
: _name(name)
{
cout << _name << " created" << endl;
}
~test()
{
cout << _name << " destroyed" << endl;
}
string _name;
static test t; // static member
};
test test::t("static in class");
test t("global variable");
void f()
{
static test t("static variable");
static int num = 10 ; // POD type, init before enter main function
test t2("Local variable");
cout << "Function executed" << endl;
}
int main()
{
test t("local to main");
cout << "Program start" << endl;
f();
cout << "Program end" << endl;
return 0;
}
result:
static in class created
global variable created
local to main created
Program start
static variable created
Local variable created
Function executed
Local variable destroyed
Program end
local to main destroyed
static variable destroyed
global variable destroyed
static in class destroyed
Anybody tested in Linux env ?
Or is it initialized when doSomething() is first called?
Yes, it is. This, among other things, lets you initialize globally-accessed data structures when it is appropriate, for example inside try/catch blocks. E.g. instead of
int foo = init(); // bad if init() throws something
int main() {
try {
...
}
catch(...){
...
}
}
you can write
int& foo() {
static int myfoo = init();
return myfoo;
}
and use it inside the try/catch block. On the first call, the variable will be initialized. Then, on the first and next calls, its value will be returned (by reference).
Static variables are allocated inside a code segment -- they are part of the executable image, and so are mapped in already initialized.
Static variables within function scope are treated the same, the scoping is purely a language level construct.
For this reason you are guaranteed that a static variable will be initialized to 0 (unless you specify something else) rather than an undefined value.
There are some other facets to initialization you can take advantage off -- for example shared segments allow different instances of your executable running at once to access the same static variables.
In C++ (globally scoped) static objects have their constructors called as part of the program start up, under the control of the C runtime library. Under Visual C++ at least the order that objects are initialized in can be controlled by the init_seg pragma.
In the following code it prints Initial = 4 which is the value of static_x as it is implemented in the compiling time.
int func(int x)
{
static int static_x = 4;
static_x = x;
printf ("Address = 0x%x",&static_x ); // prints 0x40a010
return static_x;
}
int main()
{
int x = 8;
uint32_t *ptr = (uint32_t *)(0x40a010); // static_x location
printf ("Initial = %d\n",*ptr);
func(x);
return 0;
}