Given the following example:
struct test
{
const char* data;
const int number;
};
struct test* foo()
{
static struct test t = {
"this is some data",
69
};
return &t;
}
is the call to foo thread safe? In other words, is the structure initialized only once in a thread-safe manner? Does it make a difference if this is compiled in C or C++?
The distinction exists in C/C++ prior to C++ 11 and in C++ 11 or later (Earlier standards lacked any provisions for threading.).
As you can see here: C++ Static local variables, since C++11 is it guaranteed by the standard that a static local variable will be initialized only once. There is a specific note regarding locks that can be applied to ensure single initializing in a multi threaded environment:
If multiple threads attempt to initialize the same static local
variable concurrently, the initialization occurs exactly once (similar
behavior can be obtained for arbitrary functions with std::call_once).
Note: usual implementations of this feature use variants of the
double-checked locking pattern, which reduces runtime overhead for
already-initialized local statics to a single non-atomic boolean
comparison.
The rules in C are specified here: C Storage duration:
static storage duration. The storage duration is the entire execution
of the program, and the value stored in the object is initialized only
once, prior to main function. All objects declared static and all
objects with either internal or external linkage that aren't declared
_Thread_local (since C11) have this storage duration.
Since C++11, object initialization will be made only by one thread, other threads will wait till it complete. See this thread for reference.
Related
I was recently using an object whose purpose is to allocate and deallocate memory as a singleton. Something like
class MyValue
{
// ...
static Allocator& GetAllocator()
{
static Allocator allocator;
return allocator;
}
// ...
};
I realized later Allocator is not thread-safe: when multiple threads were using the same allocator concurrently, occasionally strange things were happening, causing assertions and segmentation faults.
Solution: use different allocators for different threads:
class MyValue
{
// ...
static Allocator& GetAllocator()
{
thread_local static Allocator allocator;
return allocator;
}
// ...
};
Awesome! My problems are gone! Just one question:
Will my allocator variable be initialized every time a thread is created, even if the majority of threads won't use this variable?
The initialization of the allocator might be heavy operation, so I would like it to be initialized only when it is actually required, not in every thread.
I read that thread_local variables are allocated by each thread. Does that mean they are also constructed? And does this allocation (or construction) happen systematically for each thread that is created or just for threads that use it?
I faintly remember hearing in a course that most details about threads and thread-local storage are platform dependent. If this is the case, I'm particularly interested in Linux and FreeBSD.
Related (interesting reads, but I could not find the answer there):
Destruction of thread_local objects
What does the thread_local mean in C++11?
What is the performance penalty of C++11 thread_local variables in GCC 4.8?
[basic.stc.thread] states
All variables declared with the thread_local keyword have thread storage duration. The storage for these entities shall last for the duration of the thread in which they are created. There is a distinct object or reference per thread, and use of the declared name refers to the entity associated with the current thread.
A variable with thread storage duration shall be initialized before its first odr-use (6.2) and, if constructed, shall be destroyed on thread exit.
So, you will get storage for the object in each thread. We also have [stmt.dcl]/4 that states
Dynamic initialization of a block-scope variable with static storage duration (6.7.1) or thread storage duration (6.7.2) is performed the first time control passes through its declaration; such a variable is considered initialized upon the completion of its initialization.
So, if we reach the declaration then the object will be initialized and if it has a constructor, it will be called.
If we put that all together you will have a number of constructor and destructor calls equal to the number of threads that actually reach
thread_local static Allocator allocator;
I can give you a solution to check whether it creates a new object of type Allocator each time you call GetAllocator(). Just call this method at least 5 times and check the address of all the object return. If address of all the return object are different then yes its creates different object in each call or if not it just return the address of same object each time you call GetAllocator().
If I have some variables that I'm initializing statically (before main begins), am I free to use any built-in stuff in these constructors, like <iostream> or <vector>?
The "static initialization order fiasco" occurs because the order in which static variables are initialized (among different translation units) is undefined.
So what if something benign like
std::cout << "Hello" << std::endl;
happens to rely on some static variable inside <iostream> being initialized ahead of time? (I'm not saying it does, but assume it did.) What's to say that these static variables inside built-in libraries are initialized before my own static variables? Like inside say "Person.cpp" or whatever.
Edit: Is std::cout guaranteed to be initialized? was suggested as a duplicate to this question. However, I think my question is slightly broader in scope because it asks about any standard built-in library, rather than just <iostream>.
The C++ standards make no strong statements of the behavior of the program before the start of main, or after main has completed.
In experience, I have found a number of issues where the C++ runtime has destroyed some object (e.g. resources used for management of std::mutex), which have created a deadlock in the destruction of a complex type.
I would recommend the following pattern for static's
C++ creates objects declared static in a function in the order they are executed. This leaves a pattern which will ensure that objects exist as they are needed.
AnObject * AnObject::getInstance() {
static AnObject a;
return &a;
}
This should be executed to get hold of the global, and will occur at the point when getInstance() is called.
C++ 11 onwards
This code is guaranteed to be thread-safe, where if multiple threads of execution arrive in getInstance, only one will construct the object and the rest will wait.
Pre C++ 11
Creating this pattern replaces ill-defined order with thread safety issues.
Luckily, it will be possible to create a criticalsection/mutex primative in main, which is able to arbitrate.
In some OSs (e.g. InitializeCriticalSection and Windows), these locks can safely be created before main as static variables.
AnObject * AnObject::getInstance() {
EnterCriticalSection( &aObjectcrit );
static AnObject a;
LeaveCriticalSection( &aObjectcrit );
return &a;
}
Assuming you have initialized aObjectcrit either in or before this function is called.
The result of this pattern is a form of onion construction, where objects are required in the order they are needed, and when the program exits, they are destroyed in the reverse order they were created in.
You're confusing objects (std::cout) and types (std::vector). The former is covered by the linked question, and the latter is a type. Static initialization applies to objects, but not to types.
I have C++ code which declares static-lifetime variables which are initialized by function calls. The called function constructs a vector instance and calls its push_back method. Is the code risking doom via the C++ static initialization order fiasco? If not, why not?
Supplementary information:
What's the "static initialization order fiasco"?
It's explained in C++ FAQ 10.14
Why would I think use of vector could trigger the fiasco?
It's possible that the vector constructor makes use of the value of another static-lifetime variable initialized dynamically. If so, then there is nothing to ensure that vector's variable is initialized before I use vector in my code. Initializing result (see code below) could end up calling the vector constructor before vector's dependencies are fully initialized, leading to access to uninitialized memory.
What does this code look like anyway?
struct QueryEngine {
QueryEngine(const char *query, string *result_ptr)
: query(query), result_ptr(result_ptr) { }
static void AddQuery(const char *query, string *result_ptr) {
if (pending == NULL)
pending = new vector<QueryEngine>;
pending->push_back(QueryEngine(query, result_ptr));
}
const char *query;
string *result_ptr;
static vector<QueryEngine> *pending;
};
vector<QueryEngine> *QueryEngine::pending = NULL;
void Register(const char *query, string *result_ptr) {
QueryEngine::AddQuery(query, result_ptr);
}
string result = Register("query", &result);
Fortunately, static objects are zero-initialised even before any other initialisation is performed (even before the "true" initialisation of the same objects), so you know that the NULL will be set on that pointer long before Register is first invoked.1
Now, in terms of operating on your vector, it appears that (technically) you could run into such a problem:
[C++11: 17.6.5.9/3]: A C++ standard library function shall not directly or indirectly modify objects (1.10) accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function’s non-const arguments, including this.
[C++11: 17.6.5.9/4]: [Note: This means, for example, that implementations can’t use a static object for internal purposes without synchronization because it could cause a data race even in programs that do not explicitly share objects between threads. —end note]
Notice that, although synchronisation is being required in this note, that's been mentioned within a passage that ultimately acknowledges that static implementation details are otherwise allowed.
That being said, it seems like the standard should further state that user code should avoid operating on standard containers during static initialisation, if the intent were that the semantics of such code could not be guaranteed; I'd consider this a defect in the standard, either way. It should be clearer.
1 And it is a NULL pointer, whatever the bit-wise representation of that may be, rather than a blot to all-zero-bits.
vector doesn't depend on anything preventing its use in dynamic initialisation of statics. The only issue with your code is a lack of thread safety - no particular reason to think you should care about that, unless you have statics whose construction spawns threads....
Initializing result (see code below) could end up calling the vector constructor before that class is fully initialized, leading to access to uninitialized memory.
No... initialising result calls AddQuery which checks if (pending == NULL) - the initialisation to NULL will certainly have been done before any dynamic initialisation, per 3.6.2/2:
Constant initialization is performed:
...
— if an object with static or thread storage duration is not initialized by a constructor call and if either the object is value-initialized or every full-expression that appears in its initializer is a constant expression
So even if the result assignment is in a different translation unit it's safe. See 3.6.2/2:
Together, zero-initialization and constant initialization are called static initialization; all other initialization is dynamic initialization. Static initialization shall be performed before any dynamic initialization takes place.
I read several posts on C++ initialization from Google, some of which direct me here on StackOverflow. The concepts I picked from those posts are as follows:
The order of initialization of C++ is:
Zero Initialization;
Static Initialization;
Dynamic Initialization.
Static objects (variables included) are first Zero-initialized, and then Static-initialized.
I have several inquiries as to the initialization issue (storage class issue may be related as well):
Global objects (defined without static keyword) are also static objects, right?
Global objects are also initialized like static objects by two steps like above, right?
What is the Static Initialization? Does it refer to initializing static objects (defined with static keyword)?
I also read that objects defined within block (i.e. in a function) with static keyword is initialized when the execution thread first enters the block! This means that local static objects are not initialized before main function execution. This means they are not initialized as the two steps mentioned above, right?
Dynamic initialization refers to initialization of objects created by new operator, right? It might refer to initialization like myClass obj = myClass(100); or myClass obj = foo();
I have too many inquiries on the initialization and storage class specifier issues. I read the C++2003 Standard document, but cannot find a clear logic since they are scattered throughout the document.
I hope you give me an answer that logically explains the whole map of storage class specifier and initialization. Any reference is welcome!
Code that might explain my question:
class myClass{
public:
int i;
myClass(int j = 10): j(i){}
// other declarations
};
myClass obj1;//global scope
static myClass obj2(2);//file scope
{ //local scope
myClass obj3(3);
static myClass obj4(4);
}
EDIT:
If you think my question is rather tedious, you can help explain your ideas based on the code above.
I read several posts on C++ initialization from Google, some of which direct me here on StackOverflow. The concepts I picked from those posts are as follows:
The order of initialization of C++ is:
Zero Initialization;
Static Initialization;
Dynamic Initialization.
Yes, indeed there are 3 phases (in the Standard). Let us clarify them before continuing:
Zero Initialization: the memory is filled with 0s at the byte level.
Constant Initialization: a pre-computed (compile-time) byte pattern is copied at the memory location of the object
Static Initialization: Zero Initialization followed by Constant Initialization
Dynamic Initialization: a function is executed to initialize the memory
A simple example:
int const i = 5; // constant initialization
int const j = foo(); // dynamic initialization
Static objects (variables included) are first Zero-initialized, and then Static-initialized.
Yes and no.
The Standard mandates that the objects be first zero-initialized and then they are:
constant initialized if possible
dynamically initialized otherwise (the compiler could not compute the memory content at compile-time)
Note: in case of constant initialization, the compiler might omit to first zero-initialized memory following the as-if rule.
I have several inquiries as to the initialization issue (storage class issue may be related as well):
Global objects (defined without static keyword) are also static objects, right?
Yes, at file scope the static object is just about the visibility of the symbol. A global object can be referred to, by name, from another source file whilst a static object name is completely local to the current source file.
The confusion stems from the reuse of the world static in many different situations :(
Global objects are also initialized like static objects by two steps like above, right?
Yes, as are local static objects in fact.
What is the Static Initialization? Does it refer to initializing static objects (defined with static keyword)?
No, as explained above it refers to initializing objects without executing a user-defined function but instead copying a pre-computed byte pattern over the object's memory. Note that in the case of objects that will later be dynamically initialized, this is just zero-ing the memory.
I also read that objects defined within block (i.e. in a function) with static keyword is initialized when the execution thread first enters the block! This means that local static objects are not initialized before main function execution. This means they are not initialized as the two steps mentioned above, right?
They are initialized with the two steps process, though indeed only the first time execution pass through their definition. So the process is the same but the timing is subtly different.
In practice though, if their initialization is static (ie, the memory pattern is a compile-time pattern) and their address is not taken they might be optimized away.
Note that in case of dynamic initialization, if their initialization fails (an exception is thrown by the function supposed to initialize them) it will be re-attempted the next time flow-control passes through their definition.
Dynamic initialization refers to initialization of objects created by new operator, right? It might refer to initialization like myClass obj = myClass(100); or myClass obj = foo();
Not at all, it refers to initialization requiring the execution of a user defined function (note: std::string has a user-defined constructor as far as the C++ language is concerned).
EDIT: My thanks to Zach who pointed to me I erroneously called Static Initialization what the C++11 Standard calls Constant Initialization; this error should now be fixed.
I believe there are three different concepts: initializing the variable, the location of the variable in memory, the time the variable is initialized.
First: Initialization
When a variable is allocated in memory, typical processors leave the memory untouched, so the variable will have the same value that somebody else stored earlier. For security, some compilers add the extra code to initialize all variables they allocate to zero. I think this is what you mean by "Zero Initialization". It happens when you say:
int i; // not all compilers set this to zero
However if you say to the compiler:
int i = 10;
then the compiler instructs the processor to put 10 in the memory rather than leaving it with old values or setting it to zero. I think this is what you mean by "Static Initialization".
Finally, you could say this:
int i;
...
...
i = 11;
then the processor "zero initializes" (or leaves the old value) when executing int i; then when it reaches the line i = 11 it "dynamically initializes" the variable to 11 (which can happen very long after the first initialization.
Second: Location of the variable
There are: stack-based variables (sometimes called static variables), and memory-heap variables (sometimes called dynamic variables).
Variables can be created in the stack segment using this:
int i;
or the memory heap like this:
int *i = new int;
The difference is that the stack segment variable is lost after exiting the function call, while memory-heap variables are left until you say delete i;. You can read an Assembly-language book to understand the difference better.
Third: The time the variable is initialized
A stack-segment variable is "zero-initialized" or statically-initialized" when you enter the function call they are defined within.
A memory-heap variable is "zero-initialized" or statically-initialized" when it is first created by the new operator.
Final Remark
You can think about static int i; as a global variable with a scope limited to the function it is defined in. I think the confusion about static int i; comes because static hear mean another thing (it is not destroyed when you exit the routine, so it retains its value). I am not sure, but I think the trick used for static int i; is to put it in the stack of main() which means it is not destroyed until you exit the whole program (so it retains the first initialization), or it could be that it is stored in the data segment of the application.
I heard that after some version of gcc using simply something like:
static A* a = new A();
return a;
Is thread-safe for a singleton and one wouldn't need something adapted from say http://locklessinc.com/articles/singleton_pattern/ anymore...
Does anyone have a specific reference or link to where I can read about this?
Section 6.7 of draft standard (n3337.pdf), point 4:
The zero-initialization (8.5) of all block-scope variables with static storage duration (3.7.1) or thread storage
duration (3.7.2) is performed before any other initialization takes place. Constant initialization (3.6.2) of a
block-scope entity with static storage duration, if applicable, is performed before its block is first entered.
An implementation is permitted to perform early initialization of other block-scope variables with static or
thread storage duration under the same conditions that an implementation is permitted to statically initialize
a variable with static or thread storage duration in namespace scope (3.6.2). Otherwise such a variable is
initialized the first time control passes through its declaration; such a variable is considered initialized upon
the completion of its initialization. If the initialization exits by throwing an exception, the initialization
is not complete, so it will be tried again the next time control enters the declaration. If control enters
the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for
completion of the initialization.88 If control re-enters the declaration recursively while the variable is being
initialized, the behavior is undefined.
GCC follows the cross-vendor Itanium C++ ABI. The relevant sections covering thread-safe initialization of function-local statics are 2.8 Initialization guard variables and 3.3.2 One-time Construction API, which says:
An implementation that does not anticipate supporting multi-threading may simply check the first byte (i.e., the byte with lowest address) of that guard variable, initializing if and only if its value is zero, and then setting it to a non-zero value.
However, an implementation intending to support automatically thread-safe, one-time initialization (as opposed to requiring explicit user control for thread safety) may make use of the following API functions:
...
There were some bugs in the early GCC implementation of that API, I think they are all fixed and it works correctly from GCC version 4.3 (possibly earlier, I don't recall and can't find a reference right now.)
However, Singleton is a bad, bad pattern, do not use it!