I am confused with the description of thread_local in C++11. My understanding is, each thread has unique copy of local variables in a function. The global/static variables can be accessed by all the threads (possibly synchronized access using locks). And the thread_local variables are visible to all the threads but can only modified by the thread for which they are defined? Is it correct?
Thread-local storage duration is a term used to refer to data that is seemingly global or static storage duration (from the viewpoint of the functions using it) but, in actual fact, there is one copy per thread.
It adds to the current options:
automatic (exists during a block or function);
static (exists for the program duration); and
dynamic (exists on the heap between allocation and deallocation).
Something that is thread-local is brought into existence at thread creation time and disposed of when the thread finishes.
For example, think of a random number generator where the seed must be maintained on a per-thread basis. Using a thread-local seed means that each thread gets its own random number sequence, independent of all other threads.
If your seed was a local variable within the random function, it would be initialised every time you called it, giving you the same number each time. If it was a global, threads would interfere with each other's sequences.
Another example is something like strtok where the tokenisation state is stored on a thread-specific basis. That way, a single thread can be sure that other threads won't screw up its tokenisation efforts, while still being able to maintain state over multiple calls to strtok - this basically renders strtok_r (the thread-safe version) redundant.
Yet another example would be something like errno. You don't want separate threads modifying errno after one of your calls fails, but before you've had a chance to check the result.
This site has a reasonable description of the different storage duration specifiers.
When you declare a variable thread_local then each thread has its own copy. When you refer to it by name, then the copy associated with the current thread is used. e.g.
thread_local int i=0;
void f(int newval){
i=newval;
}
void g(){
std::cout<<i;
}
void threadfunc(int id){
f(id);
++i;
g();
}
int main(){
i=9;
std::thread t1(threadfunc,1);
std::thread t2(threadfunc,2);
std::thread t3(threadfunc,3);
t1.join();
t2.join();
t3.join();
std::cout<<i<<std::endl;
}
This code will output "2349", "3249", "4239", "4329", "2439" or "3429", but never anything else. Each thread has its own copy of i, which is assigned to, incremented and then printed. The thread running main also has its own copy, which is assigned to at the beginning and then left unchanged. These copies are entirely independent, and each has a different address.
It is only the name that is special in that respect --- if you take the address of a thread_local variable then you just have a normal pointer to a normal object, which you can freely pass between threads. e.g.
thread_local int i=0;
void thread_func(int*p){
*p=42;
}
int main(){
i=9;
std::thread t(thread_func,&i);
t.join();
std::cout<<i<<std::endl;
}
Since the address of i is passed to the thread function, then the copy of i belonging to the main thread can be assigned to even though it is thread_local. This program will thus output "42". If you do this, then you need to take care that *p is not accessed after the thread it belongs to has exited, otherwise you get a dangling pointer and undefined behaviour just like any other case where the pointed-to object is destroyed.
thread_local variables are initialized "before first use", so if they are never touched by a given thread then they are not necessarily ever initialized. This is to allow compilers to avoid constructing every thread_local variable in the program for a thread that is entirely self-contained and doesn't touch any of them. e.g.
struct my_class{
my_class(){
std::cout<<"hello";
}
~my_class(){
std::cout<<"goodbye";
}
};
void f(){
thread_local my_class unused;
}
void do_nothing(){}
int main(){
std::thread t1(do_nothing);
t1.join();
}
In this program there are 2 threads: the main thread and the manually-created thread. Neither thread calls f, so the thread_local object is never used. It is therefore unspecified whether the compiler will construct 0, 1 or 2 instances of my_class, and the output may be "", "hellohellogoodbyegoodbye" or "hellogoodbye".
Thread-local storage is in every aspect like static (= global) storage, only that each thread has a separate copy of the object. The object's life time starts either at thread start (for global variables) or at first initialization (for block-local statics), and ends when the thread ends (i.e. when join() is called).
Consequently, only variables that could also be declared static may be declared as thread_local, i.e. global variables (more precisely: variables "at namespace scope"), static class members, and block-static variables (in which case static is implied).
As an example, suppose you have a thread pool and want to know how well your work load was being balanced:
thread_local Counter c;
void do_work()
{
c.increment();
// ...
}
int main()
{
std::thread t(do_work); // your thread-pool would go here
t.join();
}
This would print thread usage statistics, e.g. with an implementation like this:
struct Counter
{
unsigned int c = 0;
void increment() { ++c; }
~Counter()
{
std::cout << "Thread #" << std::this_thread::id() << " was called "
<< c << " times" << std::endl;
}
};
Related
If a variable is declared as static in a function's scope it is only initialized once and retains its value between function calls. What exactly is its lifetime? When do its constructor and destructor get called?
void foo()
{
static string plonk = "When will I die?";
}
The lifetime of function static variables begins the first time[0] the program flow encounters the declaration and it ends at program termination. This means that the run-time must perform some book keeping in order to destruct it only if it was actually constructed.
Additionally, since the standard says that the destructors of static objects must run in the reverse order of the completion of their construction[1], and the order of construction may depend on the specific program run, the order of construction must be taken into account.
Example
struct emitter {
string str;
emitter(const string& s) : str(s) { cout << "Created " << str << endl; }
~emitter() { cout << "Destroyed " << str << endl; }
};
void foo(bool skip_first)
{
if (!skip_first)
static emitter a("in if");
static emitter b("in foo");
}
int main(int argc, char*[])
{
foo(argc != 2);
if (argc == 3)
foo(false);
}
Output:
C:>sample.exe
Created in foo
Destroyed in foo
C:>sample.exe 1
Created in if
Created in foo
Destroyed in foo
Destroyed in if
C:>sample.exe 1 2
Created in foo
Created in if
Destroyed in if
Destroyed in foo
[0] Since C++98[2] has no reference to multiple threads how this will be behave in a multi-threaded environment is unspecified, and can be problematic as Roddy mentions.
[1] C++98 section 3.6.3.1 [basic.start.term]
[2] In C++11 statics are initialized in a thread safe way, this is also known as Magic Statics.
Motti is right about the order, but there are some other things to consider:
Compilers typically use a hidden flag variable to indicate if the local statics have already been initialized, and this flag is checked on every entry to the function. Obviously this is a small performance hit, but what's more of a concern is that this flag is not guaranteed to be thread-safe.
If you have a local static as above, and foo is called from multiple threads, you may have race conditions causing plonk to be initialized incorrectly or even multiple times. Also, in this case plonk may get destructed by a different thread than the one which constructed it.
Despite what the standard says, I'd be very wary of the actual order of local static destruction, because it's possible that you may unwittingly rely on a static being still valid after it's been destructed, and this is really difficult to track down.
The existing explanations aren't really complete without the actual rule from the Standard, found in 6.7:
The zero-initialization of all block-scope variables with static storage duration or thread storage duration is performed before any other initialization takes place. Constant initialization of a block-scope entity with static storage duration, if applicable, is performed before its block is first entered. An implementation is permitted to perform early initialization of other block-scope variables with static or thread storage duration under the same conditions that an implementation is permitted to statically initialize a variable with static or thread storage duration in namespace scope. Otherwise such a variable is initialized the first time control passes through its declaration; such a variable is considered initialized upon the completion of its initialization. If the initialization exits by throwing an exception, the initialization
is not complete, so it will be tried again the next time control enters the declaration. If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization. If control re-enters the declaration recursively while the variable is being initialized, the behavior is undefined.
FWIW, Codegear C++Builder doesn't destruct in the expected order according to the standard.
C:\> sample.exe 1 2
Created in foo
Created in if
Destroyed in foo
Destroyed in if
... which is another reason not to rely on the destruction order!
The Static variables are come into play once the program execution starts and it remain available till the program execution ends.
The Static variables are created in the Data Segment of the Memory.
If a variable is declared as static in a function's scope it is only initialized once and retains its value between function calls. What exactly is its lifetime? When do its constructor and destructor get called?
void foo()
{
static string plonk = "When will I die?";
}
The lifetime of function static variables begins the first time[0] the program flow encounters the declaration and it ends at program termination. This means that the run-time must perform some book keeping in order to destruct it only if it was actually constructed.
Additionally, since the standard says that the destructors of static objects must run in the reverse order of the completion of their construction[1], and the order of construction may depend on the specific program run, the order of construction must be taken into account.
Example
struct emitter {
string str;
emitter(const string& s) : str(s) { cout << "Created " << str << endl; }
~emitter() { cout << "Destroyed " << str << endl; }
};
void foo(bool skip_first)
{
if (!skip_first)
static emitter a("in if");
static emitter b("in foo");
}
int main(int argc, char*[])
{
foo(argc != 2);
if (argc == 3)
foo(false);
}
Output:
C:>sample.exe
Created in foo
Destroyed in foo
C:>sample.exe 1
Created in if
Created in foo
Destroyed in foo
Destroyed in if
C:>sample.exe 1 2
Created in foo
Created in if
Destroyed in if
Destroyed in foo
[0] Since C++98[2] has no reference to multiple threads how this will be behave in a multi-threaded environment is unspecified, and can be problematic as Roddy mentions.
[1] C++98 section 3.6.3.1 [basic.start.term]
[2] In C++11 statics are initialized in a thread safe way, this is also known as Magic Statics.
Motti is right about the order, but there are some other things to consider:
Compilers typically use a hidden flag variable to indicate if the local statics have already been initialized, and this flag is checked on every entry to the function. Obviously this is a small performance hit, but what's more of a concern is that this flag is not guaranteed to be thread-safe.
If you have a local static as above, and foo is called from multiple threads, you may have race conditions causing plonk to be initialized incorrectly or even multiple times. Also, in this case plonk may get destructed by a different thread than the one which constructed it.
Despite what the standard says, I'd be very wary of the actual order of local static destruction, because it's possible that you may unwittingly rely on a static being still valid after it's been destructed, and this is really difficult to track down.
The existing explanations aren't really complete without the actual rule from the Standard, found in 6.7:
The zero-initialization of all block-scope variables with static storage duration or thread storage duration is performed before any other initialization takes place. Constant initialization of a block-scope entity with static storage duration, if applicable, is performed before its block is first entered. An implementation is permitted to perform early initialization of other block-scope variables with static or thread storage duration under the same conditions that an implementation is permitted to statically initialize a variable with static or thread storage duration in namespace scope. Otherwise such a variable is initialized the first time control passes through its declaration; such a variable is considered initialized upon the completion of its initialization. If the initialization exits by throwing an exception, the initialization
is not complete, so it will be tried again the next time control enters the declaration. If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization. If control re-enters the declaration recursively while the variable is being initialized, the behavior is undefined.
FWIW, Codegear C++Builder doesn't destruct in the expected order according to the standard.
C:\> sample.exe 1 2
Created in foo
Created in if
Destroyed in foo
Destroyed in if
... which is another reason not to rely on the destruction order!
The Static variables are come into play once the program execution starts and it remain available till the program execution ends.
The Static variables are created in the Data Segment of the Memory.
I am confused with the description of thread_local in C++11. My understanding is, each thread has unique copy of local variables in a function. The global/static variables can be accessed by all the threads (possibly synchronized access using locks). And the thread_local variables are visible to all the threads but can only modified by the thread for which they are defined? Is it correct?
Thread-local storage duration is a term used to refer to data that is seemingly global or static storage duration (from the viewpoint of the functions using it) but, in actual fact, there is one copy per thread.
It adds to the current options:
automatic (exists during a block or function);
static (exists for the program duration); and
dynamic (exists on the heap between allocation and deallocation).
Something that is thread-local is brought into existence at thread creation time and disposed of when the thread finishes.
For example, think of a random number generator where the seed must be maintained on a per-thread basis. Using a thread-local seed means that each thread gets its own random number sequence, independent of all other threads.
If your seed was a local variable within the random function, it would be initialised every time you called it, giving you the same number each time. If it was a global, threads would interfere with each other's sequences.
Another example is something like strtok where the tokenisation state is stored on a thread-specific basis. That way, a single thread can be sure that other threads won't screw up its tokenisation efforts, while still being able to maintain state over multiple calls to strtok - this basically renders strtok_r (the thread-safe version) redundant.
Yet another example would be something like errno. You don't want separate threads modifying errno after one of your calls fails, but before you've had a chance to check the result.
This site has a reasonable description of the different storage duration specifiers.
When you declare a variable thread_local then each thread has its own copy. When you refer to it by name, then the copy associated with the current thread is used. e.g.
thread_local int i=0;
void f(int newval){
i=newval;
}
void g(){
std::cout<<i;
}
void threadfunc(int id){
f(id);
++i;
g();
}
int main(){
i=9;
std::thread t1(threadfunc,1);
std::thread t2(threadfunc,2);
std::thread t3(threadfunc,3);
t1.join();
t2.join();
t3.join();
std::cout<<i<<std::endl;
}
This code will output "2349", "3249", "4239", "4329", "2439" or "3429", but never anything else. Each thread has its own copy of i, which is assigned to, incremented and then printed. The thread running main also has its own copy, which is assigned to at the beginning and then left unchanged. These copies are entirely independent, and each has a different address.
It is only the name that is special in that respect --- if you take the address of a thread_local variable then you just have a normal pointer to a normal object, which you can freely pass between threads. e.g.
thread_local int i=0;
void thread_func(int*p){
*p=42;
}
int main(){
i=9;
std::thread t(thread_func,&i);
t.join();
std::cout<<i<<std::endl;
}
Since the address of i is passed to the thread function, then the copy of i belonging to the main thread can be assigned to even though it is thread_local. This program will thus output "42". If you do this, then you need to take care that *p is not accessed after the thread it belongs to has exited, otherwise you get a dangling pointer and undefined behaviour just like any other case where the pointed-to object is destroyed.
thread_local variables are initialized "before first use", so if they are never touched by a given thread then they are not necessarily ever initialized. This is to allow compilers to avoid constructing every thread_local variable in the program for a thread that is entirely self-contained and doesn't touch any of them. e.g.
struct my_class{
my_class(){
std::cout<<"hello";
}
~my_class(){
std::cout<<"goodbye";
}
};
void f(){
thread_local my_class unused;
}
void do_nothing(){}
int main(){
std::thread t1(do_nothing);
t1.join();
}
In this program there are 2 threads: the main thread and the manually-created thread. Neither thread calls f, so the thread_local object is never used. It is therefore unspecified whether the compiler will construct 0, 1 or 2 instances of my_class, and the output may be "", "hellohellogoodbyegoodbye" or "hellogoodbye".
Thread-local storage is in every aspect like static (= global) storage, only that each thread has a separate copy of the object. The object's life time starts either at thread start (for global variables) or at first initialization (for block-local statics), and ends when the thread ends (i.e. when join() is called).
Consequently, only variables that could also be declared static may be declared as thread_local, i.e. global variables (more precisely: variables "at namespace scope"), static class members, and block-static variables (in which case static is implied).
As an example, suppose you have a thread pool and want to know how well your work load was being balanced:
thread_local Counter c;
void do_work()
{
c.increment();
// ...
}
int main()
{
std::thread t(do_work); // your thread-pool would go here
t.join();
}
This would print thread usage statistics, e.g. with an implementation like this:
struct Counter
{
unsigned int c = 0;
void increment() { ++c; }
~Counter()
{
std::cout << "Thread #" << std::this_thread::id() << " was called "
<< c << " times" << std::endl;
}
};
If you have shared variables between a std::thread and the main thread (or any other thread for that matter), can you still access those shared variables even if you execute the thread::detach() method immediately after creating the thread?
Yes! Global, captured and passed-in variables are still accessible after calling detach().
However, if you are calling detach, it is likely that you want to return from the function that created the thread, allowing the thread object to go out of scope. If that is the case, you will have to take care that none of the locals of that function were passed to the thread either by reference or through a pointer.
You can think of detach() as a declaration that the thread does not need anything local to the creating thread.
In the following example, a thread keeps accessing an int on the stack of the starting thread after it has gone out of scope. This is undefined behaviour!
void start_thread()
{
int someInt = 5;
std::thread t([&]() {
while (true)
{
// Will print someInt (5) repeatedly until we return. Then,
// undefined behavior!
std::cout << someInt << std::endl;
}
});
t.detach();
}
Here are some possible ways to keep the rug from being swept out from under your thread:
Declare the int somewhere that will not go out of scope during the lifetime of any threads that need it (perhaps a global).
Declare shared data as a std::shared_ptr and pass that by value into the thread.
Pass by value (performing a copy) into the thread.
Pass by rvalue reference (performing a move) into the thread.
Yes. Detaching a thread just means that it cleans up after itself when it is finished and you no longer need to, nor are you allowed to, join it.
If a variable is declared as static in a function's scope it is only initialized once and retains its value between function calls. What exactly is its lifetime? When do its constructor and destructor get called?
void foo()
{
static string plonk = "When will I die?";
}
The lifetime of function static variables begins the first time[0] the program flow encounters the declaration and it ends at program termination. This means that the run-time must perform some book keeping in order to destruct it only if it was actually constructed.
Additionally, since the standard says that the destructors of static objects must run in the reverse order of the completion of their construction[1], and the order of construction may depend on the specific program run, the order of construction must be taken into account.
Example
struct emitter {
string str;
emitter(const string& s) : str(s) { cout << "Created " << str << endl; }
~emitter() { cout << "Destroyed " << str << endl; }
};
void foo(bool skip_first)
{
if (!skip_first)
static emitter a("in if");
static emitter b("in foo");
}
int main(int argc, char*[])
{
foo(argc != 2);
if (argc == 3)
foo(false);
}
Output:
C:>sample.exe
Created in foo
Destroyed in foo
C:>sample.exe 1
Created in if
Created in foo
Destroyed in foo
Destroyed in if
C:>sample.exe 1 2
Created in foo
Created in if
Destroyed in if
Destroyed in foo
[0] Since C++98[2] has no reference to multiple threads how this will be behave in a multi-threaded environment is unspecified, and can be problematic as Roddy mentions.
[1] C++98 section 3.6.3.1 [basic.start.term]
[2] In C++11 statics are initialized in a thread safe way, this is also known as Magic Statics.
Motti is right about the order, but there are some other things to consider:
Compilers typically use a hidden flag variable to indicate if the local statics have already been initialized, and this flag is checked on every entry to the function. Obviously this is a small performance hit, but what's more of a concern is that this flag is not guaranteed to be thread-safe.
If you have a local static as above, and foo is called from multiple threads, you may have race conditions causing plonk to be initialized incorrectly or even multiple times. Also, in this case plonk may get destructed by a different thread than the one which constructed it.
Despite what the standard says, I'd be very wary of the actual order of local static destruction, because it's possible that you may unwittingly rely on a static being still valid after it's been destructed, and this is really difficult to track down.
The existing explanations aren't really complete without the actual rule from the Standard, found in 6.7:
The zero-initialization of all block-scope variables with static storage duration or thread storage duration is performed before any other initialization takes place. Constant initialization of a block-scope entity with static storage duration, if applicable, is performed before its block is first entered. An implementation is permitted to perform early initialization of other block-scope variables with static or thread storage duration under the same conditions that an implementation is permitted to statically initialize a variable with static or thread storage duration in namespace scope. Otherwise such a variable is initialized the first time control passes through its declaration; such a variable is considered initialized upon the completion of its initialization. If the initialization exits by throwing an exception, the initialization
is not complete, so it will be tried again the next time control enters the declaration. If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization. If control re-enters the declaration recursively while the variable is being initialized, the behavior is undefined.
FWIW, Codegear C++Builder doesn't destruct in the expected order according to the standard.
C:\> sample.exe 1 2
Created in foo
Created in if
Destroyed in foo
Destroyed in if
... which is another reason not to rely on the destruction order!
The Static variables are come into play once the program execution starts and it remain available till the program execution ends.
The Static variables are created in the Data Segment of the Memory.