If you have shared variables between a std::thread and the main thread (or any other thread for that matter), can you still access those shared variables even if you execute the thread::detach() method immediately after creating the thread?
Yes! Global, captured and passed-in variables are still accessible after calling detach().
However, if you are calling detach, it is likely that you want to return from the function that created the thread, allowing the thread object to go out of scope. If that is the case, you will have to take care that none of the locals of that function were passed to the thread either by reference or through a pointer.
You can think of detach() as a declaration that the thread does not need anything local to the creating thread.
In the following example, a thread keeps accessing an int on the stack of the starting thread after it has gone out of scope. This is undefined behaviour!
void start_thread()
{
int someInt = 5;
std::thread t([&]() {
while (true)
{
// Will print someInt (5) repeatedly until we return. Then,
// undefined behavior!
std::cout << someInt << std::endl;
}
});
t.detach();
}
Here are some possible ways to keep the rug from being swept out from under your thread:
Declare the int somewhere that will not go out of scope during the lifetime of any threads that need it (perhaps a global).
Declare shared data as a std::shared_ptr and pass that by value into the thread.
Pass by value (performing a copy) into the thread.
Pass by rvalue reference (performing a move) into the thread.
Yes. Detaching a thread just means that it cleans up after itself when it is finished and you no longer need to, nor are you allowed to, join it.
Related
I have always assumed lambda were just function pointers, but I've never thought to use capture statements seriously...
If I create a lambda that captures by copy, and then move that lambda to a completely different thread and make no attempt to save the original objects used in the lambda, will it retain those copies for me?
std::thread createThread() {
std::string str("Success");
auto func = [=](){
printf("%s", str.c_str());
};
str = "Failure";
return std::thread(func);
}
int main() {
std::thread thread = createThread();
thread.join();
// assuming the thread doesn't execute anything until here...
// would it print "Success", "Failure", or deference a dangling pointer?
return 0;
}
It is guaranteed to print Success. Capture-by-copy does exactly what it says. It make a copy of the object right there and stores this copy as part of the closure object. The member of the closure object created from the capture lives as long as the closure object itself.
A lambda is not a function pointer. Lambdas are general function objects that can have internal state, which a function pointer can't have. In fact, only capture-less lambdas can be converted to function pointers and so may behave like one sometimes.
The lambda expression produces a closure type that basically looks something like this:
struct /*unnamed1*/ {
/*unnamed1*/(const /*unnamed1*/&) = default;
/*unnamed1*/(/*unnamed1*/&&) = default;
/*unnamed1*/& operator=(const /*unnamed1*/&) = delete;
void operator()() const {
printf("%s", /*unnamed2*/.c_str());
};
std::string /*unnamed2*/;
};
and the lambda expression produces an object of this type, with /*unnamed2*/ direct-initialized to the current value of str. (Direct-initialized meaning as if by std::string /*unnamed2*/(str);)
You have 3 situations
You can be design guarantee that variables live longer then the thread, because you synchronize with the end of the thread before variables go out of scope.
You know your thread may outlive the scope/life cycle of your thread but you don't need access to the variables anymore from any other thread.
You can't say which thread lives longest, you have multiple thread accessing your data and you want to extend the live time of your variables
In case 1. Capture by reference
In case 2. Capture by value (or you even use move) variables
In case 3. Make data shared, std::shared_ptr and capture that by value
Case 3 will extend the lifetime of the data to the lifetime of the longest living thread.
Note I prefer using std::async over std::thread, since that returns a RAII object (a future). The destructor of that will synchronize with the thread. So you can use that as members in objects with a thread and make sure the object destruction waits for the thread to finish.
Is passing a std::future to a detached instance of std::thread a safe operation? I know that underneath, the std::future has state in a shared_ptr which it shares with a std::promise. Here is an example.
int main()
{
std::promise<void> p;
std::thread( [f = p.get_future()]() {
if ( f.wait_for( std::chrono::seconds( 2 ) ) == std::future_status::ready )
{
return;
}
std::terminate();
} ).detach();
// wait for some operation
p.set_value();
}
There is a potential error case in the above code where the lambda is executed after the main thread exits. Does the shared state remain after the main thread exits?
[basic.start.term]/6 If there is a use of a standard library object or function not permitted within signal handlers (21.10) that does not happen before (4.7) completion of destruction of objects with static storage duration and execution of std::atexit registered functions (21.5), the program has undefined behavior.
Per [basic.start.main]/5, returning from main has the effect of calling std::exit, which does destroy objects with static storage duration and execute std::atexit registered functions. Therefore, I believe your example exhibits undefined behavior.
According to cppreference:
In a typical implementation, std::shared_ptr holds only two pointers:
- the stored pointer (one returned by get());
- a pointer to control block.
The control block is a dynamically-allocated object...
Given this information, I wouldn't think that the termination of the main thread is going to affect the shared_ptr in the worker thread.
This is mainly a question about scope and threads. Let's say we have the following struct.
struct Test
{
int number;
std::string name;
};
An instance of this struct will be used as an argument in the pthread_create function. Here is an example of what this might look like.
pthread_t tid[5];
for(int i = 0; i < 5; ++i)
{
Test test;
test.number = 5;
test.name = "test";
pthread_create(&tid[i], NULL, func, (void *)&test);
}
Is this acceptable? Since test is declared in the for's scope, that means we can only rely on it existing during a single iteration of the for loop.
When pthread_create is called, a pointer to test is given as an argument. This means that func is receiving the same pointer that is passed in to pthread_create. This means that when test goes out of scope, we can no longer rely on that pointer referring to test. If we were to create more locals, hence changing the stack, the location that the pointer points to would be overwritten by those new locals. Is this correct?
Is this acceptable?
In general, no, for the reasons you stated. Specifically, the thread's entry-function may not start executing until after the Test object has been destroyed, or the Test object might be destroyed while the thread's entry-function is still using it. Either of those possibilities will lead to undefined behavior (and a program that only works correctly "sometimes").
You need to guarantee that the data you pass to the thread remains valid for as long as the thread needs to use it. One way to do that is to allocate that Test object on the heap, and have the thread call delete on the Test-pointer when it's done using it. Alternatively, you could use a condition variable to "pause" the main thread until the child pthread signals that it is done accessing the Test on the main thread's stack, so that it is now safe for the main thread to continue execution.
If we were to create more locals, hence changing the stack, the
location that the pointer points to would be overwritten by those new
locals. Is this correct?
Yes, you understand the issue correctly.
I am confused with the description of thread_local in C++11. My understanding is, each thread has unique copy of local variables in a function. The global/static variables can be accessed by all the threads (possibly synchronized access using locks). And the thread_local variables are visible to all the threads but can only modified by the thread for which they are defined? Is it correct?
Thread-local storage duration is a term used to refer to data that is seemingly global or static storage duration (from the viewpoint of the functions using it) but, in actual fact, there is one copy per thread.
It adds to the current options:
automatic (exists during a block or function);
static (exists for the program duration); and
dynamic (exists on the heap between allocation and deallocation).
Something that is thread-local is brought into existence at thread creation time and disposed of when the thread finishes.
For example, think of a random number generator where the seed must be maintained on a per-thread basis. Using a thread-local seed means that each thread gets its own random number sequence, independent of all other threads.
If your seed was a local variable within the random function, it would be initialised every time you called it, giving you the same number each time. If it was a global, threads would interfere with each other's sequences.
Another example is something like strtok where the tokenisation state is stored on a thread-specific basis. That way, a single thread can be sure that other threads won't screw up its tokenisation efforts, while still being able to maintain state over multiple calls to strtok - this basically renders strtok_r (the thread-safe version) redundant.
Yet another example would be something like errno. You don't want separate threads modifying errno after one of your calls fails, but before you've had a chance to check the result.
This site has a reasonable description of the different storage duration specifiers.
When you declare a variable thread_local then each thread has its own copy. When you refer to it by name, then the copy associated with the current thread is used. e.g.
thread_local int i=0;
void f(int newval){
i=newval;
}
void g(){
std::cout<<i;
}
void threadfunc(int id){
f(id);
++i;
g();
}
int main(){
i=9;
std::thread t1(threadfunc,1);
std::thread t2(threadfunc,2);
std::thread t3(threadfunc,3);
t1.join();
t2.join();
t3.join();
std::cout<<i<<std::endl;
}
This code will output "2349", "3249", "4239", "4329", "2439" or "3429", but never anything else. Each thread has its own copy of i, which is assigned to, incremented and then printed. The thread running main also has its own copy, which is assigned to at the beginning and then left unchanged. These copies are entirely independent, and each has a different address.
It is only the name that is special in that respect --- if you take the address of a thread_local variable then you just have a normal pointer to a normal object, which you can freely pass between threads. e.g.
thread_local int i=0;
void thread_func(int*p){
*p=42;
}
int main(){
i=9;
std::thread t(thread_func,&i);
t.join();
std::cout<<i<<std::endl;
}
Since the address of i is passed to the thread function, then the copy of i belonging to the main thread can be assigned to even though it is thread_local. This program will thus output "42". If you do this, then you need to take care that *p is not accessed after the thread it belongs to has exited, otherwise you get a dangling pointer and undefined behaviour just like any other case where the pointed-to object is destroyed.
thread_local variables are initialized "before first use", so if they are never touched by a given thread then they are not necessarily ever initialized. This is to allow compilers to avoid constructing every thread_local variable in the program for a thread that is entirely self-contained and doesn't touch any of them. e.g.
struct my_class{
my_class(){
std::cout<<"hello";
}
~my_class(){
std::cout<<"goodbye";
}
};
void f(){
thread_local my_class unused;
}
void do_nothing(){}
int main(){
std::thread t1(do_nothing);
t1.join();
}
In this program there are 2 threads: the main thread and the manually-created thread. Neither thread calls f, so the thread_local object is never used. It is therefore unspecified whether the compiler will construct 0, 1 or 2 instances of my_class, and the output may be "", "hellohellogoodbyegoodbye" or "hellogoodbye".
Thread-local storage is in every aspect like static (= global) storage, only that each thread has a separate copy of the object. The object's life time starts either at thread start (for global variables) or at first initialization (for block-local statics), and ends when the thread ends (i.e. when join() is called).
Consequently, only variables that could also be declared static may be declared as thread_local, i.e. global variables (more precisely: variables "at namespace scope"), static class members, and block-static variables (in which case static is implied).
As an example, suppose you have a thread pool and want to know how well your work load was being balanced:
thread_local Counter c;
void do_work()
{
c.increment();
// ...
}
int main()
{
std::thread t(do_work); // your thread-pool would go here
t.join();
}
This would print thread usage statistics, e.g. with an implementation like this:
struct Counter
{
unsigned int c = 0;
void increment() { ++c; }
~Counter()
{
std::cout << "Thread #" << std::this_thread::id() << " was called "
<< c << " times" << std::endl;
}
};
I am confused with the description of thread_local in C++11. My understanding is, each thread has unique copy of local variables in a function. The global/static variables can be accessed by all the threads (possibly synchronized access using locks). And the thread_local variables are visible to all the threads but can only modified by the thread for which they are defined? Is it correct?
Thread-local storage duration is a term used to refer to data that is seemingly global or static storage duration (from the viewpoint of the functions using it) but, in actual fact, there is one copy per thread.
It adds to the current options:
automatic (exists during a block or function);
static (exists for the program duration); and
dynamic (exists on the heap between allocation and deallocation).
Something that is thread-local is brought into existence at thread creation time and disposed of when the thread finishes.
For example, think of a random number generator where the seed must be maintained on a per-thread basis. Using a thread-local seed means that each thread gets its own random number sequence, independent of all other threads.
If your seed was a local variable within the random function, it would be initialised every time you called it, giving you the same number each time. If it was a global, threads would interfere with each other's sequences.
Another example is something like strtok where the tokenisation state is stored on a thread-specific basis. That way, a single thread can be sure that other threads won't screw up its tokenisation efforts, while still being able to maintain state over multiple calls to strtok - this basically renders strtok_r (the thread-safe version) redundant.
Yet another example would be something like errno. You don't want separate threads modifying errno after one of your calls fails, but before you've had a chance to check the result.
This site has a reasonable description of the different storage duration specifiers.
When you declare a variable thread_local then each thread has its own copy. When you refer to it by name, then the copy associated with the current thread is used. e.g.
thread_local int i=0;
void f(int newval){
i=newval;
}
void g(){
std::cout<<i;
}
void threadfunc(int id){
f(id);
++i;
g();
}
int main(){
i=9;
std::thread t1(threadfunc,1);
std::thread t2(threadfunc,2);
std::thread t3(threadfunc,3);
t1.join();
t2.join();
t3.join();
std::cout<<i<<std::endl;
}
This code will output "2349", "3249", "4239", "4329", "2439" or "3429", but never anything else. Each thread has its own copy of i, which is assigned to, incremented and then printed. The thread running main also has its own copy, which is assigned to at the beginning and then left unchanged. These copies are entirely independent, and each has a different address.
It is only the name that is special in that respect --- if you take the address of a thread_local variable then you just have a normal pointer to a normal object, which you can freely pass between threads. e.g.
thread_local int i=0;
void thread_func(int*p){
*p=42;
}
int main(){
i=9;
std::thread t(thread_func,&i);
t.join();
std::cout<<i<<std::endl;
}
Since the address of i is passed to the thread function, then the copy of i belonging to the main thread can be assigned to even though it is thread_local. This program will thus output "42". If you do this, then you need to take care that *p is not accessed after the thread it belongs to has exited, otherwise you get a dangling pointer and undefined behaviour just like any other case where the pointed-to object is destroyed.
thread_local variables are initialized "before first use", so if they are never touched by a given thread then they are not necessarily ever initialized. This is to allow compilers to avoid constructing every thread_local variable in the program for a thread that is entirely self-contained and doesn't touch any of them. e.g.
struct my_class{
my_class(){
std::cout<<"hello";
}
~my_class(){
std::cout<<"goodbye";
}
};
void f(){
thread_local my_class unused;
}
void do_nothing(){}
int main(){
std::thread t1(do_nothing);
t1.join();
}
In this program there are 2 threads: the main thread and the manually-created thread. Neither thread calls f, so the thread_local object is never used. It is therefore unspecified whether the compiler will construct 0, 1 or 2 instances of my_class, and the output may be "", "hellohellogoodbyegoodbye" or "hellogoodbye".
Thread-local storage is in every aspect like static (= global) storage, only that each thread has a separate copy of the object. The object's life time starts either at thread start (for global variables) or at first initialization (for block-local statics), and ends when the thread ends (i.e. when join() is called).
Consequently, only variables that could also be declared static may be declared as thread_local, i.e. global variables (more precisely: variables "at namespace scope"), static class members, and block-static variables (in which case static is implied).
As an example, suppose you have a thread pool and want to know how well your work load was being balanced:
thread_local Counter c;
void do_work()
{
c.increment();
// ...
}
int main()
{
std::thread t(do_work); // your thread-pool would go here
t.join();
}
This would print thread usage statistics, e.g. with an implementation like this:
struct Counter
{
unsigned int c = 0;
void increment() { ++c; }
~Counter()
{
std::cout << "Thread #" << std::this_thread::id() << " was called "
<< c << " times" << std::endl;
}
};