According to cppreference.com, the
std::thread constructor with no parameter means:
Creates new thread object which does not represent a thread.
My questions are:
Why do we need this constructor? And if we create a thread using this constructor, how can we "assign" a thread function later?
Why don't we have a "run(function_address)" method so that when constructed with no parameter, we can specify a function to "run" for that thread.
Or, we can construct a thread with a callable parameter (function, functors, etc.) but call a "run()" method to actually execute the thread later. Why is std::thread not designed in this way?
Your question suggests there might be some confusion and it would be helpful to clearly separate the ideas of a thread of execution from the std::thread type, and to separate both from the idea of a "thread function".
A thread of execution represents a flow of control through your program, probably corresponding to an OS thread managed by the kernel.
An object of the type std::thread can be associated with a thread of execution, or it can be "empty" and not refer to any thread of execution.
There is no such concept as a "thread function" in standard C++. Any function can be run in a new thread of execution by passing it to the constructor of a std::thread object.
why do we need this constructor?
To construct the empty state that doesn't refer to a thread of execution. You might want to have a member variable of a class that is a std::thread, but not want to associate it with a thread of execution right away. So you default construct it, and then later launch a new thread of execution and associate it with the std::thread member variable. Or you might want to do:
std::thread t;
if (some_condition) {
t = std::thread{ func1, arg1 };
}
else {
auto result = some_calculation();
t = std::thread{ func2, arg2, result };
}
The default constructor allows the object t to be created without launching a new thread of execution until needed.
And if we create a thread using this constructor, how can we "assign" a thread function later?
You "assign" using "assignment" :-)
But you don't assign a "thread function" to it, that is not what std::thread is for. You assign another std::thread to it:
std::thread t;
std::thread t2{ func, args };
t = std::move(t2);
Think in terms of creating a new thread of execution not "assigning a thread function" to something. You're not just assigning a function, that's what std::function would be used for. You are requesting the runtime to create a new thread of execution, which will be managed by a std::thread object.
Why don't we have a "run(function_address)" method so that when constructed with no parameter, we can specify a function to "run" for that thread.
Because you don't need it. You start new threads of execution by constructing a std::thread object with arguments. If you want that thread of execution to be associated with an existing object then you can do that by move-assigning or swapping.
Or, we can construct a thread with a callable parameter(function, functors, etc.) but call a "run()" method to actually execute the thread later. Why std::thread is not designed in this way?
Why should it be designed that way?
The std::thread type is for managing a thread of execution not holding a callable object for later use. If you want to create a callable object that can be later run on a new thread of execution there are lots of ways to do that in C++ (using a lambda expression, or std::bind, or std::function, or std::packaged_task, or a custom functor type). The job of std::thread is to manage a thread of execution not to hold onto a callable object until you want to call it.
The default constructor is provided such that an "empty" thread object can be created. Not all thread objects will be associated with a thread of execution at the time of construction of said object. Consider when the thread is a member of some type and that type has a default constructor. Consider another case that the thread type has no concept of a "suspended" thread, i.e. it can't be created in a suspended state.
The thread type doesn't have a "run" method of some sort since the one of the original design decisions (IIRC) was to have a "strong" association between the thread object and the thread of execution. Allowing threads to be "moved" makes that intent clearer (in my opinion). Hence moving an instance of a thread object to an "empty" object is clearer than attempting to "run" a thread.
It is conceivable that you can create a wrapper class of some sort that offers the "run" method, but I think this may be a narrower use case, and that can be solved given the API of the std::thread class.
The default constructor gives you then possibility to create array of threads:
thread my_threads[4];
for (int i=0; i<4; i++)
{
thread temp(func,...);
my_threads[i]=move(temp);
}
the thread created with default costructor "become" a "real" thread with the move costructor.
You can use thread with standard container if you need/like.
EDIT: Maybe lets firsts comment on the very last part:
3.2 Why std::thread is not designed in this way?
I don't know why exactly (there are surely advantages and disatvantages - see Jonathan Wakely's answer for a more details about the rational behind it), but it seems that c++11 std::thread is modelled much closer to pthreads than e.g. QT's or Java's QThread/Thread classes, which might be the source of your confusion.
As to the rest of your questions:
1.1 why do we need this constructor?
You might want to create a std::thread variable but don't directly start a thread (e.g. a class member variable or an element of a static array, es shown by alangab). It's not much different to an std::fstream that can be created without a filename.
1.2 And if we create a thread using this constructor, how can we "assign" a thread function later?
For example:
std::thread myThread;
// some other code
myThread = std::thread(foo());
Why don't we have a "run(function_address)" method so that when constructed with no parameter, we can specify a function to "run" for that thread.
I don't know, why it was designed like this, but I don't see the benefit a run method would have compared to above syntax.
3.1 Or, we can construct a thread with a callable parameter(function, functors, etc.) but call a "run()" method to actually execute the thread later.
You can simulate this by creating a lambda or std::function object and create the thread when you want to run the function.
auto myLambda = [=]{foo(param1, param2);};
// some other code
std::thread myThread(myLambda);
If you want to use the syntax you describe, I'd recommend to write your own Thread wrapper class (should only take a few dozen lines of code) that (optionally) also ensures that the thread is either detached or joined upon destruction of the wrapper, which is - in my opinion - the main problem with std::thread.
Related
So I ran across something that seems to defeat the purpose of std::thread or at least makes it less convenient.
Say I want to spawn an std::thread to perform a task one time and don't want to worry about it again after that. I create this thread near the end of a function so the std::thread will soon go out of scope, in my case, likely while the thread is still running. This creates a problem with a couple of solutions (or at least that I know of).
I can:
A) Make the std::thread a global variable so it doesn't go out of scope.
B) Call join() on the std::thread which defeats the purpose of spawning the thread.
Are there other, hopefully better, ways to handle this kind of situation?
What you want is std::thread::detach.
It decouples the actual thread of execution and the std::thread object, allowing you to destroy it without joining the threads.
You can use std::async.
async runs the function f asynchronously (potentially in a separate thread which may be part of a thread pool) and returns a std::future that will eventually hold the result of that function call.
Since you did not mention that you need to get the result of the thread, there is no need to get the future.
As pointed out in the comments, the destructor of std::future will block in any case, so you will have to move it to some global object (and manually manage deletion of unused futures), or you can move it into the local scope of the asynchronously called function.
Other preferred option is to allow the thread to consume tasks from a task-queue. For that, split the job into task chuncks and feed to the worker thread. To avoid polling, opt for condition_variable.
std::thread([&](){
while(running) // thread loop
{
// consume tasks from queue
}
}).detach();
A quote from Nikolai Josuttis - Standard Library C++11:
Detached threads can easily become a problem if they use nonlocal resources. The problem is that
you lose control of a detached thread and have no easy way to find out whether and how long it runs.
Thus, make sure that a detached thread does not access any objects after their lifetime has ended. For
this reason, passing variables and objects to a thread by reference is always a risk. Passing arguments
by value is strongly recommended.
So further the author explains, that even if you pass a reference as a function argument to a thread, it still passes by value, so you must indicate the reference with std::ref.
I have these questions, see the code below:
void f(std::vector<int> V){...}
void g(std::vector<int>& V){...}
std::vector<int> V;
std::thread t1(f, V);
std::thread t2(f, std::ref(V));
std::thread t3(g, V);
std::thread t4(g, std::ref(V));
What are the differences in these 4 lines? Which lines are equivalent?
I am not joining or detaching thread, it's not about that, it's about the ways of passing the function argument.
t1:
This simply passes a copy of V to the thread.
t2:
Similarly to t1, a copy of V is passed to the thread, but the actual copy is made in the called thread instead of the caller thread. This is an important distinction because should V be altered or cease to exist by the time the thread begins, you will end up with either a different vector or Undefined Behavior.
t3:
This should fail to compile as the thread will move the vector into the LValue reference, which is supposed to be illegal.
t4:
This passes the vector by reference to the thread. Any modifications to the passed reference will be applied to V, provided that proper synchronisation is performed, of course.
I am doing some work with threading on an embedded platform. This platform provides a Thread class, and it has a start method that takes a function pointer, like this:
void do_in_parallel() {
// Some stuff to do in a new thread
}
Thread my_thread;
my_thread.start(do_in_parallel);
The problem is there is no way to pass parameters in.1 I want to solve this by creating an abstract class, call it Thread2, that extends Thread (or it could just have a Thread as instance data).
Thread2 would have a pure virtual function void run() and the goal was to pass that to Thread::start(void*()), except I soon learned that member function pointers have a different type and can't be used like this. I could make run() static, but then I still can't have more than one instance, defeating the whole purpose (not to mention you can't have a virtual static function).
Are there any workarounds that wouldn't involve changing the original Thread class (considering it's a library that I'm stuck with as-is)?
1. Global variables are a usable workaround in many cases, except when instantiating more than one thread from the same function pointer. I can't come up with a way to avoid race conditions in that case.
Write a global thread pool.
It maintains a queue of tasks. These tasks can have state.
Whe you add a task to the queue, you can choose to also request it get a thread immediately. Or you can wait for threads in the pool to be finished what they are doing.
The threads in the pool are created by the provided Thread class, and they get their marching instructions from the pool. For the most part, they should pop tasks, do them, then wait on another task being ready.
If waiting isn't permitted, you could still have some global thread manager that stores state for the threads.
The pool/manager returns the equivalent of a future<T> augmented with whatever features you want. Code that provides tasks interacts with the task through that object instead of the embedded Thread type.
A simple wrapper can be written if locking is permitted
void start(Thread& t, void (*fn)(void*), void* p)
{
static std::mutex mtx; // or any other mutex
static void* sp;
static void (*sfn)(void*);
mtx.lock();
sp = p;
sfn = fn;
t.start([]{
auto p = sp;
auto fn = sfn;
mtx.unlock();
fn(p);
});
}
This is obviously not going to scale well, all thread creations goes through the same lock, but its likely enough.
Note this is exception-unsafe, but I assume that is fine in embedded systems.
With the wrapper in place
template<typename C>
void start(Thread& t, C& c)
{
start(t, [](void* p){
(*(C*)p)();
}, &c);
}
Which allows any callable to be used. This particular implementation places the responsibility of managing the callable's lifetime on the caller.
You can create your own threaded dispatching mechanism (producer-consumer queue) built around the platform specific thread.
I assume that you have the equivalent facilities of mutex and conditional variables/signalling mechanism for the target platform.
Create a thread safe queue that can accept function objects.
The run method creates a thread and waits on the queue.
The calling thread can call post()/invoke() method that simply insert a function object to the queue.
The function object can have the necessary arguments passed to the caller thread.
As the title of the question says, why C++ threads (std::thread and pthread) are movable but not copiable? What consequences are there, if we do make it copiable?
Regarding copying, consider the following snippet:
void foo();
std::thread first (foo);
std::thread second = first; // (*)
When the line marked (*) takes place, presumably some of foo already executed. What would the expected behavior be, then? Execute foo from the start? Halt the thread, copy the registers and state, and rerun it from there?
In particular, given that function objects are now part of the standard, it's very easy to launch another thread that performs exactly the same operation as some earlier thread, by reusing the function object.
There's not much motivation to begin with for this, therefore.
Regarding moves, though, consider the following:
std::vector<std::thread> threads;
without move semantics, it would be problematic: when the vector needs to internally resize, how would it move its elements to another buffer? See more on this here.
If the thread objects are copyable, who is finally responsible for the single thread of execution associated with the thread objects? In particular, what would join() do for each of the thread objects?
There are several possible outcomes, but that is the problem, there are several possible outcomes with no real overlap that can be codified (standardised) as a general use case.
Hence, the most reasonable outcome is that 1 thread of execution is associated with at most 1 thread object.
That is not to say some shared state cannot be provided, it is just that the user then needs to take further action in this regard, such as using a std::shared_ptr.
I create boost::thread object with a new operator and continue without waiting this thread to finish its work:
void do_work()
{
// perform some i/o work
}
boost::thread *thread = new boost::thread(&do_work);
I guess, it’s necessary to delete thread when the work is done. What’s the best way to this without explicitly waiting for thread termination?
The boost::thread object's lifetime and the native thread's lifetime are unrelated. The boost::thread object can go out of scope at any time.
From the boost::thread class documentation
Just as the lifetime of a file may be different from the lifetime of an iostream object which represents the file, the lifetime of a thread of execution may be different from the thread object which represents the thread of execution. In particular, after a call to join(), the thread of execution will no longer exist even though the thread object continues to exist until the end of its normal lifetime. The converse is also possible; if a thread object is destroyed without join() having first been called, the thread of execution continues until its initial function completes.
Edit: If you just need to start a thread and never invoke join, you can use the thread's constructor as a function:
// Launch thread.
boost::thread(&do_work);
However, I don't suggest you do that, even if you think you're sure the thread will complete before main() does.
You can use
boost::thread t(&do_work);
t.detach();
Once the thread is detached it is no longer owned by the boost::thread object; the object can be destroyed and the thread will continue to run. The boost::thread destructor also calls detach() if the object owns a running thread, so letting t get destroyed will have the same result.
I suggest you use boost::shared_ptr, so you won't take care when to delete thread object.
boost::shared_ptr<boost::thread> thread(new boost::thread(&do_work));
You should take a look at thread interruption.
This article is good also.
http://www.boost.org/doc/libs/1_38_0/doc/html/thread/thread_management.html