For example I want each thread to not start running until the previous one has completed, is there a flag, something like thread.isRunning()?
#include <iostream>
#include <vector>
#include <thread>
using namespace std;
void hello() {
cout << "thread id: " << this_thread::get_id() << endl;
}
int main() {
vector<thread> threads;
for (int i = 0; i < 5; ++i)
threads.push_back(thread(hello));
for (thread& thr : threads)
thr.join();
cin.get();
return 0;
}
I know that the threads are meant to run concurrently, but what if I want to control the order?
There is no thread.isRunning(). You need some synchronization primitive to do it.
Consider std::condition_variable for example.
One approachable way is to use std::async. With the current definition of std::async is that the associated state of an operation launched by std::async can cause the returned std::future's destructor to block until the operation is complete. This can limit composability and result in code that appears to run in parallel but in reality runs sequentially.
{
std::async(std::launch::async, []{ hello(); });
std::async(std::launch::async, []{ hello(); }); // does not run until hello() completes
}
If we need the second thread start to run after the first one is completed, is a thread really needed?
For solution I think try to set a global flag, the set the value in the first thread, and when start the second thread, check the flag first should work.
You can't simply control the order like saying "First, thread 1, then thread 2,..." you will need to make use of synchronization (i.e. std::mutex and condition-variables std::condition_variable_any).
You can create events so as to block one thread until a certain event happend.
See cppreference for an overview of threading-mechanisms in C++-11.
You will need to use semaphore or lock.
If you initialize semaphore to value 0:
Call wait after thread.start() and call signal/ release in the end of thread execution function (e.g. run funcition in java, OnExit function etc...)
So the main thread will keep waiting until the thread in loop has completed its execution.
Task-based parallelism can achieve this, but C++ does not currently offer task model as part of it's threading libraries. If you have TBB or PPL you can use their task-based facilities.
I think you can achieve this by using std::mutex and std::condition_variable from C++11. To be able to run threads sequentially array of booleans in used, when thread is done doing some work it writes true in specific index of the array.
For example:
mutex mtx;
condition_variable cv;
int ids[10] = { false };
void shared_method(int id) {
unique_lock<mutex> lock(mtx);
if (id != 0) {
while (!ids[id - 1]) {
cv.wait(lock);
}
}
int delay = rand() % 4;
cout << "Thread " << id << " will finish in " << delay << " seconds." << endl;
this_thread::sleep_for(chrono::seconds(delay));
ids[id] = true;
cv.notify_all();
}
void test_condition_variable() {
thread threads[10];
for (int i = 0; i < 10; ++i) {
threads[i] = thread(shared_method, i);
}
for (thread &t : threads) {
t.join();
}
}
Output:
Thread 0 will finish in 3 seconds.
Thread 1 will finish in 1 seconds.
Thread 2 will finish in 1 seconds.
Thread 3 will finish in 2 seconds.
Thread 4 will finish in 2 seconds.
Thread 5 will finish in 0 seconds.
Thread 6 will finish in 0 seconds.
Thread 7 will finish in 2 seconds.
Thread 8 will finish in 3 seconds.
Thread 9 will finish in 1 seconds.
Related
In modern C++ with STL threads I want to have two worker threads that take turns doing their work. Only one can be working at a time and each may only get one turn before the other takes a turn. I have this part working.
The added constraint is that one thread needs to keep taking turns after the other thread finishes. But in my code the remaining worker thread deadlocks after the first worker thread finishes. I don't understand why, given that the last things the first worker did was unlock and notify the condition variable, which should've woken the second one up. Here's the code:
{
std::mutex mu;
std::condition_variable cv;
int turn = 0;
auto thread_func = [&](int tid, int iters) {
std::unique_lock<std::mutex> lk(mu);
lk.unlock();
for (int i = 0; i < iters; i++) {
lk.lock();
cv.wait(lk, [&] {return turn == tid; });
printf("tid=%d turn=%d i=%d/%d\n", tid, turn, i, iters);
fflush(stdout);
turn = !turn;
lk.unlock();
cv.notify_all();
}
};
auto th0 = std::thread(thread_func, 0, 20);
auto th1 = std::thread(thread_func, 1, 25); // Does more iterations
printf("Made the threads.\n");
fflush(stdout);
th0.join();
th1.join();
printf("Both joined.\n");
fflush(stdout);
}
I don't know whether this is something I don't understand about concurrency in STL threads, or whether I just have a logic bug in my code. Note that there is a question on SO that's similar to this, but without the second worker having to run longer than the first. I can't find it right now to link to it. Thanks in advance for your help.
When one thread is done, the other will wait for a notification that nobody will send. When only one thread is left, you need to either stop using the condition variable or signal the condition variable some other way.
I'm trying to understand how to better use condition variables, and I have the following code.
Behavior.
The expected behavior of the code is that:
Each thread prints "thread n waiting"
The program waits until the user presses enter
When the user presses enter, notify_one is called once for each thread
All the threads print "thread n ready.", and exit
The observed behavior of the code is that:
Each thread prints "thread n waiting" (Expected)
The program waits until the user presses enter (Expected)
When the user presses enter, notify_one is called once for each thread (Expected)
One of the threads prints "thread n ready", but then the code hangs. (???)
Question.
Why does the code hang? And how can I have multiple threads wait on the same condition variable?
Code
#include <condition_variable>
#include <iostream>
#include <string>
#include <vector>
#include <thread>
int main() {
using namespace std::literals::string_literals;
auto m = std::mutex();
auto lock = std::unique_lock(m);
auto cv = std::condition_variable();
auto wait_then_print =[&](int id) {
return [&, id]() {
auto id_str = std::to_string(id);
std::cout << ("thread " + id_str + " waiting.\n");
cv.wait(lock);
// If I add this line in, the code gives me a system error:
// lock.unlock();
std::cout << ("thread " + id_str + " ready.\n");
};
};
auto threads = std::vector<std::thread>(16);
int counter = 0;
for(auto& t : threads)
t = std::thread(wait_then_print(counter++));
std::cout << "Press enter to continue.\n";
std::getchar();
for(int i = 0; i < counter; i++) {
cv.notify_one();
std::cout << "Notified one.\n";
}
for(auto& t : threads)
t.join();
}
Output
thread 1 waiting.
thread 0 waiting.
thread 2 waiting.
thread 3 waiting.
thread 4 waiting.
thread 5 waiting.
thread 6 waiting.
thread 7 waiting.
thread 8 waiting.
thread 9 waiting.
thread 11 waiting.
thread 10 waiting.
thread 12 waiting.
thread 13 waiting.
thread 14 waiting.
thread 15 waiting.
Press enter to continue.
Notified one.
Notified one.
thread 1 ready.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
This is undefined behavior.
In order to wait on a condition variable, the condition variable must be waited on by the same exact thread that originally locked the mutex. You cannot lock the mutex in one execution thread, and then wait on the condition variable in another thread.
auto lock = std::unique_lock(m);
This lock is obtained in the main execution thread. Afterwards, the main execution thread creates all these multiple execution threads. Each one of these execution threads executes the following:
cv.wait(lock)
The mutex lock was not acquired by the execution thread that calls wait() here, therefore this is undefined behavior.
A more closer look at what you are attempting to do here suggests that you will likely get your intended results if you simply move
auto lock = std::unique_lock(m);
inside the lambda that gets executed by each new execution thread.
You also need to simply use notify_all() instead of calling notify_one() multiple times, due to various race conditions. Remember that wait() automatically unlocks the mutex and waits on the condition variable, and wait() returns only after the thread successfully relocked the mutex after being notified by the condition variable.
I want to create over 500 threads in c++ on beaglebone black
but the program has errors.
could you explain why the errors is occured and how I fix the errors
in thread func. : call_from_thread(int tid)
void call_from_thread(int tid)
{
cout << "thread running : " << tid << std::endl;
}
in main func.
int main() {
thread t[500];
for(int i=0; i<500; i++) {
t[i] = thread(call_from_thread, i);
usleep(100000);
}
std::cout << "main fun start" << endl;
return 0;
}
I expects
...
...
thread running : 495
thread running : 496
thread running : 497
thread running : 498
thread running : 499
main fun start
but
...
...
thread running : 374
thread running : 375
thread running : 376
thread running : 377
thread running : 378
terminate called after throwing an instance of 'std::system_error'
what(): Resource temporarily unavailable
Aborted
could you help me?
The beaglebone black appears to have a maximum of 512MB of DRAM.
The minimum stack size of a thread according to pthread_create() is 2MB.
i.e. 2^29 / 2^21 = 2^8 = 256. So what you're probably seeing around thread 374 is the allocator cannot free memory fast enough to meet the demand which
is handled by throwing an exception.
If you really want to see this explode, try moving that sleep call inside your thread function. :)
You could try preallocating the stack to 1MB or less (pthreads), but that has it's
own set of problems.
The questions to really ask yourself is:
Is my application io bound or compute bound?
What's my memory budget to run this application? If you spend your entire physical memory
on thread stacks, you'll have nothing left for the shared program heap.
Do I really need this much parallelism to do the job? The A8 is a single core machine BTW.
Could I solve the problem using a thread pool? Or not use threads at all?
Finally, you can't set the stack size in std::thread api, but you can in
boost::thread.
Or just write a thin wrapper around pthreads (assuming Linux).
Whenever you use threads, there are three parts.
Start the threads
Do the work
Release the thread
You're starting the threads and doing the work, but you're not releasing them.
Releasing threads. There are two options for releasing a thread.
You can join the thread (which basically waits for it to finish)
You can detach the thread, and let it execute independently.
In this particular case, you don't want the program to finish until all threads are done executing, so you should join them.
#include <iostream>
#include <thread>
#include <vector>
#include <string>
auto call_from_thread = [](int i) {
// I create the entire message before printing it, so that there's no interleaving of messages between threads
std::string message = "Calling from thread " + std::to_string(i) + '\n';
// Because I only call print once, everything gets printed together
std::cout << message;
};
using std::thread;
int main() {
thread t[500];
for(int i=0; i<500; i++) {
// Here, I don't have to start the thread with any delay
t[i] = thread(call_from_thread, i);
}
std::cout << "main fun start\n";
// I join each thread (which waits for them to finish before closing the program)
for(auto& item : t) {
item.join();
}
return 0;
}
Assuming I have the function double someRandomFunction(int n) that takes an integer and returns double but it's random in the sense that it tries random stuff to come up with the solution so even though you run the function with the same arguments, sometimes it can take 10 seconds to finish and other 40 seconds to finish.
The double someRandomFunction(int n) functions itself is a wrapper to a black box function. So the someRandomFunction takes a while to complete but I don't have control in the main loop of the black box, hence I can't really check for a flag variable within the thread as the heavy computation happens in a black box function.
I would like to start 10 threads calling that function and I am interested in the result of the first thread which finishes first. I don't care which one it's I only need 1 result from these threads.
I found the following code:
std::vector<boost::future<double>> futures;
for (...) {
auto fut = boost::async([i]() { return someRandomFunction(2) });
futures.push_back(std::move(fut));
}
for (...) {
auto res = boost::wait_for_any(futures.begin(), futures.end());
std::this_thread::yield();
std::cout << res->get() << std::endl;
}
Which is the closest to what I am looking for, but still I can't see how I can make my program to terminate the other threads as far as one thread returns a solution.
I would like to wait for one to finish and then carry on with the result of that one thread to continue my program execution (i.e., I don't want to terminate my program after I obtain that single result, but I would like to use it for the remaining program execution.).
Again, I want to start up 10 threads calling the someRandomFunction and then wait for one thread to finish first, get the result of that thread and stop all the other threads even though they didn't finish their work.
If the data structure supplied to the black-box has some obvious start and end values, one way to make it finish early could be to change the end value while it's computing. It could of course cause all sorts of trouble if you've misunderstood how the black-box must work with the data, but if you are reasonably sure, it can work.
main spawns 100 outer threads that each spawn one inner thread that calls the blackbox. The inner thread receives the blackbox result and notifies all waiting threads that it's done. The outer thread waits for any inner thread to get done and then modifies the data for its own blackbox to trick it to finish.
No polling (except for the spurious wakeup loops) and no detached threads.
#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <vector>
#include <chrono>
// a work package for one black-box
struct data_for_back_box {
int start_here;
int end_here;
};
double blackbox(data_for_back_box* data) {
// time consuming work here:
for(auto v=data->start_here; v<data->end_here; ++v) {
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
// just a debug
if(data->end_here==0) std::cout << "I was tricked into exiting early\n";
return data->end_here;
}
// synchronizing stuff and result
std::condition_variable cv;
std::mutex mtx;
bool done=false;
double result;
// a wrapper around the real blackbox
void inner(data_for_back_box* data) {
double r = blackbox(data);
if(done) return; // someone has already finished, skip this result
// notify everyone that we're done
std::unique_lock<std::mutex> lock(mtx);
result = r;
done=true;
cv.notify_all();
}
// context setup and wait for any inner wrapper
// to signal "done"
void outer(int n) {
data_for_back_box data{0, 100+n*n};
std::thread work(inner, &data);
{
std::unique_lock<std::mutex> lock(mtx);
while( !done ) cv.wait(lock);
}
// corrupt data for blackbox:
data.end_here = 0;
// wait for this threads blackbox to finish
work.join();
}
int main() {
std::vector<std::thread> ths;
// spawn 100 worker threads
for(int i=0; i<100; ++i) {
ths.emplace_back(outer, i);
}
double saved_result;
{
std::unique_lock<std::mutex> lock(mtx);
while( !done ) cv.wait(lock);
saved_result = result;
} // release lock
// join all threads
std::cout << "got result, joining:\n";
for(auto& th : ths) {
th.join();
}
std::cout << "result: " << saved_result << "\n";
}
I would like to launch a member function in a separate thread calling it from another member.
Maybe the code below is clearer.
There is a button which launches the counter in a thread and it works:
void MainWindow::on_pushButton_CountNoArgs_clicked()
{
myCounter *counter = new myCounter;
QFuture<void> future = QtConcurrent::run(counter, &myCounter::countUpToThousand);
}
MyCounter class member functions:
void myCounter::countUpToHundred()
{
for(int i = 0; i<=100; i++)
{
qDebug() << "up to 100: " << i;
}
}
void myCounter::countUpToThousand()
{
for(int i = 0; i<=1000; i++)
{
qDebug() << "up to 1000: " << i;
if (i == 500)
{
//here I want to launch myCounter::countUpToHundred() in another thread
}
}
}
Thanks in advance.
Assuming you want to run the 2 counters parallel, you have 3 threads:
Thread 1: UI-Thread (or main thread)
Here runs on_pushButton_CountNoArgs_clicked(). You should not do hard work in this function because if you want to achive 60 frames per second, you only have 16 ms for all the work. To starting a new thread to run countUpToThousand() is a good idea.
Thread 2: background thread (started by QtConcurrent, running countUpToThousand)
This runs in parallel to Thread 1, and you are working with the same instance of myCounter (i.e. the same place in memory) so be careful which member variables you read and write.
Thread 3: background thread (started by QtConcurrent, running countUpToHundred)
Start using (as hank pointed out)
void myCounter::countUpToThousand()
{
for(int i = 0; i<=1000; i++)
{
qDebug() << "up to 1000: " << i;
if (i == 500)
{
QtConcurrent::run(this, &myCounter::countUpToHundred);
}
}
}
This will run in parallel to Thread 1 and Thread 2.
Now you might get crazy output results like 988\n99\n when one counter is at 999 and the other is at 88 because Thread 2 and Thread 3 will be printing to console at the same time and don't care about what the other thread is doing.
Also note that you must not delete counter before Thread 2 and Thread 3 are done because of you do, the'll still try to access the memory and your application will probably crash.