Synchronizing very fast threads - c++

In the following example (an idealized "game") there are two threads. The main thread which updates data and RenderThread which "renders" it to the screen. What I need it those two to be synchronized. I cannot afford to run several update iteration without running a render for every single one of them.
I use a condition_variable to sync those two, so ideally the faster thread will spend some time waiting for the slower. However condition variables don't seem to do the job if one of the threads completes an iteration for a very small amount of time. It seems to quickly reacquire the lock of the mutex before wait in the other thread is able to acquire it. Even though notify_one is called
#include <iostream>
#include <thread>
#include <chrono>
#include <atomic>
#include <functional>
#include <mutex>
#include <condition_variable>
using namespace std;
bool isMultiThreaded = true;
struct RenderThread
{
RenderThread()
{
end = false;
drawing = false;
readyToDraw = false;
}
void Run()
{
while (!end)
{
DoJob();
}
}
void DoJob()
{
unique_lock<mutex> lk(renderReadyMutex);
renderReady.wait(lk, [this](){ return readyToDraw; });
drawing = true;
// RENDER DATA
this_thread::sleep_for(chrono::milliseconds(15)); // simulated render time
cout << "frame " << count << ": " << frame << endl;
++count;
drawing = false;
readyToDraw = false;
lk.unlock();
renderReady.notify_one();
}
atomic<bool> end;
mutex renderReadyMutex;
condition_variable renderReady;
//mutex frame_mutex;
int frame = -10;
int count = 0;
bool readyToDraw;
bool drawing;
};
struct UpdateThread
{
UpdateThread(RenderThread& rt)
: m_rt(rt)
{}
void Run()
{
this_thread::sleep_for(chrono::milliseconds(500));
for (int i = 0; i < 20; ++i)
{
// DO GAME UPDATE
// when this is uncommented everything is fine
// this_thread::sleep_for(chrono::milliseconds(10)); // simulated update time
// PREPARE RENDER THREAD
unique_lock<mutex> lk(m_rt.renderReadyMutex);
m_rt.renderReady.wait(lk, [this](){ return !m_rt.drawing; });
m_rt.readyToDraw = true;
// SUPPLY RENDER THREAD WITH DATA TO RENDER
m_rt.frame = i;
lk.unlock();
m_rt.renderReady.notify_one();
if (!isMultiThreaded)
m_rt.DoJob();
}
m_rt.end = true;
}
RenderThread& m_rt;
};
int main()
{
auto start = chrono::high_resolution_clock::now();
RenderThread rt;
UpdateThread u(rt);
thread* rendering = nullptr;
if (isMultiThreaded)
rendering = new thread(bind(&RenderThread::Run, &rt));
u.Run();
if (rendering)
rendering->join();
auto duration = chrono::high_resolution_clock::now() - start;
cout << "Duration: " << double(chrono::duration_cast<chrono::microseconds>(duration).count())/1000 << endl;
return 0;
}
Here is the source of this small example code, and as you can see even on ideone's run the output is frame 0: 19 (this means that the render thread has completed a single iteration, while the update thread has completed all 20 of its).
If we uncomment line 75 (ie simulate some time for the update loop) everything runs fine. Every update iteration has an associated render iteration.
Is there a way to really truly sync those threads, even if one of them completes an iteration in mere nanoseconds, but also without having a performance penalty if they both take some reasonable amount of milliseconds to complete?

If I understand correctly, you want the 2 threads to work alternately: updater wait until the renderer finish before to iterate again, and the renderer wait until the updater finish before to iterate again. Part of the computation could be parallel, but the number of iteration shall be similar between both.
You need 2 locks:
one for the updating
one for the rendering
Updater:
wait (renderingLk)
update
signal(updaterLk)
Renderer:
wait (updaterLk)
render
signal(renderingLk)
EDITED:
Even if it look simple, there are several problems to solve:
Allowing part of the calculations to be made in parallel: As in the above snippet, update and render will not be parallel but sequential, so there is no benefit to have multi-thread. To a real solution, some the calculation should be made before the wait, and only the copy of the new values need to be between the wait and the signal. Same for rendering: all the render need to be made after the signal, and only getting the value between the wait and the signal.
The implementation need to care also about the initial state: so no rendering is performed before the first update.
The termination of both thread: so no one will stay locked or loop infinitely after the other terminate.

I think a mutex (alone) is not the right tool for the job. You might want to consider using a semaphore (or something similar) instead. What you describe sound a lot like a producer/consumer problem, i.e., one process is allowed to run once everytime another process has finnished a task. Therefore you might also have a look at producer/consumer patterns. For example this series might get you some ideas:
A multi-threaded Producer Consumer with C++11
There a std::mutex is combined with a std::condition_variable to mimic the behavior of a semaphore. An approach that appears quite reasonable. You would probably not count up and down but rather toggle true and false a variable with needs redraw semantics.
For reference:
http://en.cppreference.com/w/cpp/thread/condition_variable
C++0x has no semaphores? How to synchronize threads?

This is because you use a separate drawing variable that is only set when the rendering thread reacquires the mutex after a wait, which may be too late. The problem disappears when the drawing variable is removed and the check for wait in the update thread is replaced with ! m_rt.readyToDraw (which is already set by the update thread and hence not susceptible to the logical race.
Modified code and results
That said, since the threads do not work in parallel, I don't really get the point of having two threads. Unless you should choose to implement double (or even triple) buffering later.

A technique often used in computer graphics is to use a double-buffer. Instead of having the renderer and the producer operate on the same data in memory, each one has its own buffer. This is implemented by using two independent buffers, and switch them when needed. The producer updates one buffer, and when it is done, it switches the buffer and fills the second buffer with the next data. Now, while the producer is processing the second buffer, the renderer works with the first one and displays it.
You could use this technique by letting the renderer lock the swap operation such that the producer may have to wait until rendering is finished.

Related

How bad it is to lock a mutex in an infinite loop or an update function

std::queue<double> some_q;
std::mutex mu_q;
/* an update function may be an event observer */
void UpdateFunc()
{
/* some other processing */
std::lock_guard lock{ mu_q };
while (!some_q.empty())
{
const auto& val = some_q.front();
/* update different states according to val */
some_q.pop();
}
/* some other processing */
}
/* some other thread might add some values after processing some other inputs */
void AddVal(...)
{
std::lock_guard lock{ mu_q };
some_q.push(...);
}
For this case is it okay to handle the queue this way?
Or would it be better if I try to use a lock-free queue like the boost one?
How bad it is to lock a mutex in an infinite loop or an update function
It's pretty bad. Infinite loops actually make your program have undefined behavior unless it does one of the following:
terminate
make a call to a library I/O function
perform an access through a volatile glvalue
perform a synchronization operation or an atomic operation
Acquiring the mutex lock before entering the loop and just holding it does not count as performing a synchronization operation (in the loop). Also, when holding the mutex, noone can add information to the queue, so while processing the information you extract, all threads wanting to add to the queue will have to wait - and no other worker threads wanting to share the load can extract from the queue either. It's usually better to extract one task from the queue, release the lock and then work with what you got.
The common way is to use a condition_variable that lets other threads acquire the lock and then notify other threads waiting with the same condition_variable. The CPU will be pretty close to idle while waiting and wake up to do the work when needed.
Using your program as a base, it could look like this:
#include <chrono>
#include <condition_variable>
#include <iostream>
#include <mutex>
#include <queue>
#include <thread>
std::queue<double> some_q;
std::mutex mu_q;
std::condition_variable cv_q; // the condition variable
bool stop_q = false; // something to signal the worker thread to quit
/* an update function may be an event observer */
void UpdateFunc() {
while(true) {
double val;
{
std::unique_lock lock{mu_q};
// cv_q.wait lets others acquire the lock to work with the queue
// while it waits to be notified.
while (not stop_q && some_q.empty()) cv_q.wait(lock);
if(stop_q) break; // time to quit
val = std::move(some_q.front());
some_q.pop();
} // lock released so others can use the queue
// do time consuming work with "val" here
std::cout << "got " << val << '\n';
}
}
/* some other thread might add some values after processing some other inputs */
void AddVal(double val) {
std::lock_guard lock{mu_q};
some_q.push(val);
cv_q.notify_one(); // notify someone that there's a new value to work with
}
void StopQ() { // a function to set the queue in shutdown mode
std::lock_guard lock{mu_q};
stop_q = true;
cv_q.notify_all(); // notify all that it's time to stop
}
int main() {
auto th = std::thread(UpdateFunc);
// simulate some events coming with some time apart
std::this_thread::sleep_for(std::chrono::seconds(1));
AddVal(1.2);
std::this_thread::sleep_for(std::chrono::seconds(1));
AddVal(3.4);
std::this_thread::sleep_for(std::chrono::seconds(1));
AddVal(5.6);
std::this_thread::sleep_for(std::chrono::seconds(1));
StopQ();
th.join();
}
If you really want to process everything that is currently in the queue, then extract everything first and then release the lock, then work with what you extracted. Extracting everything from the queue is done quickly by just swapping in another std::queue. Example:
#include <atomic>
std::atomic<bool> stop_q{}; // needs to be atomic in this version
void UpdateFunc() {
while(not stop_q) {
std::queue<double> work; // this will be used to swap with some_q
{
std::unique_lock lock{mu_q};
// cv_q.wait lets others acquire the lock to work with the queue
// while it waits to be notified.
while (not stop_q && some_q.empty()) cv_q.wait(lock);
std::swap(work, some_q); // extract everything from the queue at once
} // lock released so others can use the queue
// do time consuming work here
while(not stop_q && not work.empty()) {
auto val = std::move(work.front());
work.pop();
std::cout << "got " << val << '\n';
}
}
}
You can use it like you currently are assuming proper use of the lock across all threads. However, you may run into some frustrations about how you want to call updateFunc().
Are you going to be using a callback?
Are you going to be using an ISR?
Are you going to be polling?
If you use a 3rd party lib it often trivializes thread synchronization and queues
For example, if you are using a CMSIS RTOS(v2). It is a fairly straight forward process to get multiple threads to pass information between each other. You could have multiple producers, and a single consumer.
The single consumer can wait in a forever loop where it waits to receive a message before performing its work
when timeout is set to osWaitForever the function will wait for an
infinite time until the message is retrieved (i.e. wait semantics).
// Two producers
osMessageQueuePut(X,Y,Z,timeout=0)
osMessageQueuePut(X,Y,Z,timeout=0)
// One consumer which will run only once something enters the queue
osMessageQueueGet(X,Y,Z,osWaitForever)
tldr; You are safe to proceed, but using a library will likely make your synchronization problems easier.

Waking up multiple threads to work once per condition

I have a situation where one thread needs to occasionally wake up a number of worker threads and each worker thread needs to do it's work (only) once and then go back to sleep to wait for the next notification. I'm using a condition_variable to wake everything up, but the problem I'm having is the "only once" part. Assume that each thread is heavy to create, so I don't want to just be creating and joining them each time.
// g++ -Wall -o threadtest -pthread threadtest.cpp
#include <iostream>
#include <condition_variable>
#include <mutex>
#include <thread>
#include <chrono>
std::mutex condMutex;
std::condition_variable condVar;
bool dataReady = false;
void state_change_worker(int id)
{
while (1)
{
{
std::unique_lock<std::mutex> lck(condMutex);
condVar.wait(lck, [] { return dataReady; });
// Do work only once.
std::cout << "thread " << id << " working\n";
}
}
}
int main()
{
// Create some worker threads.
std::thread threads[5];
for (int i = 0; i < 5; ++i)
threads[i] = std::thread(state_change_worker, i);
while (1)
{
// Signal to the worker threads to work.
{
std::cout << "Notifying threads.\n";
std::unique_lock<std::mutex> lck(condMutex);
dataReady = true;
condVar.notify_all();
}
// It would be really great if I could wait() on all of the
// worker threads being done with their work here, but it's
// not strictly necessary.
std::cout << "Sleep for a bit.\n";
std::this_thread::sleep_for(std::chrono::milliseconds(1000));
}
}
Update: Here is a version implementing an almost-but-not-quite working version of a squad lock. The problem is that I can't guarantee that each thread will have a chance to wake up and derement count in waitForLeader() before one runs through again.
// g++ -Wall -o threadtest -pthread threadtest.cpp
#include <iostream>
#include <condition_variable>
#include <mutex>
#include <thread>
#include <chrono>
class SquadLock
{
public:
void waitForLeader()
{
{
// Increment count to show that we are waiting in queue.
// Also, if we are the thread that reached the target, signal
// to the leader that everything is ready.
std::unique_lock<std::mutex> count_lock(count_mutex_);
std::unique_lock<std::mutex> target_lock(target_mutex_);
if (++count_ >= target_)
count_cond_.notify_one();
}
// Wait for leader to signal done.
std::unique_lock<std::mutex> lck(done_mutex_);
done_cond_.wait(lck, [&] { return done_; });
{
// Decrement count to show that we are no longer waiting.
// If we are the last thread set done to false.
std::unique_lock<std::mutex> lck(count_mutex_);
if (--count_ == 0)
{
done_ = false;
}
}
}
void waitForHerd()
{
std::unique_lock<std::mutex> lck(count_mutex_);
count_cond_.wait(lck, [&] { return count_ >= target_; });
}
void leaderDone()
{
std::unique_lock<std::mutex> lck(done_mutex_);
done_ = true;
done_cond_.notify_all();
}
void incrementTarget()
{
std::unique_lock<std::mutex> lck(target_mutex_);
++target_;
}
void decrementTarget()
{
std::unique_lock<std::mutex> lck(target_mutex_);
--target_;
}
void setTarget(int target)
{
std::unique_lock<std::mutex> lck(target_mutex_);
target_ = target;
}
private:
// Condition variable to indicate that the leader is done.
std::mutex done_mutex_;
std::condition_variable done_cond_;
bool done_ = false;
// Count of currently waiting tasks.
std::mutex count_mutex_;
std::condition_variable count_cond_;
int count_ = 0;
// Target number of tasks ready for the leader.
std::mutex target_mutex_;
int target_ = 0;
};
SquadLock squad_lock;
std::mutex print_mutex;
void state_change_worker(int id)
{
while (1)
{
// Wait for the leader to signal that we are ready to work.
squad_lock.waitForLeader();
{
// Adding just a bit of sleep here makes it so that every thread wakes up, but that isn't the right way.
// std::this_thread::sleep_for(std::chrono::milliseconds(100));
std::unique_lock<std::mutex> lck(print_mutex);
std::cout << "thread " << id << " working\n";
}
}
}
int main()
{
// Create some worker threads and increment target for each one
// since we want to wait until all threads are finished.
std::thread threads[5];
for (int i = 0; i < 5; ++i)
{
squad_lock.incrementTarget();
threads[i] = std::thread(state_change_worker, i);
}
while (1)
{
// Signal to the worker threads to work.
std::cout << "Starting threads.\n";
squad_lock.leaderDone();
// Wait for the worked threads to be done.
squad_lock.waitForHerd();
// Wait until next time, processing results.
std::cout << "Tasks done, waiting for next time.\n";
std::this_thread::sleep_for(std::chrono::milliseconds(1000));
}
}
Following is an excerpt from a blog I created concerning concurrent design patterns. The patterns are expressed using the Ada language, but the concepts are translatable to C++.
Summary
Many applications are constructed of groups of cooperating threads of execution. Historically this has frequently been accomplished by creating a group of cooperating processes. Those processes would cooperate by sharing data. At first, only files were used to share data. File sharing presents some interesting problems. If one process is writing to the file while another process reads from the file you will frequently encounter data corruption because the reading process may attempt to read data before the writing process has completely written the information. The solution used for this was to create file locks, so that only one process at a time could open the file. Unix introduced the concept of a Pipe, which is effectively a queue of data. One process can write to a pipe while another reads from the pipe. The operating system treats data in a pipe as a series of bytes. It does not let the reading process access a particular byte of data until the writing process has completed its operation on the data.
Various operating systems also introduced other mechanisms allowing processes to share data. Examples include message queues, sockets, and shared memory. There were also special features to help programmers control access to data, such as semaphores. When operating systems introduced the ability for a single process to operate multiple threads of execution, also known as lightweight threads, or just threads, they also had to provide corresponding locking mechanisms for shared data.
Experience shows that, while the variety of possible designs for shared data is quite large, there are a few very common design patterns that frequently emerge. Specifically, there are a few variations on a lock or semaphore, as well as a few variations on data buffering. This paper explores the locking and buffering design patterns for threads in the context of a monitor. Although monitors can be implemented in many languages, all examples in this paper are presented using Ada protected types. Ada protected types are a very thorough implementation of a monitor.
Monitors
There are several theoretical approaches to creating and controlling shared memory. One of the most flexible and robust is the monitor as first described by C.A.R. Hoare. A monitor is a data object with three different kinds of operations.
Procedures are used to change the state or values contained by the monitor. When a thread calls a monitor procedure that thread must have exclusive access to the monitor to prevent other threads from encountering corrupted or partially written data.
Entries, like procedures, are used to change the state or values contained by the monitor, but an entry also specifies a boundary condition. The entry may only be executed when the boundary condition is true. Threads that call an entry when the boundary condition is false are placed in a queue until the boundary condition becomes true. Entries are used, for example, to allow a thread to read from a shared buffer. The reading thread is not allowed to read the data until the buffer actually contains some data. The boundary condition would be that the buffer must not be empty. Entries, like procedures, must have exclusive access to the monitor's data.
Functions are used to report the state of a monitor. Since functions only report state, and do not change state, they do not need exclusive access to the monitor's data. Many threads may simultaneously access the same monitor through functions without danger of data corruption.
The concept of a monitor is extremely powerful. It can also be extremely efficient. Monitors provide all the capabilities needed to design efficient and robust shared data structures for threaded systems.
Although monitors are powerful, they do have some limitations. The operations performed on a monitor should be very fast, with no chance of making a thread block. If those operations should block, the monitor will become a road block instead of a communication tool. All the threads awaiting access to the monitor will be blocked as long as the monitor operation is blocked. For this reason, some people choose not to use monitors. There are design patterns for monitors that can actually be used to work around these problems. Those design patterns are grouped together as locking patterns.
Squad Locks
A squad lock allows a special task (the squad leader) to monitor the progress of a herd or group of worker tasks. When all (or a sufficient number) of the worker tasks are done with some aspect of their work, and the leader is ready to proceed, the entire set of tasks is allowed to pass a barrier and continue with the next sequence of their activities. The purpose is to allow tasks to execute asynchronously, yet coordinate their progress through a complex set of activities.
package Barriers is
protected type Barrier(Trigger : Positive) is
entry Wait_For_Leader;
entry Wait_For_Herd;
procedure Leader_Done;
private
Done : Boolean := False;
end Barrier;
protected type Autobarrier(Trigger : Positive) is
entry Wait_For_Leader;
entry Wait_For_Herd;
private
Done : Boolean := False;
end Autobarrier;
end Barriers;
This package shows two kinds of squad lock. The Barrier protected type demonstrates a basic squad lock. The herd calls Wait_For_Leader and the leader calls Wait_For_Herd and then Leader_Done. The Autobarrier demonstrates a simpler interface. The herd calls Wait_For_Leader and the leader calls Wait_For_Herd. The Trigger parameter is used when creating an instance of either type of barrier. It sets the minimum number of herd tasks the leader must wait for before it can proceed.
package body Barriers is
protected body Barrier is
entry Wait_For_Herd when Wait_For_Leader'Count >= Trigger is
begin
null;
end Wait_For_Herd;
entry Wait_For_Leader when Done is
begin
if Wait_For_Leader'Count = 0 then
Done := False;
end if;
end Wait_For_Leader;
procedure Leader_Done is
begin
Done := True;
end Leader_Done;
end Barrier;
protected body Autobarrier is
entry Wait_For_Herd when Wait_For_Leader'Count >= Trigger is
begin
Done := True;
end Wait_For_Herd;
entry Wait_For_Leader when Done is
begin
if Wait_For_Leader'Count = 0 then
Done := False;
end if;
end Wait_For_Leader;
end Autobarrier;
end Barriers;

How to maintain certain frame rate in different threads

I have two different computational tasks that have to execute at certain frequencies. One has to be performed every 1ms and the other every 13.3ms. The tasks share some data.
I am having a hard time how to schedule these tasks and how to share data between them. One way that I thought might work is to create two threads, one for each task.
The first task is relatively simpler and can be handled in 1ms itself. But, when the second task (that is relatively more time-consuming) is going to launch, it will make a copy of the data that was just used by task 1, and continue to work on them.
Do you think this would work? How can it be done in c++?
There are multiple ways to do that in C++.
One simple way is to have 2 threads, as you described. Each thread does its action and then sleeps till the next period start. A working example:
#include <functional>
#include <iostream>
#include <chrono>
#include <thread>
#include <atomic>
#include <mutex>
std::mutex mutex;
std::atomic<bool> stop = {false};
unsigned last_result = 0; // Whatever thread_1ms produces.
void thread_1ms_action() {
// Do the work.
// Update the last result.
{
std::unique_lock<std::mutex> lock(mutex);
++last_result;
}
}
void thread_1333us_action() {
// Copy thread_1ms result.
unsigned last_result_copy;
{
std::unique_lock<std::mutex> lock(mutex);
last_result_copy = last_result;
}
// Do the work.
std::cout << last_result_copy << '\n';
}
void periodic_action_thread(std::chrono::microseconds period, std::function<void()> const& action) {
auto const start = std::chrono::steady_clock::now();
while(!stop.load(std::memory_order_relaxed)) {
// Do the work.
action();
// Wait till the next period start.
auto now = std::chrono::steady_clock::now();
auto iterations = (now - start) / period;
auto next_start = start + (iterations + 1) * period;
std::this_thread::sleep_until(next_start);
}
}
int main() {
std::thread a(periodic_action_thread, std::chrono::milliseconds(1), thread_1ms_action);
std::thread b(periodic_action_thread, std::chrono::microseconds(13333), thread_1333us_action);
std::this_thread::sleep_for(std::chrono::seconds(1));
stop = true;
a.join();
b.join();
}
If executing an action takes longer than one period to execute, then it sleeps till the next period start (skips one or more periods). I.e. each Nth action happens exactly at start_time + N * period, so that there is no time drift regardless of how long it takes to perform the action.
All access to the shared data is protected by the mutex.
So I'm thinking that task1 needs to make the copy, because it knows when it is safe to do so. Here is one simplistic model:
Shared:
atomic<Result*> latestResult = {0};
Task1:
Perform calculation
Result* pNewResult = new ResultBuffer
Copy result to pNewResult
latestResult.swap(pNewResult)
if (pNewResult)
delete pNewResult; // Task2 didn't take it!
Task2:
Result* pNewResult;
latestResult.swap(pNewResult);
process result
delete pNewResult;
In this model task1 and task2 only ever naggle when swapping a simple atomic pointer, which is quite painless.
Note that this makes many assumptions about your calculation. Could your task1 usefully calculate the result straight into the buffer, for example? Also note that at the start Task2 may find the pointer is still null.
Also it inefficiently new()s the buffers. You need 3 buffers to ensure there is never any significant naggling between the tasks, but you could just manage three buffer pointers under mutexes, such that Task 1 will have a set of data ready, and be writing another set of data, while task 2 is reading from a third set.
Note that even if you have task 2 copy the buffer, Task 1 still needs 2 buffers to avoid stalls.
You can use C++ threads and thread facilities like class thread and timer classes like steady_clock like it has been described in previous answer but if this solution works strongly depends on the platform your code is running on.
1ms and 13.3ms are pretty short time intervals and if your code is running on non-real time OS like Windows or non-RTOS Linux, there is no guarantee that OS scheduler will wake up your threads at exact times.
C++ 11 has the class high_resolution_clock that should use high resolution timer if your platform supports one but it still depends on the implementation of this class. And the bigger problem than the timer is using C++ wait functions. Neither C++ sleep_until nor sleep_for guarantees that they will wake up your thread at specified times. Here is the quote from C++ documentation.
sleep_for - blocks the execution of the current thread for at least the specified sleep_duration. sleep_for
Fortunately, most OS have some special facilities like Windows Multimedia Timers you can use if your threads are not woken up at expected times.
Here are more details. Precise thread sleep needed. Max 1ms error

Check if a thread is finished to send another param to it

I wanna to check if a thread job has been finished to call it again and send another parameter to that. The code is sth like this:
void SendMassage(double Speed)
{
Sleep(200);
cout << "Speed:" << Speed << endl;
}
int main() {
int Speed_1 = 0;
thread f(SendMassage, Speed_1);
for (int i = 0; i < 50; i++)
{
Sleep(20);
if (?)
{
another call of thread // If last thread done then call it again, otherwise not.
}
Speed_1++;
}
}
How should I do it?
Use, e.g., an atomic flag to indicate that the thread has finished:
std::atomic<bool> finished_flag{false};
void SendMassage(double Speed) {
Sleep(200);
cout << "Speed:" << Speed << endl;
finished_flag = true;
}
int main() {
int Speed_1 = 0;
thread f(SendMassage, Speed_1);
while (Speed_1 < 50) {
Sleep(20);
if (finished_flag) {
f.join();
finished_flag = false;
f = std::thread(SendMassage, Speed_1);
}
Speed_1++;
}
f.join();
}
Working example: https://wandbox.org/permlink/BrEMHFvlInshBy5V
Note that I assumed that, according to your code, you don't want to block when checking whether the thread f has finished. Otherwise, simply call f.join().
If you want to wait untill a thread has finished it's job without using Sleep, you neeed to call it's join method, like so
thread t(SendMassage, Speed_1);
t.join();
//Code here will start executing after returning from join
You can read more about it here http://en.cppreference.com/w/cpp/thread/thread/join
About sending another parameter, I think the best way would be splitting it into another function that you would call after this thread has been joined, if you need some information about something that's known only inside the function, you could create a class that would store that information in it's fields, and use it in the function you're threading.
The possibly most simple way of doing so is just joining the thread. Nothing clever, but...
OK, but why would you then want to have another thread at all if your main thread passes all its time sleeping anyway, so you quite sure are looking for something cleverer.
I personally like the principle of queues; you could use e. g. a std::deque for:
Your producer thread places in some values, your consumer thread just takes them out. Of course, you need to protect your queue via a std::mutex (or by other appropriate means) against race conditions...
The consumer would be running in an endless loop, processing the queue, if entries are available, or sleep if this is not the case. Have a look at this response for how to do the waiting...
There is the danger, though, that your queue runs full, so you might define some threshold when you stop or at least slow down producing new values, if you discover your producer being too fast. The queue has another advantage, though: If your producer is too fast, you might have more than one consumer, all serving the same queue (depending on your needs, putting together the results might need some extra efforts to keep ordering of correct).
Admitted, that's quite some work to do, it might be worth the effort, it might be overkill. If simpler approaches fit your needs already, Daniel's answer is fine, too...

Reusing thread in loop c++

I need to parallelize some tasks in a C++ program and am completely new to parallel programming. I've made some progress through internet searches so far, but am a bit stuck now. I'd like to reuse some threads in a loop, but clearly don't know how to do what I'm trying for.
I am acquiring data from two ADC cards on the computer (acquired in parallel), then I need to perform some operations on the collected data (processed in parallel) while collecting the next batch of data. Here is some pseudocode to illustrate
//Acquire some data, wait for all the data to be acquired before proceeding
std::thread acq1(AcquireData, boardHandle1, memoryAddress1a);
std::thread acq2(AcquireData, boardHandle2, memoryAddress2a);
acq1.join();
acq2.join();
while(user doesn't interrupt)
{
//Process first batch of data while acquiring new data
std::thread proc1(ProcessData,memoryAddress1a);
std::thread proc2(ProcessData,memoryAddress2a);
acq1(AcquireData, boardHandle1, memoryAddress1b);
acq2(AcquireData, boardHandle2, memoryAddress2b);
acq1.join();
acq2.join();
proc1.join();
proc2.join();
/*Proceed in this manner, alternating which memory address
is written to and being processed until the user interrupts the program.*/
}
That's the main gist of it. The next run of the loop would write to the "a" memory addresses while processing the "b" data and continue to alternate (I can get the code to do that, just took it out to prevent cluttering up the problem).
Anyway, the problem (as I'm sure some people can already tell) is that the second time I try to use acq1 and acq2, the compiler (VS2012) says "IntelliSense: call of an object of a class type without appropriate operator() or conversion functions to pointer-to-function type". Likewise, if I put std::thread in front of acq1 and acq2 again, it says " error C2374: 'acq1' : redefinition; multiple initialization".
So the question is, can I reassign threads to a new task when they have completed their previous task? I always wait for the previous use of the thread to end before calling it again, but I don't know how to reassign the thread, and since it's in a loop, I can't make a new thread each time (or if I could, that seems wasteful and unnecessary, but I could be mistaken).
Thanks in advance
The easiest way is to use a waitable queue of std::function objects. Like this:
#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <queue>
#include <functional>
#include <chrono>
class ThreadPool
{
public:
ThreadPool (int threads) : shutdown_ (false)
{
// Create the specified number of threads
threads_.reserve (threads);
for (int i = 0; i < threads; ++i)
threads_.emplace_back (std::bind (&ThreadPool::threadEntry, this, i));
}
~ThreadPool ()
{
{
// Unblock any threads and tell them to stop
std::unique_lock <std::mutex> l (lock_);
shutdown_ = true;
condVar_.notify_all();
}
// Wait for all threads to stop
std::cerr << "Joining threads" << std::endl;
for (auto& thread : threads_)
thread.join();
}
void doJob (std::function <void (void)> func)
{
// Place a job on the queu and unblock a thread
std::unique_lock <std::mutex> l (lock_);
jobs_.emplace (std::move (func));
condVar_.notify_one();
}
protected:
void threadEntry (int i)
{
std::function <void (void)> job;
while (1)
{
{
std::unique_lock <std::mutex> l (lock_);
while (! shutdown_ && jobs_.empty())
condVar_.wait (l);
if (jobs_.empty ())
{
// No jobs to do and we are shutting down
std::cerr << "Thread " << i << " terminates" << std::endl;
return;
}
std::cerr << "Thread " << i << " does a job" << std::endl;
job = std::move (jobs_.front ());
jobs_.pop();
}
// Do the job without holding any locks
job ();
}
}
std::mutex lock_;
std::condition_variable condVar_;
bool shutdown_;
std::queue <std::function <void (void)>> jobs_;
std::vector <std::thread> threads_;
};
void silly (int n)
{
// A silly job for demonstration purposes
std::cerr << "Sleeping for " << n << " seconds" << std::endl;
std::this_thread::sleep_for (std::chrono::seconds (n));
}
int main()
{
// Create two threads
ThreadPool p (2);
// Assign them 4 jobs
p.doJob (std::bind (silly, 1));
p.doJob (std::bind (silly, 2));
p.doJob (std::bind (silly, 3));
p.doJob (std::bind (silly, 4));
}
The std::thread class is designed to execute exactly one task (the one you give it in the constructor) and then end. If you want to do more work, you'll need a new thread. As of C++11, that's all we have. Thread pools didn't make it into the standard. (I'm uncertain what C++14 has to say about them.)
Fortunately, you can easily implement the required logic yourself. Here is the large-scale picture:
Start n worker threads that all do the following:
Repeat while there is more work to do:
Grab the next task t (possibly waiting until one becomes ready).
Process t.
Keep inserting new tasks in the processing queue.
Tell the worker threads that there is nothing more to do.
Wait for the worker threads to finish.
The most difficult part here (which is still fairly easy) is properly designing the work queue. Usually, a synchronized linked list (from the STL) will do for this. Synchronized means that any thread that wishes to manipulate the queue must only do so after it has acquired a std::mutex so to avoid race conditions. If a worker thread finds the list empty, it has to wait until there is some work again. You can use a std::condition_variable for this. Each time a new task is inserted into the queue, the inserting thread notifies a thread that waits on the condition variable and will therefore stop blocking and eventually start processing the new task.
The second not-so-trivial part is how to signal to the worker threads that there is no more work to do. Clearly, you can set some global flag but if a worker is blocked waiting at the queue, it won't realize any time soon. One solution could be to notify_all() threads and have them check the flag each time they are notified. Another option is to insert some distinct “toxic” item into the queue. If a worker encounters this item, it quits itself.
Representing a queue of tasks is straight-forward using your self-defined task objects or simply lambdas.
All of the above are C++11 features. If you are stuck with an earlier version, you'll need to resort to third-party libraries that provide multi-threading for your particular platform.
While none of this is rocket science, it is still easy to get wrong the first time. And unfortunately, concurrency-related bugs are among the most difficult to debug. Starting by spending a few hours reading through the relevant sections of a good book or working through a tutorial can quickly pay off.
This
std::thread acq1(...)
is the call of an constructor. constructing a new object called acq1
This
acq1(...)
is the application of the () operator on the existing object aqc1. If there isn't such a operator defined for std::thread the compiler complains.
As far as I know you may not reused std::threads. You construct and start them. Join with them and throw them away,
Well, it depends if you consider moving a reassigning or not. You can move a thread but not make a copy of it.
Below code will create new pair of threads each iteration and move them in place of old threads. I imagine this should work, because new thread objects will be temporaries.
while(user doesn't interrupt)
{
//Process first batch of data while acquiring new data
std::thread proc1(ProcessData,memoryAddress1a);
std::thread proc2(ProcessData,memoryAddress2a);
acq1 = std::thread(AcquireData, boardHandle1, memoryAddress1b);
acq2 = std::thread(AcquireData, boardHandle2, memoryAddress2b);
acq1.join();
acq2.join();
proc1.join();
proc2.join();
/*Proceed in this manner, alternating which memory address
is written to and being processed until the user interrupts the program.*/
}
What's going on is, the object actually does not end it's lifetime at the end of the iteration, because it is declared in the outer scope in regard to the loop. But a new object gets created each time and move takes place. I don't see what can be spared (I might be stupid), so I imagine this it's exactly the same as declaring acqs inside the loop and simply reusing the symbol. All in all ... yea, it's about how you classify a create temporary and move.
Also, this clearly starts a new thread each loop (of course ending the previously assigned thread), it doesn't make a thread wait for new data and magically feed it to the processing pipe. You would need to implement it a differently like. E.g: Worker threads pool and communication over queues.
References: operator=, (ctor).
I think the errors you get are self-explanatory, so I'll skip explaining them.
I think you need a much more simpler answer for running a set of threads more than once, this is the best solution:
do{
std::vector<std::thread> thread_vector;
for (int i=0;i<nworkers;i++)
{
thread_vector.push_back(std::thread(yourFunction,Parameter1,Parameter2, ...));
}
for(std::thread& it: thread_vector)
{
it.join();
}
q++;
} while(q<NTIMES);
You also could make your own Thread class and call its run method like:
class MyThread
{
public:
void run(std::function<void()> func) {
thread_ = std::thread(func);
}
void join() {
if(thread_.joinable())
thread_.join();
}
private:
std::thread thread_;
};
// Application code...
MyThread myThread;
myThread.run(AcquireData);