Thread starvation in multiprocessors system - c++

I am implementing the producer-consumer problem. To implement this, we need to have std::condition_variable along with std::mutex to notify threads to wake up. Using these 2 primitives, the producer can notify to consumer and vice-versa to wake up. This is generally required to avoid thread starvation issues. But I am thinking does this issue really persist in the case of the multiprocessors system?
This question comes because I am implementing this using lock-free ring buffer and I don't want to use std::mutex and std::condition_variable at the producer and consumer sides. Since this queue can't have a data-race issue calling enqueue() and dequeue(). Below is the code.
template<typename MessageType>
class MessageProcessor
{
public:
~MessageProcessor()
{
stop();
if (workerThread_.joinable())
workerThread_.join();
}
bool postMessage(MessageType const &msg)
{
return queue_.enqueue(msg);
}
void registerHandler(std::function<void(MessageType)> handler, int32_t coreId=-1, std::string_view const &name="")
{
std::call_once(init_, [&](){
handler_ = std::move(handler);
workerThread_ = std::thread{&MessageProcessor::process, this};
if (!setAffinity(coreId, workerThread_))
LOG("Msg Processing thread couldn't be pinned to core: " << coreId);
else
LOG("Msg Processing thread pinned to core: " << coreId);
if (! name.empty())
pthread_setname_np(workerThread_.native_handle(), name.data());
});
}
void stop()
{
stop_ = true;
}
private:
void process() //This is a consumer, runs in a separate thread
{
while(!stop_.load(std::memory_order_acquire))
{
MessageType msg;
if (! queue_.dequeue(msg))
continue;
try
{
handler_(msg);
}
catch(std::exception const &ex)
{
LOG("Error while processing data: " << msg << ", Exception: " << ex.what());
}
catch(...)
{
LOG("UNKOWN Error while processing data: " << msg);
}
}
}
bool setAffinity(int32_t const coreId, std::thread &thread)
{
int cpuCoreCount = __sysconf(_GLIBCXX_USE_SC_NPROCESSORS_ONLN);
if (coreId < 0 || coreId >= cpuCoreCount)
return false;
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(coreId, &cpuset);
pthread_t currentThread = thread.native_handle();
return pthread_setaffinity_np(currentThread, sizeof(cpu_set_t), &cpuset) == 0;
}
std::thread workerThread_;
std::atomic<bool> stop_{false};
MPMC_Circular_Queue<MessageType, 1024> queue_;
std::function<void(MessageType)> handler_{};
std::once_flag init_;
};
int main()
{
pthread_setname_np(pthread_self(), "MAIN");
MessageProcessor<int> processor;
processor.registerHandler([](int i){
LOG("Received value: " << i);
}, 2, "PROCESSOR");
std::thread t1([&]() { //Producer thread1
for (int i = 1; i <= 100000; i += 2)
{
LOG("Submitting value: " << i);
processor.postMessage(i);
}
});
pthread_setname_np(t1.native_handle(), "ODD ");
std::thread t2([&]() { //Producer thread2
for (int i = 2; i <= 100000; i += 2)
{
LOG("Submitting value: " << i);
processor.postMessage(i);
}
});
pthread_setname_np(t2.native_handle(), "EVEN");
for (int i = 1; i <= 100000; ++i)
{
LOG("Runing main thread: " << i);
}
t1.join();
t2.join();
return 0;
}
Can this code raise thread starvation issue in modern multiprocessors system? MPMC_Circular_Queue is a lock free bounded queue.

Related

Condition variable's "wait" function causing unexpected behaviour when predicate is provided

As an educational exercise I'm implementing a thread pool using condition variables. A controller thread creates a pool of threads that wait on a signal (an atomic variable being set to a value above zero). When signaled the threads wake, perform their work, and when the last thread is done it signals the main thread to awaken. The controller thread blocks until the last thread is complete. The pool is then available for subsequent re-use.
Every now and then I was getting a timeout on the controller thread waiting for the worker to signal completion (likely because of a race condition when decrementing the active work counter), so in an attempt to solidify the pool I replaced the "wait(lck)" form of the condition variable's wait method with "wait(lck, predicate)". Since doing this, the behaviour of the thread pool is such that it seems to permit decrementing of the active work counter below 0 (which is the condition for reawakening the controller thread) - I have a race condition. I've read countless articles on atomic variables, synchronisation, memory ordering, spurious and lost wakeups on stackoverflow and various other sites, have incorporated what I've learnt to the best of my ability, and still cannot for the life of me work out why the way I've coded the predicated wait just does not work. The counter should only ever be as high as the number of threads in the pool (say, 8) and as low as zero. I've started losing faith in myself - it just shouldn't be this hard to do something fundamentally simple. There is clearly something else I need to learn here :)
Considering of course that there was a race condition I ensured that the two variables that drive the awakening and termination of the pool are both atomic, and that both are only ever changed while protected with a unique_lock. Specifically, I made sure that when a request to the pool was launched, the lock was acquired, the active thread counter was changed from 0 to 8, unlocked the mutex, and then "notified_all". The controller thread would only then be awakened with the active thread count at zero, once the last worker thread decremented it that far and "notified_one".
In the worker thread, the condition variable would wait and wake only when the active thread count is greater than zero, unlock the mutex, in parallel proceed to execute the work preassigned to the processor when the pool was created, re-acquire the mutex, and atomically decrement the active thread count. It would then, while still supposedly protected by the lock, test if it was the last thread still active, and if so, again unlock the mutex and "notify_one" to awaken the controller.
The problem is - the active thread counter repeatedly proceeds below zero after even only 1 or 2 iterations. If I test the active thread count at the start of a new workload, I could find the active thread count down around -6 - it is as if the pool was allowed to reawaken the controller thread before the work was completed.
Given that the thread counter and terminate flag are both atomic variables and are only ever modified while under the protection of the same mutex, I am using sequential memory ordering for all updates, I just cannot see how this is happening and I'm lost.
#include <stdafx.h>
#include <Windows.h>
#include <iostream>
#include <thread>
using std::thread;
#include <mutex>
using std::mutex;
using std::unique_lock;
#include <condition_variable>
using std::condition_variable;
#include <atomic>
using std::atomic;
#include <chrono>
#include <vector>
using std::vector;
class IWorkerThreadProcessor
{
public:
virtual void Process(int) = 0;
};
class MyProcessor : public IWorkerThreadProcessor
{
int index_ = 0;
public:
MyProcessor(int index)
{
index_ = index;
}
void Process(int threadindex)
{
for (int i = 0; i < 5000000; i++);
std::cout << '(' << index_ << ':' << threadindex << ") ";
}
};
#define MsgBox(x) do{ MessageBox(NULL, x, L"", MB_OK ); }while(false)
class ThreadPool
{
private:
atomic<unsigned int> invokations_ = 0;
//This goes negative when using the wait_for with predicate
atomic<int> threadsActive_ = 0;
atomic<bool> terminateFlag_ = false;
vector<std::thread> threads_;
atomic<unsigned int> poolSize_ = 0;
mutex mtxWorker_;
condition_variable cvSignalWork_;
condition_variable cvSignalComplete_;
public:
~ThreadPool()
{
TerminateThreads();
}
void Init(std::vector<IWorkerThreadProcessor*>& processors)
{
unique_lock<mutex> lck2(mtxWorker_);
threadsActive_ = 0;
terminateFlag_ = false;
poolSize_ = processors.size();
for (int i = 0; i < poolSize_; ++i)
threads_.push_back(thread(&ThreadPool::launchMethod, this, processors[i], i));
}
void ProcessWorkload(std::chrono::milliseconds timeout)
{
//Only used to see how many invocations I was getting through before experiencing the issue - sadly it's only one or two
invocations_++;
try
{
unique_lock<mutex> lck(mtxWorker_);
//!!!!!! If I use the predicated wait this break will fire !!!!!!
if (threadsActive_.load() != 0)
__debugbreak();
threadsActive_.store(poolSize_);
lck.unlock();
cvSignalWork_.notify_all();
lck.lock();
if (!cvSignalComplete_.wait_for(
lck,
timeout,
[this] { return threadsActive_.load() == 0; })
)
{
//As you can tell this has taken me through a journey trying to characterise the issue...
if (threadsActive_ > 0)
MsgBox(L"Thread pool timed out with still active threads");
else if (threadsActive_ == 0)
MsgBox(L"Thread pool timed out with zero active threads");
else
MsgBox(L"Thread pool timed out with negative active threads");
}
}
catch (std::exception e)
{
__debugbreak();
}
}
void launchMethod(IWorkerThreadProcessor* processor, int threadIndex)
{
do
{
unique_lock<mutex> lck(mtxWorker_);
//!!!!!! If I use this predicated wait I see the failure !!!!!!
cvSignalWork_.wait(
lck,
[this] {
return
threadsActive_.load() > 0 ||
terminateFlag_.load();
});
//!!!!!!!! Does not cause the failure but obviously will not handle
//spurious wake-ups !!!!!!!!!!
//cvSignalWork_.wait(lck);
if (terminateFlag_.load())
return;
//Unlock to parallelise the work load
lck.unlock();
processor->Process(threadIndex);
//Re-lock to decrement the work count
lck.lock();
//This returns the value before the subtraction so theoretically if the previous value was 1 then we're the last thread going and we can now signal the controller thread to wake. This is the only place that the decrement happens so I don't know how it could possibly go negative
if (threadsActive_.fetch_sub(1, std::memory_order_seq_cst) == 1)
{
lck.unlock();
cvSignalComplete_.notify_one();
}
else
lck.unlock();
} while (true);
}
void TerminateThreads()
{
try
{
unique_lock<mutex> lck(mtxWorker_);
if (!terminateFlag_)
{
terminateFlag_ = true;
lck.unlock();
cvSignalWork_.notify_all();
for (int i = 0; i < threads_.size(); i++)
threads_[i].join();
}
}
catch (std::exception e)
{
__debugbreak();
}
}
};
int main()
{
std::vector<IWorkerThreadProcessor*> processors;
for (int i = 0; i < 8; i++)
processors.push_back(new MyProcessor(i));
std::cout << "Instantiating thread pool\n";
auto pool = new ThreadPool;
std::cout << "Initialisting thread pool\n";
pool->Init(processors);
std::cout << "Thread pool initialised\n";
for (int i = 0; i < 200; i++)
{
std::cout << "Workload " << i << "\n";
pool->ProcessWorkload(std::chrono::milliseconds(500));
std::cout << "Workload " << i << " complete." << "\n";
}
for (auto a : processors)
delete a;
delete pool;
return 0;
}
class ThreadPool
{
private:
atomic<unsigned int> invokations_ = 0;
std::atomic<unsigned int> awakenings_ = 0;
std::atomic<unsigned int> startedWorkloads_ = 0;
std::atomic<unsigned int> completedWorkloads_ = 0;
atomic<bool> terminate_ = false;
atomic<bool> stillFiring_ = false;
vector<std::thread> threads_;
atomic<unsigned int> poolSize_ = 0;
mutex mtx_;
condition_variable cvSignalWork_;
condition_variable cvSignalComplete_;
public:
~ThreadPool()
{
TerminateThreads();
}
void Init(std::vector<IWorkerThreadProcessor*>& processors)
{
unique_lock<mutex> lck2(mtx_);
//threadsActive_ = 0;
terminate_ = false;
poolSize_ = processors.size();
for (int i = 0; i < poolSize_; ++i)
threads_.push_back(thread(&ThreadPool::launchMethod, this, processors[i], i));
awakenings_ = 0;
completedWorkloads_ = 0;
startedWorkloads_ = 0;
invokations_ = 0;
}
void ProcessWorkload(std::chrono::milliseconds timeout)
{
try
{
unique_lock<mutex> lck(mtx_);
invokations_++;
if (startedWorkloads_ != 0)
__debugbreak();
if (completedWorkloads_ != 0)
__debugbreak();
if (awakenings_ != 0)
__debugbreak();
if (stillFiring_)
__debugbreak();
stillFiring_ = true;
lck.unlock();
cvSignalWork_.notify_all();
lck.lock();
if (!cvSignalComplete_.wait_for(
lck,
timeout,
//[this] { return this->threadsActive_.load() == 0; })
[this] { return completedWorkloads_ == poolSize_ && !stillFiring_; })
)
{
if (completedWorkloads_ < poolSize_)
{
if (startedWorkloads_ < poolSize_)
MsgBox(L"Thread pool timed out with some threads unstarted");
else if (startedWorkloads_ == poolSize_)
MsgBox(L"Thread pool timed out with all threads started but not all completed");
}
else
__debugbreak();
}
if (completedWorkloads_ != poolSize_)
__debugbreak();
if (awakenings_ != poolSize_)
__debugbreak();
awakenings_ = 0;
completedWorkloads_ = 0;
startedWorkloads_ = 0;
}
catch (std::exception e)
{
__debugbreak();
}
}
void launchMethod(IWorkerThreadProcessor* processor, int threadIndex)
{
do
{
unique_lock<mutex> lck(mtx_);
cvSignalWork_.wait(
lck,
[this] {
return
(stillFiring_ && (startedWorkloads_ < poolSize_)) ||
terminate_;
});
awakenings_++;
if (startedWorkloads_ == 0 && terminate_)
return;
if (stillFiring_ && startedWorkloads_ < poolSize_) //guard against spurious wakeup
{
startedWorkloads_++;
if (startedWorkloads_ == poolSize_)
stillFiring_ = false;
lck.unlock();
processor->Process(threadIndex);
lck.lock();
completedWorkloads_++;
if (completedWorkloads_ == poolSize_)
{
lck.unlock();
cvSignalComplete_.notify_one();
}
else
lck.unlock();
}
else
lck.unlock();
} while (true);
}
void TerminateThreads()
{
try
{
unique_lock<mutex> lck(mtx_);
if (!terminate_) //Don't attempt to double-terminate
{
terminate_ = true;
lck.unlock();
cvSignalWork_.notify_all();
for (int i = 0; i < threads_.size(); i++)
threads_[i].join();
}
}
catch (std::exception e)
{
__debugbreak();
}
}
};
I'm not certain if the following helps solve the problem, but I think the error is as shown below:
This
if (!cvSignalComplete_.wait_for(
lck,
timeout,
[this] { return threadsActive_.load() == 0; })
)
should be replaced by
if (!cvSignalComplete_.wait_for(
lck,
timeout,
[&] { return threadsActive_.load() == 0; })
)
Looks like the lambda is not accessing the instantiated member of the class. Here is some reference to back my case. Look at Lambda Capture section of this page.
Edit:
Another place you are using wait for with lambdas.
cvSignalWork_.wait(
lck,
[this] {
return
threadsActive_.load() > 0 ||
terminateFlag_.load();
});
Maybe modify all the lambdas and then see if it works?
The reason I'm looking at the lambda is because it seems like a case similar to a spurious wakeup. Hope it helps.

Killing thread from another thread c++

I am new to multithreading and i need your help.
Consider the following code:
vector <int> vec;
int j = 0;
void Fill()
{
for (int i = 0; i < 500; i++)
{
Sleep(500);
vec.push_back(i);
}
}
void Proces()
{
int count = 0;
int n=-1;
while (true) {
Sleep(250);
if (!vec.empty())
{
if (n != vec.back()) {
n = vec.back();
cout << n;
count++;
}
}
if (count == 101)break;
}
}
void getinput()
{
while (true) {
int k=0;
cin >> k;
//if the user enters an integer i want to kill all the threads
}
}
int main()
{
thread t1(Fill);
thread t2(Proces);
thread t3(getinput);
t1.join();
t2.join();
t3.join();
cout << "From main()";
}
The point is that i want to kill t1(Fill) and t2(Proces) from t3(getinput).Is there and way to do it,and if there is could you please post and example.
A common way to make a thread exit is to have an (atomic) flag that the thread checks to see if it should exit. Then externally you set this flag and the thread will notice it and exit naturally.
Something like
#include <thread>
#include <atomic>
#include <iostream>
#include <chrono>
// Flag telling the thread to continue or exit
std::atomic<bool> exit_thread_flag{false};
void thread_function()
{
// Loop while flag if not set
while (!exit_thread_flag)
{
std::cout << "Hello from thread\n";
std::this_thread::sleep_for(std::chrono::seconds(1)); // Sleep for one second
}
}
int main()
{
std::thread t{thread_function}; // Create and start the thread
std::this_thread::sleep_for(std::chrono::seconds(5)); // Sleep for five seconds
exit_thread_flag = true; // Tell thread to exit
t.join(); // Wait for thread to exit
}
You have to define an exit condition and lock the container before accessing it. Of course you could build an own collection as wrapper around an existing using proper locking and thus making it thread-safe.
Here is an example of locking and an exit condition:
class Test
{
public:
Test()
: exitCondition(false)
{
work = std::thread([this]() { DoWork(); });
}
~Test()
{
if (work.joinable())
work.join();
}
void Add(int i)
{
mutex.lock();
things.push_back(i);
mutex.unlock();
}
void RequestStop(bool waitForExit = false)
{
exitCondition.exchange(true);
if (waitForExit)
work.join();
}
private:
void DoWork()
{
while (!exitCondition)
{
mutex.lock();
if (!things.empty())
{
for (auto itr = things.begin(); itr != things.end();)
itr = things.erase(itr);
}
std::this_thread::sleep_for(std::chrono::milliseconds(1));
mutex.unlock();
}
}
private:
std::vector<int> things;
std::thread work;
std::atomic<bool> exitCondition;
std::mutex mutex;
};
int wmain(int, wchar_t**)
{
Test t;
t.Add(1);
t.Add(2);
t.Add(3);
t.RequestStop(true);
return 0;
}
std::atomic<bool> exit_flag{false};
...
void Fill() {
for (int i = 0; i < 500; i++) {
if (exit_flag) return;
...
}
}
void Proces() {
while (true) {
if (exit_flag) return;
...
}
}
void getinput() {
while (true) {
...
if ( /* the user enters an integer i want to kill all the threads */ )
exit_flag = true;
}
}

Do I need to implement blocking when using boost::asio?

My question is, if I run io_service::run () on multiple threads, do I need to implement blocking on these asynchronous functions?
example:
int i = 0;
int j = 0;
void test_timer(boost::system::error_code ec)
{
//I need to lock up here ?
if (i++ == 10)
{
j = i * 10;
}
timer.expires_at(timer.expires_at() + boost::posix_time::milliseconds(500));
timer.async_wait(&test_timer);
}
void threadMain()
{
io_service.run();
}
int main()
{
boost::thread_group workers;
timer.async_wait(&test_timer);
for (int i = 0; i < 5; i++){
workers.create_thread(&threadMain);
}
io_service.run();
workers.join_all();
return 0;
}
The definition of async is that it is non-blocking.
If you mean to ask "do I have to synchronize access to shared objects from different threads" - that question is unrelated and the answer depends on the thread-safety documented for the object you are sharing.
For Asio, basically (rough summary) you need to synchronize concurrent access (concurrent as in: from multiple threads) to all types except boost::asio::io_context¹,².
Your Sample
Your sample uses multiple threads running the io service, meaning handlers run on any of those threads. This means that effectively you're sharing the globals and indeed they need protection.
However Because your application logic (the async call chain) dictates that only one operation is ever pending, and the next async operation on the shared timer object is always scheduled from within that chain, the access is logically all from a single thread (called an implicit strand. See Why do I need strand per connection when using boost::asio?
The simplest thing that would work:
Logical Strand
Live On Coliru
#include <boost/asio.hpp>
#include <boost/thread.hpp>
#include <iostream>
boost::asio::io_service io_service;
boost::asio::deadline_timer timer { io_service };
struct state_t {
int i = 0;
int j = 0;
} state;
void test_timer(boost::system::error_code ec)
{
if (ec != boost::asio::error::operation_aborted) {
{
if (state.i++ == 10) {
state.j = state.i * 10;
if (state.j > 100)
return; // stop after 5 seconds
}
}
timer.expires_at(timer.expires_at() + boost::posix_time::milliseconds(50));
timer.async_wait(&test_timer);
}
}
int main()
{
boost::thread_group workers;
timer.expires_from_now(boost::posix_time::milliseconds(50));
timer.async_wait(&test_timer);
for (int i = 0; i < 5; i++){
workers.create_thread([] { io_service.run(); });
}
workers.join_all();
std::cout << "i = " << state.i << std::endl;
std::cout << "j = " << state.j << std::endl;
}
Note I removed the io_service::run() from the main thread as it is redundant with the join() (unless you really wanted 6 threads running the handlers, not 5).
Prints
i = 11
j = 110
Caveat
There's a pitfall lurking here. Say, you didn't want to bail at a fixed number, like I did, but want to stop, you'd be tempted to do:
timer.cancel();
from main. That's not legal, because the deadline_timer object is not thread safe. You'd need to either
use a global atomic_bool to signal the request for termination
post the timer.cancel() on the same strand as the timer async chain. However, there is only an explicit strand, so you can't do it without changing the code to use an explicit strand.
More Timers
Let's complicate things by having two timers, with their own implicit strands. This means access to the timer instances still need not be synchronized, but access to i and j does need to be.
Note In this demo I use synchronized_value<> for elegance. You can write similar logic manually using mutex and lock_guard.
Live On Coliru
#include <boost/asio.hpp>
#include <boost/thread.hpp>
#include <boost/thread/synchronized_value.hpp>
#include <iostream>
boost::asio::io_service io_service;
struct state {
int i = 0;
int j = 0;
};
boost::synchronized_value<state> shared_state;
struct TimerChain {
boost::asio::deadline_timer _timer;
TimerChain() : _timer{io_service} {
_timer.expires_from_now(boost::posix_time::milliseconds(50));
resume();
}
void resume() {
_timer.async_wait(boost::bind(&TimerChain::test_timer, this, _1));
};
void test_timer(boost::system::error_code ec)
{
if (ec != boost::asio::error::operation_aborted) {
{
auto state = shared_state.synchronize();
if (state->i++ == 10) {
state->j = state->i * 10;
}
if (state->j > 100) return; // stop after some iterations
}
_timer.expires_at(_timer.expires_at() + boost::posix_time::milliseconds(50));
resume();
}
}
};
int main()
{
boost::thread_group workers;
TimerChain timer1;
TimerChain timer2;
for (int i = 0; i < 5; i++){
workers.create_thread([] { io_service.run(); });
}
workers.join_all();
auto state = shared_state.synchronize();
std::cout << "i = " << state->i << std::endl;
std::cout << "j = " << state->j << std::endl;
}
Prints
i = 12
j = 110
Adding The Explicit Strands
Now it's pretty straight-forward to add them:
struct TimerChain {
boost::asio::io_service::strand _strand;
boost::asio::deadline_timer _timer;
TimerChain() : _strand{io_service}, _timer{io_service} {
_timer.expires_from_now(boost::posix_time::milliseconds(50));
resume();
}
void resume() {
_timer.async_wait(_strand.wrap(boost::bind(&TimerChain::test_timer, this, _1)));
};
void stop() { // thread safe
_strand.post([this] { _timer.cancel(); });
}
// ...
Live On Coliru
#include <boost/asio.hpp>
#include <boost/thread.hpp>
#include <boost/thread/synchronized_value.hpp>
#include <iostream>
boost::asio::io_service io_service;
struct state {
int i = 0;
int j = 0;
};
boost::synchronized_value<state> shared_state;
struct TimerChain {
boost::asio::io_service::strand _strand;
boost::asio::deadline_timer _timer;
TimerChain() : _strand{io_service}, _timer{io_service} {
_timer.expires_from_now(boost::posix_time::milliseconds(50));
resume();
}
void resume() {
_timer.async_wait(_strand.wrap(boost::bind(&TimerChain::test_timer, this, _1)));
};
void stop() { // thread safe
_strand.post([this] { _timer.cancel(); });
}
void test_timer(boost::system::error_code ec)
{
if (ec != boost::asio::error::operation_aborted) {
{
auto state = shared_state.synchronize();
if (state->i++ == 10) {
state->j = state->i * 10;
}
}
// continue indefinitely
_timer.expires_at(_timer.expires_at() + boost::posix_time::milliseconds(50));
resume();
}
}
};
int main()
{
boost::thread_group workers;
TimerChain timer1;
TimerChain timer2;
for (int i = 0; i < 5; i++){
workers.create_thread([] { io_service.run(); });
}
boost::this_thread::sleep_for(boost::chrono::seconds(10));
timer1.stop();
timer2.stop();
workers.join_all();
auto state = shared_state.synchronize();
std::cout << "i = " << state->i << std::endl;
std::cout << "j = " << state->j << std::endl;
}
Prints
i = 400
j = 110
¹ (or using the legacy name boost::asio::io_service)
² lifetime mutations are not considered member operations in this respect (you have to manually synchronize construction/destruction of shared objects even for thread-safe objects)

Syncing Threads in Boost

I am trying to create an application which create one main thread and 10 slave threads. I want to run the slave threads once after the main thread is run. So for each main thread execution, each slave thread is going to execute once. I tried to handle this with two different conditional variables. So, one is used for slave threads so they can wait until the main thread notify them and another conditional variable for the main thread which is signaled after each child finish its task, so the main thread can check if all the slave threads are done or not. The code is as follows:
// STD
#include <iostream>
#include <vector>
// BOOST
#include <boost/thread.hpp>
#include <boost/atomic.hpp>
std::vector<boost::thread*> threads;
std::vector<boost::mutex*> data_ready_mutex;
std::vector<boost::condition_variable*> cond;
std::vector<bool> data_ready;
std::vector<int> num_run;
boost::mutex check_finish_mutex;
std::vector<bool> finished;
boost::atomic<int> data;
boost::atomic<int> next_thread_id;
boost::mutex finished_task_mutex;
boost::condition_variable finished_task_cond;
bool finished_task = false;
void signal_finished(const int& id)
{
{
boost::lock_guard<boost::mutex> lock(finished_task_mutex);
finished[id] = true;
finished_task = true;
}
finished_task_cond.notify_all();
}
void signal_slave(const int& id)
{
{
boost::lock_guard<boost::mutex> lock(*data_ready_mutex[id]);
data_ready[id] = true;
}
cond[id]->notify_all();
}
void slave_therad()
{
int id = next_thread_id++;
std::cout << "( " << id << " ) slave_thread created\n";
while (true)
{
boost::unique_lock<boost::mutex> lock(*data_ready_mutex[id]);
while (!data_ready[id])
{
cond[id]->wait(lock);
}
finished[id] = false;
data_ready[id] = false;
data++;
num_run[id]++;
signal_finished(id);
}
}
void main()
{
size_t nThreads = 10;
data_ready_mutex.resize(nThreads);
cond.resize(nThreads);
data_ready.resize(nThreads);
finished.resize(nThreads);
num_run.resize(nThreads, 0);
for (size_t i = 0; i < nThreads; i++)
{
data_ready_mutex[i] = new boost::mutex();
cond[i] = new boost::condition_variable();
data_ready[i] = false;
finished[i] = false;
}
for (size_t i = 0; i < nThreads; i++)
{
threads.push_back(new boost::thread(slave_therad));
}
while (true)
{
clock_t start_time = clock();
for (size_t i = 0; i < threads.size(); i++)
signal_slave(static_cast<int>(i));
while (true)
{
boost::unique_lock<boost::mutex> lock(finished_task_mutex);
while (!finished_task)
{
finished_task_cond.wait(lock);
}
finished_task = false;
size_t i = 0;
for (; i < finished.size(); i++)
{
if (!finished[i]) break;
}
if (i == finished.size()) break;
}
clock_t end_time = clock();
std::cout << "Elapsed Time = " << static_cast<float>(end_time - start_time) / CLOCKS_PER_SEC << std::endl;
for (size_t i = 0; i < threads.size(); i++)
finished[i] = false;
}
for (size_t i = 0; i < nThreads; i++)
{
threads[i]->join();
}
}
The problem is that somewhere the code stops and it stuck in deadlock.
Also, I tried to change the way to implement. So, I used an atomic<int> which counts the number of threads which has finished their task and in the main thread I check if the number of threads is equal to number of threads which has updated themselves but this method also stuck somewhere and goes into deadlock.
The code can be found here:
// STD
#include <iostream>
#include <vector>
// BOOST
#include <boost/thread.hpp>
#include <boost/atomic.hpp>
std::vector<boost::thread*> threads; //!< Slave Threads array
std::vector<boost::mutex*> data_ready_mutex; //!< Mutex to guard the data_ready
std::vector<bool> data_ready; //!< Shows if the data is ready for the slave thread or not.
std::vector<boost::condition_variable*> cond; //!< conditional variable to wait on data being ready for the slave thread.
std::vector<int> num_run; //!< Stores the number of times each slave thread is run.
boost::atomic<int> data; //!< Stores the data processed by each slave thread
boost::atomic<int> next_thread_id; //!< id for the next thread (used for giving an id from 0,..., nThreads-1
boost::atomic<int> num_threads_done; //!< Stores the number of slave threads which has finished their task
//! Signals a slave thread to start its task
void signal_slave(const int& id)
{
{
boost::lock_guard<boost::mutex> lock(*data_ready_mutex[id]);
data_ready[id] = true;
}
cond[id]->notify_all();
}
//! Slave thread function
void slave_therad()
{
// assign an id to the current slave_thread
int id = next_thread_id++;
std::cout << "( " << id << " ) slave_thread created\n";
while (true)
{
// wait for a signal from the main thread
boost::unique_lock<boost::mutex> lock(*data_ready_mutex[id]);
while (!data_ready[id])
{
cond[id]->wait(lock);
}
// make the data not ready, so the loop is not going to run without the main thread signal after the thread is done.
data_ready[id] = false;
// TASK for SLAVE THREAD
data++;
// Increase the number of times the thread is run
num_run[id]++;
// Increase the number of threads which has finished their tasks.
num_threads_done++;
}
}
void main()
{
size_t nThreads = 10;
// creating the data ready mutexes, conditional variables, data_ready variable (bools), num_runs array.
data_ready_mutex.resize(nThreads);
cond.resize(nThreads);
data_ready.resize(nThreads);
num_run.resize(nThreads, 0);
for (size_t i = 0; i < nThreads; i++)
{
data_ready_mutex[i] = new boost::mutex();
cond[i] = new boost::condition_variable();
data_ready[i] = false;
}
// Creating the slave threads
for (size_t i = 0; i < nThreads; i++)
{
threads.push_back(new boost::thread(slave_therad));
}
// Main Thread Body
while (true)
{
clock_t start_time = clock();
// Reset the number of threads which are done.
num_threads_done = 0;
// Signals the slave threads to start doing their task.
for (size_t i = 0; i < threads.size(); i++)
signal_slave(static_cast<int>(i));
// Wait until all the slave threads are done.
while (true)
if (num_threads_done == threads.size()) break;
clock_t end_time = clock();
std::cout << "Elapsed Time = " << static_cast<float>(end_time - start_time) / CLOCKS_PER_SEC << std::endl;
}
for (size_t i = 0; i < nThreads; i++)
{
threads[i]->join();
}
}
Even, I tried to fix the issue with barriers but it did not fix my problem. the code is as follows:
// STD
#include <iostream>
#include <vector>
// BOOST
#include <boost/thread.hpp>
#include <boost/atomic.hpp>
boost::barrier* barrier; //!< barrier to make sure all the slave threads are done their tasks.
std::vector<boost::thread*> threads;
std::vector<boost::mutex*> data_ready_mutex; //!< Mutex to guard the data_ready
std::vector<bool> data_ready; //!< Shows if the data is ready for the slave thread or not.
std::vector<boost::condition_variable*> cond; //!< conditional variable to wait on data being ready for the slave thread.
std::vector<int> num_run; //!< Stores the number of times each slave thread is run.
boost::atomic<int> data; //!< Stores the data processed by each slave thread
boost::atomic<int> next_thread_id; //!< id for the next thread (used for giving an id from 0,..., nThreads-1
boost::atomic<int> num_threads_done; //!< Stores the number of slave threads which has finished their task
std::vector<bool> finished; //!< Array which stores if all the slave threads are done or not.
boost::mutex finished_task_mutex; //!< mutex to guard the finished_task variable
boost::condition_variable finished_task_cond; //!< Conditional variable to wait for all the threads to finish they tasks.
boost::atomic<bool> finished_task(false); //!< Variable which stores if the task of slave_threads are finished or not.
void signal_finished(const int& id)
{
{
boost::lock_guard<boost::mutex> lock(finished_task_mutex);
finished[id] = true;
finished_task = true;
}
finished_task_cond.notify_all();
}
void signal_slave(const int& id)
{
{
boost::lock_guard<boost::mutex> lock(*data_ready_mutex[id]);
data_ready[id] = true;
}
cond[id]->notify_all();
}
void slave_therad()
{
int id = next_thread_id++;
std::cout << "( " << id << " ) slave_thread created\n";
while (true)
{
boost::unique_lock<boost::mutex> lock(*data_ready_mutex[id]);
while (!data_ready[id])
{
cond[id]->wait(lock);
}
finished[id] = false;
data_ready[id] = false;
data++;
num_run[id]++;
barrier->wait();
signal_finished(id);
}
}
void main()
{
size_t nThreads = 10;
data_ready_mutex.resize(nThreads);
cond.resize(nThreads);
data_ready.resize(nThreads);
finished.resize(nThreads);
num_run.resize(nThreads, 0);
for (size_t i = 0; i < nThreads; i++)
{
data_ready_mutex[i] = new boost::mutex();
cond[i] = new boost::condition_variable();
data_ready[i] = false;
finished[i] = false;
}
barrier = new boost::barrier(nThreads);
for (size_t i = 0; i < nThreads; i++)
{
threads.push_back(new boost::thread(slave_therad));
}
while (true)
{
clock_t start_time = clock();
for (size_t i = 0; i < threads.size(); i++)
signal_slave(static_cast<int>(i));
while (true)
{
boost::unique_lock<boost::mutex> lock(finished_task_mutex);
while (!finished_task)
{
finished_task_cond.wait(lock);
}
finished_task = false;
break;
}
clock_t end_time = clock();
std::cout << "Elapsed Time = " << static_cast<float>(end_time - start_time) / CLOCKS_PER_SEC << std::endl;
for (size_t i = 0; i < threads.size(); i++)
finished[i] = false;
}
for (size_t i = 0; i < nThreads; i++)
{
threads[i]->join();
}
}
[UPDATED]
So, I simply used the mutex, conditional variables and data_ready in a struct as follows and now the code is working. I think there was a bug with using pointer to mutex and so on. the code is as follows:
//#define SYNC_WITH_BARRIER
#define SYNC_WITH_ATOMICS
// STD
#include <iostream>
#include <vector>
// BOOST
#include <boost/thread.hpp>
#include <boost/atomic.hpp>
#include <boost/ptr_container/ptr_vector.hpp>
std::vector<boost::thread*> threads;
boost::atomic<int> next_thread_id(0);
boost::mutex finished_task_mutex;
boost::condition_variable finished_task_cond;
bool finished_task = false;
boost::atomic<int> num_finished_tasks(0);
struct Work
{
Work(boost::barrier& _barrier) : b(&_barrier)
{
}
boost::barrier* b;
boost::mutex data_ready_mutex;
boost::condition_variable data_ready_cond;
bool data_ready;
int num_run;
boost::atomic<int> data;
bool finished;
void signal_slave()
{
{
boost::lock_guard<boost::mutex> lock(data_ready_mutex);
data_ready = true;
data_ready_cond.notify_all();
}
}
void slave_therad()
{
int id = next_thread_id++;
std::cout << "( " << id << " ) slave_thread created\n";
while (true)
{
boost::unique_lock<boost::mutex> lock(data_ready_mutex);
while (!data_ready)
{
data_ready_cond.wait(lock);
}
finished = false;
data_ready = false;
data++;
num_run++;
#ifdef SYNC_WITH_BARRIER
b->count_down_and_wait();
#else
#ifdef SYNC_WITH_ATOMICS
num_finished_tasks++;
#endif
#endif
}
}
};
#include <boost/chrono.hpp>
#include <boost/chrono/chrono_io.hpp>
using hrc = boost::chrono::high_resolution_clock;
void main()
{
size_t nThreads = 10;
boost::thread_group tg;
boost::ptr_vector<Work> work_items;
work_items.reserve(nThreads);
boost::barrier finish(nThreads + 1); // one for the main thread
for (size_t i = 0; i < nThreads; i++)
{
work_items.push_back(new Work(finish));
tg.create_thread(boost::bind(&Work::slave_therad, boost::ref(work_items.back())));
}
while (true)
{
auto start_time = hrc::now();
num_finished_tasks = 0;
for (size_t i = 0; i < work_items.size(); i++)
work_items[i].signal_slave();
#ifdef SYNC_WITH_BARRIER
finish.count_down_and_wait();
#else
#ifdef SYNC_WITH_ATOMICS
while (true) if (num_finished_tasks == work_items.size()) break;
#endif
#endif
clock_t end_time = clock();
std::cout << "Elapsed Time = " << hrc::now() - start_time << std::endl;
}
for (size_t i = 0; i < nThreads; i++)
{
threads[i]->join();
}
}
#sehe even with barrier, it stuck in deadlock. – mmostajab 5 mins ago
Since youdon't show anything about what you're doing there, let me give you a startup boost by incorporating a large chunk of all the suggestions you received:
Live On Coliru
#include <boost/atomic.hpp>
#include <boost/thread.hpp>
#include <boost/bind.hpp>
#include <iostream>
#include <vector>
namespace /*static*/ {
boost::atomic<int> data;
boost::atomic<int> num_threads_done;
struct Work {
void signal_slave()
{
boost::lock_guard<boost::mutex> lock(data_ready_mutex);
data_ready = true;
cond.notify_all();
}
void slave_thread()
{
static boost::atomic_int _id_gen(0);
id = _id_gen++;
std::cout << "(" << id << ") slave_thread created\n";
while (true) {
boost::unique_lock<boost::mutex> lock(data_ready_mutex);
cond.wait(lock, [&]{ return data_ready; });
data_ready = false;
data++;
num_run++;
num_threads_done++;
}
}
private:
int id = 0;
bool data_ready = false;
int num_run = 0;
boost::mutex data_ready_mutex;
boost::condition_variable cond;
};
}
#include <boost/chrono.hpp>
#include <boost/chrono/chrono_io.hpp>
using hrc = boost::chrono::high_resolution_clock;
int main()
{
boost::thread_group tg;
size_t nThreads = 10;
std::vector<Work> works(nThreads);
for (size_t i = 0; i < nThreads; i++) {
tg.create_thread(boost::bind(&Work::slave_thread, boost::ref(works[i])));
}
while (true) {
auto start_time = hrc::now();
for (auto& w : works)
w.signal_slave();
std::cout << "Elapsed Time = " << (hrc::now()-start_time) << std::endl;
}
tg.join_all();
}
Bear in mind, I don't know what you're trying to achieve here. Adding a barrier I had this in mind: how to use boost barrier
I tried to change the #sehe answer, so it solve exactly the problem which I am looking for and I achieved this code:
#include <boost/atomic.hpp>
#include <boost/thread.hpp>
#include <boost/bind.hpp>
#include <iostream>
#include <vector>
namespace /*static*/ {
boost::atomic<int> data;
boost::barrier* slave_thread_finished_barrier;
boost::mutex slave_thread_finished_mutex;
boost::condition_variable slave_thread_finished_cond;
bool slave_thread_finished = false;
struct Work {
void signal_slave()
{
boost::lock_guard<boost::mutex> lock(data_ready_mutex);
data_ready = true;
cond.notify_all();
}
void slave_thread()
{
static boost::atomic_int _id_gen(0);
id = _id_gen++;
std::cout << "(" << id << ") slave_thread created\n";
while (true) {
boost::unique_lock<boost::mutex> lock(data_ready_mutex);
cond.wait(lock, [&]{ return data_ready; });
data_ready = false;
data++;
num_run++;
slave_thread_finished_barrier->wait();
// signaling the main thread that the slave threads are done.
if (id == 0)
{
boost::lock_guard<boost::mutex> lock(slave_thread_finished_mutex);
slave_thread_finished = true;
slave_thread_finished_cond.notify_one();
}
}
}
private:
int id = 0;
bool data_ready = false;
int num_run = 0;
boost::mutex data_ready_mutex;
boost::condition_variable cond;
};
}
#include <boost/chrono.hpp>
#include <boost/chrono/chrono_io.hpp>
using hrc = boost::chrono::high_resolution_clock;
int main()
{
boost::thread_group tg;
size_t nThreads = 10;
slave_thread_finished_barrier = new boost::barrier(nThreads);
std::vector<Work> works(nThreads);
for (size_t i = 0; i < nThreads; i++) {
tg.create_thread(boost::bind(&Work::slave_thread, boost::ref(works[i])));
}
while (true) {
auto start_time = hrc::now();
for (auto& w : works)
w.signal_slave();
// Wait for slave threads to finish.
boost::unique_lock<boost::mutex> lock(slave_thread_finished_mutex);
slave_thread_finished_cond.wait(lock, [&]{ return slave_thread_finished; });
slave_thread_finished = false;
std::cout << "Elapsed Time = " << (hrc::now() - start_time) << std::endl;
}
tg.join_all();
}

How to check if a std::thread is still running?

How can I check if a std::thread is still running (in a platform independent way)?
It lacks a timed_join() method and joinable() is not meant for that.
I thought of locking a mutex with a std::lock_guard in the thread and using the try_lock() method of the mutex to determine if it is still locked (the thread is running), but it seems unnecessarily complex to me.
Do you know a more elegant method?
Update: To be clear: I want to check if the thread cleanly exited or not. A 'hanging' thread is considered running for this purpose.
If you are willing to make use of C++11 std::async and std::future for running your tasks, then you can utilize the wait_for function of std::future to check if the thread is still running in a neat way like this:
#include <future>
#include <thread>
#include <chrono>
#include <iostream>
int main() {
using namespace std::chrono_literals;
/* Run some task on new thread. The launch policy std::launch::async
makes sure that the task is run asynchronously on a new thread. */
auto future = std::async(std::launch::async, [] {
std::this_thread::sleep_for(3s);
return 8;
});
// Use wait_for() with zero milliseconds to check thread status.
auto status = future.wait_for(0ms);
// Print status.
if (status == std::future_status::ready) {
std::cout << "Thread finished" << std::endl;
} else {
std::cout << "Thread still running" << std::endl;
}
auto result = future.get(); // Get result.
}
If you must use std::thread then you can use std::promise to get a future object:
#include <future>
#include <thread>
#include <chrono>
#include <iostream>
int main() {
using namespace std::chrono_literals;
// Create a promise and get its future.
std::promise<bool> p;
auto future = p.get_future();
// Run some task on a new thread.
std::thread t([&p] {
std::this_thread::sleep_for(3s);
p.set_value(true); // Is done atomically.
});
// Get thread status using wait_for as before.
auto status = future.wait_for(0ms);
// Print status.
if (status == std::future_status::ready) {
std::cout << "Thread finished" << std::endl;
} else {
std::cout << "Thread still running" << std::endl;
}
t.join(); // Join thread.
}
Both of these examples will output:
Thread still running
This is of course because the thread status is checked before the task is finished.
But then again, it might be simpler to just do it like others have already mentioned:
#include <thread>
#include <atomic>
#include <chrono>
#include <iostream>
int main() {
using namespace std::chrono_literals;
std::atomic<bool> done(false); // Use an atomic flag.
/* Run some task on a new thread.
Make sure to set the done flag to true when finished. */
std::thread t([&done] {
std::this_thread::sleep_for(3s);
done = true;
});
// Print status.
if (done) {
std::cout << "Thread finished" << std::endl;
} else {
std::cout << "Thread still running" << std::endl;
}
t.join(); // Join thread.
}
Edit:
There's also the std::packaged_task for use with std::thread for a cleaner solution than using std::promise:
#include <future>
#include <thread>
#include <chrono>
#include <iostream>
int main() {
using namespace std::chrono_literals;
// Create a packaged_task using some task and get its future.
std::packaged_task<void()> task([] {
std::this_thread::sleep_for(3s);
});
auto future = task.get_future();
// Run task on new thread.
std::thread t(std::move(task));
// Get thread status using wait_for as before.
auto status = future.wait_for(0ms);
// Print status.
if (status == std::future_status::ready) {
// ...
}
t.join(); // Join thread.
}
An easy solution is to have a boolean variable that the thread sets to true on regular intervals, and that is checked and set to false by the thread wanting to know the status. If the variable is false for to long then the thread is no longer considered active.
A more thread-safe way is to have a counter that is increased by the child thread, and the main thread compares the counter to a stored value and if the same after too long time then the child thread is considered not active.
Note however, there is no way in C++11 to actually kill or remove a thread that has hanged.
Edit How to check if a thread has cleanly exited or not: Basically the same technique as described in the first paragraph; Have a boolean variable initialized to false. The last thing the child thread does is set it to true. The main thread can then check that variable, and if true do a join on the child thread without much (if any) blocking.
Edit2 If the thread exits due to an exception, then have two thread "main" functions: The first one have a try-catch inside which it calls the second "real" main thread function. This first main function sets the "have_exited" variable. Something like this:
std::atomic<bool> thread_done = false;
void *thread_function(void *arg)
{
void *res = nullptr;
try
{
res = real_thread_function(arg);
}
catch (...)
{
}
thread_done = true;
return res;
}
This simple mechanism you can use for detecting finishing of a thread without blocking in join method.
std::thread thread([&thread]() {
sleep(3);
thread.detach();
});
while(thread.joinable())
sleep(1);
You can always check if the thread's id is different than std::thread::id() default constructed.
A Running thread has always a genuine associated id.
Try to avoid too much fancy stuff :)
Create a mutex that the running thread and the calling thread both have access to. When the running thread starts it locks the mutex, and when it ends it unlocks the mutex. To check if the thread is still running, the calling thread calls mutex.try_lock(). The return value of that is the status of the thread. (Just make sure to unlock the mutex if the try_lock worked)
One small problem with this, mutex.try_lock() will return false between the time the thread is created, and when it locks the mutex, but this can be avoided using a slightly more complex method.
Surely have a mutex-wrapped variable initialised to false, that the thread sets to true as the last thing it does before exiting. Is that atomic enough for your needs?
I checked both systems:
-Using thread+atomic: take 9738 milliseconds
-Using future+async: take 7746 milliseconds
Not threads: 56000milliseconds
Using a Core-I7 6 cores laptop
My code creates 4000 threads, but no more than 12 running every time.
Here is the code:
#include <iostream>
#include <thread>
#include <future>
#include <chrono>
#include <mutex> // std::mutex
#include <atomic>
#include <chrono>
#pragma warning(disable:4996)
#pragma warning(disable:6031)
#pragma warning(disable:6387)//strout
#pragma warning(disable:26451)
using namespace std;
const bool FLAG_IMPRIME = false;
const int MAX_THREADS = 12;
mutex mtx; // mutex for critical section
atomic <bool> th_end[MAX_THREADS];
atomic <int> tareas_acabadas;
typedef std::chrono::high_resolution_clock t_clock; //SOLO EN WINDOWS
std::chrono::time_point<t_clock> start_time, stop_time; char null_char;
void timer(const char* title = 0, int data_size = 1) { stop_time = t_clock::now(); double us = (double)chrono::duration_cast<chrono::microseconds>(stop_time - start_time).count(); if (title) printf("%s time = %7lgms = %7lg MOPs\n", title, (double)us * 1e-3, (double)data_size / us); start_time = t_clock::now(); }
class c_trim
{
char line[200];
thread th[MAX_THREADS];
double th_result[MAX_THREADS];
int th_index;
double milliseconds_commanded;
void hilo(int hindex,int milliseconds, double& milliseconds2)
{
sprintf(line, "%i:%ia ",hindex, milliseconds); imprime(line);
this_thread::sleep_for(std::chrono::milliseconds(milliseconds));
milliseconds2 = milliseconds * 1000;
sprintf(line, "%i:%ib ", hindex, milliseconds); imprime(line);
tareas_acabadas++; th_end[hindex] = true;
}
int wait_first();
void imprime(char* str) { if (FLAG_IMPRIME) { mtx.lock(); cout << str; mtx.unlock(); } }
public:
void lanzatareas();
vector <future<void>> futures;
int wait_first_future();
void lanzatareas_future();//usa future
};
int main()
{
c_trim trim;
timer();
trim.lanzatareas();
cout << endl;
timer("4000 tareas using THREAD+ATOMIC:", 4000);
trim.lanzatareas_future();
cout << endl;
timer("4000 tareas using FUTURE:", 4000);
cout << endl << "Tareas acabadas:" << tareas_acabadas << endl;
cout << "=== END ===\n"; (void)getchar();
}
void c_trim::lanzatareas()
{
th_index = 0;
tareas_acabadas = 0;
milliseconds_commanded = 0;
double *timeout=new double[MAX_THREADS];
int i;
for (i = 0; i < MAX_THREADS; i++)
{
th_end[i] = true;
th_result[i] = timeout[i] = -1;
}
for (i = 0; i < 4000; i++)
{
int milliseconds = 5 + (i % 10) * 2;
{
int j = wait_first();
if (th[j].joinable())
{
th[j].join();
th_result[j] = timeout[j];
}
milliseconds_commanded += milliseconds;
th_end[j] = false;
th[j] = thread(&c_trim::hilo, this, j, milliseconds, std::ref(timeout[j]));
}
}
for (int j = 0; j < MAX_THREADS; j++)
if (th[j].joinable())
{
th[j].join();
th_result[j] = timeout[j];
}
delete[] timeout;
cout <<endl<< "Milliseconds commanded to wait=" << milliseconds_commanded << endl;
}
void c_trim::lanzatareas_future()
{
futures.clear();
futures.resize(MAX_THREADS);
tareas_acabadas = 0;
milliseconds_commanded = 0;
double* timeout = new double[MAX_THREADS];
int i;
for (i = 0; i < MAX_THREADS; i++)
{
th_result[i] = timeout[i] = -1;
}
for (i = 0; i < 4000; i++)
{
int milliseconds = 5 + (i % 10) * 2;
{
int j;
if (i < MAX_THREADS) j = i;
else
{
j = wait_first_future();
futures[j].get();
th_result[j] = timeout[j];
}
milliseconds_commanded += milliseconds;
futures[j] = std::async(std::launch::async, &c_trim::hilo, this, j, milliseconds, std::ref(timeout[j]));
}
}
//Last MAX_THREADS:
for (int j = 0; j < MAX_THREADS; j++)
{
futures[j].get();
th_result[j] = timeout[j];
}
delete[] timeout;
cout << endl << "Milliseconds commanded to wait=" << milliseconds_commanded << endl;
}
int c_trim::wait_first()
{
int i;
while (1)
for (i = 0; i < MAX_THREADS; i++)
{
if (th_end[i] == true)
{
return i;
}
}
}
//Espera que acabe algun future y da su index
int c_trim::wait_first_future()
{
int i;
std::future_status status;
while (1)
for (i = 0; i < MAX_THREADS; i++)
{
status = futures[i].wait_for(0ms);
if (status == std::future_status::ready)
return i;
}
}
I also had this problem very recently. Tried with the C++20 std::jthread using the shared-stop state to check if the thread is over, but inside the thread the std::stop_token argument is a readonly and doesn't indicate to outside when the thread finishes.
So I created a simple class (nes::uthread) extending std::thread with a flag to indicate it's finished. Example:
#include <atomic>
#include <chrono>
#include <iostream>
#include <memory>
#include <thread>
namespace nes {
class uthread final
{
std::unique_ptr<std::atomic<bool>> m_finished;
std::thread m_thr;
public:
uthread()
: m_finished { std::make_unique<std::atomic<bool>>(true) }
{}
template <class Function, class... Args>
uthread(Function&& f, Args&&... args)
: m_finished { std::make_unique<std::atomic<bool>>(false) }
, m_thr {
[](std::atomic<bool>& finished, Function&& ff, Args&&... aargs) {
try {
std::forward<Function>(ff)(std::forward<Args>(aargs)...);
finished = true;
} catch (...) {
finished = true;
throw;
}
},
std::ref(*m_finished), std::forward<Function>(f),
std::forward<Args>(args)...
}
{}
uthread(const uthread&) = delete;
uthread(uthread&&) = default;
uthread& operator=(const uthread&) = delete;
uthread& operator=(uthread&&) = default;
[[nodiscard]] std::thread::id get_id() const noexcept {
return m_thr.get_id(); }
[[nodiscard]] bool joinable() const noexcept { return m_thr.joinable(); }
void join() { m_thr.join(); }
[[nodiscard]] const std::atomic<bool>& finished() const noexcept {
return *m_finished; }
};
}
int main()
{
using namespace std;
using namespace std::chrono;
using namespace std::chrono_literals;
using namespace nes;
{
cout << "std::thread join() termination\n";
atomic<bool> finished = false;
thread t { [&finished] {
this_thread::sleep_for(2s);
finished = true;
cout << "thread ended\n";
}};
for (int i = 0; i < 5; i++) {
cout << t.get_id() << ".join() " << t.joinable()
<< " finished: " << finished << '\n';
this_thread::sleep_for(1s);
}
t.join();
}
cout << '\n';
{
cout << "std::jthread join() termination\n";
jthread t {[](stop_token st) {
this_thread::sleep_for(2s);
cout << "thread ended. stop possible: " << st.stop_possible() << '\n';
}};
auto st = t.get_stop_source();
for (int i = 0; i < 5; i++) {
cout << t.get_id() << ".join() " << t.joinable()
<< " finished: " << !st.stop_possible() << '\n';
this_thread::sleep_for(1s);
}
}
cout << '\n';
{
cout << "nes::uthread join() termination\n";
uthread t {[] {
this_thread::sleep_for(2s);
cout << "thread ended\n";
}};
for (int i = 0; i < 5; i++) {
cout << t.get_id() << ".join() " << t.joinable()
<< " finished: " << t.finished() << '\n';
this_thread::sleep_for(1s);
}
t.join();
}
}
Possible prints:
std::thread join() termination
2.join() 1 finished: 0
2.join() 1 finished: 0
thread ended
2.join() 1 finished: 1
2.join() 1 finished: 1
2.join() 1 finished: 1
std::jthread join() termination
3.join() 1 finished: 0
3.join() 1 finished: 0
thread ended. stop possible: 1
3.join() 1 finished: 0
3.join() 1 finished: 0
3.join() 1 finished: 0
nes::uthread join() termination
4.join() 1 finished: 0
4.join() 1 finished: 0
thread ended
4.join() 1 finished: 1
4.join() 1 finished: 1
4.join() 1 finished: 1
You can use std::jthread in nes::uthread so you don't need to join.