Is already in c++11 or boost thread monitor ?
I need to monitor thread execution and when one fails for any reason I need to start again.
I am using in c++11.
This depends on what constitutes a thread failure. If you mean it could exit, you can package it up:
Let's pretend we have a "long-running" task with a 25% chance of failing midway:
int my_processing_task() // this can randomly fail
{
static const size_t iterations = 1ul << 6;
static const size_t mtbf = iterations << 2; // 25% chance of failure
static auto odds = bind(uniform_int_distribution<size_t>(0, mtbf), mt19937(time(NULL)));
for(size_t iteration = 0; iteration < iterations; ++iteration)
{
// long task
this_thread::sleep_for(chrono::milliseconds(10));
// that could fail
if (odds() == 37)
throw my_failure();
}
// we succeeded!
return 42;
}
If we want to keep running the task, regardless of whether it completed normally, or with an error, we can write a monitoring wrapper:
template <typename F> void monitor_task_loop(F f)
{
while (!shutdown)
try {
f();
++completions;
} catch (exception const& e)
{
std::cout << "handling: '" << e.what() << "'\n";
++failures;
}
std::cout << "shutdown requested\n";
}
In this case, I randomly thought it would be nice to count the number of regular completions, and the number of failures. The shutdown flag enables the thread to be shutdown:
auto timeout = async(launch::async, []{ this_thread::sleep_for(chrono::seconds(3)); shutdown = true; });
monitor_task_loop(my_processing_task);
Will run the task montoring loop for ~3 seconds. A demonstration running three background threads monitoring our task is Live On Coliru.
Added a c++03 version using Boost Live On Coliru.
This version uses only standard c++11 features.
#include <thread>
#include <future>
#include <iostream>
#include <random>
using namespace std;
struct my_failure : virtual std::exception {
char const* what() const noexcept { return "the thread failed randomly"; }
};
int my_processing_task() // this can randomly fail
{
static const size_t iterations = 1ul << 4;
static const size_t mtbf = iterations << 2; // 25% chance of failure
static auto odds = bind(uniform_int_distribution<size_t>(0, mtbf), mt19937(time(NULL)));
for(size_t iteration = 0; iteration < iterations; ++iteration)
{
// long task
this_thread::sleep_for(chrono::milliseconds(10));
// that could fail
if (odds() == 37)
throw my_failure();
}
// we succeeded!
return 42;
}
std::atomic_bool shutdown(false);
std::atomic_size_t failures(0), completions(0);
template <typename F> void monitor_task_loop(F f)
{
while (!shutdown)
try {
f();
++completions;
} catch (exception const& e)
{
std::cout << "handling: '" << e.what() << "'\n";
++failures;
}
std::cout << "shutdown requested\n";
}
int main()
{
auto monitor = [] { monitor_task_loop(my_processing_task); };
thread t1(monitor), t2(monitor), t3(monitor);
this_thread::sleep_for(chrono::seconds(3));
shutdown = true;
t1.join();
t2.join();
t3.join();
std::cout << "completions: " << completions << ", failures: " << failures << "\n";
}
Related
I need to create an infinite loop, and in this loop there must be some function that must run in parallel. Since they access to a read-only structure, there's no risk of some race condition, so I want to run them simultaneously in order to gain some performance.
The problem is that I don't know how to achieve this result in an efficient way.
This is an example where I run four function in parallel in the loop with specific framerate (the idea from loop at specific framerate is taken from here):
#include <iostream>
#include <thread>
#include <random>
#include <condition_variable>
#include <mutex>
int getRandomIntBetween(int minValue, int maxValue) {
std::random_device rd;
std::mt19937 rng(rd());
std::uniform_int_distribution<int> uni(minValue, maxValue);
return uni(rng);
}
void fun1() {
int randomInterval = getRandomIntBetween(10, 90);
std::this_thread::sleep_for(std::chrono::milliseconds(randomInterval));
std::cout << "fun1 done in " << randomInterval << "ms" << std::endl;
}
void fun2() {
int randomInterval = getRandomIntBetween(10, 90);
std::this_thread::sleep_for(std::chrono::milliseconds(randomInterval));
std::cout << "fun2 done in " << randomInterval << "ms" << std::endl;
}
void fun3() {
int randomInterval = getRandomIntBetween(10, 200);
std::this_thread::sleep_for(std::chrono::milliseconds(randomInterval));
std::cout << "fun3 done in " << randomInterval << "ms" << std::endl;
}
void fun4() {
int randomInterval = getRandomIntBetween(3, 300);
std::this_thread::sleep_for(std::chrono::milliseconds(randomInterval));
std::cout << "fun4 done in " << randomInterval << "ms" << std::endl;
}
int main(int argc, char* argv[]) {
const int64_t frameDurationInUs = 1.0e6 / 1;
std::cout << "Parallel looping testing" << std::endl;
std::condition_variable cv;
std::mutex mut;
bool stop = false;
size_t counter{ 0 };
using delta = std::chrono::duration<int64_t, std::ratio<1, 1000000>>;
auto next = std::chrono::steady_clock::now() + delta{ frameDurationInUs };
std::unique_lock<std::mutex> lk(mut);
while (!stop) {
mut.unlock();
if (counter % 10 == 0) {
std::cout << counter << " frames..." << std::endl;
}
std::thread t1{ &fun1 };
std::thread t2{ &fun2 };
std::thread t3{ &fun3 };
std::thread t4{ &fun4 };
counter++;
t1.join();
t2.join();
t3.join();
t4.join();
mut.lock();
cv.wait_until(lk, next);
next += delta{ frameDurationInUs };
}
return 0;
}
It works but it's inefficient, because I create and delete four thread objects at every iteration.
Instead I'd like to maintain the threads always active, and then call the functions inside the loop, and using some lock mechanism (mutex, semaphore) to wait inside the loop that all functions are run completely before start the next loop iteration.
How can achieve this result?
If you do not want to rely on thread reusing, you don't have to resort to pooling:
In your very specific case you probably don't need to bother with a fully developed thread pool as you want each function to be run exactly once by the corresponding thread.
Your joins therefore become queries for the threads to be done with one particular job:
std::array<std::atomic<bool>, 4> done;
// loop:
std::fill(begin(done), end(done), false);
// ... run threads
for (std::size_t i = 0; i < 4; ++i) {
while (done[i] == false) {} // wait for thread i to finish
}
And thread i obviously then writes done[i] = true; once the function it was supposed to run is done.
You would distribute work packages in much the same way.
below is a snippet of small producer/consumer example
#include <iostream>
#include <boost/asio/use_awaitable.hpp>
#include <boost/system/detail/generic_category.hpp>
#include <boost/asio/experimental/channel.hpp>
#include <boost/asio/experimental/awaitable_operators.hpp>
#include <boost/asio.hpp>
#include <boost/asio/experimental/as_tuple.hpp>
using namespace boost::asio::experimental::awaitable_operators;
template<typename T>
struct Channel : public boost::asio::experimental::channel<void(boost::system::error_code, T)> {
using boost::asio::experimental::channel<void(boost::system::error_code, T)>::channel;};
boost::asio::awaitable<void> consumer(Channel<int>& ch1, Channel<int>& ch2,
int nexpected) {
int num = 0;
for (;;) {
auto [order, ex0, r0, ex1, r1] = co_await boost::asio::experimental::make_parallel_group(
[&ch1](auto token) {
return ch1.async_receive(std::move(token));
},
[&ch2](auto token) {
return ch2.async_receive(std::move(token));
}
).async_wait(
boost::asio::experimental::wait_for_one{},
boost::asio::use_awaitable);
std::cout << "num = " << num << std::endl;
num++;
if (num == nexpected) {
std::cout << "consumer is all done" << std::endl;
break;
}
}
assert(num == nexpected && "sent must be equal received");
}
boost::asio::awaitable<void> producer(Channel<int>& ch, int const n, int const id) {
for (int i=0; i<n; i++) {
auto value = id == 1 ? i : -i;
std::cout << "producer " << id << ": sending " << value << std::endl;
auto const [ec] = co_await ch.async_send(
boost::system::error_code{},
value,
boost::asio::experimental::as_tuple(boost::asio::use_awaitable));
if (ec) std::cout << "error!" << std::endl;
}
co_return;
}
void test0() {
auto ctx = boost::asio::io_context{};
std::size_t const n = 10;
auto ch1 = Channel<int>{ctx, 10};
auto ch2 = Channel<int>{ctx, 10};
boost::asio::co_spawn(
ctx,
producer(ch2, 100, 2),
boost::asio::detached
);
boost::asio::co_spawn(
ctx,
producer(ch1, 100, 1),
boost::asio::detached
);
boost::asio::co_spawn(
ctx,
consumer(ch1, ch2, 200),
boost::asio::detached
);
ctx.run();
}
int main() {
test0();
return 0;
}
In short, there are 2 boost asio experimental channels. there are 2 producers and 1 consumer. consumer reads from either one of these channels. I"m using make_parallel_group with wait_for_one, which waits for one and cancels the others.
When running the program, I observe that one async_receive completes, the other is cancelled and the async_send is somehow cancelled without error code stating that it was cancelled. Basically that means that consumer sees only 100 values. i expect to see all 200 values.
questions:
I'm expecting per operation cancellation here. async_receive cancelled does not force cancelling of async_send.
looking at the source code of parallel_group (detail namespace), I do not see calls to bind_cancellation_slot one a per operation basis... am i missing something?
thanks
In my program I need to start 2 pieces of external hardware.
This is somewhat time consuming and I therefore want to run it in separate threads.
The start-up has two parts. The second part, hardwareTask2(), must be performed ca. simultaneously on both threads.
I therefore want to use a std::barrier to synchronize before calling this method.
However, the first part of the start-up, hardwareTask1() may fail.
If it fails on either thread I want both threads to return.
How do I achieve this?
Using std::barrier::arrive_and_drop() below I have managed to at least get the other thread to finish (not wait indefinitely at the barrier).
#include <iostream>
#include <thread>
#include <barrier>
bool hardwareTask1(unsigned int id) {
srand(id);
int r = rand() % 10;
std::this_thread::sleep_for(std::chrono::seconds(r));
return true;
}
// must be called ca. simultaneously:
void hardwareTask2() {
std::this_thread::sleep_for(std::chrono::seconds(5));
}
void startHardware(unsigned int id, std::barrier<>& b) {
bool ok = hardwareTask1(id);
// Simulate that the above function failed for the first thread:
if (id == 1) {
ok = false;
}
if (!ok) {
b.arrive_and_drop();
return;
}
std::cerr << id << ": finished task1\n";
b.arrive_and_wait();
std::cerr << id << ": after barrier\n";
hardwareTask2();
}
int main()
{
std::barrier<> b(2);
std::thread t1(&startHardware, 1, std::ref(b));
std::thread t2(&startHardware, 2, std::ref(b));
t1.join();
t2.join();
std::cerr << "Both threads have finished.\n";
int k;
std::cin >> k;
}
Like #UlrichEckhardt mentioned in his comment, you can use a std::future, std::promise pair to do the synchronization. Small example:
#include <iostream>
#include <future>
#include <random>
#include <thread>
void hardwareTask1(std::promise<bool> p)
{
std::cout << "HW 1: First part\n";
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<> distrib(1, 10);
std::this_thread::sleep_for(std::chrono::seconds(distrib(gen)));
int res = distrib(gen);
if (res > 5) {
p.set_value(true);
std::cout << "HW 1: Second part\n";
}
else {
p.set_value(false);
std::cout << "HW 1: First part failed - abort\n";
}
}
void hardwareTask2(std::future<bool> f)
{
std::cout << "HW 2: First part\n";
if (f.get())
std::cout << "HW 2: Second part\n";
else
std::cout << "HW 2: HW 1 failed - abort\n";
}
int main()
{
std::promise<bool> p;
std::thread t2(&hardwareTask2, p.get_future());
std::thread t1(&hardwareTask1, std::move(p));
t1.join();
t2.join();
}
I am trying to run some function in asynchronous manner. For this purpose I wrote class called Core where I use std::async to run function in different thread and std::shared_future<int> to wait for this thread and possibly to get future result. This is code of test program:
#include <iostream>
#include <future>
class Core : public std::enable_shared_from_this<Core>
{
public:
Core()
: isRunning_(false) {
};
~Core() {
isRunning_ = false;
if (f_.valid())
{
f_.wait();
std::cout << "Result is: " << f_.get() << std::endl;
}
};
void Start() {
isRunning_ = true;
auto self(shared_from_this());
f_ = std::async(std::launch::async, [self, this]() {
try {
while (true) {
if (!isRunning_)
break;
std::cout << "Boom" << std::endl; // Error occurs here
std::this_thread::sleep_for(std::chrono::seconds(1));
}
}
catch (const std::exception& e) {
std::cerr << "Loop error:" << e.what();
}
return 999;
});
}
private:
std::shared_future<int> f_;
std::atomic<bool> isRunning_;
};
int main()
{
try {
std::shared_ptr<Core> load(new Core);
load->Start();
throw std::runtime_error("Generate error"); // Added in order to generate error
}
catch (const std::exception& e) {
std::cout << "Error occurred: " << e.what();
}
return 0;
}
Each time when I start this program it crashes at this line:
std::cout << "Boom" << std::endl; // Error occurs here
with this error:
That is debugger error and call stack which I managed to get during debugging:
Looks like Core destructor function doesn't call at all. Why is it happens? weird!!!
Could you tell me where is my mistake? Thanks.
When main thread returns from main() it starts tearing down the environment before terminating the whole process. All this while background thread is accessing objects there are being destroyed or have been destroyed already.
I am not sure what you are triying to achieve, but you are doing something wrong:
Your lambda should execute some work and return immediately after it is done e.g. you should never loop forever.
Your main thread should wait for your future to complete by calling std::future<T>::get().
In this question I described boost::asio and boost::coroutine usage pattern which causes random crashes of my application and I published extract from my code and valgrind and GDB output.
In order to investigate the problem further I created smaller proof of concept application which applies the same pattern. I saw that the same problem arises in the smaller program which source I publish here.
The code starts a few threads and creates a connection pool with a few dummy connections (user supplied numbers). Additional arguments are unsigned integer numbers which plays the role of fake requests. The dummy implementation of sendRequest function just starts asynchronous timer for waiting number of seconds equal to the input number and yileds from the function.
Can someone see the problem with this code and can he propose some fix for it?
#include "asiocoroutineutils.h"
#include "concurrentqueue.h"
#include <iostream>
#include <thread>
#include <boost/lexical_cast.hpp>
using namespace std;
using namespace boost;
using namespace utils;
#define id this_thread::get_id() << ": "
// ---------------------------------------------------------------------------
/*!
* \brief This is a fake Connection class
*/
class Connection
{
public:
Connection(unsigned connectionId)
: _id(connectionId)
{
}
unsigned getId() const
{
return _id;
}
void sendRequest(asio::io_service& ioService,
unsigned seconds,
AsioCoroutineJoinerProxy,
asio::yield_context yield)
{
cout << id << "Connection " << getId()
<< " Start sending: " << seconds << endl;
// waiting on this timer is palceholder for any asynchronous operation
asio::steady_timer timer(ioService);
timer.expires_from_now(chrono::seconds(seconds));
coroutineAsyncWait(timer, yield);
cout << id << "Connection " << getId()
<< " Received response: " << seconds << endl;
}
private:
unsigned _id;
};
typedef std::unique_ptr<Connection> ConnectionPtr;
typedef std::shared_ptr<asio::steady_timer> TimerPtr;
// ---------------------------------------------------------------------------
class ConnectionPool
{
public:
ConnectionPool(size_t connectionsCount)
{
for(size_t i = 0; i < connectionsCount; ++i)
{
cout << "Creating connection: " << i << endl;
_connections.emplace_back(new Connection(i));
}
}
ConnectionPtr getConnection(TimerPtr timer,
asio::yield_context& yield)
{
lock_guard<mutex> lock(_mutex);
while(_connections.empty())
{
cout << id << "There is no free connection." << endl;
_timers.emplace_back(timer);
timer->expires_from_now(
asio::steady_timer::clock_type::duration::max());
_mutex.unlock();
coroutineAsyncWait(*timer, yield);
_mutex.lock();
cout << id << "Connection was freed." << endl;
}
cout << id << "Getting connection: "
<< _connections.front()->getId() << endl;
ConnectionPtr connection = std::move(_connections.front());
_connections.pop_front();
return connection;
}
void addConnection(ConnectionPtr connection)
{
lock_guard<mutex> lock(_mutex);
cout << id << "Returning connection " << connection->getId()
<< " to the pool." << endl;
_connections.emplace_back(std::move(connection));
if(_timers.empty())
return;
auto timer = _timers.back();
_timers.pop_back();
auto& ioService = timer->get_io_service();
ioService.post([timer]()
{
cout << id << "Wake up waiting getConnection." << endl;
timer->cancel();
});
}
private:
mutex _mutex;
deque<ConnectionPtr> _connections;
deque<TimerPtr> _timers;
};
typedef unique_ptr<ConnectionPool> ConnectionPoolPtr;
// ---------------------------------------------------------------------------
class ScopedConnection
{
public:
ScopedConnection(ConnectionPool& pool,
asio::io_service& ioService,
asio::yield_context& yield)
: _pool(pool)
{
auto timer = make_shared<asio::steady_timer>(ioService);
_connection = _pool.getConnection(timer, yield);
}
Connection& get()
{
return *_connection;
}
~ScopedConnection()
{
_pool.addConnection(std::move(_connection));
}
private:
ConnectionPool& _pool;
ConnectionPtr _connection;
};
// ---------------------------------------------------------------------------
void sendRequest(asio::io_service& ioService,
ConnectionPool& pool,
unsigned seconds,
asio::yield_context yield)
{
cout << id << "Constructing request ..." << endl;
AsioCoroutineJoiner joiner(ioService);
ScopedConnection connection(pool, ioService, yield);
asio::spawn(ioService, bind(&Connection::sendRequest,
connection.get(),
std::ref(ioService),
seconds,
AsioCoroutineJoinerProxy(joiner),
placeholders::_1));
joiner.join(yield);
cout << id << "Processing response ..." << endl;
}
// ---------------------------------------------------------------------------
void threadFunc(ConnectionPool& pool,
ConcurrentQueue<unsigned>& requests)
{
try
{
asio::io_service ioService;
while(true)
{
unsigned request;
if(!requests.tryPop(request))
break;
cout << id << "Scheduling request: " << request << endl;
asio::spawn(ioService, bind(sendRequest,
std::ref(ioService),
std::ref(pool),
request,
placeholders::_1));
}
ioService.run();
}
catch(const std::exception& e)
{
cerr << id << "Error: " << e.what() << endl;
}
}
// ---------------------------------------------------------------------------
int main(int argc, char* argv[])
{
if(argc < 3)
{
cout << "Usage: ./async_request poolSize threadsCount r0 r1 ..."
<< endl;
return -1;
}
try
{
auto poolSize = lexical_cast<size_t>(argv[1]);
auto threadsCount = lexical_cast<size_t>(argv[2]);
ConcurrentQueue<unsigned> requests;
for(int i = 3; i < argc; ++i)
{
auto request = lexical_cast<unsigned>(argv[i]);
requests.tryPush(request);
}
ConnectionPoolPtr pool(new ConnectionPool(poolSize));
vector<unique_ptr<thread>> threads;
for(size_t i = 0; i < threadsCount; ++i)
{
threads.emplace_back(
new thread(threadFunc, std::ref(*pool), std::ref(requests)));
}
for_each(threads.begin(), threads.end(), mem_fn(&thread::join));
}
catch(const std::exception& e)
{
cerr << "Error: " << e.what() << endl;
}
return 0;
}
Here are some helper utilities used by the above code:
#pragma once
#include <boost/asio/steady_timer.hpp>
#include <boost/asio/spawn.hpp>
namespace utils
{
inline void coroutineAsyncWait(boost::asio::steady_timer& timer,
boost::asio::yield_context& yield)
{
boost::system::error_code ec;
timer.async_wait(yield[ec]);
if(ec && ec != boost::asio::error::operation_aborted)
throw std::runtime_error(ec.message());
}
class AsioCoroutineJoiner
{
public:
explicit AsioCoroutineJoiner(boost::asio::io_service& io)
: _timer(io), _count(0) {}
void join(boost::asio::yield_context yield)
{
assert(_count > 0);
_timer.expires_from_now(
boost::asio::steady_timer::clock_type::duration::max());
coroutineAsyncWait(_timer, yield);
}
void inc()
{
++_count;
}
void dec()
{
assert(_count > 0);
--_count;
if(0 == _count)
_timer.cancel();
}
private:
boost::asio::steady_timer _timer;
std::size_t _count;
}; // AsioCoroutineJoiner class
class AsioCoroutineJoinerProxy
{
public:
AsioCoroutineJoinerProxy(AsioCoroutineJoiner& joiner)
: _joiner(joiner)
{
_joiner.inc();
}
AsioCoroutineJoinerProxy(const AsioCoroutineJoinerProxy& joinerProxy)
: _joiner(joinerProxy._joiner)
{
_joiner.inc();
}
~AsioCoroutineJoinerProxy()
{
_joiner.dec();
}
private:
AsioCoroutineJoiner& _joiner;
}; // AsioCoroutineJoinerProxy class
} // utils namespace
For completeness of the code the last missing part is ConcurrentQueue class. It is too long to paste it here, but if you want you can find it here.
Example usage of the application is:
./connectionpooltest 3 3 5 7 8 1 0 9 2 4 3 6
where the first number 3 are fake connections count and the second number 3 are the number of used threads. Numbers after them are fake requests.
The output of valgrind and GDB is the same as in the mentioned above question.
Used version of boost is 1.57. The compiler is GCC 4.8.3. The operating system is CentOS Linux release 7.1.1503
It seems that all valgrind errors are caused because of BOOST_USE_VALGRIND macro is not defined as Tanner Sansbury points in comment related to this question. It seems that except this the program is correct.