C++ synchronization and exception handling in cross threads - c++

I am using boost library for threading and synchronization in my application.
First of all I must say exceptions within threads on synchronization is compilitey new thing for me.
In any case below is the pseudo code what I want to achieve. I want synchronized threads to throw same exception that MAY have been thrown from the thread doing notify. How can I achieve this?
Could not find any topics from Stack Overflow regarding exception throwing with cross thread interaction using boost threading model
Many thanks in advance!
// mutex and scondition variable for the problem
mutable boost::mutex conditionMutex;
mutable boost::condition_variable condition;
inline void doTheThing() const {
if (noone doing the thing) {
try {
doIt()
// I succeeded
failed = false;
condition.notify_all();
}
catch (...) {
// I failed to do it
failed = true;
condition.notify_all();
throw
}
else {
boost::mutex::scoped_lock lock(conditionMutex);
condition.wait(lock);
if (failed) {
// throw the same exception that was thrown from
// thread doing notify_all
}
}
}

So you want the first thread that hits doTheThing() to call doIt(), and all subsequent threads that hit doTheThing() to wait for the first thread to finish calling doIt() before they proceed.
I think this should do the trick:
boost::mutex conditionMutex; // mutable qualifier not needed
bool failed = false;
bool done = false;
inline void doTheThing() const {
boost::unique_lock uql(conditionMutex);
if (!done) {
done = true;
try {
doIt();
failed = false;
}
catch (...) {
failed = true;
throw
}
}
else if (failed)
{
uql.unlock();
// now this thread knows that another thread called doIt() and an exception
// was thrown in that thread.
}
}
Important notes:
Every thread that calls doTheThing() must take a lock. There is no way around this. You are synchronizing threads, and for a thread to know anything about what's happening in another thread, it must take a lock. (Or it can use atomic memory operations, but that's a more advanced technique.) The variables failed and done are protected by the conditionMutex.
C++ will call destructor of uql when the function exits normally or by throwing exception.
EDIT Oh, and as for throwing the exception to all the other threads, forget about that, it's almost impossible, and it isn't the way things are done in C++. Instead, each thread can check to see if the first thread successfully called doIt() in the place I've indicated above.
EDIT There is no language support for propagating an exception to another thread. You can generalize the problem of propagating exceptions to another thread to passing messages to another thread. There are lots of library solutions to the problem of passing messages between threads ( boost::asio::io_service::post() ), and you could pass a message that contains the exception, with instructions to throw that exception on receipt of message. It's a bad idea, though. Only throw exceptions when you have an error that prevents you from unwinding the call stack by ordinary function return. That's what an exception is--an alternative way to return from a function when returning the usual way doesn't make sense.

Related

Boost asio io_service run exception adn restart from multiple threads

I was wondering what happens when an exception is thrown inside a handler function when using boost asio's run() function on the io_context from multiple threads. My thread function which calls the run operation on the io_context looks like this:
while(!io->stopped() && *stop == false) {
try {
auto cnt = io->run();
}catch(std::exception &e) {
}
if(io->stopped()) {
break;
}
}
The number of threads is 1..N. The documentation states that any subsequent calls to run() must call restart() first but restart() must not be called when there are still any active calls to run() which I can't know because there may be still threads calling run().
What is the solution for this when there is just one io_context and many threads calling run()
You're right that you need to catch/handle exceptions.
The solution to your conundrum is quite simple: run() should actually not be called in a loop. This realization helps because it let's you realize that once run(); exited normally, you can always do a break;.
More generically speaking, you could also inspect the return value of run/poll{*} member functions: if it returns zero then the service ran out of work.
See also Should the exception thrown by boost::asio::io_service::run() be caught?
The updated documentation is (mutatis mutandis) identical https://www.boost.org/doc/libs/1_77_0/doc/html/boost_asio/reference/io_service.html#boost_asio.reference.io_service.effect_of_exceptions_thrown_from_handlers.
A normal exit from run requires a call to restart in order for subsequent calls to run to not return immediately. Throwing an exception is not a normal exit so it is fine to call run again immediately.
Your code could just be:
while(true) {
try {
auto cnt = io->run();
break;
}catch(std::exception &e) {
}
}

How to register a thread-exit-handler in C++?

When I create a new thread I want to wait until the new thread reaches a specific point. Until now, I have solved this with a promise that is passed to the thread while the creating function waits for future.get(). If the initialization of the new thread fails, I set an exception to the promise so that future.get() will also throw an exception.
This looks something like this:
boost::promise<bool> promiseThreadStart;
void threadEntry();
int main(int argc, char* argv[])
{
// Prepare promise/future
promiseThreadStart = boost::promise<bool>();
boost::unique_future<bool> futureThreadStarted = promiseThreadStart.get_future();
// Start thread
boost::thread threadInstance = boost::thread(threadEntry);
// Wait for the thread to successfully initialize or to fail
try {
bool threadStarted = futureThreadStarted.get();
// Started successfully
}
catch (std::exception &e) {
// Starting the thread failed
}
return 0;
}
void threadEntry() {
// Do some preparations that could possibly fail
if (initializationFailed) {
promiseThreadStart.set_exception(std::runtime_error("Could not start thread."));
return;
}
// Thread initialized successfully
promiseThreadStart.set_value(true);
// Do the actual work of the thread
}
What upsets me here is the fact that the thread could fail during its initialization phase with an error that I do not handle. Then, I would not set the proper exception to the promise and the main-function would wait infinitely for future.get() to return. With this in mind, my solution seems to me quite error prone and badly designed.
I have learned about RAII and how it supplies you with exception-safety because you can clean up in the destructor. I would like to apply a similar pattern to the situation mentioned above. Therefore, I am wondering if there is something like a thread-destructor or exit-handler where I could possibly set a default exception to the promise. But anyways, using this promise/future design seems to me like a dirty workaround. So, what's the best and most elegant way to achieve exception-safe waiting?
Finally, I found an answer incidentally: I guess it can be done with a std::condition_variable while the waiting thread(s) can be notified with std::notify_all_at_thread_exit if the newly created thread terminates prematurely.

Multithreaded c++11-ish queue fails on windows

I'm not that into multi-threading, so I appreciate any advice.
In my server which is written in producer-consumer multi-threaded style
queue is wrapped altogether with its mutex and cv:
template <typename Event>
struct EventsHandle {
public: // methods:
Event*
getEvent()
{
std::unique_lock<std::mutex> lock {mutex};
while (events.empty()) {
condition_variable.wait(lock);
}
return events.front();
};
void
setEvent(Event* event)
{
std::lock_guard<std::mutex> lock {mutex};
events.push(event);
condition_variable.notify_one();
};
void
pop()
{ events.pop(); };
private: // fields:
std::queue<Event*> events;
std::mutex mutex;
std::condition_variable condition_variable;
};
and here how it is used in the consumer thread:
void
Server::listenEvents()
{
while (true) {
processEvent(events_handle.getEvent());
events_handle.pop();
}
};
and in a producer:
parse input, whatever else
...
setEvent(new Event {ERASE_CLIENT, getSocket(), nullptr});
...
void
Client::setEvent(Event* event)
{
if (event) {
events_handle->setEvent(event);
}
};
The code works on linux and I don't know why, but fails on windows MSVC13.
At some point exception is thrown with this dialog:
"Unhandled exception at 0x59432564 (msvcp120d.dll) in Server.exe: 0xC0000005: Access violation reading location 0xCDCDCDE1".
Debugging shows that exception is thrown on this line: std::lock_guard<std::mutex> lock(mutex) in the setEvent() function.
Little googling led me to these articles:
http://www.justsoftwaresolutions.co.uk/threading/implementing-a-thread-safe-queue-using-condition-variables.html
http://www.codeproject.com/Articles/598695/Cplusplus-threads-locks-and-condition-variables
I tried to follow them, but with little help. So after this long text sheet my question:
what's wrong with the code, mutex?
UPDATE
So... Finally after lovely game named 'comment that line out' it turned out that the problem was in memory management. Event's dctor caused failure.
As conclusion, it's better to use std::unique_ptr<> as mentions Jarod42 or use value semantic as advises juanchopanza when passing objects back and forth.
And use libraries whenever possible, don't reinvent the wheel =)
You have a data race due to a missing mutex in EventsHandle::pop.
Your producer thread may push items to the queue by calling setEvent(), and get preempted while executing the line events.push(event). Now a consumer thread can execute events.pop() concurrently. You end up with two unsynchronized writes on the queue, which is undefined behavior.
Also note that if you have more than one consumer, you need to ensure that the element you pop is the same you retrieved earlier from getEvent. In case one consumer gets preempted by another between the two calls. This is very hard to achieve with two separate member functions which are synchronized by a mutex that is a member of the class. The usual approach here is to offer a single getEventAndPop() function instead, that keeps the lock throughout the operation and get rid of the separate functions you currently have. This might seem like an absurd restriction at first sight, but multithreaded code has to play by different rules.
You might also want to change the setEvent method to not notify while the mutex is still locked. It depends on the scheduler but the thread waiting to be notified might wake up immediately just to wait for the mutex.
void
setEvent(Event* event)
{
{
std::lock_guard<std::mutex> lock{mutex};
events.push(event);
}
condition_variable.notify_one();
};
In my opinion, pop should be integrated in the getEvent().
This will prevent more threads from getting the same event (if pop is not in getEvent then more threads could get the same one).
Event*
getEvent()
{
std::unique_lock<std::mutex> lock {mutex};
while (events.empty()) {
condition_variable.wait(lock);
}
Event* front = events.front();
events.pop();
return front;
};

Actor calculation model using boost::thread

I'm trying to implement Actor calculation model over threads on C++ using boost::thread.
But program throws weird exception during execution. Exception isn't stable and some times program works in correct way.
There my code:
actor.hpp
class Actor {
public:
typedef boost::function<int()> Job;
private:
std::queue<Job> d_jobQueue;
boost::mutex d_jobQueueMutex;
boost::condition_variable d_hasJob;
boost::atomic<bool> d_keepWorkerRunning;
boost::thread d_worker;
void workerThread();
public:
Actor();
virtual ~Actor();
void execJobAsync(const Job& job);
int execJobSync(const Job& job);
};
actor.cpp
namespace {
int executeJobSync(std::string *error,
boost::promise<int> *promise,
const Actor::Job *job)
{
int rc = (*job)();
promise->set_value(rc);
return 0;
}
}
void Actor::workerThread()
{
while (d_keepWorkerRunning) try {
Job job;
{
boost::unique_lock<boost::mutex> g(d_jobQueueMutex);
while (d_jobQueue.empty()) {
d_hasJob.wait(g);
}
job = d_jobQueue.front();
d_jobQueue.pop();
}
job();
}
catch (...) {
// Log error
}
}
void Actor::execJobAsync(const Job& job)
{
boost::mutex::scoped_lock g(d_jobQueueMutex);
d_jobQueue.push(job);
d_hasJob.notify_one();
}
int Actor::execJobSync(const Job& job)
{
std::string error;
boost::promise<int> promise;
boost::unique_future<int> future = promise.get_future();
{
boost::mutex::scoped_lock g(d_jobQueueMutex);
d_jobQueue.push(boost::bind(executeJobSync, &error, &promise, &job));
d_hasJob.notify_one();
}
int rc = future.get();
if (rc) {
ErrorUtil::setLastError(rc, error.c_str());
}
return rc;
}
Actor::Actor()
: d_keepWorkerRunning(true)
, d_worker(&Actor::workerThread, this)
{
}
Actor::~Actor()
{
d_keepWorkerRunning = false;
{
boost::mutex::scoped_lock g(d_jobQueueMutex);
d_hasJob.notify_one();
}
d_worker.join();
}
Actually exception that is thrown is boost::thread_interrupted in int rc = future.get(); line. But form boost docs I can't reason of this exception. Docs says
Throws: - boost::thread_interrupted if the result associated with *this is not ready at the point of the call, and the current thread is interrupted.
But my worker thread can't be in interrupted state.
When I used gdb and set "catch throw" I see that back trace looks like
throw thread_interrupted
boost::detail::interruption_checker::check_for_interruption
boost::detail::interruption_checker::interruption_checker
boost::condition_variable::wait
boost::detail::future_object_base::wait_internal
boost::detail::future_object_base::wait
boost::detail::future_object::get
boost::unique_future::get
I looked into boost sources but can't get why interruption_checker decided that worker thread is interrupted.
So someone C++ guru, please help me. What I need to do to get correct code?
I'm using:
boost 1_53
Linux version 2.6.18-194.32.1.el5 Red Hat 4.1.2-48
gcc 4.7
EDIT
Fixed it! Thanks to Evgeny Panasyuk and Lazin. The problem was in TLS
management. boost::thread and boost::thread_specific_ptr are using
same TLS storage for their purposes. In my case there was problem when
they both tried to change this storage on creation (Unfortunately I
didn't get why in details it happens). So TLS became corrupted.
I replaced boost::thread_specific_ptr from my code with __thread
specified variable.
Offtop: During debugging I found memory corruption in external library
and fixed it =)
.
EDIT 2
I got the exact problem... It is a bug in GCC =)
The _GLIBCXX_DEBUG compilation flag breaks ABI.
You can see discussion on boost bugtracker:
https://svn.boost.org/trac/boost/ticket/7666
I have found several bugs:
Actor::workerThread function does double unlock on d_jobQueueMutex. First unlock is manual d_jobQueueMutex.unlock();, second is in destructor of boost::unique_lock<boost::mutex>.
You should prevent one of unlocking, for example release association between unique_lock and mutex:
g.release(); // <------------ PATCH
d_jobQueueMutex.unlock();
Or add additional code block + default-constructed Job.
It is possible that workerThread will never leave following loop:
while (d_jobQueue.empty()) {
d_hasJob.wait(g);
}
Imagine following case: d_jobQueue is empty, Actor::~Actor() is called, it sets flag and notifies worker thread:
d_keepWorkerRunning = false;
d_hasJob.notify_one();
workerThread wakes up in while loop, sees that queue is empty and sleeps again.
It is common practice to send special final job to stop worker thread:
~Actor()
{
execJobSync([this]()->int
{
d_keepWorkerRunning = false;
return 0;
});
d_worker.join();
}
In this case, d_keepWorkerRunning is not required to be atomic.
LIVE DEMO on Coliru
EDIT:
I have added event queue code into your example.
You have concurrent queue in both EventQueueImpl and Actor, but for different types. It is possible to extract common part into separate entity concurrent_queue<T> which works for any type. It would be much easier to debug and test queue in one place than catching bugs scattered over different classes.
So, you can try to use this concurrent_queue<T>(on Coliru)
This is just a guess. I think that some code can actually call boost::tread::interrupt(). You can set breakpoint to this function and see what code is responsible for this. You can test for interruption in execJobSync:
int Actor::execJobSync(const Job& job)
{
if (boost::this_thread::interruption_requested())
std::cout << "Interruption requested!" << std::endl;
std::string error;
boost::promise<int> promise;
boost::unique_future<int> future = promise.get_future();
The most suspicious code in this case is a code that has reference to thread object.
It is good practice to make your boost::thread code interruption aware anyway. It is also possible to disable interruption for some scope.
If this is not the case - you need to check code that works with thread local storage, because thread interruption flag stored in the TLS. Maybe some your code rewrites it. You can check interruption before and after such code fragment.
Another possibility is that your memory is corrupt. If no code is calling boost::thread::interrupt() and you doesn't work with TLS. This is the most hard case, try to use some dynamic analyzer - valgrind or clang memory sanitizer.
Offtopic:
You probably need to use some concurrent queue. std::queue will be very slow because of high memory contention and you will end up with poor cache performance. Good concurrent queue allow your code to enqueue and dequeue elements in parallel.
Also, actor is not something that supposed to execute arbitrary code. Actor queue must receive simple messages, not functions! Youre writing a job queue :) You need to take a look at some actor system like Akka or libcpa.

How can I propagate exceptions between threads?

We have a function which a single thread calls into (we name this the main thread). Within the body of the function we spawn multiple worker threads to do CPU intensive work, wait for all threads to finish, then return the result on the main thread.
The result is that the caller can use the function naively, and internally it'll make use of multiple cores.
All good so far..
The problem we have is dealing with exceptions. We don't want exceptions on the worker threads to crash the application. We want the caller to the function to be able to catch them on the main thread. We must catch exceptions on the worker threads and propagate them across to the main thread to have them continue unwinding from there.
How can we do this?
The best I can think of is:
Catch a whole variety of exceptions on our worker threads (std::exception and a few of our own ones).
Record the type and message of the exception.
Have a corresponding switch statement on the main thread which rethrows exceptions of whatever type was recorded on the worker thread.
This has the obvious disadvantage of only supporting a limited set of exception types, and would need modification whenever new exception types were added.
C++11 introduced the exception_ptr type that allows to transport exceptions between threads:
#include<iostream>
#include<thread>
#include<exception>
#include<stdexcept>
static std::exception_ptr teptr = nullptr;
void f()
{
try
{
std::this_thread::sleep_for(std::chrono::seconds(1));
throw std::runtime_error("To be passed between threads");
}
catch(...)
{
teptr = std::current_exception();
}
}
int main(int argc, char **argv)
{
std::thread mythread(f);
mythread.join();
if (teptr) {
try{
std::rethrow_exception(teptr);
}
catch(const std::exception &ex)
{
std::cerr << "Thread exited with exception: " << ex.what() << "\n";
}
}
return 0;
}
Because in your case you have multiple worker threads, you will need to keep one exception_ptr for each of them.
Note that exception_ptr is a shared ptr-like pointer, so you will need to keep at least one exception_ptr pointing to each exception or they will be released.
Microsoft specific: if you use SEH Exceptions (/EHa), the example code will also transport SEH exceptions like access violations, which may not be what you want.
Currently, the only portable way is to write catch clauses for all the types of exceptions that you might like to transfer between threads, store the information somewhere from that catch clause and then use it later to rethrow an exception. This is the approach taken by Boost.Exception.
In C++0x, you will be able to catch an exception with catch(...) and then store it in an instance of std::exception_ptr using std::current_exception(). You can then rethrow it later from the same or a different thread with std::rethrow_exception().
If you are using Microsoft Visual Studio 2005 or later, then the just::thread C++0x thread library supports std::exception_ptr. (Disclaimer: this is my product).
If you're using C++11, then std::future might do exactly what you're looking for: it can automagically trap exceptions that make it to the top of the worker thread, and pass them through to the parent thread at the point that std::future::get is called. (Behind the scenes, this happens exactly as in #AnthonyWilliams' answer; it's just been implemented for you already.)
The down side is that there's no standard way to "stop caring about" a std::future; even its destructor will simply block until the task is done. [EDIT, 2017: The blocking-destructor behavior is a misfeature only of the pseudo-futures returned from std::async, which you should never use anyway. Normal futures don't block in their destructor. But you still can't "cancel" tasks if you're using std::future: the promise-fulfilling task(s) will continue running behind the scenes even if nobody is listening for the answer anymore.] Here's a toy example that might clarify what I mean:
#include <atomic>
#include <chrono>
#include <exception>
#include <future>
#include <thread>
#include <vector>
#include <stdio.h>
bool is_prime(int n)
{
if (n == 1010) {
puts("is_prime(1010) throws an exception");
throw std::logic_error("1010");
}
/* We actually want this loop to run slowly, for demonstration purposes. */
std::this_thread::sleep_for(std::chrono::milliseconds(100));
for (int i=2; i < n; ++i) { if (n % i == 0) return false; }
return (n >= 2);
}
int worker()
{
static std::atomic<int> hundreds(0);
const int start = 100 * hundreds++;
const int end = start + 100;
int sum = 0;
for (int i=start; i < end; ++i) {
if (is_prime(i)) { printf("%d is prime\n", i); sum += i; }
}
return sum;
}
int spawn_workers(int N)
{
std::vector<std::future<int>> waitables;
for (int i=0; i < N; ++i) {
std::future<int> f = std::async(std::launch::async, worker);
waitables.emplace_back(std::move(f));
}
int sum = 0;
for (std::future<int> &f : waitables) {
sum += f.get(); /* may throw an exception */
}
return sum;
/* But watch out! When f.get() throws an exception, we still need
* to unwind the stack, which means destructing "waitables" and each
* of its elements. The destructor of each std::future will block
* as if calling this->wait(). So in fact this may not do what you
* really want. */
}
int main()
{
try {
int sum = spawn_workers(100);
printf("sum is %d\n", sum);
} catch (std::exception &e) {
/* This line will be printed after all the prime-number output. */
printf("Caught %s\n", e.what());
}
}
I just tried to write a work-alike example using std::thread and std::exception_ptr, but something's going wrong with std::exception_ptr (using libc++) so I haven't gotten it to actually work yet. :(
[EDIT, 2017:
int main() {
std::exception_ptr e;
std::thread t1([&e](){
try {
::operator new(-1);
} catch (...) {
e = std::current_exception();
}
});
t1.join();
try {
std::rethrow_exception(e);
} catch (const std::bad_alloc&) {
puts("Success!");
}
}
I have no idea what I was doing wrong in 2013, but I'm sure it was my fault.]
You problem is that you could receive multiple exceptions, from multiple threads, as each could fail, perhaps from different reasons.
I am assuming the main thread is somehow waiting for the threads to end to retrieve the results, or checking regularly the other threads' progress, and that access to shared data is synchronized.
Simple solution
The simple solution would be to catch all exceptions in each thread, record them in a shared variable (in the main thread).
Once all threads finished, decide what to do with the exceptions. This means that all other threads continued their processing, which perhaps, is not what you want.
Complex solution
The more complex solution is have each of your threads check at strategic points of their execution, if an exception was thrown from another thread.
If a thread throws an exception, it is caught before exiting the thread, the exception object is copied into some container in the main thread (as in the simple solution), and some shared boolean variable is set to true.
And when another thread tests this boolean, it sees the execution is to be aborted, and aborts in a graceful way.
When all thread did abort, the main thread can handle the exception as needed.
An exception thrown from a thread will not be catchable in the parent thread. Threads have different contexts and stacks, and generally the parent thread is not required to stay there and wait for the children to finish, so that it could catch their exceptions. There is simply no place in code for that catch:
try
{
start thread();
wait_finish( thread );
}
catch(...)
{
// will catch exceptions generated within start and wait,
// but not from the thread itself
}
You will need to catch exceptions inside each thread and interpret exit status from threads in the main thread to re-throw any exceptions you might need.
BTW, in the absents of a catch in a thread it is implementation specific if stack unwinding will be done at all, i.e. your automatic variables' destructors may not even be called before terminate is called. Some compilers do that, but it's not required.
Could you serialize the exception in the worker thread, transmit that back to the main thread, deserialize, and throw it again? I expect that for this to work the exceptions would all have to derive from the same class (or at least a small set of classes with the switch statement thing again). Also, I'm not sure that they would be serializable, I'm just thinking out loud.
There is, indeed, no good and generic way to transmit exceptions from one thread to the next.
If, as it should, all your exceptions derive from std::exception, then you can have a top-level general exception catch that will somehow send the exception to the main thread where it will be thrown again. The problem being you loose the throwing point of the exception. You can probably write compiler-dependent code to get this information and transmit it though.
If not all your exception inherit std::exception, then you are in trouble and have to write a lot of top-level catch in your thread ... but the solution still hold.
You will need to do a generic catch for all exceptions in the worker (including non-std exceptions, like access violations), and send a message from the worker thread (i suppose you have some kind of messaging in place?) to the controlling thread, containing a live pointer to the exception, and rethrow there by creating a copy of the exception.
Then the worker can free the original object and exit.
See http://www.boost.org/doc/libs/release/libs/exception/doc/tutorial_exception_ptr.html. It is also possible to write a wrapper function of whatever function you call to join a child thread, which automatically re-throws (using boost::rethrow_exception) any exception emitted by a child thread.