What I want to is invoking a method foo() with a timeout (say 1 minute). If its execution costs less than 1 minute, return the result. Otherwise an exception will be thrown. Here is the code:
//PRINT "START" IN THE LOG
auto m = std::make_shared<std::mutex>();
auto cv = std::make_shared<std::condition_variable>();
auto ready = std::make_shared<bool>(false);
auto response = std::make_shared<TResponse>();
auto exception = std::make_shared<FooException>();
exception->Code = ErrorCode::None;
std::thread([=]
{
std::unique_lock<std::mutex> lk(*m);
cv->wait(lk, [=]{ return *ready; });
try
{
//PRINT "PROCESS" IN THE LOG
auto r = foo();
*response = std::move(r);
}
catch(const FooException& e)
{
*exception = std::move(e);
}
lk.unlock();
cv->notify_one();
}).detach();
std::unique_lock<std::mutex> lk(*m);
*ready = true;
cv->notify_one();
auto status = cv->wait_for(lk, std::chrono::seconds(60));
if (status == std::cv_status::timeout)
{
//PRINT "TIMEOUT" IN THE LOG
//throw timeout exception
}
else
{
//PRINT "FINISH" IN THE LOG
if (exception->Code == ErrorCode::None)
{
return *response;
}
else
{
throw *exception;
}
}
You can see I add logs START/PROCESS/FINISH/TIMEOUT in the code, every time this method is executed, I can see START/PROCESS/FINISH or START/PROCESS/TIMEOUT pattern in the logs. However, sometimes the logs are START/PROCESS, without any FINISH/TIMEOUT. I think cv->wait_for should block the current thread for 60 seconds at most, then it exists with either TIMEOUT or FINISH.
The foo() method contains disk IO operations to network drives that sometimes hangs for more than 1 hour(the reason is not related to this question, and it can't be resolved now), I tried to replace foo with a thread sleep, everything is working as expected. What's wrong with this code and how can I improve this?
Because you have no predicate in the cv->wait_for call, the thread might be unblocked spuriously. However, it is strange that no FINISH/TIMEOUT is printed. So we might need more information here: What does happen with the program? Does it hang, does it throw, does it just exit, does it print in the line after cv->wait_for?
You could try using std::async and see if the same behavior appears (furthermore, it would greatly simplify your code):
std::future<int> res = std::async(foo);
std::future_status stat = res.wait_for(std::chrono::seconds(60));
if (stat != std::future_status::ready) {
std::cout << "Timed out..." << "\n";
} else {
try {
int result = res.get();
std::cout << "Result = " << result << std::endl;
} catch (const FooException& e) {
std::cerr << e.what() << '\n';
}
}
EDIT As pointed out in the comments by CuriouslyRecurringThoughts the future of std::async blocks in the destructor. If that is not an option, the following code uses a std::promise and a detached thread instead:
std::promise<int> prom;
std::future<int> res = prom.get_future();
std::thread([p = std::move(prom)]() mutable {
try {
p.set_value(foo());
} catch (const std::exception& e) {
p.set_exception(std::current_exception());
}
}).detach();
Waiting for the std::future is done as shown before.
It seems that despite the timed wait your main thread deadlocks because even when cv->wait_for returns with timeout it still tries to lk.lock() on the mutex which is currently locked by the second thread.
As mentioned on cppreference about wait_for:
When unblocked, regardless of the reason, lock is reacquired and wait_for() exits.
I'm not sure why the promise/future solution didn't work for you since you didn't post that example here, but I've tried a simple version of it which seems to work even when the second thread "hangs":
using namespace std::chrono_literals;
std::cout << "START" << std::endl;
std::promise<void> p;
auto f = p.get_future();
std::thread t([p = std::move(p)]() mutable {
std::cout << "PROCESS" << std::endl;
std::this_thread::sleep_for(5min);
p.set_value();
});
auto status = f.wait_for(5s);
std::cout << (status == std::future_status::ready ? "FINISH" : "TIMEOUT") << std::endl;
t.join();
The output is as expected:
START
PROCESS
TIMEOUT
We can create a separate thread to run the call itself, and wait on a condition variable back in your main thread which will be signaled by the thread doing the call to foo once it returns.
The trick is to wait on the condition variable with your 60s timeout, so that if the call takes longer than the timeout you will still wake up, know about it, and be able to throw the exception - all in the main thread.
Please find below a code example:
#include <iostream>
#include <chrono>
#include <thread>
#include <mutex>
#include <condition_variable>
using namespace std::chrono_literals;
int foo()
{
//std::this_thread::sleep_for(10s); //Will Return Success
std::this_thread::sleep_for(70s); //Will Return Timeout
return 1;
}
int foo_wrapper()
{
std::mutex m;
std::condition_variable cv;
int retValue;
std::thread t([&cv, &retValue]()
{
retValue = foo();
cv.notify_one();
});
t.detach();
{
std::unique_lock<std::mutex> lock(m);
if(cv.wait_for(lock, 60s) == std::cv_status::timeout)
throw std::runtime_error("Timeout");
}
return retValue;
}
int main()
{
bool timedout = false;
try {
foo_wrapper();
}
catch(std::runtime_error& e) {
std::cout << e.what() << std::endl;
timedout = true;
}
if(!timedout)
std::cout << "Success" << std::endl;
else
std::cout << "Failure" << std::endl;
return 0;
}
If we use std::this_thread::sleep_for(10s); inside foo will return SUCCESS
And, if we use std::this_thread::sleep_for(70s); inside foo will return TIMEOUT
I hope it helps!
As Mike van Dyke says, and the documentation makes quite clear, you need a predicate to use a condition variable correctly, to deal with spurious wakeups:
When the condition variable is notified, a timeout expires, or a spurious wakeup occurs, the thread is awakened, and the mutex is atomically reacquired. The thread should then check the condition and resume waiting if the wake up was spurious.
Any use of a condvar for waiting without a loop and predicate is wrong. It should always have either an explicit while(!predicate) loop or look something like:
std::unique_lock<std::mutex> lk(*m);
auto status = cv->wait_for(lk, std::chrono::seconds(60), predicate);
if (status == std::cv_status::timeout)
{ /*...*/ } else { /*...*/ }
which means you need some predicate to check: setting *ready = false before notifying the condvar in your thread (and using !*ready as your predicate) would be fine.
As for why you didn't see the expected result - I have no idea, because I can't see your real logging code or what happens outside the code snippet you provided. Waking from wait_for without either having timed out or received a valid response or exception is the most likely, but you'll either have to debug your code or provide a complete example to help with that.
Related
I'm studying concurrency in C++ and I'm trying to implement a multithreaded callback registration system. I came up with the following code, which is supposed to accept registration requests until an event occurs. After that, it should execute all the registered callbacks in order with which they were registered. The registration order doesn't have to be deterministic.
The code doesn't work as expected. First of all, it rarely prints the "Pushing callback with id" message. Secondly, it sometimes hangs (a deadlock caused by a race condition, I assume). I'd appreciate help in figuring out what's going on here. If you see that I overcomplicate some parts of the code or misuse some pieces, please also point it out.
#include <condition_variable>
#include <functional>
#include <iostream>
#include <mutex>
#include <queue>
#include <thread>
class CallbackRegistrar{
public:
void registerCallbackAndExecute(std::function<void()> callback) {
if (!eventTriggered) {
std::unique_lock<std::mutex> lock(callbackMutex);
auto saved_id = callback_id;
std::cout << "Pushing callback with id " << saved_id << std::endl;
registeredCallbacks.push(std::make_pair(callback_id, callback));
++callback_id;
callbackCond.wait(lock, [this, saved_id]{return releasedCallback.first == saved_id;});
releasedCallback.second();
callbackExecuted = true;
eventCond.notify_one();
}
else {
callback();
}
}
void registerEvent() {
eventTriggered = true;
while (!registeredCallbacks.empty()) {
releasedCallback = registeredCallbacks.front();
callbackCond.notify_all();
std::unique_lock<std::mutex> lock(eventMutex);
eventCond.wait(lock, [this]{return callbackExecuted;});
callbackExecuted = false;
registeredCallbacks.pop();
}
}
private:
std::queue<std::pair<unsigned, std::function<void()>>> registeredCallbacks;
bool eventTriggered{false};
bool callbackExecuted{false};
std::mutex callbackMutex;
std::mutex eventMutex;
std::condition_variable callbackCond;
std::condition_variable eventCond;
unsigned callback_id{1};
std::pair<unsigned, std::function<void()>> releasedCallback;
};
int main()
{
CallbackRegistrar registrar;
std::thread t1(&CallbackRegistrar::registerCallbackAndExecute, std::ref(registrar), []{std::cout << "First!\n";});
std::thread t2(&CallbackRegistrar::registerCallbackAndExecute, std::ref(registrar), []{std::cout << "Second!\n";});
registrar.registerEvent();
t1.join();
t2.join();
return 0;
}
This answer has been edited in response to more information being provided by the OP in a comment, the edit is at the bottom of the answer.
Along with the excellent suggestions in the comments, the main problem that I have found in your code is with the callbackCond condition variable wait condition that you have set up. What happens if releasedCallback.first does not equal savedId?
When I have run your code (with a thread-safe queue and eventTriggered as an atomic) I found that the problem was in this wait function, if you put a print statement in that function you will find that you get something like this:
releasedCallback.first: 0, savedId: 1
This then waits forever.
In fact, I've found that the condition variables used in your code aren't actually needed. You only need one, and it can live inside the thread-safe queue that you are going to build after some searching ;)
After you have the thread-safe queue, the code from above can be reduced to:
class CallbackRegistrar{
public:
using NumberedCallback = std::pair<unsigned int, std::function<void()>>;
void postCallback(std::function<void()> callback) {
if (!eventTriggered)
{
std::unique_lock<std::mutex> lock(mutex);
auto saved_id = callback_id;
std::cout << "Pushing callback with id " << saved_id << std::endl;
registeredCallbacks.push(std::make_pair(callback_id, callback));
++callback_id;
}
else
{
while (!registeredCallbacks.empty())
{
NumberedCallback releasedCallback;
registeredCallbacks.waitAndPop(releasedCallback);
releasedCallback.second();
}
callback();
}
}
void registerEvent() {
eventTriggered = true;
}
private:
ThreadSafeQueue<NumberedCallback> registeredCallbacks;
std::atomic<bool> eventTriggered{false};
std::mutex mutex;
unsigned int callback_id{1};
};
int main()
{
CallbackRegistrar registrar;
std::vector<std::thread> threads;
for (int i = 0; i < 10; i++)
{
threads.push_back(std::thread(&CallbackRegistrar::postCallback,
std::ref(registrar),
[i]{std::cout << std::to_string(i) <<"\n";}
));
}
registrar.registerEvent();
for (auto& thread : threads)
{
thread.join();
}
return 0;
}
I'm not sure if this does exactly what you want, but it doesn't deadlock. It's a good starting point in any case, but you need to bring your own implementation of ThreadSafeQueue.
Edit
This edit is in response to the comment by the OP stating that "once the event occurs, all the callbacks should be executed in [the] order that they've been pushed to the queue and by the same thread that registered them".
This was not mentioned in the original question post. However, if that is the required behaviour then we need to have a condition variable wait in the postCallback method. I think this is also the reason why the OP had the condition variable in the postCallback method in the first place.
In the code below I have made a few edits to the callbacks, they now take input parameters. I did this to print some useful information while the code is running so that it is easier to see how it works, and, importantly how the condition variable wait is working.
The basic idea is similar to what you had done, I've just trimmed out the stuff you didn't need.
class CallbackRegistrar{
public:
using NumberedCallback = std::pair<unsigned int, std::function<void(int, int)>>;
void postCallback(std::function<void(int, int)> callback, int threadId) {
if (!m_eventTriggered)
{
// Lock the m_mutex
std::unique_lock<std::mutex> lock(m_mutex);
// Save the current callback ID and push the callback to the queue
auto savedId = m_currentCallbackId++;
std::cout << "Pushing callback with ID " << savedId << "\n";
m_registeredCallbacks.push(std::make_pair(savedId, callback));
// Wait until our thread's callback is next in the queue,
// this will occur when the ID of the last called callback is one less than our saved callback.
m_conditionVariable.wait(lock, [this, savedId, threadId] () -> bool
{
std::cout << "Waiting on thread " << threadId << " last: " << m_lastCalledCallbackId << ", saved - 1: " << (savedId - 1) << "\n";
return (m_lastCalledCallbackId == (savedId - 1));
});
// Once we are finished waiting, get the callback out of the queue
NumberedCallback retrievedCallback;
m_registeredCallbacks.waitAndPop(retrievedCallback);
// Update last callback ID and call the callback
m_lastCalledCallbackId = retrievedCallback.first;
retrievedCallback.second(m_lastCalledCallbackId, threadId);
// Notify one waiting thread
m_conditionVariable.notify_one();
}
else
{
// If the event is already triggered, call the callback straight away
callback(-1, threadId);
}
}
void registerEvent() {
// This is all we have to do here.
m_eventTriggered = true;
}
private:
ThreadSafeQueue<NumberedCallback> m_registeredCallbacks;
std::atomic<bool> m_eventTriggered{ false};
std::mutex m_mutex;
std::condition_variable m_conditionVariable;
unsigned int m_currentCallbackId{ 1};
std::atomic<unsigned int> m_lastCalledCallbackId{ 0};
};
The main function is as above, except I am creating 100 threads instead of 10, and I have made the callback print out information about how it was called.
for (int createdThreadId = 0; createdThreadId < 100; createdThreadId++)
{
threads.push_back(std::thread(&CallbackRegistrar::postCallback,
std::ref(registrar),
[createdThreadId](int registeredCallbackId, int callingThreadId)
{
if (registeredCallbackId < 0)
{
std::cout << "Callback " << createdThreadId;
std::cout << " called immediately, from thread: " << callingThreadId << "\n";
}
else
{
std::cout << "Callback " << createdThreadId;
std::cout << " called from thread " << callingThreadId;
std::cout << " after being registered as " << registeredCallbackId << "\n";
}
},
createdThreadId));
}
I am not entirely sure why you want to do this, as it seems to defeat the point of having multiple threads, although I may be missing something there. But, regardless, I hope this helps you to understand better the problem you are trying to solve.
Experimenting with this code some more, I found out why the "Pushing callback with id " part was rarely printed. It's because the call to registrar.registerEvent from the main thread was usually faster than the calls to registerCallbackAndExecute from separate threads. Because of that, the condition if (!eventTriggered) was almost never fulfilled (eventTriggered had been set to true in the registerEvent method) and hence all calls to registerCallbackAndExecute were falling into the else branch and executing straightaway.
Then, the program sometimes also didn't finish, because of a race condition between registerEvent and registerCallbackAndExecute. Sometimes, registerEvent was being called after the check if (!eventTriggered) but before pushing the callback to the queue. Then, registerEvent completed instantly (as the queue was empty) while the thread calling registerCallbackAndExecute was pushing the callback to the queue. The latter thread then kept waiting forever for the event (that had already happened) to happen.
What happens if I use wait_for on a future, which goes out of scope due to a timeout and then set_value is called on a promise? It would make little sense if it was undefined, however, I want to be sure and did not find an answer of my own. Example code below
#include <future>
#include <iostream>
#include <thread>
void task1() {
std::promise<void> promise;
auto futureOutOfScopeWork = [&promise]() {
std::future<void> future = promise.get_future();
auto status = future.wait_for(std::chrono::milliseconds(500));
if (status == std::future_status::ready) {
std::cout << "in time" << std::endl;
} else if (status == std::future_status::timeout) {
std::cout << "timeout" << std::endl;
} else {
std::cout << "invalid state" << std::endl;
}
};
std::thread futureThread(futureOutOfScopeWork);
futureThread.detach();
std::this_thread::sleep_for(std::chrono::milliseconds(750));
promise.set_value();
}
int main() {
std::thread startEverything(task1);
startEverything.detach();
std::this_thread::sleep_for(std::chrono::milliseconds(1000));
}
the startEverythring thread initiates a promise. startEverything then
starts another thread which invokes get_future on the promise. The future then goes out of scope because of a timeout. Am I running into undefined behaviour here? get_future was called, the future was destroyed and set_value invoked in a promise whose linked future hast gone out of scope.
I am working with condition_variable on Visual studio 2019. The condition_variable.wait_for() function returns std::cv_status::no_timeout without any notification.
#include <iostream>
#include <thread>
#include <chrono>
#include <mutex>
std::condition_variable cv;
std::mutex mtx;
bool called = false;
void printThread()
{
std::unique_lock<std::mutex> lck(mtx);
while (std::cv_status::timeout == cv.wait_for(lck, std::chrono::seconds(1)))
{
std::cout << "*";
}
std::cout << "thread exits" << std::endl;
}
int main()
{
std::thread th(printThread);
th.join();
std::cout << "program exits" << std::endl;
}
I think the code will never exit and keep printing *, but it exits after printing some *.
Here is the output:
********************************************************************thread exits
program exits
Why does this happen? Is it the so-called "spurious wakeups"?
Yes, it's a "spurious wakeup". This is explained on cppreference.com's reference page for wait_for:
It may also be unblocked spuriously. When unblocked, regardless of the
reason, lock is reacquired and wait_for() exits.
Translation: there are gremlins in your computer. They get grumpy, occasionally. And if they do get grumpy, wait_for returns before the requested timeout expires. And when that happens:
Return value
std::cv_status::timeout if the relative timeout specified by
rel_time expired, std::cv_status::no_timeout otherwise.
And that seems to be exactly what you're seeing. The C++ standard permits a C++ implementation to return from wait_for prematurely, for arbitrary reasons, and unless you do return from wait_for when the timeout expires, no_timeout is what you get.
You might be wondering why wait_for (and several other similar functions) may decide to throw up their hands and return "spuriously". But that would be a different question...
As already explained, it is waking up due spurious wakeup. Such thing make the function wait_for completely useless. The solution is to use the wait_until saving the current time before entering the wait loop:
int count = 1;
std::mutex mutex;
std::condition_variable condition_variable;
void wait() {
std::unique_lock<std::mutex> lock(mutex);
count--;
int timeout = 1000; // 1 second
std::chrono::time_point<std::chrono::system_clock> timenow =
std::chrono::system_clock::now();
while(count < 0) {
std::cv_status status = condition_variable.wait_until(
lock,
timenow + std::chrono::duration<double,std::ratio<1,1000>>(timeout));
if ( std::cv_status::timeout == status) {
count++;
break;
}
}
}
I got code like below:
std::mutex mutex;
std::condition_variable condition_variable;
bool finish = false;
void test() {
while (true) {
std::unique_lock<std::mutex> lock(mutex);
condition_variable.wait(lock);
if (finish){
std::cout << "finish detected" << std::endl;
return;
}
}
}
int main() {
std::thread t(test);
std::unique_lock<std::mutex> lock(mutex);
finish = true;
lock.unlock();
//sleep(1);
condition_variable.notify_all();
std::cout << "notify_all" << std::endl;
t.join();
}
and the code will not terminate when running, the notify_all log will print, but the finish detected log will not. If I use debug mode, the code will terminate successfully, so I cannot provide a clear clue about the status of the running code, but if I release the sleep(1), the code will works.
So can anyone help what's wrong with my code?
Condition variables have no state, so that when you signal it and there are no waiters the signal is lost. It happens in your code when condition_variable.notify_all() executes before condition_variable.wait(lock);.
The code doesn't use correct method to wait on the condition variable. The correct method is:
Lock the mutex.
Check the condition (finish here).
If the condition is not satisfied, wait on the condition variable. The condition variable can be woken up spuriously. Goto 2.
Fix:
void test() {
std::unique_lock<std::mutex> lock(mutex);
while(!finish)
condition_variable.wait(lock);
std::cout << "finish detected" << std::endl;
}
There is another overload of condition_variable::wait that does the while loop for you:
void test() {
std::unique_lock<std::mutex> lock(mutex);
condition_variable.wait(lock, [&finish]{ return finish; });
std::cout << "finish detected" << std::endl;
}
I am currently trying to learn how to use a condition_variable for thread synchronization. For testing, I have made the demo application shown below. When I start it, it runs into a dead lock. I know the location where this happens, but I'm unable to understand why the deadlock occurs.
I know that a condition_variable's wait function will automatically unlock the mutex when the condition is not true, so the main thread should not be blocked in the second pass. But it is just this what happens.
Could anybody explain why?
#include <thread>
#include <condition_variable>
#include <iostream>
bool flag = false;
std::mutex g_mutex;
std::condition_variable cv;
void threadProc()
{
std::unique_lock<std::mutex> lck(g_mutex);
while (true)
{
static int count = 0;
std::cout << "wait for flag" << ++count << std::endl;
cv.wait(lck, []() {return flag; }); // !!!It will blocked at the second round
std::cout << "flag is true " << count << std::endl;
flag = false;
lck.unlock();
}
}
int main(int argc, char *argv[])
{
std::thread t(threadProc);
while (true)
{
static int count = 0;
{
std::lock_guard<std::mutex> guard(g_mutex); // !!!It will blocked at the second round
flag = true;
std::cout << "set flag " << ++count << std::endl;
}
cv.notify_one();
std::this_thread::sleep_for(std::chrono::seconds(1));
}
t.join();
return 0;
}
I know that a condition_variable's wait function will automatically unlock the mutex when the condition is not true.
Um..., yes..., Just to be absolutely clear, cv.wait(lck, f) does this:
while(! f()) {
cv.wait(lck);
}
And each call to cv.wait(lck) will;
unlock lck,
wait until some other thread calls cv.notify_one() or cv.notify_all(),
re-lock lck, and then
return.
You can fix the problem by moving the unique_lock(...) statement inside the while loop. As it is now, you're attempting to unlock lck on round 2 but it was not in a locked state, since, after round 1 you never locked it again.