How to handle condition variable missed signal from another thread - c++

So in the compilable code below, I'm sending a Query message to be handled by another thread and I want to wait for a response or timeout if it hits a certain timeout. I don't know why the wait_until is missing the signal and hitting the timeout period when it should not be doing that. It only happens if the handler is returning a response REALLY fast. How do you propose I fix the code below?
#include <mutex>
#include <memory>
#include <condition_variable>
#include <atomic>
#include <thread>
#include <iostream>
#include <queue>
#include <zconf.h>
class Question
{
};
class Answer
{
public:
bool isAnswered = false;
};
class Query
{
std::condition_variable _cv;
std::mutex _mutex;
std::atomic_bool _questionAnswered;
std::atomic_bool _questionSet;
std::shared_ptr<Question> _question;
std::shared_ptr<Answer> _answer;
public:
void setQuestion(std::shared_ptr<Question> & question)
{
if(!_questionSet)
{
_question = question;
_questionSet = true;
}
};
void setAnswer(std::shared_ptr<Answer> answer)
{
std::unique_lock<std::mutex> lock(_mutex);
if(!_questionAnswered)
{
// Set the answer and notify the getAnswerWithTimeout() to unlock if holding
_answer = answer;
_questionAnswered = true;
lock.unlock();
_cv.notify_all();
}
};
std::shared_ptr<Answer> getAnswerWithTimeout(uint64_t micros)
{
std::unique_lock<std::mutex> lock(_mutex);
if(!_questionAnswered)
{
auto now = std::chrono::system_clock::now();
// When timeout occurs, lock down this class, set the answer as null, and set error to timeout
if (!_cv.wait_until(lock, now + std::chrono::microseconds(micros), [&]() { return (bool)_questionAnswered; }) )
{
_answer = nullptr;
_questionAnswered = true;
}
}
return _answer;
};
};
void function_to_run(std::shared_ptr<Query> query)
{
// Respond to query and set the answer
auto answer = std::make_shared<Answer>();
answer->isAnswered = true;
// Set the response answer
query->setAnswer(answer);
}
std::queue<std::shared_ptr<Query>> queryHandler;
bool keepRunning = true;
std::mutex queryHandlerMutex;
std::condition_variable queryHandlerCv;
void handleQueryHandler()
{
while (true)
{
std::shared_ptr<Query> query;
{
std::unique_lock<std::mutex> lock(queryHandlerMutex);
queryHandlerCv.wait(lock, [&] { return !keepRunning || !queryHandler.empty(); });
if (!keepRunning) {
return;
}
// Pop off item from queue
query = queryHandler.front();
queryHandler.pop();
}
// Process query with function
function_to_run(query);
}
}
void insertIntoQueryHandler(std::shared_ptr<Query> & query)
{
{
std::unique_lock<std::mutex> lock(queryHandlerMutex);
// Insert into Query Handler
queryHandler.emplace(query);
}
// Notify query handler to start if locked on empty
queryHandlerCv.notify_one();
}
std::shared_ptr<Answer>
ask(std::shared_ptr<Query> query, uint64_t timeoutMicros=0)
{
std::shared_ptr<Answer> answer = nullptr;
// Send Query to be handled by external thread
insertIntoQueryHandler(query);
// Hold for the answer to be returned with timeout period
answer = query->getAnswerWithTimeout(timeoutMicros);
return answer;
}
int main()
{
// Start Up Query Handler thread to handle Queries
std::thread queryHandlerThread(handleQueryHandler);
// Create queries in infinite loop and process
for(int i = 0; i < 1000000; i++)
{
auto question = std::make_shared<Question>();
auto query = std::make_shared<Query>();
query->setQuestion(question);
auto answer = ask(query, 1000);
if(!answer)
{
std::cout << "Query Timed out after 1000us" << std::endl;
}
}
// Stop the thread
{
std::unique_lock<std::mutex> lock(queryHandlerMutex);
keepRunning = false;
}
queryHandlerCv.notify_one();
queryHandlerThread.join();
return 0;
}

As discussed in the comments, the main issue here is the timeout period you're using (1ms), in this interval:
auto now = std::chrono::system_clock::now();
.... another thread may sneak in here ....
if (!_cv.wait_until(lock, now + std::chrono::microseconds(micros), [&]() { return (bool)_questionAnswered; }) )
{
another thread can sneak in and consume a timeslice (e.g. 10ms) and the wait_until would timeout immediately. Furthermore there are reports of unexpected behaviour with wait_until as described here:
std::condition_variable wait_until surprising behaviour
Increasing the timeout to something in the order of several timeslices will fix this. You can also adjust thread priorities.
Personally I advocate polling a condition variable with wait_for which is efficient and also bails in a timely fashion (as opposed to polling a flag and sleeping).
Time slices in non-RTOS systems tend to be in the order of 10ms, so I would not expect such short timeouts to work accurately and predictably in general-purpose systems. See this for an introduction to pre-emptive multitasking:
https://www.geeksforgeeks.org/time-slicing-in-cpu-scheduling/
as well as this:
http://dev.ti.com/tirex/explore/node?node=AL.iEm6ATaD6muScZufjlQ__pTTHBmu__LATEST
As jtbandes points out, it's worth using tools such as Clang's thread sanitiser to check for potential logic races: https://clang.llvm.org/docs/ThreadSanitizer.html

Related

Handle mutex lock in callback c++

I've got a Timer class that can run with both an initial time and an interval. There's an internal function internalQuit performs thread.join() before a thread is started again on the resetCallback. The thing is that each public function has it's own std::lock_guard on the mutex to prevent the data of being written. I'm now running into an issue that when using the callback to for example stop the timer in the callback, the mutex cannot be locked by stop(). I'm hoping to get some help on how to tackle this issue.
class Timer
{
public:
Timer(string_view identifier, Function &&timeoutHandler, Duration initTime, Duration intervalTime);
void start()
void stop() // for example
{
std::lock_guard lock{mutex};
running = false;
sleepCv.notify_all();
}
void setInitTime()
void setIntervalTime()
void resetCallback(Function &&timeoutHandler)
{
internalQuit();
{
std::lock_guard lock{mutex};
quit = false;
}
startTimerThread(std::forward<Function>(timeoutHandler));
}
private:
internalQuit() // performs thread join
{
{
std::lock_guard lock {mutex};
quit = true;
running = false;
sleepCv.notify_all();
}
thread.join();
}
mainLoop(Function &&timeoutHandler)
{
while(!quit)
{
std::unique_lock lock{mutex};
// wait for running with sleepCv.wait()
// handle initTimer with sleepCv.wait_until()
timeoutHandler(); // callback
// handle intervalTimer with sleepCv.wait_until()
timeoutHandler(); // callback
}
}
startTimerThread(Function &&timeoutHandler)
{
thread = std::thread([&, timeoutHandler = std::forward<Function>(timeoutHandler)](){
mainLoop(timeoutHandler);
});
}
std::thread thread{};
std::mutex mutex{};
std::condition_variable sleepCv{}
// initTime, intervalTime and some booleans for updating with sleepCv.notify_all();
}
For testing this, I have the following testcase in Gtest. I'm expecting the timer to stop in the callback. Unfortunately, the timer will hang on acquiring the mutex lock in the stop() function.
std::atomic<int> callbackCounter;
void timerCallback()
{
callbackCounter.fetch_add(1, std::memory_order_acq_rel);
}
TEST(timerTest, timerShouldStopWhenStoppedInNewCallback)
{
std::atomic<int> testCounter{0};
Timer<std::chrono::steady_clock > t{"timerstop", &timerCallback, std::chrono::milliseconds(0), std::chrono::milliseconds(100)};
t.resetCallback([&]{
testCounter += 1;
t.stop();
});
t.start();
sleepMilliSeconds(100);
ASSERT_EQ(testCounter.load(), 1); // trigger due to original interval timeout
sleepMilliSeconds(100);
ASSERT_EQ(testCounter.load(), 1); // no trigger, because stopped in new callback
}
Removing all the mutexes in each of the public fucntions, fixes the issue. But that could lead to possible race conditions for data being written to variables. Hence each function has a lock before writing to f.e. the booleans.
I've tried looking into the std::move functionality to move the thread during the resetCallback into a different variable and then call join on that one. I'm also investigating recursive_mutex but have no experience with using that.
void resetCallback(Function &&timeoutHandler)
{
internalQuit();
{
std::lock_guard lock{mutex};
quit = false;
}
auto prevThread = std::thread(std::move(this->thread));
// didn't know how to continue from here, requiring more selfstudy.
startTimerThread(std::forward<Function>(timeoutHandler));
}
It's a new subject for me, have worked with mutexes and timers before but with relatively simple stuff.
Thank you in advance.

Best way to optimize timer queue with concurrent_priority_queue C++

I'm working on timer queue using concurrent_priority_queue right now..
I implemented basic logic of executing most urgent event in this queue.
Here's my code.
TimerEvent ev{};
while (timer.mLoop)
{
while (timer.mQueue.empty() == false)
{
if (timer.mQueue.try_pop(ev) == false)
continue;
if (ev.Type == EVENT_TYPE::PHYSICS) // Physics event is around 15 ~ 17ms
{
auto now = Clock::now();
std::this_thread::sleep_for(ev.StartTime - now);
timer.mGameServerPtr->PostPhysicsOperation(ev.WorldID);
}
else if (ev.Type == EVENT_TYPE::INVINCIBLE) // This event is 3sec long.
{
auto now = Clock::now();
std::this_thread::sleep_for(ev.StartTime - now); // This is wrong!!
timer.mGameServerPtr->ReleaseInvincibleMode(ev.WorldID);
}
}
std::this_thread::sleep_for(10ms);
}
The problem would be easily solved if there is like front/top method in concurrent_priority_queue.
But there is no such method in class because it isn't thread-safe.
So, I just popped event out of the queue and waited until start time of the event.
In this way, I shouldn't have to insert event into queue again.
But problem is that if I have another type of event like EVENT_TYPE::INVINCIBLE, then I shouldn't just use sleep_for because this event is almost 3 second long. While waiting for 3 second, the PHYSICS event will not executed in time.
I can use sleep_for method for PHYSIC event since it is most shortest one to wait.
But I have to re-insert INVINCIBLE event into queue.
How can I optimize this timer without re-insert event into queue again?
How can I optimize this timer without re-insert event into queue again?
By the looks of it, that'll be hard when using the implementation of concurrent_priority_queue you are currently using. It wouldn't be hard if you just used the standard std::priority_queue and added some locking where needed though.
Example:
#include <atomic>
#include <chrono>
#include <condition_variable>
#include <functional>
#include <iostream>
#include <mutex>
#include <queue>
using Clock = std::chrono::steady_clock;
using time_point = std::chrono::time_point<Clock>;
struct TimerEvent {
void operator()() { m_event(); }
bool operator<(const TimerEvent& rhs) const {
return rhs.StartTime < StartTime;
}
time_point StartTime;
std::function<void()> m_event; // what to execute when the timer is due
};
class TimerQueue {
public:
~TimerQueue() { shutdown(); }
void shutdown() {
m_shutdown = true;
m_cv.notify_all();
}
// add a new TimerEvent to the queue
template<class... Args>
void emplace(Args&&... args) {
std::scoped_lock lock(m_mutex);
m_queue.emplace(TimerEvent{std::forward<Args>(args)...});
m_cv.notify_all();
}
// Wait until it's time to fire the event that is first in the queue
// which may change while we are waiting, but that'll work too.
bool wait_pop(TimerEvent& ev) {
std::unique_lock lock(m_mutex);
while(!m_shutdown &&
(m_queue.empty() || Clock::now() < m_queue.top().StartTime))
{
if(m_queue.empty()) { // wait "forever"
m_cv.wait(lock);
} else { // wait until first StartTime
auto st = m_queue.top().StartTime;
m_cv.wait_until(lock, st);
}
}
if(m_shutdown) return false; // time to quit
ev = std::move(m_queue.top()); // extract event
m_queue.pop();
return true;
}
private:
std::priority_queue<TimerEvent> m_queue;
mutable std::mutex m_mutex;
std::condition_variable m_cv;
std::atomic<bool> m_shutdown{};
};
If an event that is due before the event we're currently waiting for in wait_pop comes in, the m_cv.wait/m_cv.wait_until will unblock (because of the m_cv.notify_all() in emplace()) and that new element will be the first in queue.
The event loop could simply be:
void event_loop(TimerQueue& tq) {
TimerEvent te;
while(tq.wait_pop(te)) {
te(); // execute event
}
// the queue was shutdown, exit thread
}
And you could put any kind of invocable with the time point when you'd like it to fire in that queue.
#include <thread>
int main() {
TimerQueue tq;
// create a thread to run the event loop
auto ev_th = std::thread(event_loop, std::ref(tq));
// wait a second
std::this_thread::sleep_for(std::chrono::seconds(1));
// add an event in 5 seconds
tq.emplace(Clock::now() + std::chrono::seconds(5), [] {
std::cout << "second\n";
});
// wait a second
std::this_thread::sleep_for(std::chrono::seconds(1));
// add an event in 2 seconds
tq.emplace(Clock::now() + std::chrono::seconds(2), [] {
std::cout << "first\n";
});
// sleep some time
std::this_thread::sleep_for(std::chrono::seconds(3));
// shutdown, only the event printing "first" will have fired
tq.shutdown();
ev_th.join();
}
Demo with logging

How to get local hour efficiently?

I'm developing a service. Currently I need to get local hour for every request, since it involves system call, it costs too much.
In my case, some deviation like 200ms is OK for me.
So what's the best way to maintain a variable storing local_hour, and update it every 200ms?
static int32_t GetLocalHour() {
time_t t = std::time(nullptr);
if (t == -1) { return -1; }
struct tm *time_info_ptr = localtime(&t);
return (nullptr != time_info_ptr) ? time_info_ptr->tm_hour : -1;
}
If you want your main thread to spend as little time as possible on getting the current hour you can start a background thread to do all the heavy lifting.
For all things time use std::chrono types.
Here is the example, which uses quite a few (very useful) multithreading building blocks from C++.
#include <chrono>
#include <future>
#include <condition_variable>
#include <mutex>
#include <atomic>
#include <iostream>
// building blocks
// std::future/std::async, to start a loop/function on a seperate thread
// std::atomic, to be able to read/write threadsafely from a variable
// std::chrono, for all things time
// std::condition_variable, for communicating between threads. Basicall a signal that only signals that something has changed that might be interesting
// lambda functions : anonymous functions that are useful in this case for starting the asynchronous calls and to setup predicates (functions returning a bool)
// std::mutex : threadsafe access to a bit of code
// std::unique_lock : to automatically unlock a mutex when code goes out of scope (also needed for condition_variable)
// helper to convert time to start of day
using days_t = std::chrono::duration<int, std::ratio_multiply<std::chrono::hours::period, std::ratio<24> >::type>;
// class that has an asynchronously running loop that updates two variables (threadsafe)
// m_hours and m_seconds (m_seconds so output is a bit more interesting)
class time_keeper_t
{
public:
time_keeper_t() :
m_delay{ std::chrono::milliseconds(200) }, // update loop period
m_future{ std::async(std::launch::async,[this] {update_time_loop(); }) } // start update loop
{
// wait until asynchronous loop has started
std::unique_lock<std::mutex> lock{ m_mtx };
// wait until the asynchronous loop has started.
// this can take a bit of time since OS needs to schedule a thread for that
m_cv.wait(lock, [this] {return m_started; });
}
~time_keeper_t()
{
// threadsafe stopping of the mainloop
// to avoid problems that the thread is still running but the object
// with members is deleted.
{
std::unique_lock<std::mutex> lock{ m_mtx };
m_stop = true;
m_cv.notify_all(); // this will wakeup the loop and stop
}
// future.get will wait until the loop also has finished
// this ensures no member variables will be accessed
// by the loop thread and it is safe to fully destroy this instance
m_future.get();
}
// inline to avoid extra calls
inline int hours() const
{
return m_hours;
}
// inline to avoid extra calls
inline int seconds() const
{
return m_seconds;
}
private:
void update_time()
{
m_now = std::chrono::steady_clock::now();
std::chrono::steady_clock::duration tp = m_now.time_since_epoch();
// calculate back till start of day
days_t days = duration_cast<days_t>(tp);
tp -= days;
// calculate hours since start of day
auto hours = std::chrono::duration_cast<std::chrono::hours>(tp);
tp -= hours;
m_hours = hours.count();
// seconds since start of last hour
auto seconds = std::chrono::duration_cast<std::chrono::seconds>(tp);
m_seconds = seconds.count() % 60;
}
void update_time_loop()
{
std::unique_lock<std::mutex> lock{ m_mtx };
update_time();
// loop has started and has initialized all things time with values
m_started = true;
m_cv.notify_all();
// stop condition for the main loop, put in a predicate lambda
auto stop_condition = [this]()
{
return m_stop;
};
while (!m_stop)
{
// wait until m_cv is signaled or m_delay timed out
// a condition variable allows instant response and thus
// is better then just having a sleep here.
// (imagine a delay of seconds, that would also mean stopping could
// take seconds, this is faster)
m_cv.wait_for(lock, m_delay, stop_condition);
if (!m_stop) update_time();
}
}
std::atomic<int> m_hours;
std::atomic<int> m_seconds;
std::mutex m_mtx;
std::condition_variable m_cv;
bool m_started{ false };
bool m_stop{ false };
std::chrono::steady_clock::time_point m_now;
std::chrono::steady_clock::duration m_delay;
std::future<void> m_future;
};
int main()
{
time_keeper_t time_keeper;
// the mainloop now just can ask the time_keeper for seconds
// or in your case hours. The only time needed is the time
// to return an int (atomic) instead of having to make a full
// api call to get the time.
for (std::size_t n = 0; n < 30; ++n)
{
std::cout << "seconds now = " << time_keeper.seconds() << "\n";
std::this_thread::sleep_for(std::chrono::milliseconds(100));
}
return 0;
}
You don't need to query local time for every request because hour doesn't change every 200ms. Just update the local hour variable every hour
The most correct solution would be registering to a timer event like scheduled task on Windows or cronjobs on Linux that runs at the start of every hour. Alternatively create a timer that runs every hour and update the variable
The timer creation depends on the platform, for example on Windows use SetTimer, on Linux use timer_create. Here's a very simple solution using boost::asio which assumes that you run on the exact hour. You'll need to make some modification to allow it to run at any time, for example by creating a one-shot timer or by sleeping until the next hour
#include <chrono>
using namespace std::chrono_literals;
int32_t get_local_hour()
{
time_t t = std::time(nullptr);
if (t == -1) { return -1; }
struct tm *time_info_ptr = localtime(&t);
return (nullptr != time_info_ptr) ? time_info_ptr->tm_hour : -1;
}
static int32_t local_hour = get_local_hour();
bool running = true;
// Timer callback body, called every hour
void update_local_hour(const boost::system::error_code& /*e*/,
boost::asio::deadline_timer* t)
{
while (running)
{
t->expires_at(t->expires_at() + boost::posix_time::hour(1));
t->async_wait(boost::bind(print,
boost::asio::placeholders::error, t, count));
local_hour = get_local_hour();
}
}
int main()
{
boost::asio::io_service io;
// Timer that runs every hour and update the local_hour variable
boost::asio::deadline_timer t(io, boost::posix_time::hour(1));
t.async_wait(boost::bind(update_local_hour,
boost::asio::placeholders::error, &t));
running = true;
io.run();
std::this_thread::sleep_for(3h);
running = false; // stop the timer
}
Now just use local_hour directly instead of GetLocalHour()

C++/Qt: How to create a busyloop which you can put on pause?

Is there a better answer to this question than creating a spinlock-like structure with a global boolean flag which is checked in the loop?
bool isRunning = true;
void busyLoop()
{
for (;;) {
if (!isRunning)
continue;
// ...
}
}
int main()
{
// ...
QPushButton *startBusyLoopBtn = new QPushButton("start busy loop");
QObject::connect(startBusyLoopBtn, QPushButton::clicked, [](){ busyLoop(); });
QPushButton *startPauseBtn = new QPushButton("start/pause");
QObject::connect(startPauseBtn, QPushButton::clicked, [](){ isRunning = !isRunning; });
// ...
}
To begin with, we waste the CPU time while checking the flag. Secondly, we need two separate buttons for this scheme to work. How can we use Qt's slot-signal mechanism for a simpler solution?
You can use std::condition_variable:
std::mutex mtx;
std::condition_variable cv_start_stop;
std::thread thr([&](){
/**
* this thread will notify and unpause the main loop 3 seconds later
*/
std::this_thread::sleep_for(std::chrono::milliseconds(3000));
cv_start_stop.notify_all();
});
bool paused = true;
while (true)
{
if (paused)
{
std::unique_lock<std::mutex> lock(mtx);
cv_start_stop.wait(lock); // this will lock the thread until notified.
std::cout << "thread unpaused\n";
paused = false;
}
std::cout << "loop goes on until paused\n";
std::this_thread::sleep_for(std::chrono::milliseconds(1000));
}
This will not brutally check for a flag to continue, instead, it will put thread to sleep until notified.
You will simply make paused = true; to pause and cv_start_stop.notify_one(); or cv_start_stop.notify_all(); to unpause.

A race condition in a custom implementation of recursive mutex

UPD: It seems that the problem which I explain below is non-existent. I cannot reproduce it in a week already, I started suspecting that it was caused by some bugs in a compiler or corrupted memory because it is not reproducing anymore.
I tried to implement my own recursive mutex in C++, but for some reason, it fails. I tried to debug it, but I stuck. (I know that there are recursive mutex in std, but I need a custom implementation in a project where STL is not available; this implementation was just a check of an idea). I haven't thought about efficiency yet, but I don't understand why my straightforward implementation doesn't work.
First of all, here's the implementation of the RecursiveMutex:
class RecursiveMutex
{
std::mutex critical_section;
std::condition_variable cv;
std::thread::id id;
int recursive_calls = 0;
public:
void lock() {
auto thread = std::this_thread::get_id();
std::unique_lock<std::mutex> lock(critical_section);
cv.wait( lock, [this, thread]() {
return id == thread || recursive_calls == 0;
});
++recursive_calls;
id = thread;
}
void unlock() {
std::unique_lock<std::mutex> lock( critical_section );
--recursive_calls;
if( recursive_calls == 0 ) {
lock.unlock();
cv.notify_all();
}
}
};
The failing test is straightforward, it just runs two threads, both of them are locking and unlocking the same mutex (the recursive nature of the mutex is not tested here). Here it is:
std::vector<std::thread> threads;
void initThreads( int num_of_threads, std::function<void()> func )
{
threads.resize( num_of_threads );
for( auto& thread : threads )
{
thread = std::thread( func );
}
}
void waitThreads()
{
for( auto& thread : threads )
{
thread.join();
}
}
void test () {
RecursiveMutex mutex;
while (true) {
int count = 0;
initThreads(2, [&mutex] () {
for( int i = 0; i < 100000; ++i ) {
try {
mutex.lock();
++count;
mutex.unlock();
}
catch (...) {
// Extremely rarely.
// Exception is "Operation not permited"
assert(false);
}
}
});
waitThreads();
// Happens often
assert(count == 200000);
}
}
In this code I have two kinds of errors:
Extremely rarely I get an exception in RecursiveMutex::lock() which contains message "Operation not permitted" and is thrown from cv.wait. As far as I understand, this exception is thrown when wait is called on a mutex which is not owned by the thread. At the same time, I lock it just above calling the wait so this cannot be the case.
In most situations I just get a message into console "terminate called without an active exception".
My main question is what the bug is, but I'll also be happy to know how to debug and provoke race condition in such a code in general.
P.S. I use Desktop Qt 5.4.2 MinGW 32 bit.