I'm using a std::timed_mutex for the first time and it's not behaving the way I expect. It appears to fail immediately instead of waiting for the mutex. I'm providing the lock timeout in milliseconds (as shown here http://www.cplusplus.com/reference/mutex/timed_mutex/try_lock_for/). But the call to try_lock_for() fails right away.
Here's the class that handles locking and unlocking the mutex:
const unsigned int DEFAULT_MUTEX_WAIT_TIME_MS = 5 * 60 * 1000;
class ScopedTimedMutexLock
{
public:
ScopedTimedMutexLock(std::timed_mutex* sourceMutex, unsigned int numWaitMilliseconds=DEFAULT_MUTEX_WAIT_TIME_MS)
m_mutex(sourceMutex)
{
if( !m_mutex->try_lock_for( std::chrono::milliseconds(numWaitMilliseconds) ) )
{
std::string message = "Timeout attempting to acquire mutex lock for ";
message += Conversion::toString(numWaitMilliseconds);
message += "ms";
throw MutexException(message);
}
}
~ScopedTimedMutexLock()
{
m_mutex->unlock();
}
private:
std::timed_mutex* m_mutex;
};
And this is where it's being used:
void CommandService::Process( RequestType& request )
{
unsigned long callTime =
std::chrono::duration_cast< std::chrono::milliseconds >(
std::chrono::system_clock::now().time_since_epoch()
).count();
try
{
ScopedTimedMutexLock lock( m_classMutex, request.getLockWaitTimeMs(DEFAULT_MUTEX_WAIT_TIME_MS) );
// ... command processing code goes here
}
catch( MutexException& mutexException )
{
unsigned long catchTime =
std::chrono::duration_cast< std::chrono::milliseconds >(
std::chrono::system_clock::now().time_since_epoch()
).count();
cout << "The following error occured while attempting to process command"
<< "\n call time: " << callTime
<< "\n catch time: " << catchTime;
cout << mutexException.description();
}
}
Here's the console output:
The following error occured while attempting to process command
call time: 1131268914
catch time: 1131268914
Timeout attempting to acquire mutex lock for 300000ms
Any idea where this is going wrong? Is the conversion to std::chrono::milliseconds correct? How do I make try_lock_for() wait for the lock?
ADDITIONAL INFO: The call to try_lock_for() didn't always fail immediately. Many times the call acquired the lock and everything worked as expected. The failures I was seeing were intermittent. See my answer below for details about why this was failing.
The root cause of the problem is mentioned in the description for try_lock_for() at http://en.cppreference.com/w/cpp/thread/timed_mutex/try_lock_for. Near the end of the description it says:
As with try_lock(), this function is allowed to fail spuriously and
return false even if the mutex was not locked by any other thread at
some point during timeout_duration.
I naively assumed there were only two possible outcomes: (1) the function acquires the lock within the time period, or (2) the function fails after the wait time has elapsed. But there is another possibility, (3) the function fails after a relatively short time for no specified reason. TL;DR, my bad.
I solved the problem by rewriting the ScopedTimedMutexLock constructor to loop on try_lock() until the lock is acquired or the wait time limit is exceeded.
ScopedTimedMutexLock(std::timed_mutex* sourceMutex, unsigned int numWaitMilliseconds=DEFAULT_MUTEX_WAIT_TIME_MS)
m_mutex(sourceMutex)
{
const unsigned SLEEP_TIME_MS = 5;
bool isLocked = false;
unsigned long startMS = now();
while( now() - startMS < numWaitMilliseconds && !isLocked )
{
isLocked = m_sourceMutex->try_lock();
if( !isLocked )
{
std::this_thread::sleep_for(
std::chrono::milliseconds(SLEEP_TIME_MS));
}
}
if( !isLocked )
{
std::string message = "Timeout attempting to acquire mutex lock for ";
message += Conversion::toString(numWaitMilliseconds);
message += "ms";
throw MutexException(message);
}
}
Where now() is defined like this:
private:
unsigned long now() {
return std::chrono::duration_cast< std::chrono::milliseconds >(
std::chrono::system_clock::now().time_since_epoch() ).count();
}
Just a little bump-up for those who come late. And many thanks for help!
Got quite same behavior (only in std::shared_timed_mutex). After some digging found out that both try_lock_for() and try_lock_until() fail immediately if same thread already has exclusive lock on mutex, basically saving time on testing broken code.
Tested with gcc-9, gcc-10, clang-10 and clang-12.
Did NOT tested other possible combinations like requesting exclusive lock over shared lock or requesting shared lock over any of exclusive/shared locks.
Related
I have a method:
void move_robot(const vector<vector<double> > &map) {
// accumulate the footprint while the robot moves
// Iterate through the path
//std::unique_lock<std::mutex> lck(mtx);
for (unsigned int i=1; i < map.size(); i++) {
while (distance(position , map[i]) > DISTANCE_TOLERANCE ) {
this->direction = unitary_vector(map[i], this->position);
this->next_step();
lck.unlock();
this_thread::sleep_for(chrono::milliseconds(10)); // sleep for 500 ms
lck.lock();
}
std::cout << "New position is x:" << this->position[0] << " and y:" << this->position[1] << std::endl;
}
this->moving = false;
// notify to end
}
When the sleep and locks are included I get:
ASM generation compiler returned: 0
Execution build compiler returned: 0
Program returned: 143
Killed - processing time exceeded
Nevertheless, if I comment all the locks and this_thread::sleep_for it works as expected.
I need the locks because I am dealing with other threads. The complete code is this one: https://godbolt.org/z/7ErjrG
I am quite stacked because the otput does not provide much information
You have not posted the code of next_step and the definition of mtx, this is important information.
std::mutex mtx;
void next_step() {
std::unique_lock<std::mutex> lck(mtx);
this->position[0] += DT * this->direction[0];
this->position[1] += DT * this->direction[1];
}
If you read the manual for std::mutex you find out:
A calling thread must not own the mutex prior to calling lock or try_lock.
And std::unique_lock:
Locks the associated mutex by calling m.lock(). The behavior is undefined if the current thread already owns the mutex except when the mutex is recursive.
next_step called from move_robot violates this, it tries to lock already owned mutex object by that calling thread.
The relative topic for your question is Can unique_lock be used with a recursive_mutex?. There you get the fix:
std::recursive_mutex mtx;
std::unique_lock<std::recursive_mutex> lck(mtx);
According to the documentation
the currently-running fiber retains control until it invokes some
operation that passes control to the manager
I can think about only one operation - boost::this_fiber::yield which may cause control switch from fiber to fiber. However, when I run something like
bf::fiber([](){std::cout << "Bang!" << std::endl;}).detach();
bf::fiber([](){std::cout << "Bung!" << std::endl;}).detach();
I get output like
Bang!Bung!
\n
\n
Which means control was passed between << operators from one fiber to another. How it could happen? Why? What is the general definition of controll passing from fiber to fiber in the context of boost::fiber library?
EDIT001:
Cant get away without code:
#include <boost/fiber/fiber.hpp>
#include <boost/fiber/mutex.hpp>
#include <boost/fiber/barrier.hpp>
#include <boost/fiber/algo/algorithm.hpp>
#include <boost/fiber/algo/work_stealing.hpp>
namespace bf = boost::fibers;
class GreenExecutor
{
std::thread worker;
bf::condition_variable_any cv;
bf::mutex mtx;
bf::barrier barrier;
public:
GreenExecutor() : barrier {2}
{
worker = std::thread([this] {
bf::use_scheduling_algorithm<bf::algo::work_stealing>(2);
// wait till all threads joining the work stealing have been registered
barrier.wait();
mtx.lock();
// suspend main-fiber from the worker thread
cv.wait(mtx);
mtx.unlock();
});
bf::use_scheduling_algorithm<bf::algo::work_stealing>(2);
// wait till all threads have been registered the scheduling algorithm
barrier.wait();
}
template<typename T>
void PostWork(T&& functor)
{
bf::fiber {std::move(functor)}.detach();
}
~GreenExecutor()
{
cv.notify_all();
worker.join();
}
};
int main()
{
GreenExecutor executor;
std::this_thread::sleep_for(std::chrono::seconds(1));
int i = 0;
for (auto j = 0ul; j < 10; ++j) {
executor.PostWork([idx {++i}]() {
auto res = pow(sqrt(sin(cos(tan(idx)))), M_1_PI);
std::cout << idx << " - " << res << std::endl;
});
}
while (true) {
boost::this_fiber::yield();
}
return 0;
}
Output
2 - 1 - -nan
0.503334 3 - 4 - 0.861055
0.971884 5 - 6 - 0.968536
-nan 7 - 8 - 0.921959
0.9580699
- 10 - 0.948075
0.961811
Ok, there were a couple of things I missed, first, my conclusion was based on misunderstanding of how stuff works in boost::fiber
The line in the constructor mentioned in the question
bf::use_scheduling_algorithm<bf::algo::work_stealing>(2);
was installing the scheduler in the thread where the GreenExecutor instance was created (in the main thread) so, when launching two worker fibers I was actually initiating two threads which are going to process submitted fibers which in turn would process these fibers asynchronously thus mixing the std::cout output. No magic, everything works as expected, the boost::fiber::yield still is the only option to pass control from one fiber to another
I'm building a simulator to test student code for a very simple robot. I need to run two functions(to update robot sensors and robot position) on separate threads at regular time intervals. My current implementation is highly processor inefficient because it has a thread dedicated to simply incrementing numbers to keep track of the position in the code. My recent theory is that I may be able to use sleep to give the time delay between updating value of the sensor and robot position. My first question is: is this efficient? Second: Is there any way to do a simple thing but measure clock cycles instead of seconds?
Putting a thread to sleep by waiting on a mutex-like object is generally efficient. A common pattern involves waiting on a mutex with a timeout. When the timeout is reached, the interval is up. When the mutex is releaed, it is the signal for the thread to terminate.
Pseudocode:
void threadMethod() {
for(;;) {
bool signalled = this->mutex.wait(1000);
if(signalled) {
break; // Signalled, owners wants us to terminate
}
// Timeout, meaning our wait time is up
doPeriodicAction();
}
}
void start() {
this->mutex.enter();
this->thread.start(threadMethod);
}
void stop() {
this->mutex.leave();
this->thread.join();
}
On Windows systems, timeouts are generally specified in milliseconds and are accurate to roughly within 16 milliseconds (timeBeginPeriod() may be able to improve this). I do not know of a CPU cycle-triggered synchronization primitive. There are lightweight mutexes called "critical sections" that spin the CPU for a few thousand cycles before delegating to the OS thread scheduler. Within this time they are fairly accurate.
On Linux systems the accuracy may be a bit higher (high frequency timer or tickless kernel) and in addition to mutexes, there are "futexes" (fast mutex) which are similar to Windows' critical sections.
I'm not sure I grasped what you're trying to achieve, but if you want to test student code, you might want to use a virtual clock and control the passing of time yourself. For example by calling a processInputs() and a decideMovements() method that the students have to provide. After each call, 1 time slot is up.
This C++11 code uses std::chrono::high_resolution_clock to measure subsecond timing, and std::thread to run three threads. The std::this_thread::sleep_for() function is used to sleep for a specified time.
#include <iostream>
#include <thread>
#include <vector>
#include <chrono>
void seconds()
{
using namespace std::chrono;
high_resolution_clock::time_point t1, t2;
for (unsigned i=0; i<10; ++i) {
std::cout << i << "\n";
t1 = high_resolution_clock::now();
std::this_thread::sleep_for(std::chrono::seconds(1));
t2 = high_resolution_clock::now();
duration<double> elapsed = duration_cast<duration<double> >(t2-t1);
std::cout << "\t( " << elapsed.count() << " seconds )\n";
}
}
int main()
{
std::vector<std::thread> t;
t.push_back(std::thread{[](){
std::this_thread::sleep_for(std::chrono::seconds(3));
std::cout << "awoke after 3\n"; }});
t.push_back(std::thread{[](){
std::this_thread::sleep_for(std::chrono::seconds(7));
std::cout << "awoke after 7\n"; }});
t.push_back(std::thread{seconds});
for (auto &thr : t)
thr.join();
}
It's hard to know whether this meets your needs because there are a lot of details missing from the question. Under Linux, compile with:
g++ -Wall -Wextra -pedantic -std=c++11 timers.cpp -o timers -lpthread
Output on my machine:
0
( 1.00014 seconds)
1
( 1.00014 seconds)
2
awoke after 3
( 1.00009 seconds)
3
( 1.00015 seconds)
4
( 1.00011 seconds)
5
( 1.00013 seconds)
6
awoke after 7
( 1.0001 seconds)
7
( 1.00015 seconds)
8
( 1.00014 seconds)
9
( 1.00013 seconds)
Other C++11 standard features that may be of interest include timed_mutex and promise/future.
Yes your theory is correct. You can use sleep to put some delay between execution of a function by thread. Efficiency depends on how wide you can choose that delay to get desired result. You have to explain details of your implementation. For e.g we don't know whether two threads are dependent ( in that case you have to take care of synchronization which would blow up some cycles ).
Here's the one way to do it. I'm using C++11, thread, atomics and high precision clock. The scheduler will callback a function that takes dt seconds which is time elapsed since last call. The loop can be stopped by calling stop() method of if callback function returns false.
Scheduler code
#include <thread>
#include <chrono>
#include <functional>
#include <atomic>
#include <system_error>
class ScheduledExecutor {
public:
ScheduledExecutor()
{}
ScheduledExecutor(const std::function<bool(double)>& callback, double period)
{
initialize(callback, period);
}
void initialize(const std::function<bool(double)>& callback, double period)
{
callback_ = callback;
period_ = period;
keep_running_ = false;
}
void start()
{
keep_running_ = true;
sleep_time_sum_ = 0;
period_count_ = 0;
th_ = std::thread(&ScheduledExecutor::executorLoop, this);
}
void stop()
{
keep_running_ = false;
try {
th_.join();
}
catch(const std::system_error& /* e */)
{ }
}
double getSleepTimeAvg()
{
//TODO: make this function thread safe by using atomic types
//right now this is not implemented for performance and that
//return of this function is purely informational/debugging purposes
return sleep_time_sum_ / period_count_;
}
unsigned long getPeriodCount()
{
return period_count_;
}
private:
typedef std::chrono::high_resolution_clock clock;
template <typename T>
using duration = std::chrono::duration<T>;
void executorLoop()
{
clock::time_point call_end = clock::now();
while (keep_running_) {
clock::time_point call_start = clock::now();
duration<double> since_last_call = call_start - call_end;
if (period_count_ > 0 && !callback_(since_last_call.count()))
break;
call_end = clock::now();
duration<double> call_duration = call_end - call_start;
double sleep_for = period_ - call_duration.count();
sleep_time_sum_ += sleep_for;
++period_count_;
if (sleep_for > MinSleepTime)
std::this_thread::sleep_for(std::chrono::duration<double>(sleep_for));
}
}
private:
double period_;
std::thread th_;
std::function<bool(double)> callback_;
std::atomic_bool keep_running_;
static constexpr double MinSleepTime = 1E-9;
double sleep_time_sum_;
unsigned long period_count_;
};
Example usage
bool worldUpdator(World& w, double dt)
{
w.update(dt);
return true;
}
void main() {
//create world for your simulator
World w(...);
//start scheduler loop for every 2ms calls
ScheduledExecutor exec;
exec.initialize(
std::bind(worldUpdator, std::ref(w), std::placeholders::_1),
2E-3);
exec.start();
//main thread just checks on the results every now and then
while (true) {
if (exec.getPeriodCount() % 10000 == 0) {
std::cout << exec.getSleepTimeAvg() << std::endl;
}
}
}
There are also other, related questions on SO.
I'm using Boost 1.41 in a linux app that receives data on one thread and sticks it in a queue, another thread pops it off the queue and processes it. To make it thread safe I'm using scoped locks.
My problem is that very infrequently the lock function fails in the read function with the message:
void boost::mutex::lock() Assertion '!pthread_mutext_lock(&m)' failed
It is very infrequent, on last run, it took 36 hours (~425M transactions) before it failed. The read and write functions are listed below, its always in the read function that the Assert arises
Write to queue
void PacketForwarder::Enqueue(const byte_string& newPacket, long sequenceId)
{
try
{
boost::mutex::scoped_lock theScopedLock(pktQueueLock);
queueItem itm(newPacket,sequenceId);
packetQueue.push(itm);
if (IsConnecting() && packetQueue.size() > MaximumQueueSize)
{
// Reached maximum queue size while client unavailable; popping.
packetQueue.pop();
}
}
catch(...)
{
std::cout << name << " Exception was caught:" << std::endl;
}
}
Read from queue
while ( shouldRun )
{
try
{
if (clientSetsHaveChanged)
{
tryConnect();
}
size_t size = packetQueue.size();
if (size > 0)
{
byte_string packet;
boost::mutex::scoped_lock theQLock(pktQueueLock);
queueItem itm = packetQueue.front();
packet = itm.data;
packetQueue.pop();
BytesSent += packet.size();
trySend(packet);
}
else
{
boost::this_thread::sleep(boost::posix_time::milliseconds(50));
}
}
catch (...)
{
cout << name << " Other exception in send packet" << endl;
}
I've googled and found a few problems when destroying scoped_locks but nothing on failing to get a lock. I have also had a search through boost release notes and Trac logs to see if this has been identified as an issue by anyone else. I thought my code was about as simple as it gets but obviously something is up. Any thoughts?
TIA
Paul
There is one thread-safety issue in your program, in this piece of code:
size_t size = packetQueue.size();
if (size > 0)
{
byte_string packet;
boost::mutex::scoped_lock theQLock(pktQueueLock);
queueItem itm = packetQueue.front();
packet = itm.data;
packetQueue.pop();
// ...
}
The issue here is that between the time you checked the queue size and the time you got the lock some other reader thread might take the last item out of the queue, which will cause front() and pop() to fail. Unless you have only one reader thread, you need the size check to be under the lock as well.
I do not know if this is the reason of the assertion failure though. The assertion means the call to pthread_mutex_lock returned a non-zero value signaling an error. Unfortunately, Boost does not show which exactly of possible pthread_mutex_lock errors has happened.
I've wrote a timer using std::thread - here is how it looks like:
TestbedTimer::TestbedTimer(char type, void* contextObject) :
Timer(type, contextObject) {
this->active = false;
}
TestbedTimer::~TestbedTimer(){
if (this->active) {
this->active = false;
if(this->timer->joinable()){
try {
this->timer->join();
} catch (const std::system_error& e) {
std::cout << "Caught system_error with code " << e.code() <<
" meaning " << e.what() << '\n';
}
}
if(timer != nullptr) {
delete timer;
}
}
}
void TestbedTimer::run(unsigned long timeoutInMicroSeconds){
this->active = true;
timer = new std::thread(&TestbedTimer::sleep, this, timeoutInMicroSeconds);
}
void TestbedTimer::sleep(unsigned long timeoutInMicroSeconds){
unsigned long interval = 500000;
if(timeoutInMicroSeconds < interval){
interval = timeoutInMicroSeconds;
}
while((timeoutInMicroSeconds > 0) && (active == true)){
if (active) {
timeoutInMicroSeconds -= interval;
/// set the sleep time
std::chrono::microseconds duration(interval);
/// set thread to sleep
std::this_thread::sleep_for(duration);
}
}
if (active) {
this->notifyAllListeners();
}
}
void TestbedTimer::interrupt(){
this->active = false;
}
I'm not really happy with that kind of implementation since I let the timer sleep for a short interval and check if the active flag has changed (but I don't know a better solution since you can't interrupt a sleep_for call). However, my program core dumps with the following message:
thread is joinable
Caught system_error with code generic:35 meaning Resource deadlock avoided
thread has rejoined main scope
terminate called without an active exception
Aborted (core dumped)
I've looked up this error and as seems that I have a thread which waits for another thread (the reason for the resource deadlock). However, I want to find out where exactly this happens. I'm using a C library (which uses pthreads) in my C++ code which provides among other features an option to run as a daemon and I'm afraid that this interfers with my std::thread code. What's the best way to debug this?
I've tried to use helgrind, but this hasn't helped very much (it doesn't find any error).
TIA
** EDIT: The code above is actually not exemplary code, but I code I've written for a routing daemon. The routing algorithm is a reactive meaning it starts a route discovery only if it has no routes to a desired destination and does not try to build up a routing table for every host in its network. Every time a route discovery is triggered a timer is started. If the timer expires the daemon is notified and the packet is dropped. Basically, it looks like that:
void Client::startNewRouteDiscovery(Packet* packet) {
AddressPtr destination = packet->getDestination();
...
startRouteDiscoveryTimer(packet);
...
}
void Client::startRouteDiscoveryTimer(const Packet* packet) {
RouteDiscoveryInfo* discoveryInfo = new RouteDiscoveryInfo(packet);
/// create a new timer of a certain type
Timer* timer = getNewTimer(TimerType::ROUTE_DISCOVERY_TIMER, discoveryInfo);
/// pass that class as callback object which is notified if the timer expires (class implements a interface for that)
timer->addTimeoutListener(this);
/// start the timer
timer->run(routeDiscoveryTimeoutInMilliSeconds * 1000);
AddressPtr destination = packet->getDestination();
runningRouteDiscoveries[destination] = timer;
}
If the timer has expired the following method is called.
void Client::timerHasExpired(Timer* responsibleTimer) {
char timerType = responsibleTimer->getType();
switch (timerType) {
...
case TimerType::ROUTE_DISCOVERY_TIMER:
handleExpiredRouteDiscoveryTimer(responsibleTimer);
return;
....
default:
// if this happens its a bug in our code
logError("Could not identify expired timer");
delete responsibleTimer;
}
}
I hope that helps to get a better understanding of what I'm doing. However, I did not to intend to bloat the question with that additional code.