Boost interprocess mutexes and checking for abandonment - c++

I have a need for interprocess synchronization around a piece of hardware. Because this code will need to work on Windows and Linux, I'm wrapping with Boost Interprocess mutexes. Everything works well accept my method for checking abandonment of the mutex. There is the potential that this can happen and so I must prepare for it.
I've abandoned the mutex in my testing and, sure enough, when I use scoped_lock to lock the mutex, the process blocks indefinitely. I figured the way around this is by using the timeout mechanism on scoped_lock (since much time spent Googling for methods to account for this don't really show much, boost doesn't do much around this because of portability reasons).
Without further ado, here's what I have:
#include <boost/interprocess/sync/named_recursive_mutex.hpp>
#include <boost/interprocess/sync/scoped_lock.hpp>
typedef boost::interprocess::named_recursive_mutex MyMutex;
typedef boost::interprocess::scoped_lock<MyMutex> ScopedLock;
MyMutex* pGate = new MyMutex(boost::interprocess::open_or_create, "MutexName");
{
// ScopedLock lock(*pGate); // this blocks indefinitely
boost::posix_time::ptime timeout(boost::posix_time::microsec_clock::local_time() + boost::posix_time::seconds(10));
ScopedLock lock(*pGate, timeout); // a 10 second timeout that returns immediately if the mutex is abandoned ?????
if(!lock.owns()) {
delete pGate;
boost::interprocess::named_recursive_mutex::remove("MutexName");
pGate = new MyMutex(boost::interprocess::open_or_create, "MutexName");
}
}
That, at least, is the idea. Three interesting points:
When I don't use the timeout object, and the mutex is abandoned, the ScopedLock ctor blocks indefinitely. That's expected.
When I do use the timeout, and the mutex is abandoned, the ScopedLock ctor returns immediately and tells me that it doesn't own the mutex. Ok, perhaps that's normal, but why isn't it waiting for the 10 seconds I'm telling it too?
When the mutex isn't abandoned, and I use the timeout, the ScopedLock ctor still returns immediately, telling me that it couldn't lock, or take ownership, of the mutex and I go through the motions of removing the mutex and remaking it. This is not at all what I want.
So, what am I missing on using these objects? Perhaps it's staring me in the face, but I can't see it and so I'm asking for help.
I should also mention that, because of how this hardware works, if the process cannot gain ownership of the mutex within 10 seconds, the mutex is abandoned. In fact, I could probably wait as little as 50 or 60 milliseconds, but 10 seconds is a nice "round" number of generosity.
I'm compiling on Windows 7 using Visual Studio 2010.
Thanks,
Andy

When I don't use the timeout object, and the mutex is abandoned, the ScopedLock ctor blocks indefinitely. That's expected
The best solution for your problem would be if boost had support for robust mutexes. However Boost currently does not support robust mutexes. There is only a plan to emulate robust mutexes, because only linux has native support on that. The emulation is still just planned by Ion Gaztanaga, the library author.
Check this link about a possible hacking of rubust mutexes into the boost libs:
http://boost.2283326.n4.nabble.com/boost-interprocess-gt-1-45-robust-mutexes-td3416151.html
Meanwhile you might try to use atomic variables in a shared segment.
Also take a look at this stackoverflow entry:
How do I take ownership of an abandoned boost::interprocess::interprocess_mutex?
When I do use the timeout, and the mutex is abandoned, the ScopedLock ctor returns immediately and tells me that it doesn't own the mutex. Ok, perhaps that's normal, but why isn't it waiting for the 10 seconds I'm telling it too?
This is very strange, you should not get this behavior. However:
The timed lock is possibly implemented in terms of the try lock. Check this documentation:
http://www.boost.org/doc/libs/1_53_0/doc/html/boost/interprocess/scoped_lock.html#idp57421760-bb
This means, the implementation of the timed lock might throw an exception internally and then returns false.
inline bool windows_mutex::timed_lock(const boost::posix_time::ptime &abs_time)
{
sync_handles &handles =
windows_intermodule_singleton<sync_handles>::get();
//This can throw
winapi_mutex_functions mut(handles.obtain_mutex(this->id_));
return mut.timed_lock(abs_time);
}
Possibly, the handle cannot be obtained, because the mutex is abandoned.
When the mutex isn't abandoned, and I use the timeout, the ScopedLock ctor still returns immediately, telling me that it couldn't lock, or take ownership, of the mutex and I go through the motions of removing the mutex and remaking it. This is not at all what I want.
I am not sure about this one, but I think the named mutex is implemented by using a shared memory. If you are using Linux, check for the file /dev/shm/MutexName. In Linux, a file descriptor remains valid until that is not closed, no matter if you have removed the file itself by e.g. boost::interprocess::named_recursive_mutex::remove.

Check out the BOOST_INTERPROCESS_ENABLE_TIMEOUT_WHEN_LOCKING and BOOST_INTERPROCESS_TIMEOUT_WHEN_LOCKING_DURATION_MS compile flags. Define the first symbol in your code to force the interprocess mutexes to time out and the second symbol to define the timeout duration.
I helped to get them added to the library to solve the abandoned mutex issue. It was necessary to add it due to many interprocess constructs (like message_queue) that rely on the simple mutex rather than the timed mutex. There may be a more robust solution in the future, but this solution has worked just fine for my interprocess needs.
I'm sorry I can't help you with your code at the moment; something is not working correctly there.

BOOST_INTERPROCESS_ENABLE_TIMEOUT_WHEN_LOCKING is not so good. It throws an exception and does not help much. To workaround exceptional behaviour I wrote this macro. It works just alright for common purposed. In this sample named_mutex is used. The macro creates a scoped lock with a timeout, and if the lock cannot be acquired for EXCEPTIONAL reasons, it will unlock it afterwards. This way the program can lock it again later and does not freeze or crash immediately.
#define TIMEOUT 1000
#define SAFELOCK(pMutex) \
boost::posix_time::ptime wait_time \
= boost::posix_time::microsec_clock::universal_time() \
+ boost::posix_time::milliseconds(TIMEOUT); \
boost::interprocess::scoped_lock<boost::interprocess::named_mutex> lock(*pMutex, wait_time); \
if(!lock.owns()) { \
pMutex->unlock(); }
But even this is not optimal, because the code to be locked now runs unlocked once. This may cause problems. You can easily extend the macro however. E.g. run code only if lock.owns() is true.

boost::interprocess::named_mutex has 3 defination:
on windows, you can use macro to use windows mutex instead of boost mutex, you can try catch the abandoned exception, and you should unlock it!
on linux, the boost has pthread_mutex, but it not robust attribute in 1_65_1version
so I implemented interprocess_mutex myself use system API(windows Mutex and linux pthread_mutex process shared mode), but windows Mutex is in the kernel instead of file.

Craig Graham answered this in a reply already but I thought I'd elaborate because I found this, didn't read his message, and beat my head against it to figure it out.
On a POSIX system, timed lock calls:
timespec ts = ptime_to_timespec(abs_time);
pthread_mutex_timedlock(&m_mut, &ts)
Where abs_time is the ptime that the user passes into interprocess timed_lock.
The problem is, that abs_time must be in UTC, not system time.
Assume that you want to wait for 10 seconds; if you're ahead of UTC your timed_lock() will return immediately,
and if you're behind UTC, your timed_lock() will return in hours_behind - 10 seconds.
The following ptime times out an interprocess mutex in 10 seconds:
boost::posix_time::ptime now = boost::posix_time::second_clock::universal_time() +
boost::posix_time::seconds(10);
If I use ::local_time() instead of ::universal_time(), since I'm ahead of UTC, it returns immediately.
The documentation fails to mention this.
I haven't tried it, but digging into the code a bit, it looks like the same problem would occur on a non-POSIX system.
If BOOST_INTERPROCESS_POSIX_TIMEOUTS is not defined, the function ipcdetail::try_based_timed_lock(*this, abs_time) is called.
It uses universal time as well, waiting on while(microsec_clock::universal_time() < abs_time).
This is only speculation, as I don't have quick access to a Windows system to test this on.
For full details, see https://www.boost.org/doc/libs/1_76_0/boost/interprocess/sync/detail/common_algorithms.hpp

Related

Using unique locks while in DPC

Currently working on a light weight filter in the NDIS stack. I'm trying to inject a packet which set in a global variable as an NBL. During receive NBL, if an injected NBL is pending, than a lock is taken by the thread before picking the injected NBL up to process it. Originally I was looking at using a spin lock or FAST_MUTEX. But according to the documentation for FAST_MUTEX, any other threads attempting to take the lock will wait for the lock to release before continuing.
The problem is, that receive NBL is running in DPC mode. This would cause a DPC running thread to pause and wait for the lock to release. Additionally, I'd like to be able to assert ownership of a thread's ownership over a lock.
My question is, does windows kernel support unique mutex locks in the kernel, can these locks be taken in DPC mode and how expensive is assertion of ownership in the lock. I'm fairly new to C++ so forgive any syntax errors.
I attempted to define a mutex in the LWF object
// Header file
#pragma once
#include <mutex.h>
class LWFobject
{
public:
LWFobject()
std::mutex ExampleMutex;
std::unique_lock ExampleLock;
}
//=============================================
// CPP file
#include "LWFobject.h"
LWFobject::LWFObject()
{
ExmapleMutex = CreateMutex(
NULL,
FALSE,
NULL);
ExampleLock(ExampleMutex, std::defer_lock);
}
Is the use of unique_locks supported in the kernel? When I attempt to compile it, it throws hundreds of compilation errors when attempting to use mutex.h. I'd like to use try_lock and owns_lock.
You can't use standard ISO C++ synchronization mechanisms while inside a Windows kernel.
A Windows kernel is a whole other world in itself, and requires you to live by its rules (which are vast - see for example these two 700-page books: 1, 2).
Processing inside a Windows kernel is largely asynchronous and event-based; you handle events and schedule deferred calls or use other synchronization techniques for work that needs to be done later.
Having said that, it is possible to have a mutex in the traditional sense inside a Windows driver. It's called a Fast Mutex and requires raising IRQL to APC_LEVEL. Then you can use calls like ExAcquireFastMutex, ExTryToAcquireFastMutex and ExReleaseFastMutex to lock/try-lock/release it.
A fundamental property of a lock is which priority (IRQL) it's synchronized at. A lock can be acquired from lower priorities, but can never be acquired from a higher priority.
(Why? Imagine how the lock is implemented. The lock must raise the current task priority up to the lock's natural priority. If it didn't do this, then a task running at a low priority could grab the lock, get pre-empted by a higher priority task, which would then deadlock if it tried to acquire the same lock. So every lock has a documented natural IRQL, and the lock will first raise the current thread to that IRQL before attempting to acquire exclusivity.)
The NDIS datapath can run at any IRQL between PASSIVE_LEVEL and DISPATCH_LEVEL, inclusive. This means that anything on the datapath must only ever use locks that are synchronized at DISPATCH_LEVEL (or higher). This really limits your choices: you can use KSPIN_LOCKs, NDIS_RW_LOCKs, and a handful of other uncommon ones.
This gets viral: if you have one function that can sometimes run at DISPATCH_LEVEL (like the datapath), it forces the lock to be synchronized at DISPATCH_LEVEL, which forces any other functions that hold the lock to also run at DISPATCH_LEVEL. That can be inconvenient, for example you might want to hold the locks while reading from the registry too.
There are various approaches to design your driver:
* Use spinlocks everywhere. When reading from the registry, read into temporary variables, then grab a spinlock and copy the temporary variables into global state.
* Use mutexes (or better yet: pushlocks) everywhere. Quarantine the datapath into a component that runs at dispatch level, and carefully copy any configuration state into this component's private state.
* Somehow avoid having your datapath interact with the rest of your driver, so there's no shared state, and thus no shared locks.
* Have the datapath rush to PASSIVE_LEVEL by queuing all packets to a worker thread.

scoped_lock() - an RAII implementation using pthread

I have a socket shared between 4 threads and I wanted to use the RAII principle for acquiring and releasing the mutex.
The ground realities
I am using the pthread library.
I cannot use Boost.
I cannot use anything newer than C++03.
I cannot use exceptions.
The Background
Instead of having to lock the mutex for the socket everytime before using it, and then unlocking the mutex right afterwards, I thought I could write a scoped_lock() which would lock the mutex, and once it goes out of scope, it would automatically unlock the mutex.
So, quite simply I do a lock in the constructor and an unlock in the destructor, as shown here.
ScopedLock::ScopedLock(pthread_mutex_t& mutex, int& errorCode)
: m_Mutex(mutex)
{
errorCode = m_lock();
}
ScopedLock::~ScopedLock()
{
errorCode = m_unlock();
}
where m_lock() and m_unlock() are quite simply two wrapper functions around the pthread_mutex_lock() and the pthread_mutex_unlock() functions respectively, with some additional tracelines/logging.
In this way, I would not have to write at least two unlock statements, one for the good case and one for the bad case (at least one, could be more bad-paths in some situations).
The Problem
The problem that I have bumped into and the thing that I don't like about this scheme is the destructor.
I have diligiently done for every function the error-handling, but from the destructor of this ScopedLock(), I cannot inform the caller about any errors that might be returned my m_unlock().
This is a fundamental problem with RAII, but in this case, you're in luck. pthread_unlock only fails if you set up the mutex wrong (EINVAL) or if you're attempting to unlock an error checking mutex from a thread that doesn't own it (EPERM). These errors are indications of bugs in your code, not runtime errors that you should be taking into account. asserting errorCode==0 is a reasonable strategy in this case.

Concurrent code without waiting

I'm thinking about a certain kind of synchronisation primitive, but I don't know what this kind of synchronisation is called or if something like this would be working.
So there is one variable (boolean) which basically signals if one thread is still working on a block of memory or not. At the beginning the bool is set to false, meaning the worker thread is not working on your block of memory. Now the main thread gives the worker thread a "todo-list", describing how it should be working on that block of memory. After that, it changes the state of the boolean to true, so that the worker thread knows it is now allowed to do its work. The main thread can now continue its own work and checks at certain locations if the worker thread is now done working, e.g. if the boolean has been set to false again. If it is stil true, the main thread just continues its own work and doesn't wait for the worker thread. If the boolean is false, the main thread knows the worker thread is done and starts processing the block of memory.
So the boolean just transfers the ownership over a block of memory between two threads. If one thread currently does not have the ownership of that memory, it just continues with its own work, and checks repeatedly if it now has the ownership again. This way, none of the threads is waiting for one another and can continue its own work.
What is this called and how is such a behavior implemented?
EDIT: Basically it's a mutex. But instead of waiting for the mutex to be unlocked again, it continues/skips the critical code.
EDIT: Basically it's a mutex. But instead of waiting for the mutex to
be unlocked again, it continues/skips the critical code.
It's still a mutex, just with "try" methods.
in standard C++, we're talking about std::mutex::try_lock , which tries to lock the mutex, if it fails it returns false and moves on
class unlocker{
std::mutex& m_Parent;
public :
unlocker(std::mutex& parent) : m_Parent(parent){}
~unlocker() {m_Parent.unlock(); }
};
std::mutex mtx;
if (mtx.try_lock()){
unlocker unlock(mtx); // no, you can't use std::lock_guard/unique_lock here
//success, mtx is free
} else{
// do something else
}
on Native OS's code you have similar functions depending on the operating system you are on, like pthread_mutex_trylock on Unix and TryEnterCriticalSection on Windows. needless to say that standard mutex probably does use these functions behind the scenes
What will you do if the main thread runs out of work?
Suppose you keep checking and you keep reading true. Eventually you reach a point where the main thread cannot continue without the result from the worker thread. Since you have no more work to do, the only thing left is now keep checking the value of the flag over and over, wasting CPU resources that other threads could use to do useful work.
In general, this is not what you want. Instead, you would like the operating system to put your main thread to sleep and only wake it up once the worker thread has finished processing. All kinds of locks and semaphores that ship with modern operating systems work this way. Underneath there is some flag in memory that indicates who owns the lock, but there is also a bunch of logic around it that ensure the operating system won't schedule threads that have nothing to do but wait for a lock to become ready.
That being said, there are some situations where this is not what you want. If you are sufficiently sure that you won't run into the situation where one thread just spins on a lock, and you want to save the overhead that comes with the OS locks, just checking a flag like you described might be a viable option.
Note though that low-level stuff like this should be reserved for special circumstances, not be the first tool in your toolbox. It's just too easy to end up with an algorithm that is incorrect or an implementation that is not as efficient as you thought. If you decide to go down this road, be prepared to do some serious work to get it working as expected.

Boost::mutex performance vs pthread_mutex_t

I was using pthread_mutex_ts beforehand. The code sometimes got stuck. I had a couple of lines of code scattered across functions that I wrapped...
pthread_mutex_lock(&map_mutex);// Line 1
//critical code involving reading/writing wrapped around a mutex //Line 2
pthread_mutex_unlock(&map_mutex); //Line 3
Not sure how/where the code was getting stuck, I switched the pthread_mutex_t to a boost:mutex
1) If i just substitute lines 1 and 3 with boost::lock_guard<boost::mutex> lock(map_mutex); in line 1, and everything works flawlessly, what could be going wrong with the pthread implementation?
2) Am I giving up performance by switching to boost. The critical portion here is very time-sensitive so I would like the mutex to be very lightweight. (C++, redhat)
If an exception is thrown, or the function returns, between lines 1 and 3, then the mutex will not be unlocked. The next time anyone tries to lock it, their thread will wait indefinitely.
On a Posix platform, boost::mutex is a very thin wrapper around a pthread_mutex_t, and lock_guard just contains a reference to the mutex, and unlocks it in its destructor. The only extra overhead will be to initialise that reference (and even that is likely to be optimised away), and the extra code needed to unlock the mutex in the event of an exception/return, which you'd need anyway.

Cheapest way to wake up multiple waiting threads without blocking

I use boost::thread to manage threads. In my program i have pool of threads (workers) that are activated sometimes to do some job simultaneously.
Now i use boost::condition_variable: and all threads are waiting inside boost::condition_variable::wait() call on their own conditional_variableS objects.
Can i AVOID using mutexes in classic scheme, when i work with conditional_variables? I want to wake up threads, but don't need to pass some data to them, so don't need a mutex to be locked/unlocked during awakening process, why should i spend CPU on this (but yes, i should remember about spurious wakeups)?
The boost::condition_variable::wait() call trying to REACQUIRE the locking object when CV received the notification. But i don't need this exact facility.
What is cheapest way to awake several threads from another thread?
If you don't reacquire the locking object, how can the threads know that they are done waiting? What will tell them that? Returning from the block tells them nothing because the blocking object is stateless. It doesn't have an "unlocked" or "not blocking" state for it to return in.
You have to pass some data to them, otherwise how will they know that before they had to wait and now they don't? A condition variable is completely stateless, so any state that you need must be maintained and passed by you.
One common pattern is to use a mutex, condition variable, and a state integer. To block, do this:
Acquire the mutex.
Copy the value of the state integer.
Block on the condition variable, releasing the mutex.
If the state integer is the same as it was when you coped it, go to step 3.
Release the mutex.
To unblock all threads, do this:
Acquire the mutex.
Increment the state integer.
Broadcast the condition variable.
Release the mutex.
Notice how step 4 of the locking algorithm tests whether the thread is done waiting? Notice how this code tracks whether or not there has been an unblock since the thread decided to block? You have to do that because condition variables don't do it themselves. (And that's why you need to reacquire the locking object.)
If you try to remove the state integer, your code will behave unpredictably. Sometimes you will block too long due to missed wakeups and sometimes you won't block long enough due to spurious wakeups. Only a state integer (or similar predicate) protected by the mutex tells the threads when to wait and when to stop waiting.
Also, I haven't seen how your code uses this, but it almost always folds into logic you're already using. Why did the threads block anyway? Is it because there's no work for them to do? And when they wakeup, are they going to figure out what to do? Well, finding out that there's no work for them to do and finding out what work they do need to do will require some lock since it's shared state, right? So there almost always is already a lock you're holding when you decide to block and need to reacquire when you're done waiting.
For controlling threads doing parallel jobs, there is a nice primitive called a barrier.
A barrier is initialized with some positive integer value N representing how many threads it holds. A barrier has only a single operation: wait. When N threads call wait, the barrier releases all of them. Additionally, one of the threads is given a special return value indicating that it is the "serial thread"; that thread will be the one to do some special job, like integrating the results of the computation from the other threads.
The limitation is that a given barrier has to know the exact number of threads. It's really suitable for parallel processing type situations.
POSIX added barriers in 2003. A web search indicates that Boost has them, too.
http://www.boost.org/doc/libs/1_33_1/doc/html/barrier.html
Generally speaking, you can't.
Assuming the algorithm looks something like this:
ConditionVariable cv;
void WorkerThread()
{
for (;;)
{
cv.wait();
DoWork();
}
}
void MainThread()
{
for (;;)
{
ScheduleWork();
cv.notify_all();
}
}
NOTE: I intentionally omitted any reference to mutexes in this pseudo-code. For the purposes of this example, we'll suppose ConditionVariable does not require a mutex.
The first time through MainTnread(), work is queued and then it notifies WorkerThread() that it should execute its work. At this point two things can happen:
WorkerThread() completes DoWork() before MainThread() can complete ScheduleWork().
MainThread() completes ScheduleWork() before WorkerThread() can complete DoWork().
In case #1, WorkerThread() comes back around to sleep on the CV, and is awoken by the next cv.notify() and all is well.
In case #2, MainThread() comes back around and notifies... nobody and continues on. Meanwhile WorkerThread() eventually comes back around in its loop and waits on the CV but it is now one or more iterations behind MainThread().
This is known as a "lost wakeup". It is similar to the notorious "spurious wakeup" in that the two threads now have different ideas about how many notify()s have taken place. If you are expecting the two threads to maintain synchrony (and usually you are), you need some sort of shared synchronization primitive to control it. This is where the mutex comes in. It helps avoid lost wakeups which, arguably, are a more serious problem than the spurious variety. Either way, the effects can be serious.
UPDATE: For further rationale behind this design, see this comment by one of the original POSIX authors: https://groups.google.com/d/msg/comp.programming.threads/cpJxTPu3acc/Hw3sbptsY4sJ
Spurious wakeups are two things:
Write your program carefully, and make sure it works even if you
missed something.
Support efficient SMP implementations
There may be rare cases where an "absolutely, paranoiacally correct"
implementation of condition wakeup, given simultaneous wait and
signal/broadcast on different processors, would require additional
synchronization that would slow down ALL condition variable operations
while providing no benefit in 99.99999% of all calls. Is it worth the
overhead? No way!
But, really, that's an excuse because we wanted to force people to
write safe code. (Yes, that's the truth.)
boost::condition_variable::notify_*(lock) does NOT require that the caller hold the lock on the mutex. THis is a nice improvement over the Java model in that it decouples the notification of threads with the holding of the lock.
Strictly speaking, this means the following pointless code SHOULD DO what you are asking:
lock_guard lock(mutex);
// Do something
cv.wait(lock);
// Do something else
unique_lock otherLock(mutex);
//do something
otherLock.unlock();
cv.notify_one();
I do not believe you need to call otherLock.lock() first.