Portable c++ boost::iterprocess::mutex, an another try - c++

I was looking for long time around to have portable robust solution for multiprocessing synchronization. Who touche this things know that good solution are boost::iterprocess named
sync objects. But ....
When your process have named_mutex locked and your process die (there are many normal situations when process die, not just bug or others.) In that case named_mutex will remain in locked state. There were attempt to make robust_mutex in boost code done by Ion Gaztanaga (www.boost.org/doc/libs/1_55_0/boost/interprocess/detail/robust_emulation.hpp)
He had nice idea how resolve abandoning state check. Each process in game has its own lock file and while is alive it hold that file locked. Then Ion's robust_mutex check, in case of failed lock attempt, current owner process lock file, and can determine if current mutex owner is alive or not. In case it is death mutex can be taken. Trick with file lock is nice idea cause file locks are unlocked by OS in case process die, and this seems to be well portable. This solution wraps base spin_mutex and hold current owner process id in internal field. I made intensive testing and found 2 big problems.
File lock handling and way how is implemented slows down mutex in manner that it is faster just use file lock.
Decoupling effective lock gate variable and owner process id cause situations where mutex can be stolen by different processes.
And here come my question : I'm proposing solution for both problems, and I wood like to have some pro opinion about it.
do not use for each existing process separate lock file but to use one file for all eventual process id (there should be enough 4MB) and for each process lock just one byte. Position of that byte is determined by process id itself. (this is not my idea but I found it in code of Howard Chu and his excellent LMDB)
do not wrap spin_mutex as is, but rewrite it's code so it use as lock gate current owner process id instead just 0/1, so lock and unlock can happen in one atomic CAS operation.
I did a try to implement it and tested on windows. I use original boost code and call boost where necessary. Here is code. It's taken from our projects tree, so if you want to test it you have to adapt some include maybe. It's proposal, so please do not blame me for code style or something else. If idea and mode is good, I'll continue to make it more perfect. If not I'll just use something else, but I can't find anything else.
There are also versions for recursive_mutex, and named_mutex too. Then there is sort of
fixing proposal, cause if one process take in owner ship precedentaly abandoned mutex, there is high probability that there must be done some sort of integrity check.
I'd like to discuss eventual improvements
Thank you in advance
Ladislav.

Related

Reader/Writer: multiple heavy readers, only 1 write per day

I have a huge tbb::concurrent_unordered_map that gets "read" heavily by multiple (~60) threads concurrently.
Once per day I need to clear it (either fully, or selectively). Erasing is obviously not thread safe in tbb implementation, so some synchronisation needs to be in place to prevent UB.
However I know for a fact that the "write" will only happen once per day (the exact time is unknown).
I've been looking at std::shared_mutexto allow concurrent reads but I am afraid that even in an uncontended scenario might slow things significantly.
Is there a better solution for this?
Perhaps checking a std::atomic<bool> before locking on the mutex?
Thanks in advance.
It might require a bit of extra work on maintaining it, but you can use copy-on-write scheme.
Keep the map in a singleton within a shared pointer.
For "read" operations, have users thread-safely copy the shared pointer and use it for as long as they want.
For "write" operations, create a new instance map in a new shared pointer, fill it with whatever you want and replace the old version it in the singleton.
This way "read" users will still see the old version and can use it safely. Just make sure they occasionally get the newest version from the singleton. Perhaps, even give them a handle that automatically updates the shared pointer once a second or something.
This works in case you don't need the threads to synchronously update all at once.
Another scheme, you create atomic boolean to indicate when an update is incoming, and just make all threads pause their operations on the map when it is true. Once they all stopped you perform the update and let them resume their operation.
This is a perfect job for a read/write lock.
In C++ this can be implemented by having a shared_mutex, then using a unique_lock to lock it for writing, and a shared_lock to lock it for reading. See this post for a example code.
The end effect is that readers will never block on eachother, reads can all happen at the same time, but if the writer has the lock, everything will block to let the writing operation proceed.
If the writing takes a long time, so long that the once-per-day delay is unacceptable, then you can have the writer create and populate a new copy of the data without taking a lock, then take the write end of the lock and swap the data:
Readers:
Lock mutex with a shared_lock
Do stuff
Unlock
Repeat
Writer:
Create new copy of data
Lock mutex with a unique_lock
Swap data quickly
Unlock
Repeat tomorrow
A shared_lock on a shared_mutex will be fast. You could use a double check locking strategy but I would not do that until you do performance testing and also probably take a look at the source for shared_lock, because I suspect it does something similar already, and a double-check on the read end might just add overhead unnecessarily. Also I don't have enough coffee in me yet to work out double check locking in a read/write lock scenario.
There is a threading construct called a spin lock as well, but it's really just an encapsulation of a double-checked lock that repeats the "check" until it clears. It's a good construct but again you'll want performance analyses and a look at the shared_lock + shared_mutex source, because they might spin already. A good implementation of a spin lock can be found here, it covers some common gotchas. You might have to tweak it to get a read/write spinlock.
Generally speaking though, it's best to use existing constructs during the initial implementation at the very least as a clearly coded proof-of-concept. Then if you know that you're seeing too much read contention, you can optimize from there. But you need the general strategy down first, and 91 times out of a hundred, it's good enough. In this case, no matter what, some manifestation of a read/write lock is what you're going to end up with.

Is there a C++ design pattern that implements a mechanism or mutex that controls the amount of time a thread can own a locked resource?

I am looking for a way to guarantee that any time a thread locks a specific resource, it is forced to release that resource after a specific period of time (if it has not already released it). Envision a connection where you need to limit the amount of time any specific thread can own that connection for.
I envision this is how it could be used:
{
std::lock_guard<std::TimeLimitedMutex> lock(this->myTimeLimitedMutex, timeout);
try {
// perform some operation with the resource that myTimeLimitedMutex guards.
}
catch (MutexTimeoutException ex) {
// perform cleanup
}
}
I see that there is a timed_mutex that lets the program timeout if a lock cannot be acquired. I need the timeout to occur after the lock is acquired.
There are already some situations where you get a resource that can be taken away unexpectedly. For instance, a tcp sockets -- once a socket connection is made, code on each side needs to handle the case where the other side drops the connection.
I am looking for a pattern that handle types of resources that normally time out on their own, but when they don't, they need to be reset. This does not have to handle every type of resource.
This can't work, and it will never work. In other words, this can never be made. It goes against all concept of ownership and atomic transactions. Because when thread acquires the lock and implements two transactions in a row, it expects them to become atomically visible to outside word. In this scenario, it would be very possible that the transaction will be torn - first part of it will be performed, but the second will be not.
What's worse is that since the lock will be forcefully removed, the part-executed transaction will become visible to outside word, before the interrupted thread has any chance to roll-back.
This idea goes contrary to all school of multi-threaded thinking.
I support SergeyAs answer. Releasing a locked mutex after a timeout is a bad idea and cannot work. Mutex stands for mutual exclusion and this is a rock-hard contract which cannot be violated.
But you can do what you want:
Problem: You want to guarantee that your threads do not hold the mutex longer than a certain time T.
Solution: Never lock the mutex for longer than time T. Instead write your code so that the mutex is locked only for the absolutely necessary operations. It is always possible to give such a time T (modulo the uncertainties and limits given my a multitasking and multiuser operating system of course).
To achieve that (examples):
Never do file I/O inside a locked section.
Never call a system call while a mutex is locked.
Avoid sorting a list while a mutex is locked (*).
Avoid doing a slow operation on each element of a list while a mutex is locked (*).
Avoid memory allocation/deallocation while a mutex is locked (*).
There are exceptions to these rules, but the general guideline is:
Make your code slightly less optimal (e.g. do some redundant copying inside the critical section) to make the critical section as short as possible. This is good multithreading programming.
(*) These are just examples for operations where it is tempting to lock the entire list, do the operations and then unlock the list. Instead it is advisable to just take a local copy of the list and clear the original list while the mutex is locked, ideally by using the swap() operation offered by most STL containers. And then do the slow operation on the local copy outside of the critical section. This is not always possible but always worth considering. Sorting has square complexity in the worst case and usually needs random access to the entire list. It is useful to sort (a copy of) the list outside of the critical section and later check whether elements need to be added or removed. Memory allocations also have quite some complexity behind them, so massive memory allocations/deallocations should be avoided.
You can't do that with only C++.
If you are using a Posix system, it can be done.
You'll have to trigger a SIGALARM signal that's only unmasked for the thread that'll timeout. In the signal handler, you'll have to set a flag and use longjmp to return to the thread code.
In the thread code, on the setjmp position, you can only be called if the signal was triggered, thus you can throw the Timeout exception.
Please see this answer for how to do that.
Also, on linux, it seems you can directly throw from the signal handler (so no longjmp/setjmp here).
BTW, if I were you, I would code the opposite. Think about it: You want to tell a thread "hey, you're taking too long, so let's throw away all the (long) work you've done so far so I can make progress".
Ideally, you should have your long thread be more cooperative, doing something like "I've done A of a ABCD task, let's release the mutex so other can progress on A. Then let's check if I can take it again to do B and so on."
You probably want to be more fine grained (have more mutex on smaller objects, but make sure you're locking in the same order) or use RW locks (so that other threads can use the objects if you're not modifying them), etc...
Such an approach cannot be enforced because the holder of the mutex needs the opportunity to clean up anything which is left in an invalid state part way through the transaction. This can take an unknown arbitrary amount of time.
The typical approach is to release the lock when doing long tasks, and re-aquire it as needed. You have to manage this yourself as everyone will have a slightly different approach.
The only situation I know of where this sort of thing is accepted practice is at the kernel level, especially with respect to microcontrollers (which either have no kernel, or are all kernel, depending on who you ask). You can set an interrupt which modifies the call stack, so that when it is triggered it unwinds the particular operations you are interested in.
"Condition" variables can have timeouts. This allows you to wait until a thread voluntarily releases a resource (with notify_one() or notify_all()), but the wait itself will timeout after a specified fixed amount of time.
Examples in the Boost documentation for "conditions" might make this more clear.
If you want to force a release, you have to write the code which will force it though. This could be dangerous. The code written in C++ can be doing some pretty close-to-the-metal stuff. The resource could be accessing real hardware and it could be waiting on it to finish something. It may not be physically possible to end whatever the program is stuck on.
However, if it is possible, then you can handle it in the thread in which the wait() times out.

QReadWriteLock recursion

I'm using QReadWriteLock in recursive mode.
This code doesn't by itself make sense, but the issues I have arise from here:
lock->lockForWrite();
lock->lockForRead();
lockForRead is blocked. Note that this is in recursive mode.
The way i see it is that Write is a "superior" lock, it allows me to read and write to the protected data, where Read lock only allows reading.
Also, i think that write lock should not be blocked if the only reader is the same one asking for the write lock.
I can see from the qreadwritelock.cpp source codes that there is no attempt to make it work like what i would like. So it's not a bug, but a feature I find missing.
My question is this, should this kind of recursion be allowed? Are there any problems that arise from this kind of implementation and what would they be?
From QReadWriteLock docs:
Note that the lock type cannot be changed when trying to lock
recursively, i.e. it is not possible to lock for reading in a thread
that already has locked for writing (and vice versa).
So, like you say, it's just the way it works. I personally can't see how allowing reads on the same thread as write locked item would cause problems, but perhaps it requires an inefficient lock implementation?
You could try asking on the QT forums but I doubt you'll get a definitive answer.
Why don't you take the QT source as a starter and have a go at implementing yourself if it's something you need. Writing synchronisation objects can be tricky, but it's a good learning exercise.
I found this question while searching for the same functionality myself.
While thinking about implementing this on my own, I realized that there definitely is a problem arising doing so:
So you want to upgrade your lock from shared (read) to exclusive (write). Doing
lock->unlock();
lock->lockForWrite();
is not what you want, since you want no other thread to gain the write lock right after the current thread released the read lock. But if there Was a
lock->changeModus(WRITE);
or something like that, you will create a deadlock. To gain a write lock, the lock blocks until all current read locks are released. So here, multiple threads will block waiting for each other.

how to synchronize three dependent threads

If I have
1. mainThread: write data A,
2. Thread_1: read A and write it to into a Buffer;
3. Thread_2: read from the Buffer.
how to synchronize these three threads safely, with not much performance loss? Is there any existing solution to use? I use C/C++ on linux.
IMPORTANT: the goal is to know the synchronization mechanism or algorithms for this particular case, not how mutex or semaphore works.
First, I'd consider the possibility of building this as three separate processes, using pipes to connect them. A pipe is (in essence) a small buffer with locking handled automatically by the kernel. If you do end up using threads for this, most of your time/effort will be spent on creating nearly an exact duplicate of the pipes that are already built into the kernel.
Second, if you decide to build this all on your own anyway, I'd give serious consideration to following a similar model anyway. You don't need to be slavish about it, but I'd still think primarily in terms of a data structure to which one thread writes data, and from which another reads the data. By strong preference, all the necessary thread locking necessary would be built into that data structure, so most of the code in the thread is quite simple, reading, processing, and writing data. The main difference from using normal Unix pipes would be that in this case you can maintain the data in a more convenient format, instead of all the reading and writing being in text.
As such, what I think you're looking for is basically a thread-safe queue. With that, nearly everything else involved becomes borders on trivial (at least the threading part of it does -- the processing involved may not be, but at least building it with multiple threads isn't adding much to the complexity).
It's hard to say how much experience with C/C++ threads you have. I hate to just point to a link but have you read up on pthreads?
https://computing.llnl.gov/tutorials/pthreads/
And for a shorter example with code and simple mutex'es (lock object you need to sync data):
http://students.cs.byu.edu/~cs460ta/cs460/labs/pthreads.html
I would suggest Boost.Thread for this purpose. This is quite good framework with mutexes and semaphores, and it is multiplatform. Here you can find very good tutorial about this.
How exactly synchronize these threads is another problem and needs more information about your problem.
Edit The simplest solution would be to put two mutexes -- one on A and second on Buffer. You don't have to worry about deadlocks in this particular case. Just:
Enter mutex_A from MainThread; Thread1 waits for mutex to be released.
Leave mutex from MainThread; Thread1 enters mutex_A and mutex_Buffer, starts reading from A and writes it to Buffer.
Thread1 releases both mutexes. ThreadMain can enter mutex_A and write data, and Thread2 can enter mutex_Buffer safely read data from Buffer.
This is obviously the simplest solution, and probably can be improved, but without more knowledge about the problem, this is the best I can come up with.

WaitForSingleObject is not locking, Still allowing other threads to change value in C++

I'm trying to use WaitForSingleObject(fork[leftFork], Infinite); to lock a variable using multiple threads but it doesn't seem to lock anything
I set the Handle fork[5] and then use the code below but it doesn't seem to lock anything.
while(forks[rightFork] == 0 || forks[leftFork] == 0) Sleep(0);
WaitForSingleObject(fork[leftFork], INFINITE);
forks[leftFork]--;
WaitForSingleObject(fork[rightFork], INFINITE);
forks[rightFork]--;
I have tried as a WaitForMultipleObjects as well and same result. When I create the mutex I use fork[i]= CreateMutex(NULL, FALSE,NULL);
I was wondering if this is only good for each thread or do they share it?
First of all, you haven't shown enough code for us to be able to help you with any great certainty of correctness. But, having made that proviso, I'm going to try anyway!
Your use of the word fork suggests to me that you are approaching Windows threading from a pthreads background. Windows threads are a little different. Rather confusingly, the most effective in-process mutex object in Windows is not the mutex, it is in fact the critical section.
The interface for the critical section is much simpler to use, it being essentially an acquire function and a corresponding release function. If you are synchronizing within a single process, and you need a simple lock (rather than, say, a semaphore), you should use critical sections rather than mutexes.
In fact, only yesterday here on Stack Overflow, I wrote a more detailed answer to a question which described the standard usage pattern for critical sections. That post has lots of links to the pertinent sections of MSDN documentation.
Having said that, it would appear that all you are trying to do is to synchronize the decrementing of an array of integer values. If that is so then you can do this most simply in a lock free manner with InterlockIncrement or one of its friends.
You only need to use a mutex when you are performing cross process synchronization. Indeed you should only use a mutex when you are synchronizing across a process because critical sections perform so much better (i.e. faster). Since you are updating a simple array here, and since there is no obviously visible IPC going on, I can only conclude that this really is in-process.
If I'm wrong and you really are doing cross-process work and require a mutex then we would need to see more code. For example I don't see any calls to ReleaseMutex. I don't know exactly how you are creating your mutexes.
If this doesn't help, please edit your question to include more code, and also a high level overview of what you are trying to achieve.