Why would I use a unique_lock<> wrapper?
I sometimes see code like this
std::unique_lock<std::mutex> lock(m_active_sessions_guard); // lock() the associated mutex
m_active_sessions[request_id] = session;
lock.unlock();
where a unique_lock<> is created just to lock the associated mutex.
Having searched, I've found that this class is not copyable. Is this the only benefit of using it?
unique_lock utilizes RAII to guarantee exception-safe code. Notice that C++ does not have a finally statement.
In the case a exception is thrown, the mutex will still be released correctly.
Consider this code:
int func()
{
m_active_sessions_guard.lock();
... some code ...
if (x > y)
{
return -1;
}
... some more code ...
m_active_sessions_guard.unlock();
return 1;
}
We "forgot" the unlock in the early return when x > y is true. This could deadlock our program, or (worse!) cause the program to run slowly/misbehave in some other way.
By using a type that automatically unlocks the lock when the destructor is called, you are guaranteed that you don't "forget" to unlock the lock, EVER. I have certainly spent MANY hours looking for such problems, and I wouldn't wish it upon anyone - especially the ones where triggering the situation where it locks up or runs slow is only happening once in a while, so to even "catch" the failure you have to be lucky (maybe x > y only happens on Thursdays, in months without "r" and when the date is divisible by both 7 and 3. So if you are unlucky enough to get the bug report at the end of April, you'll be debugging for a while... :)
The basis is RAII ("Resource Allocation Is Initialization"), and it's the same logic as why you want to use std::vector rather than pointers and call new/delete yourself. Any time you can make the compiler do the job for you, you have a great benefit of "not having to remember". Computers programs, such as compilers, are very good at "remembering" things.
Related
I have a loop like this:
for (auto &i : elements)
{
std::lock_guard<std::mutex> lock(i.mutex);
some heavy writing to disk with i
}
Throwing an error:
tpp.c:62: __pthread_tpp_change_priority: Assertion `new_prio == -1 || (new_prio >= __sched_fifo_min_prio && new_prio <= __sched_fifo_max_prio)' failed.
Can someone explain me, why this loop does not throw the error:
for (auto &i : elements)
{
i.mutex.lock();
some heavy writing to disk with i
i.mutex.unlock();
}
My knowledge about C++ multithreading is more from a practical viewpoint, so both loops always seemed equal to me. However, the first one is causing me much more problems than the second. It's also not always the same error, I get some invalid pointers with loop #1 as well, while the second loop has not yet crashed in many runs.
Any guess, what may cause the problem without knowing the rest of the code?
tpp.c:63: __pthread_tpp_change_priority: Assertion is a known problem and solved:
In brief, the problem is caused by repeated locking of a fast mutex, and solved by using a recursive mutex, and the default pthread_mutex_t is not recursive. Is it possible that there's pthread_mutex_t deeply inside the thread running code ??
BTW, to make the mutex recursive, please set the mutex attribute with attribute PTHREAD_MUTEX_RECURSIVE_NP.
Okay, I guess I found my answer. Thanks everyone for sorting things out. #molbdnilo was closest to the problem. Indeed there was an undefined behaviour in the rest of my code. Running Valgrind showed me invalid read and write operations on the lock_guard. My mutex is stored in a shared pointer, which is being reset by another thread without being locked beforehand. Consequently the mutex is being destroyed, while the other thread holds a lock_guard on it. I guess that's a problem?
In loop #2 the actual scenario is i->mutex.lock() and i->mutex.unlock(). So here was no error thrown, likely because the first lock sometimes just happened on an object, that was being released soon after. The unlock then was called on an already unlocked mutex, which is undefined behaviour as well but didn't cause any error for now.
I have a socket shared between 4 threads and I wanted to use the RAII principle for acquiring and releasing the mutex.
The ground realities
I am using the pthread library.
I cannot use Boost.
I cannot use anything newer than C++03.
I cannot use exceptions.
The Background
Instead of having to lock the mutex for the socket everytime before using it, and then unlocking the mutex right afterwards, I thought I could write a scoped_lock() which would lock the mutex, and once it goes out of scope, it would automatically unlock the mutex.
So, quite simply I do a lock in the constructor and an unlock in the destructor, as shown here.
ScopedLock::ScopedLock(pthread_mutex_t& mutex, int& errorCode)
: m_Mutex(mutex)
{
errorCode = m_lock();
}
ScopedLock::~ScopedLock()
{
errorCode = m_unlock();
}
where m_lock() and m_unlock() are quite simply two wrapper functions around the pthread_mutex_lock() and the pthread_mutex_unlock() functions respectively, with some additional tracelines/logging.
In this way, I would not have to write at least two unlock statements, one for the good case and one for the bad case (at least one, could be more bad-paths in some situations).
The Problem
The problem that I have bumped into and the thing that I don't like about this scheme is the destructor.
I have diligiently done for every function the error-handling, but from the destructor of this ScopedLock(), I cannot inform the caller about any errors that might be returned my m_unlock().
This is a fundamental problem with RAII, but in this case, you're in luck. pthread_unlock only fails if you set up the mutex wrong (EINVAL) or if you're attempting to unlock an error checking mutex from a thread that doesn't own it (EPERM). These errors are indications of bugs in your code, not runtime errors that you should be taking into account. asserting errorCode==0 is a reasonable strategy in this case.
here is the situation
boost::shared_mutex rwlock;
void test()
{
boost::unique_lock < boost::shared_mutex > writelock(rwlock);
// here we have deadlock
}
int main()
{
boost::shared_lock < boost::shared_mutex > readlock(rwlock);
test();
}
I know that we can do something like that:
{
boost::upgrade_lock<boost::shared_mutex> upgradeable_lock(rwlock); // here we obtain readlock
{
boost::upgrade_to_unique_lock<boost::shared_mutex> uniqueLock(upgradeable_lock); // right now we upgrade readlock to writelock
}
}
but if like in my first example we have another scope we don't see upgradeable_lock. How to solve that issue ?
I assume, that the real code is a lot more difficult, with acquiring read locks all over the place, multiple times in the call stack, and then you need to write somewhere, never anticipated.
I am a guessing a bit here, but if that is true, and you don't want to change that, you'll have to walk up the call path from your writing function, and always release the shared_lock before you do the relevant call, and acquire it again afterwards.
Read / write locks are great, but they tend to mislead developers to use read locks inflationary.
As soon as you can refactor, try and reduce the read locks to just those places, where you really have to read. Keep the critical section as short as possible, and avoid function calls inside, that might acquire that lock as well.
When you have done that, a change to a function, that has to write now as well, will not be a big issue any more. BTW, this will also improve the concurrency, because a writer will have more chances to find a moment, where no reader holds a read lock. You might prefer to do that refactoring now, because it will make life a lot easier afterwards.
Another guess: In case you use these read-locks to have a stable state of the data during a longer process, you might want to rethink that choice now. What you really want then, is some kind of software transactional memory.
I am surprised to see from pstack that this code leads to deadlock! I don't see a reason for the same.
pthread_mutex_t lock;
_Cilk_for (int i = 0; i < N; ++i) {
int ai = A[i];
if (ai < pivot) {
pthread_mutex_lock(&lock);
A[ia++] = ai;
pthread_mutex_unlock(&lock);
}
else if (ai > pivot) {
pthread_mutex_lock(&lock);
A[ib++] = ai;
pthread_mutex_unlock(&lock);
}
else {
pthread_mutex_lock(&lock);
A[ic++] = ai;
pthread_mutex_unlock(&lock);
}
}
I am just using mutexes to make sure that access to A is atomic and serialized.
What is wrong with this code to lead to deadlock?
Is there a better way to implement this?
If that's code inside a function, then you're not initialising the mutex correctly. You need to set it to PTHREAD_MUTEX_INITIALIZER (for a simple, default mutex) or do a pthread_mutex_init() on it (for more complex requirements). Without proper initialisation, you don't know what state the mutex starts in - it may well be in a locked state simply because whatever happened to be on the stack at that position looked like a locked mutex.
That's why it always needs to be initialised somehow, so that there is no doubt of the initial state.
Another potential problem you may have is this:
int ai = A[i];
You probably should protect that access with the same mutex since otherwise you may read it in a "half-state" (when another thread is only part way through updating the variable).
And, I have to say, I'm not sure that threads are being used wisely here. The use of mutexes is likely to swamp a statement like A[ia++] = ai to the point where the vast majority of time will be spent locking and unlocking the mutex. They're generally more useful where the code being processed during the lock is a little more substantial.
You may find a non-threaded variant will blow this one out of the water (but, of course, don't take my word for it - my primary optimisation mantra is "measure, don't guess").
Your pthread_mutex_t lock is not properly initialized, so, since it is a local variable, it may contain garbage, and might be in a strangely locked state. You should call pthread_mutex_init or initialize your lock with PTHREAD_MUTEX_INITIALIZER
As others complained, you are not wisely using mutexes. The critical sections of your code are much too small.
AFTER you fix or otherwise verify that you are in fact initializing your lock:
pstack may be privy to control mechanisms introduced by _Cilk_for that are interfering with what would otherwise be reasonable pthread code.
A quick search shows there are mutex solutions for use with Cilk - intermixing Cilk and pthreads isn't mentioned. It does look like Cilk is a layer on top of pthreads - so if Cilk chose to put a wrapper around mutex, they likely did so for a good reason. I'd suggest staying with the Cilk API.
That aside, there's a more fundamental issue with your algorithm. In your case, the overhead for creating parallel threads and synchronizing them likely dwarfs the cost of executing the code in the body of the for-loop. It's very possible this will run faster without parallelizing it.
[Edit: (copied from a comment) As it turns out, the problem was elsewhere, but thank you all for your input.]
I have a shared container class which uses a single mutex to lock the push() and pop() functions, since I don't want to simultaneously modify the head and tail. Here's the code:
int Queue::push( WorkUnit unit )
{
pthread_mutex_lock( &_writeMutex );
int errorCode = 0;
try
{
_queue.push_back( unit );
}
catch( std::bad_alloc )
{
errorCode = 1;
}
pthread_mutex_unlock( &_writeMutex );
return errorCode;
}
When I run this in debug mode, everything is peachy. When I run in release mode, I get crashes at roughly the time when the driver program starts pushing and popping "simultaneously". Does the try/catch block immediately force an exit if it catches a std::bad_alloc exception? If so, should I enclose the remainder of the function in a finally block?
Also, is it possible that the slower debug mode only succeeds because my push() and pop() calls never actually occur at the same time?
In C++ we use Resource Acquisition Is Initialization (RAII) for guarding against exceptions.
Is this actually bombing after an exception? Far more likely from your snippet is that you just have bad synchronization in place. That starts with the name of your mutex: "writeMutex". This is not going to work if there is also a "readMutex". All reading, peeking and writing operations need to be locked by the same mutex.
Does the try/catch block immediately
force an exit if it catches a
std::bad_alloc exception?
No. If a std::bad_alloc is thrown inside the try {...} block, the code in the catch {...} block will fire.
If your program is actually crashing, then it seems like either your push_back call is throwing some exception other than bad_alloc (which isn't handled in your code), or the bad_alloc is being thrown somewhere outside the try {...} block.
By the way, are you sure you really want to use a try...catch block here?
plus
what does the pop look like
create a lock wrapper class that will automatically free the lock when it goes out of scope (as in RAII comment)
c++ does not have finally (thanks to mr stoustrop being stroppy)
i would catch std::exception or none at all (ducks head down for flame war). If u catch none then you need the wrapper
Regarding release/debug: Yes, you will often find race condition change between the two types of builds. When you deal with synchronization, your threads will run with different level of training. Well written threading will mostly run concurrently while poorly written threading the threads will in a highly synchronous manner relative to each other. All types of synchronization yield some level synchronous behavior. It as if synchronous and synchronization come from the same root word...
So yes, given the slightly different run-time performance between debug and release, those points where the threads synchronize can sometimes cause bad code to manifest in one type of build and not the other.
You need to use RAII
This basically means using the constructor/destructor to lock/unlock the resource.
This gurantees that the mutex will always be unlocked even when exceptions are around.
You should only be using one mutex for access to the list.
Even if you have a read only mutex that is used by a thread that only reads. That does not mean it is safe to read when another thread is updating the queue. The queue could be in some intermediate state caused by a thread calling push() while another thread is trying ti navigate an invlide intermediate state.
class Locker
{
public:
Locker(pthread_mutex_t &lock)
:m_mutex(lock)
{
pthread_mutex_lock(&m_mutex);
}
~Locker()
{
pthread_mutex_unlock(&m_mutex);
}
private:
pthread_mutex_t& m_mutex;
};
int Queue::push( WorkUnit unit )
{
// building the object lock calls the constructor thus locking the mutex.
Locker lock(_writeMutex);
int errorCode = 0;
try
{
_queue.push_back( unit );
}
catch( std::bad_alloc ) // Other exceptions may happen here.
{ // You catch one that you handle locally via error codes.
errorCode = 1; // That is fine. But there are other exceptions to think about.
}
return errorCode;
} // lock destructor called here. Thus unlocking the mutex.
PS. I hate the use of leading underscore.
Though technically it is OK here (assuming member variables) it is so easy to mess up that I prefer not to pre pend '' to idnetifiers. See What are the rules about using an underscore in a C++ identifier? for a whole list of rules to do about '' in identifier names.
Previous code sample with Locker class has a major problem:
What do you do when and if pthread_mutex_lock() fails?
The answer is you must throw an exception at this point, from constructor, and it can be caught.
Fine.
However,
According to c++ exception specs throwing an exception from a destructor is a no-no.
HOW DO YOU HANDLE pthread_mutex_unlock FAILURES?
Running code under any instrumentation software serves no purpose whatsoever.
You have to right code that works, not run it under valgrind.
In C it works perfectly fine:
pthread_cleanup_pop( 0 );
r = pthread_mutex_unlock( &mutex );
if ( r != 0 )
{
/* Explicit error handling at point of occurance goes here. */
}
But because c++ is a software abortion there just no reasonable way to deal with threaded coded failures with any degree of certainty. Brain-dead ideas like wrapping pthread_mutex_t into a class that adds some sort of state variable is just that - brain dead. The following code just does not work:
Locker::~Locker()
{
if ( pthread_mutex_unlock( &mutex ) != 0 )
{
failed = true; // Nonsense.
}
}
And the reason for that is that after pthread_mutex_unlock() returns this thread very well may be sliced out of cpu - preempted. That means that the .failed public variable will be still false. Other threads looking at it will get wrong information - the state variable says no failures, meanwhile pthread_mutex_unlock() failed. Even if, by some stroke of luck, these two statements run in one go, this thread may be preempted before ~Locker() returns and other threads may modify the value of .failed. Bottom line these schemes do not work - there is no atomic test-and-set mechanism for application defined variables.
Some say, destructors should never have code that fails. Anything else is a bad design. Ok, fine. I am just curious to see what IS a good design to be 100% exception and thread safe in c++.