pthreads: thread starvation caused by quick re-locking

pthreads: thread starvation caused by quick re-locking - c++

I have a two threads, one which works in a tight loop, and the other which occasionally needs to perform a synchronization with the first:
// thread 1
while(1)
{
lock(work);
// perform work
unlock(work);
}
// thread 2
while(1)
{
// unrelated work that takes a while
lock(work);
// synchronizing step
unlock(work);
}
My intention is that thread 2 can, by taking the lock, effectively pause thread 1 and perform the necessary synchronization. Thread 1 can also offer to pause, by unlocking, and if thread 2 is not waiting on lock, re-lock and return to work.
The problem I have encountered is that mutexes are not fair, so thread 1 quickly re-locks the mutex and starves thread 2. I have attempted to use pthread_yield, and so far it seems to run okay, but I am not sure it will work for all systems / number of cores. Is there a way to guarantee that thread 1 will always yield to thread 2, even on multi-core systems?
What is the most effective way of handling this synchronization process?

You can build a FIFO "ticket lock" on top of pthreads mutexes, along these lines:
#include <pthread.h>
typedef struct ticket_lock {
pthread_cond_t cond;
pthread_mutex_t mutex;
unsigned long queue_head, queue_tail;
} ticket_lock_t;
#define TICKET_LOCK_INITIALIZER { PTHREAD_COND_INITIALIZER, PTHREAD_MUTEX_INITIALIZER }
void ticket_lock(ticket_lock_t *ticket)
{
unsigned long queue_me;
pthread_mutex_lock(&ticket->mutex);
queue_me = ticket->queue_tail++;
while (queue_me != ticket->queue_head)
{
pthread_cond_wait(&ticket->cond, &ticket->mutex);
}
pthread_mutex_unlock(&ticket->mutex);
}
void ticket_unlock(ticket_lock_t *ticket)
{
pthread_mutex_lock(&ticket->mutex);
ticket->queue_head++;
pthread_cond_broadcast(&ticket->cond);
pthread_mutex_unlock(&ticket->mutex);
}
Under this kind of scheme, no low-level pthreads mutex is held while a thread is within the ticketlock protected critical section, allowing other threads to join the queue.

In your case it is better to use condition variable to notify second thread when it is required to awake and perform all required operations.

pthread offers a notion of thread priority in its API. When two threads are competing over a mutex, the scheduling policy determines which one will get it. The function pthread_attr_setschedpolicy lets you set that, and pthread_attr_getschedpolicy permits retrieving the information.
Now the bad news:
When only two threads are locking / unlocking a mutex, I can’t see any sort of competition, the first who runs the atomic instruction takes it, the other blocks. I am not sure whether this attribute applies here.
The function can take different parameters (SCHED_FIFO, SCHED_RR, SCHED_OTHER and SCHED_SPORADIC), but in this question, it has been answered that only SCHED_OTHER was supported on linux)
So I would give it a shot if I were you, but not expect too much. pthread_yield seems more promising to me. More information available here.

Ticket lock above looks like the best. However, to insure your pthread_yield works, you could have a bool waiting, which is set and reset by thread2. thread1 yields as long as bool waiting is set.

Here's a simple solution which will work for your case (two threads). If you're using std::mutex then this class is a drop-in replacement. Change your mutex to this type and you are guaranteed that if one thread holds the lock and the other is waiting on it, once the first thread unlocks, the second thread will grab the lock before the first thread can lock it again.
If more than two threads happen to use the mutex simultaneously it will still function but there are no guarantees on fairness.
If you're using plain pthread_mutex_t you can easily change your locking code according to this example (unlock remains unchanged).
#include <mutex>
// Behaves the same as std::mutex but guarantees fairness as long as
// up to two threads are using (holding/waiting on) it.
// When one thread unlocks the mutex while another is waiting on it,
// the other is guaranteed to run before the first thread can lock it again.
class FairDualMutex : public std::mutex {
public:
void lock() {
_fairness_mutex.lock();
std::mutex::lock();
_fairness_mutex.unlock();
}
private:
std::mutex _fairness_mutex;
};

Related

Understanding condition_variable::wait for blocking a thread

While implementing a thread pool pattern in C++ based on this, I came across a few questions.
Let's assume minimal code sample:
std::mutex thread_mutex;
std::condition_variable thread_condition;
void thread_func() {
std::unique_lock<std::mutex> lock(thread_mutex);
thread_condition.wait(lock);
lock.unlock();
}
std::thread t1 = std::thread(thread_func);
Regarding cppreference.com about conditon_variable::wait(), wait() causes the current thread to block. What is locking the mutex then for when I only need one thread at all using wait() to get notified when something is to do?
unique_lock will block the thread when the mutex already has been locked by another thread. But this wouldn't be neccesary as long as wait() blocks anyway or what do I miss here?
Adding a few lines at the bottom...
std::thread t2 = std::thread(thread_func);
thread_condition.notify_all()
When unique_lock is blocking the thread, how will notify_all() reach both threads when one of them is locked by unique_lock and the other is blocked by wait()? I understand that blocking wait() will be freed by notify_all() which afterwards leads to unlocking the mutex and that this gives chance to the other thread for locking first the mutex and blocking thread by wait() afterwards. But how is this thread notified than?
Expanding this question by adding a loop in thread_func()...
std::mutex thread_mutex;
std::condition_variable thread_condition;
void thread_func() {
while(true) {
std::unique_lock<std::mutex> lock(thread_mutex);
thread_condition.wait(lock);
lock.unlock();
}
}
std::thread t1 = std::thread(thread_func);
std::thread t2 = std::thread(thread_func);
thread_condition.notify_all()
While reading documentation, I would now expect both threads running endlessly. But they do not return from wait() lock. Why do I have to use a predicate for expected behaviour like this:
bool wakeup = false;
//[...]
thread_condition.wait(lock, [] { return wakeup; });
//[...]
wakeup = !wakeup;
thread_condition.notify_all();
Thanks in advance.

This is really close to being a duplicate, but it's actually that question that answers this one; we also have an answer that more or less answers this question, but the question is distinct. I think that an independent answer is needed, even though it's little more than a (long) definition.
What is a condition variable?
The operational definition is that it's a means for a thread to block until a message arrives from another thread. A mutex alone can't possibly do this: if all other threads are busy with unrelated work, a mutex can't block a thread at all. A semaphore can block a lone thread, but it's tightly bound to the notion of a count, which isn't always appropriate to the nature of the message to receive.
This "channel" can be implemented in several ways. Very low-tech is to use a pipe, but that involves expensive system calls. Windows provides the Event object which is fundamentally a boolean on whose truth a thread may wait. (C++20 provides a similar feature with atomic_flag::wait.)
Condition variables take a different approach: their structural definition is that they are stateless, but have a special connection to a corresponding mutex type. The latter is necessitated by the former: without state, it is impossible to store a message, so arrangements must be made to prevent sending a message during some interval between a thread recognizing the need to wait (by examining some other state: perhaps that the queue from which it wants to pop is empty) and it actually being blocked. Of course, after the thread is blocked it cannot take any action to allow the message to be sent, so the condition variable must do so.
This is implemented by having the thread take a mutex before checking the condition and having wait release that mutex only after the thread can receive the message. (In some implementations, the mutex is also used to protect the workings of the condition variable, but C++ does not do so.) When the message is received, the mutex is re-acquired (which may block the thread again for a time), as is necessary to consult the external state again. wait thus acts like an everted std::unique_lock: the mutex is unlocked during wait and locked again afterwards, with possibly arbitary changes having been made by other threads in the meantime.
Answers
Given this understanding, the individual answers here are trivial:
Locking the mutex allows the waiting thread to safely decide to wait, given that there must be some other thread affecting the state in question.
If the std::unique_lock blocks, some other thread is currently updating the state, which might actually obviate the need for wait.
Any number of threads can be in wait, since each unlocks the mutex when it calls it.
Waiting on a condition variable, er, unconditionally is always wrong: the state you're after might already apply, with no further messages coming.

Implement a high performance mutex similar to Qt's one

I have a multi-thread scientific application where several computing threads (one per core) have to store their results in a common buffer. This requires a mutex mechanism.
Working threads spend only a small fraction of their time writing to the buffer, so the mutex is unlocked most of the time, and locks have a high probability to succeed immediately without waiting for another thread to unlock.
Currently, I have used Qt's QMutex for the task, and it works well : the mutex has a negligible overhead.
However, I have to port it to c++11/STL only. When using std::mutex, the performance drops by 66% and the threads spend most of their time locking the mutex.
After another question, I figured that Qt uses a fast locking mechanism based on a simple atomic flag, optimized for cases where the mutex is not already locked. And falls back to a system mutex when concurrent locking occurs.
I would like to implement this in STL. Is there a simple way based on std::atomic and std::mutex ? I have digged in Qt's code but it seems overly complicated for my use (I do not need locks timeouts, pimpl, small footprint etc...).
Edit : I have tried a spinlock, but this does not work well because :
Periodically (every few seconds), another thread locks the mutexes and flushes the buffer. This takes some time, so all worker threads get blocked at this time. The spinlocks make the scheduling busy, causing the flush to be 10-100x slower than with a proper mutex. This is not acceptable
Edit : I have tried this, but it's not working (locks all threads)
class Mutex
{
public:
Mutex() : lockCounter(0) { }
void lock()
{
if(lockCounter.fetch_add(1, std::memory_order_acquire)>0)
{
std::unique_lock<std::mutex> lock(internalMutex);
cv.wait(lock);
}
}
void unlock();
{
if(lockCounter.fetch_sub(1, std::memory_order_release)>1)
{
cv.notify_one();
}
}
private:
std::atomic<int> lockCounter;
std::mutex internalMutex;
std::condition_variable cv;
};
Thanks!
Edit : Final solution
MikeMB's fast mutex was working pretty well.
As a final solution, I did:
Use a simple spinlock with a try_lock
When a thread fails to try_lock, instead of waiting, they fill a queue (which is not shared with other threads) and continue
When a thread gets a lock, it updates the buffer with the current result, but also with the results stored in the queue (it processes its queue)
The buffer flushing was made much more efficiently : the blocking part only swaps two pointers.

General Advice
As was mentioned in some comments, I'd first have a look, whether you can restructure your program design to make the mutex implementation less critical for your performance .
Also, as multithreading support in standard c++ is pretty new and somewhat infantile, you sometimes just have to fall back on platform specific mechanisms, like e.g. a futex on linux systems or critical sections on windows or non-standard libraries like Qt.
That being said, I could think of two implementation approaches that might potentially speed up your program:
Spinlock
If access collisions happen very rarely, and the mutex is only hold for short periods of time (two things one should strive to achieve anyway of course), it might be most efficient to just use a spinlock, as it doesn't require any system calls at all and it's simple to implement (taken from cppreference):
class SpinLock {
std::atomic_flag locked ;
public:
void lock() {
while (locked.test_and_set(std::memory_order_acquire)) {
std::this_thread::yield(); //<- this is not in the source but might improve performance.
}
}
void unlock() {
locked.clear(std::memory_order_release);
}
};
The drawback of course is that waiting threads don't stay asleep and steal processing time.
Checked Locking
This is essentially the idea you demonstrated: You first make a fast check, whether locking is actually needed based on an atomic swap operation and use a heavy std::mutex only if it is unavoidable.
struct FastMux {
//Status of the fast mutex
std::atomic<bool> locked;
//helper mutex and vc on which threads can wait in case of collision
std::mutex mux;
std::condition_variable cv;
//the maximum number of threads that might be waiting on the cv (conservative estimation)
std::atomic<int> cntr;
FastMux():locked(false), cntr(0){}
void lock() {
if (locked.exchange(true)) {
cntr++;
{
std::unique_lock<std::mutex> ul(mux);
cv.wait(ul, [&]{return !locked.exchange(true); });
}
cntr--;
}
}
void unlock() {
locked = false;
if (cntr > 0){
std::lock_guard<std::mutex> ul(mux);
cv.notify_one();
}
}
};
Note that the std::mutex is not locked in between lock() and unlock() but it is only used for handling the condition variable. This results in more calls to lock / unlock if there is high congestion on the mutex.
The problem with your implementation is, that cv.notify_one(); can potentially be called between if(lockCounter.fetch_add(1, std::memory_order_acquire)>0) and cv.wait(lock); so your thread might never wake up.
I didn't do any performance comparisons against a fixed version of your proposed implementation though so you just have to see what works best for you.

Not really an answer per definition, but depending on the specific task, a lock-free queue might help getting rid of the mutex at all. This would help the design, if you have multiple producers and a single consumer (or even multiple consumers). Links:
Though not directly C++/STL, Boost.Lockfree provides such a queue.
Another option is the lock-free queue implementation in "C++ Concurrency in Action" by Anthony Williams.
A Fast Lock-Free Queue for C++
Update wrt to comments:
Queue size / overflow:
Queue overflowing can be avoided by i) making the queue large enough or ii) by making the producer thread wait with pushing data once the queue is full.
Another option would be to use multiple consumers and multiple queues and implement a parallel reduction but this depends on how the data is treated.
Consumer thread:
The queue could use std::condition_variable and make the consumer thread wait until there is data.
Another option would be to use a timer for checking in regular intervals (polling) for the queue being non-empty, once it is non-empty the thread can continuously fetch data and the go back into wait-mode.

C++ Mutexes- Check if another thread is waiting

Is it possible for a thread that already has a lock on a mutex to check whether another thread is already waiting, without releasing the mutex? For example, say a thread has 3 tasks to run on a block of data, but another thread may have a short task to run on the data as well. Ideally, I'd have the first thread check whether another thread is waiting between each of the three tasks and allow the other thread to execute its task before resuming the other two tasks. Does Boost have a type of mutex that supports this functionality (assuming the C++11 mutex or lock types don't support this), or can this be done with conditional variables?

You cannot check whether other threads are waiting on a mutex.
If you want to give other threads their chance to run, just release the mutex. No need to know if someone is waiting. Then re-acquire as necessary.
Conditional variables are events. Use them if you want to wait until something happens. To check whether something has happened you need a regular (mutex-protected or atomic) variable.

you can not check if other threads are waiting on a mutex.
if you do need such functionality, you need to implement your own.
a mutex with use_count will suffice your need.
class my_mutex{
public:
my_mutex() {count=0;}
void lock() {count++; mtx.lock();}
void unlock() {count--; mtx.unlock();}
size_t get_waiting_threads() {return count>1?count-1:0;}
private:
atomic_ulong count;
mutex mtx;
};
if you need to finish task 1 and 2 before task 3 is executed, you should use conditional_variable instead of mutex.

If two threads are waiting on a lock, the thread that started waiting first is not guaranteed to be the thread that gets it first when it becomes available. So if you're a tight-looping thread who does something like
while(true):
mutex.lock()
print "got the lock; releasing it"
mutex.unlock()
You might think you're being polite to all the other threads waiting for the lock, but you're not, the system might just give you the lock over and over again without letting any of the other threads jump in, no matter how long they've been waiting.
Condition variables are a reasonable way to solve this problem.

Mutex example / tutorial? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I was trying to understand how mutexes work. Did a lot of Googling but it still left some doubts of how it works because I created my own program in which locking didn't work.
One absolutely non-intuitive syntax of the mutex is pthread_mutex_lock( &mutex1 );, where it looks like the mutex is being locked, when what I really want to lock is some other variable. Does this syntax mean that locking a mutex locks a region of code until the mutex is unlocked? Then how do threads know that the region is locked? [UPDATE: Threads know that the region is locked, by Memory Fencing ]. And isn't such a phenomenon supposed to be called critical section? [UPDATE: Critical section objects are available in Windows only, where the objects are faster than mutexes and are visible only to the thread which implements it. Otherwise, critical section just refers to the area of code protected by a mutex]
What's the simplest possible mutex example program and the simplest possible explanation on the logic of how it works?

Here goes my humble attempt to explain the concept to newbies around the world: (a color coded version on my blog too)
A lot of people run to a lone phone booth (they don't have mobile phones) to talk to their loved ones. The first person to catch the door-handle of the booth, is the one who is allowed to use the phone. He has to keep holding on to the handle of the door as long as he uses the phone, otherwise someone else will catch hold of the handle, throw him out and talk to his wife :) There's no queue system as such. When the person finishes his call, comes out of the booth and leaves the door handle, the next person to get hold of the door handle will be allowed to use the phone.
A thread is : Each person
The mutex is : The door handle
The lock is : The person's hand
The resource is : The phone
Any thread which has to execute some lines of code which should not be modified by other threads at the same time (using the phone to talk to his wife), has to first acquire a lock on a mutex (clutching the door handle of the booth). Only then will a thread be able to run those lines of code (making the phone call).
Once the thread has executed that code, it should release the lock on the mutex so that another thread can acquire a lock on the mutex (other people being able to access the phone booth).
[The concept of having a mutex is a bit absurd when considering real-world exclusive access, but in the programming world I guess there was no other way to let the other threads 'see' that a thread was already executing some lines of code. There are concepts of recursive mutexes etc, but this example was only meant to show you the basic concept. Hope the example gives you a clear picture of the concept.]
With C++11 threading:
#include <iostream>
#include <thread>
#include <mutex>
std::mutex m;//you can use std::lock_guard if you want to be exception safe
int i = 0;
void makeACallFromPhoneBooth()
{
m.lock();//man gets a hold of the phone booth door and locks it. The other men wait outside
//man happily talks to his wife from now....
std::cout << i << " Hello Wife" << std::endl;
i++;//no other thread can access variable i until m.unlock() is called
//...until now, with no interruption from other men
m.unlock();//man lets go of the door handle and unlocks the door
}
int main()
{
//This is the main crowd of people uninterested in making a phone call
//man1 leaves the crowd to go to the phone booth
std::thread man1(makeACallFromPhoneBooth);
//Although man2 appears to start second, there's a good chance he might
//reach the phone booth before man1
std::thread man2(makeACallFromPhoneBooth);
//And hey, man3 also joined the race to the booth
std::thread man3(makeACallFromPhoneBooth);
man1.join();//man1 finished his phone call and joins the crowd
man2.join();//man2 finished his phone call and joins the crowd
man3.join();//man3 finished his phone call and joins the crowd
return 0;
}
Compile and run using g++ -std=c++0x -pthread -o thread thread.cpp;./thread
Instead of explicitly using lock and unlock, you can use brackets as shown here, if you are using a scoped lock for the advantage it provides. Scoped locks have a slight performance overhead though.

While a mutex may be used to solve other problems, the primary reason they exist is to provide mutual exclusion and thereby solve what is known as a race condition. When two (or more) threads or processes are attempting to access the same variable concurrently, we have potential for a race condition. Consider the following code
//somewhere long ago, we have i declared as int
void my_concurrently_called_function()
{
i++;
}
The internals of this function look so simple. It's only one statement. However, a typical pseudo-assembly language equivalent might be:
load i from memory into a register
add 1 to i
store i back into memory
Because the equivalent assembly-language instructions are all required to perform the increment operation on i, we say that incrementing i is a non-atmoic operation. An atomic operation is one that can be completed on the hardware with a gurantee of not being interrupted once the instruction execution has begun. Incrementing i consists of a chain of 3 atomic instructions. In a concurrent system where several threads are calling the function, problems arise when a thread reads or writes at the wrong time. Imagine we have two threads running simultaneoulsy and one calls the function immediately after the other. Let's also say that we have i initialized to 0. Also assume that we have plenty of registers and that the two threads are using completely different registers, so there will be no collisions. The actual timing of these events may be:
thread 1 load 0 into register from memory corresponding to i //register is currently 0
thread 1 add 1 to a register //register is now 1, but not memory is 0
thread 2 load 0 into register from memory corresponding to i
thread 2 add 1 to a register //register is now 1, but not memory is 0
thread 1 write register to memory //memory is now 1
thread 2 write register to memory //memory is now 1
What's happened is that we have two threads incrementing i concurrently, our function gets called twice, but the outcome is inconsistent with that fact. It looks like the function was only called once. This is because the atomicity is "broken" at the machine level, meaning threads can interrupt each other or work together at the wrong times.
We need a mechanism to solve this. We need to impose some ordering to the instructions above. One common mechanism is to block all threads except one. Pthread mutex uses this mechanism.
Any thread which has to execute some lines of code which may unsafely modify shared values by other threads at the same time (using the phone to talk to his wife), should first be made acquire a lock on a mutex. In this way, any thread that requires access to the shared data must pass through the mutex lock. Only then will a thread be able to execute the code. This section of code is called a critical section.
Once the thread has executed the critical section, it should release the lock on the mutex so that another thread can acquire a lock on the mutex.
The concept of having a mutex seems a bit odd when considering humans seeking exclusive access to real, physical objects but when programming, we must be intentional. Concurrent threads and processes don't have the social and cultural upbringing that we do, so we must force them to share data nicely.
So technically speaking, how does a mutex work? Doesn't it suffer from the same race conditions that we mentioned earlier? Isn't pthread_mutex_lock() a bit more complex that a simple increment of a variable?
Technically speaking, we need some hardware support to help us out. The hardware designers give us machine instructions that do more than one thing but are guranteed to be atomic. A classic example of such an instruction is the test-and-set (TAS). When trying to acquire a lock on a resource, we might use the TAS might check to see if a value in memory is 0. If it is, that would be our signal that the resource is in use and we do nothing (or more accurately, we wait by some mechanism. A pthreads mutex will put us into a special queue in the operating system and will notify us when the resource becomes available. Dumber systems may require us to do a tight spin loop, testing the condition over and over). If the value in memory is not 0, the TAS sets the location to something other than 0 without using any other instructions. It's like combining two assembly instructions into 1 to give us atomicity. Thus, testing and changing the value (if changing is appropriate) cannot be interrupted once it has begun. We can build mutexes on top of such an instruction.
Note: some sections may appear similar to an earlier answer. I accepted his invite to edit, he preferred the original way it was, so I'm keeping what I had which is infused with a little bit of his verbiage.

I stumbled upon this post recently and think that it needs an updated solution for the standard library's c++11 mutex (namely std::mutex).
I've pasted some code below (my first steps with a mutex - I learned concurrency on win32 with HANDLE, SetEvent, WaitForMultipleObjects etc).
Since it's my first attempt with std::mutex and friends, I'd love to see comments, suggestions and improvements!
#include <condition_variable>
#include <mutex>
#include <algorithm>
#include <thread>
#include <queue>
#include <chrono>
#include <iostream>
int _tmain(int argc, _TCHAR* argv[])
{
// these vars are shared among the following threads
std::queue<unsigned int> nNumbers;
std::mutex mtxQueue;
std::condition_variable cvQueue;
bool m_bQueueLocked = false;
std::mutex mtxQuit;
std::condition_variable cvQuit;
bool m_bQuit = false;
std::thread thrQuit(
[&]()
{
using namespace std;
this_thread::sleep_for(chrono::seconds(5));
// set event by setting the bool variable to true
// then notifying via the condition variable
m_bQuit = true;
cvQuit.notify_all();
}
);
std::thread thrProducer(
[&]()
{
using namespace std;
int nNum = 13;
unique_lock<mutex> lock( mtxQuit );
while ( ! m_bQuit )
{
while( cvQuit.wait_for( lock, chrono::milliseconds(75) ) == cv_status::timeout )
{
nNum = nNum + 13 / 2;
unique_lock<mutex> qLock(mtxQueue);
cout << "Produced: " << nNum << "\n";
nNumbers.push( nNum );
}
}
}
);
std::thread thrConsumer(
[&]()
{
using namespace std;
unique_lock<mutex> lock(mtxQuit);
while( cvQuit.wait_for(lock, chrono::milliseconds(150)) == cv_status::timeout )
{
unique_lock<mutex> qLock(mtxQueue);
if( nNumbers.size() > 0 )
{
cout << "Consumed: " << nNumbers.front() << "\n";
nNumbers.pop();
}
}
}
);
thrQuit.join();
thrProducer.join();
thrConsumer.join();
return 0;
}

For those looking for the shortex mutex example:
#include <mutex>
int main() {
std::mutex m;
m.lock();
// do thread-safe stuff
m.unlock();
}

The function pthread_mutex_lock() either acquires the mutex for the calling thread or blocks the thread until the mutex can be acquired. The related pthread_mutex_unlock() releases the mutex.
Think of the mutex as a queue; every thread that attempts to acquire the mutex will be placed on the end of the queue. When a thread releases the mutex, the next thread in the queue comes off and is now running.
A critical section refers to a region of code where non-determinism is possible. Often this because multiple threads are attempting to access a shared variable. The critical section is not safe until some sort of synchronization is in place. A mutex lock is one form of synchronization.

You are supposed to check the mutex variable before using the area protected by the mutex. So your pthread_mutex_lock() could (depending on implementation) wait until mutex1 is released or return a value indicating that the lock could not be obtained if someone else has already locked it.
Mutex is really just a simplified semaphore. If you read about them and understand them, you understand mutexes. There are several questions regarding mutexes and semaphores in SO. Difference between binary semaphore and mutex, When should we use mutex and when should we use semaphore and so on. The toilet example in the first link is about as good an example as one can think of. All code does is to check if the key is available and if it is, reserves it. Notice that you don't really reserve the toilet itself, but the key.

SEMAPHORE EXAMPLE ::
sem_t m;
sem_init(&m, 0, 0); // initialize semaphore to 0
sem_wait(&m);
// critical section here
sem_post(&m);
Reference : http://pages.cs.wisc.edu/~remzi/Classes/537/Fall2008/Notes/threads-semaphores.txt

Modelling boost::Lockable with semaphore rather than mutex (previously titled: Unlocking a mutex from a different thread)

I'm using the C++ boost::thread library, which in my case means I'm using pthreads. Officially, a mutex must be unlocked from the same thread which locks it, and I want the effect of being able to lock in one thread and then unlock in another. There are many ways to accomplish this. One possibility would be to write a new mutex class which allows this behavior.
For example:
class inter_thread_mutex{
bool locked;
boost::mutex mx;
boost::condition_variable cv;
public:
void lock(){
boost::unique_lock<boost::mutex> lck(mx);
while(locked) cv.wait(lck);
locked=true;
}
void unlock(){
{
boost::lock_guard<boost::mutex> lck(mx);
if(!locked) error();
locked=false;
}
cv.notify_one();
}
// bool try_lock(); void error(); etc.
}
I should point out that the above code doesn't guarantee FIFO access, since if one thread calls lock() while another calls unlock(), this first thread may acquire the lock ahead of other threads which are waiting. (Come to think of it, the boost::thread documentation doesn't appear to make any explicit scheduling guarantees for either mutexes or condition variables). But let's just ignore that (and any other bugs) for now.
My question is, if I decide to go this route, would I be able to use such a mutex as a model for the boost Lockable concept. For example, would anything go wrong if I use a boost::unique_lock< inter_thread_mutex > for RAII-style access, and then pass this lock to boost::condition_variable_any.wait(), etc.
On one hand I don't see why not. On the other hand, "I don't see why not" is usually a very bad way of determining whether something will work.
The reason I ask is that if it turns out that I have to write wrapper classes for RAII locks and condition variables and whatever else, then I'd rather just find some other way to achieve the same effect.
EDIT:
The kind of behavior I want is basically as follows. I have an object, and it needs to be locked whenever it is modified. I want to lock the object from one thread, and do some work on it. Then I want to keep the object locked while I tell another worker thread to complete the work. So the first thread can go on and do something else while the worker thread finishes up. When the worker thread gets done, it unlocks the mutex.
And I want the transition to be seemless so nobody else can get the mutex lock in between when thread 1 starts the work and thread 2 completes it.
Something like inter_thread_mutex seems like it would work, and it would also allow the program to interact with it as if it were an ordinary mutex. So it seems like a clean solution. If there's a better solution, I'd be happy to hear that also.
EDIT AGAIN:
The reason I need locks to begin with is that there are multiple master threads, and the locks are there to prevent them from accessing shared objects concurrently in invalid ways.
So the code already uses loop-level lock-free sequencing of operations at the master thread level. Also, in the original implementation, there were no worker threads, and the mutexes were ordinary kosher mutexes.
The inter_thread_thingy came up as an optimization, primarily to improve response time. In many cases, it was sufficient to guarantee that the "first part" of operation A, occurs before the "first part" of operation B. As a dumb example, say I punch object 1 and give it a black eye. Then I tell object 1 to change it's internal structure to reflect all the tissue damage. I don't want to wait around for the tissue damage before I move on to punch object 2. However, I do want the tissue damage to occur as part of the same operation; for example, in the interim, I don't want any other thread to reconfigure the object in such a way that would make tissue damage an invalid operation. (yes, this example is imperfect in many ways, and no I'm not working on a game)
So we made the change to a model where ownership of an object can be passed to a worker thread to complete an operation, and it actually works quite nicely; each master thread is able to get a lot more operations done because it doesn't need to wait for them all to complete. And, since the event sequencing at the master thread level is still loop-based, it is easy to write high-level master-thread operations, as they can be based on the assumption that an operation is complete (more precisely, the critical "first part" upon which the sequencing logic depends is complete) when the corresponding function call returns.
Finally, I thought it would be nice to use inter_thread mutex/semaphore thingies using RAII with boost locks to encapsulate the necessary synchronization that is required to make the whole thing work.

man pthread_unlock (this is on OS X, similar wording on Linux) has the answer:
NAME
pthread_mutex_unlock -- unlock a mutex
SYNOPSIS
#include <pthread.h>
int
pthread_mutex_unlock(pthread_mutex_t *mutex);
DESCRIPTION
If the current thread holds the lock on mutex, then the
pthread_mutex_unlock() function unlocks mutex.
Calling pthread_mutex_unlock() with a mutex that the
calling thread does not hold will result in
undefined behavior.
...
My counter-question would be - what kind of synchronization problem are you trying to solve with this? Most probably there is an easier solution.
Neither pthreads nor boost::thread (built on top of it) guarantee any order in which a contended mutex is acquired by competing threads.

Sorry, but I don't understand. what will be the state of your mutex in line [1] in the following code if another thread can unlock it?
inter_thread_mutex m;
{
m.lock();
// [1]
m.unlock();
}
This has no sens.

There's a few ways to approach this. Both of the ones I'm going to suggest are going to involve adding an additional piece of information to the object, rather adding a mechanism to unlock a thread from a thread other than the one that owns it.
1) you can add some information to indicate the object's state:
enum modification_state { consistent, // ready to be examined or to start being modified
phase1_complete, // ready for the second thread to finish the work
};
// first worker thread
lock();
do_init_work(object);
object.mod_state = phase1_complete;
unlock();
signal();
do_other_stuff();
// second worker thread
lock()
while( object.mod_state != phase1_complete )
wait()
do_final_work(obj)
object.mod_state = consistent;
unlock()
signal()
// some other thread that needs to read the data
lock()
while( object.mod_state != consistent )
wait();
read_data(obj)
unlock()
Works just fine with condition variables, because obviously you're not writing your own lock.
2) If you have a specific thread in mind, you can give the object an owner.
// first worker
lock();
while( obj.owner != this_thread() ) wait();
do_initial_work(obj);
obj.owner = second_thread_id;
unlock()
signal()
...
This is pretty much the same solution as my first solution, but more flexible in the adding/removing of phases, and less flexible in the adding/removing of threads.
To be honest, I'm not sure how inter thread mutex would help you here. You'd still need a semaphore or condition variable to signal the passing of the work to the second thread.

Small modification to what you already have: how about storing the id of the thread which you want to take the lock, in your inter_thread_whatever? Then unlock it, and send a message to that thread, saying "I want you execute whatever routine it is that tries to take this lock".
Then the condition in lock becomes while(locked || (desired_locker != thisthread && desired_locker != 0)). Technically you've "released the lock" in the first thread, and "taken it again" in the second thread, but there's no way that any other thread can grab it in between, so it's as if you've transferred it directly from one to the other.
There's a potential problem, that if a thread exits or is killed, while it's the desired locker of your lock, then that thread deadlocks. But you were already talking about the first thread waiting for a message from the second thread to say that it has successfully acquired the lock, so presumably you already have a plan in mind for what happens if that message is never received. To that plan, add "reset the desired_locker field on the inter_thread_whatever".
This is all very hairy, though, I'm not convinced that what I've proposed is correct. Is there a way that the "master" thread (the one that's directing all these helpers) can just make sure that it doesn't order any more operations to be performed on whatever is protected by this lock, until the first op is completed (or fails and some RAII thing notifies you)? You don't need locks as such, if you can deal with it at the level of the message loop.

I don't think it is a good idea to say that your inter_thread_mutex (binary_semaphore) can be seen as a model of Lockable. The main issue is that the main feature of your inter_thread_mutex defeats the Locakble concept. If inter_thread_mutex was a model of lockable you will expect in In [1] that the inter_thread_mutex m is locked.
// thread T1
inter_thread_mutex m;
{
unique_lock<inter_thread_mutex> lk(m);
// [1]
}
But as an other thread T2 can do m.unlock() while T1 is in [1], the guaranty is broken.
Binary semaphores can be used as Lockables as far as each thread tries to lock before unlocking. But the main goal of your class is exactly the contrary.
This is one of the reason semaphores in Boost.Interprocess don't use lock/unlock to name the functions, but wait/notify. Curiously these are the same names used by conditions :)

A mutex is a mechanism for describing mutually exclusive blocks of code. It does not make sense for these blocks of code to cross thread boundaries. Trying to use such a concept in such an counter intuitive way can only lead to problems down the line.
It sounds very much like you're looking for a different multi-threading concept, but without more detail it's hard to know what.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js