This scenario always occurs frequently: We have some threads, and a shared object, we need to make sure at any time only one thread can modify that object.
Well, the obvious solution is to use the lock the door-do the job-get out of there idiom. In this situation, I always use POSIX mutexes. For example
pthread_mutex_lock(&this->messageRW); // lock the door
P_Message x = this->messageQueue.front(); // do the job
this->messageQueue.pop();
pthread_mutex_unlock(&this->messageRW); // get out of there
// somewhere else, in another thread
while (true) {
P_Message message;
solver->listener->recvMessage(message);
pthread_mutex_lock(&(solver->messageRW)); // lock the door
solver->messageQueue.push(message); // do the job
pthread_mutex_unlock(&(solver->messageRW)); // get out of there
sem_post(&solver->messageCount);
}
I use messageQueue in so many places in code. So ended up with a lot of lock/unlock pairs which are inelegant. I think there should be a way to declare messageQueue as an object that is supposed to be shared between threads, and then threading API can take care of lock/unlock. I can think of a wrapper class, or something similar. A POSIX-based solution is preferred though other API's (boost threads, or other libraries) are also acceptable.
What would you implement in a similar situation?
Update for future readers
I found this. Will be a part of C++14 I guess.
You can use boost:scoped_lock in this case. As soon as you go out of scope, it unlocks elegantly:
boost::mutex mMutex;//member mutex object defined somewhere
{ //scope operator start
boost::mutex::scoped_lock scopedLock(mMutex);
pthread_mutex_lock(); // scoped lock the door
P_Message x = this->messageQueue.front(); // do the job
this->messageQueue.pop();
} //scope operator end, unlock mutex
// somewhere else, in another thread
while (true) {
P_Message message;
solver->listener->recvMessage(message);
boost::mutex::scoped_lock scopedLock(mMutex); // scoped lock the door
solver->messageQueue.push(message); // do the job
sem_post(&solver->messageCount);
} //scope operator end, unlock mutex
For this I would either subclass (is-a) or include (has-a) the message queue class into another class which forced the use of mutexes.
That's functionally what other languages do such as with the Java synchronized keyword - it modifieds the underlying object to be automatically protected.
It's the message queue itself which should handle the locking
(be atomic), not the calling code. And you need more than just
a mutex, you need a condition as well to avoid race conditions.
The standard idiom would be something like:
class ScopedLock // You should already have this one anyway
{
pthread_mutex_t& myOwned;
ScopedLock( ScopedLock const& );
ScopedLock& operator=( ScopedLock const& );
public:
ScopedLock( pthread_mutex_t& owned )
: myOwned( owned )
{
pthread_mutex_lock( &myOwned );
}
~ScopedLock()
{
pthread_mutex_unlock( &myOwned );
}
};
class MessageQueue
{
std::deque<Message> myQueue;
pthread_mutex_t myMutex;
pthread_cond_t myCond;
public:
MessageQueue()
{
pthread_mutex_init( &myMutex );
pthread_cond_init( &myCond );
}
void push( Message const& message )
{
ScopedLock( myMutex );
myQueue.push_back( message );
pthread_cond_broadcast( &myCond );
}
Message pop()
{
ScopedLock( myMutex );
while ( myQueue.empty() ) {
pthread_cond_wait( &myCond, &myMutex );
}
Message results = myQueue.front();
myQueue.pop_front();
return results;
}
};
This needs more error handling, but the basic structure is
there.
Of course, if you can use C++11, you'ld be better off using the
standard thread primitives. (Otherwise, I'd normally suggest
Boost threads. But if you're already using Posix threads, you
might want to wait to convert until you can use standard
threads, rather than converting twice.) But you'll still need
both a mutex and a condition.
Related
The following code hangs because of multiple calls to acquire a non-recursive mutex:
#include <pthread.h>
class Lock
{
public:
Lock( pthread_mutex_t& mutex )
: mutex_( mutex )
{
pthread_mutex_lock( &mutex_ );
}
~Lock()
{
pthread_mutex_unlock( &mutex_ );
}
private:
pthread_mutex_t& mutex_;
};
class Foo
{
public:
Foo()
{
pthread_mutex_init( &mutex_, NULL );
}
~Foo()
{
pthread_mutex_destroy( &mutex_ );
}
void hang()
{
Lock l( mutex_ );
subFunc();
}
void subFunc()
{
Lock l( mutex_ );
}
private:
pthread_mutex_t mutex_;
};
int main()
{
Foo f;
f.hang();
}
Is there a word or phrase for this situation? I'm not sure, but I don't think this can properly be called a deadlock: I'm of the understanding that a deadlock proper refers to the stalemate resulting from impassably ordered acquisition of multiple shared resources.
I've been anecdotally calling this a "single mutex deadlock" but I'd like to learn if there is a more proper term/phrase for this.
The Wikipedia article on reentrant mutexes cites Pattern-Oriented Software Architecture, which uses the term "self-deadlock." This term seems pretty reasonable to me!
...mutexes come in two basic flavors: recursive and non-recursive. A recursive mutex allows re-entrant locking, in which a thread that has already locked a mutex can lock it again and progress. Non-recursive mutexes, in contrast, cannot: a second lock in the same thread results in self-deadlock. Non-recursive mutexes can potentially be much faster to lock and unlock than recursive mutexes, but the risk of self-deadlock means that care must be taken when an object calls any methods on itself, either directly or via a callback, because double-locking will cause the thread to hang.
(emphasis added)
Various search results across a variety of technologies corroborate the use of this term.
https://docs.oracle.com/cd/E19253-01/816-5137/guide-35930/index.html
https://support.microsoft.com/en-in/help/2963138/fix-parallel-deadlock-or-self-deadlock-occurs-when-you-run-a-query-tha
https://issues.apache.org/jira/browse/DERBY-6692
https://github.com/citusdata/citus/issues/1572
"self deadlock" or "recursive deadlock".
According to the manual this is undefined behavior to lock a default initialized mutex twice from the same thread:
If the mutex type is PTHREAD_MUTEX_DEFAULT, attempting to recursively lock the mutex results in undefined behavior.
Consider the following example.
std::mutex mtx;
std::condition_variable cv;
void f()
{
{
std::unique_lock<std::mutex> lock( mtx );
cv.wait( lock ); // 1
}
std::cout << "f()\n";
}
void g()
{
std::this_thread::sleep_for( 1s );
cv.notify_one();
}
int main()
{
std::thread t1{ f };
std::thread t2{ g };
t2.join();
t1.join();
}
g() "knows" that f() is waiting in the scenario I would like to discuss.
According to cppreference.com there is no need for g() to lock the mutex before calling notify_one. Now in the line marked "1" cv will release the mutex and relock it once the notification is sent. The destructor of lock releases it again immediately after that. This seems to be superfluous especially since locking is expensive. (I know in certain scenarios the mutex needs to be locked. But this is not the case here.)
Why does condition_variable have no function "wait_nolock" which does not relock the mutex once the notification arrives. If the answer is that pthreads do not provide such functionality: Why can`t pthreads be extended for providing it? Is there an alternative for realizing the desired behavior?
You misunderstand what your code does.
Your code on line // 1 is free to not block at all. condition_variables can (and will!) have spurious wakeups -- they can wake up for no good reason at all.
You are responsible for checking if the wakeup is spurious.
Using a condition_variable properly requires 3 things:
A condition_variable
A mutex
Some data guarded by the mutex
The data guarded by the mutex is modified (under the mutex). Then (with the mutex possibly disengaged), the condition_variable is notified.
On the other end, you lock the mutex, then wait on the condition variable. When you wake up, your mutex is relocked, and you test if the wakeup is spurious by looking at the data guarded by the mutex. If it is a valid wakeup, you process and proceed.
If it wasn't a valid wakeup, you go back to waiting.
In your case, you don't have any data guarded, you cannot distinguish spurious wakeups from real ones, and your design is incomplete.
Not surprisingly with the incomplete design you don't see the reason why the mutex is relocked: it is relocked so you can safely check the data to see if the wakeup was spurious or not.
If you want to know why condition variables are designed that way, probably because this design is more efficient than the "reliable" one (for whatever reason), and rather than exposing higher level primitives, C++ exposed the lower level more efficient primitives.
Building a higher level abstraction on top of this isn't hard, but there are design decisions. Here is one built on top of std::experimental::optional:
template<class T>
struct data_passer {
std::experimental::optional<T> data;
bool abort_flag = false;
std::mutex guard;
std::condition_variable signal;
void send( T t ) {
{
std::unique_lock<std::mutex> _(guard);
data = std::move(t);
}
signal.notify_one();
}
void abort() {
{
std::unique_lock<std::mutex> _(guard);
abort_flag = true;
}
signal.notify_all();
}
std::experimental::optional<T> get() {
std::unique_lock<std::mutex> _(guard);
signal.wait( _, [this]()->bool{
return data || abort_flag;
});
if (abort_flag) return {};
T retval = std::move(*data);
data = {};
return retval;
}
};
Now, each send can cause a get to succeed at the other end. If more than one send occurs, only the latest one is consumed by a get. If and when abort_flag is set, instead get() immediately returns {};
The above supports multiple consumers and producers.
An example of how the above might be used is a source of preview state (say, a UI thread), and one or more preview renderers (which are not fast enough to be run in the UI thread).
The preview state dumps a preview state into the data_passer<preview_state> willy-nilly. The renderers compete and one of them grabs it. Then they render it, and pass it back (through whatever mechanism).
If the preview states come faster than the renderers consume them, only the most recent one is of interest, so the earlier ones are discarded. But existing previews aren't aborted just because a new state shows up.
Questions where asked below about race conditions.
If the data being communicated is atomic, can't we do without the mutex on the "send" side?
So something like this:
template<class T>
struct data_passer {
std::atomic<std::experimental::optional<T>> data;
std::atomic<bool> abort_flag = false;
std::mutex guard;
std::condition_variable signal;
void send( T t ) {
data = std::move(t); // 1a
signal.notify_one(); // 1b
}
void abort() {
abort_flag = true; // 1a
signal.notify_all(); // 1b
}
std::experimental::optional<T> get() {
std::unique_lock<std::mutex> _(guard); // 2a
signal.wait( _, [this]()->bool{ // 2b
return data.load() || abort_flag.load(); // 2c
});
if (abort_flag.load()) return {};
T retval = std::move(*data.load());
// data = std::experimental::nullopt; // doesn't make sense
return retval;
}
};
the above fails to work.
We start with the listening thread. It does step 2a, then waits (2b). It evaluates the condition at step 2c, but doesn't return from the lambda yet.
The broadcasting thread then does step 1a (setting the data), then signals the condition variable. At this moment, nobody is waiting on the condition variable (the code in the lambda doesn't count!).
The listening thread then finishes the lambda, and returns "spurious wakeup". It then blocks on the condition variable, and never notices that data was sent.
The std::mutex used while waiting on the condition variable must guard the write to the data "passed" by the condition variable (whatever test you do to determine if the wakeup was spurious), and the read (in the lambda), or the possibility of "lost signals" exists. (At least in a simple implementation: more complex implementations can create lock-free paths for "common cases" and only use the mutex in a double-check. This is beyond the scope of this question.)
Using atomic variables does not get around this problem, because the two operations of "determine if the message was spurious" and "rewait in the condition variable" must be atomic with regards to the "spuriousness" of the message.
I have a server-type application, and I have an issue with making sure thread's aren't deleted before they complete. The code below pretty much represents my server; the cleanup is required to prevent a build up of dead threads in the list.
using namespace std;
class A {
public:
void doSomethingThreaded(function<void()> cleanupFunction, function<bool()> getStopFlag) {
somethingThread = thread([cleanupFunction, getStopFlag, this]() {
doSomething(getStopFlag);
cleanupFunction();
});
}
private:
void doSomething(function<bool()> getStopFlag);
thread somethingThread;
...
}
class B {
public:
void runServer();
void stop() {
stopFlag = true;
waitForListToBeEmpty();
}
private:
void waitForListToBeEmpty() { ... };
void handleAccept(...) {
shared_ptr<A> newClient(new A());
{
unique_lock<mutex> lock(listMutex);
clientData.push_back(newClient);
}
newClient.doSomethingThreaded(bind(&B::cleanup, this, newClient), [this]() {
return stopFlag;
});
}
void cleanup(shared_ptr<A> data) {
unique_lock<mutex> lock(listMutex);
clientData.remove(data);
}
list<shared_ptr<A>> clientData;
mutex listMutex;
atomc<bool> stopFlag;
}
The issue seems to be that the destructors run in the wrong order - i.e. the shared_ptr is destructed at when the thread's function completes, meaning the 'A' object is deleted before thread completion, causing havok when the thread's destructor is called.
i.e.
Call cleanup function
All references to this (i.e. an A object) removed, so call destructor (including this thread's destructor)
Call this thread's destructor again -- OH NOES!
I've looked at alternatives, such as maintaining a 'to be removed' list which is periodically used to clean the primary list by another thread, or using a time-delayed deletor function for the shared pointers, but both of these seem abit chunky and could have race conditions.
Anyone know of a good way to do this? I can't see an easy way of refactoring it to work ok.
Are the threads joinable or detached? I don't see any detach,
which means that destructing the thread object without having
joined it is a fatal error. You might try simply detaching it,
although this can make a clean shutdown somewhat complex. (Of
course, for a lot of servers, there should never be a shutdown
anyway.) Otherwise: what I've done in the past is to create
a reaper thread; a thread which does nothing but join any
outstanding threads, to clean up after them.
I might add that this is a good example of a case where
shared_ptr is not appropriate. You want full control over
when the delete occurs; if you detach, you can do it in the
clean up function (but quite frankly, just using delete this;
at the end of the lambda in A::doSomethingThreaded seems more
readable); otherwise, you do it after you've joined, in the
reaper thread.
EDIT:
For the reaper thread, something like the following should work:
class ReaperQueue
{
std::deque<A*> myQueue;
std::mutex myMutex;
std::conditional_variable myCond;
A* getOne()
{
std::lock<std::mutex> lock( myMutex );
myCond.wait( lock, [&]( !myQueue.empty() ) );
A* results = myQueue.front();
myQueue.pop_front();
return results;
}
public:
void readyToReap( A* finished_thread )
{
std::unique_lock<std::mutex> lock( myMutex );
myQueue.push_back( finished_thread );
myCond.notify_all();
}
void reaperThread()
{
for ( ; ; )
{
A* mine = getOne();
mine->somethingThread.join();
delete mine;
}
}
};
(Warning: I've not tested this, and I've tried to use the C++11
functionality. I've only actually implemented it, in the past,
using pthreads, so there could be some errors. The basic
principles should hold, however.)
To use, create an instance, then start a thread calling
reaperThread on it. In the cleanup of each thread, call
readyToReap.
To support a clean shutdown, you may want to use two queues: you
insert each thread into the first, as it is created, and then
move it from the first to the second (which would correspond to
myQueue, above) in readyToReap. To shut down, you then wait
until both queues are empty (not starting any new threads in
this interval, of course).
The issue is that, since you manage A via shared pointers, the this pointer captured by the thread lambda really needs to be a shared pointer rather than a raw pointer to prevent it from becoming dangling. The problem is that there's no easy way to create a shared_ptr from a raw pointer when you don't have an actual shared_ptr as well.
One way to get around this is to use shared_from_this:
class A : public enable_shared_from_this<A> {
public:
void doSomethingThreaded(function<void()> cleanupFunction, function<bool()> getStopFlag) {
somethingThread = thread([cleanupFunction, getStopFlag, this]() {
shared_ptr<A> temp = shared_from_this();
doSomething(getStopFlag);
cleanupFunction();
});
this creates an extra shared_ptr to the A object that keeps it alive until the thread finishes.
Note that you still have the problem with join/detach that James Kanze identified -- Every thread must have either join or detach called on it exactly once before it is destroyed. You can fulfill that requirement by adding a detach call to the thread lambda if you never care about the thread exit value.
You also have potential for problems if doSomethingThreaded is called multiple times on a single A object...
For those who are interested, I took abit of both answers given (i.e. James' detach suggestion, and Chris' suggestion about shared_ptr's).
My resultant code looks like this and seems neater and doesn't cause a crash on shutdown or client disconnect:
using namespace std;
class A {
public:
void doSomething(function<bool()> getStopFlag) {
...
}
private:
...
}
class B {
public:
void runServer();
void stop() {
stopFlag = true;
waitForListToBeEmpty();
}
private:
void waitForListToBeEmpty() { ... };
void handleAccept(...) {
shared_ptr<A> newClient(new A());
{
unique_lock<mutex> lock(listMutex);
clientData.push_back(newClient);
}
thread clientThread([this, newClient]() {
// Capture the shared_ptr until thread over and done with.
newClient->doSomething([this]() {
return stopFlag;
});
cleanup(newClient);
});
// Detach to remove the need to store these threads until their completion.
clientThread.detach();
}
void cleanup(shared_ptr<A> data) {
unique_lock<mutex> lock(listMutex);
clientData.remove(data);
}
list<shared_ptr<A>> clientData; // Can remove this if you don't
// need to connect with your clients.
// However, you'd need to make sure this
// didn't get deallocated before all clients
// finished as they reference the boolean stopFlag
// OR make it a shared_ptr to an atomic boolean
mutex listMutex;
atomc<bool> stopFlag;
}
In my multithreaded server I have somefunction(), which needs to protect two independent of each other global data using EnterCriticalSection.
somefunction()
{
EnterCriticalSection(&g_List);
...
EnterCriticalSection(&g_Variable);
...
LeaveCriticalSection(&g_Variable);
...
LeaveCriticalSection(&g_List);
}
Following the advice of better programmers i'm going to use a RAII wrapper. For example:
class Locker
{
public:
Locker(CSType& cs): m_cs(cs)
{
EnterCriticalSection(&m_cs);
}
~Locker()
{
LeaveCriticalSection(&m_cs);
}
private:
CSType& m_cs;
}
My question: Is it ok to transform somefunction() to this?
(putting 2 Locker in one function):
somefunction()
{
// g_List,g_Variable previously initialized via InitializeCriticalSection
Locker lock(g_List);
Locker lock(g_Variable);
...
...
}
?
Your current solution has potential dead lock case. If you have two (or more) CSTypes which will be locked in different order this way, you will end up in dead lock. Best way would be to lock them both atomically. You can see an example of this in boost thread library. shared_lock and unique_lock can be used in deferred mode so that first you prepare all raii objects for all mutex objects, and then lock them all atomically in one call to lock function.
As long as you keep lock order the same in your threads its OK. Do you really need to lock them both at the same time? Also with scoped lock you can add scopes to control when to unlock, something like this:
{
// use inner scopes to control lock duration
{
Locker lockList (g_list);
// do something
} // unlocked at the end
Locker lockVariable (g_variable);
// do something
}
How's this look:
class HugeHack
{
HugeHack() : m_flag( false ) { }
void Logout( )
{
boost::lock_guard< boost::mutex > lock( m_lock );
m_flag = true;
// do stuff that in a perfect world would be atomic with library call and onLogout
// call a library function that waits for a thread to finish, that thread calls my onLogout() function before it dies
m_flag = false;
}
void onLogout()
{
boost::unique_lock< boost::mutex > l( m_lock, boost::defer_lock_t );
if( ! m_flag )
l.lock();
// do stuff
}
boost::mutex m_lock;
bool m_flag;
};
The flag is true ONLY while Logout is running, Logout legitimately waits for a thread to die that calls onLogout, so unless someone else can call onLogout... (can't be totally sure not my library I'm using - QuickFix)
I'm not sure I'm using the unique lock correctly there, if not, the goal is only to conditionally lock the lock (while maintaining scoped locking semantics).
The problem is that if you read m_flag without locking the mutex, you might see m_flag be false even if in reality it is true, and Logout is in the middle of operation. Locking the mutex issues a memory fence, which is essential to ensure proper memory visibility. And BTW stijn is right - if this is all you are after, you can ditch m_flag and use try_lock instead, like this:
boost::mutex::scoped_try_lock l( m_lock );
if ( l )
// lock succeeded
It sounds like what you want to do is release the lock in Logout at the point you are waiting for the thread to terminate (and call onLogout). So do that instead, then unconditionally take the lock in onLogout.