Avoid race condition using std::mutex - c++

I am dealing with the multi-threading project with C++ and I doubt about std::mutex
Let's assume that I have a stack.
#include <exception>
#include <memory>
#include <mutex>
#include <stack>
struct empty_stack: std::exception
{
const char* what() const throw();
};
template<typename T>
class threadsafe_stack
{
private:
std::stack<T> data;
mutable std::mutex m;
public:
threadsafe_stack(){}
threadsafe_stack(const threadsafe_stack& other)
{
std::lock_guard<std::mutex> lock(other.m);
data=other.data;
}
threadsafe_stack& operator=(const threadsafe_stack&) = delete;
void push(T new_value)
{
std::lock_guard<std::mutex> lock(m);
data.push(new_value);
}
std::shared_ptr<T> pop()
{
std::lock_guard<std::mutex> lock(m);
if(data.empty()) throw empty_stack();
std::shared_ptr<T> const res(std::make_shared<T>(data.top()));
data.pop();
return res;
}
void pop(T& value)
{
std::lock_guard<std::mutex> lock(m);
if(data.empty()) throw empty_stack();
value=data.top();
data.pop();
}
bool empty() const
{
std::lock_guard<std::mutex> lock(m);
return data.empty();
}
};
Someone said that using this stack can avoid race condition. However I think that problem here is that mutex aka mutual exclusion here only ensure for individual function not together. For example, I can have the threads call push and pop. Those function still have problem of race condition.
For example:
threadsafe_stack st; //global varibale for simple
void fun1(threadsafe_stack st)
{
std::lock_guard<std::mutex> lock(m);
st.push(t);
t = st.pop();
//
}
void fun2(threadsafe_stack st)
{
std::lock_guard<std::mutex> lock(m);
T t,t2;
t = st.pop();
// Do big things
st.push(t2);
//
}
If a thread fun1 and fun2 call the same stack (global variable for simple). So it can be a race condition(?)
I have only solution I can think is using some kind of atomic transaction means instead of calling directly push(), pop(), empty(), I call them via a function with a "function pointer" to those function and with only one mutex.
For example:
#define PUSH 0
#define POP 1
#define EMPTY 2
changeStack(int kindOfFunction, T* input, bool* isEmpty)
{
std::lock_guard<std::mutex> lock(m);
switch(kindOfFunction){
case PUSH:
push(input);
break;
case POP:
input = pop();
break;
case EMPTY:
isEmpty = empty();
break;
}
}
Is my solution good? Or I just overthinking and the first solution my friend told me is good enough? Are there any other solution for this? The solution can avoid "atomic transaction" like I suggest.

A given mutex is a single lock and can be held by a single thread at any one time.
If a thread (T1) is holding the lock on a given object in push() another thread (T2) cannot acquire it in pop() and will be blocked until T1 releases it. At that point of release T2 (or another thread also blocked by the same mutex) will be unblocked and allowed to proceed.
You do not need to do all the locking and unlocking in one member.
The point where you may still be introducing a race condition is constructs like this if they appear in consumer code:
if(!stack.empty()){
auto item=stack.pop();//Guaranteed?
}
If another thread T2 enters pop() after thread T1 enters empty() (above) and gets blocked waiting on the mutex then the pop() in T1 may fail because T2 'got there first'. Any number of actions might take place between the end of empty() and the start of pop() in that snippet unless other synchronization is handling it.
In this case you should imagine T1 & T2 literally racing to pop() though of course they may be racing to different members and still invalidate each other...
If you want to build code like that you usually have to then add further atomic member functions like try_pop() which returns (say) an empty std::shared_ptr<> if the stack is empty.
I hope this sentence isn't confusing:
Locking the object mutex inside member functions avoids race
conditions between calls to those member functions but not in
between calls to those member functions.
The best way to solve that is by adding 'composite' functions that are doing the job of more than one 'logical' operation. That tends to go against good class design in which you design a logical set of minimal operations and the consuming code combines them.
The alternative is to allow the consuming code access to the mutex. For example expose void lock() const; and void unlock() cont; members. That is usually not preferred because (a) it becomes very easy for consumer code to create deadlocks and (b) you either use a recursive lock (with its overhead) or double up member functions again:
void pop(); //Self locking version...
void pop_prelocked(); //Caller must hold object mutex or program invalidated.
Whether you expose them as public or protected or not that would make try_pop() look something like this:
std::shared_ptr<T> try_pop(){
std::lock_guard<std::mutex> guard(m);
if(empty_prelocked()){
return std::shared_ptr<T>();
}
return pop_prelocked();
}
Adding a mutex and acquiring it at the start of each member is only the start of the story...
Footnote: Hopefully that explains mutual exlusion (mut****ex). There's a whole other topic round memory barriers lurking below the surface here but if you use mutexes in this way you can treat that as an implementation detail for now...

You misunderstand something. You don't need that changeStack function.
If you forget about lock_guard, here's what it looks like (with lock_guard, the code does the same, but lock_guard makes it convenient: makes unlock automatic):
push() {
m.lock();
// do the push
m.unlock();
}
pop() {
m.lock();
// do the pop
m.unlock();
}
When push is called, mutex will be locked. Now, imagine, that on other thread, there is pop called. pop tries to lock the mutex, but it cannot lock it, because push already locked it. So it has to wait for push to unlock the mutex. When push unlocks the mutex, then pop can lock it.
So, in short, it is std::mutex which does the mutual exclusion, not the lock_guard.

Related

Avoiding deadlock in concurrent waiting object

I've implemented a "Ticket" class which is shared as a shared_ptr between multiple threads.
The program flow is like this:
parallelQuery() is called to start a new query job. A shared instance of Ticket is created.
The query is split into multiple tasks, each task is enqueued on a worker thread (this part is important, otherwise I'd just join threads and done). Each task gets the shared ticket.
ticket.wait() is called to wait for all tasks of the job to complete.
When one task is done it calls the done() method on the ticket.
When all tasks are done the ticket is unlocked, result data from the task aggregated and returned from parallelQuery()
In pseudo code:
std::vector<T> parallelQuery(std::string str) {
auto ticket = std::make_shared<Ticket>(2);
auto task1 = std::make_unique<Query>(ticket, str+"a");
addTaskToWorker(task1);
auto task2 = std::make_unique<Query>(ticket, str+"b");
addTaskToWorker(task2);
ticket->waitUntilDone();
auto result = aggregateData(task1, task2);
return result;
}
My code works. But I wonder if it is theoretically possible that it can lead to a deadlock in case when unlocking the mutex is executed right before it gets locked again by the waiter thread calling waitUntilDone().
Is this a possibility, and how to avoid this trap?
Here is the complete Ticket class, note the execution order example comments related to the problem description above:
#include <mutex>
#include <atomic>
class Ticket {
public:
Ticket(int numTasks = 1) : _numTasks(numTasks), _done(0), _canceled(false) {
_mutex.lock();
}
void waitUntilDone() {
_doneLock.lock();
if (_done != _numTasks) {
_doneLock.unlock(); // Execution order 1: "waiter" thread is here
_mutex.lock(); // Execution order 3: "waiter" thread is now in a dealock?
}
else {
_doneLock.unlock();
}
}
void done() {
_doneLock.lock();
_done++;
if (_done == _numTasks) {
_mutex.unlock(); // Execution order 2: "task1" thread unlocks the mutex
}
_doneLock.unlock();
}
void cancel() {
_canceled = true;
_mutex.unlock();
}
bool wasCanceled() {
return _canceled;
}
bool isDone() {
return _done >= _numTasks;
}
int getNumTasks() {
return _numTasks;
}
private:
std::atomic<int> _numTasks;
std::atomic<int> _done;
std::atomic<bool> _canceled;
// mutex used for caller wait state
std::mutex _mutex;
// mutex used to safeguard done counter with lock condition in waitUntilDone
std::mutex _doneLock;
};
One possible solution which just came to my mind when editing the question is that I can put _done++; before the _doneLock(). Eventually, this should be enough?
Update
I've updated the Ticket class based on the suggestions provided by Tomer and Phil1970. Does the following implementation avoid mentioned pitfalls?
class Ticket {
public:
Ticket(int numTasks = 1) : _numTasks(numTasks), _done(0), _canceled(false) { }
void waitUntilDone() {
std::unique_lock<std::mutex> lock(_mutex);
// loop to avoid spurious wakeups
while (_done != _numTasks && !_canceled) {
_condVar.wait(lock);
}
}
void done() {
std::unique_lock<std::mutex> lock(_mutex);
// just bail out in case we call done more often than needed
if (_done == _numTasks) {
return;
}
_done++;
_condVar.notify_one();
}
void cancel() {
std::unique_lock<std::mutex> lock(_mutex);
_canceled = true;
_condVar.notify_one();
}
const bool wasCanceled() const {
return _canceled;
}
const bool isDone() const {
return _done >= _numTasks;
}
const int getNumTasks() const {
return _numTasks;
}
private:
std::atomic<int> _numTasks;
std::atomic<int> _done;
std::atomic<bool> _canceled;
std::mutex _mutex;
std::condition_variable _condVar;
};
Don't write your own wait methods but use std::condition_variable instead.
https://en.cppreference.com/w/cpp/thread/condition_variable.
Mutexes usage
Generally, a mutex should protect a given region of code. That is, it should lock, do its work and unlock. In your class, you have multiple method where some lock _mutex while other unlock it. This is very error-prone as if you call the method in the wrong order, you might well be in an inconsistant state. What happen if a mutex is lock twice? or unlocked when already unlocked?
The other thing to be aware with mutex is that if you have multiple mutexes, it that you can easily have deadlock if you need to lock both mutexes but don't do it in consistant order. Suppose that thread A lock mutex 1 first and the mutex 2, and thread B lock them in the opposite order (mutex 2 first). There is a possibility that something like this occurs:
Thread A lock mutex 1
Thread B lock mutex 2
Thread A want to lock mutex 2 but cannot as it is already locked.
Thread B want to lock mutex 1 but cannot as it is already locked.
Both thread will wait forever
So in your code, you should at least have some checks to ensure proper usage. For example, you should verify _canceled before unlocking the mutex to ensure cancel is called only once.
Solution
I will just gave some ideas
Declare a mutux and a condition_variable to manage the done condition in your class.
std::mutex doneMutex;
std::condition_variable done_condition;
Then waitUntilDone would look like:
void waitUntilDone()
{
std::unique_lock<std::mutex> lk(doneMutex);
done_condition.wait(lk, []{ return isDone() || wasCancelled();});
}
And done function would look like:
void done()
{
std::lock_guard<std::mutex> lk(doneMutex);
_done++;
if (_done == _numTasks)
{
doneCondition.notify_one();
}
}
And cancel function would become
void done()
{
std::lock_guard<std::mutex> lk(doneMutex);
_cancelled = true;
doneCondition.notify_one();
}
As you can see, you only have one mutex now so you basically eliminate the possibility of a deadlock.
Variable naming
I suggest you to not use lock in the name of you mutex since it is confusing.
std::mutex someMutex;
std::guard_lock<std::mutex> someLock(someMutex); // std::unique_lock when needed
That way, it is far easier to know which variable refer to the mutex and which one to the lock of the mutex.
Good reading
If you are serious about multithreading, then you should buy that book:
C++ Concurrency in Action
Practical Multithreading
Anthony Williams
Code Review (added section)
Essentially same code has beed posted to CODE REVIEW: https://codereview.stackexchange.com/questions/225863/multithreading-ticket-class-to-wait-for-parallel-task-completion/225901#225901.
I have put an answer there that include some extra points.
You not need to use mutex for operate with atomic values
UPD
my answer to mainn question was wrong. I deleted one.
You can use simple (non atomic) int _numTasks; also. And you not need shared pointer - just create Task on the stack and pass pointer
Ticket ticket(2);
auto task1 = std::make_unique<Query>(&ticket, str+"a");
addTaskToWorker(task1);
or unique ptr if you like
auto ticket = std::make_unique<Ticket>(2);
auto task1 = std::make_unique<Query>(ticket.get(), str+"a");
addTaskToWorker(task1);
because shared pointer can be cut by Occam's razor :)

mutex lock synchronization between different threads

Since I have recently started coding multi threaded programs this might be a stupid question. I found out about the awesome mutex and condition variable usage. From as far as I can understand there use is:
Protect sections of code/shared resources from getting corrupted by multiple threads access. Hence lock that portion thus one can control which thread will be accessing.
If a thread is waiting for a resource/condition from another thread one can use cond.wait() instead of polling every msec
Now Consider the following class example:
class Queue {
private:
std::queue<std::string> m_queue;
boost::mutex m_mutex;
boost::condition_variable m_cond;
bool m_exit;
public:
Queue()
: m_queue()
, m_mutex()
, m_cond()
, m_exit(false)
{}
void Enqueue(const std::string& Req)
{
boost::mutex::scoped_lock lock(m_mutex);
m_queue.push(Req);
m_cond.notify_all();
}
std::string Dequeue()
{
boost::mutex::scoped_lock lock(m_mutex);
while(m_queue.empty() && !m_exit)
{
m_cond.wait(lock);
}
if (m_queue.empty() && m_exit) return "";
std::string val = m_queue.front();
m_queue.pop();
return val;
}
void Exit()
{
boost::mutex::scoped_lock lock(m_mutex);
m_exit = true;
m_cond.notify_all();
}
}
In the above example, Exit() can be called and it will notify the threads waiting on Dequeue that it's time to exit without waiting for more data in the queue.
My question is since Dequeue has acquired the lock(m_mutex), how can Exit acquire the same lock(m_mutex)? Isn't unless the Dequeue releases the lock then only Exit can acquire it?
I have seen this pattern in Destructor implementation too, using same class member mutex, Destructor notifies all the threads(class methods) thats it time to terminate their respective loops/functions etc.
As Jarod mentions in the comments, the call
m_cond.wait(lock)
is guaranteed to atomically unlock the mutex, releasing it for the thread, and starts listening to notifications of the condition variable (see e.g. here).
This atomicity also ensures any code in the thread is executed after the listening is set up (so no notify calls will be missed). This assumes of course that the thread first locks the mutex, otherwise all bets are off.
Another important bit to understand is that condition variables may suffer from "spurious wakeups", so it is important to have a second boolean condition (e.g. here, you could check the emptiness of your queue) so that you don't end up awoken with an empty queue. Something like this:
m_cond.wait(lock, [this]() { return !m_queue.empty() || m_exit; });

Deadlock simulation using std::mutex

I have following example:
template <typename T>
class container
{
public:
std::mutex _lock;
std::set<T> _elements;
void add(T element)
{
_elements.insert(element);
}
void remove(T element)
{
_elements.erase(element);
}
};
void exchange(container<int>& cont1, container<int>& cont2, int value)
{
cont1._lock.lock();
std::this_thread::sleep_for(std::chrono::seconds(1));
cont2._lock.lock();
cont1.remove(value);
cont2.add(value);
cont1._lock.unlock();
cont2._lock.unlock();
}
int main()
{
container<int> cont1, cont2;
cont1.add(1);
cont2.add(2);
std::thread t1(exchange, std::ref(cont1), std::ref(cont2), 1);
std::thread t2(exchange, std::ref(cont2), std::ref(cont1), 2);
t1.join();
t2.join();
return 0;
}
In this case I'm expiriencing a deadlock. But when I use std::lock_guard instead of manually locking and unlocking mutextes I have no deadlock. Why?
void exchange(container<int>& cont1, container<int>& cont2, int value)
{
std::lock_guard<std::mutex>(cont1._lock);
std::this_thread::sleep_for(std::chrono::seconds(1));
std::lock_guard<std::mutex>(cont2._lock);
cont1.remove(value);
cont2.add(value);
}
Your two code snippets are not comparable. The second snippet locks and immediately unlocks each mutex as the temporary lock_guard object is destroyed at the semicolon:
std::lock_guard<std::mutex>(cont1._lock); // temporary object
The correct way to use lock guards is to make scoped variables of them:
{
std::lock_guard<std::mutex> lock(my_mutex);
// critical section here
} // end of critical section, "lock" is destroyed, calling mutex.unlock()
(Note that there is another common error that's similar but different:
std::mutex mu;
// ...
std::lock_guard(mu);
This declares a variable named mu (just like int(n);). However, this code is ill-formed because std::lock_guard does not have a default constructor. But it would compile with, say, std::unique_lock, and it also would not end up locking anything.)
Now to address the real problem: How do you lock multiple mutexes at once in consistent order? It may not be feasible to agree on a single lock order across an entire codebase, or even across a future user's codebase, or even in local cases as your example shows. In such cases, use the std::lock algorithm:
std::mutex mu1;
std::mutex mu2;
void f()
{
std::lock(mu1, mu2);
// order below does not matter
std::lock_guard<std::mutex> lock1(mu1, std::adopt_lock);
std::lock_guard<std::mutex> lock2(mu2, std::adopt_lock);
}
In C++17 there is a new variadic lock guard template called scoped_lock:
void f_17()
{
std::scoped_lock lock(mu1, mu2);
// ...
}
The constructor of scoped_lock uses the same algorithm as std::lock, so the two can be used compatibly.
While Kerrek SB's answer is entirely valid I thought I'd throw an alternative hat in the ring. std::lock or any try-and-retreat deadlock avoidance strategies should be seen as the last resort from a performance perspective.
How about:
#include <functional> //includes std::less<T> template.
static const std::less<void*> l;//comparison object. See note.
void exchange(container<int>& cont1, container<int>& cont2, int value)
{
if(&cont1==&cont2) {
return; //aliasing protection.
}
std::unique_lock<std::mutex> lock1(cont1._lock, std::defer_lock);
std::unique_lock<std::mutex> lock2(cont2._lock, std::defer_lock);
if(l(&cont1,&cont2)){//in effect portal &cont1<&cont2
lock1.lock();
std::this_thread::sleep_for(std::chrono::seconds(1));
lock2.lock();
}else{
lock2.lock();
std::this_thread::sleep_for(std::chrono::seconds(1));
lock1.lock();
}
cont1.remove(value);
cont2.add(value);
}
This code uses the memory address of the objects to determine an arbitrary but consistent lock order. This approach can (of course) be generalized.
Note also that in reusable code the aliasing protection is necessary because the version where cont1 is cont2 would be invalid by trying to lock the same lock twice. std::mutex cannot be assumed to be a recursive lock and normally isn't.
NB: The use of std::less<void> ensures compliance as it guarantees a consistent total ordering of addresses. Technically (&cont1<&cont2) is unspecified behavior. Thanks Kerrek SB!

Why C++ concurrency in action listing_6.1 does not use std::recursive_mutex

I am reading the book "C++ Concurrency In Action" and have some question about the mutex used in listing 6.1, the code snippet is below:
void pop(T& value)
{
std::lock_guard<std::mutex> lock(m);
if(data.empty()) throw empty_stack();
value=std::move(data.top());
data.pop();
}
bool empty() const
{
std::lock_guard<std::mutex> lock(m);
return data.empty();
}
The pop method locks the mutex and then calls the empty mutex. But the mutex is not a recursive_mutex, and the code works properly. So I doubt what is the actually difference between std::mutex and std::recursive_mutex.
It is calling data.empty() which seems like a function from a data member. Not the same as the empty function you show.
If it were, this would be a recursive call
bool empty() const
{
std::lock_guard<std::mutex> lock(m);
return data.empty();
}
and nothing would work.
well, recursive_mutex is for... recursive function!
In some Operating systems, locking the same mutex twice can lead to a system error( in which, the lock may be released copmletely, application may crash and actually all kind of weird and undefined behaviour may occur).
look at this (silly example)
void recursivePusher(int x){
if (x>10){
return;
}
std::lock_guard<std::mutex> lock(m);
queue.push(x);
recursivePusher(x+1);
}
this function recursivly increments x and pushes it into some shared queue.
as we talked above - the same lock may not be locked twice by the same thread, but we do need to make sure the shared queue isn't baing altered by mutilple threads.
one easy solution is to move the lociking outside the recursive function, but what happens if we can't do it? what happens if the function called is the only one that can lock the shared resource?
for example, my calling function may look like this:
switch(option){
case case1: recursivly_manipulate_shared_array(); break;
case case2: recursivly_manipulate_shared_queue(); break;
case case3: recursivly_manipulate_shared_map(); break;
}
ofcourse, you wouldn't lock all three(shred_Array,shared_map,shared_queue) only for one of them will be altered.
the solution is to use std::shared_mutex :
void recursivePusher(int x){
if (x>10){
return;
}
std::lock_guard<std::recursive_mutex> lock(m);
queue.push(x);
recursivePusher(x+1);
}
if the same thread don't need to lock the mutex recursivly it should use regular std::mutex, like in your example.
PS. in your snippet, empty is not the same as T::empty.
calling data.empty() doesn't call empty recursivley.

Why does the author claim that this code leads to race?

Why does author think that below part of source code leads to race?
Author says:
This design is subject to race conditions between calls to empty, front and pop if there is more than one thread removing items from the queue, but in a single-consumer system (as being discussed here), this is not a problem.
Here is the code:
template<typename Data>
class concurrent_queue
{
private:
std::queue<Data> the_queue;
mutable boost::mutex the_mutex;
public:
void push(const Data& data)
{
boost::mutex::scoped_lock lock(the_mutex);
the_queue.push(data);
}
bool empty() const
{
boost::mutex::scoped_lock lock(the_mutex);
return the_queue.empty();
}
Data& front()
{
boost::mutex::scoped_lock lock(the_mutex);
return the_queue.front();
}
Data const& front() const
{
boost::mutex::scoped_lock lock(the_mutex);
return the_queue.front();
}
void pop()
{
boost::mutex::scoped_lock lock(the_mutex);
the_queue.pop();
}
};
If you call empty you check whether it is safe to pop an element. What could happen in a threaded system is that after you checked that queue is not empty another thread could already have popped the last element and it is no longer safe that the queue is not empty.
thread A: thread B:
if(!queue.empty());
if(!queue.empty());
queue.pop();
->it is no longer sure that the queue
isn't empty
If you have more than one thread "comsuming" data from the queue, it can lead to a race condition in a particularly bad way. Take the following pseudo code:
class consumer
{
void do_work()
{
if(!work_.empty())
{
type& t = work_.front();
work_.pop();
// do some work with t
t...
}
}
concurrent_queue<type> work_;
};
This looks simple enough, but what if you have multiple consumer objects, and there is only one item in the concurrent_queue. If the consumer is interrupted after calling empty(), but before calling pop(), then potentially multiple consumers will try to work on the same object.
A more appropriate implementation would perform the empty checking and popping in a single operation exposed in the interface, like this:
class concurrent_queue
{
private:
std::queue<Data> the_queue;
mutable boost::mutex the_mutex;
public:
void push(const Data& data)
{
boost::mutex::scoped_lock lock(the_mutex);
the_queue.push(data);
}
bool pop(Data& popped)
{
boost::mutex::scoped_lock lock(the_mutex);
if(!the_queue.empty())
{
popped = the_queue.front();
the_queue.pop();
return true;
}
return false;
}
};
Because you could do this...
if (!your_concurrent_queue.empty())
your_concurrent_queue.pop();
...and still have a failure on pop if another thread called pop "in between" these two lines.
(Whether this will actually happen in practice, depends on timing of execution of concurrent threads - in essence threads "race" and who wins this race determines whether the bug will manifest itself or not, which is essentially random on modern preemptive OSes. This randomness can make race conditions very hard to diagnose and repair.)
Whenever clients do "meta-operations" like these (where there is a sequence of several calls accomplishing the desired effect), it's impossible to protect against race conditions by in-method locking alone.
And since the clients have to perform their own locking anyway, you can even consider abandoning the in-method locking, for performance reasons. Just be sure this is clearly documented so the clients know that you are not making any promises regarding thread-safety.
I think what's confused you is that in the code you posted, there is nothing that causes a race condition. The race condition would be caused by the threads actually CALLING this code. Imagine that thread 1 checks to see if the thread is not empty. Then that thread goes to sleep for a year. One year later when it wakes up, is it still valid for that thread to assume the queue is still empty? Well, no, in the meantime, another thread could have easily come along and called pushed.