Can atomic read operations lead to deadlocks? - c++

I am watching this Herb Sutter's talk about atomics, mutexes and memory barriers, and I have a question regarding it. Since 47:33 Herb explains how mutexes and atomics are related to memory ordering. On 49:12 he says, that with the default memory order, which is memory_order_seq_cst, atomic load() operation is equivalent of locking a mutex, and atomic store operation is equivalent of unlocking a mutex. On 53:15 Herb was asked what would happen if the atomic load() operation in his example was not followed by subsequent store() operation, and his answer really confused me:
54:00 - If you don't write this release(mutex.unlock()) or this one(atomic.store()), that person will never get a chance to run...
Maybe I completely misunderstood him due to my not very good English skills, but this is how I interpret his words:
If a thread A reads an atomic variable without subsequent write into it, no other thread will be able to work with this variable due to it being deadlocked by the thread A like as if a mutex was locked by this thread without further unlocking.
But is it really what he meant? It doesn't seem to be true. In my understanding release happens automatically right after an atomic variable was either loaded or stored. Just an example:
#include <atomic>
#include <iostream>
#include <thread>
std::atomic<int> number{0};
void foo()
{
while (number != 104) {}
std::cout << "Number:\t" << number << '\n';
}
int main()
{
std::thread thr1(foo);
std::thread thr2(foo);
std::thread thr3(foo);
std::thread thr4(foo);
number = 104;
thr1.join();
thr2.join();
thr3.join();
thr4.join();
}
In the above example there are 4 threads successfully reading the same atomic variable, and no any writes are needed in these threads to release the variable for other threads. Apparently, atomic.load() != mutex.lock() as well as atomic.store() != mutex.unlock(). They may behave the same way in terms of memory barriers, but they are not the same, aren't they?
Could you please explain to me, what did Herb actually mean by saying they are equal?

There's a misunderstanding here. An atomic read, regardless of memory order, is not "equivalent to locking a mutex". In terms of visibility it might have the same effects, but a mutex is much heavier.
Here's a typical problem with a mutex:
std::mutex mtx1;
std::mutex mtx2;
void thr1() {
mtx1.lock();
mtx2.lock();
mtx2.unlock();
mtx1.unlock();
}
void thr2() {
mtx2.lock();
mtx1.lock();
mtx1.unlock();
mtx2.unlock();
}
Note that the two function lock the two mutexes in the opposite order. So it's possible that thr1 locks mtx1, then thr2 locks mtx2, then thr1 tries to lock mtx2 and thr2 tries to lock mtx1. That's a deadlock; neither thread can make progress, because the resource it needs is held by the other thread.
Atomics don't have that problem, because you can't run code "inside" an atomic access. You can't get that sort of resource conflict.
The issue that seems to underly that discussion is the possibility that the thread running while (number != 104) {} won't see the updated value of number, and so the code will be an infinite loop. That's not a deadlock. Which isn't to say that it's not a problem, but the problem is with visibility.

Related

Why doesn't mutex work without lock guard?

I have the following code:
#include <chrono>
#include <iostream>
#include <mutex>
#include <thread>
int shared_var {0};
std::mutex shared_mutex;
void task_1()
{
while (true)
{
shared_mutex.lock();
const auto temp = shared_var;
std::this_thread::sleep_for(std::chrono::seconds(1));
if(temp == shared_var)
{
//do something
}
else
{
const auto timenow = std::chrono::system_clock::to_time_t(std::chrono::system_clock::now());
std::cout << ctime(&timenow) << ": Data race at task_1: shared resource corrupt \n";
std::cout << "Actual value: " << shared_var << "Expected value: " << temp << "\n";
}
shared_mutex.unlock();
}
}
void task_2()
{
while (true)
{
std::this_thread::sleep_for(std::chrono::seconds(2));
++shared_var;
}
}
int main()
{
auto task_1_thread = std::thread(task_1);
auto task_2_thread = std::thread(task_2);
task_1_thread.join();
task_2_thread.join();
return 0;
}
shared_var is protected in task_1 but not protected in task_2
What is expected:
I was expecting else branch is not entered in task_1 as the shared resource is locked.
What actually happens:
Running this code will enter else branch in task_1.
Expected outcome is obtained when replace shared_mutex.lock(); with std::lock_guard<std::mutex> lock(shared_mutex); and shared_mutex.unlock(); with std::lock_guard<std::mutex> unlock(shared_mutex);
Questions:
What is the problem in my current approach?
Why does it work with loack_guard?
I am running the code on:
https://www.onlinegdb.com/online_c++_compiler
Suppose you have a room with two entries. One entry has a door the other not. The room is called shared_var. There are two guys that want to enter the room, they are called task_1 and task_2.
You now want to make sure somehow that only one of them is inside the room at any time.
taks_2 can enter the room freely through the entry without a door. task_1 uses the door called shared_mutex.
Your question is now: Can achieve that only one guy is in the room by adding a lock to the door at the first entry?
Obviously no, because the second door can still be entered and left without you having any control over it.
If you experiment you might observe that without the lock it happens that you find both guys in the room while after adding the lock you don't find both guys in the room. Though this is pure luck (bad luck actually, because it makes you beleive that the lock helped). In fact the lock did not change much. The guy called task_2 can still enter the room while the other guy is inside.
The solution would be to make both go through the same door. They lock the door when going inside and unlock it when leaving the room. Putting an automatic lock on the door can be nice, because then the guys cannot forget to unlock the door when they leave.
Oh sorry, i got lost in telling a story.
TL;DR: In your code it does not matter if you use the lock or not. Actually also the mutex in your code is useless, because only one thread un/locks it. To use the mutex properly, both threads need to lock it before reading/writing shared memory.
With UB (as data race), output is undetermined, you might see "expected" output, or strange stuff, crash, ...
What is the problem in my current approach?
In first sample, you have data race as you write (non-atomic) shared_var in one thread without synchronization and read in another thread.
Why does it work with loack_guard?
In modified sample, you lock twice the same (non-recursive) mutex, which is also UB
From std::mutex::lock:
If lock is called by a thread that already owns the mutex, the behavior is undefined
You just have 2 different behaviours for 2 different UB (when anything can happen for both cases).
A mutex lock does not lock a variable, it just locks the mutex so that other code cannot lock the same mutex at the same time.
In other words, all accesses to a shared variable need to be wrapped in a mutex lock on the same mutex to avoid multiple simultaneous accesses to the same variable, it's not in any way automatic just because the variable is wrapped in a mutex lock in another place in the code.
You're not locking the mutex at all in task2, so there is a race condition.
The reason it seems to work when you wrap the mutex in a std::lock_guard is that the lock guard holds the mutex lock until the end of the scope which in this case is the end of the function.
Your function first locks the mutex with the lock lock_guard to later in the same scope try to lock the same mutex with the unlock lock_guard. Since the mutex is already locked by the lock lock_guard, execution stops and there is no output because the program is in effect not running anymore.
If you output "ok" in your code at the point of the "//do something" comment, you'll see that you get the output once and then the program stops all output.
Note; as of this behaviour being guaranteed, see #Jarod42s answer for much better info on that. As with most unexpected behaviour in C++, there is probably an UB involved.

what does unique_lock mean when a single thread acquire 2 unique_lock of the same mutex?

I have the following code, which is from https://en.cppreference.com/w/cpp/thread/unique_lock. However, upon, printing the output, I see some unexpected result and would like some explaination.
The code is:
#include <mutex>
#include <thread>
#include <chrono>
#include <iostream>
struct Box {
explicit Box(int num) : num_things{num} {}
int num_things;
std::mutex m;
};
void transfer(Box &from, Box &to, int anotherNumber)
{
// don't actually take the locks yet
std::unique_lock<std::mutex> lock1(from.m, std::defer_lock);
std::unique_lock<std::mutex> lock2(to.m, std::defer_lock);
// lock both unique_locks without deadlock
std::lock(lock1, lock2);
from.num_things += anotherNumber;
to.num_things += anotherNumber;
std::cout<<std::this_thread::get_id()<<" "<<from.num_things<<"\n";
std::cout<<std::this_thread::get_id()<<" "<<to.num_things<<"\n";
// 'from.m' and 'to.m' mutexes unlocked in 'unique_lock' dtors
}
int main()
{
Box acc1(100); //initialized acc1.num_things = 100
Box acc2(50); //initialized acc2.num_things = 50
std::thread t1(transfer, std::ref(acc1), std::ref(acc2), 10);
std::thread t2(transfer, std::ref(acc2), std::ref(acc1), 5);
t1.join();
t2.join();
}
My expectation:
acc1 will be initialized with num_things=100, and acc2 with num_things=50.
say thread t1 runs first, it acquire mutex m, with 2 locks. Once the locks are locked, and can assign num_things to num=10
upon completion, it will print from.num_things = 110 and to.numthings = 60 in order. "from" first, then "to" later.
thread1 finishes the critical section of the code, and wrapper unique_lock calls its destructor which basically unlock the mutex.
Here is what I don't understand.
I expected the lock1 fill be unlocked first, and lock2 later.
Thread t2 then acquire the mutex in the same order and lock the lock1 first, then lock2. It will also runs the critical code sequentially up to cout.
Thread t2 will take the global acc1.num_things = 110 and acc2.num_things = 60 from t1.
I expect that t2 will print from.num_things = 115 first, then to.numthings = 65.
However, upon countless trial, I always get the reverse order. And that is my confusion.
I expected the lock1 fill be unlocked first, and lock2 later.
No, the reverse is true. In your function lock1 gets constructed first, then lock2. Therefore, when the function returns lock2 gets destroyed first, then lock1, so lock2's destructor releases its lock before lock1's destructor.
The actual order in which std::lock manages to acquire the multiple locks has no bearing on how the locks gets destroyed, and release their ownership of their respective mutexes. That still follows normal C++ rules for doing so.
say thread t1 runs first,
You have no guarantee of that, whatsoever. In the above code it's entirely possible that t2 will enter the function first and acquire the locks on the mutexes. And, it is also entirely possible that each time you run this program you'll get different results, with both t1 and t2 winning the race, randomly.
Without getting into technical mumbo-jumbo, the only thing that C++ guarantees you is that std::thread gets fully constructed before the thread function gets invoked in a new execution thread. You have no guarantees whatsoever that, when creating two execution threads one after another, the first one will call its function and run some arbitrary part of the thread function before the second execution thread does the same.
So it's entirely possible that t2 will get the first dibs on the locks occasionally. Or, always. Attempting to control the relative sequence of events across execution threads is much harder than you think.

Difference between shared mutex and mutex (why do both exist in C++ 11)?

Haven't got an example online to demonstrate this vividly. Saw an example at http://en.cppreference.com/w/cpp/header/shared_mutex but
it is still unclear. Can somebody help?
By use of normal mutexes, you can guarantee exclusive access to some kind of critical resource – and nothing else. Shared mutexes extend this feature by allowing two levels of access: shared and exclusive as follows:
Exclusive access prevents any other thread from acquiring the mutex, just as with the normal mutex. It does not matter if the other thread tries to acquire shared or exclusive access.
Shared access allows multiple threads to acquire the mutex, but all of them only in shared mode. Exclusive access is not granted until all of the previous shared holders have returned the mutex (typically, as long as an exclusive request is waiting, new shared ones are queued to be granted after the exclusive access).
A typical scenario is a database: It does not matter if several threads read one and the same data simultaneously. But modification of the database is critical - if some thread reads data while another one is writing it might receive inconsistent data. So all reads must have finished before writing is allowed and new reading must wait until writing has finished. After writing, further reads can occur simultaneously again.
Edit: Sidenote:
Why readers need a lock?
This is to prevent the writer from acquiring the lock while reading yet occurs. Additionally, it prevents new readers from acquiring the lock if it is yet held exclusively.
A shared mutex has two levels of access 'shared' and 'exclusive'.
Multiple threads can acquire shared access but only one can hold 'exclusive' access (that includes there being no shared access).
The common scenario is a read/write lock. Recall that a Data Race can only occur when two threads access the same data at least one of which is a write.
The advantage of that is data may be read by many readers but when a writer needs access they must obtain exclusive access to the data.
Why have both? One the one hand the exclusive lock constitutes a normal mutex so arguably only Shared is needed. But there may be overheads in an shared lock implementation that can be avoided using the less featured type.
Here's an example (adapted slightly from the example here http://en.cppreference.com/w/cpp/thread/shared_mutex).
#include <iostream>
#include <mutex>
#include <shared_mutex>
#include <thread>
std::mutex cout_mutex;//Not really part of the example...
void log(const std::string& msg){
std::lock_guard guard(cout_mutex);
std::cout << msg << std::endl;
}
class ThreadSafeCounter {
public:
ThreadSafeCounter() = default;
// Multiple threads/readers can read the counter's value at the same time.
unsigned int get() const {
std::shared_lock lock(mutex_);//NB: std::shared_lock will shared_lock() the mutex.
log("get()-begin");
std::this_thread::sleep_for(std::chrono::milliseconds(500));
auto result=value_;
log("get()-end");
return result;
}
// Only one thread/writer can increment/write the counter's value.
void increment() {
std::unique_lock lock(mutex_);
value_++;
}
// Only one thread/writer can reset/write the counter's value.
void reset() {
std::unique_lock lock(mutex_);
value_ = 0;
}
private:
mutable std::shared_mutex mutex_;
unsigned int value_ = 0;
};
int main() {
ThreadSafeCounter counter;
auto increment_and_print = [&counter]() {
for (int i = 0; i < 3; i++) {
counter.increment();
auto ctr=counter.get();
{
std::lock_guard guard(cout_mutex);
std::cout << std::this_thread::get_id() << ' ' << ctr << '\n';
}
}
};
std::thread thread1(increment_and_print);
std::thread thread2(increment_and_print);
std::thread thread3(increment_and_print);
thread1.join();
thread2.join();
thread3.join();
}
Possible partial output:
get()-begin
get()-begin
get()-end
140361363867392 2
get()-end
140361372260096 2
get()-begin
get()-end
140361355474688 3
//Etc...
Notice how the two get()-begin() return show that two threads are holding the shared lock during the read.
"Shared mutexes are usually used in situations when multiple readers can access the same resource at the same time without causing data races, but only one writer can do so."
cppreference.com
This is useful when you need read/writer lock: https://en.wikipedia.org/wiki/Readers%E2%80%93writer_lock

std::lock_guard example, explanation on why it works

I've reached a point in my project that requires communication between threads on resources that very well may be written to, so synchronization is a must. However I don't really understand synchronization at anything other than the basic level.
Consider the last example in this link: http://www.bogotobogo.com/cplusplus/C11/7_C11_Thread_Sharing_Memory.php
#include <iostream>
#include <thread>
#include <list>
#include <algorithm>
#include <mutex>
using namespace std;
// a global variable
std::list<int>myList;
// a global instance of std::mutex to protect global variable
std::mutex myMutex;
void addToList(int max, int interval)
{
// the access to this function is mutually exclusive
std::lock_guard<std::mutex> guard(myMutex);
for (int i = 0; i < max; i++) {
if( (i % interval) == 0) myList.push_back(i);
}
}
void printList()
{
// the access to this function is mutually exclusive
std::lock_guard<std::mutex> guard(myMutex);
for (auto itr = myList.begin(), end_itr = myList.end(); itr != end_itr; ++itr ) {
cout << *itr << ",";
}
}
int main()
{
int max = 100;
std::thread t1(addToList, max, 1);
std::thread t2(addToList, max, 10);
std::thread t3(printList);
t1.join();
t2.join();
t3.join();
return 0;
}
The example demonstrates how three threads, two writers and one reader, accesses a common resource(list).
Two global functions are used: one which is used by the two writer threads, and one being used by the reader thread. Both functions use a lock_guard to lock down the same resource, the list.
Now here is what I just can't wrap my head around: The reader uses a lock in a different scope than the two writer threads, yet still locks down the same resource. How can this work? My limited understanding of mutexes lends itself well to the writer function, there you got two threads using the exact same function. I can understand that, a check is made right as you are about to enter the protected area, and if someone else is already inside, you wait.
But when the scope is different? This would indicate that there is some sort of mechanism more powerful than the process itself, some sort of runtime environment blocking execution of the "late" thread. But I thought there were no such things in c++. So I am at a loss.
What exactly goes on under the hood here?
Let’s have a look at the relevant line:
std::lock_guard<std::mutex> guard(myMutex);
Notice that the lock_guard references the global mutex myMutex. That is, the same mutex for all three threads. What lock_guard does is essentially this:
Upon construction, it locks myMutex and keeps a reference to it.
Upon destruction (i.e. when the guard's scope is left), it unlocks myMutex.
The mutex is always the same one, it has nothing to do with the scope. The point of lock_guard is just to make locking and unlocking the mutex easier for you. For example, if you manually lock/unlock, but your function throws an exception somewhere in the middle, it will never reach the unlock statement. So, doing it the manual way you have to make sure that the mutex is always unlocked. On the other hand, the lock_guard object gets destroyed automatically whenever the function is exited – regardless how it is exited.
myMutex is global, which is what is used to protect myList. guard(myMutex) simply engages the lock and the exit from the block causes its destruction, dis-engaging the lock. guard is just a convenient way to engage and dis-engage the lock.
With that out of the way, mutex does not protect any data. It just provides a way to protect data. It is the design pattern that protects data. So if I write my own function to modify the list as below, the mutex cannot protect it.
void addToListUnsafe(int max, int interval)
{
for (int i = 0; i < max; i++) {
if( (i % interval) == 0) myList.push_back(i);
}
}
The lock only works if all pieces of code that need to access the data engage the lock before accessing and disengage after they are done. This design-pattern of engaging and dis-engaging the lock before and after every access is what protects the data (myList in your case)
Now you would wonder, why use mutex at all, and why not, say, a bool. And yes you can, but you will have to make sure that the bool variable will exhibit certain characteristics including but not limited to the below list.
Not be cached (volatile) across multiple threads.
Read and write will be atomic operation.
Your lock can handle situation where there are multiple execution pipelines (logical cores, etc).
There are different synchronization mechanisms that provide "better locking" (across processes versus across threads, multiple processor versus, single processor, etc) at a cost of "slower performance", so you should always choose a locking mechanism which is just about enough for your situation.
Just to add onto what others here have said...
There is an idea in C++ called Resource Acquisition Is Initialization (RAII) which is this idea of binding resources to the lifetime of objects:
Resource Acquisition Is Initialization or RAII, is a C++ programming technique which binds the life cycle of a resource that must be acquired before use (allocated heap memory, thread of execution, open socket, open file, locked mutex, disk space, database connection—anything that exists in limited supply) to the lifetime of an object.
C++ RAII Info
The use of a std::lock_guard<std::mutex> class follows the RAII idea.
Why is this useful?
Consider a case where you don't use a std::lock_guard:
std::mutex m; // global mutex
void oops() {
m.lock();
doSomething();
m.unlock();
}
in this case, a global mutex is used and is locked before the call to doSomething(). Then once doSomething() is complete the mutex is unlocked.
One problem here is what happens if there is an exception? Now you run the risk of never reaching the m.unlock() line which releases the mutex to other threads.
So you need to cover the case where you run into an exception:
std::mutex m; // global mutex
void oops() {
try {
m.lock();
doSomething();
m.unlock();
} catch(...) {
m.unlock(); // now exception path is covered
// throw ...
}
}
This works but is ugly, verbose, and inconvenient.
Now lets write our own simple lock guard.
class lock_guard {
private:
std::mutex& m;
public:
lock_guard(std::mutex& m_):(m(m_)){ m.lock(); } // lock on construction
~lock_guard() { t.unlock(); }} // unlock on deconstruction
}
When the lock_guard object is destroyed, it will ensure that the mutex is unlocked.
Now we can use this lock_guard to handle the case from before in a better/cleaner way:
std::mutex m; // global mutex
void ok() {
lock_guard lk(m); // our simple lock guard, protects against exception case
doSomething();
} // when scope is exited our lock guard object is destroyed and the mutex unlocked
This is the same idea behind std::lock_guard.
Again this approach is used with many different types of resources which you can read more about by following the link on RAII.
This is precisely what a lock does. When a thread takes the lock, regardless of where in the code it does so, it must wait its turn if another thread holds the lock. When a thread releases a lock, regardless of where in the code it does so, another thread may acquire that lock.
Locks protect data, not code. They do it by ensuring all code that accesses the protected data does so while it holds the lock, excluding other threads from any code that might access that same data.

Is std::mutex sufficient for data synchronization between threads

If I have a global array that multiple threads are writing to and reading from, and I want to ensure that this array remains synchronized between threads, is using std::mutex enough for this purpose, as shown in pseudo code below? I came across with this resource, which makes me think that the answer is positive:
Mutual exclusion locks (such as std::mutex or atomic spinlock) are an example of release-acquire synchronization: when the lock is released by thread A and acquired by thread B, everything that took place in the critical section (before the release) in the context of thread A has to be visible to thread B (after the acquire) which is executing the same critical section.
I'm still interested in other people's opinion.
float * globalArray;
std::mutex globalMutex;
void method1()
{
std::lock_guard<std::mutex> lock(globalMutex);
// Perform reads/writes to globalArray
}
void method2()
{
std::lock_guard<std::mutex> lock(globalMutex);
// Perform reads/writes to globalArray
}
main()
{
std::thread t1(method1());
std::thread t2(method2());
std::thread t3(method1());
std::thread t4(method2());
...
std::thread tn(method1());
}
This is precisely what mutexes are for. Just try not to hold them any longer than necessary to minimize the costs of contention.