Read shared data protected by Mutex without locking Mutex

Read shared data protected by Mutex without locking Mutex - c++

Given shared data protected by a Mutex. What is the appropriate way to read part of the shared data without needing to lock the Mutex? Is using std::atomic_ref an appropriate way as indicated in the example below?
struct A
{
std::mutex mutex;
int counter = 0;
void modify()
{
std::lock_guard<std::mutex> guard(mutex);
// do something with counter
}
int getCounter()
{
return std::atomic_ref<int>(counter).load();
}
};

If you bypass locking the mutex and perform atomic reads from the shared data (for example using std::atomic_ref), then your program will be invoking undefined behavior if one of the other thread writes using a non-atomic access.
If all threads use atomic operations to access the shared data, then there is no undefined behavior. However, in that case, there is probably no point in protecting the shared data with a mutex, if all accesses are atomic anyway.

Related

Difference between shared mutex and mutex (why do both exist in C++ 11)?

Haven't got an example online to demonstrate this vividly. Saw an example at http://en.cppreference.com/w/cpp/header/shared_mutex but
it is still unclear. Can somebody help?

By use of normal mutexes, you can guarantee exclusive access to some kind of critical resource – and nothing else. Shared mutexes extend this feature by allowing two levels of access: shared and exclusive as follows:
Exclusive access prevents any other thread from acquiring the mutex, just as with the normal mutex. It does not matter if the other thread tries to acquire shared or exclusive access.
Shared access allows multiple threads to acquire the mutex, but all of them only in shared mode. Exclusive access is not granted until all of the previous shared holders have returned the mutex (typically, as long as an exclusive request is waiting, new shared ones are queued to be granted after the exclusive access).
A typical scenario is a database: It does not matter if several threads read one and the same data simultaneously. But modification of the database is critical - if some thread reads data while another one is writing it might receive inconsistent data. So all reads must have finished before writing is allowed and new reading must wait until writing has finished. After writing, further reads can occur simultaneously again.
Edit: Sidenote:
Why readers need a lock?
This is to prevent the writer from acquiring the lock while reading yet occurs. Additionally, it prevents new readers from acquiring the lock if it is yet held exclusively.

A shared mutex has two levels of access 'shared' and 'exclusive'.
Multiple threads can acquire shared access but only one can hold 'exclusive' access (that includes there being no shared access).
The common scenario is a read/write lock. Recall that a Data Race can only occur when two threads access the same data at least one of which is a write.
The advantage of that is data may be read by many readers but when a writer needs access they must obtain exclusive access to the data.
Why have both? One the one hand the exclusive lock constitutes a normal mutex so arguably only Shared is needed. But there may be overheads in an shared lock implementation that can be avoided using the less featured type.
Here's an example (adapted slightly from the example here http://en.cppreference.com/w/cpp/thread/shared_mutex).
#include <iostream>
#include <mutex>
#include <shared_mutex>
#include <thread>
std::mutex cout_mutex;//Not really part of the example...
void log(const std::string& msg){
std::lock_guard guard(cout_mutex);
std::cout << msg << std::endl;
}
class ThreadSafeCounter {
public:
ThreadSafeCounter() = default;
// Multiple threads/readers can read the counter's value at the same time.
unsigned int get() const {
std::shared_lock lock(mutex_);//NB: std::shared_lock will shared_lock() the mutex.
log("get()-begin");
std::this_thread::sleep_for(std::chrono::milliseconds(500));
auto result=value_;
log("get()-end");
return result;
}
// Only one thread/writer can increment/write the counter's value.
void increment() {
std::unique_lock lock(mutex_);
value_++;
}
// Only one thread/writer can reset/write the counter's value.
void reset() {
std::unique_lock lock(mutex_);
value_ = 0;
}
private:
mutable std::shared_mutex mutex_;
unsigned int value_ = 0;
};
int main() {
ThreadSafeCounter counter;
auto increment_and_print = [&counter]() {
for (int i = 0; i < 3; i++) {
counter.increment();
auto ctr=counter.get();
{
std::lock_guard guard(cout_mutex);
std::cout << std::this_thread::get_id() << ' ' << ctr << '\n';
}
}
};
std::thread thread1(increment_and_print);
std::thread thread2(increment_and_print);
std::thread thread3(increment_and_print);
thread1.join();
thread2.join();
thread3.join();
}
Possible partial output:
get()-begin
get()-begin
get()-end
140361363867392 2
get()-end
140361372260096 2
get()-begin
get()-end
140361355474688 3
//Etc...
Notice how the two get()-begin() return show that two threads are holding the shared lock during the read.

"Shared mutexes are usually used in situations when multiple readers can access the same resource at the same time without causing data races, but only one writer can do so."
cppreference.com
This is useful when you need read/writer lock: https://en.wikipedia.org/wiki/Readers%E2%80%93writer_lock

Upgrading read lock to write lock without releasing the first in C++11?

I know it's possible using boost::UpgradeLockable in C++14.
Is there anything similar for C++11?

An upgradeable lock can be written on top of simpler locking primitives.
struct upgradeable_timed_mutex {
void lock() {
upgradable_lock();
upgrade_lock();
}
void unlock() {
upgrade_unlock();
upgradable_unlock();
}
void shared_lock() { shared.shared_lock(); }
void shared_unlock() { shared.shared_unlock(); }
void upgradable_lock() { unshared.lock(); }
void ungradable_unlock() { unshared.unlock(); }
void upgrade_lock() { shared.lock(); }
void upgrade_unlock() { shared.unlock(); }
private:
friend struct upgradable_lock;
std::shared_timed_mutex shared;
std::timed_mutex unshared;
};
and similar for the timed and try variants. Note that timed variants which access two mutexes in a row have to do some extra work to avoid spending up to 2x the requested time, and try_lock has to be careful about the first lock's state in case the 2nd fails.
Then you have to write upgradable_lock, with the ability to spawn a std::unique_lock upon request.
Naturally this is hand-written thread safety code, so it is unlikely to be correct.
In C++1z you can write an untimed version as well (with std::shared_mutex and std::mutex).
Less concretely, there can be exactly one upgradeable or write lock at a time. This is what the unshared mutex represents.
So long as you hold unshared, nobody else is writing to the guarded data, so you can read from it without holding the shared mutex at all.
When you want to upgrade, you can grab a unique lock on the shared mutex. This cannot deadlock so long as no readers try to upgrade to upgradable. This excludes readers from reading, you can write, and then you release it and return back to a read only state (where you only hold the unshared mutex).

std::lock_guard example, explanation on why it works

I've reached a point in my project that requires communication between threads on resources that very well may be written to, so synchronization is a must. However I don't really understand synchronization at anything other than the basic level.
Consider the last example in this link: http://www.bogotobogo.com/cplusplus/C11/7_C11_Thread_Sharing_Memory.php
#include <iostream>
#include <thread>
#include <list>
#include <algorithm>
#include <mutex>
using namespace std;
// a global variable
std::list<int>myList;
// a global instance of std::mutex to protect global variable
std::mutex myMutex;
void addToList(int max, int interval)
{
// the access to this function is mutually exclusive
std::lock_guard<std::mutex> guard(myMutex);
for (int i = 0; i < max; i++) {
if( (i % interval) == 0) myList.push_back(i);
}
}
void printList()
{
// the access to this function is mutually exclusive
std::lock_guard<std::mutex> guard(myMutex);
for (auto itr = myList.begin(), end_itr = myList.end(); itr != end_itr; ++itr ) {
cout << *itr << ",";
}
}
int main()
{
int max = 100;
std::thread t1(addToList, max, 1);
std::thread t2(addToList, max, 10);
std::thread t3(printList);
t1.join();
t2.join();
t3.join();
return 0;
}
The example demonstrates how three threads, two writers and one reader, accesses a common resource(list).
Two global functions are used: one which is used by the two writer threads, and one being used by the reader thread. Both functions use a lock_guard to lock down the same resource, the list.
Now here is what I just can't wrap my head around: The reader uses a lock in a different scope than the two writer threads, yet still locks down the same resource. How can this work? My limited understanding of mutexes lends itself well to the writer function, there you got two threads using the exact same function. I can understand that, a check is made right as you are about to enter the protected area, and if someone else is already inside, you wait.
But when the scope is different? This would indicate that there is some sort of mechanism more powerful than the process itself, some sort of runtime environment blocking execution of the "late" thread. But I thought there were no such things in c++. So I am at a loss.
What exactly goes on under the hood here?

Let’s have a look at the relevant line:
std::lock_guard<std::mutex> guard(myMutex);
Notice that the lock_guard references the global mutex myMutex. That is, the same mutex for all three threads. What lock_guard does is essentially this:
Upon construction, it locks myMutex and keeps a reference to it.
Upon destruction (i.e. when the guard's scope is left), it unlocks myMutex.
The mutex is always the same one, it has nothing to do with the scope. The point of lock_guard is just to make locking and unlocking the mutex easier for you. For example, if you manually lock/unlock, but your function throws an exception somewhere in the middle, it will never reach the unlock statement. So, doing it the manual way you have to make sure that the mutex is always unlocked. On the other hand, the lock_guard object gets destroyed automatically whenever the function is exited – regardless how it is exited.

myMutex is global, which is what is used to protect myList. guard(myMutex) simply engages the lock and the exit from the block causes its destruction, dis-engaging the lock. guard is just a convenient way to engage and dis-engage the lock.
With that out of the way, mutex does not protect any data. It just provides a way to protect data. It is the design pattern that protects data. So if I write my own function to modify the list as below, the mutex cannot protect it.
void addToListUnsafe(int max, int interval)
{
for (int i = 0; i < max; i++) {
if( (i % interval) == 0) myList.push_back(i);
}
}
The lock only works if all pieces of code that need to access the data engage the lock before accessing and disengage after they are done. This design-pattern of engaging and dis-engaging the lock before and after every access is what protects the data (myList in your case)
Now you would wonder, why use mutex at all, and why not, say, a bool. And yes you can, but you will have to make sure that the bool variable will exhibit certain characteristics including but not limited to the below list.
Not be cached (volatile) across multiple threads.
Read and write will be atomic operation.
Your lock can handle situation where there are multiple execution pipelines (logical cores, etc).
There are different synchronization mechanisms that provide "better locking" (across processes versus across threads, multiple processor versus, single processor, etc) at a cost of "slower performance", so you should always choose a locking mechanism which is just about enough for your situation.

Just to add onto what others here have said...
There is an idea in C++ called Resource Acquisition Is Initialization (RAII) which is this idea of binding resources to the lifetime of objects:
Resource Acquisition Is Initialization or RAII, is a C++ programming technique which binds the life cycle of a resource that must be acquired before use (allocated heap memory, thread of execution, open socket, open file, locked mutex, disk space, database connection—anything that exists in limited supply) to the lifetime of an object.
C++ RAII Info
The use of a std::lock_guard<std::mutex> class follows the RAII idea.
Why is this useful?
Consider a case where you don't use a std::lock_guard:
std::mutex m; // global mutex
void oops() {
m.lock();
doSomething();
m.unlock();
}
in this case, a global mutex is used and is locked before the call to doSomething(). Then once doSomething() is complete the mutex is unlocked.
One problem here is what happens if there is an exception? Now you run the risk of never reaching the m.unlock() line which releases the mutex to other threads.
So you need to cover the case where you run into an exception:
std::mutex m; // global mutex
void oops() {
try {
m.lock();
doSomething();
m.unlock();
} catch(...) {
m.unlock(); // now exception path is covered
// throw ...
}
}
This works but is ugly, verbose, and inconvenient.
Now lets write our own simple lock guard.
class lock_guard {
private:
std::mutex& m;
public:
lock_guard(std::mutex& m_):(m(m_)){ m.lock(); } // lock on construction
~lock_guard() { t.unlock(); }} // unlock on deconstruction
}
When the lock_guard object is destroyed, it will ensure that the mutex is unlocked.
Now we can use this lock_guard to handle the case from before in a better/cleaner way:
std::mutex m; // global mutex
void ok() {
lock_guard lk(m); // our simple lock guard, protects against exception case
doSomething();
} // when scope is exited our lock guard object is destroyed and the mutex unlocked
This is the same idea behind std::lock_guard.
Again this approach is used with many different types of resources which you can read more about by following the link on RAII.

This is precisely what a lock does. When a thread takes the lock, regardless of where in the code it does so, it must wait its turn if another thread holds the lock. When a thread releases a lock, regardless of where in the code it does so, another thread may acquire that lock.
Locks protect data, not code. They do it by ensuring all code that accesses the protected data does so while it holds the lock, excluding other threads from any code that might access that same data.

Thread safety of copy on write

I try to understand proper way of developing threadsafe applications.
In current project I have following class :
class Test
{
public:
void setVal(unsigned int val)
{
mtx.lock();
testValue = val;
mtx.unlock();
}
unsigned int getVal()
{
unsigned int copy = testValue;
return copy;
}
private:
boost::mutex mtx;
unsigned int testValue;
}
And my question : is above method Test::getVal() threadsafe in multithreaded environment, or it must be locked before taking copy ?
I've read some articles about COW, and now I'm unsure.
Thanks!

If you have data which can be shared between multiple threads (such as the testValue member in your case), you must synchronise all accesses to that data. "Synchronise" has a broad meaning here: it could be done using a mutex, by making the data atomic, or by explicitly invoking a memory barrier.
But you cannot skip on this. In a parallel world with multiple threads, CPU cores, CPUs and caches, there is no guarantee that a write by one thread will be visible to another thread if they don't "shake hands" on a synchronisation primitive. It is quite possible that thread T1's cache entry for testValue will not be updated when thread T2 writes into testValue, precisely because the HW cache management system sees "no synchronisation is happening, the threads don't access shared data, why should I torpedo performance by invalidating caches?"
The C++11 standard chapter [intro.multithread] goes into more detail than you'd like on this, but here's an informal Note from that chapter summarising the idea:
5 ... Informally, performing a release operation on A forces prior side
effects on other memory locations to become visible to other threads that later perform a consume or an
acquire operation on A. ...

Locking/unlocking mutex inside private functions

Imagine you have a big function that locks/unlocks a mutex inside and you want to break the function into smaller functions:
#include <pthread.h>
class MyClass : public Uncopyable
{
public:
MyClass() : m_mutexBuffer(PTHREAD_MUTEX_INITIALIZER), m_vecBuffer() {}
~MyClass() {}
void MyBigFunction()
{
pthread_mutex_lock(&m_mutexBuffer);
if (m_vecBuffer.empty())
{
pthread_mutex_unlock(&m_mutexBuffer);
return;
}
// DoSomethingWithBuffer1();
unsigned char ucBcc = CalculateBcc(&m_vecBuffer[0], m_vecBuffer.size());
// DoSomethingWithBuffer2();
pthread_mutex_unlock(&m_mutexBuffer);
}
private:
void DoSomethingWithBuffer1()
{
// Use m_vecBuffer
}
void DoSomethingWithBuffer2()
{
// Use m_vecBuffer
}
private:
pthread_mutex_t m_mutexBuffer;
std::vector<unsigned char> m_vecBuffer;
};
How should I go about locking/unlocking the mutex inside the smaller functions?
Should I unlock the mutex first, then lock it straightaway and finally unlock it before returning?
void DoSomethingWithBuffer1()
{
pthread_mutex_unlock(&m_mutexBuffer);
pthread_mutex_lock(&m_mutexBuffer);
// Use m_vecBuffer
pthread_mutex_unlock(&m_mutexBuffer);
}

How should I go about locking/unlocking the mutex inside the smaller functions?
If your semantics require your mutex to be locked during the whole MyBigFunction() operation then you can't simply unlock it and relock it in the middle of the function.
My best bet would be to ignore the mutex in the smaller DoSomethingWithBuffer...() functions, and simply require that these functions are called with the mutex being already locked. This shouldn't be a problem since those functions are private.
On a side note, your mutex usage is incorrect: it is not exception safe, and you have code paths where you don't release the mutex. You should either use C++11's mutex and lock classes or boost's equivalents if you are using C++03. At worst if you can't use boost, write a small RAII wrapper to hold the lock.

In general, try to keep the regions of code within each lock to a minimum (to avoid contention), but avoid to unlock and immediatly re-lock the same mutex. Thus, if the smaller functions are not mutually exclusive, they should both use their own indepdenent mutices and only when they actually access the shared resource.
Another thing that should consider is to use RAII for locking and unlocking (as in C++11 with std::lock_guard<>), so that returning from a locked region (either directly or via an uncaught exception) does not leave you in a locked state.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Read shared data protected by Mutex without locking Mutex - c++

Related

Difference between shared mutex and mutex (why do both exist in C++ 11)?

Upgrading read lock to write lock without releasing the first in C++11?

std::lock_guard example, explanation on why it works

Thread safety of copy on write

Locking/unlocking mutex inside private functions

Categories

Resources