I am reading Item 16 in Scott Meyers's Effective Modern C++.
In the later part of the item, he says
For a single variable or memory location requiring synchronization, use of a std::atomic is adequate, but once you get to two or more variables or memory locations that require manipulation as a unit, you should reach for
a mutex.
But I still don't see why it is adequate in the case of a single variable or memory location, take the polynomial example in this item
class Polynomial {
public:
using RootsType = std::vector<double>;
RootsType roots() const
{
if (!rootsAreValid) { // if cache not valid
.... // **very expensive compuation**, computing roots,
// store them in rootVals
rootsAreValid = true;
}
return rootVals;
}
private:
mutable std::atomic<bool> rootsAreValid{ false };
mutable RootsType rootVals{};
};
My question is:
If thread 1 is in the middle of heavily computing the rootVals before the rootAreValid gets assigned to true, and thread 2 also calls function roots(), and evaluates rootAreValid to false, then thread 2 will also steps into the heavy computation of rootVals, so for this case how an atomic bool is adequate? I still think a std::lock_guard<mutex> is needed to protect the entry to the rootVals computation.
In your example there are two variables being synchronized : rootVals and rootsAreValid. That particular item is referring to the case where only the atomic value requires synchronization. For example :
#include <atomic>
class foo
{
public:
void work()
{
++times_called;
/* multiple threads call this to do work */
}
private:
// Counts the number of times work() was called
std::atomic<int> times_called{0};
};
times_called is the only variable in this case.
I suggest you to avoid unnecessary heavy computation using the following code:
class Polynomial {
public:
using RootsType = std::vector<double>;
RootsType roots() const
{
if (!rootsAreValid) { // Acquiring mutex usually is not cheap, so we check the state without locking
std::lock_guard<std::mutex> lock_guard(sync);
if (!rootsAreValid) // The state could changed because the mutex was not owned at the first check
{
.... // **very expensive compuation**, computing roots,
// store them in rootVals
}
rootsAreValid = true;
}
return rootVals;
}
private:
mutable std::mutex sync;
mutable std::atomic<bool> rootsAreValid{ false };
mutable RootsType rootVals{};
};
Related
I am reading C++ concurrency in action.
It says that when you use std::execution::par, you can use mutex per internal element like below.
#include <mutex>
#include <vector>
class X {
mutable std::mutex m;
int data;
public:
X() : data(0) {}
int get_value() const {
std::lock_guard guard(m);
return data;
}
void increment() {
std::lock_guard guard(m);
++data;
}
};
void increment_all(std::vector<X>& v) {
std::for_each(v.begin(), v.end(), [](X& x) { x.increment(); });
}
But it says that when you use std::execution::par_unseq, you have to replace this mutex with a whole-container mutex like below
#include <mutex>
#include <vector>
class Y {
int data;
public:
Y() : data(0) {}
int get_value() const { return data; }
void increment() { ++data; }
};
class ProtectedY {
std::mutex m;
std::vector<Y> v;
public:
void lock() { m.lock(); }
void unlock() { m.unlock(); }
std::vector<Y>& get_vec() { return v; }
};
void incremental_all(ProtectedY& data) {
std::lock_guard<ProtectedY> guard(data);
auto& v = data.get_vec();
std::for_each(std::execution::par_unseq, v.begin(), v.end(),
[](Y& y) { y.increment(); });
}
But even if you use second version, y.increament() in parallel algorithm thread also has data race condition because there is no lock among parallel algorithm threads.
How does this second version with std::execution::par_unseq can be thread safe ?
It is only thread-safe because you do not access shared data in the parallel algorithm.
The only thing being executed in parallel are the calls to y.increment(). These can happen in any order, on any thread and be arbitrarily interleaved with each other, even within a single thread. But y.increment() only accesses private data of y, and each y is distinct from all the other vector elements. So there is no opportunity for data races here, because there is no "overlap" between the individual elements.
A different example would be if the increment function also accesses some global state that is being shared between all the different elements of the vector. In that case, there is now a potential for a data race, so access to that shared global state needs to be synchronized. But because of the specific requirements of the parallel unsequenced policy, you can't just use a mutex for synchronizing here.
Note that if a mutex is being used in the context of parallel algorithms it may protect against different hazards: One use is using a mutex to synchronize among the different threads executing the for-each. This works for the parallel execution policy, but not for parallel-unsequenced. This is not the use case in your examples, as in your case no data is shared, so we don't need any synchronization. Instead in your examples the mutex only synchronizes the invocation of the for-each against any other threads that might still be running as part of a larger application, but there is no synchronization within the for-each itself. This is a valid use case for both parallel and parallel-unsequenced, but in the latter case, it cannot be achieved by using per-element mutexes.
I have a struct containing two elements.
struct MyStruct {
int first_element_;
std::string second_element_;
}
The struct is shared between threads and therefore requires locking. My use case requires to lock access to the whole struct instead of just a specific member, so for example:
// start of Thread 1's routine
<Thread 1 "locks" struct>
<Thread 1 gets first_element_>
<Thread 1 sets second_elements_>
<Thread 2 wants to access struct -> blocks>
<Thread 1 sets first_element_>
// end of Thread 1's routine
<Thread 1 releases lock>
<Thread 2 reads/sets ....>
What's the most elegant way of doing that?
EDIT: To clarify, basically this question is about how to enforce any thread using this struct to lock a mutex (stored wherever) at the start of its routine and unlock the mutex at the end of it.
EDIT2: My current (ugly) solution is to have a mutex inside MyStruct and lock that mutex at the start of each thread's routine which uses MyStruct. However, if one thread "forgets" to lock that mutex, I run into synchronization problems.
You can have a class instead of the struct and implement getters and setters for first_element_ and second_element_. Besides those class members, you will also need a member of type std::mutex.
Eventually, your class could look like this:
class Foo {
public:
// ...
int get_first() const noexcept {
std::lock_guard<std::mutex> guard(my_mutex_);
return first_element_;
}
std::string get_second() const noexcept {
std::lock_guard<std::mutex> guard(my_mutex_);
return second_element_;
}
private:
int first_element_;
std::string second_element_;
std::mutex my_mutex_;
};
Please, note that getters are returning copies of the data members. If you want to return references (like std::string const& get_second() const noexcept) then you need to be careful because the code that gets the reference to the second element has no lock guard and there might be a race condition in such case.
In any case, your way to go is using std::lock_guards and std::mutexes around code that can be used by more than one thread.
You could implement something like this that combines the lock with the data:
class Wrapper
{
public:
Wrapper(MyStruct& value, std::mutex& mutex)
:value(value), lock(mutex) {}
MyStruct& value;
private:
std::unique_lock<std::mutex> lock;
};
class Container
{
public:
Wrapper get()
{
return Wrapper(value, mutex);
}
private:
MyStruct value;
std::mutex mutex;
};
The mutex is locked when you call get and unlocked automatically when Wrapper goes out of scope.
I often encounter with the design of this thread-safe structure. As the following version1, one thread may call foo1::add_data() rarely, and another thread often call foo1::get_result(). For the purpose of optimization, I think it can use an atomic for applying double checked locking pattern(DCLP), as version2 showed. Is there other better design for this situation? Or could it been improved, for example accessing atomic with std::memory_order?
version1:
class data {};
class some_data {};
class some_result {};
class foo1
{
public:
foo1() : m_bNeedUpdate(false) {}
void add_data(data n)
{
std::lock_guard<std::mutex> lock(m_mut);
// ... restore new data to m_SomeData
m_bNeedUpdate = true;
}
some_result get_result() const
{
{
std::lock_guard<std::mutex> lock(m_mut);
if (m_bNeedUpdate)
{
// ... process mSomeData and update m_SomeResult
m_bNeedUpdate = false;
}
}
return m_SomeResult;
}
private:
mutable std::mutex m_mut;
mutable bool m_bNeedUpdate;
some_data m_SomeData;
mutable some_result m_SomeResult;
};
version2:
class foo2
{
public:
foo2() : m_bNeedUpdate(false) {}
void add_data(data n)
{
std::lock_guard<std::mutex> lock(m_mut);
// ... restore new data to m_SomeData
m_bNeedUpdate.store(true);
}
some_result get_result() const
{
if (m_bNeedUpdate.load())
{
std::lock_guard<std::mutex> lock(m_mut);
if (m_bNeedUpdate.load())
{
// ... process mSomeData and update m_SomeResult
m_bNeedUpdate.store(false);
}
}
return m_SomeResult;
}
private:
mutable std::mutex m_mut;
mutable std::atomic<bool> m_bNeedUpdate;
some_data m_SomeData;
mutable some_result m_SomeResult;
};
The problem is that version 2 isn't thread safe, at least
according to C++11 (and Posix, earlier); you're accessing
a variable which may be modified without the access being
protected. (The double checked locking pattern is known to be
broken, see
http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf.)
It can be made to work in C++11 (or non-portably earlier) by
using atomic variables, but what you've written results in
undefined behavior.
I think a significant improvement (in terms of code size as well as in terms of simplicity and performance) could be achieved by using a 'read-write lock' which allows many threads to read in parallel. Boost provides shared_mutex for this purpose, but from a quick glance it appears that this blog article implements the same kind of lock in a portable manner without requiring Boost.
You said that you're calling get_average quite often, have you considered calculating average only based on numbers that you haven't 'seen'? It would be O(n) instead of O(n^2).
It would be something like
average = (last_average * last_size + static_cast<double>(
std::accumulate(m_vecData.begin() + last_size, m_vecData.end(), 0))) /
m_vecData.size();
It should give you satisfying results, depending on how big your vector is.
Consider the following C++11 code where class B is instantiated and used by multiple threads. Because B modifies a shared vector, I have to lock access to it in the ctor and member function foo of B. To initialize the member variable id I use a counter that is an atomic variable because I access it from multiple threads.
struct A {
A(size_t id, std::string const& sig) : id{id}, signature{sig} {}
private:
size_t id;
std::string signature;
};
namespace N {
std::atomic<size_t> counter{0};
typedef std::vector<A> As;
std::vector<As> sharedResource;
std::mutex barrier;
struct B {
B() : id(++counter) {
std::lock_guard<std::mutex> lock(barrier);
sharedResource.push_back(As{});
sharedResource[id].push_back(A("B()", id));
}
void foo() {
std::lock_guard<std::mutex> lock(barrier);
sharedResource[id].push_back(A("foo()", id));
}
private:
const size_t id;
};
}
Unfortunately, this code contains a race condition and does not work like this (sometimes the ctor and foo() do not use the same id). If I move the initialization of id to the ctor body which is locked by a mutex, it works:
struct B {
B() {
std::lock_guard<std::mutex> lock(barrier);
id = ++counter; // counter does not have to be an atomic variable and id cannot be const anymore
sharedResource.push_back(As{});
sharedResource[id].push_back(A("B()", id));
}
};
Can you please help me understanding why the latter example works (is it because it does not use the same mutex?)? Is there a safe way to initialize id in the initializer list of B without locking it in the body of the ctor? My requirements are that id must be const and that the initialization of id takes place in the initializer list.
First, there's still a fundamental logic problem in the posted code.
You use ++ counter as id. Consider the very first creation of B,
in a single thread. B will have id == 1; after the push_back of
sharedResource, you will have sharedResource.size() == 1, and the
only legal index for accessing it will be 0.
In addition, there's a clear race condition in the code. Even if you
correct the above problem (initializing id with counter ++), suppose
that both counter and sharedResource.size() are currently 0;
you've just initialized. Thread one enters the constructor of B,
increments counter, so:
counter == 1
sharedResource.size() == 0
It is then interrupted by thread 2 (before it acquires the mutex), which
also increments counter (to 2), and uses its previous value (1) as
id. After the push_back in thread 2, however, we have only
sharedResource.size() == 1, and the only legal index is 0.
In practice, I would avoid two separate variables (counter and
sharedResource.size()) which should have the same value. From
experience: two things that should be the same won't be—the only
time redundant information should be used is when it is used for
control; i.e. at some point, you have an assert( id ==
sharedResource.size() ), or something similar. I'd use something like:
B::B()
{
std::lock_guard<std::mutex> lock( barrier );
id = sharedResource.size();
sharedResource.push_back( As() );
// ...
}
Or if you want to make id const:
struct B
{
static int getNewId()
{
std::lock_guard<std::mutex> lock( barrier );
int results = sharedResource.size();
sharedResource.push_back( As() );
return results;
}
B::B() : id( getNewId() )
{
std::lock_guard<std::mutex> lock( barrier );
// ...
}
};
(Note that this requires acquiring the mutex twice. Alternatively, you
could pass the additional information necessary to complete updating
sharedResource to getNewId(), and have it do the whole job.)
When an object is being initialized, it should be owned by a single thread. Then when it is done being initialized, it is made shared.
If there is such a thing as thread-safe initialization, it means ensuring that an object has not become accessible to other threads before being initialized.
Of course, we can discuss thread-safe assignment of an atomic variable. Assignment is different from initialization.
You are in the sub-constructor list initializing the vector. This is not really an atomic operation. so in a multi-threaded system you could get hit by two threads at the same time. This is changing what id is. welcome to thread safety 101!
moving the initialization into the constructor surrounded by the lock makes it so only one thread can access and set the vector.
The other way to fix this would be to move this into a singelton pattern. But then you are paying for the lock every time you get the object.
Now you can get into things like double checked locking :)
http://en.wikipedia.org/wiki/Double-checked_locking
So having simple class
class mySafeData
{
public:
mySafeData() : myData(0)
{
}
void Set(int i)
{
boost::mutex::scoped_lock lock(myMutex);
myData = i; // set the data
++stateCounter; // some int to track state chages
myCondvar.notify_all(); // notify all readers
}
void Get( int& i)
{
boost::mutex::scoped_lock lock(myMutex);
// copy the current state
int cState = stateCounter;
// waits for a notification and change of state
while (stateCounter == cState)
myCondvar.wait( lock );
}
private:
int myData;
int stateCounter;
boost::mutex myMutex;
};
and array of threads in infinite loops calling each one function
Get()
Set()
Get()
Get()
Get()
will they always call functions in the same order and only once per circle (by circle I mean will all boost threads run in same order each time so that each thread would Get() only once after one Set())?
No. You can never make any assumptions of which order the threads will be served. This is nothing related to boost, it is the basics of multiprogramming.
The threads should acquire the lock in the same order that they reach the scoped_lock constructor (I think). But there's no guarantee that they will reach that point in any fixed order!
So in general: don't rely on it.
No, the mutex only prevents two threads from accessing the variable at the same time. It does not affect the thread scheduling order or execution time, which can for all intents and purposes be assumed to be random.