C++11's "const==mutable", How to implement copy? efficiently? - c++

As I introduced in this question and this question , it seem that the modern way to implement a thread safe class with hidden state is this:
struct Widget {
int getValue() const{
std::lock_guard<std::mutex> guard{m}; // lock mutex
if (cacheValid) return cachedValue;
else {
cachedValue = expensiveQuery(); // write data mem
cacheValid = true; // write data mem
return cachedValue;
}
} // unlock mutex
...
private:
mutable std::mutex m;
mutable int cachedValue;
mutable bool cacheValid;
...
};
The logic seems solids and explained here: https://herbsutter.com/2013/01/01/video-you-dont-know-const-and-mutable/
The question is how do I implement the rest of the class?
In particular the copy constructor.
One option is not to implement copy and that's it. (=delete).
But Widget can have bonafide state (in ...) and that state should be copyable or one might want to make copies precisely to alleviate the usage of a single mutex and balance it with the expense of expensiveQuery.
So lets assume we want to make Widget copyable: one cannot make it default-copyable because the mutex is not copyable so something needs to be implemented manually.
This is already complicated because one needs to protect the source instance other before copying and for that one seems to need a body for the function.
struct Widget {
Widget(Widget const& other){
std::lock_guard<std::mutex> guard{other.m}; // lock mutex
cachedValue = other.cachedValue;
cachedValid = other.cachedValid;
}
...
};
This is getting very ugly.
Maybe one needs to implement a lock() function that RAII-lock and return the unprotected Widget_nonmt, this is possible but you can see how the problem diverges.
Is this the right way to make the C++11 Widget copyable?
Finally, we can have a user complaining that now he/she is being forced me to use the mutex for copies even in context that I know they are unnecessary.
And this is where it gets crazy, continuing with this logic (same as for the member function) one might think of introducing a non-const copy constructor.
struct Widget{
Widget(Widget& other) : cachedValid{other.cachedValid}, cachedValue{other.cachecValue}{}
Widget(Widget const& other) ... // same as above
};
Note that this is not a move constructor, it is something more strange.
(something like the doomed std::auto_ptr)
It is not unusable, but is very non-standard and for example the STL containers for example will never use when copying containers of Widget.
Logic took me to this ugly place, starting from something solid.
Is this a problem that still waits for a more elegant and general solution?
Is this the real end of C++98 to C++11 transition saga of a class with protected internal state?

Related

Does move constructor makes any sense for a class that has std::thread's and std::mutex's as its fields?

Suppose I have a class:
class WonnaBeMovedClass
{
public:
WonnaBeMovedClass(WonnaBeMovedClass&& other);
void start();
private:
void updateSharedData();
std::vector<int> _sharedData;
std::thread _threadUpdate;
std::mutex _mutexShaderData;
//other stuff
};
WonnaBeMovedClass::WonnaBeMovedClass(WonnaBeMovedClass&& other)
{
_sharedData = std::move(other._sharedData);
_threadUpdate = std::move(other._threadUpdate);
_mutexShaderData = std::move(other._mutexShaderData); //won't compile.
}
void WonnaBeMovedClass::start()
{
_threadUpdate = std::thread(&updateSharedData, this);
}
void WonnaBeMovedClass::updateSharedData()
{
std::lock_guard<std::mutex> lockSharedData(_mutexShaderData);
for (auto& value : _sharedData)
{
++value;
}
}
It won't compile because mutex cannot be moved. It doesn't make sense.
Then I thought that it is possible to workaround this by using pointers instead of actual variables and came up with the following:
class WonnaBeMovedClass
{
public:
WonnaBeMovedClass(WonnaBeMovedClass&& other);
void start();
private:
void updateSharedData();
std::vector<int> _sharedData;
std::unique_ptr<std::thread> _threadUpdate //pointer;
std::unique_ptr<std::mutex> _mutexShaderData //pointer;
//other stuff
};
WonnaBeMovedClass::WonnaBeMovedClass(WonnaBeMovedClass&& other)
{
_sharedData = std::move(other._sharedData);
_threadUpdate = std::move(other._threadUpdate);
_mutexShaderData = std::move(other._mutexShaderData); //won't compile.
}
void WonnaBeMovedClass::start()
{
_threadUpdate = std::make_unique<std::thread>(&updateSharedData, this);
}
void WonnaBeMovedClass::updateSharedData()
{
std::lock_guard<std::mutex> lockSharedData(*_mutexShaderData);
for (auto& value : _sharedData)
{
++value;
}
}
So now when I:
WonnaBeMovedClass object1;
WonnaBeMovedClass object2;
//do stuff
object1 = std::move(object2);
I actually move addresses of both mutex and thread.
It makes more sense now... Or not?
The thread is still working with the data of object1, not object2, so it still doesn't make any sense.
I may have moved the mutex, but the thread is unaware of object2. Or is it?
I am unable to find the answer so I am asking you for help.
Am I doing something completely wrong and copying/moving threads and mutexes is just a bad design and I should rethink the architecture of the program?
Edit:
There was a question about actual purpose of the class. It is actually a TCP/IP client (represented as a class) that holds:
latest data from the server (several data tables, similar to std::vector).
contains methods that manage threads (update state, send/receive messages).
More that one connection could be established at a time, so somewhere in a code there is a std::vector<Client> field that represents all active connections.
Connections are determined by the configuration file.
//read configurations
...
//init clients
for (auto& configuration : _configurations)
{
Client client(configuration);
_activeClients.push_back(client); // this is where compiler reminded me that I am unable to move my object (aka WonnaBeMovedClass object).
}}
I've changed _activeClients from std::vector<Client> to std::vector<std::unique_ptr<Client>> and modified initialization code to create pointer objects instead of objects directly and worked around my issues, but the question remained so I decided to post it here.
Let's break the issue in two.
Moving mutexes. This cannot be done because mutexes are normally implemented in terms of OS objects which must have fixed addresses. In other words, the OS (or the runtime library, which is the same as the OS for our purposes) keeps an address of your mutex. This can be worked around by storing (smart) pointers to mutexes in your code, and moving those instead. The mutex itself doesn't move. Thread objects can be moved so there's no issue.
Moving your own data whereas some active code (a thread or a running function or a std::function stored somewhere or whatever) has the address of your data and can access it. This is actually very similar to the previous case, only instead of the OS it's your own code that holds on the data. The solution, as before, is in not moving your data. Store and move a (smart) pointer to the data instead.
To summarise,
class WonnaBeMovedClass
{
public:
WonnaBeMovedClass
(WonnaBeMovedClass&& other);
void start();
private:
struct tdata {
std::vector<int> _sharedData;
std::thread _threadUpdate;
std::mutex _mutexShaderData;
};
std::shared_ptr<tdata> data;
static void updateSharedData(std::shared_ptr<tdata>);
};
void WonnaBeMovedClass::start()
{
_threadUpdate = std::thread(&updateSharedData, data);
}
It makes more sense now... Or not?
Not really.
If an std::mutex gets moved, the other threads will not be aware of the modification of memory address of that mutex! This discards thread safety.
However, a solution with std::unique_ptr exists in Copy or Move Constructor for a class with a member std::mutex (or other non-copyable object)?
Last, but not least C++14 seems to have something to bring into the play. Read more in How should I deal with mutexes in movable types in C++?
WonnaBeMovedClass is a handle holding thread and mutex, so it is not a bad design to provide them with a move semantics (but not copy).
The second solutions looks fine, but don't forget about proper resource management for your mutex (construct and destruct). I don't really undertand the real life purpose of the class, so depending on the whole solution desing, it might be better to use shared_ptr instead of unique_ptr (in case that multiple WonnaBeMovedClass can share same mutex).
std::thread is itself a handle to system thread, so it doesn't have to be wrapped in a pointer, resource management (i.e OS thread handle) is managed by the standard library itself.
Note that mutex are actually kernel objects (usually implemented as an opaque pointer, for example in Windows API), and thus should not be modified or changed by user code in any way.

How can I create a smart pointer that locks and unlocks a mutex?

I have a threaded class from which I would like to occasionally acquire a pointer an instance variable. I would like this access to be guarded by a mutex so that the thread is blocked from accessing this resource until the client is finished with its pointer.
My initial approach to this is to return a pair of objects: one a pointer to the resource and one a shared_ptr to a lock object on the mutex. This shared_ptr holds the only reference to the lock object so the mutex should be unlocked when it goes out of scope. Something like this:
void A::getResource()
{
Lock* lock = new Lock(&mMutex);
return pair<Resource*, shared_ptr<Lock> >(
&mResource,
shared_ptr<Lock>(lock));
}
This solution is less than ideal because it requires the client to hold onto the entire pair of objects. Behaviour like this breaks the thread safety:
Resource* r = a.getResource().first;
In addition, my own implementation of this is deadlocking and I'm having difficulty determining why, so there may be other things wrong with it.
What I would like to have is a shared_ptr that contains the lock as an instance variable, binding it with the means to access the resource. This seems like something that should have an established design pattern but having done some research I'm surprised to find it quite hard to come across.
My questions are:
Is there a common implementation of this pattern?
Are there issues with putting a mutex inside a shared_ptr that I'm overlooking that prevent this pattern from being widespread?
Is there a good reason not to implement my own shared_ptr class to implement this pattern?
(NB I'm working on a codebase that uses Qt but unfortunately cannot use boost in this case. However, answers involving boost are still of general interest.)
I'm not sure if there are any standard implementations, but since I like re-implementing stuff for no reason, here's a version that should work (assuming you don't want to be able to copy such pointers):
template<class T>
class locking_ptr
{
public:
locking_ptr(T* ptr, mutex* lock)
: m_ptr(ptr)
, m_mutex(lock)
{
m_mutex->lock();
}
~locking_ptr()
{
if (m_mutex)
m_mutex->unlock();
}
locking_ptr(locking_ptr<T>&& ptr)
: m_ptr(ptr.m_ptr)
, m_mutex(ptr.m_mutex)
{
ptr.m_ptr = nullptr;
ptr.m_mutex = nullptr;
}
T* operator ->()
{
return m_ptr;
}
T const* operator ->() const
{
return m_ptr;
}
private:
// disallow copy/assignment
locking_ptr(locking_ptr<T> const& ptr)
{
}
locking_ptr& operator = (locking_ptr<T> const& ptr)
{
return *this;
}
T* m_ptr;
mutex* m_mutex; // whatever implementation you use
};
You're describing a variation of the EXECUTE AROUND POINTER pattern, described by Kevlin Henney in Executing Around Sequences.
I have a prototype implementation at exec_around.h but I can't guarantee it works correctly in all cases as it's a work in progress. It includes a function mutex_around which creates an object and wraps it in a smart pointer that locks and unlocks a mutex when accessed.
There is another approach here. Far less flexible and less generic, but also far simpler. While it still seems to fit your exact scenario.
shared_ptr (both standard and Boost) offers means to construct it while providing another shared_ptr instance which will be used for usage counter and some arbitrary pointer that will not be managed at all. On cppreference.com it is the 8th form (the aliasing constructor).
Now, normally, this form is used for conversions - like providing a shared_ptr to base class object from derived class object. They share ownership and usage counter but (in general) have two different pointer values of different types. This form is also used to provide a shared_ptr to a member value based on shared_ptr to object that it is a member of.
Here we can "abuse" the form to provide lock guard. Do it like this:
auto A::getResource()
{
auto counter = std::make_shared<Lock>(&mMutex);
std::shared_ptr<Resource> result{ counter, &mResource };
return result;
}
The returned shared_ptr points to mResource and keeps mMutex locked for as long as it is used by anyone.
The problem with this solution is that it is now your responsibility to ensure that the mResource remains valid (in particular - it doesn't get destroyed) for that long as well. If locking mMutex is enough for that, then you are fine.
Otherwise, above solution must be adjusted to your particular needs. For example, you might want to have the counter a simple struct that keeps both the Lock and another shared_ptr to the A object owning the mResource.
To add to Adam Badura's answer, for a more general case using std::mutex and std::lock_guard, this worked for me:
auto A::getResource()
{
auto counter = std::make_shared<std::lock_guard<std::mutex>>(mMutex);
std::shared_ptr<Resource> ptr{ counter, &mResource} ;
return ptr;
}
where the lifetimes of std::mutex mMutex and Resource mResource are managed by some class A.

Impossible to be const-correct when combining data and it's lock?

I've been looking at ways to combine a piece of data which will be accessed by multiple threads alongside the lock provisioned for thread-safety. I think I've got to a point where I don't think its possible to do this whilst maintaining const-correctness.
Take the following class for example:
template <typename TType, typename TMutex>
class basic_lockable_type
{
public:
typedef TMutex lock_type;
public:
template <typename... TArgs>
explicit basic_lockable_type(TArgs&&... args)
: TType(std::forward<TArgs...>(args)...) {}
TType& data() { return data_; }
const TType& data() const { return data_; }
void lock() { mutex_.lock(); }
void unlock() { mutex_.unlock(); }
private:
TType data_;
mutable TMutex mutex_;
};
typedef basic_lockable_type<std::vector<int>, std::mutex> vector_with_lock;
In this I try to combine the data and lock, marking mutex_ as mutable. Unfortunately this isn't enough as I see it because when used, vector_with_lock would have to be marked as mutable in order for a read operation to be performed from a const function which isn't entirely correct (data_ should be mutable from a const).
void print_values() const
{
std::lock_guard<vector_with_lock> lock(values_);
for(const int val : values_)
{
std::cout << val << std::endl;
}
}
vector_with_lock values_;
Can anyone see anyway around this such that const-correctness is maintained whilst combining data and lock? Also, have I made any incorrect assumptions here?
Personally, I'd prefer a design where you don't have to lock manually, and the data is properly encapsulated in a way that you cannot actually access it without locking first.
One option is to have a friend function apply or something that does the locking, grabs the encapsulated data and passes it to a function object that is run with the lock held within it.
//! Applies a function to the contents of a locker_box
/*! Returns the function's result, if any */
template <typename Fun, typename T, typename BasicLockable>
ResultOf<Fun(T&)> apply(Fun&& fun, locker_box<T, BasicLockable>& box) {
std::lock_guard<BasicLockable> lock(box.lock);
return std::forward<Fun>(fun)(box.data);
}
//! Applies a function to the contents of a locker_box
/*! Returns the function's result, if any */
template <typename Fun, typename T, typename BasicLockable>
ResultOf<Fun(T const&)> apply(Fun&& fun, locker_box<T, BasicLockable> const& box) {
std::lock_guard<BasicLockable> lock(box.lock);
return std::forward<Fun>(fun)(box.data);
}
Usage then becomes:
void print_values() const
{
apply([](std::vector<int> const& the_vector) {
for(const int val : the_vector) {
std::cout << val << std::endl;
}
}, values_);
}
Alternatively, you can abuse range-based for loop to properly scope the lock and extract the value as a "single" operation. All that is needed is the proper set of iterators1:
for(auto&& the_vector : box.open()) {
// lock is held in this scope
// do our stuff normally
for(const int val : the_vector) {
std::cout << val << std::endl;
}
}
I think an explanation is in order. The general idea is that open() returns a RAII handle that acquires the lock on construction and releases it upon destruction. The range-based for loop will ensure this temporary lives for as long as that loop executes. This gives the proper lock scope.
That RAII handle also provides begin() and end() iterators for a range with the single contained value. This is how we can get at the protected data. The range-based loop takes care of doing the dereferencing for us and binding it to the loop variable. Since the range is a singleton, the "loop" will actually always run exactly once.
The box should not provide any other way to get at the data, so that it actually enforces interlocked access.
Of course one can stow away a reference to the data once the box is open, in a way that the reference is available after the box closes. But this is for protecting against Murphy, not Machiavelli.
The construct looks weird, so I wouldn't blame anyone for not wanting it. One one hand I want to use this because the semantics are perfect, but on the other hand I don't want to because this is not what range-based for is for. On the gripping hand this range-RAII hybrid technique is rather generic and can be easily abused for other ends, but I will leave that to your imagination/nightmares ;) Use at your own discretion.
1 Left as an exercise for the reader, but a short example of such a set of iterators can be found in my own locker_box implementation.
What do you understand by "const correct"? Generally, I think that there is a consensus for logical const, which means that if the mutex isn't part of the logical (or observable) state of your object, there's nothing wrong with declaring it mutable, and using it even in const functions.
In a sense, whether the mutex is locked or not is part of the observable state of the object -- you can observe it for example by accidentally creating a locking inversion.
That is a fundamental issue with self-locking objects, and I guess one aspect of it does relate to const-correctness.
Either you can change the "lockedness" of the object via a reference-to-const, or else you can't make synchronized accesses via reference-to-const. Pick one, presumably the first.
The alternative is to ensure that the object cannot be "observed" by the calling code while in a locked state, so that the lockedness isn't part of the observable state. But then there's no way for a caller to visit each element in their vector_with_lock as a single synchronized operation. As soon as you call the user's code with the lock held, they can write code containing a potential or guaranteed locking inversion, that "sees" whether the lock is held or not. So for collections this doesn't resolve the issue.

Should I use const to make objects thread-safe?

I wrote a class which instances may be accessed by several threads. I used a trick to remember users they have to lock the object before using it. It involves keeping only const instances. When in the need to read or modify sensitive data, other classes should call a method (which is const, thus allowed) to get a non-const version of the locked object. Actually it returns a proxy object containing a pointer to the non-const object and a scoped_lock, so it unlocks the object when going out of scope. The proxy object also overloads operator-> so the access to the object is transparent.
This way, shooting onself's foot by accessing unlocked objects is harder (there is always const_cast).
"Clever tricks" should be avoided, and this smells bad anyway.
Is this design really bad ?
What else can I or should I do ?
Edit: Getters are non-const to enforce locking.
Basic problem: a non-const reference may exist elsewhere. If that gets written safely, it does not follow that it can be read safely -- you may look at an intermediate state.
Also, some const methods might (legitimately) modify hidden internal details in a thread-unsafe way.
Analyse what you're actually doing to the object and find an appropriate synchronisation mode.
If your clever container really does know enough about the objects to control all their synchronisation via proxies, then make those objects private inner classes.
This is clever, but unfortunately doomed to fail.
The problem, underlined by spraff, is that you protect against reads but not against writes.
Consider the following sequence:
unsigned getAverageSalary(Employee const& e) {
return e.paid() / e.hired_duration();
}
What happens if we increment paid between the two function calls ? We get an incoherent value.
The problem is that your scheme does not explicitly enforce locking for reads.
Consider the alternative of a Proxy pattern: The object itself is a bundle of data, all privates. Only a Proxy class (friend) can read/write its data, and when initializing the Proxy it grabs the lock (on the mutex of the object) automatically.
class Data {
friend class Proxy;
Mutex _mutex;
int _bar;
};
class Proxy {
public:
Proxy(Data& data): _lock(data._mutex), _data(data) {}
int bar() const { return _data._bar; }
void bar(int b) { _data._bar = b; }
private:
Proxy(Proxy const&) = delete; // disable copy
Lock _lock;
Data& _data;
};
If I wanted to do what you are doing, I would do one of the following.
Method 1:
shared_mutex m; // somewhere outside the class
class A
{
private:
int variable;
public:
void lock() { m.lock(); }
void unlock() { m.unlock(); }
bool is_locked() { return m.is_locked(); }
bool write_to_var(int newvalue)
{
if (!is_locked())
return false;
variable = newvalue;
return true;
}
bool read_from_var(int *value)
{
if (!is_locked() || value == NULL)
return false;
*value = variable;
return true;
}
};
Method 2:
shared_mutex m; // somewhere outside the class
class A
{
private:
int variable;
public:
void write_to_var(int newvalue)
{
m.lock();
variable = newvalue;
m.unlock();
}
int read_from_var()
{
m.lock();
int to_return = variable;
m.unlock();
return to_return;
}
};
The first method is more efficient (not locking-unlocking all the time), however, the program may need to keep checking the output of every read and write to see if they were successful. The second method automatically handles the locking and so the programmer wouldn't even know the lock is there.
Note: This is not code for copy-paste. It shows a concept and sketches how it's done. Please don't comment saying you forgot some error checking somewhere.
This sounds a lot like Alexandrescu's idea with volatile. You're not
using the actual semantics of const, but rather exploiting the way the
type system uses it. In this regard, I would prefer Alexandrescu's use
of volatile: const has very definite and well understood semantics,
and subverting them will definitely cause confusion for anyone reading
or maintaining the code. volatile is more appropriate, as it has no
well defined semantics, and in the context of most applications, is not
used for anything else.
And rather than returning a classical proxy object, you should return a
smart pointer. You could actually use shared_ptr for this, grabbing
the lock before returning the value, and releasing it in the deleter
(rather than deleting the object); I rather fear, however, that this
would cause some confusion amongst the readers, and I would probably go
with a custom smart pointer (probably using shared_ptr with the custom
deleter in the implementation). (From your description, I suspect that
this is closer to what you had in mind anyway.)

Should mutexes be mutable?

Not sure if this is a style question, or something that has a hard rule...
If I want to keep the public method interface as const as possible, but make the object thread safe, should I use mutable mutexes? In general, is this good style, or should a non-const method interface be preferred? Please justify your view.
The hidden question is: where do you put the mutex protecting your class?
As a summary, let's say you want to read the content of an object which is protected by a mutex.
The "read" method should be semantically "const" because it does not change the object itself. But to read the value, you need to lock a mutex, extract the value, and then unlock the mutex, meaning the mutex itself must be modified, meaning the mutex itself can't be "const".
If the mutex is external
Then everything's ok. The object can be "const", and the mutex don't need to be:
Mutex mutex ;
int foo(const Object & object)
{
Lock<Mutex> lock(mutex) ;
return object.read() ;
}
IMHO, this is a bad solution, because anyone could reuse the mutex to protect something else. Including you. In fact, you will betray yourself because, if your code is complex enough, you'll just be confused about what this or that mutex is exactly protecting.
I know: I was victim of that problem.
If the mutex is internal
For encapsulation purposes, you should put the mutex as near as possible from the object it's protecting.
Usually, you'll write a class with a mutex inside. But sooner or later, you'll need to protect some complex STL structure, or whatever thing written by another without mutex inside (which is a good thing).
A good way to do this is to derive the original object with an inheriting template adding the mutex feature:
template <typename T>
class Mutexed : public T
{
public :
Mutexed() : T() {}
// etc.
void lock() { this->m_mutex.lock() ; }
void unlock() { this->m_mutex.unlock() ; } ;
private :
Mutex m_mutex ;
}
This way, you can write:
int foo(const Mutexed<Object> & object)
{
Lock<Mutexed<Object> > lock(object) ;
return object.read() ;
}
The problem is that it won't work because object is const, and the lock object is calling the non-const lock and unlock methods.
The Dilemma
If you believe const is limited to bitwise const objects, then you're screwed, and must go back to the "external mutex solution".
The solution is to admit const is more a semantic qualifier (as is volatile when used as a method qualifier of classes). You are hiding the fact the class is not fully const but still make sure provide an implementation that keeps the promise that the meaningful parts of the class won't be changed when calling a const method.
You must then declare your mutex mutable, and the lock/unlock methods const:
template <typename T>
class Mutexed : public T
{
public :
Mutexed() : T() {}
// etc.
void lock() const { this->m_mutex.lock() ; }
void unlock() const { this->m_mutex.unlock() ; } ;
private :
mutable Mutex m_mutex ;
}
The internal mutex solution is a good one IMHO: Having to objects declared one near the other in one hand, and having them both aggregated in a wrapper in the other hand, is the same thing in the end.
But the aggregation has the following pros:
It's more natural (you lock the object before accessing it)
One object, one mutex. As the code style forces you to follow this pattern, it decreases deadlock risks because one mutex will protect one object only (and not multiple objects you won't really remember), and one object will be protected by one mutex only (and not by multiple mutex that needs to be locked in the right order)
The mutexed class above can be used for any class
So, keep your mutex as near as possible to the mutexed object (e.g. using the Mutexed construct above), and go for the mutable qualifier for the mutex.
Edit 2013-01-04
Apparently, Herb Sutter have the same viewpoint: His presentation about the "new" meanings of const and mutable in C++11 is very enlightening:
http://herbsutter.com/2013/01/01/video-you-dont-know-const-and-mutable/
[Answer edited]
Basically using const methods with mutable mutexes is a good idea (don't return references by the way, make sure to return by value), at least to indicate they do not modify the object. Mutexes should not be const, it would be a shameless lie to define lock/unlock methods as const...
Actually this (and memoization) are the only fair uses I see of the mutable keyword.
You could also use a mutex which is external to your object: arrange for all your methods to be reentrant, and have the user manage the lock herself : { lock locker(the_mutex); obj.foo(); } is not that hard to type, and
{
lock locker(the_mutex);
obj.foo();
obj.bar(42);
...
}
has the advantage it doesn't require two mutex locks (and you are guaranteed the state of the object did not change).