"Mutable is used to specify that the member does not affect the externally visible state of the class (as often used for mutexes, memo caches, lazy evaluation, and access instrumentation)." [Reference: cv (const and volatile) type qualifiers, mutable specifier]
This sentence made me wonder:
"Guideline: Remember the “M&M rule”: For a member-variable, mutable and mutex (or atomic) go together." [Reference: GotW #6a Solution: Const-Correctness, Part 1 (updated for C ++11/14)]
I understand why “M&M rule” applies to std::mutex data-member: to allow const-functions to be thread-safe despite they lock/unlock the mutex data-member, but does “M&M rule” applies also to std::atomic data-member?
You got it partly backwards. The article does not suggest to make all atomic members mutable. Instead it says:
(1) For a member variable, mutable implies mutex (or equivalent): A
mutable member variable is presumed to be a mutable shared variable
and so must be synchronized internally—protected with a mutex, made
atomic, or similar.
(2) For a member variable, mutex (or similar synchronization type)
implies mutable: A member variable that is itself of a synchronization
type, such as a mutex or a condition variable, naturally wants to be
mutable, because you will want to use it in a non-const way (e.g.,
take a std::lock_guard) inside concurrent const member
functions.
(2) says that you want a mutex member mutable. Because typically you also want to lock the mutex in const methods. (2) does not mention atomic members.
(1) on the other hand says that if a member is mutable, then you need to take care of synchronization internally, be it via a mutex or by making the member an atomic. That is because of the bullets the article mentions before:
If you are implementing a type, unless you know objects of the type can never be shared (which is generally impossible), this means that each of your const member functions must be either:
truly physically/bitwise const with respect to this object, meaning that they perform no writes to the object’s data; or else
internally synchronized so that if it does perform any actual writes to the object’s data, that data is correctly protected with a mutex or equivalent (or if appropriate are atomic<>) so that any possible concurrent const accesses by multiple callers can’t tell the difference.
A member that is mutable is not "truly const", hence you need to take care of synchronization internally (either via a mutex or by making the member atomic).
TL;DR: The article does not suggest to make all atomic members mutable. It rather suggests to make mutex members mutable and to use internal synchronization for all mutable members.
Related
According to Herb Sutter (http://isocpp.org/blog/2012/12/you-dont-know-const-and-mutable-herb-sutter), in C++11 const methods must not alter the object bit-wise, or must perform internal synchronization (e.g. using a mutex) if they have mutable data members.
Suppose I have a global object that I'm accessing from multiple threads and suppose it has mutable members. For the sake of argument, let's assume that we cannot modify the source of the class (it's provided by a third-party).
In C++98 these threads would use a global mutex to synchronize access to this object. So, an access would require a single mutex lock/unlock.
However, in C++11, any const member function call on this object will invoke internal synchronization as well, so potentially, a single const function call on this object will cost 2 lock/unlock operations (or more, depending on how many functions you call from a single thread). Note that the global mutex is still needed, because const doesn't seem to do anything for writers (except possibly slowing them down as well, if one of the non-const methods calls a const method).
So, my question is: If all of our classes have to be this way in C++ (at least to be usable by STL), doesn't this lead to excessive synchronization measures?
Thanks
Edit: Some clarifications:
It seems that in C++11, you cannot use a class with the standard library unless its const member functions are internally synchronized (or do not perform any writes).
While C++11 doesn't automatically add any synchronization code itself, a standard-library-compliant class doesn't need synchronization in C++98, but needs it in C++11. So, in C++98 you can get away with not doing any internal synchronization for mutable members, but in C++11 you can't.
in C++11, any const member function call on this object will invoke internal synchronization as well
Why? That synchronisation doesn't just magically appear in the class, it's only there if someone adds it explicitly.
so potentially, a single const function call on this object will cost 2 lock/unlock operations
Only if someone has added an internal mutex to it and you also use an external one ... but why on earth would you do that?
Note that the global mutex is still needed, because const doesn't seem to do anything for writers (except possibly slowing them down as well, if one of the non-const methods calls a const method).
If the class has an internal mutex that's used to make the const members thread-safe then it could also be used for non-const members. If the class doesn't have an internal mutex, then the situation is identical to the C++98 one.
I think you're seeing a problem that doesn't exist.
Herb's "new meaning for const" is not enforced by the language or compiler, it's just design guidance, i.e. an idiom for good code. To follow that guidance you don't add mutexes to every class so const members are allowed to modify mutable members, you avoid mutable members! In the rare cases where you absolutely must have mutable members, either require users to do their own locking (and clearly document the class as requiring external synchronisation) or add internal synchronisation and pay the extra cost ... but those situations should be rare, so it's not true that "C++11 objects are slower because of the new const" because most well-designed objects don't have mutable members anyway.
Yes, you are absolutely correct. You should make your objects follow these guidelines, and therefore access to them will potentially be slower in C++11. If and only if:
The class has mutable members which const member functions modify.
The object is being accessed from multiple threads.
If you ensure that at least one of these is untrue, then nothing changes. The number of objects that are being accessed from multiple threads should always be as minimal as possible. And the number of classes that have mutable members should be minimal. So you're talking about a minimal set of a minimal set of objects.
And even then... all that is required is that data races will not be broken. Depending on what the mutable data even is, this could simply be an atomic access.
I fail to see the problem here. Few of the standard library objects will have mutable members. I defy you to find a reasonable implementation of basic_string, vector, map, etc that need mutable members.
It seems that in C++11, you cannot use a class with the standard library unless its const member functions are internally synchronized (or do not perform any writes).
This is incorrect. You most certainly can. What you cannot do is attempt to access that class across multiple threads in a way that would "perform any writes" on those mutable members. If you never access that object through that C++11 class across threads in that particular way, you're fine.
So yes, you can use them. But you only get the guarantees that your own class provides. If you use your class through a standard library class in an unreasonable way (like your const member functions not being const or properly synchronized), then that's your fault, not the library's.
So, in C++98 you can get away with not doing any internal synchronization for mutable members, but in C++11 you can't.
That's like saying you can get away with computer crime back in the Roman Empire. Of course you can. They didn't have computers back then; so they didn't know what computer crime was.
C++98/03 did not have the concept of "threading". Thus, the standard has no concept of "internal synchronization", so what you could or could not "get away with" was neither defined nor undefined. It made no more sense to ask that question of the standard than to ask what the hacking laws were during Ceaser's day.
Now that C++11 actually defines this concept and the idea of a race condition, C++11 is able to say when you can "get away with not doing any internal synchronization".
Or, to put it another way, here is how the two standards answer your question: What is the result of a potential data race on a mutable member when accessed via a member function declared const in the standard library?
C++11: There will be no data races on any internal members when accessed by a const function. All standard library implementations of such functions must be implemented in such a way that a data race cannot occur.
C++98/03: What's a data race?
I was surprised this didn't show up in my search results, I thought someone would've asked this before, given the usefulness of move semantics in C++11:
When do I have to (or is it a good idea for me to) make a class non-movable in C++11?
(Reasons other than compatibility issues with existing code, that is.)
Herb's answer (before it was edited) actually gave a good example of a type which shouldn't be movable: std::mutex.
The OS's native mutex type (e.g. pthread_mutex_t on POSIX platforms) might not be "location invariant" meaning the object's address is part of its value. For example, the OS might keep a list of pointers to all initialized mutex objects. If std::mutex contained a native OS mutex type as a data member and the native type's address must stay fixed (because the OS maintains a list of pointers to its mutexes) then either std::mutex would have to store the native mutex type on the heap so it would stay at the same location when moved between std::mutex objects or the std::mutex must not move. Storing it on the heap isn't possible, because a std::mutex has a constexpr constructor and must be eligible for constant initialization (i.e. static initialization) so that a global std::mutex is guaranteed to be constructed before the program's execution starts, so its constructor cannot use new. So the only option left is for std::mutex to be immovable.
The same reasoning applies to other types that contain something that requires a fixed address. If the address of the resource must stay fixed, don't move it!
There is another argument for not moving std::mutex which is that it would be very hard to do it safely, because you'd need to know that noone is trying to lock the mutex at the moment it's being moved. Since mutexes are one of the building blocks you can use to prevent data races, it would be unfortunate if they weren't safe against races themselves! With an immovable std::mutex you know the only things anyone can do to it once it has been constructed and before it has been destroyed is to lock it and unlock it, and those operations are explicitly guaranteed to be thread safe and not introduce data races. This same argument applies to std::atomic<T> objects: unless they could be moved atomically it wouldn't be possible to safely move them, another thread might be trying to call compare_exchange_strongon the object right at the moment it's being moved. So another case where types should not be movable is where they are low-level building blocks of safe concurrent code and must ensure atomicity of all operations on them. If the object value might be moved to a new object at any time you'd need to use an atomic variable to protect every atomic variable so you know if it's safe to use it or it's been moved ... and an atomic variable to protect that atomic variable, and so on...
I think I would generalize to say that when an object is just a pure piece of memory, not a type which acts as a holder for a value or abstraction of a value, it doesn't make sense to move it. Fundamental types such as int can't move: moving them is just a copy. You can't rip the guts out of an int, you can copy its value and then set it to zero, but it's still an int with a value, it's just bytes of memory. But an int is still movable in the language terms because a copy is a valid move operation. For non-copyable types however, if you don't want to or can't move the piece of memory and you also can't copy its value, then it's non-movable. A mutex or an atomic variable is a specific location of memory (treated with special properties) so doesn't make sense to move, and is also not copyable, so it's non-movable.
Short answer: If a type is copyable, it should also be moveable. However, the reverse is not true: some types like std::unique_ptr are moveable yet it doesn't make sense to copy them; these are naturally move-only types.
Slightly longer answer follows...
There are two major kinds of types (among other more special-purpose ones such as traits):
Value-like types, such as int or vector<widget>. These represent values, and should naturally be copyable. In C++11, generally you should think of move as an optimization of copy, and so all copyable types should naturally be moveable... moving is just an efficient way of doing a copy in the often-common case that you don't need the original object any more and are just going to destroy it anyway.
Reference-like types that exist in inheritance hierarchies, such as base classes and classes with virtual or protected member functions. These are normally held by pointer or reference, often a base* or base&, and so do not provide copy construction to avoid slicing; if you do want to get another object just like an existing one, you usually call a virtual function like clone. These do not need move construction or assignment for two reasons: They're not copyable, and they already have an even more efficient natural "move" operation -- you just copy/move the pointer to the object and the object itself doesn't have to move to a new memory location at all.
Most types fall into one of those two categories, but there are other kinds of types too that are also useful, just rarer. In particular here, types that express unique ownership of a resource, such as std::unique_ptr, are naturally move-only types, because they are not value-like (it doesn't make sense to copy them) but you do use them directly (not always by pointer or reference) and so want to move objects of this type around from one place to another.
Actually when I search around, I found quite some types in C++11 are not movable:
all mutex types(recursive_mutex , timed_mutex, recursive_timed_mutex,
condition_variable
type_info
error_category
locale::facet
random_device
seed_seq
ios_base
basic_istream<charT,traits>::sentry
basic_ostream<charT,traits>::sentry
all atomic types
once_flag
Apparently there is a discussion on Clang: https://groups.google.com/forum/?fromgroups=#!topic/comp.std.c++/pCO1Qqb3Xa4
Another reason I've found - performance.
Say you have a class 'a' which holds a value.
You want to output an interface which allows a user to change the value for a limited time (for a scope).
A way to achieve this is by returning a 'scope guard' object from 'a' which sets the value back in its destructor, like so:
class a
{
int value = 0;
public:
struct change_value_guard
{
friend a;
private:
change_value_guard(a& owner, int value)
: owner{ owner }
{
owner.value = value;
}
change_value_guard(change_value_guard&&) = delete;
change_value_guard(const change_value_guard&) = delete;
public:
~change_value_guard()
{
owner.value = 0;
}
private:
a& owner;
};
change_value_guard changeValue(int newValue)
{
return{ *this, newValue };
}
};
int main()
{
a a;
{
auto guard = a.changeValue(2);
}
}
If I made change_value_guard movable, I'd have to add an 'if' to its destructor that would check if the guard has been moved from - that's an extra if, and a performance impact.
Yeah, sure, it can probably be optimized away by any sane optimizer, but still it's nice that the language (this requires C++17 though, to be able to return a non-movable type requires guaranteed copy elision) does not require us to pay that if if we're not going to move the guard anyway other than returning it from the creating function (the dont-pay-for-what-you-dont-use principle).
I am trying to use std::atomic library.
What's the difference between specialized and non-specialized atomic
member functions?
What's the difference (if there is any) between following functions?
operator= stores a value into an atomic object (public member function) v.s. store (C++11) atomically replaces the value of the atomic object with a non-atomic argument (public member function)
operator T() loads a value from an atomic object (public member function) v.s. load (C++11) atomically obtains the value of the atomic object (public member function).
operator+= v.s. fetch_add
operator-= v.s. fetch_sub
operator&= v.s. fetch_and
operator|= v.s. fetch_or
operator^= v.s. fetch_xor
What's the downside of declare a variable as atomic v.s. a
non-atomic variable. For example, what's the downside of
std::atomic<int> x v.s. int x? In other words, how much is the overhead of an atomic variable?
Which one has more overhead? An atomic variable, v.s. a normal
variable protected by a mutex?
Here is the reference to my quesitons. http://en.cppreference.com/w/cpp/atomic/atomic
Not an expert, but I'll try:
The specializations (for built-in types such as int) contain additional operations such as fetch_add. Non-specialized forms (user defined types) will not contain these.
operator= returns its argument, store does not. Also, non-operators allow you to specify a memory order. The standard says operator= is defined in terms of store.
Same as above, although it returns the value of load.
Same as above
Same as above
Same as above
Same as above
Same as above
Same as above
They do different things. It's undefined behavior to use an int in the way you would use std::atomic_int.
You can assume the overhead is int <= std::atomic <= int and std::mutex where <= means 'less overhead'. So it's likely better than locking with a mutex (especially for built-in types), but worse than int.
What's the difference between specialized and non-specialized atomic member functions?
As can be seen in the synposis of these classes on the standard (§29.5), there are three different sets of member functions:
the most generic one provides only store, load, exchange, and compare-exchange operations;
the specializations for integral types provide atomic arithmetic and bitwise operations, in addition to the generic ones;
the specialization for pointers provides pointer arithmetic operations in addition to the generic ones.
What's the difference (if there is any) between following functions?
operator= stores a value into an atomic object (public member function) v.s. store (C++11) atomically replaces the value of the atomic object with a non-atomic argument (public member function)
(...)
The main functional difference is that the non-operator versions (§29.6.5, paragraphs 9-17 and more) have an extra parameter for specifying the desired memory ordering (§29.3/1). The operator versions use the sequential consistency memory ordering:
void A::store(C desired, memory_order order = memory_order_seq_cst) volatile noexcept;
void A::store(C desired, memory_order order = memory_order_seq_cst) noexcept;
Requires: The order argument shall not be memory_order_consume, memory_order_acquire, nor memory_order_acq_rel.
Effects: Atomically replaces the value pointed to by object or by this with the value of desired. Memory is affected according to the
value of order.
C A::operator=(C desired) volatile noexcept;
C A::operator=(C desired) noexcept;
Effects: store(desired)
Returns: desired
The non-operator forms are advantageous because sequential consistency is not always necessary, and it is potentially more expensive than the other memory orderings. With careful analysis one can find out what are the minimal guarantees needed for correct operation and select one of the less restrictive memory orderings, giving more leeway to the optimizer.
What's the downside of declare a variable as atomic v.s. a non-atomic variable. For example, what's the downside of std::atomic<int> x v.s. int x? In other words, how much is the overhead of an atomic variable?
Using an atomic variable when a regular variable would suffice limits the number of possible optimizations, because atomic variables impose additional constraints of indivisibility and (possibly) memory ordering.
Using a regular variable when an atomic variable is needed may introduce data races, and that makes the behaviour undefined (§1.10/21):
The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.
The overhead of an atomic variable is a matter of quality of implementation. Ideally, an atomic variable has zero overhead when you need atomic operations. When you don't need atomic operations, whatever overhead it may have is irrelevant: you just use a regular variable.
Which one has more overhead? An atomic variable, v.s. a normal variable protected by a mutex?
There's no reason for an atomic variable to have more overhead than a normal variable protected by a mutex: worst case scenario, the atomic variable is implemented just like that. But there is a possibility that the atomic variable is lock-free, which would involve less overhead. This property can be ascertained with the functions described in the standard in §29.6.5/7:
bool atomic_is_lock_free(const volatile A *object) noexcept;
bool atomic_is_lock_free(const A *object) noexcept;
bool A::is_lock_free() const volatile noexcept;
bool A::is_lock_free() const noexcept;
Returns: True if the object’s operations are lock-free, false otherwise.
I'm not an expert on this stuff, but if I understand correctly the non-specialized operations in your reference do one thing atomically, load, store, replace, etc.
The specialized function do two things atomically, that is they modify and then return an atomic object in such a way that both operations happen before any other thread could mess with them.
From my understanding of the mutable keyword, one of its primary uses is caching data and computing them when needed. Since they can change (even though they are const) wouldnt it be unsafe or pointless to use them? The caching part modifies the data so there would need to be a lock and from my understanding when you write for multithreads the data should NEVER change and copies should be made and returned/chained together.
So is it pointless or bad to use C++'s mutable keyword?
So is it pointless or bad to use C++'s mutable keyword?
No; the mutable keyword is A Good Thing. mutable can be used to separate the observable state of an object from the internal contents of the object.
With the "cached data" example that you describe (a very common use of mutable), it allows the class to perform optimizations "under the covers" that don't actually modify the observable state.
With respect to accessing an object from multiple threads, yes, you have to be careful. In general, if a class is designed to be accessed from multiple threads and it has mutable variables, it should synchronize modification of those variables internally. Note, however, that the problem is really more a conceptual one. It's easy to reason that:
All of my threads only call const member functions on this shared object
Const member functions do not modify the object on which they are called
If an object is not modified, I don't need to synchronize access to it
Therefore, I don't need to synchronize access to this object
This argument is wrong because (2) is false: const member functions can indeed modify mutable data members. The problem is that it's really, really easy to think that this argument is right.
The solution to this problem isn't easy: effectively, you just have to be extremely careful when writing multithreaded code and be absolutely certain that you understand either how objects being shared between threads are implemented or what concurrency guarantees they give.
On the opposite end, most of my multithreaded code requires the use of the mutable keyword:
class object {
type m_data;
mutable mutex m_mutex;
public:
void set( type const & value ) {
scoped_lock lock( m_mutex );
m_data = value;
}
type get() const {
scoped_lock lock( m_mutex );
return m_data;
}
};
The fact that the get method does not modify the state of the object is declared by means of the const keyword. But without the mutable modifier applied to the declaration of the mutex attribute, the code would not be able to lock or release the mutex --both operations clearly modify the mutex, even if they do not modify the object.
You can even make the data attribute mutable if it can be lazily evaluated and the cost is high, as long as you do lock the object. This is the cache usage that you refer to in the question.
The mutable modifier is not a problem with multithreaded code, only when you try to do lock-less multithreading. And as with all lock-less programming, you must be very careful with what you do, regardless of const or mutable. You can write perfectly unsafe multithreaded code that calls const methods on objects with no mutable attributes. The simple example would be removing the mutex from the previous code and having N threads perform only get()s while another thread performs set()s. The fact that get() is const is no guarantee that you will not get invalid results if another thread is modifying.
No, the mutable keyword is so that you can have fields inside an object that can change even when the object is const, such as for metadata that isn't part of an object's properties but its management (such as counters, etc.). It has nothing to do with threading.
I want to execute a read-only method on an object marked as const, but in order to do this thread-safely, I need to lock a readers-writer mutex:
const Value Object::list() const {
ScopedRead lock(children_);
...
}
But this breaks because the compiler complains about "children_" being const and such. I went up to the ScopedRead class and up to the RWMutex class (which children_ is a sub-class) to allow read_lock on a const object, but I have to write this:
inline void read_lock() const {
pthread_rwlock_rdlock(const_cast<pthread_rwlock_t*>(&rwlock_));
}
I have always learned that const_cast is a code smell. Any way to avoid this ?
Make the lock mutable
mutable pthread_rwlock_t rwlock;
This is a common scenario in which mutable is used. A read-only query of an object is (as the name implies) an operation that should not require non-const access. Mutable is considered good practice when you want to be able to modify parts of an object that aren't visible or have observable side-effects to the object. Your lock is used to ensure sequential access to the object's data, and changing it doesn't effect the data contained within the object nor have observable side-effects to later calls so it is still honoring the const-ness of the object.
Make the lock mutable.
Yes, use mutable. It's designed for this very purpose: Where the entire context of the function is const (i.e. an accessor or some other logically read-only action.) but where some element of writable access is needed for a mutex or reference counter etc.
The function should be const, even if it does lock a mutex internally. Doing so makes the code thread-neutral without having to expose the details, which I presume is what you're trying to do.
There are very few places where const_cast<> needs to be legitimately used and this isn't one of them. Using const cast on on an object, especially in a const function is a code maintenance nightmare. Consider:
token = strtok_r( const_cast<char*>( ref_.c_str() ), ":", &saveptr );
In fact, I'd argue that when you see const_cast in a const function, you should start by making the function non-const (very soon after you should get rid of the const_cast and make the function const again though)
Well, if we are not allowed to modify the declaration of the variable, then const_cast comes to the rescue. If not, making it mutable is the solution.
To solve the actual problem, declare the lock as mutable.
The following is now my professional opinion:
The compiler is right to complain, and you are right to find this mildly offensive. If performing a read-only operation requires a lock, and locks must be writeable to lock, then you should probably make the read-only query require non-const access.
EDIT: Alright, I'll bite. I've seen this kind of pattern cause major perf hits in places you would not expect. Does anyone here know how tolower or toupper can become a major bottleneck if called frequently enough, even with the default ASCII locale? In one particular implementation of the C runtime library built for multithreading, there was a lock taken to query the current locale for that thread. Calling tolower on the order of 10000 times or more resulted in more of a perf hit than reading a file from disk.
Just because you want read-only access doesn't mean that you should hide the fact that you need to lock to get it.