I'm reading Functional Programming in C++ from Ivan Čukić, and I am having a hard time interpreting a point in the summary of Chapter 5:
When you make a member function const, you promise that the function won't change any data in the class (not a bit of the object will change), or that any changes to the object (to members declared as mutable) will be atomic as far as the users of the object are concerned.
If the part in italic was simply are limited to members declared as mutable I would have been happy with it. However, this rewording of mine seems to correspond to what the author put in parenthesis. What is out of parenthesis is what is puzzling me: what is the meaning of atomic in that sentence?
The author is making a claim about best practices, not about the rules of the language.
You can write a class in which const methods alter mutable members in ways that are visible to the user, like this:
struct S {
mutable int value = 0;
int get() const {
return value++;
}
};
const S s;
std::cout << s.get(); // prints 0
std::cout << s.get(); // prints 1
// etc
You can do that, and it wouldn't break any of the rules of the language. However, you shouldn't. It violates the user's expectation that the const method should not change the internal state in an observable way.
There are legitimate uses for mutable members, such as memoization that can speed up subsequent executions of a const member function.
The author suggests that, as a matter of best practices, such uses of mutable members by const member functions should be atomic, since users are likely to expect that two different threads can call const member functions on an object concurrently.
If you violate this guideline, then you're not directly breaking any rules of the language. However, it makes it likely that users will use your class in a way that will cause data races (which are undefined behaviour). It takes away the user's ability to use the const qualifier to reason about the thread-safety of your class.
or that any changes to the object (to members declared as mutable)
will be atomic as far as the users of the object are concerned.
I think the author (or editor) of the book worded his statement poorly there -- const and mutable make no guarantees about thread-safety; indeed, they were part of the language back when the language had no support for multithreading (i.e. back when multithreading specifications were not part of the C++ standard and therefore anything you did with multithreading in your C++ program was therefore technically undefined behavior).
I think what the author intended to convey is that changes to mutable member-variables from with a const-tagged method should be limited only to changes that don't change the object's state as far as the calling code can tell. The classic example of this would be memo-ization of an expensive computation for future reference, e.g.:
class ExpensiveResultGenerator
{
public:
ExpensiveResultGenerator()
: _cachedInputValue(-1)
{
}
float CalculateResult(int inputValue) const
{
if ((_cachedInputValue < 0)||(_cachedInputValue != inputValue))
{
_cachedInputValue = inputValue;
_cachedResult = ReallyCPUExpensiveCalculation(inputValue);
}
return _cachedResult;
}
private:
float ReallyCPUExpensiveCalculation(int inputValue) const
{
// Code that is really expensive to calculate the value
// corresponding to (inputValue) goes here....
[...]
return computedResult;
}
mutable int _cachedInputValue;
mutable float _cachedResult;
}
Note that as far as the code using the ExpensiveResultGenerator class is concerned, CalculateResult(int) const doesn't change the state of the ExpensiveResultGenerator object; it is simply computing a mathematical function and returning the result. But internally we are making a memo-ization optimization so that if the user calls CalculateResult(x) with the same value for x multiple times in a row, we can skip the expensive calculation after the first time and just return the _cachedResult instead, for a speedup.
Of course, making that memo-ization optimization can introduce race conditions in a multi-threaded environment, since now we are changing state variables even if the calling code can't see us doing it. So to do this safely in a multithreaded environment, you would need to employ a Mutex of some sort to serialize accesses to the two mutable variables -- either that, or require the calling code to serialize any calls to CalculateResult().
Related
today I have learned about the mutable keyword in C++ and would like to use it in my code.
I have a class with many const methods and one of them should be able to modify some of the object's variables (conserving the logical state of the object). However I don't want to let all the const methods to modify the variable, only the selected one. Is there any way of doing that? Maybe with const_cast?
(The code I am talking about is an implementation of the Union-Find structure. The Find operation does not change the logical state of the structure (it only searches for a root of a tree), but changes the physical state by doing so-called path compression)
Thanks!
EDIT: I have added an excerpt from the code I am referring to:
class UnionFind {
public:
void Union(int a, int b) {...}
int Find(int x) const {
// logically, this method is const
while(x != parents[x]) {
// path compression
// the next three lines modify parents and sizes,
// but the logical state of the object is not changed
sizes[parents[x]] -= sizes[x];
sizes[parents[parents[x]]] += sizes[x];
parents[x] = parents[parents[x]];
x = parents[x];
}
return x;
}
int someOtherMethodThatAccessesParents() const {
// this method does access parents, but read only.
// I would prefer if parents behaved like if it was
// not 'mutable' inside this method
...
}
private:
// these have to be mutable if I want the Find method
// to be marked const (as it should be)
// but making them mutable then does not enforce
// the physical non-mutability in other const methods :(
mutable std::vector<int> parents;
mutable std::vector<int> sizes;
};
On first glance this can't be achieved unless you use a nasty const_cast. But don't do that since the behaviour on attempting to modify a variable following a const_cast that was originally declared as const is undefined.
However, it might be feasible to achieve what you want using friendship since that can be controlled on a function by function basis whereas mutability, as you correctly point out, cannot be.
Put the variable you want to modify in a base class and mark it private. Perhaps provide a "getter" function to that member. That function would be const and would probably return a const reference to the member. Then make your function a friend of that base class. That function will be able to change the value of that private member.
If you can afford to use mutable, that is the right way to do it.
Still, it's possible to do what you are asking for. Normally this is done via the “fake this” idiom:
MyClass *mutableThis = const_cast<MyClass*>(this);
Then access your field normally through the new pointer. This is also the way to do it if you have to support some old compiler with no mutable support.
Note however that this is generally a dangerous practice, as it can easily lead you into the dreaded realm of undefined behavior. If the original object was actually declared const (as opposed to just being accessed via a const pointer/reference), you're asking for trouble.
In short: use mutable when you can, use fake this when you can't, but only when you know what you're doing.
The main portion of this question is in regards to the proper and most computationally efficient method of creating a public read-only accessor for a private data member inside of a class. Specifically, utilizing a const type & reference to access the variables such as:
class MyClassReference
{
private:
int myPrivateInteger;
public:
const int & myIntegerAccessor;
// Assign myPrivateInteger to the constant accessor.
MyClassReference() : myIntegerAccessor(myPrivateInteger) {}
};
However, the current established method for solving this problem is to utilize a constant "getter" function as seen below:
class MyClassGetter
{
private:
int myPrivateInteger;
public:
int getMyInteger() const { return myPrivateInteger; }
};
The necessity (or lack thereof) for "getters/setters" has already been hashed out time and again on questions such as: Conventions for accessor methods (getters and setters) in C++ That however is not the issue at hand.
Both of these methods offer the same functionality using the syntax:
MyClassGetter a;
MyClassReference b;
int SomeValue = 5;
int A_i = a.getMyInteger(); // Allowed.
a.getMyInteger() = SomeValue; // Not allowed.
int B_i = b.myIntegerAccessor; // Allowed.
b.myIntegerAccessor = SomeValue; // Not allowed.
After discovering this, and finding nothing on the internet concerning it, I asked several of my mentors and professors for which is appropriate and what are the relative advantages/disadvantages of each. However, all responses I received fell nicely into two categories:
I have never even thought of that, but use a "getter" method as it is "Established Practice".
They function the same (They both run with the same efficiency), but use a "getter" method as it is "Established Practice".
While both of these answers were reasonable, as they both failed to explain the "why" I was left unsatisfied and decided to investigate this issue further. While I conducted several tests such as average character usage (they are roughly the same), average typing time (again roughly the same), one test showed an extreme discrepancy between these two methods. This was a run-time test for calling the accessor, and assigning it to an integer. Without any -OX flags (In debug mode), the MyClassReference performed roughly 15% faster. However, once a -OX flag was added, in addition to performing much faster both methods ran with the same efficiency.
My question is thus has two parts.
How do these two methods differ, and what causes one to be faster/slower than the others only with certain optimization flags?
Why is it that established practice is to use a constant "getter" function, while using a constant reference is rarely known let alone utilized?
As comments pointed out, my benchmark testing was flawed, and irrelevant to the matter at hand. However, for context it can be located in the revision history.
The answer to question #2 is that sometimes, you might want to change class internals. If you made all your attributes public, they're part of the interface, so even if you come up with a better implementation that doesn't need them (say, it can recompute the value on the fly quickly and shave the size of each instance so programs that make 100 million of them now use 400-800 MB less memory), you can't remove it without breaking dependent code.
With optimization turned on, the getter function should be indistinguishable from direct member access when the code for the getter is just a direct member access anyway. But if you ever want to change how the value is derived to remove the member variable and compute the value on the fly, you can change the getter implementation without changing the public interface (a recompile would fix up existing code using the API without code changes on their end), because a function isn't limited in the way a variable is.
There are semantic/behavioral differences that are far more significant than your (broken) benchmarks.
Copy semantics are broken
A live example:
#include <iostream>
class Broken {
public:
Broken(int i): read_only(read_write), read_write(i) {}
int const& read_only;
void set(int i) { read_write = i; }
private:
int read_write;
};
int main() {
Broken original(5);
Broken copy(original);
std::cout << copy.read_only << "\n";
original.set(42);
std::cout << copy.read_only << "\n";
return 0;
}
Yields:
5
42
The problem is that when doing a copy, copy.read_only points to original.read_write. This may lead to dangling references (and crashes).
This can be fixed by writing your own copy constructor, but it is painful.
Assignment is broken
A reference cannot be reseated (you can alter the content of its referee but not switch it to another referee), leading to:
int main() {
Broken original(5);
Broken copy(4);
copy = original;
std::cout << copy.read_only << "\n";
original.set(42);
std::cout << copy.read_only << "\n";
return 0;
}
generating an error:
prog.cpp: In function 'int main()':
prog.cpp:18:7: error: use of deleted function 'Broken& Broken::operator=(const Broken&)'
copy = original;
^
prog.cpp:3:7: note: 'Broken& Broken::operator=(const Broken&)' is implicitly deleted because the default definition would be ill-formed:
class Broken {
^
prog.cpp:3:7: error: non-static reference member 'const int& Broken::read_only', can't use default assignment operator
This can be fixed by writing your own copy constructor, but it is painful.
Unless you fix it, Broken can only be used in very restricted ways; you may never manage to put it inside a std::vector for example.
Increased coupling
Giving away a reference to your internals increases coupling. You leak an implementation detail (the fact that you are using an int and not a short, long or long long).
With a getter returning a value, you can switch the internal representation to another type, or even elide the member and compute it on the fly.
This is only significant if the interface is exposed to clients expecting binary/source-level compatibility; if the class is only used internally and you can afford to change all users if it changes, then this is not an issue.
Now that semantics are out of the way, we can speak about performance differences.
Increased object size
While references can sometimes be elided, it is unlikely to ever happen here. This means that each reference member will increase the size of an object by at least sizeof(void*), plus potentially some padding for alignment.
The original class MyClassA has a size of 4 on x86 or x86-64 platforms with mainstream compilers.
The Broken class has a size of 8 on x86 and 16 on x86-64 platforms (the latter because of padding, as pointers are aligned on 8-bytes boundaries).
An increased size can bust up CPU caches, with a large number of items you may quickly experience slow downs due to it (well, not that it'll be easy to have vectors of Broken due to its broken assignment operator).
Better performance in debug
As long as the implementation of the getter is inline in the class definition, then the compiler will strip the getter whenever you compile with a sufficient level of optimizations (-O2 or -O3 generally, -O1 may not enable inlining to preserve stack traces).
Thus, the performance of access should only vary in debug code, where performance is least necessary (and otherwise so crippled by plenty of other factors that it matters little).
In the end, use a getter. It's established convention for a good number of reasons :)
When implementing constant reference (or constant pointer) your object also stores a pointer, which makes it bigger in size. Accessor methods, on the other hand, are instantiated only once in program and are most likely optimized out (inlined), unless they are virtual or part of exported interface.
By the way, getter method can also be virtual.
To answer question 2:
const_cast<int&>(mcb.myIntegerAccessor) = 4;
Is a pretty good reason to hide it behind a getter function. It is a clever way to do a getter-like operation, but it completely breaks abstraction in the class.
As far as I know , making constant functions in a class is useful for read/write compiler optimizations.
A constant function within a class means that the class members will remain constant during the execution of the function.
However, you can bypass this by const casting the implicit parameter (ofc this is a very bad practice).
My questions is as follows :
What pitfalls can the following code cause (especially in terms of performance unrelated to thread synchronization) ?
int myClass::getSomething() const
{
myClass* writableThis = const_cast<myClass*>(this);
writableThis->m_nMemberInt++;
...
return m_nSomeOtherUnchangedMember;
}
Another related question :
Is the behavior compiler/platform/os specific ?
I would also very much appreciate if someone could explain the magic under the hood when such a code is compiled/executed (I'm speculating that the CPU is making out-of-order optimizations based on the fact that the function is const , and not respecting this during actual execution should have some side effects).
EDIT :
Thank you for clarifying this for me. After further research all the received answers are correct but I can accept only one :).
Regarding the const qualifier being used solely for syntax corectness , I believe this answer is both right and wrong, the correct way to state this (imho) would be that it is used mostly for syntax corectness (in a very limited number of scenarios it can produce different / better code ). References : SO Related question , related article
The const_cast<T>(this) trick is potentially unsafe, because the user of your member function may run into undefined behavior without doing anything wrong on their side.
The problem is that casting away const-ness is allowed only when you start with a non-const object. If your object is constant, the function that casts away its const-ness and uses the resultant pointer to change object's state triggers undefined behavior:
struct Test {
int n;
Test() : n(0) {}
void potentiallyUndefinedBehavior() const {
Test *wrong = const_cast<Test*>(this);
wrong->n++;
}
};
int main() {
Test t1;
// This call is OK, because t1 is non-const
t1.potentiallyUndefinedBehavior();
const Test t2;
// This triggers undefined behavior, because t2 is const
t2.potentiallyUndefinedBehavior();
return 0;
}
The trick with const_cast<T>(this) has been invented for caching values inside member functions with const qualifier. However, it is no longer useful, because C++ added a special keyword for this sort of things: by marking a member mutable you make that member writable inside const-qualified methods:
struct Test {
mutable int n;
Test() : n(0) {}
void wellDefinedBehavior() const {
n++;
}
};
Now the const member function will not trigger undefined behavior regardless of the context.
The CPU doesn't know anything about const, which is a C++ keyword. By the time the compiler has transformed the C++ code to assembly, there's not much left of that.
Of course, there's a real possibility that the generated code is entirely different because of the const keyword. For instance, the const version of some operator[] may return a T object by value whereas the non-const version must return a T&. A CPU doesn't even know what function it's in, or even assume the existence of functions.
My answer is to use the storage class mutable for any thing which need to be modified in const methods.
It's built into the language, so there are several benefits. It's a tighter control for how const methods modify data members. Other developers will know these data members will change in const methods. If there are any compiler optimizations, the compiler will know to do the right thing.
class myClass {
private:
int m_nSomeOtherUnchangedMember;
mutable int m_nMemberInt;
…
public:
int getSomething() const;
…
};
int myClass::getSomething() const
{
m_nMemberInt++;
…
return m_nSomeOtherUnchangedMember;
}
As far as I know , making constant functions in a class is useful for read/write compiler optimizations.
No. We use const methods to enforce semantic guarantees, not to allow optimizations (with the possible exception of avoiding copies).
What pitfalls can the following code cause
Firstly, it can break program semantics.
For example, std::map nodes store std::pair<const Key, T>, because the Key shouldn't mutate after it has been inserted. If the key changes value, the map sorting invariant is incorrect, and subsequent find/insert/rebalance operations will misbehave.
If you call a const-qualified method on this const key, and that method changes the Key in a way that affects how it compares, then you've cunningly broken the map.
Secondly, it can kill your program. If you have a const object overlaid on a genuinely read-only address range, or you have a statically-initialized const object in the read-only initialized data segment, then writing to it will cause some kind of protection error
As other stated the const-correctness was designed as a help for the programmers and not an help for the optimizer. You should remember 4 things:
1. const references and const methods are not faster
2. const references and const methods are not faster
3. const references and const methods are not faster
4. const references and const methods are not faster
More specifically the optimizer simply completely ignores const-ness of references or of methods because const doesn't really mean in that context what you are thinking.
A const reference to an object doesn't mean that for example during the execution of a method the object will remain constant. Consider for example:
struct MyObject {
int x;
void foo() const {
printf("%i\n", x);
char *p = new char[10];
printf("%i\n", x);
delete[] p;
}
};
the compiler cannot assume that x member didn't mutate between the two calls to printf. The reason is that std::operator new global allocator could have been overloaded and the code could have regular non-const pointer to the instance. Therefore it's perfectly legal for the global allocator to change x during the execution of foo. The compiler cannot know this is not going to happen (the global allocator could be overloaded in another
compilation unit).
Calling any unknown code (i.e. basically any non-inlined function) can mutate any part of an object, being in a const method or not. The const method simply means that you cannot use this to mutate the object, not that the object is constant.
If const correctness is really an help for the programmers is another question on which I personally have a quite heretic point of view, but that's another story...
If I have a class that has many int, float, and enum member variables, is it considered efficient and/or good practice to return them as references rather than copies, and return constant references where no changes should be made? Or is there a reason I should return them as copies?
There is no reason to return primitive types such as int and float by reference, unless you want to allow them to be changed. Returning them by reference is actually less efficient because it saves nothing (ints and pointers are usually the same size) while the dereferencing actually adds overhead.
If they are constant references, maybe it is OK. If they are not constant references, probably not.
As to efficiency - on a 64-bit machine, the references will be 64-bit quantities (pointers in disguise); int and float and enum will be smaller. If you return a reference, you are forcing a level of indirection; it is less efficient.
So, especially for built-in types as return values, it is generally better to return the value rather than a reference.
Some cases it is necessary:
Look at overloaded operator[] for any class. It usually has two versions. The mutating version has to return a reference.
int &operator[](int index); // by reference
int operator[](int index) const; // by value
In general, It is OK to allow access to class members by trusted entities by a class e.g. friends. In case these trusted entities also need to modify the state, references or pointers to the class members, are the only options one has.
In many cases, references usually simplify syntax e.g where 'v' is STL vector.
v.at(1) = 2 vs *(v.at(1)) = 2;
This is probably mostly a matter of style or preference. One reason to not return references is because you are using getters and setters to allow you to change the implementation of those members, If you changed a private member to another type, or removed it completely because it can be computed, then you no longer have the ability to return a reference, since there's nothing to reference.
On the other hand, returning references for non-trivial types (compound classes) can speed up your code a bit over making a copy, and you can allow those members to be assigned through the returned reference (if desired).
Almost, const references are better. For ints and such theres no point because you would want them to be changed or because they are the same size (or nearly) as a reference.
So yes it is a good idea. I prefer another language or to hack away at my own C++ stuff and just allow the var to be public (once again it just my own stuff)
This is a performance question mostly but from a robustness point of view I would say it's preferably to return values instead of const references. The reason being that even const references weakens encapsulation. Consider this:
struct SomeClass
{
std::vector<int> const & SomeInts () const;
void AddAnInt (int i); // Adds an integer to the vector of ints.
private:
std::vector<int> m_someInts;
};
bool ShouldIAddThisInt(int i);
void F (SomeClass & sc)
{
auto someInts = sc.SomeInts ();
auto end = someInts.end ();
for (auto iter = someInts.begin (); iter != end; ++iter)
{
if (ShouldIAddThisInt(*iter))
{
// oops invalidates the iterators
sc.AddAnInt (*iter);
}
}
}
So in case it makes semantically sense and we can avoid excessive dynamic allocations I prefer return by value.
Getters are for emissions of a class say Exhaust Car.emit(), where the car has just created the Exhaust.
If you are bound to write const Seat& Car.get_front_seat()
to have later sit in the Driver, you can immediately notice that something is wrong.
Correcly, you'd rather write Car.get_in_driver(Driver)
which then calls directly seat.sit_into(Driver).
This second method easily avoids those awkward situations when you get_front_seat but the door is closed and you virtually push in the driver through the closed door. Remember, you have only asked for a seat! :)
All in all: always return by value (and rely on return value optimization), or realize it is time for changing your design.
The background: classes were created so that data can be coupled together with its accessor functionality, localizing bugs etc. Thus classes are never activity, but data oriented.
Further pitfalls: in c++ if you return something by const ref, then you can easily forget it is only a ref and once your object is destructed you can be left with an invalid ref. Otherwise, that object will be copied once it leaves the getter anyway. But unnecessay copies are avoided by the compiler, see Return Value Optimization.
I have a class member myMember that is a myType pointer. I want to assign this member in a function that is declared as const. I'm doing as follows:
void func() const
{
...
const_cast<myType*>(myMember) = new myType();
...
}
Doing this works fine in VC++, but GCC gives an error with the message "lvalue required as left operand of assignment".
Making the member mutable allow me to simply remove the const_cast and assign the value. However, I'm not entirely sure that that comes with other side-effects.
Can I assign my member without having to make the member mutable? How? Are there any side-effects in making members mutable?
This scenario -- an encapsulated internal state change that does not impact external state (e.g. caching results) -- is exactly what the mutable keyword is for.
The code wont actually work in VC++ - you're not updating the value (or at least it shouldnt), hence the warning from GCC. Correct code is
const_cast<myType*&>(myMember) = new myType();
or [from other response, thanks :P]:
const_cast<ThisType*>(this)->myMember = new myType();
Making it mutable effectively means you get implicit const_casts in const member functions, which is generally what you should be steering towards when you find yourself doing loads of const_casts on this. There are no 'side-effects to using mutable' other than that.
As you can see from the vehement debates circling this question, willy-nilly usage of mutable and lots of const_casts can definitely be symptoms of bad smells in your code. From a conceptual point of view, casting away constness or using mutable can have much larger implications. In some cases, the correct thing to do may be to change the method to non-const, i.e., own up to the fact that it is modifying state.
It all depends on how much const-correctness matters in your context - you dont want to end up just sprinking mutable around like pixie dust to make stuff work, but mutable is intended for usage if the member isnt part of the observable state of the object. The most stringent view of const-correctness would hold that not a single bit of the object's state can be modified (e.g., this might be critical if you're instance is in ROM...) - in those cases you dont want any constness to be lost. In other cases, you might have some external state stored somewhere ouside of the object - e.g., a thread-specific cache which also needs to be considered when deciding if it is appropriate.
const_cast is nearly always a sign of design failure. In your example, either func() should not be const, or myMember should be mutable.
A caller of func() will expect her object not to change; but this means "not to change in a way she can notice"; this is, not to change its external state. If changing myMember does not change the object external state, that is what the mutable keyword is for; otherwise, func() should not be const, because you would be betraying your function guarantees.
Remember that mutable is not a mechanism to circunvent const-correctness; it is a mechanism to improve it.
class Class{
int value;
void func()const{
const_cast<Class*>(this)->value=123;
}
};
As Steve Gilham wrote, mutable is the correct (and short) answer to your question. I just want to give you a hint in a different direction.
Maybe it's possible in your szenario to make use of an (or more than one) interface?
Perhaps you can grok it from the following example:
class IRestrictedWriter // may change only some members
{
public:
virtual void func() = 0;
}
class MyClass : virtual public IRestrictedWriter
{
public:
virtual void func()
{
mValueToBeWrittenFromEverybody = 123;
}
void otherFunctionNotAccessibleViaIRestrictedWriter()
{
mOtherValue1 = 123;
mOtherValue2 = 345;
}
...
}
So, if you pass to some function an IRestrictedReader * instead of a const MyClass * it can call func and thus change mValueToBeWrittenFromEverybody whereas mOtherValue1 is kind of "const".
. I find mutable always a bit of a hack (but use it sometimes).