Related
I'd like to work out conventions on passing parameters to functions/methods. I know it's a common issue and it has been answered many times, but I searched a lot and found nothing that fully satisfies me.
Passing by value is obvious and I won't mention this. What I came up with is:
Passing by non-const reference means, that object is MODIFIED
Passing by const reference means, that object is USED
Passing by pointer means, that a reference to object is going to be STORED. Whether ownership is passed or not will depend on the context.
It seems to be consistent, but when I want to pick heap-allocated object and pass it to 2. case parameter, it'd look like this:
void use(const Object &object) { ... }
//...
Object *obj = getOrCreateObject();
use(*obj);
or
Object &obj = *getOrCreateObject();
use(obj);
Both look weird to me. What would you advise?
PS I know that one should avoid raw pointers and use smart instead (easier memory managment and expressiveness in ownership) and it can be the next step in refactoring the project I work on.
You can use these conventions if you like. But keep in mind that you cannot assume conventions when dealing with code written by other people. You also cannot assume that people reading your code are aware of your conventions. You should document an interface with comments when it might be ambiguous.
Passing by pointer means, that object is going to be STORED. Who's its owner will depend on the context.
I can think of only one context where the ownership of a pointer argument should transfer to the callee: Constructor of a smart pointer.
Besides possible intention of storing, a pointer argument can alternatively have the same meaning as a reference argument, with the addition that the argument is optional. You typically cannot represent an optional argument with a reference since they cannot be null - although with custom types you could use a reference to a sentinel value.
Both look weird to me. What would you advise?
Neither look weird to me, so my advise is to get accustomed.
The main problem with your conventions is that you make no allowance for the possibility of interfacing to code (e.g. written by someone else) that doesn't follow your conventions.
Generally speaking, I use a different set of conventions, and rarely find a need to work around them. (The main exception will be if there is a need to use a pointer to a pointer, but I rarely need to do that directly).
Passing by non-const reference is appropriate if ANY of the following MAY be true;
The object may be changed;
The object may be passed to another function by a non-const reference [relevant when using third party code by developers who choose to omit the const - which is actually something a lot of beginners or lazy developers do];
The object may be passed to another function by a non-const pointer [relevant when using third party code be developers who choose to omit the const, or when using legacy APIs];
Non-const member functions of the object are called (regardless of whether they change the object or not) [also often a consideration when using third-party code by developers who prefer to avoid using const].
Conversely, const references may be passed if ALL of the following are true;
No non-mutable members of the object are changed;
The object is only passed to other functions by const reference, by const pointer, or by value;
Only const member functions of the object are called (even if those members are able to change mutable members.
I'll pass by value instead of by const reference in cases where the function would copy the object anyway. (e.g. I won't pass by const reference, and then construct a copy of the passed object within the function).
Passing non-const pointers is relevant if it is appropriate to pass a non-const reference but there is also a possibility of passing no object (e.g. a nullptr).
Passing const pointers is relevant if it is appropriate to pass a const reference but there is also a possibility of passing no object (e.g. a nullptr).
I would not change the convention for either of the following
Storing a reference or pointer to the object within the function for later use - it is possible to convert a pointer to a reference or vice versa. And either one can be stored (a pointer can be assigned, a reference can be used to construct an object);
Distinguishing between dynamically allocated and other objects - since I mostly either avoid using dynamic memory allocation at all (e.g. use standard containers, and pass them around by reference or simply pass iterators from them around) or - if I must use a new expression directly - store the pointer in another object that becomes responsible for deallocation (e.g. a std::smart_pointer) and then pass the containing object around.
In my opionion, they are the same. In the first part of your post, you are talking about the signature, but your example is about function call.
Say I have the following member function:
void CFoo::regWrite( int addr, int data )
{
reg_write( addr, data ); // driver call to e.g. write a firmware register
}
Clearly, calling this function doesn't modify the internal state of the object it is called on. However, it changes the state of whatever this Foo instance represents.
In circumstances such as these, should Foo::regWrite(int addr, int data) be a const function?
You have to decide what the meaning is of "logically const" for the class CFoo, and that depends what the class is for.
If CFoo is construed as referring to some data, then it might make sense to be able to modify that data via a const instance of CFoo, in which case your member function would be const. For examples of this consider other types that refer to some data -- you can modify the referand of a char *const or a const std::unique_ptr<char>.
If CFoo is construed as owning some data, then it might make sense to forbid modification via a const instance of CFoo. For examples of this consider containers, where the elements are logically "part of the object's state" even when they aren't physically part of the object. So vector::operator[] has a const overload that returns a const T& rather than a T&, the insert member function is non-const, etc.
It is up to the programmer to define what 'const' shall mean for a class. With the specifier mutable you can even have a constobject with changing values in a member. When it comes to hardware one might consider the configuration as the target for const correctness: as long as the configuration does not change the object can be considered constant.
A similar issue rises if you have pointers to other objects in your class: Your const method can then call non-const methods on the other object and thus modifiy it.
If you look at the hardware as as some other object referenced by your class, it would be perfectly valid to modify firmware settings (since only a "referenced" object is changed). If you want your class to "represent" the hardware (or part of it), I would rather suggest not to mark the method as const.
So I think it mainly depends how you designed your class.
There are two ways of looking at this - the optimization angle, and the logic of this declaration. Which is more important is for you to decide.
Optimization
EDIT: I made some incorrect assumptions. It seems the compiler is not actually free to make the optimizations below, and will only make them by analyzing the body of the method to ensure no modifications occur (and even then only in simple cases).
Having this const will allow the compiler to optimize a little bit
more. It knows that regWrite doesn't change any fields in the
object, so it can keep them if it was storing them in registers, and
do similar optimizations that rely on the objects fields not being
changed.
This is really the only thing the compiler will depend on when you
make a definition like this, so having this const is OK and can
theoretically allow better performance.
Making logical sense
It feels unintuitive to have a const method whose whole purpose is a destructive change. The usual intuition a programmer has is that as long as I'm only calling const methods, the results of other const methods shouldn't change. If you violate this unwritten contract, expect people to be surprised - even if the compiler is OK with it.
I'm not sure if this will be violated here - it will depend on the other code in this class. However, if no other considerations are important (performance, etc.), const is (for me) mostly a marker on the interface which says "calling this does not change the state of this object", for a broad definition of "state".
This is murky ground however, and it is up to you what you consider a state change. If you think of your firmware object as representing a link to the internals, writing a register does not change anything about this link and is const. If you think of it as representing the state of the underlying registers, than writing to registers is a change of state.
About six years ago, a software engineer named Harri Porten wrote this article, asking the question, "When should a member function have a const qualifier and when shouldn't it?" I found it to be the best write-up I could find of the issue, which I've been wrestling with more recently and which I think is not well covered in most discussions I've found on const correctness. Since a software information-sharing site as powerful as SO didn't exist back then, I'd like to resurrect the question here.
The article seems to cover a lot of basic ground, but the author still has a question about const and non-const overloads of functions returning pointers. Last line of the article is:
Many will probably answer "It depends." but I'd like to ask "It depends on what?"
To be absolutely precise, it depends whether the state of the A object pointee is logically part of the state of this object.
For an example where it is, vector<int>::operator[] returns a reference to an int. The int referand is "part of" the vector, although it isn't actually a data member. So the const-overload idiom applies: change an element and you've changed the vector.
For an example where it isn't, consider shared_ptr. This has the member function T * operator->() const;, because it makes logical sense to have a const smart pointer to a non-const object. The referand is not part of the smart pointer: modifying it does not change the smart pointer. So the question of whether you can "reseat" a smart pointer to refer to a different object is independent of whether or not the referand is const.
I don't think I can provide any complete guidelines to let you decide whether the pointee is logically part of the object or not. However, if modifying the pointee changes the return values or other behaviour of any member functions of this, and especially if the pointee participates in operator==, then chances are it is logically part of this object.
I would err on the side of assuming it is part (and provide overloads). Then if a situation arose where the compiler complains that I'm trying to modify the A object returned from a const object, I'd consider whether I really should be doing that or not, and if so change the design so that only the pointer-to-A is conceptually part of the object's state, not the A itself. This of course requires ensuring that modifying the A doesn't do anything that breaks the expected behaviour of this const object.
If you're publishing the interface you may have to figure this out in advance, but in practice going back from the const overloads to the const-function-returning-non-const-pointer is unlikely to break client code. Anyway, by the time you publish an interface you hopefully have used it a bit, and probably got a feel for what the state of your object really includes.
Btw, I also try to err on the side of not providing pointer/reference accessors, especially modifiable ones. That's really a separate issue (Law of Demeter and all that), but the more times you can replace:
A *getA();
const A *getA() const;
with:
A getA() const; // or const A &getA() const; to avoid a copy
void setA(const A &a);
The less times you have to worry about the issue. Of course the latter has its own limitations.
One interesting rule of thumb I found while researching this came from here:
A good rule of thumb for LogicalConst is as follows: If an operation preserves LogicalConstness, then if the old state and the new state are compared with the EqualityOperator, the result should be true. In other words, the EqualityOperator should reflect the logical state of the object.
I personally use a very simple Rule Of Thumb:
If the observable state of an object does not change when calling a given method, this method ought to be const.
In general it is similar to the rule mentioned by SCFrench about Equality Comparison, except that most of my classes cannot be compared.
I would like to push the debate one step further though:
When requiring an argument, a function ought to take it by const handle (or copy) if the argument is left unchanged (for an external observer)
It is slightly more general, since after all the method of a class is nothing else than a free-standing function accepting an instance of the class as a first argument:
class Foo { void bar() const; };
is equivalent to:
class Foo { friend void bar(const Foo& self); }; // ALA Python
when it doesn't modify the object.
It simply makes this to have type const myclass*. This guarantees a calling function that the object won't change. Allowing some optimizations to the compiler, and easier for the programmer to know if he can call it without side effects (at least effects to the object).
General rule:
A member function should be const if it both compiles when marked as const and if it would still compile if const were transitive w.r.t pointers.
Exceptions:
Logically const operations; methods that alter internal state but that alteration is not detectable using the class's interface. Splay tree queries for example.
Methods where the const/non-const implementations differ only by return type (common with methods the return iterator/const_iterator). Calling the non-const version in the const version via a const_cast is acceptable to avoid repetition.
Methods interfacing to 3rd party C++ that isn't const correct, or to code written in a language that doesn't support const
Here are some good articles:
Herb Sutter's GotW #6
Herb Sutter & const for optimizations
More advice on const correctness
From Wikipedia
I use const method qualifiers when the method does not alter the class' data members or its common intent is not to modify the data members. One example involves RAII for a getter method that may have to initialize a data members (such as retrieve from a database). In this example, the method only modifies the data member(s) once during initialization; all other times it is constant.
I'm allowing the compiler to catch const errors during compile time rather than me catching them during run-time (or a User).
I want to execute a read-only method on an object marked as const, but in order to do this thread-safely, I need to lock a readers-writer mutex:
const Value Object::list() const {
ScopedRead lock(children_);
...
}
But this breaks because the compiler complains about "children_" being const and such. I went up to the ScopedRead class and up to the RWMutex class (which children_ is a sub-class) to allow read_lock on a const object, but I have to write this:
inline void read_lock() const {
pthread_rwlock_rdlock(const_cast<pthread_rwlock_t*>(&rwlock_));
}
I have always learned that const_cast is a code smell. Any way to avoid this ?
Make the lock mutable
mutable pthread_rwlock_t rwlock;
This is a common scenario in which mutable is used. A read-only query of an object is (as the name implies) an operation that should not require non-const access. Mutable is considered good practice when you want to be able to modify parts of an object that aren't visible or have observable side-effects to the object. Your lock is used to ensure sequential access to the object's data, and changing it doesn't effect the data contained within the object nor have observable side-effects to later calls so it is still honoring the const-ness of the object.
Make the lock mutable.
Yes, use mutable. It's designed for this very purpose: Where the entire context of the function is const (i.e. an accessor or some other logically read-only action.) but where some element of writable access is needed for a mutex or reference counter etc.
The function should be const, even if it does lock a mutex internally. Doing so makes the code thread-neutral without having to expose the details, which I presume is what you're trying to do.
There are very few places where const_cast<> needs to be legitimately used and this isn't one of them. Using const cast on on an object, especially in a const function is a code maintenance nightmare. Consider:
token = strtok_r( const_cast<char*>( ref_.c_str() ), ":", &saveptr );
In fact, I'd argue that when you see const_cast in a const function, you should start by making the function non-const (very soon after you should get rid of the const_cast and make the function const again though)
Well, if we are not allowed to modify the declaration of the variable, then const_cast comes to the rescue. If not, making it mutable is the solution.
To solve the actual problem, declare the lock as mutable.
The following is now my professional opinion:
The compiler is right to complain, and you are right to find this mildly offensive. If performing a read-only operation requires a lock, and locks must be writeable to lock, then you should probably make the read-only query require non-const access.
EDIT: Alright, I'll bite. I've seen this kind of pattern cause major perf hits in places you would not expect. Does anyone here know how tolower or toupper can become a major bottleneck if called frequently enough, even with the default ASCII locale? In one particular implementation of the C runtime library built for multithreading, there was a lock taken to query the current locale for that thread. Calling tolower on the order of 10000 times or more resulted in more of a perf hit than reading a file from disk.
Just because you want read-only access doesn't mean that you should hide the fact that you need to lock to get it.
I'm adding some lazy initialization logic to a const method, which makes the method in fact not const. Is there a way for me to do this without having to remove the "const" from the public interface?
int MyClass::GetSomeInt() const
{
// lazy logic
if (m_bFirstTime)
{
m_bFirstTime = false;
Do something once
}
return some int...
}
EDIT: Does the "mutable" keyword play a role here?
Make m_bFirstTime mutable:
class MyClass
{
: :
mutable bool m_bFirstTime;
};
...but this is also very often an indication of a design flaw. So beware.
Actually, you said that you didn't want to change the header file. So your only option is to cast away the constness of the this pointer...
int MyClass::GetSomeInt() const
{
MyClass* that = const_cast<MyClass*>(this);
// lazy logic
if (that->m_bFirstTime)
{
that->m_bFirstTime = false;
Do something once
}
return some int...
}
If using mutable raises a red flag, this launches a red flag store in to orbit. Doing stuff like this is usually a really bad idea.
I think of this problem as involving two concepts: (1) "logically const" and (2) "bitwise const". By this I mean that getting some int from a class, does not logically change the class and in most cases it does not change the bits of the class members. However, in some cases, like yours, it does.
In these cases, where the method is logically const but not bitwise const, the compiler cannot know this. This is the reason for the existence of the mutable keyword. Use it as John Dibling shows, but it is not a design flaw. On the contrary, there are many cases where this is necessary. In your example, I presume that the calculation of the int is expensive, so we do not want to calculate it if it is not needed. In other cases, you may wish to cache results of methods for later use, etc.
BTW, even though you have accepted the "mutable" answer as correct, you do have to update the .h!
set the m_bFirstTime member to be mutable
As John Dibling said, mark the fields that are changed as mutable. The important part is in the comment by ypnos: 'don't really change the state of the object' (as perceived by the outside world). That is, any method call before and after the const method call must yield the same results. Else your design is flawed.
Some things that make sense to be mutable:
mutex or other lock types
cached results (that will not change)
Mutex are not part of your objects state, they are only a blocking mechanism to guarantee data integrity. A method that will retrieve a value from your class, does need to change the mutex, but your class data and state will be exactly the same after the execution of the const method as it was before.
With caching, you must consider that only data that it makes sense for data that is expensive to retrieve and assumed not to change (DNS result, as an example). Else you could be returning stale data to your user.
Some things that should not be changed inside const methods:
Anything that modifies the state of
the object
Anything that affects this or other
method results
Any user of your class that executes const methods will assume that your class (as seen from the outside world) will not change during the execution. It will be quite misleading and error prone if it were not the same. As an example, assume that a dump() method changes some internal variable -state, value- and that during debug the user of your class decides to dump() your object at a given point: your class will behave differently with traces than without: perfect debug nightmare.
Note that if you do lazy optimizations you must do them to access immutable data. That is, if your interface dictates that during construction you will retrieve an element from the database that can be later accessed through a constant method, then if you do lazy fetching of the data you can end up in a situation where the user constructs your object to keep a copy of the old data, modifies the database and later on decides to restore the previous data into the database. If you have performed a lazy fetch you will end up loosing the original value. An opposite example would be configuration file parsing if the config file is not allowed to be modified during the execution of the program. You can avoid parsing the file up the point where it is needed knowing that performing the read in the beginning or at a later time will yield the same result.
In any case - make note that this is no longer going to be thread safe. You can often rely on an object to be thread safe if it only has const methods (or you use only the const methods after initialization). But if those const methods are only logically const, then you lose that benefit (unless of course you start locking it all the time).
The only compiler optimization that could cause havok would be for the compiler to figure out that you're calling the method twice with the same arguments, and just reuse the first return value - which is fine as long as the function truly is logically const, and returns the same value for a given set of arguments (and object state). Even that optimization isn't valid if it's possible that anyone (including another thread) has had access to the object to call non-const methods on it.
mutable was added to the language specifically for this case. C++ is a pragmatic language, and is happy to allow corner cases like this to exist for when they are needed.