This question already has answers here:
Returning a const reference to an object instead of a copy
(12 answers)
Closed 8 years ago.
Now, this is highly conceptual. I don't know if I understand this correctly, so please help me understand the difference.
Let's assume that name is a private std::string data member that is accessed by the getName() accessor function:
const string& getName() const {
return name;
}
Now then, this returns a reference, which is just another word for alias, to name. So, an alias is being returned, i.e. the name data member is being returned. Is this allowed or will it defeat the whole purpose of data hiding?
In other words, how exactly is the above method different to the conventional:
string getName() const {
return name;
}
???
And finally, is it really worth implementing the former instead of the latter?
First of all, the reference would be problematic indeed if the underlying value could change, particularly in the context of multi-threaded execution. So it's almost a basic assumption that the value of the data member doesn't change during the lifetime of the object. That it's effectively a constant.
Now, a main problem with the reference is that it exposes an implementation detail so that it gets difficult to change the implementation.
A more academic problem is that it can break code, if there earlier was a by-value return, or just because it's unusual. E.g.
const string s& = foo().name();
With foo() returning an object by value, and name() returning a string by reference, this gives you a dangling reference instead of the naïvely expected prolonged lifetime. I call it academic because I can't imagine anyone writing that. Still, Murphy's law and all that.
It will probably not be (significantly) more efficient than a value return, precisely because it's unlikely that it's used just to initialize a reference.
So:
probably not significantly more efficient,
prevents changing implementation easily,
also has an academic problem, yielding dangling references.
In sum, just don't.
This is premature optimization and complication.
The first allows callers some-what direct access to your internal name variable. Granted it's constant, so they can only call const methods on it. But still do you want external callers operating on your hidden, internal data? Even worse, what if some bozo decides to const_cast the internal data buffer of the string and hack on it?
The second returns a copy of your internal name variable. Perfectly safe for any callers to use.
I usually steer away from the first type, except for trivial, low level types. But then trivial low level types don't have much overhead for copying anyways. So that means I never write stuff like that.
The const reference return is better since it does not make a copy of the string. The reason I say this is because the interface is more flexible this way - you can always copy the const reference into another string if needed or you can use it as a reference - up to the caller. Returning a member byvalue and you are always stuck with making a copy. If name is big or used often, then it will impact performance and I assume performance is one of the reasons you use C++ in the first place.
Now, the other answers raise some negative points about returning a const reference, which I do not think are valid.
The concern that you can cast away the const, is valid, but casting away const is just one of the tools in the C++ developer's toolbox. Why take it away? If someone really wants to mess with your object, they can always do so in c++ by addressing memory directly so designing your code to save your callers from themselves is pointless. Casting the const away shows intent to do so and in my opinion is perfectly OK. It means that the caller has some very specific reasons to do so and knows that the const being cast away is for a non-const object and therefore - safe.
The academic example in the other answer is just silly:
const string s& = foo().name();
Again, designing your code to attempt to save the caller from themselves is limiting you from the power of C++. If one would really want to do the above, the proper way would be
string s = foo().name();
So that point is moot too.
The only valid point is that it exposes the implementation somewhat. The efficiency gains, however, outweigh this concern in my opinion.
What you really should ask yourself is this - what is the usual case of using name()?
By answering this question, you will answer which flavour you should use.
To me, the fact that it is called name implies that it will mostly be used for printing/logging and comparison. Therefore, the const reference is the clear winner here.
Also, look at the style guides out there. Most of them will have you pass by const reference and return members by const reference. There are very good reasons to do so as outlined above.
Related
I got a question regarding to these two possibilities of setting a value:
Let's say I got a string of a class which I want to change. I am using this:
void setfunc(std::string& st) { this->str = st; }
And another function which is able to do the exact same, but with a string reference instead of a void for setting a value:
std::string& reffunc() { return this->str; }
now if I am going to set a value I can use:
std::string text("mytext");
setfunc(text);
//or
reffunc() = text;
Now my question is if it is considered bad at using the second form of setting the value.
There is no performance difference.
The reason to have getters and setters in the first place is that the class can protect its invariants and is easier to modify.
If you have only setters and getters that return by value, your class has the following freedoms, without breaking API:
Change the internal representation. Maybe the string is stored in a different format that is more appropriate for internal operations. Maybe it isn't stored in the class itself.
Validate the incoming value. Does the string have a maximum or minimum length? A setter can enforce this.
Preserve invariants. Is there a second member of the class that needs to change if the string changes? The setter can perform the change. Maybe the string is a URL and the class caches some kind of information about it. The setter can clear the cache.
If you change the getter to return a const reference, as is sometimes done to save a copy, you lose some freedom of representation. You now need an actual object of the return type that you can reference which lives long enough. You need to add lifetime guarantees to the return value, e.g. promising that the reference is not invalidated until a non-const member is used. You can still return a reference to an object that is not a direct member of the class, but maybe a member of a member (for example, returning a reference to the first name part of an internal name struct), or a dereferenced pointer.
But if you return by non-const reference, almost all bets are off. Since the client can change the value referenced, you can no longer rely on a setter being called and code controlled by the class being executed when the value changes. You cannot constrain the value, and you cannot preserve invariants. Returning by non-const reference makes the class little different from a simple struct with public members.
Which leads us to that last option, simply making the data member public. Compared to a getter returning a non-const reference, the only thing you lose is that the object returned can no longer be an indirect member; it has to be a direct, real member.
On the other side of that equation is performance and code overhead. Every getter and setter is additional code to write, with additional opportunities for errors (ever copy-pasted a getter/setter pair for one member to create the same for another and then forgot to change one of the variables in there?). The compiler will probably inline the accessors, but if it doesn't, there's call overhead. If you return something like a string by value, it has to be copied, which is expensive.
It's a trade-off, but most of the time, it's better to write the accessors because it gives you better encapsulation and flexibility.
We cannot see the definition of the member str.
If it's private, your reffunc() exposes a reference to a private member; you're writing a function to violate your design, so you should reconsider what you're doing.
Moreover, it's a reference, so you have to be sure that the object containing str still exists when you use that reference.
Moreover, you are showing outside implementation details, that could change in the future, changing the interface itself (if str becomes something different, setfunc()'s implementation could be adapted, reffunc()'s signature has to change).
It's not wrong what you wrote, but it could be used in the wrong way. You're reducing the encapsulation. It's a design choice.
It's fine. However, you have to watch out for these pitfalls:
the referenced object is modifiable. When you return a non-const reference, you expose data without protection against modifications. Obvious, but be aware of this anyway!
referenced objects can go out of sope. If the referenced object's lifetime ends, accessing the reference will be undefined behavior. However, they can be used to extend the lifetime of temporaries.
The way you used the reffunc() function is considered bad coding. But (as mentioned in the comments), generally speaking, returning references is not bad coding.
Here's why reffunc() = text; is considered bad coding:
People usually do not expect function calls on the left hand of an assignment, but on the right side. The natural expectation when seeing a function call is that it computes and returns a value (or rvalue, which is expected to be on the right hand side of assignment) and not a reference (or lvalue, which is expected to be on the left hand side of assignment).
So by putting a function call on the left hand side of the assignment, you are making your code more complicated, and therefore, less readable. Keeping in mind that you do not have any other motivations for it (as you say, performance is the same, and it usually is in these situations), good coding recommends that you use a "set" function.
Read the great book "Clean Code" for more issues on clean coding.
As for returning references in functions, which is the title of your question, it is not always bad coding and is sometimes required for having cleaner and briefer code. Specifically many operator overloading features in c++ work properly if you return a reference (see operator[] in std::vector and the assignment operator which usually help the code become more readable and less complex. See the comments).
I know that "best" is relative and varies with different situations, but why would one choose to implement say a getter by passing in a variable rather than a pointer to a variable. Since passing pointers is generally faster/less overhead, why not just use pointers/references all the time instead of passing variables? I can only see issues if the original variable is deleted, then you'll be left with null pointers, but in the case of class level variables that shouldn't be an issue right?
Example:
int getNum() {return num}
vs
void getNum(int* toGet) {toGet = num}
Use T getter() or T getter() const unless there is no copy/move constructor for return value. The only exception - significant performance issues. As about pointer, I think, the only reason to use void getter(T* pointer) is writing POD-data to pre-allocated buffer.
When you about to choose void getter(T& value) due to performance reasons, look if compiler performs Return Value Optimization to help you. In most cases, it does, so just let your compiler work for you.
When you're sure that RVO is not applicable in your case, check if this code is called often (may be performance doesn't matter in caller function)?
And when you're can provide proof that reference or pointer is needed in your getter to anyone concerning - use reference/pointer alternative. As has been suggested above, pointer and reference are much more less obvious and hard to support than "returning by-value". Don't add potential error places to your code just because you can.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Should accessors return values or constant references?
First of all, let's ignore the setters and getters are/aren't evil. :)
My question is, if I have a class that has some std:: container as a member, let's say string, what should the return type of the getter be? I kind of prefer const T& compared to T for performance reasons... I know that most of the time users will make a copy anyway, but I guess not all the time. Am I wrong?
So in general what is better:
std::string get_name() const;
OR
const std::string& get_name() const;
Return a constant reference. If the user wants to make a copy, no skin off your back.
Here's the consideration:
1) return a copy string. Then the documentation is simple ("returns the current value of...") and the function is slow (I doubt there are many circumstances where a compiler is smart enough to omit the copy, even where the return value is used only within a single expression. It's theoretically possible for the compiler to recognize string as a value type with side-effect-free copies, and also to prove that the referand cannot change during the period in which the caller uses the copy, and therefore use a reference instead. But will it do all that?).
2) return a reference const string&. Then the documentation is complex, ("returns a reference to a string object containing the current value of... This reference remains valid for the following time... It continues to contain the same value for the following subset of that time..."). The function is fast if the caller doesn't need a copy. The implementation of the class is pretty much constrained to always in future store that string as a string data member, because it will not otherwise have anything to return with suitable lifetime.
Aliasing is potentially fast (if it avoids copies) but complex (since referands can change or disappear), so functions that return references are potentially fast but complex. Furthermore, (1) is a "getter that returns a property of the object", but (2) is a "getter that returns a private member of the object". So if getters are evil then (2) is more evil than (1).
I would generally return the reference if the getter is essentially there as a hack for other tightly-coupled classes to get at the data, or if the class has very obvious semantics for when it will change, for example "never during the lifetime of the object", or if the string is expected to be so huge that it's reasonable to expose it by reference simply because taking a copy of it should be rare and so callers will be expecting view behavior rather than value behavior. I'd probably return the value if the interface is supposed to be compatible-forever, just to be safe, unless the class I'm writing is explicitly designed as "a thing that holds a big string and does X for you whilst still letting you see the string".
Immutable garbage-collected strings make this problem go away, which is probably one of the reasons they're attractive to designers of higher-level languages.
That depends on the most common use of the getter.
If that string is going to be used since the moment of the get operation untill the end of the program, a copy operation might be advised as you dont want to "make a promise" to the user about the life time of the string.
If that string is going to be used momentarly, use a reference.
If your entire system is using some string repository, that ensures that all string's life times are known. You can safely return a reference from that repository.
In general a const reference is ideal. As Kerrek said, if they need a copy they can make it themselves. The only place this can potentially cause issues is entirely on the onus of the receiver of the reference. For example, when a const reference var is is receiving the result of an object member returning a const reference, and that object is later destroyed independent of the lifetime of the reference variable, you've effectively "remembered" a pointer (loose term) that is no longer valid; i.e. :
const std::string& myref = myobj->getString();
...
delete myobj;
But you should have knowledge of this (since you're writing the code) and therefore should plan to avoid it in the first place. They should have made a copy rather than taking the reference, it is still best-practice to return the reference regardless.
I know why the following does not work correclty, so I am not asking why. But I am feeling bad about it is that it seems to me that it is a very big programming hindrance.
#include <iostream>
#include <string>
using namespace std;
string ss("hello");
const string& fun(const string& s) {
return s;
}
int main(){
const string& s = fun("hello");
cout<<s<<endl;
cout<<fun("hello")<<endl;
}
The first cout will not work. the second cout will.
My concern is the following:
Is it not possible to imagine a situation where a method implementor wants to return an argument that is a const reference and is unavoidable?
I think it is perfectly possible.
What would you do in C++ in this situation?
Thanks.
In C++, it is important to establish the lifetimes of objects. One common technique is to decide upon an "owner" for each object. The owner is responsible for ensuring that the object exists as long as it is needed, and deleting it when not needed.
Often, the owner is another object that holds the owned object in an instance variable. The other typical ways to deal with this are to make it a global, a static member of a class, a local variable, or use a reference-counted pointer.
In your example, there is no clear ownership of the string object. It is not owned by the main() function, because it is not a local variable, and there is no other owner.
I feel your pain. I've found other situations where returning a const reference seemed the right thing to do, but had other ugly issues.
Luckily, the subtle gotcha is solved in c++0x. Always return by value. The new move constructors will make things a fast as you could wish.
The technique is valid and is used all the time. However in your first example you are converting a const char* to a temporary std::string and attempting to return it, which is not the same as returning a const-reference to an object stored elsewhere. In the second example you are doing the same thing, but you are using the result before the temporary is destroyed, which in this case is legal but dangerous (see your first case.)
Update: Allow me to clarify my answer some. I'm saying the problem lies in the creation of the temporary and not correctly handling the lifetimes of the objects being created. The technique is a good one, but it (along with many other good techniques) requires the pre- and post-conditions of the functions be met. Part of this burden falls on the function programmer (who should document it) and partly on the client as well.
I think it is a slight weakness of C++. There's an unfortunate combination of two factors:
The function's return is only valid as long as its argument is.
Implicit conversion means that the function's argument is not the object it may appear to be.
I have no sympathy for people who fail to think about the lifetime of objects they have pointers/references to. But the implicit conversion, which certainly is a language feature with subtle pros and cons, is not making the analysis very easy here. Sometimes implicit conversion is bad news, which why the explicit keyword exists. But the problem isn't that conversion to string is bad in general, it's just bad for this function, used in this incorrect way.
The author of the function can in effect disable implicit conversion, by defining an overload:
const char *fun(const char *s) { return s; }
That change alone means the code which previously was bad, works. So I think it's a good idea in this case to do that. Of course it doesn't help if someone defines a type which the author of fun has never heard of, and which has an operator std::string(). Also, fun is not a realistic function, and for more useful routines you might not want to provide an equivalent which operates on char*. In that case, void fun(const char *); at least forces the caller to explicitly cast to string, which might help them use the function correctly.
Alternatively, the caller could note that he's providing a char*, and getting back a reference to a string. That appears to me to be a free lunch, so alarm bells should be ringing where this string came from, and how long it's going to last.
Yes, I agree that there are situations where this is a relevant problem.
I would use a reference-counted pointer to "solve" it.
I think you are asking for trouble in C++98 :)
This can be solved in two ways. First, you could use a shared pointer. In this case, the memory would be managed automatically by the shared_ptr, and you are done! But, this is a bad solution in most cases. Because you are really not sharing the memory between many references. auto_ptr is the true solution for this problem, if you consider using the heap all the time. auto_ptr needs one little crucial improvement that is not there in C++98 to be really usable, that is : Move Semantic!
A better solution is to allow ownership to be moved between references, by using r-value references, which is there in C++0x. So, your piece of code would look like(not sure if the syntax is correct):
string fun(const string& s) {
return s; // return a copy of s
}
....
string s = fun("Hello"); // the actual heap memory is transfered to s.
// the temporary is destroyed, but as we said
// it is empty, because 's' now owns the actual data!
Is it not possible to imagine a situation where a method implementor wants to return an argument that is a const reference and is unavoidable?
Wrong question to ask, really. All you have to do is include whether the returned reference might be to a parameter (passed by reference), and document that as part of the interface. (This is often obvious already, too.) Let the caller decide what to do, including making the temporary into an explicit object and then passing that.
It is common and required to document the lifetimes of returned pointers and references, such as for std::string::data.
What would you do in C++ in this situation?
Often you can pass and return by value instead. This is commonly done with things like std::copy (for the destination iterator in this case).
In the upcoming C++ standard, r-value references can be used to keep your temporary objects 'alive' and would fix the issue that you're having.
You may want to look up perfect forwarding and move constructors as well.
Whilst refactoring some code I came across some getter methods that returns a std::string. Something like this for example:
class foo
{
private:
std::string name_;
public:
std::string name()
{
return name_;
}
};
Surely the getter would be better returning a const std::string&? The current method is returning a copy which isn't as efficient. Would returning a const reference instead cause any problems?
The only way this can cause a problem is if the caller stores the reference, rather than copy the string, and tries to use it after the object is destroyed. Like this:
foo *pFoo = new foo;
const std::string &myName = pFoo->getName();
delete pFoo;
cout << myName; // error! dangling reference
However, since your existing function returns a copy, then you would
not break any of the existing code.
Edit: Modern C++ (i. e. C++11 and up) supports Return Value Optimization, so returning things by value is no longer frowned upon. One should still be mindful of returning extremely large objects by value, but in most cases it should be ok.
Actually, another issue specifically with returning a string not by reference, is the fact that std::string provides access via pointer to an internal const char* via the c_str() method. This has caused me many hours of debugging headache. For instance, let's say I want to get the name from foo, and pass it to JNI to be used to construct a jstring to pass into Java later on, and that name() is returning a copy and not a reference. I might write something like this:
foo myFoo = getFoo(); // Get the foo from somewhere.
const char* fooCName = foo.name().c_str(); // Woops! foo.name() creates a temporary that's destructed as soon as this line executes!
jniEnv->NewStringUTF(fooCName); // No good, fooCName was released when the temporary was deleted.
If your caller is going to be doing this kind of thing, it might be better to use some type of smart pointer, or a const reference, or at the very least have a nasty warning comment header over your foo.name() method. I mention JNI because former Java coders might be particularly vulnerable to this type of method chaining that may seem otherwise harmless.
One problem for the const reference return would be if the user coded something like:
const std::string & str = myObject.getSomeString() ;
With a std::string return, the temporary object would remain alive and attached to str until str goes out of scope.
But what happens with a const std::string &? My guess is that we would have a const reference to an object that could die when its parent object deallocates it:
MyObject * myObject = new MyObject("My String") ;
const std::string & str = myObject->getSomeString() ;
delete myObject ;
// Use str... which references a destroyed object.
So my preference goes to the const reference return (because, anyway, I'm just more confortable with sending a reference than hoping the compiler will optimize the extra temporary), as long as the following contract is respected: "if you want it beyond my object's existence, they copy it before my object's destruction"
Some implementations of std::string share memory with copy-on-write semantics, so return-by-value can be almost as efficient as return-by-reference and you don't have to worry about the lifetime issues (the runtime does it for you).
If you're worried about performance, then benchmark it (<= can't stress that enough) !!! Try both approaches and measure the gain (or lack thereof). If one is better and you really care, then use it. If not, then prefer by-value for the protection it offers agains lifetime issues mentioned by other people.
You know what they say about making assumptions...
Okay, so the differences between returning a copy and returning the reference are:
Performance: Returning the reference may or may not be faster; it depends on how std::string is implemented by your compiler implementation (as others have pointed out). But even if you return the reference the assignment after the function call usually involves a copy, as in std::string name = obj.name();
Safety: Returning the reference may or may not cause problems (dangling reference). If the users of your function don't know what they are doing, storing the reference as reference and using it after the providing object goes out of scope then there's a problem.
If you want it fast and safe use boost::shared_ptr. Your object can internally store the string as shared_ptr and return a shared_ptr. That way, there will be no copying of the object going and and it's always safe (unless your users pull out the raw pointer with get() and do stuff with it after your object goes out of scope).
I'd change it to return const std::string&. The caller will probably make a copy of the result anyway if you don't change all the calling code, but it won't introduce any problems.
One potential wrinkle arises if you have multiple threads calling name(). If you return a reference, but then later change the underlying value, then the caller's value will change. But the existing code doesn't look thread-safe anyway.
Take a look at Dima's answer for a related potential-but-unlikely problem.
It is conceivable that you could break something if the caller really wanted a copy, because they were about to alter the original and wanted to preserve a copy of it. However it is far more likely that it should, indeed, just be returning a const reference.
The easiest thing to do is try it and then test it to see if it still works, provided that you have some sort of test you can run. If not, I'd focus on writing the test first, before continuing with refactoring.
Odds are pretty good that typical usage of that function won't break if you change to a const reference.
If all of the code calling that function is under your control, just make the change and see if the compiler complains.
Does it matter? As soon as you use a modern optimizing compiler, functions that return by value will not involve a copy unless they are semantically required to.
See the C++ lite FAQ on this.
Depends what you need to do. Maybe you want to all the caller to change the returned value without changing the class. If you return the const reference that won't fly.
Of course, the next argument is that the caller could then make their own copy. But if you know how the function will be used and know that happens anyway, then maybe doing this saves you a step later in code.
I normally return const& unless I can't. QBziZ gives an example of where this is the case. Of course QBziZ also claims that std::string has copy-on-write semantics which is rarely true today since COW involves a lot of overhead in a multi-threaded environment. By returning const & you put the onus on the caller to do the right thing with the string on their end. But since you are dealing with code that is already in use you probably shouldn't change it unless profiling shows that the copying of this string is causing massive performance problems. Then if you decide to change it you will need to test thouroughly to make sure you didn't break anything. Hopefully the other developers you work with don't do sketchy stuff like in Dima's answer.
Returning a reference to a member exposes the implementation of the class.
That's could prevent to change the class. May be usefull for private or protected methods incase the optimization is needed.
What should a C++ getter return