I'm still not quite sure when return-by-value is a good idea in C++ an when not. In the following case, is it ok?
vector<int> to_vec(const Eigen::MatrixXi& in){
vector<int> out;
// copy contents of in into out
return out;
}
Eigen::MatrixXi to_eigen(const vector<int>& in){
Eigen::MatrixXi out;
// copy contents of in into out
return out
}
Depending on how those objects vector and MatrixXi actually work, it could result in an expensive copy. On the other hand, I assume that they leverage C++'s move functionality to inexpensively copy the by reusing the underlying data.
Without exactly knowing the implementation, what can I assume?
In such a situation where you're declaring a local variable, initializing it and returning it by value, you can be pretty safe in assuming that your compiler will elide the copy.
This case is known as named return value optimization. Essentially, instead of allocating the return value in the function call, it'll be done at the call site and passed in as a reference. Returning by value is the best choice here, as you don't need to declare a variable at the call site to pass in, but the performance will be as if you had.
In C++17, copy elision will be mandatory in most cases involving prvalues (e.g. T t = get_t(); or return get_t()), but is still optional for NRVO.
The Thumb rules regarding return values in C++ are:
never return a reference to a local variable
never return a pointer to a local variable
don't return a named value using move semantics
as for (3) - This is a known concern with C++ - we all learned that when an object returns by value - it activates the copy constructor. this is theoretically true, but practically wrong. the compiler will utilize copy elision on objects when optimization are turned on.
copy elision is an optimization technique that makes the value be created within the caller scope and not in the callee scope, hence preventing an expensive copy. modification on that object will take place in the callee scope.
as for (1) and (2), there is also a corner case regarding coroutines and generators, but unless you know you're dealing with them, (1) and (2) are always valid.
Related
Let's say I have a simple function returnString that returns a string by value:
std::string returnString() {
std::string s;
// Use s in such a way to defeat return value mandatory copy-elision
return s;
}
I have another function that wants to heap-allocate the result of this. Easy enough.
void caller() {
std::string* heap_allocated_string = new std::string(returnString());
}
Instead of std::string though, consider an arbitrary type T. I believe that by the language rules, the following two statements are true. Are they?
In C++14, I believe this is not ideal, since for some types if the move ctor is not free or not defined, we might be doing unnecessary work compared to just directly constructing on the heap.
In C++17, this triggers mandatory copy elision, so even if the type did not define a move constructor, no extra copy would be created, and the move constructor would not be called.
In general though, for a generic type, is there a better way to do this, without modifying the called function?
In C++14, I believe this is not ideal, since for some types if the move ctor is not free or not defined, we might be doing unnecessary work compared to just directly constructing on the heap.
Define "not ideal".
If the object has to be constructed via a factory function which returns by value, and you want to heap-allocate the object instead, and you have to do this "without modifying the called function," then that is as good as it's going to get.
Plus, you need not worry about the lack of copy/move in C++14. The reason being that it is (almost) impossible to return a non-copyable, non-moveable object by value in C++14. There is technically a way to do it (through the use of list-initialization syntax in the return statement), but if the function is written as you've stated it, then whatever type it returns must be copyable or moveable.
Furthermore, the new expression on your end doesn't even require named RVO; this part is just eliding a temporary, and there's no reason why a compiler wouldn't be able to optimize that move away.
So basically, there has to be a copy/move constructor for the function to compile, and any copy/move will be optimized away on your end for all practical purposes. So there's nothing to be concerned about.
I am fairly new to C++ and I know of three ways of returning a local variable and all have their downsides:
Person& getPerson()
{
Person bob;
return bob;
}
Clearly not a good idea.
Person getPerson()
{
Person bob;
return bob;
}
No chance of a null pointer or dangling reference but a performance hit.
Person* getPerson()
{
return new Person();
}
No chance of a null pointer but surely this violates the basic rules of OO design. Another object will have to delete this - but why should it have to? The implemenation of the getPerson() method has nothing to do with it.
So, I am looking for an alternative. I have heard of shared pointers and smart pointers (standard and Boost) but I'm not sure whether any of them are designed to deal with this problem. What do you guys suggest?
Option #2: return by value.
Person getPerson()
{
Person bob;
return bob;
}
There is no performance hit here. This copy may be (and probably will be) elided by your compiler. In fact, even if you turn off your compiler's copy elision optimizations, with a C++11 compiler this will be considered as a move first.
In fact, even if you then do Person p = getPerson(), which would normally involve two copies, both may be elided.
See §12.9/31:
in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value
And §12.9/32:
When the criteria for elision of a copy operation are met or would be met save for the fact that the source object is a function parameter, and the object to be copied is designated by an lvalue, overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue.
No chance of a null pointer or dangling reference but a performance hit.
Actually, no performance hit at all.
See for example here: Want Speed? Pass by Value.
Compiler can easily optimize that, with strategies called copy elision and the named return value optimization (check out the link for that).
You shouldn't worry too much about a performance hit here:
Person getPerson()
{
Person bob;
return bob;
}
The copy you are worried about will most likely be elided in what is called return value optimization (RVO). The C++ standard allows compilers to make this optimization, even if it breaks the as-if rule. I haven't come across a compiler that wouldn't elide a copy in this kind of expression for a long time:
Person p = getPerson();
In C++11, even in the absence of copy elision, this would be a candidate for a move construction. This could be an extremely cheap operation, but that really depends on the type in question. In any case, copy elision is hard to avoid.
See this related post.
See this demo.
As others have already pointed out, return value optimization helps minimize the performance hit from simply returning a value.
Move semantics (new with C++11) can also help in this regard -- a return expression is pretty much the canonical example of an "xvalue", which is eligible to have its value moved from the source to the destination, rather than copied. Especially for a type (e.g., vector) that mostly consists of a pointer to the real data, this can be extremely beneficial, as it allows essentially a shallow copy instead of a deep copy (i.e., instead of making a copy of the entire vector, it only ends up copying the pointer).
A shared_ptr or unique_ptr can work here as well. A shared_ptr is basically a reference counted pointer, so (pre-C++11) it lets you keep the object alive by just incrementing a reference count during the return process, then decrementing it again afterwards. At least in a single-threaded environment, this is generally pretty cheap -- often cheaper than making a copy of the data.
A unique_ptr does roughly similar things, but without the overhead of incrementing and decrementing a reference count. The basic difference is that instead of a making copying cheap, it moves the pointer to avoid doing a copy at all.
Any of these can work, but pretty clearly the best of them in most cases is to just return the value (and if it makes sense, add a move constructor and/or move assignment operator to the type you're working with).
Local variables go out of scope - their lifetime ends - once the function execution is complete. Therefore, generally it is not a good idea to return references or pointers to local variables.
What you might want to do is to return references or pointers to class member variables, which maintain their lifetime as long as the class object is in scope or has a valid lifetime.
Should you ever need to return polymorphic objects, I recommend using unique pointers:
std::unique_ptr<Person> getPerson()
{
return std::unique_ptr<Person>(new Programmer);
}
I know I bang on about this a lot.
Another alternative is to not return anything.
Tell the object what to do:
display yourself, using this renderer
serialise yourself using this serialiser (implementation could be xml, database, json, network)
update your state for this time
decorate yourself with controls, using this control creator (creates sliders, dropdown lists, checkboxes, etc)
No need for getters on the whole. Make efforts to avoid them and you'll find your designs pleasantly changed, testable, reasonable.
In my current project I need to implement quite a few functions/methods that take some parameters and generate a collection of results (rather large). So in order to return this collection without copying, I can either create a new collection and return a smart pointer:
boost::shared_ptr<std::vector<Stuff> > generate();
or take a reference to a vector which will be populated:
void generate(std::vector<Stuff> &output);
Both approaches have benefits. The first clearly shows that the vector is the output of the function, it is trivial to use in a parallelized scenario, etc. The second might be more efficient when called in a loop (because we don't allocate memory every time), but then it is not that obvious that the parameter is the output, and someone needs to clean the old data from the vector...
Which would be more customary in real life (i.e. what is the best practise)? In C#/java I would argue that the first one, what is the case in C++?
Also, is it possible to effectively return a vector by value using C++11? What would the pitfalls be?
do correctness first, then optimize if necessary
with both move semantics and Return Value Optimization conspiring to make an ordinary function result non-copying, you would probably have to work at it to make it sufficiently inefficient to be worth optimization work
so, just return the collection as a function result, then MEASURE if you feel that it's too slow
You should return by value.
is it possible to effectively return a vector by value using C++11?
Yes, C++11 supports move semantics. You return a value, but the compiler knows it's a temporary, and therefore can invoke a special constructor (move constructor) that is especially designed to simply "steal the guts" of the returned object. After all, you won't use that temporary object anymore, so why copying it when you can just move its content?
Apart from this, it may be worth mentioning that most C++ compilers, even pre-C++11, implement (Named) Return Value Optimization, which would elide the copy anyway, incurring in no overhead. Thus, you may want to actually measure the performance penalty you (possibly) get before optimizing.
I think you should pass by reference, or return a shared pointer, only when you need reference semantics. This does not seem to be your case.
There is an alternative approach. If you can make your functions template, make them take an output iterator (whose type is a template argument) as argument:
tempalte<class OutputIterator>
void your_algorithm(OutputIterator out) {
for(/*condition*/) {
++out = /* calculation */;
}
}
This has the advantage that the caller can decide in what kind of collection he wants to store the result (the output iterator could for instance write directly to a file, or store the result in a std::vector, or filter it, etc.).
The best practise will probably be surprising to you. I would recommend returning by value in both C++03 and C++11.
In C++03, if you create a std::vector local to generate and return it, the copy may be elided by the compiler (and almost certainly will be). See C++03 §12.8/15:
in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object with the same cv-unqualified type as the function return type, the copy operation can be omitted by constructing the automatic object directly into the function's return value
In C++11, if you create a std::vector local to generate and return it, the copy will first be considered as a move first (which will already be very fast) and then that may be elided (and almost certainly will be). See C++11 §12.8/31:
in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value
And §12.8/32:
When the criteria for elision of a copy operation are met or would be met save for the fact that the source object is a function parameter, and the object to be copied is designated by an lvalue, overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue.
So return by value!
Believe it or not, I'm going to suggest that instead of either of those approaches just take the obvious implementation and return by value! Compilers are very often able to optimize away the notional copy that would be induced, removing it completely. By writing the code in the most obvious manner you make it very clear to future maintainers what the intent is.
But let's say you try return by value and your program runs too slow and let's further suppose that your profiler shows that the return by value is in fact your bottleneck. In this case I would allocate the container on the heap and return as an auto_ptr in C++03 or a unique_ptr in C++11 to clearly indicate that ownership is being transferred and that the generate isn't keeping a copy of that shared_ptr for its own purposes later.
Finally, the series at http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/ provides a great perspective on almost the exact same question.
I need to get this straight. With the code below here:
vector<unsigned long long int> getAllNumbersInString(string line){
vector<unsigned long long int> v;
string word;
stringstream stream(line);
unsigned long long int num;
while(getline(stream, word, ',')){
num = atol(word.c_str());
v.push_back(num);
}
return v;
}
This sample code simply turns an input string into a series of unsigned long long int stored in vector.
In this case above, if I have another function calls this function, and we appear to have about 100,000 elements in the vector, does this mean, when we return it, a new vector will be created and will have elements created identically to the one in the function, and then the original vector in the function will be eliminated upon returning? Is my understanding correct so far?
Normally, I will write the code in such a way that all functions will return pointer when it comes to containers, however, program design-wise, and with my understanding above, should we always return a pointer when it comes to container?
The std::vector will most likely (if your compiler optimizations are turned on) be constructed directly in the function's return value. This is known as copy/move elision and is an optimization the compiler is allowed to make:
in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value
This quote is taken from the C++11 standard but is similar for C++03. It is important to note that copy/move elision does not have to occur at all - it is entirely up to the compiler. Most modern compilers will handle your example with no problems at all.
If elision does not occur, C++11 will still provide you with a further benefit over C++03:
In C++03, without copy elision, returning a std::vector like this would have involved, as you say, copying all of the elements over to the returned object and then destroyed the local std::vector.
In C++11, the std::vector will be moved out of the function. Moving allows the returned std::vector to steal the contents of the std::vector that is about to be destroyed. This is much more efficient that copying the contents over.
You may have expected that the object would just be copied because it is an lvalue, but there is a special rule that makes copies like this first be considered as moves:
When the criteria for elision of a copy operation are met [...] and the object to be copied is designated by an lvalue, overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue.
As for whether you should return a pointer to your container: the answer is almost certainly no. You shouldn't be passing around pointers unless its completely necessary, and when it is necessary, you're much better off using smart pointers. As we've seen, in your case it's not necessary at all because there's little to no overhead in passing it by value.
It is safe, and I would say preferable, to return by value with any reasonable compiler. The C++ standard allows copy elision, in this case named return value optimization (NRVO), which means this extra copy you are worried about doesn't take place.
Note that this is a case of an optimization that is allowed to modify the observable behaviour of a program.
Note 2. As has been mentioned in other answers, C++11 introduces move semantics, which means that, in cases where RVO doesn't apply, you may still have a very cheap operation where the contents of the object being returned are transfered to the caller. In the case of std::vector, this is extremely cheap. But bear in mind that not all types can be moved.
Your understanding is correct.
But compilers can apply copy elision through RVO and NRVO and remove the extra copy being generated.
Should we always return a pointer when it comes to container?
If you can, ofcourse you should avoid retun by value especially for non POD types.
That depends on whether or not you need reference semantics.
In general, if you do not need reference semantics, I would say you should not use a pointer, because in C++11 container classes support move semantics, so returning a collection by value is fast. Also, the compiler can elide the call to the moved constructor (this is called Named Return Value Optimization or NRVO), so that no overhead at all will be introduced.
However, if you do need to create separate, consistent views of your collection (i.e. aliases), so that for instance insertions into the returned vector will be "seen" in several places that share the ownership of that vector, then you should consider returning a smart pointer.
I have the following compare method.
The method compare and return diff result.
I want to minimize the times that the result list is copied (to temporal and to assignment).
One way to do this is to add additional reference argument for the result, but i love that utils function are closed (they dont change values), so i prefer to avoid this.
One copy can be avoid by using the const& in the assignment
const& list<uint32> diff = getDiffNewElements (...)
, can be a way to also avoid the local copy to temporal ?
The diff method:
list<uint32> getDiffNewElements(const list<Row>& src ,const list<Row>& dst) {
list<uint32> result;
... Do Some compare
return result;
}
There are potentially two copies in the code that you present. One inside the function, from the variable result to the returned object. NRVO will take care of that if the complexity of the function allows for it. If as it seems, you have a single return statement, then the compiler will elide that copy (assuming that you are not disabling it with compiler flags and that you have some optimization level enabled).
The second potential copy is in the caller, from the returned value to the final storage. That copy is almost always elided by the compiler, and it is even simpler to do here than in the NRVO case: the calling convention (all calling conventions I know of) determines that the caller reserves the space for the returned object, and that it passes a hidden pointer to that location to the function. The function in turn uses that pointer as the destination of the first copy.
A function T f() is transformed into void f( uninitialized<T>* __ret ) (there is no such thing as uninitialized<>, but bear with me) and the implementation of the function uses that pointer as the destination when copying in the return statement. On the caller site, if you have T a = f(); the compiler will transform it into T a/*no construction here*/; f(&a);
There is an interesting bit of code in the question that seems to indicate that you have been mislead in the past: const list<uint32>& diff = getDiffNewElements(...). Using a reference rather than storing the value directly (as in list<uint32> diff = getDiffNewElements(...)) has no impact at all in the number of copies that are made. The getDiffNewElements still needs to copy (or elide the copy) to the returned object and that object lives in the scope of the caller. By taking a const reference you are telling the compiler that you don't want to directly name that object, but rather keep it as an unnamed object in the scope and that you only want to use a reference to it. Semantically it can be transformed into:
T __unnamed = getDiffNewElements(...); // this copy can be elided
T const& diff = __unnamed;
The compiler is free, and will probably, optimize the reference away, using the identifier diff as an alias to __unnamed without requiring extra space, so in general it will not be worse than the alternative, but the code is slightly more complex and there is no advantage at all.
Long time ago, when I had time, I started a blog and wrote a couple of articles on value semantics, (N)RVO and copy elision. You might want to take a look.
NRVO
Copy elision
You are looking in the wrong direction. Your getDiffNewElements method does not create unnecessary copies, because modern compilers do RVO (which was pointed in the comments, although if your compiler doesn't do RVO, there is nothing you can do about it, your const& won't help). The way you can optimize this function is to return vector instead of list, since you can call reserve on vector and avoid memory allocations every time you push_back a new element. list does not preallocate memory with default allocator and it has no reserve method to do that.