I come from Java, where filling containers did not involve thinking. My problem now that i work with c++ is, that filling a container in a function with data which is declared in function scope can lead to errors, where the data doesn't exist any more when I want to access it.
I could not find tutorials addressing the problem, so I went the Java way and only made containers getting pointers declared with "new". But now I am forced to return a
std::list<Vertex<float> >
from a function and thought this might be a good point to learn how I would fill and return such a thing. Would this
{
std::list<Vertex<float> > myList;
Vertex<float> v(0.0, 0.1, 0.2);
myList.push_back(v);
myList.push_back(Vertex<float>(1,0, 1.1, 1.2));
return myList;
}
actually be fine as a sample function body? And if yes, why would v still exist outside of the scope? Does each insertion in a container also imply a copying?
This would work "fine" since every operation there creates a copy.
myList.push_back(v); creates a copy of v so that visiblity of v is now irrelevant.
return myList; returns a copy of the list to the calling function so the visiblity of myList is now irrelevant. The calling function should make a copy of this list to keep it in scope, else it will be destroyed at the end of execution of the line that calls this function.
The reason that fine is quoted is that copies are typically expensive. In your case they are quite small so it might be irrelevant, and in many cases they might be optimised away, but it is still something to keep in mind.
Old C++ way of optimizing is to pass a list by reference and use that to construct your list, instead of returning by value.
void MakeMeAList(std::list<Vertex<float> >& aList){
....
}
std::list<Vertex<float> > aList;
MakeMeAList(aList);
As #billz suggests, Return Value Optimization should optimize away the copies even if this is not done.
New C++ (c++11) -
The use of emplace_back to construct the list, would be more efficient that copying, as long as the input variables are not going to be used any more. ( thanks #Troy)
My C++11 is weak, I am almost sure even returning by value is OK since Move semantics would optimize it away, but I am only 95% sure.
To add to Karthik's reply, if you're using a relatively new compiler that implements r-value references (part of C++0x standard), your approach will work perfectly without the negative impact of the copy operation.
Check out http://en.wikipedia.org/wiki/C%2B%2B11#Rvalue_references_and_move_constructors
for a brief intro to r-values.
Note that before r-value references were adopted, many compilers eliminated the 'expensive copy operation' involved in returning collections through what is called Return Value Optimization. Again Wiki has more details:
http://en.wikipedia.org/wiki/Return_value_optimization
Visual Studio 2005 (and newer) implements RVO, and I believe GCC also has a similar feature. So effectively, your code, which is more readable than using arguments to return values, is the correct approach and ought to work very well if you're using the latest compilers.
Related
In his excellent book "C++ Concurrency in Action" (2nd edition including C++17) Anthony Williams discusses the implementation of a thread-safe stack.
In the course of this, he proposes an adapter implementation of std::stack, which, among other things, would combine the calls of top() and pop() into one. The separation into 2 separate functions, however, was done for a reason in std::stack, namely to avoid losing data in the case that a potential copy made when returning the popped element to the caller throws an exception inside the copy constructor. When returning, the element will have already been popped off and is consequentially lost.
Instead of having a function T pop(), he proposes other variations of pop that would be able to remove the element off the stack and provide it to the caller in one operation, all of which come with their own problems, though.
The 1st alternative he proposes has the signature void pop(T&). The caller passes in a reference to a T and gets the popped off object that way. This way of doing it, however, comes with the problem that a T need be constructed prior to the call to pop, which might be an expensive operation, or it might not be possible to construct a T beforehand at all because necessary data might not be available yet at the time. Another problem the author mentions is that T might not be assignable, which would be required for this solution, though.
Now my question: Wouldn't all of the mentioned problems be solved if we passed a std::optional<T>& instead of a T&?
In that case, no instance of T would need to be constructed prior to the call to pop. Furthermore, assignability would not be required anymore either, since the object to be returned could be constructed into the std::optional<T> instance directly using its emplace function.
Am I missing something crucial here or am I right? If I am indeed right, I would be curious to know why this was not considered (for a good reason or just plainly an oversight?).
std::optional does solve all of the mentioned problems, and using it to control lifetime can be quite valuable, although it would appear a bit strange in
std::optional<T> o;
st.pop(o);
to have o always engaged.
That said, with a stupid scope-guard trick it's possible in C++17 to safely return T even without requiring no-throw-movability:
T pop() {
struct pop_guard {
C &c;
int u=std::uncaught_exceptions();
~pop_guard() {if(std::uncaught_exceptions()==u) c.pop_back();}
} pg{c};
return std::move(c.back());
}
(We could of course test for a throwing move and just move (perhaps twice) in its absence.)
However, what wasn't mentioned is that separate top and pop allows a T that isn't even movable, so long as the underlying container supports it. std::stack<std::mutex> works (with emplace, not push!) because std::deque doesn't require movability.
When returning a container, I always had to determine if I should use return value or use output parameter. If the performance matter, I chose the second option, otherwise I always chose the first option because it is more intuitive.
Frankly, I personally have been strongly objected about output parameters, possibly because of my mathematical background, but it was kind of OK to use them when I had no other options.
However, the things have been completely changed when it comes to generic programming. There are situations I encountered where a function may not know whether or not the object it returns is a huge container or just a simple value.
Consistently using output parameters may be the solution that I want to know if I can avoid. It is just so awkward if I have to do
int a;
f(a, other_arguments);
compared to
auto a = f(other_arguments);
Furthermore, sometimes the return type of f() has no default constructor. If output parameters are used, there is no graceful way to deal with that case.
I wonder if it is possible to return a "modifier object", a functor taking output parameters to modify them appropriately. (Perhaps this is a kind of lazy evaluation?) Well, returning such objects is not a problem, but the problem is I can't insert an appropriate overload of the assignment operator (or constructor) that takes such an object and triggers it to do its job, when the return type belongs to a library that I can't touch, e.g., std::vector. Of course, conversion operators are not helpful as they have no access to existing resources prepared for the target object.
Some people might ask why not use assign(); define a "generator object" which has begin() & end(), and pass those iterators to std::vector::assign. It is not a solution. For the first reason, the "generator object" does not have the full access to the target object and this may limit what could be done. For the second and more important reason, the call site of my function f() may also be a template which does not know the exact return type of f(), so it cannot determine which of the assignment operator or the assign() member function should be used.
I think that the "modifier object" approach to modify containers should have been already discussed in the past as it is not at all a new idea.
To sum up,
Is it possible to use return values to simulate what would happen when output parameters are used instead, in particular, when outputs are containers?
If not, was adding those supports to the standard discussed before? If it was, what were the issues? Is it a terrible idea?
Edit
The code example I've put above is misleading. The function f() may be used to initialize a local variable, but it may be also used to modify existing variables defined elsewhere. For the first case, as Rakete1111 mentioned, there is no problem with return by value as copy elision comes into play. But for the second case, there may be unnecessary resource releasing/acquiring.
I don't think your "modifier object" was ever proposed (AFAIK). And it will never go into the standard. Why? Because we already have a way to get rid of the expensive copy, and that is return by value (plus compiler optimizations).
Before C++17, compilers were allowed to do basically almost the same thing. This optimization is known as (N)RVO, which optimizes away the copy when returning a (named) temporary from a function.
auto a = f(other_arguments);
Will not return a temporary, then copy it into a. The compiler will optimize the copy away entirely, it is not needed. In theory, you cannot assume that your compiler supports this, but the three major ones do (clang, gcc, MSVC) so no need to worry - I don't know about ICC and the others, so I can't say.
So, as there is no copy (or move) involved, there is no performance penalty of using return values instead of output parameters (most probably, if for some reason your compiler doesn't support it, you'll get a move most of the time). You should always use return parameters if possible, and only use output parameters or some other technique if you measure that you get significantly better performance otherwise.
(Edited, based on comments)
You are right you should avoid output parameters if possible, because the code using them is harder to read and to debug.
Since C++11 we have a feature called move constructors (see reference). You can use std::move on all primitive and STL containers types. It is efficient (question: Efficiency difference between copy and move constructor), because you don't actually copy the values of the variables. Only the pointers are swapped. For your own complex types, you can write your own move constructor.
On the other hand, the only thing you can do is to return a reference to a temporary, which is of undefined behavior, for example:
#include <iostream>
int &&add(int initial, int howMany) {
return std::move(initial + howMany);
}
int main() {
std::cout << add(5, 2) << std::endl;
return 0;
}
Its output could be:
7
You could avoid the problem of temporary variables using global or static ones, but it is bad either. #Rakete is right, there is no good way of achieving it.
In this talk (sorry about the sound) Chandler Carruth suggests not passing by reference, even const reference, in the vast majority of cases due to the way in which it limits the back-end to perform optimisation.
He claims that in most cases the copy is negligible - which I am happy to believe, most data structures/classes etc. have a very small part allocated on the stack - especially when compared with the back-end having to assume pointer aliasing and all the nasty things that could be done to a reference type.
Let's say that we have large object on the stack - say ~4kB and a function that does something to an instance of this object (assume free-standing function).
Classically I would write:
void DoSomething(ExpensiveType* inOut);
ExpensiveType data;
...
DoSomething(&data);
He's suggesting:
ExpensiveType DoSomething(ExpensiveType in);
ExpensiveType data;
...
data = DoSomething(data);
According to what I got from the talk, the second would tend to optimise better. Is there a limit to how big I make something like this though, or is the back-end copy-elision stuff just going to prefer the values in almost all cases?
EDIT: To clarify I'm interested in the whole system, since I feel that this would be a major change to the way I write code, I've had use of refs over values drilled into me for anything larger than integral types for a long time now.
EDIT2: I tested it too, results and code here. No competition really, as we've been taught for a long time, the pointer is a far faster method of doing things. What intrigues me now is why it was suggested during that talk that we move to pass by value, but as the numbers don't support it, it's not something I'm going to do.
I have now watched parts of Chandler's talk. I think the general discussion along the lines "should I now always pass by value" does not do his talk justice. Edit: And actually his talk has been discussed before, here value semantics vs output params with large data structures and in a blog from Eric Niebler, http://ericniebler.com/2013/10/13/out-parameters-vs-move-semantics/.
Back to Chandler. In the key note he specifically (around the 4x-5x minute mark mentioned elsewhere) mentions the following points:
If the optimizer cannot see the code of the called function you have much bigger problems than passing refs or values. It pretty much prevents optimization. (There is a follow-up question at that point about link time optimization which may be discussed later, I don't know.)
He recommends the "new classical" way of returning values using move semantics. Instead of the old school way of passing a reference to an existing object as an in-out parameter the value should be constructed locally and moved out. The big advantage is that the optimizer can be sure that no part of the object is alisased since only the function has access to it.
He mentions threads, storing a variable's value in globals, and observable behaviour like output as examples for unknowns which prevent optimization when only refs/pointers are passed. I think an abstract description could be "the local code can not assume that local value changes are undetected elsewhere, and it cannot assume that a value which is not changed locally has not changed at all". With local copies these assumptions could be made.
Obviously, when passing (and possibly, if objects cannot be moved, when returning) by value, there is a trade-off between the copy cost and the optimization benefits. Size and other things making copying costly will tip the balance towards reference strategies, while lots of optimizable work on the object in the function tips it towards value passing. (His examples involved pointers to ints, not to 4k sized objects.)
Based on the parts I watched I do not think Chandler promoted passing by value as a one-fits-all strategy. I think he dissed passing by reference mostly in the context of passing an out parameter instead of returning a new object. His example was not about a function which modified an existing object.
On a general note:
A program should express the programmer's intent. If you need a copy, by all means do copy! If you want to modify an existing object, by all means use references or pointers. Only if side effects or run time behavior become unbearable; really only then try do do something smart.
One should also be aware that compiler optimizations are sometimes surprising. Other platforms, compilers, compiling options, class libraries or even just small changes in your own code may all prevent the compiler from coming to the rescue. The run-time cost of the change would in many cases come totally unexpected.
Perhaps you took that part of the talk out of context, or something. For large objects, typically it depends on whether the function needs a copy of the object or not. For example:
ExpensiveType DoSomething(ExpensiveType in)
{
cout << in.member;
}
you wasted a lot of resource copying the object unnecessarily, when you could have passed by const reference instead.
But if the function is:
ExpensiveType DoSomething(ExpensiveType in)
{
in.member = 5;
do_something_else(in);
}
and we did not want to modify the calling function's object, then this code is likely to be more efficient than:
ExpensiveType DoSomething(ExpensiveType const &inr)
{
ExpensiveType in = inr;
in.member = 5;
do_something_else(in);
}
The difference comes when invoked with an rvalue (e.g. DoSomething( ExpensiveType(6) ); The latter creates a temporary , makes a copy, then destroys both; whereas the former will create a temporary and use that to move-construct in. (I think this can even undergo copy elision).
NB. Don't use pointers as a hack to implement pass-by-reference. C++ has native pass by reference.
I'm a newcomer to C++ and I ran into a problem recently returning a reference to a local variable. I solved it by changing the return value from std::string& to an std::string. However, to my understanding this can be very inefficient. Consider the following code:
string hello()
{
string result = "hello";
return result;
}
int main()
{
string greeting = hello();
}
To my understanding, what happens is:
hello() is called.
Local variable result is assigned a value of "hello".
The value of result is copied into the variable greeting.
This probably doesn't matter that much for std::string, but it can definitely get expensive if you have, for example, a hash table with hundreds of entries.
How do you avoid copy-constructing a returned temporary, and instead return a copy of the pointer to the object (essentially, a copy of the local variable)?
Sidenote: I've heard that the compiler will sometimes perform return-value optimization to avoid calling the copy constructor, but I think it's best not to rely on compiler optimizations to make your code run efficiently.)
The description in your question is pretty much correct. But it is important to understand that this is behavior of the abstract C++ machine. In fact, the canonical description of abstract return behavior is even less optimal
result is copied into a nameless intermediate temporary object of type std::string. That temporary persists after the function's return.
That nameless intermediate temporary object is then copied to greeting after function returns.
Most compilers have always been smart enough to eliminate that intermediate temporary in full accordance with the classic copy elision rules. But even without that intermediate temporary the behavior has always been seen as grossly suboptimal. Which is why a lot of freedom was given to compilers in order to provide them with optimization opportunities in return-by-value contexts. Originally it was Return Value Optimization (RVO). Later Named Return Value Optimization was added to it (NRVO). And finally, in C++11, move semantics became an additional way to optimize the return behavior in such cases.
Note that under NRVO in your example the initialization of result with "hello" actually places that "hello" directly into greeting from the very beginning.
So in modern C++ the best advice is: leave it as is and don't avoid it. Return it by value. (And prefer to use immediate initialization at the point of declaration whenever you can, instead of opting for default initialization followed by assignment.)
Firstly, the compiler's RVO/NRVO capabilities can (and will) eliminate the copying. In any self-respecting compiler RVO/NRVO is not something obscure or secondary. It is something compiler writers do actively strive to implement and implement properly.
Secondly, there's always move semantics as a fallback solution if RVO/NRVO somehow fails or is not applicable. Moving is naturally applicable in return-by-value contexts and it is much less expensive than full-blown copying for non-trivial objects. And std::string is a movable type.
I disagree with the sentence "I think it's best not to rely on compiler optimizations to make your code run efficiently." That's basically the compiler's whole job. Your job is to write clear, correct, and maintainable source code. For every performance issue I've ever had to fix, I've had to fix a hundred or more issues caused by a developer trying to be clever instead of doing something simple, correct, and maintainable.
Let's take a look at some of the things you could do to try to "help" the compiler and see how they affect the maintainability of the source code.
You could return the data via reference
For example:
void hello(std::string& outString)
Returning data using a reference makes the code at the call-site hard to read. It's nearly impossible to tell what function calls mutate state as a side effect and which don't. Even if you're really careful with const qualifying the references it's going to be hard to read at the call site. Consider the following example:
void hello(std::string& outString); //<-This one could modify outString
void out(const std::string& toWrite); //<-This one definitely doesn't.
. . .
std::string myString;
hello(myString); //<-This one maybe mutates myString - hard to tell.
out(myString); //<-This one certainly doesn't, but it looks identical to the one above
Even the declaration of hello isn't clear. Does it modify outString, or was the author just sloppy and forgot to const qualify the reference? Code that is written in a functional style is easier to read and understand and harder to accidentally break.
Avoid returning the data via reference
You could return a pointer to the object instead of returning the object.
Returning a pointer to the object makes it hard to be sure your code is even correct. Unless you use a unique_ptr you have to trust that anybody using your method is thorough and makes sure to delete the pointer when they're done with it, but that isn't very RAII. std::string is already a type of RAII wrapper for a char* that abstracts away the data lifetime issues associated with returning a pointer. Returning a pointer to a std::string just re-introduces the problems that std::string was designed to solve. Relying on a human being to be diligent and carefully read the documentation for your function and know when to delete the pointer and when not to delete the pointer is unlikely to have a positive outcome.
Avoid returning a pointer to the object instead of returning the object
That brings us to move constructors.
A move constructor will just transfer ownership of the pointed-to data from 'result' to its final destination. Afterwards, accessing the 'result' object is invalid but that doesn't matter - your method ended and the 'result' object went out of scope. No copy, just a transfer of ownership of the pointer with clear semantics.
Normally the compiler will call the move constructor for you. If you're really paranoid (or have specific knowledge that the compiler isn't going to help you) you can use std::move.
Use move constructors if at all possible
Finally modern compilers are amazing. With a modern C++ compiler, 99% of the time the compiler is going to do some sort of optimization to eliminate the copy. The other 1% of the time it's probably not going to matter for performance. In specific circumstances the compiler can re-write a method like std::string GetString(); to void GetString(std::string& outVar); automatically. The code is still easy to read, but in the final assembly you get all of the real or imagined speed benefits of returning by reference. Don't sacrifice readability and maintainability for performance unless you have specific knowledge that the solution doesn't meet your business requirements.
There are plenty of ways to achieve that:
1) Return some data by the reference
void SomeFunc(std::string& sResult)
{
sResult = "Hello world!";
}
2) Return pointer to the object
CSomeHugeClass* SomeFunc()
{
CSomeHugeClass* pPtr = new CSomeHugeClass();
//...
return(pPtr);
}
3) C++ 11 could utilize a move constructor in such cases. See this this and this for the additional info.
C++11 vectors have the new function emplace_back. Unlike push_back, which relies on compiler optimizations to avoid copies, emplace_back uses perfect forwarding to send the arguments directly to the constructor to create an object in-place. It seems to me that emplace_back does everything push_back can do, but some of the time it will do it better (but never worse).
What reason do I have to use push_back?
I have thought about this question quite a bit over the past four years. I have come to the conclusion that most explanations about push_back vs. emplace_back miss the full picture.
Last year, I gave a presentation at C++Now on Type Deduction in C++14. I start talking about push_back vs. emplace_back at 13:49, but there is useful information that provides some supporting evidence prior to that.
The real primary difference has to do with implicit vs. explicit constructors. Consider the case where we have a single argument that we want to pass to push_back or emplace_back.
std::vector<T> v;
v.push_back(x);
v.emplace_back(x);
After your optimizing compiler gets its hands on this, there is no difference between these two statements in terms of generated code. The traditional wisdom is that push_back will construct a temporary object, which will then get moved into v whereas emplace_back will forward the argument along and construct it directly in place with no copies or moves. This may be true based on the code as written in standard libraries, but it makes the mistaken assumption that the optimizing compiler's job is to generate the code you wrote. The optimizing compiler's job is actually to generate the code you would have written if you were an expert on platform-specific optimizations and did not care about maintainability, just performance.
The actual difference between these two statements is that the more powerful emplace_back will call any type of constructor out there, whereas the more cautious push_back will call only constructors that are implicit. Implicit constructors are supposed to be safe. If you can implicitly construct a U from a T, you are saying that U can hold all of the information in T with no loss. It is safe in pretty much any situation to pass a T and no one will mind if you make it a U instead. A good example of an implicit constructor is the conversion from std::uint32_t to std::uint64_t. A bad example of an implicit conversion is double to std::uint8_t.
We want to be cautious in our programming. We do not want to use powerful features because the more powerful the feature, the easier it is to accidentally do something incorrect or unexpected. If you intend to call explicit constructors, then you need the power of emplace_back. If you want to call only implicit constructors, stick with the safety of push_back.
An example
std::vector<std::unique_ptr<T>> v;
T a;
v.emplace_back(std::addressof(a)); // compiles
v.push_back(std::addressof(a)); // fails to compile
std::unique_ptr<T> has an explicit constructor from T *. Because emplace_back can call explicit constructors, passing a non-owning pointer compiles just fine. However, when v goes out of scope, the destructor will attempt to call delete on that pointer, which was not allocated by new because it is just a stack object. This leads to undefined behavior.
This is not just invented code. This was a real production bug I encountered. The code was std::vector<T *>, but it owned the contents. As part of the migration to C++11, I correctly changed T * to std::unique_ptr<T> to indicate that the vector owned its memory. However, I was basing these changes off my understanding in 2012, during which I thought "emplace_back does everything push_back can do and more, so why would I ever use push_back?", so I also changed the push_back to emplace_back.
Had I instead left the code as using the safer push_back, I would have instantly caught this long-standing bug and it would have been viewed as a success of upgrading to C++11. Instead, I masked the bug and didn't find it until months later.
push_back always allows the use of uniform initialization, which I'm very fond of. For instance:
struct aggregate {
int foo;
int bar;
};
std::vector<aggregate> v;
v.push_back({ 42, 121 });
On the other hand, v.emplace_back({ 42, 121 }); will not work.
Backwards compatibility with pre-C++11 compilers.
Some library implementations of emplace_back do not behave as specified in the C++ standard including the version that ship with Visual Studio 2012, 2013 and 2015.
In order to accommodate known compiler bugs, prefer usingstd::vector::push_back() if the parameters reference iterators or other objects which will be invalid after the call.
std::vector<int> v;
v.emplace_back(123);
v.emplace_back(v[0]); // Produces incorrect results in some compilers
On one compiler, v contains the values 123 and 21 instead of the expected 123 and 123. This is due to the fact that the 2nd call to emplace_back results in a resize at which point v[0] becomes invalid.
A working implementation of the above code would use push_back() instead of emplace_back() as follows:
std::vector<int> v;
v.emplace_back(123);
v.push_back(v[0]);
Note: The use of a vector of ints is for demonstration purposes. I discovered this issue with a much more complex class which included dynamically allocated member variables and the call to emplace_back() resulted in a hard crash.
Use push_back only for primitive/built-in types or raw pointers. Otherwise use emplace_back.
Consider what happens in Visual Studio 2019 with c++-17 compiler. We have emplace_back in a function with proper arguments set up. Then someone changes parameters of the constuctor called by emplace_back. There is no warning whatsover in VS, the code also compiles fine, then it crashes in runtime. I removed all emplace_back from the codebase after this.