std::string my_func(){
return std::string("...");
}
std::string can be replaced by std::vector or anything else. In general, how do I know rv was moved and not copied?
According to cppreference,
In C++11, expressions that ...
do not have identity and can be moved from are called prvalue ("pure rvalue") expressions.
A temporary object like the one you are returning (std::string("...")) has no identity and can be moved from because the compiler can detect that its state will not be used before its lifetime ends. Thus, the compiler will prefer to move the object over copying it.
However, the temporary object will likely not get moved or copied at all because of return value optimization (RVO), a form of copy elision optimization that constructs the object in the storage that it would otherwise be moved to. So if you were to write std::string str = my_func();, the code would likely be optimized to construct str from "..." in-place, rather than constructing a string inside my_func() and then moving it out. RVO applies for any kind of object that is copyable and/or movable.
A move constructor takes a rvalue reference. A rvalue reference is preferred over a lvalue reference to const when dealing with prvalues. So, as long as your type has a valid move constructor the compiler will chose the move constructor and only fall back to the copy constructor if there isn't move constructor.
That said in this case most likely nothing will be copied or moved and RVO will kick in which directly constructs the object at the call site. The only way to know though if it did is to inspect the assembly.
Starting in C++17 with guaranteed copy elision you are actually guaranteed that no copy or move will happen here.
Related
C++17 guarantees copy elision for:
T funcReturningT() {
return T(...);
}
T t=funcReturningT();
Now if I wrap the return into a static_cast to the same type, like so:
T t=static_cast<T>(funcReturningT());
does the standard still guarantee copy elision or not?
RVO is dead. Long live RVO.
In modern C++, there are two cases corresponding to what used to be called RVO:
There's NRVO, in which the compiler is allowed to elide a local variable if it can see that that variable will just be returned eventually. If the compiler declines to elide the local variable, then it is obligated to treat it as an rvalue when it is returned, assuming certain conditions are met (thus converting a copy into a move). This is not what your question is asking about.
There's also guaranteed copy elision --- but to even use that name for it is to think about C++17 using a pre-C++17 mindset. What really changed in C++17 is that prvalues are no longer "objects without identity" but rather "recipes for creating objects". This is the reason why copy-elision-like behaviour is guaranteed.
So let's look at your statement:
T t=static_cast<T>(funcReturningT());
The expression funcReturningT() is a prvalue. In C++14, this would mean that immediately upon evaluating it, the implementation would have to instantiate a temporary T object (lacking identity), but non-guaranteed copy elision would allow the compiler (at its discretion) to elide such object. In C++17, it is ready to create a T object but doesn't do so immediately.
Then the static_cast is evaluated, and the result of it is also a prvalue of the same type. Because of this, no move constructor is required and no temporary object needs to be created. The result of the cast is just the original "recipe".
And finally, when t is initialized from the result of the static_cast, the move constructor is once again not required. The "recipe" that static_cast used is simply executed with t as the object that it creates.
The beauty of it is that there is nothing to elide, which is why "guaranteed copy elision" is a misnomer.
(Sometimes, temporary objects do need to be created: in particular, if any constructor or conversion function needs to be called at any point, then the prvalue has to be "materialized" in order for that call to actually have an object to work with. In your example, this is not necessary.)
Why does this program call the copy constructor instead of the move constructor?
class Qwe {
public:
int x=0;
Qwe(int x) : x(x){}
Qwe(const Qwe& q) {
cout<<"copy ctor\n";
}
Qwe(Qwe&& q) {
cout<<"move ctor\n";
}
};
Qwe foo(int x) {
Qwe q=42;
Qwe e=32;
cout<<"return!!!\n";
return q.x > x ? q : e;
}
int main(void)
{
Qwe r = foo(50);
}
The result is:
return!!!
copy ctor
return q.x > x ? q : e; is used to disable nrvo. When I wrap it in std::move, it is indeed moved. But in "A Tour of C++" the author said that the move c'tor must be called when it available.
What have I done wrong?
You did not write your function in a way that allows copy/move elision to occur. The requirements for a copy to be replaced by a move are as follows:
[class.copy.elision]/3:
In the following copy-initialization contexts, a move operation might
be used instead of a copy operation:
If the expression in a return statement is a (possibly parenthesized) id-expression that names an object with automatic
storage duration declared in the body or
parameter-declaration-clause of the innermost enclosing function or lambda-expression
overload resolution to select the constructor for the copy is first
performed as if the object were designated by an rvalue. If the first
overload resolution fails or was not performed, or if the type of the
first parameter of the selected constructor is not an rvalue reference
to the object's type (possibly cv-qualified), overload resolution is
performed again, considering the object as an lvalue.
The above is from C++17, but the C++11 wording is pretty much the same. The conditional operator is not an id-expression that names an object in the scope of the function.
An id-expression would be something like q or e in your particular case. You need to name an object in that scope. A conditional expression doesn't qualify as naming an object, so it must preform a copy.
If you want to exercise your English comprehension abilities on a difficult wall of text, then this is how it's written in C++11. Takes some effort to see IMO, but it's the same as the clarified version above:
When certain criteria are met, an implementation is allowed to omit
the copy/move construction of a class object, even if the copy/move
constructor and/or destructor for the object have side effects. [...]
This elision of copy/move operations, called copy elision, is
permitted in the following circumstances (which may be combined to
eliminate multiple copies):
in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other
than a function or catch-clause parameter) with the same
cv-unqualified type as the function return type, the copy/move
operation can be omitted by constructing the automatic object directly
into the function's return value
When the criteria for elision of a copy operation are met or would be
met save for the fact that the source object is a function parameter,
and the object to be copied is designated by an lvalue, overload
resolution to select the constructor for the copy is first performed
as if the object were designated by an rvalue. If overload resolution
fails, or if the type of the first parameter of the selected
constructor is not an rvalue reference to the object's type (possibly
cv-qualified), overload resolution is performed again, considering the
object as an lvalue.
StoryTeller didn't answer the question: Why is the move c'tor not called? (And not: Why is there no copy elision?)
Here's my go: The move c'tor will be called if and only if:
Copy elision (RVO) is not performed. Your use of the ternary operator is indeed a way to prevent copy elision. Let me point out though that return (0, q); is a simpler way to do this if you just want to return q while suppressing copy elision. This uses the (in-)famous comma operator. Possibly return ((q)); might work, too, but I am not enough of a language lawyer to tell for sure.
The argument to return is an rvalue. This could be a temporary (more precisely, a prvalue), but these are also eligible for copy elision. Therefore, the argument to return must be an xvalue, such as std::move(q) if you want to ensure the move c'tor is called.
See also: C++ value categories
Some more technicalities of your example:
q and e are objects of type Qwe.
q.x > x ? q : e is an lvalue expression of type Qwe. This is because the expressions q and e are lvalues of type Qwe. The ternary operator just selects either of them.
std::move(q.x > x ? q : e) is an xvalue expression of type Qwe. The std::move simply turns (casts) the lvalue into an xvalue. As an aside, q.x > x ? std::move(q) : std::move(e) would also work.
The copy c'tor gets called in return q.x > x ? q : e; because it can be called with an lvalue of type Qwe (constness is optional), while, on the other hand, the move c'tor cannot be called with an lvalue and is therefore eliminated from the candidate set.
UPDATE: Addressing the comments by going into more depth… this is a really confusing aspect of C++!
Conceptually, in C++98, returning an object by value meant returning a copy of the object, so the copy c'tor would be called. However, the standard's authors considered that a compiler should be free to perform an optimization such that this potentially expensive copy (e.g. of a container) could be elided under suitable circumstances.
This copy elision means that, instead of creating the object in one place and then copying it to a memory address controlled by the caller, the callee creates the object directly in the memory controlled by the caller. Therefore, only the "normal" constructor, e.g. a default c'tor, is called.
Therefore, they added a passage such that the compiler is required to check that the copy c'tor — whether generated or user-defined – exists and is accessible (there was no notion yet of deleted functions for that matter), and must ensure that the object is initialized as-if it had been first created in a different place and then copied (cf. as-if rule), but the compiler was not required to ensure that any side effects of the copy c'tor would be observable, such as the stream output in your example.
The reason why the c'tor was still required to be there was that they wanted to avoid a scenario where a compiler was able to accept code that another would have to reject, simply because the former implemented an optional optimization that the latter did not.
In C++11, move semantics were added, and the committee very much wanted to use this in such a manner that a lot of existing return-by-value functions e.g. involving strings or containers would become more efficient. This was done in such a way that conditions were given under which the compiler was actually required to perform a move instead of a copy. However, the idea of copy elision remained important, so basically there were now four different categories:
The compiler is required to check for a usable (see above) move c'tor, but is allowed to elide it.
The compiler is required to check for a usable move c'tor, and has to call it.
The compiler is required to check for a usable copy c'tor, but is allowed to elide it.
The compiler is required to check for a usable copy c'tor, and has to call it.
… which in turn lead to four possible outcomes:
Compiler checks for move c'tor, but then elides it. (relates to 1. above)
Compiler checks for move c'tor and actually emits a call to it. (relates to 1. or 2. above)
Compiler checks for copy c'tor, but then elides it. (relates to 3. above)
Compiler checks for copy c'tor and actually emits a call to it. (relates to 3. or 4. above)
And the long optimization story doesn't end here, because, in C++17, the compiler is required to elide certain c'tor calls. In these cases, the compiler is not even allowed to demand that a copy or move c'tor is available.
Note that a compiler has always been free to elide even such c'tor calls that do not meet the standard requirements, under the protection of the as-if rule, for instance by function inlining and the following optimization steps. Anyway, a function call, conceptually, does not have to be backed by the actual machine instruction for the execution of a subroutine. The compiler is just not allowed to remove observable, otherwise defined behavior.
By now you should have noticed that, at least prior to C++17, it is very well possible for the same well-formed program to behave differently, depending on the compiler used and even optimization settings, if the copy rsp. move constructor has observable side effects. Also, a compiler that implements copy/move elision may do so for a subset of the conditions under which the standard allows it to happen. This makes your question almost impossible to answer in detail. Why is the copy/move c'tor called here, but not there? Well, it may be because of the requirements of the C++ standard, but it also may be the preference of your compiler. Maybe the compiler authors had time and leisure implementing the one optimization but not the other. Maybe they found it too difficult in the latter case. Maybe they just had more important stuff to do. Who knows?
What matters 99% of the time for me as a developer is to write my code in such a way that the compiler can apply the best optimizations. Sticking to common cases and standard practice is one thing. Knowing NRVO and RVO of temporaries is another thing, and writing the code such that the standard allows (or, in C++17, requires) copy/move elision, and ensuring that a move c'tor is available where it is beneficial (in case elision does not occur). Don't rely on side effects such as writing a log message or incrementing a global counter. These are not what a copy or move c'tor should commonly do anyway, except possibly for debugging or scholarly interest.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What is move semantics?
I recently attended a C++11 seminar and the following tidbit of advice was given.
when you have && and you are unsure, you will almost always use std::move
Could any one explain to me why you should use std::move as opposed to some alternatives and some cases when you should not use std::move?
First, there's probably a misconception in the question I'll address:
Whenever you see T&& t in code (And T is an actual type, not a template type), keep in mind the value category of t is an lvalue(reference), not an rvalue(temporary) anymore. It's very confusing. The T&& merely means that t is constructed from an object that was an rvalue 1, but t itself is an lvalue, not an rvalue. If it has a name (in this case, t) then it's an lvalue and won't automatically move, but if it has no name (the result of 3+4) then it is an rvalue and will automatically move into it's result if it can. The type (in this case T&&) has almost nothing to do with the value category of the variable (in this case, an lvalue).
That being said, if you have T&& t written in your code, that means you have a reference to a variable that was a temporary, and it is ok to destroy if you want to. If you need to access the variable multiple times, you do not want to std::move from it, or else it would lose it's value. But the last time you acccess t it is safe to std::move it's value to another T if you wish. (And 95% of the time, that's what you want to do). All of this also applies to auto&& variables.
1. if T is a template type, T&& is a forwarding reference instead, in which case you use std::forward<T>(t) instead of std::move(t) the last time. See this question.
I found this article to be pretty enlightening on the subject of rvalue references in general. He mentions std::move towards the end. This is probably the most relevant quote:
We need to use std::move, from <utility> -- std::move is a way of
saying, "ok, honest to God I know I have an lvalue, but I want it to
be an rvalue." std::move does not, in and of itself, move anything; it
just turns an lvalue into an rvalue, so that you can invoke the move
constructor.
Say you have a move constructor that looks like this:
MyClass::MyClass(MyClass&& other): myMember(other.myMember)
{
// Whatever else.
}
When you use the statement other.myMember, the value that's returned is an lvalue. Thus the code uses the copy constructor to initialize this->myMember. But since this is a move constructor, we know that other is a temporary object, and therefore so are its members. So we really want to use the more-efficient move constructor to initialize this->myMember. Using std::move makes sure that the compiler treats other.myMember like an rvalue reference and calls the move constructor, as you'd want it to:
MyClass::MyClass(MyClass&& other): myMember(std::move(other.myMember))
{
// Whatever else.
}
Just don't use std::move on objects you need to keep around - move constructors are pretty much guaranteed to muck up any objects passed into them. That's why they're only used with temporaries.
Hope that helps!
When you have an object of type T&&, a rvalue, it means that this object is safe to be moved, as no one else will depend on its internal state later.
As moving should never be more expensive than copying, you will almost always want to move it. And to move it, you have to use the std::move function.
When should you avoid std::move, even if it would be safe? I wouldn't use it in trivial examples, e.g.,:
int x = 0;
int y = std::move(x);
Beside that, I see no downsides. If it does not complicate the code, moving should be done whenever possible IMHO.
Another example, where you don't want to move are return values. The language guarantees that return values are (at least) moved, so you should not write
return std::move(x); // not recommended
(If you are lucky, return value optimization hits, which is even better than a move operation.)
You can use move when you need to "transfer" the content of an object somewhere else, without doing a copy. It's also possible for an object to take the content of a temporary object without doing a copy, with std::move.
Read more on Rvalue references and move constructors from wikipedia.
I need to get this straight. With the code below here:
vector<unsigned long long int> getAllNumbersInString(string line){
vector<unsigned long long int> v;
string word;
stringstream stream(line);
unsigned long long int num;
while(getline(stream, word, ',')){
num = atol(word.c_str());
v.push_back(num);
}
return v;
}
This sample code simply turns an input string into a series of unsigned long long int stored in vector.
In this case above, if I have another function calls this function, and we appear to have about 100,000 elements in the vector, does this mean, when we return it, a new vector will be created and will have elements created identically to the one in the function, and then the original vector in the function will be eliminated upon returning? Is my understanding correct so far?
Normally, I will write the code in such a way that all functions will return pointer when it comes to containers, however, program design-wise, and with my understanding above, should we always return a pointer when it comes to container?
The std::vector will most likely (if your compiler optimizations are turned on) be constructed directly in the function's return value. This is known as copy/move elision and is an optimization the compiler is allowed to make:
in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value
This quote is taken from the C++11 standard but is similar for C++03. It is important to note that copy/move elision does not have to occur at all - it is entirely up to the compiler. Most modern compilers will handle your example with no problems at all.
If elision does not occur, C++11 will still provide you with a further benefit over C++03:
In C++03, without copy elision, returning a std::vector like this would have involved, as you say, copying all of the elements over to the returned object and then destroyed the local std::vector.
In C++11, the std::vector will be moved out of the function. Moving allows the returned std::vector to steal the contents of the std::vector that is about to be destroyed. This is much more efficient that copying the contents over.
You may have expected that the object would just be copied because it is an lvalue, but there is a special rule that makes copies like this first be considered as moves:
When the criteria for elision of a copy operation are met [...] and the object to be copied is designated by an lvalue, overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue.
As for whether you should return a pointer to your container: the answer is almost certainly no. You shouldn't be passing around pointers unless its completely necessary, and when it is necessary, you're much better off using smart pointers. As we've seen, in your case it's not necessary at all because there's little to no overhead in passing it by value.
It is safe, and I would say preferable, to return by value with any reasonable compiler. The C++ standard allows copy elision, in this case named return value optimization (NRVO), which means this extra copy you are worried about doesn't take place.
Note that this is a case of an optimization that is allowed to modify the observable behaviour of a program.
Note 2. As has been mentioned in other answers, C++11 introduces move semantics, which means that, in cases where RVO doesn't apply, you may still have a very cheap operation where the contents of the object being returned are transfered to the caller. In the case of std::vector, this is extremely cheap. But bear in mind that not all types can be moved.
Your understanding is correct.
But compilers can apply copy elision through RVO and NRVO and remove the extra copy being generated.
Should we always return a pointer when it comes to container?
If you can, ofcourse you should avoid retun by value especially for non POD types.
That depends on whether or not you need reference semantics.
In general, if you do not need reference semantics, I would say you should not use a pointer, because in C++11 container classes support move semantics, so returning a collection by value is fast. Also, the compiler can elide the call to the moved constructor (this is called Named Return Value Optimization or NRVO), so that no overhead at all will be introduced.
However, if you do need to create separate, consistent views of your collection (i.e. aliases), so that for instance insertions into the returned vector will be "seen" in several places that share the ownership of that vector, then you should consider returning a smart pointer.
I have already asked a similar question a while ago, but I'm still unclear on some details.
Under what circumstances is the postblit constructor called?
What are the semantics of moving an object? Will it be postblitted and/or destructed?
What happens if I return a local variable by value? Will it implicitly be moved?
How do I cast an expression to an rvalue? For example, how would a generic swap look like?
A postblit constructor is called whenever the struct is copied - e.g. when passing a struct to a function.
A move is a bitwise copy. The postblit constructor is never called. The destructor is never called. The bits are simply copied. The original was "moved" and so nothing needs to be created or destroyed.
It will be moved. This is the prime example of a move.
There are a number of different situations that a swap function would have to worry about if you want to make it as efficient as possible. I would advise simply using the swap function in std.algorithm. The classic swap would result in copying and would thus call the postblit constructor and the destructor. Moves are generally done by the compiler, not the programmer. However, looking at the official implementation of swap, it looks like it plays some tricks to get move semantics out of the deal where it can. Regardless, moves are generally done by the compiler. They're an optimization that it will do where it knows that it can (RVO being the classic case where it can).
According to TDPL (p. 251), there are only 2 cases where D guarantees that a move will take place:
All anonymous rvalues are moved, not copied. A call to this(this)
is never inserted when the source is an anonymous rvalue (i.e., a
temporary as featured in the function hun above).
All named temporaries that are stack-allocated inside a function and
then returned elide a call to this(this).
There is no guarantee that other potential elisions are observed.
So, the compiler may use moves elsewhere, but there's no guarantee that it will.
As far as I understand:
1) When a struct is copied, as opposed to moved or constructed.
2) The point of move semantics is that neither of the two needs to happen. The new location of the struct is initialized with a bit-wise copy of the struct, and the old location goes out of scope and becomes inaccessible. Thus, the struct has "moved" from A to B.
3) That is the typical move situation:
S init(bool someFlag)
{
S s;
s.foo = someFlag? bar : baz;
return s; // `s` can now be safely moved from here...
}
// call-site:
S s = init(flag);
//^ ... to here.