I'm confused as to what is going on in the following code snippet. Is move really necessary here? What would be the most optimal + safe way of returning the temporary set?
set<string> getWords()
{
set<string> words;
for (auto iter = wordIndex.begin(); iter != wordIndex.end(); ++iter)
{
words.insert(iter->first);
}
return move(words);
}
My calling code simply does set<string> words = foo.getWords()
First off, the set is not temporary, but local.
Second, the correct way to return it is via return words;.
Not only is this the only way you allow for return-value optimization, but moreover, the local variable will also bind to the move constructor of the returned object in the (unusual) case where the copy is not elided altogether. So it's a true triple-win scenario.
There's no need to use move here. Simply return "words". It will participate in the so-called "return value optimization".
Section 12.8 in the C++11 standard requires the move constructor to be called (if it exists) in the case where a local variable is returned. In essence, the compiler will take care of calling std::move for you.
No, the explicit move is not necessarily the most optimal way to move the set. Since the set is being returned by value, the compiler may perform named return value optimization on the set, meaning it may elide the copy and directly construct the set in-place at the call-site where the return value is to be stored. Explicit moving will inhibit this.
Related
Say I have a std::vector that is declared in a loop's body and co_yielded:
some_generator<std::vector<int>> vector_sequence() {
while (condition()) {
std::vector<int> result = get_a_vector();
result.push_back(1);
co_yield std::move(result); // or just: co_yield result;
}
}
Quite obviously, result isn't going to be used again after being co_yielded (or I am horribly mistaken), so it would make sense to move it. I tried co_yielding a simple non-copyable type without std::move and it did not compile, so in generic code, one would have use std::move. Does the compiler not recognize this (compiler bug?) or is it intended by the language that co_yield always copies an lvalue, so I have to std::move? I know that returning an lvalue that is a local variable issues a move or some other sort of copy elision, and this does not seem so much different from it.
I have read C++: should I explicitly use std::move() in a return statement to force a move?, which is related to this question, but does not answer it, and considered co_return vs. co_yield when the right hand side is a temporary, which as far as I understand, is not related to this question.
The implicit move rule ([class.copy.elision]/3) applies to return and co_return statements and to throw expressions. It doesn't apply to co_yield.
The reason is that, in the contexts enumerated in [class.copy.elision]/3, the execution of the return or co_return statement or throw expression ensures that the implicitly movable entity's lifetime ends. For example,
auto foo() {
std::string s = ...;
if (bar()) {
return s;
}
// return something else
}
Here, even though there is code after the return statement, it's guaranteed that if the return statement executes, then any code further down that can see s will not execute. This makes it safe to implicitly move s.
In contrast, co_yield only suspends the coroutine and does not end it in the manner of co_return. Thus, in general, after co_yield result; is evaluated, the coroutine might later resume and use the very same result variable again. This means that in general, it's not safe to implicitly transform the copy into a move; therefore, the standard does not prescribe such behaviour. If you want a move, write std::move.
If the language were to allow implicit move in your example, it would have to have specific rules to ensure that, although the variable could be used again after co_yield, it is in fact not. In your case, it might indeed be that the loop will immediately end and thus the result variable's lifetime will end before its value can be observed again, but in general you would have to specify a set of conditions under which this can be guaranteed to be the case. Then, you could propose that an implicit move occur only under those conditions.
Motivation:
I'm trying to transfer a std::vector<std::unique_ptr<some_type>> to a different thread, via a lambda capture.
Since I need the vector to not be cleaned up when the function goes out of scope, I need to take it by value (and not by reference).
Since it's a vector of unique_ptrs, I need to move (and not copy) it into the capture.
I'm using a generalized lambda capture to move the vector while capturing.
Minimal program to illustrate the concept:
auto create_vector(){
std::vector<std::unique_ptr<int>> new_vector{};
new_vector.push_back(std::make_unique<int>(5));
return std::move(new_vector);
}
int main() {
const auto vec_const = create_vector();
[vec=std::move(vec_const)](){
std::cout << "lambda, vec size: " << vec.size() << std::endl;
}();
}
Issue:
If I'm using a const local vector, compilation fails due to attempting to copy the unique_ptrs.
However if I remove the const qualifier, the code compiles and runs well.
auto vec_const = create_vector();
Questions:
What's the reason for this? Does being const disable the "movability" of the vector? Why?
How would I ensure the constness of a vector in such a scenario?
Follow-up:
The comments and answers mention that a const type can't be moved from. Sounds reasonable, however the the compiler errors fail to make it clear. In this case I would expect one of two things:
The std::move(vec_const) should throw an error regarding moving from const (casting it to rvalue) being impossible.
The vector move-constructor telling me that it refuses to accept const rvalues.
Why don't those happen? Why does instead the assignment seems to just try to copy the unique_ptrs inside the vector (which is what I'd expect from the vectors copy-constructor)?
Moving is a disruptive operation: you conceptually change the content of the thing you move from.
So yes: a const object can (and should) not be moved from. That would change the original object, which makes its constness void.
In this case, vector has no vector(const vector&&), only vector(vector &&) (move constructor) and vector(const vector &) (copy constructor).
Overload resolution will only bind a call with const vector argument to the latter (lest const-correctness would be violated), so this will result in copying the contents.
I agree: error reporting sucks. It's hard to engineer an error report about vector when you hit a problem with unique_ptr. That's why the whole tail of required from ...., required from ... obliterates the view.
From your question, and your code, I can tell that you don't fully grasp the move semantics stuff:
you shouldn't move into a return value; a return value is already an rvalue, so there's no point.
std::move does not really move anything, it only changes the qualifier of the variable you want to 'move from', so that the right receiver can be selected (using 'binding' rules). It is the receiving function that actually changes the contents of the original object.
When you are moving something from A to B, then act of moving must necessarily mean that A gets modified, since after the move A may no longer have whatever was in A, originally. This is the whole purpose of move semantics: to provide an optimal implementation since the moved-from object is allowed to be modified: its contents getting transferred in some fast and mysterious way into B, leaving A in some valid, but unspecified, state.
Consequently, by definition, A cannot be const.
I have been wondering about that all day long and I can't find an answer to that specific case.
Main :
std::vector<MyObject*> myVector;
myVector.reserve(5);
myFunction(std::move(myVector));
myFunction :
void myFunction(std::vector<MyObject*> && givenVector){
std::vector<MyObject*> otherVector = givenVector;
std::cout << givenVector[0];
// do some stuff
}
My questions are :
in the main, is myVector destroyed by the function myFunction() because it is considered as an rvalue or does the compiler knows that it is also a lvalue and therefore performs a copy before sending it to myFunction ? What happens if I try to use the vector after the call to myFunction()?
inside the function myFunction() , is the vector givenVector destroyed when affected to otherVector ? if so, what happens when I try to print it ? if not is it useful to use rvalue in this function ?
Looks like duplicate.
myVector is not destroyed by the function myFunction(). It's unspecifed what should happen in general case with class with stealen resources.
givenVector is not destroyed when affected to otherVector. It's unspecifed what should happen in general case with class with stealen resources.
In order to be compilable, you should apply a std::move to your vector before you pass it to the function (--at least if no further overloads exists):
myFunction(std::move(myVector));
Then, inside the function, by
std::vector<MyObject*> otherVector = std::move(givenVector);
the move constructor of std::vector is called which basically moves all the content out of the vector (note however again the std::move on the right-hand side -- otherwise you'll get a copy). By this, the vector is not "destroyed". Even after the move it is still alive, yet in an unspecified state.
That means that those member functions which pose no specific condition on the state of the vector might be called, such as the destructor, the size() operator and so on. A pop_back() or a derefencing of a vector element however will likely fail.
See here for a more detailed explanation what you still can do with a moved-from object.
The code won't compile, since you try to bind an lvalue to an rvalue reference. You'll need to deliberately convert it to an rvalue:
myFunction(std::move(givenVector));
Simply doing this won't "destroy" the object; what happens to it depends on what the function does. Generally, functions which take rvalue references do so in order to move from the argument, in which case they might leave it in some valid but "empty" state, but won't destroy it.
Your code moves the vector to the local otherVector, leaving it empty. Then you try to print the first element of an empty vector, giving undefined behaviour.
No copy is performed. What happens to myVector depends on what myFunction does with it. You should consider objects that have been moved from as either being the same or being empty. You can assign new values and keep using it or destroy it.
myVector is fine. It is an lvalue and otherVector makes a copy of it. You most likely wanted to write otherVector = std::move(myVector);, in which case myVector should be empty. If you have an old implementation of the STL (that does not know about move semantics) a copy is performed and myVector is not changed. If that makes sense is for you to decide. You moved a given vector to a new vector, which can be useful. Printing an empty vector is not so useful.
If a function gets an argument by rvalue-reference, that does not mean it will be destructively used, only that it can be.
Destructive use means that the passed object is thereafter in some unspecified but valid state, fit only for re-initializing, mving, copying or destruction.
In the function, the argument has a name and thus is an lvalue.
To mark the place(s) where you want to take advantage of the licence to ruthlessly plunder it, you have to convert it to an rvalue-reference on passing it on, for example with std::move or std::forward, the latter mostly for templates.
I have the following function:
void read_int(std::vector<int> &myVector)
Which allows me to fill myVector through it reference. It is used like this:
std::vector<int> myVector;
read_int(myVector);
I want to refactor a bit the code (keeping the original function) to in the end have this:
auto myVector = read_int(); // auto is std::vector<int>
What would be the best intermediate function to achieve this?
It seems to me that the following straight-forward answer is suboptimal:
std::vector<int> read_int() {
std::vector<int> myVector_temp;
read_int(myVector_temp);
return myVector_temp;
}
The obvious answer is correct, and basically optimal.
void do_stufF(std::vector<int>& on_this); // (1)
std::vector<int> do_stuff_better() { // (2)
std::vector<int> myVector_temp; // (3)
do_stuff(myVector_temp); // (4)
return myVector_temp; // (5)
}
At (3) we create a named return value in automatic storage (on the stack).
At (5) we only ever return the named return value from the function, and we never return anything else but that named return value anywhere else in the function.
Because of (3) and (5), the compiler is allowed to (and most likely will) elide the existence of the myVector_temp object. It will directly construct the return value of the function, and call it myVector_temp. It still needs there to be an existing move or copy constructor, but it does not call it.
On the other end, when calling do_stuff_better, some compilers can also elide the assignment at call:
std::vector<int> bob = do_stuff_better(); // (6)
The compiler is allowed to effectively pass a "pointer to bob" and tell do_stuff_better() to construct its return value in bob's location, eliding this copy construction as well (well, it can arrange how the call occurs such that the location that do_stuff_better() is asked to construct its return value in is the same as the location of bob).
And in C++11, even if the requirements for both elisions are not met, or the compiler chooses not to use them, in both cases a move must be done instead of a copy.
At line (5) we are returning a locally declared automatic storage duration variable in a plain and simple return statement. This makes the return an implicit move if not elided.
At line (6), the function returns an unnamed object, which is an rvalue. When bob is constructed from it, it move-constructs.
moveing a std::vector consists of copying the value of ~3 pointers, and then zeroing the source, regardless of how big the vector is. No elements need be copied or moved.
Both of the above elisions, where we remove the named local variable within do_stuff_better(), and we remove the return value of do_stuff_better() and instead directly construct bob, are somewhat fragile. Learning the rules under which your compiler is allowed to do those elisions, and also the situations where your compiler actually does the elisions, is worthwhile.
As an example of how it is fragile, if you had a branch where you did a return std::vector<int>() in your do_stuff_better() after checking an error state, the in-function elision would probably be blocked.
Even if elision is blocked or your compiler doesn't implement it for a case, the fact that the container is move'd means that the run time costs are going to be minimal.
I think, you have to read more about move semantics (link to Google query, there are a lot of papers on this - just choose one).
In short, in C++ all STL containers are written in such way, that returning them from function will cause their contents to be moved from the returned value (so called right-hand reference) to the variable you are assigning it to. In effect you'll only copy a few fields of the std::vector instead of its data. That's a lot faster than copying its contents.
A function needs to return two values to the caller. What is the best way to implement?
Option 1:
pair<U,V> myfunc()
{
...
return make_pair(getU(),getV());
}
pair<U,V> mypair = myfunc();
Option 1.1:
// Same defn
U u; V v;
tie(u,v) = myfunc();
Option 2:
void myfunc(U& u , V& v)
{
u = getU(); v= getV();
}
U u; V v;
myfunc(u,v);
I know with Option2, there are no copies/moves but it looks ugly. Will there be any copies/moves occur in Option1, 1.1? Lets assume U and V are huge objects supporting both copy/move operations.
Q: Is it theoretically possible for any RVO/NRVO optimizations as per the standard? If yes, has gcc or any other compiler implemented yet?
Will RVO happen when returning std::pair?
Yes it can.
Is it guaranteed to happen?
No it is not.
C++11 standard: Section 12.8/31:
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the copy/move constructor and/or destructor for the object have side effects.
Copy elision is not a guaranteed feature. It is an optimization compilers are allowed to perform whenever they can. There is nothing special w.r.t std::pair. If a compiler is good enough to detect an optimization opportunity it will do so. So your question is compiler specific but yes same rule applies to std::pair as to any other class.
While RVO is not guaranteed, in C++11 the function as you have defined it I believe MUST move-return at the very least, so I would suggest leaving the clearer definition rather than warping it to accept output-variables (Unless you have a specific policy for using them).
Also, even if this example did use RVO, your explicit use of make_pair means you will always have at least one extra pair construction and thus a move operation. Change it to return a brace-initialized expression:
return { getU(), getV() };
RVO or Copy elision is dependant on compiler so if you want to have RVO and avoid call to Copy constructor best option is to use pointers.
In our product we use use pointers and boost containers pointer to avoid Copy constructor. and this indeed gives performance boost of around 10%.
Coming to your question,
In option 1 U and V's copy constructor will not be called as you are not returning U or V but returning std::pair object so it's copy constructor will be called and most compilers will definately use RVO here to avoid that.
Thanks
Niraj Rathi
If you need to do additional work on u and v after having created the pair, I find the following pattern pretty flexible in C++17:
pair<U,V> myfunc()
{
auto out = make_pair(getU(),getV());
auto& [u, v] = out;
// Work with u and v
return out;
}
This should be a pretty easy case for the compiler to use named return value optimization