My understanding of move semantics is that you should declare the function's parameters as by-value but make a call to std::move when you are interested in owning an object. This lets the compiler decide if the object should be copied or moved.
How do you allow the compiler to make a similar optimization when you are only interested in the object's contents, without making unnecessary copies of the container or it's elements. So for example if I have some function which takes an array of 3 objects and stores them in it's own container, how do I let the compiler decide if it should move or copy the elements?
class foo {
std::vector<BigObject> objects;
public:
void set_objects(std::array<BigObject, 3> input) {
// Store the input's elements in the objects vector
}
};
I read about std::make_move_iterator but I am unsure whether this forces a move and how this effects the input or how the input parameter should be defined.
For example if set_objects's body looked like so:
objects = std::vector<BigObject>(
std::make_move_iterator(input.begin()),
std::make_move_iterator(input.end())
);
I believe objects would now be a vector containing input's elements and input would now be empty. That's desired if input was an xvalue but the signature declares input as by-value so was the original container copied even if it was an xvalue as nowhere do I say to move the actual container, just it's elements.
I can declare input as by-reference to prevent the copying but then if std::make_move_iterator forces a move there could be side effects if input was an lvalue?
Having typed out my question I believe the answer is to overload set_objects with an xvalue and by-reference signature, the former moving input's elements and the later copying input's elements but I do not feel confident enough to say that is the correct solution.
Related
Let's say I have a buffer of chars in memory that holds a c_string, and I want to add an object of std::string with the content of that c_string to a standard container, such as std::list<std::string>, in an efficient way.
Example:
#include <list>
#include <string>
int main()
{
std::list<std::string> list;
char c_string_buffer[] {"For example"};
list.push_back(c_string_buffer); // 1
list.emplace_back(c_string_buffer); // 2
list.push_back(std::move(std::string(c_string_buffer))); // 3
}
I use ReSharper C++, and it complains about #1 (and suggests #2).
When I read push_back vs emplace_back, it says that when it is not an rvalue, the container will store a "copied" copy, not a moved copy. Meaning, #2 does the same as #1. Don’t blindly prefer emplace_back to push_back also talks about that.
Case 3: When I read What's wrong with moving?, it says that what std::move() does "is a cast to a rvalue reference, which may enable moving under some conditions".
--
Does #3 actually give any benefit? I assume that the constructor of std::string is called and creates a std::string object with the content of the c_string. I am not sure if later the container constructs another std::string and copies the 1st object to the 2nd object.
// 3 is fully equivalent to // 1. std::move does absolutely nothing here since std::string(c_string_buffer) is already a rvalue.
The problem with push_back is not related to move vs copy.
push_back is always a bad choice if you don't yet have an object of the element type because it always creates the new container element via copy or move construction from another object of the element type.
If you write list.push_back(c_string_buffer); // 1, then because push_back expects a std::string&& argument (or const std::string&), a temporary object of type std::string will be constructed from c_string_buffer and passed-by-reference to push_back. push_back then constructs the new element from this temporary.
With // 3 you are just making the temporary construction that would otherwise happen implicitly explicit.
The second step above can be avoided completely by emplace_back. Instead of taking a reference to an object of the target type, it takes arbitrary arguments by-reference and then constructs the new element directly from the arguments. No temporary std::string is needed.
Does 3 actually give any benefit?
No. #1 already does a move without std::move. #3 is just an unnecessarily explicit way to write #1.
#2 is generally potentially most efficient. That's why your static analyzer suggests it. But, as the article explains, it's not significant in this case, and has a potential compile-time penalty. I believe you can avoid the potential compile-time cost by using list.emplace_back(+c_string_buffer);, but that may be confusing to the reader.
I want to pass a std::list as a parameter to fn(std::list<int>), so I do fn({10, 21, 30}) and everybody is happy.
However, I've come to learn that one shouldn't pass list by value, cause it's costly. So, I redefine my fn as fn(std::list<int> &). Now, when I do the call fn({10, 21, 30}), I get an error: candidate function not viable: cannot convert initializer list argument to 'std::list<int> &'.
QUESTION TIME
Is the "you shall not pass an costly object by value" rule valid here? We aren't passing a list after all, but an initializer_list, no?
If the rule still applies, what's the easy fix here?
I guess my doubt comes from the fact that I don't know clearly what happens when one passes an initializer_list argument to a function that accepts a list.
Is list generated on the spot and then passed by value? If not, what is it that actually happens?
However, I've come to learn that one shouldn't pass list by value, cause it's costly.
That's not entirely accurate. If you need to pass in a list that the function can modify, where the modifications shouldn't be externally visible, you do want to pass a list by value. This gives the caller the ability to choose whether to copy or move from an existing list, so gives you the most reasonable flexibility.
If the modifications should be externally visible, you should prevent temporary list objects from being passed in, since passing in a temporary list object would prevent the caller from being able to see the changes made to the list. The flexibility to silently pass in temporary objects is the flexibility to shoot yourself in the foot. Don't make it too flexible.
If you need to pass in a list that the function will not modify, then const std::list<T> & is the type to use. This allows either lvalues or rvalues to be passed in. Since there won't be any update to the list, there is no need for the caller to see any update to the list, and there is no problem passing in temporary list objects. This again gives the caller the most reasonable flexibility.
Is the "you shall not pass an costly object by value" rule valid here? We aren't passing a list after all, but an initializer_list, no?
You're constructing a std::list from an initializer list. You're not copying that std::list object, but you are copying the list items from the initializer list to the std::list. If the copying of the list items is cheap, you don't need to worry about it. If the copying of the list items is expensive, then it should be up to the caller to construct the list in some other way, it still doesn't need to be something to worry about inside your function.
If the rule still applies, what's the easy fix here?
Both passing std::list by value or by const & allow the caller to avoid pointless copies. Which of those you should use depends on the results you want to achieve, as explained above.
Is list generated on the spot and then passed by value? If not, what is it that actually happens?
Passing the list by value constructs a new std::list object in the location of the function parameter, using the function argument to specify how to construct it. This may or may not involve a copy or a move of an existing std::list object, depending on what the caller specifies as the function argument.
The expression {10, 21, 30} will construct a initializer_list<int>
This in turn will be used to create a list<int>
That list will be a temporary and a temporarys will not bind to a
non-const reference.
One fix would be to change the prototype for you function to
fn(const std::list<int>&)
This means that you can't edit it inside the function, and you probably don't need to.
However, if you must edit the parameter inside the function, taking it by value would be appropriate.
Also note, don't optimize prematurely, you should always use idiomatic
constructs that clearly represents what you want do do, and for functions,
that almost always means parameters by const& and return by value.
This is easy to use right, hard to use wrong, and almost always fast enough.
Optimization should only be done after profiling, and only for the parts of the program that you have measured to need it.
Quoting the C++14 standard draft, (emphasis are mine)
18.9 Initializer lists [support.initlist]
2: An object of type initializer_list provides access to an array of
objects of type const E. [ Note: A pair of pointers or a pointer plus
a length would be obvious representations for initializer_list.
initializer_list is used to implement initializer lists as specified
in 8.5.4. Copying an initializer list does not copy the underlying
elements. —end note ]
std::list has a constructor which is used to construct from std::initializer_list. As you can see, it takes it by value.
list(initializer_list<T>, const Allocator& = Allocator());
If you are never going to modify your parameter, then fn(const std::list<int>&) will do just fine. Otherwise, fn(std::list<int>) will suffice well for.
To answer your questions:
Is the "you shall not pass an costly object by value" rule valid here?
We aren't passing a list after all, but an initializer_list, no?
std::initializer_list is not a costly object. But std::list<int> surely sounds like a costly object
If the rule still applies, what's the easy fix here?
Again, it's not costly
Is list generated on the spot and then passed by value? If not, what is it that actually happens?
Yes, it is... your list object is created on the spot at run-time right before the program enters your function scope
However, I've come to learn that one shouldn't pass list by value, cause it's costly. So, I redefine my fn as fn(std::list &). Now, when I do the call fn({10, 21, 30}), I get an error: candidate function not viable: cannot convert initializer list argument to 'std::list &'.
A way to fix the problem would be:
fn(std::list<int>& v) {
cout << v.size();
}
fn(std::list<int>&& v) {
fn(v);
}
Now fn({1, 2, 3 }); works as well (it will call the second overloaded function that accepts a list by rvalue ref, and then fn(v); calls the first one that accepts lvalue references.
fn(std::list<int> v)
{
}
The problem with this function is that it can be called like:
list<int> biglist;
fn(biglist);
And it will make a copy. And it will be slow. That's why you want to avoid it.
I would give you the following solutions:
Overloaded your fn function to accept both rvalues and lvalues
properly as shown before.
Only use the second function (the one that accepts only rvalue
references). The problem with this approach is that will throw a compile error even if it's called with a lvalue reference, which is something you want to allow.
Like the other answers and comments you can use a const reference to the list.
void fn(const std::list<int>& l)
{
for (auto it = l.begin(); it != l.end(); ++it)
{
*it; //do something
}
}
If this fn function is heavily used and you are worried about the overhead of constructing and destructing the temporary list object, you can create a second function that receives the initializer_list directly that doesn't involve any copying whatsoever. Using a profiler to catch such a performance hot spot is not trivial in many cases.
void fn(const std::initializer_list<int>& l)
{
for (auto it = l.begin(); it != l.end(); ++it)
{
*it; //do something
}
}
You can have std::list<> because in fact you're making temporary list and passing initializer_list by value is cheap. Also accessing that list later can be faster than a reference because you avoid dereferencing.
You could hack it by having const& std::list as parameter or like that
void foo( std::list<int> &list ) {}
int main() {
std::list<int> list{1,2,3};
foo( list );
}
List is created on function scope and this constructor is called
list (initializer_list<value_type> il,
const allocator_type& alloc = allocator_type())
So there's no passing list by value. But if you'll use that function and pass list as parameter it'll be passed by value.
This question already has answers here:
Does moving leave the object in a usable state?
(3 answers)
Closed 7 years ago.
Say I have a function which goes like:
void a_fct( std::vector<double> &some_vector )
{
std::vector<double> a_temporary_vector = some_vector;
... some stuff involving only a_temporary_vector ...
some_vector = a_temporary_vector;
}
which is of course a bit silly, but it is just intended as bringing the following general questions:
It seems to me that one should rather move a_temporary_vector into some_vector here, as it then goes out of scope. But is it safe to move some_vector into a_temporary_vector rather than copying ?
Imagine now that somewhere else in the code, I have a pointer to the vector given as argument here. Is this pointer still pointing to it if I move some_vector into a_temporary_vector ?
In both cases, can I expect it to be true for any STL implementation ?
Yes, all of the above is perfectly safe.
Moving into or from the argument vector does change it, but this is kind of what a non-const reference argument is meant to do anyway. The result of moving the temporary vector into to output parameter is equivalent to the copy method.
The pointer to the argument vector is no problem either. some_vector will still be the same object as before, you will just have changed its content.
Pointers, iterators and references to the original data of some_vector get invalidated of course, regardless of copy or move.
The standard (in 17.6.5.15) says:
Objects of types defined in the C++ standard library may be moved from (12.8). Move operations may be explicitly specified or implicitly generated. Unless otherwise specified, such moved-from objects shall be placed in a valid but unspecified state.
Thus, some_vector is required to have an (unspecified but) valid state after being moved from. You're allowed to move from it into you temporary and then move again from the temporary into some_vector.
std::vector<double> a_temporary_vector = std::move(some_vector);
// do stuff only on a_temporary_vector but not some_vector
some_vector = std::move(a_temporary_vector);
Pointers and references to some_vector itself are still valid but pointers or references to content of some_vector are not (as you would normally expect when passing an object to a function taking a non-constant reference).
Note: You could use swap in this case instead if you're not sure about whether to move from it or not.
void foo( std::vector<double> &some_vector )
{
// swap contents of some_vector into temporary
std::vector<double> a_temporary_vector;
a_temporary_vector.swap(some_vector);
// operate on temporary
// swap content back into some_vector
a_temporary_vector.swap(some_vector);
some_vector = a_temporary_vector;
}
I have been wondering about that all day long and I can't find an answer to that specific case.
Main :
std::vector<MyObject*> myVector;
myVector.reserve(5);
myFunction(std::move(myVector));
myFunction :
void myFunction(std::vector<MyObject*> && givenVector){
std::vector<MyObject*> otherVector = givenVector;
std::cout << givenVector[0];
// do some stuff
}
My questions are :
in the main, is myVector destroyed by the function myFunction() because it is considered as an rvalue or does the compiler knows that it is also a lvalue and therefore performs a copy before sending it to myFunction ? What happens if I try to use the vector after the call to myFunction()?
inside the function myFunction() , is the vector givenVector destroyed when affected to otherVector ? if so, what happens when I try to print it ? if not is it useful to use rvalue in this function ?
Looks like duplicate.
myVector is not destroyed by the function myFunction(). It's unspecifed what should happen in general case with class with stealen resources.
givenVector is not destroyed when affected to otherVector. It's unspecifed what should happen in general case with class with stealen resources.
In order to be compilable, you should apply a std::move to your vector before you pass it to the function (--at least if no further overloads exists):
myFunction(std::move(myVector));
Then, inside the function, by
std::vector<MyObject*> otherVector = std::move(givenVector);
the move constructor of std::vector is called which basically moves all the content out of the vector (note however again the std::move on the right-hand side -- otherwise you'll get a copy). By this, the vector is not "destroyed". Even after the move it is still alive, yet in an unspecified state.
That means that those member functions which pose no specific condition on the state of the vector might be called, such as the destructor, the size() operator and so on. A pop_back() or a derefencing of a vector element however will likely fail.
See here for a more detailed explanation what you still can do with a moved-from object.
The code won't compile, since you try to bind an lvalue to an rvalue reference. You'll need to deliberately convert it to an rvalue:
myFunction(std::move(givenVector));
Simply doing this won't "destroy" the object; what happens to it depends on what the function does. Generally, functions which take rvalue references do so in order to move from the argument, in which case they might leave it in some valid but "empty" state, but won't destroy it.
Your code moves the vector to the local otherVector, leaving it empty. Then you try to print the first element of an empty vector, giving undefined behaviour.
No copy is performed. What happens to myVector depends on what myFunction does with it. You should consider objects that have been moved from as either being the same or being empty. You can assign new values and keep using it or destroy it.
myVector is fine. It is an lvalue and otherVector makes a copy of it. You most likely wanted to write otherVector = std::move(myVector);, in which case myVector should be empty. If you have an old implementation of the STL (that does not know about move semantics) a copy is performed and myVector is not changed. If that makes sense is for you to decide. You moved a given vector to a new vector, which can be useful. Printing an empty vector is not so useful.
If a function gets an argument by rvalue-reference, that does not mean it will be destructively used, only that it can be.
Destructive use means that the passed object is thereafter in some unspecified but valid state, fit only for re-initializing, mving, copying or destruction.
In the function, the argument has a name and thus is an lvalue.
To mark the place(s) where you want to take advantage of the licence to ruthlessly plunder it, you have to convert it to an rvalue-reference on passing it on, for example with std::move or std::forward, the latter mostly for templates.
I have the following function:
void read_int(std::vector<int> &myVector)
Which allows me to fill myVector through it reference. It is used like this:
std::vector<int> myVector;
read_int(myVector);
I want to refactor a bit the code (keeping the original function) to in the end have this:
auto myVector = read_int(); // auto is std::vector<int>
What would be the best intermediate function to achieve this?
It seems to me that the following straight-forward answer is suboptimal:
std::vector<int> read_int() {
std::vector<int> myVector_temp;
read_int(myVector_temp);
return myVector_temp;
}
The obvious answer is correct, and basically optimal.
void do_stufF(std::vector<int>& on_this); // (1)
std::vector<int> do_stuff_better() { // (2)
std::vector<int> myVector_temp; // (3)
do_stuff(myVector_temp); // (4)
return myVector_temp; // (5)
}
At (3) we create a named return value in automatic storage (on the stack).
At (5) we only ever return the named return value from the function, and we never return anything else but that named return value anywhere else in the function.
Because of (3) and (5), the compiler is allowed to (and most likely will) elide the existence of the myVector_temp object. It will directly construct the return value of the function, and call it myVector_temp. It still needs there to be an existing move or copy constructor, but it does not call it.
On the other end, when calling do_stuff_better, some compilers can also elide the assignment at call:
std::vector<int> bob = do_stuff_better(); // (6)
The compiler is allowed to effectively pass a "pointer to bob" and tell do_stuff_better() to construct its return value in bob's location, eliding this copy construction as well (well, it can arrange how the call occurs such that the location that do_stuff_better() is asked to construct its return value in is the same as the location of bob).
And in C++11, even if the requirements for both elisions are not met, or the compiler chooses not to use them, in both cases a move must be done instead of a copy.
At line (5) we are returning a locally declared automatic storage duration variable in a plain and simple return statement. This makes the return an implicit move if not elided.
At line (6), the function returns an unnamed object, which is an rvalue. When bob is constructed from it, it move-constructs.
moveing a std::vector consists of copying the value of ~3 pointers, and then zeroing the source, regardless of how big the vector is. No elements need be copied or moved.
Both of the above elisions, where we remove the named local variable within do_stuff_better(), and we remove the return value of do_stuff_better() and instead directly construct bob, are somewhat fragile. Learning the rules under which your compiler is allowed to do those elisions, and also the situations where your compiler actually does the elisions, is worthwhile.
As an example of how it is fragile, if you had a branch where you did a return std::vector<int>() in your do_stuff_better() after checking an error state, the in-function elision would probably be blocked.
Even if elision is blocked or your compiler doesn't implement it for a case, the fact that the container is move'd means that the run time costs are going to be minimal.
I think, you have to read more about move semantics (link to Google query, there are a lot of papers on this - just choose one).
In short, in C++ all STL containers are written in such way, that returning them from function will cause their contents to be moved from the returned value (so called right-hand reference) to the variable you are assigning it to. In effect you'll only copy a few fields of the std::vector instead of its data. That's a lot faster than copying its contents.