I'm trying to change my code to take a vector by value using std::move instead of passing it by reference because I've gathered that would be more efficient. I've seen different ways of doing this though, one having the constructor pass by value and using std::move in the constructor, and the other way being to initialize the class with an std::move and having the constructor take an rvalue (did I get that right?). Some example below:
method 1:
Constructor:
StatisticsCompiler::StatisticsCompiler(std::vector<Wrapper<StatisticsMC>> Inner_) :Inner(std::move(Inner_))
{
}
in main:
vector<Wrapper<StatisticsMC>> gathererArray{ meanGatherer, absQuantileGatherer, relVaRGatherer, relESGatherer };
StatisticsCompiler gathererCombiner(gathererArray);
method 2.
Constructor:
StatisticsCompiler::StatisticsCompiler(std::vector<Wrapper<StatisticsMC>>&& Inner_) :Inner(Inner_)
{
}
main:
vector<Wrapper<StatisticsMC>> gathererArray{ meanGatherer, absQuantileGatherer, relVaRGatherer, relESGatherer };
StatisticsCompiler gathererCombiner(std::move(gathererArray));
Is there a difference between what's going on here or is it the same thing, the first method "looks" nicer in main but the second method is the way I would intuitively understand it to work from learning about rvalues. If performance wise they're the exact same, then what's standard practice?
StatisticsCompiler::StatisticsCompiler(std::vector<Wrapper<StatisticsMC>> Inner_) :Inner(std::move(Inner_))
This constructor takes the argument by value. The parameter can either be copy constructed from an lvalue argument, or move constructed from an rvalue. This ability to choose between copying from lvalue and moving from rvalue, combined with the simplicity, is why this approach is recommended.
The member is always moved from that copied or moved argument.
StatisticsCompiler gathererCombiner(gathererArray);
You passed an lvalue; therefore the parameter is copied. You could use the move constructor instead by passing an rvalue:
StatisticsCompiler gathererCombiner(std::move(gathererArray));
Or you could even use a prvalue:
StatisticsCompiler gathererCombiner({
meanGatherer,
absQuantileGatherer,
relVaRGatherer,
relESGatherer,
});
StatisticsCompiler::StatisticsCompiler(std::vector<Wrapper<StatisticsMC>>&& Inner_) :Inner(Inner_)
This constructor accepts only rvalue arguments. This is not as flexible as the first suggestion which also can accept lvalues.
This approach always copies the parameter into the member. You might want to move instead:
Inner(std::move(Inner_))
Conclusion: When you want to store an object given as an argument, then passing by value (the first example) is good default choice. If you don't need the passed argument afterwards, then you can choose to move from it.
Related
I'm currently learning about R-value references by reading tutorials. Many tutorials mention move constructors/assignment operators as the main use case of R-value references. So I'm wondering whether/how they should be used outside of the "Rule of 5".
Say I have a function
std::string foo();
which returns a potentially large string, and want to pass its output to
int bar (const std::string& s);
.
Consider 3 options of doing this.
1.
const std::string my_string = foo();
const int x = bar(my_string);
std::string&& my_string = foo();
const int x = bar(my_string);
const int x = bar(foo());
My understanding is that 2. and 3. are essentially the same (in both cases, there's never an l-value holding my_string), but 2. may be more readable (the string can be given a name, and the statement can be broken up into multiple lines). None of the 3 options does any unnecessary copies. The difference between 2. and 1. is that in 1., my_string will live till it goes out of scope, while in 2., I tell the compiler that I don't need my_string any more after passing it to bar, allowing to free its memory earlier.
Is the above correct? If so, is 2. the best option to use here?
My understanding is that 2. and 3. are essentially the same
No, in fact 1. and 2. are essentially the same, but 3. is different.
In both 1. and 2. my_string is an lvalue, since names of variables are always lvalues. In both cases the string object lives until the end of the scope. In case of 1. because that is the scope of the object and in 2. because of the lifetime extension rules on reference binding.
In 3. foo() is a prvalue (a kind of rvalue). The temporary object materialized from it and to which the reference parameter of the function binds lives until the end of the initialization of x.
So if bar had different overloads taking lvalue references or rvalue references, in 1. and 2. the lvalue reference overload would be chosen and in 3. the rvalue reference overload.
Since C++17, none of the three methods make unnecessary copies. There will be only one std::string object, either the temporary one in case of 2. and 3. or the named one in case of 1.
Before C++17, there could be an additional copy in case of 1. to copy from the return value of foo into my_string. However the compiler was allowed to elide this copy and that probably almost always happened in practice. Since C++17 this elision is mandatory.
while in 2., I tell the compiler that I don't need my_string any more after passing it to bar,
Aside from this applying to 3., not 2., the compiler actually doesn't care about this at all, the lifetime of the object is not affected.
It is the function you are passing to that cares. If a function takes an rvalue reference, by convention, that means that the function is allowed to use and modify the object's state in whatever way it suits it. Your function doesn't have a rvalue overload, so it doesn't matter how you pass the string to it. bar should never modify the state of the object.
So to make the distinction relevant to your case, you would either add an overload
int bar(std::string&&);
that will be called instead of the first overload for 3. and somehow makes use of the implied permission to put the string object into an unspecified state, or you would use a single overload
int bar(std::string);
in which case before C++17 the parameter will be constructed through the copy constructor for 1. and 2. or the move constructor for 3. The move constructor makes use of the rvalue convention and will reuse the passed string's allocations. Or in case of 3., since C++17 definitively and optionally before that, the copy/move is elided and the parameter is directly constructed by the function foo().
I'm learning C++ at the moment and try avoid picking up bad habits.
From what I understand, clang-tidy contains many "best practices" and I try to stick to them as best as possible (even though I don't necessarily understand why they are considered good yet), but I'm not sure if I understand what's recommended here.
I used this class from the tutorial:
class Creature
{
private:
std::string m_name;
public:
Creature(const std::string &name)
: m_name{name}
{
}
};
This leads to a suggestion from clang-tidy that I should pass by value instead of reference and use std::move.
If I do, I get the suggestion to make name a reference (to ensure it does not get copied every time) and the warning that std::move won't have any effect because name is a const so I should remove it.
The only way I don't get a warning is by removing const altogether:
Creature(std::string name)
: m_name{std::move(name)}
{
}
Which seems logical, as the only benefit of const was to prevent messing with the original string (which doesn't happen because I passed by value).
But I read on CPlusPlus.com:
Although note that -in the standard library- moving implies that the moved-from object is left in a valid but unspecified state. Which means that, after such an operation, the value of the moved-from object should only be destroyed or assigned a new value; accessing it otherwise yields an unspecified value.
Now imagine this code:
std::string nameString("Alex");
Creature c(nameString);
Because nameString gets passed by value, std::move will only invalidate name inside the constructor and not touch the original string. But what are the advantages of this? It seems like the content gets copied only once anyhow - if I pass by reference when I call m_name{name}, if I pass by value when I pass it (and then it gets moved). I understand that this is better than passing by value and not using std::move (because it gets copied twice).
So two questions:
Did I understand correctly what is happening here?
Is there any upside of using std::move over passing by reference and just calling m_name{name}?
/* (0) */
Creature(const std::string &name) : m_name{name} { }
A passed lvalue binds to name, then is copied into m_name.
A passed rvalue binds to name, then is copied into m_name.
/* (1) */
Creature(std::string name) : m_name{std::move(name)} { }
A passed lvalue is copied into name, then is moved into m_name.
A passed rvalue is moved into name, then is moved into m_name.
/* (2) */
Creature(const std::string &name) : m_name{name} { }
Creature(std::string &&rname) : m_name{std::move(rname)} { }
A passed lvalue binds to name, then is copied into m_name.
A passed rvalue binds to rname, then is moved into m_name.
As move operations are usually faster than copies, (1) is better than (0) if you pass a lot of temporaries. (2) is optimal in terms of copies/moves, but requires code repetition.
The code repetition can be avoided with perfect forwarding:
/* (3) */
template <typename T,
std::enable_if_t<
std::is_convertible_v<std::remove_cvref_t<T>, std::string>,
int> = 0
>
Creature(T&& name) : m_name{std::forward<T>(name)} { }
You might optionally want to constrain T in order to restrict the domain of types that this constructor can be instantiated with (as shown above). C++20 aims to simplify this with Concepts.
In C++17, prvalues are affected by guaranteed copy elision, which - when applicable - will reduce the number of copies/moves when passing arguments to functions.
Did I understand correctly what is happening here?
Yes.
Is there any upside of using std::move over passing by reference and just calling m_name{name}?
An easy to grasp function signature without any additional overloads. The signature immediately reveals that the argument will be copied - this saves callers from wondering whether a const std::string& reference might be stored as a data member, possibly becoming a dangling reference later on. And there is no need to overload on std::string&& name and const std::string& arguments to avoid unnecessary copies when rvalues are passed to the function. Passing an lvalue
std::string nameString("Alex");
Creature c(nameString);
to the function that takes its argument by value causes one copy and one move construction. Passing an rvalue to the same function
std::string nameString("Alex");
Creature c(std::move(nameString));
causes two move constructions. In contrast, when the function parameter is const std::string&, there will always be a copy, even when passing an rvalue argument. This is clearly an advantage as long as the argument type is cheap to move-construct (this is the case for std::string).
But there is a downside to consider: the reasoning doesn't work for functions that assign the function argument to another variable (instead of initializing it):
void setName(std::string name)
{
m_name = std::move(name);
}
will cause a deallocation of the resource that m_name refers to before it's reassigned. I recommend reading Item 41 in Effective Modern C++ and also this question.
How you pass is not the only variable here, what you pass makes the big difference between the two.
In C++, we have all kinds of value categories and this "idiom" exists for cases where you pass in an rvalue (such as "Alex-string-literal-that-constructs-temporary-std::string" or std::move(nameString)), which results in 0 copies of std::string being made (the type does not even have to be copy-constructible for rvalue arguments), and only uses std::string's move constructor.
Somewhat related Q&A.
There are several disadvantages of pass-by-value-and-move approach over pass-by-(rv)reference:
it causes 3 objects to be spawned instead of 2;
passing an object by value may lead to extra stack overhead, because even regular string class is typically at least 3 or 4 times larger than a pointer;
argument objects construction is going to be done on the caller side, causing code bloat;
Suppose I have the following class:
class foo {
std::unique_ptr<blah> ptr;
}
What's the difference between these two:
foo::foo(unique_ptr p)
: ptr(std::move(p))
{ }
and
foo::foo(unique_ptr&& p)
: ptr(std::move(p)
{ }
When called like
auto p = make_unique<blah>();
foo f(std::move(p));
Both compile, and I guess both must use unique_ptr's move constructors? I guess the first one it'll get moved twice, but the second one it'll only get moved once?
They do the same, which is moving the pointer twice (which translates to 2 casts).
The only practical difference is the place where the code would break if you were to pass a std::unique_ptr by value.
Having foo::foo(unique_ptr p) the compiler would complain about the copy constructor being deleted.
Having foo::foo(unique_ptr&& p) it would say that there is no matching function for the set of arguments provided.
std::unique_ptr is a non-copyable class, this means that you either pass a constant reference (readonly access) or move it (rvalue) to express/give ownership.
The second case (getting explicit rvalue) is the proper implementation, as your intent is passing the ownership and std::unique_ptr has deleted copy assignment/constructor.
In the first case, copy is prohibited and the arg is an rvalue (you moved it before passing) so the compiler uses the move constructor. Then the ownership belongs to the new argument and you move it again to the internal field passing the ownership again, not a very nice solution. (Note that this declaration isn't restrictive, it allows any value type. If std::unique_ptr would be copyable, copy contructor would be used here, hiding the actual problem).
In terms of resulting code, std::move is a simple cast to rvalue, so both do the same. In terms of correctness, second one must be used.
I am trying to understand rvalue reference and move semantics. In following code, when I pass 10 to Print function it calls rvalue reference overload, which is expected. But what exactly happens, where will that 10 get copied (or from where it referred). Secondly what does std::move actually do? Does it extract value 10 from i and then pass it? Or it is instruction to compiler to use rvalue reference?
void Print(int& i)
{
cout<<"L Value reference "<<endl;
}
void Print(int&& i)
{
cout<<"R Value reference "<< endl;
}
int main()
{
int i = 10;
Print(i); //OK, understandable
Print(10); //will 10 is not getting copied? So where it will stored
Print(std::move(i)); //what does move exactly do
return 0;
}
Thanks.
In the case of a 10, there will probably be optimisations involved which will change the actual implementation, but conceptually, the following happens:
A temporary int is created and initialised with the value 10.
That temporary int is bound to the r-value reference function parameter.
So conceptually, there's no copying - the reference will refer to the temporary.
As for std::move(): there may be some tricky bits related to references etc., but principally, it's just a cast to r-value reference. std::move() does not actually move anything. It just turns its argument into an r-value, so that it can be moved from.
"Moving" is not really a defined operation, anyway. While it's convenient to think about moving, the important thing is l-value vs. r-value distinction.
"Moving" is normally implemented by move constructors, move assignment operators and functions taking r-value references (such as push_back()). It is their implementation that makes the move an actual move - that is, they are implemented so that they can "steal" the r-value's resources instead of copying them. That's because, being an r-value, it will no longer be accessible (or so you promise the compiler).
That's why std::move() enables "moving" - it turns its argument into an r-value, signalling, "hey, compiler, I will not be using this l-value any more, you can let functions (such as move ctors) treat it as an r-value and steal from it."
But what exactly happens, where that 10 will get copied (or from where it referred)
A temporary value is created, and a reference passed to the function. Temporaries are rvalues, so can be bound to rvalue references; so the second overload is chosen.
Secondly what std::move actually do?
It gives you an rvalue reference to its argument. It's equivalent (by definition) to static_cast<T&&>.
Despite the name, it doesn't do any movement itself; it just gives you a reference that can be used to move the value.
std::move cast int in int&& via static_cast<int&&>.
Eventually if the type is a class or a struct, the move constructor if it is defined (implicitly or explicitly) will be invoked instead of the copy constructor/classical constructor.
std::shared_ptr::operator* returns by lvalue reference, and the answer given on overloading pointer like operations here says that the convention is to return by lvalue reference. However, when I'm using the following code, I get error C2664: 'AdjacencyList::addVertex' : cannot convert parameter 1 from 'AdjacencyList::vertex_type' to 'AdjacencyList::vertex_type &&': You cannot bind an lvalue to an rvalue reference:
std::shared_ptr<vertex_type> AdjacencyList::addVertex(vertex_type&& v)
{
auto existingVertex(findVertex(v));
if (!existingVertex.isValid())
{
existingVertex = std::make_shared<vertex_type>(std::forward<vertex_type>(v))
m_vertices.push_back(existingVertex);
}
return existingVertex;
};
AdjacencyList minimumSpanningTree;
// startVertex is a shared_ptr to a vertex returned from a previous call of addVertex
// on another AdjacencyList object
const auto mstStartVertex(minimumSpanningTree.addVertex(*startVertex));
Should I provide AdjacencyList::addVertex(const vertex_type& v) or change the code at the bottom of the above block to make a copy of the vertex before passing to addVertex?
AdjacencyList minimumSpanningTree;
Vertex s(*startVertex);
const auto mstStartVertex(minimumSpanningTree.addVertex(std::move(s)));
I would think that you should return a copy from your operator*, as the sematics of the std::weak_ptr suggest that you can not guarantee that a returned reference would stay valid. Since the returned copy is then given to a function which can move it somewhere else, it should also be efficient enough, since addVertex looks like it would require a copy anyways, i.e., if you would create an overload of addVertex, it will create a copy of the passed const reference internally, would it?
The most efficient approach in terms of redundant copies is to provide rvalue and const reference overloads:
std::shared_ptr<vertex_type> AdjacencyList::addVertex(vertex_type&&);
std::shared_ptr<vertex_type> AdjacencyList::addVertex(const vertex_type&);
To eliminate the redundant code, you can forward to a template method or to a concrete method taking a bool flag and performing const_cast as appropriate.
If the overhead of copying the Vertex object is minimal compared to the cost of increased code, and if the if block is usually or often entered, then the redundant copy will make your code clearer. Your second suggested call will work better if you just create a prvalue temporary that doesn't need to be moved:
const auto mstStartVertex(minimumSpanningTree.addVertex(Vertex{*startVertex}));
However in that case you might as well create the temporary in the call itself, by providing a single value overload (How to reduce redundant code when adding new c++0x rvalue reference operator overloads):
std::shared_ptr<vertex_type> AdjacencyList::addVertex(vertex_type);