The accepted answer of this post Pass by value vs pass by rvalue reference says that:
For move-only types (as std::unique_ptr), pass-by-value seems to be the norm...
I'm a little bit doubtful about that. Let's say there is some non-copyable type, Foo, which is also not cheap to move; and some type Bar that has a member Foo.
class Foo {
public:
Foo(const Foo&) = delete;
Foo(Foo&&) { /* quite some work */ }
...
};
class Bar {
public:
Bar(Foo f) : f_(std::move(f)) {} // (1)
Bar(Foo&& f) : f_(std::move(f)) {} // (2)
// Assuming only one of (1) and (2) exists at a time
private:
Foo f_;
};
Then for the following code:
Foo f;
...
Bar bar(std::move(f));
Constructor (1) incurs 2 move constructions, while constructor (2) only incurs 1. I also remember reading in Scott Meyers's Effective Modern C++ about this but can't remember which item immediately.
So my question is, for move-only types (or more generally, when we want to transfer the ownership of the argument), shouldn't we prefer pass-by-rvalue-reference for better performance?
UPDATE: I'm aware that the pass-by-value constructors/assignment operators (sometimes called unifying ctors/assignment operators) can help eliminate duplicate code. I should say I'm more interested in the case when (1) performance is important, and (2) the type is non-copyable and so there are no duplicate ctors/assignment operators which accept const lvalue reference arguments.
UPDATE 2: So I've found Scott Meyers's blog about the specific problem: http://scottmeyers.blogspot.com/2014/07/should-move-only-types-ever-be-passed.html. This blog discusses the reason that he advocates in Item 41 of his Effective Modern C++ that:
Consider pass by value only for copyable parameters...that are cheap to move...[and] always copied.
There is an extensive discussion in that item about pass by value vs. rvalue reference, too much to be quoted here. The point is, both ways have their own advantages and disadvantages, but for transferring the ownership of a move-only object, pass by rvalue reference seems to be preferable.
In this case we can have our cake and eat it. A template constructor enabled only for Foo-like references gives us perfect forwarding plus a single implementation of a constructor:
#include <iostream>
#include <utility>
class Foo {
public:
Foo() {}
Foo(const Foo&) = delete;
Foo(Foo&&) { /* quite some work */ }
};
class Bar {
public:
template<class T, std::enable_if_t<std::is_same<std::decay_t<T>, Foo>::value>* = nullptr>
Bar(T&& f) : f_(std::forward<T>(f)) {} // (2)
// Assuming only one of (1) and (2) exists at a time
private:
Foo f_;
};
int main()
{
Foo f;
Bar bar(std::move(f));
// this won't compile
// Foo f2;
// Bar bar2(f2);
}
Background
It's hard to imagine a class that's expensive to move: move semantics come exactly from the need to give a fast alternative to copies, when semantics allow.
You bring the example of std::string and SSO. However that example is clearly flawed (I doubt they even turned on optimizations) because copying 16 bytes through memcpy should take a bunch of CPU cycles since it can be implemented in 1 SIMD instruction to store them all at once. Also, MSVC 10 is really old.
So my question is, for move-only types (or more generally, when we
want to transfer the ownership of the argument), shouldn't we prefer
pass-by-rvalue-reference for better performance?
I shan't talk about performance, because it's such a peculiar aspect and can't be analyzed "in general". We'd need concrete cases. Also, compiler optimizations also needs to be considered; quite heavily, actually. And not to forget a thorough performance analysis.
std::unique_ptr is a bit different because (i) can only be moved due to owning semantics (ii) it is cheap to move.
My view. I would say if you have to provide a "faster" alternative as an API, provide both - just like std::vector::push_back. It could have, as you say, slight improvements.
Otherwise, even for move-only types, passing by const-reference still works and if you think it wouldn't, go for pass-by-value.
Related
Recently I find myself often in the situation of having a single function that takes some object as a parameter. The function will have to copy that object.
However the parameter for that function may also quite frequently be a temporary and thus I want to also provide an overload of that function that takes an rvalue reference instead a const reference.
Both overloads tend to only differ in that they have different types of references as argument types. Other than that they are functionally equivalent.
For instance consider this toy example:
void foo(const MyObject &obj) {
globalVec.push_back(obj); // Makes copy
}
void foo(MyObject &&obj) {
globalVec.push_back(std::move(obj)); // Moves
}
Now I was wondering whether there is a way to avoid this code-duplication by e.g. implementing one function in terms of the other.
For instance I was thinking of implementing the copy-version in terms of the move-one like this:
void foo(const MyObject &obj) {
MyObj copy = obj;
foo(std::move(copy));
}
void foo(MyObject &&obj) {
globalVec.push_back(std::move(obj)); // Moves
}
However this still does not seem ideal since now there is a copy AND a move operation happening when calling the const ref overload instead of a single copy operation that was required before.
Furthermore, if the object does not provide a move-constructor, then this would effectively copy the object twice (afaik) which defeats the whole purpose of providing these overloads in the first place (avoiding copies where possible).
I'm sure one could hack something together using macros and the preprocessor but I would very much like to avoid involving the preprocessor in this (for readability purposes).
Therefore my question reads: Is there a possibility to achieve what I want (effectively only implementing the functionality once and then implement the second overload in terms of the first one)?
If possible I would like to avoid using templates instead.
My opinion is that understanding (truly) how std::move and std::forward work, together with what their similarities and their differences are is the key point to solve your doubts, so I suggest that you read my answer to What's the difference between std::move and std::forward, where I give a very good explanation of the two.
In
void foo(MyObject &&obj) {
globalVec.push_back(obj); // Moves (no, it doesn't!)
}
there's no move. obj is the name of a variable, and the overload of push_back which will be called is not the one which will steal reasources out of its argument.
You would have to write
void foo(MyObject&& obj) {
globalVec.push_back(std::move(obj)); // Moves
}
if you want to make the move possible, because std::move(obj) says look, I know this obj here is a local variable, but I guarantee you that I don't need it later, so you can treat it as a temporary: steal its guts if you need.
As regards the code duplication you see in
void foo(const MyObject &obj) {
globalVec.push_back(obj); // Makes copy
}
void foo(MyObject&& /*rvalue reference -> std::move it */ obj) {
globalVec.push_back(std::move(obj)); // Moves (corrected)
}
what allows you to avoid it is std::forward, which you would use like this:
template<typename T>
void foo(T&& /* universal/forwarding reference -> std::forward it */ obj) {
globalVec.push_back(std::forward<T>(obj)); // moves conditionally
}
As regards the error messages of templates, be aware that there are ways to make things easier. for instance, you could use static_asserts at the beginning of the function to enfornce that T is a specific type. That would certainly make the errors more understandable. For instance:
#include <type_traits>
#include <vector>
std::vector<int> globalVec{1,2,3};
template<typename T>
void foo(T&& obj) {
static_assert(std::is_same_v<int, std::decay_t<T>>,
"\n\n*****\nNot an int, aaarg\n*****\n\n");
globalVec.push_back(std::forward<T>(obj));
}
int main() {
int x;
foo(x);
foo(3);
foo('c'); // errors at compile time with nice message
}
Then there's SFINAE, which is harder and I guess beyond the scope of this question and answer.
My suggestion
Don't be scared of templates and SFINAE! They do pay off :)
There's a beautiful library that leverages template metaprogramming and SFINAE heavily and successfully, but this is really off-topic :D
A simple solution is:
void foo(MyObject obj) {
globalVec.push_back(std::move(obj));
}
If caller passes an lvalue, then there is a copy (into the parameter) and a move (into the vector). If caller passes an rvalue, then there are two moves (one into parameter and another into vector). This can potentially be slightly less optimal compared to the two overloads because of the extra move (slightly compensated by the lack of indirection) but in cases where moves are cheap, this is often a decent compromise.
Another solution for templates is std::forward explored in depth in Enlico's answer.
If you cannot have a template and the potential cost of a move is too expensive, then you just have to be satisfied with some extra boilerplate of having two overloads.
According to cppreference, std::copyable is defined as follows:
template <class T>
concept copyable =
std::copy_constructible<T> &&
std::movable<T> && // <-- !!
std::assignable_from<T&, T&> &&
std::assignable_from<T&, const T&> &&
std::assignable_from<T&, const T>;
I'm wondering why a copyable object should be also movable. Just think about a global variable that is accessed by several functions. While it makes sense to copy that variable (for example to save its state before calling another function) it makes no sense, and actually would be very bad, to move it since other functions might not know that that variable is currently in an unspecified state. So why exactly does std::copyable subsume std::movable ?
This comes from two facts. Firstly, even if you don't define move constructor + move assignment you can still construct/assign object from r-value reference if you define copying functions. Just take a look at the example:
#include <utility>
struct foo {
foo() = default;
foo(const foo&) = default;
foo& operator=(const foo&) = default;
};
int main()
{
foo f;
foo b = std::move(f);
}
Secondly (and maybe more importantly), the copyable type can always be (or according to standard now must be) also movable in some way. If object is copyable then worst case scenario for move is just copying internal data.
Note that since I declared copy constructor the compiler DID NOT generate default move constructor.
While it makes sense to copy that variable (for example to save its state before calling another function) it makes no sense, and actually would be very bad, to move it since other functions might not know that that variable is currently in an unspecified state.
There's a strong, unstated presumption here of what moving actually means that is probably the source of confusion. Consider the type:
class Person {
std::string name;
public:
Person(std::string);
Person(Person const& rhs) : name(rhs.name) { }
Person& operator=(Person const& rhs) { name = rhs.name; return *this; }
};
What does moving a Person do? Well, an rvalue of type Person can bind to Person const&... and that'd be the only candidate... so moving would invoke the copy constructor. Moving does a copy! This isn't a rare occurrence either - moving doesn't have to be destructive or more efficient than copying, it just can be.
Broadly speaking, there are four sane categories of types:
Types for which move and copy do the same thing (e.g. int)
Types for which move can be an optimization of copy that consumes resources (e.g. string or vector<int>)
Types which can be moved but not copied (e.g. unique_ptr<int>)
Types which can be neither moved nor copied (e.g. mutex)
There are a lot of types that fall into group 1 there. And the kind of variable mentioned in OP should also fall into group 1.
Notably missing from this list is a type that is copyable but not movable, since that makes very little sense from an operational stand-point. If you can copy the type, and you don't want destructive behavior on moving, just make moving also copy the type.
As such, you can view these groups as a kind of hierarchy. (3) expands on (4), and (1) and (2) expand on (3) - you can't really differentiate (1) from (2) syntactically. Hence, copyable subsumes movable.
I often see the following idiom in production code: A value argument (like a shared pointer) is handed into a constructor and shall be copied once. To ensure this, the argument is wrapped into a std::move application. Hating boilerplate code and formal noise I wondered if this is actually necessary. If I remove the application, at least gcc 7 outputs some different assembly.
#include <memory>
class A
{
std::shared_ptr<int> p;
public:
A(std::shared_ptr<int> p) : p(std::move(p)) {} //here replace with p(p)
int get() { return *p; }
};
int f()
{
auto p = std::make_shared<int>(42);
A a(p);
return a.get();
}
Compiler Explorer shows you the difference. While I am not certain what is the most efficient approach here, I wonder if there is an optimization that allows to treat p as a rvalue reference in that particular location? It certainly is a named entity, but that entity is "dead" after that location anyway.
Is it valid to treat a "dead" variable as a rvalue reference? If not, why?
In the body of the constructor, there are two p objects, the ctor argument and this->p. Without the std::move, they're identical. That of course means the ownership is shared between the two pointers. This must be achieved in a thread-safe way, and is expensive.
But optimizing this out is quite hard. A compiler can't generally deduce that ownership is redundant. By writing std::move yourself, you make it unambiguously clear that the ctor argument p does not need to retain ownership.
I've been reading about rvalue references and std::move, and I have a performance question. For the following code:
template<typename T>
void DoSomething()
{
T x = foo();
bar(std::move(x));
}
Would it be better, from a performance perspective, to rewrite this using an rvalue reference, like so?
template<typename T>
void DoSomething()
{
T&& x = foo();
bar(x);
}
If I understand rvalue references correctly, this will simply act as if bar(foo()) had been called, as pointed out by commenters. But the intermediate value may be needed; is it useful to do this?
The best way to use rvalue references / move semantics, is for the most part, don't try too hard. Return function locals by value. And construct them by value from such functions:
X foo();
template<typename T>
void DoSomething()
{
T x = foo();
...
If you are the author of class X, then you probably want to ensure that X has efficient move construction and move assignment. If the copy constructor and copy assignment of X are already efficient, there is literally nothing more to do:
class X
{
int data_;
public:
// copy members just copy data_
};
If you find yourself allocating memory in X's copy constructor, and or copy assignment, then you should consider writing a move constructor and move assignment operator. Do your best to make them noexcept.
Although there are always exceptions to the rule, in general:
Don't return by reference, either lvalue or rvalue reference, unless you want the client to have direct access to a non-function-local variable.
Don't catch a return by reference, lvalue or rvalue. Yes, there are cases where it might save you something, without getting you into trouble. Don't even think about doing it as a rule. Only do something like this if your performance testing is highly motivating you to do so.
All of the things you (hopefully) learned not to do with lvalue references are still just as dangerous to do with rvalue references (such as returning a dangling reference from a function).
First program for correctness. This includes writing extensive and easy-to-run unit tests covering common and corner cases of your API. Then start optimizing. When you start out trying to write extremely optimized code (e.g. catching a return by reference) before you have a correct and well-tested solution, one inevitably writes code that is so brittle that the first bug fix that comes along just breaks the code further, but often in very subtle ways. This is a lesson that even experienced programmers (including myself) have to learn over and over.
In spite of what I say in (4), even on the first write, keep an eye on O(N) performance. I.e. if your first try is so grossly slow due to basic overall design deficiencies or poor algorithms that it can't perform its basic function in a reasonable time, then you need more than new code, you need a new design. In this bullet I am not talking about things like whether or not you've caught a return with an rvalue reference. I'm talking about whether or not you've created an O(N^2) algorithm, when O(N log N), or O(N) could've done the job. This is all too easy to do when the job that needs to get done is non-trivial, or when the code is so complicated that you can't tell what is going on.
Here std::move does not help.
As you have made a copy of the object (though elision may help).
template<typename T>
void DoSomething()
{
T x = foo();
bar(std::move(x));
}
Here rvalue reference is not helping
As the variable x is a named object (and thus not an rvalue reference (anymore)).
template<typename T>
void DoSomething()
{
T&& x = foo();
bar(x);
}
Best to use:
template<typename T>
void DoSomething()
{
bar(foo());
}
But if you must use it locally:
template<typename T>
void DoSomething()
{
T&& x = foo();
// Do stuff with x
bar(std::move(x));
}
Though I am not 100% sure about the above and would love some feedback.
In traditional C++, passing by value into functions and methods is slow for large objects, and is generally frowned upon. Instead, C++ programmers tend to pass references around, which is faster, but which introduces all sorts of complicated questions around ownership and especially around memory management (in the event that the object is heap-allocated)
Now, in C++11, we have Rvalue references and move constructors, which mean that it's possible to implement a large object (like an std::vector) that's cheap to pass by value into and out of a function.
So, does this mean that the default should be to pass by value for instances of types such as std::vector and std::string? What about for custom objects? What's the new best practice?
It's a reasonable default if you need to make a copy inside the body. This is what Dave Abrahams is advocating:
Guideline: Don’t copy your function arguments. Instead, pass them by value and let the compiler do the copying.
In code this means don't do this:
void foo(T const& t)
{
auto copy = t;
// ...
}
but do this:
void foo(T t)
{
// ...
}
which has the advantage that the caller can use foo like so:
T lval;
foo(lval); // copy from lvalue
foo(T {}); // (potential) move from prvalue
foo(std::move(lval)); // (potential) move from xvalue
and only minimal work is done. You'd need two overloads to do the same with references, void foo(T const&); and void foo(T&&);.
With that in mind, I now wrote my valued constructors as such:
class T {
U u;
V v;
public:
T(U u, V v)
: u(std::move(u))
, v(std::move(v))
{}
};
Otherwise, passing by reference to const still is reasonable.
In almost all cases, your semantics should be either:
bar(foo f); // want to obtain a copy of f
bar(const foo& f); // want to read f
bar(foo& f); // want to modify f
All other signatures should be used only sparingly, and with good justification. The compiler will now pretty much always work these out in the most efficient way. You can just get on with writing your code!
Pass parameters by value if inside the function body you need a copy of the object or only need to move the object. Pass by const& if you only need non-mutating access to the object.
Object copy example:
void copy_antipattern(T const& t) { // (Don't do this.)
auto copy = t;
t.some_mutating_function();
}
void copy_pattern(T t) { // (Do this instead.)
t.some_mutating_function();
}
Object move example:
std::vector<T> v;
void move_antipattern(T const& t) {
v.push_back(t);
}
void move_pattern(T t) {
v.push_back(std::move(t));
}
Non-mutating access example:
void read_pattern(T const& t) {
t.some_const_function();
}
For rationale, see these blog posts by Dave Abrahams and Xiang Fan.
The signature of a function should reflect it's intended use. Readability is important, also for the optimizer.
This is the best precondition for an optimizer to create fastest code - in theory at least and if not in reality then in a few years reality.
Performance considerations are very often overrated in the context of parameter passing. Perfect forwarding is an example. Functions like emplace_back are mostly very short and inlined anyway.