Say I have this constructor in C++:
A::A( std::string const& name,
std::string const& type,
std::vector<B> const& b_vec,
bool unique )
: _name(name), _type(type), _b_vec(b_vec), _unique(unique)
{ };
I would like to overload this constructor for the case where the arguments are rvalues (I want to use move semantics there).
A::A( std::string && name,
std::string && type,
std::vector<B> && b_vec,
bool unique )
: _name(name), _type(type), _b_vec(b_vec), _unique(unique)
{ };
The above one works fine when all of the arguments are rvalues, but suppose if only some of them are is in the next example:
// create some lvalues somehow
std::string name = "stack overflow";
std::vector<B> vec = { ... }; // implementation of B's constructot is not important
// call a mixed constructor
A new_A_instance(name, "cool-website", vec, true);
it is to my understanding that since 'const&' cannot bind to '&&' but '&&' can bind to 'const&' the first (non-move) constructor would be used.
This seems sub-optimal, since two of the four arguments could be moved (because they are rvalue) instead of being copied (as is the case in the first constructor).
So I could overload the operator for this specific case, but one could easily image a case where other arguments are rvalue and others are agin lvalue. Should I overload the constructor for each of these cases? This would combinatorily lead to very much overloads as the number of arguments increases...
I kind-of feel there is a better solution (perhaps using templates, but my template knowledge is shamefully low).
Note: this problem isn't tied to overloading pass-by-ref functions to move functions per-se, but I found this a good example (especially since the overloads don't feel very different). Also note that I just used constructors as an example, but the overloaded function can be anything.
Pass by value, this is what move semantics are for:
A::A(std::string name, std::string type, std::vector<B> b_vec, bool unique )
: _name(std::move(name)), _type(std::move(type)), _b_vec(std::move(b_vec)),
_unique(unique)
{ };
This has the expected behaviour in every case. Passing a temporary by value allows the compiler to perform copy elision, which it pretty much always does.
Note that in your second code, copies are made, since you don't use std::move. Please realize that when you write
void foo(bar&& x)
{
...
}
then in the body of foo, x is a lvalue. Objects with names are always lvalues. Inside this body, you must use std::move(x) if you intend to pass x as a rvalue.
Related
I would like to do something like the following:
class Foo
{
Foo(int &&a, int b, std::string s="");
// does not compile because a is not an rvalue:
// Foo(int &&a, std::string s) : Foo(a, 0, s) {}
Foo(int &&a, std::string s) : Foo(std::move(a), 0, s) {} // move a
}
Is this a valid way to overload a constructor in general?
And specifically one that takes an rvalue reference as a parameter?
Edited based on comments
To clarify, I'm new to move semantics (and an amateur programmer) and I'm simply not sure if this is a good way to handle this situation.
I added the first question based on a comment (now deleted) that suggested this is not the correct way to overload constructors.
Is this a valid way to overload a constructor in general?
And specifically one that takes an rvalue reference as a parameter?
It is valid to overload constructors in general, even ones that have rvalue reference arguments.
In particular, your example is ill-formed. As you point out in the comments, the example fails to compile. To fix it, you must pass an rvalue to the constructor that you delegate to. The correct way to convert a rvalue reference variable into an rvalue is to use std::move. So, what you must do is what you already know:
The only option I can see is to std::move(a).
I've already asked on code review and software engineering but the topic didn't fit the site, so I'm asking here hoping this is not opinion-based. I am an "old school" C++ developer (I've stopped at C++ 2003) but now I've read a few books on modern C++ 11/17 and I'm rewriting some libraries of mine.
The first thing I've made is adding move constructor/assignment operator where needed ( = classes that already had destructor + copy constructor and copy assignment). Basically I'm using the rule of five.
Most of my functions are declared like
func(const std::string& s);
Which is the common way to pass a reference avoiding a copy. By the way there is also the new move semantic and there's somethig that I wasn't able to find in my books/online. This code:
void fun(std::string& x) {
x.append(" world");
std::cout << x;
}
int main()
{
std::string s{"Hello "};
fun(s);
}
Can also be written as:
void fun(std::string&& x) {
x.append(" world");
std::cout << x;
}
int main()
{
std::string s{"Hello "};
fun(std::move(s));
//or fun("Hello ");
// or fun(std::string {"Hello" });
}
My question is: when should I declare functions that accept a paramenter that is a rvalue reference?
I understand the usage of && semantic on constructors and assignment operators but not really on functions. In the example above (first function) I have a std::string& x which cannot be called as fun("Hello "); of course because I should delcare the type as const std::string& x. But now the const doesnt allow me to change the string!
Yes, I could use a const cast but I rarely do casts (and if it's the case, they're dynamic casts). The power of the && is that I avoid copies, I don't have to do something like
std::string x = "...";
fun(x); //void fun(std::string& x) {}
and I can assing temporary values that will be moved. Should I declare functions with rvalue references when possible?
I have a library that I'm rewriting with modern C++ 17 and I have functions like:
//only const-ref
Type1 func(const type2& x);
Type3 function(const type4& x);
I am asking if it's worth rewriting all of them as
//const-ref AND rvalue reference
Type1 func(const type2& x);
Type3 function(const type4& x);
Type1 func(type2&& x);
Type3 function(type4&& x);
I don't want to create too many overloads that may be useless but if an user of my library wanted to use the move operation I should create the && param types. Of course I am not doing this for primitive types (int, double, char...) but for containers or classes. What do you suggest?
I am not sure if the latter scenario (with both versions) would be useful or not.
Let me comment on four scenarios in your question and examples.
std::string_view with pass-by-value is supposed to replace const std::string& parameters and whenever you can guarantee the necessary preconditions for a safe usage of std::string_view (lifetime, pointee doesn't change), it's a good candidate to start modernizing your function signatures.
const T& vs. T&& (where T is not subject to template type deduction) with known usage scenarios. The void fun function that appends to a given, modifiable string, will only makes sense as void fun(std::string&&) if calling code doesn't need the result after the call. In this case, the rvalue-reference signature documents this expectation nicely and is the way to go. But these cases are rather rare in my experience.
const T& vs. T&& (again, no type deduction) with unknown usage scenarios. A good reference here is std::vector::push_back, which is overloaded for both rvalue and lvalue references. The push_back operation is assumed to be cheap compared to move-construction a T, that's why the overload makes sense. When a function is assumed to be more expensive than such a move-construction, passing the argument by value is a simplification that can make sense (see also Item 41 in EMC++).
const T& vs. T&& when type deduction takes place. Here, use universal references together with std::forward whenever possible and the parameters can't be const qualified. If they aren't modified in the function body, go with const T&.
You want to use rvalue references only if:
You might retain a copy and you need the extra performance (measure!)
Example for this would be writing a library type (e.g. std::vector) where performance matters to its users.
You want only temporaries to be passed to your function
Example for this is the move assignment operator: After the assignment, the original objects state will not exist anymore.
Forwarding references (T&& with T deduced) fall under the first option.
Rvalue reference (not to be confused with a forwarding reference!) in function arguments is used when there is a need to move ownership from one object to another.
It is true that it is often done in context of move constructors/assignment operators, but this is not the only case. For example, a function accepting an ownership of std::unique_prt could accept it's argument by an rvalue reference.
For class types it is possible to assign to temporary objects which is actually not allowed for built-in types. Further, the assignment operator generated by default even yields an lvalue:
int() = int(); // illegal: "expression is not assignable"
struct B {};
B& b = B() = B(); // compiles OK: yields an lvalue! ... but is wrong! (see below)
For the last statement the result of the assignment operator is actually used to initialize a non-const reference which will become stale immediately after the statement: the reference isn't bound to the temporary object directly (it can't as temporary objects can only be bound to a const or rvalue references) but to the result of the assignment whose life-time isn't extended.
Another problem is that the lvalue returned from the assignment operator doesn't look as if it can be moved although it actually refers to a temporary. If anything is using the result of the assignment to get hold of the value it will be copied rather than moved although it would be entirely viable to move. At this point it is worth noting that the problem is described in terms of the assignment operator because this operator is typically available for value types and returns an lvalue reference. The same problem exists for any function returning a reference to the objects, i.e., *this.
A potential fix is to overload the assignment operator (or other functions returning a reference to the object) to consider the kind of object, e.g.:
class G {
public:
// other members
G& operator=(G) & { /*...*/ return *this; }
G operator=(G) && { /*...*/ return std::move(*this); }
};
The possibility to overload the assignment operators as above has come with C++11 and would prevent the subtle object invalidation noted above and simultaneously allow moving the result of an assignment to a temporary. The implementation of the these two operators is probably identical. Although the implementation is likely to be rather simple (essentially just a swap() of the two objects) it still means extra work raising the question:
Should functions returning a reference to the object (e.g., the assignment operator) observe the rvalueness of the object being assigned to?
An alternatively (mentioned by Simple in a comment) is to not overload the assignment operator but to qualify it explicitly with a & to restrict its use to lvalues:
class GG {
public:
// other members
GG& operator=(GG) & { /*...*/ return *this; }
};
GG g;
g = GG(); // OK
GG() = GG(); // ERROR
IMHO, the original suggestion by Dietmar Kühl (providing overloads for & and && ref-qualifiers) is superior than Simple's one (providing it only for &).
The original idea is:
class G {
public:
// other members
G& operator=(G) & { /*...*/ return *this; }
G operator=(G) && { /*...*/ return std::move(*this); }
};
and Simple has suggested to remove the second overload. Both solutions invalidate this line
G& g = G() = G();
(as wanted) but if the second overload is removed, then these lines also fail to compile:
const G& g1 = G() = G();
G&& g2 = G() = G();
and I see no reason why they shouldn't (there's no lifetime issue as explained in Yakk's post).
I can see only one situation where Simple's suggestion is preferable: when G doesn't have an accessible copy/move constructor. Since most types for which the copy/move assignment operator is accessible also have an accessible copy/move constructor, this situation is quite rare.
Both overloads take the argument by value and there are good reasons for that if G has an accessible copy/move constructor. Suppose for now that G does not have one. In this case the operators should take the argument by const G&.
Unfortunately the second overload (which, as it is, returns by value) should not return a reference (of any type) to *this because the expression to which *this binds to is an rvalue and thus, it's likely to be a temporary whose lifetime is about to expiry. (Recall that forbidding this from happening was one of the OP's motivation.)
In this case, you should remove the second overload (as per Simple's suggestion) otherwise the class doesn't compile (unless the second overload is a template that's never instantiated). Alternatively, we can keep the second overload and define it as deleted. (But why bother since the existence of the overload for & alone is already enough?)
A peripheral point.
What should be the definition of operator = for &&? (We assume again that G has an accessible copy/move constructor.)
As Dietmar Kühl has pointed out and Yakk has explored, the code of the both overloads should be very similar and, in this case, it's better to implement the one for && in terms of the one for &. Since the performance of a move is expected to be no worse than a copy (and since RVO doesn't apply when returning *this) we should return std::move(*this). In summary, a possible one-line definition is:
G operator =(G o) && { return std::move(*this = std::move(o)); }
This is good enough if only G can be assigned to another G or if G has (non-explicit) converting constructors. Otherwise, you should instead consider giving G a (template) forwarding copy/move assignment operator taking an universal reference:
template <typename T>
G operator =(T&& o) && { return std::move(*this = std::forward<T>(o)); }
Although this is not a lot of boiler plate code it's still an annoyance if we have to do that for many classes. To decrease the amount of boiler plate code we can define a macro:
#define ASSIGNMENT_FOR_RVALUE(type) \
template <typename T> \
type operator =(T&& b) && { return std::move(*this = std::forward<T>(b)); }
Then inside G's definition one adds ASSIGNMENT_FOR_RVALUE(G).
(Notice that the relevant type appears only as the return type. In C++14 it can be automatically deduced by the compiler and thus, G and type in the last two code snippets can be replaced by auto. It follows that the macro can become an object-like macro instead of a function-like macro.)
Another way of reducing the amount of boiler plate code is defining a CRTP base class that implements operator = for &&:
template <typename Derived>
struct assignment_for_rvalue {
template <typename T>
Derived operator =(T&& o) && {
return std::move(static_cast<Derived&>(*this) = std::forward<T>(o));
}
};
The boiler plate becomes the inheritance and the using declaration as shown below:
class G : public assignment_for_rvalue<G> {
public:
// other members, possibly including assignment operator overloads for `&`
// but taking arguments of different types and/or value category.
G& operator=(G) & { /*...*/ return *this; }
using assignment_for_rvalue::operator =;
};
Recall that, for some types and contrarily to using ASSIGNMENT_FOR_RVALUE, inheriting from assignment_for_rvalue might have some unwanted consequences on the class layout.
The first problem is that this is not actually ok in C++03:
B& b = B() = B();
in that b is bound to an expired temporary once the line is finished.
The only "safe" way to use this is in a function call:
void foo(B&);
foo( B()=B() );
or something similar, where the line-lifetime of the temporaries is sufficient for the lifetime of what we bind it to.
We can replace the probably inefficient B()=B() syntax with:
template<typename T>
typename std::decay<T>::type& to_lvalue( T&& t ) { return t; }
and now the call looks clearer:
foo( to_lvalue(B()) );
which does it via pure casting. Lifetime is still not extended (I cannot think of a way to manage that), but we don't construct to objects then pointlessly assign one to the other.
So now we sit down and examine these two options:
G operator=(G o) && { return std::move(o); }
G&& operator=(G o) && { *this = std::move(o); return std::move(*this); }
G operator=(G o) && { *this = std::move(o); return std::move(*this); }
which are, as an aside, complete implementations, assuming G& operator=(G o)& exists and is written properly. (Why duplicate code when you don't need to?)
The first and third allows for lifetime extension of the return value, the second uses the lifetime of *this. The second and third modify *this, while the first one does not.
I would claim that the first one is the right answer. Because *this is bound to an rvalue, the caller has stated that it will not be reused, and its state does not matter: changing it is pointless.
The lifetime of first and third means that whomever uses it can extend the lifetime of the returned value, and not be tied to whatever *this's lifetime is.
About the only use the B operator=(B)&& has is that it allows you to treat rvalue and lvalue code relatively uniformly. As a downside, it lets you treat it relatively uniformly in situations where the result may be surprising.
std::forward<T>(t) = std::forward<U>(u);
should probably fail to compile instead of doing something surprising like "not modifying t" when T&& is an rvalue reference. And modifying t when T&& is an rvalue reference is equally wrong.
For move enabled classes is there a difference between this two?
struct Foo {
typedef std::vector<std::string> Vectype;
Vectype m_vec;
//this or
void bar(Vectype&& vec)
{
m_vec = std::move(vec);
}
//that
void bar(Vectype vec)
{
m_vec = std::move(vec);
}
};
int main()
{
Vectype myvec{"alpha","beta","gamma"};
Foo fool;
fool.bar(std::move(myvec));
}
My understanding is that if you use a lvalue myvec you also required to introduce const
Vectype& version of Foo::bar() since Vectype&& won't bind. That's aside, in the rvalue case, Foo::bar(Vectype) will construct the vector using the move constructor or better yet elide the copy all together seeing vec is an rvalue (would it?). So is there a compelling reason to not to prefer by value declaration instead of lvalue and rvalue overloads?
(Consider I need to copy the vector to the member variable in any case.)
The pass-by-value version allows an lvalue argument and makes a copy of it. The rvalue-reference version can't be called with an lvalue argument.
Use const Type& when you don't need to change or copy the argument at all, use pass-by-value when you want a modifiable value but don't care how you get it, and use Type& and Type&& overloads when you want something slightly different to happen depending on the context.
The pass-by-value function is sufficient (and equivalent), as long as the argument type has an efficient move constructor, which is true in this case for std::vector.
Otherwise, using the pass-by-value function may introduce an extra copy-construction compared to using the pass-by-rvalue-ref function.
See the answer https://stackoverflow.com/a/7587151/1190077 to the related question Do I need to overload methods accepting const lvalue reference for rvalue references explicitly? .
Yes, the first one (Vectype&& vec) won't accept a const object or simply lvalue.
If you want to save the object inside like you do, it's best to copy(or move if you pass an rvalue) in the interface and then move, just like you did in your second example.
This is a follow-on question to
C++0x rvalue references and temporaries
In the previous question, I asked how this code should work:
void f(const std::string &); //less efficient
void f(std::string &&); //more efficient
void g(const char * arg)
{
f(arg);
}
It seems that the move overload should probably be called because of the implicit temporary, and this happens in GCC but not MSVC (or the EDG front-end used in MSVC's Intellisense).
What about this code?
void f(std::string &&); //NB: No const string & overload supplied
void g1(const char * arg)
{
f(arg);
}
void g2(const std::string & arg)
{
f(arg);
}
It seems that, based on the answers to my previous question that function g1 is legal (and is accepted by GCC 4.3-4.5, but not by MSVC). However, GCC and MSVC both reject g2 because of clause 13.3.3.1.4/3, which prohibits lvalues from binding to rvalue ref arguments. I understand the rationale behind this - it is explained in N2831 "Fixing a safety problem with rvalue references". I also think that GCC is probably implementing this clause as intended by the authors of that paper, because the original patch to GCC was written by one of the authors (Doug Gregor).
However, I don't this is quite intuitive. To me, (a) a const string & is conceptually closer to a string && than a const char *, and (b) the compiler could create a temporary string in g2, as if it were written like this:
void g2(const std::string & arg)
{
f(std::string(arg));
}
Indeed, sometimes the copy constructor is considered to be an implicit conversion operator. Syntactically, this is suggested by the form of a copy constructor, and the standard even mentions this specifically in clause 13.3.3.1.2/4, where the copy constructor for derived-base conversions is given a higher conversion rank than other user-defined conversions:
A conversion of an expression of class type to the same class type is given Exact Match rank, and a conversion
of an expression of class type to a base class of that type is given Conversion rank, in spite of the fact that
a copy/move constructor (i.e., a user-defined conversion function) is called for those cases.
(I assume this is used when passing a derived class to a function like void h(Base), which takes a base class by value.)
Motivation
My motivation for asking this is something like the question asked in How to reduce redundant code when adding new c++0x rvalue reference operator overloads ("How to reduce redundant code when adding new c++0x rvalue reference operator overloads").
If you have a function that accepts a number of potentially-moveable arguments, and would move them if it can (e.g. a factory function/constructor: Object create_object(string, vector<string>, string) or the like), and want to move or copy each argument as appropriate, you quickly start writing a lot of code.
If the argument types are movable, then one could just write one version that accepts the arguments by value, as above. But if the arguments are (legacy) non-movable-but-swappable classes a la C++03, and you can't change them, then writing rvalue reference overloads is more efficient.
So if lvalues did bind to rvalues via an implicit copy, then you could write just one overload like create_object(legacy_string &&, legacy_vector<legacy_string> &&, legacy_string &&) and it would more or less work like providing all the combinations of rvalue/lvalue reference overloads - actual arguments that were lvalues would get copied and then bound to the arguments, actual arguments that were rvalues would get directly bound.
Clarification/edit: I realize this is virtually identical to accepting arguments by value for movable types, like C++0x std::string and std::vector (save for the number of times the move constructor is conceptually invoked). However, it is not identical for copyable, but non-movable types, which includes all C++03 classes with explicitly-defined copy constructors. Consider this example:
class legacy_string { legacy_string(const legacy_string &); }; //defined in a header somewhere; not modifiable.
void f(legacy_string s1, legacy_string s2); //A *new* (C++0x) function that wants to move from its arguments where possible, and avoid copying
void g() //A C++0x function as well
{
legacy_string x(/*initialization*/);
legacy_string y(/*initialization*/);
f(std::move(x), std::move(y));
}
If g calls f, then x and y would be copied - I don't see how the compiler can move them. If f were instead declared as taking legacy_string && arguments, it could avoid those copies where the caller explicitly invoked std::move on the arguments. I don't see how these are equivalent.
Questions
My questions are then:
Is this a valid interpretation of the standard? It seems that it's not the conventional or intended one, at any rate.
Does it make intuitive sense?
Is there a problem with this idea that I"m not seeing? It seems like you could get copies being quietly created when that's not exactly expected, but that's the status quo in places in C++03 anyway. Also, it would make some overloads viable when they're currently not, but I don't see it being a problem in practice.
Is this a significant enough improvement that it would be worth making e.g. an experimental patch for GCC?
What about this code?
void f(std::string &&); //NB: No const string & overload supplied
void g2(const std::string & arg)
{
f(arg);
}
...However, GCC and MSVC both reject g2 because of clause 13.3.3.1.4/3, which prohibits lvalues from binding to rvalue ref arguments. I understand the rationale behind this - it is explained in N2831 "Fixing a safety problem with rvalue references". I also think that GCC is probably implementing this clause as intended by the authors of that paper, because the original patch to GCC was written by one of the authors (Doug Gregor)....
No, that's only half of the reason why both compilers reject your code. The other reason is that you can't initialize a reference to non-const with an expression referring to a const object. So, even before N2831 this didn't work. There is simply no need for a conversion because a string is a already a string. It seems you want to use string&& like string. Then, simply write your function f so that it takes a string by value. If you want the compiler to create a temporary copy of a const string lvalue just so you can invoke a function taking a string&&, there wouldn't be a difference between taking the string by value or by rref, would it?
N2831 has little to do with this scenario.
If you have a function that accepts a number of potentially-moveable arguments, and would move them if it can (e.g. a factory function/constructor: Object create_object(string, vector, string) or the like), and want to move or copy each argument as appropriate, you quickly start writing a lot of code.
Not really. Why would you want to write a lot of code? There is little reason to clutter all your code with const&/&& overloads. You can still use a single function with a mix of pass-by-value and pass-by-ref-to-const -- depending on what you want to do with the parameters. As for factories, the idea is to use perfect forwarding:
template<class T, class... Args>
unique_ptr<T> make_unique(Args&&... args)
{
T* ptr = new T(std::forward<Args>(args)...);
return unique_ptr<T>(ptr);
}
...and all is well. A special template argument deduction rule helps differentiating between lvalue and rvalue arguments and std::forward allows you to create expressions with the same "value-ness" as the actual arguments had. So, if you write something like this:
string foo();
int main() {
auto ups = make_unique<string>(foo());
}
the string that foo returned is automatically moved to the heap.
So if lvalues did bind to rvalues via an implicit copy, then you could write just one overload like create_object(legacy_string &&, legacy_vector &&, legacy_string &&) and it would more or less work like providing all the combinations of rvalue/lvalue reference overloads...
Well, and it would be pretty much equivalent to a function taking the parameters by value. No kidding.
Is this a significant enough improvement that it would be worth making e.g. an experimental patch for GCC?
There's no improvement.
I don't quite see your point in this question. If you have a class that is movable, then you just need a T version:
struct A {
T t;
A(T t):t(move(t)) { }
};
And if the class is traditional but has an efficient swap you can write the swap version or you can fallback to the const T& way
struct A {
T t;
A(T t) { swap(this->t, t); }
};
Regarding the swap version, I would rather go with the const T& way instead of that swap. The main advantage of the swap technique is exception safety and is to move the copy closer to the caller so that it can optimize away copies of temporaries. But what do you have to save if you are just constructing the object anyway? And if the constructor is small, the compiler can look into it and can optimize away copies too.
struct A {
T t;
A(T const& t):t(t) { }
};
To me, it doesn't seem right to automatically convert a string lvalue to a rvalue copy of itself just to bind to a rvalue reference. An rvalue reference says it binds to rvalue. But if you try binding to an lvalue of the same type it better fails. Introducing hidden copies to allow that doesn't sound right to me, because when people see a X&& and you pass a X lvalue, I bet most will expect that there is no copy, and that binding is directly, if it works at all. Better fail out straight away so the user can fix his/her code.