For class types it is possible to assign to temporary objects which is actually not allowed for built-in types. Further, the assignment operator generated by default even yields an lvalue:
int() = int(); // illegal: "expression is not assignable"
struct B {};
B& b = B() = B(); // compiles OK: yields an lvalue! ... but is wrong! (see below)
For the last statement the result of the assignment operator is actually used to initialize a non-const reference which will become stale immediately after the statement: the reference isn't bound to the temporary object directly (it can't as temporary objects can only be bound to a const or rvalue references) but to the result of the assignment whose life-time isn't extended.
Another problem is that the lvalue returned from the assignment operator doesn't look as if it can be moved although it actually refers to a temporary. If anything is using the result of the assignment to get hold of the value it will be copied rather than moved although it would be entirely viable to move. At this point it is worth noting that the problem is described in terms of the assignment operator because this operator is typically available for value types and returns an lvalue reference. The same problem exists for any function returning a reference to the objects, i.e., *this.
A potential fix is to overload the assignment operator (or other functions returning a reference to the object) to consider the kind of object, e.g.:
class G {
public:
// other members
G& operator=(G) & { /*...*/ return *this; }
G operator=(G) && { /*...*/ return std::move(*this); }
};
The possibility to overload the assignment operators as above has come with C++11 and would prevent the subtle object invalidation noted above and simultaneously allow moving the result of an assignment to a temporary. The implementation of the these two operators is probably identical. Although the implementation is likely to be rather simple (essentially just a swap() of the two objects) it still means extra work raising the question:
Should functions returning a reference to the object (e.g., the assignment operator) observe the rvalueness of the object being assigned to?
An alternatively (mentioned by Simple in a comment) is to not overload the assignment operator but to qualify it explicitly with a & to restrict its use to lvalues:
class GG {
public:
// other members
GG& operator=(GG) & { /*...*/ return *this; }
};
GG g;
g = GG(); // OK
GG() = GG(); // ERROR
IMHO, the original suggestion by Dietmar Kühl (providing overloads for & and && ref-qualifiers) is superior than Simple's one (providing it only for &).
The original idea is:
class G {
public:
// other members
G& operator=(G) & { /*...*/ return *this; }
G operator=(G) && { /*...*/ return std::move(*this); }
};
and Simple has suggested to remove the second overload. Both solutions invalidate this line
G& g = G() = G();
(as wanted) but if the second overload is removed, then these lines also fail to compile:
const G& g1 = G() = G();
G&& g2 = G() = G();
and I see no reason why they shouldn't (there's no lifetime issue as explained in Yakk's post).
I can see only one situation where Simple's suggestion is preferable: when G doesn't have an accessible copy/move constructor. Since most types for which the copy/move assignment operator is accessible also have an accessible copy/move constructor, this situation is quite rare.
Both overloads take the argument by value and there are good reasons for that if G has an accessible copy/move constructor. Suppose for now that G does not have one. In this case the operators should take the argument by const G&.
Unfortunately the second overload (which, as it is, returns by value) should not return a reference (of any type) to *this because the expression to which *this binds to is an rvalue and thus, it's likely to be a temporary whose lifetime is about to expiry. (Recall that forbidding this from happening was one of the OP's motivation.)
In this case, you should remove the second overload (as per Simple's suggestion) otherwise the class doesn't compile (unless the second overload is a template that's never instantiated). Alternatively, we can keep the second overload and define it as deleted. (But why bother since the existence of the overload for & alone is already enough?)
A peripheral point.
What should be the definition of operator = for &&? (We assume again that G has an accessible copy/move constructor.)
As Dietmar Kühl has pointed out and Yakk has explored, the code of the both overloads should be very similar and, in this case, it's better to implement the one for && in terms of the one for &. Since the performance of a move is expected to be no worse than a copy (and since RVO doesn't apply when returning *this) we should return std::move(*this). In summary, a possible one-line definition is:
G operator =(G o) && { return std::move(*this = std::move(o)); }
This is good enough if only G can be assigned to another G or if G has (non-explicit) converting constructors. Otherwise, you should instead consider giving G a (template) forwarding copy/move assignment operator taking an universal reference:
template <typename T>
G operator =(T&& o) && { return std::move(*this = std::forward<T>(o)); }
Although this is not a lot of boiler plate code it's still an annoyance if we have to do that for many classes. To decrease the amount of boiler plate code we can define a macro:
#define ASSIGNMENT_FOR_RVALUE(type) \
template <typename T> \
type operator =(T&& b) && { return std::move(*this = std::forward<T>(b)); }
Then inside G's definition one adds ASSIGNMENT_FOR_RVALUE(G).
(Notice that the relevant type appears only as the return type. In C++14 it can be automatically deduced by the compiler and thus, G and type in the last two code snippets can be replaced by auto. It follows that the macro can become an object-like macro instead of a function-like macro.)
Another way of reducing the amount of boiler plate code is defining a CRTP base class that implements operator = for &&:
template <typename Derived>
struct assignment_for_rvalue {
template <typename T>
Derived operator =(T&& o) && {
return std::move(static_cast<Derived&>(*this) = std::forward<T>(o));
}
};
The boiler plate becomes the inheritance and the using declaration as shown below:
class G : public assignment_for_rvalue<G> {
public:
// other members, possibly including assignment operator overloads for `&`
// but taking arguments of different types and/or value category.
G& operator=(G) & { /*...*/ return *this; }
using assignment_for_rvalue::operator =;
};
Recall that, for some types and contrarily to using ASSIGNMENT_FOR_RVALUE, inheriting from assignment_for_rvalue might have some unwanted consequences on the class layout.
The first problem is that this is not actually ok in C++03:
B& b = B() = B();
in that b is bound to an expired temporary once the line is finished.
The only "safe" way to use this is in a function call:
void foo(B&);
foo( B()=B() );
or something similar, where the line-lifetime of the temporaries is sufficient for the lifetime of what we bind it to.
We can replace the probably inefficient B()=B() syntax with:
template<typename T>
typename std::decay<T>::type& to_lvalue( T&& t ) { return t; }
and now the call looks clearer:
foo( to_lvalue(B()) );
which does it via pure casting. Lifetime is still not extended (I cannot think of a way to manage that), but we don't construct to objects then pointlessly assign one to the other.
So now we sit down and examine these two options:
G operator=(G o) && { return std::move(o); }
G&& operator=(G o) && { *this = std::move(o); return std::move(*this); }
G operator=(G o) && { *this = std::move(o); return std::move(*this); }
which are, as an aside, complete implementations, assuming G& operator=(G o)& exists and is written properly. (Why duplicate code when you don't need to?)
The first and third allows for lifetime extension of the return value, the second uses the lifetime of *this. The second and third modify *this, while the first one does not.
I would claim that the first one is the right answer. Because *this is bound to an rvalue, the caller has stated that it will not be reused, and its state does not matter: changing it is pointless.
The lifetime of first and third means that whomever uses it can extend the lifetime of the returned value, and not be tied to whatever *this's lifetime is.
About the only use the B operator=(B)&& has is that it allows you to treat rvalue and lvalue code relatively uniformly. As a downside, it lets you treat it relatively uniformly in situations where the result may be surprising.
std::forward<T>(t) = std::forward<U>(u);
should probably fail to compile instead of doing something surprising like "not modifying t" when T&& is an rvalue reference. And modifying t when T&& is an rvalue reference is equally wrong.
Related
I have been writing the following code to support function calls on rvalues without having to std::move explicitly on the return value.
struct X {
X& do_something() & {
// some code
return *this;
}
X&& do_something() && {
// some code
return std::move(*this);
}};
But this results in having to repeat the code inside the function. Preferably, I would do something like
struct X {
X& do_something() & {
// some code
return *this;
}
X&& do_something() && {
return std::move(do_something());
}};
Is this a valid transformation? Why or why not?
Also, I can't help but feel that there's some knowledge gap w.r.t ref-qualifiers. Is there a general way (or a set of rules) of figuring out if code like this is valid or not?
Is this a valid transformation?
Yes. Inside a member function *this is always an lvalue. Even if the function is rvalue reference qualified. It's the same as
void foo(bar& b) { /* do things */ }
void foo(bar&& b) {
// b is an lvalue inside the function
foo(b); // calls the first overload
}
So you may use an lvalue ref qualified function to share an implementation.
And using std::move on the result is also no problem. The first overload can only return an lvalue reference, because as far as it knows, it was called on an lvalue. Meanwhile the second overload has an extra bit of information, it knows it was originally invoked on an rvalue. Therefore, it does an extra cast, based on the additional information.
std::move is just a named cast that turns lvalues into rvalues. Its purpose is to signal the designated object can be treated as though it's about to expire. Since you are doing this cast inside a context where you know this to be true (the member is originally called on an object that binds to an rvalue reference), it should not pose a problem.
I'm used to pass lambda functions (and other callables) to template functions -- and use them -- as follows
template <typename F>
auto foo (F && f)
{
// ...
auto x = std::forward<F>(f)(/* some arguments */);
// ...
}
I mean: I'm used to pass them through a forwarding reference and call them passing through std::forward.
Another Stack Overflow user argue (see comments to this answer) that this, calling the functional two or more time, it's dangerous because it's semantically invalid and potentially dangerous (and maybe also Undefined Behavior) when the function is called with a r-value reference.
I've partially misunderstand what he means (my fault) but the remaining doubt is if the following bar() function (with an indubitable multiple std::forward over the same object) it's correct code or it's (maybe only potentially) dangerous.
template <typename F>
auto bar (F && f)
{
using A = typename decltype(std::function{std::forward<F>(f)})::result_type;
std::vector<A> vect;
for ( auto i { 0u }; i < 10u ; ++i )
vect.push_back(std::forward<F>(f)());
return vect;
}
Forward is just a conditional move.
Therefore, to forward the same thing multiple times is, generally speaking, as dangerous as moving from it multiple times.
Unevaluated forwards don't move anything, so those don't count.
Routing through std::function adds a wrinkle: that deduction only works on function pointers and on function objects with a single function call operator that is not && qualified. For these, rvalue and lvalue invocation are always equivalent if both compiles.
I'd say the general rule applies in this case. You're not supposed to do anything with a variable after it was moved/forwarded from, except maybe assigning to it.
Thus...
How do correctly use a callable passed through forwarding reference?
Only forward if you're sure it won't be called again (i.e. on last call, if at all).
If it's never called more than once, there is no reason to not forward.
As for why your snippet could be dangerous, consider following functor:
template <typename T>
struct foo
{
T value;
const T &operator()() const & {return value;}
T &&operator()() && {return std::move(value);}
};
As an optimization, operator() when called on an rvalue allows caller to move from value.
Now, your template wouldn't compile if given this functor (because, as T.C. said, std::function wouldn't be able to determine return type in this case).
But if we changed it a bit:
template <typename A, typename F>
auto bar (F && f)
{
std::vector<A> vect;
for ( auto i { 0u }; i < 10u ; ++i )
vect.push_back(std::forward<F>(f)());
return vect;
}
then it would break spectacularly when given this functor.
If you're either going to just forward the callable to another place or simply call the callable exactly once, I would argue that using std::forward is the correct thing to do in general. As explained here, this will sort of preserve the value category of the callable and allow the "correct" version of a potentially overloaded function call operator to be called.
The problem in the original thread was that the callable was being called in a loop, thus potentially invoked more than once. The concrete example from the other thread was
template <typename F>
auto map(F&& f) const
{
using output_element_type = decltype(f(std::declval<T>()));
auto sequence = std::make_unique<Sequence<output_element_type>>();
for (const T& element : *this)
sequence->push(f(element));
return sequence;
}
Here, I believe that calling std::forward<F>(f)(element) instead of f(element), i.e.,
template <typename F>
auto map(F&& f) const
{
using output_element_type = decltype(std::forward<F>(f)(std::declval<T>()));
auto sequence = std::make_unique<Sequence<output_element_type>>();
for (const T& element : *this)
sequence->push(std::forward<F>(f)(element));
return sequence;
}
would be potentially problematic. As far as my understanding goes, the defining characteristic of an rvalue is that it cannot explicitly be referred to. In particular, there is naturally no way for the same prvalue to be used in an expression more than once (at least I can't think of one). Furthermore, as far as my understanding goes, if you're using std::move or std::forward or whatever other way to obtain an xvalue, even on the same original object, the result will be a new xvalue every time. Thus, there also cannot possibly be a way to refer to the same xvalue more than once. Since the same rvalue cannot be used more than once, I would argue (see also comments underneath this answer) that it would generally be a valid thing for an overloaded function call operator to do something that can only be done once in case the call happens on an rvalue, for example:
class MyFancyCallable
{
public:
void operator () & { /* do some stuff */ }
void operator () && { /* do some stuff in a special way that can only be done once */ }
};
The implementation of MyFancyCallable may assume that a call that would pick the &&-qualified version cannot possibly happen more than once (on the given object). Thus, I would consider forwarding the same callable into more than one call to be semantically broken.
Of course, technically, there is no universal definition of what it actually means to forward or move an object. In the end, it's really up to the implementation of the particular types involved to assign meaning there. Thus, you may simply specify as part of your interface that potential callables passed to your algorithm must be able to deal with being called multiple times on an rvalue that refers to the same object. However, doing so pretty much goes against all the conventions for how the rvalue reference mechanism is generally used in C++, and I don't really see what there possibly would be to be gained from doing this…
C++11 makes it possible to overload member functions based on reference qualifiers:
class Foo {
public:
void f() &; // for when *this is an lvalue
void f() &&; // for when *this is an rvalue
};
Foo obj;
obj.f(); // calls lvalue overload
std::move(obj).f(); // calls rvalue overload
I understand how this works, but what is a use case for it?
I see that N2819 proposed limiting most assignment operators in the standard library to lvalue targets (i.e., adding "&" reference qualifiers to assignment operators), but this was rejected. So that was a potential use case where the committee decided not to go with it. So, again, what is a reasonable use case?
In a class that provides reference-getters, ref-qualifier overloading can activate move semantics when extracting from an rvalue. E.g.:
class some_class {
huge_heavy_class hhc;
public:
huge_heavy_class& get() & {
return hhc;
}
huge_heavy_class const& get() const& {
return hhc;
}
huge_heavy_class&& get() && {
return std::move(hhc);
}
};
some_class factory();
auto hhc = factory().get();
This does seem like a lot of effort to invest only to have the shorter syntax
auto hhc = factory().get();
have the same effect as
auto hhc = std::move(factory().get());
EDIT: I found the original proposal paper, it provides three motivating examples:
Constraining operator = to lvalues (TemplateRex's answer)
Enabling move for members (basically this answer)
Constraining operator & to lvalues. I suppose this is sensible to ensure that the "pointee" is more likely to be alive when the "pointer" is eventually dereferenced:
struct S {
T operator &() &;
};
int main() {
S foo;
auto p1 = &foo; // Ok
auto p2 = &S(); // Error
}
Can't say I've ever personally used an operator& overload.
One use case is to prohibit assignment to temporaries
// can only be used with lvalues
T& operator*=(T const& other) & { /* ... */ return *this; }
// not possible to do (a * b) = c;
T operator*(T const& lhs, T const& rhs) { return lhs *= rhs; }
whereas not using the reference qualifier would leave you the choice between two bads
T operator*(T const& lhs, T const& rhs); // can be used on rvalues
const T operator*(T const& lhs, T const& rhs); // inhibits move semantics
The first choice allows move semantics, but acts differently on user-defined types than on builtins (doesn't do as the ints do). The second choice would stop the assigment but eliminate move semantics (possible performance hit for e.g. matrix multiplication).
The links by #dyp in the comments also provide an extended discussion on using the other (&&) overload, which can be useful if you want to assign to (either lvalue or rvalue) references.
If f() needs a Foo temp that is a copy of this and modified, you can modify the temp this instead while you can't otherwise
On the one hand you can use them to prevent functions that are semantically nonsensical to call on temporaries from being called, such as operator= or functions that mutate internal state and return void, by adding & as a reference qualifier.
On the other hand you can use it for optimizations such as moving a member out of the object as a return value when you have an rvalue reference, for example a function getName could return either a std::string const& or std::string&& depending on the reference qualifier.
Another use case might be operators and functions that return a reference to the original object such as Foo& operator+=(Foo&) which could be specialized to return an rvalue reference instead, making the result movable, which would again be an optimization.
TL;DR: Use it to prevent incorrect usage of a function or for optimization.
C++11 makes it possible to overload member functions based on reference qualifiers:
class Foo {
public:
void f() &; // for when *this is an lvalue
void f() &&; // for when *this is an rvalue
};
Foo obj;
obj.f(); // calls lvalue overload
std::move(obj).f(); // calls rvalue overload
I understand how this works, but what is a use case for it?
I see that N2819 proposed limiting most assignment operators in the standard library to lvalue targets (i.e., adding "&" reference qualifiers to assignment operators), but this was rejected. So that was a potential use case where the committee decided not to go with it. So, again, what is a reasonable use case?
In a class that provides reference-getters, ref-qualifier overloading can activate move semantics when extracting from an rvalue. E.g.:
class some_class {
huge_heavy_class hhc;
public:
huge_heavy_class& get() & {
return hhc;
}
huge_heavy_class const& get() const& {
return hhc;
}
huge_heavy_class&& get() && {
return std::move(hhc);
}
};
some_class factory();
auto hhc = factory().get();
This does seem like a lot of effort to invest only to have the shorter syntax
auto hhc = factory().get();
have the same effect as
auto hhc = std::move(factory().get());
EDIT: I found the original proposal paper, it provides three motivating examples:
Constraining operator = to lvalues (TemplateRex's answer)
Enabling move for members (basically this answer)
Constraining operator & to lvalues. I suppose this is sensible to ensure that the "pointee" is more likely to be alive when the "pointer" is eventually dereferenced:
struct S {
T operator &() &;
};
int main() {
S foo;
auto p1 = &foo; // Ok
auto p2 = &S(); // Error
}
Can't say I've ever personally used an operator& overload.
One use case is to prohibit assignment to temporaries
// can only be used with lvalues
T& operator*=(T const& other) & { /* ... */ return *this; }
// not possible to do (a * b) = c;
T operator*(T const& lhs, T const& rhs) { return lhs *= rhs; }
whereas not using the reference qualifier would leave you the choice between two bads
T operator*(T const& lhs, T const& rhs); // can be used on rvalues
const T operator*(T const& lhs, T const& rhs); // inhibits move semantics
The first choice allows move semantics, but acts differently on user-defined types than on builtins (doesn't do as the ints do). The second choice would stop the assigment but eliminate move semantics (possible performance hit for e.g. matrix multiplication).
The links by #dyp in the comments also provide an extended discussion on using the other (&&) overload, which can be useful if you want to assign to (either lvalue or rvalue) references.
If f() needs a Foo temp that is a copy of this and modified, you can modify the temp this instead while you can't otherwise
On the one hand you can use them to prevent functions that are semantically nonsensical to call on temporaries from being called, such as operator= or functions that mutate internal state and return void, by adding & as a reference qualifier.
On the other hand you can use it for optimizations such as moving a member out of the object as a return value when you have an rvalue reference, for example a function getName could return either a std::string const& or std::string&& depending on the reference qualifier.
Another use case might be operators and functions that return a reference to the original object such as Foo& operator+=(Foo&) which could be specialized to return an rvalue reference instead, making the result movable, which would again be an optimization.
TL;DR: Use it to prevent incorrect usage of a function or for optimization.
Now that GCC 4.8.1 and Clang 2.9 and higher support them, reference qualifiers (also known as "rvalue references for *this") have become more widely available. They allow classes to behave even more like built-in types by, e.g., disallowing assignment to rvalues (which can otherwise cause an unwanted cast of an rvalue to an lvalue):
class A
{
// ...
public:
A& operator=(A const& o) &
{
// ...
return *this;
}
};
In general, it is sensible to call a const member function of an rvalue, so an lvalue reference qualifier would be out of place (unless the rvalue qualifier can be used for an optimization such as moving a member out of a class instead of returning a copy).
On the flip side, mutating operators such as the pre decrement/increment operators should be lvalue-qualified, as they usually return an lvalue-reference to the object. Hence also the question: Are there any reasons to ever allow mutating/non-const methods (including operators) to be called on rvalue references aside from conceptually const methods which are only not marked const because const-correctness (including proper application of mutable when using an internal cache, which may include ensuring certain thread-saftey guarantees now) was neglected in the code base?
To clarify, I am not suggesting to forbid mutating methods on rvalues on the language level (at the very least this could break legacy code) but I believe that defaulting (as an idiom / coding style) to only allowing lvalues for mutating methods will generally lead to cleaner, safer APIs. However I am interested in examples where not doing so leads to cleaner, less astonishing APIs.
A mutator that operates on an R-value can be useful if the R-value is used to accomplish some task, but in the interim it maintains some state. For example:
struct StringFormatter {
StringFormatter &addString(string const &) &;
StringFormatter &&addString(string const &) &&;
StringFormatter &addNumber(int) &;
StringFormatter &&addNumber(int) &&;
string finish() &;
string finish() &&;
};
int main() {
string message = StringFormatter()
.addString("The answer is: ")
.addNumber(42)
.finish();
cout << message << endl;
}
By allowing either an L-value or an R-value, one can construct an object, pass it through some mutators, and use the result of the expression to accomplish some task without having to store it in an L-value, even if the mutators are member functions.
Also note that not all mutating operators return a reference to the self. User-defined mutators can implement any signature they need or want. A mutator may consume the state of the object to return something more useful, and by acting on an R-value, the fact that the object is consumed isn't a problem since the state would have otherwise been discarded. In fact, a member function that consumes the state of the object to produce something else useful will have to be marked as such, making it easier to see when l-values are consumed. For example:
MagicBuilder mbuilder("foo", "bar");
// Shouldn't compile (because it silently consumes mbuilder's state):
// MagicThing thing = mbuilder.construct();
// Good (the consumption of mbuilder is explicit):
MagicThing thing = move(mbuilder).construct();
I think it comes about in cases where the only way to retrieve some value is by mutating another value. For instance, iterators don't provide a "+1" or a "next" method. So suppose I'm constructing a wrapper for stl list iterators (perhaps to create an iterator for my own list-backed data-structure):
class my_iter{
private:
std::list::iterator<T> iter;
void assign_to_next(std::list::iterator<T>&& rhs) {
iter = std::move(++rhs);
}
};
Here, the assign_to_next method takes an iterator and assigns this one to have the next position after that one. It's not too hard to imagine situations where this might be useful, but more importantly there is nothing surprising about this implementation. True, we could also say iter = std::move(rhs); ++iter; or ++(iter = std::move(rhs));, but I don't see any arguments for why those would be any cleaner or faster. I think this implementation is the most natural to me.
FWIW HIC++ agrees with you as far as assignment operators:
http://www.codingstandard.com/rule/12-5-7-declare-assignment-operators-with-the-ref-qualifier/
Should a non-const method ever apply to rvalues?
This question puzzles me. A more sensible question to me would be:
Should a const method ever apply exclusively to rvalues?
To which I believe the answer is no. I can't imagine a situation in which you would want to overload on const rvalue *this, just as I can't imagine a situation in which you would want to overload on const rvalue arguments.
You overload on rvalues because it's possible to handle them more efficiently when you know that you can steal their guts, but you can't steal the guts of a const object.
There are four possible ways to overload on *this:
struct foo {
void bar() &;
void bar() &&;
void bar() const &;
void bar() const &&;
};
The constness of the latter two overloads means that neither one can mutate *this, so there can be no difference between what the const & overload is allowed to do to *this and what the const && overload is allowed to do to *this. In the absence of the const && overload, the const & will bind to both lvalues and rvalues anyway.
Given that overloading on const && is useless and only really provided for completeness (prove me wrong!) we are left with only one remaining use case for ref-qualifiers: overloading on non-const rvalue *this. One can define a function body for a && overload, or one can = delete it (this happens implicitly if only a & overload is provided). I can imagine plenty of cases in which defining a && function body might be useful.
A proxy object which implements pointer semantics by overloading operator-> and unary operator*, such as boost::detail::operator_arrow_dispatch, might find it useful to use ref-qualifiers on its operator*:
template <typename T>
struct proxy {
proxy(T obj) : m_obj(std::move(obj)) {}
T const* operator->() const { return &m_obj; }
T operator*() const& { return m_obj; }
T operator*() && { return std::move(m_obj); }
private:
T m_obj;
};
If *this is a rvalue then operator* can return by move instead of by copy.
I can imagine functions that move from the actual object to a parameter.