Some clarification on rvalue references

Some clarification on rvalue references - c++

First: where are std::move and std::forward defined? I know what they do, but I can't find proof that any standard header is required to include them. In gcc44 sometimes std::move is available, and sometimes its not, so a definitive include directive would be useful.
When implementing move semantics, the source is presumably left in an undefined state. Should this state necessarily be a valid state for the object? Obviously, you need to be able to call the object's destructor, and be able to assign to it by whatever means the class exposes. But should other operations be valid? I suppose what I'm asking is, if your class guarantees certain invariants, should you strive to enforce those invariants when the user has said they don't care about them anymore?
Next: when you don't care about move semantics, are there any limitations that would cause a non-const reference to be preferred over an rvalue reference when dealing with function parameters? void function(T&); over void function(T&&); From a caller's perspective, being able to pass functions temporary values is occasionally useful, so it seems as though one should grant that option whenever it is feasible to do so. And rvalue references are themselves lvalues, so you can't inadvertently call a move-constructor instead of a copy-constructor, or something like that. I don't see a downside, but I'm sure there is one.
Which brings me to my final question. You still can not bind temporaries to non-const references. But you can bind them to non-const rvalue references. And you can then pass along that reference as a non-const reference in another function.
void function1(int& r) { r++; }
void function2(int&& r) { function1(r); }
int main() {
function1(5); //bad
function2(5); //good
}
Besides the fact that it doesn't do anything, is there anything wrong with that code? My gut says of course not, since changing rvalue references is kind of the whole point to their existence. And if the passed value is legitimately const, the compiler will catch it and yell at you. But by all appearances, this is a runaround of a mechanism that was presumably put in place for a reason, so I'd just like confirmation that I'm not doing anything foolish.

First: where are std::move and std::forward defined?
See 20.3 Utility components, <utility>.
When implementing move semantics, the source is presumably left in an undefined state. Should this state necessarily be a valid state for the object?
Obviously, the object should still be destructibly. But further than that, I think it's a good idea to be still assignable. The Standard says for objects that satisfy "MoveConstructible" and "MoveAssignable":
[ Note: rv remains a valid object. Its state is unspecified. — end note ]
This would mean, I think, that the object can still participate in any operation that doesn't state any precondition. This includes CopyConstructible, CopyAssignable, Destructible and other things. Notice that this won't require anything for your own objects from a core language perspective. The requirements only take place once you touch Standard library components that state these requirements.
Next: when you don't care about move semantics, are there any limitations that would cause a non-const reference to be preferred over an rvalue reference when dealing with function parameters?
This, unfortunately, crucially depends on whether the parameter is in a function template and uses a template parameter:
void f(int const&); // takes all lvalues and const rvalues
void f(int&&); // can only accept nonconst rvalues
However for a function template
template<typename T> void f(T const&);
template<typename T> void f(T&&);
You can't say that, because the second template will, after being called with an lvalue, have as parameter of the synthesized declaration the type U& for nonconst lvalues (and be a better match), and U const& for const lvalues (and be ambiguous). To my knowledge, there is no partial ordering rule to disambiguate that second ambiguity. However, this is already known.
-- Edit --
Despite that issue report, I don't think that the two templates are ambiguous. Partial ordering will make the first template more specialized, because after taking away the reference modifiers and the const, we will find that both types are the same, and then notice that the first template had a reference to const. The Standard says (14.9.2.4)
If, for a given type, deduction succeeds in both directions (i.e., the types are identical after the transfor-mations above) and if the type from the argument template is more cv-qualiﬁed than the type from the parameter template (as described above) that type is considered to be more specialized than the other.
If for each type being considered a given template is at least as specialized for all types and more specialized for some set of types and the other template is not more specialized for any types or is not at least as specialized for any types, then the given template is more specialized than the other template.
This makes the T const& template the winner of partial ordering (and GCC is indeed correct to choose it).
-- Edit End --
Which brings me to my final question. You still can not bind temporaries to non-const references. But you can bind them to non-const rvalue references.
This is nicely explained in this article. The second call using function2 only takes nonconst rvalues. The rest of the program won't notice if they are modified, because they won't be able to access those rvalues afterwards anymore! And the 5 you pass is not a class type, so a hidden temporary is created and then passed to the int&& rvalue reference. The code calling function2 won't be able to access that hidden object here, so it won't notice any change.
A different situation is if you do this one:
SomeComplexObject o;
function2(move(o));
You have explicitly requested that o is moved, so it will be modified according to its move specification. However moving is a logically non-modifying operation (see the article). This means whether you move or not shouldn't be observable from the calling code:
SomeComplexObject o;
moveit(o); // #1
o = foo;
If you erase the line that moves, behavior will still be the same, because it's overwritten anyway. This however means that code that uses the value of o after it has been moved from is bad, because it breaks this implicit contract between moveit and the calling code. Thus, the Standard makes no specification about the concrete value of a moved from container.

where are std::move and std::forward defined?
std::move and std::forward are declared in <utility>. See the synopsis at the beginning of section 20.3[utility].
When implementing move semantics, the source is presumably left in an undefined state.
It of course depends on how you implement the move-constructor and move-assignment operator. If you want to use your objects in standard containers, however, you have to follow the MoveConstructible and MoveAssignable concepts, which says that the object remains valid, but is left in unspecified state, i.e. you definitely can destroy it.

included by utility
Here is the article I read about rvalues.
I can't help you with rest, sorry.

Related

Why is assigning rvalue to lvalue member reference allowed in c++?

After learning some Rust and its lifetime specifiers, borrowing semantics, etc, I came across a Rust sample which doesn't allow something like that which is allowed in C++. Why?
struct S {
std::string& str;
S(std::string&& value) : str(value) {}
};

It is not allowed and would cause an error if you actually did try that.
However, the name of a variable is always an lvalue, not a rvalue. What type the variable is doesn't matter at all. You need to call std::move on it to turn it into a rvalue. That's what std::move does.
When using a rvalue reference variable it behaves exactly like a lvalue reference variable. Both refer directly to the bound object as a lvalue. They only differ in how they can or cannot be initialized and how they affect overload resolution and template argument deduction.
The point is to make it explicit when you potentially move from an object. Even if value is a rvalue reference, you may still use it multiple times in the function. But, usually, when you move from the referenced object, you can't use it afterwards anymore or at least it will lose its state. Therefore it must be clear where a potential move can happen while still allowing non-move usage. So the rule makes sense to enforce you to write std::move explicitly everywhere where a move might happen.
If the goal is prevent a user from constructing S from a temporary, which is reasonable, then you should delete the constructor completely:
S(std::string&& value) = delete;
(There are technically some issues with this. A better, but complex approach is to follow what std::reference_wrapper does.)
However, the user already has to think about lifetime of the argument they pass to S. So you only really catch a small subset of potential mistakes with this. There is no way to protect the user from not keeping the passed object alive long enough. If you need to ensure this through S, then S must take (shared) ownership of the object, e.g. by using std::unique_ptr or std::shared_ptr instead of a reference. There is no borrow checker in C++ like there is in Rust to verify that the borrowed/referenced object is kept alive long enough.
C++ lifetime and ownership management is mostly based on convention together with some utilities like the smart pointers. There is no intrinsic core language enforcement aside from automatic storage duration, although core language features like rvalue references are designed to support the conventions, e.g. as I described above.

Why C++ standard library does not pass predicates as && [duplicate]

I was looking at the various signatures for std::find_if on cppreference.com, and I noticed that the flavors that take a predicate function appear to accept it by value:
template< class InputIt, class UnaryPredicate >
InputIt find_if( InputIt first, InputIt last,
UnaryPredicate p );
If I understand them correctly, lambdas with captured variables allocate storage for either references or copies of their data, and so presumably a "pass-by-value" would imply that the copies of captured data are copied for the call.
On the other hand, for function pointers and other directly addressable things, the performance should be better if the function pointer is passed directly, rather than by reference-to-pointer (pointer-to-pointer).
First, is this correct? Is the UnaryPredicate above going to be a by-value parameter?
Second, is my understanding of passing lambdas correct?
Third, is there a reason for passing by value instead of by reference in this situation? And more to the point, is there not some sufficiently ambiguous syntax (hello, universal reference) that would let the compiler do whatever it wants to get the most performance out?

Is the UnaryPredicate above going to be a by-value parameter?
Yes, that's what it says in the function parameter list. It accepts a deduced value type.
Beyond that, lambda expressions are prvalues. Meaning, with c++17's guaranteed copy elision, that p is initialized directly from the lambda expression. No extra copies of the closure or the captured objects are being made when passing it into the function (the function may make more copies internally however, though that's not common).
If the predicate was passed by reference, a temporary object would need to be materialized. So for a lambda expression, nothing is gained by a switch to pass by reference.
If you have other sorts of predicates, which are expansive to copy, then you can pass in std::reference_wrapper to that predicate object, for a cheap "handle" to it. The wrapper's operator() will do the right thing.
The definition is mostly historic, but nowadays it's really a non-issue to do it with pass by value.
To elaborate on why referential semantics would suck, let's try to take it through the years. A simple lvalue reference won't do, since now we don't support binding to an rvalue. A const lvalue reference won't do either, since now we require the predicate to not modify any internal state, and what for?
So up to c++11, we don't really have an alternative. A pass by value would be better than a reference. With the new standard, we may revise our approach. To support rvalues, we may add an rvalue reference overload. But that is an exercise in redundancy, since it doesn't need to do anything different.
By passing a value, the caller has the choice in how to create it, and for prvalues, in c++17, it's practically free. If the caller so desires, they can provide referential semantics explicitly. So nothing is lost, and I think much is gained in terms of simplicity of usage and API design.

There are actually multiple reasons:
you can always turn deduced value arguments into using reference semantics but not vice verse: just pass std::ref(x) instead of x. std::reference_wrapper<T> isn't entirely equivalent to passing a reference but especially for function object it does the Right Thing. That is, passing generic arguments by value is the more general approach.
Pass by reference (T&) doesn't work for temporary or const objects, T const& doesn't work for non-const&, i.e., the only choice would be T&& (forwarding reference) which didn't exist pre-C++11 and the algorithm interfaces didn't change since they were introduced with C++98.
Value parameters can be copy elided unlike any sort of reference parameters, including forwarding references.

Understanding lvalue/rvalue expression vs object type

I've read some of the prior top answers as well as Stroustrup's "The C++ Programming Language" and "Effective Modern C++" but I'm having trouble really understanding the distinction between the lvalue/rvalue aspect of an expression vs its type. In the introduction to "Effective Modern C++" it says:
A useful heuristic to determine whether an expression is an lvalue is to ask if you can take its address. If you can, it typically is. If you can't, it's usually an rvalue. A nice feature of this heuristic is that it helps you remember that the type of an expression is independent of whether the expression is an lvalue or rvalue ... It's especially important to remember this when dealing with a parameter of rvalue reference type, because the parameter itself is an lvalue.
I'm not understanding something because I don't understand why if you have an rvalue reference type parameter you need to actually cast it to an rvalue via std::move() to make it eligible to be moved. Even if the parameter (all parameters) is an lvalue the compiler knows its type is an rvalue reference so why the need to tell the compiler that it can be moved? It seems redundant but I guess I am not understanding the distinction between the type of an expression vs its lvalue/rvalue nature (not sure of the right terminology).
Edit:
To follow-up to some of the answers/comments below what's still not clear is why in doSomething() below I would need to wrap the parameter in std::move() to get it to bind to an rvalue reference and resolve to the 2nd version of doSomethingElse(). I understand that if this were to implicitly happen it would be bad because the parameter would have been moved from and one could inadvertently use it after this. It seems like the the rvalue reference type nature of the parameter is meaningless within the function as its only purpose was to bind to resolve to the right version of the function given an rvalue was passed in as an argument.
Widget getWidget();
void doSomethingElse(Widget& rhs); // #1
void doSomethingElse(Widget&& rhs); // #2
void doSomething(Widget&& rhs) {
// will call #1
doSomethingElse(rhs);
// will call #2
doSomethingElse(std::move(rhs));
}
int main() {
doSomething(getWidget());
}

I don't understand why if you have an rvalue reference type parameter you need to actually cast it to an rvalue via std::move() to make it eligible to be moved.
As the quotes said, types and value categories are different things. A parameter is always an lvalue, even its type is an rvalue-reference; we have to use std::move to bind it to an rvalue-reference. Suppose we allow the compiler to do it implicitly, like the following code snippet,
void foo(std::string&& s);
void bar(std::string&& s) {
foo(s);
// continue to use s...
// oops, s might have been moved
foo(std::string{}); // this is fine;
// the temporary will be destroyed after the full expression and won't be used later
}
So we have to use std::move explicitly, to tell the compiler that we know what we're trying to do.
void bar(std::string&& s) {
foo(std::move(s));
// we know that s might have been moved
}

RValue references exist to solve the forwarding problem. Existing type deduction rules in C++ made it impossible to have consistent and sensible move semantics. So, the type system was extended and new rules were introduced to make it more complicated but consistent.
This only makes sense if you look at it through the perspective of the problem that was being solved. Here is a good link dedicated just to explaining RValue references.

I think you've actually grasped the distinction between type and value category, so I'll focus on two specific claims/queries:
You need to actually cast it to an rvalue via std::move() to make it eligible to be moved
Sort of, but not really. Coercing the expression that names or refers to your object into an rvalue allows us to trigger, during overload resolution, the function overload that takes a Type&&. It is convention that we do this when we want to transfer ownership, but that's not quite the same as making it "eligible to be moved" because a move may not be what you end up doing. This is kind of nitpicking in a sense, though I think it's important to understand. Because:
Even if the parameter (all parameters) is an lvalue the compiler knows its type is an rvalue reference so why the need to tell the compiler that it can be moved?
Unless you write std::move(theThing), or the thing is a temporary (an rvalue already) then it's not an rvalue, and so it cannot bind to an rvalue reference. That's how it's all designed and defined. It's deliberately made that way so that an lvalue expression, an expression that names a thing, a thing that you haven't written std::move() around, will not bind to an rvalue reference. And so either your program will not compile or, if available, overload resolution will instead pick a version of the function that takes maybe a const Type& — and we know that there can be no ownership transfer involved with that.
tl;dr: the compiler doesn't know that its type is an rvalue reference, because it isn't one. Much like you can't do int& ref = 42, you can't do int x = 42; int&& ref = x;. Otherwise, it'd try to move everything! The whole point is to make certain kinds of references only work with certain kinds of expression, so that we can use this to trigger calls to copying/moving functions as appropriate with a minimum of machinery at the callsite.

Why does C++11 have implicit moves for value parameters, but not for rvalue parameters?

In C++11, value parameters (and other values) enjoy implicit move when returned:
A func(A a) {
return a; // uses A::A(A&&) if it exists
}
At least in MSVC 2010, rvalue reference parameters need std::move:
A func(A && a) {
return a; // uses A::A(A const&) even if A::A(A&&) exists
}
I would imagine that inside functions, an rvalue reference and a value behave similar, with the only difference that in case of values, the function itself is responsible for destruction, while for rvalue references, the responsibility is outside.
What is the motivation for treating them differently in the standard?

The standardization committee expended great effort in creating wording so that moves would only ever happen in exactly two circumstances:
When it is clearly safe to do so.
When the user explicitly asks (via std::move or a similar cast).
A value parameter will unquestionably be destroyed at the end of the function. Therefore, returning it by move is clearly safe; it can't be touched by other code after the return (not unless you're deliberately trying to break things, in which case you probably triggered undefined behavior). Therefore, it can be moved from in the return.
A && variable could be referring to a temporary. But it could be referring to an lvalue (a named variable). It is therefore not clearly safe to move from it; the original variable could be lurking around. And since you didn't explicitly ask to move from it (ie: you didn't call std::move in this function), no movement can take place.
The only time a && variable will be implicitly moved from (ie: without std::move) is when you return it. std::move<T> returns a T&&. It is legal for that return value to invoke the move constructor, because it is a return value.
Now it is very difficult to call A func(A &&a) with an lvalue without calling std::move (or an equivalent cast). So technically, it should be fine for parameters of && type to be implicitly moved from. But the standards committee wanted moves to be explicit for && types, just to make sure that movement didn't implicitly happen within the scope of this function. That is, it can't use outside-of-function knowledge about where the && comes from.
In general, you should only take parameters by && in two cases: either you're writing a move constructor (or move assignment operator, but even that can be done by value), or you're writing a forwarding function. There may be a few other cases, but you shouldn't take && to a type unless you have something special in mind. If A is a moveable type, then just take it by value.

This was fixed for C++20 by P0527 and P1825. The only way to have a function parameter bind to an rvalue reference is for the source to either be a temporary or for the caller to explicitly cast a non-temporary to an rvalue (for instance, with std::move). Therefore, this "mandatory optimization" was deemed safe.

In your first case, the compiler knows that a is going away and nothing will be able to cling on to it: clearly, this object can be moved from and if it is not it will be destroyed. In the second case, the rvalue reference indicates that it is permissible to move from the object and the caller doesn't expect the object to stay around. However, it is the function's choice whether it takes advantage of this permission or not and there may be reasons why the function sometimes wants to move from the argument and sometimes it doesn't want to. If the compiler were given the liberty to move off this object, there would be no way to prevent the compiler from doing so. However, using std::move(a) there is already a way to indicate that it is desired to move from the object.
The general rule in the standard is that the compiler only ever moves objects implicitly which are known to go away. When an rvalue reference comes in, the compiler doesn't really know that the object is about to away: if it was explicitly std::move()ed it actually stays around.

Use of rvalue reference members?

I was wondering what use an rvalue reference member has
class A {
// ...
// Is this one useful?
Foo &&f;
};
Does it have any benefits or drawbacks compared to an lvalue reference member? What is a prime usecase of it?

I've seen one very motivating use case for rvalue reference data members, and it is in the C++0x draft:
template<class... Types>
tuple<Types&&...>
forward_as_tuple(Types&&... t) noexcept;
Effects: Constructs a tuple of
references to the arguments in t
suitable for forwarding as arguments
to a function. Because the result may
contain references to temporary
variables, a program shall ensure that
the return value of this function does
not outlive any of its arguments.
(e.g., the program should typically
not store the result in a named
variable).
Returns: tuple<Types&&...>(std::forward<Types>(t)...)
The tuple has rvalue reference data members when rvalues are used as arguments to forward_as_tuple, and otherwise has lvalue reference data members.
I've found forward_as_tuple subsequently helpful when needing to catch variadic arguments, perfectly forward them packed as a tuple, and re-expand them later at the point of forwarding to a functor. I used forward_as_tuple in this style when implementing an enhanced version of tuple_cat proposed in LWG 1385:
http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#1385

According to Stephan T. Lavavej, rvalue reference data members have no use.
[at 31:00] The thing I've seen programmers do when they get hold of rvalue references is that, they start to go a little crazy, because they're so powerful. They start saying "Oh, I'm gonna have rvalue reference data members, I'm gonna have rvalue reference local variables, I'm gonna have rvalue reference return values!" And then they write code like this: [...]

class A {
// ...
// Is this one useful?
Foo &&f;
};
In this specific case, there is no reason to use an rvalue reference. It doesn't buy you anything you couldn't have done before.
But you may want to define data members with parameterized types. std::tuple is going to support lvalue and rvalue reference data members, for example. This way it allows you to codify an expression's value category which might come in handy for "delayed perfect forwarding". The standard draft even includes a function template of the form
template<class Args...>
tuple<Args&&...> pack_arguments(Args&&...args);
But I'm honestly not sure about its usefulness.

Just thinking out loud here, but wouldn't it have a use in functors? The constructor is often used for "currying", binding some parameters in advance, before the actual function call.
So in this context, the class member is just a staging ground (or a manually implemented closure) for the upcoming function call, and I see no reason why a rvalue reference wouldn't be meaningful there.
But in "regular" non-functor classes, I don't see much point.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js