so assume we write the following in C++: a=b=5; which basically means a=(b=5);
If we have a=5; I know that 5 is a literal and thus it is an R-Value. a is an L-Value. Same goes for b=5;
I'm wondering now, what happens, if we write a=b=5; respectively a=(b=5);
Can I now say the following?
For a, b=5 is an R-Value with a being an L-Value. Also, b is an L-Value and 5 is an R-Value.
What's the R-Value of a?
This depends. The built in operator= returns an lvalue reference to the left hand side of the assignment. So, if a and b are int's then (b = 5) is an lvalue expression and you assign that lvalue to a, with a and b both being lvalues and 5 being a prvalue.
This is generally the same for overloaded operator= as well since most people return a lvalue reference but it does not have to be.
If you want to cast a lvalue into a rvalue then you use std::move.
Related
I was reading Effective C++: 55 Specific Ways to Improve Your Programs and Designs by Scott Meyers and he stated:
Having a function return a constant value is generally inappropriate, but sometimes doing so can reduce the incidence of client errors without giving up safety or efficiency. For example, consider the declaration of the operator* function:
class Rational { ... };
const Rational operator*(const Rational& lhs, const Rational& rhs);
According to Meyers, do this prevents "atrocities" like this, which would be illegal if a, b were primitive types:
Rational a, b, c;
...
(a * b) = c;
This got me confused and while trying to understand why the above assignment was illegal for primitive types but not user-defined types, I came across rvalues and lvalues
I still feel I don't have a strong grasp of what rvalues and lvalues are after looking through some SO questions, but here's my basic understanding: an lvalue references a location in memory and thus can be assigned to (it can be on both sides of = operator as well); an rvalue however, cannot be assigned to because it does not reference a memory location(e.g. temporary values from function returns and literals)
My question is: why is assigning to a product of two numbers/objects legal for user-defined types (even though it does not make sense) but not primitives? Does it have to do with return types? does the overloaded * operator return an assignable value or a temporary value?
[expr.call]/14: A function call is an lvalue if the result type is an lvalue reference type or an rvalue reference to function type, an xvalue if the result type is an rvalue reference to object type, and a prvalue otherwise.
This makes sense, since the result doesn't "have a name". If you returned a reference, the implication would be that it is a reference to some object somewhere that does "have a name" (which is, generally but not always, true).
Then there's this:
[expr.ass]/1: The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand; their result is an lvalue referring to the left operand.
This is saying that an assignment requires an lvalue on the left hand side. So far so good; you've covered this yourself.
How come a non-const function call result works then?
By a special rule!
[over.oper]/8:: [..] Some predefined operators, such as +=, require an operand to be an lvalue when applied to basic types; this is not required by operator functions.
… and = applied to an object of class type invokes an operator function.
I can't readily answer the "why": on the surface of it, it made sense to relax this restriction when dealing with classes, and the original (inherited) restriction on built-ins always seemed a little excessive (in my opinion) but would have had to be kept for compatibility reasons.
But then you have people like Meyers pointing out that it now becomes useful (sort of) to return const values to effectively "undo" this change.
Ultimately I wouldn't try too hard to find a strong rationale either way.
The prefix operators return the object
itself as an lvalue. The postfix operators return a copy of the object’s original value
as an rvalue.
so in a statement like so *a++ a is being incremented and a copy of the original value of a is returned as rvalue but from the microsoft c++ language reference on Lvalues and Rvalues
An rvalue is a temporary value that does not persist beyond the expression that uses it
and gives an example
// lvalues_and_rvalues1.cpp
// compile with: /EHsc
#include <iostream>
using namespace std;
int main()
{
int x = 3 + 4;
cout << x << endl;
}
In this example, x is an lvalue because it persists beyond the expression that defines it. The expression 3 + 4 is an rvalue because it evaluates to a temporary value that does not persist beyond the expression that defines it.
My questions:
1) what is the rvalue being returned from the *a++ so that it can be dereferenced?
2) Did i misunderstand any concept ?
Thanks in advance!
The prefix operators return the object itself as an lvalue. The postfix operators return a copy of the object’s original value as an rvalue.
Wrong! Well, mostly. If the quote is talking about all prefix/suffix operators, then it's completely errated. However, if it's talking about the ++ and -- prefix/postfix pairs, then it's correct.
Now, taking that into account...
what is the rvalue returning from the *a++ so that it can be dereferenced?
Assuming a is a pointer of some kind, a++ increments a and yields a rvalue consisting of a's value before the increment. The increment and decrement operators, ++ and --, in both postfix and prefix forms, require an lvalue as their operator. This is because rvalues are temporary, that is, their scope is limited by the expression their occur in, so these operators make little or no sense on them. Remember, these operators not only inspect/read, but change/write to the variable itself.
The unary * operator takes a pointer(-like) object and dereferences it, yielding an lvalue found in there. It works for both rvalue and lvalue pointers. This is because * can be considered sort of a "passive" operator. It does not change/write to the pointer itself, but dereferences it and returns the lvalue object at the address stored by the pointer, whose address is of course that contained by the pointer. As all that * needs is the memory address contained in a pointer object, and the address of the pointer itself, if it has one at all, is useless here, * makes sense for both rvalues and lvalues.
You can think that * "requires an rvalue", and that "lvalues can be used as rvalues when necessary", if it clarifies (or confuses?) things a little bit more.
*a++ is equivalent to:
auto temp = *a;
a++;
// Use the value of temp here
except you can only refer to the value once, where as temp you could refer to multiple times.
First example
int a = 0;
auto && b = ++a;
++a;
cout << a << b << endl;
prints 22
Second example
int a = 0;
auto && b = a++;
++a;
cout << a << b << endl;
prints 20
Question:
Why in first example ++a in 3rd line also increments b, and why there is no such behavior in second example?
Update: New question arised.
Because pre-increment (++a) first increments the value of a, stores the result, and then returns the reference to a. Now a and b effectively point to the same object.
Post-increment (a++), however, first stores the current value of a in a temporary, increments a, and returns this temporary - to which your rvalue ref points. a and b point to different objects, more specifically - b is a temporary holding the value of a prior to incrementing.
This is the reason why it's encouraged to use ++it over it++ for iterators and other complex objects that define increment / decrement: the latter creates a temporary copy and thus may be slower.
The difference is that ++a is an lvalue, however a++ is not. This is specified by C++14 [expr.pre.incr]/1:
The operand of prefix ++ is modified by adding 1 [...] The
operand shall be a modifiable lvalue. [...] The result is the updated operand; it is an lvalue
and [expr.post.incr]/1:
[...] The result is a prvalue.
Now we consider auto && b = ++a; . ++a is an lvalue. auto&& is a forwarding reference. Forwarding references can actually bind to lvalues: the auto may itself deduce to a reference type. This code deduces to int &b = ++a;.
When a reference is bound to an lvalue of the same type, the reference binds directly, so b becomes another name for a.
In the second example, auto && b = a++;, a++ is a prvalue. This means it doesn't have an associated address and it's no longer any relation to the variable a. This line has the same behaviour as ++a; auto && b = (a + 0); would.
Firstly, since a++ is a prvalue, auto&& deduces to int&&. (i.e. auto deduces to int). When a reference of non-class type is bound to a prvalue, a temporary object is copy-initialized from the value. This object has its lifetime extended to match the reference.
So b in the second case is bound to a different object from a, a "temporary" int (which is not really so temporary, since it lasts as long as b does).
The reference binding rules are in [dcl.init.ref].
In the second case (post-increment) b actually references the temporary created for (a++), so the increments do not affect b.
Given:
int& foo(); // don't care what the reference is to
int intVal;
In the following two cases the right hand side is the same function call
int& intRef = foo();
intVal = foo(); // a reference is returned... a value is assigned.
In the second case how is the returned reference "converted" into a value?
Is it done by the assignment operator for the int?
At the language level there's no such concept as "dereferencing a reference". A reference implements the concept of an lvalue. A variable and a reference are basically the same thing. The only difference between a variable and a reference is that the variable is bound to its location in storage automatically, by the compiler, while a reference is generally bound through user action at run time.
In your example, there's no conceptual difference between intRef and intVal. Both are lvalues of type int. And at the conceptual level both are accessed through the same mechanism. You can even think of all variables in your program as references, which were implicitly pre-bound for you by the compiler. This is basically what Bjarne Stroustrup means in TC++PL when he says (not verbatim) that one can think of references as just alternative names for existing variables.
The only moment when the difference between the two is perceptible is when you create these entities and initialize them. Initialization of a reference is an act of binding it to some location in storage. Initialization of a variable is an act of copying the initial value into the existing storage.
But once a reference is initialized, it acts as an ordinary variable: an act of reading/writing a reference is an act of reading/writing the storage location it is bound to. Taking the address of a reference evaluates to the address of the storage location it is bound to. And so on.
It is not a secret that in many cases a reference is implemented internally as a pointer in disguise, i.e. as an invisible pointer that is implicitly dereferenced for you every time you access it. In such cases (when it is really implemented through a pointer) the dereference is done, again, every time you access it. So, it is not the assignment operator that does it, as you ask in your question. It is the very fact that you mentioned the name of that reference in your code that causes the invisible pointer to get dereferenced.
However, an entity that implements "alternative name for existing variable" does not necessarily require storage for itself, i.e. in a compiled language it is not required to be represented by anything material, like a hidden pointer. This is why the language standard states in 8.3.2 that "It is unspecified whether or not a reference requires storage".
foo is returning some reference to an object of type "int". We won't care about where that "int" came from and we'll just assume it exists.
The first line, int& intRef = foo(), creates intRef which also refers to exactly the same object of type "int" as is referenced by the return value of foo.
The second line, the value of intVal is replaced by the value of the object referred to by the returned reference.
In response to your comments:
You seem to be getting very confused between pointers and references. References are just like aliases for an object. Doing anything to a reference will actually affect the object it refers to.
There is no such thing as dereferencing a reference. You can only dereference pointers. Dereferencing is the act of using the unary * operator to get the object pointed at by a point. For example, if you have a int* p, you can do *p to get the object that it points at. This is dereferencing p.
The only time you can do * on a reference is if the object it refers to is a pointer (or if it overloads operator*). In your case, since foo returns an int&, we can't dereference it. The expression *foo() just won't compile. That's because the return value of foo has type "int" which is not a pointer and doesn't overload operator*.
For all intents and purposes, you can treat the reference returned from foo as simply being the object it refers to. Assigning this value to intVal is really no different to assigning x to intVal in the following code:
int intVal;
int x = 5;
intVal = x;
As I'm sure you understand, intVal is given the value of x. This is defined simply by the standard:
In simple assignment (=), the value of the expression replaces that of the object referred to by the left operand.
No conversion needs to occur at all because both sides of the operator are the same type.
This is really no different to your situation. You just have:
intVal = some_ref_to_int;
Where some_ref_to_int is the expression foo(). The fact that it's a reference doesn't matter. intVal receives the value of the object that the reference denotes.
Assigning to intVal is an assignment-expression defined in 5.17 [exp.ass] in the standard. The grammar rules for an assignment-expression are quite complicated, depending on several other grammar rules, but basically you need a modifiable lvalue on the left hand side of the = operator, and a prvalue expression on the right hand side.
In the case of
intVal = foo();
the expression on the RHS is an lvalue of type int, so the built-in lvalue-to-rvalue conversion takes place ... this is barely a conversion, in that the value doesn't change and neither does the type (except that for fundamental types cv-qualifiers are removed, so if the lvalue is type const int the prvalue will be type int). [conv.lval] says
A glvalue (3.10) of a non-function, non-array type T can be converted to a prvalue. [...] If T is a non-class type, the type of the prvalue is the cv-unqualified version of T. Otherwise, the type of the prvalue is T. [...] the value contained in the object indicated by the glvalue is the prvalue result.
So the prvalue has type int and the same value as foo() i.e. the same value as the variable the returned reference is bound to.
The rules of assignment expressions say:
In simple assignment (=), the value of the expression replaces that of the object referred to by the left operand.
So the value of intVal will be replaced by the value of the prvalue. The rules continue:
If the left operand is not of class type, the expression is implicitly converted (Clause 4) to the cv-unqualified type of the left operand.
So because int is not a class type (and therefore has no overloaded operator= it just uses the built-in assignment operator) the assignment will convert the RHS to int, which is the type it already has in your case.
So the value of intVal gets set to the value of the prvalue, which we said is the value of the glvalue expression foo(), i.e. the value of the variable the reference is bound to.
Note that the lvalue-to-rvalue conversion is nothing to do with the RHS being a reference. The same thing happens here:
int val = 0;
intVal = val;
val is an lvalue of type int so it's converted to a prvalue of type int and the value of intVal is set to the value of that prvalue.
The rules are expressed in terms of an expression's "value category" (i.e. lvalue or rvalue) not whether it's a reference or not. Any "dereferencing" of a reference that's needed is done implicitly and invisibly by the compiler in order to implement the required behaviour.
In C++, pre-increment operator gives lvalue because incremented object itself is returned, not a copy.
But in C, it gives rvalue. Why?
C doesn't have references. In C++ ++i returns a reference to i (lvalue) whereas in C it returns a copy(incremented).
C99 6.5.3.1/2
The value of the operand of the prefix ++ operator is incremented. The result is the new value of the operand after incrementation. The expression ++Eis equivalent to (E+=1).
‘‘value of an expression’’ <=> rvalue
However for historical reasons I think "references not being part of C" could be a possible reason.
C99 says in the footnote (of section $6.3.2.1),
The name ‘‘lvalue’’ comes originally
from the assignment expression E1 =
E2, in which the left operand E1 is
required to be a (modifiable) lvalue.
It is perhaps better considered as
representing an object ‘‘locator
value’’. What is sometimes called
‘‘rvalue’’ is in this International
Standard described as the ‘‘value of
an expression’’.
Hope that explains why ++i in C, returns rvalue.
As for C++, I would say it depends on the object being incremented. If the object's type is some user-defined type, then it may always return lvalue. That means, you can always write i++++++++ or ++++++i if type of i is Index as defined here:
Undefined behavior and sequence points reloaded
Off the top of my head, I can't imagine any useful statements that could result from using a pre-incremented variable as an lvalue. In C++, due to the existence of operator overloading, I can. Do you have a specific example of something that you're prevented from doing in C, due to this restriction?