How does multiple assignment (a = b) = c syntax work? [closed] - c++

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
How does a statement like (a = b) = c; work in C++, assuming a,b and c are ints or any other primitive type?

The assignment expression a = b is not an lvalue in C, but it is in C++:
C11, 6.5.14 (Assignment operators):
An assignment operator stores a value in the object designated by the left operand. An assignment expression has the value of the left operand after the assignment, but is not an lvalue.
C++14, 5.18 [expr.ass] (Assignment and compound assignment operators):
The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand and return an lvalue referring to the left operand.
In the evolution of C++ from C, several expressions were made "lvalue-aware", as it were, because lvalues are much more important in C++ than in C. In C, everything is trivial (trivially copyable and trivially destructible, all in the words of C++), so lvalue-to-rvalue conversions (or "lvalue conversions", as C calls them) aren't painful. In C++, copying and destruction are non-trivial concepts, and by making expressions preserve lvalue-ness, a lot of copying and destructing can be avoided that was never necessary to begin with.
Another example is the conditional expression (a ? b : c), which is not an lvalue in C, but can be a lvalue in C++.
Another interesting artefact of this language evolution is that C has four well-defined storage durations (automatic, static, thread-local, dynamic), but in C++ this becomes more muddled, since temporary objects are a non-trivial concept in C++ that almost calls for its own storage duration. (E.g. Clang internally has a fifth, "full expression" storage duration.) Temporaries are of course the result of lvalue-to-rvalue conversion, so by avoiding the conversion, there's one less thing to worry about.
(Please note that all of this discussion only applies to the respective core language expressions. C++ also has the separate, unrelated feature of operator overloading, which produces function call expressions, which have all the usual semantics of function calls and have nothing to do with operators except for the syntax. For example, you can define an overloaded operator= that returns a prvalue or void if you so wish.)

Informally, in C++, for builtin types, the result of a = b is a reference to a; you can assign a value to that reference, just as with any other reference. So (a = b) = c assigns the value of b to a, and then assigns the value of c to a.
For user-defined types this may not apply, although the usual idiom is for an assignment operator to return a reference to the left-hand argument, so the behavior of user-defined types mimics the behavior of builtin types:
struct S {
S& operator=(const S& rhs) {
return *this;
}
};
Now, S a, b, c; (a = b) = c; means call a.operator=(b), which returns a reference to a; then call S::operator= on that result and c, effectively calling a.operator=(c).

(a = b) = c is a valid statement in C++. Here '=' is working as an assignment operator. Here, b's value will be assigned to a and c's value will be assigned to a for Right to Left precedence.
For example:
int a = 5;
int b = 2;
int c = 7;
int answer = (a = b) = c;
cout << answer << endl;
Output:
7

The following is a little speculation, so please correct me if I am wrong.
When they invented operator overloading, they had to come up with a standard-looking general form of an assignment operator for any class T. For example:
T& T::operator=(T);
T& T::operator=(const T&);
Here, it returns a reference to T, instead of just T to make three-part assignment like x = (y = z) efficient, not requiring a copy.
It could return a const reference to T, which would make unwanted assignment (a = b) = c an error. I guess that they didn't use this because of two reasons:
Shorter code - don't need to write all these consts all the time (the fine details of const-correctness were not clear at that time)
More flexibility - allows code like (a = b).print(), where print is a non-const method (because the programmer was lazy/ignorant of const-correctness)
The semantics for primitive types (which are not classes) were kind-of extrapolated, to give:
int& operator=(int&, int); // not real code; just a concept
The "return type" is not const int& so it matches the pattern with classes. So, if the buggy (a = b) = c code is valid for user-defined types, it should be valid also for built-in types, as required by C++ design principles. And once you document this kind of stuff, you cannot change it because of backward compatibility.

Related

What type of value do overloaded operators return (for user-defined types): rvalue or lvalue?

I was reading Effective C++: 55 Specific Ways to Improve Your Programs and Designs by Scott Meyers and he stated:
Having a function return a constant value is generally inappropriate, but sometimes doing so can reduce the incidence of client errors without giving up safety or efficiency. For example, consider the declaration of the operator* function:
class Rational { ... };
const Rational operator*(const Rational& lhs, const Rational& rhs);
According to Meyers, do this prevents "atrocities" like this, which would be illegal if a, b were primitive types:
Rational a, b, c;
...
(a * b) = c;
This got me confused and while trying to understand why the above assignment was illegal for primitive types but not user-defined types, I came across rvalues and lvalues
I still feel I don't have a strong grasp of what rvalues and lvalues are after looking through some SO questions, but here's my basic understanding: an lvalue references a location in memory and thus can be assigned to (it can be on both sides of = operator as well); an rvalue however, cannot be assigned to because it does not reference a memory location(e.g. temporary values from function returns and literals)
My question is: why is assigning to a product of two numbers/objects legal for user-defined types (even though it does not make sense) but not primitives? Does it have to do with return types? does the overloaded * operator return an assignable value or a temporary value?
[expr.call]/14: A function call is an lvalue if the result type is an lvalue reference type or an rvalue reference to function type, an xvalue if the result type is an rvalue reference to object type, and a prvalue otherwise.
This makes sense, since the result doesn't "have a name". If you returned a reference, the implication would be that it is a reference to some object somewhere that does "have a name" (which is, generally but not always, true).
Then there's this:
[expr.ass]/1: The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand; their result is an lvalue referring to the left operand.
This is saying that an assignment requires an lvalue on the left hand side. So far so good; you've covered this yourself.
How come a non-const function call result works then?
By a special rule!
[over.oper]/8:: [..] Some predefined operators, such as +=, require an operand to be an lvalue when applied to basic types; this is not required by operator functions.
… and = applied to an object of class type invokes an operator function.
I can't readily answer the "why": on the surface of it, it made sense to relax this restriction when dealing with classes, and the original (inherited) restriction on built-ins always seemed a little excessive (in my opinion) but would have had to be kept for compatibility reasons.
But then you have people like Meyers pointing out that it now becomes useful (sort of) to return const values to effectively "undo" this change.
Ultimately I wouldn't try too hard to find a strong rationale either way.

Why are multiple pre-increments allowed in C++ but not in C? [duplicate]

This question already has answers here:
Why are multiple increments/decrements valid in C++ but not in C?
(4 answers)
Closed 5 years ago.
Why is
int main()
{
int i = 0;
++++i;
}
valid C++ but not valid C?
C and C++ say different things about the result of prefix ++. In C++:
[expr.pre.incr]
The operand of prefix ++ is modified by adding 1. The operand shall be
a modifiable lvalue. The type of the operand shall be an arithmetic
type other than cv bool, or a pointer to a completely-defined object
type. The result is the updated operand; it is an lvalue, and it is a
bit-field if the operand is a bit-field. The expression ++x is
equivalent to x+=1.
So ++ can be applied on the result again, because the result is basically just the object being incremented and is an lvalue. In C however:
6.5.3 Unary operators
The operand of the prefix increment or decrement operator shall have atomic, qualified, or unqualified real or pointer type, and shall be a modifiable lvalue.
The value of the operand of the prefix ++ operator is incremented. The
result is the new value of the operand after incrementation.
The result is not an lvalue; it's just the pure value of the incrementation. So you can't apply any operator that requires an lvalue on it, including ++.
If you are ever told the C++ and C are superset or subset of each other, know that it is not the case. There are many differences that make that assertion false.
In C, it's always been that way. Possibly because pre-incremented ++ can be optimised to a single machine code instruction on many CPUs, including ones from the 1970s which was when the ++ concept developed.
In C++ though there's the symmetry with operator overloading to consider. To match C, the canonical pre-increment ++ would need to return const &, unless you had different behaviour for user-defined and built-in types (which would be a smell). Restricting the return to const & is a contrivance. So the return of ++ gets relaxed from the C rules, at the expense of increased compiler complexity in order to exploit any CPU optimisations for built-in types.
I assume you understand why it's fine in C++ so I'm not going to elaborate on that.
For whatever it's worth, here's my test result:
t.c:6:2: error: lvalue required as increment operand
++ ++c;
^
Regarding CppReference:
Non-lvalue object expressions
Colloquially known as rvalues, non-lvalue object expressions are the expressions of object types that do not designate objects, but rather values that have no object identity or storage location. The address of a non-lvalue object expression cannot be taken.
The following expressions are non-lvalue object expressions:
all operators not specified to return lvalues, including
increment and decrement operators (note: pre- forms are lvalues in C++)
And Section 6.5.3.1 from n1570:
The value of the operand of the prefix ++ operator is incremented. The result is the new value of the operand after incrementation.
So in C, the result of prefix increment and prefix decrement operators are not required to be lvalue, thus not incrementable again. In fact, such word can be understood as "required to be rvalue".
The other answers explain the way that the standards diverge in what they require. This answer provides a motivating example in the area of difference.
In C++, you can have a function like int& foo(int&);, which has no analog in C. It is useful (and not onerous) for C++ to have the option of foo(foo(x));.
Imagine for a moment that operations on basic types were defined somewhere, e.g. int& operator++(int&);. ++++x itself is not a motivating example, but it fits the pattern of foo above.

Why the Standard C++ Grammar for Assignment Expression Looks so Weird

From the C++ standard, the grammar for an assignment expression is like this:
assignment-expression:
conditional-expression
logical-or-expression assignment-operator assignment-expression
throw-expression
assignment-operator: one of
= *= /= %= += -= >>= <<= &= ^= |=
Notice that the left hand side of the "assignment-operator" is "logical-or-expression" i.e. something like (4 || localvar1) = 5; is a valid assignment expression according to the grammar. This doesn't make sense to me. Why they choose a "logical-or-expression" instead of, say, an identifier or id_expression?
The grammar is a bit complex, but if you continue unrolling with the previous definitions you will see that assignment expressions are very generic and allow for mostly anything. While the snippet from the standard that you quote focuses on logical-or-expression, if you keep unrolling the definition of that you will find that both the left hand side and right hand side of an assignment can be almost any subexpression (although not literally any).
The reason as pointed out before is that assignment can be applied to any lvalue expression of enum or fundamental type or rvalue expression of a class type (where operator= is always a member). Many expressions, in a language that allows for operator overloading and that does not define what the type returned from the operator is, can potentially fulfill the needs of assignment, and the grammar must allow all of those uses.
Different rules in the standard will later limit which of the possible expressions that can be generated from the grammar are actually valid or not.
Your particular statement, (4 || localvar1) = 5; is invalid (unless operator|| is overloaded), because you can't assign 5 to 4 (4 is an r-value). You must have an l-value (something that can be assigned to) on the left, such as a reference returned by a function.
For example, say you have some function int& get_my_int() that returns the reference to an integer. Then, you can do this:
`get_my_int() = 5;`
This will set the integer returned by get_my_int() to 5.
Just like in your first post, this MUST be a reference to an integer (and not a value); otherwise, the above statement wouldn't compile.
There are actually two interesting things about the C++ grammar for assignment statements, neither of which have to do with the validity of:
(4 || localvar1) = 5;
That expression is syntactically valid (up to type-checking) because of the parentheses. Any parenthesized expression of reference type is syntactically correct on the left-hand-side of an assignment operator. (And, as has been pointed out, almost any expression which involves a user type or function can be of reference type, as a result of operator overloading.)
What's more interesting about the grammar is that it establishes the left precedence of assignment operators as being lower than almost all other operators, including logical-or, so that the above expression is semantically equivalent to
4 || localvar1 = 5;
even though many readers would interpret the above as 4 || (localvar1 = 5) (which would have been totally correct assuming that localvar1 is of a type which can be assigned to by an int, even though said assignment will never happen -- unless, of course, || is overloaded in this context).
So what has a lower precedence on the left hand side of an assignment operator? As I said, very little, but one important exception is ?::
// Replace the larger of a and b with c
a > b ? a = c : b = c;
is valid and conveniently parenthesis-less. (Many style guides insist on redundant parentheses here, but I personally rather like the unparenthesized version.) This is not the same as right-hand precedence, so that the following also works without parentheses:
// Replace c with the larger of a and b
c = a > b ? a : b;
The only other operators which bind less tightly on the left of an assignment operator than the assignment operator are the , operator and another assignment operator. (In other words, assignment is right-associative unlike almost all other binary operators.) Neither of these are surprising -- in fact, they are so necessary that it's easy to miss how important it is to design a grammar in this way. Consider the following unremarkable for clause:
for (first = p = vec.begin(), last = vec.end(); p < last; ++p)
Here the , is a comma operator, and it clearly needs to bind less tightly than either of the assignments which surround it. (C and C++ are only exceptional in this syntax by having a comma operator; in most languages, , is not considered an operator.) Also, it would obviously be undesirable for the first assignment expression to be parsed as (first = p) = vec.begin().
The fact that assignment operators associate to the right is unremarkable, but it's worth noting for one historical curiosity. When Bjarne Stroustrup was looking around for operators to overload for I/O streams, he settled on << and >> because, although an assignment operator might have been more natural [1], assignment binds to the right, and a streaming operator must bind to the left (std::cout << a << b must be (std::cout << a) << b). However, since << binds much more tightly than assignment, there are a number of gotchas when using the streaming operators. (The one which most recently caught me is that shift binds more tightly than the bitwise operators.)
[Note 1]: I don't have a citation for this, but I remember reading it many years ago in The C++ Programming Language. As I recall, there was not a consensus about assignment operators being natural, but it seems more natural than overloading shift operators to be something completely different from their normal semantics.

Using the result of compound assignment as an lvalue [duplicate]

This question already has answers here:
What's the result of += in C and C++?
(2 answers)
Closed 9 years ago.
I'm surprised that this works:
double x = 3;
double y = 2;
(x *= 2) += y;
std::cout << x << std::endl;
The result is 8, which is what it looks like the programmer is trying to achieve. But I thought assignment operators returned an rvalue - how is it that you can assign to the result of one?
The assignment operators for the built in types return an lvalue
in C++ (unlike in C). But you cannot use it to modify the
object without an intervening sequence point, so your example is
undefined behavior (in C++03—C++11 changed a lot here, and
I seem to remember that one of the results made your code
defined).
Regardless of the situation with regards to undefined behavior,
you would be better off writing:
x = 2 * x + y;
It's far more readable. The fact that the assignment operators
result in lvalues is really only usable when the results are
bound immediately to a reference:
T&
SomeClass::f()
{
// ...
return aTinSomeClass += 42;
}
And even then, I'd write it in two statements.
(The general rule in C++ is that if the result of an operator
corresponds to the value of an object in memory, then it is an
lvalue. There was no general rule in C.)
In C++ the result of the assignment operators, including the compound assignment operators (such as *=) are l-values, and thus assignables.
In C they are r-values, so your code invalid C code.
In C++, compound assignment operators return lvalues, as per §5.17/1:
The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand and return an lvalue referring to the left operand.
In this case, the use of *= returns an lvalue denoting the object x which is then used as the left operand of the += operator.
Perhaps you are thinking of the simple arithmetic operators like + and * which do return rvalues.
In C, these operators don't return lvalues and therefore the result can't be further assigned.

Low level details of C/C++ assignment operator implementation. What does it return?

I m a total newbie to a C++ world (and C too). And don't know all its details. But one thing really bothers me.
It is constructions like :
while (a=b) {...} .As I understand this magic works because assignment operator in C and C++ returns something.
So the questions: what does it return? Is this a documented thing?Does it work the same in C and C++. Low level details about assignment operator and its implementation in both C and C++ (if there is a difference) will be very appreciated!
I hope that this question won't be closed, because I can't find a comprehensive explanation and good material on this theme from the low level point of view all the more so.
For built-in types in C++ evaluating an assignment expression produces an lvalue that is the left hand side of the assignment expression. The assignment is sequenced before the result can be used, so when the result is converted to an rvalue you get the newly assigned value:
int a, b=5;
int &c = (a=b);
assert(&c==&a);
b=10;
assert(10==(a=b));
C is almost but not exactly the same. The result of an assignment expression in C is an rvalue the same as the value newly assigned to the left hand side of the assignment.
int *c = &(a=b); // not legal in C because you can only take the address of lvalues.
Usually if the result of an assignment is used at all it's used as an rvalue (e.g., a=b=c), so this difference between C++ and C largely goes unnoticed.
The assignment operator is defined (in C) as returning the value of the variable that was assigned to - i.e. the value of the expression (a=b) is the value of a after the expression has been evaluated.
It can be defined to be something different (of the same type) for user-defined operator overloads in C++, but I suspect most would consider this to be a very unpleasant use of operator overloading.
You can use this (non-boolean) value in a while (or an if, etc.) because of type conversion - using a value in a conditional context causes it to be implicitly converted to something which makes sense in a conditional context. In C++, this is bool, and you can define your own conversion (for your own type) by overloading operator bool(). In C, anything other than 0 is true.
To understand such expressions, you have to first understand that, positive integers are considered as 'true' and 0 is considered as false.
An assignment evaluates to the left hand side of the operator = as its value. So, while(a=b) { } would mean, while(1 /*true*/) if a after being assigned to b evaluates to non-zero. Else, it is considered as while(0 /*false*/)
Similarly, with the operator (a=b)?1:0 is the value of a after being assigned to b .. if it is non-zero then the value is taken as true and the statement following ? will be executed, or the statement following : is executed.
Assignments usually evaluate to the value at the left hand side of the operator = where as, logical operators(such as ==, && etc) evaluate to 1 or 0.
Note: with C++, it will depend upon if or not, a certain operator is overloaded.. and it will also depend upon the return type of the overloaded operator.
The assignment operators in C and C++ return the value of the variable being assigned to, i.e., their left operand. In your example of a = b, the value of this entire expression is the value that is assigned to a (which is the value of b converted into the type of a).
So you can say that the assignment operator "returns" the value of its left operand.
In C++ it's a little more complicated because you can overload the = operator with an actual user-defined function, and have it return something other than the value (and type) of the left operand.