Why the Standard C++ Grammar for Assignment Expression Looks so Weird

Why the Standard C++ Grammar for Assignment Expression Looks so Weird - c++

From the C++ standard, the grammar for an assignment expression is like this:
assignment-expression:
conditional-expression
logical-or-expression assignment-operator assignment-expression
throw-expression
assignment-operator: one of
= *= /= %= += -= >>= <<= &= ^= |=
Notice that the left hand side of the "assignment-operator" is "logical-or-expression" i.e. something like (4 || localvar1) = 5; is a valid assignment expression according to the grammar. This doesn't make sense to me. Why they choose a "logical-or-expression" instead of, say, an identifier or id_expression?

The grammar is a bit complex, but if you continue unrolling with the previous definitions you will see that assignment expressions are very generic and allow for mostly anything. While the snippet from the standard that you quote focuses on logical-or-expression, if you keep unrolling the definition of that you will find that both the left hand side and right hand side of an assignment can be almost any subexpression (although not literally any).
The reason as pointed out before is that assignment can be applied to any lvalue expression of enum or fundamental type or rvalue expression of a class type (where operator= is always a member). Many expressions, in a language that allows for operator overloading and that does not define what the type returned from the operator is, can potentially fulfill the needs of assignment, and the grammar must allow all of those uses.
Different rules in the standard will later limit which of the possible expressions that can be generated from the grammar are actually valid or not.

Your particular statement, (4 || localvar1) = 5; is invalid (unless operator|| is overloaded), because you can't assign 5 to 4 (4 is an r-value). You must have an l-value (something that can be assigned to) on the left, such as a reference returned by a function.
For example, say you have some function int& get_my_int() that returns the reference to an integer. Then, you can do this:
`get_my_int() = 5;`
This will set the integer returned by get_my_int() to 5.
Just like in your first post, this MUST be a reference to an integer (and not a value); otherwise, the above statement wouldn't compile.

There are actually two interesting things about the C++ grammar for assignment statements, neither of which have to do with the validity of:
(4 || localvar1) = 5;
That expression is syntactically valid (up to type-checking) because of the parentheses. Any parenthesized expression of reference type is syntactically correct on the left-hand-side of an assignment operator. (And, as has been pointed out, almost any expression which involves a user type or function can be of reference type, as a result of operator overloading.)
What's more interesting about the grammar is that it establishes the left precedence of assignment operators as being lower than almost all other operators, including logical-or, so that the above expression is semantically equivalent to
4 || localvar1 = 5;
even though many readers would interpret the above as 4 || (localvar1 = 5) (which would have been totally correct assuming that localvar1 is of a type which can be assigned to by an int, even though said assignment will never happen -- unless, of course, || is overloaded in this context).
So what has a lower precedence on the left hand side of an assignment operator? As I said, very little, but one important exception is ?::
// Replace the larger of a and b with c
a > b ? a = c : b = c;
is valid and conveniently parenthesis-less. (Many style guides insist on redundant parentheses here, but I personally rather like the unparenthesized version.) This is not the same as right-hand precedence, so that the following also works without parentheses:
// Replace c with the larger of a and b
c = a > b ? a : b;
The only other operators which bind less tightly on the left of an assignment operator than the assignment operator are the , operator and another assignment operator. (In other words, assignment is right-associative unlike almost all other binary operators.) Neither of these are surprising -- in fact, they are so necessary that it's easy to miss how important it is to design a grammar in this way. Consider the following unremarkable for clause:
for (first = p = vec.begin(), last = vec.end(); p < last; ++p)
Here the , is a comma operator, and it clearly needs to bind less tightly than either of the assignments which surround it. (C and C++ are only exceptional in this syntax by having a comma operator; in most languages, , is not considered an operator.) Also, it would obviously be undesirable for the first assignment expression to be parsed as (first = p) = vec.begin().
The fact that assignment operators associate to the right is unremarkable, but it's worth noting for one historical curiosity. When Bjarne Stroustrup was looking around for operators to overload for I/O streams, he settled on << and >> because, although an assignment operator might have been more natural [1], assignment binds to the right, and a streaming operator must bind to the left (std::cout << a << b must be (std::cout << a) << b). However, since << binds much more tightly than assignment, there are a number of gotchas when using the streaming operators. (The one which most recently caught me is that shift binds more tightly than the bitwise operators.)
[Note 1]: I don't have a citation for this, but I remember reading it many years ago in The C++ Programming Language. As I recall, there was not a consensus about assignment operators being natural, but it seems more natural than overloading shift operators to be something completely different from their normal semantics.

Related

Is using compound assignment operator (+=, ...) on uninitialized variable NOT a UB in C++?

I am trying to create a simple tool to detect the use of uninitialized variables based on Clang AST. What I know is that the thing actually causes UB with uninit variables is an lvalue to rvalue cast that happens implicitly.
Now what I noticed while examining the AST of a basic example program is that all compound assignment operators do not cause any such cast node to appear! Does this imply that no UB takes place?
int a;
int b = 10;
b += a; // this line is obviously UB...
a += b; // ... but is this one ok?
// Note: no ImplicitCastExpr of LvalueToRvalue type in AST !
That is also true for postfix / prefix increment / decrement operators (in particular, I have absolutely no clue how postfix operators save the 'value' of the variable without copying).
I managed to find some info about increment operators (Is it legal to increment non-initialized variable? - only one reference, unfortunately), but now struggle with comp. assignments. If possible, I'd love to know in particular what happens to increment as well.

Is using compound assignment operator (+=, …) on uninitialized variable NOT a UB in C++?
No, it is UB (except for in cases where standard says it's not).
Standard quotes (from latest draft):
[expr.ass] The behavior of an expression of the form E1 op= E2 is equivalent to E1 = E1 op E2 except that E1 is evaluated only once.
So, we know that left hand operand is evaluated. And we also know that the value is used in the operation.
[basic.indet] If an indeterminate value is produced by an evaluation, the behavior is undefined ...
There you have it.

Why are multiple pre-increments allowed in C++ but not in C? [duplicate]

This question already has answers here:
Why are multiple increments/decrements valid in C++ but not in C?
(4 answers)
Closed 5 years ago.
Why is
int main()
{
int i = 0;
++++i;
}
valid C++ but not valid C?

C and C++ say different things about the result of prefix ++. In C++:
[expr.pre.incr]
The operand of prefix ++ is modified by adding 1. The operand shall be
a modifiable lvalue. The type of the operand shall be an arithmetic
type other than cv bool, or a pointer to a completely-defined object
type. The result is the updated operand; it is an lvalue, and it is a
bit-field if the operand is a bit-field. The expression ++x is
equivalent to x+=1.
So ++ can be applied on the result again, because the result is basically just the object being incremented and is an lvalue. In C however:
6.5.3 Unary operators
The operand of the prefix increment or decrement operator shall have atomic, qualified, or unqualified real or pointer type, and shall be a modifiable lvalue.
The value of the operand of the prefix ++ operator is incremented. The
result is the new value of the operand after incrementation.
The result is not an lvalue; it's just the pure value of the incrementation. So you can't apply any operator that requires an lvalue on it, including ++.
If you are ever told the C++ and C are superset or subset of each other, know that it is not the case. There are many differences that make that assertion false.

In C, it's always been that way. Possibly because pre-incremented ++ can be optimised to a single machine code instruction on many CPUs, including ones from the 1970s which was when the ++ concept developed.
In C++ though there's the symmetry with operator overloading to consider. To match C, the canonical pre-increment ++ would need to return const &, unless you had different behaviour for user-defined and built-in types (which would be a smell). Restricting the return to const & is a contrivance. So the return of ++ gets relaxed from the C rules, at the expense of increased compiler complexity in order to exploit any CPU optimisations for built-in types.

I assume you understand why it's fine in C++ so I'm not going to elaborate on that.
For whatever it's worth, here's my test result:
t.c:6:2: error: lvalue required as increment operand
++ ++c;
^
Regarding CppReference:
Non-lvalue object expressions
Colloquially known as rvalues, non-lvalue object expressions are the expressions of object types that do not designate objects, but rather values that have no object identity or storage location. The address of a non-lvalue object expression cannot be taken.
The following expressions are non-lvalue object expressions:
all operators not specified to return lvalues, including
increment and decrement operators (note: pre- forms are lvalues in C++)
And Section 6.5.3.1 from n1570:
The value of the operand of the prefix ++ operator is incremented. The result is the new value of the operand after incrementation.
So in C, the result of prefix increment and prefix decrement operators are not required to be lvalue, thus not incrementable again. In fact, such word can be understood as "required to be rvalue".

The other answers explain the way that the standards diverge in what they require. This answer provides a motivating example in the area of difference.
In C++, you can have a function like int& foo(int&);, which has no analog in C. It is useful (and not onerous) for C++ to have the option of foo(foo(x));.
Imagine for a moment that operations on basic types were defined somewhere, e.g. int& operator++(int&);. ++++x itself is not a motivating example, but it fits the pattern of foo above.

Compiles as C++ but not C (error: lvalue required as unary '&' operand)

This line compiles when I use C++, but not C:
gmtime(&(*(time_t *)alloca(sizeof(time_t)) = time(NULL))); //make an lvalue with alloca
I'm surprised by this difference. There is not even a warning for C++.
When I specify gcc -x c, the message is:
playground.cpp:25:8: error: lvalue required as unary '&' operand
gmtime(&(*(time_t *)alloca(sizeof(time_t)) = time(NULL)));
^
Isn't the & here just an address-of operator? Why is it different in C and C++?
Although I can use compound literals in C, still is it possible to modify my syntax to make it work in both C & C++?

In C11 6.5.16/3:
An assignment operator stores a value in the object designated by the left operand. An assignment expression has the value of the left operand after the assignment, but is not an lvalue.
In C++14 5.17/1:
The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand and return an lvalue referring to the left operand.
(Earlier versions of the language standards in each case specified the same thing).
Since the address-of operator can only operate on an lvalue, the code is correct in C++ but not in C.
Regarding the question "Is it possible to modify my syntax to make it work in both C & C++?". This is not a desirable goal; the two languages are different and you should decide what you're writing. This makes about as much sense as trying to stick to syntax that works in both C and Java.
As suggested by others, you could write:
time_t t = time(NULL);
gmtime(&t);
which has the benefits over your original code of being:
simpler, therefore easier to understand and maintain
does not depend on non-standard alloca function
does not have potential alignment violation
uses no more memory and perhaps uses less

Using the result of compound assignment as an lvalue [duplicate]

This question already has answers here:
What's the result of += in C and C++?
(2 answers)
Closed 9 years ago.
I'm surprised that this works:
double x = 3;
double y = 2;
(x *= 2) += y;
std::cout << x << std::endl;
The result is 8, which is what it looks like the programmer is trying to achieve. But I thought assignment operators returned an rvalue - how is it that you can assign to the result of one?

The assignment operators for the built in types return an lvalue
in C++ (unlike in C). But you cannot use it to modify the
object without an intervening sequence point, so your example is
undefined behavior (in C++03—C++11 changed a lot here, and
I seem to remember that one of the results made your code
defined).
Regardless of the situation with regards to undefined behavior,
you would be better off writing:
x = 2 * x + y;
It's far more readable. The fact that the assignment operators
result in lvalues is really only usable when the results are
bound immediately to a reference:
T&
SomeClass::f()
{
// ...
return aTinSomeClass += 42;
}
And even then, I'd write it in two statements.
(The general rule in C++ is that if the result of an operator
corresponds to the value of an object in memory, then it is an
lvalue. There was no general rule in C.)

In C++ the result of the assignment operators, including the compound assignment operators (such as *=) are l-values, and thus assignables.
In C they are r-values, so your code invalid C code.

In C++, compound assignment operators return lvalues, as per §5.17/1:
The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand and return an lvalue referring to the left operand.
In this case, the use of *= returns an lvalue denoting the object x which is then used as the left operand of the += operator.
Perhaps you are thinking of the simple arithmetic operators like + and * which do return rvalues.
In C, these operators don't return lvalues and therefore the result can't be further assigned.

Low level details of C/C++ assignment operator implementation. What does it return?

I m a total newbie to a C++ world (and C too). And don't know all its details. But one thing really bothers me.
It is constructions like :
while (a=b) {...} .As I understand this magic works because assignment operator in C and C++ returns something.
So the questions: what does it return? Is this a documented thing?Does it work the same in C and C++. Low level details about assignment operator and its implementation in both C and C++ (if there is a difference) will be very appreciated!
I hope that this question won't be closed, because I can't find a comprehensive explanation and good material on this theme from the low level point of view all the more so.

For built-in types in C++ evaluating an assignment expression produces an lvalue that is the left hand side of the assignment expression. The assignment is sequenced before the result can be used, so when the result is converted to an rvalue you get the newly assigned value:
int a, b=5;
int &c = (a=b);
assert(&c==&a);
b=10;
assert(10==(a=b));
C is almost but not exactly the same. The result of an assignment expression in C is an rvalue the same as the value newly assigned to the left hand side of the assignment.
int *c = &(a=b); // not legal in C because you can only take the address of lvalues.
Usually if the result of an assignment is used at all it's used as an rvalue (e.g., a=b=c), so this difference between C++ and C largely goes unnoticed.

The assignment operator is defined (in C) as returning the value of the variable that was assigned to - i.e. the value of the expression (a=b) is the value of a after the expression has been evaluated.
It can be defined to be something different (of the same type) for user-defined operator overloads in C++, but I suspect most would consider this to be a very unpleasant use of operator overloading.
You can use this (non-boolean) value in a while (or an if, etc.) because of type conversion - using a value in a conditional context causes it to be implicitly converted to something which makes sense in a conditional context. In C++, this is bool, and you can define your own conversion (for your own type) by overloading operator bool(). In C, anything other than 0 is true.

To understand such expressions, you have to first understand that, positive integers are considered as 'true' and 0 is considered as false.
An assignment evaluates to the left hand side of the operator = as its value. So, while(a=b) { } would mean, while(1 /*true*/) if a after being assigned to b evaluates to non-zero. Else, it is considered as while(0 /*false*/)
Similarly, with the operator (a=b)?1:0 is the value of a after being assigned to b .. if it is non-zero then the value is taken as true and the statement following ? will be executed, or the statement following : is executed.
Assignments usually evaluate to the value at the left hand side of the operator = where as, logical operators(such as ==, && etc) evaluate to 1 or 0.
Note: with C++, it will depend upon if or not, a certain operator is overloaded.. and it will also depend upon the return type of the overloaded operator.

The assignment operators in C and C++ return the value of the variable being assigned to, i.e., their left operand. In your example of a = b, the value of this entire expression is the value that is assigned to a (which is the value of b converted into the type of a).
So you can say that the assignment operator "returns" the value of its left operand.
In C++ it's a little more complicated because you can overload the = operator with an actual user-defined function, and have it return something other than the value (and type) of the left operand.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Why the Standard C++ Grammar for Assignment Expression Looks so Weird - c++

Related

Is using compound assignment operator (+=, ...) on uninitialized variable NOT a UB in C++?

Why are multiple pre-increments allowed in C++ but not in C? [duplicate]

Compiles as C++ but not C (error: lvalue required as unary '&' operand)

Using the result of compound assignment as an lvalue [duplicate]

Low level details of C/C++ assignment operator implementation. What does it return?

Categories

Resources