Comma operator in C++11 (sequencing) - c++

The standard mentions f(a,(t=3,t+2),c); which would be an assignment-expression followed by an expression for the 2nd operator according to my understanding.
But the grammar lists it juxtaposed:
expression:
assignment-expression
expression, assignment-expression
Working Draft, Standard for Programming
Language C
++ Revision N4140 (November 2014)
Is someone so nice as to explain to me please what it is that I'm missing here?

When you see
expression:
assignment-expression
expression, assignment-expression
It mean that there are 2 possibilities for expression. One possibility that it is just assignment-expression that is defined somewhere earlier. Or it is recursively represented as expression, assignment-expression
So after extending it you receive that expression is comma separated list of one or more assignment-expression tokens.
In the sample you're mentioned second parameter is expression (t=3,t+2) which consists of 2 comma-separated assignment-expressions - and since it appears "In contexts where comma is given a special meaning" it has to "appear only in parentheses".
To find out why assignment-expression could take a form of t+2 you have to go back from its definitions and choose first choice always
assignment-expression
-> conditional-expression
--> logical-or-expression
---> logical-and-expression
----> inclusive-or-expression
-----> exclusive-or-expression
------> and-expression
-------> equality-expression
--------> relational-expression
---------> shift-expression
----------> additive-expression - this is what you see

Note that since the definition of expression is
expression:
  assignment-expression
  expression , assignment-expression
the second line means that any assignment-expression can be considered an expression, which is why t=3, t+2 is a valid expression.
So why is the grammar this way? First note that the grammar for expressions builds its way in steps from the most tightly bound category primary-expression to the least tightly bound category expression. (And then the fact that "( expression )" is a primary-expression brings the expression grammar full circle and lets us cause any expression to be more tightly bound than everything that surrounds it by adding parentheses.)
For example, the well-known fact that binary * binds tighter than binary + follows from these grammar pieces:
multiplicative-expression:
  pm-expression
  multiplicative-expression * pm-expression
  multiplicative-expression / pm-expression
  multiplicative-expression % pm-expression
additive-expression:
  multiplicative-expression
  additive-expression + multiplicative-expression
  additive-expression - multiplicative-expression
In the expression 2 + 3 * 4, the literals 2, 3, and 4 can be considered a pm-expression, or therefore also a multiplicative-expression or additive-expression. So you might say 2 + 3 would qualify as an additive-expression, but it is not a multiplicative-expression, so the full 2 + 3 * 4 can't work that way. Instead the grammar forces 3 * 4 to be considered a multiplicative-expression, so that 2 + 3 * 4 can be an additive-expression. Therefore 3 * 4 is a subexpression of the binary +.
Or in the expression 2 * 3 + 4, 3 + 4 might be considered an additive-expression, but then it is not a pm-expression, so that doesn't work. Instead the parser must recognize that 2 * 3 is a multiplicative-expression, which is also an additive-expression, so 2 * 3 + 4 is a valid additive-expression, with 2 * 3 as a subexpression of the binary +.
The recursive nature of most grammar definitions matters when the same operator is used twice, or two operators with the same precedence are used.
Going back to the comma grammar, if we have the tokens "a, b, c", we might say b, c could be an expression, but it is not an assignment-expression, so b, c cannot be a subexpression of the whole. Instead the grammar requires recognizing a, b as an expression, which is allowed as a left subexpression of another comma operator, so a, b, c is also an expression with a, b as the left operand.
This doesn't make any difference for the built-in comma, since its meaning is associative: "evaluate and discard a, then the result value comes from evaluating (evaluate and discard b, then the result value comes from evaluating c)" does the same as "evaluate and discard (evaluate and discard a, then the result value comes from evaluating b), then the result value comes from evaluating c".
But it does give us a clearly-defined behavior in case of an overloaded operator,. Given:
struct X {};
X operator,(X, X);
X a, b, c;
X d = (a, b, c);
we know that the last line means
X d = operator,(operator,(a,b), c);
and not
X d = operator,(a, operator,(b,c));
(I'd consider it particularly evil to define a non-associative operator,, but it is allowed.)

This is the syntax notation (see §1.6 of N4140).
It is mainly used to evaluate precedence, but the name can be misleading.
For example in [expr.ass] (§5.18) you have the folowing definition:
assignment-expression:
conditional-expression
logical-or-expression assignment-operator initializer-clause
throw-expression
assignment-operator: one of
= *= /= %= += -= >>= <<= &= ^= |=
So an assignment-expression can be a conditional-expression or a throw-expression even if neither performs any assignment.
This just states that a = b, throw 10 or cond ? c : d are expressions with the same precedence order.

f(a,(t=3,t+2),c);
Here, first, stores 3 into t variable, then calls function f() with three arguments. It means second argument value become 5 and pass to the function.

Related

In C and C++, is an expression using the comma operator like "a = b, ++a;" undefined?

Take these three snippets of C code:
1) a = b + a++
2) a = b + a; a++
3) a = b + a, a++
Everyone knows that example 1 is a Very Bad Thing, and clearly invokes undefined behavior. Example 2 has no problems. My question is regarding example 3. Does the comma operator work like a semicolon in this kind of expression? Are 2 and 3 equivalent or is 3 just as undefined as 1?
Specifically I was considering this regarding something like free(foo), foo = bar. This is basically the same problem as above. Can I be sure that foo is freed before it's reassigned, or is this a clear sequence point problem?
I am aware that both examples are largely pointless and it makes far more sense to just use a semicolon and be done with it. I'm just asking out of curiosity.
Case 3 is well defined.
First, let's look at how the expression is parsed:
a = b + a, a++
The comma operator , has the lowest precedence, followed by the assignment operator =, the addition operator + and the postincrement operator ++. So with the implicit parenthesis it is parsed as:
(a = (b + a)), (a++)
From here, section 6.5.17 of the C standard regarding the comma operator , says the following:
2 The left operand of a comma operator is evaluated as a void expression; there is a sequence point between its
evaluation and that of the right operand. Then the right
operand is evaluated; the result has its type and value
Section 5.14 p1 of the C++11 standard has similar language:
A pair of expressions separated by a comma is evaluated left-to-right;
the left expression is a discarded- value expression.
Every value computation and side effect associated with the left
expression is sequenced before every value computation and side effect
associated with the right expression. The type and value of the result
are the type and value of the right operand; the result is of the same
value category as its right operand, and is a bit-field if its right
operand is a glvalue and a bit-field.
Because of the sequence point, a = b + a is guaranteed to be fully evaluated before a++ in the expression a = b + a, a++.
Regarding free(foo), foo = bar, this also guarantees that foo is free'ed before a new value is assigned.
a = b + a, a++; is well-defined, but a = (b + a, a++); can be undefined.
First of all, the operator precedence makes the expression equivalent to (a = (b+a)), a++;, where + has the highest precedence, followed by =, followed by ,. The comma operator includes a sequence point between the evaluation of its left and right operand. So the code is, uninterestingly, completely equivalent to:
a = b + a;
a++;
Which is of course well-defined.
Had we instead written a = (b + a, a++);, then the sequence point in the comma operator wouldn't save the day. Because then the expression would have been equivalent to
(void)(b + a);
a = a++;
In C and C++14 or older, a = a++ is unsequenced , (see C11 6.5.16/3). Meaning this is undefined behavior (Per C11 6.5/2). Note that C++11 and C++14 were badly formulated and ambiguous.
In C++17 or later, the operands of the = operator are sequenced right to left and this is still well-defined.
All of this assuming no C++ operator overloading takes place. In that case, the parameters to the overloaded operator function will be evaluated, a sequence point takes place before the function is called, and what happens from there depends on the internals of that function.

C++ nested conditional operator order of evaluation

For an expression like
x = a ? b : c ? d : e;
I understand that because the ?: operator has right associativity, the expression is grouped as
x = a ? b : (c ? d : e);
However, what about order of evaluation? Does associativity mean that the (c ? d : e) branch evaluated first, and then the answer of it passed as an argument to the left ?: operator? Or is a evaluated first, and then depending on that either b is returned or the (c ? d : e) branch is evaluated? Or is it undefined?
n3376 5.16/1
Conditional expressions group right-to-left. The first expression is
contextually converted to bool (Clause 4). It is evaluated and if it
is true, the result of the conditional expression is the value of the
second expression, otherwise that of the third expression. Only one of
the second and third expressions is evaluated. Every value computation
and side effect associated with the first expression is sequenced
before every value computation and side effect associated with the
second or third expression.
For the conditional operator:
the first operand is evaluated first;
either the second or the third (but not both) is evaluated depending on the value of the first.

C/C++ How does compiler separate tokens according to operator's precedence and associativity?

Consider the following codes:
int a = 3;
int b = 0;
b = a > 0 ? ++b, ++a : --a, b = 0;
After execution, I get the value of b to become 0 and the value of a to become 4.
That means the result of condition expression, a > 0 is evaluated as true and the expression a++ has been executed, while the expression b = 0 after , hast been executed ,too. In other words, the expression b = 0 is not an operand of the ternary operator, while ++b is. Otherwise, b = 0 won't be executed since the condition expression isn't evaluated as false.
My question is "according to what rule does the compiler kick b = 0 out of the ternary operator's operand?"
The operators in the third statement includes: ++ and --, which have the highest precedence, >, which has the second largest precedence, ? : and =, which have the third largest precedence and , with the lowest precedence. I know that operators with higher precedence should determine their operands earlier so that ++,--, and > are handled first. Then the statement is equivalently:
b = (a > 0) ? (++b), (++a) : (--a), b = 0;
Now, it's = and ?:'s turn to be handled. The associativity of = and ?: is right-to-left, so I consider the compiler will parse the statement from the right end.The first operator met is = and so b = 0 is grouped together. The second met operator is ,. Since it's precedence is lower then the current operators being analyzed, I assume the compiler will just skip it. Then the compiler met :, which is a part of ternary operator, so it keeps parsing.(Actually I don't know how the compiler can know that : is a part of ?: before parsing the whole ternary operator) Problem comes here. The next operator met by the compiler is , but the compiler haven't finished determining the operands of ?: yet. The , has lower priority than ?:. Theoretically it should be skipped; surprisingly, in practical test, the (++b) and (++a) have been concatenated by the , operator at this time and both are considered as the operand of ?:. That makes me confused. Why does the last , is ignored and doesn't included in the operand of ?: while the previous , in statement is kept in the operand of ternary operator?
May someone clarify the concepts of precedence and associativity with this example? I'm really confused about the executing result when first taking a sight of this piece of codes. I had thought that the expression b=0 is also a part of the ternary operator's operand; therefore b = 0 will only be executed if a > 0 is false.
Thanks in advance.
Precedence and associativity are different concepts, but technically the C and C++ standard specifies none. Instead they give the grammar rules to deduce the structure of the expression.
The relevant rules are:
conditional-expression:
logical-or-expression
logical-or-expression ? expression : assignment-expression
expression:
assignment-expression
expression , assignment-expression
primary-expression:
( expression )
postfix-expression:
primary-expression
...
And so on...
The idea is that each type of expression can generate a composite expresion or another type of expression of lower precedence. You can only go up to the root expression by using parenthesis.
With that in mind, note that the conditional-expression that uses the ?: actually has different types of expressions in each of the three subexpressions. The middle one is expression so it will accept any kind of expression, even with , or = (no ambiguity here because of the ending :).
But note that the last one is assignment-expression, that is any kind of expression except the one with ,. If you want to use that, you will have to enclose it with () creating a primary-expression instead.
Bonus explanation: the first expression is logical-or-expression, and if you look carefully to the grammar you'll see that it excludes assignment operators, the conditional operator and the comma operator.
So your expression:
b = a > 0 ? ++b, ++a : --a, b = 0
Is actually a expression comma assignment-expression, where the first expression is b = a > 0 ? ++b, ++a : --a and the second assignment-expression is b = 0.
And so on...
Your expression is evaluated as (b = ((a > 0) ? (++b, ++a) : (--a))), (b = 0);.
As you say the ?: has higher precedence than the comma operator, so the b=0 does not belong to the ternary conditional. The difference for the left and the right part of the ternary operator is, that on the left side the compiler tries to evaluate the complete string ++b, ++a as an expression (knowing that the part between ? and : must be an expression, while on the right side the compiler tries to parse an expression as far as it can. And precedence of operators says the compiler must stop at the ,. On the left side the compiler does not stop on the , because this is a legal part of the expression.

Comma operator precedence while used with ? : operator [duplicate]

This question already has answers here:
Something we found when using comma in condition ternary operator? [duplicate]
(4 answers)
What's the precedence of comma operator inside conditional operator in C++?
(3 answers)
Closed 9 years ago.
I have no idea why the result of the two sub programs below are different:
int a , b;
a = 13, b=12;
(a > b)? (a++,b--):(a--,b++); // Now a is 14 and b is 11
a = 13, b=12;
(a > b)? a++,b-- : a--,b++; // Now a is 14 but b is 12
However for these cases, the results are identical:
a = 13, b=12;
(a < b) ? a++,b-- : a--,b++; // Now a is 12 and b is 13
a = 13, b=12;
(a < b) ? (a++,b--) : (a--,b++); // Again a is 12 and b is 13
Why parentheses make difference for the statement after "?", but make no difference for the statement after ":"? Do you have any idea?
This one:
(a > b)? a++,b-- : a--,b++;
is equivalent to:
((a > b) ? (a++, b--) : a--), b++;
so b is always incremented and only sometimes decremented. There is no way to parse the comma operator between ? and : other than as parenthesized in the 'equivalent to' expression. But after the :, the unparenthesized comma terminates the ternary ?: operator and leaves the increment as unconditionally executed. The precedence of the comma operator is very, very low.
The relevant parts of the C++ grammar are:
conditional-expression:
logical-or-expression
logical-or-expression ? expression : assignment-expression
assignment-expression:
conditional-expression
logical-or-expression assignment-operator assignment-expression
throw-expression
expression:
assignment-expression
expression, assignment-expression
In summary, while the 'middle' of a conditional expression can be a full expression extending up to the :, the last sub-expression can only be an assignment-expression which excludes expressions using the comma operator (other than where the comma operator appears as part of valid sub-expression of an assignment-expression such as a parenthesized primary-expression or as the second operand of another conditional-expression).
In C, the last sub-expression is more restricted, it cannot even be an assignment-expression although this is not a concern in your example.
conditional-expression:
logical-OR-expression
logical-OR-expression ? expression : conditional-expression
In this case
(a > b)? a++,b-- : a--,b++;
It is equivalent to
((a > b)? a++,b-- : a--),b++;
I guess it's because x ? y cannot be considered a valid expression, therefore the comma can't split the operator there. x ? y : z is a valid expression, the the comma after the colon can split into two expressions.

Is comma operator free from side effect?

For example for such statement:
c += 2, c -= 1
Is it true that c += 2 will be always evaluated first, and c in second expression c-= 1 will always be updated value from expression c += 2?
Yes, it is guaranteed by the standard, as long as that comma is a non-overloaded comma operator. Quoting n3290 §5.18:
The comma operator groups left-to-right.
expression:
assignment-expression
expression , assignment-expression
A pair of expressions separated by a comma is evaluated left-to-right; the left expression is a discarded-
value expression (Clause 5)83. Every value computation and side effect associated with the left expression
is sequenced before every value computation and side effect associated with the right expression. The type
and value of the result are the type and value of the right operand; the result is of the same value category
as its right operand, and is a bit-field if its right operand is a glvalue and a bit-field.
And the corresponding footnote:
83 However, an invocation of an overloaded comma operator is an ordinary function call; hence, the evaluations of its argument
expressions are unsequenced relative to one another (see 1.9).
So this holds only for the non-overloaded comma operator.
The , between arguments to a function are not comma operators. This rule does not apply there either.
For C++03, the situation is similar:
The comma operator groups left-to-right.
expression:
assignment-expression
expression , assignment-expression
A pair of expressions separated by a comma is evaluated left-to-right and the value of the left expression is
discarded. The lvalue-to-rvalue (4.1), array-to-pointer (4.2), and function-to-pointer (4.3) standard conver-
sions are not applied to the left expression. All side effects (1.9) of the left expression, except for the
destruction of temporaries (12.2), are performed before the evaluation of the right expression. The type and
value of the result are the type and value of the right operand; the result is an lvalue if its right operand is.
Restrictions are the same though: does not apply to overloaded comma operators, or function argument lists.
Yes, the comma operator guarantees that the statements are evaluated in left-to-right order, and the returned value is the evaluated rightmost statement.
Be aware, however, that the comma in some contexts is not the comma operator. For example, the above is not guaranteed for function argument lists.
Yes, in C++ the comma operator is a sequence point and those expression will be evaluated in the order they are written. See 5.18 in the current working draft:
[snip] is evaluated left-to-right. [snip]
I feel that your question is lacking some explanation as to what you mean by "side effects". Every statement in C++ is allowed to have a side effect and so is an overloaded comma operator.
Why is the statement you have written not valid in a function call?
It's all about sequence points. In C++ and C it is forbidden to modify a value twice inside between two sequence points. If your example truly uses operator, every self-assignment is inside its own sequence point. If you use it like this foo(c += 2, c -= 2) the order of evaluation is undefined. I'm actually unsure if the second case is undefined behaviour as I do not know if an argument list is one or many sequence points. I ought to ask a question about this.
It should be always evaluated from left to right, as this is the in the definition of the comma operator:
Link
You've got two questions.
The first question: "Is comma operator free from side effect?"
The answer to this is no. The comma operator naturally facilitates writing expressions with side effects, and deliberately writing expressions with side effects is what the operator is commonly used for. E.g., in while (cin >> str, str != "exit") the state of the input stream is changed, which is an intentional side effect.
But maybe you don't mean side-effect in the computer science sense, but in some ad hoc sense.
Your second question: "For example for such statement: c += 2, c -= 1 Is it true that c += 2 will be always evaluated first, and c in second expression c-= 1 will always be updated value from expression c += 2?"
The answer to this is yes in the case of a statement or expression, except when the comma operator is overloaded (very unusual). However, sequences like c += 2, c -= 1 can also occur in argument lists, in which case, what you've got is not an expression, and the comma is not a sequence operator, and the order of evaluation is not defined. In foo(c += 2, c -= 1) the comma is not a comma operator, but in foo((c += 2, c -= 1)) it is, so it may pay to pay attention to the parentheses in function calls.