I read somewhere that the ?: operator in C is slightly different in C++, that there's some source code that works differently in both languages. Unfortunately, I can't find the text anywhere. Does anyone know what this difference is?
The conditional operator in C++ can return an lvalue, whereas C does not allow for similar functionality. Hence, the following is legal in C++:
(true ? a : b) = 1;
To replicate this in C, you would have to resort to if/else, or deal with references directly:
*(true ? &a : &b) = 1;
Also in C++, ?: and = operators have equal precedence and group right-to-left, such that:
(true ? a = 1 : b = 2);
is valid C++ code, but will throw an error in C without parentheses around the last expression:
(true ? a = 1 : (b = 2));
The principal practical difference is that in C, evaluation of ?: can never result in a l-value where as in C++ it can.
There are other differences in its definition which have few practical consequences. In C++ the first operand is converted to a bool, in C it is compared against 0. This is analagous to the difference in definition of ==, !=, etc. between C and C++.
There are also more complex rules in C++ for deducing the type of a ?: expression based on the types of the 2nd and 3rd operands. This reflects the possibility of user-defined implicit conversions in C++.
Example code. Valid C++; invalid C.
extern int h(int p, int q);
int g(int x)
{
int a = 3, b = 5;
(x ? a : b) = 7;
return h( a, b );
}
gcc generates the error: "error: invalid lvalue in assignment" when compiling as C, but the code compiles without error when compiling as C++.
Edit:
Although ?: can't return an l-value in C, perhaps surprisingly the grammar for ?: is:
conditional-expression:
logical-OR-expression
logical-OR-expression ? expression : conditional-expression
This means that a ? b : c = d parses as (a ? b : c) = d even though (due to the 'not an l-value' rule) this can't result in a valid expression.
C++ changes the grammar to this:
conditional-expression:
logical-or-expression
logical-or-expression ? expression : assignment-expression
While the extension to allow conditional-expression to be an l-value in some situations would have made a ? b : c = d valid without the grammar change, the new grammar change means that the expression is now valid but with the different meaning of a ? b : (c = d).
Although I don't have any evidence for it, my supposition that as the grammar change couldn't break compatibility with existing C code, it was more likely that the new grammar would produce fewer surprises with expressions such as:
make_zero ? z = 0 : z = 1;
Related
Take these three snippets of C code:
1) a = b + a++
2) a = b + a; a++
3) a = b + a, a++
Everyone knows that example 1 is a Very Bad Thing, and clearly invokes undefined behavior. Example 2 has no problems. My question is regarding example 3. Does the comma operator work like a semicolon in this kind of expression? Are 2 and 3 equivalent or is 3 just as undefined as 1?
Specifically I was considering this regarding something like free(foo), foo = bar. This is basically the same problem as above. Can I be sure that foo is freed before it's reassigned, or is this a clear sequence point problem?
I am aware that both examples are largely pointless and it makes far more sense to just use a semicolon and be done with it. I'm just asking out of curiosity.
Case 3 is well defined.
First, let's look at how the expression is parsed:
a = b + a, a++
The comma operator , has the lowest precedence, followed by the assignment operator =, the addition operator + and the postincrement operator ++. So with the implicit parenthesis it is parsed as:
(a = (b + a)), (a++)
From here, section 6.5.17 of the C standard regarding the comma operator , says the following:
2 The left operand of a comma operator is evaluated as a void expression; there is a sequence point between its
evaluation and that of the right operand. Then the right
operand is evaluated; the result has its type and value
Section 5.14 p1 of the C++11 standard has similar language:
A pair of expressions separated by a comma is evaluated left-to-right;
the left expression is a discarded- value expression.
Every value computation and side effect associated with the left
expression is sequenced before every value computation and side effect
associated with the right expression. The type and value of the result
are the type and value of the right operand; the result is of the same
value category as its right operand, and is a bit-field if its right
operand is a glvalue and a bit-field.
Because of the sequence point, a = b + a is guaranteed to be fully evaluated before a++ in the expression a = b + a, a++.
Regarding free(foo), foo = bar, this also guarantees that foo is free'ed before a new value is assigned.
a = b + a, a++; is well-defined, but a = (b + a, a++); can be undefined.
First of all, the operator precedence makes the expression equivalent to (a = (b+a)), a++;, where + has the highest precedence, followed by =, followed by ,. The comma operator includes a sequence point between the evaluation of its left and right operand. So the code is, uninterestingly, completely equivalent to:
a = b + a;
a++;
Which is of course well-defined.
Had we instead written a = (b + a, a++);, then the sequence point in the comma operator wouldn't save the day. Because then the expression would have been equivalent to
(void)(b + a);
a = a++;
In C and C++14 or older, a = a++ is unsequenced , (see C11 6.5.16/3). Meaning this is undefined behavior (Per C11 6.5/2). Note that C++11 and C++14 were badly formulated and ambiguous.
In C++17 or later, the operands of the = operator are sequenced right to left and this is still well-defined.
All of this assuming no C++ operator overloading takes place. In that case, the parameters to the overloaded operator function will be evaluated, a sequence point takes place before the function is called, and what happens from there depends on the internals of that function.
This question already has answers here:
Errors using ternary operator in c
(5 answers)
Closed 8 years ago.
There are a lot of differences between C and C++ and came to stuck on one of them
The same code gives an error in C while just executes fine in C++
Please explain the reason
int main(void)
{
int a=10,b;
a>=5?b=100:b=200;
}
The above code gives an error in C stating lvalue required while the same code compiles fine in C++
Have a look at the operator precedence.
Without an explicit () your code behaves like
( a >= 5 ? b = 100 : b ) = 200;
The result of a ?: expression is not a modifiable lvalue [#] and hence we cannot assign any values to it.
Also, worthy to mention, as per the c syntax rule,
assignment is never allowed to appear on the right hand side of a conditional operator
Relared Reference : C precedence table.
OTOH, In case of c++, well,
the conditional operator has the same precedence as assignment.
and are grouped right-to-left, essentially making your code behave like
a >= 5 ? (b = 100) : ( b = 200 );
So, your code works fine in case of c++
[ # ] -- As per chapter 6.5.15, footnote (12), C99 standard,
A conditional expression does not yield an lvalue.
Because C and C++ aren't the same language, and you are ignoring the assignment implied by the ternary. I think you wanted
b = a>=5?100:200;
which should work in both C and C++.
In C you can fix it with placing the expression within Parentheses so that while evaluating the assignment becomes valid.
int main(void)
{
int a=10,b;
a>=5?(b=100):(b=200);
}
The error is because you don't care about the operator precedence and order of evaluation.
I am addicted to "braceless" ifs, like this:
if (a) b++, c++, d = e;
But one annoying thing is that return cannot be a part of the last part. Intuitively I feel why is that, but can anyone explain in programming language terms why this will not compile?
main() {
int a, b, c, d, e;
if (a) b = c, d = e, return;
}
If you care, please also explain why is that designed like that, it seems like a flaw to me. I can understand in C but in C++ it could have been redesigned without major compatibility loss with the existing C code.
Just for comparison: these will compile and do exactly what expected:
while (a < 10) a++, b--, c += 2;
while (a < 10) if (a == 5) half = a, save();
The "comma" operator is exactly that, an operator. It's left and right sides must be expressions, and return is not an expression.
To elaborate, the comma operator evaluates its left-hand side first, and discards the value. Then, it evaluates its right-hand side, and the whole comma expression evaluates to the right-hand side's value.
It's similar to this:
template <typename T, typename U>
U operator,(T t, U u)
{
return u;
}
Therefore, you cannot put anything in a comma expression that is not an expression itself.
If you're looking to simultaneously execute a series of statements and group them together, that's exactly what ; and {} are for. There is no reason to duplicate that behavior in the comma operator.
It can be done the following way
if (a) return ( b = c, d = e, 0 );
Oe if there is no return expression
if (a) return ( b = c, d = e, ( void )0 );
It may be open to question whether this answers the question the OP was really asking, but in case anybody cares about why the comma operator was designed the way it was, I think it goes back to BCPL.
In BCPL, you could combine a series of assignments like:
L1 := R1
L2 := R2
...into a single statement (command) like:
L1, L2 := R1, R2
Much like in C and C++, these were executed in order from left to right. Unlike C and C++, this "comma operator" didn't produce a single expression (at least as C uses the term).
BCPL also had a resultis that let you make a block of statements into something almost like a function.
At least to me, it looks like in C, Dennis1 decided decided to sort of combine these two concepts into a single one that was rather simpler: a comma operator that would allow evaluation of a number of expressions in succession, and yield a single result.
Reference: BCPL Reference Manual
I suppose in fairness I should mention the possibility that this decision was actually made by Ken Thomson in the design of B. Little enough documentation on B has survived that it's almost impossible to even guess about that.
As already stated return is not an expression, it's a keyword. However, b = c, d = e is an expression. Therefore your intent is probably this:
if (a) return (b = c, d = e, 0);
b = c, d = e, return doesn't really make any sense, as it would be inconsistent with how the comma operator works in other contexts. Imagine if you could do this:
for (int i = 0, j = 0, return; ...
That would make absolutely no sense. It would also be redundant if return meant something in this context as the comma operator already returns its last operand. There would also be no point because the comma operator already evaluates its operands, how would return something be beneficial in this case?
Someone looking at your code might glance over it and say, "this should be: if (a) (b = c, d = e); return 0;", which is a trap because of the lack of braces. What they would really mean is if (a) { (b = c, d = e); return 0; }, but this problem would be avoided if you use the syntax mentioned at the top of this answer. It simply isn't readable as it makes no semantic sense.
Regardless, this would only make sense if b and d were global variables, for example something like errno, allowing you to assign to the variable and return in one statement.
Why exactly this will not compile if (a) b = c, d = e, return;?
This is because a comma (,) operator must have its left and right operands to be expressions. The return statement is not an expression. See the syntax defined for , operator by the C and C++ standard:
C11: 6.5.17 Comma operator
Syntax
expression:
assignment-expression
expression , assignment-expression
The same syntax is defined by C++ standard
C++: 5.18 Comma operator [expr.comma]
The comma operator groups left-to-right.
expression:
assignment-expression
expression , assignment-expression
A pair of expressions1 separated by a comma is evaluated left-to-right;
Note that the standard says about expressions and return is not an expression.
1.Emphasis is mine
I can understand in C but in C++ it could have been redesigned without
major compatibility loss with the existing C code.
It could have been, but why would anyone, ever, want it to be? The language already contains a means to the end that you are looking for- braces. They are much more reliable and useful than abusing the comma operator like you have. For example, if you are using UDTs, then you are going to run into some nasty surprises when I overload the comma operator. Oops!
More to the point, having return as an expression doesn't make sense, because the function has already, well, returned, when it is evaluated, so there's no way anyone could possibly use any hypothetical return value.
Your entire question is predicated on your personal dislike of braces. Nobody else who designs the language really shares that feeling.
Consider the following codes:
int a = 3;
int b = 0;
b = a > 0 ? ++b, ++a : --a, b = 0;
After execution, I get the value of b to become 0 and the value of a to become 4.
That means the result of condition expression, a > 0 is evaluated as true and the expression a++ has been executed, while the expression b = 0 after , hast been executed ,too. In other words, the expression b = 0 is not an operand of the ternary operator, while ++b is. Otherwise, b = 0 won't be executed since the condition expression isn't evaluated as false.
My question is "according to what rule does the compiler kick b = 0 out of the ternary operator's operand?"
The operators in the third statement includes: ++ and --, which have the highest precedence, >, which has the second largest precedence, ? : and =, which have the third largest precedence and , with the lowest precedence. I know that operators with higher precedence should determine their operands earlier so that ++,--, and > are handled first. Then the statement is equivalently:
b = (a > 0) ? (++b), (++a) : (--a), b = 0;
Now, it's = and ?:'s turn to be handled. The associativity of = and ?: is right-to-left, so I consider the compiler will parse the statement from the right end.The first operator met is = and so b = 0 is grouped together. The second met operator is ,. Since it's precedence is lower then the current operators being analyzed, I assume the compiler will just skip it. Then the compiler met :, which is a part of ternary operator, so it keeps parsing.(Actually I don't know how the compiler can know that : is a part of ?: before parsing the whole ternary operator) Problem comes here. The next operator met by the compiler is , but the compiler haven't finished determining the operands of ?: yet. The , has lower priority than ?:. Theoretically it should be skipped; surprisingly, in practical test, the (++b) and (++a) have been concatenated by the , operator at this time and both are considered as the operand of ?:. That makes me confused. Why does the last , is ignored and doesn't included in the operand of ?: while the previous , in statement is kept in the operand of ternary operator?
May someone clarify the concepts of precedence and associativity with this example? I'm really confused about the executing result when first taking a sight of this piece of codes. I had thought that the expression b=0 is also a part of the ternary operator's operand; therefore b = 0 will only be executed if a > 0 is false.
Thanks in advance.
Precedence and associativity are different concepts, but technically the C and C++ standard specifies none. Instead they give the grammar rules to deduce the structure of the expression.
The relevant rules are:
conditional-expression:
logical-or-expression
logical-or-expression ? expression : assignment-expression
expression:
assignment-expression
expression , assignment-expression
primary-expression:
( expression )
postfix-expression:
primary-expression
...
And so on...
The idea is that each type of expression can generate a composite expresion or another type of expression of lower precedence. You can only go up to the root expression by using parenthesis.
With that in mind, note that the conditional-expression that uses the ?: actually has different types of expressions in each of the three subexpressions. The middle one is expression so it will accept any kind of expression, even with , or = (no ambiguity here because of the ending :).
But note that the last one is assignment-expression, that is any kind of expression except the one with ,. If you want to use that, you will have to enclose it with () creating a primary-expression instead.
Bonus explanation: the first expression is logical-or-expression, and if you look carefully to the grammar you'll see that it excludes assignment operators, the conditional operator and the comma operator.
So your expression:
b = a > 0 ? ++b, ++a : --a, b = 0
Is actually a expression comma assignment-expression, where the first expression is b = a > 0 ? ++b, ++a : --a and the second assignment-expression is b = 0.
And so on...
Your expression is evaluated as (b = ((a > 0) ? (++b, ++a) : (--a))), (b = 0);.
As you say the ?: has higher precedence than the comma operator, so the b=0 does not belong to the ternary conditional. The difference for the left and the right part of the ternary operator is, that on the left side the compiler tries to evaluate the complete string ++b, ++a as an expression (knowing that the part between ? and : must be an expression, while on the right side the compiler tries to parse an expression as far as it can. And precedence of operators says the compiler must stop at the ,. On the left side the compiler does not stop on the , because this is a legal part of the expression.
I read somewhere that the ?: operator in C is slightly different in C++, that there's some source code that works differently in both languages. Unfortunately, I can't find the text anywhere. Does anyone know what this difference is?
The conditional operator in C++ can return an lvalue, whereas C does not allow for similar functionality. Hence, the following is legal in C++:
(true ? a : b) = 1;
To replicate this in C, you would have to resort to if/else, or deal with references directly:
*(true ? &a : &b) = 1;
Also in C++, ?: and = operators have equal precedence and group right-to-left, such that:
(true ? a = 1 : b = 2);
is valid C++ code, but will throw an error in C without parentheses around the last expression:
(true ? a = 1 : (b = 2));
The principal practical difference is that in C, evaluation of ?: can never result in a l-value where as in C++ it can.
There are other differences in its definition which have few practical consequences. In C++ the first operand is converted to a bool, in C it is compared against 0. This is analagous to the difference in definition of ==, !=, etc. between C and C++.
There are also more complex rules in C++ for deducing the type of a ?: expression based on the types of the 2nd and 3rd operands. This reflects the possibility of user-defined implicit conversions in C++.
Example code. Valid C++; invalid C.
extern int h(int p, int q);
int g(int x)
{
int a = 3, b = 5;
(x ? a : b) = 7;
return h( a, b );
}
gcc generates the error: "error: invalid lvalue in assignment" when compiling as C, but the code compiles without error when compiling as C++.
Edit:
Although ?: can't return an l-value in C, perhaps surprisingly the grammar for ?: is:
conditional-expression:
logical-OR-expression
logical-OR-expression ? expression : conditional-expression
This means that a ? b : c = d parses as (a ? b : c) = d even though (due to the 'not an l-value' rule) this can't result in a valid expression.
C++ changes the grammar to this:
conditional-expression:
logical-or-expression
logical-or-expression ? expression : assignment-expression
While the extension to allow conditional-expression to be an l-value in some situations would have made a ? b : c = d valid without the grammar change, the new grammar change means that the expression is now valid but with the different meaning of a ? b : (c = d).
Although I don't have any evidence for it, my supposition that as the grammar change couldn't break compatibility with existing C code, it was more likely that the new grammar would produce fewer surprises with expressions such as:
make_zero ? z = 0 : z = 1;