Is comma operator free from side effect? - c++

For example for such statement:
c += 2, c -= 1
Is it true that c += 2 will be always evaluated first, and c in second expression c-= 1 will always be updated value from expression c += 2?

Yes, it is guaranteed by the standard, as long as that comma is a non-overloaded comma operator. Quoting n3290 §5.18:
The comma operator groups left-to-right.
expression:
assignment-expression
expression , assignment-expression
A pair of expressions separated by a comma is evaluated left-to-right; the left expression is a discarded-
value expression (Clause 5)83. Every value computation and side effect associated with the left expression
is sequenced before every value computation and side effect associated with the right expression. The type
and value of the result are the type and value of the right operand; the result is of the same value category
as its right operand, and is a bit-field if its right operand is a glvalue and a bit-field.
And the corresponding footnote:
83 However, an invocation of an overloaded comma operator is an ordinary function call; hence, the evaluations of its argument
expressions are unsequenced relative to one another (see 1.9).
So this holds only for the non-overloaded comma operator.
The , between arguments to a function are not comma operators. This rule does not apply there either.
For C++03, the situation is similar:
The comma operator groups left-to-right.
expression:
assignment-expression
expression , assignment-expression
A pair of expressions separated by a comma is evaluated left-to-right and the value of the left expression is
discarded. The lvalue-to-rvalue (4.1), array-to-pointer (4.2), and function-to-pointer (4.3) standard conver-
sions are not applied to the left expression. All side effects (1.9) of the left expression, except for the
destruction of temporaries (12.2), are performed before the evaluation of the right expression. The type and
value of the result are the type and value of the right operand; the result is an lvalue if its right operand is.
Restrictions are the same though: does not apply to overloaded comma operators, or function argument lists.

Yes, the comma operator guarantees that the statements are evaluated in left-to-right order, and the returned value is the evaluated rightmost statement.
Be aware, however, that the comma in some contexts is not the comma operator. For example, the above is not guaranteed for function argument lists.

Yes, in C++ the comma operator is a sequence point and those expression will be evaluated in the order they are written. See 5.18 in the current working draft:
[snip] is evaluated left-to-right. [snip]
I feel that your question is lacking some explanation as to what you mean by "side effects". Every statement in C++ is allowed to have a side effect and so is an overloaded comma operator.
Why is the statement you have written not valid in a function call?
It's all about sequence points. In C++ and C it is forbidden to modify a value twice inside between two sequence points. If your example truly uses operator, every self-assignment is inside its own sequence point. If you use it like this foo(c += 2, c -= 2) the order of evaluation is undefined. I'm actually unsure if the second case is undefined behaviour as I do not know if an argument list is one or many sequence points. I ought to ask a question about this.

It should be always evaluated from left to right, as this is the in the definition of the comma operator:
Link

You've got two questions.
The first question: "Is comma operator free from side effect?"
The answer to this is no. The comma operator naturally facilitates writing expressions with side effects, and deliberately writing expressions with side effects is what the operator is commonly used for. E.g., in while (cin >> str, str != "exit") the state of the input stream is changed, which is an intentional side effect.
But maybe you don't mean side-effect in the computer science sense, but in some ad hoc sense.
Your second question: "For example for such statement: c += 2, c -= 1 Is it true that c += 2 will be always evaluated first, and c in second expression c-= 1 will always be updated value from expression c += 2?"
The answer to this is yes in the case of a statement or expression, except when the comma operator is overloaded (very unusual). However, sequences like c += 2, c -= 1 can also occur in argument lists, in which case, what you've got is not an expression, and the comma is not a sequence operator, and the order of evaluation is not defined. In foo(c += 2, c -= 1) the comma is not a comma operator, but in foo((c += 2, c -= 1)) it is, so it may pay to pay attention to the parentheses in function calls.

Related

Does the modification of the LHS always happen strictly after the RHS has been evaluated during assignment for built-in type? [duplicate]

map<int, int> mp;
printf("%d ", mp.size());
mp[10]=mp.size();
printf("%d\n", mp[10]);
This code yields an answer that is not very intuitive:
0 1
I understand why it happens - the left side of the assignment returns reference to mp[10]'s underlying value and at the same time creates aforementioned value, and only then is the right side evaluated, using the newly computed size() of the map.
Is this behaviour stated anywhere in C++ standard? Or is the order of evaluation undefined?
Result was obtained using g++ 5.2.1.
Yes, this is covered by the standard and it is unspecified behavior. This particular case is covered in a recent C++ standards proposal: N4228: Refining Expression Evaluation Order for Idiomatic C++ which seeks to refine the order of evaluation rules to make it well specified for certain cases.
It describes this problem as follows:
Expression evaluation order is a recurring discussion topic in the C++
community. In a nutshell, given an expression such as f(a, b,
c), the order in which the sub-expressions f, a, b, c are evaluated is left unspecified by the standard. If any two of these sub-expressions happen to modify the same object without intervening sequence points, the behavior of the program is undefined. For instance, the expression f(i++, i) where i is an
integer variable leads to undefined behavior , as does v[i]
= i++. Even when the behavior is not undefined, the result of evaluating an expression can still be anybody’s guess. Consider
the following program fragment:
#include <map>
int main() {
std::map<int, int> m;
m[0] = m.size(); // #1
}
What should the map object m look like after evaluation of the
statement marked #1? { {0, 0 } } or {{0, 1 } } ?
We know that unless specified the evaluations of sub-expressions are unsequenced, this is from the draft C++11 standard section 1.9 Program execution which says:
Except where noted, evaluations of operands of individual operators
and of subexpressions of individual expressions are unsequenced.[...]
and all the section 5.17 Assignment and compound assignment operators [expr.ass] says is:
[...]In all cases, the assignment is sequenced after the value
computation of the right and left operands, and before the value computation of the assignment expression.[...]
So this section does not nail down the order of evaluation but we know this is not undefined behavior since both operator [] and size() are function calls and section 1.9 tells us(emphasis mine):
[...]When calling a function (whether or not the function is inline), every value computation and side effect
associated with any argument expression, or with the postfix expression designating the called function, is
sequenced before execution of every expression or statement in the body of the called function. [ Note: Value
computations and side effects associated with different argument expressions are unsequenced. —end note ]
Every evaluation in the calling function (including other function calls) that is not otherwise specifically
sequenced before or after the execution of the body of the called function is indeterminately sequenced with
respect to the execution of the called function.9[...]
Note, I cover the second interesting example from the N4228 proposal in the question Does this code from “The C++ Programming Language” 4th edition section 36.3.6 have well-defined behavior?.
Update
It seems like a revised version of N4228 was accepted by the Evolution Working Group at the last WG21 meeting but the paper(P0145R0) is not yet available. So this could possibly no longer be unspecified in C++17.
Update 2
Revision 3 of p0145 made this specified and update [expr.ass]p1:
The assignment operator (=) and the compound assignment operators all group right-to-left.
All require a modifiable lvalue as their left operand; their result is an lvalue referring to the left operand.
The result in all cases is a bit-field if the left operand is a bit-field.
In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. The right operand is sequenced before the left operand. ...
From the C++11 standard (emphasis mine):
5.17 Assignment and compound assignment operators
1 The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand and return an lvalue referring to the left operand. The result in all cases is a bit-field if the left operand is a bit-field. In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.
Whether the left operand is evaluated first or the right operand is evaluated first is not specified by the language. A compiler is free to choose to evaluate either operand first. Since the final result of your code depends on the order of evaluation of the operands, I would say it is unspecified behavior rather than undefined behavior.
1.3.25 unspecified behavior
behavior, for a well-formed program construct and correct data, that depends on the implementation
I'm sure that the standard does not specify for an expression x = y; which order x or y is evaluated in the C++ standard (this is the reason why you can't do *p++ = *p++ for example, because p++ is not done in a defined order).
In other words, to guarantee order x = y; in a defined order, you need to do break it up into two sequence points.
T tmp = y;
x = tmp;
(Of course, in this particular case, one might presume the compiler prefers to do operator[] before size() because it can then store the value directly into the result of operator[] instead of keeping it in a temporary place, to store it later after operator[] has been evaluated - but I'm pretty sure the compiler doesn't NEED to do it in that order)
Let's take a look at what your code breaks down to:
mp.operator[](10).operator=(mp.size());
which pretty much tells the story that in the first part an entry to 10 is created and in the second part the size of the container is assigned to the integer reference in position of 10.
But now you get into the order of evaluation problem which is unspecified. Here is a much simpler example .
When should map::size() get called, before or after map::operator(int const &); ?
Nobody really knows.

In C and C++, is an expression using the comma operator like "a = b, ++a;" undefined?

Take these three snippets of C code:
1) a = b + a++
2) a = b + a; a++
3) a = b + a, a++
Everyone knows that example 1 is a Very Bad Thing, and clearly invokes undefined behavior. Example 2 has no problems. My question is regarding example 3. Does the comma operator work like a semicolon in this kind of expression? Are 2 and 3 equivalent or is 3 just as undefined as 1?
Specifically I was considering this regarding something like free(foo), foo = bar. This is basically the same problem as above. Can I be sure that foo is freed before it's reassigned, or is this a clear sequence point problem?
I am aware that both examples are largely pointless and it makes far more sense to just use a semicolon and be done with it. I'm just asking out of curiosity.
Case 3 is well defined.
First, let's look at how the expression is parsed:
a = b + a, a++
The comma operator , has the lowest precedence, followed by the assignment operator =, the addition operator + and the postincrement operator ++. So with the implicit parenthesis it is parsed as:
(a = (b + a)), (a++)
From here, section 6.5.17 of the C standard regarding the comma operator , says the following:
2 The left operand of a comma operator is evaluated as a void expression; there is a sequence point between its
evaluation and that of the right operand. Then the right
operand is evaluated; the result has its type and value
Section 5.14 p1 of the C++11 standard has similar language:
A pair of expressions separated by a comma is evaluated left-to-right;
the left expression is a discarded- value expression.
Every value computation and side effect associated with the left
expression is sequenced before every value computation and side effect
associated with the right expression. The type and value of the result
are the type and value of the right operand; the result is of the same
value category as its right operand, and is a bit-field if its right
operand is a glvalue and a bit-field.
Because of the sequence point, a = b + a is guaranteed to be fully evaluated before a++ in the expression a = b + a, a++.
Regarding free(foo), foo = bar, this also guarantees that foo is free'ed before a new value is assigned.
a = b + a, a++; is well-defined, but a = (b + a, a++); can be undefined.
First of all, the operator precedence makes the expression equivalent to (a = (b+a)), a++;, where + has the highest precedence, followed by =, followed by ,. The comma operator includes a sequence point between the evaluation of its left and right operand. So the code is, uninterestingly, completely equivalent to:
a = b + a;
a++;
Which is of course well-defined.
Had we instead written a = (b + a, a++);, then the sequence point in the comma operator wouldn't save the day. Because then the expression would have been equivalent to
(void)(b + a);
a = a++;
In C and C++14 or older, a = a++ is unsequenced , (see C11 6.5.16/3). Meaning this is undefined behavior (Per C11 6.5/2). Note that C++11 and C++14 were badly formulated and ambiguous.
In C++17 or later, the operands of the = operator are sequenced right to left and this is still well-defined.
All of this assuming no C++ operator overloading takes place. In that case, the parameters to the overloaded operator function will be evaluated, a sequence point takes place before the function is called, and what happens from there depends on the internals of that function.

Why use this comma in this return statement? [duplicate]

This question already has answers here:
What does the comma operator , do?
(8 answers)
C++ -- return x,y; What is the point?
(18 answers)
Closed 6 years ago.
I understand what this C++ function does, but I don't understand why the return statement is written this way:
int intDivide(int num, int denom){
return assert(denom!=0), num/denom;
}
There is only one statement here, because there is only one ; but the comma confuses me. Why not write:
int intDivide(int num, int denom){
assert(denom!=0);
return num/denom;
}
Aside from "elegance" is there something to be gained in the first version?
What exactly is that comma doing anyway? Does it break a single statement into 2 parts such that essentially the above 2 versions are identical?
Although the code didn't seem to use constexpr, C++11 constexpr functions were constrained to have only one statement which had to be a return statement. To do the non-functional assertion and return a value there would be no other option than using the comma operator. With C++14 this constraint was removed, though.
I could imagine that the function was rewritten from a macro which originally read something like this
#define INT_DIVIDE(nom,denom) (assert(denom != 0), nom/denom)
The built-in comma operator simply sequences two expressions. The result of the expression is the second operand. The two functions are, indeed, equivalent. Note, that the comma operator can be overloaded. If it is, the expressions are not sequenced and the result is whatever the overload defines.
In practice the comma operator sometimes comes in quite handy. For example, it is quite common to use the comma operator when expanding a parameter pack: in some uses each of the expansions is required to produce a value and to avoid void results messing things up, the comma operator can be used to have a value. For example:
template <typename... T>
void g(T const& arg) {
std::initializer_list<bool>{ (f(arg), true)... };
}
This is a sort of 'syntactic sugar', which is expanded on in a similar question.
Basically the e1, e2 means evaluate e1, and then evaluate e2 - and the entire statement is the result of e2. It's a short and obfuscated (in my opinion) way of writing what you suggest. Maybe the writer is cheap on code lines.
From the C++ standard:
5.19 Comma operator [expr.comma]
1 The comma operator groups left-to-right.
expression:
assignment-expression
expression , assignment-expression
A pair of expressions separated by a comma is
evaluated left-to-right; the left expression is a discarded- value
expression (Clause 5).87 Every value computation and side effect
associated with the left expression is sequenced before every value
computation and side effect associated with the right expression. The
type and value of the result are the type and value of the right
operand; the result is of the same value category as its right
operand, and is a bit-field if its right operand is a glvalue and a
bit-field. If the value of the right operand is a temporary (12.2),
the result is that temporary.
Yes, the two versions are identical, except if the comma operator is overloaded, as #StoryTeller commented.

How comma (,) works in a for loop between two expressions in conditional part

In the for loop at initialization part you can declare and initialize many variables as you like but ofcourse they have to be same type.In conditional part you can apply any expressional statements like AND(&&),OR(||),>,<,== etc.
but(,) is not a expression .How it works here
just a=1,2,3,4,5,6 and b=1,2,3,4,5,6,7,8,9,10
and a<6,b<9 returns a=1,2,3,4,5,6,7,8,9=b
for(int a=1,b=1,c=2,d=5;a<4,b<10;a++,b++)//initialize variables and using , between expression
{
cout<<a<<" "<<b<<endl;
}
Because...that's not really how things work at all.
The comma operator evaluates and discards its left operand (so in most cases its left operand will have side effects). After the left operand is evaluated (and any side effects from it have happened), the right operand is evaluate. The value yielded from this is the value of the right operand.
Actually it's not or, the behavior of comma operator can be described as:
In the C and C++ programming languages, the comma operator
(represented by the token ,) is a binary operator that evaluates its
first operand and discards the result, and then evaluates the second
operand and returns this value (and type).
From wiki: https://en.wikipedia.org/wiki/Comma_operator
So only the result of k<10 taken into the account.

Order of evaluation of assignment statement in C++

map<int, int> mp;
printf("%d ", mp.size());
mp[10]=mp.size();
printf("%d\n", mp[10]);
This code yields an answer that is not very intuitive:
0 1
I understand why it happens - the left side of the assignment returns reference to mp[10]'s underlying value and at the same time creates aforementioned value, and only then is the right side evaluated, using the newly computed size() of the map.
Is this behaviour stated anywhere in C++ standard? Or is the order of evaluation undefined?
Result was obtained using g++ 5.2.1.
Yes, this is covered by the standard and it is unspecified behavior. This particular case is covered in a recent C++ standards proposal: N4228: Refining Expression Evaluation Order for Idiomatic C++ which seeks to refine the order of evaluation rules to make it well specified for certain cases.
It describes this problem as follows:
Expression evaluation order is a recurring discussion topic in the C++
community. In a nutshell, given an expression such as f(a, b,
c), the order in which the sub-expressions f, a, b, c are evaluated is left unspecified by the standard. If any two of these sub-expressions happen to modify the same object without intervening sequence points, the behavior of the program is undefined. For instance, the expression f(i++, i) where i is an
integer variable leads to undefined behavior , as does v[i]
= i++. Even when the behavior is not undefined, the result of evaluating an expression can still be anybody’s guess. Consider
the following program fragment:
#include <map>
int main() {
std::map<int, int> m;
m[0] = m.size(); // #1
}
What should the map object m look like after evaluation of the
statement marked #1? { {0, 0 } } or {{0, 1 } } ?
We know that unless specified the evaluations of sub-expressions are unsequenced, this is from the draft C++11 standard section 1.9 Program execution which says:
Except where noted, evaluations of operands of individual operators
and of subexpressions of individual expressions are unsequenced.[...]
and all the section 5.17 Assignment and compound assignment operators [expr.ass] says is:
[...]In all cases, the assignment is sequenced after the value
computation of the right and left operands, and before the value computation of the assignment expression.[...]
So this section does not nail down the order of evaluation but we know this is not undefined behavior since both operator [] and size() are function calls and section 1.9 tells us(emphasis mine):
[...]When calling a function (whether or not the function is inline), every value computation and side effect
associated with any argument expression, or with the postfix expression designating the called function, is
sequenced before execution of every expression or statement in the body of the called function. [ Note: Value
computations and side effects associated with different argument expressions are unsequenced. —end note ]
Every evaluation in the calling function (including other function calls) that is not otherwise specifically
sequenced before or after the execution of the body of the called function is indeterminately sequenced with
respect to the execution of the called function.9[...]
Note, I cover the second interesting example from the N4228 proposal in the question Does this code from “The C++ Programming Language” 4th edition section 36.3.6 have well-defined behavior?.
Update
It seems like a revised version of N4228 was accepted by the Evolution Working Group at the last WG21 meeting but the paper(P0145R0) is not yet available. So this could possibly no longer be unspecified in C++17.
Update 2
Revision 3 of p0145 made this specified and update [expr.ass]p1:
The assignment operator (=) and the compound assignment operators all group right-to-left.
All require a modifiable lvalue as their left operand; their result is an lvalue referring to the left operand.
The result in all cases is a bit-field if the left operand is a bit-field.
In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. The right operand is sequenced before the left operand. ...
From the C++11 standard (emphasis mine):
5.17 Assignment and compound assignment operators
1 The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand and return an lvalue referring to the left operand. The result in all cases is a bit-field if the left operand is a bit-field. In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.
Whether the left operand is evaluated first or the right operand is evaluated first is not specified by the language. A compiler is free to choose to evaluate either operand first. Since the final result of your code depends on the order of evaluation of the operands, I would say it is unspecified behavior rather than undefined behavior.
1.3.25 unspecified behavior
behavior, for a well-formed program construct and correct data, that depends on the implementation
I'm sure that the standard does not specify for an expression x = y; which order x or y is evaluated in the C++ standard (this is the reason why you can't do *p++ = *p++ for example, because p++ is not done in a defined order).
In other words, to guarantee order x = y; in a defined order, you need to do break it up into two sequence points.
T tmp = y;
x = tmp;
(Of course, in this particular case, one might presume the compiler prefers to do operator[] before size() because it can then store the value directly into the result of operator[] instead of keeping it in a temporary place, to store it later after operator[] has been evaluated - but I'm pretty sure the compiler doesn't NEED to do it in that order)
Let's take a look at what your code breaks down to:
mp.operator[](10).operator=(mp.size());
which pretty much tells the story that in the first part an entry to 10 is created and in the second part the size of the container is assigned to the integer reference in position of 10.
But now you get into the order of evaluation problem which is unspecified. Here is a much simpler example .
When should map::size() get called, before or after map::operator(int const &); ?
Nobody really knows.