"x = ++x" is it really undefined? - c++

I am using Coverity Prevent on a project to find errors.
It reports an error for this expression (The variable names are of course changed):
x=
(a>= b) ?
++x: 0;
The message is:
EVALUATION_ORDER defect: In "x=(a>= b) ? ++x: 0;", "x" is written in "x" (the assignment LHS) and written in "(a>= b) ? ++x: 0;" but the order in which the side effects take place is undefined because there is no intervening sequence point. END OF MESSAGE
While I can understand that "x = x++" is undefined, this one is a bit harder for me. Is this one a false positive or not?

Conditional operator ?: has a sequence point between evaluation of the condition (first operand) and evaluation of second or third operand, but it has no dedicated sequence point after the evaluation of second or third operand. Which means that two modifications of x in this example are potentially conflicting (not separated by a sequence point). So, Coverity Prevent is right.
Your statement in that regard is virtually equivalent to
a >= b ? x = ++x : x = 0;
with the same problem as in x = ++x.
Now, the title of your question seems to suggest that you don't know whether x = ++x is undefined. It is indeed undefined. It is undefined for the very same reason x = x++ is undefined. In short, if the same object is modified more than once between a pair of adjacent sequence points, the behavior is undefined. In this case x is modified by assignment and by ++ an there's no sequence point to "isolate" these modifications from each other. So, the behavior is undefined. There's absolutely no difference between ++x and x++ in this regard.

Regardless of the accuracy of the message, replacing the code in question by x= (a>= b) ? x+1: 0; achieves the same end without any confusion. If the tool is confused then maybe the next person to look at this code will be too.
This does assume that x does not have an overloaded increment operator with side-effects that you rely on here.

The statement x = ++x; writes to the variable x twice before hitting the sequence point and hence the behavior is undefined.

It is hard to imagine a compiler producing code for "x = ++x;" which would in fact not work the same as "++x". If x is not volatile, however, it would be legal for a compiler to process the statement "y = ++x;" as
y=x+1;
x=x+1;
The statement "x = ++x;" would thus become
x=x+1;
x=x+1;
If having the destination of one an arithmetic assignment expression get used too quickly as a source operand for another would cause a pipeline delay, the former optimization might be reasonable. Obviously disastrous if the incremented and assigned variable are one and the same.
If variable 'x' is volatile, I can't think of any code sequence where a compiler that wasn't deliberately trying to be mean could legitimately regard "x = x++;" as having any effect other than reading all parts of 'x' exactly once and writing the same correct value to all parts of 'x' exactly twice.

I suppose that logically, if you write either "x=++x" or "x=x++", either way you would expect the end result to be that x is one more than it started as. But make it just a shade more complicated. What if you wrote "x=x+(++x)" ? Is this the same as "x=x+(x+1)" ? Or do we add one first and then add x to itself, i.e. "x=(x+1)+(x+1)"? What if we wrote "x=(x--)+(x++)" ?
Even if you could write a language spec that gave unambiguous meaning to these sort of constructs, and could then implement it cleanly, why would you want to? Putting a unary increment in an assignment to the same variable just doesn't make sense, and even if we forced sense out of it, it provides no useful functionality. Like, I'm sure we could write a compliler that would take an expression like "x+1=y-1" and figure out that that really means "x=y-2", but why bother? There's no gain.
Even if we wrote compilers that did something predictable with "x=x++-++x", programmers would have to know and understand the rules. Without any obvious benefit, it would just add complexity to programs for no purpose.

In order to understand this you need to have a basic understanding of sequence points. See this link: http://en.wikipedia.org/wiki/Sequence_point
For the = operator there is no sequence point, so there is no guarantee that the value of x will be modified before it is again assigned to x.

Related

Evaluating order 'and' for C++ in relation with null pointers

Please consider the following block of C++-code.
call * pCall; // Make a pointer to private variable call.
pCall = NULL; // We are sure that it is a null pointer.
aMemberFuntionThatMayChangeTheValueOfpCall();
if (pCall != NULL and pCall->isDelivered() == false){
doSomething();
}
I would like to execute some lines of code, here represented by doSomething(), if and only if pCall is not a null pointer and pCall->isDelivered() is false.
However, I read that the order of evaluation is unspecified. So theoretically the compiler may evaluate pCall->isDelivered() first, and run into a run time exception. However, in the debugging sessions it seems to evaluate the and-operator left to right. Can someone please shine a light on this? I don't want any failures of the code when it gets into a production environment or when it gets executed on another machine.
Of course it is possible to make it into two nested if-statements, but this makes the source far more unreadable because I need this kind of code multiple times.
Can anyone tell me how to do this evaluation in one if-statement such that there is no misevaluation?
Your and operator is more commonly written in C++ as &&, but they are both the same "logical and" operator. For this operator the order of evaluation is specified and evaluation is executed from left to right, guaranteed. (Otherwise millions of existing programs would fail).
The exception - not applicable to your code - is when the && operator is overloaded. In this case the left-to-right rule does not work. BTW it is a main reason why it is recommended not to overload this operator.
Your code below will work (I've changed 'and' to '&&' and added parentheses):
if ( (pCall != NULL) && (pCall->isDelivered() == false) ) {
doSomething();
}
because the logical-AND && will 'short circuit'; i.e. if the first expression is false, the second will not be evaluated. Evaluation order is from left to right in this case.
The following relates to the Miscosoft C++ compiler in VS2013:
Logical operators also guarantee evaluation of their operands from
left to right. However, they evaluate the smallest number of operands
needed to determine the result of the expression. This is called
"short-circuit" evaluation. Thus, some operands of the expression may
not be evaluated. For example, in the expression x && y++ the second
operand, y++, is evaluated only if x is true (nonzero). Thus, y is not
incremented if x is false (0).

Operator associativity and order of evaluation [duplicate]

The terms 'operator precedence' and 'order of evaluation' are very commonly used terms in programming and extremely important for a programmer to know. And, as far as I understand them, the two concepts are tightly bound; one cannot do without the other when talking about expressions.
Let us take a simple example:
int a=1; // Line 1
a = a++ + ++a; // Line 2
printf("%d",a); // Line 3
Now, it is evident that Line 2 leads to Undefined Behavior, since Sequence points in C and C++ include:
Between evaluation of the left and right operands of the && (logical
AND), || (logical OR), and comma
operators. For example, in the
expression *p++ != 0 && *q++ != 0, all
side effects of the sub-expression
*p++ != 0 are completed before any attempt to access q.
Between the evaluation of the first operand of the ternary
"question-mark" operator and the
second or third operand. For example,
in the expression a = (*p++) ? (*p++)
: 0 there is a sequence point after
the first *p++, meaning it has already
been incremented by the time the
second instance is executed.
At the end of a full expression. This category includes expression
statements (such as the assignment
a=b;), return statements, the
controlling expressions of if, switch,
while, or do-while statements, and all
three expressions in a for statement.
Before a function is entered in a function call. The order in which
the arguments are evaluated is not
specified, but this sequence point
means that all of their side effects
are complete before the function is
entered. In the expression f(i++) + g(j++) + h(k++),
f is called with a
parameter of the original value of i,
but i is incremented before entering
the body of f. Similarly, j and k are
updated before entering g and h
respectively. However, it is not
specified in which order f(), g(), h()
are executed, nor in which order i, j,
k are incremented. The values of j and
k in the body of f are therefore
undefined.3 Note that a function
call f(a,b,c) is not a use of the
comma operator and the order of
evaluation for a, b, and c is
unspecified.
At a function return, after the return value is copied into the
calling context. (This sequence point
is only specified in the C++ standard;
it is present only implicitly in
C.)
At the end of an initializer; for example, after the evaluation of 5
in the declaration int a = 5;.
Thus, going by Point # 3:
At the end of a full expression. This category includes expression statements (such as the assignment a=b;), return statements, the controlling expressions of if, switch, while, or do-while statements, and all three expressions in a for statement.
Line 2 clearly leads to Undefined Behavior. This shows how Undefined Behaviour is tightly coupled with Sequence Points.
Now let us take another example:
int x=10,y=1,z=2; // Line 4
int result = x<y<z; // Line 5
Now its evident that Line 5 will make the variable result store 1.
Now the expression x<y<z in Line 5 can be evaluated as either:
x<(y<z) or (x<y)<z. In the first case the value of result will be 0 and in the second case result will be 1. But we know, when the Operator Precedence is Equal/Same - Associativity comes into play, hence, is evaluated as (x<y)<z.
This is what is said in this MSDN Article:
The precedence and associativity of C operators affect the grouping and evaluation of operands in expressions. An operator's precedence is meaningful only if other operators with higher or lower precedence are present. Expressions with higher-precedence operators are evaluated first. Precedence can also be described by the word "binding." Operators with a higher precedence are said to have tighter binding.
Now, about the above article:
It mentions "Expressions with higher-precedence operators are evaluated first."
It may sound incorrect. But, I think the article is not saying something wrong if we consider that () is also an operator x<y<z is same as (x<y)<z. My reasoning is if associativity does not come into play, then the complete expressions evaluation would become ambiguous since < is not a Sequence Point.
Also, another link I found says this on Operator Precedence and Associativity:
This page lists C operators in order of precedence (highest to lowest). Their associativity indicates in what order operators of equal precedence in an expression are applied.
So taking, the second example of int result=x<y<z, we can see here that there are in all 3 expressions, x, y and z, since, the simplest form of an expression consists of a single literal constant or object. Hence the result of the expressions x, y, z would be there rvalues, i.e., 10, 1 and 2 respectively. Hence, now we may interpret x<y<z as 10<1<2.
Now, doesn't Associativity come into play since now we have 2 expressions to be evaluated, either 10<1 or 1<2 and since the precedence of operator is same, they are evaluated from left to right?
Taking this last example as my argument:
int myval = ( printf("Operator\n"), printf("Precedence\n"), printf("vs\n"),
printf("Order of Evaluation\n") );
Now in the above example, since the comma operator has same precedence, the expressions are evaluated left-to-right and the return value of the last printf() is stored in myval.
In SO/IEC 9899:201x under J.1 Unspecified behavior it mentions:
The order in which subexpressions are evaluated and the order in which side effects
take place, except as specified for the function-call (), &&, ||, ?:, and comma
operators (6.5).
Now I would like to know, would it be wrong to say:
Order of Evaluation depends on the precedence of operators, leaving cases of Unspecified Behavior.
I would like to be corrected if any mistakes were made in something I said in my question.
The reason I posted this question is because of the confusion created in my mind by the MSDN Article. Is it in Error or not?
Yes, the MSDN article is in error, at least with respect to standard C and C++1.
Having said that, let me start with a note about terminology: in the C++ standard, they (mostly--there are a few slip-ups) use "evaluation" to refer to evaluating an operand, and "value computation" to refer to carrying out an operation. So, when (for example) you do a + b, each of a and b is evaluated, then the value computation is carried out to determine the result.
It's clear that the order of value computations is (mostly) controlled by precedence and associativity--controlling value computations is basically the definition of what precedence and associativity are. The remainder of this answer uses "evaluation" to refer to evaluation of operands, not to value computations.
Now, as to evaluation order being determined by precedence, no it's not! It's as simple as that. Just for example, let's consider your example of x<y<z. According to the associativity rules, this parses as (x<y)<z. Now, consider evaluating this expression on a stack machine. It's perfectly allowable for it to do something like this:
push(z); // Evaluates its argument and pushes value on stack
push(y);
push(x);
test_less(); // compares TOS to TOS(1), pushes result on stack
test_less();
This evaluates z before x or y, but still evaluates (x<y), then compares the result of that comparison to z, just as it's supposed to.
Summary: Order of evaluation is independent of associativity.
Precedence is the same way. We can change the expression to x*y+z, and still evaluate z before x or y:
push(z);
push(y);
push(x);
mul();
add();
Summary: Order of evaluation is independent of precedence.
When/if we add in side effects, this remains the same. I think it's educational to think of side effects as being carried out by a separate thread of execution, with a join at the next sequence point (e.g., the end of the expression). So something like a=b++ + ++c; could be executed something like this:
push(a);
push(b);
push(c+1);
side_effects_thread.queue(inc, b);
side_effects_thread.queue(inc, c);
add();
assign();
join(side_effects_thread);
This also shows why an apparent dependency doesn't necessarily affect order of evaluation either. Even though a is the target of the assignment, this still evaluates a before evaluating either b or c. Also note that although I've written it as "thread" above, this could also just as well be a pool of threads, all executing in parallel, so you don't get any guarantee about the order of one increment versus another either.
Unless the hardware had direct (and cheap) support for thread-safe queuing, this probably wouldn't be used in in a real implementation (and even then it's not very likely). Putting something into a thread-safe queue will normally have quite a bit more overhead than doing a single increment, so it's hard to imagine anybody ever doing this in reality. Conceptually, however, the idea is fits the requirements of the standard: when you use a pre/post increment/decrement operation, you're specifying an operation that will happen sometime after that part of the expression is evaluated, and will be complete at the next sequence point.
Edit: though it's not exactly threading, some architectures do allow such parallel execution. For a couple of examples, the Intel Itanium and VLIW processors such as some DSPs, allow a compiler to designate a number of instructions to be executed in parallel. Most VLIW machines have a specific instruction "packet" size that limits the number of instructions executed in parallel. The Itanium also uses packets of instructions, but designates a bit in an instruction packet to say that the instructions in the current packet can be executed in parallel with those in the next packet. Using mechanisms like this, you get instructions executing in parallel, just like if you used multiple threads on architectures with which most of us are more familiar.
Summary: Order of evaluation is independent of apparent dependencies
Any attempt at using the value before the next sequence point gives undefined behavior -- in particular, the "other thread" is (potentially) modifying that data during that time, and you have no way of synchronizing access with the other thread. Any attempt at using it leads to undefined behavior.
Just for a (admittedly, now rather far-fetched) example, think of your code running on a 64-bit virtual machine, but the real hardware is an 8-bit processor. When you increment a 64-bit variable, it executes a sequence something like:
load variable[0]
increment
store variable[0]
for (int i=1; i<8; i++) {
load variable[i]
add_with_carry 0
store variable[i]
}
If you read the value somewhere in the middle of that sequence, you could get something with only some of the bytes modified, so what you get is neither the old value nor the new one.
This exact example may be pretty far-fetched, but a less extreme version (e.g., a 64-bit variable on a 32-bit machine) is actually fairly common.
Conclusion
Order of evaluation does not depend on precedence, associativity, or (necessarily) on apparent dependencies. Attempting to use a variable to which a pre/post increment/decrement has been applied in any other part of an expression really does give completely undefined behavior. While an actual crash is unlikely, you're definitely not guaranteed to get either the old value or the new one -- you could get something else entirely.
1 I haven't checked this particular article, but quite a few MSDN articles talk about Microsoft's Managed C++ and/or C++/CLI (or are specific to their implementation of C++) but do little or nothing to point out that they don't apply to standard C or C++. This can give the false appearance that they're claiming the rules they have decided to apply to their own languages actually apply to the standard languages. In these cases, the articles aren't technically false -- they just don't have anything to do with standard C or C++. If you attempt to apply those statements to standard C or C++, the result is false.
The only way precedence influences order of evaluation is that it
creates dependencies; otherwise the two are orthogonal. You've
carefully chosen trivial examples where the dependencies created by
precedence do end up fully defining order of evaluation, but this isn't
generally true. And don't forget, either, that many expressions have
two effects: they result in a value, and they have side effects. These
two are no required to occur together, so even when dependencies
force a specific order of evaluation, this is only the order of
evaluation of the values; it has no effect on side effects.
A good way to look at this is to take the expression tree.
If you have an expression, lets say x+y*z you can rewrite that into an expression tree:
Applying the priority and associativity rules:
x + ( y * z )
After applying the priority and associativity rules, you can safely forget about them.
In tree form:
x
+
y
*
z
Now the leaves of this expression are x, y and z. What this means is that you can evaluate x, y and z in any order you want, and also it means that you can evaluate the result of * and x in any order.
Now since these expressions don't have side effects you don't really care. But if they do, the ordering can change the result, and since the ordering can be anything the compiler decides, you have a problem.
Now, sequence points bring a bit of order into this chaos. They effectively cut the tree into sections.
x + y * z, z = 10, x + y * z
after priority and associativity
x + ( y * z ) , z = 10, x + ( y * z)
the tree:
x
+
y
*
z
, ------------
z
=
10
, ------------
x
+
y
*
z
The top part of the tree will be evaluated before the middle, and middle before bottom.
Precedence has nothing to do with order of evaluation and vice-versa.
Precedence rules describe how an underparenthesized expression should be parenthesized when the expression mixes different kinds of operators. For example, multiplication is of higher precedence than addition, so 2 + 3 x 4 is equivalent to 2 + (3 x 4), not (2 + 3) x 4.
Order of evaluation rules describe the order in which each operand in an expression is evaluated.
Take an example
y = ++x || --y;
By operator precedence rule, it will be parenthesize as (++/-- has higher precedence than || which has higher precedence than =):
y = ( (++x) || (--y) )
The order of evaluation of logical OR || states that (C11 6.5.14)
the || operator guarantees left-to-right evaluation.
This means that the left operand, i.e the sub-expression (x++) will be evaluated first. Due to short circuiting behavior; If the first operand compares unequal to 0, the second operand is not evaluated, right operand --y will not be evaluated although it is parenthesize prior than (++x) || (--y).
It mentions "Expressions with higher-precedence operators are evaluated first."
I am just going to repeat what I said here. As far as standard C and C++ are concerned that article is flawed. Precedence only affects which tokens are considered to be the operands of each operator, but it does not affect in any way the order of evaluation.
So, the link only explains how Microsoft implemented things, not how the language itself works.
I think it's only the
a++ + ++a
epxression problematic, because
a = a++ + ++a;
fits first in 3. but then in the 6. rule: complete evaluation before assignment.
So,
a++ + ++a
gets for a=1 fully evaluated to:
1 + 3 // left to right, or
2 + 2 // right to left
The result is the same = 4.
An
a++ * ++a // or
a++ == ++a
would have undefined results. Isn't it?

Operator Precedence vs Order of Evaluation

The terms 'operator precedence' and 'order of evaluation' are very commonly used terms in programming and extremely important for a programmer to know. And, as far as I understand them, the two concepts are tightly bound; one cannot do without the other when talking about expressions.
Let us take a simple example:
int a=1; // Line 1
a = a++ + ++a; // Line 2
printf("%d",a); // Line 3
Now, it is evident that Line 2 leads to Undefined Behavior, since Sequence points in C and C++ include:
Between evaluation of the left and right operands of the && (logical
AND), || (logical OR), and comma
operators. For example, in the
expression *p++ != 0 && *q++ != 0, all
side effects of the sub-expression
*p++ != 0 are completed before any attempt to access q.
Between the evaluation of the first operand of the ternary
"question-mark" operator and the
second or third operand. For example,
in the expression a = (*p++) ? (*p++)
: 0 there is a sequence point after
the first *p++, meaning it has already
been incremented by the time the
second instance is executed.
At the end of a full expression. This category includes expression
statements (such as the assignment
a=b;), return statements, the
controlling expressions of if, switch,
while, or do-while statements, and all
three expressions in a for statement.
Before a function is entered in a function call. The order in which
the arguments are evaluated is not
specified, but this sequence point
means that all of their side effects
are complete before the function is
entered. In the expression f(i++) + g(j++) + h(k++),
f is called with a
parameter of the original value of i,
but i is incremented before entering
the body of f. Similarly, j and k are
updated before entering g and h
respectively. However, it is not
specified in which order f(), g(), h()
are executed, nor in which order i, j,
k are incremented. The values of j and
k in the body of f are therefore
undefined.3 Note that a function
call f(a,b,c) is not a use of the
comma operator and the order of
evaluation for a, b, and c is
unspecified.
At a function return, after the return value is copied into the
calling context. (This sequence point
is only specified in the C++ standard;
it is present only implicitly in
C.)
At the end of an initializer; for example, after the evaluation of 5
in the declaration int a = 5;.
Thus, going by Point # 3:
At the end of a full expression. This category includes expression statements (such as the assignment a=b;), return statements, the controlling expressions of if, switch, while, or do-while statements, and all three expressions in a for statement.
Line 2 clearly leads to Undefined Behavior. This shows how Undefined Behaviour is tightly coupled with Sequence Points.
Now let us take another example:
int x=10,y=1,z=2; // Line 4
int result = x<y<z; // Line 5
Now its evident that Line 5 will make the variable result store 1.
Now the expression x<y<z in Line 5 can be evaluated as either:
x<(y<z) or (x<y)<z. In the first case the value of result will be 0 and in the second case result will be 1. But we know, when the Operator Precedence is Equal/Same - Associativity comes into play, hence, is evaluated as (x<y)<z.
This is what is said in this MSDN Article:
The precedence and associativity of C operators affect the grouping and evaluation of operands in expressions. An operator's precedence is meaningful only if other operators with higher or lower precedence are present. Expressions with higher-precedence operators are evaluated first. Precedence can also be described by the word "binding." Operators with a higher precedence are said to have tighter binding.
Now, about the above article:
It mentions "Expressions with higher-precedence operators are evaluated first."
It may sound incorrect. But, I think the article is not saying something wrong if we consider that () is also an operator x<y<z is same as (x<y)<z. My reasoning is if associativity does not come into play, then the complete expressions evaluation would become ambiguous since < is not a Sequence Point.
Also, another link I found says this on Operator Precedence and Associativity:
This page lists C operators in order of precedence (highest to lowest). Their associativity indicates in what order operators of equal precedence in an expression are applied.
So taking, the second example of int result=x<y<z, we can see here that there are in all 3 expressions, x, y and z, since, the simplest form of an expression consists of a single literal constant or object. Hence the result of the expressions x, y, z would be there rvalues, i.e., 10, 1 and 2 respectively. Hence, now we may interpret x<y<z as 10<1<2.
Now, doesn't Associativity come into play since now we have 2 expressions to be evaluated, either 10<1 or 1<2 and since the precedence of operator is same, they are evaluated from left to right?
Taking this last example as my argument:
int myval = ( printf("Operator\n"), printf("Precedence\n"), printf("vs\n"),
printf("Order of Evaluation\n") );
Now in the above example, since the comma operator has same precedence, the expressions are evaluated left-to-right and the return value of the last printf() is stored in myval.
In SO/IEC 9899:201x under J.1 Unspecified behavior it mentions:
The order in which subexpressions are evaluated and the order in which side effects
take place, except as specified for the function-call (), &&, ||, ?:, and comma
operators (6.5).
Now I would like to know, would it be wrong to say:
Order of Evaluation depends on the precedence of operators, leaving cases of Unspecified Behavior.
I would like to be corrected if any mistakes were made in something I said in my question.
The reason I posted this question is because of the confusion created in my mind by the MSDN Article. Is it in Error or not?
Yes, the MSDN article is in error, at least with respect to standard C and C++1.
Having said that, let me start with a note about terminology: in the C++ standard, they (mostly--there are a few slip-ups) use "evaluation" to refer to evaluating an operand, and "value computation" to refer to carrying out an operation. So, when (for example) you do a + b, each of a and b is evaluated, then the value computation is carried out to determine the result.
It's clear that the order of value computations is (mostly) controlled by precedence and associativity--controlling value computations is basically the definition of what precedence and associativity are. The remainder of this answer uses "evaluation" to refer to evaluation of operands, not to value computations.
Now, as to evaluation order being determined by precedence, no it's not! It's as simple as that. Just for example, let's consider your example of x<y<z. According to the associativity rules, this parses as (x<y)<z. Now, consider evaluating this expression on a stack machine. It's perfectly allowable for it to do something like this:
push(z); // Evaluates its argument and pushes value on stack
push(y);
push(x);
test_less(); // compares TOS to TOS(1), pushes result on stack
test_less();
This evaluates z before x or y, but still evaluates (x<y), then compares the result of that comparison to z, just as it's supposed to.
Summary: Order of evaluation is independent of associativity.
Precedence is the same way. We can change the expression to x*y+z, and still evaluate z before x or y:
push(z);
push(y);
push(x);
mul();
add();
Summary: Order of evaluation is independent of precedence.
When/if we add in side effects, this remains the same. I think it's educational to think of side effects as being carried out by a separate thread of execution, with a join at the next sequence point (e.g., the end of the expression). So something like a=b++ + ++c; could be executed something like this:
push(a);
push(b);
push(c+1);
side_effects_thread.queue(inc, b);
side_effects_thread.queue(inc, c);
add();
assign();
join(side_effects_thread);
This also shows why an apparent dependency doesn't necessarily affect order of evaluation either. Even though a is the target of the assignment, this still evaluates a before evaluating either b or c. Also note that although I've written it as "thread" above, this could also just as well be a pool of threads, all executing in parallel, so you don't get any guarantee about the order of one increment versus another either.
Unless the hardware had direct (and cheap) support for thread-safe queuing, this probably wouldn't be used in in a real implementation (and even then it's not very likely). Putting something into a thread-safe queue will normally have quite a bit more overhead than doing a single increment, so it's hard to imagine anybody ever doing this in reality. Conceptually, however, the idea is fits the requirements of the standard: when you use a pre/post increment/decrement operation, you're specifying an operation that will happen sometime after that part of the expression is evaluated, and will be complete at the next sequence point.
Edit: though it's not exactly threading, some architectures do allow such parallel execution. For a couple of examples, the Intel Itanium and VLIW processors such as some DSPs, allow a compiler to designate a number of instructions to be executed in parallel. Most VLIW machines have a specific instruction "packet" size that limits the number of instructions executed in parallel. The Itanium also uses packets of instructions, but designates a bit in an instruction packet to say that the instructions in the current packet can be executed in parallel with those in the next packet. Using mechanisms like this, you get instructions executing in parallel, just like if you used multiple threads on architectures with which most of us are more familiar.
Summary: Order of evaluation is independent of apparent dependencies
Any attempt at using the value before the next sequence point gives undefined behavior -- in particular, the "other thread" is (potentially) modifying that data during that time, and you have no way of synchronizing access with the other thread. Any attempt at using it leads to undefined behavior.
Just for a (admittedly, now rather far-fetched) example, think of your code running on a 64-bit virtual machine, but the real hardware is an 8-bit processor. When you increment a 64-bit variable, it executes a sequence something like:
load variable[0]
increment
store variable[0]
for (int i=1; i<8; i++) {
load variable[i]
add_with_carry 0
store variable[i]
}
If you read the value somewhere in the middle of that sequence, you could get something with only some of the bytes modified, so what you get is neither the old value nor the new one.
This exact example may be pretty far-fetched, but a less extreme version (e.g., a 64-bit variable on a 32-bit machine) is actually fairly common.
Conclusion
Order of evaluation does not depend on precedence, associativity, or (necessarily) on apparent dependencies. Attempting to use a variable to which a pre/post increment/decrement has been applied in any other part of an expression really does give completely undefined behavior. While an actual crash is unlikely, you're definitely not guaranteed to get either the old value or the new one -- you could get something else entirely.
1 I haven't checked this particular article, but quite a few MSDN articles talk about Microsoft's Managed C++ and/or C++/CLI (or are specific to their implementation of C++) but do little or nothing to point out that they don't apply to standard C or C++. This can give the false appearance that they're claiming the rules they have decided to apply to their own languages actually apply to the standard languages. In these cases, the articles aren't technically false -- they just don't have anything to do with standard C or C++. If you attempt to apply those statements to standard C or C++, the result is false.
The only way precedence influences order of evaluation is that it
creates dependencies; otherwise the two are orthogonal. You've
carefully chosen trivial examples where the dependencies created by
precedence do end up fully defining order of evaluation, but this isn't
generally true. And don't forget, either, that many expressions have
two effects: they result in a value, and they have side effects. These
two are no required to occur together, so even when dependencies
force a specific order of evaluation, this is only the order of
evaluation of the values; it has no effect on side effects.
A good way to look at this is to take the expression tree.
If you have an expression, lets say x+y*z you can rewrite that into an expression tree:
Applying the priority and associativity rules:
x + ( y * z )
After applying the priority and associativity rules, you can safely forget about them.
In tree form:
x
+
y
*
z
Now the leaves of this expression are x, y and z. What this means is that you can evaluate x, y and z in any order you want, and also it means that you can evaluate the result of * and x in any order.
Now since these expressions don't have side effects you don't really care. But if they do, the ordering can change the result, and since the ordering can be anything the compiler decides, you have a problem.
Now, sequence points bring a bit of order into this chaos. They effectively cut the tree into sections.
x + y * z, z = 10, x + y * z
after priority and associativity
x + ( y * z ) , z = 10, x + ( y * z)
the tree:
x
+
y
*
z
, ------------
z
=
10
, ------------
x
+
y
*
z
The top part of the tree will be evaluated before the middle, and middle before bottom.
Precedence has nothing to do with order of evaluation and vice-versa.
Precedence rules describe how an underparenthesized expression should be parenthesized when the expression mixes different kinds of operators. For example, multiplication is of higher precedence than addition, so 2 + 3 x 4 is equivalent to 2 + (3 x 4), not (2 + 3) x 4.
Order of evaluation rules describe the order in which each operand in an expression is evaluated.
Take an example
y = ++x || --y;
By operator precedence rule, it will be parenthesize as (++/-- has higher precedence than || which has higher precedence than =):
y = ( (++x) || (--y) )
The order of evaluation of logical OR || states that (C11 6.5.14)
the || operator guarantees left-to-right evaluation.
This means that the left operand, i.e the sub-expression (x++) will be evaluated first. Due to short circuiting behavior; If the first operand compares unequal to 0, the second operand is not evaluated, right operand --y will not be evaluated although it is parenthesize prior than (++x) || (--y).
It mentions "Expressions with higher-precedence operators are evaluated first."
I am just going to repeat what I said here. As far as standard C and C++ are concerned that article is flawed. Precedence only affects which tokens are considered to be the operands of each operator, but it does not affect in any way the order of evaluation.
So, the link only explains how Microsoft implemented things, not how the language itself works.
I think it's only the
a++ + ++a
epxression problematic, because
a = a++ + ++a;
fits first in 3. but then in the 6. rule: complete evaluation before assignment.
So,
a++ + ++a
gets for a=1 fully evaluated to:
1 + 3 // left to right, or
2 + 2 // right to left
The result is the same = 4.
An
a++ * ++a // or
a++ == ++a
would have undefined results. Isn't it?

Comma operator in a conditional

I have read in a lot of places but I really can't understand the specified behavior in conditionals.
I understand that in assignments it evaluates the first operand, discards the result, then evaluates the second operand.
But for this code, what it supposed to do?
CPartFile* partfile = (CPartFile*)lParam;
ASSERT( partfile != NULL );
bool bDeleted = false;
if (partfile,bDeleted)
partfile->PerformFileCompleteEnd(wParam);
The partfile in the IF was an unnecessary argument, or it have any meaning?
In this case, it is an unnecessary expression, and can be deleted without changing the meaning of the code.
The comma operator performs the expression of the first item, discards the results, then evaluates the result as the last expression.
So partfile,bDeleted would evaulate whatever partfile would, discard that result, then evaluate and return bDeleted
It's useful if you need to evaluate something which has a side-effect (for example, calling a method). In this case, though, it's useless.
For more information, see Wikipedia: Comma operator
bool bDeleted = false;
if (partfile,bDeleted)
partfile->PerformFileCompleteEnd(wParam);
Here, the if statement evaluates partfile,bDeleted, but bDelete is always false, so the expression fails to run. The key question is "what's that all about?". The probable answer is that someone temporarily wanted to prevent the partfile->PerformFileCompleteEnd(wParam); statement from running, perhaps because it was causing some problem or they wanted to ensure later code reported errors properly if that step wasn't performed. So that they're remember how the code used to be, they left the old "if (partfile)" logic there, but added a hardcoded bDeleted variable to document that the partfile->Perform... logic had effectively been "deleted" from the program.
A better way to temporarily disable such code is probably...
#if 0
if (partfile)
partfile->PerformFileCompleteEnd(wParam);
#endif
...though sometimes I try to document the reasoning too...
#ifndef DONT_BYPASS_FILE_COMPLETE_PROCESSING_DURING_DEBUGGING
if (partfile)
partfile->PerformFileCompleteEnd(wParam);
#endif
...or...
if (partFile, !"FIXME remove this after debugging")
partfile->PerformFileCompleteEnd(wParam);
The best choice depends on your tool set and existing habits (e.g. some editors highlight "FIXME" and "TODO" in reverse video so it's hard to miss or grey out #if 0 blocks; you might have particular strings your source-control checkin warns about; preprocessor defines only in debug vs release builds can prevent accidental distribution etc.).
partfile is evaluated, then bDeleted is evaluated and used as the test. Since evaluation of partfile does not have any side effects, removing it from the conditional has no effect.
The comma operator is a rather obscure feature of C/C++. It should not be confused with the comma in initialising lists (ie: int x, int y; ) nor with function call parameter separation comma (ie: func(x, y) ).
The comma operator has one single purpose: to give the programmer a guaranteed order of evaluation of an expression. For almost every operator in C/C++, the order of evaluation of expressions is undefined. If I write
result = x + y;
where x and y are subexpressions, then either x or y can be evaluated first. I cannot know which, it's up to the compiler. If you however write
result = x, y;
the order of evaluation is guaranteed by the standard: left first.
Of course, the uses of this in real world applications are quite limited...

An explanation about Sequence points

Lately, I have seen a lot of questions being asked about output for some crazy yet syntactically allowed code statements like like i = ++i + 1 and i=(i,i++,i)+1;.
Frankly realistically speaking hardly anyone writes any such code in actual programing.To be frank I have never encountered any such code in my professional experience. So I usually end up skipping such questions here on SO. But lately the sheer volume of such Q's being asked makes me think if I am missing out some important theory by skipping such Q's. I gather that the such Q's revolve around Sequence points. I hardly know anything about sequence points to be frank and I am just wondering if not knowing about it is a handicap in some way. So can someone please explain the theory /concept of Sequence points, or If possible point to a resource which explains about the concept. Also, is it worth to invest time in knowing about this concept/theory?
The simplest answer I can think of is:
C++ is defined in terms of an abstract machine. The output of a program executed on the abstract machine is defined ONLY in terms of the order that "side effects" are performed. And Side effects are defined as calls into IO library functions, and changes to variables marked volatile.
C++ compilers are allowed to do whatever they want internally to optimize code, but they cannot change the order of writes to volatile variables, and io calls.
Sequence points define the c/c++ program's heartbeat - side effects before the sequence point are "complete" and side effects after the sequence point have not yet taken place. But, side effects (or, code that can effect a side effect indirectly( within a sequence point can be re-ordered.
Which is why understanding them is important. Without that understanding, your fundamental understanding of what a c++ program is (And how it might be optimized by an agressive compiler) is flawed.
See http://en.wikipedia.org/wiki/Sequence_point.
It's a quite simple concept, so you don't need to invest much time :)
The exact technical details of sequence points can get hairy, yes. But following these guideline solves almost all the practical issues:
If an expression modifies a value, there must be a sequence point between the modification and any other use of that value.
If you're not sure whether two uses of a value are separated by a sequence point or not, break up your code into more statements.
Here "modification" includes assignment operations on the left-hand value in =, +=, etc., and also the ++x, x++, --x, and x-- syntaxes. (It's usually these increment/decrement expressions where some people try to be clever and end up getting into trouble.)
Luckily, there are sequence points in most of the "expected" places:
At the end of every statement or declaration.
At the beginning and end of every function call.
At the built-in && and || operators.
At the ? in a ternary expression.
At the built-in , comma operator. (Most commonly seen in for conditions, e.g. for (a=0, b=0; a<m && b<n; ++a, ++b).) A comma which separates function arguments is not the comma operator and is not a sequence point.
Overloaded operator&&, operator||, and operator, do not cause sequence points. Potential surprises from that fact is one reason overloading them is usually discouraged.
It is worth knowing that sequence points exist because if you don't know about them you can easily write code which seems to run fine in testing but actually is undefined and might fail when you run it on another computer or with different compile options. In particular if you write for example x++ as part of a larger expression that also includes x you can easily run into problems.
I don't think it is necessary to learn all the rules fully - but you need to know when you need to check the specification, or perhaps better - when to rewrite your code to make it so that you aren't relying on sequence points rules if a simpler design would work too.
int n,n_squared;
for(n=n_squared=0;n<100;n_squared+=n+ ++n)
printf("%i squared might or might not be %i\n",n,n_squared);
... doesn't always do what you think it will do. This can make debugging painful.
The reason is the ++n retrieves, modifies, and stores the value of n, which could be before or after n is retrieved. Therefore, the value of n_squared isn't clearly defined after the first iteration. Sequence points guarantee that the subexpressions are evaluated in order.