I've been debugging some SSE-optimised vector code and noticed some odd behaviour. To be fair, the code style is pretty bad, but what the compiler does still seems wrong to me. Here is the function in question:
inline void daxpy(int n, double alph, const double* x, int incx, double* y, int incy) {
__m128d sse_alph = _mm_load1_pd(&alph);
while (n >= 4) {
n -= 4;
__m128d y1 = _mm_load_pd(y+n), y2 = _mm_load_pd(y+n+2);
__m128d x1 = _mm_load_pd(x+n), x2 = _mm_load_pd(x+n+2);
y1 = _mm_add_pd(y1, _mm_mul_pd(x1, sse_alph));
y2 = _mm_add_pd(y2, _mm_mul_pd(x2, sse_alph));
_mm_store_pd(y+n, y1), _mm_store_pd(y+n+2, y2);
}
}
The function is that the array y = y + alph * x. We guarantee that the arrays both have the same length, n, which is a multiple of 4, and that x and y are aligned on 16-byte boundaries (I've omitted the relevant assertions for clarity).
The last line of the loop has been written with a comma operator so that it looks like the two load lines. The problem is that the first _mm_store_pd call is not executed. Isn't that wrong? I guess the compiler might have decided that only the second call is necessary to evaluate the expression, but it seems pretty obvious that the intrinsic function has a side effect.
Have I misunderstood what is going on here? I realise that using a comma operator like this is pretty poor style - my question is whether the compiler is wrong. The compiler in question is Visual C++ 2010 SP 1.
Building this code with Microsoft Visual Studio 2008, 2010, and 2012 shows they all eliminate the left operand of the comma operator. This happens only if optimization is enabled. When this code is built using gcc 4.8.1, the left operand of the comma operator is not eliminated, even when full optimization is used.
The C99 specification states, "The left operand of a comma operator is evaluated as a void expression; there is a sequence point after its evaluation. Then the right operand is evaluated".
In my opinion, the Microsoft optimizer is incorrect to remove this code. This is because the language specification says both operands are evaluated. The only differences between the two operands of the comma operator are the order of their evaluation and which one provides the result for the comma operator. In this case the result is void.
Work-around: replace the comma with a semicolon.
Related
This question already has answers here:
Undefined behavior and sequence points
(5 answers)
Closed 6 years ago.
I have a friend who is getting different output than I do for the following program:
int main() {
int x = 20, y = 35;
x = y++ + x++;
y = ++y + ++x;
printf("%d%d", x, y);
return 0;
}
I am using Ubuntu, and have tried using gcc and clang. I get 5693 from both.
My friend is using Visual Studio 2015, and gets 5794.
The answer I get (5693) makes most sense to me, since:
the first line sets x = x + y (which is x = 20+35 = 55) (note: x was incremented, but assigned over top of, so doesn't matter)
y was incremented and is therefore 36
next line increments both, adds the result and sets it as y (which is y = 37 + 56 = 93)
which would be 56 and 93, so the output is 5693
I could see the VS answer making sense if the post-increment happened after the assignment. Is there some spec that makes one of these answers more right than the other? Is it just ambiguous? Should we fire anyone who writes code like this, making the ambiguity irrelevant?
Note: Initially, we only tried with gcc, however clang gives this warning:
coatedmoose#ubuntu:~/playground$ clang++ strange.cpp
strange.cpp:8:16: warning: multiple unsequenced modifications to 'x' [-Wunsequenced]
x = y++ + x++;
~ ^
1 warning generated.
The Clang warning is alluding to a clause in the standard, C++11 and later, that makes it undefined behaviour to execute two unsequenced modifications to the same scalar object. (In earlier versions of the standard the rule was different although similar in spirit.)
So the answer is that the spec makes all possible answers equally valid, including the program crashing; it is indeed inherently ambiguous.
By the way, Visual C++ actually does have somewhat consistent and logical behaviour in such cases, even though the standard does not require it: it performs all pre-increments first, then does arithmetic operations and assignments, and then finally performs all post-increments before moving on to the next statement. If you trace through the code you've given, you'll see that Visual C++'s answer is what you would expect from this procedure.
What's the order C++ does in chained multiplication ?
int a, b, c, d;
// set values
int m = a*b*c*d;
operator * has left to right associativity:
int m = ((a * b) * c) * d;
While in math it doesn't matter (multiplication is associative), in case of both C and C++ we may have or not have overflow depending on the order.
0 * INT_MAX * INT_MAX // 0
INT_MAX * INT_MAX * 0 // overflow
And things are getting even more complex if we consider floating point types or operator overloading. See comments of #delnan and #melpomene.
Yes, the order is from left to right.
int m = a * b * c * d;
If you are more interested in the topic of evaluation order or operator precedence, then you might be surprised how some ++ operations behave, even differently with the version of C.
http://en.cppreference.com/w/c/language/eval_order
http://en.cppreference.com/w/c/language/operator_precedence
The order is from left to right in this case
where
int m=a*b*c*d;
Here, first (a*b) is computed then the result is multiplied by c and then d as shown using parentheses:
int m=(((a*b)*c)*d);
The order is nominally left to right. But the optimizers in C++ compilers I've used feel free to change that order for data types they think they understand. If you overload operator* and the optimizer can't see through your overload, then it can't change the order. But when you multiply a sequence of things (variables, constants, function results, etc.) whose type is double, the optimizer might use the associative and commutative property of real number multiplication as if it were true in float or double multiplication. That can lead to some surprises when least significant bits matter to you.
So far as I understand, the standard allows that optimization, but I am far from a "language lawyer" so my best guess on the standard is not a pronouncement to be trusted (as compared with my experience on what compilers actually do).
There isn't any special "order" it multiplies as shown, left to right.
int m = a * b * c * d;
The order of operations comes into affect when using addition/ subtraction with dividing/ multiplying. Otherwise it is always left to right. Regardless, with the example, the solution would be the same no matter which order they were in.
Consider the following snippets:
C++:
#include <iostream>
using namespace std;
int main()
{
int x = 10, y = 20;
y = x + (x=y)*0;
cout << y;
return 0;
}
which gives a result of 20, because the value of y is assigned to x since the bracket is executed first according to the Operator Precedence Table.
VB.NET:
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim x As Integer = 10
Dim y As Integer = 20
y = x + (x = y) * 0
MsgBox(y)
End Sub
which instead gives a result of 10.
What is the reason for this difference?
What is the order of execution of operators in VB.NET?
Unlike in C++, VB.NET's = is not always an assignment. It can also be the equality comparison operator (== in C++) if it appears inside an expression. Therefore your two expressions are not the same. They are not even equivalent. The VB.NET code does not do what you might think it does.
First to your C++ code: Like you're saying, the assignment x=y happens first; thus your code is roughly equivalent to this: (It seems that was incorrect; see Jens' answer.) Since you end up with y being 20, it is likely that your C++ compiler evaluated your code as if you had written this:
int x = 10, y = 20;
x = y;
y = x + x*0; // which is equivalent to `y = 20 + 20*0;`, or `y = 20 + 0;`, or `y = 20;`
In VB.NET however, because the = in your subexpression (x=y) is not actually interpreted as an assignment, but as a comparison, the code is equivalent to this:
Dim x As Integer = 10
Dim y As Integer = 20
y = 10 + False*0 ' which is equivalent to `y = 10 + 0*0`, or `y = 10` '
Here, operator precedence doesn't even come into play, but an implicit type conversion of the boolean value False to numeric 0.
(Just in case you were wondering: In VB.NET, assignment inside an expression is impossible. Assignments must always be full statements of their own, they cannot happen "inline". Otherwise it would be impossible to tell whether a = inside an expression meant assignment or comparison, since the same operator is used for both.)
Your C++ snippet is undefined behavior. There is no sequence point between using x as the first argument and assigning y to x, so the compiler can evaluate the sub-expressions in any order. Both
First evaluate the left side: y = 10 + (x=y) * 0 -> y = 10 + (x=20) * 0 -> y = 10 + 20*0
First evaluate the right side: y = x + (x=20) * 0 -> y = 20 + 20 * 0
It is also generally a very bad style to put assignments inside expressions.
This answer was intended as a comment, but its length quickly exceeded the limit. Sorry :)
You are confusing operator precedence with evaluation order. (This is a very common phenomenon, so don't feel bad). Let me try to explain with a simpler example involving more familiar operators:
If you have an expression like a + b * c then yes, the multiplication will always happen before the addition, because the * operator binds tighter than + operator. So far so good? The important point is that C++ is allowed to evaluate the operands a, b and c in any order it pleases. If one of those operands has a side effect which affects another operand, this is bad for two reasons:
It may cause undefined behavior (which in your code is indeed the case), and more importantly
It is guaranteed to give future readers of your code serious headaches. So please don't do it!
By the way, Java will always evaluate a first, then b, then c, "despite" the fact that multiplication happens before addition. The pseudo-bytecode will look like push a; push b; push c; mul; add;
(You did not ask about Java, but I wanted to mention Java to give an example where evaluating a is not only feasible, but guaranteed by the language specification. C# behaves the same way here.)
So I'm looking over C++ operator rules as I do when my programs start behaving wonkily. And I come across the comma operator. Now, I have known it was there for a while but never used it, so I began reading, and I come across this little gem:
if (int y = f(x), y > x)
{
// statements that use y
}
I had never thought about using commas' first arguments' side-effects to get locally-scoped variables without the need for bulky block-delimited code or repeated function calls. Naturally, this all excited me greatly, and I immediately ran off to try it.
test_comma.cpp: In function 'int main()':
test_comma.cpp:9:18: error: expected ')' before ',' token
if (int y = f(x), y > x) {
I tried this on both a C and C++ compiler, and neither of them liked it. I tried instead declaring y in the outer scope, and it compiled and ran just fine without the int in the if condition, but that defeats the purpose of the comma here. Is this just a GCC implementation quirk? The opinion of the Internet seems to be that this should be perfectly valid C (and ostensibly, to my eye, C++) code; there is no mention of this error on any GCC or C++ forum that I've seen.
EDIT: Some more information. I am using MinGW GCC 4.8.1-4 on Windows 7 64-bit (though obviously my binaries are 32-bit; I need to install mingw-w64 one of these days).
I also tried using this trick outside of a conditional statement, as below:
int y = (int z = 5, z);
This threw up two different errors:
test_comma.cpp: In function 'int main()':
test_comma.cpp:9:11: error: expected primary-expression before 'int'
int y = (int z = 5, z);
^
test_comma.cpp:9:11: error: expected ')' before 'int'
With creative use of parentheses in my if statement above, I managed to get the same errors there, too.
Contrary to what several other people have claimed, declarations inside the if conditional are perfectly valid. However, your code is not.
The first problem is that you're not actually using the comma operator, but [almost] attempting to declare multiple variables. That is not valid in an if conditional. And, even if it were possible, your second declaration would be entirely broken anyway since you try to redeclare y, and you do so with > instead of =. It all simply makes no sense.
The following code is sort of similar:
if (int y = (f(x), y > x))
Now at least it's half-valid, but you're using y uninitialised and yielding undefined behaviour.
Declarations and expressions are not the same thing, so the following is quite different code:
int y = 0;
if (y = f(x), y > x)
Now you don't have a problem with uninitialised variables, either (because I initialised y myself), and you're getting this "side-effect declaration" that doesn't change the resulting value of the if conditional. But it's about as clear as mud. Look how the precedence forms:
int y = 0;
if ((y = f(x)), (y > x))
That's not really very intuitive.
Hopefully this total catastrophe has been a lesson in avoiding this sort of cryptic code in entirety. :)
You cannot declare variable and apply operator , simultaneously either you are declaring variable (in case of if it would be only one 'cause result needs to be resolved to bool), either you are writing some statement (also resolving to bool) which may include operator , in it.
You need to declare y on the top of if condition:
int y;
if(y=f(x),y>x)
{
}
This will check the last condition defined in the if condition and rest others are executed as general statements.
I was writing a console application that would try to "guess" a number by trial and error, it worked fine and all but it left me wondering about a certain part that I wrote absentmindedly,
The code is:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int x,i,a,cc;
for(;;){
scanf("%d",&x);
a=50;
i=100/a;
for(cc=0;;cc++)
{
if(x<a)
{
printf("%d was too big\n",a);
a=a-((100/(i<<=1))?:1);
}
else if (x>a)
{
printf("%d was too small\n",a);
a=a+((100/(i<<=1))?:1);
}
else
{
printf("%d was the right number\n-----------------%d---------------------\n",a,cc);
break;
}
}
}
return 0;
}
More specifically the part that confused me is
a=a+((100/(i<<=1))?:1);
//Code, code
a=a-((100/(i<<=1))?:1);
I used ((100/(i<<=1))?:1) to make sure that if 100/(i<<=1) returned 0 (or false) the whole expression would evaluate to 1 ((100/(i<<=1))?:***1***), and I left the part of the conditional that would work if it was true empty ((100/(i<<=1))? _this space_ :1), it seems to work correctly but is there any risk in leaving that part of the conditional empty?
This is a GNU C extension (see ?: wikipedia entry), so for portability you should explicitly state the second operand.
In the 'true' case, it is returning the result of the conditional.
The following statements are almost equivalent:
a = x ?: y;
a = x ? x : y;
The only difference is in the first statement, x is always evaluated once, whereas in the second, x will be evaluated twice if it is true. So the only difference is when evaluating x has side effects.
Either way, I'd consider this a subtle use of the syntax... and if you have any empathy for those maintaining your code, you should explicitly state the operand. :)
On the other hand, it's a nice little trick for a common use case.
This is a GCC extension to the C language. When nothing appears between ?:, then the value of the comparison is used in the true case.
The middle operand in a conditional expression may be omitted. Then if the first operand is nonzero, its value is the value of the conditional expression.
Therefore, the expression
x ? : y
has the value of x if that is nonzero; otherwise, the value of y.
This example is perfectly equivalent to
x ? x : y
In this simple case, the ability to omit the middle operand is not especially useful. When it becomes useful is when the first operand does, or may (if it is a macro argument), contain a side effect. Then repeating the operand in the middle would perform the side effect twice. Omitting the middle operand uses the value already computed without the undesirable effects of recomputing it.