So I know that C++ has an Operator Precedence and that
int x = ++i + i++;
is undefined because pre++ and post++ are at the same level and thus there is no way to tell which one will get calculated first. But what I was wondering is if
int i = 1/2/3;
is undefined. The reason I ask is because there are multiple ways to look at that (1/2)/3 OR 1/(2/3).
My guess is that it is a undefined behavior but I would like to confirm it.
If you look at the C++ operator precedence and associativity, you'll see that the division operator is Left-to-right associative, which means this will be evaluated as (1/2)/3, since:
Operators that are in the same cell (there may be several rows of operators listed in a cell) are evaluated with the same precedence, in the given direction. For example, the expression a=b=c is parsed as a=(b=c), and not as (a=b)=c because of right-to-left associativity.
In your example the compiler is free to evaluate "1" "2" and "3" in any order it likes, and then apply the divisions left to right.
It's the same for the i++ + i++ example. It can evaluate the i++'s in any order and that's where the problem lies.
It's not that the function's precedence isn't defined, it's that the order of evaluation of its arguments is.
The first code snippet is undefined behaviour because variable i is being modified multiple times inbetween sequence points.
The second code snippet is defined behaviour and is equivalent to:
int i = (1 / 2) / 3;
as operator / has left-to-right associativity.
It is defined, it goes from left to right:
#include <iostream>
using namespace std;
int main (int argc, char *argv[]) {
int i = 16/2/2/2;
cout<<i<<endl;
return 0;
}
print "2" instead of 1 or 16.
It might be saying that it is undefined because you have chosen an int, which is the set of whole numbers.
Try a double or float which include fractions.
Related
#include <iostream>
using namespace std;
int main()
{
int arr[3] = { 10, 20, 30 };
cout << arr[-2] << endl;
cout << -2[arr] << endl;
return 0;
}
Output:
4196160
-30
Here arr[-2] is out of range and invalid, causing undefined behavior.
But -2[arr] evaluates to -30. Why?
Isn't arr[-2] equivalent to -2[arr]?
-2[arr] is parsed as -(2[arr]). In C (and in C++, ignoring overloading), the definition of X[Y] is *(X+Y) (see more discussion of this in this question), which means that 2[arr] is equal to arr[2].
The compiler parses this expression
-2
like
unary_minus decimal_integer_literal
That is definitions of integer literals do not include signs.
In turn the expression
2[arr]
is parsed by the compiler as a postfix expression.
Postfix expressions have higher precedence than unary expressions. Thus this expression
-2[arr]
is equivalent to
- ( 2[arr] )
So the unary minus is applied to the lvalue returned by the postfix expression 2[arr].
On the other hand if you wrote
int n = -2;
and then
n[arr]
then this expression would be equivalent to
arr[-2]
-2[arr] is equivalent to -(2[arr]), which is equivalent to -arr[2]. However, (-2)[arr] is equivalent to arr[-2].
This is because E1[E2] is identical to (*((E1)+(E2)))
The underlying problem is with operator precedence. In C++ the [], ie the Subscript operator hold more precedence (somewhat akin to preferance) than the - unary_minus operator.
So when one writes,
arr[-2]
The compiler first executes arr[] then - , but the unary_minus is enclosed within the bounds of the [-2] so the expression is decomposed together.
In the,
-2[arr]
The same thing happens but, the compiler executes 2[] first the n the - operator so it ends up being
-(2[arr]) not (-2)[arr]
Your understanding of the concept that,
arr[i] i[arr] and *(i+arr) are all the same is correct. They are all equivalent expressions.
If you want to write in that way, write it as (-2)[arr]. You will get the same value for sure.
Check this out for future referance :http://en.cppreference.com/w/cpp/language/operator_precedence
Consider the following c++ code for a simple binary tree DFS traversal:
#include <iostream>
#include <vector>
using namespace std;
int print_vector(vector<char> *vec){
for (auto &it: *vec)
cout<<it;
cout<<'\n';
return 0;
}
int btree_dfs_traversal(int max_depth, int cur_depth, vector<char> position){
if (cur_depth>max_depth)
return 0;
print_vector(&position);
vector<char>left = position;
vector<char>right = position;
left.push_back('l');
right.push_back('r');
return btree_dfs_traversal(max_depth, cur_depth+1, left)+btree_dfs_traversal(max_depth, cur_depth+1, right);
}
int main(int argc, const char * argv[]) {
vector<char> pos;
btree_dfs_traversal(4, 0, pos);
return 0;
}
The function(a minimal example) visits a binary tree, and prints the "position" of each node it visits. The only difference with a standard DFS is that (this part made the difference), most implementation uses a iteration for visiting the two nodes, while my the return statement returns the sum of two visits.
I expected the program to recurse from the left statement, i.e. the output starts with l, ll, lll, ... And indeed in my system(OSX) it is like this, and ideone has this output too.
However, in some friends' system, the output is different. The recursion starts from the right statement, i.e. r, rr... Unfortunately, I do not have exact information of their complier information at present.
My question is: is the sum of two recursion being a undefined behavior such that different compiler can produce different results? Or, it is just wrong to start from right?
The "problem" is that in f(1) + f(2) it is unspecified which function call comes first. Different compilers will pick different order and the order can depend on any number of unrelated factors. From the C++ reference:
Order of evaluation
Order of evaluation of the operands of almost all C++ operators (including the order of evaluation of function arguments in a function-call expression and the order of evaluation of the subexpressions within any expression) is unspecified. The compiler can evaluate operands in any order, and may choose another order when the same expression is evaluated again.
There are exceptions to this rule which are noted below.
Except where noted below, there is no concept of left-to-right or right-to-left evaluation in C++. This is not to be confused with left-to-right and right-to-left associativity of operators: the expression f1() + f2() + f3() is parsed as (f1() + f2()) + f3() due to left-to-right associativity of operator+, but the function call to f3 may be evaluated first, last, or between f1() or f2() at run time.
See http://en.cppreference.com/w/cpp/language/eval_order for exceptions and more info.
Does spaces have any meaning in these expressions:
assume:
int a = 1;
int b = 2;
1)
int c = a++ +b;
Or,
2)
int c = a+ ++b;
When I run these two in visual studio, I get different results. Is that the correct behavior, and what does the spec says?
In general, what should be evaluated first, post-increment or pre-increment?
Edit: I should say that
c =a+++b;
Does not compile on visual studio. But I think it should. The postfix++ seems to be evaluated first.
Is that the correct behavior
Yes, it is.
Postfix ++ first returns the current value, then increments it. so int c = a++ +b means compute the value of c as the sum between current a(take the current a value, and only after taking it, increment a) and b;
Prefix ++ first increments the current value, then returns the value already incremented, so in this case, int c = a+ ++b means compute c as the sum between a and the return of the next expression, ++b, which means b is first incremented, then returned.
In general, what should be evaluated first, post-increment or
pre-increment?
In this example, it is not about which gets evaluated first, it is about what each does - postfix first returns the value, then increments it; prefix first increments the value, then returns it.
Hth
Maybe it helps to understand the general architecture of how programs are parsed.
In a nutshell, there are two stages to parsing a program (C++ or others): lexer and parser.
The lexer takes the text input and maps it to a sequence of symbols. This is when spaces are handled because they tell where the symbols are. Spaces really matter at some places (like between int and c, to not confuse with the symbol intc) but not others (like between a and ++ because there is no ambiguity to separate them).
The first example:
int c = a++ +b;
gives the following symbols, each on its own row (implementations may do this in slightly different ways of course):
int
c
=
a
++
+
b
;
While in the other case:
int c = a+ ++b;
the symbols are instead:
int
c
=
a
+
++
b
;
The parser then builds a tree (Abstract Syntax Tree, AST) out of the symbols and according to some grammar. In particular, according to the C++ grammar, + as an addition has a lower precedence than the unary ++ operator (regardless of postfix or prefix). This means that the first example is semantically the same as (a++) + b while the second is like a+ (++b).
For your examples, the ASTs will be different, because the spaces already lead to a different output at the lexer phase.
Note that spaces are not required between ++ and +, so a+++b would theoretically be fine, but this is not recommended for readability. So, some spaces are important for technical reasons while others are important for us users to read the code.
Yes they should be different; the behaviour is correct.
There are a few possible sources for your confusion.
This question is not about "spaces in operators". You have different operators. If you were to remove the space, you would have a different question. See What is i+++ increment in c++
It's also not about "what should be evaluated first, post-increment or pre-increment". It's about understanding the difference between post-increment and pre-increment.
Both increment the variable to which they apply.
But the post-increment expression returns the value from before the increment.
Whereas the pre-increment expression returns the value after the increment.
I.e.
//Given:
int a = 1;
int b = 2;
//Post-increment
int c = a++ +b; =>
1 + 2; (and a == 2) =>
3;
//Pre-increment
int c = a+ ++b; =>
1 + 3; (and b == 3) =>
4;
Another thing that might be causing confusion. You wrote: a++ +b;. And you may be assuming that +b is the unary + operator. This would be an incorrect assumption because you have both left and right operands making that + a binary additive operator (as in x + y).
Final possible confusion. You may be wondering why:
in a++ +b the ++ is a post-increment operator applied to a.
whereas in a+ ++b it's a pre-increment operator applied to b.
The reason is that ++ has higher precedence than the binary additive +. And in both cases it would be impossible to apply ++ to +.
What's the order C++ does in chained multiplication ?
int a, b, c, d;
// set values
int m = a*b*c*d;
operator * has left to right associativity:
int m = ((a * b) * c) * d;
While in math it doesn't matter (multiplication is associative), in case of both C and C++ we may have or not have overflow depending on the order.
0 * INT_MAX * INT_MAX // 0
INT_MAX * INT_MAX * 0 // overflow
And things are getting even more complex if we consider floating point types or operator overloading. See comments of #delnan and #melpomene.
Yes, the order is from left to right.
int m = a * b * c * d;
If you are more interested in the topic of evaluation order or operator precedence, then you might be surprised how some ++ operations behave, even differently with the version of C.
http://en.cppreference.com/w/c/language/eval_order
http://en.cppreference.com/w/c/language/operator_precedence
The order is from left to right in this case
where
int m=a*b*c*d;
Here, first (a*b) is computed then the result is multiplied by c and then d as shown using parentheses:
int m=(((a*b)*c)*d);
The order is nominally left to right. But the optimizers in C++ compilers I've used feel free to change that order for data types they think they understand. If you overload operator* and the optimizer can't see through your overload, then it can't change the order. But when you multiply a sequence of things (variables, constants, function results, etc.) whose type is double, the optimizer might use the associative and commutative property of real number multiplication as if it were true in float or double multiplication. That can lead to some surprises when least significant bits matter to you.
So far as I understand, the standard allows that optimization, but I am far from a "language lawyer" so my best guess on the standard is not a pronouncement to be trusted (as compared with my experience on what compilers actually do).
There isn't any special "order" it multiplies as shown, left to right.
int m = a * b * c * d;
The order of operations comes into affect when using addition/ subtraction with dividing/ multiplying. Otherwise it is always left to right. Regardless, with the example, the solution would be the same no matter which order they were in.
I recently got confused by the following c++ snippet:
#include <cstdio>
int lol(int *k){
*k +=5;
return *k;
}
int main(int argc, const char *argv[]){
int k = 0;
int w = k + lol(&k);
printf("%d\n", w);
return 0;
}
Take a look at line:
int w = k + lol(&k);
Until now I thought that this expression would be evaluated from left to right: take current value of k (which before calll to lol function is 0) and then add it to the result of lol function. But compiler proves me I'm wrong, the value of w is 10. Even if I switch places to make it
int w = lol(&k) + k;
the result would be still 10. What am I doing wrong?
Tomek
This is because the parameters in an expression are not specified to be evaluated in any particular order.
The compiler is free to execute either parameter k or lol(&k) first. There are no sequence points in that expression. This means that the side-effects of the parameters can be executed in any order.
So in short, it's not specified whether the code prints 5 or 10. Both are valid outputs.
The exception to this is short-circuiting in boolean expressions because && and || are sequence points. (see comments)
This code either yields 5 or 10 depending on the choice of evaluation oder of the function call relative to that of the left side of +.
Its behavior is not undefined because a function call is surrounded by two sequence points.
Plus is by definition commutative, so the order in your example is totally implementation-defined.
Mysticial is right when mentioning sequence points. Citing Wikipedia article (don't have C++ standard at hand):
A sequence point in imperative programming defines any point in a
computer program's execution at which it is guaranteed that all side
effects of previous evaluations will have been performed, and no side
effects from subsequent evaluations have yet been performed. They are
often mentioned in reference to C and C++, because the result of some
expressions can depend on the order of evaluation of their
subexpressions. Adding one or more sequence points is one method of
ensuring a consistent result, because this restricts the possible
orders of evaluation.
The article also has a list of sequence point in C++.