Who defines C operator precedence and associativity? - c++

Introduction
In every textbook on C/C++, you'll find an operator precedence and associativity table such as the following:
http://en.cppreference.com/w/cpp/language/operator_precedence
One of the questions on StackOverflow asked something like this:
What order do the following functions execute:
f1() * f2() + f3();
f1() + f2() * f3();
Referring to the previous chart I confidently replied that functions have left-to-right associativity so in the previous statements the are evaluated like this in both cases:
f1() -> f2() -> f3()
After the functions are evaluated you finish the evaluation like this:
(a1 * a2) + a3
a1 + (a2 * a3)
To my surprise, many people told me I was flat out wrong. Determined to prove them wrong, I decided to turn to the ANSI C11 standard. I was once again surprised to find out that very little is mentioned on operator precedence and associativity.
Questions
If my belief that functions are always evaluated from left-to-right is wrong, what does the table referring to function precedence and associativity really mean?
Who defines operator precedence and associativity if it's not ANSI? If it is ANSI who makes the definition, why is little mentioned about operator precedence and associativity? Is operator precedence and associativity inferred from the ANSI C standard or is it defined in Mathematics?

Operator precedence is defined in the appropriate standard. The standards for C and C++ are the One True Definition of what exactly C and C++ are. So if you look closely, the details are there. In fact, the details are in the grammar of the language. For example, take a look at the grammar production rule for + and - in C++ (collectively, additive-expressions):
additive-expression:
multiplicative-expression
additive-expression + multiplicative-expression
additive-expression - multiplicative-expression
As you can see, a multiplicative-expression is a subrule of an additive-expression. This means that if you have something like x + y * z, the y * z expression is a subexpression of x + y * z. This defines the precedence between these two operators.
We can also see that the left operand of an additive-expression expands to another additive-expression, which means that with x + y + z, x + y is a subexpression of it. This defines the associativity.
Associativity determines how adjacent uses of the same operator will be grouped. For example, + is left-to-right associative, which means that x + y + z will be grouped like so: (x + y) + z.
Don't mistake this for order of evaluation. There is absolutely no reason why the value of z could not be computed before x + y is. What matters is that it is x + y that is computed and not y + z.
For the function call operator, left-to-right associativity means that f()() (which could happen if f returned a function pointer, for example) is grouped like so: (f())() (of course, the other direction wouldn't make any sense).
Now let's consider the example you were looking at:
f1() + f2() * f3()
The * operator has higher precedence than the + operator, so the expressions are grouped like so:
f1() + (f2() * f3())
We don't even have to consider associativity here, because we don't have any of the same operator adjacent to each other.
Evaluation of the functions call expressions is, however, completely unsequenced. There's no reason f3 couldn't be called first, then f1, and then f2. The only requirement in this case is that operands of an operator are evaluated before the operator is. So that would mean f2 and f3 have to be called before the * is evaluated and the * must be evaluated and f1 must be called before the + is evaluated.
Some operators do, however, impose a sequencing on the evaluation of their operands. For example, in x || y, x is always evaluated before y. This allows for short-circuiting, where y does not need to be evaluated if x is known already to be true.
The order of evaluation was previously defined in C and C++ with the use of sequence points, and both have changed terminology to define things in terms of a sequenced before relationship. For more information, see Undefined Behaviour and Sequence Points.

The precedence of operators in the C Standard is indicated by the syntax.
(C99, 6.5p3) "The grouping of operators and operands is indicated by the syntax. 74)"
74) "The syntax specifies the precedence of operators in the evaluation of an expression"
C99 Rationale also says
"The rules of precedence are encoded into the syntactic rules for each operator."
and
"The rules of associativity are similarly encoded into the syntactic rules."
Also note that associativity has nothing to do with evaluation order. In:
f1() * f2() + f3()
function calls are evaluated in any order. The C syntactic rules says that f1() * f2() + f3() means (f1() * f2()) + f3() but the evaluation order of the operands in the expression is unspecified.

One way to think about precedence and associativity is to imagine that the language only allows statements containing an assignment and one operator, rather than multiple operators. So a statement like:
a = f1() * f2() + f3();
would not be allowed, since it has 5 operators: 3 function calls, multiplication, and addition. In this simplified language, you would have to assign everything to temporaries and then combine them:
temp1 = f1();
temp2 = f2();
temp3 = temp1 * temp2;
temp4 = f3();
a = temp3 + temp4;
Associativity and precedence specify that the last two statements must be performed in that order, since multiplication has higher precedence than addition. But it doesn't specify the relative order of the first 3 statements; it would be just as valid to do:
temp4 = f3();
temp2 = f2();
temp1 = f1();
temp3 = temp1 * temp2;
a = temp3 + temp4;
sftrabbit gave an example where associativity of function call operators is relevant:
a = f()();
When simplifying it as above, this becomes:
temp = f();
a = temp();

Precedence and associativity are defined in the standard, and they decide how to build the syntax tree. Precedence works by operator type(1+2*3 is 1+(2*3) and not (1+2)*3) and associativity works by operator position(1+2+3 is (1+2)+3 and not 1+(2+3)).
Order of evaluation is different - it does not define how to build the syntax tree - it defines how to evaluate the nodes of operators in the syntax tree. Order of evaluation is defined not to be defined - you can never rely on it because compilers are free to choose any order they see fit. This is done so compilers could try to optimize the code. The idea is that programmers write code that shouldn't be affected by order of evaluation, and yield the same results no matter the order.

Left-to-right associativity means that f() - g() - h() means (f() - g()) - h(), nothing more. Suppose f returns 1. Suppose g returns 2. Suppose h returns 3. Left-to-right associativity means the result is (1 - 2) - 3, or -4: a compiler is still permitted to first call g and h, that has nothing to do with associativity, but it is not allowed to give a result of 1 - (2 - 3), which would be something completely different.

Related

what is the difference between operators associativity and order of evaluation in c++

What is the difference between operators associativity and order of evaluation?
I have expected that operator associativity is a precedence of operators which in the same group and have the same precedence, but I cannot understand the difference between operators associativity and order of evaluation
Associativity informs the order of evaluation of the components of an expression, but it does not entirely define it.
Consider this expression: a + b + c. Associativity of + is left-to-right, so we know that this is conceptually equivalent to ((a + b) + c). What this means is that the expression a + b gets evaluated before the addition of that result to c. That informs the order of evaluation.
But this does not entirely define ordering. That it, it does not mean that a or b gets evaluated before c. It is entirely possible for a compiler to evaluate c first, then a and then b, then +ing a and b, and finally +ing that to the result of c. Indeed, in C++14, it is possible for a compiler to partially evaluate c, then evaluate part of a, then some of b, then some of c etc. That all depends on how complex a, b, and c are.
This is particularly important if one of these is a function call: a(x + y, z) + b + c. The compiler can evaluate x, then c, then z, then y, then b, then x + y, then evaluate a, then call the result of that evaluation, then adding that to b, then adding that to c.
In C++, "associativity" refers to the direction in which an operator is applied to its operands, while "order of evaluation" refers to the order in which operands are evaluated. For example, the addition operator (+) has left-to-right associativity, meaning multiple additions in an expression are evaluated from left to right. However, the order of evaluation is not guaranteed to be predictable, and can affect the result of an expression with side effects. Understanding these concepts is important for writing correct and predictable code.

What is the difference between a sequence point and operator precedence?

Consider the classical sequence point example:
i = i++;
The C and C++ standards state that the behavior of the above expression is undefined because the = operator is not associated with a sequence point.
What confuses me is that ++ has a higher precedence than = and so, the above expression, based on precedence, must evaluate i++ first and then do the assignment. Thus, if we start with i = 0, we should always end up with i = 0 (or i = 1, if the expression was i = ++i) and not undefined behavior. What am I missing?
All operators produce a result. In addition, some operators, such as assignment operator = and compound assignment operators (+=, ++, >>=, etc.) produce side effects. The distinction between results and side effects is at the heart of this question.
Operator precedence governs the order in which operators are applied to produce their results. For instance, precedence rules require that * goes before +, + goes before &, and so on.
However, operator precedence says nothing about applying side effects. This is where sequence points (sequenced before, sequenced after, etc.) come into play. They say that in order for an expression to be well-defined, the application of side effects to the same location in memory must be separated by a sequence point.
This rule is broken by i = i++, because both ++ and = apply their side effects to the same variable i. First, ++ goes, because it has higher precedence. It computes its value by taking i's original value prior to the increment. Then = goes, because it has lower precedence. Its result is also the original value of i.
The crucial thing that is missing here is a sequence points separating side effects of the two operators. This is what makes behavior undefined.
Operator precedence (and associativity) state the order in which an expression is parsed and executed. However, this says nothing about the order of evaluation of the operands, which is a different term. Example:
a() + b() * c()
Operator precedence dictates that the result of b() and the result of c() must be multiplied before added together with the result of a().
However, it says nothing about the order in which these functions should be executed. The order of evaluation of each operator specifies this. Most often, the order of evaluation is unspecified (unspecified behavior), meaning that the standard lets the compiler do it in any order it likes. The compiler need not document this order nor does it need to behave consistently. The reason for this is to give compilers more freedom in expression parsing, meaning faster compilation and possibly also faster code.
In the above example, I wrote a simple test program and my compiler executed the above functions in the order a(), b(), c(). The fact that the program needs to execute both b() and c() before it can multiply the results, doesn't mean that it must evaluate those operands in any given order.
This is where sequence points come in. It is a given point in the program where all previous evaluations (and operations) must be done. So sequence points are mostly related to order of evaluation and not so much operator precedence.
In the example above, the three operands are unsequenced in relation to each other, meaning that no sequence point dictates the order of evaluation.
Therefore it turns problematic when side effects are introduced in such unsequenced expressions. If we write i++ + i++ * i++, then we still don't know the order in which these operands are evaluated, so we can't determine what the result will be. This is because both + and * have unspecified/unsequenced order of evaluation.
Had we written i++ || i++ && i++, then the behavior would be well-defined, because the && and || specifies the order of evaluation to be left-to-right and there is a sequence point between the evaluation of the left and the right operand. Thus if(i++ || i++ && i++) is perfectly portable and safe (although unreadable) code.
As for the expression i = i++;, the problem here is that the = is defined as (6.5.16):
The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands. The evaluations of the operands are unsequenced.
This expression is actually close to be well-defined, because the text actually says that the left operand should not be updated before the right operand is computed. The problem is the very last sentence: the order of evaluation of the operands is unspecified/unsequenced.
And since the expression contains the side effect of i++, it invokes undefined behavior, since we can't know if the operand i or the operand i++ is evaluated first.
(There's more to it, since the standard also says that an operand should not be used twice in an expression for unrelated purposes, but that's another story.)
Operator precedence and order of evaluation are two different things. Let's have a look at them one by one:
Operator precedence rule: In an expression operands bound tighter to the operators having higher precedence.
For example
int a = 5;
int b = 10;
int c = 2;
int d;
d = a + b * c;
In the expression a + b * c, precedence of * is higher than that of + and therefore, b and c will bind to * and expression will be parsed as a + (b * c).
Order of evaluation rule: It describes how operands will be evaluated in an expression. In the statement
d = a>5 ? a : ++a;
a is guaranteed to be evaluated before evaluation of ++b or c.
But for the expression a + (b * c), though * has higher precedence than that of +, it is not guaranteed that a will be evaluated either before or after b or c and not even b and c ordered for their evaluation. Even a, b and c can evaluate in any order.
The simple rule is that: operator precedence is independent from order of evaluation and vice versa.
In the expression i = i++, higher precedence of ++ just tells the compiler to bind i with ++ operator and that's it. It says nothing about order of evaluation of the operands or which side effect (the one by = operator or one by ++) should take place first. Compiler is free to do anything.
Let's rename the i at left of assignment be il and at the right of assignment (in the expression i++) be ir, then the expression be like
il = ir++ // Note that suffix l and r are used for the sake of clarity.
// Both il and ir represents the same object.
Now compiler is free to evaluate the expression il = ir++ either as
temp = ir; // i = 0
ir = ir + 1; // i = 1 side effect by ++ before assignment
il = temp; // i = 0 result is 0
or
temp = ir; // i = 0
il = temp; // i = 0 side effect by assignment before ++
ir = ir + 1; // i = 1 result is 1
resulting in two different results 0 and 1 which depends on the sequence of side effects by assignment and ++ and hence invokes UB.

Calling function with side effects inside expression

I thought I understand how sequence points work in C++, but this GeeksQuiz question puzzled me:
int f(int &x, int c) {
c = c - 1;
if (c == 0) return 1;
x = x + 1;
return f(x, c) * x;
}
int main() {
int p = 5;
cout << f(p, p) << endl;
return 0;
}
The “correct” answer to this question says it prints 6561. Indeed, in VS2013 it does. But isn't it UB anyway because there is no guarantee which will be evaluated first: f(x, c) or x. We get 6561 if f(x, c) is evaluated first: the whole thing turns into five recursive calls: the first four (c = 5, 4, 3, 2) continue on, the last one (c = 1) terminates and returns 1, which amounts to 9 ** 4 in the end.
However, if x was evaluated first, then we'd get 6 * 7 * 8 * 9 * 1 instead. The funny thing is, in VS2013 even replacing f(x, c) * x with x * f(x, c) doesn't change the result. Not that it means anything.
According to the standard, is this UB or not? If not, why?
This is UB.
n4140 §1.9 [intro.execution]/15
Except where noted, evaluations of
operands of individual operators and of subexpressions of individual
expressions are unsequenced. [...] If a side effect on a scalar object
is unsequenced relative to [...] value computation using the value of
the same scalar object [...] the behavior is undefined.
Multiplicative operators don't have sequencing explicitly noted.
This is UB
Order of evaluation of the operands of almost all C++ operators (including the order of evaluation of function arguments in a function-call expression and the order of evaluation of the subexpressions within any expression) is unspecified. The compiler can evaluate operands in any order, and may choose another order when the same expression is evaluated again.
There are exceptions to this rule which are noted below.
Except where noted below, there is no concept of left-to-right or
right-to-left evaluation in C++. This is not to be confused with
left-to-right and right-to-left associativity of operators: the
expression f1() + f2() + f3() is parsed as (f1() + f2()) + f3() due to
left-to-right associativity of operator+, but the function call to f3
may be evaluated first, last, or between f1() or f2() at run time.

How is this Precedence operators working?

I know this is silly question but I don't know which step I'm missing to count so can't understand why the output is that of this code.
int i=2;
int c;
c = 2 * - ++ i << 1;
cout<< c;
I have trouble to understanding this line in this code:
c = 2 * - ++ i <<1;
I'm getting result -12. But I'm unable to get it how is precedence of operator is working here?
Have a look at the C++ Operator Precedence table.
The ++i is being evaluated, yielding 3.
The unary - is being evaluated, yielding -3.
The multiplication is being done1, yielding -6.
The bit shift is evaluated (shifting left by 1 is effectively multiplying by two) yielding -12.
The result -12 is being assigned to the variable c.
If you used parentheses to see what operator precedence was doing, you'd get
c = ((2 * (-(++i))) << 1);
Plus that expression is a bit misleading due to the weird spacing between operators. It would be better to write it c = 2 * -++i << 1;
1 Note that this is not the unary *, which dereferences a pointer. This is the multiplication operator, which is a binary operator.
Operator precedence defined the grouping between the operators and their operands. In your example the grouping is as follows
c = ((2 * (-(++i))) << 1);
That's how "precedence of operator is working here" and that's the only thing it does.
The result of this expression is -6 shifted one bit to the left. This happens to be -12 on your platform.
According to your comment in another answer, you mistakenly believe that operator precedence somehow controls what is executed "first" and what is executed "next". This is totally incorrect. Operator precedence has absolutely nothing to do with the order of execution. The only thing operator precedence does, once again, is define the grouping between the operators and their operands. No more, no less.
The order of execution is a totally different thing entirely independent from operator precedence. In fact, C++ language does not define any "order of execution" for expressions containing no sequence points inside (the above one included).

operators computing direction

I encountered something that I can't understand.
I have this code:
cout << "f1 * f1 + f2 * f1 - f1 / f2 is: "<< f1 * f1 + f2 * f1 - f1 / f2 << endl;
All the "f"s are objects, and all the operators are overloaded.
The weird this is that the first computation is of the / operator,
then the second * and then the first *; after that, the operator + and at last, operator -.
So basically, the / and * worked from right to left,
and the + and - operators worked from left to right.
I made another test...
I checked this code:
cout << "f1 * f1 / f2 is: " << f1 * f1 / f2 << endl;
Now, the first operator was * and only then operator /.
So now, it worked from left to right.
Can someone help me understand why is there difference in the directions?
10X!
This is yet again a question of the order of evaluation of function parameters - C++ does not specify such an order. Your code is equivalent to:
(f1 * f1) + (f2 * f1) - (f1 / f2)
The three multiply and divide operations can be evaluated in any order. This is perhaps cleraer for named functions:
add(f1*f2,f2*f1)).minus(f1/f2);
The bottom line is that associativity and precedence have nothing to say about the order of evaluation of function parameters and/or sub-expressions. Given the simple expression:
a + b
the C++ (and C) compiler is free to evaluate a first, then b, or b first then a, whether or not the '+' is overloaded.
It is unspecified in what sequence operator arguments will be calculated.
C++ Standard 5/4:
Except where noted, the order of evaluation of operands of individual operators and subexpressions of individual
expressions, and the order in which side effects take place, is unspecified.
Your expression is equivalent to (* and / are operators too, but leave them as is):
operator-( operator+(f1*f1, f2*f1), f1/f2 )
Operator precedence defines the order of operators that have different precedence, so that, e.g., * and / always evaluate before + and -. Then there is the left-to right rule when multiple operators of the same precedence are concatenated.
However, there is (with the exception of logical and ternary operators) no rule about which of an operator's arguments should be evaluated first. The compiler is free to perform the multiplicative operations in any order in pleases before passing them to the additive operators.
In fact, with the expression f() + g() + h(), the compiler is free to call the functions in reverse order, h(), then g(), then f(), then add together the result of f() and g(), and finally add the result of h(). That wouldn't be a very sensible thing to do in most cases, but it's perfectly legal.
User defined operators use the same precedence and associativity rule than buil-in ones.
Rule of precedence state that operators with higher precedence should be executed before those with lower precedence when adjacent in an expression (separated by a single operator).
Associativity rule state in wich order operators should be executed when an expression contains adjacent operators of same precedence.
In your first exemple precedence rule apply, but as associativity is only about adjacent operators the compiler choose in which order he will execute multiply and divide.
In Your second exemple the asociativity rule applies.
Rule of thumb to avoid problems with this kind of rules (that can be somewhat complex):
if unsure use parenthesis or local variables to force order.
avoid side effect when you call function (or user defined operators) because result could be surprising
when redefining operators try to be consistent with maths
http://www.cppreference.com/wiki/operator_precedence
strange behavior, indeed.
But what i would say is that this should not matter, because a * b / c should be equal to a / c * b if you implemented them mathematical-wise
This just looks like simple operator precedence at work.
In your first example, all the multiplication and division must be done before the addition and subtraction. Since the results of the multiplications and divisions are independent, it really doesn't matter what order they're performed, just that the results are used from left to right in the addition and subtraction.
In the second example, the multiplication and division are not independent and must be performed left to right.
You're always getting the correct results based upon operator precedence. THat's all that really matters. The compiler does not guarantee anything about the order of evaluation other than that operator precedence is honored.
Neil is right .
It is matter of operator associativity and precedence.
The expression is evaluated as (f1 * f1) + (f2 * f1) - (f1 / f2) as per the rules suggests and after that it is left to right in second pass . lastly addition and sub. takes place.
second example is simple. * and / have same precedence so we evaluate the expression as per the associativity rules which is left to right hence the order.