For example, we have
int p(void) {
return 4;
}
int q(void) {
return 5;
}
int main(void) {
int x = p() + q();
return 0;
}
How does the stack frame look like in this case? To be exact, are p and q evaluated simultaneously, or after p is first evaluated to be 4, the program proceeds to q?
From cppreference
Order of evaluation of the operands of almost all C++ operators
(including the order of evaluation of function arguments in a
function-call expression and the order of evaluation of the
subexpressions within any expression) is unspecified. The compiler can
evaluate operands in any order, and may choose another order when the
same expression is evaluated again.
There are exceptions to this rule which are noted below.
Except where noted below, there is no concept of left-to-right or
right-to-left evaluation in C++. This is not to be confused with
left-to-right and right-to-left associativity of operators: the
expression f1() + f2() + f3() is parsed as (f1() + f2()) + f3() due to
left-to-right associativity of operator+, but the function call to f3
may be evaluated first, last, or between f1() or f2() at run time
Related
I am trying to know the order of execution of functions inside the cout statement
I tried this set of codes
#include < iostream >
using namespace std;
int i=0;
int sum(int a)
{
i++;
return a+i;
}
int main()
{
cout << sum(3) << sum(2) ;
return 0;
}
"I expected the output to be 44, but the actual output is 53"
As stated here: https://en.cppreference.com/w/cpp/language/eval_order
Order of evaluation of any part of any expression, including order of
evaluation of function arguments is unspecified (with some exceptions
listed below). The compiler can evaluate operands and other
subexpressions in any order, and may choose another order when the
same expression is evaluated again.
There is no concept of left-to-right or right-to-left evaluation in
C++. This is not to be confused with left-to-right and right-to-left
associativity of operators: the expression a() + b() + c() is parsed
as (a() + b()) + c() due to left-to-right associativity of operator+,
but the function call to c may be evaluated first, last, or between
a() or b() at run time
In your line
cout << sum(3) << sum(2)
the order of the two operator<< calls depends on the operator you use (here << so left-to-right), but the evaluation of each subexpression, namely sum(3) and sum(2) has no defined order and depends on the mood (most optimized compile approach usually) of your compiler.
For info here is a list of operators associativity: https://en.cppreference.com/w/cpp/language/operator_precedence
Why does the following print bD aD aB aA aC aU instead of aD aB aA aC bD aU? In other words, why is b-- evaluated before --++a--++?
#include <iostream>
using namespace std;
class A {
char c_;
public:
A(char c) : c_(c) {}
A& operator++() {
cout << c_ << "A ";
return *this;
}
A& operator++(int) {
cout << c_ << "B ";
return *this;
}
A& operator--() {
cout << c_ << "C ";
return *this;
}
A& operator--(int) {
cout << c_ << "D ";
return *this;
}
void operator+(A& b) {
cout << c_ << "U ";
}
};
int main()
{
A a('a'), b('b');
--++a-- ++ +b--; // the culprit
}
From what I gather, here's how the expression is parsed by the compiler:
Preprocessor tokenization: -- ++ a -- ++ + b --;
Operator precedence1: (--(++((a--)++))) + (b--);
+ is left-to-right associative, but nonetheless the compiler may choose to evaluate the expression on the right (b--) first.
I'm assuming the compiler chooses to do it this way because it leads to better optimized code (less instructions). However, it's worth noting that I get the same result when compiling with /Od (MSVC) and -O0 (GCC). This brings me to my question:
Since I was asked this on a test which should in principle be implementation/compiler-agnostic, is there something in the C++ standard that prescribes the above behavior, or is it truly unspecified? Can someone quote an excerpt from the standard which confirms either? Was it wrong to have such a question on the test?
1 I realize the compiler doesn't really know about operator precedence or associativity, rather it cares only about the language grammar, but this should get the point across either way.
The expression statement
--++a-- ++ +b--; // the culprit
can be represented the following way
at first like
( --++a-- ++ ) + ( b-- );
then like
( -- ( ++ ( ( a-- ) ++ ) ) ) + ( b-- );
and at last like
a.operator --( 0 ).operator ++( 0 ).operator ++().operator --().operator + ( b.operator --( 0 ) );
Here is a demonstrative program.
#include <iostream>
using namespace std;
#include <iostream>
using namespace std;
class A {
char c_;
public:
A(char c) : c_(c) {}
A& operator++() {
cout << c_ << "A ";
return *this;
}
A& operator++(int) {
cout << c_ << "B ";
return *this;
}
A& operator--() {
cout << c_ << "C ";
return *this;
}
A& operator--(int) {
cout << c_ << "D ";
return *this;
}
void operator+(A& b) {
cout << c_ << "U ";
}
};
int main()
{
A a('a'), b('b');
--++a-- ++ +b--; // the culprit
std::cout << std::endl;
a.operator --( 0 ).operator ++( 0 ).operator ++().operator --().operator + ( b.operator --( 0 ) );
return 0;
}
Its output is
bD aD aB aA aC aU
bD aD aB aA aC aU
You can imagine the last expression written in the functional form like a postfix expression of the form
postfix-expression ( expression-list )
where the postfix expression is
a.operator --( 0 ).operator ++( 0 ).operator ++().operator --().operator +
and the expression-list is
b.operator --( 0 )
In the C++ Standard (5.2.2 Function call) there is said that
8 [Note: The evaluations of the postfix expression and of the arguments
are all unsequenced relative to one another. All side effects of
argument evaluations are sequenced before the function is entered (see
1.9). —end note]
So it is implementation-defined whether at first the argument will be evaluated or the postfix expression. According to the showed output the compiler at first evaluates the argument and only then the postfix expression.
I would say they were wrong to include such a question.
Except as noted, the following excerpts are all from §[intro.execution] of N4618 (and I don't think any of this stuff has changed in more recent drafts).
Paragraph 16 has the basic definition of sequenced before, indeterminately sequenced, etc.
Paragraph 18 says:
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.
In this case, you're (indirectly) calling some functions. The rules there are fairly simple as well:
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. For each function invocation F, for every evaluation A that occurs within F and every evaluation B that does not occur within F but is evaluated on the same thread and as part of the same signal handler (if any), either A is
sequenced before B or B is sequenced before A.
Putting that into bullet points to more directly indicate order:
first evaluate the function arguments, and whatever designates the function being called.
Evaluate the body of the function itself.
Evaluate another (sub-)expression.
No interleaving is allowed unless something starts up a thread to allow something else to execute in parallel.
So, does any of this change before we're invoking the functions via operator overloads rather than directly? Paragraph 19 says "No":
The sequencing constraints on the execution of the called function (as described above) are features of the function calls as evaluated, whatever the syntax of the expression that calls the function might be.
§[expr]/2 also says:
Uses of overloaded operators are transformed into function calls as described
in 13.5. Overloaded operators obey the rules for syntax and evaluation order specified in Clause 5, but the requirements of operand type and value category are replaced by the rules for function call.
Individual operators
The only operator you've used that has somewhat unusual requirements with respect to sequencing are the post-increment and post-decrement. These say (§[expr.post.incr]/1:
The value computation of the ++ expression is sequenced before the modification of the operand object. With respect to an indeterminately-sequenced function call, the operation of postfix ++ is a single evaluation. [ Note: Therefore, a function call shall not intervene between the lvalue-to-rvalue conversion and the side effect associated with any single postfix ++ operator. —end note ]
In the end, however, this is pretty much just what you'd probably expect: if you pass x++ as a parameter to a function, the function receives the previous value of x, but if x is also in scope inside the function, x will have the incremented value by the time the body of the function starts to execute.
The + operator, however, does not specify ordering of the evaluation of its operands.
Summary
Using overloaded operators does not enforce any sequencing on the evaluation of sub-expressions within an expression, beyond the fact that evaluating an individual operator is a function call, and has the ordering requirements of any other function call.
More specifically, in this case, b-- is the operand to a function call, and --++a-- ++ is the expression that designates the function being called (or at least the object on which the function will be called--the -- designates the function within that object). As noted, ordering between these two is not specified (nor does operator + specify an order of evaluating its left vs. right operand).
There is not something in the C++ standard which says things need to be evaluated in this way. C++ has the concept of sequenced-before, where some operations are guaranteed to happen before other operations are. This is a partially-ordered set; that is, sosome operations are sequenced before others, two operations can’t be sequenced before eath other, and if a is sequenced before b, and b is sequenced before c, then a is sequenced before c. However, there are many types of operation which have no sequenced-before guarantees. Before C++11, there was instead a concept of a sequence point, which isn’t quite the same but similar.
Very few operators (only ,, &&, ?:, and ||, I believe) guarantee a sequence point between their arguments (and even then, until C++17, this guarantee doesn’t exist when the operators are overloaded). In particular, the addition does not guarantee any such thing. The compiler is free to evaluate the left-hand side first, to evaluate the right-hand side first, or (I think) even to evaluate them simultaneously.
Sometimes changing optimization options can change the results, or changing compilers. Apparently you aren’t seeing that; there are no guarantees here.
Operator precedence and associativity rules are only used to convert your expression from the original "operators in expression" notation to the equivalent "function call" format. After the conversion you end up with a bunch of nested function calls, which are processed in the usual way. In particular, order of parameter evaluation is unspecified, which means that there's no way to say which operand of the "binary +" call will get evaluated first.
Also, note that in your case binary + is implemented as a member function, which creates certain superficial asymmetry between its arguments: one argument is "regular" argument, another is this. Maybe some compilers "prefer" to evaluate "regular" arguments first, which is what leads to b-- being evaluated first in your tests (you might end up with different ordering from the same compiler if you implement your binary + as a freestanding function). Or maybe it doesn't matter at all.
Clang, for example, begins with evaluating the first operand, leaving b-- for later.
Take in account priority of operators in c++:
a++ a-- Suffix/postfix increment and decrement. Left-to-right
++a --a Prefix increment and decrement. Right-to-left
a+b a-b Addition and subtraction. Left-to-right
Keeping the list in your mind you can easily read the expression even without parentheses:
--++a--+++b--;//will follow with
--++a+++b--;//and so on
--++a+b--;
--++a+b;
--a+b;
a+b;
And dont forget about essential difference prefix and postfix operators in terms of order evaluation of variable and expression ))
I thought I understand how sequence points work in C++, but this GeeksQuiz question puzzled me:
int f(int &x, int c) {
c = c - 1;
if (c == 0) return 1;
x = x + 1;
return f(x, c) * x;
}
int main() {
int p = 5;
cout << f(p, p) << endl;
return 0;
}
The “correct” answer to this question says it prints 6561. Indeed, in VS2013 it does. But isn't it UB anyway because there is no guarantee which will be evaluated first: f(x, c) or x. We get 6561 if f(x, c) is evaluated first: the whole thing turns into five recursive calls: the first four (c = 5, 4, 3, 2) continue on, the last one (c = 1) terminates and returns 1, which amounts to 9 ** 4 in the end.
However, if x was evaluated first, then we'd get 6 * 7 * 8 * 9 * 1 instead. The funny thing is, in VS2013 even replacing f(x, c) * x with x * f(x, c) doesn't change the result. Not that it means anything.
According to the standard, is this UB or not? If not, why?
This is UB.
n4140 §1.9 [intro.execution]/15
Except where noted, evaluations of
operands of individual operators and of subexpressions of individual
expressions are unsequenced. [...] If a side effect on a scalar object
is unsequenced relative to [...] value computation using the value of
the same scalar object [...] the behavior is undefined.
Multiplicative operators don't have sequencing explicitly noted.
This is UB
Order of evaluation of the operands of almost all C++ operators (including the order of evaluation of function arguments in a function-call expression and the order of evaluation of the subexpressions within any expression) is unspecified. The compiler can evaluate operands in any order, and may choose another order when the same expression is evaluated again.
There are exceptions to this rule which are noted below.
Except where noted below, there is no concept of left-to-right or
right-to-left evaluation in C++. This is not to be confused with
left-to-right and right-to-left associativity of operators: the
expression f1() + f2() + f3() is parsed as (f1() + f2()) + f3() due to
left-to-right associativity of operator+, but the function call to f3
may be evaluated first, last, or between f1() or f2() at run time.
Introduction
In every textbook on C/C++, you'll find an operator precedence and associativity table such as the following:
http://en.cppreference.com/w/cpp/language/operator_precedence
One of the questions on StackOverflow asked something like this:
What order do the following functions execute:
f1() * f2() + f3();
f1() + f2() * f3();
Referring to the previous chart I confidently replied that functions have left-to-right associativity so in the previous statements the are evaluated like this in both cases:
f1() -> f2() -> f3()
After the functions are evaluated you finish the evaluation like this:
(a1 * a2) + a3
a1 + (a2 * a3)
To my surprise, many people told me I was flat out wrong. Determined to prove them wrong, I decided to turn to the ANSI C11 standard. I was once again surprised to find out that very little is mentioned on operator precedence and associativity.
Questions
If my belief that functions are always evaluated from left-to-right is wrong, what does the table referring to function precedence and associativity really mean?
Who defines operator precedence and associativity if it's not ANSI? If it is ANSI who makes the definition, why is little mentioned about operator precedence and associativity? Is operator precedence and associativity inferred from the ANSI C standard or is it defined in Mathematics?
Operator precedence is defined in the appropriate standard. The standards for C and C++ are the One True Definition of what exactly C and C++ are. So if you look closely, the details are there. In fact, the details are in the grammar of the language. For example, take a look at the grammar production rule for + and - in C++ (collectively, additive-expressions):
additive-expression:
multiplicative-expression
additive-expression + multiplicative-expression
additive-expression - multiplicative-expression
As you can see, a multiplicative-expression is a subrule of an additive-expression. This means that if you have something like x + y * z, the y * z expression is a subexpression of x + y * z. This defines the precedence between these two operators.
We can also see that the left operand of an additive-expression expands to another additive-expression, which means that with x + y + z, x + y is a subexpression of it. This defines the associativity.
Associativity determines how adjacent uses of the same operator will be grouped. For example, + is left-to-right associative, which means that x + y + z will be grouped like so: (x + y) + z.
Don't mistake this for order of evaluation. There is absolutely no reason why the value of z could not be computed before x + y is. What matters is that it is x + y that is computed and not y + z.
For the function call operator, left-to-right associativity means that f()() (which could happen if f returned a function pointer, for example) is grouped like so: (f())() (of course, the other direction wouldn't make any sense).
Now let's consider the example you were looking at:
f1() + f2() * f3()
The * operator has higher precedence than the + operator, so the expressions are grouped like so:
f1() + (f2() * f3())
We don't even have to consider associativity here, because we don't have any of the same operator adjacent to each other.
Evaluation of the functions call expressions is, however, completely unsequenced. There's no reason f3 couldn't be called first, then f1, and then f2. The only requirement in this case is that operands of an operator are evaluated before the operator is. So that would mean f2 and f3 have to be called before the * is evaluated and the * must be evaluated and f1 must be called before the + is evaluated.
Some operators do, however, impose a sequencing on the evaluation of their operands. For example, in x || y, x is always evaluated before y. This allows for short-circuiting, where y does not need to be evaluated if x is known already to be true.
The order of evaluation was previously defined in C and C++ with the use of sequence points, and both have changed terminology to define things in terms of a sequenced before relationship. For more information, see Undefined Behaviour and Sequence Points.
The precedence of operators in the C Standard is indicated by the syntax.
(C99, 6.5p3) "The grouping of operators and operands is indicated by the syntax. 74)"
74) "The syntax specifies the precedence of operators in the evaluation of an expression"
C99 Rationale also says
"The rules of precedence are encoded into the syntactic rules for each operator."
and
"The rules of associativity are similarly encoded into the syntactic rules."
Also note that associativity has nothing to do with evaluation order. In:
f1() * f2() + f3()
function calls are evaluated in any order. The C syntactic rules says that f1() * f2() + f3() means (f1() * f2()) + f3() but the evaluation order of the operands in the expression is unspecified.
One way to think about precedence and associativity is to imagine that the language only allows statements containing an assignment and one operator, rather than multiple operators. So a statement like:
a = f1() * f2() + f3();
would not be allowed, since it has 5 operators: 3 function calls, multiplication, and addition. In this simplified language, you would have to assign everything to temporaries and then combine them:
temp1 = f1();
temp2 = f2();
temp3 = temp1 * temp2;
temp4 = f3();
a = temp3 + temp4;
Associativity and precedence specify that the last two statements must be performed in that order, since multiplication has higher precedence than addition. But it doesn't specify the relative order of the first 3 statements; it would be just as valid to do:
temp4 = f3();
temp2 = f2();
temp1 = f1();
temp3 = temp1 * temp2;
a = temp3 + temp4;
sftrabbit gave an example where associativity of function call operators is relevant:
a = f()();
When simplifying it as above, this becomes:
temp = f();
a = temp();
Precedence and associativity are defined in the standard, and they decide how to build the syntax tree. Precedence works by operator type(1+2*3 is 1+(2*3) and not (1+2)*3) and associativity works by operator position(1+2+3 is (1+2)+3 and not 1+(2+3)).
Order of evaluation is different - it does not define how to build the syntax tree - it defines how to evaluate the nodes of operators in the syntax tree. Order of evaluation is defined not to be defined - you can never rely on it because compilers are free to choose any order they see fit. This is done so compilers could try to optimize the code. The idea is that programmers write code that shouldn't be affected by order of evaluation, and yield the same results no matter the order.
Left-to-right associativity means that f() - g() - h() means (f() - g()) - h(), nothing more. Suppose f returns 1. Suppose g returns 2. Suppose h returns 3. Left-to-right associativity means the result is (1 - 2) - 3, or -4: a compiler is still permitted to first call g and h, that has nothing to do with associativity, but it is not allowed to give a result of 1 - (2 - 3), which would be something completely different.
I only know i = i++; is undefined behavior, but if there are two or more functions called in an expression, and all the functions are the same. Is it undefined? For example:
int func(int a)
{
std::cout << a << std::endl;
return 0;
}
int main()
{
std::cout << func(0) + func(1) << std::endl;
return 0;
}
The behavior of the expression func(0) + func(1) is defined in that the result will be the sum of the results obtained by calling func with a parameter of 0 and funcwith a parameter of 1.
However, the order in which the functions are called is probably implementation dependent, although it might be unspecified. That is, the compiler could generate code equivalent to:
int a = func(0);
int b = func(1);
int result = a + b;
Or it could generate:
int a = func(1);
int b = func(0);
int result = a + b;
This normally won't be a problem unless func has side effects that depend on the order of calls.
std::cout << func(0) + func(1) << std::endl;
Whether the function call func(0) or func(1) executes first, is implementation dependent. After that, there is a sequence point, and func(0) + func(1) is output.
But by definition, it's not called undefined behavior.
The behavior of this program is not undefined but it is unspecified, if we look at the draft C++ standard section 1.9 Program execution paragraph 15 says(emphasis mine):
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [ Note: In an expression that is evaluated more than once during the execution of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations. —end note ] The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar
object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
and if we check section 5.7 Additive operators which covers + and - that section does not specify an ordering so it is unsequenced.
In this case func has a side effect since it is outputting to stdout and so the order of the output is going to depend on the implementation and it even could change for subsequent evaluations.
Note that the ; ends an expression statement and section 6.2 Expression statement says:
[...]All side effects from an expression statement are completed before the next statement is executed.[...]
so although the order of the function calls is unspecified, the side effects of each statement are completed before the next.