Consider the following c++ code for a simple binary tree DFS traversal:
#include <iostream>
#include <vector>
using namespace std;
int print_vector(vector<char> *vec){
for (auto &it: *vec)
cout<<it;
cout<<'\n';
return 0;
}
int btree_dfs_traversal(int max_depth, int cur_depth, vector<char> position){
if (cur_depth>max_depth)
return 0;
print_vector(&position);
vector<char>left = position;
vector<char>right = position;
left.push_back('l');
right.push_back('r');
return btree_dfs_traversal(max_depth, cur_depth+1, left)+btree_dfs_traversal(max_depth, cur_depth+1, right);
}
int main(int argc, const char * argv[]) {
vector<char> pos;
btree_dfs_traversal(4, 0, pos);
return 0;
}
The function(a minimal example) visits a binary tree, and prints the "position" of each node it visits. The only difference with a standard DFS is that (this part made the difference), most implementation uses a iteration for visiting the two nodes, while my the return statement returns the sum of two visits.
I expected the program to recurse from the left statement, i.e. the output starts with l, ll, lll, ... And indeed in my system(OSX) it is like this, and ideone has this output too.
However, in some friends' system, the output is different. The recursion starts from the right statement, i.e. r, rr... Unfortunately, I do not have exact information of their complier information at present.
My question is: is the sum of two recursion being a undefined behavior such that different compiler can produce different results? Or, it is just wrong to start from right?
The "problem" is that in f(1) + f(2) it is unspecified which function call comes first. Different compilers will pick different order and the order can depend on any number of unrelated factors. From the C++ reference:
Order of evaluation
Order of evaluation of the operands of almost all C++ operators (including the order of evaluation of function arguments in a function-call expression and the order of evaluation of the subexpressions within any expression) is unspecified. The compiler can evaluate operands in any order, and may choose another order when the same expression is evaluated again.
There are exceptions to this rule which are noted below.
Except where noted below, there is no concept of left-to-right or right-to-left evaluation in C++. This is not to be confused with left-to-right and right-to-left associativity of operators: the expression f1() + f2() + f3() is parsed as (f1() + f2()) + f3() due to left-to-right associativity of operator+, but the function call to f3 may be evaluated first, last, or between f1() or f2() at run time.
See http://en.cppreference.com/w/cpp/language/eval_order for exceptions and more info.
Related
I am a C++ beginning learner.
Recently, I read a paragraph describing the evaluation of an expression.
The original text is as below:
"...Evaluation of an expression may generate side-effects, e.g. std::printf("%d", 4) prints the character '4' on the standard output...."
My question is "Why the character '4' caused by std::printf("%d", 4) is a side-effect?"
Can anyone give me a more comprehensive explanation or more examples about side-effects evaluated by expressions?
Thanks!
A side effect is any change in the system that is observable to the outside world.
Printing a number is clearly a visible change (also, internally you affect stdout state etc...)
Another important notion that can be helpful is the notion of pure function. It has two main characteristics:
A pure function is deterministic. This means, that given the same input, the function will always return the same output. ...
A pure function will not cause side effects. A side effect is any change in the system that is observable to the outside world.
Typical examples of functions violating these properties are:
static int n=0;
int foo_1(int m) // not deterministic, without side effect
{
return m+n;
}
int foo_2(int m) // with side effect, but deterministic
{
++n;
return m;
}
int foo_3(int m) // with side effect, not deterministic
{
++n;
return m+n;
}
int foo_4(int m) // without side effect + deterministic = pure function
{
return 2*m;
}
I have recently taken up studying algorithms and data structures.I came across fibonacci problem and its solution using recursion. But thats the thing. I understand how recursive calls work when there is only one(like in factorial of a number).The function calls keep on stacking up until they hit the base case and then they start unraveling by one to the desired answer.
But What I dont get is how does recursion work when there are two recursive calls in an expression like f(n)+f(n/2). I mean which call is resolved first and when does the second call gets resolved. Also how is the sum calculated if this expression is assigned to a statement?
I wrote a small code to decipher this myself.
#include <iostream>
#include <string>
int main()
{
int recursionTest(int n, char c);
int x;
std::cin>>x;
std::cout<<"Ans:"<<recursionTest(x,'c');
}
int recursionTest(int n,char c)
{
int result=0;
if(n==0)
return 1;
std::cout<<n<<":"<<c<<std::endl;
result=recursionTest(n/2,'L')+recursionTest(n/3,'R');////I cant figure
out the flow of
control here!!!
return result;
}
And I got the following output.
24
24:c
12:L
6:L
3:L
1:L
1:R
2:R
1:L
4:R
2:L
1:L
1:R
8:R
4:L
2:L
1:L
1:R
2:R
1:L
Ans:20
SO I get it ,its a tree structure. But I still dont know how we are getting 20 as answer(input=24). How is the sum expression working and what is it summing,how can I look at the tree structure and generate the same out put?
There is no defined order to how the two subexpressions of the + operator are evaluated. The compiler can emit code to evaluate either one first and the other one second. It can even interleave some computations of one side with computations of the other. For example, it could calculate n/3, then n/2, then the right function call, and finally the left function call.
The flow control is just like for a single case of recursion followed by another single case. So, your line:
result=recursionTest(n/2,'L')+recursionTest(n/3,'R');
is effectively the same as:
int left = recursionTest(n/2,'L');
int right = recursionTest(n/3,'R');
result = left + right;
except for the implication in my version that the left function call is guaranteed to be evaluated before the right function call.
Here operator precedence will play the role
f(n) + f(n/2);
In the above code snippet f(n) will get called first and then f(n/2). basically arithmetic operators compile from left to right.
If you want to debug the code use printf statements inside function f(int) by printing n value . By this you can get the hang of the code
int f(int &x, int c)
{
c = c - 1;
if (c == 0) return 1;
x = x + 1;
return f(x, c) * x;
}
int x = 5;
cout << f(x,5);
In the example above the four possible answers to choose from are:
3024
6561
55440
161051
Function f(int &x, int c) is called four times after the first call before it reaches the base case where it returns the result which is 6561. My guess was 3024 but I was wrong. Even if the x variable which is passed by reference increments in each call of f(int &x, int c) and takes the values 6->7->8->9 respectively the final result of this recursion is equal to 9^4.
So my question is: Variable x is passed by reference and is equal to 9 when it reaches the base case. Does that mean that all the stages of recursion will have this value for variable x even if they had a different value when they've been called?
No, there are more than four answers to choose from.
The fetch of x for the recursive function call, and the fetch of x for the right hand side of multiplication, is not sequenced with each other; and as such the evaluation order is unspecified.
This doesn't mean that the evaluation order would be some particular evaluation order, and it's only necessary to figure it out. This means that the final results can:
Vary depending on the compiler.
Vary each time this program executes.
The evaluation order may also be different for each individual recursive call. Each recursive call can end up using a different evaluation order, too. "Unspecified" means "unspecified". Any possibility can happen. Each individual time.
I didn't bother to calculate all actual possibilities here. It's better to invest one's own time on something that should work properly, instead of on something that obviously can never work properly.
If you want a specific evaluation order, it's going to be either this:
int y=x;
return f(x, c) * y;
Or this:
int y=f(x, c);
return y * x;
This evaluation order is now specified.
So I know that C++ has an Operator Precedence and that
int x = ++i + i++;
is undefined because pre++ and post++ are at the same level and thus there is no way to tell which one will get calculated first. But what I was wondering is if
int i = 1/2/3;
is undefined. The reason I ask is because there are multiple ways to look at that (1/2)/3 OR 1/(2/3).
My guess is that it is a undefined behavior but I would like to confirm it.
If you look at the C++ operator precedence and associativity, you'll see that the division operator is Left-to-right associative, which means this will be evaluated as (1/2)/3, since:
Operators that are in the same cell (there may be several rows of operators listed in a cell) are evaluated with the same precedence, in the given direction. For example, the expression a=b=c is parsed as a=(b=c), and not as (a=b)=c because of right-to-left associativity.
In your example the compiler is free to evaluate "1" "2" and "3" in any order it likes, and then apply the divisions left to right.
It's the same for the i++ + i++ example. It can evaluate the i++'s in any order and that's where the problem lies.
It's not that the function's precedence isn't defined, it's that the order of evaluation of its arguments is.
The first code snippet is undefined behaviour because variable i is being modified multiple times inbetween sequence points.
The second code snippet is defined behaviour and is equivalent to:
int i = (1 / 2) / 3;
as operator / has left-to-right associativity.
It is defined, it goes from left to right:
#include <iostream>
using namespace std;
int main (int argc, char *argv[]) {
int i = 16/2/2/2;
cout<<i<<endl;
return 0;
}
print "2" instead of 1 or 16.
It might be saying that it is undefined because you have chosen an int, which is the set of whole numbers.
Try a double or float which include fractions.
I recently got confused by the following c++ snippet:
#include <cstdio>
int lol(int *k){
*k +=5;
return *k;
}
int main(int argc, const char *argv[]){
int k = 0;
int w = k + lol(&k);
printf("%d\n", w);
return 0;
}
Take a look at line:
int w = k + lol(&k);
Until now I thought that this expression would be evaluated from left to right: take current value of k (which before calll to lol function is 0) and then add it to the result of lol function. But compiler proves me I'm wrong, the value of w is 10. Even if I switch places to make it
int w = lol(&k) + k;
the result would be still 10. What am I doing wrong?
Tomek
This is because the parameters in an expression are not specified to be evaluated in any particular order.
The compiler is free to execute either parameter k or lol(&k) first. There are no sequence points in that expression. This means that the side-effects of the parameters can be executed in any order.
So in short, it's not specified whether the code prints 5 or 10. Both are valid outputs.
The exception to this is short-circuiting in boolean expressions because && and || are sequence points. (see comments)
This code either yields 5 or 10 depending on the choice of evaluation oder of the function call relative to that of the left side of +.
Its behavior is not undefined because a function call is surrounded by two sequence points.
Plus is by definition commutative, so the order in your example is totally implementation-defined.
Mysticial is right when mentioning sequence points. Citing Wikipedia article (don't have C++ standard at hand):
A sequence point in imperative programming defines any point in a
computer program's execution at which it is guaranteed that all side
effects of previous evaluations will have been performed, and no side
effects from subsequent evaluations have yet been performed. They are
often mentioned in reference to C and C++, because the result of some
expressions can depend on the order of evaluation of their
subexpressions. Adding one or more sequence points is one method of
ensuring a consistent result, because this restricts the possible
orders of evaluation.
The article also has a list of sequence point in C++.