How can I implement tail calls in a custom virtual machine?
I know that I need to pop off the original function's local stack, then it's arguments, then push on the new arguments. But, if I pop off the function's local stack, how am I supposed to push on the new arguments? They've just been popped off the stack.
I take it for granted that we're discussing a traditional "stack-based" virtual machine here.
You pop off the current function's local stack preserving the still-relevant parts in non-stack "registers" (where the "relevant parts" are, clearly, the argument for the forthcoming recursive tail call), then (once all of the function's local stack and arguments are cleaned up) you push the arguments for the recursive call. E.g., suppose the function you're optimizing is something like:
def aux(n, tot):
if n <= 1: return tot
return aux(n-1, tot * n)
which without optimization might produce byte-code symbolically like:
AUX: LOAD_VAR N
LOAD_CONST 1
COMPARE
JUMPIF_GT LAB
LOAD_VAR TOT
RETURN_VAL
LAB: LOAD_VAR N
LOAD_CONST 1
SUBTRACT
LOAD_VAR TOT
LOAD_VAR N
MULTIPLY
CALL_FUN2 AUX
RETURN_VAL
the CALL_FUN2 means "call a function with two arguments". With the optimization, it could become sometime like:
POP_KEEP 2
POP_DISCARD 2
PUSH_KEPT 2
JUMP AUX
Of course I'm making up my symbolic bytecodes as I go along, but I hope the intent is clear: POP_DISCARD n is the normal pop that just discards the top n entries from the stack, but POP_KEEP n is a variant that keeps them "somewhere" (e.g. in an auxiliary stack not directly accessible to the application but only to the VM's own machinery -- storage with such a character is sometimes called "a register" when discussing VM implementation) and a matching PUSH_KEPT n which empties the "registers" back into the VM's normal stack.
I think you're looking at this the wrong way. Instead of popping the old variables off the stack and then pushing the new ones, simply reassign the ones already there (carefully). This is roughly the same optimization that would happen if you rewrote the code to be the equivalent iterative algorithm.
For this code:
int fact(int x, int total=1) {
if (x == 1)
return total;
return fact(x-1, total*x);
}
would be
fact:
jmpne x, 1, fact_cont # if x!=1 jump to multiply
retrn total # return total
fact_cont: # update variables for "recursion
mul total,x,total # total=total*x
sub x,1,x # x=x-1
jmp fact #"recurse"
There's no need to pop or push anything on the stack, merely reassign.
Clearly, this can be further optimized, by putting the exit condition second, allowing us to skip a jump, resulting in fewer operations.
fact_cont: # update variables for "recursion
mul total,x,total # total=total*x
sub x,1,x # x=x-1
fact:
jmpne x, 1, fact_cont # if x!=1 jump to multiply
retrn total # return total
Looking again, this "assembly" better reflects this C++, which clearly has avoided the recursion calls
int fact(int x, int total=1)
for( ; x>1; --x)
total*=x;
return total;
}
Related
I am talking with reference to C++.
I know that if an int is declared as static in recursion, its value is not reinitialized in stack-recursion call and the present value is used.
But if a stack becomes empty(or a recursion computation is complete) and then the recursion is called again, will it use the same static value as initialized in first stack call??
I will explain my problem in detail.
I am trying to code level order traversal in spiral form.
1
/ \
2 3
/ \ / \
7 6 5 4
Level order traversal in spiral form will give output 1 2 3 4 5 6 7.
void LevelSpiral(node* root, int level)
{
static int k = level%2;
if(root==NULL)
return;
if(level==1)
{
printf("%d ",root->val);
}
else
{
if(k==0)
{
LevelSpiral(root->left,level-1);
LevelSpiral(root->right,level-1);
}
else
{
LevelSpiral(root->right,level-1);
LevelSpiral(root->left,level-1);
}
}
}
void LevelOrderSpiral(node* root)
{
for(int i=1;i<=maxheight;i++)
LevelSpiral(root,i);
}
LevelOrderSpiral function makes separate LevelSpiral-call for each i. But throughout the code it always uses k=1(which is initialized in the first LevelSpiral-call with i=1) and prints the output as 1 3 2 4 5 6 7.
Shouldn't it print 1 2 3 4 5 6 7 as the function stack is reinitialized for every i?
You need a static variable for it's value to be retained between calls, or from one call to the next recursive call.
Furthermore, recursion wouldn't be the first tool I reach for a breadth-first traversal. I would use a queue of node (safe) pointers (or reference wrappers or whatever). Put the root node in the queue, then loop until the queue is empty removing the front element and enqueueing all of it's child nodes and do what you want with the recently removed element.
Regarding your implementation, you are alternating between going to the left first and going to the right first. level always equals 1 at the row before the one you want to print, so you always traverse your printing row from right to left. You'll see bigger shufflings of the nodes when you have a deeper tree. Draw a sample tree on paper and draw the navigations on it as you follow your code by hand.
I know that if an int is declared as const in recursion, its value is not reinitialized in stack-recursion call and the present value is used.
No, that’s wrong. const has got nothing to do with recursion or reentrancy.
But if a stack becomes empty(or a recursion computation is complete) and then the recursion is called again, will it use the same const value as initialized in first stack call??
A const is a normal (albeit unmodifiable) variable: it is reinitialised whenever the initialisation statement is executed, i.e. on every function call. This is the same for any non-static variable.
static local variables exhibit the behaviour you are describing: they are only executed once, at the first call of that function, and, importantly, they are not reinitialised even after the call stack is “emptied”. It makes no difference whether the function is called repeatedly from outside, or recursively.
I'm writing a function for calculating integrals recursively, using the trapezoid rule. For some f(x) on the interval (a,b), the method is to calculate the area of the big trapezoid with side (b-a) and then compare it with the sum of small trapezoids formed after dividing the interval into n parts. If the difference is larger than some given error, the function is called again for each small trapezoid and the results summed. If the difference is smaller, it returns the arithmetic mean of the two values.
The function takes two parameters, a function pointer to the function which is to be integrated and a constant reference to an auxiliary structure, which contains information such as the interval (a,b), the amount of partitions, etc:
struct Config{
double min,max;
int partitions;
double precision;
};
The problem arises when I want to change the amount of partitions with each iteration, for the moment let's say just increment by one. I see no way of doing this without resorting to calling the current depth of the recurrence:
integrate(const Config &conf, funptr f){
double a=conf.min,b=conf.max;
int n=conf.partitions;
//calculating the trapezoid areas here
if(std::abs(bigTrapezoid-sumOfSmallTrapezoids) > conf.precision){
double s=0.;
Config configs = new Config[n];
int newpartitions = n+(calls);
for(int i=0; i < n;++i){
configs[i]={ a+i*(b-a)/n , a+(i+1)*(b-a)/n , newpartitions};
s+=integrate(configs[i],f);
}
delete [] configs;
return s; }
else{
return 0.5*(bigTrapezoid+sumOfSmallTrapezoids);}
}
The part I'm missing here is of course a way to find (calls). I have tried doing something similar to this answer, but it does not work, in fact it freezes the pc until makefile kills the process. But perhaps I'm doing it wrong. I do not want to add an extra parameter to the function or an additional variable to the structure. How should I proceed?
You cannot "find" calls, but you can definitely pass it yourself, like this:
integrate(const Config &conf, funptr f, int calls=0) {
...
s+=integrate(configs[i],f, calls+1);
...
}
It seems to me that 'int newpartitions = n + 1;' would be enough, no? At every recursion level, the number of partitions increases by one. Say conf.partitions starts off at 1. If the routine needs to recurse down a new level, newpartitions is 2, and you will build 2 new Config instances each with '2' as the value for partitions. Recursing down another level, newpartitions is 3, and you build 3 Configs, each with '3' as 'partitions', and so on.
The trick here is to make sure your code is robust enough to avoid infinite recursion.
By the way, it seems inefficient to me to use dynamic allocation for Config instances that have to be destroyed after the loop. Why not build a single Config instance on the stack inside the loop? Your code should run much faster that way.
I am generalizing another problem I have that has a similar recursive call. In my case, the variables being used are strings, so I can't simply pass by value to avoid the code before and after the recursive call in the loop. Is there a way to turn this into an iterative loop? Please assume that the code before and after the recursive call in the loop cannot be changed to make this specific instance work.
This code tests to see if the sum of any combintion of ints from nums adds up to zero. The original value for index is 0, and max is the maximum number of numbers I want to add up in any given solution.
For further clarification, the numbers can be repeated, so I can't just try all possible combinations, because there are infinitely many.
void findSolution(const vector<int>& nums, vector<int>& my_list, int& mySum,
int index, const int max)
{
if(mySum == 0) {
/* print my_list and exit(0) */
}
if(index < max) {
for(int i = 0; i < nums.size(); ++i) {
my_list.push_back(nums[i]);
mySum += nums[i];
findSolution(nums, my_list, mySum, index+1, max);
mySum -= nums[i];
my_list.pop_back();
}
}
}
Maintain a manual stack.
In your case the only independent states of the recursive call are the value of I and the value of index. Index is just the recursive call depth.
Create a std vector of int called your stack, reserve it to max. Replace the loop with while stack true.
Before entering the loop, push zero on the stack.
Break the cod in your loop into 3 parts. A B and C. A is before recursive call, B is what you recursively call, and C is after. Included in C is the increment at the top of the loop, which happens after C in the original code.
The first thing you do in the loop is check if the top of the stack is nums size or bigger. If so, pop the stack, then execute C and continue unless stack is empty, in which case break.
Then execute A. Then push 0 on the stack and continue if the stack size is less than max. Then execute C. You can remove the C code duplication with a flag variable.
Remember that the top of the stack replaces any references to I. Basically we are replacing a recursive call which automatically makes a stack for us with manually maintaining the same stack, but only storing the absolute least amount we can get away with. The start of the loop does double duty as both the end of a recursive call and the start of the loop, so we can do away with goto.
Give it a try, and worry about the explanation after you have seen it. The stuff about the flag variable makes more sense after the code is in place.
Hey Guys. I need help understanding my hw assignment. I am starting out in C++ and don't know that much. I do know the basics of a stack and fibonacci sequence. However I do not exactly understand the problem given to me and need not the code to solving the problem but help clarifying some steps. Here's the hw:
"By completing this project you will become familiar with using recursion and creating ADTs in C++.
Create an integer stack ADT (you may Modify the IntStack ADT given to you in the lecture notes) such that it has a maximum capacity of at least 256 elements. Also add whatever is needed such that it will print out its contents (left-to-right, with the top of the stack on the right) if it is printed to a C++ ostream - such as cout). This stack should be designed such that it will only hold meaningful values greater than zero. Values less than or equal to zero should be printed out as a '?'.
Write a recursive implementation of the Fibonacci sequence discussed in class. Also - create an instance of your stack ADT that persists between calls (it can't be a local variable), and at each step, push a non-meaningful value into it until the value at that stage is determined, then pop it off, and push in the determined value and print the entire stack before returning.
Your program should ask for the position N in the Fibonacci sequence to be determined and then it should output the result of the function call. Example output (including the output from your recursive function) follows:
Enter the position in the Fibonacci sequence to determine: 5
?-?-?-1
?-?-?-1
?-?-2
?-?-1
?-3
?-?-1
?-?-1
?-2
5
Fibonacci(5) = 5
What exactly is the output here? Is it printing out the stack as it calculates the 5th position? also Any ideas on how to implement the Fibonacci into a stack in C++? Should these values be stored in array, list or it doesn't matter? I'm a noob so any help would be much appreciated. Thanks
Yes, it's calculating 5th fibonacci number (which happens to be 5, that's a bit confusing), but look at what you calculate when you call fibonacci(5), assuming the following code for fibonacci:
int fibonacci(int n) {
if (n <= 1) return n;
else if (n == 2) return 1;
else return fibonacci(n-1) + fibonacci(n-2);
}
here are the function calls to calculate fibonacci(5):
f(5)
-> f(4)
-> f(3)
-> f(2)
-> f(1)
-> f(2)
->f(3)
-> f(2)
-> f(1)
If you look at this as a binary tree, the output they gave you was a post-order tree traversal, with the amount of ? being the depth of that stack, and the number being the value of that node.
So just do what the function does and every time you see return, write what you return (with the ?'s before it):
The first function that returns is the first f(2), at depth 4: print ?-?-?-1
The second return is the f(1) below it: print ?-?-?-1
The third return is the parent of f(2) and f(1), which has depth 3 and value f(2)+f(1)=2: print ?-?-2
And so on until you return f(5) at depth 0 and value 5
Beak your entire problem into smaller parts which can be solved/implemented by themselves. In this case there are two main parts:
Stack -- Implement a standard stack class according to the details given in the question. It may help to list them all in point form.
Fibonacci -- Use the stack class to generate the Fibonacci series recursively. The stack is your storage mechanism for this exercise.
The example output ?-?-?-1 can be understood as the following stack operations:
push 0
push 0
push 0
push 1
print
I leave printing the stack to you and address the confusing part of using the stack to store marker ("the non meaningful" number) and result. Here is the partial pseudo code:
procedure fib(n)
push a marker (say a zero) to a global stack
if n is 1 or 2 then
result = you_know_what
else
calculate fib(n-1)
pop the stack ==> fib_n_minus_1
calculate fib(n-2)
pop the stack ==> fib_n_minus_2
result = fib_n_minus_1 + fib_n_minus_2
endif
pop the marker off the stack and discard
push the result into the stack
print the stack
end fib
The key to note here is fib() does not return a value. Instead, it pushes the return value into a global stack.
Can someone show me a simple tail-recursive function in C++?
Why is tail recursion better, if it even is?
What other kinds of recursion are there besides tail recursion?
A simple tail recursive function:
unsigned int f( unsigned int a ) {
if ( a == 0 ) {
return a;
}
return f( a - 1 ); // tail recursion
}
Tail recursion is basically when:
there is only a single recursive call
that call is the last statement in the function
And it's not "better", except in the sense that a good compiler can remove the recursion, transforming it into a loop. This may be faster and will certainly save on stack usage. The GCC compiler can do this optimisation.
Tail recusion in C++ looks the same as C or any other language.
void countdown( int count ) {
if ( count ) return countdown( count - 1 );
}
Tail recursion (and tail calling in general) requires clearing the caller's stack frame before executing the tail call. To the programmer, tail recursion is similar to a loop, with return reduced to working like goto first_line;. The compiler needs to detect what you are doing, though, and if it doesn't, there will still be an additional stack frame. Most compilers support it, but writing a loop or goto is usually easier and less risky.
Non-recursive tail calls can enable random branching (like goto to the first line of some other function), which is a more unique facility.
Note that in C++, there cannot be any object with a nontrivial destructor in the scope of the return statement. The end-of-function cleanup would require the callee to return back to the caller, eliminating the tail call.
Also note (in any language) that tail recursion requires the entire state of the algorithm to be passed through the function argument list at each step. (This is clear from the requirement that the function's stack frame be eliminated before the next call begins… you can't be saving any data in local variables.) Furthermore, no operation can be applied to the function's return value before it's tail-returned.
int factorial( int n, int acc = 1 ) {
if ( n == 0 ) return acc;
else return factorial( n-1, acc * n );
}
Tail recursion is a special case of a tail call. A tail call is where the compiler can see that there are no operations that need to be done upon return from a called function -- essentially turning the called function's return into it's own. The compiler can often do a few stack fix-up operations and then jump (rather than call) to the address of the first instruction of the called function.
One of the great things about this besides eliminating some return calls is that you also cut down on stack usage. On some platforms or in OS code the stack can be quite limited and on advanced machines like the x86 CPUs in our desktops decreasing the stack usage like this will improve data cache performance.
Tail recursion is where the called function is the same as the calling function. This can be turned into loops, which is exactly the same as the jump in the tail call optimization mentioned above. Since this is the same function (callee and caller) there are fewer stack fixups that need to be done before the jump.
The following shows a common way to do a recursive call which would be more difficult for a compiler to turn into a loop:
int sum(int a[], unsigned len) {
if (len==0) {
return 0;
}
return a[0] + sum(a+1,len-1);
}
This is simple enough that many compilers could probably figure it out anyway, but as you can see there is an addition that needs to happen after the return from the called sum returns a number, so a simple tail call optimization is not possible.
If you did:
static int sum_helper(int acc, unsigned len, int a[]) {
if (len == 0) {
return acc;
}
return sum_helper(acc+a[0], len-1, a+1);
}
int sum(int a[], unsigned len) {
return sum_helper(0, len, a);
}
You would be able to take advantage of the calls in both functions being tail calls. Here the sum function's main job is to move a value and clear a register or stack position. The sum_helper does all of the math.
Since you mentioned C++ in your question I'll mention some special things about that.
C++ hides some things from you which C does not. Of these destructors are the main thing that will get in the way of tail call optimization.
int boo(yin * x, yang *y) {
dharma z = x->foo() + y->bar();
return z.baz();
}
In this example the call to baz is not really a tail call because z needs to be destructed after the return from baz. I believe that the rules of C++ may make the optimization more difficult even in cases where the variable is not needed for the duration of the call, such as:
int boo(yin * x, yang *y) {
dharma z = x->foo() + y->bar();
int u = z.baz();
return qwerty(u);
}
z may have to be destructed after the return from qwerty here.
Another thing would be implicit type conversion, which can happen in C as well, but can more complicated and common in C++.
For instance:
static double sum_helper(double acc, unsigned len, double a[]) {
if (len == 0) {
return acc;
}
return sum_helper(acc+a[0], len-1, a+1);
}
int sum(double a[], unsigned len) {
return sum_helper(0.0, len, a);
}
Here sum's call to sum_helper is not a tail call because sum_helper returns a double and sum will need to convert that into an int.
In C++ it is quite common to return an object reference which may have all kinds of different interpretations, each of which could be a different type conversion,
For instance:
bool write_it(int it) {
return cout << it;
}
Here there is a call made to cout.operator<< as the last statement. cout will return a reference to itself (which is why you can string lots of things together in a list separated by << ), which you then force to be evaluated as a bool, which ends up calling another of cout's methods, operator bool(). This cout.operator bool() could be called as a tail call in this case, but operator<< could not.
EDIT:
One thing that is worth mentioning is that a major reason that tail call optimization in C is possible is that the compiler knows that the called function will store it's return value in the same place as the calling function would have to ensure that its return value is stored in.
Tail recursion is a trick to actually cope with two issues at the same time. The first is executing a loop when it is hard to know the number of iterations to do.
Though this can be worked out with simple recursion, the second problem arises which is that of stack overflow due to the recursive call being executed too many times. The tail call is the solution, when accompanied by a "compute and carry" technique.
In basic CS you learn that a computer algorithm needs to have an invariant and a termination condition. This is the base for building the tail recursion.
All computation happens in the argument passing.
All results must be passed onto function calls.
The tail call is the last call, and occurs at termination.
To simply put it, no computation must happen on the return value of your function .
Take for example the computation of a power of 10, which is trivial and can be written by a loop.
Should look something like
template<typename T> T pow10(T const p, T const res =1)
{
return p ? res: pow10(--p,10*res);
}
This gives an execution, e.g 4:
ret,p,res
-,4,1
-,3,10
-,2,100
-,1,1000
-,0,10000
10000,-,-
It is clear that the compiler just has to copy values without changing the stack pointer and when the tail call happens just to return the result.
Tail recursion is very important because it can provide ready made compile time evaluations, e.g. The above can be made to be.
template<int N,int R=1> struct powc10
{
int operator()() const
{
return powc10<N-1, 10*R>()();
}
};
template<int R> struct powc10<0,R>
{
int operator()() const
{
return R;
}
};
this can be used as powc10<10>()() to compute the 10th power at compile time.
Most compilers have a limit of nested calls so the tail call trick helps. Evidently,there are no meta programming loops, so have to use recursion.
Tail recursion does not exist really at compiler level in C++.
Although you can write programs that use tail recursion, you do not get the inherit benefits of tail recursion implemented by supporting compilers/interpreters/languages. For instance Scheme supports a tail recursion optimization so that it basically will change recursion into iteration. This makes it faster and invulnerable to stack overflows. C++ does not have such a thing. (least not any compiler I've seen)
Apparently tail recursion optimizations exist in both MSVC++ and GCC. See this question for details.
Wikipedia has a decent article on tail recursion. Basically, tail recursion is better than regular recursion because it's trivial to optimize it into an iterative loop, and iterative loops are generally more efficient than recursive function calls. This is particularly important in functional languages where you don't have loops.
For C++, it's still good if you can write your recursive loops with tail recursion since they can be better optimized, but in such cases, you can generally just do it iteratively in the first place, so the gain is not as great as it would be in a functional language.