I'm trying to find the height of a Binary Search Tree in my program, and keep coming upon this recursive solution to find the height:
int maxHeight(BinaryTree *p) {
if (!p) return 0;
int left_height = maxHeight(p->left);
int right_height = maxHeight(p->right);
return (left_height > right_height) ? left_height + 1 : right_height + 1;
}
Can someone explain to me how this works? I don't understand how it adds up the height. It looks like it should just go through each side of the tree and return 0.
The algorithm works this way:
If the tree I am looking at does not exist, then the length of the tree is 0.
Otherwise, the length of the tree is the maximum height of the two sub-trees I have plus 1 (the plus 1 is needed to include the node you are currently looking at).
E.g. if I have a tree with no branches (i.e a stump) then I have a height of one, because I have two subtrees of height 0 and the max of these heights plus 1 is 1.
another example:
If I have a tree:
A - B - C - D
| |
E F
(where a is root)
then, height is not 0, as A is not null
height = max(height(left), height(right)) + 1.
height of left at A is 0, because A has no left branch.
height of right branch is the height of B + 1.
to work out the height of B, we consider B as a completely new tree:
B - C - D
| |
E F
now
height = max(height(left), height(right)) + 1.
to work out the heigtht of left, we conside E as a completely new tree:
E
this exists so it's height is not 0
however, it's two branches do not exist, so it's height is 1 (each branch has height of 0)
back to the parent tree again:
B - C - D
| |
E F
we were working out height, and found out the height of the left branch is 1.
so height = max(1, height(right) ) + 1
so, what is the height of right?
once again, we consider the right branch as it's own tree:
C - D
|
F
the problem is the same as before
height = max(height(left), height(right)) + 1
to work out height(left), we consider F by itself
F
F has height of 1, because it has two null branches (i.e. max of two 0 heights's plus 1)
now looking at right
D
D has height of 1 for same reason
back to parent of F and D:
C - D
|
F
height of C is:
max(height(F), height(D)) + 1
= max(1, 1) + 1
= 1 + 1
= 2.
So now we know height of C, we can go back to the parent:
B - C - D
| |
E F
recall, we worked out the length of B's left branch as 1, and then started working out it's right branch height.
We now know that the right branch has a height of 2
Max(1, 2) is 2.
2 + 1 = 3
Therefore, height of B is 3.
Now we know this, we are finally back to our original tree:
A - B - C - D
| |
E F
we already worked out the left branches height at 0, and then started working on the right branch.
we now know the right branch has a height of 3.
Therefore,
height(a) = Max(height(null), Max(height(B))
= Max( 0 , 3 ) + 1
= 3+1
=4
done. The height of A is 4.
The function only returns 0 if the binary tree is null. That makes sense because if the tree is null, there are no nodes to count so the height is 0.
If it's not null, then the function will add 1 (the height of the current node) to the height of the left or the right child sub-tree, whichever is greater.
How does it know the height of the child sub-trees? By calling itself recursively passing in the left child or the right child so that the next recursion starts one level down in the tree.
What happens when you call the function for the first time passing in the root of the tree? The function first calls itself recursively travelling down the left-most children until it finds the leaf node. It calls itself one more time passing in the left child of the leaf node, which is null. This last call doesn't recurse any more and just returns 0. The function then recurses to the right child of the leaf which also returns 0. It then adds 1 for its own height and returns.
We are now at the parent of the left-most leaf and as before it will recurse to the right child (sibling to the leaf). That one might not exist (returns 0), be a leaf (returns 1) or have children (returns >1). Whatever the return value, it will be compared to the left-most leaf (height 1) and whichever is greater will be incremented (always adding the height of the current node) and returned as the height of the sub-tree rooted at the current node.
Notice that the recursion will keep "unrolling" as it travels back up to the root but at each level, it will first recurse further down the right child sub-tree. This is what's known as a depth first search. Eventually, the whole tree will be visited and the outstanding maximum height computed all the way back to the root.
Your code is not memory friendly. For example, when you run this method, the height_left and height_right vars will still be saved in the memory. So, what if you run this function billions of times? I suggest to return with no variables, for instance
return max(maxHeight(p->left), maxHeight(p->right));
Related
void level_order_recursive(struct node *t , int h) //'h' is height of my binary tree
{ //'t' is address of root node
for(int i = 0 ; i <= h ; i++)
{
print_level(t , i);
}
}
After print_level() is called everytime , I think recursive function is called (2^i) times . So 2^0 + 2^1 + 2^2 ....2^h should give time complexity of O(2^n).Where am I going wrong ?
void print_level(struct node * t , int i)
{
if( i == 0)
cout << t -> data <<" ";
else
{
if(t -> left != NULL)
print_level(t -> left , i - 1); //recursive call
if(t -> right != NULL)
print_level(t -> right , i - 1); //recursive call
}
}
You are confusing h and n. h is the height of the tree. n is apparently the number of elements in the tree. So print_level takes worst case O ($2^i), but that is also just n.
The worst case happens when you have a degenerate tree, where each node has only one successor. In that case you have n nodes, but the height of the tree is also h = n. Each call to print_level takes i steps in that case, and summing up i from 1 to h = n gives O ($n^2).
You always start at the root of the tree t and increase the level by one each time (i) until you reach the height of the tree h.
You said it is a binary tree, but you did not mention any property, e.g. balanced or so. So I assume it can be an unbalanced binary tree and thus the height of the tree in worst case can be h = n where n is the number of nodes (that is a completely unbalanced tree that looks like a list actually).
So this means that level_order_recursive loops n times. I.e. the worst case is that the tree has n levels.
print_level receives the root node and the level to print. And it calls itself recursively until it reaches the level and prints out that level.I.e. it loops i times (a recursive call decreases i by one each time).
So you have 1 + 2 + 3 + ... + h iterations. And since h = n you get 1 + 2 + 3 ... + n steps. This is (n * (n+1))/2 (Gaussian sum formula) which is in O(n^2).
If you can assure that the tree is balanced than you would improve the worst case scenario, because the height would be h = ld(n) where ld denotes the binary logarithm.
Based on this or that, pages 3 and 4, binary search algorithm, which resembles our case, has a time complexity of T(n) = T(n/2) + c.
Except that, both left and right sub-trees are browsed, hence the 2T(n/2) in the formula below, since this is a traversal algorithm, rather than a search one.
Here, I will comply to the question and use 'h' instead of 'n'.
Using recurrence relation, you get the following proof:
In the worst case the time complexity will be O(n^2) but cannot be 2^n as time complexity for each level will be-> O(n) + O(n-1) + O(n-2) + ... + O(1) which is at worst O(n^2).
Is there a way I can create a for loop such that, given any starting location the loop will subsequently iterate through the right-most element of each level? Given some heap:
If you were inside a make_heap function, you may start at (n-2)/2) which would be the node denoted by the red twelve.
Now, given some start location (n-2)/2, is it possible to iterate such that the subsequent values of the loop will be 6 -> 2 -> 0 (array location of the right most elements above the initial level, which is the red number minus one) which correspond to 14 -> 24 -> 25.
My initial implementation looks like
using std::size_t;
size_t n = last - first; // size of heap
for(size_t start = (n-2)/2;
start > 0;
start = (size_t)pow(2, (size_t)log2(start)-1))
{
std::cout << start << std::endl;
}
My thinking was that start is equal to 2log2(start)-1, which means the previous level.
However this only yields 11, 4, 2, 1 (add one for the corresponding node location in red). It in theory should be 11, 6, 2, 0. Any ideas?
Assuming your indices are 1-based, it is straight forward to compute the parent and the child nodes of a node given its index n:
you get the parent node using n / 2
you get the left child node using n * 2
you get the rigth child node using n * 2 + 1
Since arrays in C++ are 0-base you may need to strategically add/substract 1 to convert between node and array indices.
To get to the right children of the the parent nodes you'd keep the index p of the current parent node, replacing it using p = p / 2 in each direction and access the node at p * 2 + 1. Of course, if make_heap() is anything like std::make_heap() it doesn't need to do anything like that. It merely needs to "bubble up" the new node while its parent is bigger the node.
I have these following methods to get the height of a red black tree and this works (I send the root). Now my question is, how is this working? I have drawn a tree and have tried following this step by step for each recursion call but I can't pull it off.
I know the general idea of what the code is doing, which is going through all the leaves and comparing them but can anyone give a clear explanation on this?
int RedBlackTree::heightHelper(Node * n) const{
if ( n == NULL ){
return -1;
}
else{
return max(heightHelper(n->left), heightHelper(n->right)) + 1;
}
}
int RedBlackTree::max(int x, int y) const{
if (x >= y){
return x;
}
else{
return y;
}
}
Well, the general algorithm to find the height of any binary tree (whether a BST,AVL tree, Red Black,etc) is as follows
For the current node:
if(node is NULL) return -1
else
h1=Height of your left child//A Recursive call
h2=Height of your right child//A Recursive call
Add 1 to max(h1,h2) to account for the current node
return this value to parent.
An illustration to the above algorithm is as follows:
(Image courtesy Wikipedia.org)
This code will return the height of any binary tree, not just a red-black tree. It works recursively.
I found this problem difficult to think about in the past, but if we imagine we have a function which returns the height of a sub-tree, we could easily use that to compute the height of a full tree. We do this by computing the height of each side, taking the max, and adding one.
The height of the tree either goes through the left or right branch, so we can take the max of those. Then we add 1 for the root.
Handle the base case of no tree (-1), and we're done.
This is a basic recursion algorithm.
Start at the base case, if the root itself is null the height of tree is -1 as the tree does not exist.
Now imagine at any node what will be the height of the tree if this node were its root?
It would be simply the maximum of the height of left subtree or the right subtree (since you are trying to find the maximum possible height, so you have to take the greater of the 2) and add a 1 to it to incorporate the node itself.
That's it, once you follow this, you're done!
As a recursive function, this computes the height of each child node, using that result to compute the height of the current node by adding + 1 to it. The height of any node is always the maximum height of the two children + 1. A single-node case is probably the easiest to understand, since it has a height of zero (0).
A
Here the call stack looks like this:
height(A) =
max(height(A->left), height(A->right)) + 1
Since both left and right are null, both return (-1), and therefore this reduces to
height(A) = max (-1, -1) + 1;
height(A) = -1 + 1;
height(A) = 0
A slightly more complicated version
A
B C
D E
The recursive calls we care about are:
height(A) =
max(height(B), height(C)) + 1
height(B) =
max(height(D), height(E)) + 1
The single nodes D, E, and C we already know from our first example have a height of zero (they have no children). therefore all of the above reduces to
height(A) = max( (max(0, 0) + 1), 0) + 1
height(A) = max(1, 0) + 1
height(A) = 1 + 1
height(A) = 2
I hope that makes at least a dent in the learning curve for you. Draw them out on paper with some sample trees to understand better if you still have doubts.
I am having trouble understanding this maxDepth code. Any help would be appreciated. Here is the snippet example I followed.
int maxDepth(Node *&temp)
{
if(temp == NULL)
return 0;
else
{
int lchild = maxDepth(temp->left);
int rchild = maxDepth(temp->right);
if(lchild <= rchild)
return rchild+1;
else
return lchild+1;
}
}
Basically, what I understand is that the function recursively calls itself (for each left and right cases) until it reaches the last node. once it does, it returns 0 then it does 0+1. then the previous node is 1+1. then the next one is 2+1. if there is a bst with 3 left childs, int lchild will return 3. and the extra + 1 is the root. So my question is, where do all these +1 come from. it returns 0 at the last node but why does it return 0+1 etc. when it goes up the left/right child nodes? I don't understand why. I know it does it, but why?
Consider this part (of a bigger tree):
A
\
B
Now we want to calculate the depth of this treepart, so we pass pointer to A as its param.
Obviously pointer to A is not NULL, so the code has to:
call maxDepth for each of A's children (left and right branches). A->right is B, but A->left is obviously NULL (as A has no left branch)
compare these, choose the greatest value
return this chosen value + 1 (as A itself takes a level, doesn't it?)
Now we're going to look at how maxDepth(NULL) and maxDepth(B) are calculated.
The former is quite easy: the first check will make maxDepth return 0. If the other child were NULL too, both depths would be equal (0), and we have to return 0 + 1 for A itself.
But B is not empty; it has no branches, though, so (as we noticed) its depth is 1 (greatest of 0 for NULLs at both parts + 1 for B itself).
Now let's get back to A. maxDepth of its left branch (NULL) is 0, maxDepth of its right branch is 1. Maximum of these is 1, and we have to add 1 for A itself - so it's 2.
The point is the same steps are to be done when A is just a part of the bigger tree; the result of this calculation (2) will be used in the higher levels of maxDepth calls.
Depth is being calculated using the previous node + 1
All the ones come from this part of the code:
if(lchild <= rchild)
return rchild + 1;
else
return lchild + 1;
You add yourself +1 to the results obtained in the leaves of the tree. These ones keep adding up until you exit all the recursive calls of the function and get to the root node.
Remember in binary trees a node has at most 2 children (left and right)
It is a recursive algorithm, so it calls itself over and over.
If the temp (the node being looked at) is null, it returns 0, as this node is nothing and should not count. that is the base case.
If the node being looked at is not null, it may have children. so it gets the max depth of the left sub tree (and adds 1, for the level of the current node) and the right subtree (and adds 1 for the level of the current node). it then compares the two and returns the greater of the two.
It dives down into the two subtrees (temp->left and temp->right) and repeats the operation until it reaches nodes without children. at that point it will call maxDepth on left and right, which will be null and return 0, and then start returning back up the chain of calls.
So if you you have a chain of three nodes (say, root-left1-left2) it will get down to left2 and call maxDepth(left) and maxDepth(right). each of those return 0 (they are null). then it is back at left2. it compares, both are 0, so the greater of the two is of course 0. it returns 0+1. then we are at left1 - repeats, finds that 1 is the greater of its left n right (perhaps they are the same or it has no right child) so it returns 1+1. now we are at root, same thing, it returns 2+1 = 3, which is the depth.
Because the depth is calculated with previous node+1
To find Maximum depth in binary tree keep going left and Traveres the tree, basically perform a DFS
or
We can find the depth of the binary search tree in three different recursive ways
– using instance variables to record current depth and total depth at every level
– without using instance variables in top-bottom approach
– without using instance variables in bottom-up approach
The code snippet can be reduced to just:
int maxDepth(Node *root){
if(root){ return 1 + max( maxDepth(root->left), maxDepth(root->right)); }
return 0;
}
A good way of looking at this code is from the top down:
What would happen if the BST had no nodes? We would have root = NULL and the function would immediately return an expected depth of 0.
Now suppose the tree was populated with a number of nodes. Starting at the top, the if condition would be true for the root node. We then ask, what is the max depth of the LEFT SUB TREE and the RIGHT SUB TREE by passing the root of those sub trees to maxDepth. Both the LST and the RST of the root are one level deeper than the root, so we must add one to get the depth of the tree at root of the tree passed to the function.
i think this is the right answer
int maxDepth(Node *root){
if(root){ return 1 + max( maxDepth(root->left), maxDepth(root->right)); }
return -1;
}
I am new to recursion and trying to understand this code snippet. I'm studying for an exam, and this is a "reviewer" I found from Standford' CIS Education Library (From Binary Trees by Nick Parlante).
I understand the concept, but when we're recursing INSIDE THE LOOP, it all blows! Please help me. Thank you.
countTrees() Solution (C/C++)
/*
For the key values 1...numKeys, how many structurally unique
binary search trees are possible that store those keys.
Strategy: consider that each value could be the root.
Recursively find the size of the left and right subtrees.
*/
int countTrees(int numKeys) {
if (numKeys <=1) {
return(1);
}
// there will be one value at the root, with whatever remains
// on the left and right each forming their own subtrees.
// Iterate through all the values that could be the root...
int sum = 0;
int left, right, root;
for (root=1; root<=numKeys; root++) {
left = countTrees(root - 1);
right = countTrees(numKeys - root);
// number of possible trees with this root == left*right
sum += left*right;
}
return(sum);
}
Imagine the loop being put "on pause" while you go in to the function call.
Just because the function happens to be a recursive call, it works the same as any function you call within a loop.
The new recursive call starts its for loop and again, pauses while calling the functions again, and so on.
For recursion, it's helpful to picture the call stack structure in your mind.
If a recursion sits inside a loop, the structure resembles (almost) a N-ary tree.
The loop controls horizontally how many branches at generated while the recursion decides the height of the tree.
The tree is generated along one specific branch until it reaches the leaf (base condition) then expand horizontally to obtain other leaves and return the previous height and repeat.
I find this perspective generally a good way of thinking.
Look at it this way: There's 3 possible cases for the initial call:
numKeys = 0
numKeys = 1
numKeys > 1
The 0 and 1 cases are simple - the function simply returns 1 and you're done. For numkeys 2, you end up with:
sum = 0
loop(root = 1 -> 2)
root = 1:
left = countTrees(1 - 1) -> countTrees(0) -> 1
right = countTrees(2 - 1) -> countTrees(1) -> 1
sum = sum + 1*1 = 0 + 1 = 1
root = 2:
left = countTrees(2 - 1) -> countTrees(1) -> 1
right = countTrees(2 - 2) -> countTrees(0) -> 1
sum = sum + 1*1 = 1 + 1 = 2
output: 2
for numKeys = 3:
sum = 0
loop(root = 1 -> 3):
root = 1:
left = countTrees(1 - 1) -> countTrees(0) -> 1
right = countTrees(3 - 1) -> countTrees(2) -> 2
sum = sum + 1*2 = 0 + 2 = 2
root = 2:
left = countTrees(2 - 1) -> countTrees(1) -> 1
right = countTrees(3 - 2) -> countTrees(1) -> 1
sum = sum + 1*1 = 2 + 1 = 3
root = 3:
left = countTrees(3 - 1) -> countTrees(2) -> 2
right = countTrees(3 - 3) -> countTrees(0) -> 1
sum = sum + 2*1 = 3 + 2 = 5
output 5
and so on. This function is most likely O(n^2), since for every n keys, you're running 2*n-1 recursive calls, meaning its runtime will grow very quickly.
Just to remember that all the local variables, such as numKeys, sum, left, right, root are in the stack memory. When you go to the n-th depth of the recursive function , there will be n copies of these local variables. When it finishes executing one depth, one copy of these variable will be popped up from the stack.
In this way, you will understand that, the next-level depth will NOT affect the current-level depth local variables (UNLESS you are using references, but we are NOT in this particular problem).
For this particular problem, time-complexity should be carefully paid attention to. Here are my solutions:
/* Q: For the key values 1...n, how many structurally unique binary search
trees (BST) are possible that store those keys.
Strategy: consider that each value could be the root. Recursively
find the size of the left and right subtrees.
http://stackoverflow.com/questions/4795527/
how-recursion-works-inside-a-for-loop */
/* A: It seems that it's the Catalan numbers:
http://en.wikipedia.org/wiki/Catalan_number */
#include <iostream>
#include <vector>
using namespace std;
// Time Complexity: ~O(2^n)
int CountBST(int n)
{
if (n <= 1)
return 1;
int c = 0;
for (int i = 0; i < n; ++i)
{
int lc = CountBST(i);
int rc = CountBST(n-1-i);
c += lc*rc;
}
return c;
}
// Time Complexity: O(n^2)
int CountBST_DP(int n)
{
vector<int> v(n+1, 0);
v[0] = 1;
for (int k = 1; k <= n; ++k)
{
for (int i = 0; i < k; ++i)
v[k] += v[i]*v[k-1-i];
}
return v[n];
}
/* Catalan numbers:
C(n, 2n)
f(n) = --------
(n+1)
2*(2n+1)
f(n+1) = -------- * f(n)
(n+2)
Time Complexity: O(n)
Space Complexity: O(n) - but can be easily reduced to O(1). */
int CountBST_Math(int n)
{
vector<int> v(n+1, 0);
v[0] = 1;
for (int k = 0; k < n; ++k)
v[k+1] = v[k]*2*(2*k+1)/(k+2);
return v[n];
}
int main()
{
for (int n = 1; n <= 10; ++n)
cout << CountBST(n) << '\t' << CountBST_DP(n) <<
'\t' << CountBST_Math(n) << endl;
return 0;
}
/* Output:
1 1 1
2 2 2
5 5 5
14 14 14
42 42 42
132 132 132
429 429 429
1430 1430 1430
4862 4862 4862
16796 16796 16796
*/
You can think of it from the base case, working upward.
So, for base case you have 1 (or less) nodes. There is only 1 structurally unique tree that is possible with 1 node -- that is the node itself. So, if numKeys is less than or equals to 1, just return 1.
Now suppose you have more than 1 key. Well, then one of those keys is the root, some items are in the left branch and some items are in the right branch.
How big are those left and right branches? Well it depends on what is the root element. Since you need to consider the total amount of possible trees, we have to consider all configurations (all possible root values) -- so we iterate over all possible values.
For each iteration i, we know that i is at the root, i - 1 nodes are on the left branch and numKeys - i nodes are on the right branch. But, of course, we already have a function that counts the total number of tree configurations given the number of nodes! It's the function we're writing. So, recursive call the function to get the number of possible tree configurations of the left and right subtrees. The total number of trees possible with i at the root is then the product of those two numbers (for each configuration of the left subtree, all possible right subtrees can happen).
After you sum it all up, you're done.
So, if you kind of lay it out there's nothing special with calling the function recursively from within a loop -- it's just a tool that we need for our algorithm. I would also recommend (as Grammin did) to run this through a debugger and see what is going on at each step.
Each call has its own variable space, as one would expect. The complexity comes from the fact that the execution of the function is "interrupted" in order to execute -again- the same function.
This code:
for (root=1; root<=numKeys; root++) {
left = countTrees(root - 1);
right = countTrees(numKeys - root);
// number of possible trees with this root == left*right
sum += left*right;
}
Could be rewritten this way in Plain C:
root = 1;
Loop:
if ( !( root <= numkeys ) ) {
goto EndLoop;
}
left = countTrees( root -1 );
right = countTrees ( numkeys - root );
sum += left * right
++root;
goto Loop;
EndLoop:
// more things...
It is actually translated by the compiler to something like that, but in assembler. As you can see the loop is controled by a pair of variables, numkeys and root, and their values are not modified because of the execution of another instance of the same procedure. When the callee returns, the caller resumes the execution, with the same values for all values it had before the recursive call.
IMO, key element here is to understand function call frames, call stack, and how they work together.
In your example, you have bunch of local variables which are initialised but not finalised in the first call. It's important to observe those local variables to understand the whole idea. At each call, the local variables are updated and finally returned in a backwards manner (most likely it's stored in a register before each function call frame is popped off from the stack) up until it's added to the initial function call's sum variable.
The important distinction here is - where to return. If you need accumulated sum value like in your example, you cannot return inside the function which would cause to early-return/exit. However, if you depend on a value to be in a certain state, then you can check if this state is hit inside the for loop and return immediately without going all the way up.