Red-Black Tree Height using Recursion - c++

I have these following methods to get the height of a red black tree and this works (I send the root). Now my question is, how is this working? I have drawn a tree and have tried following this step by step for each recursion call but I can't pull it off.
I know the general idea of what the code is doing, which is going through all the leaves and comparing them but can anyone give a clear explanation on this?
int RedBlackTree::heightHelper(Node * n) const{
if ( n == NULL ){
return -1;
}
else{
return max(heightHelper(n->left), heightHelper(n->right)) + 1;
}
}
int RedBlackTree::max(int x, int y) const{
if (x >= y){
return x;
}
else{
return y;
}
}

Well, the general algorithm to find the height of any binary tree (whether a BST,AVL tree, Red Black,etc) is as follows
For the current node:
if(node is NULL) return -1
else
h1=Height of your left child//A Recursive call
h2=Height of your right child//A Recursive call
Add 1 to max(h1,h2) to account for the current node
return this value to parent.
An illustration to the above algorithm is as follows:
(Image courtesy Wikipedia.org)

This code will return the height of any binary tree, not just a red-black tree. It works recursively.
I found this problem difficult to think about in the past, but if we imagine we have a function which returns the height of a sub-tree, we could easily use that to compute the height of a full tree. We do this by computing the height of each side, taking the max, and adding one.
The height of the tree either goes through the left or right branch, so we can take the max of those. Then we add 1 for the root.
Handle the base case of no tree (-1), and we're done.

This is a basic recursion algorithm.
Start at the base case, if the root itself is null the height of tree is -1 as the tree does not exist.
Now imagine at any node what will be the height of the tree if this node were its root?
It would be simply the maximum of the height of left subtree or the right subtree (since you are trying to find the maximum possible height, so you have to take the greater of the 2) and add a 1 to it to incorporate the node itself.
That's it, once you follow this, you're done!

As a recursive function, this computes the height of each child node, using that result to compute the height of the current node by adding + 1 to it. The height of any node is always the maximum height of the two children + 1. A single-node case is probably the easiest to understand, since it has a height of zero (0).
A
Here the call stack looks like this:
height(A) =
max(height(A->left), height(A->right)) + 1
Since both left and right are null, both return (-1), and therefore this reduces to
height(A) = max (-1, -1) + 1;
height(A) = -1 + 1;
height(A) = 0
A slightly more complicated version
A
B C
D E
The recursive calls we care about are:
height(A) =
max(height(B), height(C)) + 1
height(B) =
max(height(D), height(E)) + 1
The single nodes D, E, and C we already know from our first example have a height of zero (they have no children). therefore all of the above reduces to
height(A) = max( (max(0, 0) + 1), 0) + 1
height(A) = max(1, 0) + 1
height(A) = 1 + 1
height(A) = 2
I hope that makes at least a dent in the learning curve for you. Draw them out on paper with some sample trees to understand better if you still have doubts.

Related

Why time complexity of following code is O(n^2)?

void level_order_recursive(struct node *t , int h) //'h' is height of my binary tree
{ //'t' is address of root node
for(int i = 0 ; i <= h ; i++)
{
print_level(t , i);
}
}
After print_level() is called everytime , I think recursive function is called (2^i) times . So 2^0 + 2^1 + 2^2 ....2^h should give time complexity of O(2^n).Where am I going wrong ?
void print_level(struct node * t , int i)
{
if( i == 0)
cout << t -> data <<" ";
else
{
if(t -> left != NULL)
print_level(t -> left , i - 1); //recursive call
if(t -> right != NULL)
print_level(t -> right , i - 1); //recursive call
}
}
You are confusing h and n. h is the height of the tree. n is apparently the number of elements in the tree. So print_level takes worst case O ($2^i), but that is also just n.
The worst case happens when you have a degenerate tree, where each node has only one successor. In that case you have n nodes, but the height of the tree is also h = n. Each call to print_level takes i steps in that case, and summing up i from 1 to h = n gives O ($n^2).
You always start at the root of the tree t and increase the level by one each time (i) until you reach the height of the tree h.
You said it is a binary tree, but you did not mention any property, e.g. balanced or so. So I assume it can be an unbalanced binary tree and thus the height of the tree in worst case can be h = n where n is the number of nodes (that is a completely unbalanced tree that looks like a list actually).
So this means that level_order_recursive loops n times. I.e. the worst case is that the tree has n levels.
print_level receives the root node and the level to print. And it calls itself recursively until it reaches the level and prints out that level.I.e. it loops i times (a recursive call decreases i by one each time).
So you have 1 + 2 + 3 + ... + h iterations. And since h = n you get 1 + 2 + 3 ... + n steps. This is (n * (n+1))/2 (Gaussian sum formula) which is in O(n^2).
If you can assure that the tree is balanced than you would improve the worst case scenario, because the height would be h = ld(n) where ld denotes the binary logarithm.
Based on this or that, pages 3 and 4, binary search algorithm, which resembles our case, has a time complexity of T(n) = T(n/2) + c.
Except that, both left and right sub-trees are browsed, hence the 2T(n/2) in the formula below, since this is a traversal algorithm, rather than a search one.
Here, I will comply to the question and use 'h' instead of 'n'.
Using recurrence relation, you get the following proof:
In the worst case the time complexity will be O(n^2) but cannot be 2^n as time complexity for each level will be-> O(n) + O(n-1) + O(n-2) + ... + O(1) which is at worst O(n^2).

How the double recursion works in C/C++ - for example depth of a binary tree?

Recursion doesn't strike naturally to me. A few programs, which I could understand was Factorial, where factorial of n is n * factorial(n-1). Similarly, fibonacci series - Fn = Fn-1 + Fn-2. Also, a bst- insert, search. All these recursion functions have one thing common - a condition to return the concrete value. Otherwise, it will call itself with a different parameter. Once the concrete value is returned, all the calls are unfolded. However, I am not able to understand the programs where the recursion is one after the other. What happens over there. How can I think on those lines naturally? For example - here is the program -
What is the significance of the following lines?
/* compute the depth of each subtree */
int lDepth = maxDepth(node->left);
int rDepth = maxDepth(node->right);
int maxDepth(struct node* node)
{
if (node==NULL)
return 0;
else
{
/* compute the depth of each subtree */
int lDepth = maxDepth(node->left);
int rDepth = maxDepth(node->right);
/* use the larger one */
if (lDepth > rDepth)
return(lDepth+1);
else return(rDepth+1);
}
}
While searching the tree, the condition that returns a concrete value is if (node==NULL) and the concrete value returned is 0, which is a tree of depth 0. Consider the following tree (from wikipedia)
Starting at node 8, the code will recurse to node 3, and then to node 1. When it tries to recurse to the left child of node 1, it will find NULL and return 0. Then it will try the right child of node 1, which will also return 0. At this point node 1 comes to the if statement
if (lDepth > rDepth)
return(lDepth+1);
else return(rDepth+1);
Since both lDepth and rDepth are 0, node 1 returns 1 to node 3. Then node 3 recurses to node 6, and so on.
Each call to maxDepth(node->left) will result in an immediate call to maxDepth(node->left), until there it nothing left (no pun intended) on the left most side of the tree. Then the last call returns and there will be a call to maxDepth(node->right).
This is a so-called 'depth-first' traversal, in that we go as far up the tree as possible and then visit the leaves on each branch until we're done on the branch and then we back-up to the fork.
Perhaps the best way to understand this code is to draw a picture of a small binary tree, and step through the code to see what will happen.
Mentally, it may help to separate the logic into two parts. If you have a pointer possibly to a node in a singly-linked list, and define the depth of a NULL pointer as being 0, then you can say the depth of any other node is one more than the depth of the node to which it links.
{node3 p_next_->}---->{node2 p_next_->}---->{node1 p_next_=nullptr}
So, the code for this would be:
int depth(Node* p)
{
if (p == NULL) return 0;
return depth(p->p_next_) + 1;
}
Then, your question is about a binary tree... the logic is similar but each time you work out the depth of a node, you're saying "well, this node may have a right and/or a left hierarchy of nodes under it... my depth is one more than the greater of their depths".
Alternatively, it might help to think of a family, say fred has two children sue and max, and so on....
fred
/ \
sue max
/ \
sally charlie
/
june
Here, working out the depth of say sue is a bit like asking "is she a child (depth 1), a parent (depth 2), a grandparent (depth 3), a great-grandparent (depth 4)?"
We can see she's sally's mother (depth 2 on that side), but she's june's grandmother (depth 3) on that side.
We can work out that answer of 3 systematically by saying sue's depth is one more than the deepest of sally and charlie's, and sally's got no kids (their imaginary depth is 0) so she's depth 1 so sue'2 at least 2, but charlie's one more than june who's one more than her imaginarykids (depth 0) i.e. june's 1 so charlie's 2 so sue's actually 3, being more than the count from sally's side.

BST Find Height Recursivly

I'm trying to find the height of a Binary Search Tree in my program, and keep coming upon this recursive solution to find the height:
int maxHeight(BinaryTree *p) {
if (!p) return 0;
int left_height = maxHeight(p->left);
int right_height = maxHeight(p->right);
return (left_height > right_height) ? left_height + 1 : right_height + 1;
}
Can someone explain to me how this works? I don't understand how it adds up the height. It looks like it should just go through each side of the tree and return 0.
The algorithm works this way:
If the tree I am looking at does not exist, then the length of the tree is 0.
Otherwise, the length of the tree is the maximum height of the two sub-trees I have plus 1 (the plus 1 is needed to include the node you are currently looking at).
E.g. if I have a tree with no branches (i.e a stump) then I have a height of one, because I have two subtrees of height 0 and the max of these heights plus 1 is 1.
another example:
If I have a tree:
A - B - C - D
| |
E F
(where a is root)
then, height is not 0, as A is not null
height = max(height(left), height(right)) + 1.
height of left at A is 0, because A has no left branch.
height of right branch is the height of B + 1.
to work out the height of B, we consider B as a completely new tree:
B - C - D
| |
E F
now
height = max(height(left), height(right)) + 1.
to work out the heigtht of left, we conside E as a completely new tree:
E
this exists so it's height is not 0
however, it's two branches do not exist, so it's height is 1 (each branch has height of 0)
back to the parent tree again:
B - C - D
| |
E F
we were working out height, and found out the height of the left branch is 1.
so height = max(1, height(right) ) + 1
so, what is the height of right?
once again, we consider the right branch as it's own tree:
C - D
|
F
the problem is the same as before
height = max(height(left), height(right)) + 1
to work out height(left), we consider F by itself
F
F has height of 1, because it has two null branches (i.e. max of two 0 heights's plus 1)
now looking at right
D
D has height of 1 for same reason
back to parent of F and D:
C - D
|
F
height of C is:
max(height(F), height(D)) + 1
= max(1, 1) + 1
= 1 + 1
= 2.
So now we know height of C, we can go back to the parent:
B - C - D
| |
E F
recall, we worked out the length of B's left branch as 1, and then started working out it's right branch height.
We now know that the right branch has a height of 2
Max(1, 2) is 2.
2 + 1 = 3
Therefore, height of B is 3.
Now we know this, we are finally back to our original tree:
A - B - C - D
| |
E F
we already worked out the left branches height at 0, and then started working on the right branch.
we now know the right branch has a height of 3.
Therefore,
height(a) = Max(height(null), Max(height(B))
= Max( 0 , 3 ) + 1
= 3+1
=4
done. The height of A is 4.
The function only returns 0 if the binary tree is null. That makes sense because if the tree is null, there are no nodes to count so the height is 0.
If it's not null, then the function will add 1 (the height of the current node) to the height of the left or the right child sub-tree, whichever is greater.
How does it know the height of the child sub-trees? By calling itself recursively passing in the left child or the right child so that the next recursion starts one level down in the tree.
What happens when you call the function for the first time passing in the root of the tree? The function first calls itself recursively travelling down the left-most children until it finds the leaf node. It calls itself one more time passing in the left child of the leaf node, which is null. This last call doesn't recurse any more and just returns 0. The function then recurses to the right child of the leaf which also returns 0. It then adds 1 for its own height and returns.
We are now at the parent of the left-most leaf and as before it will recurse to the right child (sibling to the leaf). That one might not exist (returns 0), be a leaf (returns 1) or have children (returns >1). Whatever the return value, it will be compared to the left-most leaf (height 1) and whichever is greater will be incremented (always adding the height of the current node) and returned as the height of the sub-tree rooted at the current node.
Notice that the recursion will keep "unrolling" as it travels back up to the root but at each level, it will first recurse further down the right child sub-tree. This is what's known as a depth first search. Eventually, the whole tree will be visited and the outstanding maximum height computed all the way back to the root.
Your code is not memory friendly. For example, when you run this method, the height_left and height_right vars will still be saved in the memory. So, what if you run this function billions of times? I suggest to return with no variables, for instance
return max(maxHeight(p->left), maxHeight(p->right));

Finding Depth of Binary Tree

I am having trouble understanding this maxDepth code. Any help would be appreciated. Here is the snippet example I followed.
int maxDepth(Node *&temp)
{
if(temp == NULL)
return 0;
else
{
int lchild = maxDepth(temp->left);
int rchild = maxDepth(temp->right);
if(lchild <= rchild)
return rchild+1;
else
return lchild+1;
}
}
Basically, what I understand is that the function recursively calls itself (for each left and right cases) until it reaches the last node. once it does, it returns 0 then it does 0+1. then the previous node is 1+1. then the next one is 2+1. if there is a bst with 3 left childs, int lchild will return 3. and the extra + 1 is the root. So my question is, where do all these +1 come from. it returns 0 at the last node but why does it return 0+1 etc. when it goes up the left/right child nodes? I don't understand why. I know it does it, but why?
Consider this part (of a bigger tree):
A
\
B
Now we want to calculate the depth of this treepart, so we pass pointer to A as its param.
Obviously pointer to A is not NULL, so the code has to:
call maxDepth for each of A's children (left and right branches). A->right is B, but A->left is obviously NULL (as A has no left branch)
compare these, choose the greatest value
return this chosen value + 1 (as A itself takes a level, doesn't it?)
Now we're going to look at how maxDepth(NULL) and maxDepth(B) are calculated.
The former is quite easy: the first check will make maxDepth return 0. If the other child were NULL too, both depths would be equal (0), and we have to return 0 + 1 for A itself.
But B is not empty; it has no branches, though, so (as we noticed) its depth is 1 (greatest of 0 for NULLs at both parts + 1 for B itself).
Now let's get back to A. maxDepth of its left branch (NULL) is 0, maxDepth of its right branch is 1. Maximum of these is 1, and we have to add 1 for A itself - so it's 2.
The point is the same steps are to be done when A is just a part of the bigger tree; the result of this calculation (2) will be used in the higher levels of maxDepth calls.
Depth is being calculated using the previous node + 1
All the ones come from this part of the code:
if(lchild <= rchild)
return rchild + 1;
else
return lchild + 1;
You add yourself +1 to the results obtained in the leaves of the tree. These ones keep adding up until you exit all the recursive calls of the function and get to the root node.
Remember in binary trees a node has at most 2 children (left and right)
It is a recursive algorithm, so it calls itself over and over.
If the temp (the node being looked at) is null, it returns 0, as this node is nothing and should not count. that is the base case.
If the node being looked at is not null, it may have children. so it gets the max depth of the left sub tree (and adds 1, for the level of the current node) and the right subtree (and adds 1 for the level of the current node). it then compares the two and returns the greater of the two.
It dives down into the two subtrees (temp->left and temp->right) and repeats the operation until it reaches nodes without children. at that point it will call maxDepth on left and right, which will be null and return 0, and then start returning back up the chain of calls.
So if you you have a chain of three nodes (say, root-left1-left2) it will get down to left2 and call maxDepth(left) and maxDepth(right). each of those return 0 (they are null). then it is back at left2. it compares, both are 0, so the greater of the two is of course 0. it returns 0+1. then we are at left1 - repeats, finds that 1 is the greater of its left n right (perhaps they are the same or it has no right child) so it returns 1+1. now we are at root, same thing, it returns 2+1 = 3, which is the depth.
Because the depth is calculated with previous node+1
To find Maximum depth in binary tree keep going left and Traveres the tree, basically perform a DFS
or
We can find the depth of the binary search tree in three different recursive ways
– using instance variables to record current depth and total depth at every level
– without using instance variables in top-bottom approach
– without using instance variables in bottom-up approach
The code snippet can be reduced to just:
int maxDepth(Node *root){
if(root){ return 1 + max( maxDepth(root->left), maxDepth(root->right)); }
return 0;
}
A good way of looking at this code is from the top down:
What would happen if the BST had no nodes? We would have root = NULL and the function would immediately return an expected depth of 0.
Now suppose the tree was populated with a number of nodes. Starting at the top, the if condition would be true for the root node. We then ask, what is the max depth of the LEFT SUB TREE and the RIGHT SUB TREE by passing the root of those sub trees to maxDepth. Both the LST and the RST of the root are one level deeper than the root, so we must add one to get the depth of the tree at root of the tree passed to the function.
i think this is the right answer
int maxDepth(Node *root){
if(root){ return 1 + max( maxDepth(root->left), maxDepth(root->right)); }
return -1;
}

Finding Diameter of a Tree

I have written code for finding the diameter of binary tree. But I am unable to figure out where is it going wrong . The two functions that I have written and their definition are as follows :-
int btree::diameteroftree(node* leaf)
{
if (leaf==NULL)
return 0;
int lheight = hieghtoftree(leaf->left);
int rheight = hieghtoftree(leaf->right);
int ldiameter = diameteroftree(leaf->left);
int rdiameter = diameteroftree(leaf->right);
return max(lheight + rheight + 1,max(ldiameter,rdiameter));
}
int btree::hieghtoftree(node* leaf)
{
int left=0,right=0;
if(leaf==NULL)
return -1;
else
{
left=hieghtoftree(leaf->left);
right=hieghtoftree(leaf->right);
if(left > right)
return left +1;
else
return right+1;
}
}
I am unable to figure out where am I going wrong here . Can someone let me know ...
You want to return the number of nodes on the longest path. Therefore, the problem in your algorithm is this line:
return max(lheight + rheight + 1,max(ldiameter,rdiameter));
where
rootDiameter = lheight + rheight + 1
is the length of the path from the deepest node of the left tree to the deepest node of the right tree. However, this calculation is not correct. A single node returns a height of 0, so it will not be counted. You have two options:
Change hieghtoftree to return the number of nodes on the deepest path and not the number of "hops"
Address this problem in your summation
.
return max(lheight + rheight + 3,max(ldiameter,rdiameter));
In a directed, rooted tree, there is always at most one path between any pair of nodes and the longest path to any node always starts at the root. It follows that the diameter is simply the height of the entire tree height(root), which can be computed with the recursion
height(leaf) = 0
height(node) = max(height(node.left), height(node.right)) + 1
EDIT: the page you link to in the comment describes the diameter of an undirected tree. You need a different tree representation, e.g. an adjacency matrix.
Consider a 3-node tree with root R and 2 leaves L1, L2. Then heightoftree(L1) == heightoftree(L2) == -1. Diameteroftree(R) would therefore be (-1)+(-1)+1 = -1 ?!?
I suggest return -1; --> return 0;
and
return max(lheight + rheight + 1,max(ldiameter,rdiameter)); --> return max(lheight + rheight + 2,max(ldiameter,rdiameter));
The result would be the number of edges on the path. If you count the number of nodes, then add one or subtract one from the final result according to your need.