I was going through the tutorial of binary tree .
And I am slightly stuck in use of recursive function . say for example I need to count no of nodes in a tree
int countNodes( TreeNode *root )
{
// Count the nodes in the binary tree to which
// root points, and return the answer.
if ( root == NULL )
return 0; // The tree is empty. It contains no nodes.
else
{
int count = 1; // Start by counting the root.
count += countNodes(root->left); // Add the number of nodes
// in the left subtree.
count += countNodes(root->right); // Add the number of nodes
// in the right subtree.
return count; // Return the total.
}
} // end countNodes()
Now my doubt is-> how would it count say root->left->left of right ? or root->right->left->left??
Thanks
With recursive functions, you should think recursively! Here's how I would think of this function:
I start writing the signature of the function, that is
int countNodes( TreeNode *root )
So first, the cases that are not recursive. For example, if the given tree is NULL, then there are no nodes, so I return 0.
Then, I observe that the number of nodes in my tree are the number of nodes of the left sub-tree plus the number of nodes of the right sub-tree plus 1 (the root node). Therefore, I basically call the function for the left and right nodes and add the values adding 1 also.
Note that I assume the function already works correctly!
Why did I do this? Simple, the function is supposed to work on any binary tree right? Well, the left sub-tree of the root node, is in fact a binary tree! The right sub-tree also is a binary tree. So, I can safely assume with the same countNodes functions I can count the nodes of those trees. Once I have them, I just add left+right+1 and I get my result.
How does the recursive function really work? You could use a pen and paper to follow the algorithm, but in short it is something like this:
Let's say you call the function with this tree:
a
/ \
b c
/ \
d e
You see the root is not null, so you call the function for the left sub-tree:
b
and later the right sub-tree
c
/ \
d e
Before calling the right sub-tree though, the left sub-tree needs to be evaluated.
So, you are in the call of the function with input:
b
You see that the root is not null, so you call the function for the left sub-tree:
NULL
which returns 0, and the right sub-tree:
NULL
which also returns 0. You compute the number of nodes of the tree and it is 0+0+1 = 1.
Now, you got 1 for the left sub-tree of the original tree which was
b
and the function gets called for
c
/ \
d e
Here, you call the function again for the left sub-tree
d
which similar to the case of b returns 1, and then the right sub-tree
e
which also returns 1 and you evaluate the number of nodes in the tree as 1+1+1 = 3.
Now, you return the first call of the function and you evaluate the number of nodes in the tree as 1+3+1 = 5.
So as you can see, for each left and right, you call the function again, and if they had left or right children, the function gets called again and again and each time it goes deeper in the tree. Therefore, root->left->left or root->right->left->left get evaluated not directly, but after subsequent calls.
That's basically what the recursion's doing, it's adding 1 each time countNodes is called as it gets to a child node (int count = 1;) and terminating when it tries to go to the next child of a leaf node (since a leaf has no children). Each node recursively calls countNodes for each of it's left and right children and the count slowly increases and bubbles to the top.
Try and look at it this way, where 1 is added for each node and 0 for a non-existing node where the recursion stops:
1
/ \
1 1
/ \ / \
1 0 0 0
/ \
0 0
The 1's each add up, with the node's parent (the calling function at each level of recursion) adding 1 + left_size + right_size and returning that result. Therefore the values returned at each stage would be:
4
/ \
2 1
/ \ / \
1 0 0 0
/ \
0 0
I'm not sure that made it any clearer but I hope it did.
Say you call countNodes(myTree);. Assuming myTree is not null, countNodes will eventually execute count += countNodes(root->left);, where root is myTree. It re-enters your countNodes function with the entire tree rooted at root->left (which is myTree->left). The logic then repeats itself; if there is no root->left, then the function returns 0. Otherwise, it will eventually call count += countNodes(root->left); again, but this time root will actually be myTree->left. That way it will count myTree->left->left. Later it does the same thing with the right nodes.
That's the beauty of recursive algorithms. The function is defined over the current node and its children. You only have to convince yourself that the current invocation is correct as long as the recursive calls to the left and right children are correct. Exactly the same reasoning applies to the children and their children, and so on ... it'll all just work.
It will start by root->left->(subnode->left) etc. until that branch returns 0 e.g. if it is not an actual node (a leaf in the tree);
Then the deepest node will check for root->right and repeat the same procedure. Try to visualize this with a small tree :
So in this case your function will go A->D->B then the right nodes will all return 0 and you will get a last +1 from your C node.
The algorithm implementation you write is exhaustive. It visit the entire tree.
If the tree is empty, count is zero.
If not, we get the left node, let's call it L and we add 1 to our count.
Since it is proven that a subtree of a tree is itself a tree, we perform the same algorithm again on the tree that have L as root.
We now do it for the tree that have the root right node as root.
Now... this indeed works.
A subtree of a tree is a tree, also for empty or single nodes.
You should look at the definition of tree.
You can prove it using Mathematical Induction and formulate your problem in terms of inductive reasoning.
Recursive algorithms uses often a structure very similar to inductive reasoning.
The trick with recursive functions is that there is a base case and an inductive step, just like mathematical induction.
The base case is how your recursive algorithm knows to stop. In this case it is if (root == NULL) -- this node doesn't represent a tree. This line is executed on every single node in your binary tree, even though it calls each one root at the time. It is false for all the nodes of the tree, but when you begin calling the recursive routine on the children of the leaf nodes -- which are all NULL -- then it will return 0 as the count of nodes.
The inductive step is how your recursive algorithm moves from one solved state to the next unsolved state by converting the unsolved problem into (one or more) already-solved problems. Your algorithm needs to count the number of nodes in a tree; you need 1 for the current node, and then you have two simpler problems -- the number of nodes in the tree on the left, and the number of nodes on the tree on the right. Get both of those, add them together, add the 1 for the current node, and return that as the count of nodes in this tree.
This concept really is fundamental to many algorithms in computer science, so it is well worth studying it until you fully understand it. See also quicksort, Fibonocci sequence.
Think that the program goes first at the deepest branches. Then it goes backwards returning the count number to its previous member.
A
/ \
B C
/ \ \
D E F
So first the program runs until
count += countNodes(root->left);
It pauses what it's done so far(aparently nothing special) and goes into B. Same happens there and it goes to D. At D it does the same. However here we have a special case. The program sees at the beginning at line
if ( root == NULL )
that the nonexistent left child of D is indeed NULL. Therefore you get back 0.
Then we go back at where we paused last time and we continue like this. Last time we were at B so we continue past the line
count += countNodes(root->left);
and run the next line
count += countNodes(root->right);
This goes on until you get back to A. But at point A again we had paused just after searching the left leave of A. Therefore we continue with the right leave. Once we are done going through whola that branch we get back to A.
At this point we don't have any unfinished business(pauses) left so we just return the count that we gathered through whole this process.
Every subtree has its own root, just as the actual tree has a root. The counting is the same as for each of the subtrees. So you just keep going until you reach the leaf nodes and stop that local recursion, returning and adding up the nodes as you go.
Draw the entire tree, then assign 1 for all leafs nodes (leaf nodes are at level N). After that you should be able to calculate the number of nodes that is generated by each node on a higher level (N-1) just by summing: 1 + 1 if the node has two children or 1 if the node has only one child. So for each node on level N-1, assign the value 1 + sum(left,right). Following this you should be able to calculate the number of nodes of the entire tree. The recursion that you posted, do just that.
http://www.jargon.net/jargonfile/r/recursion.html
UPDATE:
The point is that both the data structure and the program are recursive.
for the data structure this means: a subtree is also a tree.
for the code it means: counting the tree := counting the subtrees (and add one)
Related
I use the following method to traverse* a binary tree of 300 000 levels:
Node* find(int v){
if(value==v)
return this;
else if(right && value<v)
return right->find(v);
else if(left && value>v)
return left->find(v);
}
However I get a segmentation fault due to stack overflow.
Any ideas on how to traverse the deep tree without the overhead of recursive function calls?
*
By "traverse" I mean "search for a node with given value", not full tree traversal.
Yes! For a 300 000 level tree avoid recursion. Traverse your tree and find the value iteratively using a loop.
Binary Search Tree representation
25 // Level 1
20 36 // Level 2
10 22 30 40 // Level 3
.. .. .. .. .. .. ..
.. .. .. .. .. .. .. .. // Level n
Just to clarify the problem further. Your tree has a depth of n = 300.000 levels. Thus, in the worst case scenario a Binary Search Tree (BST) will have to visit ALL of the tree's nodes. This is bad news because that worst case has an algorithmic O(n) time complexity. Such a tree can have:
2ˆ300.000 nodes = 9.9701e+90308 nodes (approximately).
9.9701e+90308 nodes is an exponentially massive number of nodes to visit. With these numbers it becomes so clear why the call stack overflows.
Solution (iterative way):
I'm assuming your Node class/struct declaration is a classic standard integer BST one. Then you could adapt it and it will work:
struct Node {
int data;
Node* right;
Node* left;
};
Node* find(int v) {
Node* temp = root; // temp Node* value copy to not mess up tree structure by changing the root
while (temp != nullptr) {
if (temp->data == v) {
return temp;
}
if (v > temp->data) {
temp = temp->right;
}
else {
temp = temp->left;
}
}
return nullptr;
}
Taking this iterative approach avoids recursion, hence saving you the hassle of having to recursively find the value in a tree so large with your program call stack.
A simple loop where you have a variable of type Node* which you set to the next node, then loop again ...
Don't forget the case that the value you are searching for does not exist!
You could implement the recursion by not using the call stack but a user-defined stack or something similar; this could be done via the existing stack template. The approach would be to have a while loop which iterates until the stack is empty; as the existing implementaion uses depth-first search, elimination of the recursive calls can be found here.
When the tree that you have is a Binary Search Tree, and all you want to do is search for a node in it that has a specific value, then things are simple: no recursion is necessary, you can do it using a simple loop as others have pointed out.
In the more general case of having a tree which is not necessarily a Binary Search Tree, and wanting to perform a full traversal of it, the simplest way is using recursion, but as you already understand, if the tree is very deep, then recursion will not work.
So, in order to avoid recursion, you have to implement a stack on the C++ heap. You need to declare a new StackElement class that will contain one member for each local variable that your original recursive function had, and one member for each parameter that your original recursive function accepted. (You might be able to get away with fewer member variables, you can worry about that after you have gotten your code to work.)
You can store instances of StackElement in a stack collection, or you can simply have each one of them contain a pointer to its parent, thus fully implementing the stack by yourself.
So, instead of your function recursively calling itself, it will simply consist of a loop. Your function enters the loop with the current StackElement being initialized with information about the root node of your tree. Its parent pointer will be null, which is another way of saying that the stack will be empty.
In every place where the recursive version of your function was calling itself, your new function will be allocating a new instance of StackElement, initializing it, and repeating the loop using this new instance as the current element.
In every place where the recursive version of your function was returning, your new function will be releasing the current StackElement, popping the one that was sitting on the top of the stack, making it the new current element, and repeating the loop.
When you find the node you were looking for, you simply break from the loop.
Alternatively, if the node of your existing tree supports a) a link to its "parent" node and b) user data (where you can store a "visited" flag) then you don't need to implement your own stack, you can just traverse the tree in-place: in each iteration of your loop you first check if the current node is the node you were looking for; if not, then you enumerate through children until you find one which has not been visited yet, and then you visit it; when you reach a leaf, or a node whose children have all been visited, then you back-track by following the link to the parent. Also, if you have the freedom to destroy the tree as you are traversing it, then you do not even need the concept of "user data": once you are done with a child node, you free it and make it null.
Well, it can be made tail recursive at the cost of a single additional local variable and a few comparisons:
Node* find(int v){
if(value==v)
return this;
else if(!right && value<v)
return NULL;
else if(!left && value>v)
return NULL;
else {
Node *tmp = NULL;
if(value<v)
tmp = right;
else if(value>v)
tmp = left;
return tmp->find(v);
}
}
Walking through a binary tree is a recursive process, where you'll keep walking until you find that the node you're at currently points nowhere.
It is that you need an appropriate base condition. Something which looks like:
if (treeNode == NULL)
return NULL;
In general, traversing a tree is accomplished this way (in C):
void traverse(treeNode *pTree){
if (pTree==0)
return;
printf("%d\n",pTree->nodeData);
traverse(pTree->leftChild);
traverse(pTree->rightChild);
}
I just need help writing a recursive function that returns the rightmost node at the deepest level of a binary search tree (it's actually a node based heap). It is important to remember that the tree will always be complete. I have tried a few things, but nothing worked well enough to post here as a start.
I have found similar questions but those all relate to the leftmost node which I can do because it is always in the same place, but the rightmost node varies depending on how filled the tree is.
I have functions getLeftHeight and getRightHeight
void HeapClass::findDeleteNode( HeapNode *&workingPtr ){
}
I have the prototype like this, where I send in a pointer that I later take values out of and delete in a different function. I'm new to this website so sorry if this question is lacking information or posted incorrectly. Any help would be appreciated, thank you.
You'll need to compare the height of the left subtree and the right. If the height of the right one is greater or equal to the left, you recurse on that node, otherwise on the left
void HeapClass::findDeleteNode( HeapNode *&workingPtr )
{
bool isLeaf = getLeftHeight(workingPtr)) == 0 && getRightHeight(workingPtr) == 0;
if(isLeaf) {
// found it
return;
}
findDeleteNode(getLeftHeight(workingPtr) > getRightHeight(workingPtr) ? workingPtr->left : workingPtr->right);
}
I have a balanced binary search tree of integers and I want to find the leftmost node which stores the integer greater or equal to a fixed number like a using a function like ask(a).
for example suppose that I have added the following points in my tree, 8,10,3,6,1,4,7,14,13
Then the tree would be like this:
now ask(1) should be 1, ask(3) should be 3, ask(2) should be 3 and so on.
I think that I can use Inorder traversal to write my ask function, But I don't know how.
Iv written this piece of code so far:
inorderFind(node->left, a);
if (node->key.getX() >= a)
return node;
inorderFind(node->right, a);
The first argument is the current tree node and a is the a that is described above. I know that I can use a bool variable like flag and set it to true when the if condition holds, and then it would prevent from walking through other nodes of the tree and returning a false node. Is there anything else that I can do?
Trees have the wonderful property of allowing queries through simple, recursive algorithms. So, let's try to find a recursive formulation of your query.
Say LEFTMOST(u) is a function which answers this question :
Given the binary search subtree rooted at node u, with(possibly null) left and
right children l and r, respectively, what is the left-most node
with a value >= a?
The relation is quite simple:
LEFTMOST(u) = LEFTMOST(l) if it exists
LEFTMOST(r) otherwise
That's it. How you translate this to your problem and how you handle concepts like "null" and "does not exist" is a function of your representation.
So, I'm trying to build a function for my 2-3 tree class, and this is the pseudocode that I've been following:
if T is empty replace it with a single node containing k
else if T is just 1 node m:
(a) create a new leaf node n containing k
(b) create a new internal node with m and n as its children,
and with the appropriate values for leftMax and middleMax
else call auxiliary method insert(T, k)
However, I don't understand what it does for the case when T is just 1 single node (or one leaf). What is n ? is it a misprint ? If m is already a root since T is just one node m, then how can it can be create as a new internal node as (b) instructs ? Any help would be greatly appreciated.
Any drawing would be much easier to understand the concept and I would really appreciate for that. Thank you
I have implement a link-based BST (binary search tree) in C++ for one of my assignment. I have written my whole class and everything works good, but my assignment asks me to plot the run-times for:
a. A sorted list of 50000, 75000, and 100000 items
b. A random list of 50000, 75000, and 100000 items
That's fine, I can insert the numbers but it also asks me to call the FindHeight() and CountLeaves() methods on the tree. My problem is that I've implemented the two functions using recursion. Since I have a such a big list of numbers I'm getting getting a stackoverflow exception.
Here's my class definition:
template <class TItem>
class BinarySearchTree
{
public:
struct BinarySearchTreeNode
{
public:
TItem Data;
BinarySearchTreeNode* LeftChild;
BinarySearchTreeNode* RightChild;
};
BinarySearchTreeNode* RootNode;
BinarySearchTree();
~BinarySearchTree();
void InsertItem(TItem);
void PrintTree();
void PrintTree(BinarySearchTreeNode*);
void DeleteTree();
void DeleteTree(BinarySearchTreeNode*&);
int CountLeaves();
int CountLeaves(BinarySearchTreeNode*);
int FindHeight();
int FindHeight(BinarySearchTreeNode*);
int SingleParents();
int SingleParents(BinarySearchTreeNode*);
TItem FindMin();
TItem FindMin(BinarySearchTreeNode*);
TItem FindMax();
TItem FindMax(BinarySearchTreeNode*);
};
FindHeight() Implementation
template <class TItem>
int BinarySearchTree<TItem>::FindHeight()
{
return FindHeight(RootNode);
}
template <class TItem>
int BinarySearchTree<TItem>::FindHeight(BinarySearchTreeNode* Node)
{
if(Node == NULL)
return 0;
return 1 + max(FindHeight(Node->LeftChild), FindHeight(Node->RightChild));
}
CountLeaves() implementation
template <class TItem>
int BinarySearchTree<TItem>::CountLeaves()
{
return CountLeaves(RootNode);
}
template <class TItem>
int BinarySearchTree<TItem>::CountLeaves(BinarySearchTreeNode* Node)
{
if(Node == NULL)
return 0;
else if(Node->LeftChild == NULL && Node->RightChild == NULL)
return 1;
else
return CountLeaves(Node->LeftChild) + CountLeaves(Node->RightChild);
}
I tried to think of how I can implement the two methods without recursion but I'm completely stumped. Anyone have any ideas?
Recursion on a tree with 100,000 nodes should not be a problem if it is balanced. The depth would only be maybe 17, which would not use very much stack in the implementations shown. (log2(100,000) = 16.61). So it seems that maybe the code that is building the tree is not balancing it correctly.
I found this page very enlightening because it talks about the mechanics of converting a function that uses recursion to one that uses iteration.
It has examples showing code as well.
May be you need to calculate this while doing the insert. Store the heights of nodes, i.e add an integer field like height in the Node object. Also have counters height and leaves for the tree. When you insert a node, if its parent is (was) a leaf, the leaf count doesnt change, but if not, increase leaf count by 1. Also the height of the new node is parent's height + 1, hence if that is greater than the current height of the tree, then update it. Its a homework, so i wont help with the actual code
Balance your tree occasionally. If your tree is getting stackoverflow on FindHeight(), that means your tree is way unbalanced. If the tree is balanced it should only have a depth of about 20 nodes for 100000 elements.
The easiest (but fairly slow) way of re-balancing unbalanced binary tree is to allocate an array of TItem big enough to hold all of the data in the tree, insert all of your data into it in sorted order, and delete all of the nodes. Then rebuild the tree from the array recursively. The root is the node in the middle. root->left is the middle of the left half, root->right is the middle of the right half. Repeat recursively. This is the easiest way to rebalance, but it is slowish and takes lots of memory temporarily. On the other hand, you only have to do this when you detect that the tree is very unbalanced, (depth on insert is more than 100).
The other (better) option is to balance during inserts. The most intuitive way to do this is to keep track of how many nodes are beneath the current node. If the right child has more than twice as many "child" nodes as the left child, "rotate" left. And vice-versa. There's instrcutions on how to do tree rotates all over the internet. This makes inserts slightly slower, but then you don't have occassional massive stalls that the first option creates. On the other hand, you have to constantly update all of the "children" counts as you do the rotates, which isn't trivial.
In order to count the leaves without recursion, use the concept of an iterator like the STL uses for the RB-tree underlying std::set and std::map ... Create a begin() and end() function for you tree that indentifies the ordered first and last node (in this case the left-most node and then the right-most node). Then create a function called
BinarySearchTreeNode* increment(const BinarySearchTreeNode* current_node)
that for a given current_node, will return a pointer to the next node in the tree. Keep in mind for this implementation to work, you will need an extra parent pointer in your node type to aid in the iteration process.
Your algorithm for increment() would look something like the following:
Check to see if there is a right-child to the current node.
If there is a right-child, use a while-loop to find the left-most node of that right subtree. This will be the "next" node. Otherwise go to step #3.
If there is no right-child on the current node, then check to see if the current node is the left-child of its parent node.
If step #3 is true, then the "next" node is the parent node, so you can stop at this point, otherwise go the next step.
If the step #3 was false, then the current node is the right-child of the parent. Thus you will need to keep moving up to the next parent node using a while loop until you come across a node that is a left-child of its parent node. The parent of this left-child node will then be the "next" node, and you can stop.
Finally, if step #5 returns you to the root, then the current node is the last node in the tree, and the iterator has reached the end of the tree.
Finally you'll need a bool leaf(const BinarySearchTreeNode* current_node) function that will test whether a given node is a leaf node. Thus you counter function can simply iterate though the tree and find all the leaf nodes, returning a final count once it's done.
If you want to measure the maximum depth of an unbalanced tree without recursion, you will, in your tree's insert() function, need to keep track of the depth that a node was inserted at. This can simply be a variable in your node type that is set when the node is inserted in the tree. You can then iterate through the three, and find the maximum depth of a leaf-node.
BTW, the complexity of this method is unfortunately going to be O(N) ... nowhere near as nice as O(log N).