I use the following method to traverse* a binary tree of 300 000 levels:
Node* find(int v){
if(value==v)
return this;
else if(right && value<v)
return right->find(v);
else if(left && value>v)
return left->find(v);
}
However I get a segmentation fault due to stack overflow.
Any ideas on how to traverse the deep tree without the overhead of recursive function calls?
*
By "traverse" I mean "search for a node with given value", not full tree traversal.
Yes! For a 300 000 level tree avoid recursion. Traverse your tree and find the value iteratively using a loop.
Binary Search Tree representation
25 // Level 1
20 36 // Level 2
10 22 30 40 // Level 3
.. .. .. .. .. .. ..
.. .. .. .. .. .. .. .. // Level n
Just to clarify the problem further. Your tree has a depth of n = 300.000 levels. Thus, in the worst case scenario a Binary Search Tree (BST) will have to visit ALL of the tree's nodes. This is bad news because that worst case has an algorithmic O(n) time complexity. Such a tree can have:
2ˆ300.000 nodes = 9.9701e+90308 nodes (approximately).
9.9701e+90308 nodes is an exponentially massive number of nodes to visit. With these numbers it becomes so clear why the call stack overflows.
Solution (iterative way):
I'm assuming your Node class/struct declaration is a classic standard integer BST one. Then you could adapt it and it will work:
struct Node {
int data;
Node* right;
Node* left;
};
Node* find(int v) {
Node* temp = root; // temp Node* value copy to not mess up tree structure by changing the root
while (temp != nullptr) {
if (temp->data == v) {
return temp;
}
if (v > temp->data) {
temp = temp->right;
}
else {
temp = temp->left;
}
}
return nullptr;
}
Taking this iterative approach avoids recursion, hence saving you the hassle of having to recursively find the value in a tree so large with your program call stack.
A simple loop where you have a variable of type Node* which you set to the next node, then loop again ...
Don't forget the case that the value you are searching for does not exist!
You could implement the recursion by not using the call stack but a user-defined stack or something similar; this could be done via the existing stack template. The approach would be to have a while loop which iterates until the stack is empty; as the existing implementaion uses depth-first search, elimination of the recursive calls can be found here.
When the tree that you have is a Binary Search Tree, and all you want to do is search for a node in it that has a specific value, then things are simple: no recursion is necessary, you can do it using a simple loop as others have pointed out.
In the more general case of having a tree which is not necessarily a Binary Search Tree, and wanting to perform a full traversal of it, the simplest way is using recursion, but as you already understand, if the tree is very deep, then recursion will not work.
So, in order to avoid recursion, you have to implement a stack on the C++ heap. You need to declare a new StackElement class that will contain one member for each local variable that your original recursive function had, and one member for each parameter that your original recursive function accepted. (You might be able to get away with fewer member variables, you can worry about that after you have gotten your code to work.)
You can store instances of StackElement in a stack collection, or you can simply have each one of them contain a pointer to its parent, thus fully implementing the stack by yourself.
So, instead of your function recursively calling itself, it will simply consist of a loop. Your function enters the loop with the current StackElement being initialized with information about the root node of your tree. Its parent pointer will be null, which is another way of saying that the stack will be empty.
In every place where the recursive version of your function was calling itself, your new function will be allocating a new instance of StackElement, initializing it, and repeating the loop using this new instance as the current element.
In every place where the recursive version of your function was returning, your new function will be releasing the current StackElement, popping the one that was sitting on the top of the stack, making it the new current element, and repeating the loop.
When you find the node you were looking for, you simply break from the loop.
Alternatively, if the node of your existing tree supports a) a link to its "parent" node and b) user data (where you can store a "visited" flag) then you don't need to implement your own stack, you can just traverse the tree in-place: in each iteration of your loop you first check if the current node is the node you were looking for; if not, then you enumerate through children until you find one which has not been visited yet, and then you visit it; when you reach a leaf, or a node whose children have all been visited, then you back-track by following the link to the parent. Also, if you have the freedom to destroy the tree as you are traversing it, then you do not even need the concept of "user data": once you are done with a child node, you free it and make it null.
Well, it can be made tail recursive at the cost of a single additional local variable and a few comparisons:
Node* find(int v){
if(value==v)
return this;
else if(!right && value<v)
return NULL;
else if(!left && value>v)
return NULL;
else {
Node *tmp = NULL;
if(value<v)
tmp = right;
else if(value>v)
tmp = left;
return tmp->find(v);
}
}
Walking through a binary tree is a recursive process, where you'll keep walking until you find that the node you're at currently points nowhere.
It is that you need an appropriate base condition. Something which looks like:
if (treeNode == NULL)
return NULL;
In general, traversing a tree is accomplished this way (in C):
void traverse(treeNode *pTree){
if (pTree==0)
return;
printf("%d\n",pTree->nodeData);
traverse(pTree->leftChild);
traverse(pTree->rightChild);
}
Related
I am reading about list traversals in Algorithms book by RobertSedwick. Function definitions are shown below. It is mentioned that it is possible to have traverse and remove functions can have iterative counter parts, but traverseR cannot have. My question why traverseR cannot have iterative counter part? Is it that if recursive call is not end of function i.e., like in traverse then we cannot have iterative, Is my understanding right?
Thanks for your time and help.
void traverse(link h, void visit(link))
{
if (h == 0) return;
visit(h);
traverse(h->next, visit);
}
void traverseR(link h, void visit(link))
{
if (h == 0) return;
traverseR(h->next, visit);
visit(h);
}
void remove(link& x, Item v)
{
while (x != 0 && x->item == v)
{ link t = x; x = x->next; delete t; }
if (x != 0) remove(x->next, v);
}
traverseR uses the call stack to store pointers to all the nodes of the list, so that they can be accessed in reverse order as the call stack unwinds.
In order to do this without a call stack (i.e. non-recursively), you'll need some other stack-like data structure to store these pointers in.
The other functions simply work on the current node and move on, with no need to store anything for use after the recursive function call returns. This means that the tail recursion can be replaced with a loop (either by modifying the code or, depending on the compiler, letting it determine that that's possible and make the transformation itself).
Assuming that the list is single-linked, it is not possible to visit it iteratively in the backward order because there's no pointer from a node to a previous node.
What the recursive implementation of traverseR essentially does is that it implicitly reverses the list and visits it in the forward order.
You could write and iterative version of traverseR using a stack: in a loop iterate from one node to another, pushing the nodes on the stack. When you get to the end of the list then, in another loop, pop and visit the nodes you visited.
But his is basically what the recursive version does.
It is possible to traverse a singly linked list in reverse order with only O(1) extra space -- i.e., without a stack of previously visited nodes. It is, however, a little tricky, and not at all thread safe.
The trick to this is to traverse the list from beginning to end, reversing it in place as you do so, then traverse it back to the beginning, reversing it again on the way back through.
Since it is a linked list, reversing it in place is fairly straightforward: as you get to a node, save the current value of its next pointer, and overwrite that with the address of the previous node in the list (see the code for more detail):
void traverseR(node *list, void (*visit)(node *)) {
node *prev = nullptr;
node *curr = list;
node *next;
if (!curr)
return;
// Traverse forwards, reversing list in-place as we go.
do {
next = curr->next;
curr->next = prev;
prev = curr;
curr = next;
} while (curr->next);
// fix up so we have a fully reversed list
curr->next = prev;
prev = nullptr;
// Traverse the reversed list, visiting each node and reversing again
do {
visit(curr);
next = curr->next;
curr->next = prev;
prev = curr;
curr = next;
} while (curr->next);
}
Like almost anything dealing with linked lists, I feel obliged to add that (at least IMO) they should almost always be treated as a purely intellectual exercise. Using them in real code is usually a net loss. You typically end up with code that's slow, fragile, and hard to understand, as well as typically wasting quite a bit of memory (unless the data you store in each node is pretty big, the pointer can often use as much space as the data itself).
My question why traverseR cannot have iterative counter part? Is it that if recursive call is not end of function i.e., like in traverse then we cannot have iterative, Is my understanding right?
Correct. The functions traverse and remove end with a call to themselves. They are tail recursive functions. The call in traverseR to itself is not at the end of the function; traverseR is not tail recursive.
Recursion in general has an expense of creating and later destroying stack frames. This expense can be completely avoided with tail recursive functions by changing the recursion into iteration. Most compilers recognize tail recursive functions and convert the recursion to iteration.
It is possible to write an iterative version of traverseR depending on what you mean by iterative. If you are limited so a single traversal through the list, it is not possible. But if you can sacrifice a lot processing time it can be done. It does use less memory in the classic speed vs. memory trade-off.
void traverseRI(link h, void visit(link))
{
if (h == 0) return;
link last = 0;
while (last != h)
{
link test = h;
while (test->next != last)
{
test = test->next;
}
visit(test);
last = test;
}
}
I have implement a link-based BST (binary search tree) in C++ for one of my assignment. I have written my whole class and everything works good, but my assignment asks me to plot the run-times for:
a. A sorted list of 50000, 75000, and 100000 items
b. A random list of 50000, 75000, and 100000 items
That's fine, I can insert the numbers but it also asks me to call the FindHeight() and CountLeaves() methods on the tree. My problem is that I've implemented the two functions using recursion. Since I have a such a big list of numbers I'm getting getting a stackoverflow exception.
Here's my class definition:
template <class TItem>
class BinarySearchTree
{
public:
struct BinarySearchTreeNode
{
public:
TItem Data;
BinarySearchTreeNode* LeftChild;
BinarySearchTreeNode* RightChild;
};
BinarySearchTreeNode* RootNode;
BinarySearchTree();
~BinarySearchTree();
void InsertItem(TItem);
void PrintTree();
void PrintTree(BinarySearchTreeNode*);
void DeleteTree();
void DeleteTree(BinarySearchTreeNode*&);
int CountLeaves();
int CountLeaves(BinarySearchTreeNode*);
int FindHeight();
int FindHeight(BinarySearchTreeNode*);
int SingleParents();
int SingleParents(BinarySearchTreeNode*);
TItem FindMin();
TItem FindMin(BinarySearchTreeNode*);
TItem FindMax();
TItem FindMax(BinarySearchTreeNode*);
};
FindHeight() Implementation
template <class TItem>
int BinarySearchTree<TItem>::FindHeight()
{
return FindHeight(RootNode);
}
template <class TItem>
int BinarySearchTree<TItem>::FindHeight(BinarySearchTreeNode* Node)
{
if(Node == NULL)
return 0;
return 1 + max(FindHeight(Node->LeftChild), FindHeight(Node->RightChild));
}
CountLeaves() implementation
template <class TItem>
int BinarySearchTree<TItem>::CountLeaves()
{
return CountLeaves(RootNode);
}
template <class TItem>
int BinarySearchTree<TItem>::CountLeaves(BinarySearchTreeNode* Node)
{
if(Node == NULL)
return 0;
else if(Node->LeftChild == NULL && Node->RightChild == NULL)
return 1;
else
return CountLeaves(Node->LeftChild) + CountLeaves(Node->RightChild);
}
I tried to think of how I can implement the two methods without recursion but I'm completely stumped. Anyone have any ideas?
Recursion on a tree with 100,000 nodes should not be a problem if it is balanced. The depth would only be maybe 17, which would not use very much stack in the implementations shown. (log2(100,000) = 16.61). So it seems that maybe the code that is building the tree is not balancing it correctly.
I found this page very enlightening because it talks about the mechanics of converting a function that uses recursion to one that uses iteration.
It has examples showing code as well.
May be you need to calculate this while doing the insert. Store the heights of nodes, i.e add an integer field like height in the Node object. Also have counters height and leaves for the tree. When you insert a node, if its parent is (was) a leaf, the leaf count doesnt change, but if not, increase leaf count by 1. Also the height of the new node is parent's height + 1, hence if that is greater than the current height of the tree, then update it. Its a homework, so i wont help with the actual code
Balance your tree occasionally. If your tree is getting stackoverflow on FindHeight(), that means your tree is way unbalanced. If the tree is balanced it should only have a depth of about 20 nodes for 100000 elements.
The easiest (but fairly slow) way of re-balancing unbalanced binary tree is to allocate an array of TItem big enough to hold all of the data in the tree, insert all of your data into it in sorted order, and delete all of the nodes. Then rebuild the tree from the array recursively. The root is the node in the middle. root->left is the middle of the left half, root->right is the middle of the right half. Repeat recursively. This is the easiest way to rebalance, but it is slowish and takes lots of memory temporarily. On the other hand, you only have to do this when you detect that the tree is very unbalanced, (depth on insert is more than 100).
The other (better) option is to balance during inserts. The most intuitive way to do this is to keep track of how many nodes are beneath the current node. If the right child has more than twice as many "child" nodes as the left child, "rotate" left. And vice-versa. There's instrcutions on how to do tree rotates all over the internet. This makes inserts slightly slower, but then you don't have occassional massive stalls that the first option creates. On the other hand, you have to constantly update all of the "children" counts as you do the rotates, which isn't trivial.
In order to count the leaves without recursion, use the concept of an iterator like the STL uses for the RB-tree underlying std::set and std::map ... Create a begin() and end() function for you tree that indentifies the ordered first and last node (in this case the left-most node and then the right-most node). Then create a function called
BinarySearchTreeNode* increment(const BinarySearchTreeNode* current_node)
that for a given current_node, will return a pointer to the next node in the tree. Keep in mind for this implementation to work, you will need an extra parent pointer in your node type to aid in the iteration process.
Your algorithm for increment() would look something like the following:
Check to see if there is a right-child to the current node.
If there is a right-child, use a while-loop to find the left-most node of that right subtree. This will be the "next" node. Otherwise go to step #3.
If there is no right-child on the current node, then check to see if the current node is the left-child of its parent node.
If step #3 is true, then the "next" node is the parent node, so you can stop at this point, otherwise go the next step.
If the step #3 was false, then the current node is the right-child of the parent. Thus you will need to keep moving up to the next parent node using a while loop until you come across a node that is a left-child of its parent node. The parent of this left-child node will then be the "next" node, and you can stop.
Finally, if step #5 returns you to the root, then the current node is the last node in the tree, and the iterator has reached the end of the tree.
Finally you'll need a bool leaf(const BinarySearchTreeNode* current_node) function that will test whether a given node is a leaf node. Thus you counter function can simply iterate though the tree and find all the leaf nodes, returning a final count once it's done.
If you want to measure the maximum depth of an unbalanced tree without recursion, you will, in your tree's insert() function, need to keep track of the depth that a node was inserted at. This can simply be a variable in your node type that is set when the node is inserted in the tree. You can then iterate through the three, and find the maximum depth of a leaf-node.
BTW, the complexity of this method is unfortunately going to be O(N) ... nowhere near as nice as O(log N).
I have two questions,
1) for any recursive algorithm, there exists a iterative algorithm, is that right? I think it's right, because you just have to use the stack explicit.And it is confirmed in this question
Way to go from recursion to iteration
2) probably the same question like the above one, I really dont think the iterative solution is obvious or easy to write even with the recursive algorithm. For example: for a postorder (LRN) or inorder(LNR) bst traverse, how could you write it with iterative method? In these two cases, it's not easy to find the first object to insert into the stack. That's where I got stuck.
Any suggestions? Actually, my purpose is the same as the above question, try to find a general pattern to change recursive algorithm to iterative ones.
I feel you haven't asked the question properly. I will try to answer the question as to how one can think about implementing the iterative version of in-order traversal(I just happen to have given this some thought and implemented it very recently. I feel I will help myself too by putting this down) given that one knows the recursive version.
Each function call in a recursive version seeks to visit the node associated with the function call. The function is coded such that activation-frame corresponding to a node is saved onto the system stack(stack area of that process) before the it can do its main job, i.e. visit the node. This is so because we want to visit the left subtree of the node before visiting the node itself.
After the left subtree is visited, a return to the frame of our saved node results in the language environment popping the same from the internal stack and a visit to our node is now allowed.
We have to mimic this pushing and popping with an explicit stack.
template<class T>
void inorder(node<T> *root)
{
// The stack stores the parent nodes who have to be traversed after their
// left sub-tree has been traversed
stack<node<T>*> s;
// points to the currently processing node
node<T>* cur = root;
// Stack-not-empty implies that trees represented by nodes in the stack
// have their right sub-tree un-traversed
// cur-not-null implies that the tree represented by 'cur' has its root
// node and left sub-tree un-traversed
while (cur != NULL || !s.empty())
{
if (cur != NULL)
{
for (; cur->l != NULL; cur = cur->l) // traverse to the leftmost child because every other left child will have a left subtree
s.push(cur);
visit(cur); // visit him. At this point the left subtree and the parent is visited
cur = cur->r; // set course to visit the right sub-tree
}
else
{// the right sub-tree is empty. cur was set in the last iteration to the right subtree
node<T> *parent = s.top();
s.pop();
visit(parent);
cur = parent->r;
}
}
}
The best way to understand this is to draw the functioning of the internal stack on paper on each call and return of the recursive version.
Here's the node definition:
struct node{
int data;
stuct node * left;
struct node * right;
};
What I am trying to do is list all the nodes that point to an ancestor node. After posting the wrong solution and taking advice from the answers, my new solution is:
Recursively go through the binary tree. Add the current node to an array of nodes and then check if the children of the current node point to any of the previous ancestor nodes.
The default case is the node being NULL. If that happens the function returns.
How it is supposed to work:
Adds the node to the array
Checks if the left child is NULL.
If not, it compares the child to each of the previous nodes.
If it finds a fault, it reports it.
If not, it calls the function with the child as the argument.
Repeat until finished.
(Does same for rhs of binary tree)
Questions:
Is an array the best thing to store
the nodes?
Does this work? for (i = 0; i < sizeof(arrOfNodes) / sizeof(node); i++)
Because the function is recursive,
the array and the array index can't
be initialized inside the function
(or can they be?) so should they be
global?
Would it be better to have two arrays?
(one for the LHS and one for the
RHS)
The code:
void findFault(node * root){
if (root == NULL){
return;
}
arrOfNodes[index++] == root; // array of nodes
if (root->left != NULL){
for (i = 0; i < sizeof(arrOfNodes) / sizeof(node); i++){
if (ar[i] == root->left){
printf("%d", root->left);
return;
}
}
findFault(root->left);
} else return;
if (root->right != NULL){
for (i = 0; i < sizeof(ar) / sizeof(node); i++){
if (ar[i] == root->right){
printf("%d", root->right);
return;
}
}
findFault(root->right);
} else return;
}
I don't know about recursion, but this:
if (&root->left->left == &root){
is wrong in more ways that I can possibly describe, but anyway here are three issues:
Why are you taking the address of root?
Why don't you test that the first left pointer is null?
You could simply use a std::map, but learning how to implement a binary tree is a good idea too.
This is an incorrect solution to the problem. Neil Butterworth already noted on your code, I'll note on the algorithm.
Your algorithm only checks a very specific case - whether a grandchild node points to its grandparent. What you should do is collect the parents along the way to a node and see that a node's child isn't one of its parents.
There are many ways to do this. One is to add a counter to your node struct and set all nodes' counters to zero before you begin traversing the tree. Whenever you reach a node you make sure the counter is zero and then increase it by one. This means that if you see a child whose counter isn't zero, you've already visited it and therefore the tree isn't valid.
Another way to accomplish this kind of check is to do a breadth-first sweep of the nodes, all the while keeping a vector of nodes you have visited already (which you can keep sorted by address). Each time you visit a node, assert it is not in the vector, then add it to the appropriate place to keep the visited list sorted.
The advantage to this kind of check is it can be performed without modifying the tree or node struct itself, though there is a bit of a performance penalty.
Notes:
An array would be a fine way to store the nodes. If you're avoiding STL (curious: why?) then you'll have to manage your own memory. Doable, but it's a brittle wheel to reinvent.
Your for loop check to get the size of the arrays will not work; if you use malloc/free or new/delete then you'll have to specify the size of the array you want beforehand; you should use that size instead of calculating it every time through the for loop.
The typical pattern for a recursive algorithm is to have an "outer" and "inner" function. The outer function is the one called by external code and does the initial setup, etc. The inner function is only called by the outser function, tends to have a more complicated parameter set (taking data set up by the outer function), and calls itself to perform the actual recursion.
You will need two arrays: one for the list of nodes you have visited, and one for the list of nodes you have yet to visit.
I don't know if the algorithm that generates the binary tree is able to propagate a fault other than node's left/right child.
Anyway, this is a corrected version for your code:
void findFault(node * root){
if (root == NULL){
return;
}
if (root->left == root){
printf("left: %d", root->data);
} else findFault(root->left);
if (root->right == root){
printf("right: %d", root->data);
} else findFault(root->right);
}
int LinkedList::DoStuff()
{
Node *Current = next_;
while ( Current != NULL )
{
Current = Current->next_;
length_++;
}
// At the last iteration we have reached the end/tail/last node
return length_;
}
there are no more nodes beyond the last. How can i traverse to the tail-end to the front-head?
Unless your linked list is a doubly-linked one, this is difficult to do. Recursion is one way, assuming you don't have lists so big that you'll run out of stack space, something like this (pseudo-code):
DoStuffBackwards (currNode) {
if (currNode != NULL) {
DoStuffBackwards (currNode->next);
// Process currNode here.
}
}
DoStuffBackwards (firstNode);
This works because you keep calling DoStuffBackwards() for the next node until you exhaust the list then, as you roll back up the recursion stack, you process each node.
If you just want to go backwards from last node to current node, than Pax's answer (using recursion) is your best bet, also see my version below. If your current node is not the head of your non-circular-singly-linked-list, and you want to go from current node to head node, it is impossible.
int LinkedList::DoStuff()
{
return DoStuffBackward(next_, 0);
}
int LinkedList::DoStuffBackward(Node* node, int n)
{
if (!node)
{
return n;
}
int len = DoStuffBackward(node->next_, n + 1);
std::cout << "doing stuff for node " << n << std::endl;
return len;
}
This has the smell of homework, so no code, but here's an overview of a solution that doesn't require recursion:
If you want to run through the list backward one option to relink the list to point backwards as you're traversing it to find the end. Then as you re-traverse the list (which visits the nodes in the reverse order from the original list) you repeat the relinking same as before and the list ends up in its original order.
This is simple in concept, but handling the pointers and links correctly (especially at the start and end of the list) can be a bit tricky.
Recursion can work, as can building an auxiliary data structure, such as an array with one entry for each element of the original list. If you want a solution for a single-threaded list without requiring O(n) extra storage, the best bet is to reverse the list in place as Michael suggests. I wrote an example for this, [but I'll leave it out given the concern about homework]. One caution about reversing the list: if there are any other data structures that hold pointers to the original list, and you might be accessing them during your traversal, they won't work if they need to access the list while it's reversed, and this might lead to data corruption if they try to modify the list.
Update: Ok, here's the (C++) routine to reverse a list in place. It hasn't been tested, though. And I'll note that if this is homework, the poster still needs to figure out how to use this routine correctly to get a complete answer.
Node *ReverseList(Node *head) {
// Reverse a single-threaded list in place, return new head
Node *prev=NULL;
Node *cur=head;
while (Node *next=cur->next_) {
cur->next_ = prev;
prev = cur;
cur = next;
}
cur->next_ = prev;
return cur;
}
push the list on a stack and then pop them off.
Is your linked list class doubly-linked or singly-linked? If there is no previous pointer inside each node, you can't traverse backwards.
I also suggest you post more code and take the time to make your question readable.