what should be the structure of binary search tree node

what should be the structure of binary search tree node - c++

i am trying to make c++ program for binary search tree which will contain following functionality (actually this is a part of my college assignment):
A) CREATE Binary search tree.
B) Inorder, preorder, postorder traversals. ( non-recursive )
C) Search the Val in tree.
D) Breadth first traversal.
E) Depth first traversal
F) Count leaf nodes, non-leaf nodes.
G) Count no. of levels
my doubt is:-
1. usually a tree node have following structure:
class node{
private:
node *lChild;
int info;
node *rChild;
}
so in case i want to perform depth-first or breadth-first traversal can i change the node structure and add one more pointer pointing to the parent so that i can easily move backward in the hierarchy
class node{
private:
node *parent //pointer to parent node
node *lChild;
int info;
node *rChild;
}
is this considered as normal practice or bad form of programming a binary tree ? and if it is not considered as good way of programming a tree is there any other way or do i have to use the method given in books of using stack (for Depth First) and queue(for breadth first) to store nodes (visited or non-visited accordingly)
2. This is first time i am learning data structures so it will be a great help if someone can explain in simple words that what is the difference between recursive and non-recursive traversal with binary tree in consideration

i change the node structure and add one more pointer pointing to the parent [...] is this considered as normal practice or bad form of programming a binary tree ?
It is not a normal practice (but not quite "bad form"). Each node is a collection of data and two pointers. If you add a third pointer to each node, you will have increased the overhead of each node by 50% (two pointers to three pointers per node) which for a large binary tree will be quite a lot.
This is first time i am learning data structures so it will be a great help if someone can explain in simple words that what is the difference between recursive and non-recursive traversal
A recursive implementation is a function that only applies on a node, then calls itself for the subsequent nodes. This makes use of the application call-stack to process the nodes of the tree.
A non-recursive implementation uses a local stack to push non-processed nodes; then it loops as long as there is data on the stack and processes each entry.
Here's an example for printing to console, that shows difference between recursive and non-recursive ( the example is incomplete, as this is homework :] ):
void recursive_print(node* n) {
std::cout << n->info << "\n";
if(n->lChild)
recursive_print(n->lChild); // recursive call
// n->rChild is processed the same
}
void non_recursive_print(node* n) {
std::stack<node*> stack;
stack.push(n);
while(!stack.empty()) { // behaves (more or less) the same as
// the call-stack in the recursive call
node* x = stack.top();
stack.pop();
std::cout << x->info << "\n";
if(x->lChild)
stack.push(x->lChild); // non-recursive: push to the stack
// x->rChild is processed the same way
}
}
// client code:
node *root; // initialized elsewhere
if(root) {
recursive_print(root);
non_recursive_print(root);
}

You don't need a pointer to the parent node. Think about the cases when you would use it. The only way you can reach a node is through its parent, so you have already visited the parent.
Do you know what recursive means?

There's nothing to stop you adding a parent pointer if you want to. However, it's not usually necessary, and slightly increases the size and complexity.
The normal approach for traversing a tree is some kind of recursive function. You first call the function and pass in the root node of the tree. The function then calls itself, passing the child pointers (one at a time). This happens recursively all the way down the tree until there are no child nodes left.
The function does whatever processing you want on its own node after the recursive calls have returned. That means you're basically traversing down the tree with each call (making your call stack progressively deeper), and then doing the processing on the way back up as each function returns.
The function should never try to go back up the tree the same way it came down (i.e. passing in a parent pointer), otherwise you'll end up with an infinite recursion.

Typically you only need a parent pointer if you need to support iteration
Imagine that you have found a leaf node and then want to find the next node (lowest key greater than current key), for example:
mytree::iterator it1=mytree_local.find(7);
if (it1 != mytree_local.end())
{
mytree::iterator it2=it1.next(); // it1 is a leaf node and next() needs to go up
}
Since here you are starting at the bottom and not the top, you need to go up
But your assignment only requires operations that start at the root node, you shouldn't have a up pointer, follow the other answers for approaches that avoid the need to go up.

I would suggest you look into the Visitor pattern - for its flavor, not specifically for its structure (it's very complex).
Essentially, it is a design pattern that disconnects traversal of a tree in such a way that you have only one set of code that does tree traversal, and you use that set of code to execute various functionality on each node. The traversal code is generally not part of the Node class.
Specifically, it will allow you to not have to write the traversal code more than once - For example, utnapistims answer will force you to write traversal code for every piece of functionality you need; that example covers printing - to ouputXML() would require another copy of traversal code. Eventually, your Node class becomes a huge ungainly beast.
With Visitor, you would have your Tree and Node classes, a separate Traversal class, and numerous functional classes, such as PrintNode, NodeToXML, and possibly DeleteNode, to use with the Traversal class.
As for adding a Parent pointer, that would only be useful if you intended to park on a given node between calls to the Tree - i.e. you were going to do a relative search beginning on a pre-selected arbitrary node. This would probably mean that you had better not do any multi-threaded work with said tree. The Parent pointer will also be difficult to update as a red/black tree can easily insert a new node between the current node and its "parent".
I would suggest a BinaryTree class, with a method that instantiates a single Visitor class, and the visitor class accepts an implementation of a Traversal interface, which would be one of either Breadth, Width or Binary. Basically, when the Visitor is ready to move to the next node, it calls the Traversal interface implementation to get it (the next node).

Related

Is there a way to access non leaf nodes in a C++ Boost rtree

Sorry in advance, this a very specific question and I cannot provide any piece of code as this is for my job, thus confidential.
I am using the Boost R-trees, and an algorithm that I need to implement requires to access the non leaf nodes of the tree. With Boost rtree library, I only can access leaf nodes in an easy way. I noticed that there is a function to print all the nodes including the non leaf nodes (which means they exist, they are computed), with their position, their level in the tree etc, but I cannot access them the same way than the leaf nodes.
For now, the best solution that I have is to implement a visitor for the tree and overload the operator () to gather the nodes (this is what the print method does to access the nodes).
My question is, does anybody know an easier way to access the non leaf nodes ? Because this one does not seem to be efficient, and I'm loosing time each time I want to access a non leaf node. Moreover, I need to replicate the structure of the tree without the points, and I cannot do that if I cannot access the non leaf nodes.
Thank you in advance !

I don't know what would you like to do exactly so this will be a general answer.
In order to access the tree nodes for the first time you have to traverse the tree structure. In Boost.Geometry rtree visitor pattern is used for that. You could do it manually but internally Boost.Variant is used to represent the nodes so you'll end up with variant visitor instead. At this point you have a few options depending what are you going to do with the nodes. Are you going to modify the r-tree? Will the rtree be moved in memory? Will the addresses of nodes change? How many nodes are you going to access? Do you want to store some kind of reference to a node and traverse the tree structure from that point? Do you want to traverse the structure downward or upward?
One option as you noticed is to traverse the tree structure each time. This is a good approach if the tree structure can change. The obvious drawback is that you have to check all child nodes at each node using some condition (whatever you do in order to pick the node of interest).
If the tree structure does not change but the tree is copied to a different place in memory you can represent the node as a path from the root to the node of interest as list of indexes of child nodes. E.g. a list {1, 2, 3} meaning: traverse the tree using child node 1 of root node, then at the next level pick child node 2, then your node will be child node 3 at the next level. In this case you still have to traverse the tree but doesn't have to check conditions again.
If the tree does not change and nodes stays in the same place in memory you can simply use pointers or references.

implement a Queue using a BST

How do i implement a Queue using a BST.
Is this the way to do it, keep on inserting the nodes in the tree while maintaining a count value associated with each and every node,but while deletion BST should work like queue(FIFO) so start deleting from BST with the node having lowest count value in the tree.
Did i get the question and solution right? If not,then please explain me the question.

A BST is really an inappropriate data structure to use to back a queue. You really ought to use a linked list instead, because it would be way faster, less complicated, and plain old better.
However, if you insist on using a BST...
You would use the BST as a priority queue, and define a wrapper type that also holds a 'queue index', which is what the items would be sorted by. You would have to define the comparison to take into account the current queue index though, because otherwise you could only ever add as many items as the difference between the highest and lowest values of your index type.

You can have a Queue like this:
BST // to store data
pointer to head; // Points to the head of the Queue
pointer to tail // Points to the tail of the Queue
You add to the nodes structs of BST also a pointer to another node that will represent the order of insertion.
struct Node{
int x;
//left pointer
//right pointer
struct Node *next_queune_element;
}
During the insertion
When you want to add an element, you first access the node that the pointer tail points to and make it point to the new element that you just inserted (the BST node). Then you update the tail pointer to point to the new element.
During the deletion
When you remove an element, you first access the node that the head pointer points to, you store the next_queune_element in an auxiliary temporary variable and remove the node. Finally, make the head pointer to point to the auxiliary temporary variable.

I think a binary tree would be the desired data structure here and not a binary search tree. Using a binary tree for implementing a queue might be usefull when doing functional programming. You can do it using a binary tree which stays height balanced after each push and pop operation so they will always be O(log n). Push and pop look like:
a function to insert an element to the very left of the tree (the push function);
a function to delete an element from the very right of the tree (the pop fuction).
In both cases rebalancing won't violate the insertion order. Both are easy too implement as well. You are in fact using an AVL tree with altered insert functions. A bonus is the elements do not need to be (totally) orderable.

Algorithm for creating Iterator for BinaryTree class

I want to add Bi-Directional Iterator (like Iterator exported by std::set) in my Parametrized BinaryTree class but I'm unable to comeup with any algorithm.
Simply structure of Binary tree node is , it contains three pointers , left , right , parent:

With the given structure you want to proceed like this:
To start the iteration you would find the left-most node.
To go to the next node the operation depends on where you currently are:
If your node has a right child you go to this child and find its left-most successor (if there is no left child the node you are on is the next node).
If you nodes doesn't have a right child you move up the chain of parents until you find a parent for which you used the link to the left node: the next node becomes this node.
To move in the other direction you reverse the roles of left and right.
Effectively, this is implements a stack-less in-order traversal of the tree. If your tree isn't changed while iterating (an unlikely scenario) or you don't have a link to the parent node, you can maintain the stack explicitly in the iterator.

A good approach to this issue may be to first write your recursive pre-order algorithm, without using templates, and then you can from that create a templated version and implement the correct Iterators.
Just a thought.

You can't use recursion to implement an iterator in C++ because your iterator needs to return from all processing before it can return the result.
Only languages like C# and Python, that have a concept of yield can use recursion to create iterators.
Your iterator needs to maintain a stack of yet-to-be-visited nodes.
Of the top of my head, I think the algorithm is something like:
Keep going down and to the left
Every time you come across a right branch, add it to the stack
If at any point you can't go left, pop the first branch off the stack and begin visiting that in the same way.

How to indicate preorder of a spanning tree using the algorithm BFS

I'm doing an implementation of the BFS algorithm in c++ to find a spanning tree, the output for a spanning tree should be shown in preorder, but I have a doubt in the implementation, how I can build a tree if not exactly know how many children have each node?. Considering a tree structure recursive The data structure of the tree can be written as:
typedef struct node
{
int val;
struct node *left, *right;
}*tree; //tree has been typedefed as a node pointer.
But do not think it works this implementation as mentioned before.
This is my function to return the tree in preorder:
void preorder(tree t)
{
if(t == NULL)
return;
printf("%d ", t->val);
preorder(t->left);
preorder(t->right);
}
I also wonder if there is any way to do the preorder of the nodes without using a tree structure.

I have seen two concrete questions in the posting:
Is it possible to have a data structure using more than two children in a tree? Of course this is possible. Interestingly, it is even possible with the node structure you posted! Just consider the left pointer to be a pointer to the first child and the right pointer to point to the next sibling. Since breadth first search of a graph implicitly builds up a spanning tree, you can then walk this tree in preorder if you actually represent it somehow.
Can you do a preorder walk without using a tree structure? Yes, this is possible, too. Essentially, DFS and BFS are conceptually no different for this: you just have a data structure maintaining the nodes to be visited next. For DFS this is a stack, for BFS this is a queue. You get a preorder walk of the tree (i.e. you visit all children of a node after the parent) if you emit the node number when you insert it into the data structure maintaining the nodes to be visited.
To expand a bit on the second point: a preorder walk of a tree just means that each node is processed prior to it child nodes. When you do a graph search you want to traverse through a connected component of a graph, visiting each node just once, you effectively create an implicit tree. That is, your start node become the root node of the tree. Whenever you visit a node you search for adjacent nodes which haven't been visited, i.e. which isn't marked. If there is such a node, the incident edge becomes a tree node and you mark the node. Since there is always only just one node being actively held you need to remember the nodes which aren't processed, yet, in some data structure, e.g. a stack or a queue (instead of using a stack explicitly you could do recursion which creates the stack implicitly). Now, if you emit the node number the first time you see a node you clearly process it prior to its children, i.e. you end up writing the node number the order of a preorder walk.
If you don't understand this, please whip out a sheet of paper and draw a graph and a queue:
the nodes with hollow circles and their node number next to them
the edges with thin lines
the queue is just rectangles which doesn't contain anything at the start
Now choose a node to become the start node of your search which is the same as the root node of your tree. Write its number into the first empty position in the queue and mark i.e. fill the node. Now proceed with the search:
look at the node indicated by front of the queue and find an adjacent node which isn't filled:
append the node at the back of the queue (i.e. right behind the last node in the rectangle)
mark (i.e. fill) the node
make the line connecting the two nodes thicker: it is a tree edge now
if there are no further unmarked adjacent nodes tick the front node in the queue off (i.e. remove it from the queue) and move on to the next node until there are no further nodes
Now the queue rectangle contains a preorder walk of the spanning tree implied by a breadth first search of the graph. The spanning tree is visible using the thicker lines. The algorithm would also work if you treated the rectangle for the queue as a stack but it would be a bit messier because you end up with ticked off nodes between nodes still to be processed: instead of looking at the first unticked node you would look at the last unticked node.
When working with graph algorithms I found it quite helpful to visualize the algorithm. Although it would be nice to have the computer maintain the drawing, the low-tech alternative of drawing things on paper and possibly indicating active nodes by a number of labeled pencils works as well if not better.
Just a comment on the code: whenever you are reading any input, make sure that you successfully read the data. BTW, your code is clearly only C and not C++ code: variable length arrays are not available in C++. In C++ you would use std::vector<int> followOrder(vertexNumber) instead of int followOrder[vertexNumber]. Interestingly, the code isn't C either because it uses e.g. std::queue<int>.

Fastest way to traverse arbitary depth tree for deletion?

For my own exercises I'm writing an XML-parser. To fill the tree I use a normal std::stack and push the current node on top after making it a child of the last top-node (should be depth-first?). So I now do the same for deletion of the nodes, and I want to know if there's a faster way.
Current code for deletion:
struct XmlNode{
// ignore the rest of the node implementation for now
std::vector<XmlNode*> children_;
};
XmlNode* root_ = new XmlNode;
// fill root_ with child nodes...
// and then those nodes with child nodes and so fort...
std::stack<XmlNode*> nodes_;
nodes_.push(root_);
while(!nodes_.empty()){
XmlNode* node = nodes_.top();
if(node->children_.size() > 0){
nodes_.push(node->children_.back());
node->children_.pop_back();
}else{
delete nodes_.top();
nodes_.pop();
}
}
Works totally fine but it kinda looks slow. So is there any faster / better / more common way to do this?

Don't go out of your way to do iteratively what can be easily done recursively, unless you can prove that the recursive version is either insufficient (e.g. stack overflows) or slower (which won't happen unless you start overflowing your stack, forcing the OS to either expand it or crash you).
In other words, in general, use iteration for linear structures, and recursion for tree structures.
Compared to recursion, an iterative method was around 3 times slower on my machine. If you can be sure that your XML depth won't exceed a few hundred nestings (which I've never seen inside real-world XML documents), then recursion won't be a problem.
To iterate is human; to recurse, divine. :)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js