I am working on implementing a binary search tree. One of the functions required to complete the implementation is a rebalance function.
According to the specifications the function works in the following way:
The rebalance() method should create a balanced tree and thereby reduce skewness to zero. A
balanced tree is one in which the size of the left and right subtrees differ by no more than 1,
throughout the tree (i.e., every subtree is also balanced).
To balance the tree, the rebalance() method should repeatedly move the root value to the smaller
subtree, and move the min/max value from the larger subtree to the root, until the tree is balanced.
It should then recursively balance both subtrees.
So far I have the following code:
struct treeNode {
Type value;
int count;
treeNode* left;
treeNode* right;
};
treeNode* root;
template <class Type>
void bstree<Type>::rebalance(treeNode* sroot){
if (root == NULL) {
throw new underflow_error("tree is empty");
}
while (skewness(sroot) != 0)
{
if (size(sroot->left) < size(sroot->right))
{
sroot->left.insert(sroot->value);
sroot->left.insert(max(sroot->right));
sroot->left.insert(min(sroot->right));
}
else
{
sroot->right.insert(sroot->value);
sroot->left.insert(max(sroot->left));
sroot->left.insert(min(sroot->left));
}
}
rebalance(sroot->left);
rebalance(sroot->right);
}
I can't tell if I have followed the specifications correctly or not. Can I get some insight or pointers as where I may have done things wrong?
You have not followed the specifications correctly. For one thing, your rebalance operation increases the size of the tree. For another, the result is not always a binary search tree.
You must understand how these trees work before you attempt to implement them. There is just no way around that. I advise you to study your text (or wikipedia) and try constructing and balancing some trees with pencil and paper.
Related
I'm working through a Binary Search Tree tutorial. And I find this function destroy_tree(node* leaf). Its behaviour worries me - I can't imagine how the call stack looks like, can you explain it to me?
void btree::destroy_tree(node* leaf)
{
if (leaf !=NULL)
{
destroy_tree(leaf->left);
destroy_tree(leaf->right);
delete leaf;
}
}
For questions about recursive functions, sometimes it helps to just think of or draw a simple tree and just map out on paper how the function goes through it.
First thing, it's been a while since I used c++, but for the sake of this example I'm going to change your code to:
void btree::destroy_tree(node* leaf)
{
if(leaf !=NULL)
{
if (leaf->left != NULL)
destroy_tree(leaf->left);
if (leaf->right != NULL)
destroy_tree(leaf->right);
delete leaf;
}
}
just so there's less stuff on the stack.
Think about how the logic of this function works recursively through a tree. Take the following tree example which I snagged from Wikipedia
Let's say you call destroy_tree(root). The function destroy_tree(root) calls destroy_tree(node->left) first, then destroy_tree(node->right). This means that left children are always iterated through before ANY right child is. So to use the numbers in the above tree, the tree would traverse in the order: 8,3,1,6,4,7,10,14,13. You can see based on this that all left children are traversed. No right child will be traversed while there is still an untraversed left.
The call stack should look similar as the program runs. Calling destroy_tree(left) will call ``destroy_tree()` on every consecutive left node before any right nodes are reached.
The C++ STL class std::map implements O(log(n)) look-up using a binary tree. But with trees, it's not immediately obvious how an iterator would work. What does the ++ operator actually mean in a tree structure? Whereas the concept of "next element" has an obvious implementation in an array, for me it's not so obvious in a tree. How would one implement a tree iterator?
For an inorder traversal (probably works for others too), if you have a parent-pointer in your nodes you can do a non-recursive traversal. It should be possible to just store two pointers in your iterator: you need an indication of where you are, and you'll probably (I'm not doing the research now) need something like a "previous" pointer so you can figure out your current movement direction (i.e. do I need to go into the left subtree, or did I just come back from it).
"Previous" will probably be something like "parent", if we've just entered the node; "left" if we're coming back from the left subtree, "right" if we are coming back from the right subtree, and "self" if the last node we returned was our own.
I would like to add my two cents worth as a comment, but since I am not able to I shall have to add an answer. I have been googling and was frustrated because all the answers I found, these excepted, assumed a stack or some other variably-sized data structure. I did find some code. It shows that it can be done without a stack but I found it hard to follow and so decided to attack the problem from first principles.
The first thing to note is that the algorithm is "left-greedy". Thus, when we start at the root we immediately go as far left as possible, since the leftmost node is the one we need first. This means that we never need to consider the left-subtree. It has already been iterated over.
The order of iteration is left subtree, node, right subtree. So if we are positioned at a given node we know that its left subtree and the node itself have been visited and that we should next visit the right subtree, if any, going as far left as possible.
Otherwise, we must go up the tree. if we are going from a left child to its parent then the parent comes next. (Afterwards we will visit its right subtree, as already covered.)
The final case is when we are going from a right child to its parent. The parent has been visited already so we must go up again. In fact we must keep going up until we reach the root or the tree, or find ourselves moving to a parent from its left child. As we have already seen, the parent is the next node in this case. (The root may be indicated by a null pointer, as in my code, or some special sentinel node.)
The following code could easily be adapted for an STL-style iterator
// Go as far left from this node as you can.
// i.e. find the minimum node in this subtree
Node* Leftmost(Node* node)
{
if (node == nullptr)
return nullptr;
while (node->left != nullptr)
node = node->left;
return node;
}
// Start iterating from a root node
Node* First(Node* root)
{
return Leftmost(root);
}
// The iteration is current at node. Return the next node
// in value order.
Node* Next(Node* node)
{
// Make sure that the caller hasn't failed to stop.
assert(node != nullptr);
// If we have a right subtree we must iterate over it,
// starting at its leftmost (minimal) node.
if (node->right != nullptr)
return Leftmost(node->right);
// Otherwise we must go up the tree
Node* parent = node->parent;
if (parent == nullptr)
return nullptr;
// A node comes immediately after its left subtree
if (node == parent->left)
return parent;
// This must be the right subtree!
assert(node == parent->right);
// In which case we need to go up again, looking for a node that is
// its parent's left child.
while (parent != nullptr && node != parent->left)
{
node = parent;
parent = node->parent;
}
// We should be at a left child!
assert(parent == nullptr || node == parent->left);
// And, as we know, a node comes immediately after its left subtree
return parent;
}
Consider the set of all elements in the map that are not less than the current element that are also not the current element. The "next element" is the element from that set of elements that is less than all other elements in that set.
In order to use a map, you must have a key. And that key must implement a "less than" operation. This determines the way the map is formed, such that the find, add, remove, increment, and decrement operations are efficient.
Generally the map internally uses a tree of some kind.
Standard implementation of map iterator operator++ watch in stl_tree.h:
_Self&
operator++() _GLIBCXX_NOEXCEPT
{
_M_node = _Rb_tree_increment(_M_node);
return *this;
}
_Rb_tree_increment implementation is discussed here
I am working on a binary search tree in C++ at the moment and I have reached the stage where I have to write the remove/delete function(using recursive approach, x = change(x)). I have two options:
to stop at the parent of the node of the node to be deleted;
to get to the node to delete and then call a function that will
return the parent
Approach 1: less expensive, more code
Approach 2: less code, more expensive
Which approach is better according to you, and why?
I disagree that those are your only two options.
I think a simpler solutions is to ask each node weather it should be deleted. If it decides yes then it is deleted and returns the new node that should replace it. If it decides no then it returns itself.
// pseudo code.
deleteNode(Node* node, int value)
{
if (node == NULL) return node;
if (node->value == value)
{
// This is the node I want to delete.
// So delete it and return the value of the node I want to replace it with.
// Which may involve some shifting of things around.
return doDelete(node);
}
else if (value < node->value)
{
// Not node. But try deleting the node on the left.
// whatever happens a value will be returned that
// is assigned to left and the tree will be correct.
node->left = deleteNode(node->left, value);
}
else
{
// Not node. But try deleting the node on the right.
// whatever happens a value will be returned that
// is assigned to right and the tree will be correct.
node->right = deleteNode(node->right, value);
}
// since this node is not being deleted return it.
// so it can be assigned back into the correct place.
return node;
}
The best approach would be to traverse upto the parent of the node to be deleted, and then delete that child node. Eventually using this approach you always visit the child node, since you always have to confirm the child node is the node u want to delete.
I find that the most efficient form for writing functions for tree data structures in general is the following psuedocode format.
function someActionOnTree() {
return someActionOnTree(root)
}
function someActionOnTree (Node current) {
if (current is null) {
return null
}
if (current is not the node I seek) {
//logic for picking the next node to move to
next node = ...
next node = someActionOnTree(next node)
}
else {
// do whatever you need to do with current
// i.e. give it a child, delete its memory, etc
current = ...
}
return current;
}
This recursive function recurses over the vertex set of a data structure. For every iteration of the algorithm, it either looks for a node to recurse the function on, and overwrites the data structure's reference to that node with the value of the algorithm's iteration on that node. Otherwise, it overwrites the node's value (and possibly perform a different set of logic). Finally, the function returns a reference to the parameter node, which is essential for the overwriting step.
This is a generally the most efficient form of code I've found for tree data structures in C++. The concepts apply other structures as well - you can use recursion of this form, where the return value is always a reference to a fixed point in the planar representation of your data structure (basically, always return whatever is supposed to be at the spot you're looking at).
Here's an application of this style to a binary search tree delete function to embellish my point.
function deleteNodeFromTreeWithValue( value ) {
return deleteNodeFromTree(root, value)
}
function deleteNodeFromTree(Node current, value) {
if (current is null) return null
if (current does not represent value) {
if (current is greater than my value) {
leftNode = deleteNodeFromTree(leftNode, value)
} else {
rightNode = deleteNodeFromTree(rightNode, value)
}
}
else {
free current's memory
current = null
}
return current
}
Obviously, there are many other ways to write this code, but from my experience, this has turned out to be the most effective method. Note that performance isn't really hit by overwriting pointers, since the hardware already cached the nodes. If you're looking into improving performance of your search tree, I'd recommend looking into specialized trees, like self-balancing ones (AVL trees), B-trees, red-black trees, etc.
Suppose I am given a undirected tree and I need to find a path(the only path) between two nodes.
What is the best algorithm to do it.I probably could use a Dijkstra's algorithm but there a probably something better for trees.
C++ example would be helpful but not necessary
Thank you
Assuming each node has a pointer to its parent, then simply back-track up the tree towards the root from each start node. Eventually, the two paths must intersect. Testing for intersection could be as simple as maintaining a std::map of node addresses.
UPDATE
As you've updated your question to specify undirected trees, then the above isn't valid. A simple approach is simply to perform a depth-first traversal starting at Node #1, eventually you'll hit Node #2. This is O(n) in the size of the tree. I'm not sure there's going to be a faster approach than that, assuming a completely general tree.
Breadth-first search and depth-first search are more effective then Dijkstra's algorithm.
Supposing you have
struct Node
{
std::vector<Node *> children;
};
then what could be done is traversing the whole tree starting at root keeping the whole chain during the traversal. If you find e.g. node1 then you save the current chain, if you find node2 then you check for the intersection... in code (UNTESTED):
bool findPath(std::vector<Node *>& current_path, // back() is node being visited
Node *n1, Node *n2, // interesting nodes
std::vector<Node *>& match, // if not empty back() is n1/n2
std::vector<Node *>& result) // where to store the result
{
if (current_path.back() == n1 || current_path.back() == n2)
{
// This is an interesting node...
if (match.size())
{
// Now is easy: current_path/match are paths from root to n1/n2
...
return true;
}
else
{
// This is the first interesting node found
match = current_path;
}
}
for (std::vector<Node *>::iterator i=current_path.back().children.begin(),
e=current_path.back().children.end();
i != e; ++i)
{
current_path.push_back(*i);
if (findPath(current_path, n1, n2, match, result))
return true;
current_path.pop_back(); // *i
}
return false;
}
I'm writing a program in C++ that uses genetic techniques to optimize an expression tree.
I'm trying to write a class Tree which has as a data member Node root. The node constructor generates a random tree of nodes with +,-,*,/ as nodes and the integers as leaves.
I've been working on this awhile, and I'm not yet clear on the best structure. Because I need to access any node in the tree in order to mutate or crossbreed the tree, I need to keep a dicionary of the Nodes. An array would do, but it seems that vector is the recommended container.
vector<Node> dict;
So the Tree class would contain a vector dict with all the nodes of the tree (or pointers to same), the root node of the tree, and a variable to hold a fitness measure for the tree.
class Tree
{
public:
typedef vector<Node>dict;
dict v;
Node *root;
float fitness;
Tree(void);
~Tree();
};
class Node
{
public:
char *cargo;
Node *parent;
Node *left;
Node *right;
bool entry;
dict v;
Node(bool entry, int a_depth, dict v, Node *pparent = 0);
};
Tree::Tree()
{
Node root(true, tree_depth, v);
};
There seems to be no good place to put typedef vector<Node>dict;, because if it goes in the definition of Tree, it doesn't know about Node, and will give an error saying so. I havn't been able to find a place to typedef it.
But I'm not even sure if a vector is the best container. The Nodes just need to be indexed sequentally. The container would need to grow as there could be 200 to 500 Nodes.
I think a standard Binary Tree should do... here is an example of a (binary) expression tree node:
const int NUMBER = 0, // Values representing two kinds of nodes.
OPERATOR = 1;
struct ExpNode { // A node in an expression tree.
int kind; // Which type of node is this?
// (Value is NUMBER or OPERATOR.)
double number; // The value in a node of type NUMBER.
char op; // The operator in a node of type OPERATOR.
ExpNode *left; // Pointers to subtrees,
ExpNode *right; // in a node of type OPERATOR.
ExpNode( double val ) {
// Constructor for making a node of type NUMBER.
kind = NUMBER;
number = val;
}
ExpNode( char op, ExpNode *left, ExpNode *right ) {
// Constructor for making a node of type OPERATOR.
kind = OPERATOR;
this->op = op;
this->left = left;
this->right = right;
}
}; // end ExpNode
So when you're doing crossover or mutation and you want to select a random node you just do the following:
Count the number of nodes in the tree (only need to do this ones in the constructor).
Select a random index from 0 to the size of the tree.
Visit each node and subtract 1 from the random index until you reach zero.
Return the node when the index is 0.
In this case you don't need to know anything about the parent of the node. So mating/mutation should look like this:
select nodeX
select nodeY
if( Rand(0,1) == 1 )
nodeY->left = nodeX;
else
nodeY->right = nodeX;
And that should be it...
I don't think the Node or the Tree are the first classes to write.
I'd start with Expression. In your case you need at least a BinaryExpression, as well as an expression with no subnodes (constants or variables). Each Binary expression should contain auto_ptr<Expression> lhs and auto_ptr<Expression> rhs.
You could then easily write a function to enumerate through the expression tree's members. If performance turns out to be relevant, you can cache the list of expressions in the tree, and invalidate it manually when you change the expression. Anything more advanced is likely to be slower and more error prone.
I don't see why an expression needs to know it's parent expression. It only makes life harder when you start editing expressions.
You may implement a list over nodes. Then, each node will have two additional pointers inside:
class Node{
...
Node* sequentialPrevious;
Node* sequentialNext;
...
}
And so will the tree:
class Tree{
...
Node* sequentialFirst;
Node* sequentialLast;
...
}
Than you will be albe to move bidirectionally over nodes just by jumping to sequentialFirst or sequentialLast and then iteratively to sequentialNext or sequentialPrevious. Of course, Node constructor and destructor must be properly implemented to keep those pointers up to date.