Binary Search Tree Remove - c++

I am working on a binary search tree in C++ at the moment and I have reached the stage where I have to write the remove/delete function(using recursive approach, x = change(x)). I have two options:
to stop at the parent of the node of the node to be deleted;
to get to the node to delete and then call a function that will
return the parent
Approach 1: less expensive, more code
Approach 2: less code, more expensive
Which approach is better according to you, and why?

I disagree that those are your only two options.
I think a simpler solutions is to ask each node weather it should be deleted. If it decides yes then it is deleted and returns the new node that should replace it. If it decides no then it returns itself.
// pseudo code.
deleteNode(Node* node, int value)
{
if (node == NULL) return node;
if (node->value == value)
{
// This is the node I want to delete.
// So delete it and return the value of the node I want to replace it with.
// Which may involve some shifting of things around.
return doDelete(node);
}
else if (value < node->value)
{
// Not node. But try deleting the node on the left.
// whatever happens a value will be returned that
// is assigned to left and the tree will be correct.
node->left = deleteNode(node->left, value);
}
else
{
// Not node. But try deleting the node on the right.
// whatever happens a value will be returned that
// is assigned to right and the tree will be correct.
node->right = deleteNode(node->right, value);
}
// since this node is not being deleted return it.
// so it can be assigned back into the correct place.
return node;
}

The best approach would be to traverse upto the parent of the node to be deleted, and then delete that child node. Eventually using this approach you always visit the child node, since you always have to confirm the child node is the node u want to delete.

I find that the most efficient form for writing functions for tree data structures in general is the following psuedocode format.
function someActionOnTree() {
return someActionOnTree(root)
}
function someActionOnTree (Node current) {
if (current is null) {
return null
}
if (current is not the node I seek) {
//logic for picking the next node to move to
next node = ...
next node = someActionOnTree(next node)
}
else {
// do whatever you need to do with current
// i.e. give it a child, delete its memory, etc
current = ...
}
return current;
}
This recursive function recurses over the vertex set of a data structure. For every iteration of the algorithm, it either looks for a node to recurse the function on, and overwrites the data structure's reference to that node with the value of the algorithm's iteration on that node. Otherwise, it overwrites the node's value (and possibly perform a different set of logic). Finally, the function returns a reference to the parameter node, which is essential for the overwriting step.
This is a generally the most efficient form of code I've found for tree data structures in C++. The concepts apply other structures as well - you can use recursion of this form, where the return value is always a reference to a fixed point in the planar representation of your data structure (basically, always return whatever is supposed to be at the spot you're looking at).
Here's an application of this style to a binary search tree delete function to embellish my point.
function deleteNodeFromTreeWithValue( value ) {
return deleteNodeFromTree(root, value)
}
function deleteNodeFromTree(Node current, value) {
if (current is null) return null
if (current does not represent value) {
if (current is greater than my value) {
leftNode = deleteNodeFromTree(leftNode, value)
} else {
rightNode = deleteNodeFromTree(rightNode, value)
}
}
else {
free current's memory
current = null
}
return current
}
Obviously, there are many other ways to write this code, but from my experience, this has turned out to be the most effective method. Note that performance isn't really hit by overwriting pointers, since the hardware already cached the nodes. If you're looking into improving performance of your search tree, I'd recommend looking into specialized trees, like self-balancing ones (AVL trees), B-trees, red-black trees, etc.

Related

Why Is My Binary Tree Overwriting The Leaves Of Its Root?

I've pinpointed my issue to this specific function, it's the helper function for my binary tree. Before this function call there is a node but instead of growing it seemingly just replaces that node. When I look at my code in my head it all makes sense but I can't figure out what I'm doing wrong.
Here is the function that calls add:
void BSTree::Insert(Client &newClient) {
if (isEmpty())
{
Node *newNode = new Node(newClient);
this->root = newNode;
}
else
add(this->root, newClient);
}
and here is my add() function:
BSTree::Node* BSTree::add(Node *node, Client &newClient) // helper function for Insert()
{
if (node == nullptr)
{
Node *newNode = new Node(newClient);
//node = newNode; // already tried adding this in
return newNode;
}
if (newClient.clientID < node->pClient->clientID)
return node->left = add(node->left, newClient); // already tried just returning add()
else
return node->right = add(node->right, newClient);
}
Since this is your question, I will explain what your code is doing. Imagine you have a mature binary tree already and you are adding a node to your tree. By the time you reach this line
return node->left = add(node->left, newClient);
Three separate instructions are carried out:
newClient is added to the left branch of node by add().
the left child of node is set to the return value of add().
the right hand side (RHS) of the assignment is returned by the parent function.
The issue is with number 2. If the tree you are adding to is mature already, changing left child of nodes as you're traversing the tree will cause the override effect that you're observing. In fact, the problem goes beyond overwriting leaves. Since you use the new keyword, the overwritten nodes still have allocated heap space, are never deleted and cause a memory leak.
Here are some thoughts to get you on the right direction:
Your insert() function ensures that the first time you call add(), you are not passing nullptr as the first argument. Take advantage of that and ensure nullptr is never passed into add() function by checking for nullptr before you do the recursive call. Change the return type of add() to void. You no longer need to check node is nullptr. Here's some pseudocode to guide you
void add(node, val)
if val < node.val
if node.left exists
add(node.left, val)
else
make a new object and set node.left to that object
else
if node.right exists
add(node.right, val)
else
make a new object and set node.right to that object
There is a problem with your logic. First of all, there is the insert() method which you should write like this for better understanding:
void BSTree::Insert(const Client &newClient) // use const to prevent modification
{
if (isEmpty()) { root = new Node(newClient); }
else { add(this->root, newClient); }
}
This way you are creating a new object at root directly with the help of 'root' pointer in BSTree.
Now, about the add() method. The 'node' you are passing as a parameter is a copy of the pointer variable, so the actual pointer value is not changed. See this:
BSTree::Node* BSTree::add(Node *node, Client &newClient) //logical error
You need to pass the Node* by reference like this using 'Node* &node':
BSTree::Node* BSTree::add(Node* &node, const Client &newClient)
Why is you binary tree overwriting the roots of its leaves? Answer:
Your recursive call with return statement is totally wrong.
return node->left = add(node->left, newClient);
The add(node->left, newClient) always returns the address of the leaves, and you are returning this value. It goes for recursive calls until it reaches the leaves place.
Conclusion: Since, there are a lot of bugs, I would suggest you re-write logic again carefully.
I hope this helps! :-)

Inserting Before/After Node in Linked List

I am trying to insert nodes in a list based on the value of a data member. Basically, if the member isVip evaluates to true, that node gets precedence, and should be inserted ahead of any regular node (but behind any existing VIP nodes). Regular nodes simply get added at the end of the list.
I'm pretty sure I have a good idea of how to use two pointers to step through the list and insert elements for n > 2 where n is the number of current list members, but I'm sort of conceptually stuck for the case when there's only one node.
Here is my working version of code below:
void SelfStorageList::rentLocker(Locker e) {
int count = 0;
LockerNode *p = head;
if (isEmpty()) {
head = new LockerNode(e);
tail = head;
}
for(;p!=0;count++, p=p->next) {
if(count == 1) {
if (e.isVip) {
if(p->objLocker.isVip) {
LockerNode*p = new LockerNode(e, p->next);
}
}
}
}
As you can see, I'm checking to see if the passed in object is VIP, and then whether the current one is. Here, I've hit some trouble. Assuming both are VIP, will this line:
LockerNode*p = new LockerNode(e, p->next);
put the passed in locker object in the correct place (i.e. after the current VIP one). If so, would:
LockerNode*p = new LockerNode(e, p);
equivalently place it before? Is the use or absence of the 'next' member of the node what defines the placement location, or is it something entirely different?
Hope someone can clear my doubts, and sorry if it seem a foolish question! Thanks!
Simply iterate over the list while the next node have isVip set (current->next->isVip). After the iteration, the last node visited will be the last with isVip set, and you should insert the new node after that one.
It can be implemented in fewer lines, without the explicit isEmpty check, and without any counter. Even less than that if you use a standard container instead.

Delete node from linked list (recursively)

!=I am currently working on the following erase recursive bool function that thakes list and int as arguments and return true if the int was found and deleted and false if it was not found in the list. It seems to work, but the problem is that it deletes the next int number in the list, and not the current:
typedef struct E_Type * List;
struct E_Type
{
int data;
List next = 0;
};
bool erase(const List & l, int data){
List current = l;
if (current == 0)
{
return false;
}
else if (current->data == data)
{
List deleteNode = new E_Type;
deleteNode = current->next;//probably this causes the error, but how can I point it to the current without crashing the program
current->next = deleteNode->next;
delete deleteNode;
return true;
}
else if (current->data != data)
{
return erase(current->next, data);
}
}
There are two basic type of lists:
single-linked lists (each node knows its next node) and
double-linked lists (each node knows its next as well as its previous node).
If, like in your case, one has a single-linked list, you must not check the CURRENT node for equality to 'data', because at that point it is too late to change the next pointer of the last node. So you always have to check the NEXT pointer for equality, like this:
bool erase(const List & l, int data)
{
List current = l;
if (current == 0)
return false;
// special case: node to be deleted is the first one
if (current->data == data)
{
delete current;
return true;
}
if (current->next && current->next->data == data) // next exists and must be erased
{
List deleteNode = current->next; // Step 1: save ptr to next
current->next = deleteNode->next; // Step 2: reassign current->next ptr
delete deleteNode; // Step 3: delete the node
return true;
}
return erase(current->next, data);
}
Note: I spared your last 'else if' condition. The 'else' because the previous if had a return in it, and the 'if' since its condition was just the negation of the previous 'if', which - if the program comes this far - would always hold.
Regards
The only node you're considering is the current one, so you must have a provision for modifying l:
if (current->data == data)
{
l = current->next;
delete current;
return true;
}
Here are some pointers.
An iterative approach
When you're iterating over your list, maintaining a pointer to the current element is not enough. You also need to maintain a pointer to the previous element, since you will need to fix up previous->next if you delete the current element.
On top of that, deleting the first element of the list will require special handling.
A recursive approach
Write a recursive function that will take a pointer to the head of the list, find & delete the required element, and return a pointer to the new head of the list. To do this, you need to:
Define and implement the base case. Handling one-element lists seems like a natural candidate.
Define the recursion. There are two cases: either the head of the list is the element you're looking for, or it isn't. Figure out what you need to do in both cases, and take it from there.
If you have a list:
A --> B --> C --> D
And you want to delete C, you have to:
Store C in a temp variable
Change B->next=C->next
delete C.
So you need to find B to be able to modify it.
You should certainly not create any new instance of E_type.
Your condition
else if (current->data == data)
will stop on the node which has the value data. You then go on to delete the node after this node in your code.
If you want to keep the rest of the code same, then that line should be :
else if ((current->next)->data == data)
with an extra check, in case the first element is the only element in the list.
A simpler way would be to keep a pointer that points to the element before the current element, and then deleting the node which is referred by that pointer.
You will need to change the next pointer of the preceding entry. So everything is find, but you have to check current->next->data against data, not current->data.
Be sure to check for NULL-pointers in case current is the last entry in the list!
When you delete a node from a list, you need to point the previous node to the next one. Since you have a singly linked list, there are 2 options:
Maintain a pointer to previous node in your erase function. When encountering desired node, link previous node to current->next and delete current node. Needs special treatment for the first node in the list.
When you encounter desired node, copy the content of the current->next into current, then delete current->next. This way you don't need an extra parameter in your function. Needs special treatment for the last node in the list.

How does the std::map iterator work?

The C++ STL class std::map implements O(log(n)) look-up using a binary tree. But with trees, it's not immediately obvious how an iterator would work. What does the ++ operator actually mean in a tree structure? Whereas the concept of "next element" has an obvious implementation in an array, for me it's not so obvious in a tree. How would one implement a tree iterator?
For an inorder traversal (probably works for others too), if you have a parent-pointer in your nodes you can do a non-recursive traversal. It should be possible to just store two pointers in your iterator: you need an indication of where you are, and you'll probably (I'm not doing the research now) need something like a "previous" pointer so you can figure out your current movement direction (i.e. do I need to go into the left subtree, or did I just come back from it).
"Previous" will probably be something like "parent", if we've just entered the node; "left" if we're coming back from the left subtree, "right" if we are coming back from the right subtree, and "self" if the last node we returned was our own.
I would like to add my two cents worth as a comment, but since I am not able to I shall have to add an answer.  I have been googling and was frustrated because all the answers I found, these excepted, assumed a stack or some other variably-sized data structure. I did find some code.  It shows that it can be done without a stack but I found it hard to follow and so decided to attack the problem from first principles.
The first thing to note is that the algorithm is "left-greedy".  Thus, when we start at the root we immediately go as far left as possible, since the leftmost node is the one we need first.  This means that we never need to consider the left-subtree.  It has already been iterated over.
The order of iteration is left subtree, node, right subtree.  So if we are positioned at a given node we know that its left subtree and the node itself have been visited and that we should next visit the right subtree, if any, going as far left as possible.
Otherwise, we must go up the tree.  if we are going from a left child to its parent then the parent comes next.  (Afterwards we will visit its right subtree, as already covered.)
The final case is when we are going from a right child to its parent.  The parent has been visited already so we must go up again.  In fact we must keep going up until we reach the root or the tree, or find ourselves moving to a parent from its left child. As we have already seen, the parent is the next node in this case.  (The root may be indicated by a null pointer, as in my code, or some special sentinel node.)
The following code could easily be adapted for an STL-style iterator
// Go as far left from this node as you can.
// i.e. find the minimum node in this subtree
Node* Leftmost(Node* node)
{
if (node == nullptr)
return nullptr;
while (node->left != nullptr)
node = node->left;
return node;
}
// Start iterating from a root node
Node* First(Node* root)
{
return Leftmost(root);
}
// The iteration is current at node. Return the next node
// in value order.
Node* Next(Node* node)
{
// Make sure that the caller hasn't failed to stop.
assert(node != nullptr);
// If we have a right subtree we must iterate over it,
// starting at its leftmost (minimal) node.
if (node->right != nullptr)
return Leftmost(node->right);
// Otherwise we must go up the tree
Node* parent = node->parent;
if (parent == nullptr)
return nullptr;
// A node comes immediately after its left subtree
if (node == parent->left)
return parent;
// This must be the right subtree!
assert(node == parent->right);
// In which case we need to go up again, looking for a node that is
// its parent's left child.
while (parent != nullptr && node != parent->left)
{
node = parent;
parent = node->parent;
}
// We should be at a left child!
assert(parent == nullptr || node == parent->left);
// And, as we know, a node comes immediately after its left subtree
return parent;
}
Consider the set of all elements in the map that are not less than the current element that are also not the current element. The "next element" is the element from that set of elements that is less than all other elements in that set.
In order to use a map, you must have a key. And that key must implement a "less than" operation. This determines the way the map is formed, such that the find, add, remove, increment, and decrement operations are efficient.
Generally the map internally uses a tree of some kind.
Standard implementation of map iterator operator++ watch in stl_tree.h:
_Self&
operator++() _GLIBCXX_NOEXCEPT
{
_M_node = _Rb_tree_increment(_M_node);
return *this;
}
_Rb_tree_increment implementation is discussed here

Exception free tree destruction in C++

I have recently managed to get a stack overflow when destroying a tree by deleting its root 'Node', while the Node destructor is similar to this:
Node::~Node(){
for(int i=0;i<m_childCount;i++)
delete m_child[i];
}
A solution that come up into my mind was to use own stack. So deleting the tree this way:
std::stack< Node* > toDelete;
if(m_root)
toDelete.push(m_root);
while(toDelete.size()){
Node *node = toDelete.top();
toDelete.pop();
for(int i=0;i<node->GetChildCount();i++)
toDelete.push(node->Child(i));
delete node;
}
But in there the std::stack::push() may throw an exception. Is it possible to write an exception free tree destruction? How?
EDIT:
If anybody is interested here is an exception free non-recursive code inspired by the algorithm pointed out by jpalecek:
Node *current = m_root;
while(current){
if(current->IsLeaf()){
delete current;
return;
}
Node *leftMostBranch = current;// used to attach right subtrees
// delete all right childs
for(size_t i=1; i<current->GetChildCount(); i++){
while(!leftMostBranch->Child(0)->IsLeaf())
leftMostBranch = leftMostBranch->Child(0);
delete leftMostBranch->Child(0);
leftMostBranch->Child(0) = current->Child(i);
}
// delete this node and advance to the left child
Node *tmp = current;
current = current->Child(0);
delete tmp;
}
note: Node::IsLeaf() is equivalent to Node::GetChildCount()!=0.
I just had this as an interview question.
And I must admit this is one of the hardest things I had to solve on the fly.
Personally I don't think it's a good question as you may know the trick (if you have read Knuth) in which case it becomes trivial to solve but you can still fool the interviewer into making him/her think you have solved it on the fly.
This can be done assuming that the node stores child pointers in a static structure. If the node stores child pointers in a dynamic structure then it will not work, as you need to re-shape the tree on the fly (it may work but there is no guarantee).
Surprisingly the solution is O(n)
(Technically every node is visited exactly twice with no re-scanning of the tree).
This solution uses a loop (so no memory usage for stack) and does not dynamically allocate memeroy to hold nodes that need to be deleted. So it is surprisingly effecient.
class Node
{
// Value we do not care about.
int childCount;
Node* children[MAX_CHILDREN];
};
freeTree(Node* root)
{
if (root == NULL)
{ return;
}
Node* bottomLeft = findBottomLeft(root);
while(root != NULL)
{
// We want to use a loop not recursion.
// Thus we need to make the tree into a list.
// So as we hit a node move all children to the bottom left.
for(int loop = 1;loop < root->childCount; ++loop)
{
bottomLeft->children[0] = root->children[loop];
bottomLeft->childCount = std::max(1, bottomLeft->childCount);
bottomLeft = findBottomLeft(bottomLeft);
}
// Now we have a root with a single child
// Now we can delete the node and move to the next node.
Node* bad = root;
root = root->children[0];
delete bad; // Note the delete should no longer destroy the children.
}
}
Node* findBottomLeft(Node* node)
{
while((node->childCount > 0) && node->children[0] != NULL))
{ node = node->children[0];
}
return node;
}
The above method will work as long as their is always a children[0] node (even if it is NULL). As long as you do not have to dynamically allocate space to hold children[0]. Alternatively just add one more pointer to the node object to hold the delete list and use this to turn the tree into a list.
This is what all garbage collectors struggle with. However, the best thing you can do (IMHO) is to pray for enough memory for the stack, and your prayers will be heard 99.99999% of the time. Should it not happen, just abort().
BTW if you are interested, there is a solution to traverse long (and deep) trees without allocating much memory.
Why is the original code throwing an exception? I'm guessing you are doing something like using the same node object in multiple places in the tree. Stack overflows are rarely caused by normal expected situations. Stack overflows are not a problem, they are the symptom of a problem.
Rewriting the code differently won't fix that; you should just investigate & fix the error.
Is it possible to write an exception free tree destruction? How?
Perhaps this (untested code):
void destroy(Node* parent)
{
while (parent)
{
//search down to find a leaf node, which has no children
Node* leaf = parent;
while (leaf->children.count != 0)
leaf = leaf->chilren[0];
//remember the leaf's parent
parent = leaf->parent;
//delete the leaf
if (parent)
{
parent->children.remove(leaf);
}
delete leaf;
} //while (parent)
}