Find nth smallest element in Binary Search Tree - c++

I have written an algorithm for finding nth smallest element in BST but it returns root node instead of the nth smallest one. So if you input nodes in order 7 4 3 13 21 15, this algorithm after call find(root, 0) returns Node with value 7 instead of 3, and for call find(root, 1) it returns 13 instead of 4. Any thoughts ?
Binode* Tree::find(Binode* bn, int n) const
{
if(bn != NULL)
{
find(bn->l, n);
if(n-- == 0)
return bn;
find(bn->r, n);
}
else
return NULL;
}
and definition of Binode
class Binode
{
public:
int n;
Binode* l, *r;
Binode(int x) : n(x), l(NULL), r(NULL) {}
};

It is not possible to efficiently retrieve the n-th smallest element in a binary search tree by itself. However, this does become possible if you keep in each node an integer indicating the number of nodes in its entire subtree. From my generic AVL tree implementation:
static BAVLNode * BAVL_GetAt (const BAVL *o, uint64_t index)
{
if (index >= BAVL_Count(o)) {
return NULL;
}
BAVLNode *c = o->root;
while (1) {
ASSERT(c)
ASSERT(index < c->count)
uint64_t left_count = (c->link[0] ? c->link[0]->count : 0);
if (index == left_count) {
return c;
}
if (index < left_count) {
c = c->link[0];
} else {
c = c->link[1];
index -= left_count + 1;
}
}
}
In the above code, node->link[0] and node->link[1] are the left and right child of node, and node->count is the number of nodes in the entire subtree of node.
The above algorithm has O(logn) time complexity, assuming the tree is balanced. Also, if you keep these counts, another operation becomes possible - given a pointer to a node, it is possible to efficiently determine its index (the inverse of the what you asked for). In the code I linked, this operation is called BAVL_IndexOf().
Be aware that the node counts need to be updated as the tree is changed; this can be done with no (asymptotic) change in time complexity.

There are a few problems with your code:
1) find() returns a value (the correct node, assuming the function is working as intended), but you don't propagate that value up the call chain, so top-level calls don't know about the (possible) found element
.
Binode* elem = NULL;
elem = find(bn->l, n);
if (elem) return elem;
if(n-- == 0)
return bn;
elem = find(bn->r, n);
return elem; // here we don't need to test: we need to return regardless of the result
2) even though you do the decrement of n at the right place, the change does not propagate upward in the call chain. You need to pass the parameter by reference (note the & after int in the function signature), so the change is made on the original value, not on a copy of it
.
Binode* Tree::find(Binode* bn, int& n) const
I have not tested the suggested changes, but they should put you in the right direction for progress

Related

Recursive function to return last node of a heap

I am trying to return the last node of a binary heap (implemented with pointers, not an array). Here, 'last' means the bottom-most node starting from the left in this case without two children, actually the node where I am supposed to append the new node to. The insert function will bind data to a new node for insertion, then call this function to return the last node, so I can add the new node either left of right depending on the child nodes present, or not.
The problem is that the right side of the tree is always empty, and never gets past the height after root's. Seems to stick on the leftmost side, because it reaches first the exit condition from every recursion call starting from left.
The recursive function checks first the node, returns 0 if no child, returns 1 if only left child and returns 2 in case of a node having two children. Here is the function :
template<typename T>
node<T> * heap<T>::find_insert_pos (node<T> *x, int &res) {
if(find_insert_poshelper(x, res) == 0) return x;
else if(find_insert_poshelper(x, res) == 1) return x;
else {
node<T> *a = find_insert_pos(x->left, res);
if(find_insert_poshelper(a, res) != 2) return a;
else return find_insert_pos(a, res);
node<T> *b = find_insert_pos(x->right, res);
if(find_insert_poshelper(b, res) != 2) return b;
else return find_insert_pos(b, res);
}
}
I've tried to figure it out, but insertion still goes wrong. The other functions used into insertion are more than triple checked.
(By the way, 'res' is passed by reference always in the chunk of code)
I have changed the logic behind the function. Instead of only validating for children per node, I validate now if the node evaluated had children, if it does then I validate one step further each of those children, left and right, to see if any of those grand-children have children themselves.
If they don't, I will loop this for the next level following the root level 0, jumping to level 1 and so on, until one of the children nodes does not contain two children, and returning x.left or x.right, depending the case.
-- Final edit --
Hard to think about a MRE since it was more about logic. The question was posted by someone in need of practice with recursion, and it happened. All the logic changed, even for sub-functions.
But it will be required to manually assign and narrow down until three levels are full (full meaning having two children) before calling this operation, which is checking three levels down. Having this done nicely I get a beautiful heap.
I can show an attempt to a MRE of how I implemented it to be able to find the bottom node to append a new node to, but not pure since I don't put the code from the 'insert' function, which is part iterative (first three levels) and part recursive (that was the original question, to find the parent node for the new node to insert). How the insert operation goes, I create a new node dynamically and then I search for the parent node where I need to append new data to (the iterative part starts here until the 8th node of the tree is reached, path similar to BFS), then when the position is retrieved (that is, the pointer itself), I test whether for left or right insertion, as by the rules of the heap. Starting at the 8th node, the value of the parent node is set recursively as follows :
First the recursive function itself :
node * function_a (node *x, int &res) {
node *temp = function_b (x, res);
if(temp != ptr_null) return temp;
else {
if(x->L != ptr_null) function_a (x->L, res);
if(x->R != ptr_null) function_a (x->R, res);
return temp;
}
}
A sub-function :
node * function_b (node *x, int &res) {
node *a = x->L;
node *b = x->R;
int aL = function_c (a->L, res);
int aR = function_c (a->R, res);
int bL = function_c (b->L, res);
int bL = function_c (b->R, res);
if(aL != 2) return a->L;
else if(aR != 2) return a->R;
else if(bL != 2) return b->L;
else if(bR != 2) return b->R;
else return ptr_null;
}
And another :
int & function_c (node *x, int &res) {
if(x->L == ptr_null && x.R == ptr_null) return res = 0;
else if(x->L != ptr_null && x->R == ptr_null) return res = 1;
else return res = 2;
}
Since this is checking 3 levels down from x defined (in this case from the root) in function_a, I can't make it 100% recursive that way or I will get a segmentation fault. How can I improve my algorithm ?

Build minimum height BST from a sorted std::list<float> with C++

I'm having trouble writing the code to build minimum height BST from a sorted std::list.
For the node class:
class cBTNode
{
private:
cBTNode* m_LeftChild;
cBTNode* m_RightChild;
float m_Data;
}
For the BST class:
class cBTNodeTree
{
private:
cBTNode* m_Root;
public:
void LoadBalancedMain(std::list<float>& ls);
void LoadBalanced(std::list<float>& ls, cBTNode* root);
}
Implementation: (basically my method is to find the middle element of the list ls, put that into the root, put all the elements smaller than the middle element into ls_left, and all the elements bigger than it into ls_right. Then recursively build up the left and right subtree by recursively calling the same function on ls_left and ls_right)
void cBTNodeTree::LoadBalancedMain(std::list<float>& ls)
{
LoadBalanced(ls, m_Root); // m_Root is the root of the tree
}
void cBTNodeTree::LoadBalanced(std::list<float>& ls, cBTNode* root)
{
// Stopping Condition I:
if (ls.size() <= 0)
{
root = nullptr;
return;
}
// Stopping Condition II:
if (ls.size() == 1)
{
root = new cBTNode(ls.front());
return;
}
// When we have at least 2 elements in the list
// Step 1: Locate the middle element
if (ls.size() % 2 == 0)
{
// Only consider the case of even numbers for the moment
int middle = ls.size() / 2;
std::list<float> ls_left;
std::list<float> ls_right;
int index = 0;
// Obtain ls_left consisting elements smaller than the middle one
while (index < middle)
{
ls_left.push_back(ls.front());
ls.pop_front();
index += 1;
}
// Now we reach the middle element
root = new cBTNode(ls.front());
ls.pop_front();
// The rest is actually ls_right
while (ls.size() > 0)
{
ls_right.push_back(ls.front());
ls.pop_front();
}
// Now we have the root and two lists
cBTNode* left = root->GetLeftChild();
cBTNode* right = root->GetRightChild();
if (ls_left.size() > 0)
{
LoadBalanced(ls_left, left);
root->SetLeftChild(left);
}
else
{
left = nullptr;
}
if (ls_right.size() > 0)
{
LoadBalanced(ls_right, right);
root->SetRightChild(left);
}
else
{
right = nullptr;
}
}
}
My Question: Somehow I found that actually none of the elements has been inserted into the tree. For example, if I check the value of m_Root, the root of the tree, I got an error because it's still nullprt. I'm not sure where did I go wrong? I hope it's some stupid pointer mistake because I haven't slept well. (I'm pretty sure the 'new cBTNode(ls.front())' line works)
BTW although I have written a dozen functions for the BST, I'm still struggling with BST recursion. I noticed that in all the textbooks that I read, for the linked list version of BST, the insertion ALWAYS need a helper function that return a pointer to a node. I begin to feel that I don't actually understand the things going on behind the recursion...
1:
void cBTNodeTree::LoadBalanced(std::list<float>& ls, cBTNode* root)
Here cBTNode* root is passed by value.
Instead, you should pass by reference & or cBTNode** (pointer to a pointer).
Passing by reference would be simple, you won't need to change anything except the function signature.
void cBTNodeTree::LoadBalanced(std::list<float>& ls, cBTNode*& root)
Notice & before root in above statement.
2:
if (ls_right.size() > 0)
{
LoadBalanced(ls_right, right);
root->SetRightChild(left);
}
You are setting right child to left root which is not what you desire.
3:
cBTNode* left = root->GetLeftChild();
cBTNode* right = root->GetRightChild();
These are unnecessary.
4:
if (ls.size() % 2 == 0)
No need for two separate cases.
You can achieve this by just appropriately setting middle:
int middle = (ls.size()-1) / 2;
You pass the pointer to the root by value. Pass it by reference instead by changing the signature of LoadBalanced() appropriately.

replace each node of a tree with sum of all values in left side of the node without using extra integer pointer argument

Given a binary tree, I need to change the value in each node to sum of all the values in the nodes on the left side of the node. Essentially each node should have the value equal to sum of all values of nodes visited earlier to this node in in-order traversal of the tree. Important point is this has to be done without using integer pointer argument. I am able to solve it with interger pointer argument to hold sum like this. Without this integer pointer variable, How do I hold sum when I visit right side of a node from its parent.
void modifyBST(struct node *root, int *sum) {
if (root == NULL) return;
// Recur for right subtree
modifyBSTUtil(root->left, sum);
// Now *sum has sum of nodes in right subtree, add
// root->data to sum and update root->data
*sum = *sum + root->data;
root->data = *sum;
// Recur for left subtree
modifyBSTUtil(root->right, sum);
}
How do I modify this method such that int *sum can be removed.
My complete program is here click here
Example tree:
Example tree:
inorder: 4 2 5 1 6 3 7
preorder: 1 2 4 5 3 6 7
output : 4 6 11 12 18 21 28
int modify(struct node* node,int sum)
{
if (node == NULL)
return 0;
int l=modify(node->left,sum);
int r=modify(node->right,sum+l+node->data);
int x=node->data;
node->data=node->data+l+sum;
return x+l+r;
}
Call the function using statement:
modify(root,0);
Full implementation : http://ideone.com/A3ezlk
One possible solution:
Rephrasing the question, what you really want to do is set each node's sum to the previous node's sum + the value associated with that node, i.e.
noden->sum = noden-1->sum + noden->value
Right now, you are setting noden->sum when you visit noden, but the issue you are running into is you don't have easy access to noden-1->sum without passing it as a parameter. To work around this, I would suggest setting noden->sum when you DO have easy access to noden-1->sum, i.e. when you visit noden-1
Some C code to show what I mean
void modifyBST(struct node *curNode)
{
struct node* next = treeSuccessor(curNode);
if(next != NULL)
{
next->sum = curNode->sum + next->value;
modifyBST(next);
}
}
The assumption is that curNode->sum has been set before modifyBST is called for that node, which is valid for the first node (as its sum is just equal to its value) and inductively valid for all other nodes.
You would use this method by finding the first node (the one with value 4 in your example), if necessary setting sum equal to its value, and calling modifyBST with that node as the argument.
TreeSuccessor is a fairly well known algorithm, you can find pseudocode for it many places online if you need.
I think below code should work just fine. I have not tested it, just checked it manually.
I hope this helps:
int modifyBST(struct node *root) {
if (root == NULL) return 0;
root->data += modifyBSTUtil(root->left);
return (root->data + modifyBSTUtil(root->right));
}
Since your pointer argument variable sum never changes, we can easily get rid of it.
int sum;
void modifyNode(struct node *root)
{
if (!root) return;
modifyNode(root->left);
sum = root->data += sum;
modifyNode(root->right);
}
void modifyTree(struct node *root)
{
sum = 0, modifyNode(root);
}
integrated

getting the number of nodes in a binary search tree

So I'm working on a method that gets the number of nodes in a binary search tree, when I have 3 nodes, it gives me 3, but if I do 5 it gives me 4, what do I need to change?
int BinaryTree::size(int count, Node *leaf) const
{
if(leaf != NULL)//if we are not at a leaf
{
size(count + 1, leaf->getLeft());//recurisvly call the function and increment the count
size(count + 1, leaf->getRight());
}
else
{
return count;//return the count
}
}
int BinaryTree::size(Node *leaf) const
{
if(leaf == NULL) { //This node doesn't exist. Therefore there are no nodes in this 'subtree'
return 0;
} else { //Add the size of the left and right trees, then add 1 (which is the current node)
return size(leaf->getLeft()) + size(leaf->getRight()) + 1;
}
}
While this is a different approach, I find that is it easier to read through than what you had.
Other people have already chimed in with a correct algorithm. I'm just going to explain why your algorithm doesn't work.
The logic behind your algorithm seems to be: keep a running count value. If the leaf is null then it has no children so return the count, if the leaf is not null then recurse down the children.
This is backwards though. Because you're going to need to pass your int by reference, not value, and then not increment if it's null, increment if it's not null, and recurse.
So your original idea would work with some modifications, but Nick Mitchinson and arrows have a better way. This is your algorithm fixed so it works:
int BinaryTree::size(Node *leaf, int& count=0) const
{
if(leaf != NULL)//if we are not at a leaf
{
count++;
size(leaf->getLeft(), count);//recurisvly call the function and increment the count
size(leaf->getRight(), count);
}
return count;//return the count
}
But again, there are better ways to write this. And the other answers show them.
int BinaryTree::size(Node *n) const
{
if(!n)
return 0;
else
return size(n->getLeft()) + 1 + size(n->getRight());
}

C++ return an object or change object inside function

I'm new to C++ so am still learning. I am trying to write an algorithm to build a tree recursively, I would usually write it according to Method 1 below, however, as when the function returns it makes a (I hope deep) copy of the RandomTreeNode, I am concerned about calling it recursively and would therefore prefer method 2. Am I correct in my thinking?
Method 1
RandomTreeNode build_tree(std::vector<T>& data, const std::vector<funcion_ptr>& functions){
if(data.size() == 0 || data_has_same_values(data)){
RandomeTreeNode node = RandomTreeNode();
node.setData(node);
return node;
}
RandomTreeNode parent = RandomTreeNode();
vector<T> left_data = split_data_left(data);
vector<T> right_data = split_data_right(data);
parent.set_left_child(build_tree(left_data));
parent.set_right_child(build_tree(right_data));
return parent;
}
Method 2
void build_tree(RandomTreeNode& current_node, vector<T> data){
if(data.size() == 0 || data_has_same_values(data)){
current_node.setData(node);
}
vector<T> left_data = split_data_left(data);
vector<T> right_data = split_data_right(data);
RandomTreeNode left_child = RandomTreeNode();
RandomTreeNode right_child = RandomTreeNode();
current_node.set_left_child(left_child);
current_node.set_right_child(right_child);
build_tree(left_child, left_data);
build_tree(right_child, right_data);
}
There are several improvements.
First, you're copying a vector. As I understand the name of your functions, you're splitting a vector in two blocks ([left|right] and not [l|r|lll|r|...]). So, instead of passing a vector each time, you can just pass index to specify the ranges.
The method 2, if well implemented, will be more efficient in memory. So, you should improve the idea behind it.
Last, you can use an auxilliary function, which will be more suited to the problem (a mix between method 1 and method 2).
Here is some sample code:
// first is inclusive
// last is not inclusive
void build_tree_aux(RandomTreeNode& current_node, std::vector<T>& data, int first, int last)
{
if(last == first || data_has_same_values(data,first,last))
{
current_node.setData(data,first,last);
// ...
}
// Find new ranges
int leftFirst = first;
int leftLast = split_data(data,first,last);
int rightFirst = leftLast;
int rightLast = last;
// Instead of copying an empty node, we create the children
// of current_node, and then process these nodes
current_node.build_left_child();
current_node.build_right_child();
// Recursion, left_child() and right_child() returns reference
build_tree_aux(current_node.left_child(),data,leftFirst,leftLast);
build_tree_aux(current_node.right_child(),data,rightFirst,rightLast);
/*
// left_child() and right_child() are not really breaking encapsulation,
// because you can consider that the child nodes are not really a part of
// a node.
// But if you want, you can do the following:
current_node.build_tree(data,leftFirst,leftLast);
// Where RandomTreeNode::build_tree simply call build_tree_aux on the 2 childrens
*/
}
RandomTreeNode build_tree(std::vector<T>& data)
{
RandomTreeNode root;
build_tree_aux(root,data,0,data.size());
return root;
}