getting the number of nodes in a binary search tree - c++

So I'm working on a method that gets the number of nodes in a binary search tree, when I have 3 nodes, it gives me 3, but if I do 5 it gives me 4, what do I need to change?
int BinaryTree::size(int count, Node *leaf) const
{
if(leaf != NULL)//if we are not at a leaf
{
size(count + 1, leaf->getLeft());//recurisvly call the function and increment the count
size(count + 1, leaf->getRight());
}
else
{
return count;//return the count
}
}

int BinaryTree::size(Node *leaf) const
{
if(leaf == NULL) { //This node doesn't exist. Therefore there are no nodes in this 'subtree'
return 0;
} else { //Add the size of the left and right trees, then add 1 (which is the current node)
return size(leaf->getLeft()) + size(leaf->getRight()) + 1;
}
}
While this is a different approach, I find that is it easier to read through than what you had.

Other people have already chimed in with a correct algorithm. I'm just going to explain why your algorithm doesn't work.
The logic behind your algorithm seems to be: keep a running count value. If the leaf is null then it has no children so return the count, if the leaf is not null then recurse down the children.
This is backwards though. Because you're going to need to pass your int by reference, not value, and then not increment if it's null, increment if it's not null, and recurse.
So your original idea would work with some modifications, but Nick Mitchinson and arrows have a better way. This is your algorithm fixed so it works:
int BinaryTree::size(Node *leaf, int& count=0) const
{
if(leaf != NULL)//if we are not at a leaf
{
count++;
size(leaf->getLeft(), count);//recurisvly call the function and increment the count
size(leaf->getRight(), count);
}
return count;//return the count
}
But again, there are better ways to write this. And the other answers show them.

int BinaryTree::size(Node *n) const
{
if(!n)
return 0;
else
return size(n->getLeft()) + 1 + size(n->getRight());
}

Related

Recursive function to return last node of a heap

I am trying to return the last node of a binary heap (implemented with pointers, not an array). Here, 'last' means the bottom-most node starting from the left in this case without two children, actually the node where I am supposed to append the new node to. The insert function will bind data to a new node for insertion, then call this function to return the last node, so I can add the new node either left of right depending on the child nodes present, or not.
The problem is that the right side of the tree is always empty, and never gets past the height after root's. Seems to stick on the leftmost side, because it reaches first the exit condition from every recursion call starting from left.
The recursive function checks first the node, returns 0 if no child, returns 1 if only left child and returns 2 in case of a node having two children. Here is the function :
template<typename T>
node<T> * heap<T>::find_insert_pos (node<T> *x, int &res) {
if(find_insert_poshelper(x, res) == 0) return x;
else if(find_insert_poshelper(x, res) == 1) return x;
else {
node<T> *a = find_insert_pos(x->left, res);
if(find_insert_poshelper(a, res) != 2) return a;
else return find_insert_pos(a, res);
node<T> *b = find_insert_pos(x->right, res);
if(find_insert_poshelper(b, res) != 2) return b;
else return find_insert_pos(b, res);
}
}
I've tried to figure it out, but insertion still goes wrong. The other functions used into insertion are more than triple checked.
(By the way, 'res' is passed by reference always in the chunk of code)
I have changed the logic behind the function. Instead of only validating for children per node, I validate now if the node evaluated had children, if it does then I validate one step further each of those children, left and right, to see if any of those grand-children have children themselves.
If they don't, I will loop this for the next level following the root level 0, jumping to level 1 and so on, until one of the children nodes does not contain two children, and returning x.left or x.right, depending the case.
-- Final edit --
Hard to think about a MRE since it was more about logic. The question was posted by someone in need of practice with recursion, and it happened. All the logic changed, even for sub-functions.
But it will be required to manually assign and narrow down until three levels are full (full meaning having two children) before calling this operation, which is checking three levels down. Having this done nicely I get a beautiful heap.
I can show an attempt to a MRE of how I implemented it to be able to find the bottom node to append a new node to, but not pure since I don't put the code from the 'insert' function, which is part iterative (first three levels) and part recursive (that was the original question, to find the parent node for the new node to insert). How the insert operation goes, I create a new node dynamically and then I search for the parent node where I need to append new data to (the iterative part starts here until the 8th node of the tree is reached, path similar to BFS), then when the position is retrieved (that is, the pointer itself), I test whether for left or right insertion, as by the rules of the heap. Starting at the 8th node, the value of the parent node is set recursively as follows :
First the recursive function itself :
node * function_a (node *x, int &res) {
node *temp = function_b (x, res);
if(temp != ptr_null) return temp;
else {
if(x->L != ptr_null) function_a (x->L, res);
if(x->R != ptr_null) function_a (x->R, res);
return temp;
}
}
A sub-function :
node * function_b (node *x, int &res) {
node *a = x->L;
node *b = x->R;
int aL = function_c (a->L, res);
int aR = function_c (a->R, res);
int bL = function_c (b->L, res);
int bL = function_c (b->R, res);
if(aL != 2) return a->L;
else if(aR != 2) return a->R;
else if(bL != 2) return b->L;
else if(bR != 2) return b->R;
else return ptr_null;
}
And another :
int & function_c (node *x, int &res) {
if(x->L == ptr_null && x.R == ptr_null) return res = 0;
else if(x->L != ptr_null && x->R == ptr_null) return res = 1;
else return res = 2;
}
Since this is checking 3 levels down from x defined (in this case from the root) in function_a, I can't make it 100% recursive that way or I will get a segmentation fault. How can I improve my algorithm ?

Build minimum height BST from a sorted std::list<float> with C++

I'm having trouble writing the code to build minimum height BST from a sorted std::list.
For the node class:
class cBTNode
{
private:
cBTNode* m_LeftChild;
cBTNode* m_RightChild;
float m_Data;
}
For the BST class:
class cBTNodeTree
{
private:
cBTNode* m_Root;
public:
void LoadBalancedMain(std::list<float>& ls);
void LoadBalanced(std::list<float>& ls, cBTNode* root);
}
Implementation: (basically my method is to find the middle element of the list ls, put that into the root, put all the elements smaller than the middle element into ls_left, and all the elements bigger than it into ls_right. Then recursively build up the left and right subtree by recursively calling the same function on ls_left and ls_right)
void cBTNodeTree::LoadBalancedMain(std::list<float>& ls)
{
LoadBalanced(ls, m_Root); // m_Root is the root of the tree
}
void cBTNodeTree::LoadBalanced(std::list<float>& ls, cBTNode* root)
{
// Stopping Condition I:
if (ls.size() <= 0)
{
root = nullptr;
return;
}
// Stopping Condition II:
if (ls.size() == 1)
{
root = new cBTNode(ls.front());
return;
}
// When we have at least 2 elements in the list
// Step 1: Locate the middle element
if (ls.size() % 2 == 0)
{
// Only consider the case of even numbers for the moment
int middle = ls.size() / 2;
std::list<float> ls_left;
std::list<float> ls_right;
int index = 0;
// Obtain ls_left consisting elements smaller than the middle one
while (index < middle)
{
ls_left.push_back(ls.front());
ls.pop_front();
index += 1;
}
// Now we reach the middle element
root = new cBTNode(ls.front());
ls.pop_front();
// The rest is actually ls_right
while (ls.size() > 0)
{
ls_right.push_back(ls.front());
ls.pop_front();
}
// Now we have the root and two lists
cBTNode* left = root->GetLeftChild();
cBTNode* right = root->GetRightChild();
if (ls_left.size() > 0)
{
LoadBalanced(ls_left, left);
root->SetLeftChild(left);
}
else
{
left = nullptr;
}
if (ls_right.size() > 0)
{
LoadBalanced(ls_right, right);
root->SetRightChild(left);
}
else
{
right = nullptr;
}
}
}
My Question: Somehow I found that actually none of the elements has been inserted into the tree. For example, if I check the value of m_Root, the root of the tree, I got an error because it's still nullprt. I'm not sure where did I go wrong? I hope it's some stupid pointer mistake because I haven't slept well. (I'm pretty sure the 'new cBTNode(ls.front())' line works)
BTW although I have written a dozen functions for the BST, I'm still struggling with BST recursion. I noticed that in all the textbooks that I read, for the linked list version of BST, the insertion ALWAYS need a helper function that return a pointer to a node. I begin to feel that I don't actually understand the things going on behind the recursion...
1:
void cBTNodeTree::LoadBalanced(std::list<float>& ls, cBTNode* root)
Here cBTNode* root is passed by value.
Instead, you should pass by reference & or cBTNode** (pointer to a pointer).
Passing by reference would be simple, you won't need to change anything except the function signature.
void cBTNodeTree::LoadBalanced(std::list<float>& ls, cBTNode*& root)
Notice & before root in above statement.
2:
if (ls_right.size() > 0)
{
LoadBalanced(ls_right, right);
root->SetRightChild(left);
}
You are setting right child to left root which is not what you desire.
3:
cBTNode* left = root->GetLeftChild();
cBTNode* right = root->GetRightChild();
These are unnecessary.
4:
if (ls.size() % 2 == 0)
No need for two separate cases.
You can achieve this by just appropriately setting middle:
int middle = (ls.size()-1) / 2;
You pass the pointer to the root by value. Pass it by reference instead by changing the signature of LoadBalanced() appropriately.

Checking for a cycle in an undirected graph using DFS?

So, I made the following code for DFS:
void dfs (graph * mygraph, int foo, bool arr[]) // here, foo is the source vertex
{
if (arr[foo] == true)
return;
else
{
cout<<foo<<"\t";
arr[foo] = true;
auto it = mygraph->edges[foo].begin();
while (it != mygraph->edges[foo].end())
{
int k = *it;
if (arr[k] == false)
{
//cout<<k<<"\n";
dfs(mygraph,k,arr);
//cout<<k<<"\t";
}
it++;
}
}
//cout<<"\n";
}
Now, I read up that in an undirected graph, if while DFS, it returns to the same vertex again, there is a cycle. Therefore, what I did was this,
bool checkcycle( graph * mygraph, int foo, bool arr[] )
{
bool result = false;
if (arr[foo] == true)
{
result = true;
}
else
{
arr[foo] = true;
auto it = mygraph->edges[foo].begin();
while (it != mygraph->edges[foo].end())
{
int k = *it;
result = checkcycle(mygraph,k,arr);
it++;
}
}
return result;
}
But, my checkcycle function returns true even if their is no cycle. Why is that? Is there something wrong with my function? There is no execution problem, otherwise I would have debugged, but their seems to be something wrong in my logic.
Notice that your function doesn't quite do what you think it does. Let me try to step through what's happening here. Assume the following relationships: (1,2), (1,3), (2,3). I'm not assuming reflexibility (that is, (1,2) does not imply (2,1)). Relationships are directed.
Start with node 1. Flag it as visited
Iterate its children (2 and 3)
When in node 2, recursively call check cycle. At this point 2 is also flagged as visited.
The recursive call now visits 3 (DEPTH search). 3 is also flagged as visited
Call for step 4 dies returning false
Call for step 3 dies returning false
We're back at step 2. Now we'll iterate node 3, which has already been flagged in step 4. It just returns true.
You need a stack of visited nodes or you ONLY search for the original node. The stack will detect sub-cycles as well (cycles that do not include the original node), but it also takes more memory.
Edit: the stack of nodes is not just a bunch of true/false values, but instead a stack of node numbers. A node has been visited in the current stack trace if it's present in the stack.
However, there's a more memory-friendly way: set arr[foo] = false; as the calls die. Something like this:
bool checkcycle( graph * mygraph, int foo, bool arr[], int previousFoo=-1 )
{
bool result = false;
if (arr[foo] == true)
{
result = true;
}
else
{
arr[foo] = true;
auto it = mygraph->edges[foo].begin();
while (it != mygraph->edges[foo].end())
{
int k = *it;
// This should prevent going back to the previous node
if (k != previousFoo) {
result = checkcycle(mygraph,k,arr, foo);
}
it++;
}
// Add this
arr[foo] = false;
}
return result;
}
I think it should be enough.
Edit: should now support undirected graphs.
Node: this code is not tested
Edit: for more elaborate solutions see Strongly Connected Components
Edit: this answer is market as accepted although the concrete solution was given in the comments. Read the comments for details.
are all of the bools in arr[] set to false before checkcycle begins?
are you sure your iterator for the nodes isn't doubling back on edges it has already traversed (and thus seeing the starting node multiple times regardless of cycles)?

C++: Counting the duplicates in a BST

I have a BST as;
8
/ \
4 12
\
6
/
6
I have the following code in order to calculate the duplicate count which in here should be 1 (6 has a duplicate);
struct Node
{
int data;
Node *left, *right;
};
void inorder(Node *root, Node *previous, int count)
{
if(root != NULL)
{
if(root != previous && root->data == previous->data)
count++;
previous = root;
inorder(root->left, previous, count);
cout<<root->data<<" ";
inorder(root->right, previous, count);
}
}
I have to do this using constant extra space.I know it's nowhere close but the idea I have is to keep a track of the previous node and check for the duplicate and at the end return the count. But I couldn't get to return an integer value while performing in order BST traversal. Besides that would there be a better way to count the duplicates in BST. I initiate;
inorder(a, a, 0);
In a binary search tree, depending on how the insert is written, the duplicate will always be on the left or right, looks like left in your case. So all you need is one extra variable that keeps track of the count of the dupes, in your function keep track of the last visited node if the current node is ever the same as the last visited one increment the count.
Here's some code Disclaimer: totally untested just know it compiles
int count_dupes(Node * root, Node * last = nullptr) {
int is_dupe = 0;
if (root->value == last->value) is_dupe = 1;
return is_dupe + (root->right != nullptr? count_dupes(root->right,root):0)
+ (root->left!= nullptr? count_dupes(root->left,root):0);
}
By the way I'm sensing this is an interview type question but Thomas Matthews is right, your tree should not have duplicates inserted.
Lets assume in your BST a duplicate can only be on the left of a node (it is always the same side, we just have to choose the convention and stick to it). Just increment duplicate count as you recurse left in your in-order traversal and value does not change. Make sure you pass count by reference, not by value. Zero it out before starting. Nice interview question, btw

Find nth smallest element in Binary Search Tree

I have written an algorithm for finding nth smallest element in BST but it returns root node instead of the nth smallest one. So if you input nodes in order 7 4 3 13 21 15, this algorithm after call find(root, 0) returns Node with value 7 instead of 3, and for call find(root, 1) it returns 13 instead of 4. Any thoughts ?
Binode* Tree::find(Binode* bn, int n) const
{
if(bn != NULL)
{
find(bn->l, n);
if(n-- == 0)
return bn;
find(bn->r, n);
}
else
return NULL;
}
and definition of Binode
class Binode
{
public:
int n;
Binode* l, *r;
Binode(int x) : n(x), l(NULL), r(NULL) {}
};
It is not possible to efficiently retrieve the n-th smallest element in a binary search tree by itself. However, this does become possible if you keep in each node an integer indicating the number of nodes in its entire subtree. From my generic AVL tree implementation:
static BAVLNode * BAVL_GetAt (const BAVL *o, uint64_t index)
{
if (index >= BAVL_Count(o)) {
return NULL;
}
BAVLNode *c = o->root;
while (1) {
ASSERT(c)
ASSERT(index < c->count)
uint64_t left_count = (c->link[0] ? c->link[0]->count : 0);
if (index == left_count) {
return c;
}
if (index < left_count) {
c = c->link[0];
} else {
c = c->link[1];
index -= left_count + 1;
}
}
}
In the above code, node->link[0] and node->link[1] are the left and right child of node, and node->count is the number of nodes in the entire subtree of node.
The above algorithm has O(logn) time complexity, assuming the tree is balanced. Also, if you keep these counts, another operation becomes possible - given a pointer to a node, it is possible to efficiently determine its index (the inverse of the what you asked for). In the code I linked, this operation is called BAVL_IndexOf().
Be aware that the node counts need to be updated as the tree is changed; this can be done with no (asymptotic) change in time complexity.
There are a few problems with your code:
1) find() returns a value (the correct node, assuming the function is working as intended), but you don't propagate that value up the call chain, so top-level calls don't know about the (possible) found element
.
Binode* elem = NULL;
elem = find(bn->l, n);
if (elem) return elem;
if(n-- == 0)
return bn;
elem = find(bn->r, n);
return elem; // here we don't need to test: we need to return regardless of the result
2) even though you do the decrement of n at the right place, the change does not propagate upward in the call chain. You need to pass the parameter by reference (note the & after int in the function signature), so the change is made on the original value, not on a copy of it
.
Binode* Tree::find(Binode* bn, int& n) const
I have not tested the suggested changes, but they should put you in the right direction for progress