counting number of elements less than X in a BST - c++

I had implemented a BST for a multiset using the C++ code below, whereas each node contains the number of occurrence num of each distinct number data, and I try to find the number of elements less than certain value x, using the order function below.
It works, however, inefficient in terms of execution time.
Is there any method with better time complexity?
struct Node {
int data;
int height;
int num;
Node *left;
Node *right;
};
int order(Node *n, int x) {
int sum = 0;
if (n != NULL) {
if (n->data < x) {
sum += n->num;
sum += order(n->right, x);
}
sum += order(n->left, x);
}
return sum;
}

You can bring the algorithm down to O(logN) time by storing in each node the number of elements in the subtree of which it is the root. Then you'd only have to recurse on one of the two children of each node (go left if x < node->data, right if x > node->data), which if the tree is balanced only takes logarithmic time.
struct Node {
int data;
int height;
int num;
int size; // numer of elements in the subtree starting at this node
Node *left;
Node *right;
};
int order(Node *n, int x) {
if(n == NULL) return 0;
// elements less than n->data make up the whole left subtree
if (x == n->data) {
return n->left ? n->left->size : 0;
}
// even smaller? just recurse left
else if (x < n->data) {
return order(n->left, x);
}
// bigger? take the whole left subtree and part of the right one
else {
return (n->left ? n->left->size : 0) + order(n->right, x);
}
}
Of course, now you have to keep track of the size, but this can be done very efficiently when updating the tree: simply recalculate the size (n->left->size + n->right->size + 1) of each modified node in an insertion or deletion.

If you can add the size to your structure, I highly recommend using Dario Petrillo’s answer.
If you have to stick to your structure, you can reduce the number of instructions and recursions.
int count_all(Node* n) {
int acc = n->num;
if (n->left != NULL) acc += count_all(n->left);
if (n->right != NULL) acc += count_all(n->right);
return acc;
}
int order(Node *n, int x) {
if (n == NULL) return 0;
// Find the first left node which is < x
while (n->data >= x) {
n = n->left;
if (n == NULL) return 0;
}
assert(n != NULL && n->data < x);
int sum = n->num;
// Grab everything left because all of them are < x
if (n->left != NULL) sum += count_all(n->left);
// Some of the right nodes may be < x, some may not
// Repeat the algorithm to find out
if (n->right != NULL) sum += order(n->right, x);
return sum;
}
This reduces the number of recursions when the root is bigger than x and you want to quickly find the next left node that satisfies n->data < x. It also removes a ton of unnecessary comparisons to x for the left side of a tree where you already know that everything is < x.

Related

Codeforces question on Binary Lifting and LCA

Recently, I was solving this problem on Codeforces, but I failed the testcases. I can't see the testcases, nor can I see other's solutions because it is a gym question. Since, I cannot view the testcases where my code is failing, I need some help. I promise that the code is quite readable. Question is mainly on Binary lifting and LCA calculation.
As probably know, sloths live on trees. Accordingly, David has a pet sloth which he lets play on his unweighted trees when he solves programming problems. Occasionally, David will notice that his sloth is located on a particular node π‘Ž in the tree, and ask it to move to some other node 𝑏.
Of course, the sloth is as well-intentioned as can be, but alas, it only has enough energy to move across at most 𝑐 edges. If the sloth needs to cross fewer than 𝑐 edges to get to node 𝑏, it will get there and then take a nap. Otherwise, it will get as close as possible before it retires and hangs idly awaiting further digestion.
Where will the sloth end up? Also, since this happens quite often, David would like you to answer π‘ž queries, each one of the similar form.
Input
The first line will contain a single integer 𝑛, the number of nodes in the tree. The following π‘›βˆ’1 lines will contain two integers 𝑒 and 𝑣, describing the edges in the tree. These edges will form a tree.
After that, there will be a single line containing an integer π‘ž, the number of times David will motivate his sloth to move. π‘ž lines follow, each containing three integers π‘Ž, 𝑏, and 𝑐: the node the sloth starts on, the node David asks the sloth to move to, and the energy the sloth has when starting.
int LOGN = 19;
int n;
int timer = 0;
vector<int> adj[300005];
bool vis[300005];
int p[300005], tin[300005], tout[300005];
vector<int> d(300005,0);
int dp[300005][20];
void dfs(int v, int parent)
{
tin[v] = ++timer;
if (parent != -1)
d[v] = d[parent] + 1;
p[v] = parent;
dp[v][0] = parent;
for (int i = 1; i < LOGN; i++){
if (dp[v][i β€” 1] == -1)
dp[v][i] = -1;
else
dp[v][i] = dp[dp[v][i β€” 1]][i β€” 1];
}
for (auto u : adj[v]){
if (u != parent)
dfs(u, v);
}
tout[v] = ++timer;
}
bool isAncestor(int p, int c){ // check if p is ancestor of c
return (tin[p] <= tin[c] && tout[p] >= tout[c]);
}
int lift(int v, int steps) //find the y-th ancestor of x
{
for (int i = 0; i < 19; ++i){
if ((1 << i) & steps){
v = dp[v][i];
if (v == -1)
return -1;
steps -= (1 << i);
}
}
return v;
}
int LCA(int u, int v)
{
if (isAncestor(u, v))
return u;
if (isAncestor(v, u))
return v;
for (int i = 18; i >= 0; i--)
{
if (dp[u][i] == -1)// -1 should not be passed
continue;
if (!isAncestor(dp[u][i], v))
u = dp[u][i];
}
return dp[u][0];
}
void solve()
{
cin>>n;
timer=0;
for(int i=0;i<n-1;i++){// Input
int x,y;
cin>>x>>y;
x--;y--;
adj[x].pb(y);
adj[y].pb(x);
}
dfs(0,-1);
tout[0]=++timer;
int q;
cin>>q;
for(int z=0;z<q;z++){
int u,v,energy;
cin>>u>>v>>energy;
u--;v--;
if(isAncestor(u,v))// u is ancestor of v
{
if(d[v]-d[u]<=energy)// reachable
cout<<v+1<<endl;
else{// midway: lift
int res=lift(v,energy);
cout<<res+1<<endl;
}
}
else if(isAncestor(v,u))// v is ancestor of u
{
if(d[u]-d[v]<=energy)// reachable
cout<<v+1<<endl;
else{/// midway: lift
int res=lift(u,d[u]-d[v]-energy);
cout<<res+1<<endl;
}
}
else{
int lca=LCA(u,v);
int dist=d[u]+d[v]-2*d[lca];
if(energy>=dist)
cout<<v+1<<endl;
else{
// find vertex of the path from u to v with distance of energy from u
int dist1=d[u]-d[lca];
int dist2=d[v]-d[lca];
// lies on the path from u to LCA
if(energy<=dist1){
int res=lift(u,energy)+1;
cout<<res<<endl;
}
// lies on the path from LCA to v
else{
int rem=energy-dist1;
dist2-=rem;
int res=lift(v,dist2)+1;
cout<<res<<endl;
}
}
}
}
}
signed main()
{
FAST;
solve();
return 0;
}

btree program crashing possibly due to pointers

I'm trying to print a b tree in level order,but it keeps on crashing. Im not sure whats the real reason but i think its crashing because of the pointers. Im trying to use a function i found online that goes through each level and puts it in a queue and prints it, but ive run into this problem.If anyone has another way of doing it please let me know.
// C++ program for B-Tree insertion
#include<iostream>
#include <queue>
using namespace std;
int ComparisonCount = 0;
// A BTree node
class BTreeNode
{
int *keys; // An array of keys
int t; // Minimum degree (defines the range for number of keys)
BTreeNode **C; // An array of child pointers
int n; // Current number of keys
bool leaf; // Is true when node is leaf. Otherwise false
public:
BTreeNode(int _t, bool _leaf); // Constructor
// A utility function to insert a new key in the subtree rooted with
// this node. The assumption is, the node must be non-full when this
// function is called
void insertNonFull(int k);
// A utility function to split the child y of this node. i is index of y in
// child array C[]. The Child y must be full when this function is called
void splitChild(int i, BTreeNode *y);
// A function to traverse all nodes in a subtree rooted with this node
void traverse();
// A function to search a key in subtree rooted with this node.
BTreeNode *search(int k); // returns NULL if k is not present.
// Make BTree friend of this so that we can access private members of this
// class in BTree functions
friend class BTree;
};
// A BTree
class BTree
{
BTreeNode *root; // Pointer to root node
int t; // Minimum degree
public:
// Constructor (Initializes tree as empty)
BTree(int _t)
{
root = NULL; t = _t;
}
// function to traverse the tree
void traverse()
{
if (root != NULL) root->traverse();
}
// function to search a key in this tree
BTreeNode* search(int k)
{
return (root == NULL) ? NULL : root->search(k);
}
// The main function that inserts a new key in this B-Tree
void insert(int k);
};
// Constructor for BTreeNode class
BTreeNode::BTreeNode(int t1, bool leaf1)
{
// Copy the given minimum degree and leaf property
t = t1;
leaf = leaf1;
// Allocate memory for maximum number of possible keys
// and child pointers
keys = new int[2 * t - 1];
C = new BTreeNode *[2 * t];
// Initialize the number of keys as 0
n = 0;
}
// Function to traverse all nodes in a subtree rooted with this node
/*void BTreeNode::traverse()
{
// There are n keys and n+1 children, travers through n keys
// and first n children
int i;
for (i = 0; i < n; i++)
{
// If this is not leaf, then before printing key[i],
// traverse the subtree rooted with child C[i].
if (leaf == false)
{
ComparisonCount++;
C[i]->traverse();
}
cout << " " << keys[i];
}
// Print the subtree rooted with last child
if (leaf == false)
{
ComparisonCount++;
C[i]->traverse();
}
}*/
// Function to search key k in subtree rooted with this node
BTreeNode *BTreeNode::search(int k)
{
// Find the first key greater than or equal to k
int i = 0;
while (i < n && k > keys[i])
i++;
// If the found key is equal to k, return this node
if (keys[i] == k)
{
ComparisonCount++;
return this;
}
// If key is not found here and this is a leaf node
if (leaf == true)
{
ComparisonCount++;
return NULL;
}
// Go to the appropriate child
return C[i]->search(k);
}
// The main function that inserts a new key in this B-Tree
void BTree::insert(int k)
{
// If tree is empty
if (root == NULL)
{
ComparisonCount++;
// Allocate memory for root
root = new BTreeNode(t, true);
root->keys[0] = k; // Insert key
root->n = 1; // Update number of keys in root
}
else // If tree is not empty
{
// If root is full, then tree grows in height
if (root->n == 2 * t - 1)
{
ComparisonCount++;
// Allocate memory for new root
BTreeNode *s = new BTreeNode(t, false);
// Make old root as child of new root
s->C[0] = root;
// Split the old root and move 1 key to the new root
s->splitChild(0, root);
// New root has two children now. Decide which of the
// two children is going to have new key
int i = 0;
if (s->keys[0] < k)
{
ComparisonCount++;
i++;
}s->C[i]->insertNonFull(k);
// Change root
root = s;
}
else // If root is not full, call insertNonFull for root
root->insertNonFull(k);
}
}
// A utility function to insert a new key in this node
// The assumption is, the node must be non-full when this
// function is called
void BTreeNode::insertNonFull(int k)
{
// Initialize index as index of rightmost element
int i = n - 1;
// If this is a leaf node
if (leaf == true)
{
ComparisonCount++;
// The following loop does two things
// a) Finds the location of new key to be inserted
// b) Moves all greater keys to one place ahead
while (i >= 0 && keys[i] > k)
{
keys[i + 1] = keys[i];
i--;
}
// Insert the new key at found location
keys[i + 1] = k;
n = n + 1;
}
else // If this node is not leaf
{
// Find the child which is going to have the new key
while (i >= 0 && keys[i] > k)
i--;
// See if the found child is full
if (C[i + 1]->n == 2 * t - 1)
{
ComparisonCount++;
// If the child is full, then split it
splitChild(i + 1, C[i + 1]);
// After split, the middle key of C[i] goes up and
// C[i] is splitted into two. See which of the two
// is going to have the new key
if (keys[i + 1] < k)
i++;
}
C[i + 1]->insertNonFull(k);
}
}
// A utility function to split the child y of this node
// Note that y must be full when this function is called
void BTreeNode::splitChild(int i, BTreeNode *y)
{
// Create a new node which is going to store (t-1) keys
// of y
BTreeNode *z = new BTreeNode(y->t, y->leaf);
z->n = t - 1;
// Copy the last (t-1) keys of y to z
for (int j = 0; j < t - 1; j++)
z->keys[j] = y->keys[j + t];
// Copy the last t children of y to z
if (y->leaf == false)
{
ComparisonCount++;
for (int j = 0; j < t; j++)
z->C[j] = y->C[j + t];
}
// Reduce the number of keys in y
y->n = t - 1;
// Since this node is going to have a new child,
// create space of new child
for (int j = n; j >= i + 1; j--)
C[j + 1] = C[j];
// Link the new child to this node
C[i + 1] = z;
// A key of y will move to this node. Find location of
// new key and move all greater keys one space ahead
for (int j = n - 1; j >= i; j--)
keys[j + 1] = keys[j];
// Copy the middle key of y to this node
keys[i] = y->keys[t - 1];
// Increment count of keys in this node
n = n + 1;
}
void BTreeNode::traverse()
{
std::queue<BTreeNode*> queue;
queue.push(this);
while (!queue.empty())
{
BTreeNode* current = queue.front();
queue.pop();
int i;
for (i = 0; i < n; i++)
{
if (leaf == false)
queue.push(current->C[i]);
cout << " " << current->keys[i] << endl;
}
if (leaf == false)
queue.push(current->C[i]);
}
}
// Driver program to test above functions
int main()
{
BTree t(4); // A B-Tree with minium degree 4
srand(29324);
for (int i = 0; i<200; i++)
{
int p = rand() % 10000;
t.insert(p);
}
cout << "Traversal of the constucted tree is ";
t.traverse();
int k = 6;
(t.search(k) != NULL) ? cout << "\nPresent" : cout << "\nNot Present";
k = 28;
(t.search(k) != NULL) ? cout << "\nPresent" : cout << "\nNot Present";
cout << "There are " << ComparisonCount << " comparison." << endl;
system("pause");
return 0;
}
Your traversal code uses the field values for this as though they were the values for the current node in the loop body.
You need to stick current-> in front of the member references in the loop body like this (in the lines marked with "//*"):
while (!queue.empty())
{
BTreeNode* current = queue.front();
queue.pop();
int i;
for (i = 0; i < current->n; i++) //*
{
if (current->leaf == false) //*
queue.push(current->C[i]);
cout << " " << current->keys[i] << endl;
}
if (current->leaf == false) //*
queue.push(current->C[i]);
}
This is a strong indicator that all the stuff qualified with current-> in reality wants to live in a function where it is this and thus does not need to be named explicitly.
Your code is better organised and more pleasant to read than most debug requests we get here, but it is still fairly brittle and it contains quite a few smelly bits like if (current->leaf == false) instead of if (not current->is_leaf).
You may want to post it over on Code Review when you have got it into working shape; I'm certain that the experienced coders hanging out there can give you lots of valuable advice on how to improve your code.
In order to ease prototyping and development I would strongly advise the following:
use std::vector<> instead of naked arrays during the prototype phase
invalidate invalid entries during development/prototyping (set keys to -1 and pointers to 0)
use assert() for documenting - and checking - local invariants
write functions that verify the structural invariants exactly and call them before/after every function that modifies the structure
compile your code with /Wall /Wextra and clean it up so that it always compiles without warnings
Also, don't use int indiscriminately; the basic type for things that cannot become negative is unsigned (node degree, current key count etc.).
P.S.: it would be easier to build a conforming B-tree by pinning the order on the number of keys (i.e. number of keys can vary between K and 2*K for some K). Pinning the order on the number of pointers makes things more difficult, and one consequence is that the number of keys for 'order' 2 (where a node is allowed to have between 2 and 4 pointers) can vary between 1 and 3. For most folks dealing with B-trees that will be a rather unexpected sight!

How do I do the following recursive function?

Ok, so I have a regular Node list, with members info and next.
I need to use a function, recursively, to calculate the average, and then compare if each node is bigger than the average or not.
int Acount(NodeType* Node, int sum, int& avg){
if (Node == NULL){//last call
avg = sum / avg;
return 0;
}
else {
return (Acount(Node->next, sum + Node->info, ++avg) + (Node->info > avg ? 1 : 0));
}
}
Which is quite simple. Problem is the value returned is always 0.
The problem appears to be with
(Node->info > avg ? 1 : 0));
I've done the tests and when I do the following:
return (Acount(Node->next, sum + Node->info, ++avg) + Node->info;
or
return (Acount(Node->next, sum + Node->info, ++avg) + avg;
Results meet expectations. As in, I'm getting the sum of the Node->info in the first case, and I'm getting average*number of nodes in the second case.
Point of this, I've proved that the function is working perfectly.
Yet when it comes to
(Node->info > avg ? 1 : 0));
Appears to be problematic, which is quite peculiar. if I place for example:
(Node->info == 5 ? 1 : 0));
And there is only one 5 in the nodes, then the function returns 1. So everything is working as intended, yet I keep getting a 0.
The following are the main functions and additional functions for the Node.
#include <iostream>
using std::cout;
using std::cin;
using std::endl;
struct NodeType{
int info;
NodeType *next;
};
//pre: first node passed is not NULL
int Acount(NodeType* Node, int sum, int& avg){
if (Node == NULL){//last call
avg = sum / avg;
return 0;
}
else {
return (Acount(Node->next, sum + Node->info, ++avg) + (Node->info > avg ? 1 : 0));
}
}
void fill(NodeType*& Node){
NodeType *temp;
Node = new NodeType;
Node->info = 0;
Node->next = NULL;
temp = Node;
for (int i = 1; i < 10; i++){
temp->next = new NodeType;
temp = temp->next;
temp->info = i;
temp->next = NULL;
}
}
void print(NodeType* Node){
NodeType *temp = Node;
while (temp != NULL){
cout << temp->info << " ";
temp = temp->next;
}
cout << endl;
}
void Delete(NodeType* Node){
NodeType *temp;
while (Node != NULL){
temp = Node;
Node = Node->next;
delete temp;
}
}
void main(){
int sum = 0, avg = 0;
NodeType *Node;
fill(Node);
print(Node);
cout << Acount(Node, sum, avg) << endl;
Delete(Node);
}
In C++ there is no concept of left-to-right (or right-to-left) evaluation order of expressions. Operator priorities will control associativity, but in the case of f1() + f2() there is no guarantee that f1() is invoked before f2() (and viceversa). It may depend on the compiler or other.
My suggestion is to split the expression into 2 distinct statements as follows:
int tmp = Acount(Node->next, sum + Node->info, ++avg);
return tmp + (Node->info > avg ? 1 : 0);
I am not sure if your code has defined behaviour. But, this line
return (Acount(Node->next, sum + Node->info, ++avg) + (Node->info > avg ? 1 : 0));
depends on if the left summand or the right summand is calculated first.
If it is the left one, then Acount goes down the recursion an incrementing avg until avg equals the number of elements in the list (here 10 when starting from zero called by the main routine). Note, that avg is passed by reference. Thus, when the recursion goes back up, this term in the right summand
Node->info > avg
will never be true because Node->info is set in the fill routine to values smaller then the number of elements.
I don't think your method will work.
In this statement:
return (Acount(Node->next, sum + Node->info, ++avg) + (Node->info > avg ? 1 : 0))
You don't know when the second term has be evaluated. It's not defined in C++.

Problems with Internal path length function

I've been asked to compute the average depth of a node in both a binary search tree, and an AVL tree. Through some research, I found that the average depth of a tree is the internal path length divided by the number of nodes in a tree, and that the internal path length (the sum of the path lengths of every node in the tree) is given by this recurrence:
D(1) = 0, D(N) = D(i) + D(N βˆ’ i βˆ’ 1) + N βˆ’ 1
where D(N) is a tree with N nodes, D(i), is the IPL of the left subtree, and D(N-i-1) is the IPL of the right subtree.
Using that, I wrote this function:
int internalPathLength(Node *t, int& sum) const{
if(t == nullptr || (t->left == nullptr && t->right == nullptr)) {
return 0;
}
else {
int a = 0;
sum += internalPathLength(t->left, sum) + internalPathLength(t->right, sum) + (countNodes(t,a)-1);
cout << sum << endl;
return sum;
}
This function gives me, with a binary search tree of 565 nodes, an IPL of 1,264,875,230 and an average depth of 2,238,717, a preposterously high number. Using it on an AVL tree of similar size gives me an IPL of -1,054,188,525 and an average depth of -1,865,820, which is a negative number on top of being preposterously high. Is there something wrong my interpretation/implementation of the recurrence? what else can I try? Or are the values I'm getting in the normal range for this computation after all?
The problem is that you pass sum by reference, so it gets incremented way too many times. You don't really need this sum at all. This should work:
int internalPathLength(Node *t) const{
if(t == nullptr || (t->left == nullptr && t->right == nullptr)) {
return 0;
}
else {
return internalPathLength(t->left) + internalPathLength(t->right) + countNodes(t) - 1;
}
}
This is not optimal, because your count function is probably also recursive.
You can count the nodes in each subtree in the same recursion and then use it. Like this:
int internalPathLength(Node *t, int &count) const{
if(t == nullptr) {
count = 0;
return 0;
}
else if(t->left == nullptr && t->right == nullptr){
count = 1;
return 0;
}
else {
count = 1;
int leftCount;
int rightCount;
int sum = internalPathLength(t->left, leftCount) + internalPathLength(t->right, rightCount);
count += leftCount + rightCount;
return sum + count - 1;
}
}

How to heapify the minheap using an array in C++?

This program should work correctly but it doesn't! assume you are building a minheap by inserting nmubers into an array. Each time of insertion should be followed by Heapify function to make sure that the sort of numbers do not violate the minheap rule. This is what I wrote but there is something wrong with it and I couldn't make it!
int P(int i) //returning the index of parent
{
if (i % 2 == 0) { i = ((i - 2) / 2); }
else { i = ((i - 1) / 2); }
return i;
}
void Heapify(double A[], int i)//putting the smallest value in the root because we have a min heap
{
if (P(i) != NULL && A[i] < A[P(i)])
{
temp = A[P(i)];
A[P(i)] = A[i];
A[i] = temp;
Heapify(A, P(i));
}
}
Generally speaking, your heapify function doesn't seem to take a minimum of both left and right branches into consideration. Let me show you an ideal, working implementation (object-oriented, so you might want to pass the heap as a parameter). You can find the exact pseudocode all over the internet, so I'm not really presenting anything unique.
void Heap::Heapify (int i)
{
int l = left(i);
int r = right(i);
int lowest;
if (l < heap_size && heap[l] -> value < heap[i] -> value )
lowest = l;
else
lowest = i;
if (r < heap_size && heap[r] -> value < heap[lowest] -> value)
lowest = r;
if (lowest != i)
{
swap (heap[i], heap[lowest]);
Heapify(lowest);
}
}
where
int left ( int i ) { return 2 * i; }
int right ( int i ) { return 2 * i + 1; }
As you can see, an algorithm first checks which one of left and right children have lower value. That value is swapped with current value. That is everything there is to it.