Returning Kth smallest element in a BST

Returning Kth smallest element in a BST - c++

I wrote the following code snippet to return the Kth smallest element in a BST:
/**
* Definition for a binary tree node.
* struct TreeNode {
* int val;
* TreeNode *left;
* TreeNode *right;
* TreeNode(int x) : val(x), left(NULL), right(NULL) {}
* };
*/
class Solution {
public:
int kthSmallest(TreeNode* root, int k) {
if(root==NULL)
return -2;
cout<<"Root value "<<root->val<<" and k value "<<k<<"\n";
kthSmallest(root->left, k-1);
if((k)==0)
return root->val;
kthSmallest(root->right, k-1);
}
};
I basically just do an inorder traversal of the tree and decrement the value of k during each recursive call. So, in this way, when the value of k equals 0, I have found the node and just return it.
Following is the output of my debugging statements:
Root value 1 and k value 3
Root value 2 and k value 2
Root value 4 and k value 1
Root value 8 and k value 0
Root value 9 and k value 0
Root value 5 and k value 1
Root value 10 and k value 0
Root value 11 and k value 0
Root value 3 and k value 2
Root value 6 and k value 1
Root value 12 and k value 0
Root value 7 and k value 1
I am unable to understand why the program keeps on executing even after k has become 0. What have I missed? I appreciate your help.
Edit: I don't think the question description is required, but if needed, it can be found here: LeetCode: Find Kth smallest element in a BST. Also, please note that I cannot edit the function prototype. Also, the question says that the k would be a valid number between 1 and BST's total number of elements.

Variables are not shared between (recursive or not) calls. Each call will have k being a normal local variable whose value is promptly forgotten once the function returns.
If you have k == 1 and then do kthSmallest(root->left, k-1) it's only in the recursive call that k == 0. In the callee the value of k is still 1.
This algorithm will never work as coded, because you rely on arguments and local variables being shared.
You also don't propagate the K:th smallest value from the bottom of the call tree. You just throw it away, and in the end (the very first call to kthSmallest) don't return anything which leads to undefined behavior.

You code would not work as is, as in the previous comment.
You could do something along the lines of:
int kthSmallest(TreeNode* root, int k) {
int i = 0;
return kthSmallest(root, &i, k);
}
int kthSmallest(TreeNode* root, int *i, int k) {
if(root == nullptr)
return INT32_MAX;
int left = kthSmallest(root->left, i, k);
if (left != INT32_MAX)
return left;
if (++*i == k)
return root->val;
kthSmallest(root->right, i, k);
}

Related

How does a BST height algorithm count?

int BianaryTree<T>::height(Node<T>* A){
root = A;
if(root==nullptr){
return 0;
}
else {
int lHeight=height(root->left); //how is int counting here?
int rHeight=height(root->right);
return max(lHeight, rHeight)+1;
}
}
So from what I understand this is a standard Bianary Search Tree height algorithm. My main question is how storing the recursd function in an int variable is "counting" the height of the tree? As far as I can tell all this function is returning is 0.

Each recursive branch will eventually result in a return value of zero, but it then returns the maximum of the left and right heights plus 1. That plus 1 is critical, without that the return value of the entire recursion would indeed be zero but with it you get the height you expect.
Imagine a tree like the following:
3
2 4
1
with the children of 1 2 and 4 (not shown) assumed to be null. height(node_1->left) and height(node_1->right) would both be 0 but height(node_1) would be max(height(node_1->left), height(node_1->right)) + 1, that is 1. The rest follows from there.

Appropriate data structure for add and find queries

I have two types of queries.
1 X Y
Add element X ,Y times in the collection.
2 N
Number of queries < 5 * 10^5
X < 10^9
Y < 10^9
Find Nth element in the sorted collection.
I tried STL set but it did not work.
I think we need balanced tree with each node containing two data values.
First value will be element X. And another will be prefix sum of all the Ys of elements smaller than or equal to value.
When we are adding element X find preprocessor of that first value.Add second value associated with preprocessor to Y.
When finding Nth element. Search in tree(second value) for value immediately lower than N.
How to efficiently implement this data structure ?

This can easily be done using segment tree data structure with complexity of O(Q*log(10^9))
We should use so called "sparse" segment tree so that we only create nodes when needed, instead of creating all nodes.
In every node we will save count of elements in range [L, R]
Now additions of some element y times can easily be done by traversing segment tree from root to leaf and updating the values (also creating nodes that do not exist yet).
Since the height of segment tree is logarithmic this takes log N time where N is our initial interval length (10^9)
Finding k-th element can easily be done using binary search on segment tree, since on every node we know the count of elements in some range, we can use this information to traverse left or right to the element which contains the k-th
Sample code (C++):
#include <bits/stdc++.h>
using namespace std;
#define ll long long
const int sz = 31*4*5*100000;
ll seg[sz];
int L[sz],R[sz];
int nxt = 2;
void IncNode(int c, int l, int r, int idx, int val)
{
if(l==r)
{
seg[c]+=val;
return;
}
int m = (l+r)/2;
if(idx <= m)
{
if(!L[c])L[c]=nxt++;
IncNode(L[c],l,m,idx,val);
}
else
{
if(!R[c])R[c]=nxt++;
IncNode(R[c],m+1,r,idx,val);
}
seg[c] = seg[L[c]] + seg[R[c]];
}
int FindKth(int c, int l, int r, ll k)
{
if(l==r)return r;
int m = (l+r)/2;
if(seg[L[c]] >= k)return FindKth(L[c],l,m,k);
return FindKth(R[c],m+1,r,k-seg[L[c]]);
}
int main()
{
ios::sync_with_stdio(0);cin.tie(0);cout.tie(0);
int Q;
cin>>Q;
int L = 0, R = 1e9;
while(Q--)
{
int type;
cin>>type;
if(type==1)
{
int x,y;
cin>>x>>y;
IncNode(1,L,R,x,y);
}
else
{
int k;
cin>>k;
cout<<FindKth(1,L,R,k)<<"\n";
}
}
}

Maintaining a prefix sum in each node is not practical. It would mean that every time you add a new node, you have to update the prefix sum in every node succeeding it in the tree. Instead, you need to maintain subtree sums: each node should contain the sum of Y-values for its own key and the keys of all descendants. Maintaining subtree sums when the tree is updated should be straightforward.
When you answer a query of type 2, at each node, you would descend into the left subtree if N is less than or equal to the subtree sum value S of the left child (I'm assuming N is 1-indexed). Otherwise, subtract S + 1 from N and descend into the right subtree.
By the way, if the entire set of X values is known in advance, then instead of a balanced BST, you could use a range tree or a binary indexed tree.

Finding kthSmallestElement in the BST

I am trying to solve the following question from LeetCode:
https://leetcode.com/problems/kth-smallest-element-in-a-bst/description/
The aim is, given a BST, we have to find out the Kth-smallest element in it and return its value.
I could come up with a O(n) time and space solution myself. But another solution which I wrote with online help is far better:
/**
* Definition for a binary tree node.
* struct TreeNode {
* int val;
* TreeNode *left;
* TreeNode *right;
* TreeNode(int x) : val(x), left(NULL), right(NULL) {}
* };
*/
class Solution {
public:
int kthSmallestUtil(TreeNode* root, int& k) {
if(!root) return -1;
int value=kthSmallestUtil(root->left, k);
if(!k) return value;
k--;
if(k==0) return root->val;
return kthSmallestUtil(root->right, k);
}
int kthSmallest(TreeNode* root, int k) {
return kthSmallestUtil(root, k);
}
};
I have understood the above solution. I also debugged it (https://onlinegdb.com/BJnoIkrLM) by inserting break points at 29, 30, 33 and 37. However, I still feel a bit uneasy because of the following reason:
In case of the call kthSmallestUtil(root->left, k);, we pass the original value of k; we then (understandably) decrement the value of k for the current root (since we are doing in order traversal). But, when we again recurse for kthSmallestUtil(root->right, k);, why don't we pass the original value of k? Why does the right child get a 'preferential' treatment - a decremented value of k?
I know because of debugging how the values of k change and we get the final answer.. But I am seeking some intuition behind using the original value of k for the left child and the decremented value of k for the right child.

This solutions seems to assume an ordered binary search tree.
That means the left branch of the tree contains only smaller values than the current nodes val. Thus it first recurses into the left branch, decrementing k along the way, then if k is not 0 k is decremented for the current element. If k is still not 0 then the values in the right branch, all greater than the current nodes value, are considered.
What you need to understand is that the k being decremented in the k--; line is not the original value of k but the value of k after the traversal of the entire left branch.
The recursive calls all modify the same k because k is passed by reference and not by value

The code works more less this way - go as deep as you can in the left branch of the BST. When you reach the leftmost leaf - the smallest value - decrement k value and start seraching in the ramaining part of the BST. Because we already visited smallest value in the whole tree and we are searching for kth smallest value, we must search for k-1th smallest value in the rest of the tree (as we no longer take into account this leftmost leaf). And so, if k is equal to zero it means current node has the kth smallest value. Otherwise it is necessary to also search the right subtrees.

Convert linked list into binary search tree, do stuff and return tree as list

I have the following problem:
I have a line with numbers that I have to read. The first number from the line is the amount of operations I will have to perform on the rest of the sequence.
There are two types of operations I will have to do:
Remove- we remove the number after the current one, then we move forward X steps in the sequence, where X=value of removed element)
Insert- we insert a new number after the current one with a value of (current element's value-1), then we move forward by X steps in the sequence where X = value of the current element (i.e not the new one)
We do "Remove" if the current number's value is even, and "Insert" if the value is odd.
After the amount of operations we have to print the whole sequence, starting from the number we ended the operations.
Properly working example:
Input: 3 1 2 3
Output:0 0 3 1
3 is the first number and it becomes the OperCount value.
First operation:
Sequence: 1 2 3, first element: 1
1 is odd, so we insert 0 (currNum's value-1)
We move forward by 1(currNum's value)
Output sequence: 1 0 2 3, current position: 0
Second operation:
0 is even so we remove the next value (2)
Move forward by the removed element's value(2):
From 0 to 3
From 3 to 1
Output sequence: 1 0 3, current position: 1
Third operation:
1 is even, so once again we insert new element with value of 0
Move by current element's value(1), onto the created 0.
Output sequence: 1 0 0 3, current position: first 0
Now here is the deal, we have reached the final condition and now we have to print whole sequence, but starting from the current position.
Final Output:
0 0 3 1
I have the working version, but its using the linked list, and because of that, it doesn't pass all the tests. Linked list traversal is too long, thats why I need to use the binary tree, but I kinda don't know how to start with it. I would appreciate any help.

First redefine the operations to put most (but not all) the work into a container object: We want 4 operations supported by the container object:
1) Construct from a [first,limit) pair of input random access iterators
2) insert(K) finds the value X at position K, inserts a X-1 after it and returns X
3) remove(K) finds the value X at position K, deletes it and returns X
4) size() reports the size of the contents
The work outside the container would just keep track of incremental changes to K:
K += insert(K); K %= size();
or
K += remove(K); K %= size();
Notice the importance of a sequence point before reading size()
The container data is just a root pointing to a node.
struct node {
unsigned weight;
unsigned value;
node* child[2];
unsigned cweight(unsigned s)
{ return child[s] ? child[s]->weight : 0; }
};
The container member functions insert and remove would be wrappers around recursive static insert and remove functions that each take a node*& in addition to K.
The first thing each of either recursive insert or remove must do is:
if (K<cweight(0)) recurse passing (child[0], K);
else if ((K-=cweight(0))>0) recurse passing (child[1], K-1);
else do the basic operation (read the result, create or destroy a node)
After doing that, you fix the weight at each level up the recursive call stack (starting where you did the work for insert or the level above that for remove).
After incrementing or decrementing the weight at the current level, you may need to re-balance, remembering which side you recursively changed. Insert is simpler: If child[s]->weight*4 >= This->weight*3 you need to re-balance. The re-balance is one of the two basic tree rotations and you select which one based on whether child[s]->cweight(s)<child[s]->cweight(1-s). rebalance for remove is the same idea but different details.
This system does a lot more worst case re-balancing than a red-black or AVL tree. But still is entirely logN. Maybe there is a better algorithm for a weight-semi-balanced tree. But I couldn't find that with a few google searches, nor even the real name of nor other details about what I just arbitrarily called a "weight-semi-balanced tree".
Getting the nearly 2X speed up of strangely mixing the read operation into the insert and remove operations, means you will need yet another recursive version of insert that doesn't mix in the read, and is used for the portion of the path below the point you read from (so it does the same recursive weight changes and re-balancing but with different input and output).
Given random access input iterators, the construction is a more trivial recursive function. Grab the middle item from the range of iterators and make a node of it with the total weight of the whole range, then recursively pass the sub ranges before and after the middle one to the same recursive function to create child subtree.
I haven't tested any of this, but I think the following is all the code you need for remove as well as the rebalance needed for both insert and remove. Functions taking node*& are static member function of tree and those not taking node*& are non static.
unsigned tree::remove(unsigned K)
{
node* removed = remove(root, K);
unsigned result = removed->value;
delete removed;
return result;
}
// static
node* tree::remove( node*& There, unsigned K) // Find, unlink and return the K'th node
{
node* result;
node* This = There;
unsigned s=0; // Guess at child NOT removed from
This->weight -= 1;
if ( K < This->cweight(0) )
{
s = 1;
result = remove( This->child[0], K );
}
else
{
K -= This->cweight(0);
if ( K > 0 )
{
result = remove( This->child[1], K-1 );
}
else if ( ! This->child[1] )
{
// remove This replacing it with child[0]
There = This->child[0];
return This; // Nothing here/below needs a re-balance check
}
else
{
// remove This replacing it with the leftmost descendent of child[1]
result = This;
There = This = remove( This->child[1], 0 );
This->child[0] = Result->child[0];
This->child[1] = Result->child[1];
This->weight = Result->weight;
}
}
rebalance( There, s );
return result;
}
// static
void tree::rebalance( node*& There, unsigned s)
{
node* This = There;
node* c = This->child[s];
if ( c && c->weight*4 >= This->weight*3 )
{
node* b = c->child[s];
node* d = c->child[1-s];
unsigned bweight = b ? b->weight : 0;
if ( d && bweight < d->weight )
{
// inner rotate: d becomes top of subtree
This->child[s] = d->child[1-s];
c->child[1-s] = d->child[s];
There = d;
d->child[s] = c;
d->child[1-s] = This;
d->weight = This->weight;
c->weight = bweight + c->cweight(1-s) + 1;
This->weight -= c->weight + 1;
}
else
{
// outer rotate: c becomes top of subtree
There = c;
c->child[1-s] = This;
c->weight = This->weight;
This->child[s] = d;
This->weight -= bweight+1;
}
}
}

You can use std::set which is implemented as binary tree. It's constructor allows construction from the iterator, thus you shouldn't have problem transforming list to the set.

Modify lazy propogation in segment tree

I recently read about lazy propogation in segment tree and coded it too.But i got stuck when suppose instead of adding value(=val) i need to divide by value.How to do it ?
Please help
My update function is as follow :
void update_tree(int node, int a, int b, int i, int j, int value) {
if(lazy[node] != 0) { // This node needs to be updated
tree[node] += lazy[node]; // Update it
if(a != b) {
lazy[node*2] += lazy[node]; // Mark child as lazy
lazy[node*2+1] += lazy[node]; // Mark child as lazy
}
lazy[node] = 0; // Reset it
}
if(a > b || a > j || b < i) // Current segment is not within range [i, j]
return;
if(a >= i && b <= j) { // Segment is fully within range
tree[node] += value;
if(a != b) { // Not leaf node
lazy[node*2] += value;
lazy[node*2+1] += value;
}
return;
}
update_tree(node*2, a, (a+b)/2, i, j, value); // Updating left child
update_tree(1+node*2, 1+(a+b)/2, b, i, j, value); // Updating right child
tree[node] = max(tree[node*2], tree[node*2+1]); // Updating root with max value
}

HINTS
Suppose you need to divide by a fixed value of K.
One possibility would be to convert your numbers to base K and in each node maintain an array of numbers A[], where A[i] is the total in all lower nodes of all digits in position i (when thought of as a base K number).
So, for example, if K was 10, then A[0] would store the total of all the units, while A[1] would store the total of all the tens.
The reason to do this is that it then becomes easy to divide lazily by K, all you need to do is set A[i]=A[i+1] and you can use the same lazy update trick as in your code.
EXAMPLE
Suppose we had an array 5,11,20,100 and K was 10
We would construct a node for element 5,11 containing the value:
Total = A[1]*10+A[0]*1 with A[1]=1 and A[0]=5+1 (the sum of the unit values)
we would also have a node for 20,100 containing the value:
Total = A[2]*100+A[1]*10+A[0]*1 with A[2]=1,A[1]=2,A[0]=0
and a node for the entire 5,11,20,100 array with:
Total = A[2]*100+A[1]*10+A[0]*1 with A[2]=1,A[1]=2+1,A[0]=5+1
If we then wanted to divide the whole array by 10, we would simply change the array elements for the top node:
A=[1,3,6] changes to [0,1,3]
and then we could query the sum of all the node by computing:
Total = A[2]*100+A[1]*10+A[0]*1 = 0*100+1*10+3*1=13
which is the same as
(5/10=0)+(11/10=1)+(20/10=2)+(100/10=10)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js