Finding kthSmallestElement in the BST - c++

I am trying to solve the following question from LeetCode:
https://leetcode.com/problems/kth-smallest-element-in-a-bst/description/
The aim is, given a BST, we have to find out the Kth-smallest element in it and return its value.
I could come up with a O(n) time and space solution myself. But another solution which I wrote with online help is far better:
/**
* Definition for a binary tree node.
* struct TreeNode {
* int val;
* TreeNode *left;
* TreeNode *right;
* TreeNode(int x) : val(x), left(NULL), right(NULL) {}
* };
*/
class Solution {
public:
int kthSmallestUtil(TreeNode* root, int& k) {
if(!root) return -1;
int value=kthSmallestUtil(root->left, k);
if(!k) return value;
k--;
if(k==0) return root->val;
return kthSmallestUtil(root->right, k);
}
int kthSmallest(TreeNode* root, int k) {
return kthSmallestUtil(root, k);
}
};
I have understood the above solution. I also debugged it (https://onlinegdb.com/BJnoIkrLM) by inserting break points at 29, 30, 33 and 37. However, I still feel a bit uneasy because of the following reason:
In case of the call kthSmallestUtil(root->left, k);, we pass the original value of k; we then (understandably) decrement the value of k for the current root (since we are doing in order traversal). But, when we again recurse for kthSmallestUtil(root->right, k);, why don't we pass the original value of k? Why does the right child get a 'preferential' treatment - a decremented value of k?
I know because of debugging how the values of k change and we get the final answer.. But I am seeking some intuition behind using the original value of k for the left child and the decremented value of k for the right child.

This solutions seems to assume an ordered binary search tree.
That means the left branch of the tree contains only smaller values than the current nodes val. Thus it first recurses into the left branch, decrementing k along the way, then if k is not 0 k is decremented for the current element. If k is still not 0 then the values in the right branch, all greater than the current nodes value, are considered.
What you need to understand is that the k being decremented in the k--; line is not the original value of k but the value of k after the traversal of the entire left branch.
The recursive calls all modify the same k because k is passed by reference and not by value

The code works more less this way - go as deep as you can in the left branch of the BST. When you reach the leftmost leaf - the smallest value - decrement k value and start seraching in the ramaining part of the BST. Because we already visited smallest value in the whole tree and we are searching for kth smallest value, we must search for k-1th smallest value in the rest of the tree (as we no longer take into account this leftmost leaf). And so, if k is equal to zero it means current node has the kth smallest value. Otherwise it is necessary to also search the right subtrees.

Related

Appropriate data structure for add and find queries

I have two types of queries.
1 X Y
Add element X ,Y times in the collection.
2 N
Number of queries < 5 * 10^5
X < 10^9
Y < 10^9
Find Nth element in the sorted collection.
I tried STL set but it did not work.
I think we need balanced tree with each node containing two data values.
First value will be element X. And another will be prefix sum of all the Ys of elements smaller than or equal to value.
When we are adding element X find preprocessor of that first value.Add second value associated with preprocessor to Y.
When finding Nth element. Search in tree(second value) for value immediately lower than N.
How to efficiently implement this data structure ?
This can easily be done using segment tree data structure with complexity of O(Q*log(10^9))
We should use so called "sparse" segment tree so that we only create nodes when needed, instead of creating all nodes.
In every node we will save count of elements in range [L, R]
Now additions of some element y times can easily be done by traversing segment tree from root to leaf and updating the values (also creating nodes that do not exist yet).
Since the height of segment tree is logarithmic this takes log N time where N is our initial interval length (10^9)
Finding k-th element can easily be done using binary search on segment tree, since on every node we know the count of elements in some range, we can use this information to traverse left or right to the element which contains the k-th
Sample code (C++):
#include <bits/stdc++.h>
using namespace std;
#define ll long long
const int sz = 31*4*5*100000;
ll seg[sz];
int L[sz],R[sz];
int nxt = 2;
void IncNode(int c, int l, int r, int idx, int val)
{
if(l==r)
{
seg[c]+=val;
return;
}
int m = (l+r)/2;
if(idx <= m)
{
if(!L[c])L[c]=nxt++;
IncNode(L[c],l,m,idx,val);
}
else
{
if(!R[c])R[c]=nxt++;
IncNode(R[c],m+1,r,idx,val);
}
seg[c] = seg[L[c]] + seg[R[c]];
}
int FindKth(int c, int l, int r, ll k)
{
if(l==r)return r;
int m = (l+r)/2;
if(seg[L[c]] >= k)return FindKth(L[c],l,m,k);
return FindKth(R[c],m+1,r,k-seg[L[c]]);
}
int main()
{
ios::sync_with_stdio(0);cin.tie(0);cout.tie(0);
int Q;
cin>>Q;
int L = 0, R = 1e9;
while(Q--)
{
int type;
cin>>type;
if(type==1)
{
int x,y;
cin>>x>>y;
IncNode(1,L,R,x,y);
}
else
{
int k;
cin>>k;
cout<<FindKth(1,L,R,k)<<"\n";
}
}
}
Maintaining a prefix sum in each node is not practical. It would mean that every time you add a new node, you have to update the prefix sum in every node succeeding it in the tree. Instead, you need to maintain subtree sums: each node should contain the sum of Y-values for its own key and the keys of all descendants. Maintaining subtree sums when the tree is updated should be straightforward.
When you answer a query of type 2, at each node, you would descend into the left subtree if N is less than or equal to the subtree sum value S of the left child (I'm assuming N is 1-indexed). Otherwise, subtract S + 1 from N and descend into the right subtree.
By the way, if the entire set of X values is known in advance, then instead of a balanced BST, you could use a range tree or a binary indexed tree.

Returning Kth smallest element in a BST

I wrote the following code snippet to return the Kth smallest element in a BST:
/**
* Definition for a binary tree node.
* struct TreeNode {
* int val;
* TreeNode *left;
* TreeNode *right;
* TreeNode(int x) : val(x), left(NULL), right(NULL) {}
* };
*/
class Solution {
public:
int kthSmallest(TreeNode* root, int k) {
if(root==NULL)
return -2;
cout<<"Root value "<<root->val<<" and k value "<<k<<"\n";
kthSmallest(root->left, k-1);
if((k)==0)
return root->val;
kthSmallest(root->right, k-1);
}
};
I basically just do an inorder traversal of the tree and decrement the value of k during each recursive call. So, in this way, when the value of k equals 0, I have found the node and just return it.
Following is the output of my debugging statements:
Root value 1 and k value 3
Root value 2 and k value 2
Root value 4 and k value 1
Root value 8 and k value 0
Root value 9 and k value 0
Root value 5 and k value 1
Root value 10 and k value 0
Root value 11 and k value 0
Root value 3 and k value 2
Root value 6 and k value 1
Root value 12 and k value 0
Root value 7 and k value 1
I am unable to understand why the program keeps on executing even after k has become 0. What have I missed? I appreciate your help.
Edit: I don't think the question description is required, but if needed, it can be found here: LeetCode: Find Kth smallest element in a BST. Also, please note that I cannot edit the function prototype. Also, the question says that the k would be a valid number between 1 and BST's total number of elements.
Variables are not shared between (recursive or not) calls. Each call will have k being a normal local variable whose value is promptly forgotten once the function returns.
If you have k == 1 and then do kthSmallest(root->left, k-1) it's only in the recursive call that k == 0. In the callee the value of k is still 1.
This algorithm will never work as coded, because you rely on arguments and local variables being shared.
You also don't propagate the K:th smallest value from the bottom of the call tree. You just throw it away, and in the end (the very first call to kthSmallest) don't return anything which leads to undefined behavior.
You code would not work as is, as in the previous comment.
You could do something along the lines of:
int kthSmallest(TreeNode* root, int k) {
int i = 0;
return kthSmallest(root, &i, k);
}
int kthSmallest(TreeNode* root, int *i, int k) {
if(root == nullptr)
return INT32_MAX;
int left = kthSmallest(root->left, i, k);
if (left != INT32_MAX)
return left;
if (++*i == k)
return root->val;
kthSmallest(root->right, i, k);
}

Count the number of nodes in an AVL tree in a given range

I'm required to write a C++ function that, given a range (a,b], returns the number of nodes in an AVL tree that are in that given range, specifically in log(n) time complexity.
I can add more fields to the tree's nodes if I need to do so.
I should point out that a,b will not necessarily appear in the tree. For example, if the tree's nodes are: 1,2,5,7,9,10, then running the function using the parameters (3,9] should return 3.
Which algorithm should I use to achieve this?
This is a famous problem - dynamic order statistcs by tree augmentation.
You basically need to augment your nodes so that when you look at a child pointer, you know how many children are in the child's subtree at time O(1). It's easy to see that this can be done without affecting the complexity.
Once you have that, you can answer any query (between this and that, inclusive/exclusive - all possibilities) by performing two traversals from node to roots. The exact traversals depend on the details (check the functions lower_bound and upper_bound in C++ for example).
First you could implement a split by key operation. That is, given a tree, to perform split(tree, key, ts, tg) splits the key in two trees; ts contains the keys less than key; t2 the greater or equal ones. This operation can be done in O(lg n).
Then, with two splits, the first on a and the second on b you can obtain the desired subset range in O(lg n).
The split could be implemented as follows (pseudo code):
void split(Node * root, const Key & key, Node *& ts, Node *& tg) noexcept
{
if (root == Node::NullPtr)
return;
if (key < KEY(root))
{
Node * r = RLINK(root), * tgaux = Node::NullPtr;
split(LLINK(root), key, ts, tgaux);
insert(tgaux, root); // insert root in tgaux
tg = join_ex(tgaux, r);
}
else
{ // ket greater or equal than key to tg
Node * l = LLINK(root), *tsaux = Node::NullPtr;
split(RLINK(root), key, tsaux, tg));
insert(tsaux, root); // insert root in tsaux
ts = join_ex(l, tsaux);
}
}
The join_ex(t1, t2) joins two exclusive trees; that is, all the keys of t1 are lesser that any key of tree t2. This join can be implemented in O(lg n) in a similar way to the concatenation described by Knuth in TAOCP V3 6.2.3.
Grosso modo if you want to join l and r, then suppose h(l) > h(r). You remove from r the leftmost node (the minimum). Let j this join node and r' the resulting tree (r - j). Now you descend by the right side of r until reaching a node p such that h(p) - h(r') equals 0 or 1. At this moment you do
And you treat p as if this was inserted.
EDIT: I was wrong in interpreting the question. Sorry. I did not see that it was to count not to calculate a set. The following would be my answer. I do not erase what I've written because I think it is useful anyway.
Ami Tavory was right.
If you use extended trees, that is to store the subtree cardinality in each node, then you could easily compute the inorder positios of a key. I usually call to this operation position(key). If key is not in the set then it returns the position that key had if it was inserted in the tree.
The inorder position of root is the cardinality of left tree.
Now, in order to count the cardinality of [a, b) set you perform position(b) - position(a). You could require to do some adjustments if a or b are not present in the tree. But basically is thus.
position(key) is, I think, "naturally" simple. Supposing that the node cardinality is accessed with COUNT(node):
long position(Node * root, const Key & key) noexcept
{
if (r == Node::NullPtr)
return 0;
if (key < KEY(root))
return position(LLINK(r), key, p);
else if (KEY(r) < key)
return position(RLINK(r), key) + COUNT(LLINK(r)) + 1;
else // the root contains key
return COUNT(LLINK(r));
}
Since an avl tree is balanced, position takes O(lg n). So two calls take O(lg n). A non recursive version is simple.
I hope you know to forgive my mistake

Binary tree interview: implement follow operation

I was asked to implement a binary search tree with follow operation for each node v - the complexity should be O(1). The follow operation should return a node w (w > v).
I proposed to do it in O(log(n)) but they wanted O(1)
Upd. It should be next greater node
just keep the maximum element for the tree and always return it for nodes v < maximum.
You can get O(1) if you store pointers to the "next node" (using your O(log(n) algorithm), given you are allowed to do that.
How about:
int tree[N];
size_t follow(size_t v) {
// First try the right child
size_t w = v * 2 + 1;
if(w >= N) {
// Otherwise right sibling
w = v + 1;
if(w >= N) {
// Finally right parent
w = (v - 1) / 2 + 1;
}
}
return w;
}
Where tree is a complete binary tree in array form and v/w are represented as zero-based indices.
One idea is to literally just have a next pointer on each node.
You can update these pointers in O(height) after an insert or remove (O(height) is O(log n) for a self-balancing BST), which is as long as an insert or remove takes, so it doesn't add to the time complexity.
Alternatively, you can also have a previous pointer in addition to the next pointer. If you do this, you can update these pointers in O(1).
Obviously, in either case, if you have a node, you also have its next pointer, and you can simply get this value in O(1).
Pseudo-code
For only a next pointer, after the insert, you'd do:
if inserted as a right child:
newNode.next = parent.next
parent.next = newNode
else // left child
predecessor(newNode)
For both next and previous pointers:
if inserted as a right child:
parent.next.previous = newNode
newNode.next = parent.next
parent.next = newNode
else // left child
parent.previous.next = newNode
newNode.previous = parent.previous
parent.previous = newNode
(some null checks are also required).

Finding Depth of Binary Tree

I am having trouble understanding this maxDepth code. Any help would be appreciated. Here is the snippet example I followed.
int maxDepth(Node *&temp)
{
if(temp == NULL)
return 0;
else
{
int lchild = maxDepth(temp->left);
int rchild = maxDepth(temp->right);
if(lchild <= rchild)
return rchild+1;
else
return lchild+1;
}
}
Basically, what I understand is that the function recursively calls itself (for each left and right cases) until it reaches the last node. once it does, it returns 0 then it does 0+1. then the previous node is 1+1. then the next one is 2+1. if there is a bst with 3 left childs, int lchild will return 3. and the extra + 1 is the root. So my question is, where do all these +1 come from. it returns 0 at the last node but why does it return 0+1 etc. when it goes up the left/right child nodes? I don't understand why. I know it does it, but why?
Consider this part (of a bigger tree):
A
\
B
Now we want to calculate the depth of this treepart, so we pass pointer to A as its param.
Obviously pointer to A is not NULL, so the code has to:
call maxDepth for each of A's children (left and right branches). A->right is B, but A->left is obviously NULL (as A has no left branch)
compare these, choose the greatest value
return this chosen value + 1 (as A itself takes a level, doesn't it?)
Now we're going to look at how maxDepth(NULL) and maxDepth(B) are calculated.
The former is quite easy: the first check will make maxDepth return 0. If the other child were NULL too, both depths would be equal (0), and we have to return 0 + 1 for A itself.
But B is not empty; it has no branches, though, so (as we noticed) its depth is 1 (greatest of 0 for NULLs at both parts + 1 for B itself).
Now let's get back to A. maxDepth of its left branch (NULL) is 0, maxDepth of its right branch is 1. Maximum of these is 1, and we have to add 1 for A itself - so it's 2.
The point is the same steps are to be done when A is just a part of the bigger tree; the result of this calculation (2) will be used in the higher levels of maxDepth calls.
Depth is being calculated using the previous node + 1
All the ones come from this part of the code:
if(lchild <= rchild)
return rchild + 1;
else
return lchild + 1;
You add yourself +1 to the results obtained in the leaves of the tree. These ones keep adding up until you exit all the recursive calls of the function and get to the root node.
Remember in binary trees a node has at most 2 children (left and right)
It is a recursive algorithm, so it calls itself over and over.
If the temp (the node being looked at) is null, it returns 0, as this node is nothing and should not count. that is the base case.
If the node being looked at is not null, it may have children. so it gets the max depth of the left sub tree (and adds 1, for the level of the current node) and the right subtree (and adds 1 for the level of the current node). it then compares the two and returns the greater of the two.
It dives down into the two subtrees (temp->left and temp->right) and repeats the operation until it reaches nodes without children. at that point it will call maxDepth on left and right, which will be null and return 0, and then start returning back up the chain of calls.
So if you you have a chain of three nodes (say, root-left1-left2) it will get down to left2 and call maxDepth(left) and maxDepth(right). each of those return 0 (they are null). then it is back at left2. it compares, both are 0, so the greater of the two is of course 0. it returns 0+1. then we are at left1 - repeats, finds that 1 is the greater of its left n right (perhaps they are the same or it has no right child) so it returns 1+1. now we are at root, same thing, it returns 2+1 = 3, which is the depth.
Because the depth is calculated with previous node+1
To find Maximum depth in binary tree keep going left and Traveres the tree, basically perform a DFS
or
We can find the depth of the binary search tree in three different recursive ways
– using instance variables to record current depth and total depth at every level
– without using instance variables in top-bottom approach
– without using instance variables in bottom-up approach
The code snippet can be reduced to just:
int maxDepth(Node *root){
if(root){ return 1 + max( maxDepth(root->left), maxDepth(root->right)); }
return 0;
}
A good way of looking at this code is from the top down:
What would happen if the BST had no nodes? We would have root = NULL and the function would immediately return an expected depth of 0.
Now suppose the tree was populated with a number of nodes. Starting at the top, the if condition would be true for the root node. We then ask, what is the max depth of the LEFT SUB TREE and the RIGHT SUB TREE by passing the root of those sub trees to maxDepth. Both the LST and the RST of the root are one level deeper than the root, so we must add one to get the depth of the tree at root of the tree passed to the function.
i think this is the right answer
int maxDepth(Node *root){
if(root){ return 1 + max( maxDepth(root->left), maxDepth(root->right)); }
return -1;
}