How to implement BST functions insert() and split() based upon rank? - c++

I am trying to figure out how to implement code to the functions of insert() (which inserts an element into the tree), split() (which splits the tree on rank r into two trees of L and R. L containing ranks < r and R containing ranks >= r. For this assignment, I am dumbfounded. I believe I have the code for my insert correct for it works:
Node *insert(Node *T, int v, int r)
{
if(T == nullptr)
{
return new Node(v);
}
int rank = T->left ? T->left->size : 0;
if (r <= rank)
{
T -> left = insert(T -> left, v, rank);
}
else
{
T -> left = insert(T -> left, v, r - rank - 1);
}
fix_size(T);
return T;
}
For my split() function, I barely have anything that works. Can someone explain the algorithm of how to complete these two functions? Thank you!

I'm assuming we're dealing with unbalanced BSTs here (balance makes this harder).
Splitting the null tree is trivial (return (null, null)).
Given a non-null tree T, compare the rank of the root to r. If it's less, then recursively split the right child to get (R<, R≥) and return (T′, R≥) where T′ is T with the right child of the root replaced by R<. Similarity, if the rank of the root is greater than or equal to r, then recursively split the left child into (L<, L≥) and return (L<, T′′) where T′′ is T with the left child of the root replaced by L≥.
This has a slick iterative implementation with Node** pointers that I will leave as an exercise.

Related

Count the number of nodes in an AVL tree in a given range

I'm required to write a C++ function that, given a range (a,b], returns the number of nodes in an AVL tree that are in that given range, specifically in log(n) time complexity.
I can add more fields to the tree's nodes if I need to do so.
I should point out that a,b will not necessarily appear in the tree. For example, if the tree's nodes are: 1,2,5,7,9,10, then running the function using the parameters (3,9] should return 3.
Which algorithm should I use to achieve this?
This is a famous problem - dynamic order statistcs by tree augmentation.
You basically need to augment your nodes so that when you look at a child pointer, you know how many children are in the child's subtree at time O(1). It's easy to see that this can be done without affecting the complexity.
Once you have that, you can answer any query (between this and that, inclusive/exclusive - all possibilities) by performing two traversals from node to roots. The exact traversals depend on the details (check the functions lower_bound and upper_bound in C++ for example).
First you could implement a split by key operation. That is, given a tree, to perform split(tree, key, ts, tg) splits the key in two trees; ts contains the keys less than key; t2 the greater or equal ones. This operation can be done in O(lg n).
Then, with two splits, the first on a and the second on b you can obtain the desired subset range in O(lg n).
The split could be implemented as follows (pseudo code):
void split(Node * root, const Key & key, Node *& ts, Node *& tg) noexcept
{
if (root == Node::NullPtr)
return;
if (key < KEY(root))
{
Node * r = RLINK(root), * tgaux = Node::NullPtr;
split(LLINK(root), key, ts, tgaux);
insert(tgaux, root); // insert root in tgaux
tg = join_ex(tgaux, r);
}
else
{ // ket greater or equal than key to tg
Node * l = LLINK(root), *tsaux = Node::NullPtr;
split(RLINK(root), key, tsaux, tg));
insert(tsaux, root); // insert root in tsaux
ts = join_ex(l, tsaux);
}
}
The join_ex(t1, t2) joins two exclusive trees; that is, all the keys of t1 are lesser that any key of tree t2. This join can be implemented in O(lg n) in a similar way to the concatenation described by Knuth in TAOCP V3 6.2.3.
Grosso modo if you want to join l and r, then suppose h(l) > h(r). You remove from r the leftmost node (the minimum). Let j this join node and r' the resulting tree (r - j). Now you descend by the right side of r until reaching a node p such that h(p) - h(r') equals 0 or 1. At this moment you do
And you treat p as if this was inserted.
EDIT: I was wrong in interpreting the question. Sorry. I did not see that it was to count not to calculate a set. The following would be my answer. I do not erase what I've written because I think it is useful anyway.
Ami Tavory was right.
If you use extended trees, that is to store the subtree cardinality in each node, then you could easily compute the inorder positios of a key. I usually call to this operation position(key). If key is not in the set then it returns the position that key had if it was inserted in the tree.
The inorder position of root is the cardinality of left tree.
Now, in order to count the cardinality of [a, b) set you perform position(b) - position(a). You could require to do some adjustments if a or b are not present in the tree. But basically is thus.
position(key) is, I think, "naturally" simple. Supposing that the node cardinality is accessed with COUNT(node):
long position(Node * root, const Key & key) noexcept
{
if (r == Node::NullPtr)
return 0;
if (key < KEY(root))
return position(LLINK(r), key, p);
else if (KEY(r) < key)
return position(RLINK(r), key) + COUNT(LLINK(r)) + 1;
else // the root contains key
return COUNT(LLINK(r));
}
Since an avl tree is balanced, position takes O(lg n). So two calls take O(lg n). A non recursive version is simple.
I hope you know to forgive my mistake

How does the R-B tree's find function implement in SGI STL?

Recently, I was learning SGI STL source code. when I read the R-B tree's find function, I cannot understand its code. First, paste the code, and there is an example, could anyone explain the find progress? Thanks.
template <class Key, class Value, class KeyOfValue, class Compare, class Alloc>
typename rb_tree<Key, Value, KeyOfValue, Compare, Alloc>::iterator
rb_tree<Key, Value, KeyOfValue, Compare, Alloc>::find(const Key& k) {
link_type y = header; // Last node which is not less than k.
link_type x = root(); // Current node.
while (x != 0)
if (!key_compare(key(x), k))
y = x, x = left(x); //value of x is bigger than k
else
x = right(x); //value of x is less than k
iterator j = iterator(y);
return (j == end() || key_compare(k, key(j.node))) ? end() : j;
}
One example,
I want to find node with value 70, and 90. Could anyone show me the progress? Thanks.
And, What confused me is the code: [else x=right(x); and the return statement].
Thanks, I got this answer. Solved, I will give a example to find 70.
First, [x=root()=30, y=header], 30<70, so [x=x->right=60];
Second, 60<70:[x=x->right=70];
Then, 70>=70, so[ y=x=70, x = x->left=65];
last, 65<70:[x=x->right=NULL];
iterator j = iterator(y);
return j;
In Red Black trees and all Binary search trees left child of any node is less and right child is greater than the node. To find a node with key we first compare the key with root node. IF it is greater then root's key we go to root's right child and compare key to it, otherwise we go to left child. Thus we can find 90 by visiting nodes 30->60->70->85->90. Since the height of RB tree is at most 2log(n+1) we can find any node in O(log(n)) time where n is number of nodes.

Binary tree interview: implement follow operation

I was asked to implement a binary search tree with follow operation for each node v - the complexity should be O(1). The follow operation should return a node w (w > v).
I proposed to do it in O(log(n)) but they wanted O(1)
Upd. It should be next greater node
just keep the maximum element for the tree and always return it for nodes v < maximum.
You can get O(1) if you store pointers to the "next node" (using your O(log(n) algorithm), given you are allowed to do that.
How about:
int tree[N];
size_t follow(size_t v) {
// First try the right child
size_t w = v * 2 + 1;
if(w >= N) {
// Otherwise right sibling
w = v + 1;
if(w >= N) {
// Finally right parent
w = (v - 1) / 2 + 1;
}
}
return w;
}
Where tree is a complete binary tree in array form and v/w are represented as zero-based indices.
One idea is to literally just have a next pointer on each node.
You can update these pointers in O(height) after an insert or remove (O(height) is O(log n) for a self-balancing BST), which is as long as an insert or remove takes, so it doesn't add to the time complexity.
Alternatively, you can also have a previous pointer in addition to the next pointer. If you do this, you can update these pointers in O(1).
Obviously, in either case, if you have a node, you also have its next pointer, and you can simply get this value in O(1).
Pseudo-code
For only a next pointer, after the insert, you'd do:
if inserted as a right child:
newNode.next = parent.next
parent.next = newNode
else // left child
predecessor(newNode)
For both next and previous pointers:
if inserted as a right child:
parent.next.previous = newNode
newNode.next = parent.next
parent.next = newNode
else // left child
parent.previous.next = newNode
newNode.previous = parent.previous
parent.previous = newNode
(some null checks are also required).

printing all binary trees from inorder traversal

Came across this question in an interview.
Given inorder traversal of a binary tree. Print all the possible binary trees from it.
Initial thought:
If say we have only 2 elements in the array. Say 2,1.
Then two possible trees are
2
\
1
1
/
2
If 3 elements Say, 2,1,4. Then we have 5 possible trees.
2 1 4 2 4
\ / \ / \ /
1 2 4 1 4 2
\ / / \
4 2 1 1
So, basically if we have n elements, then we have n-1 branches (childs, / or ).
We can arrange these n-1 branches in any order.
For n=3, n-1 = 2. So, we have 2 branches.
We can arrange the 2 branches in these ways:
/ \ \ / /\
/ \ / \
Initial attempt:
struct node *findTree(int *A,int l,int h)
{
node *root = NULL;
if(h < l)
return NULL;
for(int i=l;i<h;i++)
{
root = newNode(A[i]);
root->left = findTree(A,l,i-1);
root->right = findTree(A,i+1,h);
printTree(root);
cout<<endl;
}
}
This problem breaks down quite nicely into subproblems. Given an inorder traversal, after choosing a root we know that everything before that is the left subtree and everthing after is the right subtree (either is possibly empty).
So to enumerate all possible trees, we just try all possible values for the root and recursively solve for the left & right subtrees (the number of such trees grows quite quickly though!)
antonakos provided code that shows how to do this, though that solution may use more memory than desirable. That could be addressed by adding more state to the recursion so it doesn't have to save lists of the answers for the left & right and combine them at the end; instead nesting these processes, and printing each tree as it is found.
I'd write one function for constructing the trees and another for printing them.
The construction of the trees goes like this:
#include <vector>
#include <iostream>
#include <boost/foreach.hpp>
struct Tree {
int value;
Tree* left;
Tree* right;
Tree(int value, Tree* left, Tree* right) :
value(value), left(left), right(right) {}
};
typedef std::vector<Tree*> Seq;
Seq all_trees(const std::vector<int>& xs, int from, int to)
{
Seq result;
if (from >= to) result.push_back(0);
else {
for (int i = from; i < to; i++) {
const Seq left = all_trees(xs, from, i);
const Seq right = all_trees(xs, i + 1, to);
BOOST_FOREACH(Tree* tl, left) {
BOOST_FOREACH(Tree* tr, right) {
result.push_back(new Tree(xs[i], tl, tr));
}
}
}
}
return result;
}
Seq all_trees(const std::vector<int>& xs)
{
return all_trees(xs, 0, (int)xs.size());
}
Observe that for root value there are multiple trees that be constructed from the values to the left and the right of the root value. All combinations of these left and right trees are included.
Writing the pretty-printer is left as an exercise (a boring one), but we can test that the function indeed constructs the expected number of trees:
int main()
{
const std::vector<int> xs(3, 0); // 3 values gives 5 trees.
const Seq result = all_trees(xs);
std::cout << "Number of trees: " << result.size() << "\n";
}

Binary tree where value of each node holds the sum of child nodes

This question was asked of me in an interview. How can we convert a BT such that every node in it has a value which is the sum of its child nodes?
Give each node an attached value. When you construct the tree, the value of a leaf is set; construct interior nodes to have the value leaf1.value + leaf2.value.
If you can change the values of the leaf nodes, then the operation has to go "back up" the tree updating the sum values.
This will be a lot easier if you either include back links in the nodes, or implement the tree as a "threaded tree".
Here is a solution that can help you: (the link explains it with tree-diagrams)
Convert an arbitrary Binary Tree to a tree that holds Children Sum Property
/* This function changes a tree to to hold children sum
property */
void convertTree(struct node* node)
{
int left_data = 0, right_data = 0, diff;
/* If tree is empty or it's a leaf node then
return true */
if(node == NULL ||
(node->left == NULL && node->right == NULL))
return;
else
{
/* convert left and right subtrees */
convertTree(node->left);
convertTree(node->right);
/* If left child is not present ten 0 is used
as data of left child */
if(node->left != NULL)
left_data = node->left->data;
/* If right child is not present ten 0 is used
as data of right child */
if(node->right != NULL)
right_data = node->right->data;
/* get the diff of node's data and children sum */
diff = left_data + right_data - node->data;
/* If node's data is smaller then increment node's data
by diff */
if(diff > 0)
node->data = node->data + diff;
/* THIS IS TRICKY --> If node's data is greater then increment left
subtree by diff */
if(diff < 0)
increment(node->left, -diff);
}
}
See the link to see the complete solution and explanation!
Well as Charlie pointed out, you can simply store the sum of respective subtree sizes in each inner node, and have leaves supply constant values at construction (or always implicitly use 1, if you're only interested in the number of leaves in a tree).
This is commonly known as an Augmented Search Tree.
What's interesting is that through this kind of augmentation, i.e., storing additional per-node data, you can derive other kinds of aggregate information for items in the tree as well. Any information you can express as a monoid you can store in an augmented tree, and for this, you'll need to specify:
the data type M; in your example, integers
a binary operation "op" to combine elements, with M op M -> M; in your example, the common "plus" operator
So besides subtree sizes, you can also express stuff like:
priorities (by way of the "min" or "max" operators), for efficient queries on min/max priorities;
rightmost elements in a subtree (i.e., an "op" operator that simply returns its second argument), provided that the elements you store in a tree are ordered somehow. Note that this allows us to view even regular search trees (aka. dictionaries -- "store this, retrieve that key") as augmented trees with a corresponding monoid.
(This concept is rather reminiscent of heaps, or more explicitly treaps, which store random priorities with inner nodes for probabilistic balancing. It's also quite commonly described in the context of Finger Trees, although these are not the same thing.)
If you also provide a neutral element for your monoid, you can then walk down such a monoid-augmented search tree to retrieve specific elements (e.g., "find me the 5th leaf" for your size example; "give me the leaf with the highest priority").
Uhm, anyways. Might have gotten carried away a bit there.. I just happen to find that topic quite interesting. :)
Here is the code for the sum problem. It works i have tested it.
int sum_of_left_n_right_nodes_4m_root(tree* local_tree){
int left_sum = 0;
int right_sum = 0;
if(NULL ==local_tree){
return 0;
}
if((NULL == local_tree->left)&&(NULL == local_tree->right)){
return 0;
}
sum_of_left_n_right_nodes(local_tree->left);
sum_of_left_n_right_nodes(local_tree->right);
if(NULL != local_tree->left)
left_sum = local_tree->left->data +
local_tree->left->sum;
if(NULL != local_tree->right)
right_sum = local_tree->right->data + \
local_tree->right->sum;
local_tree->sum= right_sum + left_sum;
}
With a recursive function you can do so by making the value of each node equal to the sum of the values of it's childs under condition that it has two children, or the value of it's single child if it has one child, and if it has no childs (leaf), then this is the breaking condition, the value never changes.