As part of an AVL template I am working on (C++ templates) I was trying to merge 2 AVL trees in O(n1+n2) complexity when n1+n2 is the total elements in both trees.
I thought about the next algorithm.
Inorder traversal on the 1st tree and build an array/list - O(n1)
Inorder traversal on the 2nd tree and build an array/list - O(n2)
Merge sort of those 2 arrays and build final sorted array/list in the size of n1+n2 - O(n1+n2)
Build an empty almost complete binary tree with n1+n2 vertices - O(n1+n2).
Inorder traversal on that almost complete binary tree while updating the vertices with the elemets in the merged array/list.
My question is how do I actually build the empty almost complete binary tree with n1+n2 vertices?
If the nodes issue by the merge sort are stored in a vector, it can be done relatively easily. Your nodes are already sorted, so you can "insert" the nodes in the following fashion:
Build your root node from the element at 1/2 of the array;
Build the root's child nodes using the elements at 1/4 and 3/4 of the array;
Repeat 2 recursively.
This should feel to you as an in-order traversal of a binary tree that happens to be represented as a sorted array.
Note that for this to work, you need to build the tree with balancing "turned off". This is most likely going to require you to make this a private method of your class, probably a special constructor.
Related
How to find the k largest elements in a binary search tree faster than in O(logN + k)
I implemented the algorithm with the said asymptotics, but how to make it faster?
Extend your tree data structure with the following:
Make your tree threaded, i.e. add a parent reference to each node.
Maintain a reference to the node that has the maximum value (the "rightmost" node). Keep it up to date as nodes are added/removed.
With that information you can avoid the first descent from the root to the rightmost node, and start collecting values immediately. If the binary tree is well balanced, then the rightmost node will be on (or near) the bottom layer of the tree. Then the walk along the tree in reversed inorder sequence -- for finding the 𝑘 greatest valued nodes -- will make you traverse a number of edges that is O(𝑘).
Alternative structures, such as B+ tree and skip list can also provide O(𝑘) access to the 𝑘 greatest values they store.
This C++ assignment requires us to create a binary tree and check to see if it is a binary search tree. If it isn't, then we need an algorithm to repair it WITHOUT using extra space or other data structures. My friends and I are all stuck on figuring out a proper algorithm because tons of searching online has come up with little to nothing. We don't care much about runtime, we are mostly focused on figuring out how to repair a BT with the given requirements.
Linked lists are used to create the trees but we're stumped on how to implement some kind of algorithm to convert it into a BST.
Any tips would be greatly appreciated!
Follow the steps given below:
Check your binary tree T1 is a Binary Search Tree or not. If Yes, no need to worry about, else goto step 2
Do the post-order traversal of the given Binary Tree and create another Tree T2(consider it as Binary Search Tree initially pointing to NULL, means empty initially).
While doing the post-order traversal of the Binary Tree, keep on deleting the nodes one by one and create a copy of each node and insert that node to the BST T2 one by one. (NOTE: The insertion should be done in BST manner)
Complexity:
Worst-Time Complexity: O(n2).
Space Complexity: O(1). constant
You can follow the following steps to convert your binary tree into binary search tree in place:
First we will convert the binary tree into a doubly linked list. Do an inorder traversal and change the left and right pointers (left can be used as next and right as previous) accordingly to form a linked list in place.
Then sort the linked list in place(use merge sort here as it sorts a linked list in place).
Then u can convert this linked list in place back to a tree. Start with root as middle element of linked list and use a recursive function to move left and right accordingly.
This will create a balanced tree.
Once I was interviewed by "One well known company" and the interviewer asked me to find the median of BST.
int median(treeNode* root)
{
}
I started to implement the first brute-force solution that I came up with. I fill all the data into a std::vector<int> with inorder traversal (to get everything sorted in the vector) and got the middle element.
So my algo is O(N) for inserting every element in the vector and query of middle element with O(1), + O(N) of memory.
So is there more effective way (in terms of memory or in terms of complexity) to do the same thing.
Thanks in advance.
It can be done in O(n) time and O(logN) space by doing an in-order traversal and stopping when you reach the n/2th node, just carry a counter that tells you how many nodes have been already traversed - no need to actually populate any vector.
If you can modify your tree to ranks-tree (each node also has information about the number of nodes in the subtree it's a root of) - you can easily solve it in O(logN) time, by simply moving torward the direction of n/2 elements.
Since you know that the median is the middle element of a sorted list of elements, you can just take the middle element of your inorder traversal and stop there, without storing the values in a vector. You might need two traversals if you don't know the number of nodes, but it will make the solution use less memory (O(h) where h is the height of your tree; h = O(log n) for balanced search trees).
If you can augment the tree, you can use the solution I gave here to get an O(log n) algorithm.
The binary tree offers a sorted view for your data but in order to take advantage of it, you need to know how many elements are in each subtree. So without this knowledge your algorithm is fast enough.
If you know the size of each subtree, you select each time to visit the left or the right subtree, and this gives an O(log n) algorithm if the binary tree is balanced.
I need a data structure in c++ STL for performing insertion, searching and retrieval of kth element in log(n)
(Note: k is a variable and not a constant)
I have a class like
class myClass{
int id;
//other variables
};
and my comparator is just based on this id and no two elements will have the same id.
Is there a way to do this using STL or I've to write log(n) functions manually to maintain the array in sorted order at any point of time?
Afaik, there is no such datastructure. Of course, std::set is close to this, but not quite. It is a red black tree. If each node of this red black tree was annotated with the tree weight (the number of nodes in the subtree rooted at this node), then a retrieve(k) query would be possible. As there is no such weight annotation (as it takes valuable memory and makes insert/delete more complex as weights have to be updated), it is impossible to answer such a query efficently with any search tree.
If you want to build such a datastructure, use a conventional search tree implementation (red-black,AVL,B-Tree,...) and add a weight field to each node that counts the number of entries in its subtree. Then searching for the k-th entry is quite simple:
Sketch:
Check the weight of the child nodes, and find the child c which has the largest weight (accumulated from left) that is not greater than k
Subtract from k all weights of children that are left of c.
Descend down to c and call this procedure recursively.
In case of a binary search tree, the algorithm is quite simple since each node only has two children. For a B-tree (which is likely more efficient), you have to account as many children as the node contains.
Of course, you must update the weight on insert/delete: Go up the tree from the insert/delete position and increment/decrement the weight of each node up to the root. Also, you must exchange the weights of nodes when you do rotations (or splits/merges in the B-tree case).
Another idea would be a skip-list where the skips are annotated with the number of elements they skip. But this implementation is not trivial, since you have to update the skip length of each skip above an element that is inserted or deleted, so adjusting a binary search tree is less hassle IMHO.
Edit: I found a C implementation of a 2-3-4 tree (B-tree), check out the links at the bottom of this page: http://www.chiark.greenend.org.uk/~sgtatham/algorithms/cbtree.html
You can not achieve what you want with simple array or any other of the built-in containers. You can use a more advanced data structure for instance a skip list or a modified red-black tree(the backing datastructure of std::set).
You can get the k-th element of an arbitrary array in linear time and if the array is sorted you can do that in constant time, but still the insert will require shifting all the subsequent elements which is linear in the worst case.
As for std::set you will need additional data to be stored at each node to be able to get the k-th element efficiently and unfortunately you can not modify the node structure.
In the book I'm using for my class (and from what I've seen from a few other places), it seems like the algorithm for creating a huffman tree stems from
(1) Building a minheap based on the frequency of each character in whatever file or string is being read in.
(2) Popping off the 2 smallest values from the minheap and combining their weights into a new node.
(3) Re-inserting the new node back into the same minheap.
I'm confused about step 3. Most huffman trees I've seen have attributes more similar to a max heap than a minheap (although they are not complete trees). That is to say, the root contains the maximum weight (or combination of weights rather), while all of it's children have lesser weights. How does this implementation give a huffman tree when the combined nodes are put back into a minheap? I've been struggling with this for a while now.
A similar question has already been posted here (with the same book as me): I don't understand this Huffman algorithm implementation
In case you wanted to see the exact function described in (3).
Thanks for any help!
A Huffman tree is often not a complete binary tree, and so is not a min-heap.
The Huffman algorithm is easily understood as a list of frequencies from which a tree is built. Small branches are constructed first, which will eventually all be merged into a single tree. Each list item starts off as a symbol, and later may be a symbol or a sub-tree that has been built. Each list item always has a frequency (an integer count usually).
Take the two smallest frequencies out of the list (ties don't matter -- any choice will result in an optimal code, though there may be more than one optimal code). Construct a single-level binary tree from those two, where the two leaves are the symbols for those frequencies. Add the frequencies to make a new frequency representing the tree. Put that frequency back in the list. The list now has one less frequency in it.
Repeat. Now the binary tree constructed at each step may have symbol leaves on each branch, or one leaf and a previously constructed tree, or two trees (at earliest in the third step).
Keep going until there is only one frequency left in the list. That will be the sum of all the original frequencies. That frequency has the complete Huffman tree associated with it.
Now you can (arbitrarily) assign a 0 and a 1 to each binary branch. You build codes or decode codes by traversing the tree from the root to a symbol. The bits from the branches of that traverse are in order the Huffman code for that symbol.