How to locate immediate predecessor with O(log n) time complexity - c++

First, of all i would like to let anyone know that this is an assignment and i've finished the locate immediate predecessor with O(n), but i would like to do it with O(log n), i know it's possible since the tree is an AVL tree.
The way i've done it with O(n) is i divide the tree into 2 based on the key(record) and do a max search for the left tree and min search for the right tree. I know it's not log n since after i narrowed the solution, i still have to process all the nodes in the left or right tree so at best it's still 1/2n.
I can see the pattern of the solutions but still can't wrap my mind around it. i'm thinking about using root and node pointer but i'm still not sure of how to implement it.
Any pointers would be appreciated, i've googled and tried to solve this problem to no avail for several days now.

Given a node N in an AVL tree, there are three cases:
N has a left child L. Then the immediate predecessor of N must be the right-most deepest descendent of L. To locate it, you need to descend into the subtree of L, taking the right branch at each sub-node. There can be at most log n levels, so this is O(log n).
N has no left child, but is itself the right child of a parent P. Then P must be the immediate predecessor, located in O(1) time.
N has no left child and is the left child of a parent P. Then walk up the tree towards the root until you find a node that is the right child of an ascendent A. If there is no such A, N does not have any predecessor; otherwise A is the immediate predecessor of N. Again, there can be at most log n parent levels to check, so this is also O(log n).
Determining which of the three applies can obviously be done in O(1) time, so the total time complexity is O(log n).
Example AVL tree for reference (this is the same example as given on the Wikipedia page for AVL tree, but I've recreated the graph rather than copying the image; the source can be forked from here if anybody would like to make modifications):
Nodes 17 and 50 are examples of case 1; node 76 is an example of case 2; node 9 is an example of case 3 with no predecessor; node 19 is an example of case 3 with predecessors. If you think through each of the cases looking at examples from the tree above, you'll be able to confirm that the statements are true. This may be easier than going through a formal proof (which nevertheless could be given).

i actually figured out an easier way to solve this problem without using parent or child pointer.
Here's what i did:
Save each node as i traverse the tree recursively and save all nodes that has record less than target.
if it's a leaf then return your temp pointer to the caller.

Related

How to know that I'm at the end of tree when iterating?

I'm writing (inorder) iterator for tree structure (left child pointer, right child pointer, parent pointer) and I'm stuck, because I can't think of a way to stop iterating when I've already visited all nodes. How can I check if the node I'm currently at is the last node of tree?
EDIT
The tree structure here is supposed to be binary trie, I need inorder traversal to achieve lexicological order of node "keys" and I have the recursive version already done - I'm trying to do the iterative version, because a lot of other functions traversing the tree and I'm not really sure how to write the recursive version generic enough to support all the uses.
I'm sorry if my initial question was non-accurate, downvote as you see fit.
If you're doing it recursively, this is inherent in the algorithm, you don't need to manually check, as per the following pseudo-code:
def processTree(node):
if node == null: return
processTree(node.left)
print(node.value)
processTree(node.right)
processTree(rootNode)
Consider the point where you process the very last node, say 7 in the following tree:
__1__
/ \
2 3
/ \ / \
4 5 6 7
At that point, you will have already processed everything to the left and all parents so you will simply step up the tree and exit.
One approach is to calculate the total of nodes before searching the tree, say N. Then you can use a size_t counter to count it, i.e. pass the reference of the counter into the recursive call to for tree search, and do a ++counter at each final node.

Binary Tree Questions

Currently studying for an exam, and whilst reading through some notes, I had a few questions.
I know that the height of a Binary Search Tree is Log(n). Does this mean the depth is also Log(n)?
What is maximum depth of a node in a full binary tree with n nodes? This related to the first question; if the height of a Binary Tree is Log(n), would the maximum depth also be Log(n)?
I know that the time complexity of searching for a node in a binary search tree is O(Log(n)), which I understand. However, I read that the worst case time complexity is O(N). In what scenario would it take O(N) time to find an element?
THIS IS A PRIORITY QUEUE/ HEAP QUESTION. In my lecture notes, it says the following statement:
If we use an array for Priority Queues, en-queuing takes O(1) and de-queuing takes O(n). In a sorted Array, en-queue takes O(N) and de-queue takes O(1).
I'm having a hard time understanding this. Can anyone explain?
Sorry for all the questions, really need some clarity on a few of these topics.
Caveat: I'm a little rusty, but here goes ...
Height and depth of a binary tree are synonymous--more or less. height is the maximum depth along any path from root to leaf. But, when you traverse a tree, you have a concept of current depth. root node has depth 0, its children have depth 1, its grandchildren depth 2. If we stop here, the height of the tree is 3, but the maximum depth [we visited] is 2. Otherwise, they are often interchanged when talking about the tree overall.
Before we get to some more of your questions, it's important to note that binary trees come in various flavors. Balanced or unbalanced. With a perfectly balanced tree, all nodes except those at maximum height will have their left/right links non-null. For example, with n nodes in the tree, let n = 1024. Perfectly balanced the height is log2(n) which is 10 (e.g. 1024 == 2^10).
When you search a perfectly balanced tree, the search is O(log2(n)) because starting from the root node, you choose to follow either left or right, and each time you do, you eliminate 1/2 of the nodes. In such a tree with 1024 elements, the depth is 10 and you make 10 such left/right decisions.
Most tree algorithms, when you add a new node, will rebalance the tree on the fly (e.g. AVL or RB (red black)) trees. So, you get a perfectly balanced tree, all the time, more or less.
But ...
Let's consider a really bad algorithm. When you add a new node, it just appends it to the left link on the child with the greatest depth [or the new node becomes the new root]. The idea is fast append, and "we'll rebalance later".
If we search this "bad" tree, if we've added n nodes, the tree looks like a doubly linked list using the parent link and the left link [remember all right links are NULL]. This is linear time search or O(n).
We did this deliberately, but it can still happen with some tree algorithm and/or combinations of data. That is, the data is such that it gets naturally placed on the left link because that's where it's correct to place it based on the algorithm's placement function.
Priority queues are like regular queues except each piece of data has a priority number associated with it.
In an ordinary queue, you just push/append onto the end. When you dequeue, you shift/pop from the front. You never need to insert anything in the middle. Thus, enqueue and dequeue are both O(1) operations.
The O(n) comes from the fact that if you have to do an insertion into the middle of an array, you have to "part the waters" to make space for the element you want to insert. For example, if you need to insert after the first element [which is array[0]], you will be placing the new element at array[1], but first you have to move array[1] to array[2], array[2] to array[3], ... For an array of n, this is O(n) effort.
When removing an element from an array, it is similar, but in reverse. If you want to remove array[1], you grab it, then you must "close the gap" left by your removal by array[1] = array[2], array[2] = array[3], ... Once again, an O(n) operation.
In a sorted array, you just pop off the end. It's the one you want already. Hence O(1). To add an element, its an insertion into the correct place. If your array is 1,2,3,7,9,12,17 and you want to add 6, that's new value for array[4], and you have to move 7,9,12,17 out of the way as above.
Priority queue just appends to the array, hence O(1). But to find the correct element to dequeue, you scan the array array[0], array[1], ... remembering the first element position for a given priority, if you find a better priority, you remember that. When you hit the end, you know which element you need, say it's j. Now you have to remove j from array, and that an O(n) operation as above.
It's slightly more complex than all that, but not by two much.

finding the Lowest Common Ancestor in a general tree having the parent Ids

My roblem is to find the LCA for a general tree that would be created from a list in a txt file. I am looking for the most efficient implementation. The data is in the form of:
Id, info, ParentId
The data is not sorted in any way. I was thinking about creating a tree, but that would take at least O (nlogn). Although the log base is not 2. It depends on the avg number of children I suppose.
Instead, if I store the nodes in hashtable, then finding LCA would be better than O (nlogn). Right? For each parent of the destination node, I have to chwck whether it has been visited by the source node or not (assume that we start from source node up to the root and mark all the parent on the way as visited), which takes O (logn). Since, we just check the parents, it would be better than O(nlogn).
Any better idea?
Suppose your tree is somehow balanced, I.e. O(logn) height, your hashtable data structure should give a O(n) algorithm.
First trace from both source and destination to root. You will have two paths of length O(logn). E.g. SXYZR and DWYZR. S and D are source and destination. R is the root. This takes O(logn) time.
Then you can find the longest postfix which is YZR. Y will be the LCA. This takes O(logn) time.
Remember you need O(n) time to read input and build hashtable.

Remove an element from unbalanced binary search tree

I have been wanting to write remove() method for my Binary Search Tree (which happens to be an array representation). But before writing it, I must consider all cases. Omitting all cases (since they are easy) except when the node has two children, in all the explanations I have read so far, most of the cases I see remove an element from an already balanced binary search tree. In the few cases where I have seen an element being removed from an unbalanced binary search tree, I find that they balance it through zigs and zags, and then remove the element.
Is there a way that I can possibly remove an element from an unbalanced binary search tree without having to balance it beforehand?
If not, would it be easier to write an AVL tree (in array representation)?
You don't need to balance it, but you do need to recursively go down the tree performing some swaps here and there so you actually end up with a valid BST.
Deletion of a node with 2 children in an (unbalanced) BST: (from Wikipedia)
Call the node to be deleted N. Do not delete N. Instead, choose either its in-order successor node or its in-order predecessor node, R. Copy the value of R to N, then recursively call delete on R until reaching one of the first two cases.
Deleting a node with two children from a binary search tree. First the rightmost node in the left subtree, the inorder predecessor 6, is identified. Its value is copied into the node being deleted. The inorder predecessor can then be easily deleted because it has at most one child. The same method works symmetrically using the inorder successor labelled 9.
Although, why do you want an unbalanced tree? All operations on it take on it take longer (or at least as long), and the additional overhead to balance doesn't change the asymptotic complexity of any operations. And, if you're using the array representation where the node at index i has children at indices 2i and 2i+1, it may end up fairly sparse, i.e. there will be quite a bit of wasted memory.

Concatenating/Merging/Joining two AVL trees

Assume that I have two AVL trees and that each element from the first tree is smaller then any element from the second tree. What is the most efficient way to concatenate them into one single AVL tree? I've searched everywhere but haven't found anything useful.
Assuming you may destroy the input trees:
remove the rightmost element for the left tree, and use it to construct a new root node, whose left child is the left tree, and whose right child is the right tree: O(log n)
determine and set that node's balance factor: O(log n). In (temporary) violation of the invariant, the balance factor may be outside the range {-1, 0, 1}
rotate to get the balance factor back into range: O(log n) rotations: O(log n)
Thus, the entire operation can be performed in O(log n).
Edit: On second thought, it is easier to reason about the rotations in the following algorithm. It is also quite likely faster:
Determine the height of both trees: O(log n).
Assuming that the right tree is taller (the other case is symmetric):
remove the rightmost element from the left tree (rotating and adjusting its computed height if necessary). Let n be that element. O(log n)
In the right tree, navigate left until you reach a node whose subtree is at most one 1 taller than left. Let r be that node. O(log n)
replace that node with a new node with value n, and subtrees left and r. O(1)
By construction, the new node is AVL-balanced, and its subtree 1 taller than r.
increment its parent's balance accordingly. O(1)
and rebalance like you would after inserting. O(log n)
One ultra simple solution (that works without any assumptions in the relations between the trees) is this:
Do a merge sort of both trees into one merged array (concurrently iterate both trees).
Build an AVL tree from the array - take the middle element to be the root, and apply recursively to left and right halves.
Both steps are O(n). The major issue with it is that it takes O(n) extra space.
The best solution I read to this problem can be found here. Is very close to meriton's answer if you correct this issue:
In the third step of the algorithm navigates left until you reach the node whose sub tree has the same height as the left tree. This is not always possible, (see counterexample image). The right way to do this step is two find for a subtree with height h or h+1 where h is the height of the left tree
I suspect that you'll just have to walk one tree (hopefully the smaller) and individually add each of it's elements to the other tree. The AVL insert/delete operations are not designed to handle adding a whole subtree at a time.