Related
I have a point which i haven't understood while learning BST Deletion mechanism. Could you explain me why there is an assignment (p->rchild =, p->lchild =) each time the Delete(Node* p, int key) is called? Actually, I thought that the Delete(Node* p, int key) method just keeps returning without any mutation so the tree doesn't change.
And while i was looking for an explanation, i stumbled into this sentence :
We have to make assignments after deletion else we will end up having
duplicate nodes.
If you agree with this statement, could you please explain it to me?
Node* BST::Delete(Node *p, int key) {
Node* q;
if (p == nullptr){
return nullptr;
}
if (p->lchild == nullptr && p->rchild == nullptr){
if (p == root){
root = nullptr;
}
delete p;
return nullptr;
}
if (key < p->data){
p->lchild = Delete(p->lchild, key);
} else if (key > p->data){
p->rchild = Delete(p->rchild, key);
} else {
if (Height(p->lchild) > Height(p->rchild)){
q = InPre(p->lchild);
p->data = q->data;
p->lchild = Delete(p->lchild, q->data);
} else {
q = InSucc(p->rchild);
p->data = q->data;
p->rchild = Delete(p->rchild, q->data);
}
}
return p;
}
why there is an assignment (p->rchild =, p->lchild =) each time the Delete(Node* p, int key) is called?
If the data is found, then the goal is to have a tree that has one node less. The algorithm will uses a value-swapping mechanism to ensure that the node that will actually be removed, is always a leaf node. The deletion of a leaf node consists of two actions:
The removal from memory;
The update of its parent so that the child pointer that references this deleted node is set to a null pointer.
As the algorithm recurses right to that leaf node, it cannot set the null pointer to its parent, as there is at this stage no reference available to the parent node. For that the caller should do something, since the caller does have a reference to the parent.
So when the recursive traversal arrives at the leaf node, it needs to convey to the caller that this node should be detached. It does so by returning a null pointer, and the agreement is that the caller (whose current node p is the parent) should assign the returned pointer to relevant child pointer. That way the deleted node is really detached from the rest of the tree.
Actually, I thought that the Delete(Node* p, int key) method just keeps returning without any mutation so the tree doesn't change.
Surely the tree must change somehow to have a node deleted from it. The change happens in the assignment to p->lchild or p->rchild
I stumbled into this sentence :
We have to make assignments after deletion else we will end up having duplicate nodes.
This is true. Let's take an example tree:
7
/ \
3 8
/ \ \
1 5 9
/ / \
0 4 6
Now let's see what happens if we call Delete(root, 3). p points to the node with value 7. We go to the left with a recursive call:
p->lchild = Delete(p->lchild, key);
In the recursive execution context, we get a new p which points to the node with value 3. This is the value we're looking for, so we get into the outer else block. As the heights of the subtrees below that node are equal, we get into the inner else block. There we assign:
q = InSucc(p->rchild);
This q will reference the node with value 4. And now a duplication happens. We copy the data from q to p. That comes down to deleting the value 3 from the tree:
p->data = q->data;
But now we have twice the value 4 in the tree.
7
/ \
4* 8
/ \ \
1 5 9
/ / \
0 4* 6
So the algorithm descends to the (right) child, and now seeks to delete the node with value 4 in that subtree:
p->rchild = Delete(p->rchild, q->data);
In this new recursive call we get a new p again, which now refers to the node with value 5. We move left -- this assignment will play an important role later:
p->lchild = Delete(p->lchild, key);
This final recursive call has a new p that refers to the node with value 4 -- the one we were looking for.
This time we end up in the if block that has the delete, because this node is a leaf node. The node is freed and a null pointer is returned to the caller. From here on we start backtracking up the tree.
So one level up, at the node with value 5, we get the return value from the recursive call (which is a null pointer) and assign it:
p->lchild = Delete(p->lchild, key);
This important assignment will detach the duplicate node (with value 4) from the tree. You can see that if this assignment would not have been made, there would still be a reference to that node with a duplicate value -- even though it is pointing to freed memory.
The tree is now in its final shape:
7
/ \
4 8
/ \ \
1 5 9
/ \
0 6
Backtracking will still continue, going back to the root. Also there assignments are made to some child pointers, but these will not change the tree, as in all these cases we had returned return p;, which was the original value of the caller's child pointer.
Bug
As mentioned in the comments, the code has a bug. When deleting a leaf node, it does not verify that this node has actually the value to delete. And so in case you call this method with a value that does not occur in the tree, you'll end up deleting a leaf node with another value. In the example tree above: if you were to call Delete(root, 10), the node with value 9 will be deleted.
To correct this bug, move the following if block:
if (p->lchild == nullptr && p->rchild == nullptr){
... inside the outer else block, as the first statement there.
I'm trying to make complete tree from scratch in C++:
1st node = root
2nd node = root->left
3rd node = root->right
4th node = root->left->left
5th node = root->left->right
6th node = root->right->left
7th node = root->right->right
where the tree would look something like this:
NODE
/ \
NODE NODE
/ \ / \
NODE NODE NODE NODE
/
NEXT NODE HERE
How would I go about detecting where the next node would go so that I can just use one function to add new nodes? For instance, the 8th node would be placed at root->left->left->left
The goal is to fit 100 nodes into the tree with a simple for loop with insert(Node *newnode) in it rather than doing one at a time. It would turn into something ugly like:
100th node = root->right->left->left->right->left->left
Use a queue data structure to accomplish building a complete binary tree. STL provides std::queue.
Example code, where the function would be used in a loop as you request. I assume that the queue is already created (i.e. memory is allocated for it):
// Pass double pointer for root, to preserve changes
void insert(struct node **root, int data, std::queue<node*>& q)
{
// New 'data' node
struct node *tmp = createNode(data);
// Empty tree, initialize it with 'tmp'
if (!*root)
*root = tmp;
else
{
// Get the front node of the queue.
struct node* front = q.front();
// If the left child of this front node doesn’t exist, set the
// left child as the new node.
if (!front->left)
front->left = tmp;
// If the right child of this front node doesn’t exist, set the
// right child as the new node.
else if (!front->right)
front->right = tmp;
// If the front node has both the left child and right child, pop it.
if (front && front->left && front->right)
q.pop();
}
// Enqueue() the new node for later insertions
q.push(tmp);
}
Suppose root is node#1, root's children are node#2 and node#3, and so on. Then the path to node#k can be found with the following algorithm:
Represent k as a binary value, k = { k_{n-1}, ..., k_0 }, where each k_i is 1 bit, i = {n-1} ... 0.
It takes n-1 steps to move from root to node#k, directed by the values of k_{n-2}, ..., k_0, where
if k_i = 0 then go left
if k_i = 1 then go right
For example, to insert node#11 (binary 1011) in a complete tree, you would insert it as root->left->right->right (as directed by 011 of the binary 1011).
Using the algorithm above, it should be straightforward to write a function that, given any k, insert node#k in a complete tree to the right location. The nodes don't even need to be inserted in-order as long as new nodes are detected created properly (i.e. as the correct left or right children, respectively).
Assuming tree is always complete we may use next recursion. It does not gives best perfomance, but it is easy to understand
Node* root;
Node*& getPtr(int index){
if(index==0){
return root;
}
if(index%2==1){
return (getPtr( (index-1)/2))->left;
}
else{
return (getPtr( (index-2)/2))->right;
}
}
and then you use it like
for(int i = 0; i<100; ++i){
getPtr(i) = new Node( generatevalue(i) );
}
private Node addRecursive(*Node current, int value) {
if (current == null) {
return new Node(value);
}
if (value < current.value) {
current->left = addRecursive(current->left, value);
} else if (value > current->value) {
current->right = addRecursive(current->right, value);
} else {
// value already exists
return current;
}
return current;
}
I do not know that if your Nodes has got a value instance but:
With this code you can have a sorted binary tree by starting from the root.
if the new node’s value is lower than the current node’s, we go to the left child. If the new node’s value is greater than the current node’s, we go to the right child. When the current node is null, we’ve reached a leaf node and we can insert the new node in that position.
Is it even possible to implement a binary heap using pointers rather than an array? I have searched around the internet (including SO) and no answer can be found.
The main problem here is that, how do you keep track of the last pointer? When you insert X into the heap, you place X at the last pointer and then bubble it up. Now, where does the last pointer point to?
And also, what happens when you want to remove the root? You exchange the root with the last element, and then bubble the new root down. Now, how do you know what's the new "last element" that you need when you remove root again?
Solution 1: Maintain a pointer to the last node
In this approach a pointer to the last node is maintained, and parent pointers are required.
When inserting, starting at the last node navigate to the node below which a new last node will be inserted. Insert the new node and remember it as the last node. Move it up the heap as needed.
When removing, starting at the last node navigate to the second-to-last node. Remove the original last node and remember the the new last node just found. Move the original last node into the place of the deleted node and then move it up or down the heap as needed.
It is possible to navigate to the mentioned nodes in O(log(n)) time and O(1) space. Here is a description of the algorithms but the code is available below:
For insert: If the last node is a left child, proceed with inserting the new node as the right child of the parent. Otherwise... Start at the last node. Move up as long as the current node is a right child. If the root was not reached, move to the sibling node at the right (which necessarily exists). Then (whether or not the root was reached), move down to the left as long as possible. Proceed by inserting the new node as the left child of the current node.
For remove: If the last node is the root, proceed by removing the root. Otherwise... Start at the last node. Move up as long as the current node is a left child. If the root was not reached, move to the sibling left node (which necessarily exists). Then (whether or not the root was reached), move down to the right as long as possible. We have arrived at the second-to-last node.
However, there are some things to be careful about:
When removing, there are two special cases: when the last node is being removed (unlink the node and change the last node pointer), and when the second-to-last node is being removed (not really special but the possibility must be considered when replacing the deleted node with the last node).
When moving nodes up or down the heap, if the move affects the last node, the last-node pointer must be corrected.
Long ago I have made an implementation of this. In case it helps someone, here is the code. Algorithmically it should be correct (has also been subjected to stress testing with verification), but there is no warranty of course.
Solution 2: Reach the last node from the root
This solution requires maintaining the node count (but not parent pointers or the last node). The last (or second-to-last) node is found by navigating from the root towards it.
Assume the nodes are numbered starting from 1, as per the typical notation for binary heaps. Pick any valid node number and represent it in binary. Ignore the first (most significant) 1 bit. The remaining bits define the path from the root to that node; zero means left and one means right.
For example, to reach node 11 (=1011b), start at the root then go left (0), right (1), right (1).
This algorithm can be used in insert to find where to place the new node (follow the path for node node_count+1), and in remove to find the second-to-last-node (follow the path for node node_count-1).
This approach is used in libuv for timer management; see their implementation of the binary heap.
Usefulness of Pointer-based Binary Heaps
Many answers here and even literature say that an array-based implementation of a binary heap is strictly superior. However I contest that because there are situations where the use of an array is undesirable, typically because the upper size of the array is not known in advance and on-demand reallocations of an array are not deemed acceptable, for example due to latency or possibility of allocation failure.
The fact that libuv (a widely used event loop library) uses a binary heap with pointers only further speaks for this.
It is worth noting that the Linux kernel uses (pointer-based) red-black trees as a priority queue in a few cases, for example for CPU scheduling and timer management (for the same purpose as in libuv). I find it likely that changing these to use a pointer-based binary heap will improve performance.
Hybrid Approach
It is possible to combine Solution 1 and Solution 2 into a hybrid approach which dynamically picks either of the algorithms (for finding the last or second-to-last node), the one with a lower cost, measured in the number of edges that need to be traversed. Assume we want to navigate to node number N, and highest_bit(X) means the 0-based index of the highest-order bit in N (0 means the LSB).
The cost of navigating from the root (Solution 2) is highest_bit(N).
The cost of navigating from the previous node which is on the same level (Solution 1) is: 2 * (1 + highest_bit((N-1) xor N)).
Note that in the case of a level change the second equation will yield a wrong (too large) result, but in that case traversal from the root is more efficient anyway (for which the estimate is correct) and will be chosen, so there is no need for special handling.
Some CPUs have an instruction for highest_bit allowing very efficient implementation of these estimates. An alternative approach is to maintain the highest bit as a bit mask and do these calculation with bit masks instead of bit indices. Consider for example that 1 followed by N zeroes squared is equal to 1 followed by 2N zeroes).
In my testing it has turned out that Solution 1 is on average faster than Solution 2, and the hybrid approach appeared to have about the same average performance as Solution 2. Therefore the hybrid approach is only useful if one needs to minimize the worst-case time, which is (twice) better in Solution 2; since Solution 1 will in the worst case traverse the entire height of the tree up and then down.
Code for Solution 1
Note that the traversal code in insert is slightly different from the algorithm described above but still correct.
struct Node {
Node *parent;
Node *link[2];
};
struct Heap {
Node *root;
Node *last;
};
void init (Heap *h)
{
h->root = NULL;
h->last = NULL;
}
void insert (Heap *h, Node *node)
{
// If the heap is empty, insert root node.
if (h->root == NULL) {
h->root = node;
h->last = node;
node->parent = NULL;
node->link[0] = NULL;
node->link[1] = NULL;
return;
}
// We will be finding the node to insert below.
// Start with the current last node and move up as long as the
// parent exists and the current node is its right child.
Node *cur = h->last;
while (cur->parent != NULL && cur == cur->parent->link[1]) {
cur = cur->parent;
}
if (cur->parent != NULL) {
if (cur->parent->link[1] != NULL) {
// The parent has a right child. Attach the new node to
// the leftmost node of the parent's right subtree.
cur = cur->parent->link[1];
while (cur->link[0] != NULL) {
cur = cur->link[0];
}
} else {
// The parent has no right child. This can only happen when
// the last node is a right child. The new node can become
// the right child.
cur = cur->parent;
}
} else {
// We have reached the root. The new node will be at a new level,
// the left child of the current leftmost node.
while (cur->link[0] != NULL) {
cur = cur->link[0];
}
}
// This is the node below which we will insert. It has either no
// children or only a left child.
assert(cur->link[1] == NULL);
// Insert the new node, which becomes the new last node.
h->last = node;
cur->link[cur->link[0] != NULL] = node;
node->parent = cur;
node->link[0] = NULL;
node->link[1] = NULL;
// Restore the heap property.
while (node->parent != NULL && value(node->parent) > value(node)) {
move_one_up(h, node);
}
}
void remove (Heap *h, Node *node)
{
// If this is the only node left, remove it.
if (node->parent == NULL && node->link[0] == NULL && node->link[1] == NULL) {
h->root = NULL;
h->last = NULL;
return;
}
// Locate the node before the last node.
Node *cur = h->last;
while (cur->parent != NULL && cur == cur->parent->link[0]) {
cur = cur->parent;
}
if (cur->parent != NULL) {
assert(cur->parent->link[0] != NULL);
cur = cur->parent->link[0];
}
while (cur->link[1] != NULL) {
cur = cur->link[1];
}
// Disconnect the last node.
assert(h->last->parent != NULL);
h->last->parent->link[h->last == h->last->parent->link[1]] = NULL;
if (node == h->last) {
// Deleting last, set new last.
h->last = cur;
} else {
// Not deleting last, move last to node's place.
Node *srcnode = h->last;
replace_node(h, node, srcnode);
// Set new last unless node=cur; in this case it stays the same.
if (node != cur) {
h->last = cur;
}
// Restore the heap property.
if (srcnode->parent != NULL && value(srcnode) < value(srcnode->parent)) {
do {
move_one_up(h, srcnode);
} while (srcnode->parent != NULL && value(srcnode) < value(srcnode->parent));
} else {
while (srcnode->link[0] != NULL || srcnode->link[1] != NULL) {
bool side = srcnode->link[1] != NULL && value(srcnode->link[0]) >= value(srcnode->link[1]);
if (value(srcnode) > value(srcnode->link[side])) {
move_one_up(h, srcnode->link[side]);
} else {
break;
}
}
}
}
}
Two other functions are used: move_one_up moves a node one step up in the heap, and replace_node replaces moves an existing node (srcnode) into the place held by the node being deleted. Both work only by adjusting the links to and from the other nodes, there is no actual moving of data involved. These functions should not be hard to implement, and the mentioned link includes my implementations.
The pointer based implementation of the binary heap is incredibly difficult when compared to the array based implementation. But it is fun to code it. The basic idea is that of a binary tree. But the biggest challenge you will have is to keep it left-filled. You will have difficulty in finding the exact location as to where you must insert a node.
For that, you must know binary traversal. What we do is. Suppose our heap size is 6. We will take the number + 1, and convert it to bits. The binary representation of 7 is, "111". Now, remember to always omit the first bit. So, now we are left with "11". Read from left-to-right. The bit is '1', so, go to the right child of the root node. Then the string left is "1", the first bit is '1'. As you have only 1 bit left, this single bit tells you where to insert the new node. As it is '1' the new node must be the right child of the current node. So, the raw working of the process is that, convert the size of the heap into bits. Omit the first bit. According to the leftmost bit, go to the right child of the current node if it is '1', and to the left child of the current node if it is '0'.
After inserting the new node, you will bubble it up the heap. This tells you that you will be needing the parent pointer. So, you go once down the tree and once up the tree. So, your insertion operation will take O(log N).
As for the deletion, it is still a challenge to find the last node. I hope you are familiar with deletion in a heap where we swap it with the last node and do a heapify. But for that you need the last node, for that too, we use the same technique as we did for finding the location to insert the new node, but with a little twist. If you want to find the location of the last node, you must use the binary representation of the value HeapSize itself, not HeapSize + 1. This will take you to the last node. So, the deletion will also cost you O(log N).
I'm having trouble in posting the source code here, but you can refer to my blog for the source code. In the code, there is Heap Sort too. It is very simple. We just keep deleting the root node. Refer to my blog for explanation with figures. But I guess this explanation would do.
I hope my answer has helped you. If it did, let me know...! ☺
For those saying this is a useless exercise, there are a couple of (admittedly rare) use cases where a pointer-based solution is better. If the max size of the heap is unknown, then an array implementation will need to stop-and-copy into fresh storage when the array fills. In a system (e.g. embedded) where there are fixed response time constraints and/or where free memory exists, but not a big enough contiguous block, this may be not be acceptable. The pointer tree lets you allocate incrementally in small, fixed-size chunks, so it doesn't have these problems.
To answer the OP's question, parent pointers and/or elaborate tracking aren't necessary to determine where to insert the next node or find the current last one. You only need the bits in the binary rep of the heap's size to determine the left and right child pointers to follow.
Edit Just saw Vamsi Sangam#'s explanation of this algorithm. Nonetheless, here's a demo in code:
#include <stdio.h>
#include <stdlib.h>
typedef struct node_s {
struct node_s *lft, *rgt;
int data;
} NODE;
typedef struct heap_s {
NODE *root;
size_t size;
} HEAP;
// Add a new node at the last position of a complete binary tree.
void add(HEAP *heap, NODE *node) {
size_t mask = 0;
size_t size = ++heap->size;
// Initialize the mask to the high-order 1 of the size.
for (size_t x = size; x; x &= x - 1) mask = x;
NODE **pp = &heap->root;
// Advance pp right or left depending on size bits.
while (mask >>= 1) pp = (size & mask) ? &(*pp)->rgt : &(*pp)->lft;
*pp = node;
}
void print(NODE *p, int indent) {
if (!p) return;
for (int i = 0; i < indent; i++) printf(" ");
printf("%d\n", p->data);
print(p->lft, indent + 1);
print(p->rgt, indent + 1);
}
int main(void) {
HEAP h[1] = { NULL, 0 };
for (int i = 0; i < 16; i++) {
NODE *p = malloc(sizeof *p);
p->lft = p->rgt = NULL;
p->data = i;
add(h, p);
}
print(h->root, 0);
}
As you'd hope, it prints:
0
1
3
7
15
8
4
9
10
2
5
11
12
6
13
14
Sift-down can use the same kind of iteration. It's also possible to implement the sift-up without parent pointers using either recursion or an explicit stack to "save" the nodes in the path from root to the node to be sifted.
A binary heap is a complete binary tree obeying the heap property. That's all. The fact that it can be stored using an array, is just nice and convenient. But sure, you can implement it using a linked structure. It's a fun exercise! As such, it is mostly useful as an exercise or in more advanced datastructures( meldable, addressable priority queues for example ), as it is quite a bit more involved than doing the array version. For example, think about siftup/siftdown procedures, and all the edge cutting/sewing you'll need to get right. Anyways, it's not too hard, and once again, good fun!
There are a number of comments pointing out that by a strict definition it is possible to implement a binary heap as a tree and still call it a binary heap.
Here is the problem -- there is never a reason to do so since using an array is better in every way.
If you do searches to try to find information on how to work with a heap using pointers you are not going to find any -- no one bothers since there is no reason to implement a binary heap in this way.
If you do searches on trees you will find lots of helpful materials. This was the point of my original answer. There is nothing that stops people from doing it this way but there is never a reason to do so.
You say -- I have to do so, I've got an legacy system and I have pointers to nodes I need to put them in a heap.
Make an array of those pointers and work with them in this array as you would a standard array based heap, when you need the contents dereference them. This will work better than any other way of implementing your system.
I can think of no other reason to implement a heap using pointers.
Original Answer:
If you implement it with pointers then it is a tree. A heap is a heap because of how you can calculate the location of the children as a location in the array (2 * node index +1 and 2 * node index + 2).
So no, you can't implement it with pointers, if you do you've implemented a tree.
Implementing trees is well documented if you search you will find your answers.
I have searched around the internet (including SO) and no answer can be found.
Funny, because I found an answer on SO within moments of googling it. (Same Google search led me here.)
Basically:
The node should have pointers to its parent, left child, and right child.
You need to keep pointers to:
the root of the tree (root) (duh)
the last node inserted (lastNode)
the leftmost node of the lowest level (leftmostNode)
the rightmost node of the next-to-lowest level (rightmostNode)
Now, let the node to be inserted be nodeToInsert. Insertion algorithm in pseudocode:
void insertNode(Data data) {
Node* parentNode, nodeToInsert = new Node(data);
if(root == NULL) { // empty tree
parent = NULL;
root = nodeToInsert;
leftmostNode = root;
rightmostNode = NULL;
} else if(lastNode.parent == rightmostNode && lastNode.isRightChild()) {
// level full
parentNode = leftmostNode;
leftmostNode = nodeToInsert;
parentNode->leftChild = nodeToInsert;
rightmostNode = lastNode;
} else if (lastNode.isLeftChild()) {
parentNode = lastNode->parent;
parentNode->rightChild = nodeToInsert;
} else if(lastNode.isRightChild()) {
parentNode = lastNode->parent->parent->rightChild;
parentNode->leftChild = nodeToInsert;
}
nodeToInsert->parent = parentNode;
lastNode = nodeToInsert;
heapifyUp(nodeToInsert);
}
Pseudocode for deletion:
Data deleteNode() {
Data result = root->data;
if(root == NULL) throw new EmptyHeapException();
if(lastNode == root) { // the root is the only node
free(root);
root = NULL;
} else {
Node* newRoot = lastNode;
if(lastNode == leftmostNode) {
newRoot->parent->leftChild = NULL;
lastNode = rightmostNode;
rightmostNode = rightmostNode->parent;
} else if(lastNode.isRightChild()) {
newRoot->parent->rightChild = NULL;
lastNode = newRoot->parent->leftChild;
} else if(lastNode.isLeftChild()) {
newRoot->parent->leftChild = NULL;
lastNode = newRoot->parent->parent->leftChild->rightChild;
}
newRoot->leftChild = root->leftChild;
newRoot->rightChild = root->rightChild;
newRoot->parent = NULL;
free(root);
root = newRoot;
heapifyDown(root);
}
return result;
}
heapifyUp() and heapifyDown() shouldn’t be too hard, though of course you’ll have to make sure those functions don’t make leftmostNode, rightmostNode, or lastNode point at the wrong place.
TL;DR Just use a goddamn array.
This my my successor func:
int
BalancedTree::successor( TreeNode *node ) // successor is the left-most child of its right subtree,
{
TreeNode *tmp = node;
int successorVal = -1;
tmp = tmp->m_RChild;
if( NULL != tmp )
{
while( NULL != tmp->m_LChild )
tmp = tmp->m_LChild;
// now at left most child of right subtree
successorVal = tmp->m_nodeData;
}
return successorVal;
} // successor()
my instructor gave us a file filled with random data. I place all this data into the tree, the insert method works, but once the remove method starts, the successor function at some point returns the same value of the the node I'm looking for a successor for. This shouldn't be able to happen correct? is my successor function correct? If you want to see the remove method just mention it.
Your definition of successor is flawed already: if the node doesn't have a right node the successor is one of its ancestors: the first one whose left child is the node or one of its ancestors. Only if no such ancestor exists there is mo successor. Personally I would return an iterator to the node but otherwise the code seems to be OK.
Can someone please help me understand the following Morris inorder tree traversal algorithm without using stacks or recursion ? I was trying to understand how it works, but its just escaping me.
1. Initialize current as root
2. While current is not NULL
If current does not have left child
a. Print current’s data
b. Go to the right, i.e., current = current->right
Else
a. In current's left subtree, make current the right child of the rightmost node
b. Go to this left child, i.e., current = current->left
I understand the tree is modified in a way that the current node, is made the right child of the max node in right subtree and use this property for inorder traversal. But beyond that, I'm lost.
EDIT:
Found this accompanying c++ code. I was having a hard time to understand how the tree is restored after it is modified. The magic lies in else clause, which is hit once the right leaf is modified. See code for details:
/* Function to traverse binary tree without recursion and
without stack */
void MorrisTraversal(struct tNode *root)
{
struct tNode *current,*pre;
if(root == NULL)
return;
current = root;
while(current != NULL)
{
if(current->left == NULL)
{
printf(" %d ", current->data);
current = current->right;
}
else
{
/* Find the inorder predecessor of current */
pre = current->left;
while(pre->right != NULL && pre->right != current)
pre = pre->right;
/* Make current as right child of its inorder predecessor */
if(pre->right == NULL)
{
pre->right = current;
current = current->left;
}
// MAGIC OF RESTORING the Tree happens here:
/* Revert the changes made in if part to restore the original
tree i.e., fix the right child of predecssor */
else
{
pre->right = NULL;
printf(" %d ",current->data);
current = current->right;
} /* End of if condition pre->right == NULL */
} /* End of if condition current->left == NULL*/
} /* End of while */
}
If I am reading the algorithm right, this should be an example of how it works:
X
/ \
Y Z
/ \ / \
A B C D
First, X is the root, so it is initialized as current. X has a left child, so X is made the rightmost right child of X's left subtree -- the immediate predecessor to X in an inorder traversal. So X is made the right child of B, then current is set to Y. The tree now looks like this:
Y
/ \
A B
\
X
/ \
(Y) Z
/ \
C D
(Y) above refers to Y and all of its children, which are omitted for recursion issues. The important part is listed anyway.
Now that the tree has a link back to X, the traversal continues...
A
\
Y
/ \
(A) B
\
X
/ \
(Y) Z
/ \
C D
Then A is outputted, because it has no left child, and current is returned to Y, which was made A's right child in the previous iteration. On the next iteration, Y has both children. However, the dual-condition of the loop makes it stop when it reaches itself, which is an indication that it's left subtree has already been traversed. So, it prints itself, and continues with its right subtree, which is B.
B prints itself, and then current becomes X, which goes through the same checking process as Y did, also realizing that its left subtree has been traversed, continuing with the Z. The rest of the tree follows the same pattern.
No recursion is necessary, because instead of relying on backtracking through a stack, a link back to the root of the (sub)tree is moved to the point at which it would be accessed in a recursive inorder tree traversal algorithm anyway -- after its left subtree has finished.
The recursive in-order traversal is : (in-order(left)->key->in-order(right)). (this is similar to DFS)
When we do the DFS, we need to know where to backtrack to (that's why we normally keep a stack).
As we go through a parent node to which we will need to backtrack to -> we find the node which we will need to backtrack from and update its link to the parent node.
When we backtrack? When we cannot go further. When we cannot go further? When no left child's present.
Where we backtrack to? Notice: to SUCCESSOR!
So, as we follow nodes along left-child path, set the predecessor at each step to point to the current node. This way, the predecessors will have links to successors (a link for backtracking).
We follow left while we can until we need to backtrack. When we need to backtrack, we print the current node and follow the right link to the successor.
If we have just backtracked -> we need to follow the right child (we are done with left child).
How to tell whether we have just backtracked? Get the predecessor of the current node and check if it has a right link (to this node). If it has - than we followed it. remove the link to restore the tree.
If there was no left link => we did not backtrack and should proceed following left children.
Here's my Java code (Sorry, it is not C++)
public static <T> List<T> traverse(Node<T> bstRoot) {
Node<T> current = bstRoot;
List<T> result = new ArrayList<>();
Node<T> prev = null;
while (current != null) {
// 1. we backtracked here. follow the right link as we are done with left sub-tree (we do left, then right)
if (weBacktrackedTo(current)) {
assert prev != null;
// 1.1 clean the backtracking link we created before
prev.right = null;
// 1.2 output this node's key (we backtrack from left -> we are finished with left sub-tree. we need to print this node and go to right sub-tree: inOrder(left)->key->inOrder(right)
result.add(current.key);
// 1.15 move to the right sub-tree (as we are done with left sub-tree).
prev = current;
current = current.right;
}
// 2. we are still tracking -> going deep in the left
else {
// 15. reached sink (the leftmost element in current subtree) and need to backtrack
if (needToBacktrack(current)) {
// 15.1 return the leftmost element as it's the current min
result.add(current.key);
// 15.2 backtrack:
prev = current;
current = current.right;
}
// 4. can go deeper -> go as deep as we can (this is like dfs!)
else {
// 4.1 set backtracking link for future use (this is one of parents)
setBacktrackLinkTo(current);
// 4.2 go deeper
prev = current;
current = current.left;
}
}
}
return result;
}
private static <T> void setBacktrackLinkTo(Node<T> current) {
Node<T> predecessor = getPredecessor(current);
if (predecessor == null) return;
predecessor.right = current;
}
private static boolean needToBacktrack(Node current) {
return current.left == null;
}
private static <T> boolean weBacktrackedTo(Node<T> current) {
Node<T> predecessor = getPredecessor(current);
if (predecessor == null) return false;
return predecessor.right == current;
}
private static <T> Node<T> getPredecessor(Node<T> current) {
// predecessor of current is the rightmost element in left sub-tree
Node<T> result = current.left;
if (result == null) return null;
while(result.right != null
// this check is for the case when we have already found the predecessor and set the successor of it to point to current (through right link)
&& result.right != current) {
result = result.right;
}
return result;
}
I've made an animation for the algorithm here:
https://docs.google.com/presentation/d/11GWAeUN0ckP7yjHrQkIB0WT9ZUhDBSa-WR0VsPU38fg/edit?usp=sharing
This should hopefully help to understand. The blue circle is the cursor and each slide is an iteration of the outer while loop.
Here's code for morris traversal (I copied and modified it from geeks for geeks):
def MorrisTraversal(root):
# Set cursor to root of binary tree
cursor = root
while cursor is not None:
if cursor.left is None:
print(cursor.value)
cursor = cursor.right
else:
# Find the inorder predecessor of cursor
pre = cursor.left
while True:
if pre.right is None:
pre.right = cursor
cursor = cursor.left
break
if pre.right is cursor:
pre.right = None
cursor = cursor.right
break
pre = pre.right
#And now for some tests. Try "pip3 install binarytree" to get the needed package which will visually display random binary trees
import binarytree as b
for _ in range(10):
print()
print("Example #",_)
tree=b.tree()
print(tree)
MorrisTraversal(tree)
I found a very good pictorial explanation of Morris Traversal.
public static void morrisInOrder(Node root) {
Node cur = root;
Node pre;
while (cur!=null){
if (cur.left==null){
System.out.println(cur.value);
cur = cur.right; // move to next right node
}
else { // has a left subtree
pre = cur.left;
while (pre.right!=null){ // find rightmost
pre = pre.right;
}
pre.right = cur; // put cur after the pre node
Node temp = cur; // store cur node
cur = cur.left; // move cur to the top of the new tree
temp.left = null; // original cur left be null, avoid infinite loops
}
}
}
I think this code would be better to understand, just use a null to avoid infinite loops, don't have to use magic else. It can be easily modified to preorder.
I hope the pseudo-code below is more revealing:
node = root
while node != null
if node.left == null
visit the node
node = node.right
else
let pred_node be the inorder predecessor of node
if pred_node.right == null /* create threading in the binary tree */
pred_node.right = node
node = node.left
else /* remove threading from the binary tree */
pred_node.right = null
visit the node
node = node.right
Referring to the C++ code in the question, the inner while loop finds the in-order predecessor of the current node. In a standard binary tree, the right child of the predecessor must be null, while in the threaded version the right child must point to the current node. If the right child is null, it is set to the current node, effectively creating the threading, which is used as a returning point that would otherwise have to be on stored, usually on a stack. If the right child is not null, then the algorithm makes sure that the original tree is restored, and then continues traversal in the right subtree (in this case it is known that the left subtree was visited).
Python Solution
Time Complexity : O(n)
Space Complexity : O(1)
Excellent Morris Inorder Traversal Explanation
class Solution(object):
def inorderTraversal(self, current):
soln = []
while(current is not None): #This Means we have reached Right Most Node i.e end of LDR traversal
if(current.left is not None): #If Left Exists traverse Left First
pre = current.left #Goal is to find the node which will be just before the current node i.e predecessor of current node, let's say current is D in LDR goal is to find L here
while(pre.right is not None and pre.right != current ): #Find predecesor here
pre = pre.right
if(pre.right is None): #In this case predecessor is found , now link this predecessor to current so that there is a path and current is not lost
pre.right = current
current = current.left
else: #This means we have traverse all nodes left to current so in LDR traversal of L is done
soln.append(current.val)
pre.right = None #Remove the link tree restored to original here
current = current.right
else: #In LDR LD traversal is done move to R
soln.append(current.val)
current = current.right
return soln
PFB Explanation of Morris In-order Traversal.
public class TreeNode
{
public int val;
public TreeNode left;
public TreeNode right;
public TreeNode(int val = 0, TreeNode left = null, TreeNode right = null)
{
this.val = val;
this.left = left;
this.right = right;
}
}
class MorrisTraversal
{
public static IList<int> InOrderTraversal(TreeNode root)
{
IList<int> list = new List<int>();
var current = root;
while (current != null)
{
//When there exist no left subtree
if (current.left == null)
{
list.Add(current.val);
current = current.right;
}
else
{
//Get Inorder Predecessor
//In Order Predecessor is the node which will be printed before
//the current node when the tree is printed in inorder.
//Example:- {1,2,3,4} is inorder of the tree so inorder predecessor of 2 is node having value 1
var inOrderPredecessorNode = GetInorderPredecessor(current);
//If the current Predeccessor right is the current node it means is already printed.
//So we need to break the thread.
if (inOrderPredecessorNode.right != current)
{
inOrderPredecessorNode.right = null;
list.Add(current.val);
current = current.right;
}//Creating thread of the current node with in order predecessor.
else
{
inOrderPredecessorNode.right = current;
current = current.left;
}
}
}
return list;
}
private static TreeNode GetInorderPredecessor(TreeNode current)
{
var inOrderPredecessorNode = current.left;
//Finding Extreme right node of the left subtree
//inOrderPredecessorNode.right != current check is added to detect loop
while (inOrderPredecessorNode.right != null && inOrderPredecessorNode.right != current)
{
inOrderPredecessorNode = inOrderPredecessorNode.right;
}
return inOrderPredecessorNode;
}
}