In https://www.techiedelight.com/print-all-paths-from-root-to-leaf-nodes-binary-tree/, the code for printing root to leaf for every leaf node is provided below.
They state the algorithm is O(n), but I think it should be O(n log n) where n is the number of nodes. A standard DFS is typically O(n + E), but printing the paths seems to add a log n. Suppose h is the height of the perfect binary tree. There are n/2 nodes on the last level, hence n/2 paths that we need to print. Each path has h + 1 (let's just say it's h for mathematical simplicity) nodes. So we need end up printing h * n/2 nodes when printing all the paths. We know h = log2(n). So h * n/2 = O(n log n)?
Is their answer wrong, or is there something wrong with my analysis here?
#include <iostream>
#include <vector>
using namespace std;
// Data structure to store a binary tree node
struct Node
{
int data;
Node *left, *right;
Node(int data)
{
this->data = data;
this->left = this->right = nullptr;
}
};
// Function to check if a given node is a leaf node or not
bool isLeaf(Node* node) {
return (node->left == nullptr && node->right == nullptr);
}
// Recursive function to find paths from the root node to every leaf node
void printRootToleafPaths(Node* node, vector<int> &path)
{
// base case
if (node == nullptr) {
return;
}
// include the current node to the path
path.push_back(node->data);
// if a leaf node is found, print the path
if (isLeaf(node))
{
for (int data: path) {
cout << data << " ";
}
cout << endl;
}
// recur for the left and right subtree
printRootToleafPaths(node->left, path);
printRootToleafPaths(node->right, path);
// backtrack: remove the current node after the left, and right subtree are done
path.pop_back();
}
// The main function to print paths from the root node to every leaf node
void printRootToleafPaths(Node* node)
{
// vector to store root-to-leaf path
vector<int> path;
printRootToleafPaths(node, path);
}
int main()
{
/* Construct the following tree
1
/ \
/ \
2 3
/ \ / \
4 5 6 7
/ \
8 9
*/
Node* root = new Node(1);
root->left = new Node(2);
root->right = new Node(3);
root->left->left = new Node(4);
root->left->right = new Node(5);
root->right->left = new Node(6);
root->right->right = new Node(7);
root->right->left->left = new Node(8);
root->right->right->right = new Node(9);
// print all root-to-leaf paths
printRootToleafPaths(root);
return 0;
}
The time comlexity of finding path is O(n) where it iterates through all nodes once.
The time comlexity of "print one path" is O(log n).
To print all paths (n/2 leaf), it takes O( n log n )
Then you need to compare node traverse cost and print path cost.
I believe in most of modern OS, print cost is much greater than node traverse cost.
So the actual time complexity is O(n log n) ( for print ).
I assume the website might ignore print cost so it claims time complexity is O(n).
The complexity is O(n log n) for a balanced binary tree, but for an arbitrary binary tree, the worst case is O(n2).
Consider a tree consisting of:
n/2 nodes in a linked list on their rightChild pointers; and
At the end of that, n/2 nodes arranged into a tree with n/4 leaves.
Since the n/4 leaves are all more than n/2 nodes deep, there are more than n2/8 total nodes in all the paths, and that is O(n2)
The algorithm traverses O(n) nodes. The total prints it does is O(n lg n) for a balanced tree or O(n^2) for an arbitrary tree.
It depends on what operations have cost.
For example, storing or incrementing an n bit number or pointer is often treated as an O(1) operation. In any physical computer, if you have 2^100 nodes you'll need lg(2^100) bit pointers (or node names) which will require more time to copy than 64 or 32 bit node names. Jn a certain sense, copying a pointer should take O(lg n) time!
But we don't care. We implicitly set the price of operations, and give O notation costs in terms of those operations.
Here, it is plausible they counted printing the entire path as an O(1) operation, and counted node traversals, to get an O(n) cost. Maybe they did it even notice, no more than you noticed the max node count implied by 32 or 64 bit pointers. They failed to tell you how they are pricing things.
The same thing happens in the specification of std library algorithms; it guarantees a max number of calls of a predicate.
Related
A Binary Tree:
class BinaryTree {
public:
int value;
BinaryTree *left;
BinaryTree *right;
BinaryTree(int value) {
this->value = value;
left = nullptr;
right = nullptr;
}
};
A function:
vector<int> myFunc(BinaryTree *root) {
vector<int> results;
if(root->left == NULL && root->right == NULL){
results.push_back(root->value);
}
if(root->left != NULL){
auto lResults = myFunc(root->left);
for(auto& result : lResults){
results.push_back(root->value + result);
}
}
if(root->right != NULL){
auto rResults = myFunc(root->right);
for(auto& result : rResults){
results.push_back(root->value + result);
}
}
return results;
}
As you can see, space complexity of the function is dependent on the number of leaf nodes in the tree.
So what is the space complexity of this function?
The answer depends on the actual structure of your binary tree. If you actively balance the tree, or if the tree tends to be balanced due to its use, the number of leaves of a binary tree is close to n/2 with n the total number of nodes in the tree. Think about a tree with 31 nodes, it would have 1 as root (depth 0) 2 at depth 1 and 2^i at depth i with all leaves at depth 4 so 2^4 == 16. Note that if a binary tree is filled with random numbers, it typically tends to be roughly balanced.
However, if you insert a sorted array of numbers and do not actively balance the tree, it will only have a single leaf node. However, in that case the depth of the tree is O(n) and your function recurses n times resulting in O(n) space complexity.
So in conclusion, the space complexity would be O(n).
-- The following lowestCommonAncestor function finds the Lowest Common Ancestor of two nodes, p and q, in a Binary Tree (assuming both nodes exist and all node values are unique).
class Solution {
public:
bool inBranch(TreeNode* root, TreeNode* node)
{
if (root == NULL)
return false;
if (root->val == node->val)
return true;
return (inBranch(root->left, node) || inBranch (root->right, node));
}
TreeNode* lowestCommonAncestor(TreeNode* root, TreeNode* p, TreeNode* q) {
if (root == NULL)
return NULL;
if (root->val == p->val || root->val == q->val)
return root;
bool pInLeftBranch = false;
bool qInLeftBranch = false;
pInLeftBranch = inBranch (root->left, p);
qInLeftBranch = inBranch (root->left, q);
//Both nodes are on the left
if (pInLeftBranch && qInLeftBranch)
return lowestCommonAncestor(root->left, p, q);
//Both nodes are on the right
else if (!pInLeftBranch && !qInLeftBranch)
return lowestCommonAncestor(root->right, p, q);
else
return root;
}
};
Every time you call inBranch(root, node), you are adding O(# of descendents of root) (See time complexity of Binary Tree Search). I will assume the binary tree is balanced for the first calculation of time complexity, then look at the worst case scenario that the tree is unbalanced.
Scenario 1: Balanced Tree
The number of descendants of a node will be roughly half that of its parent. Therefore, each time we recurse, the calls to inBranch will be half as expensive. Let's say N is the total number of nodes in the tree. In the first call to lowestCommonAncestor, we search all N nodes. In subsequent calls we search the left or right half and so on.
O(N + N/2 + N/4 ...) is still the same as O(N).
Scenario 2: Unbalanced Tree
Say this tree is very unbalanced in such a way that all children are on the left side:
A
/
B
/
C
/
etc...
The number of descendants decreases by only 1 for each level of recursion. Therefore, our time complexity looks something like:
O(N + (N-1) + (N-2) + ... + 2 + 1) which is equivalent to O(N^2).
Is there a better way?
If this is a Binary Search Tree, we can do as well as O(path_length(p) + path_length(q)). If not, you will always need to traverse the entire tree at least once: O(N) to find the paths to each node, but you can still improve your worst case scenario. I'll leave it to you to figure out the actual algorithm!
In the worst case the binary tree is a list of length N where each node has at most one child while p and q are the same leaf node. In that case you will have a running time of O(N^2): You call inBranch (runtime O(N)) on each node on the way down the tree.
If the binary tree is balanced, this becomes O(N log(N)) with N nodes as you can fit O(2^K) nodes into a tree of depth K (and recurse at most K times): Finding each node is O(N), but you only do it a maximum of log(N) times. Check Ben Jones' answer!
Note that a much better algorithm would locate each node once and store a list of the paths down the tree, then compare the paths. Finding each node in the tree (if it is unsorted) is necessarily worst case O(N), the list comparison is also O(N) (unbalanced case) or O(log(N)) (balanced case) so total running time is O(N). You could do even better on a sorted tree, but that's apparently not a given here.
Path Sum Given a binary tree and a sum, find all root-to-leaf paths where each path's sum equals the given sum.
For example: sum = 11.
5
/ \
4 8
/ / \
2 -2 1
The answer is :
[
[5, 4, 2],
[5, 8, -2]
]
Personally I think, the time complexity = O(2^n), n is the number of
nodes of the given binary tree.
Thank you Vikram Bhat and David Grayson, the tight time
complexity = O(nlogn), n is the number of nodes in the given binary
tree.
Algorithm checks each node once, which causes O(n)
"vector one_result(subList);" will copy entire path from subList to one_result, each time, which causes O(logn), because the
height is O(logn).
So finally, the time complexity = O(n * logn) =O(nlogn).
The idea of this solution is DFS[C++].
/**
* Definition for binary tree
* struct TreeNode {
* int val;
* TreeNode *left;
* TreeNode *right;
* TreeNode(int x) : val(x), left(NULL), right(NULL) {}
* };
*/
#include <vector>
using namespace std;
class Solution {
public:
vector<vector<int> > pathSum(TreeNode *root, int sum) {
vector<vector<int>> list;
// Input validation.
if (root == NULL) return list;
vector<int> subList;
int tmp_sum = 0;
helper(root, sum, tmp_sum, list, subList);
return list;
}
void helper(TreeNode *root, int sum, int tmp_sum,
vector<vector<int>> &list, vector<int> &subList) {
// Base case.
if (root == NULL) return;
if (root->left == NULL && root->right == NULL) {
// Have a try.
tmp_sum += root->val;
subList.push_back(root->val);
if (tmp_sum == sum) {
vector<int> one_result(subList);
list.push_back(one_result);
}
// Roll back.
tmp_sum -= root->val;
subList.pop_back();
return;
}
// Have a try.
tmp_sum += root->val;
subList.push_back(root->val);
// Do recursion.
helper(root->left, sum, tmp_sum, list, subList);
helper(root->right, sum, tmp_sum, list, subList);
// Roll back.
tmp_sum -= root->val;
subList.pop_back();
}
};
Though it seems that time complexity is O(N) but if you need to print all paths then it is O(N*logN). Suppose that u have a complete binary tree then the total paths will be N/2 and each path will have logN nodes so total of O(N*logN) in worst case.
Your algorithm looks correct, and the complexity should be O(n) because your helper function will run once for each node, and n is the number of nodes.
Update: Actually, it would be O(N*log(N)) because each time the helper function runs, it might print a path to the console consisting of O(log(N)) nodes, and it will run O(N) times.
TIME COMPLEXITY
The time complexity of the algorithm is O(N^2), where ‘N’ is the total number of nodes in the tree. This is due to the fact that we traverse each node once (which will take O(N)), and for every leaf node we might have to store its path which will take O(N).
We can calculate a tighter time complexity of O(NlogN) from the space complexity discussion below.
SPACE COMPLEXITY
If we ignore the space required for all paths list, the space complexity of the above algorithm will be O(N) in the worst case. This space will be used to store the recursion stack. The worst-case will happen when the given tree is a linked list (i.e., every node has only one child).
How can we estimate the space used for the all paths list? Take the example of the following balanced tree:
1
/ \
2 3
/ \ / \
4 5 6 7
Here we have seven nodes (i.e., N = 7). Since, for binary trees, there exists only one path to reach any leaf node, we can easily say that total root-to-leaf paths in a binary tree can’t be more than the number of leaves. As we know that there can’t be more than N/2 leaves in a binary tree, therefore the maximum number of elements in all paths list will be O(N/2) = O(N). Now, each of these paths can have many nodes in them. For a balanced binary tree (like above), each leaf node will be at maximum depth. As we know that the depth (or height) of a balanced binary tree is O(logN) we can say that, at the most, each path can have logN nodes in it. This means that the total size of the all paths list will be O(N*logN). If the tree is not balanced, we will still have the same worst-case space complexity.
From the above discussion, we can conclude that the overall space complexity of our algorithm is O(N*logN).
Also from the above discussion, since for each leaf node, in the worst case, we have to copy log(N) nodes to store its path, therefore the time complexity of our algorithm will also be O(N*logN).
The worst case time complexity is not O(nlogn), but O(n^2).
to visit every node, we need O(n) time
to generate all paths, we have to add the nodes to the path for every valid path.
So the time taken is sum of len(path). To estimate an upper bound of the sum: the number of paths is bounded by n, the length of path is also bounded by n, so O(n^2) is an upper bound. Both worst case can be reached at the same time if the top half of the tree is a linear tree, and the bottom half is a complete binary tree, like this:
1
1
1
1
1
1 1
1 1 1 1
number of paths is n/4, and length of each path is n/2 + log(n/2) ~ n/2
I was asked to implement a binary search tree with follow operation for each node v - the complexity should be O(1). The follow operation should return a node w (w > v).
I proposed to do it in O(log(n)) but they wanted O(1)
Upd. It should be next greater node
just keep the maximum element for the tree and always return it for nodes v < maximum.
You can get O(1) if you store pointers to the "next node" (using your O(log(n) algorithm), given you are allowed to do that.
How about:
int tree[N];
size_t follow(size_t v) {
// First try the right child
size_t w = v * 2 + 1;
if(w >= N) {
// Otherwise right sibling
w = v + 1;
if(w >= N) {
// Finally right parent
w = (v - 1) / 2 + 1;
}
}
return w;
}
Where tree is a complete binary tree in array form and v/w are represented as zero-based indices.
One idea is to literally just have a next pointer on each node.
You can update these pointers in O(height) after an insert or remove (O(height) is O(log n) for a self-balancing BST), which is as long as an insert or remove takes, so it doesn't add to the time complexity.
Alternatively, you can also have a previous pointer in addition to the next pointer. If you do this, you can update these pointers in O(1).
Obviously, in either case, if you have a node, you also have its next pointer, and you can simply get this value in O(1).
Pseudo-code
For only a next pointer, after the insert, you'd do:
if inserted as a right child:
newNode.next = parent.next
parent.next = newNode
else // left child
predecessor(newNode)
For both next and previous pointers:
if inserted as a right child:
parent.next.previous = newNode
newNode.next = parent.next
parent.next = newNode
else // left child
parent.previous.next = newNode
newNode.previous = parent.previous
parent.previous = newNode
(some null checks are also required).
This question was asked to me in an interview:
lets say we have above binary tree,how can i produce an output like below
2 7 5 2 6 9 5 11 4
i answered like may be we can have a level count variable and print all the elements sequentially by checking the level count variable of each node.
probably i was wrong.
can anybody give anyidea as to how we can achieve that?
You need to do a breadth first traversal of the tree. Here it is described as follows:
Breadth-first traversal: Depth-first
is not the only way to go through the
elements of a tree. Another way is to
go through them level-by-level.
For example, each element exists at a
certain level (or depth) in the tree:
tree
----
j <-- level 0
/ \
f k <-- level 1
/ \ \
a h z <-- level 2
\
d <-- level 3
people like to number things starting
with 0.)
So, if we want to visit the elements
level-by-level (and left-to-right, as
usual), we would start at level 0 with
j, then go to level 1 for f and k,
then go to level 2 for a, h and z, and
finally go to level 3 for d.
This level-by-level traversal is
called a breadth-first traversal
because we explore the breadth, i.e.,
full width of the tree at a given
level, before going deeper.
The traversal in your question is called a level-order traversal and this is how it's done (very simple/clean code snippet I found).
You basically use a queue and the order of operations will look something like this:
enqueue F
dequeue F
enqueue B G
dequeue B
enqueue A D
dequeue G
enqueue I
dequeue A
dequeue D
enqueue C E
dequeue I
enqueue H
dequeue C
dequeue E
dequeue H
For this tree (straight from Wikipedia):
The term for that is level-order traversal. Wikipedia describes an algorithm for that using a queue:
levelorder(root)
q = empty queue
q.enqueue(root)
while not q.empty do
node := q.dequeue()
visit(node)
if node.left ≠ null
q.enqueue(node.left)
if node.right ≠ null
q.enqueue(node.right)
BFS:
std::queue<Node const *> q;
q.push(&root);
while (!q.empty()) {
Node const *n = q.front();
q.pop();
std::cout << n->data << std::endl;
if (n->left)
q.push(n->left);
if (n->right)
q.push(n->right);
}
Iterative deepening would also work and saves memory use, but at the expense of computing time.
If we are able to fetch the next element at same level, we are done. As per our prior knowledge, we can access these element using breadth first traversal.
Now only problem is how to check if we are at last element at any level. For this reason, we should be appending a delimiter (NULL in this case) to mark end of a level.
Algorithm:
1. Put root in queue.
2. Put NULL in queue.
3. While Queue is not empty
4. x = fetch first element from queue
5. If x is not NULL
6. x->rpeer <= top element of queue.
7. put left and right child of x in queue
8. else
9. if queue is not empty
10. put NULL in queue
11. end if
12. end while
13. return
#include <queue>
void print(tree* root)
{
queue<tree*> que;
if (!root)
return;
tree *tmp, *l, *r;
que.push(root);
que.push(NULL);
while( !que.empty() )
{
tmp = que.front();
que.pop();
if(tmp != NULL)
{
cout << tmp=>val; //print value
l = tmp->left;
r = tmp->right;
if(l) que.push(l);
if(r) que.push(r);
}
else
{
if (!que.empty())
que.push(NULL);
}
}
return;
}
I would use a collection, e.g. std::list, to store all elements of the currently printed level:
Collect pointers to all nodes in the current level in the container
Print the nodes listed in the container
Make a new container, add the subnodes of all nodes in the container
Overwrite the old container with the new container
repeat until container is empty
as an example of what you can do at an interview if you don't remember/don't know the "official" algorithm, my first idea was - traverse the tree in the regular pre-order dragging a level counter along, maintaining a vector of linked-lists of pointers to nodes per level, e.g.
levels[level].push_back(&node);
and in the end print the list of each level.