What is the time complexity of inorder,postorder and preorder traversal of binary trees in data structures?? Is it O(n) or O(log n) or O(n^2)??
In-order, Pre-order, and Post-order traversals are Depth-First traversals.
For a Graph, the complexity of a Depth First Traversal is O(n + m), where n is the number of nodes, and m is the number of edges.
Since a Binary Tree is also a Graph, the same applies here.
The complexity of each of these Depth-first traversals is O(n+m).
Since the number of edges that can originate from a node is limited to 2 in the case of a Binary Tree, the maximum number of total edges in a Binary Tree is n-1, where n is the total number of nodes.
The complexity then becomes O(n + n-1), which is O(n).
O(n), because you traverse each node once. Or rather - the amount of work you do for each node is constant (does not depend on the rest of the nodes).
Introduction
Hi
I was asked this question today in class, and it is a good question! I will explain here and hopefully get my more formal answer reviewed or corrected where it is wrong. :)
Previous Answers
The observation by #Assaf is also correct since binary tree traversal travels recursively to visit each node once.
But!, since it is a recursive algorithm, you often have to use more advanced methods to analyze run-time performance. When dealing with a sequential algorithm or one that uses for-loops, using summations will often be enough. So, what follows is a more detailed explanation of this analysis for those who are curious.
The Recurrence
As previously stated,
T(n) = 2*T(n/2) + 1
where T(n) is the number of operations executed in your traversal algorithm (in-order, pre-order, or post-order makes no difference.
Explanation of the Recurrence
There are two T(n) because inorder, preorder, and postorder traversals all call themselves on the left and right child node. So, think of each recursive call as a T(n). In other words, **left T(n/2) + right T(n/2) = 2 T(n/2) **. The "1" comes from any other constant time operations within the function, like printing the node value, et cetera. (It could honestly be a 1 or any constant number & the asymptotic run-time still computes to the same value. Explanation follows.).
Analysis
This recurrence actually can be analyzed using big theta using the masters' theorem. So, I will apply it here.
T(n) = 2*T(n/2) + constant
where constant is some constant (could be 1 or any other constant).
Using the Masters' Theorem , we have T(n) = a*T(n/b) + f(n).
So, a=2, b=2, f(n) = constant since f(n) = n^c = 1, then it follows that c = 0 since f(n) is a constant.
From here, we can see that a = 2 and b^c = 2 ^0 = 1. So, a>b^c or 2>2^0. So, c < logb(a) or 0 < log2(2)
From here we have T(n) = BigTheta(n^{logb(a)}) = BigTheta(n^1) = BigTheta(n)
If your not famliar with BigTheta(n), it is "similar" ( please bear with me :) ) to O(n) but it is a "tighter bound" or tighter approximation of the run-time. So, BigTheta(n) is both worst-case O(n), and best-case BigOmega(n) run-time.
I hope this helps. Take care.
O(n),I would say .
I am doing for a balanced tree,applicable for all the trees.
Assuming that you use recursion,
T(n) = 2*T(n/2) + 1 ----------> (1)
T(n/2) for left sub-tree and T(n/2) for right sub-tree and '1' for verifying the base case.
On Simplifying (1) you can prove that the traversal(either inorder or preorder or post order) is of order O(n).
Travesal is O(n) for any order - because you are hitting each node once. Lookup is where it can be less than O(n) IF the tree has some sort of organizing schema (ie binary search tree).
T(n) = 2T(n/2)+ c
T(n/2) = 2T(n/4) + c => T(n) = 4T(n/4) + 2c + c
similarly T(n) = 8T(n/8) + 4c+ 2c + c
....
....
last step ... T(n) = nT(1) + c(sum of powers of 2 from 0 to h(height of tree))
so Complexity is O(2^(h+1) -1)
but h = log(n)
so, O(2n - 1) = O(n)
Depth first traversal of a binary tree is of order O(n).
Algo -- <b>
PreOrderTrav():-----------------T(n)<b>
if root is null---------------O(1)<b>
return null-----------------O(1)<b>
else:-------------------------O(1)<b>
print(root)-----------------O(1)<b>
PreOrderTrav(root.left)-----T(n/2)<b>
PreOrderTrav(root.right)----T(n/2)<b>
If the time complexity of the algo is T(n) then it can be written as T(n) = 2*T(n/2) + O(1).
If we apply back substitution we will get T(n) = O(n).
Consider a skewed binary tree with 3 nodes as 7, 3, 2. For any operation like for searching 2, we have to traverse 3 nodes, for deleting 2 also, we have to traverse 3 nodes and for for inserting 1 also, we have to traverse 3 nodes. So, binary tree has worst case complexity of O(n).
Related
I have been solving a problem but couldn't get to an efficient solution.
Problem Statement:
Given a tree with N vertices and N-1 edges. Each vertex v denotes a value given by C[v] where C[ ] is an array. For this tree we have to perform Q queries which is given below:
A query is given by two integers u,v. Let us define a value A which is given by the product of the values of all the nodes which lie on the simple path between u and v.
More formally,if the simple path between u and v is [u,a,b...,v], then
A = C[u] * C[a] * C[b] * ... * C[v].
For this query we need to output the number of divisors of A.
Constraints :
1<=N,Q<=100000, 1<=C[i]<=1000000 for all 1<=i<=N.
My approach : Since the product could be very large, I am storing the prime factors and its count as answer.
I have first precomputed the LCA for the tree using binary lifting. Then I have defined a map< int, int > for each node which stores the prime divisors and its count of the product from root upto the current node. This could be achieved by a simple DFS and a separate function for merging the maps
(Note: I am finding the prime factorization of a node using sieve in O(LogN))
Then for each query of the type [u,v], I am finding the LCA (let's say L). Now I am subtracting the map of L from the map of u (Note: map of L will always be a subset of map of u) and similarly for node v.
Now I have all the prime factors and its count of the product.
Now simply using the result that for a number K = a^p * b^q * c^r..., the number of divisors D = (p+1) * (q+1) * (r+1)... we get out answer.
Time Complexity Analysis :
Let's define M as number of distinct primes up to 1000000.
DFS will run in O(N*M) time :
For each node, while combining the map the worst case would be when all primes up to M would be present.
LCA pre-computation : O(NLogN) time
Sieve pre-computation : O(NLogLogN) time
Each query will run in O(M+LogN) time : O(LogN) for finding LCA and O(M) for subtracting the map to find the prime factors of the product.
So Time Complexity: O(NLogN + NM + NLogLogN + QMLogN).
Now the issue is that in worst case M is of the order 50000. So this would blow up the time complexity. Is there any other efficient method?
Answer must be reported modulo 1e9+7
I'm in dire need of some guidance with calculating Big-O runtime for the following C++ function:
Fraction Polynomial::solve(const Fraction& x) const{
Fraction rc;
auto it=poly_.begin();
while(it!=poly_.end()){
Term t=*it;
//find x^exp
Fraction curr(1,1);
for(int i=0;i<t.exponent_;i++){
curr=curr*x;
}
rc+=t.coefficient_*curr;
it++;
}
return rc;
}
This is still a new concept to me, so I'm having a bit of trouble with getting it right. I'm assuming that there are at least two operations that happen once (auto it = poly_.begin, and the return rc at the end), but I am not sure how to count the number of operations with the while loop. According to my professor, the correct runtime is not O(n). If anyone could offer any guidance, it would be greatly appreciated. I want to understand how to answer this question, but I couldn't find anything else like this function online, so here I am. Thank you.
I assume you want to evaluate a certain polynomial (let us say A_n*X^n + ... + A_0) in a given point (rational value since it is given as a Fraction).
The first while loop will iterate through all the individual components of your polynomial. For an n-degree polynomial, that will yield n + 1 iterations, so the outer loop alone takes O(n) time.
However, for every term (let us say of rank i)of the polynomial, you have to compute the value of X^i, and that is what your inner for loop does. It computes X^i using a linear method, yielding linear complexity: O(i).
Since you have two nested loops the overall complexity is obtained by multiplying the worst-case time complexities of the loops. The resulting complexity is given by O(n) * O(n) = O(n^2). (First term indicates the complexity of the while loop, while the second one indicates the worst-case time complexity for computing X^i, which is O(n) when i == n).
Assuming this is a n-order polynomial (highest term is raised to the power of n).
In the outer while loop, you will iterate through n+1 terms (0 to n inclusive on both side).
For each term, in the inner for loop, you are going to perform multiplication m times whereby m is the power of current term. Since this is a n-order polynomial, m range from 0 to n. On average, you are going to perform multiplication n/2 times.
The overall complexity will be O((n+1) * (n/2)) = O(n^2)
I'm trying to understand recurrence relations. I've found a way to determine the maximum element in an array of integers through recursion. Below is the function. The first time it is called, n is the size of the array.
int ArrayMax(int array[], int n) {
if(n == 1)
return array[0];
int result = ArrayMax(array, n-1);
if(array[n-1] > result)
return array[n-1];
else
return result;
}
Now I want to understand the recurrence relation and how to get to big-O notation from there. I know that T(n) = aT(n/b) + f(n), but I don't see how to get what a and b should be.
a is "how many recursive calls there are", and b is "how many pieces you split the data into", intuitively. Note that the parameter inside the recursive call doesn't have to be n divided by something, in general it's any function of n that describes how the magnitude of your data has been changed.
For example binary search does one recursive call at each layer, splits the data into 2, and does constant work at each layer, so it has T(n) = T(n/2) + c. Merge sort splits the data in two each time (the split taking work proportional to n) and recurses on both subarrays - so you get T(n) = 2T(n/2) + cn.
In your example, you'd have T(n) = T(n-1) + c, as you're making one recursive call and "splitting the data" by reducing its size by 1 each time.
To get the big O notation from this, you just make substitutions or expand. With your example it's easy:
T(n) = T(n-1) + c = T(n-2) + 2c = T(n-3) + 3c = ... = T(0) + nc
If you assume T(0) = c0, some "base constant", then you get T(n) = nc + c0, which means the work done is in O(n).
The binary search example is similar, but you've got to make a substitution - try letting n = 2^m, and see where you can get with it. Finally, deriving the big O notation of eg. T(n) = T(sqrt(n)) + c is a really cool exercise.
Edit: There are other ways to solve recurrence relations - the Master Theorem is a standard method. But the proof isn't particularly nice and the above method works for every recurrence I've ever applied it to. And... well, it's just more fun than plugging values into a formula.
In your case recurrence relation is:
T(n) = T(n-1) + constant
And Master theorem says:
T(n) = aT(n/b) + f(n) where a >= 1 and b > 1
Here Master theorem can not be applied because for master theorem
b should be greater than 1 (b>1)
And in your case b=1
I am having difficulty figuring out the running time of this simple recursive function.
void myRecur(int n)
{
if (n < 1) return;
cout << n << " ";
myRecur(n/2);
cout << n << " ";
myRecur(n/2);
}
I figured that it prints: 4 2 1 1 2 1 1 4 2 1 1 2 1 1 for myRecur(4).
Also, is this function similar to the tree traversal function in terms of time complexity?
Any advice on understanding recursion and detailed explanation of the running time of this particular problem are much appreciated.
This is a great spot to use a recurrence relation! Let's let T(n) be the amount of time the algorithm takes to run on an input of size n. Then
T(0) = 1, since the base case does a constant amount of work, and
T(n) = 2T(⌊n/2⌋) + 1, since each other case makes two recursive calls to problems of size n/2 and does an additional constant amount of work.
The goal now is to find some sort of expression that describes T(n) non-recursively. There are a lot of ways to do this. Look up the iteration method and the recursion-tree method for some examples of this. The fastest way to do this is to use the wonderfully-named Master Theorem, which lets you directly determine the time complexity from the recurrence relation. In this case, the master theorem says that this solves to T(n) = Θ(n).
I think you are interested in worst case so the code complexity is Ο(n). For a given n your function will run 2n-1 times at most.
For a better understanding try to build a call tree.
n
n/2 n/2
n/4 n/4 n/4 n/4
...
There are ⌊log2(n)⌋ levels. Each level has 2lvl items. The total number of items is 2n - 1.
Path Sum Given a binary tree and a sum, find all root-to-leaf paths where each path's sum equals the given sum.
For example: sum = 11.
5
/ \
4 8
/ / \
2 -2 1
The answer is :
[
[5, 4, 2],
[5, 8, -2]
]
Personally I think, the time complexity = O(2^n), n is the number of
nodes of the given binary tree.
Thank you Vikram Bhat and David Grayson, the tight time
complexity = O(nlogn), n is the number of nodes in the given binary
tree.
Algorithm checks each node once, which causes O(n)
"vector one_result(subList);" will copy entire path from subList to one_result, each time, which causes O(logn), because the
height is O(logn).
So finally, the time complexity = O(n * logn) =O(nlogn).
The idea of this solution is DFS[C++].
/**
* Definition for binary tree
* struct TreeNode {
* int val;
* TreeNode *left;
* TreeNode *right;
* TreeNode(int x) : val(x), left(NULL), right(NULL) {}
* };
*/
#include <vector>
using namespace std;
class Solution {
public:
vector<vector<int> > pathSum(TreeNode *root, int sum) {
vector<vector<int>> list;
// Input validation.
if (root == NULL) return list;
vector<int> subList;
int tmp_sum = 0;
helper(root, sum, tmp_sum, list, subList);
return list;
}
void helper(TreeNode *root, int sum, int tmp_sum,
vector<vector<int>> &list, vector<int> &subList) {
// Base case.
if (root == NULL) return;
if (root->left == NULL && root->right == NULL) {
// Have a try.
tmp_sum += root->val;
subList.push_back(root->val);
if (tmp_sum == sum) {
vector<int> one_result(subList);
list.push_back(one_result);
}
// Roll back.
tmp_sum -= root->val;
subList.pop_back();
return;
}
// Have a try.
tmp_sum += root->val;
subList.push_back(root->val);
// Do recursion.
helper(root->left, sum, tmp_sum, list, subList);
helper(root->right, sum, tmp_sum, list, subList);
// Roll back.
tmp_sum -= root->val;
subList.pop_back();
}
};
Though it seems that time complexity is O(N) but if you need to print all paths then it is O(N*logN). Suppose that u have a complete binary tree then the total paths will be N/2 and each path will have logN nodes so total of O(N*logN) in worst case.
Your algorithm looks correct, and the complexity should be O(n) because your helper function will run once for each node, and n is the number of nodes.
Update: Actually, it would be O(N*log(N)) because each time the helper function runs, it might print a path to the console consisting of O(log(N)) nodes, and it will run O(N) times.
TIME COMPLEXITY
The time complexity of the algorithm is O(N^2), where ‘N’ is the total number of nodes in the tree. This is due to the fact that we traverse each node once (which will take O(N)), and for every leaf node we might have to store its path which will take O(N).
We can calculate a tighter time complexity of O(NlogN) from the space complexity discussion below.
SPACE COMPLEXITY
If we ignore the space required for all paths list, the space complexity of the above algorithm will be O(N) in the worst case. This space will be used to store the recursion stack. The worst-case will happen when the given tree is a linked list (i.e., every node has only one child).
How can we estimate the space used for the all paths list? Take the example of the following balanced tree:
1
/ \
2 3
/ \ / \
4 5 6 7
Here we have seven nodes (i.e., N = 7). Since, for binary trees, there exists only one path to reach any leaf node, we can easily say that total root-to-leaf paths in a binary tree can’t be more than the number of leaves. As we know that there can’t be more than N/2 leaves in a binary tree, therefore the maximum number of elements in all paths list will be O(N/2) = O(N). Now, each of these paths can have many nodes in them. For a balanced binary tree (like above), each leaf node will be at maximum depth. As we know that the depth (or height) of a balanced binary tree is O(logN) we can say that, at the most, each path can have logN nodes in it. This means that the total size of the all paths list will be O(N*logN). If the tree is not balanced, we will still have the same worst-case space complexity.
From the above discussion, we can conclude that the overall space complexity of our algorithm is O(N*logN).
Also from the above discussion, since for each leaf node, in the worst case, we have to copy log(N) nodes to store its path, therefore the time complexity of our algorithm will also be O(N*logN).
The worst case time complexity is not O(nlogn), but O(n^2).
to visit every node, we need O(n) time
to generate all paths, we have to add the nodes to the path for every valid path.
So the time taken is sum of len(path). To estimate an upper bound of the sum: the number of paths is bounded by n, the length of path is also bounded by n, so O(n^2) is an upper bound. Both worst case can be reached at the same time if the top half of the tree is a linear tree, and the bottom half is a complete binary tree, like this:
1
1
1
1
1
1 1
1 1 1 1
number of paths is n/4, and length of each path is n/2 + log(n/2) ~ n/2