what is the complexity of a loop which goes this
for (int i = 0; i < n; i++)
{
for (int j = 0; j < log(i); j++)
{
// do something
}
}
According to me the inner loop will be running log(1)+log(2)+log(3)+...+log(n) times so how do i calculate its complexity?
So, you have a sum log(1) + log(2) + log(3) + ... + log(n) = log(n!). By using Stirling's approximation and the fact that ln(x) = log(x) / log(e) one can get
log(n!) = log(e) * ln(n!) = log(e) (n ln(n) - n + O(ln(n)))
which gives the same complexity O(n ln(n)) as in the other answer (with slightly better understanding of the constants involved).
Without doing this in a formal manner, such complexity calculations can be "guessed" using integrals. The integrand is the complexity of do_something, which is assumed to be O(1), and combined with the interval of log N, this then becomes log N for the inner loop. Combined with the outer loop, the overall complexity is O(N log N). So between linear and quadratic.
Note: this assumes that "do something" is O(1) (in terms of N, it could be of a very high constant of course).
Lets start with log(1)+log(2)+log(3)+...+log(n). Roughly half of the elements of this sum are greater than or equal to log(n/2) = log(n) - log(2). Hence the lower bound of this sum is n / 2 * (log(n) - log(2)) = Omega(nlog(n)). To get upper bound simply multiply n by the largest element which is log(n), hence O(nlog(n)).
Related
I need to predict the algorithm's average case efficiency with respect to the size of its inputs using summation/sigma notation to arrive at the final answer. Many resources use summation to predict worst-case, and I couldn't find someone explaining how to predict average case so step-by-step answers are appreciated.
The algorithm contains a nested for loop, with the basic operation inside the innermost loop:
[code redacted]
EDIT: The execution of the basic operation it will always execute inside the second for loop if the second for loop has been entered, and has no break or return statements. HOWEVER: the end of the first for loop has the return statement which is dependent on the value produced in the basic operation, so the contents of the array do affect how many total times the basic operation will be executed for each run of the algorithm.
The array passed to the algorithm has randomly generated contents
I think the predicted average case efficiency is (n^2)/2, making it n^2 order of growth/big Theta of n^2, but I don't know how to theoretically prove this using summation.
Answers are very appreciated!
TL;DR: Your code complexity in average case is Θ(n²) if "basic operation" complexity is Θ(1) and it has no return, break or goto operators.
Explanation: the average-case complexity is just an expectation of the number of operations in your code given the size of the input.
Let's say T(A, n) is a number of operations your code performs given array A of size n. It's easy to see that
T(A, n) = 1 + // int k = ceil(size/2.0);
n * 2 + 1 + // for (int i = 0; i < size; i++){
n * (n * 2 + 1) + // for(int j = 0; j < size; j++){
n * n * X + // //Basic operation
1 // return (some int);
Where X is a number of operations in your "basic operation". As we can see, T(A, n) does not depend on actual contents of the array A. Thus, the expected number of operations given size of the array (which is simply the arithmetical mean of T(A, n) for all possible A for given n) is exactly equal to each of them:
T(n) = T(A, n) = 3 + n * 2 + n * n * (2 + X)
If we assume that X = Θ(1), this expression is Θ(n²).
Even without this assumption we can have an estimate: if X = Θ(f(n)), then your code complexity is T(n) = Θ(f(n)n²). For example, if X is Θ(log n), T(n) = Θ(n² log n)
First, here's my Shell sort code (using Java):
public char[] shellSort(char[] chars) {
int n = chars.length;
int increment = n / 2;
while(increment > 0) {
int last = increment;
while(last < n) {
int current = last - increment;
while(current >= 0) {
if(chars[current] > chars[current + increment]) {
//swap
char tmp = chars[current];
chars[current] = chars[current + increment];
chars[current + increment] = tmp;
current -= increment;
}
else { break; }
}
last++;
}
increment /= 2;
}
return chars;
}
Is this a correct implementation of Shell sort (forgetting for now about the most efficient gap sequence - e.g., 1,3,7,21...)? I ask because I've heard that the best-case time complexity for Shell Sort is O(n). (See http://en.wikipedia.org/wiki/Sorting_algorithm). I can't see this level of efficiency being realized by my code. If I added heuristics to it, then yeah, but as it stands, no.
That being said, my main question now - I'm having difficulty calculating the Big O time complexity for my Shell sort implementation. I identified that the outer-most loop as O(log n), the middle loop as O(n), and the inner-most loop also as O(n), but I realize the inner two loops would not actually be O(n) - they would be much less than this - what should they be? Because obviously this algorithm runs much more efficiently than O((log n) n^2).
Any guidance is much appreciated as I'm very lost! :P
The worst-case of your implementation is Θ(n^2) and the best-case is O(nlogn) which is reasonable for shell-sort.
The best case ∊ O(nlogn):
The best-case is when the array is already sorted. The would mean that the inner if statement will never be true, making the inner while loop a constant time operation. Using the bounds you've used for the other loops gives O(nlogn). The best case of O(n) is reached by using a constant number of increments.
The worst case ∊ O(n^2):
Given your upper bound for each loop you get O((log n)n^2) for the worst-case. But add another variable for the gap size g. The number of compare/exchanges needed in the inner while is now <= n/g. The number of compare/exchanges of the middle while is <= n^2/g. Add the upper-bound of the number of compare/exchanges for each gap together: n^2 + n^2/2 + n^2/4 + ... <= 2n^2 ∊ O(n^2). This matches the known worst-case complexity for the gaps you've used.
The worst case ∊ Ω(n^2):
Consider the array where all the even positioned elements are greater than the median. The odd and even elements are not compared until we reach the last increment of 1. The number of compare/exchanges needed for the last iteration is Ω(n^2).
Insertion Sort
If we analyse
static void sort(int[] ary) {
int i, j, insertVal;
int aryLen = ary.length;
for (i = 1; i < aryLen; i++) {
insertVal = ary[i];
j = i;
/*
* while loop exits as soon as it finds left hand side element less than insertVal
*/
while (j >= 1 && ary[j - 1] > insertVal) {
ary[j] = ary[j - 1];
j--;
}
ary[j] = insertVal;
}
}
Hence in case of average case the while loop will exit in middle
i.e 1/2 + 2/2 + 3/2 + 4/2 + .... + (n-1)/2 = Theta((n^2)/2) = Theta(n^2)
You saw here we achieved (n^2)/2 even though divide by two doesn't make more difference.
Shell Sort is nothing but insertion sort by using gap like n/2, n/4, n/8, ...., 2, 1
mean it takes advantage of Best case complexity of insertion sort (i.e while loop exit) starts happening very quickly as soon as we find small element to the left of insert element, hence it adds up to the total execution time.
n/2 + n/4 + n/8 + n/16 + .... + n/n = n(1/2 + 1/4 + 1/8 + 1/16 + ... + 1/n) = nlogn (Harmonic Series)
Hence its time complexity is some thing close to n(logn)^2
Here is the algorithm:
void heapSort(int * arr, int startIndex, int endIndex)
{
minHeap<int> h(endIndex + 1);
for (int i = 0; i < endIndex + 1; i++)
h.insert(arr[i]);
for (int i = 0; i < endIndex + 1; i++)
arr[i] = h.deleteAndReturnMin();
}
The methods insert() and deleteAndReturnMin() are both O(log n). endIndex + 1 can be referred to as n number of elements. So given that information, am I correct in saying the first and the second loop are both O(n log n) and thus the time complexity for the whole algorithm is O(n log n)? And to be more precise, would the total time complexity be O(2(n log n)) (not including the initializations)? I'm learning about big O notation and time complexity so I just want to make sure I'm understanding it correctly.
Your analysis is correct.
Given that the two methods you have provided are logarithmic time, your entire runtime is logarithmic time iterated over n elements O(n log n) total. You should also realize that Big-O notation ignores constant factors, so the factor of 2 is meaningless.
Note that there is a bug in your code. The inputs seem to suggest that the array starts from startIndex, but startIndex is completely ignored in the implementation.
You can fix this by changing the size of the heap to endIndex + 1 - startIndex and looping from int i = startIndex.
I'm busy doing an assignment and I'm struggling with a question. I know I'm not supposed to ask assignment questions outright so I understand if I don't get straight answers. But here goes anyway.
We must calculate the run time complexity of different algorithms, the one I'm stuck on is this.
for(int i = 1 ; i < n ; i++)
for(int j = 0 ; j < i ; j +=2)
sum++;
Now with my understanding, my first thought would be less than O(n2), because the nested loop isn't running the full n times, and still the j variable is incrementing by 2 each loop rather than iterating like a normal for loop. Although, when I did some code simulations with N=10, N=100, N=1000, etc. I got the following results when I outputted the sum variable.
N = 10 : 25,
N = 100 : 2500,
N = 1000 : 250000,
N = 10000 : 25000000
When I look at these results, the O Notations seems like it should be much larger than just O(n).
The 4 options we have been given in the assignment are : O(1), O(n2), O(n) and O(logn). As I said earlier, I cannot see how it can be as large as O(n2), but the results are pointing to that. So I just think I don't fully understand this, or I'm missing some link.
Any help would be appreciated!
Big O notation does not give you the number of operations. It just tells you how fast it will grow with growing input. And this is what you observe.
When you increased input c times, the total number of operations grows c^2.
If you calculated (nearly) exact number of operations precisely you would get (n^2)/4.
Of course you can calculate it with sums, but since I dunno how to use math on SO I will give an "empirical" explanation. Simple loop-within-a-loop with the same start and end conditions gives n^2. Such loop produces a matrix of all possible combinations for "i" and "j". So if start is 1 and end is N in both cases you get N*N combinations (or iterations effectively).
However, yours inner loop is for i < j. This basically makes a triangle out of this square, that is the 1st 0.5 factor, and then you skip every other element, this is another 0.5 factor; multiplied you get 1/4.
And O(0.25 * n^2) = O(n^2). Sometimes people like to leave the factor in there because it lets you compare two algorithms with the same complexity. But it does not change the ratio of growth in respect to n.
Bear in mind that big-O is asymptotic notation. Constants (additive or multiplicative) have zero impact on it.
So, the outer loop runs n times, and on the ith time, the inner loop runs i / 2 times. If it weren't for the / 2 part, it would be the sum of all numbers 1 .. n, which is the well known n * (n + 1) / 2. That expands to a * n^2 + b * n + c for a non-zero a, so it's O(n^2).
Instead of summing n numbers, we're summing n / 2 numbers. But that's still somewhere around (n/2) * ((n/2) + 1) / 2. Which still expands to d * n^2 + e * n + f for a non-zero d, so it's still O(n^2).
From your output it seems like:
sum ~= (n^2)/4.
This is obviously O(n^2) (actually you can replace the O with teta).
You should recall the definition for Big-O notation. See http://en.wikipedia.org/wiki/Big_O_notation.
The thing is that number of operations here is dependant on the square of n, even though the overall number is less than n². Nevertheless, the scaling is what matters for Big-O notation, thus it's O(n²)
With:
for (int i = 1 ; i < n ; i++)
for (int j = 0 ; j < i ; j +=2)
sum++;
We have:
0+2+4+6+...+2N == 2 * (0+1+2+3+...+N) == 2 * (N * (N+1) / 2) == N * (N+1)
so, with n == 2N, we have (n / 2) * (n / 2 + 1) ~= (n * n) / 4
so O(n²)
Your understanding regarding time complexity is not appropriate.Time Complexity is not only the matter of 'sum' variable.'sum' only calculates how many times the inner loop iterates,but you also have to consider total number of outer loop iterations.
now consider your program:
for(int i = 1 ; i < n ; i++)
for(int j = 0 ; j < i ; j +=2)
sum++;
Time complexity means running time of your program with respect to input values(here n).Here running time does not mean actual required time to execute your program in your computer .Actual required time varies from machine to machine.so to get a machine independent running time, Big O notation is very useful.Bog O actually comes from mathematics and it describes the running time in terms of mathematical functions.
The outer loop is executed total (n-1) times.for each of these (n-1) values (starting from i=1), the inner loop iterates i/2 times.so total number of inner loop iterations=1+1+2+2+3+3+...+(n/2)+(n/2)=2(1+2+3+...+n/2)=2*(n/2(n/2+1))/2=n^2/4+n/2.
similarly 'sum++' also executed total n^2/4+n/2 times.Now consider cost of line 1 of the program=c1,cost of line 2=c2 and cost of line 3=c3 .These casts can be different for different machine. so total time required for executing the program =c1*(n-1)+c2*(n^2/4+n/2)+c3*(n^2/4+n/2)=(c2+c3)n^2/4+(c2+c3)n/2+c1*n-c1.Thus the required time can be expressed in terms of mathematical function.In Big O notation you can say it is O((c2+c3)n^2/4+(c2+c3)n/2+c1*n-c1).In case of Big O notation, lower order terms and coefficient of highest order term can be ignored. because for large value of n ,n^2 is much greater than n. so you can say it is O((c1+c2)n^2/4).Also as for any value of n , n^2 is greater than (c1+c2)n^2/4 by a constant factor, so you can say it is O(n^2).
So, I really don't get Big O notation. I have been tasked with determining "O value" for this code segment.
for (int count =1; count < n; count++) // Runs n times, so linear, or O(N)
{
int count2 = 1; // Declares an integer, so constant, O(1)
while (count2 < count) // Here's where I get confused. I recognize that it is a nested loop, but does that make it O(N^2)?
{
count2 = count2 * 2; // I would expect this to be constant as well, O(N)
}
}
O(f(n))=g(n)
This implies that for some value k, f(n)>g(n) where n>k. This gives the upper bound for the function g(n).
When you are asked to find Big O for some code,
1) Try to count the number of computations being performed in terms of n and thus getting g(n).
2) Now try estimating the upper bound function of g(n). That will be your answer.
Lets apply this procedure to your code.
Lets count the number of computations made. The statements declaring and multiply by 2 take O(1) time. But these are executed repeatedly. We need to find how many times they are executed.
The outer loop executes for n times. Hence the first statement executes for n times. Now the number of times inner loop gets executed depends on value of n. For a given value of n it executes for logn times.
Now lets count the total number of computations performed,
log(1) + log(2) + log(3) +.... log(n) + n
Note that the last n is for the first statement. Simplifying the above series we get:
= log(1*2*3*...n) + n
= log(n!) + n
We have
g(n)=log(n!) + n
Lets guess the upper bound for log(n!).
Since,
1.2.3.4...n < n.n.n...(n times)
Hence,
log(n!) < log(n^n) for n>1
which implies
log(n!) = O(nlogn).
If you want a formal proof for this, check this out. Since nlogn increases faster than n , we therefore have:
O(nlogn + n) = O(nlogn)
Hence your final answer is O(nlogn).