I am having trouble deciding between N^2 and NlogN as the Big O? Whats throwing me off is the third nested for loop from k <=j. How do I reconcile this?
int Max_Subsequence_Sum( const int A[], const int N )
{
int This_Sum = 0, Max_Sum = 0;
for (int i=0; i<N; i++)
{
for (int j=i; j<N; j++)
{
This_Sum = 0;
for (int k=i; k<=j; k++)
{
This_Sum += A[k];
}
if (This_Sum > Max_Sum)
{
Max_Sum = This_Sum;
}
}
}
return Max_Sum;
}
This can be done with estimation or analysis. Looking at the inner most loop there are j-i operations inside the second loop. To get the total number of operations one would sum to get :
(1+N)(2 N + N^2) / 6
Making the algorithm O(N^3). To estimate one can see that there are three loops which at some point have O(N) calls thus it's O(N^3).
Let us analyze the most inner loop first:
for (int k=i; k <= j; k++) {
This_Sum += A[k];
}
Here the counter k iterates from i (inclusive) to j (inclusive), this thus means that the body of the for loop is performed j-i+1 times. If we assume that fetching the k-th number from an array is done in constant time, and the arithmetic operations (incrementing k, calculating the sum of This_Sum and A[k], and comparking k with j), then this thus runs in O(j-i).
The initialization of This_Sum and the if statement is not significant:
This_Sum = 0;
// ...
if (This_Sum > Max_Sum) {
Max_Sum = This_Sum;
}
indeed, if we can compare two numbers in constant time, and set one variable to the value hold by another value in constant time, then regardless whether the condition holds or not, the number of operations is fixed.
Now we can take a look at the loop in the middle, and abstract away the most inner loop:
for (int j=i; j < N; j++) {
// constant number of oprations
// j-i+1 operations
// constant number of operations
}
Here j ranges from i to N, so that means that the total number of operations is:
N
---
\
/ j - i + 1
---
j=i
This sum is equivalent to:
N
---
\
(N-j) * (1 - i) + / j
---
j=i
This is an arithmetic sum [wiki] and it is equivalent to:
(N - i + 1) × ((1 - i) + (i+N) / 2) = (N - i + 1) × ((N-i) / 2 + 1)
or when we expand this:
i2/2 + 3×N/2 - 3×i/2 + N2/2 - N×i + 1
So that means that we can now focus on the outer loop:
for (int i=0; i<N; i++) {
// i2/2 + 3×N/2 - 3×i/2 + N2/2 - N×i + 1
}
So now we can again calculate the number of operations with:
N
---
\
/ i2/2 + 3×N/2 - 3×i/2 + N2/2 - N×i + 1
---
i=0
We can use Faulhaber's formula [wiki] here to solve this sum, and obtain:
(N+1)×(N2+5×N+6)/6
or in expanded form:
N3/6 + N2 + 11×N/6 + 1
which is thus an O(n3) algorithm.
Related
The first loop runs O(log n) time but the second loop's runtime depends on the counter of the first loop, If we examine it more it should run like (1+2+4+8+16....+N) I just couldn't find a reasonable answer to this series...
for (int i = 1; i < n; i = i * 2)
{
for (int j = 1; j < i; j++)
{
//const time
}
}
Just like you said. If N is power of two, then 1+2+4+8+16....+N is exactly 2*N-1 (sum of geometric series) . This is same as O(N) that can be simplified to N.
It is like :
1 + 2 + 4 + 8 + 16 + ....+ N
= 2 ^ [O(log(N) + 1] - 1
= O(N)
I want to design an algorithm with O(n(log(2)n)^2) time complexity. I wrote this:
for(int i=1; i<=n; i++){
j=i;
while(j != 1)
j=j/2;
j=i;
while(j !=1)
j=j/2;
}
Does it have O(n(log(2)n)^2) time complexity? If not, where I am going wrong and how can I fix it so that its time complexity is O(n(log(2)n)^2)?
Slight digression:
As the guys said in the comments, the algorithm is indeed O(n log n). This is coincidentally identical to the result obtained by multiplying the complexity of the inner loop by the outer loop, i.e. O(log i) x O(n).
This may lead you to believe that we can simply add another iteration of the inner loop to obtain the (log n)2 part:
for (int i = 1; i < n; i++) {
int k = i;
while (k >= 1)
k /= 2;
int j = i;
while (j >= 1)
j /= 2;
}
}
But let's look at how the original complexity is derived:
(Using Sterling's approximation)
Therefore the proposed modification would give:
Which is not what we want.
An example I can think of from a recent personal project is semi-naive KD-tree construction. The pseudocode is given below:
def kd_recursive_cons (list_points):
if length(list_points) < predefined_threshold:
return new leaf(list_points)
A <- random axis (X, Y, Z)
sort list_points by their A-coordinate
mid <- find middle element in list_points
list_L, list_R <- split list_points at mid
node_L <- kd_recursive_cons(list_L)
node_R <- kd_recursive_cons(list_R)
return new node (node_L, node_R)
end
The time complexity function is therefore given by:
Where the n log n part is from sorting. We can obviously ignore the Dn linear part, and also the constant C. Thus:
Which is what we wanted.
Now to write a simpler piece of code with the same time complexity. We can make use of the summation we obtained in the above derivation...
And noting that the parameter passed to the log function is divided by two in every loop, we can thus write the code:
for (int i = 1; i < n; i++) {
for (int k = n; k >= 1; k /= 2) {
int j = k;
while (j >= 1)
j /= 2;
}
}
This looks like the "naive" but incorrect solution mentioned at the beginning, with the difference being that the nested loops there had different bounds (j did not depend on k, but k depended on i instead of directly on n).
EDIT: some numerical tests to confirm that the complexity is as intended:
Test function code:
int T(int n) {
int o = 0;
for (int i = 1; i < n; i++)
for (int j = n; j >= 1; j /= 2)
for (int k = j; k >= 1; k /= 2, o++);
return o;
}
Numerical results:
n T(n)
-------------------
2 3
4 18
8 70
16 225
32 651
64 1764
128 4572
256 11475
512 28105
1024 67518
2048 159666
4096 372645
8192 860055
16384 1965960
32768 4456312
65536 10026855
131072 22413141
262144 49807170
524288 110100270
Then I plotted sqrt(T(n) / n) against n. If the complexity is correct this should give a log(n) graph, or a straight line if plotted with a log-scale horizontal axis.
And this is indeed what we get:
I have a situation where I am using openMP for the Xeon Phi Coprocessor, and I have an opportunity to parallelize an "embarrassingly parallel" double for loop.
However, the for loop is looping through the upper triangle (including the diagonal):
for (int i = 0; i < n; i++)
// access some arrays with the value of i
for (int j = i; j < n; j++)
// do some stuff with the values of j and i
So, I've got the total size of the loop,
for (int q = 0; q < ((n*n - n)/2)+n; q++)
// Do stuff
But where I am struggling is:
How do I calculate i and j from q? Is it possible?
In the meantime, I'm just doing the full matrix, calculating i and j from there, and only doing my stuff when j >= i...but that still leaves a hefty amount of thread overhead.
If I restate your problem, to find i and j from q, you need the greatest i such that
q >= i*n - (i-1)*i/2
to define j as
j = i + (q - i*n - (i-1)*i/2)
If you have such a greatest i, then
(i+1)*n - i*(i+1)/2 > q >= i*n - (i-1)*i/2
n-i > (q - i*n - (i-1)*i/2) >= 0
n > j = i + (q - i*n - (i-1)*i/2) >= i
Here is a first iterative method to find i:
for (i = 0; q >= i*n - (i-1)*i/2; ++i);
i = i-1;
If you iterate over q, the computation of i is likely to exploit the iterative process.
A second method could use sqrt since
i*n - i²/2 + i/2 ~ q
i²/2 - i(n+1/2) + q ~ 0
delta = (n+0.5)² - 2q
i ~ (n+0.5) - sqrt(delta)
i could be defined as floor((n+0.5) - sqrt((n+0.5)² - 2q))
OK, this isn't an answer as far as I can tell. But, it is a workaround (for now).
I was thinking about it, and creating an array (or corresponding pointer), a[(n*n + n)/2 + n][2], and reading in the corresponding i and j values in my calling code, and passing this to my function would allow for the speed up.
You can make your loop to iterate as it is iterating on the whole matrix.
And just to keep track on your current line and each time you are entering a new line increment the index with the value of that line.
Then: i == line and j == i%n.
See this code:
int main() {
int n = 10;
int line = 0;
for (int i=0; i<n*n; i++){
if (i%n == 0 && i!=0){
line++;
i += line;
cout << endl;
}
cout << "("<<line<<","<<i%n<<")";
}
return 0;
}
Running Example
Trying to calculate the big o of the function by counting the steps. I think those are how to count each step by following how they did it in the examples, but not sure how to calculate the total.
int function (int n){
int count = 0; // 1 step
for (int i = 0; i <= n; i++) // 1 + 1 + n * (2 steps)
for (int j = 0; j < n; j++) // 1 + 1 + n * (2 steps)
for (int k = 0; k < n; k++) // 1 + 1 + n * (2 steps)
for (int m = 0; m <= n; m++) // 1 + 1 + n * (2 steps)
count++; // 1 step
return count; // 1 step
}
I want to say this function is O(n^2), but I dont understand how that was calculated.
Examples I've been looking at
int func1 (int n){
int sum = 0; // 1 step
for (int i = 0; i <= n; i++) // 1 + 1 + n * (2 steps)
sum += i; // 1 step
return sum; // 1 step
} //total steps: 4 + 3n
and
int func2 (int n){
int sum = 0; // 1 step
for (int i = 0; i <= n; i++) // 1 + 1 + n * (2 steps)
for (int j = 0; j <= n; j++) // 1 + 1 + n * (2 steps)
sum ++; // 1 step
for (int k = 0; k <= n; k++) // 1 + 1 + n * (2 steps)
sum--; // 1 step
return sum; // 1 step
}
//total steps: 3n^2 + 7n + 6
What you've just proposed here are quite simple examples.
In my opinion you just need to understand how the complexity in a cycle works, in order to understand your examples.
In short (very briefly) a cycle has to be considered in asymptotic complexity as following:
loop (condition) :
// loop body
end loop
The condition of the loop should tell you how many times the loop will be executed compared to the size of the input.
The complexity of the body (you can consider the body as an sub-function and compute the complexity as well) has to be multiplied by the complexity of the loop.
The reason is quite intuitive: what you have in the body will be executed repetitively until the condition is verified, that is the number of times the loop (and so the body) will be executed.
Just some example:
// Array linear assignment
std::vector<int> array(SIZE_ARRAY);
for (int i = 0; i < SIZE_ARRAY; ++i) {
array[i] = i;
}
Let's analyse that simple loop:
First of all, we need to select the input relative to our complexity function will be computed. That case is pretty trivial: the variable is the size of the array. That's because we want to know how our program act respect the growing of the size of the input array.
The loop will be repeat SIZE_ARRAY times. So number of times the body will be executed is SIZE_ARRAY times (note: that values is variable, is not constant value).
Now consider the loop body. The instruction array[i] = i does not depend on how big is the array. It takes an unknown number of CPU cycles, but that number is always the same, that is constant.
Summarizing, we repeat SIZE_ARRAY times an instruction which takes a constant number of CPU clocks (let's say k is that value, is constant).
So, mathematically the number of CPU clocks will be executed for that simple program will be SIZE_ARRAY * k.
With the O Big notation we can describe the limiting behaviour. That is the behaviour a function will assume when the independent variable goes to infinity.
We can write:
O(SIZE_ARRAY * k) = O(SIZE_ARRAY)
That's because k is a constant value and by definition of Big O Notation the constant does not grown at infinity (is constant ever).
If we call SIZE_ARRAY as N (the size of the input) we can say that our function is a O(N) in time complexity.
The last ("more complicate") example:
for (int i = 0; i < SIZE_ARRAY; ++i) {
for (int j = 0; j < SIZE_ARRAY; ++j) {
array[j] += j;
}
}
As before our problem size is compared to the SIZE_ARRAY.
Shortly:
The first cycle will be execute SIZE_ARRAY times, that is O(SIZE_ARRAY).
The second cycle will be execute SIZE_ARRAY times.
The body of the second cycle is an instruction which will take a constant number of CPU cycle, let's say that number is k.
We take the number of time the first loop will be executed and we multiply it by its body complexity.
O(SIZE_ARRAY) * [first_loop_body_complexity].
But the body of the first loop is:
for (int j = 0; j < SIZE_ARRAY; ++j) {
array[j] += j;
}
Which is a single loop as the previous example, and we've just computed is complexity. It is an O(SIZE_ARRAY). So we can see that:
[first_loop_body_complexity] = O(SIZE_ARRAY)
Finally, our entire complexity is:
O(SIZE_ARRAY) * O(SIZE_ARRAY) = O(SIZE_ARRAY * SIZE_ARRAY)
That is
O(SIZE_ARRAY^2)
Using N instead of SIZE_ARRAY.
O(N^2)
Disclaimer: this is not a mathematical explication. It's a dumb down version which I think can help someone who is introduced to the world of complexities and is as clueless as I was when I first met this concept. Also I don't give you the answers. Just try to help you get there.
Moral of the story: don't count steps. Complexity is not about how many instructions (I will use this instead of "steps") are executed. That in itself is (almost) completely irelevant. In layman terms (time) complexity is about how does the execution time grow depending on how the input grows - that's how I finally understood complexity.
Let's take it step by step with some of the most encountered complexity:
constant complexity: O(1)
this represents an algorithm whose execution time does not depend on the input. The execution time doesn't grow when the input grows.
For instance:
auto foo_o1(int n) {
instr 1;
instr 2;
instr 3;
if (n > 20) {
instr 4;
instr 5;
instr 6;
}
instr 7;
instr 8;
};
The execution time of this function doesn't depend on the value of n. Notice how I can say that even if some instructions get executed or not depending on the value of n. Mathematically this is because O(constant) == O(1). Intuitively it's because the growth of the number of instructions it's not proportional with n. In the same ideea, it's irrelevant if the function has 10 instr or 1k instructions. It's still O(1) - constant complexity.
Linear complexity: O(n)
this represents an algorithm whose execution time is proportional with the input. When given a small input it takes a certain amount. When increasing the input the execution time grows proportionally:
auto foo1_on(int n)
{
for (i = 0; i < n; ++i)
instr;
}
This function is O(n). This means that when the input doubles, the execution time grows by a factor. This is true for any input. E.g when you double the input from 10 to 20 and when you double the input from 1000 to 2000 there is more or less the same factor in the growth of the execution time of the algorithm.
In line with the ideea of ignoring what doesn't contribute much comparatively with the "fastest" growth, all the next functions still have O(n) complexity. Mathematically O complexities are upper-bounded. This leads to O(c1*n + c0) = O(n)
auto foo2_on(int n)
{
for (i = 0; i < n / 2; ++i)
instr;
}
here: O(n / 2) = O(n)
auto foo3_on(int n)
{
for (i = 0; i < n; ++i)
instr 1;
for (i = 0; i < n; ++i)
instr 2;
}
here O(n) + O(n) = O(2*n) = O(n)
polynomial order 2 complexity: O(n^2)
This tells you that as you grow the input, the execution time grows by factor bigger and bigger. For instance the next is a valid behavior of an O(n^2) algorithm:
Read: When you double the input from .. to .. you could get an increase of execution time of .. times
from 100 to 200 : 1.5 times
from 200 to 400 : 1.8 times
from 400 to 800 : 2.2 times
from 800 to 1600 : 6 times
from 1600 to 3200 : 500 times
Try this!. Write an O(n^2) algorithm. And double the input. At first you will see small increases of computation time. At one time it just blows and you have to wait few minutes when at the previous steps it just took mere seconds.
This can be easily understand once you look over a n^2 graph.
auto foo_on2(int n)
{
for (i = 0; i < n; ++i)
for (j = 0; j < n; ++j)
instr;
}
How is this function O(n)? Simple: first loop executes n times. (I don't care if it's n times plus 3 or 4*n. Then, for each step of the first loop the second loop executes n times. There are n iterations of the i loop. For each i iteration there are n j iterations. So in total we have n * n = n^2 j iterations. Thus O(n^2)
There are other interesting complexities like logarithmic, exponential etc etc. Once you understand the concept behind the math, it gets very interesting. For instance a logarithmic complexity O(log(n)) has an execution time that grows slower and slower as the input grows. You can clearly see that when you look over a log graph.
There are a lot of resources on the net about complexities. Search. Read. Don't understand! Search again. Read. Take paper and pen. Understand!. Repeat.
To keep it simple:
O(N) means less than or equal to N. Therefore, and within a one code snippet, we ignore all and focus on the code that takes the highest number of steps (Highest power) to solve the problem / finish the execution.
Following your examples:
int function (int n){
int count = 0; // ignore
for (int i = 0; i <= n; i++) // This looks interesting
for (int j = 0; j < n; j++) // This looks interesting
for (int k = 0; k < n; k++) // This looks interesting
for (int m = 0; m <= n; m++) // This looks interesting
count++; // This is what we are looking for.
return count; // ignore
}
For that statement to finish we will need to "wait" or "cover" or "step" (n + 1) * n * n * (n + 1) => O(~N^4).
Second example:
int func1 (int n){
int sum = 0; // ignore
for (int i = 0; i <= n; i++) // This looks interesting
sum += i; // This is what we are looking for.
return sum; // ignore
}
For that to finish it needs n + 1 steps => O(~n).
Third example:
int func2 (int n){
int sum = 0; // ignore
for (int i = 0; i <= n; i++) // This looks interesting
for (int j = 0; j <= n; j++) // This looks interesting
sum ++; // This is what we are looking for.
for (int k = 0; k <= n; k++) // ignore
sum--; // ignore
return sum; // ignore
}
For that to finish we will need (n + 1) * (n + 1) steps => O(~N^2)
In these simple cases you can determine time complexity by finding the instruction that is executed most often and then find out how this number depends on n.
In example 1, count++ is executed n^4 times => O(n^4)
In example 2, sum += i; is executed n times => O(n)
In examples 3, sum ++; is executed n^2 times => O(n^2)
Well, actually that is not correct because some of your loops are executed n+1 times, but that doesn't matter at all. In example 1, the instruction is actually executed (n+1)^2*n^2 times which is the same as n^4 + 2 n^3 + n^2. For time complexity only the largest power counts.
1) for (i = 1; i < n; i++) { > n
2) SmallPos = i; > n-1
3) Smallest = Array[SmallPos]; > n-1
4) for (j = i+1; j <= n; j++) > n*(n+1 -i-1)??
5) if (Array[j] < Smallest) { > n*(n+1 -i-1 +1) ??
6) SmallPos = j; > n*(n+1 -i-1 +1) ??
7) Smallest = Array[SmallPos] > n*(n+1 -i-1 +1) ??
}
8) Array[SmallPos] = Array[i]; > n-1
9) Array[i] = Smallest; > n-1
}
i know the big O notation is n^2 ( my bad its not n^3)
i am not sure between line 4-7 anyone care to help out?
im not sure how to get the out put for the second loop since j = i +1 as i changes so does j
also for line 4 the ans suppose to be n(n+1)/2 -1 i want to know why as i can never get that
i am not really solving for the big O i am trying to do the steps that gets to big O as constant and variables are excuded in big O notations.
I would say this is O(n^2) (although as Fred points out above, O(n^2) is a subset of O(n^3), so it's not wrong to say that it's O(n^3)).
Note that it's generally not necessary to compute the number of executions of every single line; as Big-O notation discards low-order terms, it's sufficient to focus only on the most-executed section (which will typically be inside the innermost loop).
So in your case, none of the loops are affected by the values in Array, so we can safely ignore all that. The innermost loop runs (n-1) + (n-2) + (n-3) + ... times; this is an arithmetic series, and so has a term in n^2.
Is this an algorithm given to you, or one you wrote?
I think your loop indexes are wrong.
for (i = 1; i < n; i++) {
should be either
for (i = 0; i < n; i++) {
or
for (i = 1; i <= n; i++) {
depending on whether your array indexes start at 0 or 1 (it's 0 in C and Java).
Assuming we correct it to:
for (i = 0; i < n; i++) {
SmallPos = i;
Smallest = Array[SmallPos];
for (j = i+1; j < n; j++)
if (Array[j] < Smallest) {
SmallPos = j;
Smallest = Array[SmallPos];
}
Array[SmallPos] = Array[i];
Array[i] = Smallest;
}
Then I think the complexity is n2-3/2n = O(n2).
Here's how...
The most costly operation in the innermost loop (my lecturer called this the "basic operation") is key comparison at line 5. It is done once per loop.
So now, you create a summation:
Sum(i=0 to n-1) of Sum(j=i+1 to n-1) of 1.
Now expand the innermost (rightmost) Sum to get:
Sum(i=0 to n-1) of (n-1)-(i+1)+1
and then:
Sum(i=0 to n-1) of n-i-1
and then:
[Sum(i=0 to n-1) of n] - [Sum(i=0 to n-1) of i] - [Sum (i=0 to n-1) of 1]
and then:
n[Sum(i=0 to n-1) of 1] - [(n-1)(n)/2] - [(n-1)-0+1]
and then:
n[(n-1)-0+1] - [(n^2-n)/2] - [n]
and then:
n^2 - [(n^2/2) - n/2] - n
equals:
1/2n^2 - 1/2n
is in:
O(n^2)
If you're asking why it's not O(n3)...
Consider the worst case. if (Array[j] < Smallest) will be true the most times if Array is reverse sorted.
Then you have an inner loop that looks like this:
Array[j] < Smallest;
SmallPos = j;
Smallest = Array[SmallPos];
Now we've got a constant three operations for every inner for (j...) loop.
And O(3) = O(1).
So really, it's i and j that determine how much work we do. Nothing in the inner if loop changes anything.
You can think of it as you should only count while and for loops.
As to why for (j = i+1; j <= n; j++) is n(n+1)/2. It's called an arithmetic series.
You're doing n-1 passes of the for (j...) loop when i==0, n-2 passes when i==1, n-3, etc, until 0.
So the summation is
n-1 + n-2 + n-3 + ... 3 + 2 + 1
now, you sum pairs from outside in, re-writing it as:
n-1+1 + n-2+2 + n-3+3 + ...
equals:
n + n + n + ...
and there are n/2 of these pairs, so you have:
n*(n/2)
Two for() loops, the outer loop from 1 to n, the inner loop runs between 1..n, to n. This makes it O(n^2).
If you 'draw this out', it'll be triangular, rather than rectangular, so O(n^2), while true, is hiding the fact that the constant factor term is smaller than if the inner loop also iterated from 1 to n.
It is O(n^2).
For each of the n iterations of the outer loop you have n iterations in the inner loop.