I want to design an algorithm with O(n(log(2)n)^2) time complexity. I wrote this:
for(int i=1; i<=n; i++){
j=i;
while(j != 1)
j=j/2;
j=i;
while(j !=1)
j=j/2;
}
Does it have O(n(log(2)n)^2) time complexity? If not, where I am going wrong and how can I fix it so that its time complexity is O(n(log(2)n)^2)?
Slight digression:
As the guys said in the comments, the algorithm is indeed O(n log n). This is coincidentally identical to the result obtained by multiplying the complexity of the inner loop by the outer loop, i.e. O(log i) x O(n).
This may lead you to believe that we can simply add another iteration of the inner loop to obtain the (log n)2 part:
for (int i = 1; i < n; i++) {
int k = i;
while (k >= 1)
k /= 2;
int j = i;
while (j >= 1)
j /= 2;
}
}
But let's look at how the original complexity is derived:
(Using Sterling's approximation)
Therefore the proposed modification would give:
Which is not what we want.
An example I can think of from a recent personal project is semi-naive KD-tree construction. The pseudocode is given below:
def kd_recursive_cons (list_points):
if length(list_points) < predefined_threshold:
return new leaf(list_points)
A <- random axis (X, Y, Z)
sort list_points by their A-coordinate
mid <- find middle element in list_points
list_L, list_R <- split list_points at mid
node_L <- kd_recursive_cons(list_L)
node_R <- kd_recursive_cons(list_R)
return new node (node_L, node_R)
end
The time complexity function is therefore given by:
Where the n log n part is from sorting. We can obviously ignore the Dn linear part, and also the constant C. Thus:
Which is what we wanted.
Now to write a simpler piece of code with the same time complexity. We can make use of the summation we obtained in the above derivation...
And noting that the parameter passed to the log function is divided by two in every loop, we can thus write the code:
for (int i = 1; i < n; i++) {
for (int k = n; k >= 1; k /= 2) {
int j = k;
while (j >= 1)
j /= 2;
}
}
This looks like the "naive" but incorrect solution mentioned at the beginning, with the difference being that the nested loops there had different bounds (j did not depend on k, but k depended on i instead of directly on n).
EDIT: some numerical tests to confirm that the complexity is as intended:
Test function code:
int T(int n) {
int o = 0;
for (int i = 1; i < n; i++)
for (int j = n; j >= 1; j /= 2)
for (int k = j; k >= 1; k /= 2, o++);
return o;
}
Numerical results:
n T(n)
-------------------
2 3
4 18
8 70
16 225
32 651
64 1764
128 4572
256 11475
512 28105
1024 67518
2048 159666
4096 372645
8192 860055
16384 1965960
32768 4456312
65536 10026855
131072 22413141
262144 49807170
524288 110100270
Then I plotted sqrt(T(n) / n) against n. If the complexity is correct this should give a log(n) graph, or a straight line if plotted with a log-scale horizontal axis.
And this is indeed what we get:
Related
Consider this function:
void func()
{
int n;
std::cin >> n;
int var = 0;
for (int i = n; i > 0; i--)
for (int j = 1; j < n; j *= 2)
for (int k = 0; k < j; k++)
var++;
}
I think that the time complexity is O(n^2 * log n)
but when n is 2^m, I have a hard time thinking what complexity it is.
How can I analyze the complexity of this function?
n isn't a constant, it's dynamic, so that's the variable in your analysis. If it was constant, your complexity would be O(1) regardless of its value because all constants are discarded in complexity analysis.
Similarly, "n is 2^m" is sort of nonsensical because m isn't a variable in the code, so I'm not sure how to analyze that. Complexity analysis is done relative to the size of the input; you don't have to introduce any more variables.
Let's break down the loops, then multiply them together:
for (int i = n; i > 0; i--) // O(n)
for (int j = 1; j < n; j *= 2) // O(log(n))
for (int k = 0; k < j; k++) // O(n / log(n))
Total time complexity: O(n * log(n) * (n / log(n))) => O(n^2).
The first two loops are trivial (if the second one isn't obvious, it's logarithmic because of repeated multiplication by 2, the sequence is 1, 2, 4, 8, 16...).
The third loop is tougher to analyze because it runs on j, not n. We can simplify matters by disregarding the outermost loop completely, analyzing the inner loops, then multiplying whatever complexity we get for the two inner loops by the outermost loop's O(n).
The trick is to look at the shape of the enclosing loop; as the j loop approaches n, k is running from 0..n linearly, giving a baseline of O(n) for the k loop. This is scaled by a logarithmic factor j, O(log(n)). The logarithmic factors cancel and we're left with O(n) for the inner loops.
Here's some empirical evidence of the inner loop complexity:
import math
from matplotlib import pyplot
def f(N):
count = 0
j = 1
while j < N:
j *= 2
count += j
return count
def linear(N):
return N
def linearithmic(N):
return N * math.log2(N) if N else 0
def plot_fn(fn, n_start, n_end, n_step):
x = []
y = []
for N in range(n_start, n_end, n_step):
x.append(N)
y.append(fn(N))
pyplot.plot(x, y, "-", label=fn.__name__)
def main():
max_n = 10 ** 10
step = 10 ** 5
for fn in [f, linear, linearithmic]:
plot_fn(fn, 0, max_n, step)
pyplot.legend()
pyplot.show()
if __name__ == "__main__":
main()
The plot this produces is:
This shows that the innermost two loops (in blue) are linear, not linearithmic, confirming the overall quadratic complexity once the outermost linear loop is re-introduced.
Hexagonal grid is represented by a two-dimensional array with R rows and C columns. First row always comes "before" second in hexagonal grid construction (see image below). Let k be the number of turns. Each turn, an element of the grid is 1 if and only if the number of neighbours of that element that were 1 the turn before is an odd number. Write C++ code that outputs the grid after k turns.
Limitations:
1 <= R <= 10, 1 <= C <= 10, 1 <= k <= 2^(63) - 1
An example with input (in the first row are R, C and k, then comes the starting grid):
4 4 3
0 0 0 0
0 0 0 0
0 0 1 0
0 0 0 0
Simulation: image, yellow elements represent '1' and blank represent '0'.
This problem is easy to solve if I simulate and produce a grid each turn, but with big enough k it becomes too slow. What is the faster solution?
EDIT: code (n and m are used instead R and C) :
#include <cstdio>
#include <cstring>
using namespace std;
int old[11][11];
int _new[11][11];
int n, m;
long long int k;
int main() {
scanf ("%d %d %lld", &n, &m, &k);
for (int i = 0; i < n; i++) {
for (int j = 0; j < m; j++) scanf ("%d", &old[i][j]);
}
printf ("\n");
while (k) {
for (int i = 0; i < n; i++) {
for (int j = 0; j < m; j++) {
int count = 0;
if (i % 2 == 0) {
if (i) {
if (j) count += old[i-1][j-1];
count += old[i-1][j];
}
if (j) count += (old[i][j-1]);
if (j < m-1) count += (old[i][j+1]);
if (i < n-1) {
if (j) count += old[i+1][j-1];
count += old[i+1][j];
}
}
else {
if (i) {
if (j < m-1) count += old[i-1][j+1];
count += old[i-1][j];
}
if (j) count += old[i][j-1];
if (j < m-1) count += old[i][j+1];
if (i < n-1) {
if (j < m-1) count += old[i+1][j+1];
count += old[i+1][j];
}
}
if (count % 2) _new[i][j] = 1;
else _new[i][j] = 0;
}
}
for (int i = 0; i < n; i++) {
for (int j = 0; j < m; j++) old[i][j] = _new[i][j];
}
k--;
}
for (int i = 0; i < n; i++) {
for (int j = 0; j < m; j++) {
printf ("%d", old[i][j]);
}
printf ("\n");
}
return 0;
}
For a given R and C, you have N=R*C cells.
If you represent those cells as a vector of elements in GF(2), i.e, 0s and 1s where arithmetic is performed mod 2 (addition is XOR and multiplication is AND), then the transformation from one turn to the next can be represented by an N*N matrix M, so that:
turn[i+1] = M*turn[i]
You can exponentiate the matrix to determine how the cells transform over k turns:
turn[i+k] = (M^k)*turn[i]
Even if k is very large, like 2^63-1, you can calculate M^k quickly using exponentiation by squaring: https://en.wikipedia.org/wiki/Exponentiation_by_squaring This only takes O(log(k)) matrix multiplications.
Then you can multiply your initial state by the matrix to get the output state.
From the limits on R, C, k, and time given in your question, it's clear that this is the solution you're supposed to come up with.
There are several ways to speed up your algorithm.
You do the neighbour-calculation with the out-of bounds checking in every turn. Do some preprocessing and calculate the neighbours of each cell once at the beginning. (Aziuth has already proposed that.)
Then you don't need to count the neighbours of all cells. Each cell is on if an odd number of neighbouring cells were on in the last turn and it is off otherwise.
You can think of this differently: Start with a clean board. For each active cell of the previous move, toggle the state of all surrounding cells. When an even number of neighbours cause a toggle, the cell is on, otherwise the toggles cancel each other out. Look at the first step of your example. It's like playing Lights Out, really.
This method is faster than counting the neighbours if the board has only few active cells and its worst case is a board whose cells are all on, in which case it is as good as neighbour-counting, because you have to touch each neighbours for each cell.
The next logical step is to represent the board as a sequence of bits, because bits already have a natural way of toggling, the exclusive or or xor oerator, ^. If you keep the list of neigbours for each cell as a bit mask m, you can then toggle the board b via b ^= m.
These are the improvements that can be made to the algorithm. The big improvement is to notice that the patterns will eventually repeat. (The toggling bears resemblance with Conway's Game of Life, where there are also repeating patterns.) Also, the given maximum number of possible iterations, 2⁶³ is suspiciously large.
The playing board is small. The example in your question will repeat at least after 2¹⁶ turns, because the 4×4 board can have at most 2¹⁶ layouts. In practice, turn 127 reaches the ring pattern of the first move after the original and it loops with a period of 126 from then.
The bigger boards may have up to 2¹⁰⁰ layouts, so they may not repeat within 2⁶³ turns. A 10×10 board with a single active cell near the middle has ar period of 2,162,622. This may indeed be a topic for a maths study, as Aziuth suggests, but we'll tacke it with profane means: Keep a hash map of all previous states and the turns where they occurred, then check whether the pattern has occurred before in each turn.
We now have:
a simple algorithm for toggling the cells' state and
a compact bitwise representation of the board, which allows us to create a hash map of the previous states.
Here's my attempt:
#include <iostream>
#include <map>
/*
* Bit representation of a playing board, at most 10 x 10
*/
struct Grid {
unsigned char data[16];
Grid() : data() {
}
void add(size_t i, size_t j) {
size_t k = 10 * i + j;
data[k / 8] |= 1u << (k % 8);
}
void flip(const Grid &mask) {
size_t n = 13;
while (n--) data[n] ^= mask.data[n];
}
bool ison(size_t i, size_t j) const {
size_t k = 10 * i + j;
return ((data[k / 8] & (1u << (k % 8))) != 0);
}
bool operator<(const Grid &other) const {
size_t n = 13;
while (n--) {
if (data[n] > other.data[n]) return true;
if (data[n] < other.data[n]) return false;
}
return false;
}
void dump(size_t n, size_t m) const {
for (size_t i = 0; i < n; i++) {
for (size_t j = 0; j < m; j++) {
std::cout << (ison(i, j) ? 1 : 0);
}
std::cout << '\n';
}
std::cout << '\n';
}
};
int main()
{
size_t n, m, k;
std::cin >> n >> m >> k;
Grid grid;
Grid mask[10][10];
for (size_t i = 0; i < n; i++) {
for (size_t j = 0; j < m; j++) {
int x;
std::cin >> x;
if (x) grid.add(i, j);
}
}
for (size_t i = 0; i < n; i++) {
for (size_t j = 0; j < m; j++) {
Grid &mm = mask[i][j];
if (i % 2 == 0) {
if (i) {
if (j) mm.add(i - 1, j - 1);
mm.add(i - 1, j);
}
if (j) mm.add(i, j - 1);
if (j < m - 1) mm.add(i, j + 1);
if (i < n - 1) {
if (j) mm.add(i + 1, j - 1);
mm.add(i + 1, j);
}
} else {
if (i) {
if (j < m - 1) mm.add(i - 1, j + 1);
mm.add(i - 1, j);
}
if (j) mm.add(i, j - 1);
if (j < m - 1) mm.add(i, j + 1);
if (i < n - 1) {
if (j < m - 1) mm.add(i + 1, j + 1);
mm.add(i + 1, j);
}
}
}
}
std::map<Grid, size_t> prev;
std::map<size_t, Grid> pattern;
for (size_t turn = 0; turn < k; turn++) {
Grid next;
std::map<Grid, size_t>::const_iterator it = prev.find(grid);
if (1 && it != prev.end()) {
size_t start = it->second;
size_t period = turn - start;
size_t index = (k - turn) % period;
grid = pattern[start + index];
break;
}
prev[grid] = turn;
pattern[turn] = grid;
for (size_t i = 0; i < n; i++) {
for (size_t j = 0; j < m; j++) {
if (grid.ison(i, j)) next.flip(mask[i][j]);
}
}
grid = next;
}
for (size_t i = 0; i < n; i++) {
for (size_t j = 0; j < m; j++) {
std::cout << (grid.ison(i, j) ? 1 : 0);
}
std::cout << '\n';
}
return 0;
}
There is probably room for improvement. Especially, I'm not so sure how it fares for big boards. (The code above uses an ordered map. We don't need the order, so using an unordered map will yield faster code. The example above with a single active cell on a 10×10 board took significantly longer than a second with an ordered map.)
Not sure about how you did it - and you should really always post code here - but let's try to optimize things here.
First of all, there is not really a difference between that and a quadratic grid. Different neighbor relationships, but I mean, that is just a small translation function. If you have a problem there, we should treat this separately, maybe on CodeReview.
Now, the naive solution is:
for all fields
count neighbors
if odd: add a marker to update to one, else to zero
for all fields
update all fields by marker of former step
this is obviously in O(N). Iterating twice is somewhat twice the actual run time, but should not be that bad. Try not to allocate space every time that you do that but reuse existing structures.
I'd propose this solution:
at the start:
create a std::vector or std::list "activated" of pointers to all fields that are activated
each iteration:
create a vector "new_activated"
for all items in activated
count neighbors, if odd add to new_activated
for all items in activated
set to inactive
replace activated by new_activated*
for all items in activated
set to active
*this can be done efficiently by putting them in a smart pointer and use move semantics
This code only works on the activated fields. As long as they stay within some smaller area, this is far more efficient. However, I have no idea when this changes - if there are activated fields all over the place, this might be less efficient. In that case, the naive solution might be the best one.
EDIT: after you now posted your code... your code is quite procedural. This is C++, use classes and use representation of things. Probably you do the search for neighbors right, but you can easily make mistakes there and therefore should isolate that part in a function, or better method. Raw arrays are bad and variables like n or k are bad. But before I start tearing your code apart, I instead repeat my recommendation, put the code on CodeReview, having people tear it apart until it is perfect.
This started off as a comment, but I think it could be helpful as an answer in addition to what has already been stated.
You stated the following limitations:
1 <= R <= 10, 1 <= C <= 10
Given these restrictions, I'll take the liberty to can represent the grid/matrix M of R rows and C columns in constant space (i.e. O(1)), and also check its elements in O(1) instead of O(R*C) time, thus removing this part from our time-complexity analysis.
That is, the grid can simply be declared as bool grid[10][10];.
The key input is the large number of turns k, stated to be in the range:
1 <= k <= 2^(63) - 1
The problem is that, AFAIK, you're required to perform k turns. This makes the algorithm be in O(k). Thus, no proposed solution can do better than O(k)[1].
To improve the speed in a meaningful way, this upper-bound must be lowered in some way[1], but it looks like this cannot be done without altering the problem constraints.
Thus, no proposed solution can do better than O(k)[1].
The fact that k can be so large is the main issue. The most anyone can do is improve the rest of the implementation, but this will only improve by a constant factor; you'll have to go through k turns regardless of how you look at it.
Therefore, unless some clever fact and/or detail is found that allows this bound to be lowered, there's no other choice.
[1] For example, it's not like trying to determine if some number n is prime, where you can check all numbers in the range(2, n) to see if they divide n, making it a O(n) process, or notice that some improvements include only looking at odd numbers after checking n is not even (constant factor; still O(n)), and then checking odd numbers only up to √n, i.e., in the range(3, √n, 2), which meaningfully lowers the upper-bound down to O(√n).
for (int i = 0; i < n; ++i ) { //n
for (int j = 0; j < i; ++j) { //n
cout<< i* j<<endl;
cout<< ("j = " + j);
}
for (int k = 0; k < n * 3; ++k) //n?
cout<<"k = " + k);
}
In this loop I see that the first for loop is O(n), the second loop is also O(n) but the 3rd for loop is confusing for me. K being less than something expanding would this also be O(n) for this loop? If so, what does two loops within another loop's time complexity come out to be in this context?
I am assuming O(n^2) due to the two n's in the middle not being multiplied in any way. Is this correct? Also if I'm correct and the second loop is O(n), what would the time complexity be if it was O(logn)?
(Not homework, simply for understanding purposes)
A good rule of thumb for big-O notation is the following:
When in doubt, work inside-out!
Here, let's start by analyzing the two inner loops and then work outward to get the overall time complexity. The two inner loops are shown here:
for (int j = 0; j < i; ++j) {
cout<< i* j<<endl;
cout<< (”j = ” + j);
}
for (int k = 0; k < n * 3; ++k)
cout<<”k = ” + k);
The first loop runs O(i) times and does O(1) work per iteration, so it does O(i) total work. That second loop runs O(n) times (it runs 3n times, and since big-O notation munches up constants, that's O(n) total times) and does O(1) work per iteration, so it does O(n) total work. This means that your overall loop can be rewritten as
for (int i = 0; i < n; ++i) {
do O(i) work;
do O(n) work;
}
If you do O(i) work and then do O(n) work, the total work done is O(i + n), so we can rewrite this even further as
for (int i = 0; i < n; ++i) {
do O(i + n) work;
}
If we look at the loop bounds here, we can see that i ranges from 0 up to n-1, so i is never greater than n. As a result, the O(i + n) term is equivalent to an O(n) term, since i + n = O(n). This makes our overall loop
for (int i = 0; i < n; ++i) {
do O(n) work;
}
From here, it should be a bit clearer that the overall runtime is O(n2), so we do O(n) iterations, each of which does O(n) total work.
You asked in a comment in another answer about what would happen if the second of the nested loops only ran O(log n) times instead of O(n) times. That's a great exercise, so let's see what happens if we try that out!
Imagine the code looked like this:
for (int i = 0; i < n; ++i) {
for (int j = 0; j < i; ++j) {
cout<< i* j<<endl;
cout<< ("j = " + j);
}
for (int k = 0; k < n; k *= 2)
cout<<"k = " + k);
}
Here, the second loop runs only O(log n) times because k grows geometrically. Let's again apply the idea of working from the inside out. The inside now consists of these two loops:
for (int j = 0; j < i; ++j) {
cout<< i* j<<endl;
cout<< ("j = " + j);
}
for (int k = 0; k < n; k *= 2)
cout<<"k = " + k);
Here, that first loop runs in time O(i) (as before) and the new loop runs in time O(log n), so the total work done per iteration is O(i + log n). If we rewrite our original loops using this, we get something like this:
for (int i = 0; i < n; ++i) {
do O(i + log n) work;
}
This one is a bit trickier to analyze, because i changes from one iteration of the loop to the next. In this case, it often helps to approach the analysis not by multiplying the work done per iteration by the number of iterations, but rather by just adding up the work done across the loop iterations. If we do this here, we'll see that the work done is proportional to
(0 + log n) + (1 + log n) + (2 + log n) + ... + (n-1 + log n).
If we regroup these terms, we get
(0 + 1 + 2 + ... + n - 1) + (log n + log n + ... + log n) (n times)
That simplifies to
(0 + 1 + 2 + ... + n - 1) + n log n
That first part of the summation is Gauss's famous sum 0 + 1 + 2 + ... + n - 1, which happens to be equal to n(n-1) / 2. (It's good to know this!) This means we can rewrite the total work done as
n(n - 1) / 2 + n log n
= O(n2) + O(n log n)
= O(n2)
with that last step following because O(n log n) is dominated by the O(n2) term.
Hopefully this shows you both where the result comes from and how to come up with it. Work from the inside out, working out how much work each loop does and replacing it with a simpler "do O(X) work" statement to make things easier to follow. When you have some amount of work that changes as a loop counter changes, sometimes it's easiest to approach the problem by bounding the value and showing that it never leaves some range, and other times it's easiest to solve the problem by explicitly working out how much work is done from one loop iteration to the next.
When you have multiple loops in sequence, the time complexity of all of them is the worst complexity of any of them. Since both of the inner loops are O(n), the worst is also O(n).
So since you have O(n) code inside an O(n) loop, the total complexity of everything is O(n2).
O n squared; calculate the area of a triangle.
We get 1+2+3+4+5+...+n, which is the nth triangular number. If you graph it, it is basically a triangle of height and width n.
A triangle with base n and height n has area 1/2 n^2. O doesn't care about constants like 1/2.
Trying to calculate the big o of the function by counting the steps. I think those are how to count each step by following how they did it in the examples, but not sure how to calculate the total.
int function (int n){
int count = 0; // 1 step
for (int i = 0; i <= n; i++) // 1 + 1 + n * (2 steps)
for (int j = 0; j < n; j++) // 1 + 1 + n * (2 steps)
for (int k = 0; k < n; k++) // 1 + 1 + n * (2 steps)
for (int m = 0; m <= n; m++) // 1 + 1 + n * (2 steps)
count++; // 1 step
return count; // 1 step
}
I want to say this function is O(n^2), but I dont understand how that was calculated.
Examples I've been looking at
int func1 (int n){
int sum = 0; // 1 step
for (int i = 0; i <= n; i++) // 1 + 1 + n * (2 steps)
sum += i; // 1 step
return sum; // 1 step
} //total steps: 4 + 3n
and
int func2 (int n){
int sum = 0; // 1 step
for (int i = 0; i <= n; i++) // 1 + 1 + n * (2 steps)
for (int j = 0; j <= n; j++) // 1 + 1 + n * (2 steps)
sum ++; // 1 step
for (int k = 0; k <= n; k++) // 1 + 1 + n * (2 steps)
sum--; // 1 step
return sum; // 1 step
}
//total steps: 3n^2 + 7n + 6
What you've just proposed here are quite simple examples.
In my opinion you just need to understand how the complexity in a cycle works, in order to understand your examples.
In short (very briefly) a cycle has to be considered in asymptotic complexity as following:
loop (condition) :
// loop body
end loop
The condition of the loop should tell you how many times the loop will be executed compared to the size of the input.
The complexity of the body (you can consider the body as an sub-function and compute the complexity as well) has to be multiplied by the complexity of the loop.
The reason is quite intuitive: what you have in the body will be executed repetitively until the condition is verified, that is the number of times the loop (and so the body) will be executed.
Just some example:
// Array linear assignment
std::vector<int> array(SIZE_ARRAY);
for (int i = 0; i < SIZE_ARRAY; ++i) {
array[i] = i;
}
Let's analyse that simple loop:
First of all, we need to select the input relative to our complexity function will be computed. That case is pretty trivial: the variable is the size of the array. That's because we want to know how our program act respect the growing of the size of the input array.
The loop will be repeat SIZE_ARRAY times. So number of times the body will be executed is SIZE_ARRAY times (note: that values is variable, is not constant value).
Now consider the loop body. The instruction array[i] = i does not depend on how big is the array. It takes an unknown number of CPU cycles, but that number is always the same, that is constant.
Summarizing, we repeat SIZE_ARRAY times an instruction which takes a constant number of CPU clocks (let's say k is that value, is constant).
So, mathematically the number of CPU clocks will be executed for that simple program will be SIZE_ARRAY * k.
With the O Big notation we can describe the limiting behaviour. That is the behaviour a function will assume when the independent variable goes to infinity.
We can write:
O(SIZE_ARRAY * k) = O(SIZE_ARRAY)
That's because k is a constant value and by definition of Big O Notation the constant does not grown at infinity (is constant ever).
If we call SIZE_ARRAY as N (the size of the input) we can say that our function is a O(N) in time complexity.
The last ("more complicate") example:
for (int i = 0; i < SIZE_ARRAY; ++i) {
for (int j = 0; j < SIZE_ARRAY; ++j) {
array[j] += j;
}
}
As before our problem size is compared to the SIZE_ARRAY.
Shortly:
The first cycle will be execute SIZE_ARRAY times, that is O(SIZE_ARRAY).
The second cycle will be execute SIZE_ARRAY times.
The body of the second cycle is an instruction which will take a constant number of CPU cycle, let's say that number is k.
We take the number of time the first loop will be executed and we multiply it by its body complexity.
O(SIZE_ARRAY) * [first_loop_body_complexity].
But the body of the first loop is:
for (int j = 0; j < SIZE_ARRAY; ++j) {
array[j] += j;
}
Which is a single loop as the previous example, and we've just computed is complexity. It is an O(SIZE_ARRAY). So we can see that:
[first_loop_body_complexity] = O(SIZE_ARRAY)
Finally, our entire complexity is:
O(SIZE_ARRAY) * O(SIZE_ARRAY) = O(SIZE_ARRAY * SIZE_ARRAY)
That is
O(SIZE_ARRAY^2)
Using N instead of SIZE_ARRAY.
O(N^2)
Disclaimer: this is not a mathematical explication. It's a dumb down version which I think can help someone who is introduced to the world of complexities and is as clueless as I was when I first met this concept. Also I don't give you the answers. Just try to help you get there.
Moral of the story: don't count steps. Complexity is not about how many instructions (I will use this instead of "steps") are executed. That in itself is (almost) completely irelevant. In layman terms (time) complexity is about how does the execution time grow depending on how the input grows - that's how I finally understood complexity.
Let's take it step by step with some of the most encountered complexity:
constant complexity: O(1)
this represents an algorithm whose execution time does not depend on the input. The execution time doesn't grow when the input grows.
For instance:
auto foo_o1(int n) {
instr 1;
instr 2;
instr 3;
if (n > 20) {
instr 4;
instr 5;
instr 6;
}
instr 7;
instr 8;
};
The execution time of this function doesn't depend on the value of n. Notice how I can say that even if some instructions get executed or not depending on the value of n. Mathematically this is because O(constant) == O(1). Intuitively it's because the growth of the number of instructions it's not proportional with n. In the same ideea, it's irrelevant if the function has 10 instr or 1k instructions. It's still O(1) - constant complexity.
Linear complexity: O(n)
this represents an algorithm whose execution time is proportional with the input. When given a small input it takes a certain amount. When increasing the input the execution time grows proportionally:
auto foo1_on(int n)
{
for (i = 0; i < n; ++i)
instr;
}
This function is O(n). This means that when the input doubles, the execution time grows by a factor. This is true for any input. E.g when you double the input from 10 to 20 and when you double the input from 1000 to 2000 there is more or less the same factor in the growth of the execution time of the algorithm.
In line with the ideea of ignoring what doesn't contribute much comparatively with the "fastest" growth, all the next functions still have O(n) complexity. Mathematically O complexities are upper-bounded. This leads to O(c1*n + c0) = O(n)
auto foo2_on(int n)
{
for (i = 0; i < n / 2; ++i)
instr;
}
here: O(n / 2) = O(n)
auto foo3_on(int n)
{
for (i = 0; i < n; ++i)
instr 1;
for (i = 0; i < n; ++i)
instr 2;
}
here O(n) + O(n) = O(2*n) = O(n)
polynomial order 2 complexity: O(n^2)
This tells you that as you grow the input, the execution time grows by factor bigger and bigger. For instance the next is a valid behavior of an O(n^2) algorithm:
Read: When you double the input from .. to .. you could get an increase of execution time of .. times
from 100 to 200 : 1.5 times
from 200 to 400 : 1.8 times
from 400 to 800 : 2.2 times
from 800 to 1600 : 6 times
from 1600 to 3200 : 500 times
Try this!. Write an O(n^2) algorithm. And double the input. At first you will see small increases of computation time. At one time it just blows and you have to wait few minutes when at the previous steps it just took mere seconds.
This can be easily understand once you look over a n^2 graph.
auto foo_on2(int n)
{
for (i = 0; i < n; ++i)
for (j = 0; j < n; ++j)
instr;
}
How is this function O(n)? Simple: first loop executes n times. (I don't care if it's n times plus 3 or 4*n. Then, for each step of the first loop the second loop executes n times. There are n iterations of the i loop. For each i iteration there are n j iterations. So in total we have n * n = n^2 j iterations. Thus O(n^2)
There are other interesting complexities like logarithmic, exponential etc etc. Once you understand the concept behind the math, it gets very interesting. For instance a logarithmic complexity O(log(n)) has an execution time that grows slower and slower as the input grows. You can clearly see that when you look over a log graph.
There are a lot of resources on the net about complexities. Search. Read. Don't understand! Search again. Read. Take paper and pen. Understand!. Repeat.
To keep it simple:
O(N) means less than or equal to N. Therefore, and within a one code snippet, we ignore all and focus on the code that takes the highest number of steps (Highest power) to solve the problem / finish the execution.
Following your examples:
int function (int n){
int count = 0; // ignore
for (int i = 0; i <= n; i++) // This looks interesting
for (int j = 0; j < n; j++) // This looks interesting
for (int k = 0; k < n; k++) // This looks interesting
for (int m = 0; m <= n; m++) // This looks interesting
count++; // This is what we are looking for.
return count; // ignore
}
For that statement to finish we will need to "wait" or "cover" or "step" (n + 1) * n * n * (n + 1) => O(~N^4).
Second example:
int func1 (int n){
int sum = 0; // ignore
for (int i = 0; i <= n; i++) // This looks interesting
sum += i; // This is what we are looking for.
return sum; // ignore
}
For that to finish it needs n + 1 steps => O(~n).
Third example:
int func2 (int n){
int sum = 0; // ignore
for (int i = 0; i <= n; i++) // This looks interesting
for (int j = 0; j <= n; j++) // This looks interesting
sum ++; // This is what we are looking for.
for (int k = 0; k <= n; k++) // ignore
sum--; // ignore
return sum; // ignore
}
For that to finish we will need (n + 1) * (n + 1) steps => O(~N^2)
In these simple cases you can determine time complexity by finding the instruction that is executed most often and then find out how this number depends on n.
In example 1, count++ is executed n^4 times => O(n^4)
In example 2, sum += i; is executed n times => O(n)
In examples 3, sum ++; is executed n^2 times => O(n^2)
Well, actually that is not correct because some of your loops are executed n+1 times, but that doesn't matter at all. In example 1, the instruction is actually executed (n+1)^2*n^2 times which is the same as n^4 + 2 n^3 + n^2. For time complexity only the largest power counts.
1) for (i = 1; i < n; i++) { > n
2) SmallPos = i; > n-1
3) Smallest = Array[SmallPos]; > n-1
4) for (j = i+1; j <= n; j++) > n*(n+1 -i-1)??
5) if (Array[j] < Smallest) { > n*(n+1 -i-1 +1) ??
6) SmallPos = j; > n*(n+1 -i-1 +1) ??
7) Smallest = Array[SmallPos] > n*(n+1 -i-1 +1) ??
}
8) Array[SmallPos] = Array[i]; > n-1
9) Array[i] = Smallest; > n-1
}
i know the big O notation is n^2 ( my bad its not n^3)
i am not sure between line 4-7 anyone care to help out?
im not sure how to get the out put for the second loop since j = i +1 as i changes so does j
also for line 4 the ans suppose to be n(n+1)/2 -1 i want to know why as i can never get that
i am not really solving for the big O i am trying to do the steps that gets to big O as constant and variables are excuded in big O notations.
I would say this is O(n^2) (although as Fred points out above, O(n^2) is a subset of O(n^3), so it's not wrong to say that it's O(n^3)).
Note that it's generally not necessary to compute the number of executions of every single line; as Big-O notation discards low-order terms, it's sufficient to focus only on the most-executed section (which will typically be inside the innermost loop).
So in your case, none of the loops are affected by the values in Array, so we can safely ignore all that. The innermost loop runs (n-1) + (n-2) + (n-3) + ... times; this is an arithmetic series, and so has a term in n^2.
Is this an algorithm given to you, or one you wrote?
I think your loop indexes are wrong.
for (i = 1; i < n; i++) {
should be either
for (i = 0; i < n; i++) {
or
for (i = 1; i <= n; i++) {
depending on whether your array indexes start at 0 or 1 (it's 0 in C and Java).
Assuming we correct it to:
for (i = 0; i < n; i++) {
SmallPos = i;
Smallest = Array[SmallPos];
for (j = i+1; j < n; j++)
if (Array[j] < Smallest) {
SmallPos = j;
Smallest = Array[SmallPos];
}
Array[SmallPos] = Array[i];
Array[i] = Smallest;
}
Then I think the complexity is n2-3/2n = O(n2).
Here's how...
The most costly operation in the innermost loop (my lecturer called this the "basic operation") is key comparison at line 5. It is done once per loop.
So now, you create a summation:
Sum(i=0 to n-1) of Sum(j=i+1 to n-1) of 1.
Now expand the innermost (rightmost) Sum to get:
Sum(i=0 to n-1) of (n-1)-(i+1)+1
and then:
Sum(i=0 to n-1) of n-i-1
and then:
[Sum(i=0 to n-1) of n] - [Sum(i=0 to n-1) of i] - [Sum (i=0 to n-1) of 1]
and then:
n[Sum(i=0 to n-1) of 1] - [(n-1)(n)/2] - [(n-1)-0+1]
and then:
n[(n-1)-0+1] - [(n^2-n)/2] - [n]
and then:
n^2 - [(n^2/2) - n/2] - n
equals:
1/2n^2 - 1/2n
is in:
O(n^2)
If you're asking why it's not O(n3)...
Consider the worst case. if (Array[j] < Smallest) will be true the most times if Array is reverse sorted.
Then you have an inner loop that looks like this:
Array[j] < Smallest;
SmallPos = j;
Smallest = Array[SmallPos];
Now we've got a constant three operations for every inner for (j...) loop.
And O(3) = O(1).
So really, it's i and j that determine how much work we do. Nothing in the inner if loop changes anything.
You can think of it as you should only count while and for loops.
As to why for (j = i+1; j <= n; j++) is n(n+1)/2. It's called an arithmetic series.
You're doing n-1 passes of the for (j...) loop when i==0, n-2 passes when i==1, n-3, etc, until 0.
So the summation is
n-1 + n-2 + n-3 + ... 3 + 2 + 1
now, you sum pairs from outside in, re-writing it as:
n-1+1 + n-2+2 + n-3+3 + ...
equals:
n + n + n + ...
and there are n/2 of these pairs, so you have:
n*(n/2)
Two for() loops, the outer loop from 1 to n, the inner loop runs between 1..n, to n. This makes it O(n^2).
If you 'draw this out', it'll be triangular, rather than rectangular, so O(n^2), while true, is hiding the fact that the constant factor term is smaller than if the inner loop also iterated from 1 to n.
It is O(n^2).
For each of the n iterations of the outer loop you have n iterations in the inner loop.