What's the time complexity of for (int i = 2; i < n; i = i*i)? - c++

What would be the time complexity of the following loop?
for (int i = 2; i < n; i = i * i) {
++a;
}
While practicing runtime complexities, I came across this code and can't find the answer. I thought this would be sqrt(n), though it doesn't seem correct, since the loop has the sequence of 2, 4, 16, 256, ....

To understand the answer you must understand that: Inverse of Exponent is not SQRT, but log is.
This loop is multiplying i by itself(.i.e. exponential increment) and will stop only when i >= n, therefore the complexity would be O(log(n)) (log to the base 2 to be precise because i=2 at initialization)
To illustrate this:
In the above image, you can see that SQRT is giving correct number of steps only when i is a even power of 2. However log2 is giving accurate number of steps everytime.

Each time i is powered by 2. Hence, if A(n) shows the current value of i in the last step (which is n), it can be written in a recursive for like the following (suppose n is power of 2):
A(n) = A(n-1)^2
Now, you can expand it to find a pattern:
A(n) = A(n-2)^4 = A(n-3)^8 = ... = A(n-(n-1))^(2^(n-1)) = 2^(2^(n-1))
Hence, the loop iterates k step such that n = 2 ^ (2^ (k-1)). Therefore, this loop iterates Theta(log(log(n)).

Related

What is the time complexity of this for loop (be related to `n`)?

What is the time complexity of this for loop (be related to n)?
for(int i = 1, j; i <= n; i = j + 1)
{
j = n / (n / i);
}
Please note that i, j and n are integer variables and they follow integer arithmetic. In particular, the expression n/(n/i) inside the loop should be interpreted as below:
If we use j = i; instead of j = n / (n / i);, the time complexity is O(n).
Now it's j = n / (n / i);, suppose that n = i*k+r, where k and r are all integers and r = n%i. Thus j = (i*k+r)/((i*k+r)/i) = (i*k+r)/k = i+r/k >= i, which means i will increment faster than the case where you use j = i;. So at least the time complexity is less than O(n), which I suppose gives you another O(n).
And besides the big O notation, there are another two notations(Θ and Ω) which means the lower and upper bound of O(n). You can get time complexity by finding these two bounds. And there's another rule if I remember correctly, O(k*n) = O(n), the coefficient k doesn't matter no matter how big it is.
As elaborated by taotsi, the increment for i in each iteration is
inc = 1 + r/k
where r=n%i and k=n/i. Since r<i, the increment is 1 as long as i<sqrt(n) (because then i*i/n<1 become 0 in integer division). Thereafter, the increment is (typically) 2 as long as i<2*sqrt(n). This continues similar to the geometric series, giving a factor 2 over sqrt(n), i.e. 2 sqrt(n) iterations.
If we write n = a*a+b with integers 0 <= b <= 2*a (i.e. a=int(sqrt(n)) and b=n-a*a), then the total number of iterations in simple experiments is always
b < a? 2*a-1 : 2*a
Thus, the complexity is O(√n) (provided some useful work is done inside the loop, for example counting the number of total iterations, such that the compiler is not allowed to elide the whole loop).
As #Walter has already offered a proof, I am too late for that part, but here is a Python3 version of your code and a plot of the number of iterations as a function of n vs the 2*sqrt(n) function. They look approximately the same (up to n = 1e9).
import matplotlib.pyplot as plt
from numba import jit
import math
#jit
def weird_increment_loop(n):
i = 1
j = 0
iterations = 0
while i <= n:
j = n // (n // i)
i = j + 1
iterations = iterations + 1
return iterations
iterations = []
func_2sqrt = []
domain = range(0,1000000001,1000000)
for n in domain:
iterations.append(weird_increment_loop(n))
func_2sqrt.append(math.sqrt(n)*2)
plt.plot(domain,iterations)
plt.plot(domain,func_2sqrt)
plt.xlabel("n")
plt.ylabel("iterations(n) and 2*sqrt(n)")
plt.show()
Here is the plot:
If you see no difference, it is because there is close to none :D Of course, one should always trust Mathematics ;)
Strictly by the rules of C++, it's O(1). Either the loop terminates after some finite amount of doing no observable work, or it loops forever (which is undefined behaviour). A conforming implementation may assume that undefined behaviour is not encountered, so we may assume it terminates.
Because the observable effects of the program does not depend on what happens inside the loop, an implementation is allowed to "As-if" it into nothingness.

Predict an algorithm's theoretical average-case efficiency and order of growth using summation

I need to predict the algorithm's average case efficiency with respect to the size of its inputs using summation/sigma notation to arrive at the final answer. Many resources use summation to predict worst-case, and I couldn't find someone explaining how to predict average case so step-by-step answers are appreciated.
The algorithm contains a nested for loop, with the basic operation inside the innermost loop:
[code redacted]
EDIT: The execution of the basic operation it will always execute inside the second for loop if the second for loop has been entered, and has no break or return statements. HOWEVER: the end of the first for loop has the return statement which is dependent on the value produced in the basic operation, so the contents of the array do affect how many total times the basic operation will be executed for each run of the algorithm.
The array passed to the algorithm has randomly generated contents
I think the predicted average case efficiency is (n^2)/2, making it n^2 order of growth/big Theta of n^2, but I don't know how to theoretically prove this using summation.
Answers are very appreciated!
TL;DR: Your code complexity in average case is Θ(n²) if "basic operation" complexity is Θ(1) and it has no return, break or goto operators.
Explanation: the average-case complexity is just an expectation of the number of operations in your code given the size of the input.
Let's say T(A, n) is a number of operations your code performs given array A of size n. It's easy to see that
T(A, n) = 1 + // int k = ceil(size/2.0);
n * 2 + 1 + // for (int i = 0; i < size; i++){
n * (n * 2 + 1) + // for(int j = 0; j < size; j++){
n * n * X + // //Basic operation
1 // return (some int);
Where X is a number of operations in your "basic operation". As we can see, T(A, n) does not depend on actual contents of the array A. Thus, the expected number of operations given size of the array (which is simply the arithmetical mean of T(A, n) for all possible A for given n) is exactly equal to each of them:
T(n) = T(A, n) = 3 + n * 2 + n * n * (2 + X)
If we assume that X = Θ(1), this expression is Θ(n²).
Even without this assumption we can have an estimate: if X = Θ(f(n)), then your code complexity is T(n) = Θ(f(n)n²). For example, if X is Θ(log n), T(n) = Θ(n² log n)

How to sum sequence?

How can I sum the following sequence:
⌊n/1⌋ + ⌊n/2⌋ + ⌊n/3⌋ + ... + ⌊n/n⌋
This is simply O(n) solution on C++:
#include <iostream>
int main()
{
int n;
std::cin>>n;
unsigned long long res=0;
for (int i=1;i<=n;i++)
{
res+= n/i;
}
std::cout<<res<<std::endl;
return 0;
}
Do you know any better solution than this? I mean O(1) or O(log(n)). Thank you for your time :) and solutions
Edit:
Thank you for all your answers. If someone wants the solution O(sqrt(n)):
Python:
import math
def seq_sum(n):
sqrtn = int(math.sqrt(n))
return sum(n // k for k in range(1, sqrtn + 1)) * 2 - sqrtn ** 2
n = int(input())
print(seq_sum(n))
C++:
#include <iostream>
#include <cmath>
int main()
{
int n;
std::cin>>n;
int sqrtn = (int)(std::sqrt(n));
long long res2 = 0;
for (int i=1;i<=sqrtn;i++)
{
res2 +=2*(n/i);
}
res2 -= sqrtn*sqrtn;
std::cout<<res2<<std::endl;
return 0;
}
This is Dirichlet's divisor summatory function D(x). Using the following formula (source)
where
gives the following O(sqrt(n)) psuedo-code (that happens to be valid Python):
def seq_sum(n):
sqrtn = int(math.sqrt(n))
return sum(n // k for k in range(1, sqrtn + 1)) * 2 - sqrtn ** 2
Notes:
The // operator in Python is integer, that is truncating, division.
math.sqrt() is used as an illustration. Strictly speaking, this should use an exact integer square root algorithm instead of floating-point maths.
Taken from the Wikipedia article on the Divisor summatory function,
where . That should provide an time solution.
EDIT: the integer square root problem can also be solved in square root or even logarithmic time too - just in case that isn't obvious.
The Polymath project sketches an algorithm for computing this function in time O(n^(1/3 + o(1))), see section 2.1 on pages 8-9 of:
http://arxiv.org/abs/1009.3956
The algorithm involves slicing the region into sufficiently thin intervals and estimating the value on each, where the intervals are chosen to be thin enough that the estimate will be exact when rounded to the nearest integer. So you compute up to some range directly (they suggest 100n^(1/3) but you could modify this with some care) and then do the rest in these thin slices.
See the OEIS entry for more information on this sequence.
Edit: I now see that Kerrek SB mentions this algorithm in the comments. In fairness, though, I added the comment to the OEIS 5 years ago so I don't feel bad for posting 'his' answer. :)
I should also mention that no O(1) algorithm is possible, since the answer is around n log n and hence even writing it out takes time > log n.
Let's divide all number {1, 2, 3, ..., n} into 2 groups: less than or equal to sqrt(n) and greater than sqrt(n). For the first group, we can compute the sum by simple iteration. For the second group, we can use the following observation: if a > sqrt(n), than n / a < sqrt(n). That's why we can iterate over the value of [n / i] = d (from 1 to sqrt(n)) and compute the number of such i that [n / i] = d. It can be found in O(1) for a fixed d using the fact that [n / i] = d means i * d <= n and i * (d + 1) > n, which gives [n / (d + 1)] < i <= [n / d].
The first and the second groups are processed in O(sqrt(n)), which gives O(sqrt(n)) time in total.
For large n, use the formula:
where
( is a transcendental number.)
See the Euler-Mascheroni constant article for more information.
You can notice that there is O(n^(1/2)) unique values in the set S = {⌊n/1⌋, ⌊n/2⌋, ..., ⌊n/(n-1)⌋, ⌊n/n⌋}. Therefore you can calculate the function in O(n^(1/2))
Also since this function is asymmetric, you can even calculate x2 faster by using this formula: D(n) = Σ(x=1->u)(⌊n/x⌋) - u^2 for u = ⌊n^(1/2)⌋
Even more complex but faster: using the method that Richard Sladkey described in this paper you can calculate the function in O(n^(1/3))

Big O Notation for Algorithm

I'm busy doing an assignment and I'm struggling with a question. I know I'm not supposed to ask assignment questions outright so I understand if I don't get straight answers. But here goes anyway.
We must calculate the run time complexity of different algorithms, the one I'm stuck on is this.
for(int i = 1 ; i < n ; i++)
for(int j = 0 ; j < i ; j +=2)
sum++;
Now with my understanding, my first thought would be less than O(n2), because the nested loop isn't running the full n times, and still the j variable is incrementing by 2 each loop rather than iterating like a normal for loop. Although, when I did some code simulations with N=10, N=100, N=1000, etc. I got the following results when I outputted the sum variable.
N = 10 : 25,
N = 100 : 2500,
N = 1000 : 250000,
N = 10000 : 25000000
When I look at these results, the O Notations seems like it should be much larger than just O(n).
The 4 options we have been given in the assignment are : O(1), O(n2), O(n) and O(logn). As I said earlier, I cannot see how it can be as large as O(n2), but the results are pointing to that. So I just think I don't fully understand this, or I'm missing some link.
Any help would be appreciated!
Big O notation does not give you the number of operations. It just tells you how fast it will grow with growing input. And this is what you observe.
When you increased input c times, the total number of operations grows c^2.
If you calculated (nearly) exact number of operations precisely you would get (n^2)/4.
Of course you can calculate it with sums, but since I dunno how to use math on SO I will give an "empirical" explanation. Simple loop-within-a-loop with the same start and end conditions gives n^2. Such loop produces a matrix of all possible combinations for "i" and "j". So if start is 1 and end is N in both cases you get N*N combinations (or iterations effectively).
However, yours inner loop is for i < j. This basically makes a triangle out of this square, that is the 1st 0.5 factor, and then you skip every other element, this is another 0.5 factor; multiplied you get 1/4.
And O(0.25 * n^2) = O(n^2). Sometimes people like to leave the factor in there because it lets you compare two algorithms with the same complexity. But it does not change the ratio of growth in respect to n.
Bear in mind that big-O is asymptotic notation. Constants (additive or multiplicative) have zero impact on it.
So, the outer loop runs n times, and on the ith time, the inner loop runs i / 2 times. If it weren't for the / 2 part, it would be the sum of all numbers 1 .. n, which is the well known n * (n + 1) / 2. That expands to a * n^2 + b * n + c for a non-zero a, so it's O(n^2).
Instead of summing n numbers, we're summing n / 2 numbers. But that's still somewhere around (n/2) * ((n/2) + 1) / 2. Which still expands to d * n^2 + e * n + f for a non-zero d, so it's still O(n^2).
From your output it seems like:
sum ~= (n^2)/4.
This is obviously O(n^2) (actually you can replace the O with teta).
You should recall the definition for Big-O notation. See http://en.wikipedia.org/wiki/Big_O_notation.
The thing is that number of operations here is dependant on the square of n, even though the overall number is less than n². Nevertheless, the scaling is what matters for Big-O notation, thus it's O(n²)
With:
for (int i = 1 ; i < n ; i++)
for (int j = 0 ; j < i ; j +=2)
sum++;
We have:
0+2+4+6+...+2N == 2 * (0+1+2+3+...+N) == 2 * (N * (N+1) / 2) == N * (N+1)
so, with n == 2N, we have (n / 2) * (n / 2 + 1) ~= (n * n) / 4
so O(n²)
Your understanding regarding time complexity is not appropriate.Time Complexity is not only the matter of 'sum' variable.'sum' only calculates how many times the inner loop iterates,but you also have to consider total number of outer loop iterations.
now consider your program:
for(int i = 1 ; i < n ; i++)
for(int j = 0 ; j < i ; j +=2)
sum++;
Time complexity means running time of your program with respect to input values(here n).Here running time does not mean actual required time to execute your program in your computer .Actual required time varies from machine to machine.so to get a machine independent running time, Big O notation is very useful.Bog O actually comes from mathematics and it describes the running time in terms of mathematical functions.
The outer loop is executed total (n-1) times.for each of these (n-1) values (starting from i=1), the inner loop iterates i/2 times.so total number of inner loop iterations=1+1+2+2+3+3+...+(n/2)+(n/2)=2(1+2+3+...+n/2)=2*(n/2(n/2+1))/2=n^2/4+n/2.
similarly 'sum++' also executed total n^2/4+n/2 times.Now consider cost of line 1 of the program=c1,cost of line 2=c2 and cost of line 3=c3 .These casts can be different for different machine. so total time required for executing the program =c1*(n-1)+c2*(n^2/4+n/2)+c3*(n^2/4+n/2)=(c2+c3)n^2/4+(c2+c3)n/2+c1*n-c1.Thus the required time can be expressed in terms of mathematical function.In Big O notation you can say it is O((c2+c3)n^2/4+(c2+c3)n/2+c1*n-c1).In case of Big O notation, lower order terms and coefficient of highest order term can be ignored. because for large value of n ,n^2 is much greater than n. so you can say it is O((c1+c2)n^2/4).Also as for any value of n , n^2 is greater than (c1+c2)n^2/4 by a constant factor, so you can say it is O(n^2).

Confusion with determining Big-O notation?

So, I really don't get Big O notation. I have been tasked with determining "O value" for this code segment.
for (int count =1; count < n; count++) // Runs n times, so linear, or O(N)
{
int count2 = 1; // Declares an integer, so constant, O(1)
while (count2 < count) // Here's where I get confused. I recognize that it is a nested loop, but does that make it O(N^2)?
{
count2 = count2 * 2; // I would expect this to be constant as well, O(N)
}
}
O(f(n))=g(n)
This implies that for some value k, f(n)>g(n) where n>k. This gives the upper bound for the function g(n).
When you are asked to find Big O for some code,
1) Try to count the number of computations being performed in terms of n and thus getting g(n).
2) Now try estimating the upper bound function of g(n). That will be your answer.
Lets apply this procedure to your code.
Lets count the number of computations made. The statements declaring and multiply by 2 take O(1) time. But these are executed repeatedly. We need to find how many times they are executed.
The outer loop executes for n times. Hence the first statement executes for n times. Now the number of times inner loop gets executed depends on value of n. For a given value of n it executes for logn times.
Now lets count the total number of computations performed,
log(1) + log(2) + log(3) +.... log(n) + n
Note that the last n is for the first statement. Simplifying the above series we get:
= log(1*2*3*...n) + n
= log(n!) + n
We have
g(n)=log(n!) + n
Lets guess the upper bound for log(n!).
Since,
1.2.3.4...n < n.n.n...(n times)
Hence,
log(n!) < log(n^n) for n>1
which implies
log(n!) = O(nlogn).
If you want a formal proof for this, check this out. Since nlogn increases faster than n , we therefore have:
O(nlogn + n) = O(nlogn)
Hence your final answer is O(nlogn).