calculating complexity of sorting - c++

std::sort performs approximately N*log2(N) (where N is distance) comparisons of elements(source - http://www.cplusplus.com/), so its complexity is N*log2(N).
Please, help me to calculate complexity for the next code:
void func(std::vector<float> & Storage)
{
for(int i = 0; i < Storage.size() - 1; ++i)
{
std::sort(Storage.begin()+i, Storage.end());
Storage[i+1] += Storage[i];
}
}
complexity = N^2*log2(N) or 2log2(2)+3log2(3)+...+(N)log2(N)?
Thank you.

The proper way to compute the complexity is to evaluate the complexity of repeated O(K Log K) problems of linearly increasing sizes K = 1 ... N. This can be done either by computing the sum, or by just computing the integral
Integrate[K Log[K], {K, 0, N}]
with e.g. Mathematica, and you get
1/4 N^2 (-1 + 2 Log[N])
which is of O(N^2 Log N).
Even though for polynomial and logarithmic functions it holds true, in general it is not true that the integral of K = 1 ... N subproblems of complexity f(K) is equal to N f(N). E.g. the sum of K = 1 ... N subproblems of complexity Exp[K] is simply Exp[N], not N Exp[N].

I would agree with N^2*log2(N) as the sort algorithm is run N times. In Big-O, where c is a constant:
c*N * N*log2(N) => O(N^2*log2(N))

It will be asymptotically O((N^2)*(log2(N))
we need sum of k*log2(k) k from 1 to N

You are summing up logarithmic functions:
complexity <- 0
for i = 1..N
complexity += i Log(i)
Resulting in the summation:
Log(1) + 2 Log(2) + ... + N Log(N)
from http://en.wikipedia.org/wiki/Logarithm:
the logarithm of a product is the sum of the logarithms of the factors:
thus:
the summation becomes:
Log(1) + Log(2^2) + .. + Log(N^N)
further simplifying:
Log(1*2^2*3^3*...*N^N)

Related

Find euler function of binomial coefficient

I've been trying to solve this problem:
Find Euler's totient function of binomial coefficient C(n, m) = n! / (m! (n - m)!) modulo 10^9 + 7, m <= n < 2 * 10^5.
One of my ideas was that first, we can precalculate the values of phi(i) for all i from 1 to n in linear time, also we can calculate all inverses to numbers from 1 to n modulo 10^9 + 7 using, for example, Fermat's little theorem. After that, we know, that, in general, phi(m * n) = phi(m) * phi(n) * (d / fi(d)), d = gcd(m, n). Because we know that gcd((x - 1)!, x) = 1, if x is prime, 2 if x = 4, and x in all other cases, we can calculate phi(x!) modulo 10^9 + 7 in linear time. However, in the last step, we need to calculate phi(n! / ((m! (n - m)!), (if we already know the function for factorials), so, if we are using this method, we have to know gcd(C(n, m), m! (n - m)!), and I don't know how to find it.
I've also been thinking about factorizing the binomial coefficient, but there seems no efficient way to do this.
Any help would be appreciated.
First, factorize all numbers 1..(2*10^5) as products of prime powers.
Now, factorize n!/k! = n(n-1)(n-2)...(n-k+1) as a product of prime powers by multiplying together the factors of the individual parts. Factorize (n-k)! as a product of prime powers. Subtract the latter powers from the former (to account for the divide).
Now you've got C(n, k) as a product of prime powers. Use the formula phi(N) = N * prod(1 - 1/p for p|N) to calculate phi(C(n, k)), which is straightforward given that you've computed the a list of all the prime powers that divide C(n, k) in the second step.
For example:
phi(C(9, 4)) = 9*8*7*6*5 / 5*4*3*2*1
9*8*7*6*5 = 3*3 * 2*2*2 * 7 * 3*2 * 5 = 7*5*3^3*2^4
5*4*3*2*1 = 5 * 2*2 * 3 * 2 * 1 = 5*3*2^3
9*8*7*6*5/(5*4*3*2*1) = 7*3^2*2
phi(C(9, 4)) = 7*3^2*2 * (1 - 1/7) * (1 - 1/3) * (1 - 1/2) = 36
I've done it in integers rather than integers mod M, but it seems like you already know how division works in the modulo ring.

Efficent Algorithm to Answer Subarray Queries fast

The other day I encountered a problem related with queries, but I can't solve it.
Given an array with N integers and a positive integer M, you must answer Q queries. Each query is characterized as ( i , j ), where i and j are each indices of the array. In each query you must answer how many pairs ( r , s ) exist such that
i <= r <= s <= j
the sum of the array elements with indices in [ r , s ] is divisible by M.
Limits:
N <= 50,000
Q <= 50,000
M <= 100
I have a dynamic programming solution that preprocesses every query ( r , s ) in O( N^2 ), but that is not fast enough. Is there a more efficient solution? I have some ideas with Mo's algorithm, or with segment trees, but I can't get it.
Calculate the prefix sums of the original array (assuming it's 1-based) for every i = 1..N.
The equivalence of Sum[r] and Sum[s] for any two indices r and s where r < s means that the sum of the array elements with indices in [r+1, s] is divisible by M (and we need to calculate the number of such equivalences within interval). The time complexity of this step is O(N).
Precalculate the array Count for every i = 1..N, j = 0..M-1:
Count[i][j] stores the number of times that Sum[len] (where len <= i) was equal to j. Time complexity of this step is O(N*M).
For every query (i, j) the answer will be equal to:
For every possible value of the remainder k we find D(k) - the number of times that Sum[len] is equal to k within interval [i, j]. Then we add to the result the number of all possible pairs of D(k) interval boundaries that is D(k)*(D(k)-1)/2. Time complexity: O(M) for every query.
Complexity: O(N) + O(N*M) + O(Q*M) = O((Q+N)*M), that would be ok for given constraints.
First note that for any subarray (r, s) that sums to a multiple of M:
sum(r, s) == sum(i, s) - sum(i, r - 1)
== (qa * M + ra) - (qb * M + rb)
where ra and rb are both less than M and greater than or equal to 0 (i.e. the respective remainders after dividing by M).
Now sum(r, s) is divisible by M so it's remainder is 0 after dividing by M. Therefore:
ra == rb
If we calculate all the remainders after dividing the sums the subarrays (i, i), (i, i + 1), ... ,(i, j) by M as r1, r2, ... , rj then store the count of all these in an array R of size M so that R[k] is the number of remainders equal to k, then:
R[0] == the number of subarrays starting at i that are divisible by M
and for every k >= 0 and k < M such that R[k] > 1 we can count R[k] choose 2:
(R[k] * (R[k] - 1)) / 2
subarrays not starting at i that are divisible by M.
Creating and summing all these values gives us the answer in O( N + M ) for each (r, s) query.

How to prove Big O notation

In my algorithm class we are discussing big O notation and I am stuck proving this example problem:
Prove f(n) = 3n lg n + 10n + lg n + 20 = O(n lg n)
Details will be appreciated.
All you need to prove is that for some M and X0:
M n lg n >= 3n lg n +10n + lg n + 20 for all n greater than X0
4 is pretty easy for M
I'm sure you can compute some x0 for which the above inequality holds and then easily show that it remains true for all n greater than X0
It helps to simplify the above after substituting in the 4 to
(n-1)lg n >= 10n + 20
Once any n is big enough, it should be clear that lg n > 1, so any increase in n beyond that increase the right by 1 and the left by more than 1.
Big O notation is an asymptotic notation and it's all about approximation of cases (worst, best and mid one).
In your example, nlgn grows faster than both n and lgn, moreover constant values are not relevant and can be ignored in such an approximation.
Because of that, it follows that the complexity is O(nlgn).

Most effecient algorithm for finding this LCM summation

Problem : Find
Range of n : 1<= n <=
The main challenge is handling queries(Q) which can be large . 1 <= Q <=
Methods I have used so far :
Brute Force
while(Q--)
{
int N;
cin>>N;
for(int i=1;i<=N;i++)
ans += lcm(i,N)/i ;
}
Complexity :
Preprocessing and Handling queries in
First I build a table which holds the value of euler totient function for every N.
This can be done in O(N).
void sieve()
{
// phi table holds euler totient function value
// lp holds the lowest prime factor for a number
// pr is a vector which contains prime numbers
phi[1]=1;
for(int i=2;i<=MAX;i++)
{
if(lp[i]==0)
{
lp[i]=i;
phi[i]=i-1;
pr.push_back(i);
}
else
{
if(lp[i]==lp[i/lp[i]])
phi[i] = phi[i/lp[i]]*lp[i];
else phi[i] = phi[i/lp[i]]*(lp[i]-1);
}
for(int j=0;j<(int)pr.size()&&pr[j]<=lp[i]&&i*pr[j]<=MAX;j++)
lp[i*pr[j]] = pr[j];
}
For each query factorize N and add d*phi[d] to the result.
for(int i=1;i*i<=n;i++)
{
if(n%i==0)
{
// i is a factor
sum += (n/i)*phi[n/i];
if(i*i!=n)
{
// n/i is a factor too
sum += i*phi[i];
}
}
}
This takes O(sqrt(N)) .
Complexity : O(Q*sqrt(N))
Handling queries in O(1)
To the sieve method I described above I add a part which calculates the answer we need in O(NLogN)
for(int i=1;i<=MAX;++i)
{
//MAX is 10^7
for(int j=i;j<=MAX;j+=i)
{
ans[j] += i*phi[i];
}
}
This unfortunately times out for the given constraints and the time limit (1 second).
I think this involves some clever idea regarding the prime factorization of N .
I can prime factorize a number in O(LogN) using the lp(lowest prime) table built above but I cant figure out how to arrive at the answer using the factorization.
You can try following algorithm:
lcm(i,n) / i = i * n / i * gcd(i, n) = n / gcd(i, n)
Now should find sum of numbers n / gcd(i, n).
Lets n = p1^i1 * p2^i2 * p3^j3 where number p1, p2, ... pk is prime.
Number of items n / gdc(i, n) where gcd(i , n) == 1 is phi[n] = n*(p1-1)*(p2-1)*...*(pk-1)/(p1*p2*...*pk), so add to sum n*phi[n].
Number of items n / gdc(i, n) where gcd(i , n) == p1 is phi[n/p1] = (n/p1)*(p1-1)*(p2-1)*...*(pk-1)/(p1*p2*...*pk), so add to sum n/p1*phi[n/p1].
Number of items n / gdc(i, n) where gcd(i , n) == p1*p2 is phi[n/(p1*p2)] = (n/(p1*p2))*(p1-1)*(p2-1)*...*(pk-1)/(p1*p2*...*pk), so add to sum n/(p1*p2)*phi[n/(p1*p2)].
Now answer is the sum
n/(p1^j1*p2^j2*...*pk^jk) phi[n/(p1^j1*p2^j2*...*pk^jk)]
over all
j1=0,...,i1
j2=0,...,i2
....
jk=0,...,ik
Total number of items in this sum is i1*i2*...*ik that is significantly less then O(n).
To calculate this sum you can use a recursion function with free argument initial number, current representation and initial representation:
initial = {p1:i1, p2:i2, ... ,pn:in}
current = {p1:i1, p2:i2, ... ,pn:in}
visited = {}
int calc(n, initial, current, visited):
if(current in visited):
return 0
visited add current
int sum = 0
for pj in keys of current:
if current[pj] == 0:
continue
current[pj]--
sum += calc(n, initial, current)
current[pj]++
mult1 = n
for pj in keys of current:
mult1 /= pj^current[pj]
mult2 = mult1
for pj in keys of current:
if initial[pj] == current[pj]:
continue
mult2 = mult2*(pj -1)/pj
sum += milt1 * mult2
return sum
Its possible to quickly determine the sum if you know the prime factorization of the number N. Working off the same approach (totient function times N divided by a factor) as the existing answer, but applying some algebra to simplify terms, factor the expression to sums of prime powers, substituting the formula for a geometric series... we arrive at a much simpler solution.
Given the prime factorization of N in primes ps to powers qs, we can compute the result of the original equation for N via:
result = 1
for p, q in prime_factors
result *= p * (p-1) * (p**(2*q) - 1) / (p**2 - 1) + 1
Note that ** denotes exponentiation in the above pseudo-code.
If one sieves for primes up to MAX, storing at least one prime divisor for each composite discovered (as mentioned in the original problem) as precomputation, its possible to then factor the subsequent N values in log(N) time by referencing the factor table. If one also pre-computes a prime power table, the above algorithm can then run in log(N) time, for an overall complexity of O(MAX*log(MAX)) pre-computing time and O(Q*log(MAX)) query time, and O(MAX) space.

Big-O complexity of this algorithm

CODE:
void fun(int n){
if(n>2){
for(int i=0;i<n;i++){
j=0;
while(j<n){
cout<<j;
j++;
}
}
fun(n/2);
}
}
Here's what I think:
The recursive part is running log(n) times ?
and during each recursive call, the for loop will run n^2 times, with n changing to half in each recursive call.
So is it n^2 + (n^2)/4 + (n^2)/16 + ... + 1?
You are right, so the big(O) is n^2 since the sum of the series n^2 + (n^2)/4 + (n^2)/16 + ... + 1 never exceeds 2n^2
The number of writes to cout is given by the following recurrence:
T(N) = N² + T(N/2).
By educated guess, T(N) can be a quadratic polynomial. Hence
T(N) = aN²+bN+c = N² + T(N/2) = N² + aN²/4+bN/2+c.
By identification, we have
3a/4 = 1
b/2 = 0
c = c.
and
T(N) = 4N²/3 + c.
With T(2)= 0,
T(N) = 4(N²-4)/3
which is obviously O(N²).
This is simple mathematics. The complexity is n^2 + (n^2)/4 + (n^2)/16 + ... + 1. It is (n² * (1 + 1/4+ ...)) . And the maths says that the infinite serie converges to 4/3 (the formula is: 1 / (1 - 1/4)).
It gives actually O(n2).