Most effecient algorithm for finding this LCM summation - c++

Problem : Find
Range of n : 1<= n <=
The main challenge is handling queries(Q) which can be large . 1 <= Q <=
Methods I have used so far :
Brute Force
while(Q--)
{
int N;
cin>>N;
for(int i=1;i<=N;i++)
ans += lcm(i,N)/i ;
}
Complexity :
Preprocessing and Handling queries in
First I build a table which holds the value of euler totient function for every N.
This can be done in O(N).
void sieve()
{
// phi table holds euler totient function value
// lp holds the lowest prime factor for a number
// pr is a vector which contains prime numbers
phi[1]=1;
for(int i=2;i<=MAX;i++)
{
if(lp[i]==0)
{
lp[i]=i;
phi[i]=i-1;
pr.push_back(i);
}
else
{
if(lp[i]==lp[i/lp[i]])
phi[i] = phi[i/lp[i]]*lp[i];
else phi[i] = phi[i/lp[i]]*(lp[i]-1);
}
for(int j=0;j<(int)pr.size()&&pr[j]<=lp[i]&&i*pr[j]<=MAX;j++)
lp[i*pr[j]] = pr[j];
}
For each query factorize N and add d*phi[d] to the result.
for(int i=1;i*i<=n;i++)
{
if(n%i==0)
{
// i is a factor
sum += (n/i)*phi[n/i];
if(i*i!=n)
{
// n/i is a factor too
sum += i*phi[i];
}
}
}
This takes O(sqrt(N)) .
Complexity : O(Q*sqrt(N))
Handling queries in O(1)
To the sieve method I described above I add a part which calculates the answer we need in O(NLogN)
for(int i=1;i<=MAX;++i)
{
//MAX is 10^7
for(int j=i;j<=MAX;j+=i)
{
ans[j] += i*phi[i];
}
}
This unfortunately times out for the given constraints and the time limit (1 second).
I think this involves some clever idea regarding the prime factorization of N .
I can prime factorize a number in O(LogN) using the lp(lowest prime) table built above but I cant figure out how to arrive at the answer using the factorization.

You can try following algorithm:
lcm(i,n) / i = i * n / i * gcd(i, n) = n / gcd(i, n)
Now should find sum of numbers n / gcd(i, n).
Lets n = p1^i1 * p2^i2 * p3^j3 where number p1, p2, ... pk is prime.
Number of items n / gdc(i, n) where gcd(i , n) == 1 is phi[n] = n*(p1-1)*(p2-1)*...*(pk-1)/(p1*p2*...*pk), so add to sum n*phi[n].
Number of items n / gdc(i, n) where gcd(i , n) == p1 is phi[n/p1] = (n/p1)*(p1-1)*(p2-1)*...*(pk-1)/(p1*p2*...*pk), so add to sum n/p1*phi[n/p1].
Number of items n / gdc(i, n) where gcd(i , n) == p1*p2 is phi[n/(p1*p2)] = (n/(p1*p2))*(p1-1)*(p2-1)*...*(pk-1)/(p1*p2*...*pk), so add to sum n/(p1*p2)*phi[n/(p1*p2)].
Now answer is the sum
n/(p1^j1*p2^j2*...*pk^jk) phi[n/(p1^j1*p2^j2*...*pk^jk)]
over all
j1=0,...,i1
j2=0,...,i2
....
jk=0,...,ik
Total number of items in this sum is i1*i2*...*ik that is significantly less then O(n).
To calculate this sum you can use a recursion function with free argument initial number, current representation and initial representation:
initial = {p1:i1, p2:i2, ... ,pn:in}
current = {p1:i1, p2:i2, ... ,pn:in}
visited = {}
int calc(n, initial, current, visited):
if(current in visited):
return 0
visited add current
int sum = 0
for pj in keys of current:
if current[pj] == 0:
continue
current[pj]--
sum += calc(n, initial, current)
current[pj]++
mult1 = n
for pj in keys of current:
mult1 /= pj^current[pj]
mult2 = mult1
for pj in keys of current:
if initial[pj] == current[pj]:
continue
mult2 = mult2*(pj -1)/pj
sum += milt1 * mult2
return sum

Its possible to quickly determine the sum if you know the prime factorization of the number N. Working off the same approach (totient function times N divided by a factor) as the existing answer, but applying some algebra to simplify terms, factor the expression to sums of prime powers, substituting the formula for a geometric series... we arrive at a much simpler solution.
Given the prime factorization of N in primes ps to powers qs, we can compute the result of the original equation for N via:
result = 1
for p, q in prime_factors
result *= p * (p-1) * (p**(2*q) - 1) / (p**2 - 1) + 1
Note that ** denotes exponentiation in the above pseudo-code.
If one sieves for primes up to MAX, storing at least one prime divisor for each composite discovered (as mentioned in the original problem) as precomputation, its possible to then factor the subsequent N values in log(N) time by referencing the factor table. If one also pre-computes a prime power table, the above algorithm can then run in log(N) time, for an overall complexity of O(MAX*log(MAX)) pre-computing time and O(Q*log(MAX)) query time, and O(MAX) space.

Related

Hyperbolic sine without math.h

im new to code and c++ for a homework assignment im to create a code for sinh without the math file. I understand the math behind sinh, but i have no idea how to code it, any help would be highly appreciated.
According to Wikipedia, there is a Taylor series for sinh:
sinh(x) = x + (pow(x, 3) / 3!) + (pow(x, 5) / 5!) + pow(x, 7) / 7! + ...
One challenge is that you are not allowed to use the pow function. The other is calculating the factorial.
The series is a sum of terms, so you'll need a loop:
double sum = 0.0;
for (unsigned int i = 0; i < NUMBER_OF_TERMS; ++i)
{
sum += Term(i);
}
You could implement Term as a separate function, but you may want to take advantage of declaring and using variables in the loop (that the function may not have access to).
Consider that pow(x, N) expands to x * x * x...
This means that in each iteration the previous value is multiplied by the present value. (This will come in handy later.)
Consider that N! expands to 1 * 2 * 3 * 4 * 5 * ...
This means that in each iteration, the previous value is multiplied by the iteration number.
Let's revisit the loop:
double sum = 0.0;
double power = 1.0;
double factorial = 1.0;
for (unsigned int i = 1; i <= NUMBER_OF_TERMS; ++i)
{
// Calculate pow(x, i)
power = power * x;
// Calculate x!
factorial = factorial * i;
}
One issue with the above loop is that the pow and factorial need to be calculated for each iteration, but the Taylor Series terms use the odd iterations. This is solved by calculated the terms for odd iterations:
for (unsigned int i = 1; i <= NUMBER_OF_TERMS; ++i)
{
// Calculate pow(x, i)
power = power * x;
// Calculate x!
factorial = factorial * i;
// Calculate sum for odd iterations
if ((i % 2) == 1)
{
// Calculate the term.
sum += //...
}
}
In summary, the pow and factorial functions are broken down into iterative pieces. The iterative pieces are placed into a loop. Since the Taylor Series terms are calculated with odd iteration values, a check is placed into the loop.
The actual calculation of the Taylor Series term is left as an exercise for the OP or reader.

How to calculate (n!)%1000000009

I need to find n!%1000000009.
n is of type 2^k for k in range 1 to 20.
The function I'm using is:
#define llu unsigned long long
#define MOD 1000000009
llu mulmod(llu a,llu b) // This function calculates (a*b)%MOD caring about overflows
{
llu x=0,y=a%MOD;
while(b > 0)
{
if(b%2 == 1)
{
x = (x+y)%MOD;
}
y = (y*2)%MOD;
b /= 2;
}
return (x%MOD);
}
llu fun(int n) // This function returns answer to my query ie. n!%MOD
{
llu ans=1;
for(int j=1; j<=n; j++)
{
ans=mulmod(ans,j);
}
return ans;
}
My demand is such that I need to call the function 'fun', n/2 times. My code runs too slow for values of k around 15. Is there a way to go faster?
EDIT:
In actual I'm calculating 2*[(i-1)C(2^(k-1)-1)]*[((2^(k-1))!)^2] for all i in range 2^(k-1) to 2^k. My program demands (nCr)%MOD caring about overflows.
EDIT: I need an efficient way to find nCr%MOD for large n.
The mulmod routine can be speeded up by a large factor K.
1) '%' is overkill, since (a + b) are both less than N.
- It's enough to evaluate c = a+b; if (c>=N) c-=N;
2) Multiple bits can be processed at once; see optimization to "Russian peasant's algorithm"
3) a * b is actually small enough to fit 64-bit unsigned long long without overflow
Since the actual problem is about nCr mod M, the high level optimization requires using the recurrence
(n+1)Cr mod M = (n+1)nCr / (n+1-r) mod M.
Because the left side of the formula ((nCr) mod M)*(n+1) is not divisible by (n+1-r), the division needs to be implemented as multiplication with the modular inverse: (n+r-1)^(-1). The modular inverse b^(-1) is b^(M-1), for M being prime. (Otherwise it's b^(phi(M)), where phi is Euler's Totient function.)
The modular exponentiation is most commonly implemented with repeated squaring, which requires in this case ~45 modular multiplications per divisor.
If you can use the recurrence
nC(r+1) mod M = nCr * (n-r) / (r+1) mod M
It's only necessary to calculate (r+1)^(M-1) mod M once.
Since you are looking for nCr for multiple sequential values of n you can make use of the following:
(n+1)Cr = (n+1)! / ((r!)*(n+1-r)!)
(n+1)Cr = n!*(n+1) / ((r!)*(n-r)!*(n+1-r))
(n+1)Cr = n! / ((r!)*(n-r)!) * (n+1)/(n+1-r)
(n+1)Cr = nCr * (n+1)/(n+1-r)
This saves you from explicitly calling the factorial function for each i.
Furthermore, to save that first call to nCr you can use:
nC(n-1) = n //where n in your case is 2^(k-1).
EDIT:
As Aki Suihkonen pointed out, (a/b) % m != a%m / b%m. So the method above so the method above won't work right out of the box. There are two different solutions to this:
1000000009 is prime, this means that a/b % m == a*c % m where c is the inverse of b modulo m. You can find an explanation of how to calculate it here and follow the link to the Extended Euclidean Algorithm for more on how to calculate it.
The other option which might be easier is to recognize that since nCr * (n+1)/(n+1-r) must give an integer, it must be possible to write n+1-r == a*b where a | nCr and b | n+1 (the | here means divides, you can rewrite that as nCr % a == 0 if you like). Without loss of generality, let a = gcd(n+1-r,nCr) and then let b = (n+1-r) / a. This gives (n+1)Cr == (nCr / a) * ((n+1) / b) % MOD. Now your divisions are guaranteed to be exact, so you just calculate them and then proceed with the multiplication as before. EDIT As per the comments, I don't believe this method will work.
Another thing I might try is in your llu mulmod(llu a,llu b)
llu mulmod(llu a,llu b)
{
llu q = a * b;
if(q < a || q < b) // Overflow!
{
llu x=0,y=a%MOD;
while(b > 0)
{
if(b%2 == 1)
{
x = (x+y)%MOD;
}
y = (y*2)%MOD;
b /= 2;
}
return (x%MOD);
}
else
{
return q % MOD;
}
}
That could also save some precious time.

Calculate n where a^n mod m = 1?

What is fastest way to calculate the first n satisfying the equation
a^n mod m = 1
Here a,n,m can be prime or composite
mod : is the modulus operator
What is wrong with the direct way:
int mod_order(int m, int a) {
for(int n = 1, an = a; n != m; n++, an = an * a % m) if(an % m == 1) return n;
return -1;
}
If gcd(a,m)>1, then there is no such n. (Obvious)
Otherwise, if m is prime, n=m-1. (Proof)
Otherwise (and as more general case), n=ф(m), where ф is Euler's totient function. (Proof)
As you can see, computing ф(m) is essentially the same as factorization of m. This can be done in sqrt(m) time or faster, depending on how convoluted is the algorithm you use. Simple one:
int phi(m){
if(m==1) return 1;
for(int d=2; d*d<m; ++d){
if(m%d != 0) continue;
int deg = 1; long acc=1;
for(; m%(acc*d)==0; ++deg) acc*=d;
acc /= d;
return phi(m/acc)*acc*(d-1)/d;
}
return m-1;
}
Upd: My bad. a^(ф(m)) = 1 (mod m), but there can be lesser value of n (for a=1, n=1, no difference what m is; for a=14, m=15, n=2). n is divisor of ф(m), but efficiently computing least possible n seems to be tricky. Task can be divided, by using this theorem (minimal n is least common multiple for all degrees for respective remainders). But when m is prime or has big enough prime divisor, and there is only one a (as opposed to computing n for many different a with the same m), we're kind of out of options. You may want to look at 1, 2.

calculating complexity of sorting

std::sort performs approximately N*log2(N) (where N is distance) comparisons of elements(source - http://www.cplusplus.com/), so its complexity is N*log2(N).
Please, help me to calculate complexity for the next code:
void func(std::vector<float> & Storage)
{
for(int i = 0; i < Storage.size() - 1; ++i)
{
std::sort(Storage.begin()+i, Storage.end());
Storage[i+1] += Storage[i];
}
}
complexity = N^2*log2(N) or 2log2(2)+3log2(3)+...+(N)log2(N)?
Thank you.
The proper way to compute the complexity is to evaluate the complexity of repeated O(K Log K) problems of linearly increasing sizes K = 1 ... N. This can be done either by computing the sum, or by just computing the integral
Integrate[K Log[K], {K, 0, N}]
with e.g. Mathematica, and you get
1/4 N^2 (-1 + 2 Log[N])
which is of O(N^2 Log N).
Even though for polynomial and logarithmic functions it holds true, in general it is not true that the integral of K = 1 ... N subproblems of complexity f(K) is equal to N f(N). E.g. the sum of K = 1 ... N subproblems of complexity Exp[K] is simply Exp[N], not N Exp[N].
I would agree with N^2*log2(N) as the sort algorithm is run N times. In Big-O, where c is a constant:
c*N * N*log2(N) => O(N^2*log2(N))
It will be asymptotically O((N^2)*(log2(N))
we need sum of k*log2(k) k from 1 to N
You are summing up logarithmic functions:
complexity <- 0
for i = 1..N
complexity += i Log(i)
Resulting in the summation:
Log(1) + 2 Log(2) + ... + N Log(N)
from http://en.wikipedia.org/wiki/Logarithm:
the logarithm of a product is the sum of the logarithms of the factors:
thus:
the summation becomes:
Log(1) + Log(2^2) + .. + Log(N^N)
further simplifying:
Log(1*2^2*3^3*...*N^N)

%mod compatible ways of generating Binomial Coefficients

I would like to optimize a part of my program where I'm calculating the sum of Binomial Coefficients up to K. i.e.
C(N,0) + C(N,1) + ... + C(N,K)
Since the values go beyond the data type (long long) can support, I'm to calculate values mod M and was looking for procedures to do that. Currently, I've done it with Pascal's Triangle but it seems to be taking a bit of load. so, I was wondering if there's any other efficient way to do this. I've considered Lucas' Theorem, although M I have is already large enough so that C(N,k) goes out of hand!
Any pointers as how can I do this differently, maybe calculate the whole sum altogether with some other neat expression of teh sum. If not I'll leave it with the Pascal's Triangle method itself.
Thank you,
Here is what I have so far O(N^2) :
#define MAX 1000000007
long long NChooseK_Sum(int N, int K){
vector<long long> prevV, V;
prevV.push_back(1); prevV.push_back(1);
for(int i=2;i<=N;++i){
V.clear();
V.push_back(1);
for(int j=0;j<(i-1);++j){
long long val = prevV[j] + prevV[j+1];
if(val >= MAX)
val %= MAX;
V.push_back(val);
}
V.push_back(1);
prevV = V;
}
long long res=0;
for(int i=0;i<=K;++i){
res+=V[i];
if(res >= MAX)
res %= MAX;
}
return res;
}
An algorithm that performs a linear number of arithmetic bignum operations is
def binom(n):
nck = 1
for k in range(n + 1): # 0..n
yield nck
nck = (nck * (n - k)) / (k + 1)
This uses division, but modulo a prime p, you can accomplish much the same thing by multiplying by the solution i to the equation i * (k + 1) = 1 mod p. The value i can be found in a logarithmic number of arithmetic ops via the extended Euclidean algorithm.