How to find divisor to maximise remainder? - c++

Given two numbers n and k, find x, 1 <= x <= k that maximises the remainder n % x.
For example, n = 20 and k = 10 the solution is x = 7 because the remainder 20 % 7 = 6 is maximum.
My solution to this is :
int n, k;
cin >> n >> k;
int max = 0;
for(int i = 1; i <= k; ++i)
{
int xx = n - (n / i) * i; // or int xx = n % i;
if(max < xx)
max = xx;
}
cout << max << endl;
But my solution is O(k). Is there any more efficient solution to this?

Not asymptotically faster, but faster, simply by going backwards and stopping when you know that you cannot do better.
Assume k is less than n (otherwise just output k).
int max = 0;
for(int i = k; i > 0 ; --i)
{
int xx = n - (n / i) * i; // or int xx = n % i;
if(max < xx)
max = xx;
if (i < max)
break; // all remaining values will be smaller than max, so break out!
}
cout << max << endl;
(This can be further improved by doing the for loop as long as i > max, thus eliminating one conditional statement, but I wrote it this way to make it more obvious)
Also, check Garey and Johnson's Computers and Intractability book to make sure this is not NP-Complete (I am sure I remember some problem in that book that looks a lot like this). I'd do that before investing too much effort on trying to come up with better solutions.

This problem is equivalent to finding maximum of function f(x)=n%x in given range. Let's see how this function looks like:
It is obvious that we could get the maximum sooner if we start with x=k and then decrease x while it makes any sense (until x=max+1). Also this diagram shows that for x larger than sqrt(n) we don't need to decrease x sequentially. Instead we could jump immediately to preceding local maximum.
int maxmod(const int n, int k)
{
int max = 0;
while (k > max + 1 && k > 4.0 * std::sqrt(n))
{
max = std::max(max, n % k);
k = std::min(k - 1, 1 + n / (1 + n / k));
}
for (; k > max + 1; --k)
max = std::max(max, n % k);
return max;
}
Magic constant 4.0 allows to improve performance by decreasing number of iterations of the first (expensive) loop.
Worst case time complexity could be estimated as O(min(k, sqrt(n))). But for large enough k this estimation is probably too pessimistic: we could find maximum much sooner, and if k is significantly greater than sqrt(n) we need only 1 or 2 iterations to find it.
I did some tests to determine how many iterations are needed in the worst case for different values of n:
n max.iterations (both/loop1/loop2)
10^1..10^2 11 2 11
10^2..10^3 20 3 20
10^3..10^4 42 5 42
10^4..10^5 94 11 94
10^5..10^6 196 23 196
up to 10^7 379 43 379
up to 10^8 722 83 722
up to 10^9 1269 157 1269
Growth rate is noticeably better than O(sqrt(n)).

For k > n the problem is trivial (take x = n+1).
For k < n, think about the graph of remainders n % x. It looks the same for all n: the remainders fall to zero at the harmonics of n: n/2, n/3, n/4, after which they jump up, then smoothly decrease towards the next harmonic.
The solution is the rightmost local maximum below k. As a formula x = n//((n//k)+1)+1 (where // is integer division).

waves hands around
No value of x which is a factor of n can produce the maximum n%x, since if x is a factor of n then n%x=0.
Therefore, you would like a procedure which avoids considering any x that is a factor of n. But this means you want an easy way to know if x is a factor. If that were possible you would be able to do an easy prime factorization.
Since there is not a known easy way to do prime factorization there cannot be an "easy" way to solve your problem (I don't think you're going to find a single formula, some kind of search will be necessary).
That said, the prime factorization literature has cunning ways of getting factors quickly relative to a naive search, so perhaps it can be leveraged to answer your question.

Nice little puzzle!
Starting with the two trivial cases.
for n < k: any x s.t. n < x <= k solves.
for n = k: x = floor(k / 2) + 1 solves.
My attempts.
for n > k:
x = n
while (x > k) {
x = ceil(n / 2)
}
^---- Did not work.
x = floor(float(n) / (floor(float(n) / k) + 1)) + 1
x = ceil(float(n) / (floor(float(n) / k) + 1)) - 1
^---- "Close" (whatever that means), but did not work.
My pride inclines me to mention that I was first to utilize the greatest k-bounded harmonic, given by 1.
Solution.
Inline with other answers I simply check harmonics (term courtesy of #ColonelPanic) of n less than k, limiting by the present maximum value (courtesy of #TheGreatContini). This is the best of both worlds and I've tested with random integers between 0 and 10000000 with success.
int maximalModulus(int n, int k) {
if (n < k) {
return n;
}
else if (n == k) {
return n % (k / 2 + 1);
}
else {
int max = -1;
int i = (n / k) + 1;
int x = 1;
while (x > max + 1) {
x = (n / i) + 1;
if (n%x > max) {
max = n%x;
}
++i;
}
return max;
}
}
Performance tests:
http://cpp.sh/72q6
Sample output:
Average number of loops:
bruteForce: 516
theGreatContini: 242.8
evgenyKluev: 2.28
maximalModulus: 1.36 // My solution

I'm wrong for sure, but it looks to me that it depends on if n < k or not.
I mean, if n < k, n%(n+1) gives you the maximum, so x = (n+1).
Well, on the other hand, you can start from j = k and go back evaluating n%j until it's equal to n, thus x = j is what you are looking for and you'll get it in max k steps... Too much, is it?

Okay, we want to know divisor that gives maximum remainder;
let n be a number to be divided and i be the divisor.
we are interested to find the maximum remainder when n is divided by i, for all i<n.
we know that, remainder = n - (n/i) * i //equivalent to n%i
If we observe the above equation to get maximum remainder we have to minimize (n/i)*i
minimum of n/i for any i<n is 1.
Note that, n/i == 1, for i<n, if and only if i>n/2
now we have, i>n/2.
The least possible value greater than n/2 is n/2+1.
Therefore, the divisor that gives maximum remainder, i = n/2+1
Here is the code in C++
#include <iostream>
using namespace std;
int maxRemainderDivisor(int n){
n = n>>1;
return n+1;
}
int main(){
int n;
cin>>n;
cout<<maxRemainderDivisor(n)<<endl;
return 0;
}
Time complexity: O(1)

Related

Speed problem for summation (sum of divisors)

I should implement this summation in C ++. I have tried with this code, but with very high numbers up to 10 ^ 12 it takes too long.
The summation is:
For any positive integer k, let d(k) denote the number of positive divisors of k (including 1 and k itself).
For example, for the number 4: 1 has 1 divisor, 2 has two divisors, 3 has two divisors, and 4 has three divisors. So the result would be 8.
This is my code:
#include <iostream>
#include <algorithm>
using namespace std;
int findDivisors(long long n)
{
int c=0;
for(int j=1;j*j<=n;j++)
{
if(n%j==0)
{
c++;
if(j!=(n/j))
{
c++;
}
}
}
return c;
}
long long compute(long long n)
{
long long sum=0;
for(int i=1; i<=n; i++)
{
sum += (findDivisors(i));
}
return sum;
}
int main()
{
int n, divisors;
freopen("input.txt", "r", stdin);
freopen("output.txt", "w", stdout);
cin >> n;
cout << compute(n);
}
I think it's not just a simple optimization problem, but maybe I should change the algorithm entirely.
Would anyone have any ideas to speed it up? Thank you.
largest_prime_is_463035818's answer shows an O(N) solution, but the OP is trying to solve this problem
with very high numbers up to 1012.
The following is an O(N1/2) algorithm, based on some observations about the sum
n/1 + n/2 + n/3 + ... + n/n
In particular, we can count the number of terms with a specific value.
Consider all the terms n/k where k > n/2. There are n/2 of those and all are equal to 1 (integer division), so that their sum is n/2.
Similar considerations hold for the other dividends, so that we can write the following function
long long count_divisors(long long n)
{
auto sum{ n };
for (auto i{ 1ll }, k_old{ n }, k{ n }; i < k ; ++i, k_old = k)
{ // ^^^^^ it goes up to sqrt(n)
k = n / (i + 1);
sum += (k_old - k) * i;
if (i == k)
break;
sum += k;
}
return sum;
}
Here it is tested against the O(N) algorithm, the only difference in the results beeing the corner cases n = 0 and n = 1.
Edit
Thanks again to largest_prime_is_463035818, who linked the Wikipedia page about the divisor summatory function, where both an O(N) and an O(sqrt(N)) algorithm are mentioned.
An implementation of the latter may look like this
auto divisor_summatory(long long n)
{
auto sum{ 0ll };
auto k{ 1ll };
for ( ; k <= n / k; ++k )
{
sum += n / k;
}
--k;
return 2 * sum - k * k;
}
They also add this statement:
Finding a closed form for this summed expression seems to be beyond the techniques available, but it is possible to give approximations. The leading behavior of the series is given by
D(x) = xlogx + x(2γ - 1) + Δ(x)
where γ is the Euler–Mascheroni constant, and the error term is Δ(x) = O(sqrt(x)).
I used your brute force approach as reference to have test cases. The ones I used are
compute(12) == 35
cpmpute(100) == 482
Don't get confused by computing factorizations. There are some tricks one can play when factorizing numbers, but you actually don't need any of that. The solution is a plain simple O(N) loop:
#include <iostream>
#include <limits>
long long compute(long long n){
long long sum = n+1;
for (long long i=2; i < n ; ++i){
sum += n/i;
}
return sum;
}
int main()
{
std::cout << compute(12) << "\n";
std::cout << compute(100) << "\n";
}
Output:
35
482
Why does this work?
The key is in Marc Glisse's comment:
As often with this kind of problem, this sum actually counts pairs x,
y where x divides y, and the sum is arranged to count first all x
corresponding to a fixed y, but nothing says you have to keep it that
way.
I could stop here, because the comment already explains it all. Though, if it didn't click yet...
The trick is to realize that it is much simpler to count divisors of all numbers up to n rather than n-times counting divisors of individual numbers and take the sum.
You don't need to care about factorizations of eg 123123123 or 52323423 to count all divisors up to 10000000000. All you need is a change of perspective. Instead of trying to factorize numbers, consider the divisors. How often does the divisor 1 appear up to n? Simple: n-times. How often does the divisor 2 appear? Still simple: n/2 times, because every second number is divisible by 2. Divisor 3? Every 3rd number is divisible by 3. I hope you can see the pattern already.
You could even reduce the loop to only loop till n/2, because bigger numbers obviously appear only once as divisor. Though I didn't bother to go further, because the biggest change is from your O(N * sqrt(N)) to O(N).
Let's start off with some math and reduce the O(n * sq(n)) factorization to O(n * log(log(n))) and for counting the sum of divisors the overall complexity is O(n * log(log(n)) + n * n^(1/3)).
For instance:
In Codeforces himanshujaju explains how we can optimize the solution of finding divisors of a number.
I am simplifying it a little bit.
Let, n as the product of three numbers p, q, and r.
so assume p * q * r = n, where p <= q <= r.
The maximum value of p = n^(1/3).
Now we can loop over all prime numbers in a range [2, n^(1/3)]
and try to reduce the time complexity of prime factorization.
We will split our number n into two numbers x and y => x * y = n.
And x contains prime factors up to n^(1/3) and y deals with higher prime factors greater than n^(1/3).
Thus gcd(x, y) = 1.
Now define F(n) as the number of prime factors of n.
From multiplicative rules, we can say that
F(x * y) = F(x) * F(y), if gcd(x, y) = 1.
For finding F(n) => F(x * y) = F(x) * F(y)
So first find F(x) then F(y) will F(n/x)
And there will 3 cases to cover for y:
1. y is a prime number: F(y) = 2.
2. y is the square of a prime number: F(y) = 3.
3. y is a product of two distinct prime numbers: F(y) = 4.
So once we are done with finding F(x) and F(y), we are also done with finding F(x * y) or F(n).
In Cp-Algorithm there is also a nice explanation of how to count the number of divisors on a number. And also in GeeksForGeeks a nice coding example of how to count the number of divisors of a number in an efficient way. One can check the articles and can generate a nice solution to this problem.
C++ implementation
#include <bits/stdc++.h>
using namespace std;
const int maxn = 1e6 + 11;
bool prime[maxn];
bool primesquare[maxn];
int table[maxn]; // for storing primes
void SieveOfEratosthenes()
{
for(int i = 2; i < maxn; i++){
prime[i] = true;
}
for(int i = 0; i < maxn; i++){
primesquare[i] = false;
}
// 1 is not a prime number
prime[1] = false;
for(int p = 2; p * p < maxn; p++){
// If prime[p] is not changed, then
// it is a prime
if(prime[p] == true){
// Update all multiples of p
for(int i = p * 2; i < maxn; i += p){
prime[i] = false;
}
}
}
int j = 0;
for(int p = 2; p < maxn; p++) {
if (prime[p]) {
// Storing primes in an array
table[j] = p;
// Update value in primesquare[p * p],
// if p is prime.
if(p < maxn / p) primesquare[p * p] = true;
j++;
}
}
}
// Function to count divisors
int countDivisors(int n)
{
// If number is 1, then it will have only 1
// as a factor. So, total factors will be 1.
if (n == 1)
return 1;
// ans will contain total number of distinct
// divisors
int ans = 1;
// Loop for counting factors of n
for(int i = 0;; i++){
// table[i] is not less than cube root n
if(table[i] * table[i] * table[i] > n)
break;
// Calculating power of table[i] in n.
int cnt = 1; // cnt is power of prime table[i] in n.
while (n % table[i] == 0){ // if table[i] is a factor of n
n = n / table[i];
cnt = cnt + 1; // incrementing power
}
// Calculating the number of divisors
// If n = a^p * b^q then total divisors of n
// are (p+1)*(q+1)
ans = ans * cnt;
}
// if table[i] is greater than cube root of n
// First case
if (prime[n])
ans = ans * 2;
// Second case
else if (primesquare[n])
ans = ans * 3;
// Third case
else if (n != 1)
ans = ans * 4;
return ans; // Total divisors
}
int main()
{
SieveOfEratosthenes();
int sum = 0;
int n = 5;
for(int i = 1; i <= n; i++){
sum += countDivisors(i);
}
cout << sum << endl;
return 0;
}
Output
n = 4 => 8
n = 5 => 10
Complexity
Time complexity: O(n * log(log(n)) + n * n^(1/3))
Space complexity: O(n)
Thanks, #largest_prime_is_463035818 for pointing out my mistake.

Task on the number of iterations

There is a number N
every iteration it becomes equal to (N*2)-1
I need to find out how many steps the number will be a multiple of the original N;
( 1≤ N ≤ 2 · 10 9 )
For example:
N = 7; count = 0
N_ = 7*2-1 = 13; count = 1; N_ % N != 0
N_ = 13*2-1 = 25; count = 2; N_ % N != 0
N_ = 25*2-1 = 49; count = 3; N_ % N == 0
Answer is 3
if it is impossible to decompose in this way, then output -1
#include <iostream>
using namespace std;
int main(){
int N,M,c;
cin >> N;
if (N%2==0) {
cout << -1;
return 0;
}
M = N*2-1;
c = 1;
while (M%N!=0){
c+=1;
M=M*2-1;
}
cout << c;
return 0;
}
It does not fit during (1 second limit). How to optimize the algorithm?
P.S All the answers indicated are optimized, but they don’t fit in 1 second, because you need to change the algorithm in principle. The solution was to use Euler's theorem.
The problem, as other answers have suggested, is equivalent to finding c such that pow(2, c) = 1 mod N. This is impossible if N is even, and possible otherwise (as your code suggests you know).
A linear-time approach is:
int c = 1;
uint64_t m = 2;
while (m != 1){
c += 1;
m = (2*m)%N;
}
printf("%d\n", c);
To solve this in 1 second, I don't think you can use a linear-time algorithm. The worst cases will be when N is prime and large. For example 1999999817 for which the above code runs in around 10 seconds on my laptop.
Instead, factor N into its prime factors. Solve 2^c = 1 mod p^k for each prime factor (where p^k appears in the prime factorization of N. Then combine the results using the Chinese Remainder theorem.
When finding the c for a given prime power, if k=1, the solution is c=p-1. When k is larger, the details are quite messy, but you can find a written solution here: https://math.stackexchange.com/questions/1863037/discrete-logarithm-modulo-powers-of-a-small-prime
The problem is that you're overflowing, the int data type only has 32 bits, and overflows 2^31-1 , in this problem you don't need to keep the actual value of M, you can just keep the modulo of n.
while (M%N!=0){
c+=1;
M=M*2-1;
M%=N
}
Edit:In addition, you don't actually need more than N iterations to check if a 0 mod exists, as there are only N different mods to N and it just keeps cycling. so you also need to keep that in mind in case there is no 0 mod.
There is no doubt that the main problem with your code is signed integer overflow
I added a print of M whenever M was changed (i.e. cout << M << endl;) and gave it the input 29. This is what I got:
57
113
225
449
897
1793
3585
7169
14337
28673
57345
114689
229377
458753
917505
1835009
3670017
7340033
14680065
29360129
58720257
117440513
234881025
469762049
939524097
1879048193
-536870911
-1073741823
-2147483647
1
1
1
1
... endless loop
As you see you have signed integer overflow. That is undefined behavior in C so anything may happen!! On my machine I ended up with a nasty endless loop. That must be fixed before considering performance.
The simple fix is to add a line like
M = M % N;
whenever M is changed. See the answer from #Malek
Besides that you shall also use an unsigned integer, i.e. use uint32_t for all variables.
However, that will not improve performance.
If you still have performance issue after the above fix, you can try this instead:
uint32_t N;
cin >> N;
if (N%2==0) {
cout << -1;
return 0;
}
// Alternative algorithm
uint32_t R,c;
R = 1;
c = 1;
while (R != N){
R = 2*R + 1;
if (R > N) R = R - N;
++c;
}
cout << c;
On my laptop this algorithm is 2.5 times faster when testing on all odd numbers in the range 1..100000. However, it might not be sufficient for all numbers in the range 1..2*10^9.
Also notice the use of uint32_t to avoid integer overflow.

Count the number of proper fractions

A fraction p/q (p and q are positive integers) is proper if p/q < 1. Given 3 <= N <= 50 000 000, write a program to count the number of proper fractions p/q such that p + q = n, and p, q are relative primes (their greatest common divisor is 1).
This is my code
bool prime_pairs(int x, int y) {
int t = 0;
while (y != 0) {
t = y;
y = x % y;
x = t;
}
return (x == 1);
}
void proer_fractions(int n) {
int num = n % 2 == 0 ? n / 2 - 1 : n / 2, den = 0, count = 0;
for (; num > 0; --num) {
den = n - num;
if (prime_pairs(num, den))
++count;
}
printf("%d", count);
}
I wonder if I did it correctly. Is there anyway to speed up the algorithm? It took my laptop (i7-4700mq) 2.97 seconds to run with Release mode when N = 50 000 000.
Thank you very much.
The key fact is that if p + q = n and gcd(p, q) = k then k must divide n. Conversely, if p is coprime to n then q = n - p must be coprime to to p.
Hence the problem of counting coprime pairs (p, q) that sum to n effectively reduces to counting the numbers that are coprime to n (Euler's totient, a.k.a. phi) and dividing that count by 2.
There is plenty of code for computing the totient of a number out there on the 'net, like in the GeeksForGeeks article Euler's Totient Function. It essentially boils down to factoring the number, which should be quite a bit faster than your current algorithm (about 5 orders of magnitude). Have fun!

Sum of Greatest Common Divisor of all numbers till n with n

There are n numbers from 1 to n. I need to find the
∑gcd(i,n) where i=1 to i=n
for n of the range 10^7. I used euclid's algorithm for gcd but it gave TLE. Is there any efficient method for finding the above sum?
#include<bits/stdc++.h>
using namespace std;
typedef long long int ll;
int gcd(int a, int b)
{
return b == 0 ? a : gcd(b, a % b);
}
int main()
{
ll n,sum=0;
scanf("%lld",&n);
for(int i=1;i<=n;i++)
{
sum+=gcd(i,n);
}
printf("%lld\n",sum);
return 0;
}
You can do it via bulk GCD calculation.
You should found all simple divisors and powers of these divisors. This is possible done in Sqtr(N) complexity.
After required compose GCD table.
May code snippet on C#, it is not difficult to convert into C++
int[] gcd = new int[x + 1];
for (int i = 1; i <= x; i++) gcd[i] = 1;
for (int i = 0; i < p.Length; i++)
for (int j = 0, h = p[i]; j < c[i]; j++, h *= p[i])
for (long k = h; k <= x; k += h)
gcd[k] *= p[i];
long sum = 0;
for (int i = 1; i <= x; i++) sum += gcd[i];
p it is array of simple divisors and c power of this divisor.
For example if n = 125
p = [5]
c = [3]
125 = 5^3
if n = 12
p = [2,3]
c = [2,1]
12 = 2^2 * 3^1
I've just implemented the GCD algorithm between two numbers, which is quite easy, but I cant get what you are trying to do there.
What I read there is that you are trying to sum up a series of GCD; but a GCD is the result of a series of mathematical operations, between two or more numbers, which result in a single value.
I'm no mathematician, but I think that "sigma" as you wrote it means that you are trying to sum up the GCD of the numbers between 1 and 10.000.000; which doesnt make sense at all, for me.
What are the values you are trying to find the GCD of? All the numbers between 1 and 10.000.000? I doubt that's it.
Anyway, here's a very basic (and hurried) implementation of Euclid's GCD algorithm:
int num1=0, num2=0;
cout << "Insert the first number: ";
cin >> num1;
cout << "\n\nInsert the second number: ";
cin >> num2;
cout << "\n\n";
fflush(stdin);
while ((num1 > 0) && (num2 > 0))
{
if ((num1 - num2) > 0)
{
//cout << "..case1\n";
num1 -= num2;
}
else if ((num2 - num1) > 0)
{
//cout << "..case2\n";
num2 -= num1;
}
else if (num1 = num2)
{
cout << ">>GCD = " << num1 << "\n\n";
break;
}
}
A good place to start looking at this problem is here at the Online Encyclopedia of Integer Sequences as what you are trying to do is compute the sum of the sequence A018804 between 1 and N. As you've discovered approaches that try to use simple Euclid GCD function are too slow so what you need is a more efficient way to calculate the result.
According to one paper linked from the OEIS it's possible to rewrite the sum in terms of Euler's function. This changes the problem into one of prime factorisation - still not easy but likely to be much faster than brute force.
I had occasion to study the computation of GCD sums because the problem cropped up in a HackerEarth tutorial named GCD Sum. Googling turned up some academic papers with useful formulas, which I'm reporting here since they aren't mentioned in the MathOverflow article linked by deviantfan.
For coprime m and n (i.e. gcd(m, n) == 1) the function is multiplicative:
gcd_sum[m * n] = gcd_sum[m] * gcd_sum[n]
Powers e of primes p:
gcd_sum[p^e] = (e + 1) * p^e - e * p^(e - 1)
If only a single sum is to be computed then these formulas could be applied to the result of factoring the number in question, which would still be way faster than repeated gcd() calls or going through the rigmarole proposed by Толя.
However, the formulas could just as easily be used to compute whole tables of the function efficiently. Basically, all you have to do is plug them into the algorithm for linear time Euler totient calculation and you're done - this computes all GCD sums up to a million much faster than you can compute the single GCD sum for the number 10^6 by way of calls to a gcd() function. Basically, the algorithm efficiently enumerates the least factor decompositions of the numbers up to n in a way that makes it easy to compute any multiplicative function - Euler totient (a.k.a. phi), the sigmas or, in fact, GCD sums.
Here's a bit of hashish code that computes a table of GCD sums for smallish limits - ‘small’ in the sense that sqrt(N) * N does not overflow a 32-bit signed integer. IOW, it works for a limit of 10^6 (plenty enough for the HackerEarth task with its limit of 5 * 10^5) but a limit of 10^7 would require sticking (long) casts in a couple of strategic places. However, such hardening of the function for operation at higher ranges is left as the proverbial exercise for the reader... ;-)
static int[] precompute_Pillai (int limit)
{
var small_primes = new List<ushort>();
var result = new int[1 + limit];
result[1] = 1;
int n = 2, small_prime_limit = (int)Math.Sqrt(limit);
for (int half = limit / 2; n <= half; ++n)
{
int f_n = result[n];
if (f_n == 0)
{
f_n = result[n] = 2 * n - 1;
if (n <= small_prime_limit)
{
small_primes.Add((ushort)n);
}
}
foreach (int prime in small_primes)
{
int nth_multiple = n * prime, e = 1, p = 1; // 1e6 * 1e3 < INT_MAX
if (nth_multiple > limit)
break;
if (n % prime == 0)
{
if (n == prime)
{
f_n = 1;
e = 2;
p = prime;
}
else break;
}
for (int q; ; ++e, p = q)
{
result[nth_multiple] = f_n * ((e + 1) * (q = p * prime) - e * p);
if ((nth_multiple *= prime) > limit)
break;
}
}
}
for ( ; n <= limit; ++n)
if (result[n] == 0)
result[n] = 2 * n - 1;
return result;
}
As promised, this computes all GCD sums up to 500,000 in 12.4 ms, whereas computing the single sum for 500,000 via gcd() calls takes 48.1 ms on the same machine. The code has been verified against an OEIS list of the Pillai function (A018804) up to 2000, and up to 500,000 against a gcd-based function - an undertaking that took a full 4 hours.
There's a whole range of optimisations that could be applied to make the code significantly faster, like replacing the modulo division with a multiplication (with the inverse) and a comparison, or to shave some more milliseconds by way of stepping the ‘prime cleaner-upper’ loop modulo 6. However, I wanted to show the algorithm in its basic, unoptimised form because (a) it is plenty fast as it is, and (b) it could be useful for other multiplicative functions, not just GCD sums.
P.S.: modulo testing via multiplication with the inverse is described in section 9 of the Granlund/Montgomery paper Division by Invariant Integers using Multiplication but it is hard to find info on efficient computation of inverses modulo powers of 2. Most sources use the Extended Euclid's algorithm or similar overkill. So here comes a function that computes multiplicative inverses modulo 2^32:
static uint ModularInverse (uint n)
{
uint x = 2 - n;
x *= 2 - x * n;
x *= 2 - x * n;
x *= 2 - x * n;
x *= 2 - x * n;
return x;
}
That's effectively five iterations of Newton-Raphson, in case anyone cares. ;-)
you can use Seive to store lowest prime Factor of all number less than equal to 10^7
and the by by prime factorization of given number calculate your answer directly..

to optimize the nested loops

for( a=1; a <= 25; a++){
num1 = m[a];
for( b=1; b <= 25; b++){
num2 = m[b];
for( c=1; c <= 25; c++){
num3 = m[c];
for( d=1; d <= 25; d++){
num4 = m[d];
for( e=1; e <= 25; e++){
num5 = m[e];
for( f=1; f <= 25; f++){
num6 = m[f];
for( g=1; g <= 25; g++){
num7 = m[g];
for( h=1; h <= 25; h++){
num8 = m[h];
for( i=1; i <= 25; i++){
num = num1*100000000 + num2*10000000 +
num3* 1000000 + num4* 100000 +
num5* 10000 + num6* 1000 +
num7* 100 + num8* 10 + m[i];
check_prime = 1;
for ( y=2; y <= num/2; y++)
{
if ( num % y == 0 )
check_prime = 0;
}
if ( check_prime != 0 )
{
array[x++] = num;
}
num = 0;
}}}}}}}}}
The above code takes a hell lot of time to finish executing.. In fact it doesn't even finish executing, What can i do to optimize the loop and speed up the execution?? I am newbie to cpp.
Replace this code with code using a sensible algorithm, such as the Sieve of Eratosthenes. The most important "optimization" is choosing the right algorithm in the first place.
If your algorithm for sorting numbers is to swap them randomly until they're in order, it doesn't matter how much you optimize the selecting of the random entries, swapping them, or checking if they're in order. A bad algorithm will mean bad performance regardless.
You're checking 259 = 3,814,697,265,625 numbers whether they're prime. That's a lot of prime tests and will always take long. Even in the best case (for performance) when all array entries (in m) are 0 (never mind that the test considers 0 a prime), so that the trial division loop never runs, it will take hours to run. When all entries of m are positive, the code as is will run for hundreds or thousands of years, since then each number will be trial-divided by more than 50,000,000 numbers.
Looking at the prime check,
check_prime = 1;
for ( y = 2; y <= num/2; y++)
{
if ( num % y == 0 )
check_prime = 0;
}
the first glaring inefficiency is that the loop continues even after a divisor has been found and the compositeness of num established. Break out of the loop as soon as you know the outcome.
check_prime = 1;
for ( y = 2; y <= num/2; y++)
{
if ( num % y == 0 )
{
check_prime = 0;
break;
}
}
In the unfortunate case that all numbers you test are prime, that won't change a thing, but if all (or almost all, for sufficiently large values of almost) the numbers are composite, it will cut the running time by a factor of at least 5000.
The next thing is that you divide up to num/2. That is not necessary. Why do you stop at num/2, and not at num - 1? Well, because you figured out that the largest proper divisor of num cannot be larger than num/2 because if (num >) k > num/2, then 2*k > num and num is not a multiple of k.
That's good, not everybody sees that.
But you can pursue that train of thought further. If num/2 is a divisor of num, that means num = 2*(num/2) (using integer division, with the exception of num = 3). But then num is even, and its compositeness was already determined by the division by 2, so the division by num/2 will never be tried if it succeeds.
So what's the next possible candidate for the largest divisor that needs to be considered? num/3 of course. But if that's a divisor of num, then num = 3*(num/3) (unless num < 9) and the division by 3 has already settled the question.
Going on, if k < √num and num/k is a divisor of num, then num = k*(num/k) and we see that num has a smaller divisor, namely k (possibly even smaller ones).
So the smallest nontrivial divisor of num is less than or equal to √num. Thus the loop needs only run for y <= √num, or y*y <= num. If no divisor has been found in that range, num is prime.
Now the question arises whether to loop
for(y = 2; y*y <= num; ++y)
or
root = floor(sqrt(num));
for(y = 2; y <= root; ++y)
The first needs one multiplication for the loop condition in each iteration, the second one computation of the square root outside the loop.
Which is faster?
That depends on the average size of num and whether many are prime or not (more precisely, on the average size of the smallest prime divisor). Computing a square root takes much longer than a multiplication, to compensate that cost, the loop must run for many iterations (on average) - whether "many" means more than 20, more than 100 or more than 1000, say, depends. With num larger than 10^8, as is probably the case here, probably computing the square root is the better choice.
Now we have bounded the number of iterations of the trial division loop to √num whether num is composite or prime and reduced the running time by a factor of at least 5000 (assuming that all m[index] > 0, so that always num >= 10^8) regardless of how many primes are among the tested numbers. If most values num takes are composites with small prime factors, the reduction factor is much larger, to the extent that normally, the running time is almost completely used for testing primes.
Further improvement can be obtained by reducing the number of divisor candidates. If num is divisible by 4, 6, 8, ..., then it is also divisible by 2, so num % y never yields 0 for even y > 2. That means all these divisions are superfluous. By special casing 2 and incrementing the divisor candidate in steps of 2,
if (num % 2 == 0)
{
check_prime = 0;
} else {
root = floor(sqrt(num));
for(y = 3; y <= root; y += 2)
{
if (num % y == 0)
{
check_prime = 0;
break;
}
}
}
the number of divisions to perform and the running time is roughly halved (assuming enough bad cases that the work for even numbers is negligible).
Now, whenever y is a multiple of 3 (other than 3 itself), num % y will only be computed when num is not a multiple of 3, so these divisions are also superfluous. You can eliminate them by also special-casing 3 and letting y run through only the odd numbers that are not divisible by 3 (start with y = 5, increment by 2 and 4 alternatingly). That chops off roughly a third of the remaining work (if enough bad cases are present).
Continuing that elimination process, we need only divide num by the primes not exceeding √num to find whether it's prime or not.
So usually it would be a good idea to find the primes not exceeding the square root of the largest num you'll check, store them in an array and loop
root = floor(sqrt(num));
for(k = 0, y = primes[0]; k < prime_count && (y = primes[k]) <= root; ++k)
{
if (num % y == 0)
{
check_prime = 0;
break;
}
}
Unless the largest value num can take is small enough, if, for example, you'll always have num < 2^31, then you should find the primes to that limit in a bit-sieve so that you can look up whether num is prime in constant time (a sieve of 2^31 bits takes 256 MB, if you only have flags for the odd numbers [needs special-casing to check whether num is even], you only need 128 MB to check the primality of numbers < 2^31 in constant time, further reduction of required space for the sieve is possible).
So far for the prime test itself.
If the m array contains numbers divisible by 2 or by 5, it may be worthwhile to reorder the loops, have the loop for i the outermost, and skip the inner loops if m[i] is divisible by 2 or by 5 - all the other numbers are multiplied by powers of 10 before adding, so then num would be a multiple of 2 resp. 5 and not prime.
But, despite all that, it will still take long to run the code. Nine nested loops reek of a wrong design.
What is it that you try to do? Maybe we can help finding the correct design.
We can eliminate a lot of redundant calculations by calculating each part of the number as it becomes available. This also shows the trial division test for primality on 2-3 wheel up to the square root of a number:
// array m[] is assumed sorted in descending order NB!
// a macro to skip over the duplicate digits
#define I(x) while( x<25 && m[x+1]==m[x] ) ++x;
for( a=1; a <= 25; a++) {
num1 = m[a]*100000000;
for( b=1; b <= 25; b++) if (b != a) {
num2 = num1 + m[b]*10000000;
for( c=1; c <= 25; c++) if (c != b && c != a) {
num3 = num2 + m[c]*1000000;
for( d=1; d <= 25; d++) if (d!=c && d!=b && d!=a) {
num4 = num3 + m[d]*100000;
for( e=1; e <= 25; e++) if (e!=d && e!=c && e!=b && e!=a) {
num5 = num4 + m[e]*10000;
for( f=1; f <= 25; f++) if (f!=e&&f!=d&&f!=c&&f!=b&&f!=a) {
num6 = num5 + m[f]*1000;
limit = floor( sqrt( num6+1000 )); ///
for( g=1; g <= 25; g++) if (g!=f&&g!=e&&g!=d&&g!=c&&g!=b&&g!=a) {
num7 = num6 + m[g]*100;
for( h=1; h <= 25; h++) if (h!=g&&h!=f&&h!=e&&h!=d&&h!=c&&h!=b&&h!=a) {
num8 = num7 + m[h]*10;
for( i=1; i <= 25; i++) if (i!=h&&i!=g&&i!=f&&i!=e&&i!=d
&&i!=c&&i!=b&&i!=a) {
num = num8 + m[i];
if( num % 2 /= 0 && num % 3 /= 0 ) {
is_prime = 1;
for ( y=5; y <= limit; y+=6) {
if ( num % y == 0 ) { is_prime = 0; break; }
if ( num % (y+2) == 0 ) { is_prime = 0; break; }
}
if ( is_prime ) { return( num ); } // largest prime found
}I(i)}I(h)}I(g)}I(f)}I(e)}I(d)}I(c)}I(b)}I(a)}
This code also eliminates the duplicate indices. As you've indicated in the comments, you pick your numbers out of a 5x5 grid. That means that you must use all different indices. This will bring down the count of numbers to test from 25^9 = 3,814,697,265,625 to 25*24*23*...*17 = 741,354,768,000.
Since you've now indicated that all entries in the m[] array are less than 10, there certain to be duplicates, which need to be skipped when searching. As Daniel points out, searching from the top, the first found prime will be the biggest. This is achieved by pre-sorting the m[] array in descending order.