Here is the question:
The sum of the primes below 10 is 2+3+5+7=17.
Find the sum of all the primes not greater than given N.
Input Format :
The first line contains an integer T i.e. number of the test cases.
The next T lines will contains an integer N.
Output Format :
Print the value corresponding to each test case in seperate line.
Constraints :
1≤T≤104
1≤N≤106
https://www.hackerrank.com/contests/projecteuler/challenges/euler010
This is the link to the question.
So, i attempted to solve this question using sieve of Eratosthenes.
I pre calculated all primes below 10^6 which is the given limit for N.
6 out of the 7 test cases were accepted but the last test case give Timeout(TLE) .
I read the discussion forum and there they say that in order to solve the question we need to pre-calculate the sums of primes also.
So, i tried making an array of long long ints and tried storing all the sums in it. But this is giving me a segmentation fault.
So, how am I supposed to precalculate the sums of the primes?
Here is my code:
#include "header.h" //MAX is defined to be 1000000
bool sieve[MAX + 1]; // false = prime, true = composite
int main(void){
//0 and 1 are not primes
sieve[0] = sieve[1] = true;
//input limiting value
int n = MAX;
//cross out even numbers
for(int i = 4; i <= n; i += 2){
sieve[i] = true;
}
//use sieve of eratosthenes
for(int i = 3; i <= static_cast<int>(sqrt(n)); i += 2){
if(sieve[i] == false){
for(int j = i * i; j <= n; j += i)
sieve[j] = true;
}
}
long long p, ans = 0;
int t;
std::cin >> t;
while(t--){
std::cin >> p;
for(int i = 0; i <= p; ++i)
if(sieve[i] == false)
ans += i;
std::cout << ans << std::endl;
ans = 0;
}
return 0;
}
Given an array of primes prime[N], precomputing sums of primes can be done in a single for loop like this:
int sum[N];
sum[0] = primes[0];
for (int i = 1 ; i < N ; i++) {
sum[i] = prime[i]+sum[i-1];
}
You can use this array together with primes[] by running a binary search on primes, and picking the sum at the same position if the number being searched is prime, or at the prior position if the number is not prime.
Related
I should implement this summation in C ++. I have tried with this code, but with very high numbers up to 10 ^ 12 it takes too long.
The summation is:
For any positive integer k, let d(k) denote the number of positive divisors of k (including 1 and k itself).
For example, for the number 4: 1 has 1 divisor, 2 has two divisors, 3 has two divisors, and 4 has three divisors. So the result would be 8.
This is my code:
#include <iostream>
#include <algorithm>
using namespace std;
int findDivisors(long long n)
{
int c=0;
for(int j=1;j*j<=n;j++)
{
if(n%j==0)
{
c++;
if(j!=(n/j))
{
c++;
}
}
}
return c;
}
long long compute(long long n)
{
long long sum=0;
for(int i=1; i<=n; i++)
{
sum += (findDivisors(i));
}
return sum;
}
int main()
{
int n, divisors;
freopen("input.txt", "r", stdin);
freopen("output.txt", "w", stdout);
cin >> n;
cout << compute(n);
}
I think it's not just a simple optimization problem, but maybe I should change the algorithm entirely.
Would anyone have any ideas to speed it up? Thank you.
largest_prime_is_463035818's answer shows an O(N) solution, but the OP is trying to solve this problem
with very high numbers up to 1012.
The following is an O(N1/2) algorithm, based on some observations about the sum
n/1 + n/2 + n/3 + ... + n/n
In particular, we can count the number of terms with a specific value.
Consider all the terms n/k where k > n/2. There are n/2 of those and all are equal to 1 (integer division), so that their sum is n/2.
Similar considerations hold for the other dividends, so that we can write the following function
long long count_divisors(long long n)
{
auto sum{ n };
for (auto i{ 1ll }, k_old{ n }, k{ n }; i < k ; ++i, k_old = k)
{ // ^^^^^ it goes up to sqrt(n)
k = n / (i + 1);
sum += (k_old - k) * i;
if (i == k)
break;
sum += k;
}
return sum;
}
Here it is tested against the O(N) algorithm, the only difference in the results beeing the corner cases n = 0 and n = 1.
Edit
Thanks again to largest_prime_is_463035818, who linked the Wikipedia page about the divisor summatory function, where both an O(N) and an O(sqrt(N)) algorithm are mentioned.
An implementation of the latter may look like this
auto divisor_summatory(long long n)
{
auto sum{ 0ll };
auto k{ 1ll };
for ( ; k <= n / k; ++k )
{
sum += n / k;
}
--k;
return 2 * sum - k * k;
}
They also add this statement:
Finding a closed form for this summed expression seems to be beyond the techniques available, but it is possible to give approximations. The leading behavior of the series is given by
D(x) = xlogx + x(2γ - 1) + Δ(x)
where γ is the Euler–Mascheroni constant, and the error term is Δ(x) = O(sqrt(x)).
I used your brute force approach as reference to have test cases. The ones I used are
compute(12) == 35
cpmpute(100) == 482
Don't get confused by computing factorizations. There are some tricks one can play when factorizing numbers, but you actually don't need any of that. The solution is a plain simple O(N) loop:
#include <iostream>
#include <limits>
long long compute(long long n){
long long sum = n+1;
for (long long i=2; i < n ; ++i){
sum += n/i;
}
return sum;
}
int main()
{
std::cout << compute(12) << "\n";
std::cout << compute(100) << "\n";
}
Output:
35
482
Why does this work?
The key is in Marc Glisse's comment:
As often with this kind of problem, this sum actually counts pairs x,
y where x divides y, and the sum is arranged to count first all x
corresponding to a fixed y, but nothing says you have to keep it that
way.
I could stop here, because the comment already explains it all. Though, if it didn't click yet...
The trick is to realize that it is much simpler to count divisors of all numbers up to n rather than n-times counting divisors of individual numbers and take the sum.
You don't need to care about factorizations of eg 123123123 or 52323423 to count all divisors up to 10000000000. All you need is a change of perspective. Instead of trying to factorize numbers, consider the divisors. How often does the divisor 1 appear up to n? Simple: n-times. How often does the divisor 2 appear? Still simple: n/2 times, because every second number is divisible by 2. Divisor 3? Every 3rd number is divisible by 3. I hope you can see the pattern already.
You could even reduce the loop to only loop till n/2, because bigger numbers obviously appear only once as divisor. Though I didn't bother to go further, because the biggest change is from your O(N * sqrt(N)) to O(N).
Let's start off with some math and reduce the O(n * sq(n)) factorization to O(n * log(log(n))) and for counting the sum of divisors the overall complexity is O(n * log(log(n)) + n * n^(1/3)).
For instance:
In Codeforces himanshujaju explains how we can optimize the solution of finding divisors of a number.
I am simplifying it a little bit.
Let, n as the product of three numbers p, q, and r.
so assume p * q * r = n, where p <= q <= r.
The maximum value of p = n^(1/3).
Now we can loop over all prime numbers in a range [2, n^(1/3)]
and try to reduce the time complexity of prime factorization.
We will split our number n into two numbers x and y => x * y = n.
And x contains prime factors up to n^(1/3) and y deals with higher prime factors greater than n^(1/3).
Thus gcd(x, y) = 1.
Now define F(n) as the number of prime factors of n.
From multiplicative rules, we can say that
F(x * y) = F(x) * F(y), if gcd(x, y) = 1.
For finding F(n) => F(x * y) = F(x) * F(y)
So first find F(x) then F(y) will F(n/x)
And there will 3 cases to cover for y:
1. y is a prime number: F(y) = 2.
2. y is the square of a prime number: F(y) = 3.
3. y is a product of two distinct prime numbers: F(y) = 4.
So once we are done with finding F(x) and F(y), we are also done with finding F(x * y) or F(n).
In Cp-Algorithm there is also a nice explanation of how to count the number of divisors on a number. And also in GeeksForGeeks a nice coding example of how to count the number of divisors of a number in an efficient way. One can check the articles and can generate a nice solution to this problem.
C++ implementation
#include <bits/stdc++.h>
using namespace std;
const int maxn = 1e6 + 11;
bool prime[maxn];
bool primesquare[maxn];
int table[maxn]; // for storing primes
void SieveOfEratosthenes()
{
for(int i = 2; i < maxn; i++){
prime[i] = true;
}
for(int i = 0; i < maxn; i++){
primesquare[i] = false;
}
// 1 is not a prime number
prime[1] = false;
for(int p = 2; p * p < maxn; p++){
// If prime[p] is not changed, then
// it is a prime
if(prime[p] == true){
// Update all multiples of p
for(int i = p * 2; i < maxn; i += p){
prime[i] = false;
}
}
}
int j = 0;
for(int p = 2; p < maxn; p++) {
if (prime[p]) {
// Storing primes in an array
table[j] = p;
// Update value in primesquare[p * p],
// if p is prime.
if(p < maxn / p) primesquare[p * p] = true;
j++;
}
}
}
// Function to count divisors
int countDivisors(int n)
{
// If number is 1, then it will have only 1
// as a factor. So, total factors will be 1.
if (n == 1)
return 1;
// ans will contain total number of distinct
// divisors
int ans = 1;
// Loop for counting factors of n
for(int i = 0;; i++){
// table[i] is not less than cube root n
if(table[i] * table[i] * table[i] > n)
break;
// Calculating power of table[i] in n.
int cnt = 1; // cnt is power of prime table[i] in n.
while (n % table[i] == 0){ // if table[i] is a factor of n
n = n / table[i];
cnt = cnt + 1; // incrementing power
}
// Calculating the number of divisors
// If n = a^p * b^q then total divisors of n
// are (p+1)*(q+1)
ans = ans * cnt;
}
// if table[i] is greater than cube root of n
// First case
if (prime[n])
ans = ans * 2;
// Second case
else if (primesquare[n])
ans = ans * 3;
// Third case
else if (n != 1)
ans = ans * 4;
return ans; // Total divisors
}
int main()
{
SieveOfEratosthenes();
int sum = 0;
int n = 5;
for(int i = 1; i <= n; i++){
sum += countDivisors(i);
}
cout << sum << endl;
return 0;
}
Output
n = 4 => 8
n = 5 => 10
Complexity
Time complexity: O(n * log(log(n)) + n * n^(1/3))
Space complexity: O(n)
Thanks, #largest_prime_is_463035818 for pointing out my mistake.
Peter wants to generate some prime numbers for his cryptosystem. Help him! Your task is to generate all prime numbers between two given numbers!
Input
The input begins with the number t of test cases in a single line (t<=10). In each of the next t lines there are two numbers m and n (1 <= m <= n <= 1000000000, n-m<=100000) separated by a space.
Output
For every test case print all prime numbers p such that m <= p <= n, one number per line, test cases separated by an empty line.
Example
Input:
2
1 10
3 5
Output:
2
3
5
7
3
5
Warning: large Input/Output data, be careful with certain languages (though most should be OK if the algorithm is well designed)
I looked up on google to find an optimasation solution for the above problem and here's the code.
#include <iostream>
#include <cmath>
#include <vector>
#include <set>
using namespace std;
int main() {
vector<int> primes;
primes.push_back(2);
for (int i = 3; i <= 32000; i+=2) {
bool isprime = true;
int cap = sqrt(i) + 1;
vector<int>::iterator p;
for (p = primes.begin(); p != primes.end(); p++) {
if (*p >= cap) break;
if (i % *p == 0) {
isprime = false;
break;
}
}
if (isprime) primes.push_back(i);
}
int T,N,M;
cin >> T;
for (int t = 0; t < T; t++) {
if (t) cout << endl;
cin >> M >> N;
if (M < 2) M = 2;
int cap = sqrt(N) + 1;
set<int> notprime;
notprime.clear();
vector<int>::iterator p;
for (p = primes.begin(); p != primes.end(); p++) {
if (*p >= cap) break;
int start;
if (*p >= M) start = (*p)*2;
else start = M + ((*p - M % *p) % *p); //not able to understand this logic.
for (int j = start; j <= N; j += *p) {
notprime.insert(j);
}
}
for (int i = M; i <= N; i++) {
if (notprime.count(i) == 0) {
cout << i << endl;
}
}
}
return 0;
}
I am not able to understand the above code. Please, help me in understanding it. I am just not getting the logic behind this program(I know STL, just want to understand the logic).
Its pretty simple really. You precalculate all primes that exists in your range. Then for each multiple of prime, except first, you mark number as "not prime".
Line you marked just calculates first occurence of particular prime's multiple in range M to N.
Edit: More explanations.
This method finds primes by first searching for all non-primes. What is left is primes.
To do so on first step it calculates all "small" primes. Then for each small prime it marks all its multiples that fit in target range. To do so, you need first calculate first occurence of this prime in your range - this is what "start" variable is. Basically it is first multiple of prime that >= M.
When yo have "start" you simply mark all multiples by adding prime to current number until you reach N.
If you still confused about what and how "start" is calculated try to think about how you would find "x" such that it is "x = A * y" and "x >= M" where you know A and M, but don't know "y".
Also I think there probably error in this algorithm. Because it should complete this cycle for each value in "nonprime" set. But may be it doesn't matter if first unaccounted prime multiple always > N.
I need to find all the prime numbers from 2 to n using the Sieve of Eratosthenes. I looked on Wikipedia(Sieve of Eratosthenes) to find out what the Sieve of Eratosthenes was, and it gave me this pseudocode:
Input: an integer n > 1
Let A be an array of Boolean values, indexed by integers 2 to n,
initially all set to true.
for i = 2, 3, 4, ..., not exceeding √n:
if A[i] is true:
for j = i2, i2+i, i2+2i, i2+3i, ..., not exceeding n :
A[j] := false
Output: all i such that A[i] is true.
So I used this and translated it to C++. It looks fine to me, but I have a couple errors. Firstly, if I input 2 or 3 into n, it says:
terminate called after throwing an instance of 'Range_error'
what(): Range_error: 2
Also, whenever I enter a 100 or anything else (4, 234, 149, 22, anything), it accepts the input for n, and doesn't do anything. Here is my C++ translation:
#include "std_lib_facilities.h"
int main()
{
/* this program will take in an input 'n' as the maximum value. Then it will calculate
all the prime numbers between 2 and n. It follows the Sieve of Eratosthenes with
the algorithms from Wikipedia's pseudocode translated by me into C++*/
int n;
cin >> n;
vector<string>A;
for(int i = 2; i <= n; ++i) // fills the whole table with "true" from 0 to n-2
A.push_back("true");
for(int i = 2; i <= sqrt(n); ++i)
{
i -= 2; // because I built the vector from 0 to n-2, i need to reflect that here.
if(A[i] == "true")
{
for(int j = pow(i, 2); j <= n; j += i)
{
A[j] = "false";
}
}
}
//print the prime numbers
for(int i = 2; i <= n; ++i)
{
if(A[i] == "true")
cout << i << '\n';
}
return 0;
}
The issue is coming from the fact that the indexes are not in line with the value they are representing, i.e., they are moved down by 2. By doing this operation, they no longer have the same mathematical properties.
Basically, the value 3 is at position 1 and the value 4 is at position 2. When you are testing for division, you are using the positions as they were values. So instead of testing if 4%3==0, you are testing that 2%1=0.
In order to make your program works, you have to remove the -2 shifting of the indexes:
int main()
{
int n;
cin >> n;
vector<string>A;
for(int i = 0; i <= n; ++i) // fills the whole table with "true" from 0 to n-2
A.push_back("true");
for(int i = 2; i <= sqrt(n); ++i)
{
if(A[i] == "true")
{
for(int j = pow(i, 2); j <= n; j += i)
{
A[j] = "false";
}
}
}
//print the prime numbers
for(int i = 2; i <= n; ++i)
{
if(A[i] == "true")
cout << i << '\n';
}
return 0;
}
I agree with other comments, you could use a vector of bools. And directly initialize them with the right size and value:
std::vector<bool> A(n, false);
Here you push back n-1 elements
vector<string>A;
for(int i = 2; i <= n; ++i) // fills the whole table with "true" from 0 to n-2
A.push_back("true");
but here you access your vector from A[2] to A[n].
//print the prime numbers
for(int i = 2; i <= n; ++i)
{
if(A[i] == "true")
cout << i << '\n';
}
A has elements at positions A[0] to A[n-2]. You might correct this defect by initializing your vector differently. For example as
vector<string> A(n+1, "true");
This creates a vector A with n+1 strings with default values "true" which can be accessed through A[0] to A[n]. With this your code should run, even if it has more deficits. But I think you learn most if you just try to successfully implement the sieve and then look for (good) alternatives in the internet.
This is painful. Why are you using a string array to store boolean values, and not, let's say, an array of boolean values? Why are you leaving out the first two array elements, forcing you to do some adjustment of all indices? Which you then forget half the time, totally breaking your code? At least you should change this line:
i -= 2; // because I built the vector from 0 to n-2, i need to reflect that here.
to:
i -= 2; // because I left the first two elements out, I that here.
// But only here, doing it everywhere is too annoying.
As a result of that design decision, when you execute this line:
for(int j = pow(i, 2); j <= n; j += i)
i is actually zero which means j will stay zero forever.
While trying to find prime numbers in a range (see problem description), I came across the following code:
(Code taken from here)
// For each prime in sqrt(N) we need to use it in the segmented sieve process.
for (i = 0; i < cnt; i++) {
p = myPrimes[i]; // Store the prime.
s = M / p;
s = s * p; // The closest number less than M that is composite number for this prime p.
for (int j = s; j <= N; j = j + p) {
if (j < M) continue; // Because composite numbers less than M are of no concern.
/* j - M = index in the array primesNow, this is as max index allowed in the array
is not N, it is DIFF_SIZE so we are storing the numbers offset from.
while printing we will add M and print to get the actual number. */
primesNow[j - M] = false;
}
}
// In this loop the first prime numbers for example say 2, 3 are also set to false.
for (int i = 0; i < cnt; i++) { // Hence we need to print them in case they're in range.
if (myPrimes[i] >= M && myPrimes[i] <= N) // Without this loop you will see that for a
// range (1, 30), 2 & 3 doesn't get printed.
cout << myPrimes[i] << endl;
}
// primesNow[] = false for all composite numbers, primes found by checking with true.
for (int i = 0; i < N - M + 1; ++i) {
// i + M != 1 to ensure that for i = 0 and M = 1, 1 is not considered a prime number.
if (primesNow[i] == true && (i + M) != 1)
cout << i + M << endl; // Print our prime numbers in the range.
}
However, I didn't find this code intuitive and it was not easy to understand.
Can someone explain the general idea behind the above algorithm?
What alternative algorithms are there to mark non-prime numbers in a range?
That's overly complicated. Let's start with a basic Sieve of Eratosthenes, in pseudocode, that outputs all the primes less than or equal to n:
function primes(n)
sieve := makeArray(2..n, True)
for p from 2 to n
if sieve[p]
output(p)
for i from p*p to n step p
sieve[p] := False
This function calls output on each prime p; output can print the primes, or sum the primes, or count them, or do whatever you want to do with them. The outer for loop considers each candidate prime in turn; The sieving occurs in the inner for loop where multiples of the current prime p are removed from the sieve.
Once you understand how that works, go here for a discussion of the segmented Sieve of Eratosthenes over a range.
Have you considered the sieve on a bit level, it can provide a bit larger number of primes, and with the buffer, you could modify it to find for example the primes between 2 and 2^60 using 64 bit ints, by reusing the same buffer, while preserving the offsets of the primes already discovered. The following will use an array of integers.
Declerations
#include <math.h> // sqrt(), the upper limit need to eliminate
#include <stdio.h> // for printing, could use <iostream>
Macros to manipulate bit, the following will use 32bit ints
#define BIT_SET(d, n) (d[n>>5]|=1<<(n-((n>>5)<<5)))
#define BIT_GET(d, n) (d[n>>5]&1<<(n-((n>>5)<<5)))
#define BIT_FLIP(d, n) (d[n>>5]&=~(1<<(n-((n>>5)<<5))))
unsigned int n = 0x80000; // the upper limit 1/2 mb, with 32 bits each
// will get the 1st primes upto 16 mb
int *data = new int[n]; // allocate
unsigned int r = n * 0x20; // the actual number of bits avalible
Could use zeros to save time but, on (1) for prime, is a bit more intuitive
for(int i=0;i<n;i++)
data[i] = 0xFFFFFFFF;
unsigned int seed = 2; // the seed starts at 2
unsigned int uLimit = sqrt(r); // the upper limit for checking off the sieve
BIT_FLIP(data, 1); // one is not prime
Time to discover the primes this took under a half second
// untill uLimit is reached
while(seed < uLimit) {
// don't include itself when eliminating canidates
for(int i=seed+seed;i<r;i+=seed)
BIT_FLIP(data, i);
// find the next bit still active (set to 1), don't include the current seed
for(int i=seed+1;i<r;i++) {
if (BIT_GET(data, i)) {
seed = i;
break;
}
}
}
Now for the output this will consume the most time
unsigned long bit_index = 0; // the current bit
int w = 8; // the width of a column
unsigned pc = 0; // prime, count, to assist in creating columns
for(int i=0;i<n;i++) {
unsigned long long int b = 1; // double width, so there is no overflow
// if a bit is still set, include that as a result
while(b < 0xFFFFFFFF) {
if (data[i]&b) {
printf("%8.u ", bit_index);
if(((pc++) % w) == 0)
putchar('\n'); // add a new row
}
bit_index++;
b<<=1; // multiply by 2, to check the next bit
}
}
clean up
delete [] data;
Given a sequence of n positive integers we need to count consecutive sub-sequences whose sum is divisible by k.
Constraints : N is up to 10^6 and each element up to 10^9 and K is up to 100
EXAMPLE : Let N=5 and K=3 and array be 1 2 3 4 1
Here answer is 4
Explanation : there exists, 4 sub-sequences whose sum is divisible by 3, they are
3
1 2
1 2 3
2 3 4
My Attempt :
long long int count=0;
for(int i=0;i<n;i++){
long long int sum=0;
for(int j=i;j<n;j++)
{
sum=sum+arr[j];
if(sum%k==0)
{
count++;
}
}
}
But obviously its poor approach. Can their be better approach for this question? Please help.
Complete Question: https://www.hackerrank.com/contests/w6/challenges/consecutive-subsequences
Here is a fast O(n + k) solution:
1)Lets compute prefix sums pref[i](for 0 <= i < n).
2)Now we can compute count[i] - the number of prefixes with sum i modulo k(0 <= i < k).
This can be done by iterating over all the prefixes and making count[pref[i] % k]++.
Initially, count[0] = 1(an empty prefix has sum 0) and 0 for i != 0.
3)The answer is sum count[i] * (count[i] - 1) / 2 for all i.
4)It is better to compute prefix sums modulo k to avoid overflow.
Why does it work? Let's take a closer a look at a subarray divisible by k. Let's say that it starts in L position and ends in R position. It is divisible by k if and only if pref[L - 1] == pref[R] (modulo k) because their differnce is zero modulo k(by definition of divisibility). So for each fixed modulo, we can pick any two prefixes with this prefix sum modulo k(and there are exactly count[i] * (count[i] - 1) / 2 ways to do it).
Here is my code:
long long get_count(const vector<int>& vec, int k) {
//Initialize count array.
vector<int> cnt_mod(k, 0);
cnt_mod[0] = 1;
int pref_sum = 0;
//Iterate over the input sequence.
for (int elem : vec) {
pref_sum += elem;
pref_sum %= k;
cnt_mod[pref_sum]++;
}
//Compute the answer.
long long res = 0;
for (int mod = 0; mod < k; mod++)
res += (long long)cnt_mod[mod] * (cnt_mod[mod] - 1) / 2;
return res;
}
That have to make your calculations easier:
//Now we will move all numbers to [0..K-1]
long long int count=0;
for(int i=0;i<n;i++){
arr[i] = arr[i]%K;
}
//Now we will calculate cout of all shortest subsequences.
long long int sum=0;
int first(0);
std::vector<int> beg;
std::vector<int> end;
for(int i=0;i<n;i++){
if (arr[i] == 0)
{
count++;
continue;
}
sum += arr[i];
if (sum == K)
{
beg.push_back(first);
end.push_back(i);
count++;
}
else
{
while (sum > K)
{
sum -= arr[first];
first++;
}
if (sum == K)
{
beg.push_back(first);
end.push_back(i);
count++;
}
}
}
//this way we found all short subsequences. And we need to calculate all subsequences that consist of some short subsequencies.
int party(0);
for (int i = 0; i < beg.size() - 1; ++i)
{
if (end[i] == beg[i+1])
{
count += party + 1;
party++;
}
else
{
party = 0;
}
}
So, with max array size = 10^6 and max size of rest = 99, you will not have overflow even if you will need to summ all numbers in simple int32.
And time you will spend will be around O(n+n)