Generate n-digit prime numbers given their first m digits - c++

Digressing a bit on the site I came across this question, It is clear to me that generating prime numbers is complicated and some of the solutions given in that question to make the problem easier are good and ingenious, It occurs to me that perhaps giving the program some leading digits for large primes will make the search easier, is this correct? For example, perhaps finding 10-digit primes starting with 111 is easier than generating all 10-digit primes (even the more leading digits provided this makes it less complex).
Searching on the net (I must clarify that I am not a mathematician and I am not a computer scientist) I found the following code to generate primes of n digits
#include <bits/stdc++.h>
using namespace std;
const int sz = 1e5;
bool isPrime[sz + 1];
// Function for Sieve of Eratosthenes
void sieve() {
memset(isPrime, true, sizeof(isPrime));
isPrime[0] = isPrime[1] = false;
for (int i = 2; i * i <= sz; i++) {
if (isPrime[i]) {
for (int j = i * i; j < sz; j += i) {
isPrime[j] = false;
}
}
}
}
// Function to print all the prime
// numbers with d digits
void findPrimesD(int d) {
// Range to check integers
int left = pow(10, d - 1);
int right = pow(10, d) - 1;
// For every integer in the range
for (int i = left; i <= right; i++) {
// If the current integer is prime
if (isPrime[i]) {
cout << i << " ";
}
}
}
// Driver code
int main() {
// Generate primes
sieve();
int d = 1;
findPrimesD(d);
return 0;
}
My question is, how to take advantage of this code to give it the first m digits and thus make the search easier and smaller?

Related

Improving optimization of nested loop

I'm making a simple program to calculate the number of pairs in an array that are divisible by 3 array length and values are user determined.
Now my code is perfectly fine. However, I just want to check if there is a faster way to calculate it which results in less compiling time?
As the length of the array is 10^4 or less compiler takes less than 100ms. However, as it gets more to 10^5 it spikes up to 1000ms so why is this? and how to improve speed?
#include <iostream>
using namespace std;
int main()
{
int N, i, b;
b = 0;
cin >> N;
unsigned int j = 0;
std::vector<unsigned int> a(N);
for (j = 0; j < N; j++) {
cin >> a[j];
if (j == 0) {
}
else {
for (i = j - 1; i >= 0; i = i - 1) {
if ((a[j] + a[i]) % 3 == 0) {
b++;
}
}
}
}
cout << b;
return 0;
}
Your algorithm has O(N^2) complexity. There is a faster way.
(a[i] + a[j]) % 3 == ((a[i] % 3) + (a[j] % 3)) % 3
Thus, you need not know the exact numbers, you need to know their remainders of division by three only. Zero remainder of the sum can be received with two numbers with zero remainders (0 + 0) and with two numbers with remainders 1 and 2 (1 + 2).
The result will be equal to r[1]*r[2] + r[0]*(r[0]-1)/2 where r[i] is the quantity of numbers with remainder equal to i.
int r[3] = {};
for (int i : a) {
r[i % 3]++;
}
std::cout << r[1]*r[2] + (r[0]*(r[0]-1)) / 2;
The complexity of this algorithm is O(N).
I've encountered this problem before, and while I don't find my particular solution, you could improve running times by hashing.
The code would look something like this:
// A C++ program to check if arr[0..n-1] can be divided
// in pairs such that every pair is divisible by k.
#include <bits/stdc++.h>
using namespace std;
// Returns true if arr[0..n-1] can be divided into pairs
// with sum divisible by k.
bool canPairs(int arr[], int n, int k)
{
// An odd length array cannot be divided into pairs
if (n & 1)
return false;
// Create a frequency array to count occurrences
// of all remainders when divided by k.
map<int, int> freq;
// Count occurrences of all remainders
for (int i = 0; i < n; i++)
freq[arr[i] % k]++;
// Traverse input array and use freq[] to decide
// if given array can be divided in pairs
for (int i = 0; i < n; i++)
{
// Remainder of current element
int rem = arr[i] % k;
// If remainder with current element divides
// k into two halves.
if (2*rem == k)
{
// Then there must be even occurrences of
// such remainder
if (freq[rem] % 2 != 0)
return false;
}
// If remainder is 0, then there must be two
// elements with 0 remainder
else if (rem == 0)
{
if (freq[rem] & 1)
return false;
}
// Else number of occurrences of remainder
// must be equal to number of occurrences of
// k - remainder
else if (freq[rem] != freq[k - rem])
return false;
}
return true;
}
/* Driver program to test above function */
int main()
{
int arr[] = {92, 75, 65, 48, 45, 35};
int k = 10;
int n = sizeof(arr)/sizeof(arr[0]);
canPairs(arr, n, k)? cout << "True": cout << "False";
return 0;
}
That works for a k (in your case 3)
But then again, this is not my code, but the code you can find in the following link. with a proper explanation. I didn't just paste the link since it's bad practice I think.

Duplicates in X element array

I have an interval (m,n) and there I have to print out all the numbers which have different digits. I wrote this, but it only works for 2 digit numbers. I simply do not know how to make it work for anything but 2 digit numbers. I imagine that, if I added as much for loops as the digits of my number it would work, but the interval(m,n) isn't specified so it has to be something reliable. I've been trying to solve this problem on my own for 6 damn hours and I'm absolutely fed up.
Input 97,113;
Output 97,98,102,103,104,105,106,107,108,109
Numbers 99,100,101,110+ don't get printed, because they have 2 digits that are
the same.
#include<conio.h>
#include<math.h>
#include<stdio.h>
int main()
{
int m,n,test,checker=0;
scanf("%d%d",&m,&n);
if(m>n)
{
int holder=n;
n=m;
m=holder;
}
for(int start=m;start<=n;start++)
{
int itemCount=floor(log10(abs(start)))+1;
int nums[itemCount];
int index=0;
test=start;
do
{
int nextVal = test % 10;
nums[index++]=nextVal;
test = test / 10;
}while(test>0);
for (int i = 0; i < itemCount - 1; i++)
{ // read comment by #nbro
for (int j = i + 1; j < itemCount; j++)
{
if (nums[i] == nums[j])
{
checker++;
}
}
if(checker==0)printf("%d ",start);
}
checker=0;
}
}
Since you tagged this as C++, here is a very simple solution using simple modulus and division in a loop. No conversion to string is done.
#include <iostream>
#include <bitset>
bool is_unique_digits(int num)
{
std::bitset<10> numset = 0;
while (num > 0)
{
// get last digit
int val = num % 10;
// if bit is on, then this digit is unique
if (numset[val])
return false;
// turn bit on and remove last digit from number
numset.set(val);
num /= 10;
}
return true;
}
int main()
{
for (int i = 97; i <= 113; ++i)
{
if (is_unique_digits(i))
std::cout << i << "\n";
}
}
The is_unique_digit function simply takes the number and repeatedly extracts the digits from it by taking the last digit in the number. Then this digit is tested to see if the same digit appears in the bitset. If the number already exists, false is immediately returned.
If the number is not in the bitset, then the bit that corresponds to that digit is turned "on" and the number is divided by 10 (effectively removing the last digit from the number). If the loop completes, then true is returned.
Live Example
As an idea for a design:
print the number to a string, if it isn't a string already;
declare an array of int d[10]; and set it to all zeroes
for each ascii digit c of the string,
if (d[c-'0']==1) return 0; // this digit exists already in the number
else d[c-'0']= 1;
just put if(checker==0)printf("%d ",start); outside of second loop the loop
like this
for (int i = 0; i < itemCount - 1; i++)
{
for (int j = i + 1; j < itemCount; j++)
{
if (nums[i] == nums[j])
{
checker++;
break;
}
}
}
if(checker==0)
printf("%d ",start);
checker=0;
However instead of using two nested for loop you can use count array which is more efficient
to check 1 number, you can do
X=10; //number to analyze
char counts[10]; for int i=0;i<10;i++) counts[i]=0;
char number[10];
sprintf(&number,"%s",X); bool bad=false;
for(int i=0;i<strlen(number);i++)
{
if(++counts[number[i]-'0']>1) {bad=true;break;}
}`

Finding Sum of Square of Digits Beginner Bug C++

So, I started learning C++ recently. This code is trying to add the sum of the squares of each numbers digits. For example: 243: 2*2 + 4*4 + 3*3 = 29.
int sumOfSquareDigits(int n) //BUG WITH INPUT OF 10, 100, 1000, etc.
{
int digits = findDigits(n);
int number;
int remainder;
int *allDigits = new int[digits];
for (int i = 0; i < digits; i++) { //assigns digits to array
if (i + 1 == digits){ //sees if there is a ones value left
allDigits[i] = n;
}
else {
remainder = (n % findPower10(digits - (i + 1)));
number = ((n - remainder) / findPower10(digits - (i + 1)));
allDigits[i] = number; //records leftmost digit
n = n - (allDigits[i] * findPower10(digits - (i + 1))); //gets rid of leftmost number and starts over
}
}
int result = 0;
for (int i = 0; i < digits; i++) { //finds sum of squared digits
result = result + (allDigits[i] * allDigits[i]);
}
delete [] allDigits;
return result;
}
int findDigits(int n) //finds out how many digits the number has
{
int digits = 0;
int test;
do {
digits++;
test = findPower10(digits);
} while (n > test);
return digits;
}
int findPower10(int n) { //function for calculating powers of 10
int result = 1;
for (int i = 0; i < n; i++)
result = result * 10;
return result;
}
And after running the code, I've figured out that it (barely) mostly works. I've found that whenever a user inputs a value of 10, 100, 1000, etc. it always returns a value of 100. I'd like to solve this only using the iostream header.
Sorry if my code isn't too readable or organized! It would also be helpful if there are any shortcuts to my super long code, thanks!
The problem is in the findDigits function. For the values 10, 100, 1000 etc, it calculates the number of the digits minus one. This happens because of the comparison in the loop, you are stopping when n is less or equal to test, but in these cases n is equal test and you should run the next iteration.
So, you should change the line 33:
} while (n > test);
to:
} while (n >= test);
Now, it should work just fine. But it will not work for negative numbers (I don't know this is required, but the solution bellow works for that case too).
I came up with a much simpler solution:
int sumOfSquareDigits(int n)
{
// Variable to mantain the total sum of the squares
int sum = 0;
// This loop will change n until it is zero
while (n != 0) {
/// The current digit we will calculate the square is the rightmost digit,
// so we just get its value using the mod operator
int current = n % 10;
// Add its square to the sum
sum += current*current;
// You divide n by 10, this 'removes' one digit of n
n = n / 10;
}
return sum;
}
I found the problem challenging managed to reduce your code to the following lines:
long long sumOfSquareDigits(long long i) {
long long sum(0L);
do {
long long r = i % 10;
sum += (r * r);
} while(i /= 10);
return sum;
}
Haven't test it thoroughly but I think it works OK.

Intitutive method to find prime numbers in a range

While trying to find prime numbers in a range (see problem description), I came across the following code:
(Code taken from here)
// For each prime in sqrt(N) we need to use it in the segmented sieve process.
for (i = 0; i < cnt; i++) {
p = myPrimes[i]; // Store the prime.
s = M / p;
s = s * p; // The closest number less than M that is composite number for this prime p.
for (int j = s; j <= N; j = j + p) {
if (j < M) continue; // Because composite numbers less than M are of no concern.
/* j - M = index in the array primesNow, this is as max index allowed in the array
is not N, it is DIFF_SIZE so we are storing the numbers offset from.
while printing we will add M and print to get the actual number. */
primesNow[j - M] = false;
}
}
// In this loop the first prime numbers for example say 2, 3 are also set to false.
for (int i = 0; i < cnt; i++) { // Hence we need to print them in case they're in range.
if (myPrimes[i] >= M && myPrimes[i] <= N) // Without this loop you will see that for a
// range (1, 30), 2 & 3 doesn't get printed.
cout << myPrimes[i] << endl;
}
// primesNow[] = false for all composite numbers, primes found by checking with true.
for (int i = 0; i < N - M + 1; ++i) {
// i + M != 1 to ensure that for i = 0 and M = 1, 1 is not considered a prime number.
if (primesNow[i] == true && (i + M) != 1)
cout << i + M << endl; // Print our prime numbers in the range.
}
However, I didn't find this code intuitive and it was not easy to understand.
Can someone explain the general idea behind the above algorithm?
What alternative algorithms are there to mark non-prime numbers in a range?
That's overly complicated. Let's start with a basic Sieve of Eratosthenes, in pseudocode, that outputs all the primes less than or equal to n:
function primes(n)
sieve := makeArray(2..n, True)
for p from 2 to n
if sieve[p]
output(p)
for i from p*p to n step p
sieve[p] := False
This function calls output on each prime p; output can print the primes, or sum the primes, or count them, or do whatever you want to do with them. The outer for loop considers each candidate prime in turn; The sieving occurs in the inner for loop where multiples of the current prime p are removed from the sieve.
Once you understand how that works, go here for a discussion of the segmented Sieve of Eratosthenes over a range.
Have you considered the sieve on a bit level, it can provide a bit larger number of primes, and with the buffer, you could modify it to find for example the primes between 2 and 2^60 using 64 bit ints, by reusing the same buffer, while preserving the offsets of the primes already discovered. The following will use an array of integers.
Declerations
#include <math.h> // sqrt(), the upper limit need to eliminate
#include <stdio.h> // for printing, could use <iostream>
Macros to manipulate bit, the following will use 32bit ints
#define BIT_SET(d, n) (d[n>>5]|=1<<(n-((n>>5)<<5)))
#define BIT_GET(d, n) (d[n>>5]&1<<(n-((n>>5)<<5)))
#define BIT_FLIP(d, n) (d[n>>5]&=~(1<<(n-((n>>5)<<5))))
unsigned int n = 0x80000; // the upper limit 1/2 mb, with 32 bits each
// will get the 1st primes upto 16 mb
int *data = new int[n]; // allocate
unsigned int r = n * 0x20; // the actual number of bits avalible
Could use zeros to save time but, on (1) for prime, is a bit more intuitive
for(int i=0;i<n;i++)
data[i] = 0xFFFFFFFF;
unsigned int seed = 2; // the seed starts at 2
unsigned int uLimit = sqrt(r); // the upper limit for checking off the sieve
BIT_FLIP(data, 1); // one is not prime
Time to discover the primes this took under a half second
// untill uLimit is reached
while(seed < uLimit) {
// don't include itself when eliminating canidates
for(int i=seed+seed;i<r;i+=seed)
BIT_FLIP(data, i);
// find the next bit still active (set to 1), don't include the current seed
for(int i=seed+1;i<r;i++) {
if (BIT_GET(data, i)) {
seed = i;
break;
}
}
}
Now for the output this will consume the most time
unsigned long bit_index = 0; // the current bit
int w = 8; // the width of a column
unsigned pc = 0; // prime, count, to assist in creating columns
for(int i=0;i<n;i++) {
unsigned long long int b = 1; // double width, so there is no overflow
// if a bit is still set, include that as a result
while(b < 0xFFFFFFFF) {
if (data[i]&b) {
printf("%8.u ", bit_index);
if(((pc++) % w) == 0)
putchar('\n'); // add a new row
}
bit_index++;
b<<=1; // multiply by 2, to check the next bit
}
}
clean up
delete [] data;

My Sieve of Eratosthenes takes too long

I have implemented Sieve of Eratosthenes to solve the SPOJ problem PRIME1. Though the output is fine, my submission exceeds the time limit. How can I reduce the run time?
int main()
{
vector<int> prime_list;
prime_list.push_back(2);
vector<int>::iterator c;
bool flag=true;
unsigned int m,n;
for(int i=3; i<=32000;i+=2)
{
flag=true;
float s = sqrt(static_cast<float>(i));
for(c=prime_list.begin();c<=prime_list.end();c++)
{
if(*c>s)
break;
if(i%(*c)==0)
{
flag=false;
break;
}
}
if(flag==true)
{
prime_list.push_back(i);
}
}
int t;
cin>>t;
for (int times = 0; times < t; times++)
{
cin>> m >> n;
if (t) cout << endl;
if (m < 2)
m=2;
unsigned int j;
vector<unsigned int> req_list;
for(j=m;j<=n;j++)
{
req_list.push_back(j);
}
vector<unsigned int>::iterator k;
flag=true;
int p=0;
for(j=m;j<=n;j++)
{
flag=true;
float s = sqrt(static_cast<float>(j));
for(c=prime_list.begin();c<=prime_list.end();c++)
{
if((*c)!=j)
{
if((*c)>s)
break;
if(j%(*c)==0)
{
flag=false;
break;
}
}
}
if(flag==false)
{
req_list.erase (req_list.begin()+p);
p--;
}
p++;
}
for(k=req_list.begin();k<req_list.end();k++)
{
cout<<*k;
cout<<endl;
}
}
}
Your code is slow because you did not implement the Sieve of Eratosthenes algorithm. The algorithm works that way:
1) Create an array with size n-1, representing the numbers 2 to n, filling it with boolean values true (true means that the number is prime; do not forget we start counting from number 2 i.e. array[0] is the number 2)
2) Initialize array[0] = false.
3) Current_number = 2;
3) Iterate through the array by increasing the index by Current_number.
4) Search for the first number (except index 0) with true value.
5) Current_number = index + 2;
6) Continue steps 3-5 until search is finished.
This algorithm takes O(nloglogn) time.
What you do actually takes alot more time (O(n^2)).
Btw in the second step (where you search for prime numbers between n and m) you do not have to check if those numbers are prime again, ideally you will have calculated them in the first phase of the algorithm.
As I see in the site you linked the main problem is that you can't actually create an array with size n-1, because the maximum number n is 10^9, causing memory problems if you do it with this naive way. This problem is yours :)
I'd throw out what you have and start over with a really simple implementation of a sieve, and only add more complexity if really needed. Here's a possible starting point:
#include <vector>
#include <iostream>
int main() {
int number = 32000;
std::vector<bool> sieve(number,false);
sieve[0] = true; // Not used for now,
sieve[1] = true; // but you'll probably need these later.
for(int i = 2; i<number; i++) {
if(!sieve[i]) {
std::cout << "\t" << i;
for (int temp = 2*i; temp<number; temp += i)
sieve[temp] = true;
}
}
return 0;
}
For the given range (up to 32000), this runs in well under a second (with output directed to a file -- to the screen it'll generally be slower). It's up to you from there though...
I am not really sure that you have implemented the sieve of Erasthotenes. Anyway a couple of things that could speed up to some extent your algorithm would be: Avoid multiple rellocations of the vector contents by preallocating space (lookup std::vector<>::reserve). The operation sqrt is expensive, and you can probably avoid it altogether by modifying the tests (stop when the x*x > y instead of checking x < sqrt(y).
Then again, you will get a much better improvement by revising the actual algorithm. From a cursory look it seems as if you are iterating over all candidates and for each one of them, trying to divide with all the known primes that could be factors. The sieve of Erasthotenes takes a single prime and discards all multiples of that prime in a single pass.
Note that the sieve does not perform any operation to test whether a number is prime, if it was not discarded before then it is a prime. Each not prime number is visited only once for each unique factor. Your algorithm on the other hand is processing every number many times (against the existing primes)
I think one way to slightly speed up your sieve is the prevention of using the mod operator in this line.
if(i%(*c)==0)
Instead of the (relatively) expensive mod operation, maybe if you iterated forward in your sieve with addition.
Honestly, I don't know if this is correct. Your code is difficult to read without comments and with single letter variable names.
The way I understand the problem is that you have to generate all primes in a range [m,n].
A way to do this without having to compute all primes from [0,n], because this is most likely what's slowing you down, is to first generate all the primes in the range [0,sqrt(n)].
Then use the result to sieve in the range [m,n]. To generate the initial list of primes, implement a basic version of the sieve of Eratosthenes (Pretty much just a naive implementation from the pseudo code in the Wikipedia article will do the trick).
This should enable you to solve the problem in very little time.
Here's a simple sample implementation of the sieve of Eratosthenes:
std::vector<unsigned> sieve( unsigned n ) {
std::vector<bool> v( limit, true ); //Will be used for testing numbers
std::vector<unsigned> p; //Will hold the prime numbers
for( unsigned i = 2; i < n; ++i ) {
if( v[i] ) { //Found a prime number
p.push_back(i); //Stuff it into our list
for( unsigned j = i + i; j < n; j += i ) {
v[i] = false; //Isn't a prime/Is composite
}
}
}
return p;
}
It returns a vector containing only the primes from 0 to n. Then you can use this to implement the method I mentioned. Now, I won't provide the implementation for you, but, you basically have to do the same thing as in the sieve of Eratosthenes, but instead of using all integers [2,n], you just use the result you found. Not sure if this is giving away too much?
Since the SPOJ problem in the original question doesn't specify that it has to be solved with the Sieve of Eratosthenes, here's an alternative solution based on this article. On my six year old laptop it runs in about 15 ms for the worst single test case (n-m=100,000).
#include <set>
#include <iostream>
using namespace std;
int gcd(int a, int b) {
while (true) {
a = a % b;
if(a == 0)
return b;
b = b % a;
if(b == 0)
return a;
}
}
/**
* Here is Rowland's formula. We define a(1) = 7, and for n >= 2 we set
*
* a(n) = a(n-1) + gcd(n,a(n-1)).
*
* Here "gcd" means the greatest common divisor. So, for example, we find
* a(2) = a(1) + gcd(2,7) = 8. The prime generator is then a(n) - a(n-1),
* the so-called first differences of the original sequence.
*/
void find_primes(int start, int end, set<int>* primes) {
int an; // a(n)
int anm1 = 7; // a(n-1)
int diff;
for (int n = start; n < end; n++) {
an = anm1 + gcd(n, anm1);
diff = an - anm1;
if (diff > 1)
primes->insert(diff);
anm1 = an;
}
}
int main() {
const int end = 100000;
const int start = 2;
set<int> primes;
find_primes(start, end, &primes);
ticks = GetTickCount() - ticks;
cout << "Found " << primes.size() << " primes:" << endl;
set<int>::iterator iter = primes.begin();
for (; iter != primes.end(); ++iter)
cout << *iter << endl;
}
Profile your code, find hotspots, eliminate them. Windows, Linux profiler links.