Trouble sieving primes from a large range

Trouble sieving primes from a large range - c++

#include <cstdio>
#include <algorithm>
#include <cmath>
using namespace std;
int main() {
int t,m,n;
scanf("%d",&t);
while(t--)
{
scanf("%d %d",&m,&n);
int rootn=sqrt(double(n));
bool p[10000]; //finding prime numbers from 1 to square_root(n)
for(int j=0;j<=rootn;j++)
p[j]=true;
p[0]=false;
p[1]=false;
int i=rootn;
while(i--)
{
if(p[i]==true)
{
int c=i;
do
{
c=c+i;
p[c]=false;
}while(c+p[i]<=rootn);
}
};
i=0;
bool rangep[10000]; //used for finding prime numbers between m and n by eliminating multiple of primes in between 1 and squareroot(n)
for(int j=0;j<=n-m+1;j++)
rangep[j]=true;
i=rootn;
do
{
if(p[i]==true)
{
for(int j=m;j<=n;j++)
{
if(j%i==0&&j!=i)
rangep[j-m]=false;
}
}
}while(i--);
i=n-m;
do
{
if(rangep[i]==true)
printf("%d\n",i+m);
}while(i--);
printf("\n");
}
return 0;
system("PAUSE");
}
Hello I'm trying to use the sieve of Eratosthenes to find prime numbers in a range between m to n where m>=1 and n<=100000000. When I give input of 1 to 10000, the result is correct. But for a wider range, the stack is overflowed even if I increase the array sizes.

A simple and more readable implementation
void Sieve(int n) {
int sqrtn = (int)sqrt((double)n);
std::vector<bool> sieve(n + 1, false);
for (int m = 2; m <= sqrtn; ++m) {
if (!sieve[m]) {
cout << m << " ";
for (int k = m * m; k <= n; k += m)
sieve[k] = true;
}
}
for (int m = sqrtn; m <= n; ++m)
if (!sieve[m])
cout << m << " ";
}

Reason of getting error
You are declaring an enormous array as a local variable. That's why when the stack frame of main is pushed it needs so much memory that stack overflow exception is generated. Visual studio is tricky enough to analyze the code for projected run-time stack usage and generate exception when needed.
Use this compact implementation. Moreover you can have bs declared in the function if you want. Don't make implementations complex.
Implementation
typedef long long ll;
typedef vector<int> vi;
vi primes;
bitset<100000000> bs;
void sieve(ll upperbound) {
_sieve_size = upperbound + 1;
bs.set();
bs[0] = bs[1] = 0;
for (ll i = 2; i <= _sieve_size; i++)
if (bs[i]) { //if not marked
for (ll j = i * i; j <= _sieve_size; j += i) //check all the multiples
bs[j] = 0; // they are surely not prime :-)
primes.push_back((int)i); // this is prime
} }
call from main() sieve(10000);. You have primes list in vector primes.
Note: As mentioned in comment--stackoverflow is quite unexpected error here. You are implementing sieve but it will be more efficient if you use bistet instead of bool.
Few things like if n=10^8 then sqrt(n)=10^4. And your bool array is p[10000]. So there is a chance of accessing array out of bound.

I agree with the other answers,
saying that you should basically just start over. 
Do you even care why your code doesn’t work?  (You didn’t actually ask.)
I’m not sure that the problem in your code
has been identified accurately yet. 
First of all, I’ll add this comment to help set the context:
// For any int aardvark;
// p[aardvark] = false means that aardvark is composite (i.e., not prime).
// p[aardvark] = true means that aardvark might be prime, or maybe we just don’t know yet.
Now let me draw your attention to this code:
int i=rootn;
while(i--)
{
if(p[i]==true)
{
int c=i;
do
{
c=c+i;
p[c]=false;
}while(c+p[i]<=rootn);
}
};
You say that n≤100000000 (although your code doesn’t check that), so,
presumably, rootn≤10000, which is the dimensionality (size) of p[]. 
The above code is saying that, for every integer i
(no matter whether it’s prime or composite),
2×i, 3×i, 4×i, etc., are, by definition, composite. 
So, for c equal to 2×i, 3×i, 4×i, …,
we set p[c]=false because we know that c is composite.
But look closely at the code. 
It sets c=c+i and says p[c]=false
before checking whether c is still in range
to be a valid index into p[]. 
Now, if n≤25000000, then rootn≤5000. 
If i≤ rootn, then i≤5000, and, as long as c≤5000, then c+i≤10000. 
But, if n>25000000, then rootn>5000,†
and the sequence i=rootn;, c=i;, c=c+i;
can set c to a value greater than 10000. 
And then you use that value to index into p[]. 
That’s probably where the stack overflow occurs.
Oh, BTW; you don’t need to say if(p[i]==true); if(p[i]) is good enough.
To add insult to injury, there’s a second error in the same block:
while(c+p[i]<=rootn). 
c and i are ints,
and p is an array of bools, so p[i] is a bool —
and yet you are adding c + p[i]. 
We know from the if that p[i] is true,
which is numerically equal to 1 —
so your loop termination condition is while (c+1<=rootn);
i.e., while c≤rootn-1. 
I think you meant to say while(c+i<=rootn).
Oh, also, why do you have executable code
immediately after an unconditional return statement? 
The system("PAUSE"); statement cannot possibly be reached.
(I’m not saying that those are the only errors;
they are just what jumped out at me.)
______________
† OK, splitting hairs, n has to be ≥ 25010001
(i.e., 50012) before rootn>5000.

Related

How to deal with large sizes of data such as array or just number that causing stack in Cpp?

its my first time dealing with large numbers or arrays and i cant avoid over stacking i tried to use long long to try to avoid it but it shows me that the error is int main line :
CODE:
#include <iostream>
using namespace std;
int main()
{
long long n=0, city[100000], min[100000] = {10^9}, max[100000] = { 0 };
cin >> n;
for (int i = 0; i < n; i++) {
cin >> city[i];
}
for (int i = 0; i < n; i++)
{//min
for (int s = 0; s < n; s++)
{
if (city[i] != city[s])
{
if (min[i] >= abs(city[i] - city[s]))
{
min[i] = abs(city[i] - city[s]);
}
}
}
}
for (int i = 0; i < n; i++)
{//max
for (int s = 0; s < n; s++)
{
if (city[i] != city[s])
{
if (max[i] <= abs(city[i] - city[s]))
{
max[i] = abs(city[i] - city[s]);
}
}
}
}
for (int i = 0; i < n; i++) {
cout << min[i] << " " << max[i] << endl;
}
}
**ERROR:**
Severity Code Description Project File Line Suppression State
Warning C6262 Function uses '2400032' bytes of stack: exceeds /analyze:stacksize '16384'. Consider moving some data to heap.
then it opens chkstk.asm and shows error in :
test dword ptr [eax],eax ; probe page.

Small optimistic remark:
100,000 is not a large number for your computer! (you're also not dealing with that many arrays, but arrays of that size)
Error message describes what goes wrong pretty well:
You're creating arrays on your current function's "scratchpad" (the stack). That has very limited size!
This is C++, so you really should do things the (modern-ish) C++ way and avoid manually handling large data objects when you can.
So, replace
long long n=0, city[100000], min[100000] = {10^9}, max[100000] = { 0 };
with (I don't see any case where you'd want to use long long; presumably, you want a 64bit variable?)
(10^9 is "10 XOR 9", not "10 to the power of 9")
constexpr size_t size = 100000;
constexpr int64_t default_min = 1'000'000'000;
uint64_t n = 0;
std::vector<int64_t> city(size);
std::vector<int64_t> min_value(size, default_min);
std::vector<int64_t> max_value(size, 0);
Additional remarks:
Notice how I took your 100000 and your 10⁹ and made them constexpr constants? Do that! Whenever some non-zero "magic constant" appears in your code, it's a good time to ask yourself "will I ever need that value somewhere else, too?" and "Would it make sense to give this number a name explaining what it is?". And if you answer one of them with "yes": make a new constexpr constant, even just directly above where you use it! The compiler will just deal with that as if you had the literal number where you use it, it's not any extra memory, or CPU cycles, that this will cost.
Matter of fact, that's even bad! You pre-allocating not-really-large-but-still-unneccesarily-large arrays is just a bad idea. Instead, read n first, then use that n to make std::vectors of that size.
Don not using namespace std;, for multiple reasons, chief among them that now your min and max variables would shadow std::min and std::max, and if you call something, you never know whether you're actually calling what you mean to, or just the function of the same name from the std:: namespace. Instead using std::cout; using std::cin; would do for you here!
This might be beyond your current learning level (that's fine!), but
for (int i = 0; i < n; i++) {
cin >> city[i];
}
is inelegant, and with the std::vector approach, if you make your std::vector really have length n, can be written nicely as:
for (auto &value: city) {
cin >> value;
}
This will also make sure you're not accidentally reading more values than you mean when changing the length of that city storage one day.
It looks as if you're trying to find the minimum and maximum absolute distance between city values. But you do it in an incredibly inefficient way, needing multiple loops over 10⁵·10⁵=10¹⁰ iterations.
Start with the maximum distance: assume your city vector, array (whatever!) were sorted. What are the two elements with the greatest absolute distance?
If you had a sorted array/vector: how would you find the two elements with the smallest distance?

While loop task in c++

I am a beginner in c++ and I am having problems with making this code work the way I want it to. The task is to write a program that multiplies all the natural numbers up to the loaded number n.
To make it print the correct result, I divided x by n (see code below). How can I make it print x and not have to divide it by n to get the correct answer?
#include<iostream>
using namespace std;
int main(){
int n,x=1;
int i=0;
cout<<"Enter a number bigger than 0:"<<endl;
cin>>n;
while(i<n){
i++;
x=i*x;
};
cout<<"The result is: "<<x/n<<endl;
return 0;
}

At very first a principle you best get used to as quickly as possible: Always check user input for correctness!
cin >> n;
if(cin && n > 0)
{
// valid
}
else
{
// appropriate error handling
}
Not sure, why do you need a while loop? A for loop sure is nicer in this case:
int x = 1;
for(int i = 2; i < n; ++i)
x *= i;
If you still want the while loop: Start with i == 2 (1 is neutral anyway) and increment afterwards:
i = 2;
while(i < n)
{
x *= i;
++i;
}
In case of n == 1, the loop (either variant) simply won't be entered and you are fine...

You already have two very good options, but here is an other one you might want to take a look at when you are at ease enough in programming :
unsigned factorial(unsigned value)
{
if (value <= 1)
{
return 1;
}
else
{
return value * factorial(value - 1);
}
}
It's a recursive function, which is kind of neat when used in proper moments (which could not be the case here unfortunately because the execution stack might get so big you fill your memory before you're done. But you can check it out to learn more about recursive functions)
When your memory is full, you then crash your app with what is called actually a stack overflow.

How can I make it so that in the last cout I can only put x and not have to divide x by n to get the correct answer?
It will be better to use a for loop.
// This stops when i reaches n.
// That means, n is not multiplied to the result when the loop breaks.
for (int i = 1; i < n; ++i )
{
x *= i;
}
cout << "The result is: " << x <<endl;

Count number of ways for choosing two numbers in efficient algorithm

I solved this problem but I got TLE Time Limit Exceed on online judge
the output of program is right but i think the way can be improved to be more efficient!
the problem :
Given n integer numbers, count the number of ways in which we can choose two elements such
that their absolute difference is less than 32.
In a more formal way, count the number of pairs (i, j) (1 ≤ i < j ≤ n) such that
|V[i] - V[j]| < 32. |X|
is the absolute value of X.
Input
The first line of input contains one integer T, the number of test cases (1 ≤ T ≤ 128).
Each test case begins with an integer n (1 ≤ n ≤ 10,000).
The next line contains n integers (1 ≤ V[i] ≤ 10,000).
Output
For each test case, print the number of pairs on a single line.
my code in c++ :
int main() {
int T,n,i,j,k,count;
int a[10000];
cin>>T;
for(k=0;k<T;k++)
{ count=0;
cin>>n;
for(i=0;i<n;i++)
{
cin>>a[i];
}
for(i=0;i<n;i++)
{
for(j=i;j<n;j++)
{
if(i!=j)
{
if(abs(a[i]-a[j])<32)
count++;
}
}
}
cout<<count<<endl;
}
return 0;
}
I need help how can I solve it in more efficient algorithm ?

Despite my previous (silly) answer, there is no need to sort the data at all. Instead you should count the frequencies of the numbers.
Then all you need to do is keep track of the number of viable numbers to pair with, while iterating over the possible values. Sorry no c++ but java should be readable as well:
int solve (int[] numbers) {
int[] frequencies = new int[10001];
for (int i : numbers) frequencies[i]++;
int solution = 0;
int inRange = 0;
for (int i = 0; i < frequencies.length; i++) {
if (i > 32) inRange -= frequencies[i - 32];
solution += frequencies[i] * inRange;
solution += frequencies[i] * (frequencies[i] - 1) / 2;
inRange += frequencies[i];
}
return solution;
}

#include <bits/stdc++.h>
using namespace std;
int a[10010];
int N;
int search (int x){
int low = 0;
int high = N;
while (low < high)
{
int mid = (low+high)/2;
if (a[mid] >= x) high = mid;
else low = mid+1;
}
return low;
}
int main() {
cin >> N;
for (int i=0 ; i<N ; i++) cin >> a[i];
sort(a,a+N);
long long ans = 0;
for (int i=0 ; i<N ; i++)
{
int t = search(a[i]+32);
ans += (t -i - 1);
}
cout << ans << endl;
return 0;
}

You can sort the numbers, and then use a sliding window. Starting with the smallest number, populate a std::deque with the numbers so long as they are no larger than the smallest number + 31. Then in an outer loop for each number, update the sliding window and add the new size of the sliding window to the counter. Update of the sliding window can be performed in an inner loop, by first pop_front every number that is smaller than the current number of the outer loop, then push_back every number that is not larger than the current number of the outer loop + 31.

One faster solution would be to first sort the array, then iterate through the sorted array and for each element only visit the elements to the right of it until the difference exceeds 31.
Sorting can probably be done via count sort (since you have 1 ≤ V[i] ≤ 10,000). So you get linear time for the sorting part. It might not be necessary though (maybe quicksort suffices in order to get all the points).
Also, you can do a trick for the inner loop (the "going to the right of the current element" part). Keep in mind that if S[i+k]-S[i]<32, then S[i+k]-S[i+1]<32, where S is the sorted version of V. With this trick the whole algorithm turns linear.

This can be done constant number of passes over the data, and actually can be done without being affected by the value of the "interval" (in your case, 32).
This is done by populating an array where a[i] = a[i-1] + number_of_times_i_appears_in_the_data - informally, a[i] holds the total number of elements that are smaller/equals to i.
Code (for a single test case):
static int UPPER_LIMIT = 10001;
static int K = 32;
int frequencies[UPPER_LIMIT] = {0}; // O(U)
int n;
std::cin >> n;
for (int i = 0; i < n; i++) { // O(n)
int x;
std::cin >> x;
frequencies[x] += 1;
}
for (int i = 1; i < UPPER_LIMIT; i++) { // O(U)
frequencies[i] += frequencies[i-1];
}
int count = 0;
for (int i = 1; i < UPPER_LIMIT; i++) { // O(U)
int low_idx = std::max(i-32, 0);
int number_of_elements_with_value_i = frequencies[i] - frequencies[i-1];
if (number_of_elements_with_value_i == 0) continue;
int number_of_elements_with_value_K_close_to_i =
(frequencies[i-1] - frequencies[low_idx]);
std::cout << "i: " << i << " number_of_elements_with_value_i: " << number_of_elements_with_value_i << " number_of_elements_with_value_K_close_to_i: " << number_of_elements_with_value_K_close_to_i << std::endl;
count += number_of_elements_with_value_i * number_of_elements_with_value_K_close_to_i;
// Finally, add "duplicates" of i, this is basically sum of arithmetic
// progression with d=1, a0=0, n=number_of_elements_with_value_i
count += number_of_elements_with_value_i * (number_of_elements_with_value_i-1) /2;
}
std::cout << count;
Working full example on IDEone.

You can sort and then use break to end loop when ever the range goes out.
int main()
{
int t;
cin>>t;
while(t--){
int n,c=0;
cin>>n;
int ar[n];
for(int i=0;i<n;i++)
cin>>ar[i];
sort(ar,ar+n);
for(int i=0;i<n;i++){
for(int j=i+1;j<n;j++){
if(ar[j]-ar[i] < 32)
c++;
else
break;
}
}
cout<<c<<endl;
}
}
Or, you can use a hash array for the range and mark occurrence of each element and then loop around and check for each element i.e. if x = 32 - y is present or not.

A good approach here is to split the numbers into separate buckets:
constexpr int limit = 10000;
constexpr int diff = 32;
constexpr int bucket_num = (limit/diff)+1;
std::array<std::vector<int>,bucket_num> buckets;
cin>>n;
int number;
for(i=0;i<n;i++)
{
cin >> number;
buckets[number/diff].push_back(number%diff);
}
Obviously the numbers that are in the same bucket are close enough to each other to fit the requirement, so we can just count all the pairs:
int result = std::accumulate(buckets.begin(), buckets.end(), 0,
[](int s, vector<int>& v){ return s + (v.size()*(v.size()-1))/2; });
The numbers that are in non-adjacent buckets cannot form any acceptable pairs, so we can just ignore them.
This leaves the last corner case - adjacent buckets - which can be solved in many ways:
for(int i=0;i<bucket_num-1;i++)
if(buckets[i].size() && buckets[i+1].size())
result += adjacent_buckets(buckets[i], buckets[i+1]);
Personally I like the "occurrence frequency" approach on the one bucket scale, but there may be better options:
int adjacent_buckets(const vector<int>& bucket1, const vector<int>& bucket2)
{
std::array<int,diff> pairs{};
for(int number : bucket1)
{
for(int i=0;i<number;i++)
pairs[i]++;
}
return std::accumulate(bucket2.begin(), bucket2.end(), 0,
[&pairs](int s, int n){ return s + pairs[n]; });
}
This function first builds an array of "numbers from lower bucket that are close enough to i", and then sums the values from that array corresponding to the upper bucket numbers.
In general this approach has O(N) complexity, in the best case it will require pretty much only one pass, and overall should be fast enough.
Working Ideone example

This solution can be considered O(N) to process N input numbers and constant in time to process the input:
#include <iostream>
using namespace std;
void solve()
{
int a[10001] = {0}, N, n, X32 = 0, ret = 0;
cin >> N;
for (int i=0; i<N; ++i)
{
cin >> n;
a[n]++;
}
for (int i=0; i<10001; ++i)
{
if (i >= 32)
X32 -= a[i-32];
if (a[i])
{
ret += a[i] * X32;
ret += a[i] * (a[i]-1)/2;
X32 += a[i];
}
}
cout << ret << endl;
}
int main()
{
int T;
cin >> T;
for (int i=0 ; i<T ; i++)
solve();
}
run this code on ideone
Solution explanation: a[i] represents how many times i was in the input series.
Then you go over entire array and X32 keeps track of number of elements that's withing range from i. The only tricky part really is to calculate properly when some i is repeated multiple times: a[i] * (a[i]-1)/2. That's it.

You should start by sorting the input.
Then if your inner loop detects the distance grows above 32, you can break from it.

Thanks for everyone efforts and time to solve this problem.
I appreciated all Attempts to solve it.
After testing the answers on online judge I found the right and most efficient solution algorithm is Stef's Answer and AbdullahAhmedAbdelmonem's answer also pavel solution is right but it's exactly same as Stef solution in different language C++.
Stef's code got time execution 358 ms in codeforces online judge and accepted.
also AbdullahAhmedAbdelmonem's code got time execution 421 ms in codeforces online judge and accepted.
if they put detailed explanation to there algorithm the bounty will be to one of them.
you can try your solution and submit it to codeforces online judge at this link after choosing problem E. Time Limit Exceeded?
also I found a great algorithm solution and more understandable using frequency array and it's complexity O(n).
in this algorithm you only need to take specific range for each inserted element to the array which is:
begin = element - 32
end = element + 32
and then count number of pair in this range for each inserted element in the frequency array :
int main() {
int T,n,i,j,k,b,e,count;
int v[10000];
int freq[10001];
cin>>T;
for(k=0;k<T;k++)
{
count=0;
cin>>n;
for(i=1;i<=10000;i++)
{
freq[i]=0;
}
for(i=0;i<n;i++)
{
cin>>v[i];
}
for(i=0;i<n;i++)
{
count=count+freq[v[i]];
b=v[i]-31;
e=v[i]+31;
if(b<=0)
b=1;
if(e>10000)
e=10000;
for(j=b;j<=e;j++)
{
freq[j]++;
}
}
cout<<count<<endl;
}
return 0;
}
finally i think the best approach to solve this kind of problems to use frequency array and count number of pairs in specific range because it's time complexity is O(n).

Improvements for isPrime function

There are many problems on the internet that require you to find prime numbers, so I decided to write a set of functions to find them. I used the Sieve of Eratosthenes for generating the primes as it was fast and easy to implement compared to other algorithms. However, I'm wondering if my code rather than my method is inefficient. Am I using STL containers/iterators right? Is there any section in my code slowing down the program?
In other words it does calculate the results correctly, but what I wonder about is whether its efficiency can be improved by some algorithmic improvement as opposed to just some code tweaking.
Any help is truly appreciated.
Here's my code
(I apologize if it's hard to read)
#include <iostream>
#include <set>
#include <vector>
#include <algorithm>
#include <cmath>
using namespace std;
#define initial_prime_barrier 100
bool isFlagged(int i) { return i == 0; }
bool isNextStart(int i) { return i != 0; }
vector<int> generatePrimesBelow(int limit)
{
vector<int> primes;
for (int i = 2; i < limit; i++)
{
primes.push_back(i);
}
vector<int>::iterator currentStart = primes.begin();
do
{
int numberAtStart = *currentStart;
vector<int>::iterator currentNumber = currentStart + numberAtStart;
do
{
*currentNumber = 0;
advance(currentNumber, numberAtStart);
} while (currentNumber < primes.end());
currentStart = find_if(currentStart + 1, primes.end(), isNextStart);
} while ((*currentStart) * (*currentStart) < limit);
vector<int>::iterator newEnd = remove_if(primes.begin(), primes.end(), isFlagged);
primes.erase(newEnd, primes.end());
return primes;
}
bool isPrime(int number)
{
static vector<int> primes = generatePrimesBelow(initial_prime_barrier);
static int numPrimes = primes.size();
static int largestPrime = primes[numPrimes-1];
static int halfwayPrime = primes[numPrimes/2];
if (number == largestPrime)
{
return true;
}
else if (number < largestPrime)
{
if (number == halfwayPrime)
{
return true;
}
else if (number > halfwayPrime)
{
for (int i = numPrimes/2; i < numPrimes; i++)
{
if (number == primes[i])
{
return true;
}
}
}
else if (number < halfwayPrime)
{
for (int i = numPrimes/2; i >= 0; i--)
{
if (number == primes[i])
{
return true;
}
}
}
}
else if (number > largestPrime)
{
primes = generatePrimesBelow(number + number);
numPrimes = primes.size();
largestPrime = primes[numPrimes-1];
halfwayPrime = primes[numPrimes/2];
return isPrime(number);
}
return false;
}
int main (int argc, char * const argv[])
{
const int number = 123123;
cout << (isPrime(number) ? "YES" : "NO") << endl;
}

Yes, it is your method. Several things. You don't need your array to hold numbers, each entry's address in the array is the number itself. You just need them to hold two values - true and false. So make your array vector<bool>, it will be much more compact. Then, in your inner loop you start from x+x and advance by steps of x. You should start from x*x, and advance by steps of 2*x - that will work for all x except 2. Make it a special case, or mark these even numbers at the initialization loop. Or treat an entry at i as representing the number 2*i+1 and dispense with handling evens altogether. This should speed up your sieve code. Lastly, you don't need special find_if call with all its machinery, you can just check the current entry that comes up in the loop.
(edit:) In your isPrime you perform a binary search by hand, but there is already a binary_search algo in STL. And you won't need it at all, if you keep your vector<bool> sieve array as is, without compressing. Then isPrime(i) needs just to check whether the array's value at the index i is still true.
(edit2:) Now, about efficiency. You recalculate up to n+n, probably in anticipation of more numbers to test. If you only test few, simple trial division on odds will be faster. If the numbers to test are all in a narrow-ish upper region, your best option is offset sieve with the lower sieve done up to the sqrt of the test region's upper limit. And if the numbers are widely distributed, then your current whole array approach can be used.
The key facts to use here is that there are approximately n ~= m/log m primes below m in value, that to sieve an array from 0 to m takes O(m*log (log m)) time, and that to sieve the upper region between a and b, i.e. with width d=b-a, by all the primes below r=sqrt b, it'd take time proportional to d*log (log r).
Also, when growing your sieve array it is best to expand, and not to recalculate the whole anew. The primes are all there. To sieve the appendage it will be necessary to loop through all the primes in the sieve array up to the sqrt of its new upper edge. This is reminiscent of segmented sieve, although there each new segment comes instead of, or in any case separately from a previous one.

Prime numbers program

I'm currently trying out some questions just to practice my programming skills. ( Not taking it in school or anything yet, self taught ) I came across this problem which required me to read in a number from a given txt file. This number would be N. Now I'm suppose to find the Nth prime number for N <= 10 000. After I find it, I'm suppose to print it out to another txt file. Now for most parts of the question I'm able to understand and devise a method to get N. The problem is that I'm using an array to save previously found prime numbers so as to use them to check against future numbers. Even when my array was size 100, as long as the input integer was roughly < 15, the program crashes.
#include <cstdio>
#include <iostream>
#include <cstdlib>
#include <fstream>
using namespace std;
int main() {
ifstream trial;
trial.open("C:\\Users\\User\\Documents\\trial.txt");
int prime;
trial >> prime;
ofstream write;
write.open("C:\\Users\\User\\Documents\\answer.txt");
int num[100], b, c, e;
bool check;
b = 0;
switch (prime) {
case 1:
{
write << 2 << endl;
break;
}
case 2:
{
write << 3 << endl;
break;
}
case 3:
{
write << 5 << endl;
break;
}
case 4:
{
write << 7 << endl;
break;
}
default:
{
for (int a = 10; a <= 1000000; a++) {
check = false;
if (((a % 2) != 0) && ((a % 3) != 0) && ((a % 5) != 0) && ((a % 7) != 0)) // first filter
{
for (int d = 0; d <= b; d++) {
c = num[d];
if ((a % c) == 0) {
check = true; // second filter based on previous recorded primes in array
break;
}
}
if (!check) {
e = a;
if (b <= 100) {
num[b] = a;
}
b = b + 1;
}
}
if ((b) == (prime - 4)) {
write << e << endl;
break;
}
}
}
}
trial.close();
write.close();
return 0;
}
I did this entirely base on my dummies guide and myself so do forgive some code inefficiency and general newbie-ness of my algorithm.
Also for up to 15 it displays the prime numbers correctly.
Could anyone tell me how I should go about improving this current code? I'm thinking of using a txt file in place of the array. Is that possible? Any help is appreciated.

Since your question is about programming rather than math, I will try to keep my answer that way too.
The first glance of your code makes me wonder what on earth you are doing here... If you read the answers, you will realize that some of them didn't bother to understand your code, and some just dump your code to a debugger and see what's going on. Is it that we are that impatient? Or is it simply that your code is too difficult to understand for a relatively easy problem?
To improve your code, try ask yourself some questions:
What are a, b, c, etc? Wouldn't it better to give more meaningful names?
What exactly is your algorithm? Can you write down a clearly written paragraph in English about what you are doing (in an exact way)? Can you modify the paragraph into a series of steps that you can mentally carry out on any input and can be sure that it is correct?
Are all steps necessary? Can we combine or even eliminate some of them?
What are the steps that are easy to express in English but require, say, more than 10 lines in C/C++?
Does your list of steps have any structures? Loops? Big (probably repeated) chunks that can be put as a single step with sub-steps?
After you have going through the questions, you will probably have a clearly laid out pseudo-code that solves the problem, which is easy to explain and understand. After that you can implement your pseudo-code in C/C++, or, in fact, any general purpose language.

There are a two approaches to testing for primality you might want to consider:
The problem domain is small enough that just looping over the numbers until you find the Nth prime would probably be an acceptable solution and take less than a few milliseconds to complete. There are a number of simple optimizations you can make to this approach for example you only need to test to see if it's divisible by 2 once and then you only have to check against the odd numbers and you only have to check numbers less than or equal to the aquare root of the number being tested.
The Sieve of Eratosthenes is very effective and easy to implement and incredibly light on the math end of things.
As for why you code is crashing I suspect changing the line that reads
for( int d=0; d<=b; d++)
to
for( int d=0; d<b; d++)
will fix the problem because you are trying to read from a potentially uninitialized element of the array which probably contains garbage.

I haven't looked at your code, but your array must be large enough to contain all the values you will store in it. 100 certainly isn't going to be enough for most input for this problem.
E.g. this code..
int someArray[100];
someArray[150] = 10;
Writes to a location large than the array (150 > 100). This is known as a memory overwrite. Depending on what happened to be at that memory location your program may crash immediately, later, or never at all.
A good practice when using arrays is to assert in someway that the element you are writing to is within the bounds of the array. Or use an array-type class that performs this checking.
For your problem the easiest approach would be to use the STL vector class. While you must add elements (vector::push_back()) you can later access elements using the array operator []. Vector will also give you the best iterative performance.
Here's some sample code of adding the numbers 0-100 to a vector and then printing them. Note in the second loop we use the count of items stored in the vector.
#include <vector> // std::vector
...
const int MAX_ITEMS = 100;
std::vector<int> intVector;
intVector.reserve(MAX_ITEMS); // allocates all memory up-front
// add items
for (int i = 0; i < MAX_ITEMS; i++)
{
intVector.push_back(i); // this is how you add a value to a vector;
}
// print them
for (int i = 0; i < intVector.size(); i++)
{
int elem = intVector[i]; // this access the item at index 'i'
printf("element %d is %d\n", i, elem);
}

I'm trying to improve my functional programming at the moment so I just coded up the sieve quickly. I figure I'll post it here. If you're still learning, you might find it interesting, too.
#include <iostream>
#include <list>
#include <math.h>
#include <functional>
#include <algorithm>
using namespace std;
class is_multiple : public binary_function<int, int, bool>
{
public:
bool operator()(int value, int test) const
{
if(value == test) // do not remove the first value
return false;
else
return (value % test) == 0;
}
};
int main()
{
list<int> numbersToTest;
int input = 500;
// add all numbers to list
for(int x = 1; x < input; x++)
numbersToTest.push_back(x);
// starting at 2 go through the list and remove all multiples until you reach the squareroot
// of the last element in the list
for(list<int>::iterator itr = ++numbersToTest.begin(); *itr < sqrt((float) input); itr++)
{
int tmp = *itr;
numbersToTest.remove_if(bind2nd(is_multiple(), *itr));
itr = find(numbersToTest.begin(), numbersToTest.end(), tmp); //remove_if invalidates iterator
// so find it again. kind of ugly
}
// output primes
for(list<int>::iterator itr = numbersToTest.begin(); itr != --numbersToTest.end(); itr++)
cout << *itr << "\t";
system("PAUSE");
return 0;
}
Any advice on how to improve this would be welcome by the way.

Here is my code. When working on a big number, it's very slow!
It can calculate all prime numbers with in the number you input!
#include <iostream>
#include <fstream>
#include <cmath>
using namespace std;
int main()
{
int m;
int n=0;
char ch;
fstream fp;
cout<<"What prime numbers do you want get within? ";
if((cin>>m)==0)
{
cout<<"Bad input! Please try again!\n";
return 1;
}
if(m<2)
{
cout<<"There are no prime numbers within "<<m<<endl;
return 0;
}
else if(m==2)
{
fp.open("prime.txt",ios::in|ios::out|ios::trunc);//create a file can be writen and read. If the file exist, it will be overwriten.
fp<<"There are only 1 prime number within 2.\n";
fp<<"2\n";
fp.close();
cout<<"Congratulations! It has worked out!\n";
return 0;
}
else
{
int j;
int sq;
fp.open("prime.txt",ios::in|ios::out|ios::trunc);
fp<<"2\t\t";
n++;
for(int i=3;i<=m;i+=2)
{
sq=static_cast<int>(sqrt(i))+1;
fp.seekg(0,ios::beg);
fp>>j;
for(;j<sq;)
{
if(i%j==0)
{
break;
}
else
{
if((fp>>j)==NULL)
{
j=3;
}
}
}
if(j>=sq)
{
fp.seekg(0,ios::end);
fp<<i<<"\t\t";
n++;
if(n%4==0)
fp<<'\n';
}
}
fp.seekg(0,ios::end);
fp<<"\nThere are "<<n<<" prime number within "<<m<<".\n";
fp.close();
cout<<"Congratulations! It has worked out!\n";
return 0;
}
}

For one, you'd have less code (which is always a good thing!) if you didn't have special cases for 3, 5 and 7.
Also, you can avoid the special case for 2 if you just set num[b] = 2 and only test for divisibility by things in your array.

It looks like as you go around the main for() loop, the value of b increases.
Then, this results in a crash because you access memory off the end of your array:
for (int d = 0; d <= b; d++) {
c = num[d];
I think you need to get the algorithm clearer in your head and then approach the code again.

Running your code through a debugger, I've found that it crashes with a floating point exception at "if ((a % c) == 0)". The reason for this is that you haven't initialized anything in num, so you're doing "a % 0".

From what I know, in C/C++ int is a 16bit type so you cannot fit 1 million in it (limit is 2^16=32k). Try and declare "a" as long
I think the C standard says that int is at least as large as short and at most as large as long.
In practice int is 4 bytes, so it can hold numbers between -2^31 and 2^31-1.

Since this is for pedagogical purposes, I would suggest implementing the Sieve of Eratosthenes.

This should also be of interest to you: http://en.wikipedia.org/wiki/Primality_test

for(int currentInt=2; currentInt<=1000000; currentInt++)
{check = false; // Basically the idea for this for loop is to run checks against integers. This is the main for loop in this program. I re initialize check to false ( check is a bool declared above this. )
for( int arrayPrime=0; arrayPrime<currentPrime; arrayPrime++) // This for loop is used for checking the currentInt against previously found primes which are stored in the num array.
{ c=num[arrayPrime];
if ((currentInt%c)==0) { check = true;// second filter based on previous recorded primes in array
break;} // this is the check. I check the number against every stored value in the num array. If it's divisible by any of them, then bool check is set to true.
if ( currentInt == 2)
{ check = false; } // since i preset num[0] = 2 i make an exception for the number 2.
if (!check)
{
e=a;
if(currentPrime <= 100){
num[currentPrime]= currentInt;} // This if uses check to see if the currentInt is a prime.
currentPrime = currentPrime+1;} // increases the value of currentPrime ( previously b ) by one if !check.
if(currentPrime==prime)
{
write<<e<<endl;
break;} // if currentPrime == prime then write the currentInt into a txt file and break loop, ending the program.
Thanks for the advice polythinker =)

#include <cstdio>
#include <iostream>
#include <cstdlib>
#include <fstream>
using namespace std;
int main()
{
ifstream trial;
trial.open("C:\\Users\\User\\Documents\\trial.txt");
int prime, e;
trial>>prime;
ofstream write;
write.open("C:\\Users\\User\\Documents\\answer.txt");
int num[10000], currentPrime, c, primePrint;
bool check;
currentPrime=0;
num[currentPrime] = 2;
currentPrime=1;
for(int currentInt=2; currentInt<=1000000; currentInt++)
{check = false;
for( int arrayPrime=0; arrayPrime<currentPrime; arrayPrime++)
{ c=num[arrayPrime];
if ((currentInt%c)==0) { check = true;// second filter based on previous recorded primes in array
break;}
}
if (!check)
{ e=currentInt;
if( currentInt!= 2 ) {
num[currentPrime]= currentInt;}
currentPrime = currentPrime+1;}
if(currentPrime==prime)
{
write<<e<<endl;
break;}
}
trial.close();
write.close();
return 0;
}
This is the finalized version base on my original code. It works perfectly and if you want to increase the range of prime numbers simply increase the array number. Thanks for the help =)

Since you will need larger prime number values for later questions, I suggest you follow dreeves advice, and do a sieve. It is a very useful arrow to have in your quiver.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js