Fixing my implementation of Sieve of Eratosthenes in C++ - c++

My algorithm runs correctly up to the first 100 primes but then something goes wrong. Please have a look at my code below, I tried to follow the pseudo-code given here https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
#include <iostream>
#include <vector>
#include <cmath>
using namespace std;
int main()
{
int n = 1000; //compute primes up to this number
vector<bool> p(true,n); //all values set to true, from 0 to n
for(int i = 2; i < sqrt(n)+1; i++){
if( p[i-1] == true ){
for(int j = i*i; j < n; j += i) //start looking for multiples of prime i at i*i (optimized)
p[j-1] = false;
}
}
for(int i = 2; i < n; i++){
if( p[i-1] == true )
cout << i << "\n";
}
return 0;
}
The output is:
2
3
5
7
11
13
17
19
23
29
31
37
41
43
47
53
59
61
193
199

I'm absolutely amazed the program runs at all. It has an absolute truckload of undefined behaviour!.
Unless I'm very much mistaken (in which case please reward the serenity of my Friday afternoon with downvotes), vector<bool> p(true, n) is creating a vector of size true, with the elements initialised to n.
You have the constructor arguments the wrong way round. This is effectively reversing the sieve for most of the values.
Have you absolutely clobbered your compiler's warning level?

First of all, you do not need to store a boolean value for each and every number. This way, you are wasting memory. You should store only the found primes instead, unless you have a very good reason not to do so.
I will not implement the code, since it would spoil the joy of learning. You should implement the following:
Initialize p as a vector of integers.
Store 2 as first value inside p
Iterate all odd numbers starting from 3 and ending at the end number
For each number, calculate its square root and store it into a variable
Iterate all previous elements of the p until you reach a divisor or the value of the vector at the given index reaches the square root, starting from the second element, as pair numbers larger than 2 are ignored
If you find a divisor inside your inner loop, store it into your vector and get out of the inner loop
At the end you will have a vector of primes, indexes will mean the prime index and the values will be actual primes. Each element will be a prime of its own.

Your vector construction is wrong. It must be
vector<bool> p(n, true); //all values set to true, from 0 to n
instead of
vector<bool> p(true, n); //all values set to true, from 0 to n

You mess your numbering up. If p[k] is the primeness of number k+1, your for loop is wrong
for(int j = i*i; j < n; j += i)
and should be
for(int j = (i-1)*(i-1); j < n; j += (i-1))
My advice would be to use more informative variable names and avoid those sources of confusion like p[k] giving information about integer k+1.

Related

Use of counter in C++ code for finding primes

I am working on producing C++ code to list all primes between 1 and 100 say. In order to present my question I need to provide some background.
The basic idea of what I want to do is the following:
Introduce a vector to hold all the primes in ascending order. If the first j elements of this vector are given, the j+1 element is then given as the smallest integer larger than the j'th element which is not divisible by any of the first j elements. The first element is moreover given to be 2.
Thus if v denotes the vector of primes, I want to produce code which implements the following math-type argument;
v[1]=2;
for(2<i<=100)
if i % v[j] !=0 FOR ALL 0<j< v.size()
v.push_back(i)
else
do nothing
The problem I am facing however is that C++ doesn't seem to have a for all type language construct. In order to get around this I introduced a counter variable. More precisely:
int main() {
const int max=100;
vector<int>primes; // vector holding list of primes up to some number.
for(int i=2; i<=max;++i){
if(i==2)
primes.push_back(i); // inserts 2 as first prime
else{
double counter=primes.size(); // counter to be used in following loop.
for(int j=0;j<primes.size();++j){
if(i% primes[j]==0){
break; // breaks loop if prime divisor found!
}
else{
counter-=1; //counter starts at the current size of the primes vector, and 1 is deducted each time an entry is not a prime divisor.
}
}
if (counter==0) // if the counter reaches 0 then i has no prime divisors so must be prime.
primes.push_back(i);
}
}
for(int i=0; i<primes.size(); ++i){
cout << primes[i] << '\t';
}
return 0;
}
The questions I would like to ask are then as follows:
Is there a for-all type language construct in C++?
If not, is there a more appropriate way to implement the above idea? In particular is my use of the counter variable frowned upon?
(Bonus) Is anyone aware of a more efficient way to find all the primes? The above works relatively well up to 1,,000,000 but poorly up to 1 billion.
Note I am beginner to C++ and coding in general (currently working through the book of Stroustrup) so answers provided with that in mind would be appreciated.
Thanks in advance!
EDIT:
Hello all,
Thank you for your comments. From them I learned that both use of a counter and a for all type statement are unnecessary. Instead one can assign a true or false value to each integer indicating whether a number is prime with only integers having a true value added to the vector. Setting things up in this way also allows the process of checking whether a number is prime given the currently known'' primes to be independent of the process of updating the currently known'' primes. This consequently addresses another criticism of my code which was that it was trying to do too many things at once.
Finally it was pointed out to me that there are some basic ways of improving the efficiency of the prime divisor algorithm for finding primes by, for instance, discounting all even numbers greater than 2 in the search (implemented by starting the appropriate loop at 3 and then increasing by 2 at each stage). More generally it was noted that algorithms such as the sieve of Erastothenes are much faster for finding primes, as these are based on multiplication not division. Here is the final code:
#include <iostream>
#include <cmath>
#include <vector>
using namespace std;
vector<int> primes; // vector holding list of primes up to some number.
bool is_prime(int n) {// Given integer n and a vector of primes this boolean valued function returns false if any of the primes is a prime divisor of n, and true otherwise. In the context of the main function, the list of primes will be all those that precede n, hence a return of a true value means that n is itself prime. Hence the function name.
for (int p = 0; p < primes.size(); ++p)
if (n % primes[p] == 0) {
return false;
break; // Breaks loop as soon as the first prime divisor is found.
}
return true;
}
int main() {
const int max=100;
primes.push_back(2);
for (int i = 3; i <= max; i+=2)
if (is_prime(i) == true) primes.push_back(i);
for(int i=0; i<primes.size(); ++i)
cout << primes[i] << '\t';
return 0;
}
I just have one additional question: I checked how long the algorithm takes up to 1,000,000 and the presence of the break in the is_prime function (which stops the search for a prime divisor as soon as one is found) doesn't seem to have an effect on the time. Why is this so?
thanks for all the help!

In c++, how do you find all consecutive composite numbers for in an integer using for loops?

The basic idea I want to find inside a given problem is this. I have an integer variable called N where the user can input a value to.
int main()
{
int n;
std::cin >> n;
Then from this point onward, I created a for loop that replicates how you would normally find out if the integer created is indeed prime or not. However, what I'm trying to find isn't whether the number is prime but to find all the composites from a range of 2, to the number that was inputted. So if the input is 10. I should be getting composites 4 6 8 9 10 from that given range.
I do know that the first thing to do is to create a for loop like this
for (int i = 2; i <= 10; i++)
Then nest another for loop with a conditional to test if each number inside the given range is a prime or composite.
for (int i = 2; i <= n; i++)
{
for (int j = 2; j <= i; j++)
{
if (i % j == 0)
{
std::cout << i << " ";
}
}
}
However, this approach isn't really cutting it. What's really going on inside this nested for loop approach is an out put beginning with 2 3 2 4 5 2 and a bunch of numbers that aren't making much sense. What is it about this approach that's causing this wacky sequence of numbers outputted and what can I do to fix this?
After you've printed a composite number, you want the inner "for j" loop to terminate and the outer "for i" loop to advance, so after std::cout add break;. Additionally, you know if you let j == i you'll deem any number a composite, so change the "for j" loop termination condition from j <= i to j < i. link to working code....

Having trouble with bubble sort C++

I'm having trouble with my bubble sort code. I am trying to sort a vector of strings containing numbers. It seems to work and then halfway through comparing numbers it starts to compare everything wrong (ex: it thinks that 4 > 35).
I read in the number from a text file while running the .o file
Here is the .txt file
6
89
-9
4
718
-60
35
92
1
Here it what I have:
using namespace std;
void bubbleSort(vector<string>&); //declare sort function
int main()
{
vector<string> v; //Initialize vector
string s; //Initialize string
while (cin >> s)
v.push_back(s);
bubbleSort(v); //call sort function
}
void bubbleSort(vector<string>& v){
for(int i = 0; i <= v.size(); i++) //start first loop through vector
for(int j = i+1; j < v.size(); j++){ //start second loop through vector
if(v[i] > v[j]){ //compare i-th element to i-th+1 (j-th) element
swap(v[i],v[j]); //swap elements if i-th element is greater than j-th element
for (int k = 0; k != v.size(); ++k) //loop through vector and print out binomials one per line
cout << v[k] << endl;
}
And this is what it outputs:
-60
-9
1
35
4
6
718
89
92
If someone could please tell me where I am going wrong it would be greatly appreciated! I don't understand why it works all the way up until it tries to compare 4 to 35 and then incorrectly compares them and throws everything off.
In string, "4" is actually bigger than "35". If you want to compare as a numbers, you should convert string to int. Then you find your desired answer.
You can do that by simply change if condition statement if( atoi(v[i].c_str()) > atoi(v[j].c_str()) ) in bubblesort function
So, the final code is:
void bubbleSort(vector<string>& v)
{
for(int i = 0; i < v.size(); i++) //start first loop through vector
for(int j = i+1; j < v.size(); j++) //start second loop through vector
{
if( atoi(v[i].c_str()) > atoi(v[j].c_str()) )
{
swap(v[i],v[j]); //swap elements if i-th element is greater than j-th element
}
}
for (int k = 0; k != v.size(); ++k) //loop through vector and print out binomials one per line
cout << v[k] << endl;
}
Output:
-60
-9
1
4
6
35
89
92
718
You should notice that although 33 is a bigger int then 4, if you compare strings it isn't the case. StrING compare checks the first char against the first char and if there is an equality it moves on to check next char. So if comparing 4 with 33 you get that 33 is lesser because 3 is lesser then 4.
Solution: use atoi to change the string to int and then check who is bigger
Basically what's going wrong is that strings do a lexicographical compare by default. Easiest fix is just to change your vector-of-strings into a vector-of-int and filling that vector with an int-converted-from-string (assuming that that conversion succeeds). How to convert a string to an int is something else to search the web or SO for.

Segmentation Fault with Pointer

I am having a truly bizarre problem with my code here. It works (seemingly) when I use manual print statements to output the values of int *primesArr, but if I try to do so with a for loop it fails. I ran it through gdb and find that it crashes right around where I set the next cell in the array to value 'k', which only occurs when a number is prime. The first iteration is successful (i.e. 2 is set to primesArr[0]) and then the program Segfaults when trying to increment the array. But this only happens when using a for-loop. When I create individual print statements, my program works as expected. I'm not sure how/why I am accessing memory that hasnt been appropriated when using a for-loop. I'm sure I've performed some amateur mistake somewhere, and it probably has something to do with how I'm passing my pointer... but I cannot determine its exact root. I'd appreciate any help and thank you in advance.
#include<stdio.h>
int genPrimes(int seed, int *test){
int inNumPrimes=0;
for(int k=0; k<=seed; k++){//k is number being checked for primeness
int cnt=0;
for(int i=1; i<=k; i++){//'i' is num 'k' is divided by
if(k%i == 0){
cnt++;
if(cnt > 2){
break;
}
}else{
}
}
if(cnt == 2){
printf("%i IS PRIME\n",k);
*test=k;
test++;//according to gdb, the problem is somewhere between here
inNumPrimes++;//and here. I'd wager I messed up my pointer somehow
}
//printf("%i\n",k);
}
return inNumPrimes;
}
int main(){
int outNumPrimes=0;
int *primesArr;
int n = 0;
n=20;
outNumPrimes=genPrimes(n, primesArr);
printf("Congratulations! There are %i Prime Numbers.\n",outNumPrimes);
//If you print using this for loop, the SEGFAULT occurs. Note that it does not matter how high the limit is; its been set to many values other than 5. It will eventually be set to 'outNumPrimes'
//for(int a=0; a<5; a++){
//printf("%i\n",primesArr[a]);
//}
//If you print the array elements individually, the correct value--a prime number--is printed. No SEGFAULT.
printf("%i\n",primesArr[0]);
printf("%i\n",primesArr[1]);
printf("%i\n",primesArr[2]);
printf("%i\n",primesArr[3]);
printf("%i\n",primesArr[4]);
printf("%i\n",primesArr[5]);
printf("%i\n",primesArr[6]);
printf("%i\n",primesArr[7]);
//
return 0;
}
Output with manual statements:
$ ./a.out
2 IS PRIME
3 IS PRIME
5 IS PRIME
7 IS PRIME
11 IS PRIME
13 IS PRIME
17 IS PRIME
19 IS PRIME
Congratulations! There are 8 Prime Numbers.
2
3
5
7
11
13
17
19
Now with the for loop:
$ ./a.out
2 IS PRIME
Segmentation fault
you are passing an uninitialized pointer into your primes function. the behavior you get is undefined, which is why this seems so mysterious. the variable primesArr could be pointing to anywhere.
for a simple case like this, it'd probably be better to use a std::vector<int>
The line
int *primesArr;
Declares primesArr as a pointer variable but doesn't allocate any memory for it. Since the genPrimes() function expects to treat it as an empty array that will be filled with primes, you can allocate memory in main() before calling genPrimes():
int primesArr[MAX_PRIMES];
or
int *primesArr = malloc(MAX_PRIMES * sizeof(int));
In both cases, however, you must guarantee that MAX_PRIMES is large enough to hold all of the primes that genPrimes() finds, otherwise the code will generate an error just as it does now.
Other hints:
1: Complexity
The only reason cnt is necessary is that k is divisible by 1 and k. If you change
for (int i=1; i<=k; i++) { // 'i' is the number 'k' is divided by
to
for (int i=2; i<k; ++i) { // 'i' is the number 'k' is divided by
then both of those cases are eliminated, and the loop can exit as soon as it finds a value of i for which k%i == 0.
2: Efficiency
The test
for (int i=2; i<k; ++i) { // 'i' is the number 'k' is divided by
is still quite inefficient for two reasons. First, there's no need to test every even number; if k > 2 and (k % 2) == 0, then k cannot be prime. So you can eliminate half of the tests by checking explicitly for 2 (prime) or divisibility by 2 (not prime), and then using
for (int i = 3; i < k; i += 2) { // 'i' is the number 'k' is divided by
But you can make this still more efficient, because you can stop after reaching sqrt(k). Why? If k is divisible by some number i, then it must also be divisible by k/i (because i * k/i=k). And if i > sqrt(k), then k/i < sqrt(k) and the loop would already have exited. So you need only
int r = (int) sqrt(k);
for (int i = 3; i <= r; i += 2) { // 'i' is the number 'k' is divided by
If sqrt() isn't available, you can use
for (int i = 3; i*i <= k; i += 2) { // 'i' is the number 'k' is divided by
3: Style
Just a simple thing, but instead of
int n = 0;
n=20;
you can simply write
int n = 20;
Your primesArr variable is uninitialized.
Declaring a pointer as
int *ptr;
Just declares a pointer to an int. However, the pointer itself does not point to anything. Much like declaring
int val;
does not initialize val. Therefore, you'll need to allocate memory for your primesArr pointer to point to (with new or on the stack like int primesArr[N] where N is some large number.
However, since you don't know how many primes you'll get a priori from your genPrimes function and you haven't said that STL is out of the question, I'd consider using a std::vector<int> as the input to your genPrimes function:
int genPrimes(int seed, std::vector<int>& test)
And, from within the function, you could do:
test.push_back(k)

Finding composite numbers

I have a range of random numbers. The range is actually determined by the user but it will be up to 1000 integers. They are placed in this:
vector<int> n
and the values are inserted like this:
srand(1);
for (i = 0; i < n; i++)
v[i] = rand() % n;
I'm creating a separate function to find all the non-prime values. Here is what I have now, but I know it's completely wrong as I get both prime and composite in the series.
void sieve(vector<int> v, int n)
{
int i,j;
for(i = 2; i <= n; i++)
{
cout << i << " % ";
for(j = 0; j <= n; j++)
{
if(i % v[j] == 0)
cout << v[j] << endl;
}
}
}
This method typically worked when I just had a series of numbers from 0-1000, but it doesn't seem to be working now when I have numbers out of order and duplicates. Is there a better method to find non-prime numbers in a vector? I'm tempted to just create another vector, fill it with n numbers and just find the non-primes that way, but would that be inefficient?
Okay, since the range is from 0-1000 I am wondering if it's easier to just create vector with 0-n sorted, and then using a sieve to find the primes, is this getting any closer?
void sieve(vector<int> v, BST<int> t, int n)
{
vector<int> v_nonPrime(n);
int i,j;
for(i = 2; i < n; i++)
v_nonPrime[i] = i;
for(i = 2; i < n; i++)
{
for(j = i + 1; j < n; j++)
{
if(v_nonPrime[i] % j == 0)
cout << v_nonPrime[i] << endl;
}
}
}
In this code:
if(i % v[j] == 0)
cout << v[j] << endl;
You are testing your index to see if it is divisible by v[j]. I think you meant to do it the other way around, i.e.:
if(v[j] % i == 0)
Right now, you are printing random divisors of i. You are not printing out random numbers which are known not to be prime. Also, you will have duplicates in your output, perhaps that is ok.
First off, I think Knuth said it first: premature optimization is the cause of many bugs. Make the slow version first, and then figure out how to make it faster.
Second, for your outer loop, you really only need to go to sqrt(n) rather than n.
Basically, you have a lot of unrelated numbers, so for each one you will have to check if it's prime.
If you know the range of the numbers in advance, you can generate all prime numbers that can occur in that range (or the sqrt thereof), and test every number in your container for divisibility by any one of the generated primes.
Generating the primes is best done by the Erathostenes Sieve - many examples to be found of that algorithm.
You should try using a prime sieve. You need to know the maximal number for creating the sieve (O(n)) and then you can build a set of primes in that range (O(max_element) or as the problem states O(1000) == O(1))) and check whether each number is in the set of primes.
Your code is just plain wrong. First, you're testing i % v[j] == 0, which is backwards and also explains why you get all numbers. Second, your output will contain duplicates as you're testing and outputting each input number every time it fails the (broken) divisibility test.
Other suggestions:
Using n as the maximum value in the vector and the number of elements in the vector is confusing and pointless. You don't need to pass in the number of elements in the vector - you just query the vector's size. And you can figure out the max fairly quickly (but if you know it ahead of time you may as well pass it in).
As mentioned above, you only need to test to sqrt(n) [where n is the max value in the vecotr]
You could use a sieve to generate all primes up to n and then just remove those values from the input vector, as also suggested above. This may be quicker and easier to understand, especially if you store the primes somewhere.
If you're going to test each number individually (using, I guess, and inverse sieve) then I suggest testing each number individually, in order. IMHO it'll be easier to understand than the way you've written it - testing each number for divisibility by k < n for ever increasing k.
The idea of the sieve that you try to implement depends on the fact that you start at a prime (2) and cross out multitudes of that number - so all numbers that depend on the prime "2" are ruled out beforehand.
That's because all non-primes can be factorized down to primes. Whereas primes are not divisible with modulo 0 unless you divide them by 1 or by themselves.
So, if you want to rely on this algorithm, you will need some mean to actually restore this property of the algorithm.
Your code seems to have many problems:
If you want to test if your number is prime or non-prime, you would need to check for v[j] % i == 0, not the other way round
You did not check if your number is dividing by itself
You keep on checking your numbers again and again. That's very inefficient.
As other guys suggested, you need to do something like the Sieve of Eratosthenes.
So a pseudo C code for your problem would be (I haven't run this through compilers yet, so please ignore syntax errors. This code is to illustrate the algorithm only)
vector<int> inputNumbers;
// First, find all the prime numbers from 1 to n
bool isPrime[n+1] = {true};
isPrime[0]= false;
isPrime[1]= false;
for (int i = 2; i <= sqrt(n); i++)
{
if (!isPrime[i])
continue;
for (int j = 2; j <= n/i; j++)
isPrime[i*j] = false;
}
// Check the input array for non-prime numbers
for (int i = 0; i < inputNumbers.size(); i++)
{
int thisNumber = inputNumbers[i];
// Vet the input to make sure we won't blow our isPrime array
if ((0<= thisNumber) && (thisNumber <=n))
{
// Prints out non-prime numbers
if (!isPrime[thisNumber])
cout<< thisNumber;
}
}
sorting the number first might be a good start - you can do that in nLogN time. That is a small addition (I think) to your other problem - that of finding if a number is prime.
(actually, with a small set of numbers like that you can do a sort much faster with a copy of the size of the vector/set and do a hash/bucket sort/whatever)
I'd then find the highest number in the set (I assume the numbers can be unbounded - no know upper limit until your sort - or do a single pass to find the max)
then go with a sieve - as others have said
Jeremy is right, the basic problem is your i % v[j] instead of v[j] % i.
Try this:
void sieve(vector<int> v, int n) {
int i,j;
for(j = 0; j <= n; j++) {
cout << v[j] << ": ";
for(i = 2; i < v[j]; i++) {
if(v[j] % i == 0) {
cout << "is divisible by " << i << endl;
break;
}
}
if (i == v[j]) {
cout << "is prime." << endl;
}
}
}
It's not optimal, because it's attempting to divide by all numbers less than v[j] instead of just up to the square root of v[j]. And it is attempting dividion by all numbers instead of only primes.
But it will work.