Finding composite numbers - c++

I have a range of random numbers. The range is actually determined by the user but it will be up to 1000 integers. They are placed in this:
vector<int> n
and the values are inserted like this:
srand(1);
for (i = 0; i < n; i++)
v[i] = rand() % n;
I'm creating a separate function to find all the non-prime values. Here is what I have now, but I know it's completely wrong as I get both prime and composite in the series.
void sieve(vector<int> v, int n)
{
int i,j;
for(i = 2; i <= n; i++)
{
cout << i << " % ";
for(j = 0; j <= n; j++)
{
if(i % v[j] == 0)
cout << v[j] << endl;
}
}
}
This method typically worked when I just had a series of numbers from 0-1000, but it doesn't seem to be working now when I have numbers out of order and duplicates. Is there a better method to find non-prime numbers in a vector? I'm tempted to just create another vector, fill it with n numbers and just find the non-primes that way, but would that be inefficient?
Okay, since the range is from 0-1000 I am wondering if it's easier to just create vector with 0-n sorted, and then using a sieve to find the primes, is this getting any closer?
void sieve(vector<int> v, BST<int> t, int n)
{
vector<int> v_nonPrime(n);
int i,j;
for(i = 2; i < n; i++)
v_nonPrime[i] = i;
for(i = 2; i < n; i++)
{
for(j = i + 1; j < n; j++)
{
if(v_nonPrime[i] % j == 0)
cout << v_nonPrime[i] << endl;
}
}
}

In this code:
if(i % v[j] == 0)
cout << v[j] << endl;
You are testing your index to see if it is divisible by v[j]. I think you meant to do it the other way around, i.e.:
if(v[j] % i == 0)
Right now, you are printing random divisors of i. You are not printing out random numbers which are known not to be prime. Also, you will have duplicates in your output, perhaps that is ok.

First off, I think Knuth said it first: premature optimization is the cause of many bugs. Make the slow version first, and then figure out how to make it faster.
Second, for your outer loop, you really only need to go to sqrt(n) rather than n.

Basically, you have a lot of unrelated numbers, so for each one you will have to check if it's prime.
If you know the range of the numbers in advance, you can generate all prime numbers that can occur in that range (or the sqrt thereof), and test every number in your container for divisibility by any one of the generated primes.
Generating the primes is best done by the Erathostenes Sieve - many examples to be found of that algorithm.

You should try using a prime sieve. You need to know the maximal number for creating the sieve (O(n)) and then you can build a set of primes in that range (O(max_element) or as the problem states O(1000) == O(1))) and check whether each number is in the set of primes.

Your code is just plain wrong. First, you're testing i % v[j] == 0, which is backwards and also explains why you get all numbers. Second, your output will contain duplicates as you're testing and outputting each input number every time it fails the (broken) divisibility test.
Other suggestions:
Using n as the maximum value in the vector and the number of elements in the vector is confusing and pointless. You don't need to pass in the number of elements in the vector - you just query the vector's size. And you can figure out the max fairly quickly (but if you know it ahead of time you may as well pass it in).
As mentioned above, you only need to test to sqrt(n) [where n is the max value in the vecotr]
You could use a sieve to generate all primes up to n and then just remove those values from the input vector, as also suggested above. This may be quicker and easier to understand, especially if you store the primes somewhere.
If you're going to test each number individually (using, I guess, and inverse sieve) then I suggest testing each number individually, in order. IMHO it'll be easier to understand than the way you've written it - testing each number for divisibility by k < n for ever increasing k.

The idea of the sieve that you try to implement depends on the fact that you start at a prime (2) and cross out multitudes of that number - so all numbers that depend on the prime "2" are ruled out beforehand.
That's because all non-primes can be factorized down to primes. Whereas primes are not divisible with modulo 0 unless you divide them by 1 or by themselves.
So, if you want to rely on this algorithm, you will need some mean to actually restore this property of the algorithm.

Your code seems to have many problems:
If you want to test if your number is prime or non-prime, you would need to check for v[j] % i == 0, not the other way round
You did not check if your number is dividing by itself
You keep on checking your numbers again and again. That's very inefficient.
As other guys suggested, you need to do something like the Sieve of Eratosthenes.
So a pseudo C code for your problem would be (I haven't run this through compilers yet, so please ignore syntax errors. This code is to illustrate the algorithm only)
vector<int> inputNumbers;
// First, find all the prime numbers from 1 to n
bool isPrime[n+1] = {true};
isPrime[0]= false;
isPrime[1]= false;
for (int i = 2; i <= sqrt(n); i++)
{
if (!isPrime[i])
continue;
for (int j = 2; j <= n/i; j++)
isPrime[i*j] = false;
}
// Check the input array for non-prime numbers
for (int i = 0; i < inputNumbers.size(); i++)
{
int thisNumber = inputNumbers[i];
// Vet the input to make sure we won't blow our isPrime array
if ((0<= thisNumber) && (thisNumber <=n))
{
// Prints out non-prime numbers
if (!isPrime[thisNumber])
cout<< thisNumber;
}
}

sorting the number first might be a good start - you can do that in nLogN time. That is a small addition (I think) to your other problem - that of finding if a number is prime.
(actually, with a small set of numbers like that you can do a sort much faster with a copy of the size of the vector/set and do a hash/bucket sort/whatever)
I'd then find the highest number in the set (I assume the numbers can be unbounded - no know upper limit until your sort - or do a single pass to find the max)
then go with a sieve - as others have said

Jeremy is right, the basic problem is your i % v[j] instead of v[j] % i.
Try this:
void sieve(vector<int> v, int n) {
int i,j;
for(j = 0; j <= n; j++) {
cout << v[j] << ": ";
for(i = 2; i < v[j]; i++) {
if(v[j] % i == 0) {
cout << "is divisible by " << i << endl;
break;
}
}
if (i == v[j]) {
cout << "is prime." << endl;
}
}
}
It's not optimal, because it's attempting to divide by all numbers less than v[j] instead of just up to the square root of v[j]. And it is attempting dividion by all numbers instead of only primes.
But it will work.

Related

Use of counter in C++ code for finding primes

I am working on producing C++ code to list all primes between 1 and 100 say. In order to present my question I need to provide some background.
The basic idea of what I want to do is the following:
Introduce a vector to hold all the primes in ascending order. If the first j elements of this vector are given, the j+1 element is then given as the smallest integer larger than the j'th element which is not divisible by any of the first j elements. The first element is moreover given to be 2.
Thus if v denotes the vector of primes, I want to produce code which implements the following math-type argument;
v[1]=2;
for(2<i<=100)
if i % v[j] !=0 FOR ALL 0<j< v.size()
v.push_back(i)
else
do nothing
The problem I am facing however is that C++ doesn't seem to have a for all type language construct. In order to get around this I introduced a counter variable. More precisely:
int main() {
const int max=100;
vector<int>primes; // vector holding list of primes up to some number.
for(int i=2; i<=max;++i){
if(i==2)
primes.push_back(i); // inserts 2 as first prime
else{
double counter=primes.size(); // counter to be used in following loop.
for(int j=0;j<primes.size();++j){
if(i% primes[j]==0){
break; // breaks loop if prime divisor found!
}
else{
counter-=1; //counter starts at the current size of the primes vector, and 1 is deducted each time an entry is not a prime divisor.
}
}
if (counter==0) // if the counter reaches 0 then i has no prime divisors so must be prime.
primes.push_back(i);
}
}
for(int i=0; i<primes.size(); ++i){
cout << primes[i] << '\t';
}
return 0;
}
The questions I would like to ask are then as follows:
Is there a for-all type language construct in C++?
If not, is there a more appropriate way to implement the above idea? In particular is my use of the counter variable frowned upon?
(Bonus) Is anyone aware of a more efficient way to find all the primes? The above works relatively well up to 1,,000,000 but poorly up to 1 billion.
Note I am beginner to C++ and coding in general (currently working through the book of Stroustrup) so answers provided with that in mind would be appreciated.
Thanks in advance!
EDIT:
Hello all,
Thank you for your comments. From them I learned that both use of a counter and a for all type statement are unnecessary. Instead one can assign a true or false value to each integer indicating whether a number is prime with only integers having a true value added to the vector. Setting things up in this way also allows the process of checking whether a number is prime given the currently known'' primes to be independent of the process of updating the currently known'' primes. This consequently addresses another criticism of my code which was that it was trying to do too many things at once.
Finally it was pointed out to me that there are some basic ways of improving the efficiency of the prime divisor algorithm for finding primes by, for instance, discounting all even numbers greater than 2 in the search (implemented by starting the appropriate loop at 3 and then increasing by 2 at each stage). More generally it was noted that algorithms such as the sieve of Erastothenes are much faster for finding primes, as these are based on multiplication not division. Here is the final code:
#include <iostream>
#include <cmath>
#include <vector>
using namespace std;
vector<int> primes; // vector holding list of primes up to some number.
bool is_prime(int n) {// Given integer n and a vector of primes this boolean valued function returns false if any of the primes is a prime divisor of n, and true otherwise. In the context of the main function, the list of primes will be all those that precede n, hence a return of a true value means that n is itself prime. Hence the function name.
for (int p = 0; p < primes.size(); ++p)
if (n % primes[p] == 0) {
return false;
break; // Breaks loop as soon as the first prime divisor is found.
}
return true;
}
int main() {
const int max=100;
primes.push_back(2);
for (int i = 3; i <= max; i+=2)
if (is_prime(i) == true) primes.push_back(i);
for(int i=0; i<primes.size(); ++i)
cout << primes[i] << '\t';
return 0;
}
I just have one additional question: I checked how long the algorithm takes up to 1,000,000 and the presence of the break in the is_prime function (which stops the search for a prime divisor as soon as one is found) doesn't seem to have an effect on the time. Why is this so?
thanks for all the help!

In c++, how do you find all consecutive composite numbers for in an integer using for loops?

The basic idea I want to find inside a given problem is this. I have an integer variable called N where the user can input a value to.
int main()
{
int n;
std::cin >> n;
Then from this point onward, I created a for loop that replicates how you would normally find out if the integer created is indeed prime or not. However, what I'm trying to find isn't whether the number is prime but to find all the composites from a range of 2, to the number that was inputted. So if the input is 10. I should be getting composites 4 6 8 9 10 from that given range.
I do know that the first thing to do is to create a for loop like this
for (int i = 2; i <= 10; i++)
Then nest another for loop with a conditional to test if each number inside the given range is a prime or composite.
for (int i = 2; i <= n; i++)
{
for (int j = 2; j <= i; j++)
{
if (i % j == 0)
{
std::cout << i << " ";
}
}
}
However, this approach isn't really cutting it. What's really going on inside this nested for loop approach is an out put beginning with 2 3 2 4 5 2 and a bunch of numbers that aren't making much sense. What is it about this approach that's causing this wacky sequence of numbers outputted and what can I do to fix this?
After you've printed a composite number, you want the inner "for j" loop to terminate and the outer "for i" loop to advance, so after std::cout add break;. Additionally, you know if you let j == i you'll deem any number a composite, so change the "for j" loop termination condition from j <= i to j < i. link to working code....

What is the fastest way to sort these n^2 numbers?

Given a number 'n', I want to return a sorted array of n^2 numbers containing all the values of k1*k2 where k1 and k2 can range from 1 to n.
For example for n=2 it would return : {1,2,2,4}.(the number are basically 1*1,1*2,2*1,2*2).
and for n=3 it would return : {1,2,2,3,3,4,6,6,9}.
(the numbers being : 1*1, 2*1, 1*2, 2*2, 3*1, 1*3, 3*2, 2*3, 3*3)
I tried it using sort function from c++ standard library, but I was wondering if it could be further optimized.
Well, first of all, you get n^2 entries, the largest of which will be n^2, and of the possible value range, only a tiny amount of values is used for large n. So, I'd suggest a counting approach:
Initialize an array counts[] of size n^2 with zeros.
Iterate through your array of values values[], and do counts[values[i]-1]++.
Reinitialize the values array by iterating through the counts array, dropping as many values of i+1 into the values array as counts[i] gives you.
That's all. It's O(n^2), so you'll hardly find a more performant solution.
vector<int> count(n*n+1);
for (int i = 1; i <= n; ++i)
for (int j = 1; j <= n; ++j)
++count[i*j];
for (int i = 1; i <= n*n; ++i)
for (int j = 0; j < count[i]; ++j)
cout << i << " ";
This is in essence the O(n*n) solution as described in cmaster's answer.

N choose k for large n and k

I have n elements stored in an array and a number k of possible subset over n(n chose k).
I have to find all the possible combinations of k elements in the array of length n and, for each set(of length k), make some calculations on the elements choosen.
I have written a recursive algorithm(in C++) that works fine, but for large number it crashes going out of heap space.
How can I fix the problem? How can I calculate all the sets of n chose k for large n and k?
Is there any library for C++ that can help me?
I know it is a np problem but I would write the best code in order to calculate the biggest numbers possible.
Which is approximately the biggest numbers (n and k)beyond which it becames unfeasible?
I am only asking for the best algorithm, not for unfeasible space/work.
Here my code
vector<int> people;
vector<int> combination;
void pretty_print(const vector<int>& v)
{
static int count = 0;
cout << "combination no " << (++count) << ": [ ";
for (int i = 0; i < v.size(); ++i) { cout << v[i] << " "; }
cout << "] " << endl;
}
void go(int offset, int k)
{
if (k == 0) {
pretty_print(combination);
return;
}
for (int i = offset; i <= people.size() - k; ++i) {
combination.push_back(people[i]);
go(i+1, k-1);
combination.pop_back();
}
}
int main() {
int n = #, k = #;
for (int i = 0; i < n; ++i) { people.push_back(i+1); }
go(0, k);
return 0;
}
Here is non recursive algorithm:
const int n = ###;
const int k = ###;
int currentCombination[k];
for (int i=0; i<k; i++)
currentCombination[i]=i;
currentCombination[k-1] = k-1-1; // fill initial combination is real first combination -1 for last number, as we will increase it in loop
do
{
if (currentCombination[k-1] == (n-1) ) // if last number is just before overwhelm
{
int i = k-1-1;
while (currentCombination[i] == (n-k+i))
i--;
currentCombination[i]++;
for (int j=(i+1); j<k; j++)
currentCombination[j] = currentCombination[i]+j-i;
}
else
currentCombination[k-1]++;
for (int i=0; i<k; i++)
_tprintf(_T("%d "), currentCombination[i]);
_tprintf(_T("\n"));
} while (! ((currentCombination[0] == (n-1-k+1)) && (currentCombination[k-1] == (n-1))) );
Your recursive algorithm might be blowing the stack. If you make it non-recursive, then that would help, but it probably won't solve the problem if your case is really 100 choose 10. You have two problems. Few, if any, computers in the world have 17+ terabytes of memory. Going through 17 trillion+ iterations to generate all the combinations will take way too long. You need to rethink the problem and either come up with an N choose K case that is more reasonable, or process only a certain subset of the combinations.
You probably do not want to be processing more than a billion or two combinations at the most - and even that will take some time. That translates to around 41 choose 10 to about 44 choose 10. Reducing either N or K will help. Try editing your question and posting the problem you are trying to solve and why you think you need to go through all of the combinations. There may be a way to solve it without going through all of the combinations.
If it turns out you do need to go through all those combinations, then maybe you should look into using a search technique like a genetic algorithm or simulated annealing. Both of these hill climbing search techniques provide the ability to search a large space in a relatively small time for a close to optimal solution, but neither guarantee to find the optimal solution.
You can use next_permutation() in algorithm.h to generate all possible combinations.
Here is some example code:
bool is_chosen(n, false);
fill(is_chosen.begin() + n - k, is_chosen.end(), true);
do
{
for(int i = 0; i < n; i++)
{
if(is_chosen[i])
cout << some_array[i] << " ";
}
cout << endl;
} while( next_permutation(is_chosen.begin(), is_chosen.end()) );
Don't forget to include the algorithm.
As I said in a comment, it's not clear what you really want.
If you want to compute (n choose k) for relatively small values, say n,k < 100 or so, you may want to use a recursive method, using Pascals triangle.
If n,k are large (say n=1000000, k=500000), you may be happy with an approxiate result using Sterlings formula for the factorial: (n choose k) = exp(loggamma(n)-loggamma(k)-loggamma(n-k)), computing loggamma(x) via Sterling's formula.
If you want (n choose k) for all or many k but the same n, you can simply iterate over k and use (n choose k+1) = ((n choose k)*(n-k))/(k+1).

C++ Prime number task from the book

I'm a C++ beginner ;)
How good is the code below as a way of finding all prime numbers between 2-1000:
int i, j;
for (i=2; i<1000; i++) {
for (j=2; j<=(i/j); j++) {
if (! (i%j))
break;
if (j > (i/j))
cout << i << " is prime\n";
}
}
You stop when j = i.
A first simple optimization is to stop when j = sqrt(i) (since there can be no factors of a number greater than its square root).
A much faster implementation is for example the sieve of eratosthenes.
Edit: the code looks somewhat mysterious, so here's how it works:
The terminating condition on the inner for is i/j, equivalent to j<i (which is much clearer),since when finally have j==i, we'll have i/j==0 and the for will break.
The next check if(j>(i/j)) is really nasty. Basically it just checks whether the loop hit the for's end condition (therefore we have a prime) or if we hit the explicit break (no prime). If we hit the for's end, then j==i+1 (think about it) => i/j==0 => it's a prime. If we hit a break, it means j is a factor of i,but not just any factor, the smallest in fact (since we exit at the first j that divides i)!
Since j is the smallest factor,the other factor (or product of remaining factors, given by i/j) will be greater or equal to j, hence the test. If j<=i/j,we hit a break and j is the smallest factor of i.
That's some unreadable code!
Not very good. In my humble opinion, the indentation and spacing is hideous (no offense). To clean it up some:
int i, j;
for (i=2; i<1000; i++) {
for (j=2; i/j; j++) {
if (!(i % j))
break;
if (j > i/j)
cout << i << " is prime\n";
}
}
This reveals a bug: the if (j > i/j) ... needs to be on the outside of the inner loop for this to work. Also, I think that the i/j condition is more confusing (not to mention slower) than just saying j < i (or even nothing, because once j reaches i, i % j will be 0). After these changes, we have:
int i, j;
for (i=2; i<1000; i++) {
for (j=2; j < i; j++) {
if (!(i % j))
break;
}
if (j > i/j)
cout << i << " is prime\n";
}
This works. However, the j > i/j confuses the heck out of me. I can't even figure out why it works (I suppose I could figure it out if I spent a while looking like this guy). I would write if (j == i) instead.
What you have implemented here is called trial division. A better algorithm is the Sieve of Eratosthenes, as posted in another answer. A couple things to check if you implement a Sieve of Eratosthenes:
It should work.
It shouldn't use division or modulus. Not that these are "bad" (granted, they tend to be an order of magnitude slower than addition, subtraction, negation, etc.), but they aren't needed, and if they're present, it probably means the implementation isn't really that efficient.
It should be able to compute the primes less than 10,000,000 in about a second (depending on your hardware, compiler, etc.).
First off, your code is both short and correct, which is very good for at beginner. ;-)
This is what I would do to improve the code:
1) Define the variables inside the loops, so they don't get confused with something else. I would also make the bound a parameter or a constant.
#define MAX 1000
for(int i=2;i<MAX;i++){
for(int j=2;j<i/j;j++){
if(!(i%j)) break;
if(j>(i/j)) cout<<i<<" is prime\n";
}
}
2) I would use the Sieve of Eratosthenes, as Joey Adams and Mau have suggested. Notice how I don't have to write the bound twice, so the two usages will always be identical.
#define MAX 1000
bool prime[MAX];
memset(prime, sizeof(prime), true);
for(int i=4;i<MAX;i+=2) prime[i] = false;
prime[1] = false;
cout<<2<<" is prime\n";
for(int i=3;i*i<MAX;i+=2)
if (prime[i]) {
cout<<i<<" is prime\n";
for(int j=i*i;j<MAX;j+=i)
prime[j] = false;
}
The bounds are also worth noting. i*i<MAX is a lot faster than j > i/j and you also don't need to mark any numbers < i*i, because they will already have been marked, if they are composite. The most important thing is the time complexity though.
3) If you really want to make this algorithm fast, you need to cache optimize it. The idea is to first find all the primes < sqrt(MAX) and then use them to find the rest of the
primes. Then you can use the same block of memory to find all primes from 1024-2047, say,
and then 2048-3071. This means that everything will be kept in L1-cache. I once measured a ~12 time speedup by using this optimization on the Sieve of Eratosthenes.
You can also cut the space usage in half by not storing the even numbers, which means that
you don't have to perform the calculations to begin working on a new block as often.
If you are a beginner you should probably just forget about the cache for the moment though.
The one simple answer to the whole bunch of text we posted up here is : Trial division!
If someone mentioned mathematical basis that this task was based on, we'd save plenty of time ;)
#include <stdio.h>
#define N 1000
int main()
{
bool primes[N];
for(int i = 0 ; i < N ; i++) primes[i] = false;
primes[2] = true;
for(int i = 3 ; i < N ; i+=2) { // Check only odd integers
bool isPrime = true;
for(int j = i/2 ; j > 2 ; j-=2) { // Check only from largest possible multiple of current number
if ( j%2 == 0 ) { j = j-1; } // Check only with previous odd divisors
if(!primes[j]) continue; // Check only with previous prime divisors
if ( i % j == 0 ) {
isPrime = false;
break;
}
}
primes[i] = isPrime;
}
return 0;
}
This is working code. I also included many of the optimizations mentioned by previous posters. If there are any other optimizations that can be done, it would be informative to know.
This function is more efficient to see if a number is prime.
bool isprime(const unsigned long n)
{
if (n<2) return false;
if (n<4) return true;
if (n%2==0) return false;
if (n%3==0) return false;
unsigned long r = (unsigned long) sqrt(n);
r++;
for(unsigned long c=6; c<=r; c+=6)
{
if (n%(c-1)==0) return false;
if (n%(c+1)==0) return false;
}