Efficiency of sieve of eratosthenes algorithm in c++ - c++

(Keep in mind, I'm a complete beginner to c++)
I have tried to write a sieve of eratosthenes funciton in c++, and it currently looks as follows:
#include <iostream>
#include <unordered_set>
#include <vector>
#include <numeric>
#include <chrono>
int main() {
auto start = std::chrono::high_resolution_clock::now();
int max_val = 100000;
std::vector<int> primes(max_val);
std::iota(primes.begin(), primes.end(), 0);
for (int i = 2; i <= primes.size(); i++) {
int j = i+i;
while (j < primes.size()) {
primes[j] = 0;
j += i;
}
}
std::unordered_set<int> set_primes(primes.begin(), primes.end());
set_primes.erase(1);
set_primes.erase(0);
std::cout << set_primes.size() << "\n";
auto stop = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(stop - start);
std::cout << duration.count() << "\n|";
}
However, this function is very inefficient (generating a set of primes less than 10mil takes around 11 seconds, where my python program can do it in about 4). I'm guessing the issue might lie in me iterating over millions of zeros in my vector when creating the unordered_set. What could I do to improve the efficiency of my program?

firstly all divisors of a number(except for itself and 1) are equal or less than sqrt(number), so first of all your first loop should run be from 2 to sqrt(n), secondly, you can start your second loop not from i + i but from i * i (don't want to prove it here, it's not that simple), so the final code is gonna be:
int n;
cin >> n;
vector<bool> prime (n+1, true);
prime[0] = prime[1] = false;
int i = 2;
while (i * i <= n) {
if (prime[i])
for (int j = i * i; j <= n; j += i)
prime[j] = false;
i++;
}

Related

Finding maximum

Given an integer n and array a. Finding maximum of (a[i]+a[j])*(j-i) with 1<=i<=n-1 and i+1<=j<=n
Example:
Input
5
1 3 2 5 4
Output
21
Explanation :With i=2 and j=5, we have the maximum of (a[i]+a[j])*(j-i) is (3+4)*(5-2)=21
Constraints:
n<=10^6
a[i]>0 with 1<=i<=n
I can solve this problem with n<=10^4, but what should I do if n is too large, like the constraints?
First, let's reference the "brute force" force algorithm. This will have some issues, that I will call out below, but it is a correct solution.
struct Result
{
size_t i;
size_t j;
int64_t value;
};
Result findBestBruteForce(const vector<int>& a)
{
size_t besti = 0;
size_t bestj = 0;
int64_t bestvalue = INT64_MIN;
for (size_t i = 0; i < a.size(); i++)
{
for (size_t j = i + 1; j < a.size(); j++)
{
// do the math in 64-bit space to avoid overflow
int64_t value = (a[i] + (int64_t)a[j]) * (j - i);
if (value > bestvalue)
{
bestvalue = value;
besti = i;
bestj = j;
}
}
}
return { besti, bestj, bestvalue };
}
The problem with the above code is that it runs at O(N²). Or more precisely, for the the N iterations of the outer for-loop (where i goes from 0 to N), there are an average of N/2 iterations on the inner for-loop. If N is small, this isn't a problem.
On my PC, with full optimizations turned on. When is N under 20000, the run time is less than a second. Once N approaches 100000, it takes several seconds to process the 5 billion iterations. Let's just go with a "billion operations per second" as an expected rate. If N were to 1000000, the maximum as the OP outlined, it would probably take 500 seconds. Such is the nature of a N-squared algorithm.
So how can we speed it up? Here's an interesting observation. Let's say our array was this:
10 5 4 15 13 100 101 6
On the first iteration of the outer loop above, where i=0, we'd be computing this on each iteration of the inner loop:
for each j: (a[0]+a[j])(j-0)
for each j: (10+a[j])(j-0)
for each j: [15*1, 14*2, 25*3, 23*4, 1000*5, 1010*6, 16*6]
= [15, 28, 75, 92, 5000, 6060, 96]
Hence, for when i=0, a[i] = 15 and the largest value computed from that set is 6060.
Since A[0] is 15, and we're tracking a current "best" value, there's no incentive to iterate all the values again for i=1 since a[1]==14 is less than 15. There's no j index that would compute a value of (a[1]+a[j])*(j-1) larger than what's already been found. Because (14+a[j])*(j-1) will always be less than (15+a[j])*(j-1). (Assumes all values in the array are non-negative).
So to generalize, the outer loop can skip over any index of i where A[best_i] > A[i]. And that's a real simple alteration to our above code:
Result findBestOptimized(const std::vector<int>& a)
{
if (a.size() < 2)
{
return {0,0,INT64_MIN};
}
size_t besti = 0;
size_t bestj = 0;
int64_t bestvalue = INT64_MIN;
int minimum = INT_MIN;
for (size_t i = 0; i < a.size(); i++)
{
if (a[i] <= minimum)
{
continue;
}
for (size_t j = i + 1; j < a.size(); j++)
{
int64_t value = (a[i] + (int64_t)a[j]) * (j - i);
if (value > bestvalue)
{
bestvalue = value;
besti = i;
bestj = j;
minimum = a[i];
}
}
}
return { besti, bestj, bestvalue };
}
Above, we introduce a minimum value for A[i] to be before considering doing the full inner loop enumeration.
I benchmarked this with build optimizations on. On a random array of a million items, it runs in under a second.
But wait... there's another optimization!
If the inner loop fails to find an index j such that value > bestvalue, then we already know that the current A[i] is greater than minimum. Hence, we can increment minimum to A[i] regardless at the end of the inner loop.
Now, I'll present the final solution:
Result findBestOptimizedEvenMore(const std::vector<int>& a)
{
if (a.size() < 2)
{
return { 0,0,INT64_MIN };
}
size_t besti = 0;
size_t bestj = 0;
int64_t bestvalue = INT64_MIN;
int minimum = INT_MIN;
for (size_t i = 0; i < a.size(); i++)
{
if (a[i] <= minimum)
{
continue;
}
for (size_t j = i + 1; j < a.size(); j++)
{
int64_t value = (a[i] + (int64_t)a[j]) * (j - i);
if (value > bestvalue)
{
bestvalue = value;
besti = i;
bestj = j;
}
}
minimum = a[i]; // since we know a[i] > minimum, we can do this
}
return { besti, bestj, bestvalue };
}
I benchmarked the above solution on different array sizes from N=100 to N=1000000. It does all iterations in under 25 milliseconds.
In the above solution, there's likely a worst case runtime of O(N²) again when all the items in the array are in ascending order. But I believe the average case should be on the order of O(N lg N) or better. I'll do some more analysis later if anyone is interested.
Note: Some notation for variables and the Result class in the code have been copied from #selbie's excellent answer.
Here's another O(n^2) worst-case solution with (likely provable) O(n) expected performance on random permutations and room for optimization.
Suppose [i, j] are our array bounds for an optimal pair. By the problem definition, this means all elements left of i must be strictly less than A[i], and all elements right of j must be strictly less than A[j].
This means we can compute the left-maxima of A: all elements strictly greater than all previous elements, as well as the right-maxima of A. Then, we only need to consider left endpoints from the left-maxima and right endpoints from the right-maxima.
I don't know the expectation of the product of the sizes of left and right maxima sets, but we can get an upper bound. The size of left maxima is at most the size of the longest increasing subsequence (LIS) of A. The right maxima are at most the size of the longest decreasing subsequence. These aren't independent, but I'm taking as an (unproven) assumption that the LIS and LDS lengths are inversely correlated with each other for random permutations. The right-maxima must start after the left-maxima end, so this seems like a safe assumption.
The length of the LIS for random permutations follows the Tracy-Widom distribution, so it has mean sqrt(2N) and standard deviation N^(-1/6). The expected square of the size is therefore 2N + 1/(N^1/3) so ~2N. This isn't exactly the proof we wanted, since you'd need to sum over the partial density function to be rigorous, but the LIS is already an upper bound on the left-maxima size, so I think the conclusion is still true.
C++ code (Result class and some variable names taken from selbie's post, as mentioned):
struct Result
{
size_t i;
size_t j;
int64_t value;
};
Result find_best_sum_size_product(const std::vector<int>& nums)
{
/* Given: list of positive integers nums
Returns: Tuple with (best_i, best_j, best_product)
where best_i and best_j maximize the product
(nums[i]+nums[j])*(j-i) over 0 <= i < j < n
Runtime: O(n^2) worst case,
O(n) average on random permutations.
*/
int n = nums.size();
if (n < 2)
{
return {0,0,INT64_MIN};
}
std::vector<int> left_maxima_indices;
left_maxima_indices.push_back(0);
for (int i = 1; i < n; i++){
if (nums.at(i) > nums.at(left_maxima_indices.back())) {
left_maxima_indices.push_back(i);
}
}
std::vector<int> right_maxima_indices;
right_maxima_indices.push_back(n-1);
for (int i = n-1; i >= 0; i--){
if (nums.at(i) > nums.at(right_maxima_indices.back())) {
right_maxima_indices.push_back(i);
}
}
size_t best_i = 0;
size_t best_j = 0;
int64_t best_product = INT64_MIN;
int i = 0;
int j = 0;
for (size_t left_idx = 0;
left_idx < left_maxima_indices.size();
left_idx++)
{
i = left_maxima_indices.at(left_idx);
for (size_t right_idx = 0;
right_idx < right_maxima_indices.size();
right_idx++)
{
j = right_maxima_indices.at(right_idx);
if (i == j) continue;
int64_t value = (nums.at(i) + (int64_t)nums.at(j)) * (j - i);
if (value > best_product)
{
best_product = value;
best_i = i;
best_j = j;
}
}
}
return { best_i, best_j, best_product };
}
I started from the two excellent answers by #selbie and #kcsquared.
Their solutions gave impressive results for random inputs. What was not clear is the worst case behavior.
What sequence would correspsond to the worst case?
I finally found a critial sequence for these two answers, a triangle sequence: this sequence slightly increases up to a max, and then slightly decrease. With such a sequence and n=10^5 for example, these answers take more than 10s.
My solutions starts from #selbie solution and add two improvements:
I add #kcsquared's trick: on the right (of j), they can be only lower elements
When considering a new left element a[i], it is useless to start from i + 1 to get the second element. We can start from the current best_j
With these tricks, I was able to improve the two posted answer performances a little bit. However, it still
fails to solve the triangle sequence issue: about 10s for n = 10^5.
#include <iostream>
#include <vector>
#include <string>
#include <cstdlib>
#include <ctime>
#include <chrono>
struct Result {
size_t i;
size_t j;
int64_t value;
};
void print (const Result& res, const std::string& prefix = "") {
std::cout << prefix;
std::cout << "(" << res.i << ", " << res.j << ") -> " << res.value << std::endl;
}
Result findBest(const std::vector<int>& a) {
if (a.size() < 2) {
return { 0, 0, INT64_MIN };
}
int n = a.size();
std::vector<int> next_max(n, -1);
int current_max = n-1;
for (int i = n-1; i >= 0; --i) {
if (a[i] > a[current_max]) {
current_max = i;
}
next_max[i] = current_max;
}
size_t besti = 0;
size_t bestj = 0;
int64_t bestvalue = INT64_MIN;
int minimum = INT_MIN;
for (size_t i = 0; i < a.size(); i++) {
if (a[i] <= minimum) {
continue;
}
minimum = a[i];
size_t jmin = (bestj > i) ? bestj : i+1;
for (size_t j = jmin; j < a.size(); j++) {
j = next_max[j];
value = (a[i] + (int64_t)a[j]) * (j - i);
if (value > bestvalue) {
bestvalue = value;
besti = i;
bestj = j;
}
}
}
return { besti, bestj, bestvalue };
}
int main() {
int n = 1000000;
int vmax = 100000000;
std::vector<int> A (n);
std::srand(std::time(0));
for (int i = 0; i < n; ++i) {
A[i] = rand() % vmax + 1;
}
std::cout << "n = " << n << std::endl;
auto t0 = std::chrono::high_resolution_clock::now();
auto res = findBest (A);
auto t1 = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(t1 - t0).count();
print (res, "Random: ");
std::cout << "time = " << duration/1000 << " ms" << std::endl;
int i_max = n/2;
for (int i = 0; i < i_max; ++i) A[i] = i+1;
A[i_max] = 10 * i_max;
for (int i = i_max+1; i < n; ++i) {
A[i] = 2*i_max - i;
}
t0 = std::chrono::high_resolution_clock::now();
res = findBest (A);
t1 = std::chrono::high_resolution_clock::now();
duration = std::chrono::duration_cast<std::chrono::microseconds>(t1 - t0).count();
print (res, "Triangle sequence: ");
std::cout << "time = " << duration/1000 << " ms" << std::endl;
return 0;
}

read/write to large array using large loop - execution time concerns

So recently I ran into a problem that I thought was interesting and I couldn't fully explain. I've highlighted the nature of the problem in the following code:
#include <cstring>
#include <chrono>
#include <iostream>
#define NLOOPS 10
void doWorkFast(int total, int *write, int *read)
{
for (int j = 0; j < NLOOPS; j++) {
for (int i = 0; i < total; i++) {
write[i] = read[i] + i;
}
}
}
void doWorkSlow(int total, int *write, int *read, int innerLoopSize)
{
for (int i = 0; i < NLOOPS; i++) {
for (int j = 0; j < total/innerLoopSize; j++) {
for (int k = 0; k < innerLoopSize; k++) {
write[j*k + k] = read[j*k + k] + j*k + k;
}
}
}
}
int main(int argc, char *argv[])
{
int n = 1000000000;
int *heapMemoryWrite = new int[n];
int *heapMemoryRead = new int[n];
for (int i = 0; i < n; i++)
{
heapMemoryRead[i] = 1;
}
std::memset(heapMemoryWrite, 0, n * sizeof(int));
auto start1 = std::chrono::high_resolution_clock::now();
doWorkFast(n,heapMemoryWrite, heapMemoryRead);
auto finish1 = std::chrono::high_resolution_clock::now();
auto duration1 = std::chrono::duration_cast<std::chrono::microseconds>(finish1 - start1);
for (int i = 0; i < n; i++)
{
heapMemoryRead[i] = 1;
}
std::memset(heapMemoryWrite, 0, n * sizeof(int));
auto start2 = std::chrono::high_resolution_clock::now();
doWorkSlow(n,heapMemoryWrite, heapMemoryRead, 10);
auto finish2 = std::chrono::high_resolution_clock::now();
auto duration2 = std::chrono::duration_cast<std::chrono::microseconds>(finish2 - start2);
std::cout << "Small inner loop:" << duration1.count() << " microseconds.\n" <<
"Large inner loop:" << duration2.count() << " microseconds." << std::endl;
delete[] heapMemoryWrite;
delete[] heapMemoryRead;
}
Looking at the two doWork* functions, for every iteration, we are reading the same addresses adding the same value and writing to the same addresses. I understand that in the doWorkSlow implementation, we are doing one or two more operations to resolve j*k + k, however, I think it's reasonably safe to assume that relative to the time it takes to do the load/stores for memory read and write, the time contribution of these operations is negligible.
Nevertheless, doWorkSlow takes about twice as long (46.8s) compared to doWorkFast (25.5s) on my i7-3700 using g++ --version 7.5.0. While things like cache prefetching and branch prediction come to mind, I don't have a great explanation as to why doWorkFast is much faster than doWorkSlow. Does anyone have insight?
Thanks
Looking at the two doWork* functions, for every iteration, we are reading the same addresses adding the same value and writing to the same addresses.
This is not true!
In doWorkFast, you index each integer incrementally, as array[i].
array[0]
array[1]
array[2]
array[3]
In doWorkSlow, you index each integer as array[j*k + k], which jumps around and repeats.
When j is 10, for example, and you iterate k from 0 onwards, you are accessing
array[0] // 10*0+0
array[11] // 10*1+1
array[22] // 10*2+2
array[33] // 10*3+3
This will prevent your optimizer from using instructions that can operate on many adjacent integers at once.

Algorithm to compute mode

I'm trying to devise an algorithm in the form of a function that accepts two parameters, an array and the size of the array. I want it to return the mode of the array and if there are multiple modes, return their average. My strategy was to take the array and first sort it. Then count all the occurrences of a number. while that number is occurring, add one to counter and store that count in an array m. So m is holding all the counts and another array q is holding the last value we were comparing.
For example: is my list is {1, 1, 1, 1, 2, 2, 2}
then i would have m[0] = 4 q[0] = 1
and then m[1] = 3 and q[1] = 2.
so the mode is q[0] = 1;
unfortunately i have had no success thus far. hoping someone could help.
float mode(int x[],int n)
{
//Copy array and sort it
int y[n], temp, k = 0, counter = 0, m[n], q[n];
for(int i = 0; i < n; i++)
y[i] = x[i];
for(int pass = 0; pass < n - 1; pass++)
for(int pos = 0; pos < n; pos++)
if(y[pass] > y[pos]) {
temp = y[pass];
y[pass] = y[pos];
y[pos] = temp;
}
for(int i = 0; i < n;){
for(int j = 0; j < n; j++){
while(y[i] == y[j]) {
counter++;
i++;
}
}
m[k] = counter;
q[k] = y[i];
i--; //i should be 1 less since it is referring to an array subscript
k++;
counter = 0;
}
}
Even though you have some good answers already, I decided to post another. I'm not sure it really adds a lot that's new, but I'm not at all sure it doesn't either. If nothing else, I'm pretty sure it uses more standard headers than any of the other answers. :-)
#include <vector>
#include <algorithm>
#include <unordered_map>
#include <map>
#include <iostream>
#include <utility>
#include <functional>
#include <numeric>
int main() {
std::vector<int> inputs{ 1, 1, 1, 1, 2, 2, 2 };
std::unordered_map<int, size_t> counts;
for (int i : inputs)
++counts[i];
std::multimap<size_t, int, std::greater<size_t> > inv;
for (auto p : counts)
inv.insert(std::make_pair(p.second, p.first));
auto e = inv.upper_bound(inv.begin()->first);
double sum = std::accumulate(inv.begin(),
e,
0.0,
[](double a, std::pair<size_t, int> const &b) {return a + b.second; });
std::cout << sum / std::distance(inv.begin(), e);
}
Compared to #Dietmar's answer, this should be faster if you have a lot of repetition in the numbers, but his will probably be faster if the numbers are mostly unique.
Based on the comment, it seems you need to find the values which occur most often and if there are multiple values occurring the same amount of times, you need to produce the average of these. It seems, this can easily be done by std::sort() following by a traversal finding where values change and keeping a few running counts:
template <int Size>
double mode(int const (&x)[Size]) {
std::vector<int> tmp(x, x + Size);
std::sort(tmp.begin(), tmp.end());
int size(0); // size of the largest set so far
int count(0); // number of largest sets
double sum(0); // sum of largest sets
for (auto it(tmp.begin()); it != tmp.end(); ) {
auto end(std::upper_bound(it, tmp.end(), *it));
if (size == std::distance(it, end)) {
sum += *it;
++count;
}
else if (size < std::distance(it, end)) {
size = std::distance(it, end);
sum = *it;
count = 1;
}
it = end;
}
return sum / count;
}
If you simply wish to count the number of occurences then I suggest you use a std::map or std::unordered_map.
If you're mapping a counter to each distinct value then it's an easy task to count occurences using std::map as each key can only be inserted once. To list the distinct numbers in your list simply iterate over the map.
Here's an example of how you could do it:
#include <cstddef>
#include <map>
#include <algorithm>
#include <iostream>
std::map<int, int> getOccurences(const int arr[], const std::size_t len) {
std::map<int, int> m;
for (std::size_t i = 0; i != len; ++i) {
m[arr[i]]++;
}
return m;
}
int main() {
int list[7]{1, 1, 1, 1, 2, 2, 2};
auto occurences = getOccurences(list, 7);
for (auto e : occurences) {
std::cout << "Number " << e.first << " occurs ";
std::cout << e.second << " times" << std::endl;
}
auto average = std::accumulate(std::begin(list), std::end(list), 0.0) / 7;
std::cout << "Average is " << average << std::endl;
}
Output:
Number 1 occurs 4 times
Number 2 occurs 3 times
Average is 1.42857
Here's a working version of your code. m stores the values in the array and q stores their counts. At the end it runs through all the values to get the maximal count, the sum of the modes, and the number of distinct modes.
float mode(int x[],int n)
{
//Copy array and sort it
int y[n], temp, j = 0, k = 0, m[n], q[n];
for(int i = 0; i < n; i++)
y[i] = x[i];
for(int pass = 0; pass < n - 1; pass++)
for(int pos = 0; pos < n; pos++)
if(y[pass] > y[pos]) {
temp = y[pass];
y[pass] = y[pos];
y[pos] = temp;
}
for(int i = 0; i < n;){
j = i;
while (y[j] == y[i]) {
j++;
}
m[k] = y[i];
q[k] = j - i;
k++;
i = j;
}
int max = 0;
int modes_count = 0;
int modes_sum = 0;
for (int i=0; i < k; i++) {
if (q[i] > max) {
max = q[i];
modes_count = 1;
modes_sum = m[i];
} else if (q[i] == max) {
modes_count += 1;
modes_sum += m[i];
}
}
return modes_sum / modes_count;
}

lexicographically smallest string after rotation

I am trying to solve this problem in spoj
I need to find the number of rotations of a given string that will make it lexicographically smallest among all the rotations.
For example:
Original: ama
First rotation: maa
Second rotation: aam This is the lexicographically smallest rotation so the answer is 2.
Here's my code:
string s,tmp;
char ss[100002];
scanf("%s",ss);
s=ss;
tmp=s;
int i,len=s.size(),ans=0,t=0;
for(i=0;i<len;i++)
{
string x=s.substr(i,len-i)+s.substr(0,i);
if(x<tmp)
{
tmp=x;
t=ans;
}
ans++;
}
cout<<t<<endl;
I am getting "Time Limit Exceeded" for this solution. I don't understand what optimizations can be made. How can I increase the speed of my solution?
You can use a modified suffix array. I mean modified because you must not stop on word end.
Here is the code for a similar problem I solved (SA is the suffix array):
//719
//Glass Beads
//Misc;String Matching;Suffix Array;Circular
#include <iostream>
#include <iomanip>
#include <cstring>
#include <string>
#include <cmath>
#define MAX 10050
using namespace std;
int RA[MAX], tempRA[MAX];
int SA[MAX], tempSA[MAX];
int C[MAX];
void suffix_sort(int n, int k) {
memset(C, 0, sizeof C);
for (int i = 0; i < n; i++)
C[RA[(i + k)%n]]++;
int sum = 0;
for (int i = 0; i < max(256, n); i++) {
int t = C[i];
C[i] = sum;
sum += t;
}
for (int i = 0; i < n; i++)
tempSA[C[RA[(SA[i] + k)%n]]++] = SA[i];
memcpy(SA, tempSA, n*sizeof(int));
}
void suffix_array(string &s) {
int n = s.size();
for (int i = 0; i < n; i++)
RA[i] = s[i];
for (int i = 0; i < n; i++)
SA[i] = i;
for (int k = 1; k < n; k *= 2) {
suffix_sort(n, k);
suffix_sort(n, 0);
int r = tempRA[SA[0]] = 0;
for (int i = 1; i < n; i++) {
int s1 = SA[i], s2 = SA[i-1];
bool equal = true;
equal &= RA[s1] == RA[s2];
equal &= RA[(s1+k)%n] == RA[(s2+k)%n];
tempRA[SA[i]] = equal ? r : ++r;
}
memcpy(RA, tempRA, n*sizeof(int));
}
}
int main() {
int tt; cin >> tt;
while(tt--) {
string s; cin >> s;
suffix_array(s);
cout << SA[0]+1 << endl;
}
}
I took this implementation mostly from this book. There is an easier to write O(n log²n) version, but may not be efficient enough for your case (n=10^5). This version is O(n log n), and it's not the most efficient algorithm. The wikipedia article lists some O(n) algorithms, but I find most of them too complex to write during a programming contest. This O(n log n) is usually enough for most problems.
You can find some slides explaining suffix array concept (from the author of the book I mentioned) here.
I know this comes very late but I stumbled across this from google on my search for an even faster variant of this algorithm. Turns out a good implementation is found at github: https://gist.github.com/MaskRay/8803371
It uses the lyndon factorization. That means it repeatly splits the string into lexicographically decreasing lyndon words. Lyndon word are strings that are (one of) the minimal rotations of themselves. Doing this in a circular way yields the lms of the string as the last found lyndon word.
int lyndon_word(const char *a, int n)
{
int i = 0, j = 1, k;
while (j < n) {
// Invariant: i < j and indices in [0,j) \ i cannot be the first optimum
for (k = 0; k < n && a[(i+k)%n] == a[(j+k)%n]; k++);
if (a[(i+k)%n] <= a[(j+k)%n]) {
// if k < n
// foreach p in [j,j+k], s_p > s_{p-(j-i)}
// => [j,j+k] are all suboptimal
// => indices in [0,j+k+1) \ i are suboptimal
// else
// None of [j,j+k] is the first optimum
j += k+1;
} else {
// foreach p in [i,i+k], s_p > s_{p+(j-i)}
// => [i,i+k] are all suboptimal
// => [0,j) and [0,i+k+1) are suboptimal
// if i+k+1 < j
// j < j+1 and indices in [0,j+1) \ j are suboptimal
// else
// i+k+1 < i+k+2 and indices in [0,i+k+2) \ (i+k+1) are suboptimal
i += k+1;
if (i < j)
i = j++;
else
j = i+1;
}
}
// j >= n => [0,n) \ i cannot be the first optimum
return i;
}

Sieve of Eratosthenes algorithm

I am currently reading "Programming: Principles and Practice Using C++", in Chapter 4 there is an exercise in which:
I need to make a program to calculate prime numbers between 1 and 100 using the Sieve of Eratosthenes algorithm.
This is the program I came up with:
#include <vector>
#include <iostream>
using namespace std;
//finds prime numbers using Sieve of Eratosthenes algorithm
vector<int> calc_primes(const int max);
int main()
{
const int max = 100;
vector<int> primes = calc_primes(max);
for(int i = 0; i < primes.size(); i++)
{
if(primes[i] != 0)
cout<<primes[i]<<endl;
}
return 0;
}
vector<int> calc_primes(const int max)
{
vector<int> primes;
for(int i = 2; i < max; i++)
{
primes.push_back(i);
}
for(int i = 0; i < primes.size(); i++)
{
if(!(primes[i] % 2) && primes[i] != 2)
primes[i] = 0;
else if(!(primes[i] % 3) && primes[i] != 3)
primes[i]= 0;
else if(!(primes[i] % 5) && primes[i] != 5)
primes[i]= 0;
else if(!(primes[i] % 7) && primes[i] != 7)
primes[i]= 0;
}
return primes;
}
Not the best or fastest, but I am still early in the book and don't know much about C++.
Now the problem, until max is not bigger than 500 all the values print on the console, if max > 500 not everything gets printed.
Am I doing something wrong?
P.S.: Also any constructive criticism would be greatly appreciated.
I have no idea why you're not getting all the output, as it looks like you should get everything. What output are you missing?
The sieve is implemented wrongly. Something like
vector<int> sieve;
vector<int> primes;
for (int i = 1; i < max + 1; ++i)
sieve.push_back(i); // you'll learn more efficient ways to handle this later
sieve[0]=0;
for (int i = 2; i < max + 1; ++i) { // there are lots of brace styles, this is mine
if (sieve[i-1] != 0) {
primes.push_back(sieve[i-1]);
for (int j = 2 * sieve[i-1]; j < max + 1; j += sieve[i-1]) {
sieve[j-1] = 0;
}
}
}
would implement the sieve. (Code above written off the top of my head; not guaranteed to work or even compile. I don't think it's got anything not covered by the end of chapter 4.)
Return primes as usual, and print out the entire contents.
Think of the sieve as a set.
Go through the set in order. For each value in thesive remove all numbers that are divisable by it.
#include <set>
#include <algorithm>
#include <iterator>
#include <iostream>
typedef std::set<int> Sieve;
int main()
{
static int const max = 100;
Sieve sieve;
for(int loop=2;loop < max;++loop)
{
sieve.insert(loop);
}
// A set is ordered.
// So going from beginning to end will give all the values in order.
for(Sieve::iterator loop = sieve.begin();loop != sieve.end();++loop)
{
// prime is the next item in the set
// It has not been deleted so it must be prime.
int prime = *loop;
// deleter will iterate over all the items from
// here to the end of the sieve and remove any
// that are divisable be this prime.
Sieve::iterator deleter = loop;
++deleter;
while(deleter != sieve.end())
{
if (((*deleter) % prime) == 0)
{
// If it is exactly divasable then it is not a prime
// So delete it from the sieve. Note the use of post
// increment here. This increments deleter but returns
// the old value to be used in the erase method.
sieve.erase(deleter++);
}
else
{
// Otherwise just increment the deleter.
++deleter;
}
}
}
// This copies all the values left in the sieve to the output.
// i.e. It prints all the primes.
std::copy(sieve.begin(),sieve.end(),std::ostream_iterator<int>(std::cout,"\n"));
}
From Algorithms and Data Structures:
void runEratosthenesSieve(int upperBound) {
int upperBoundSquareRoot = (int)sqrt((double)upperBound);
bool *isComposite = new bool[upperBound + 1];
memset(isComposite, 0, sizeof(bool) * (upperBound + 1));
for (int m = 2; m <= upperBoundSquareRoot; m++) {
if (!isComposite[m]) {
cout << m << " ";
for (int k = m * m; k <= upperBound; k += m)
isComposite[k] = true;
}
}
for (int m = upperBoundSquareRoot; m <= upperBound; m++)
if (!isComposite[m])
cout << m << " ";
delete [] isComposite;
}
Interestingly, nobody seems to have answered your question about the output problem. I don't see anything in the code that should effect the output depending on the value of max.
For what it's worth, on my Mac, I get all the output. It's wrong of course, since the algorithm isn't correct, but I do get all the output. You don't mention what platform you're running on, which might be useful if you continue to have output problems.
Here's a version of your code, minimally modified to follow the actual Sieve algorithm.
#include <vector>
#include <iostream>
using namespace std;
//finds prime numbers using Sieve of Eratosthenes algorithm
vector<int> calc_primes(const int max);
int main()
{
const int max = 100;
vector<int> primes = calc_primes(max);
for(int i = 0; i < primes.size(); i++)
{
if(primes[i] != 0)
cout<<primes[i]<<endl;
}
return 0;
}
vector<int> calc_primes(const int max)
{
vector<int> primes;
// fill vector with candidates
for(int i = 2; i < max; i++)
{
primes.push_back(i);
}
// for each value in the vector...
for(int i = 0; i < primes.size(); i++)
{
//get the value
int v = primes[i];
if (v!=0) {
//remove all multiples of the value
int x = i+v;
while(x < primes.size()) {
primes[x]=0;
x = x+v;
}
}
}
return primes;
}
In the code fragment below, the numbers are filtered before they are inserted into the vector. The divisors come from the vector.
I'm also passing the vector by reference. This means that the huge vector won't be copied from the function to the caller. (Large chunks of memory take long times to copy)
vector<unsigned int> primes;
void calc_primes(vector<unsigned int>& primes, const unsigned int MAX)
{
// If MAX is less than 2, return an empty vector
// because 2 is the first prime and can't be placed in the vector.
if (MAX < 2)
{
return;
}
// 2 is the initial and unusual prime, so enter it without calculations.
primes.push_back(2);
for (unsigned int number = 3; number < MAX; number += 2)
{
bool is_prime = true;
for (unsigned int index = 0; index < primes.size(); ++index)
{
if ((number % primes[k]) == 0)
{
is_prime = false;
break;
}
}
if (is_prime)
{
primes.push_back(number);
}
}
}
This not the most efficient algorithm, but it follows the Sieve algorithm.
below is my version which basically uses a bit vector of bool and then goes through the odd numbers and a fast add to find multiples to set to false. In the end a vector is constructed and returned to the client of the prime values.
std::vector<int> getSieveOfEratosthenes ( int max )
{
std::vector<bool> primes(max, true);
int sz = primes.size();
for ( int i = 3; i < sz ; i+=2 )
if ( primes[i] )
for ( int j = i * i; j < sz; j+=i)
primes[j] = false;
std::vector<int> ret;
ret.reserve(primes.size());
ret.push_back(2);
for ( int i = 3; i < sz; i+=2 )
if ( primes[i] )
ret.push_back(i);
return ret;
}
Here is a concise, well explained implementation using bool type:
#include <iostream>
#include <cmath>
void find_primes(bool[], unsigned int);
void print_primes(bool [], unsigned int);
//=========================================================================
int main()
{
const unsigned int max = 100;
bool sieve[max];
find_primes(sieve, max);
print_primes(sieve, max);
}
//=========================================================================
/*
Function: find_primes()
Use: find_primes(bool_array, size_of_array);
It marks all the prime numbers till the
number: size_of_array, in the form of the
indexes of the array with value: true.
It implemenets the Sieve of Eratosthenes,
consisted of:
a loop through the first "sqrt(size_of_array)"
numbers starting from the first prime (2).
a loop through all the indexes < size_of_array,
marking the ones satisfying the relation i^2 + n * i
as false, i.e. composite numbers, where i - known prime
number starting from 2.
*/
void find_primes(bool sieve[], unsigned int size)
{
// by definition 0 and 1 are not prime numbers
sieve[0] = false;
sieve[1] = false;
// all numbers <= max are potential candidates for primes
for (unsigned int i = 2; i <= size; ++i)
{
sieve[i] = true;
}
// loop through the first prime numbers < sqrt(max) (suggested by the algorithm)
unsigned int first_prime = 2;
for (unsigned int i = first_prime; i <= std::sqrt(double(size)); ++i)
{
// find multiples of primes till < max
if (sieve[i] = true)
{
// mark as composite: i^2 + n * i
for (unsigned int j = i * i; j <= size; j += i)
{
sieve[j] = false;
}
}
}
}
/*
Function: print_primes()
Use: print_primes(bool_array, size_of_array);
It prints all the prime numbers,
i.e. the indexes with value: true.
*/
void print_primes(bool sieve[], unsigned int size)
{
// all the indexes of the array marked as true are primes
for (unsigned int i = 0; i <= size; ++i)
{
if (sieve[i] == true)
{
std::cout << i <<" ";
}
}
}
covering the array case. A std::vector implementation will include minor changes such as reducing the functions to one parameter, through which the vector is passed by reference and the loops will use the vector size() member function instead of the reduced parameter.
Here is a more efficient version for Sieve of Eratosthenes algorithm that I implemented.
#include <iostream>
#include <cmath>
#include <set>
using namespace std;
void sieve(int n){
set<int> primes;
primes.insert(2);
for(int i=3; i<=n ; i+=2){
primes.insert(i);
}
int p=*primes.begin();
cout<<p<<"\n";
primes.erase(p);
int maxRoot = sqrt(*(primes.rbegin()));
while(primes.size()>0){
if(p>maxRoot){
while(primes.size()>0){
p=*primes.begin();
cout<<p<<"\n";
primes.erase(p);
}
break;
}
int i=p*p;
int temp = (*(primes.rbegin()));
while(i<=temp){
primes.erase(i);
i+=p;
i+=p;
}
p=*primes.begin();
cout<<p<<"\n";
primes.erase(p);
}
}
int main(){
int n;
n = 1000000;
sieve(n);
return 0;
}
Here's my implementation not sure if 100% correct though :
http://pastebin.com/M2R2J72d
#include<iostream>
#include <stdlib.h>
using namespace std;
void listPrimes(int x);
int main() {
listPrimes(5000);
}
void listPrimes(int x) {
bool *not_prime = new bool[x];
unsigned j = 0, i = 0;
for (i = 0; i <= x; i++) {
if (i < 2) {
not_prime[i] = true;
} else if (i % 2 == 0 && i != 2) {
not_prime[i] = true;
}
}
while (j <= x) {
for (i = j; i <= x; i++) {
if (!not_prime[i]) {
j = i;
break;
}
}
for (i = (j * 2); i <= x; i += j) {
not_prime[i] = true;
}
j++;
}
for ( i = 0; i <= x; i++) {
if (!not_prime[i])
cout << i << ' ';
}
return;
}
I am following the same book now. I have come up with the following implementation of the algorithm.
#include<iostream>
#include<string>
#include<vector>
#include<algorithm>
#include<cmath>
using namespace std;
inline void keep_window_open() { char ch; cin>>ch; }
int main ()
{
int max_no = 100;
vector <int> numbers (max_no - 1);
iota(numbers.begin(), numbers.end(), 2);
for (unsigned int ind = 0; ind < numbers.size(); ++ind)
{
for (unsigned int index = ind+1; index < numbers.size(); ++index)
{
if (numbers[index] % numbers[ind] == 0)
{
numbers.erase(numbers.begin() + index);
}
}
}
cout << "The primes are\n";
for (int primes: numbers)
{
cout << primes << '\n';
}
}
Here is my version:
#include "std_lib_facilities.h"
//helper function:check an int prime, x assumed positive.
bool check_prime(int x) {
bool check_result = true;
for (int i = 2; i < x; ++i){
if (x%i == 0){
check_result = false;
break;
}
}
return check_result;
}
//helper function:return the largest prime smaller than n(>=2).
int near_prime(int n) {
for (int i = n; i > 0; --i) {
if (check_prime(i)) { return i; break; }
}
}
vector<int> sieve_primes(int max_limit) {
vector<int> num;
vector<int> primes;
int stop = near_prime(max_limit);
for (int i = 2; i < max_limit+1; ++i) { num.push_back(i); }
int step = 2;
primes.push_back(2);
//stop when finding the last prime
while (step!=stop){
for (int i = step; i < max_limit+1; i+=step) {num[i-2] = 0; }
//the multiples set to 0, the first none zero element is a prime also step
for (int j = step; j < max_limit+1; ++j) {
if (num[j-2] != 0) { step = num[j-2]; break; }
}
primes.push_back(step);
}
return primes;
}
int main() {
int max_limit = 1000000;
vector<int> primes = sieve_primes(max_limit);
for (int i = 0; i < primes.size(); ++i) {
cout << primes[i] << ',';
}
}
Here is a classic method for doing this,
int main()
{
int max = 500;
vector<int> array(max); // vector of max numbers, initialized to default value 0
for (int i = 2; i < array.size(); ++ i) // loop for rang of numbers from 2 to max
{
// initialize j as a composite number; increment in consecutive composite numbers
for (int j = i * i; j < array.size(); j +=i)
array[j] = 1; // assign j to array[index] with value 1
}
for (int i = 2; i < array.size(); ++ i) // loop for rang of numbers from 2 to max
if (array[i] == 0) // array[index] with value 0 is a prime number
cout << i << '\n'; // get array[index] with value 0
return 0;
}
I think im late to this party but im reading the same book as you, this is the solution in came up with! Feel free to make suggestions (you or any!), for what im seeing here a couple of us extracted the operation to know if a number is multiple of another to a function.
#include "../../std_lib_facilities.h"
bool numIsMultipleOf(int n, int m) {
return n%m == 0;
}
int main() {
vector<int> rawCollection = {};
vector<int> numsToCheck = {2,3,5,7};
// Prepare raw collection
for (int i=2;i<=100;++i) {
rawCollection.push_back(i);
}
// Check multiples
for (int m: numsToCheck) {
vector<int> _temp = {};
for (int n: rawCollection) {
if (!numIsMultipleOf(n,m)||n==m) _temp.push_back(n);
}
rawCollection = _temp;
}
for (int p: rawCollection) {
cout<<"N("<<p<<")"<<" is prime.\n";
}
return 0;
}
Try this code it will be useful to you by using java question bank
import java.io.*;
class Sieve
{
public static void main(String[] args) throws IOException
{
int n = 0, primeCounter = 0;
double sqrt = 0;
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
System.out.println(“Enter the n value : ”);
n = Integer.parseInt(br.readLine());
sqrt = Math.sqrt(n);
boolean[] prime = new boolean[n];
System.out.println(“\n\nThe primes upto ” + n + ” are : ”);
for (int i = 2; i<n; i++)
{
prime[i] = true;
}
for (int i = 2; i <= sqrt; i++)
{
for (int j = i * 2; j<n; j += i)
{
prime[j] = false;
}
}
for (int i = 0; i<prime.length; i++)
{
if (prime[i])
{
primeCounter++;
System.out.print(i + ” “);
}
}
prime = new boolean[0];
}
}