Sieve of Eratosthenes algorithm

Sieve of Eratosthenes algorithm - c++

I am currently reading "Programming: Principles and Practice Using C++", in Chapter 4 there is an exercise in which:
I need to make a program to calculate prime numbers between 1 and 100 using the Sieve of Eratosthenes algorithm.
This is the program I came up with:
#include <vector>
#include <iostream>
using namespace std;
//finds prime numbers using Sieve of Eratosthenes algorithm
vector<int> calc_primes(const int max);
int main()
{
const int max = 100;
vector<int> primes = calc_primes(max);
for(int i = 0; i < primes.size(); i++)
{
if(primes[i] != 0)
cout<<primes[i]<<endl;
}
return 0;
}
vector<int> calc_primes(const int max)
{
vector<int> primes;
for(int i = 2; i < max; i++)
{
primes.push_back(i);
}
for(int i = 0; i < primes.size(); i++)
{
if(!(primes[i] % 2) && primes[i] != 2)
primes[i] = 0;
else if(!(primes[i] % 3) && primes[i] != 3)
primes[i]= 0;
else if(!(primes[i] % 5) && primes[i] != 5)
primes[i]= 0;
else if(!(primes[i] % 7) && primes[i] != 7)
primes[i]= 0;
}
return primes;
}
Not the best or fastest, but I am still early in the book and don't know much about C++.
Now the problem, until max is not bigger than 500 all the values print on the console, if max > 500 not everything gets printed.
Am I doing something wrong?
P.S.: Also any constructive criticism would be greatly appreciated.

I have no idea why you're not getting all the output, as it looks like you should get everything. What output are you missing?
The sieve is implemented wrongly. Something like
vector<int> sieve;
vector<int> primes;
for (int i = 1; i < max + 1; ++i)
sieve.push_back(i); // you'll learn more efficient ways to handle this later
sieve[0]=0;
for (int i = 2; i < max + 1; ++i) { // there are lots of brace styles, this is mine
if (sieve[i-1] != 0) {
primes.push_back(sieve[i-1]);
for (int j = 2 * sieve[i-1]; j < max + 1; j += sieve[i-1]) {
sieve[j-1] = 0;
}
}
}
would implement the sieve. (Code above written off the top of my head; not guaranteed to work or even compile. I don't think it's got anything not covered by the end of chapter 4.)
Return primes as usual, and print out the entire contents.

Think of the sieve as a set.
Go through the set in order. For each value in thesive remove all numbers that are divisable by it.
#include <set>
#include <algorithm>
#include <iterator>
#include <iostream>
typedef std::set<int> Sieve;
int main()
{
static int const max = 100;
Sieve sieve;
for(int loop=2;loop < max;++loop)
{
sieve.insert(loop);
}
// A set is ordered.
// So going from beginning to end will give all the values in order.
for(Sieve::iterator loop = sieve.begin();loop != sieve.end();++loop)
{
// prime is the next item in the set
// It has not been deleted so it must be prime.
int prime = *loop;
// deleter will iterate over all the items from
// here to the end of the sieve and remove any
// that are divisable be this prime.
Sieve::iterator deleter = loop;
++deleter;
while(deleter != sieve.end())
{
if (((*deleter) % prime) == 0)
{
// If it is exactly divasable then it is not a prime
// So delete it from the sieve. Note the use of post
// increment here. This increments deleter but returns
// the old value to be used in the erase method.
sieve.erase(deleter++);
}
else
{
// Otherwise just increment the deleter.
++deleter;
}
}
}
// This copies all the values left in the sieve to the output.
// i.e. It prints all the primes.
std::copy(sieve.begin(),sieve.end(),std::ostream_iterator<int>(std::cout,"\n"));
}

From Algorithms and Data Structures:
void runEratosthenesSieve(int upperBound) {
int upperBoundSquareRoot = (int)sqrt((double)upperBound);
bool *isComposite = new bool[upperBound + 1];
memset(isComposite, 0, sizeof(bool) * (upperBound + 1));
for (int m = 2; m <= upperBoundSquareRoot; m++) {
if (!isComposite[m]) {
cout << m << " ";
for (int k = m * m; k <= upperBound; k += m)
isComposite[k] = true;
}
}
for (int m = upperBoundSquareRoot; m <= upperBound; m++)
if (!isComposite[m])
cout << m << " ";
delete [] isComposite;
}

Interestingly, nobody seems to have answered your question about the output problem. I don't see anything in the code that should effect the output depending on the value of max.
For what it's worth, on my Mac, I get all the output. It's wrong of course, since the algorithm isn't correct, but I do get all the output. You don't mention what platform you're running on, which might be useful if you continue to have output problems.
Here's a version of your code, minimally modified to follow the actual Sieve algorithm.
#include <vector>
#include <iostream>
using namespace std;
//finds prime numbers using Sieve of Eratosthenes algorithm
vector<int> calc_primes(const int max);
int main()
{
const int max = 100;
vector<int> primes = calc_primes(max);
for(int i = 0; i < primes.size(); i++)
{
if(primes[i] != 0)
cout<<primes[i]<<endl;
}
return 0;
}
vector<int> calc_primes(const int max)
{
vector<int> primes;
// fill vector with candidates
for(int i = 2; i < max; i++)
{
primes.push_back(i);
}
// for each value in the vector...
for(int i = 0; i < primes.size(); i++)
{
//get the value
int v = primes[i];
if (v!=0) {
//remove all multiples of the value
int x = i+v;
while(x < primes.size()) {
primes[x]=0;
x = x+v;
}
}
}
return primes;
}

In the code fragment below, the numbers are filtered before they are inserted into the vector. The divisors come from the vector.
I'm also passing the vector by reference. This means that the huge vector won't be copied from the function to the caller. (Large chunks of memory take long times to copy)
vector<unsigned int> primes;
void calc_primes(vector<unsigned int>& primes, const unsigned int MAX)
{
// If MAX is less than 2, return an empty vector
// because 2 is the first prime and can't be placed in the vector.
if (MAX < 2)
{
return;
}
// 2 is the initial and unusual prime, so enter it without calculations.
primes.push_back(2);
for (unsigned int number = 3; number < MAX; number += 2)
{
bool is_prime = true;
for (unsigned int index = 0; index < primes.size(); ++index)
{
if ((number % primes[k]) == 0)
{
is_prime = false;
break;
}
}
if (is_prime)
{
primes.push_back(number);
}
}
}
This not the most efficient algorithm, but it follows the Sieve algorithm.

below is my version which basically uses a bit vector of bool and then goes through the odd numbers and a fast add to find multiples to set to false. In the end a vector is constructed and returned to the client of the prime values.
std::vector<int> getSieveOfEratosthenes ( int max )
{
std::vector<bool> primes(max, true);
int sz = primes.size();
for ( int i = 3; i < sz ; i+=2 )
if ( primes[i] )
for ( int j = i * i; j < sz; j+=i)
primes[j] = false;
std::vector<int> ret;
ret.reserve(primes.size());
ret.push_back(2);
for ( int i = 3; i < sz; i+=2 )
if ( primes[i] )
ret.push_back(i);
return ret;
}

Here is a concise, well explained implementation using bool type:
#include <iostream>
#include <cmath>
void find_primes(bool[], unsigned int);
void print_primes(bool [], unsigned int);
//=========================================================================
int main()
{
const unsigned int max = 100;
bool sieve[max];
find_primes(sieve, max);
print_primes(sieve, max);
}
//=========================================================================
/*
Function: find_primes()
Use: find_primes(bool_array, size_of_array);
It marks all the prime numbers till the
number: size_of_array, in the form of the
indexes of the array with value: true.
It implemenets the Sieve of Eratosthenes,
consisted of:
a loop through the first "sqrt(size_of_array)"
numbers starting from the first prime (2).
a loop through all the indexes < size_of_array,
marking the ones satisfying the relation i^2 + n * i
as false, i.e. composite numbers, where i - known prime
number starting from 2.
*/
void find_primes(bool sieve[], unsigned int size)
{
// by definition 0 and 1 are not prime numbers
sieve[0] = false;
sieve[1] = false;
// all numbers <= max are potential candidates for primes
for (unsigned int i = 2; i <= size; ++i)
{
sieve[i] = true;
}
// loop through the first prime numbers < sqrt(max) (suggested by the algorithm)
unsigned int first_prime = 2;
for (unsigned int i = first_prime; i <= std::sqrt(double(size)); ++i)
{
// find multiples of primes till < max
if (sieve[i] = true)
{
// mark as composite: i^2 + n * i
for (unsigned int j = i * i; j <= size; j += i)
{
sieve[j] = false;
}
}
}
}
/*
Function: print_primes()
Use: print_primes(bool_array, size_of_array);
It prints all the prime numbers,
i.e. the indexes with value: true.
*/
void print_primes(bool sieve[], unsigned int size)
{
// all the indexes of the array marked as true are primes
for (unsigned int i = 0; i <= size; ++i)
{
if (sieve[i] == true)
{
std::cout << i <<" ";
}
}
}
covering the array case. A std::vector implementation will include minor changes such as reducing the functions to one parameter, through which the vector is passed by reference and the loops will use the vector size() member function instead of the reduced parameter.

Here is a more efficient version for Sieve of Eratosthenes algorithm that I implemented.
#include <iostream>
#include <cmath>
#include <set>
using namespace std;
void sieve(int n){
set<int> primes;
primes.insert(2);
for(int i=3; i<=n ; i+=2){
primes.insert(i);
}
int p=*primes.begin();
cout<<p<<"\n";
primes.erase(p);
int maxRoot = sqrt(*(primes.rbegin()));
while(primes.size()>0){
if(p>maxRoot){
while(primes.size()>0){
p=*primes.begin();
cout<<p<<"\n";
primes.erase(p);
}
break;
}
int i=p*p;
int temp = (*(primes.rbegin()));
while(i<=temp){
primes.erase(i);
i+=p;
i+=p;
}
p=*primes.begin();
cout<<p<<"\n";
primes.erase(p);
}
}
int main(){
int n;
n = 1000000;
sieve(n);
return 0;
}

Here's my implementation not sure if 100% correct though :
http://pastebin.com/M2R2J72d
#include<iostream>
#include <stdlib.h>
using namespace std;
void listPrimes(int x);
int main() {
listPrimes(5000);
}
void listPrimes(int x) {
bool *not_prime = new bool[x];
unsigned j = 0, i = 0;
for (i = 0; i <= x; i++) {
if (i < 2) {
not_prime[i] = true;
} else if (i % 2 == 0 && i != 2) {
not_prime[i] = true;
}
}
while (j <= x) {
for (i = j; i <= x; i++) {
if (!not_prime[i]) {
j = i;
break;
}
}
for (i = (j * 2); i <= x; i += j) {
not_prime[i] = true;
}
j++;
}
for ( i = 0; i <= x; i++) {
if (!not_prime[i])
cout << i << ' ';
}
return;
}

I am following the same book now. I have come up with the following implementation of the algorithm.
#include<iostream>
#include<string>
#include<vector>
#include<algorithm>
#include<cmath>
using namespace std;
inline void keep_window_open() { char ch; cin>>ch; }
int main ()
{
int max_no = 100;
vector <int> numbers (max_no - 1);
iota(numbers.begin(), numbers.end(), 2);
for (unsigned int ind = 0; ind < numbers.size(); ++ind)
{
for (unsigned int index = ind+1; index < numbers.size(); ++index)
{
if (numbers[index] % numbers[ind] == 0)
{
numbers.erase(numbers.begin() + index);
}
}
}
cout << "The primes are\n";
for (int primes: numbers)
{
cout << primes << '\n';
}
}

Here is my version:
#include "std_lib_facilities.h"
//helper function:check an int prime, x assumed positive.
bool check_prime(int x) {
bool check_result = true;
for (int i = 2; i < x; ++i){
if (x%i == 0){
check_result = false;
break;
}
}
return check_result;
}
//helper function:return the largest prime smaller than n(>=2).
int near_prime(int n) {
for (int i = n; i > 0; --i) {
if (check_prime(i)) { return i; break; }
}
}
vector<int> sieve_primes(int max_limit) {
vector<int> num;
vector<int> primes;
int stop = near_prime(max_limit);
for (int i = 2; i < max_limit+1; ++i) { num.push_back(i); }
int step = 2;
primes.push_back(2);
//stop when finding the last prime
while (step!=stop){
for (int i = step; i < max_limit+1; i+=step) {num[i-2] = 0; }
//the multiples set to 0, the first none zero element is a prime also step
for (int j = step; j < max_limit+1; ++j) {
if (num[j-2] != 0) { step = num[j-2]; break; }
}
primes.push_back(step);
}
return primes;
}
int main() {
int max_limit = 1000000;
vector<int> primes = sieve_primes(max_limit);
for (int i = 0; i < primes.size(); ++i) {
cout << primes[i] << ',';
}
}

Here is a classic method for doing this,
int main()
{
int max = 500;
vector<int> array(max); // vector of max numbers, initialized to default value 0
for (int i = 2; i < array.size(); ++ i) // loop for rang of numbers from 2 to max
{
// initialize j as a composite number; increment in consecutive composite numbers
for (int j = i * i; j < array.size(); j +=i)
array[j] = 1; // assign j to array[index] with value 1
}
for (int i = 2; i < array.size(); ++ i) // loop for rang of numbers from 2 to max
if (array[i] == 0) // array[index] with value 0 is a prime number
cout << i << '\n'; // get array[index] with value 0
return 0;
}

I think im late to this party but im reading the same book as you, this is the solution in came up with! Feel free to make suggestions (you or any!), for what im seeing here a couple of us extracted the operation to know if a number is multiple of another to a function.
#include "../../std_lib_facilities.h"
bool numIsMultipleOf(int n, int m) {
return n%m == 0;
}
int main() {
vector<int> rawCollection = {};
vector<int> numsToCheck = {2,3,5,7};
// Prepare raw collection
for (int i=2;i<=100;++i) {
rawCollection.push_back(i);
}
// Check multiples
for (int m: numsToCheck) {
vector<int> _temp = {};
for (int n: rawCollection) {
if (!numIsMultipleOf(n,m)||n==m) _temp.push_back(n);
}
rawCollection = _temp;
}
for (int p: rawCollection) {
cout<<"N("<<p<<")"<<" is prime.\n";
}
return 0;
}

Try this code it will be useful to you by using java question bank
import java.io.*;
class Sieve
{
public static void main(String[] args) throws IOException
{
int n = 0, primeCounter = 0;
double sqrt = 0;
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
System.out.println(“Enter the n value : ”);
n = Integer.parseInt(br.readLine());
sqrt = Math.sqrt(n);
boolean[] prime = new boolean[n];
System.out.println(“\n\nThe primes upto ” + n + ” are : ”);
for (int i = 2; i<n; i++)
{
prime[i] = true;
}
for (int i = 2; i <= sqrt; i++)
{
for (int j = i * 2; j<n; j += i)
{
prime[j] = false;
}
}
for (int i = 0; i<prime.length; i++)
{
if (prime[i])
{
primeCounter++;
System.out.print(i + ” “);
}
}
prime = new boolean[0];
}
}

Related

Selection Sort doesn't catch duplicate numbers

What i'm trying to do is implement a simple selection sort algorithm that uses the function minButGreaterThan to find the next smallest number in the array. My problem is if the array has a duplicate number, it gets passed over and left at the end. I've tried changing the controlling if statements to accommodate for this but nothing seems to work. Any advice?
double GradeBook::minButGreaterThan(double x) // - NEEDS TESTING
{
double minButGreaterThan = -1;
for (int i = 0; i < classSize; i++)
{
if (grades[i] > x)
{
minButGreaterThan = grades[i];
break;
}
}
for (int i = 0; i < classSize; i++)
{
if (grades[i] > x && grades[i] <= minButGreaterThan)
minButGreaterThan = grades[i];
}
return minButGreaterThan;
}
void GradeBook::selectionSort() //ascending order -- *DOES NOT WORK WITH DUPLICATE SCORES* - RETEST
{
double min = absoluteMin();
for (int i = 0; i < classSize; i++)
{
if (grades[i] == min)
{
double temp = grades[0];
grades[0] = grades[i];
grades[i] = temp;
break;
}
}
for (int i = 0; i < classSize-1; i++)
{
double next = minButGreaterThan(grades[i]);
for (int n = 1; n <= classSize; n++)
if (grades[n] == next)
{
double temp = grades[n];
grades[n] = grades[i+1];
grades[i+1] = temp;
}
}
}

Should work with duplicates, a selection sort just takes the minimum and moves it to the left, to the "sorted" portion of the array.
This is my implementation:
#include <algorithm>
#include <vector>
using std::swap;
using std::vector;
using std::min_element;
void selectionSort(vector<int> &v) {
for (unsigned int i = 0; i < v.size() - 1; i++) {
auto minElement = min_element(v.begin() + i, v.end());
auto minIndex = minElement - v.begin();
swap(v[i], v[minIndex]);
}
}
You might need to modify it to work with floats. Now, a double floating precision grade (double) seems too much. I think a regular float is OK.

How can I sort array elements by number of divisors?

My problem is that I hit an obstacle while I was solving some exercises.
The source of the problem is that I have to write a program which sort descending an array by the number of each element's divisors, but when two element has the same number of divisors it should sort ascending those values.
My code so far:
#include <iostream>
#include <fstream>
using namespace std;
int cntDiv(int n) //get number of divisors
{
int lim = n;
int c = 0;
if(n == 1)
return 1;
for(int i = 1; i < lim; i++)
{
if(n % i == 0)
{
lim = n / i;
if(lim != i)
c++;
c++;
}
}
return c;
}
int main()
{
ifstream fin("in.txt");
int n, i, j;
fin >> n;
int v[n];
for(i = 0; i < n; i++)
fin >> v[i];
int div[n];
for(i = 0; i < n; i++)
div[i] = cntDiv(v[i]);
for(i = 0; i < n - 1; i++)
{
for(j = i + 1; j < n; j++)
{
if(div[i] < div[j] && div[i] != div[j]) //if the number of divisors are different
{
int t = v[i];
v[i] = v[j];
v[j] = t;
t = div[i];
div[i] = div[j];
div[j] = t;
}
if(div[i] == div[j] && v[i] > v[j]) //if the number of divisors are the same
{
int t = v[i];
v[i] = v[j];
v[j] = t;
}
}
}
for(i = 0; i < n; i++)
{
cout << v[i] << " ";
}
return 0;
}
In.txt:
5
12 20 4 100 13
Output:
100 12 20 4 13
Although it works fine with this one and many other. For bigger inputs it exceeds the time limit which is 0.1s. Any advice how should I rewrite the sorting? (I wrote bubble sort because I could not implement sorting array by property via quicksort)

Use an array of structures. The structure would contain the original value and a container of divisors:
struct Number_Attributes
{
int number;
std::list<int> divisors;
};
You can then write a custom comparator function and pass to std::sort:
bool Order_By_Divisors(const Number_Attributes& a,
const Number_Attributes& b)
{
return a.divisors.size() < b.divisors.size();
}
The sorting then becomes:
#define ARRAY_CAPACITY (20U)
Number_Attributes the_array[ARRAY_CAPACITY];
//...
std::sort(&array[0], &array[ARRAY_CAPACITY], Order_By_Divisors);
The generation of divisors is left as an exercise for the OP.

Reworking your code with std::sort:
std::vector<std::pair<int, int>> customSort(const std::vector<int>& v)
{
std::vector<std::pair<int, int>> ps;
ps.reserve(v.size());
// We don't have zip sort :/
// So building the pair
for (auto e : v)
{
ps.emplace_back(e, cntDiv(e));
}
std::sort(ps.begin(), ps.end(), [](const auto&lhs, const auto& rhs) {
// descending number of divisors, increasing value
return std::make_tuple(-lhs.second, lhs.first)
< std::make_tuple(-rhs.second, rhs.first);
});
return ps;
}
int main()
{
const std::vector<int> v = {12, 20, 4, 100, 13};
const auto res = customSort(v);
for(const auto& p : res)
{
std::cout << p.first << " ";
}
}
Demo

Finding number of prime numbers in an array

I'm trying to write a function that finds the number of prime numbers in an array.
int countPrimes(int a[], int size)
{
int numberPrime = 0;
int i = 0;
for (int j = 2; j < a[i]; j++)
{
if(a[i] % j == 0)
numbPrime++;
}
return numPrime;
}
I think what I'm missing is I have to redefine i after every iteration, but I'm not sure how.

You need 2 loops: 1 over the array, 1 checking all possible divisors. I'd suggest separating out the prime check into a function. Code:
bool primeCheck(int p) {
if (p<2) return false;
// Really slow way to check, but works
for(int d = 2; d<p; ++d) {
if (0==p%d) return false; // found a divisor
}
return true; // no divisors found
}
int countPrimes(const int *a, int size) {
int numberPrime = 0;
for (int i = 0; i < size; ++i) {
// For each element in the input array, check it,
// and increment the count if it is prime.
if(primeCheck(a[i]))
++numberPrime;
}
return numberPrime;
}
You can also use std::count_if like this:
std::count_if(std::begin(input), std::end(input), primeCheck)
See it live here.

Is there any input for which selection sort outperforms bubble sort?

I mean like...partial, full or reverse sorted arrays.
I have already tried the following: random, fully sorted, almost sorted, partially sorted, rever sorted and the count of bubble is lesser when it's fully sorted. In all other cases, it's the same.
int selectionSort(int a[], int l, int r) {
int count = 0;
for (int i = l; i < r; i++) {
int min = i;
for (int j = i + 1; j <= r; j++) {
if (a[j] < a[min]) min = j;
count++;
}
if (i != min) swap(a[i], a[min]);
}
return count;
}
int bubbleSort(int a[], int l, int r) {
int count = 0;
bool flag = false;
for (int i = l; i < r; i++) {
for (int j = r; j > i; j--) {
if (a[j-1] > a[j]) {
if (flag == false) flag = true;
swap(a[j - 1], a[j]);
}
count++;
}
if (flag == false) break;
}
return count;
}
The count returns the number of comparisons BTW.

Among simple average-case Θ(n2) algorithms, selection sort almost always outperforms bubble sort.
Source: Wikipedia

I hinted at this already in comments, but here's some updated code for you that counts both comparisons and exchanges/swaps, and illustrates that for some random input the number of exchanges/swaps is where selection sort outperforms bubble sort.
#include <iostream>
#include <vector>
#include <utility>
#include <cassert>
using namespace std;
struct Stats { int swaps_ = 0, compares_ = 0; };
std::ostream& operator<<(std::ostream& os, const Stats& s)
{
return os << "{ swaps " << s.swaps_
<< ", compares " << s.compares_ << " }";
}
Stats selectionSort(std::vector<int>& a, int l, int r) {
Stats stats;
for (int i = l; i < r; i++) {
int min = i;
for (int j = i + 1; j <= r; j++) {
if (a.at(j) < a.at(min)) min = j;
++stats.compares_;
}
if (i != min) {
swap(a.at(i), a.at(min));
++stats.swaps_;
}
}
return stats;
}
Stats bubbleSort(std::vector<int>& a, int l, int r) {
Stats stats;
bool flag = false;
for (int i = l; i < r; i++) {
for (int j = r; j > i; j--) {
if (a.at(j-1) > a.at(j)) {
if (flag == false) flag = true;
swap(a.at(j - 1), a.at(j));
++stats.swaps_;
}
++stats.compares_;
}
if (flag == false) break;
}
return stats;
}
int main()
{
std::vector<int> v1{ 4, 8, 3, 8, 10, -1, 3, 20, 5 };
std::vector<int> v1s = v1;
std::cout << "sel " << selectionSort(v1s, 0, v1s.size() - 1);
std::vector<int> v1b = v1;
std::cout << ", bub " << bubbleSort(v1b, 0, v1b.size() - 1) << '\n';
assert(v1s == v1b);
// always a good idea to check the code's doing what you expect...
for (int i : v1s) std::cout << i << ' ';
std::cout << '\n';
}
Output:
sel { swaps 6, compares 36 }, bub { swaps 15, compares 36 }
-1 3 3 4 5 8 8 10 20
You can observe / copy / fork-and-edit / run the code online here.

lexicographically smallest string after rotation

I am trying to solve this problem in spoj
I need to find the number of rotations of a given string that will make it lexicographically smallest among all the rotations.
For example:
Original: ama
First rotation: maa
Second rotation: aam This is the lexicographically smallest rotation so the answer is 2.
Here's my code:
string s,tmp;
char ss[100002];
scanf("%s",ss);
s=ss;
tmp=s;
int i,len=s.size(),ans=0,t=0;
for(i=0;i<len;i++)
{
string x=s.substr(i,len-i)+s.substr(0,i);
if(x<tmp)
{
tmp=x;
t=ans;
}
ans++;
}
cout<<t<<endl;
I am getting "Time Limit Exceeded" for this solution. I don't understand what optimizations can be made. How can I increase the speed of my solution?

You can use a modified suffix array. I mean modified because you must not stop on word end.
Here is the code for a similar problem I solved (SA is the suffix array):
//719
//Glass Beads
//Misc;String Matching;Suffix Array;Circular
#include <iostream>
#include <iomanip>
#include <cstring>
#include <string>
#include <cmath>
#define MAX 10050
using namespace std;
int RA[MAX], tempRA[MAX];
int SA[MAX], tempSA[MAX];
int C[MAX];
void suffix_sort(int n, int k) {
memset(C, 0, sizeof C);
for (int i = 0; i < n; i++)
C[RA[(i + k)%n]]++;
int sum = 0;
for (int i = 0; i < max(256, n); i++) {
int t = C[i];
C[i] = sum;
sum += t;
}
for (int i = 0; i < n; i++)
tempSA[C[RA[(SA[i] + k)%n]]++] = SA[i];
memcpy(SA, tempSA, n*sizeof(int));
}
void suffix_array(string &s) {
int n = s.size();
for (int i = 0; i < n; i++)
RA[i] = s[i];
for (int i = 0; i < n; i++)
SA[i] = i;
for (int k = 1; k < n; k *= 2) {
suffix_sort(n, k);
suffix_sort(n, 0);
int r = tempRA[SA[0]] = 0;
for (int i = 1; i < n; i++) {
int s1 = SA[i], s2 = SA[i-1];
bool equal = true;
equal &= RA[s1] == RA[s2];
equal &= RA[(s1+k)%n] == RA[(s2+k)%n];
tempRA[SA[i]] = equal ? r : ++r;
}
memcpy(RA, tempRA, n*sizeof(int));
}
}
int main() {
int tt; cin >> tt;
while(tt--) {
string s; cin >> s;
suffix_array(s);
cout << SA[0]+1 << endl;
}
}
I took this implementation mostly from this book. There is an easier to write O(n log²n) version, but may not be efficient enough for your case (n=10^5). This version is O(n log n), and it's not the most efficient algorithm. The wikipedia article lists some O(n) algorithms, but I find most of them too complex to write during a programming contest. This O(n log n) is usually enough for most problems.
You can find some slides explaining suffix array concept (from the author of the book I mentioned) here.

I know this comes very late but I stumbled across this from google on my search for an even faster variant of this algorithm. Turns out a good implementation is found at github: https://gist.github.com/MaskRay/8803371
It uses the lyndon factorization. That means it repeatly splits the string into lexicographically decreasing lyndon words. Lyndon word are strings that are (one of) the minimal rotations of themselves. Doing this in a circular way yields the lms of the string as the last found lyndon word.
int lyndon_word(const char *a, int n)
{
int i = 0, j = 1, k;
while (j < n) {
// Invariant: i < j and indices in [0,j) \ i cannot be the first optimum
for (k = 0; k < n && a[(i+k)%n] == a[(j+k)%n]; k++);
if (a[(i+k)%n] <= a[(j+k)%n]) {
// if k < n
// foreach p in [j,j+k], s_p > s_{p-(j-i)}
// => [j,j+k] are all suboptimal
// => indices in [0,j+k+1) \ i are suboptimal
// else
// None of [j,j+k] is the first optimum
j += k+1;
} else {
// foreach p in [i,i+k], s_p > s_{p+(j-i)}
// => [i,i+k] are all suboptimal
// => [0,j) and [0,i+k+1) are suboptimal
// if i+k+1 < j
// j < j+1 and indices in [0,j+1) \ j are suboptimal
// else
// i+k+1 < i+k+2 and indices in [0,i+k+2) \ (i+k+1) are suboptimal
i += k+1;
if (i < j)
i = j++;
else
j = i+1;
}
}
// j >= n => [0,n) \ i cannot be the first optimum
return i;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Sieve of Eratosthenes algorithm - c++

Related

Selection Sort doesn't catch duplicate numbers

How can I sort array elements by number of divisors?

Finding number of prime numbers in an array

Is there any input for which selection sort outperforms bubble sort?

lexicographically smallest string after rotation

Categories

Resources