Not Finding Times of Prime Generation / Limited Generation

Not Finding Times of Prime Generation / Limited Generation - c++

This program is a c++ program that finds primes using the sieve of eratosthenes to calculate primes. It is then supposed to store the time it takes to do this, and reperform the calculation 100 times, storing the times each time. There are two things that I need help with in this program:
Firstly, I can only test numbers up to 480million I would like to get higher than that.
Secondly, when i time the program it only gets the first timing and then prints zeros as the time. This is not correct and I don't know what the problem with the clock is. -Thanks for the help
Here is my code.
#include <iostream>
#include <ctime>
using namespace std;
int main ()
{
int long MAX_NUM = 1000000;
int long MAX_NUM_ARRAY = MAX_NUM+1;
int long sieve_prime = 2;
int time_store = 0;
while (time_store<=100)
{
int long sieve_prime_constant = 0;
int *Num_Array = new int[MAX_NUM_ARRAY];
std::fill_n(Num_Array, MAX_NUM_ARRAY, 3);
Num_Array [0] = 1;
Num_Array [1] = 1;
clock_t time1,time2;
time1 = clock();
while (sieve_prime_constant <= MAX_NUM_ARRAY)
{
if (Num_Array [sieve_prime_constant] == 1)
{
sieve_prime_constant++;
}
else
{
Num_Array [sieve_prime_constant] = 0;
sieve_prime=sieve_prime_constant;
while (sieve_prime<=MAX_NUM_ARRAY - sieve_prime_constant)
{
sieve_prime = sieve_prime + sieve_prime_constant;
Num_Array [sieve_prime] = 1;
}
if (sieve_prime_constant <= MAX_NUM_ARRAY)
{
sieve_prime_constant++;
sieve_prime = sieve_prime_constant;
}
}
}
time2 = clock();
delete[] Num_Array;
cout << "It took " << (float(time2 - time1)/(CLOCKS_PER_SEC)) << " seconds to execute this loop." << endl;
cout << "This loop has already been executed " << time_store << " times." << endl;
float Time_Array[100];
Time_Array[time_store] = (float(time2 - time1)/(CLOCKS_PER_SEC));
time_store++;
}
return 0;
}

I think the problem is that you don't reset the starting prime:
int long sieve_prime = 2;
Currently that is outside your loop. On second thoughts... That's not the problem. Has this code been edited to incorporate the suggestions in Mats Petersson's answer? I just corrected the bad indentation.
Anyway, for the other part of your question, I suggest you use char instead of int for Num_Array. There is no use using int to store a boolean. By using char you should be able to store about 4 times as many values in the same amount of memory (assuming your int is 32-bit, which it probably is).
That means you could handle numbers up to almost 2 billion. Since you are using signed long as your type instead of unsigned long, that is approaching the numeric limits for your calculation anyway.
If you want to use even less memory, you could use std::bitset, but be aware that performance could be significantly impaired.
By the way, you should declare your timing array at the top of main:
float Time_Array[100];
Putting it inside the loop just before it is used is a bit whack.
Oh, and just in case you're interested, here is my own implementation of the sieve which, personally, I find easier to read than yours....
std::vector<char> isPrime( N, 1 );
for( int i = 2; i < N; i++ )
{
if( !isPrime[i] ) continue;
for( int x = i*2; x < N; x+=i ) isPrime[x] = 0;
}

This section of code is supposed to go inside your loop:
int *Num_Array = new int[MAX_NUM_ARRAY];
std::fill_n(Num_Array, MAX_NUM_ARRAY, 3);
Num_Array [0] = 1;
Num_Array [1] = 1;
Edit: and this one needs be in the loop too:
int long sieve_prime_constant = 0;
When I run this on my machine, it takes 0.2s per loop. If I add two zeros to the MAX_NUM_ARRAY, it takes 4.6 seconds per iteration (up to the 20th loop, I got bored waiting longer than 1.5 minute)

Agree with the earlier comments. If you really want to juice things up you don't store an array of all possible values (as int, or char), but only keep the primes. Then you test each subsequent number for divisibility through all primes found so far. Now you are only limited by the number of primes you can store. Of course, that's not really the algorithm you wanted to implement any more... but since it would be using integer division, it's quite fast. Something like this:
int myPrimes[MAX_PRIME];
int pCount, ii, jj;
ii = 3;
myPrimes[0]=2;
for(pCount=1; pCount<MAX_PRIME; pCount++) {
for(jj = 1; jj<pCount; jj++) {
if (ii%myPrimes[jj]==0) {
// not a prime
ii+=2; // never test even numbers...
jj = 1; // start loop again
}
}
myPrimes[pCount]=ii;
}
Not really what you were asking for, but maybe it is useful.

Related

C++ Header file not creating random number

My first attempt at creating a header file. The solution is nonsense and nothing more than practice. It receives two numbers from the main file and is supposed to return a random entry from the vector. When I call it from a loop in the main file, it increments by 3 instead of randomly. (Diagnosed by returning the value of getEntry.) The Randomizer code works correctly if I pull it out of the header file and run it directly as a program.
int RandomNumber::Randomizer(int a, int b){
std::vector < int > vecArray{};
int range = (b - a) + 1;
time_t nTime;
srand((unsigned)time(&nTime));
for (int i = a-1; i < b+1; i++) {
vecArray.push_back(i);
}
int getEntry = rand() % range + 1;
int returnValue = vecArray[getEntry];
vecArray.clear();
return returnValue;
}
From what I read, header files should generally not contain function and variable definitions. I suspect Rand, being a function, is the source of the problem.
How, if possible, can I get my header file to create random numbers?

void random(){
double rangeMin = 1;
double rangeMax = 10;
size_t numSamples = 10;
thread_local std::mt19937 mt(std::random_device{}());
std::uniform_real_distribution<double> dist(rangeMin, rangeMax);
for (size_t i = 1; i <= numSamples; ++i) {
std::cout << dist(mt) << std::endl;
}
}
This method will give you the opportunity to generate random numbers, between two numbers this method you have to include random

There are many cases where you will optate to engender a desultory number. There are genuinely two functions you will require to ken about arbitrary number generation. The first is rand(), this function will only return a pseudo desultory number. The way to fine-tune this is to first call the srand() function.
Here is an example:
#include <iostream>
#include <ctime>
#include <cstdlib>
using namespace std;
int main () {
int i,j;
srand( (unsigned)time( NULL ) );
for( i = 0; i < 10; i++ ) {
j = rand();
cout <<" Random Number : " << j << endl;
}
return 0;
}
Using srand( (unsigned)time( NULL ) ); Instead of using your own value use NULL for the default setting.
You can also go here for more info.
I hope I answered your question! Have a nice day!

Ted Lyngmo gave me the idea that fixed the problem. Using random appears to work correctly in a header file.
I removed/changed the following:
time_t nTime;
srand((unsigned)time(&nTime));
int getEntry = rand() % range + 1;
and replaced them with:
std::random_device rd;
std::mt19937 gen(rd());
int getEntry = gen() % range + 1;
Issue resolved. Thank you everybody for your suggestions and comments!

As an experiment, I remove the vector and focus on the randomizer `srand(T)`, where `T` is the system time `volatile time_t T = time(NULL)`. We then found that system is NOT changed during the program running (execution simply too fast).
The function `rand()` generates a pesudo-random integer using confluent rnadom generator, basically multiply the seed by another larger unsigned integer and truncated to the finite bits of `seed`. The randomizer `srand(T)` is used to initialize the seed using system time, or any number `srand(12345);` . A seed gives a fixed sequence of random number. Without calling `srand(T)`, the seed is determined by the system initial memory gabage. The seed is then changed in every generating `rand()`.
In your code, you issue randomizer `srand(T)` reset the seed to the system time in every run. But the system time didn't changed, Thus, you are reseting the `seed` to a same number.
Run this test.
#include <cstdlib>
#include <iostream>
#include <ctime>
int Randomizer(int a, int b){
volatile time_t T = time(NULL);
std::cout << "time = " << T << std::endl;
srand(T);
std::cout << "rand() = " << rand() << std::endl;
return rand();
}
int main()
{
int n1 = 1, n2 = 8;
for(int i=0; i<5; ++i)
{
std::cout << Randomizer(n1, n2) << std::endl;
}
}
The seed is reset to the system time, which is not change during execution. Thus It renders the same random number.
$ ./a.exe
time = 1608049336
rand() = 9468
15874
time = 1608049336
rand() = 9468
15874
time = 1608049336
rand() = 9468
15874
time = 1608049336
rand() = 9468
15874
time = 1608049336
rand() = 9468
15874
In order to see the change of system time, we add a pause in the main():
int main()
{
int n1 = 1, n2 = 8;
for(int i=0; i<5; ++i)
{
std::cout << Randomizer(n1, n2) << std::endl;
system("pause");
}
}
We can observe the system time moving on...
$ ./a.exe
time = 1608050805
rand() = 14265
11107
Press any key to continue . . .
time = 1608050809
rand() = 14279
21332
Press any key to continue . . .
time = 1608050815
rand() = 14298
20287
Press any key to continue . . .
Because system time is not much different, the first generation of confluent sequence rand() is also rather closed, but the continue sequence of numbers will be "seemingly" random. The principle for confluent random generator is that once after set the seed don't change it. Until you are working for another series of random set. Therefore, put the srand(T) funtcion just once in the main() or somewhere that executed only once.
int main()
{
srand(time(NULL)); // >>>> just for this once <<<<
int n1 = 1, n2 = 8;
for(int i=0; i<5; ++i)
{
std::cout << Randomizer(n1, n2) << std::endl;
}
}

Why is this benchmark code for linear and binary search not working?

I am trying to benchmark linear and binary search as a part of an assignment. I have written the necessary search and randomizer functions. But when I try to benchmark them I get 0 delay even for higher array sizes.
The code:
#include<iostream>
#include <time.h>
#include <windows.h>
using namespace std;
double getTime()
{
LARGE_INTEGER t, f;
QueryPerformanceCounter(&t);
QueryPerformanceFrequency(&f);
return (double)t.QuadPart/(double)f.QuadPart;
}
int linearSearch(int arr[], int len,int target){
int resultIndex = -1;
for(int i = 0;i<len;i++){
if(arr[i] == target){
resultIndex = i;
break;
}
}
return resultIndex;
}
void badSort(int arr[],int len){
for(int i = 0 ; i< len;i++){
int indexToSwapWith = i;
for(int j = i+1;j < len;j++){
if(arr[j] < arr[indexToSwapWith] )
indexToSwapWith = j;
}
if(indexToSwapWith != i){
int t = arr[i];
arr[i] = arr[indexToSwapWith];
arr[indexToSwapWith] = t;
}
}
}
int binSearch(int arr[], int len,int target){
int resultIndex = -1;
int first = 0;
int last = len;
int mid = first;
while(first <= last){
mid = (first + last)/2;
if(target < arr[mid])
last = mid-1;
else if(target > arr[mid])
first = mid+1;
else
break;
}
if(arr[mid] == target)
resultIndex = mid;
return resultIndex;
}
void fillArrRandomly(int arr[],int len){
srand(time(NULL));
for(int i = 0 ; i < len ;i++){
arr[i] = rand();
}
}
void benchmarkRandomly(int len){
float startTime = getTime();
int arr[len];
fillArrRandomly(arr,len);
badSort(arr,len);
/*
for(auto i : arr)
cout<<i<<"\n";
*/
float endTime = getTime();
float timeElapsed = endTime - startTime;
cout<< "prep took " << timeElapsed<<endl;
int target = rand();
startTime = getTime();
int result = linearSearch(arr,len,target);
endTime = getTime();
timeElapsed = endTime - startTime;
cout<<"linear search result for "<<target<<":"<<result<<" after "<<startTime<<" to "<<endTime <<":"<<timeElapsed<<"\n";
startTime = getTime();
result = binSearch(arr,len,target);
endTime = getTime();
timeElapsed = endTime - startTime;
cout<<"binary search result for "<<target<<":"<<result<<" after "<<startTime<<" to "<<endTime <<":"<<timeElapsed<<"\n";
}
int main(){
benchmarkRandomly(30000);
}
Sample output:
prep took 0.9375
linear search result for 29445:26987 after 701950 to 701950:0
binary search result for 29445:26987 after 701950 to 701950:0
I have tried using clock_t as well but it was the result was the same. Do I need even higher array size or am I benchmarking the wrong way?
In the course I have to implement most of the stuff myself. That's why I'm not using stl. I'm not sure if using stl::chrono is allowed but I'd like to ensure that the problem does not lie elsewhere first.
Edit: In case it isn't clear, I can't include the time for sorting and random generation in the benchmark.

One problem is that you set startTime = getTime() before you pack your test arrays with random values. If the random number generation is slow this might dominate that returned results. The main effort is sorting your array, the search time will be extremely low compared to this.
It is probably too course an interval as you suggest. For a binary search on 30k objects we are talking about just 12 or 13 iterations so on a modern machine 20 / 1000000000 seconds at most. This is approximately zero ms.
Increasing the number of array entries won't help much, but you could try increasing the array size until you get near the memory limit. But now your problem will be that the preparatory random number generation and sorting will take forever.
I would suggest either :-
A. Checking for a very large number of items :-
unsigned int total;
startTime = getTime();
for (i=0; i<10000000; i++)
total += binSearch(arr, len, rand());
endTime = getTime();
B. Modify your code to count the number of times you compare elements and use that information instead of timing.

It looks like you're using the search result (by printing it with cout *outside the timed region, that's good). And the data + key are randomized, so the search shouldn't be getting optimized away at compile time. (Benchmarking with optimization disabled is pointless, so you need tricks like this.)
Have you looked at timeElapsed with a debugger? Maybe it's a very small float that prints as 0 with default cout settings?
Or maybe float endTime - float startTime actually is equal to 0.0f because rounding to the nearest float made them equal. Subtracting two large nearby floating-point numbers produces "catastrophic cancellation".
Remember that float only has 24 bits of significand, so regardless of the frequency you divide by, if the PerformanceCounter values differ in less than 1 part in 2^24, you'll get zero. (If that function returns raw counts from x86 rdtsc, then that will happen if your system's last reboot was more than 2^24 times longer ago than the time interval. x86 TSC starts at zero when the system boots, and (on CPUs in the last ~10 years) counts at a "reference frequency" that's (approximately) equal to your CPU's rated / "sticker" frequency, regardless of turbo or idle clock speeds. See Get CPU cycle count?)
double might help, but much better to subtract in the integer domain before dividing. Also, rewriting that part will take QueryPerformanceFrequency out of the timed interval!
As #Jon suggests, it's often better to put the code under test into a repeat loop inside one longer timed interval, so (code) caches and branch prediction can warm up.
But then you have the problem of making sure repeated calls aren't optimized away, and of randomizing the search key inside the loop. (Otherwise a smart compiler might hoist the search out of the loop).
Something like volatile int result = binSearch(...); can help, because assigning to (or initializing) a volatile is a visible side-effect that can't be optimized away. So the compiler needs to actually materialize each search result in a register.
For some compilers, e.g. ones that support GNU C inline asm, you can use inline asm to require the compiler to produce a value in a register without adding any overhead of storing it anywhere. AFAIK this isn't possible with MSVC inline asm.

Division of very large numbers

I have written following code in C++:
#include <cmath>
#include <iostream>
using namespace std;
int main()
{
double sum, containers, n ,c, max_cap, temp;
unsigned int j = 1;
cin >> n >> c;
sum = containers = n;
for (unsigned int i = 2 ; i <= c; ++i)
{
max_cap = i * n;
if (max_cap - sum > 0)
{
temp = ceil((max_cap - sum)/i);
containers += temp;
sum += i * temp;
}
}
cout << containers << '\n';
}
When the input given to this code is "728 1287644555" it takes about 5 seconds to compute the answer but when the input is roughly three times i.e. "763 3560664427" it is not giving a long time.(I waited for around half hour) As it can be seen the algo is of linear order. Therefore, it should take roughly 15 seconds. Why is this happening? Is it because the input is too large in second case? If yes then how is it affecting time so much?

My guess would be unsigned integer overflow.
for (unsigned int i = 2 ; i <= c; ++i)
i increases until it is > c, but c is a double whereas i is an unsigned int. It reaches the maximum (UINT_MAX) and wraps to 0 before it reaches the value of c.
I.e. 1287644555 is less than UINT_MAX, so it completes. But 3560664427 is greater than UINT_MAX, so it loops forever. Which only raises the question of what strange architecture you are running this on :)
On my own machine (UINT_MAX = 4294967295) the first input takes 16 seconds to process while the second takes 43.5 seconds, pretty much what you'd expect.

Determining the largest value before hitting infinity

I have this very simple function that checks the value of (N^N-1)^(N-2):
int main() {
// Declare Variables
double n;
double answer;
// Function
cout << "Please enter a double number >= 3: ";
cin >> n;
answer = pow(n,(n-1)*(n-2));
cout << "n to the n-1) to the n-2 for doubles is " << answer << endl;
}
Based on this formula, it is evident it will reach to infinity, but I am curious until what number/value of n would it hit infinity? Using a loop seems extremely inefficient, but that's all I can think of. Basically, creating a loop that says let n be a number between 1 - 100, iterate until n == inf
Is there a more efficient approach to this problem?

I think you are approaching this the wrong way.
Let : F(N) be the function (N^(N-1))(N-2)
Now you actually know whats the largest number that could be stored in a double type variable
is 0x 7ff0 0000 0000 0000 Double Precision
So now you have F(N) = max_double
just solve for X now.
Does this answer your question?

Two things: the first is that (N^(N-1))^(N-2)) can be written as N^((N-1)*(N-2)). So this would remove one pow call making your code faster.
pow(n, (n-1)*(n-2));
The second is that to know what practical limits you hit, testing all N will literally take a fraction of a second, so there really is no reason to find another practical way.
You could compute it by hand knowing variable size limits and all, but testing it is definitely faster. An example for code (C++11, since I use std::isinf):
#include <iostream>
#include <cmath>
#include <iomanip>
int main() {
double N = 1.0, diff = 10.0;
const unsigned digits = 10;
unsigned counter = digits;
while ( true ) {
double X = std::pow( N, (N-1.0) * (N-2.0) );
if ( std::isinf(X) ) {
--counter;
if ( !counter ) {
std::cout << std::setprecision(digits) << N << "\n";
break;
}
N -= diff;
diff /= 10;
}
N += diff;
}
return 0;
}
This example takes less than a millisecond on my computer, and prints 17.28894235

Sieve of Eratosthenes algorithm not working for large limits

I have programmed a sieve of Eratosthenes algorithm in C++, and it works fine for smaller numbers that I have tested it with. However, when I use large numbers, i.e. 2 000 000 as the upper limit, the program begins giving wrong answers. Can anyone clarify why?
Your help is appreciated.
#include <iostream>
#include <time.h>
using namespace std;
int main() {
clock_t a, b;
a = clock();
int n = 0, k = 2000000; // n = Sum of primes, k = Upper limit
bool r[k - 2]; // r = All numbers below k and above 1 (if true, it has been marked as a non-prime)
for(int i = 0; i < k - 2; i++) // Check all numbers
if(!r[i]) { // If it hasn't been marked as a non-prime yet ...
n += i + 2; // Add the prime to the total sum (+2 because of the shift - index 0 is 2, index 1 is 3, etc.)
for(int j = 2 * i + 2; j < k - 2; j += i + 2) // Go through all multiples of the prime under the limit
r[j] = true; // Mark the multiple as a non-prime
}
b = clock();
cout << "Final Result: " << n << endl;
cout << b - a << "ms runtime achieved." << endl;
return 0;
}
EDIT: I just did some debugging and found that it works with the limit at around 400. At 500, however, it is off - it should be 21536, but is 21499
EDIT 2: Ah, I found two errors and those seem to have fixed the problem.
The first was found by others who answered, and is that n is overflowing - upon being made a long long data type, it has begun working.
The second, rather facepalm-worthy mistake, was that the booleans in r had to be initialized. After running loop before checking for primes to make all of them false, the right answer is gotten. Does anyone know why this occured?

You simply get an integer overflow. The C++ type int is has a limited range (on a 32 bit System usually from -(2^32) / 2 to 2^32 / 2 - 1, that is the usual maximum is 2147483647 (The specific maximum on your setup can be found out by #including the <limits> header and evaluating std::numeric_limits<int>::max(). Even when k is smaller than the maximum, your code will sooner or later cause an overflow in the expressions n += i + 2 or int j = 2 * i + 2.
You will have to choose a better (read: more appropriate) type like unsigned which does not support negative numbers and can thus can represent numbers twice as large as int. You can also try unsigned long or even unsigned long long.
Also note that variable length arrays (VLAs; that's what bool r[k - 2] is) are not standard C++. You might want to use std::vector instead. You also did not initialize the array to false (std::vector would do this automatically), which could also be the problem, especially if you say that it does not work even at k=500.
In C++, you should also use <ctime> instead of <time.h> (then clock_t and andclock()are defined in thestdnamespace, but since you areusing namespace std`, this won't make a difference for you), but this is more or less a matter of style.
I found a working example in my "code archive". Although it is not based on yours, you might find it useful:
#include <vector>
#include <iostream>
int main()
{
typedef std::vector<bool> marked_t;
typedef marked_t::size_type number_t; // The type used for indexing marked_t.
const number_t max = 500;
static const number_t iDif = 2; // Account for the numbers 1 and 2.
marked_t marked(max - iDif);
number_t i = iDif;
while (i*i <= max) {
while (marked[i - iDif] == true)
++i;
for (number_t fac = iDif; i * fac < max; ++fac)
marked[i * fac - iDif] = true;
++i;
}
for (marked_t::size_type i = 0; i < marked.size(); ++i) {
if (!marked[i])
std::cout << i + iDif << ',';
}
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js