Arriving at a close approximation of the probability using code - c++

I was given a math question on probability. It goes like this:
There are 1000 lotteries and each has 1000 tickets. You decide to buy 1 ticket per lottery. What is the probability that you win at least one lottery?
I was able to do it mathematically on paper (arrived at 1 - (999/1000)^1000), but an idea of carrying out large iterations of the random experiment on my computer occurred to me. So, I typed some code — two versions of it to be exact, and both malfunction.
Code 1:
#include<iostream>
#include <stdlib.h>
using namespace std;
int main() {
int p2 = 0;
int p1 = 0;
srand(time(NULL));
for (int i = 0; i<100000; i++){
for(int j = 0; j<1000; j++){
int s = 0;
int x = rand()%1000;
int y = rand()%1000;
if(x == y)
s = 1;
p1 += s;
}
if(p1>0)
p2++;
}
cout<<"The final probability is = "<< (p2/100000);
return 0;
}
Code 2:
#include<iostream>
#include <stdlib.h>
using namespace std;
int main() {
int p2 = 0;
int p1 = 0;
for (int i = 0; i<100000; i++){
for(int j = 0; j<1000; j++){
int s = 0;
srand(time(NULL));
int x = rand()%1000;
srand(time(NULL));
int y = rand()%1000;
if(x == y)
s = 1;
p1 += s;
}
if(p1>0)
p2++;
}
cout<<"The final probability is = "<< (p2/100000);
return 0;
}
Code 3 (refered to some advanced text, but I don't understand most of it):
#include<iostream>
#include <random>
using namespace std;
int main() {
int p2 = 0;
int p1 = 0;
random_device rd;
mt19937 gen(rd());
for (int i = 0; i<100000; i++){
for(int j = 0; j<1000; j++){
uniform_int_distribution<> dis(1, 1000);
int s = 0;
int x = dis(gen);
int y = dis(gen);
if(x == y)
s = 1;
p1 += s;
}
if(p1>0)
p2++;
}
cout<<"The final probability is = "<< (p2/100000);
return 0;
}
Now, all of these codes output the same text:
The final probability is = 1
Process finished with exit code 0
It seems that the rand() function has been outputting the same value over all the 100000 iterations of the loop. I haven't been able to fix this.
I also tried using randomize() function instead of the srand() function, but it doesn't seem to work and gives weird errors like:
error: ‘randomize’ was not declared in this scope
randomize();
^
I think that randomize() has been discontinued in the later versions of C++.
I know that I am wrong on many levels. I would really appreciate if you could patiently explain me my mistakes and let me know some possible corrections.

You should reset your count (p1) at the beginning of the outer loop. Also, be aware of the final integer division p2/100000, any value of p2 < 100000 would result in 0.
Look at this modified version of your code:
#include <iostream>
#include <random>
int main()
{
const int number_of_tests = 100000;
const int lotteries = 1000;
const int tickets_per_lottery = 1000;
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<> lottery(1, tickets_per_lottery);
int winning_cases = 0;
for (int i = 0; i < number_of_tests; ++i )
{
int wins = 0; // <- reset when each test start
for(int j = 0; j < lotteries; ++j )
{
int my_ticket = lottery(gen);
int winner = lottery(gen);
if( my_ticket == winner )
++wins;
}
if ( wins > 0 )
++winning_cases;
}
// use the correct type to perform these calculations
double expected = 1.0 - std::pow((lotteries - 1.0)/lotteries, lotteries);
double probability = static_cast<double>(winning_cases) / number_of_tests;
std::cout << "Expected: " << expected
<< "\nCalculated: " << probability << '\n';
return 0;
}
A tipical run would output something like:
Expected: 0.632305
Calculated: 0.63125

Only seed the pseudorandom number generator by srand once at the beginning of your program. When you seed it over and over again you reset the pseudorandom number generator to the same initial state. time has a granularity measured in seconds, by default. Odds are you are getting all 1000 iterations - or most of them - within a single second.

See this answer to someone else's question for a general description of how pseudorandom number generators work.
This means that you should be creating one instance of a PRNG in your program and seeding it one time. Don't do either of those tasks inside loops, or inside functions that get called multiple times, unless you really know what you're doing and are trying to do something sophisticated such as using correlation induction strategies such as common random numbers or antithetic variates to achieve "variance reduction".

Related

Averaging Coin Tosses with Accumulator C++

This is the problem I am working with
Using a loop and rand(), simulate a coin toss 10000 times
Calculate the difference between heads and tails.
Wrap the above two lines in another loop, which loops 1000 times.
Use an accumulator to sum up the differences
Calculate and display the average difference between the number of heads and tails.
The accumulator is not working the way I want It to? Very much a C++ Noob, for homework lol. Anyone help please?
Why am I using rand()????
second part of the assignment has us using the newer method (mt19937), just trying to tackle this bit first before moving on.
#include <iostream>
using namespace std;
int main() {
int heads = 0, tails = 0, num, total = 0;
srand(time(NULL));
for (int h = 0; h < 1000; h++) // Loop Coin Toss
{
for (int i = 0; i < 10000; i++) // COIN TOSS
{
int random = rand() % 2;
if (random == 0)
{
heads++;
}
else
{
tails++;
}
}
cout << abs((heads++ - tails++));
cin >> num;
total =+ num;
}
cout << "The average distance between is " << total / 1000 << endl;
cin.get();
return 0;
}
With your code, you never actually save the values that you need. And there's some unnecessary arithmetic that would throw off your results. This line:
cout << abs((heads++ - tails++)); increments the heads and tails variables, but they shouldn't be.
The next two lines make no sense. Why do you need to get a number from the user, and why do you add that number to your total?
Finally, this expression: total / 1000 performs integer division, which will throw off your results.
Those are the immediate issues I can spot in your code.
Next, we move on to your problem statement. What is an accumulator? To me, it sounds like you're supposed to have a class? It also reminds me of std::accumulate, but if that's what you intended, it would have said as much. Also, std::accumulate would require storing results, and that's not really necessary for this program. The code below performs the main task, i.e. it runs the necessary simulations and tracks results.
You'll notice I don't bother counting tails. The big average is also calculated as it goes since the total number of simulations is known ahead of time.
#include <cmath>
#include <iostream>
#include <random>
int flip_coin() {
static std::mt19937 prng(std::random_device{}());
static std::uniform_int_distribution<int> flip(0, 1);
return flip(prng);
}
int main() {
constexpr int tosses = 10'000;
constexpr int simulations = 1'000;
double diffAvg = 0.0;
for (int i = 0; i < simulations; ++i) {
int heads = 0;
for (int j = 0; j < tosses; ++j) {
if (flip_coin()) {
++heads;
}
}
diffAvg +=
std::abs(heads - (tosses - heads)) / static_cast<double>(simulations);
}
std::cout << "The average heads/tails diff is: " << diffAvg << '\n';
return 0;
}
What I ended up doing that seems to work for **THIS VERSION WITH RAND() (using the new method later) **
#include <iostream>
using namespace std;
int main()
{
int heads = 0, tails = 0, total = 0;
srand(time(NULL));
for (int h = 0; h < 1000; h++) // Loop Coin Toss
{
{
for (int i = 0; i < 10000; ++i) // COIN TOSS
if (rand() % 2 == 0)
++heads;
else
++tails;
total += abs(heads - tails);
}
}
cout << "The average distance between is " << total / 1000.0 << '\n';
cin.get();
return 0;
}

Why does vc++ compiler cause this statistical pattern?

I'm running the following program:
#include <iostream>
#include <vector>
#include <cmath>
#include <cstdlib>
#include <chrono>
using namespace std;
const int N = 200; // Number of tests.
const int M = 2000000; // Number of pseudo-random values generated per test.
const int VALS = 2; // Number of possible values (values from 0 to VALS-1).
const int ESP = M / VALS; // Expected number of appearances of each value per test.
int main() {
for (int i = 0; i < N; ++i) {
unsigned seed = chrono::system_clock::now().time_since_epoch().count();
srand(seed);
vector<int> hist(VALS, 0);
for (int j = 0; j < M; ++j) ++hist[rand() % VALS];
int Y = 0;
for (int j = 0; j < VALS; ++j) Y += abs(hist[j] - ESP);
cout << Y << endl;
}
}
This program performs N tests. In each test we generate M numbers between 0 and VALS-1 while we keep counting their appearances in a histogram. Finally, we accumulate in Y the errors, which correspond to the difference between each value of the histogram and the expected value. Since the numbers are generated randomly, each of them would ideally appear M/VALS times per test.
After running my program I analysed the resulting data (i.e., the 200 values of Y) and I realised that some things where happening which I can not explain. I saw that, if the program is compiled with vc++ and given some N and VALS (N = 200 and VALS = 2 in this case), we get different data patterns for different values of M. For some tests the resulting data follows a normal distribution, and for some tests it doesn't. Moreover, this type of results seem to altern as M (the number of pseudo-random values generated in each test) increases:
M = 10K, data is not normal:
M = 100K, data is normal:
and so on:
As you can see, depending on the value of M the resulting data follows a normal distribution or otherwise follows a non-normal distribution (bimodal, dog food or kind of uniform) in which more extreme values of Y have greater presence.
This diversity of results doesn't occur if we compile the program with other C++ compilers (gcc and clang). In this case, it looks like we always obtain a half-normal distribution of Y values:
What are your thoughts on this? What is the explanation?
I carried out the tests through this online compiler: http://rextester.com/l/cpp_online_compiler_visual
The program will generate poorly distributed random numbers (not uniform, independent).
The function rand is a notoriously poor one.
The use of the remainder operator % to bring the numbers into range effectively discards all but the low-order bits.
The RNG is re-seeded every time through the loop.
[edit] I just noticed const int ESP = M / VALS;. You want a floating point number instead.
Try the code below and report back. Using the new &LT;random> is a little tedious. Many people write some small library code to simplify its use.
#include <iostream>
#include <vector>
#include <cmath>
#include <random>
#include <chrono>
using namespace std;
const int N = 200; // Number of tests.
const int M = 2000000; // Number of pseudo-random values generated per test.
const int VALS = 2; // Number of possible values (values from 0 to VALS-1).
const double ESP = (1.0*M)/VALS; // Expected number of appearances of each value per test.
static std::default_random_engine engine;
static void seed() {
std::random_device rd;
engine.seed(rd());
}
static int rand_int(int lo, int hi) {
std::uniform_int_distribution<int> dist (lo, hi - 1);
return dist(engine);
}
int main() {
seed();
for (int i = 0; i < N; ++i) {
vector<int> hist(VALS, 0);
for (int j = 0; j < M; ++j) ++hist[rand_int(0, VALS)];
int Y = 0;
for (int j = 0; j < VALS; ++j) Y += abs(hist[j] - ESP);
cout << Y << endl;
}
}

linear search for number vector in c++

I am trying to output 9 random non repeating numbers. This is what I've been trying to do:
#include <iostream>
#include <cmath>
#include <vector>
#include <ctime>
using namespace std;
int main() {
srand(time(0));
vector<int> v;
for (int i = 0; i<4; i++) {
v.push_back(rand() % 10);
}
for (int j = 0; j<4; j++) {
for (int m = j+1; m<4; m++) {
while (v[j] == v[m]) {
v[m] = rand() % 10;
}
}
cout << v[j];
}
}
However, i get repeating numbers often. Any help would be appreciated. Thank you.
With a true random number generator, the probability of drawing a particular number is not conditional on any previous numbers drawn. I'm sure you've attained the same number twice when rolling dice, for example.
rand(), which roughly approximates a true generator, will therefore give you back the same number; perhaps even consecutively: your use of % 10 further exacerbates this.
If you don't want repeats, then instantiate a vector containing all the numbers you want potentially, then shuffle them. std::shuffle can help you do that.
See http://en.cppreference.com/w/cpp/algorithm/random_shuffle
When j=0, you'll be checking it with m={1, 2, 3}
But when j=1, you'll be checking it with just m={2, 3}.
You are not checking it with the 0th index again. There, you might be getting repetitions.
Also, note to reduce the chances of getting repeated numbers, why not increase the size of random values, let's say maybe 100.
Please look at the following code to get distinct random values by constantly checking the used values in a std::set:
#include <iostream>
#include <vector>
#include <set>
int main() {
int n = 4;
std::vector <int> values(n);
std::set <int> used_values;
for (int i = 0; i < n; i++) {
int temp = rand() % 10;
while (used_values.find(temp) != used_values.end())
temp = rand() % 10;
values[i] = temp;
}
for(int i = 0; i < n; i++)
std::cout << values[i] << std::endl;
return 0;
}

c++ random engines not really random

I'm playing a bit with c++ random engines, and something upsets me.
Having noticed that the values I had were roughly of the same order, I did the following test:
#include <random>
#include <functional>
#include <iostream>
int main()
{
auto res = std::random_device()();
std::ranlux24 generator(res);
std::uniform_int_distribution<uint32_t> distribution;
auto roll = std::bind(distribution, generator);
for(int j = 0; j < 30; ++j)
{
double ssum = 0;
for(int i = 0; i< 300; ++i)
{
ssum += std::log10(roll());
}
std::cout << ssum / 300. << std::endl;
}
return 0;
}
and the values I printed were all about 9.2 looking more like a normal distribution, whatever the engine I used.
Is there something I have not understood correctly?
Thanks,
Guillaume
Having noticed that the values I had were roughly of the same order
This is exactly what you'd expect with a uniform random number generator. There are 9 times as many integers in the range [10^(n-1),10^n) as there are in the range [0,10^(n-1)).

Vector inside vector (creating chromosomes)

I'm attempting to build a genetic algorithm that can take a certain amount of variables (say 4), and use these in a way so that you could have 2a + 3b + c*c + d = 16. I realise there are more efficient ways to calculate this, but I want to try and build a genetic algorithm to expand later.
I'm starting by trying to create "organisms" that can compete later. What I've done is this:
#include "stdafx.h"
#include <iostream>
#include <vector>
#include <random>
// Set population size
const int population_size = 10;
const int number_of_variables = 4;
int main()
{
// Generate random number
std::random_device rd;
std::mt19937 rng(rd()); // random-number engine (Mersenne-Twister in this case)
std::uniform_int_distribution<int> uni(-10, 10);
// Set gene values.
std::vector<int>chromosome;
std::vector<int>variables;
for (int i = 0; i < number_of_variables; ++i)
{
double rand_num = uni(rng);
variables.push_back (rand_num);
std::cout << variables[i] << "\n";
}
return 0;
}
What happens is it will fill up the number_of_variables vector, and output these just because that makes it clear for me that it's actually doing what I intend for it to do. What I want it to do however is to fill up each "chromosome" with one variables vector, so that for example chromosome 0 would have the values {1, 5, -5, 9} etc.
The following code obviously isn't working, but this is what I'd like it to do:
for (int j = 0; j < population_size; ++j)
{
for (int i = 0; i < number_of_variables; ++i)
{
double rand_num = uni(rng);
variables.push_back(rand_num);
}
chromosome.push_back(variables[j]);
std::cout << chromosome[j] << "\n";
}
Meaning it'd fill up the variables randomly, then chromosome1 would take those 4 values that "variables" took, and repeat. What actually happens is that (I think) it only takes the first value from "variables" and copies that into "chromosome" rather than all 4.
If anyone could help it'd be very much appreciated, I realise this might be simply a rookie mistake that is laughably simply in the eyes of someone more experienced with vectors (which would probably be 99% of the people on this website, hah).
Anyway, thanks :)
#include <iostream>
#include <vector>
#include <random>
// Set population size
const int population_size = 10;
const int number_of_variables = 4;
int main()
{
// Generate random number
std::random_device rd;
std::mt19937 rng(rd()); // random-number engine (Mersenne-Twister in this case)
std::uniform_int_distribution<int> uni(-10, 10);
// Set gene values.
std::vector< std::vector<int>>chromosome;
for( int kp = 0; kp < population_size; kp++ )
{
std::vector<int>variables;
for (int i = 0; i < number_of_variables; ++i)
{
double rand_num = uni(rng);
variables.push_back (rand_num);
}
chromosome.push_back( variables );
}
// display entire population
for( auto c : chromosome )
{
for( auto v : c )
{
std::cout << v << " ";
}
std::cout << "\n";
}
// display 4th member of population
for( auto v : chromosone[ 3 ] )
{
std::cout << v << " ";
}
std::cout << "\n";
return 0;
}
http://ideone.com/2jastJ
You can place a vector inside a vector with the syntax:
std::vector<std::vector<int>>
but you will need to make the outer vector large enough for num_variables.
#include <vector>
#include <cstdlib>
using Individual = std::vector<int>;
using Population = std::vector<Individual>;
// short for std::vector<std::vector<int>>;
const size_t number_of_variables = 8;
int main() {
Population population(10);
for (auto& individual : population) {
individual.resize(number_of_variables);
for (size_t j = 0; j < number_of_variables; ++j) {
individual[j] = j; // replace with random number
}
}
}
Live demo: http://ideone.com/pfufGt