C++ <random> distribution is not random even with seed - c++

I want to generate random numbers outside the main function but even when using the library and seeding the random number generator, the output is not random. Any help is appreciated.
#include <iostream>
#include <random>
#include <time.h>
int foo(std::mt19937 rng)
{
std::uniform_int_distribution<int> distr(0, 9);
return distr(rng);
}
int main()
{
std::random_device rd;
std::mt19937 rng(rd());
for (int j=0; j<10; j++)
{
std::cout << foo(rng) << " ";
}
return 0;
}
With output
5 5 5 5 5 5 5 5 5 5

int foo(std::mt19937 rng)
You are passing the std::mt19937 generator by value, so when you pass your generator to the function, it isn't getting the numbers from the one in main, therefore creating a copy of that generator, which is modified inside that function only, and doesn't affect the one in main.
You should pass it by reference, so it modifies the one in main, and in each call the generator will create different numbers:
int foo(std::mt19937& rng)

Short version: Change foo to take a reference.
int foo(std::mt19937& rng);
When a function parameter is an object type, not a reference, that parameter is a different object from the argument object passed to it. Here since the argument type and parameter type are the same, you're using the copy constructor.
When an entropy source like mt19937 is passed to a distribution like uniform_int_distribution, the operator() of the distribution calls the operator() of the entropy source. The operator() of the entropy source both returns a pseudo-random value and also modifies the entropy source so that its next call will be different.
But back in your main, the original object rng has not been used with a distribution. It has only been copied, and then that copy was used. So next time through the loop, another copy of rng is made. But since this fresh object is essentially identical to the unused rng object, using it once is just going to produce the same results again.
A reference parameter will fix all this, since then the reference is just another name for the original mt19937 object, so then every call to foo is actually using and changing that original object.

Related

Why is this random number generator generating same numbers?

The first one works, but the second one always returns the same value. Why would this happen and how am I supposed to fix this?
int main() {
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_real_distribution<> dis(0, 1);
for(int i = 0; i < 10; i++) {
std::cout << dis(gen) << std::endl;
}return 0;
}
The one dosen't work:
double generateRandomNumber() {
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_real_distribution<> dis(0, 1);
return dis(gen);
}
int main() {
for(int i = 0; i < 10; i++) {
std::cout << generateRandomNumber() << std::endl;
}return 0;
}
What platform are you working on? std::random_device is allowed to be a pseudo-RNG if hardware or OS functionality to generate random numbers doesn't exist. It might initialize using the current time, in which case the intervals at which you're calling it might be too close apart for the 'current time' to take on another value.
Nevertheless, as mentioned in the comments, it is not meant to be used this way. A simple fix will be to declare rd and gen as static. A proper fix would be to move the initialization of the RNG out of the function that requires the random numbers, so it can also be used by other functions that require random numbers.
The first one uses the same generator for all the numbers, the second creates a new generator for each number.
Let's compare the differences between your two cases and see why this happening.
Case 1:
int main() {
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_real_distribution<> dis(0, 1);
for(int i = 0; i < 10; i++) {
std::cout << dis(gen) << std::endl;
}return 0;
}
In your first case the program executes the main function and the first thing that happens here is that you are creating an instance of a std::random_device, std::mt19337 and a std::uniform_real_distribution<> on the stack that belong to main()'s scope. Your mersenne twister gen is initialized once with the result from your random device rd. You have also initialized your distribution dis to have the range of values from 0 to 1. These only exist once per each run of your application.
Now you create a for loop that starts at index 0 and increments to 9 and on each iteration you are displaying the resulting value to cout by using the distribution dis's operator()() passing to it your already seeded generation gen. Each time on this loop dis(gen) is going to produce a different value because gen was already seeded only once.
Case 2:
double generateRandomNumber() {
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_real_distribution<> dis(0, 1);
return dis(gen);
}
int main() {
for(int i = 0; i < 10; i++) {
std::cout << generateRandomNumber() << std::endl;
}return 0;
}
In this version of the code let's see what's similar and what's different. Here the program executes and enters the main() function. This time the first thing it encounters is a for loop from 0 to 9 similar as in the main above however this loop is the first thing on main's stack. Then there is a call to cout to display results from a user defined function named generateRandomNumber(). This function is called a total of 10 times and each time you iterate through the for loop this function has its own stack memory that will be wound and unwound or created and destroyed.
Now let's jump execution into this user defined function named generateRandomNumber().
The code looks almost exactly the same as it did before when it was in main() directly but these variables live in generateRandomNumber()'s stack and have the life time of its scope instead. These variables will be created and destroyed each time this function goes in and out of scope. The other difference here is that this function also returns dis(gen).
Note: I'm not 100% sure if this will return a copy or not or if the compiler will end up doing some kind of optimizations, but returning by value usually results in a copy.
Finally when then function generateRandomNumber() returns and just before it goes completely out of scope where std::uniform_real_distribrution<>'s operator()() is being called and it goes into it's own stack and scope before returning back to main generateRandomNumber() ever so briefly and then back to main.
-Visualizing The Differences-
As you can see these two programs are quite different, very different to be exact. If you want more visual proof of them being different you can use any available online compiler to enter each program to where it shows you that program in assembly and compare the two assembly versions to see their ultimate differences.
Another way to visualize the difference between these two programs is not only to see their assembly equivalents but to step through each program line by line with a debugger and keep an eye on the stack calls and the winding and unwinding of them and keep an eye of all values as they become initialized, returned and destroyed.
-Assessment and Reasoning-
The reason the first one works as expected is because your random device, your generator and your distribution all have the life time of main and your generator is seeded only once with your random device and you only have one distribution that you are using each time in the for loop.
In your second version main doesn't know anything about any of that and all it knows is that it is going through a for loop and sending returned data from a user function to cout. Now each time it goes through the for loop this function is being called and it's stack as I said is being created and destroyed each time so all if its variables are being created and destroyed. So in this instance you are creating and destroying 10: rd, gen(rd()), and dis(0,1)s instances.
-Conclusion-
There is more to this than what I have described above and the other part that pertains to the behavior of your random number generators is what was mentioned by user Kane in his statement to you from his comment to your question:
From en.cppreference.com/w/cpp/numeric/random/random_device:
"std::random_device may be implemented in terms of an
implementation-defined pseudo-random number engine [...].
In this case each std::random_device object may generate
the same number sequence."
Each time you create and destroy you are seeding the generator over and over again with a new random_device however if your particular machine or OS doesn't have support for using random_device it can either end up using some arbitrary value as its seed value or it could end up using the system clock to generate a seed value.
So let's say it does end up using the system clock, the execution of main()'s for loop happens so fast that all of the work that is being done by the 10 calls to generateRandomNumber() is already executed before a few milliseconds have passed. So here the delta time is minimally small and negligible that it is generating the same seed value on each pass as well as it is generating the same values from the distributions.
Note that std::mt19937 gen(rd()) is very problematic. See this question, which says:
rd() returns a single unsigned int. This has at least 16 bits and probably 32. That's not enough to seed [this generator's huge state].
Using std::mt19937 gen(rd());gen() (seeding with 32 bits and looking at the first output) doesn't give a good output distribution. 7 and 13 can never be the first output. Two seeds produce 0. Twelve seeds produce 1226181350. (Link)
std::random_device can be, and sometimes is, implemented as a simple PRNG with a fixed seed. It might therefore produce the same sequence on every run. (Link)
Furthermore, random_device's approach to generating "nondeterministic" random numbers is "implementation-defined", and random_device allows the implementation to "employ a random number engine" if it can't generate "nondeterministic" random numbers due to "implementation limitations" ([rand.device]). (For example, under the C++ standard, an implementation might implement random_device using timestamps from the system clock, or using fast-moving cycle counters, since both are nondeterministic.)
An application should not blindly call random_device's generator (rd()) without also, at a minimum, calling the entropy() method, which gives an estimate of the implementation's entropy in bits.

Random normal distribution by Gaussian in C++

I have my function in Python for normal distribution. I need to convert it to C++ and i am not familiar with language.
Here is my Python:
def calculation(value):
sigma = 0.5
size = 10000
x = 200
x_distribution = np.random.normal(value, sigma, size)
for i in x_distribution:
x.append(i)
return x
And it works as expected. I am trying to re-write same thing in C++ and found only the Link and where the "std::normal_distribution<> d{5,2};
" has to make magic. But i could not figure it out how to implement.
Here what i have tried and it is failing.
# include frame.distribution
Frame DistributionModel(x_mu, x_sigma)
{
// Motion model;ignore it
model = std::normal_distribution<> d{x_mu,x_sigma};
return model;
}
Please, help me. Looking for any hints. Thanks.
Well, trouble without end...
# include frame.distribution
Syntax for inclusion is:
#include <name_of_header_file>
// or:
#include "name_of_header_file"
(The space in between # and include does not harm, but is absolutely uncommon...)
Frame DistributionModel(x_mu, x_sigma)
C++ is a strongly typed language, i. e. you cannot just give variables a name as in Python, but you need to give them a type!
Frame DistributionModel(double x_mu, double x_sigma)
Same for local variables; type must match what you actually assign to (unless using auto)
std::normal_distribution<double> nd(x_mu, x_sigma);
This is a bit special about C++: You define a local variable, e. g.
std::vector<int> v;
In case of a class, it gets already constructed using its default constructor. If you want to call a constructor with arguments, you just append the call to the variable name:
std::vector<int> v(10); // vector with 10 elements.
What you saw in the sample is a feature called "uniform initialisation", using braces instead of parentheses. I personally strongly oppose against its usage, though, so you won't ever see it in code I have written (see me constructing the std::normal_distribution above...).
std::normal_distribution is defined in header random, so you need to include it (before your function definition):
#include <random>
About the return value: You only can return Frame, if the data type is defined somewhere. Now before trying to define a new class, we just can use an existing one: std::vector (it's a template class, though). A vector is quite similar to a python list, it is a container class storing a number of objects in contiguous memory; other than python lists, though, the type of all elements stored must be the same. We can use such a vector to collect the results:
std::vector<double> result;
Such a vector can grow dynamically, however, this can result in necessity to re-allocate the internal storage memory. Costly. If you know the number of elements in advance, you can tell the vector to allocate sufficient memory in advance, too:
result.reserve(max);
The vector is what we are going to return, so we need to adjust the function signature (I allowed to give it a different name and added another parameter):
std::vector<double> getDistribution(double x_mu, double x_sigma, size_t numberOfValues)
It would be possible to let the compiler deduce the return type, using auto keyword for. While auto brings quite a lot of benefits, I do not recommend it for given purpose: With explicit return type, users of the function see right from the signature what kind of result to expect and do not have to look into the function body to know about.
std::normal_distribution now is a number generator; it does not deliver the entire sequence at once as the python equivalent does, you need to draw the values one by another explicitly:
while(numberOfValues-- > 0)
{
auto value = nd(gen);
result.push_back(value);
}
nd(gen): std::normal_distribution provides a function call operator operator(), so objects of can be called just like functions (such objects are called "functors" in C++ terminology). The function call, however, requires a random number generator as argument, so we need to provide it as in the example you saw. Putting all together:
#include <random>
#include <vector>
std::vector<double> getDistribution
(
double x_mu, double x_sigma, size_t numberOfValues
)
{
// shortened compared to your example:
std::mt19937 gen((std::random_device())());
// create temporary (anonymous) ^^
// instance and call it immediately ^^
// afterwards
std::normal_distribution<double> nd(x_mu, x_sigma);
std::vector<double> result;
result.reserve(numberOfValues);
while(numberOfValues-- > 0)
{
// shorter than above: using result of previous
// function (functor!) call directly as argument to next one
result.push_back(nd(gen));
}
// finally something familiar from python:
return result;
}
#include<iostream>
#include<random>
#include<chrono>
int main() {
unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
std::default_random_engine generator(seed);
std::normal_distribution<double> distribution(0.0, 3.0);
double number = abs(distribution(generator));
std::cout << number;
std::cin.get();
return 0;
}
This may help, create a random number using gaussian with mean=0.0 and std_dev= 3.0

Is it correct to pass the random number generator mt19937 by reference to helper functions?

int helper( mt19937& generator ){
do stuff;
return 0;
}
#include "helper.h"
// helper function defined in separate source file and included in header
mt19937 generator(time(NULL));
int main( ) {
help(generator);
}
Is it correct to create and seed the mt19937 random number generator, then pass it by reference to a function for use?
I am doing this because I know I am suppose to only seed mt19937 once. But I have a lot of helper functions in separate source files that need to use a random number generator. E.g. with the shuffle function.
Yes it is correct to pass the generator around by reference. The mt19937 has internal state that needs to be modified to get the next random number. If you were to pass the generator by value then you would make a copy of that state and then multiple functions would wind up getting the same random number. This is also why you cannot pass it by const& since it would not be able to modify that internal state if it was const.

Boost C++: Seeding random numbers in a function

I have a function that should simulate a new random exponential variable every time it is called:
#include <boost/random.hpp>
//Simulates a single exponential random variable
double generateExponential(double lambda) {
boost::mt19937 rng; //Mersenne Twister Generator
rng.seed(time(0));
boost::variate_generator< boost::mt19937&, boost::exponential_distribution<> > rndm(rng, boost::exponenti\
al_distribution<>(lambda));
return rndm();
}
for example,
double newExp = generateExponential(10);
However, each time I call the function, it generates the same random number. I want it to generate a different random number each time the function is called. I thought "rng.seed(time(0))" might fix it but it hasn't. How could I get it to do this?
If you can't change the signature of your function, then you could use a static instance of your generator. No need to re-seed.
#include <boost/random.hpp>
typedef boost::mt19937 G;
typedef boost::exponential_distribution D;
double generateExponential(double lambda)
{
static G rng(std::time(0)); // initialized and seeded once
boost::variate_generator<G &, D<> > rndm(rng, D<>(lambda));
return rndm();
}
Generally speaking, a source of random numbers should be a resource whose lifespan is that of your entire program, not that of an individual function call.
Consequently, you need the object representing said source of random numbers to have an appropriate lifespan. Making it a variable local to your function is a Bad Idea. It should be an object passed into your function, or maybe a global object.
(also, frequently reseeding a random number generator is another well known Bad Idea)

Can boost/random/uniform_int.hpp and boost/random/uniform_int_distribution.hpp be used interchangeably?

There are two random integer generators in boost, boost::uniform_int<> and boost::random::uniform_int_distribution<>, the latter being add only after boost 1.47.
I would like to know if there is any difference in their performance (i.e. the quality of the random numbers they generate)?
Also, with boost::uniform_int<> you need to couple it with a random engine through variate_generate, but seems on boost's official website that you can use
boost::random::mt19937 rng;
boost::random::uniform_int_distribution<> six(1,6);
int x = six(rng);
wihout the variate generate.
Can these two usage be used interchangeably?
boost::uniform_int<> inherits from boost::random::uniform_int_distribution<> and if you look at the header for uniform_int<>, you can see that it basically just calls the base class functions.
Since uniform_int<> just calls uniform_int_distribution<>'s functions, there is no difference in the numbers generated. Boost does explicitly state, however, that uniform_int<> is deprecated, and that uniform_int_distribution<> should be used for all new code.
To answer your second question, neither uniform_int<> nor uniform_int_distribution<> require a boost::random::variate_generator<> to function. The variate_generator<> simply associates a random number generator (like boost::random::mt19937) with a random number distribution (like uniform_int_distribution<>) as a convenience. If you don't use variate_generator<>, then you need to pass a random number generator each time you wish to generate a random number. Here's an example:
#include <boost/random/uniform_int.hpp>
#include <boost/random/mersenne_twister.hpp>
#include <boost/random/variate_generator.hpp>
#include <iostream>
#include <ctime>
int main()
{
boost::mt19937 rand_generator(std::time(NULL));
boost::random::uniform_int_distribution<> int_distribution(0, 100);
//Need to pass generator
std::cout << int_distribution(rand_generator) << std::endl;
//Associate generator with distribution
boost::random::variate_generator<boost::mt19937&,
boost::random::uniform_int_distribution<>
> int_variate_generator(rand_generator, int_distribution);
//No longer need to pass generator
std::cout << int_variate_generator() << std::endl;
}
Note that the first call is to uniform_int_distribution<> operator() whereas the second call is to variate_generator<> operator(). Associating a generator with a distribution does not change the original generator or distribution objects.
Please let me know if anything is unclear.