I'm using the following function to generate gaussian random numbers:
double r_norm(double mean, double sigma){
random_device rd;
mt19937 gen(rd());
normal_distribution<double> d(mean, sigma);
return d(gen);
}
However, when I call this in main() with cout:
for (int k = 0; k < 10; k++){
cout << r_norm(2,0.5) <<endl;
}
It outputs the same number 10 times. Ideally I need to be able to call this function wherever, in order to receive a newly generated number each time.
Update: I managed to fix this by declaring the random device and mersenne twister outside of scope as global variables, but is there a neater way to do this?
The problem is that you need to recycle the random_device.
If you really want to keep the exact same function signature, the easiest way would be to use a global variable:
random_device rd;
mt19937 gen(rd());
double r_norm(double mean, double sigma){
normal_distribution<double> d(mean, sigma);
return d(gen);
}
That being said: stl random distributions are statefull, so you need to also recycle d if you want an actual valid distribution.
At this point, maintaining your interface would require a static std::map<pair<double, double>, normal_distribution> so that you recycle them properly as well.
Related
I've read some similar questions to the one I'm asking but the answers don't seem complete or completely clear to me.
I'm trying to parallelize a parameter scan that requires the repeated generation of a set of random numbers. With only one thread I currently do something like this:
int main() {
//Get random number generators
typedef std::mt19937 MyRNG;
std::random_device rd;
//seed generator
MyRNG rng;
rng.seed(rd());
//make my uniform distributions for each parameter
std::uniform_real_distribution<> param1(-1,1);
std::uniform_real_distribution<> param2(-1,1);
double x,y;
//Do my scan
for (int i = 0; i < N; i++) {
x = param1(rng)
y = param2(rng)
//Do things with x and y*
}
In this way I get a new x and y for every scan. Now I want to utilize multiple cores to do this in parallel. So I turn define a function void scan() which essentially has the same contents as my main function. I then create multiple threads and each have them run scan(). But I'm not sure if this is thread safe using std::thread. Will my random number generation in each thread as it currently is be independent? Can I save myself time by creating my RNGs outside of my void function? Thanks.
I would probably generate the seeds in main, and pass a seed to each thread function. I wouldn't use the output of std::random_device directly either--I'd put numbers into something like an std::set or std::unordered_set until I got as many seeds as I wanted, to assure that I didn't give two threads the same seed (which would obviously be a waste of time).
Something along this general line:
int do_work(unsigned long long seed) {
//Get random number generators
typedef std::mt19937 MyRNG;
//seed generator
MyRNG rng(seed);
//make my uniform distributions for each parameter
std::uniform_real_distribution<> param1(-1,1);
std::uniform_real_distribution<> param2(-1,1);
double x,y;
//Do my scan
for (int i = 0; i < N; i++) {
x = param1(rng);
y = param2(rng);
//Do things with x and y*
}
}
static const int num_threads = 4;
int main() {
std::set<unsigned long long> seeds;
while (seeds.size() < num_threads)
seeds.insert(std::random_device()());
std::vector<std::thread> threads;
for (auto const seed: seeds)
threads.emplace_back(std::thread(do_work, seed));
for (auto &t : threads)
t.join();
}
As an aside, using a single result from random_device to seed an std::mt19937 restricts the generator quite a bit--you're giving it only 32 (or possibly 64) bits of seed, but it actually has 19937 bits of seed material. std::seed_seq attempts to ameliorate this to at least some degree (among other things, you can use a number of outputs from std::random_device to create the seed.
Oh, and given that your two instances of uniform_real_distribution use the same parameters, there's probably not a whole lot of need for two separate distribution objects either.
Those three lines of generating random number looks a bit tricky. It is hard to always remember those lines. Could someone please shed some light on it to make it easier to understand?
#include <random>
#include <iostream>
int main()
{
std::random_device rd; //1st line: Will be used to obtain a seed for the random number engine
std::mt19937 gen(rd()); //2nd line: Standard mersenne_twister_engine seeded with rd()
std::uniform_int_distribution<> dis(1, 6);
for (int n=0; n<10; ++n)
std::cout << dis(gen) << ' '; //3rd line: Use dis to transform the random unsigned int generated by gen into an int in [1, 6]
std::cout << '\n';
}
Here are some questions I can think of:
1st line of code:
random_device is a class as described by the documentation random_device, so this line means declaring a object rd? If yes, why in 2nd line we pass rd() to construct mt19937 instead of using the object rd (without parentheses)?
3rd line of code:
Why do call class uniform_int_distribution<> object dis()? Is dis() a function? Why shall we pass in gen object into dis()?
random_device is slow but genuinely random, it's used to generate the 'seed' for the random number sequence.
mt19937 is fast but only 'pseudo random'. It needs a 'seed' to start generating a sequence of numbers. That seed can be random (as in your example) so you get a different sequence of random numbers each time. But it could be a constant, so you get the same sequence of numbers each time.
uniform_int_distribution is a way of mapping random numbers (which could have any values) to the numbers you're actually interested in, in this case a uniform distribution of integers from 1 to 6.
As is often the case with OO programming, this code is about division of responsibilities. Each class contributes a small piece to the overall requirement (the generation of dice rolls). If you wanted to do something different it's easy because you've got all the pieces in front of you.
If this is too much then all you need to do is write a function to capture the overall effect, for instance
int dice_roll()
{
static std::random_device rd;
static std::mt19937 gen(rd());
static std::uniform_int_distribution<> dis(1, 6);
return dis(gen);
}
dis is an example of a function object or functor. It's an object which overloads operator() so it can be called as if it was a function.
std::random_device rd; // create access to truly random numbers
std::mt19937 gen{rd()}; // create pseudo random generator.
// initialize its seed to truly random number.
std::uniform_int_distribution<> dis{1, 6}; // define distribution
...
auto x = dis(gen); // generate pseudo random number form `gen`
// and transform its result to desired distribution `dis`.
The first one works, but the second one always returns the same value. Why would this happen and how am I supposed to fix this?
int main() {
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_real_distribution<> dis(0, 1);
for(int i = 0; i < 10; i++) {
std::cout << dis(gen) << std::endl;
}return 0;
}
The one dosen't work:
double generateRandomNumber() {
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_real_distribution<> dis(0, 1);
return dis(gen);
}
int main() {
for(int i = 0; i < 10; i++) {
std::cout << generateRandomNumber() << std::endl;
}return 0;
}
What platform are you working on? std::random_device is allowed to be a pseudo-RNG if hardware or OS functionality to generate random numbers doesn't exist. It might initialize using the current time, in which case the intervals at which you're calling it might be too close apart for the 'current time' to take on another value.
Nevertheless, as mentioned in the comments, it is not meant to be used this way. A simple fix will be to declare rd and gen as static. A proper fix would be to move the initialization of the RNG out of the function that requires the random numbers, so it can also be used by other functions that require random numbers.
The first one uses the same generator for all the numbers, the second creates a new generator for each number.
Let's compare the differences between your two cases and see why this happening.
Case 1:
int main() {
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_real_distribution<> dis(0, 1);
for(int i = 0; i < 10; i++) {
std::cout << dis(gen) << std::endl;
}return 0;
}
In your first case the program executes the main function and the first thing that happens here is that you are creating an instance of a std::random_device, std::mt19337 and a std::uniform_real_distribution<> on the stack that belong to main()'s scope. Your mersenne twister gen is initialized once with the result from your random device rd. You have also initialized your distribution dis to have the range of values from 0 to 1. These only exist once per each run of your application.
Now you create a for loop that starts at index 0 and increments to 9 and on each iteration you are displaying the resulting value to cout by using the distribution dis's operator()() passing to it your already seeded generation gen. Each time on this loop dis(gen) is going to produce a different value because gen was already seeded only once.
Case 2:
double generateRandomNumber() {
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_real_distribution<> dis(0, 1);
return dis(gen);
}
int main() {
for(int i = 0; i < 10; i++) {
std::cout << generateRandomNumber() << std::endl;
}return 0;
}
In this version of the code let's see what's similar and what's different. Here the program executes and enters the main() function. This time the first thing it encounters is a for loop from 0 to 9 similar as in the main above however this loop is the first thing on main's stack. Then there is a call to cout to display results from a user defined function named generateRandomNumber(). This function is called a total of 10 times and each time you iterate through the for loop this function has its own stack memory that will be wound and unwound or created and destroyed.
Now let's jump execution into this user defined function named generateRandomNumber().
The code looks almost exactly the same as it did before when it was in main() directly but these variables live in generateRandomNumber()'s stack and have the life time of its scope instead. These variables will be created and destroyed each time this function goes in and out of scope. The other difference here is that this function also returns dis(gen).
Note: I'm not 100% sure if this will return a copy or not or if the compiler will end up doing some kind of optimizations, but returning by value usually results in a copy.
Finally when then function generateRandomNumber() returns and just before it goes completely out of scope where std::uniform_real_distribrution<>'s operator()() is being called and it goes into it's own stack and scope before returning back to main generateRandomNumber() ever so briefly and then back to main.
-Visualizing The Differences-
As you can see these two programs are quite different, very different to be exact. If you want more visual proof of them being different you can use any available online compiler to enter each program to where it shows you that program in assembly and compare the two assembly versions to see their ultimate differences.
Another way to visualize the difference between these two programs is not only to see their assembly equivalents but to step through each program line by line with a debugger and keep an eye on the stack calls and the winding and unwinding of them and keep an eye of all values as they become initialized, returned and destroyed.
-Assessment and Reasoning-
The reason the first one works as expected is because your random device, your generator and your distribution all have the life time of main and your generator is seeded only once with your random device and you only have one distribution that you are using each time in the for loop.
In your second version main doesn't know anything about any of that and all it knows is that it is going through a for loop and sending returned data from a user function to cout. Now each time it goes through the for loop this function is being called and it's stack as I said is being created and destroyed each time so all if its variables are being created and destroyed. So in this instance you are creating and destroying 10: rd, gen(rd()), and dis(0,1)s instances.
-Conclusion-
There is more to this than what I have described above and the other part that pertains to the behavior of your random number generators is what was mentioned by user Kane in his statement to you from his comment to your question:
From en.cppreference.com/w/cpp/numeric/random/random_device:
"std::random_device may be implemented in terms of an
implementation-defined pseudo-random number engine [...].
In this case each std::random_device object may generate
the same number sequence."
Each time you create and destroy you are seeding the generator over and over again with a new random_device however if your particular machine or OS doesn't have support for using random_device it can either end up using some arbitrary value as its seed value or it could end up using the system clock to generate a seed value.
So let's say it does end up using the system clock, the execution of main()'s for loop happens so fast that all of the work that is being done by the 10 calls to generateRandomNumber() is already executed before a few milliseconds have passed. So here the delta time is minimally small and negligible that it is generating the same seed value on each pass as well as it is generating the same values from the distributions.
Note that std::mt19937 gen(rd()) is very problematic. See this question, which says:
rd() returns a single unsigned int. This has at least 16 bits and probably 32. That's not enough to seed [this generator's huge state].
Using std::mt19937 gen(rd());gen() (seeding with 32 bits and looking at the first output) doesn't give a good output distribution. 7 and 13 can never be the first output. Two seeds produce 0. Twelve seeds produce 1226181350. (Link)
std::random_device can be, and sometimes is, implemented as a simple PRNG with a fixed seed. It might therefore produce the same sequence on every run. (Link)
Furthermore, random_device's approach to generating "nondeterministic" random numbers is "implementation-defined", and random_device allows the implementation to "employ a random number engine" if it can't generate "nondeterministic" random numbers due to "implementation limitations" ([rand.device]). (For example, under the C++ standard, an implementation might implement random_device using timestamps from the system clock, or using fast-moving cycle counters, since both are nondeterministic.)
An application should not blindly call random_device's generator (rd()) without also, at a minimum, calling the entropy() method, which gives an estimate of the implementation's entropy in bits.
I created a function that is suppose to generate a set of normal random numbers from 0 to 1. Although, it seems that each time I run the function the output is the same. I am not sure what is wrong.
Here is the code:
MatrixXd generateGaussianNoise(int n, int m){
MatrixXd M(n,m);
normal_distribution<double> nd(0.0, 1.0);
random_device rd;
mt19937 gen(rd());
for(int i = 0; i < n; i++){
for(int j = 0; j < m; j++){
M(i,j) = nd(gen);
}
}
return M;
}
The output when n = 4 and m = 1 is
0.414089
0.225568
0.413464
2.53933
I used the Eigen library for this, I am just wondering why each time I run it produces the same numbers.
From:
http://en.cppreference.com/w/cpp/numeric/random/random_device
std::random_device may be implemented in terms of an implementation-defined pseudo-random number engine if a non-deterministic source (e.g. a hardware device) is not available to the implementation. In this case each std::random_device object may generate the same number sequence.
Thus, I think you should look into what library stack you are actually using here, and what's known about random_device in your specific implementation.
I realize that this then might in fact be a duplicate of "Why do I get the same sequence for every run with std::random_device with mingw gcc4.8.1?".
Furthermore, it at least used to be that initializating a new mt19937 instance would be kind of expensive. Thus, you have performance reasons in addition to quality of randomness to not re-initalize both your random_device and mt19937 instance for every function call. I would go for some kind of singleton here, unless you have very clear constraints (building in a library, unclear concurrency) that would make that an unuistable choice.
I am an experienced C programmer that is occasionally forced to use a little bit of C++.
I need to generate random numbers from a normal distribution with a variety of means and variances. If I had a C function that did this called normal(float mean, float var) then I could write the following code:
int i;
float sample;
for(i = 0;i < 1000;i++)
{
sample = normal(mean[i],variance[i]);
do_something_with_this_value(sample);
}
Note that there is a different mean and variance for each value of i.
C does not contain a function called normal, but C++ does, well actually its called std::normal_distribution. Unfortunately my C++ is not good enough to understand the syntax in the documentation. Can anyone tell me how to achieve the functionality of my C code but using std::normal_distribution.
std::normal_distribution isn't function but templated class
you can use it like this:
#include <random>
int main(int, char**)
{
// random device class instance, source of 'true' randomness for initializing random seed
std::random_device rd;
// Mersenne twister PRNG, initialized with seed from previous random device instance
std::mt19937 gen(rd());
int i;
float sample;
for(i = 0; i < 1000; ++i)
{
// instance of class std::normal_distribution with specific mean and stddev
std::normal_distribution<float> d(mean[i], stddev[i]);
// get random number with normal distribution using gen as random source
sample = d(gen);
// profit
do_something_with_this_value(sample);
}
return 0;
}