C++ normal_distribution function for simulation application

C++ normal_distribution function for simulation application - c++

I was wondering what kind of random number generator does the normal_distribution function use ?
Does it fit for scientific simulation application ?
Regards

std::normal_distribution doesn't do any random number generation. It is a random number distribution. Random number distributions only map values returned by a random number engine to some kind of distribution. They don't do any generation themselves. So it is the random number engine that you care about.
One of the random number engines provided by the standard, the std::mersenne_twister_engine is a very high quality random number engine. You can use it to generate random numbers with a normal distribution like so:
std::random_device rd;
std::mt19937 gen(rd()); // Create and seed the generator
std::normal_distribution<> d(mean, deviation); // Create distribution
std::cout << d(gen) << std::endl; // Generate random numbers according to distribution
Note that std::mt19937 is a typedef of std::mersenne_twister_engine.

The whole point of the <random> standard library is to separate distributions from random number generators. You supply a random number generator that generates uniform integers, and the distribution takes care of transforming that random, uniform integer sequence into a sample of the desired distribution.
Fortunately, the <random> library also contains a collection of random number generators. The Mersenne Twister (std::mt19937) in particular is a relatively good (i.e. fast and statistically high quality) one.
(You also need to provide a seed for the generator.)

I know the post is old, however, I hope my answer is beneficial. I use normal_distribution to generate a Gaussian noise for a sensor. This is beneficial for simulating sensors. For example, let's say you have a sensor that gives you the position of a robot in 2D. Every time you move the robot, the sensor gives you some readings of the position of your robot. In OpenGL, you can simulate this example. For example, you can track the position of the mouse and add some Gaussian noise to the real position of the mouse. In this case, you have a sensor that track the position of the mouse, however it have uncertainty due to the noise.

Related

Alternative to rand() for avoid race conditions?

According to : http://www.cplusplus.com/reference/cstdlib/rand/
In C, the generation algorithm used by rand is guaranteed to only be
advanced by calls to this function. In C++, this constraint is
relaxed, and a library implementation is allowed to advance the
generator on other circumstances (such as calls to elements of
).
But then over here it says :
The function accesses and modifies internal state objects, which may
cause data races with concurrent calls to rand or srand.
Some libraries provide an alternative function that explicitly avoids
this kind of data race: rand_r (non-portable).
C++ library implementations are allowed to guarantee no data races for
calling this function.
Ideally I would like to have some kind of "instance" of rand, so that for that instance, and a given seed, I always generate the same sequence of numbers for calls to THAT instance . With the current versions it seems that in some platforms, calls by other functions to rand() (perhaps even on different threads), could affect the sequence of numbers generated in my thread by my code.
Is there an alternative, where I can hold on to some kind of "instance", where I am guaranteed to generate a particular sequence, given a seed, and where other calls to different "instances" do not affect it ?
EDIT: For clarity - my code is going to run on multiple different platforms (iOS, Android, Windows 8.1, Windows 10, Linux etc), and it isn't possible for me to test every implementation at present. I would just like to implement things based on what is guaranteed by the standard...

You can make use of std::uniform_int_distribution and std::mt19937 to keep a generator with your common seed (all from <random> library).
std::mt19937 gen(SEED);
std::uniform_int_distribution<> dis(MIN, MAX);
auto random_number = dis(gen);
Here, SEED is the seed number you want to specify. You can set another seed later with the .seed method too:
std::mt19937 gen{};
gen.seed(SEED);
If you need to generate one, you can use std::random_device for that:
std::random_device rd{};
std::mt19937 gen(rd());
The dis(MIN, MAX) part sets a range of min and max values this distribution can come up with, which means it will never generate a value bigger than MAX, or smaller than MIN.
Finally, you can use your generator with this distribution to generate your wanted random values like so: dis(gen). The distribution can take any generator, so if you want other distributions with the same sequence of random numbers, you may make a copy of gen, or use the same seed and construct two or more generators.

use random() instead of rand().
https://www.securecoding.cert.org/confluence/display/c/MSC30-C.+Do+not+use+the+rand%28%29+function+for+generating+pseudorandom+numbers
https://www.securecoding.cert.org/confluence/display/c/CON33-C.+Avoid+race+conditions+when+using+library+functions

std::uniform_real_distribution and rand()

Why is std::uniform_real_distribution better than rand() as the random number generator? Can someone give an example please?

First, it should be made clear that the proposed comparison is nonsensical.
uniform_real_distribution is not a random number generator. You cannot produce random numbers from a uniform_real_distribution without having a random number generator that you pass to its operator(). uniform_real_distribution "shapes" the output of that random number generator into an uniform real distribution. You can plug various kinds of random number generators into a distribution.
I don't think this makes for a decent comparison, so I will be comparing the use of uniform_real_distribution with a C++11 random number generator against rand() instead.
Another obvious difference that makes the comparison even less useful is the fact that uniform_real_distribution is used to produce floating point numbers, while rand() produces integers.
That said, there are several reasons to prefer the new facilities.
rand() is global state, while when using the facilities from <random> there is no global state involved: you can have as many generators and distributions as you want and they are all independent from each other.
rand() has no specification about the quality of the sequence generated. The random number generators from C++11 are all well-specified, and so are the distributions. rand() implementations can be, and in practice have been, of very poor quality, and not very uniform.
rand() provides a random number within a predefined range. It is up to the programmer to adjust that range to the desired range. This is not a simple task. No, it is not enough to use % something. Doing this kind of adjustment in such a naive manner will most likely destroy whatever uniformity was there in the original sequence. uniform_real_distribution does this range adjustment for you, correctly.

The real comparison is between rand and one of the random number engines provided by the C++11 standard library. std::uniform_real_distribution just distributes the output of an engine according to some parameters (for example, real values between 10 and 20). You could just as well make an engine that uses rand behind the scenes.
Now the difference between the standard library random number engines and using plain old rand is in guarantee and flexibility. rand provides no guarantee for the quality of the random numbers - in fact, many implementations have shortcomings in their distribution and period. If you want some high quality random numbers, rand just won't do. However, the quality of the random number engines is defined by their algorithms. When you use std::mt19937, you know exactly what you're getting from this thoroughly tested and analysed algorithm. Different engines have different qualities that you may prefer (space efficiency, time efficiency, etc.) and are all configurable.
This is not to say you should use rand when you don't care too much. You might as well just start using the random number generation facilities from C++11 right away. There's no downside.

The reason is actually in the name of the function, which is the fact that the uniformity of the distribution of random numbers is better with std::uniform_real_distribution compared to the uniform distribution of random numbers that rand() provides.
The distribution for std::uniform_real_distribution is of course between a given interval [a,b).
Essentially, that is saying that the probability density that when you ask for a random number between 1 and 10 is as great of getting 5 or getting 9 or any other of the possible values with std::uniform_real_distribution, as when you'd do it with rand() and call it several times, the probability of getting 5 instead of 9 may be different.

Random Number Generator: Should it be used as a singleton?

I use random numbers in several places and usually construct a random number generator whenever I need it. Currently I use the Marsaglia Xorshift algorithm seeding it with the current system time.
Now I have some doubts about this strategy:
If I use several generators the independence (randomness) of the numbers between the generators depends on the seed (same seed same number). Since I use the time (ns) as seed and since this time changes this works but I am wondering whether it would not be better to use only one singular generator and e.g. to make it available as a singleton. Would this increase the random number quality ?
Edit: Unfortunately c++11 is not an option yet
Edit: To be more specific: I am not suggesting that the singleton could increase the random number quality but the fact that only one generator is used and seeded. Otherwise I have to be sure that the seeds of the different generators are independent (random) from another.
Extreme example: I seed two generators with exactly the same number -> no randomness between them

Suppose you have several variables, each of which needs to be random, independent from the others, and will be regularly reassigned with a new random value from some random generator. This happens quite often with Monte Carlo analysis, and games (although the rigor for games is much less than it is for Monte Carlo). If a perfect random number generator existed, it would be fine to use a single instantiation of it. Assign the nth pseudo random number from the generator to variable x1, the next random number to variable x2, the next to x3, and so on, eventually coming back to variable x1 on the next cycle. around. There's a problem here: Far too many PRNGs fail the independence test fail the independence test when used this way, some even fail randomness tests on individual sequences.
My approach is to use a single PRNG generator as a seed generator for a set of N instances of self-contained PRNGs. Each instance of these latter PRNGs feeds a single variable. By self-contained, I mean that the PRNG is an object, with state maintained in instance members rather than in static members or global variables. The seed generator doesn't even need to be from the same family as those other N PRNGs. It just needs to be reentrant in the case that multiple threads are simultaneously trying to use the seed generator. However, In my uses I find that it is best to set up the PRNGs before threading starts so as to guarantee repeatability. That's one run, one execution. Monte Carlo techniques typically need thousands of executions, maybe more, maybe a lot more. With Monte Carlo, repeatability is essential. So yet another a random seed generator is needed. This one seeds the seed generator used to generate the N generators for the variables.
Repeatability is important, at least in the Monte Carlo world. Suppose run number 10234 of a long Monte Carlo simulation results in some massive failure. It would be nice to see what in the world happened. It might have been a statistical fluke, it might have been a problem. The problem is that in a typical MC setup, only the bare minimum of data are recorded, just enough for computing statistics. To see what happened in run number 10234, one needs to repeat that particular case but now record everything.

You should use the same instance of your random generator class whenever the clients are interrelated and the code needs "independent" random number.
You can use different objects of your random generator class when the clients do not depend on each other and it does not matter whether they receive the same numbers or not.
Note that for testing and debugging it is very useful to be able to create the same sequence of random numbers again. Therefore you should not "randomly seed" too much.

I don't think that its increasing the randomness but it reduces the memory you need to create an object every time you want to use the random generator. If this generator doesn't have any instance specific settings you can make a singleton.

Since I use the time (ns) as seed and since this time changes this works but I am wondering whether it would not be better to use only one singular generator and e.g. to make it available as a singleton.
This is a good example when the singleton is not an anti-pattern. You could also use some kind of inversion of control.
Would this increase the random number quality ?
No. The quality depends on the algorithm that generate random numbers. How you use it is irrelevant (assuming it is used correctly).
To your edit : you could create some kind of container that holds objects of your RNG classes (or use existing containers). Something like this :
std::vector< Rng > & RngSingleton()
{
static std::vector< Rng > allRngs( 2 );
return allRngs;
}
struct Rng
{
void SetSeed( const int seen );
int GenerateNumber() const;
//...
};
// ...
RngSingleton().at(0).SetSeed( 55 );
RngSingleton().at(1).SetSeed( 55 );
//...
const auto value1 = RngSingleton().at(0).GenerateNumber;
const auto value2 = RngSingleton().at(1).GenerateNumber;

Factory pattern to the rescue.
A client should never have to worry about the instantiation rules of its dependencies.
It allows for swapping creation methods. And the other way around, if you decide to use a different algorithm you can swap the generator class and the clients need no refactoring.
http://www.oodesign.com/factory-pattern.html
--EDIT
Added pseudocode (sorry, it's not c++, it's waaaaaay too long ago since I last worked in it)
interface PRNG{
function generateRandomNumber():Number;
}
interface Seeder{
function getSeed() : Number;
}
interface PRNGFactory{
function createPRNG():PRNG;
}
class MarsagliaPRNG implements PRNG{
constructor( seed : Number ){
//store seed
}
function generateRandomNumber() : Number{
//do your magic
}
}
class SingletonMarsagliaPRNGFactory implements PRNGFactory{
var seeder : Seeder;
static var prng : PRNG;
function createPRNG() : PRNG{
return prng ||= new MarsagliaPRNG( seeder.getSeed() );
}
}
class TimeSeeder implements Seeder{
function getSeed():Number{
return now();
}
}
//usage:
seeder : Seeder = new TimeSeeder();
prngFactory : PRNGFactory = new SingletonMarsagliaPRNGFactory();
clientA.prng = prngFactory.createPRNG();
clientB.prng = prngFactory.createPRNG();
//both clients got the same instance.
The big advantage is now that if you want/need to change any of the implementation details, nothing has to change in the clients. You can change seeding method, RNG algorithm and the instantiation rule w/o having to touch any client anywhere.

Best way to generate a set of integers of size N, distributed like a normal distribution, given a mean and std. deviation

I'm looking for a way to generate a set of integers with a specified mean and std. deviation.
Using the random library, it is possible to generate a set of random doubles distributed in gaussian fashion, this would look something like this:
#include <tr1/random>
std::tr1::normal_distribution<double> normal(mean, stdDev);
std::tr1::ranlux64_base_01 eng;
eng.seed(1000);
for (int i = 0; i < N; i++)
{
gaussiannums[i] = normal(eng);
}
However, for my application, I need integers instead of doubles. So my question is, how would you generate the equivalent of the above but for integers instead of doubles? One possible path to take is to convert the doubles into integers in some fashion, but I don't know enough about how the random library works to know whether this can be done in a fashion that really preserves the bell shape and the mean/std. deviation.
I should mention that the goal here is not so much randomness, as it is to get a set of integers of a specific size, with the correct mean and std. deviation.
Ideally I would also like to specify the minimum and maximum values that can be produced, but I have not found any way to do this even for doubles, so any suggestions on this are also welcome.

This isn't possible.
The gaussian distribution is continuous, the set of integers is discrete.
The gaussian pdf has unlimited support, if you specify minimum and maximum you'll also have a different distribution.
What are you really trying to do? Is it only the mean and standard deviation that count? Other distributions have a well-defined mean and standard-deviation, including several discrete distributions.
For example, you could use a binomial distribution.
Solve the equations for mean and variance simultaneously to get p and n. Then generate samples from this distribution.
If n doesn't come out integer, you can use a multinomial distribution instead.
Although wikipedia describes methods for sampling from a binomial or multinomial distribution, they aren't particularly efficient. There's a method for efficiently generating samples from an arbitrary discrete distribution which you can use here.
In the comments, you clarified that you want a bell-shaped distribution with specific mean and standard deviation and bounded support. So we'll use the Gaussian as a starting point:
compute a gaussian CDF across the range of integers you're interested in
offset and scale it slightly to account for the missing tails (so it varies from 0 to 1)
store it in an array
To sample from this distribution:
generate uniform reals in the range [0:1]
use binary search to invert the CDF
As the truncation step will reduce the standard deviation slightly (and affect the mean also, if the minimum and maximum aren't equidistant from the chosen mean) you may have to tweak the Gaussian parameters slightly beforehand.

how to bias a random number generator

i am using the random number generator provided with stl c++. how do we bias it so that it produces smaller random numbers with a greater probability than larger random numbers.

One simple way would be to take every random number generated in the range [0,1) and raise it to any power greater than 1, depending how skewed you want the results

Well, in this case you probably would like a certain probability distribution. You can generate any distribution from a uniform random number generator, the question is only how it should look like. Rejection sampling is a common way of generating distributions that are hard to describe otherwise, but in your case something simpler might suffice.
You can take a look at this article for many common distribution functions. Chi, Chi-Square and Exponential look like good candidates.

Use std::discrete_distribution to calculate random numbers with a skewed probability distribution. See example here:
http://www.cplusplus.com/reference/random/discrete_distribution/

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js