Do std::random_device and std::mt19937 follow an uniform distribution? - c++

I'm trying to convert this line of matlab in C++: rp = randperm(p);
Following the randperm documentation:
randperm uses the same random number generator as rand
And in rand page:
rand returns a single uniformly distributed random number
So rand follows an uniform distribution. My C++ code is based on:
std::random_device rd;
std::mt19937 g(rd());
std::shuffle(... , ... ,g);
My question is: the code above follows an uniform distribution? If not, how to do so?

The different classes from the C++ random number library roughly work as follows:
std::random_device is a uniformly-distributed random number generator that may access a hardware device in your system, or something like /dev/random on Linux. It is usually just used to seed a pseudo-random generator, since the underlying device wil usually run out of entropy quickly.
std::mt19937 is a fast pseudo-random number generator using the Mersenne Twister engine which, according to the original authors' paper title, is also uniform. This generates fully random 32-bit or 64-bit unsigned integers. Since std::random_device is only used to seed this generator, it does not have to be uniform itself (e.g., you often seed the generator using a current time stamp, which is definitely not uniformly distributed).
Typically, you use a generator such as std::mt19937 to feed a particular distribution, e.g. a std::uniform_int_distribution or std::normal_distribution which then take the desired distribution shape.
std::shuffle, according to the documentation,
Reorders the elements in the given range [first, last) such that each possible permutation of those elements has equal probability of appearance.
In your code example, you use the std::mt19937 PRNG to feed std::shuffle. So, std::mt19937 is uniform, and std::shuffle should also behave uniformly. So, everything is as uniform as it can be.

Related

Match pseudo random numbers between MT19937 CPU and GPU

I am studying the behaviour of CURAND_RNG_PSEUDO_MT19937, specifically in order to match numbers generated by the standard CPU implemetation of Mersenne Twister (std::mt19937 or boost::random::mt19937).
I read in the documentation that cuRand MT19937 “has the same parameters as CPU version, but ordering is different […] Output is generated by 8192 independent generators. Each generator generates consecutive subsequence of the original sequence. […] Results are permuted differently than originally to achieve higher performance.”
Checking the unsigned int output sequences of std::mt19937 and cuRand MT19937 with the same seed, only the first number is equal and immediately the two generators diverge.
Consider that in my CPU environment, I have a distributed computation that instantiate n std::mt19937 with incremental seeds (s+1, s+2, etc.). Do you know if there is a way to modify the cuRand MT19937 generators in order to match my workflow?
Thanks

Distribution of random seeds generated by std::seed_seq class

While I understand that std::seed_seq should be used to generate a random seed for a C++ random number generator to generate e.g. uniformly distributed random numbers, I am curious how uniform is the random seed generated by std::seed_seq, given different sequences of 32 bit integers to construct an seed_seq instance.
In the documentation it's said to "produce values distributed over the entire 32-bit range even if the consumed values are close" but didn't talk about the distribution. I want to use std::seed_seq directly as a integer random number generator, using the sequence as a seed, and it's nice to know the distribution.

Boost Mersenne Twister / 53 bit precision double random value

The Boost library has a Mersenne Twister random number generator, and using the Boost Random library I can convert that into a double value.
boost::random::mt19937 rng; // produces randomness out of thin air
// see pseudo-random number generators
boost::random::uniform_real_distribution<> dblvals(0,1);
// distribution that maps to 0..1
// see random number distributions
double x = dblvals(rng); // get the number
Internally it looks like it is using an acceptance / rejection method to generate the random number.
Since the underlying integer used to create the double is 32-bits, I think this means I get a random number with 32-bits resolution, in other words 32-bits worth of randomness.
The original mt19937ar.c had a function called genrand_res53() which generated a random number with 53-bit resolution (using two 32-bit integers). Is there a way to do this in Boost?
If you have to use boost you can use boost::random::mt19937_64 to get 64 bits of randomness. If you have access to C++11 or higher you can also use std::mt19337_64 which will also give you 64 random bits.
I will note that per boost's listing boost::random::mt19937_64 runs about 2.5 times slower than boost::random::mt19937 and that is probably mirrored in its standard equivalent. if speeds is a factor then this could come into play.
Similarly to what C++ now offers (since 11), there is mt19937_64 in boost, take a look here.

Is it possible to set a deterministic seed for boost::random::uniform_int_distribution<>?

I'm using boost::random::uniform_int_distribution<boost::multiprecision::uint256_t> to generate some unit tests. Notice that I'm using multiprecision, which is why I need to use boost and not the standard library. For my periodic tests, I need to generate deterministic results from a nondeterministic seed, but in such a way where I can reproduce the results later in case the tests fail.
So, I would generate a true random number and use as a seed, and inject that to uniform_int_distribution. The purpose is that if this fails, I'll be able to reproduce the problem with the same seed that made the tests fail.
Does this part of boost support generating seed-based random numbers in its interface? If not, is there any other way to do this?
The way I generate random numbers currently is:
boost::random::random_device gen;
boost::random::uniform_int_distribution<boost::multiprecision::uint256_t> dist{100, 1000};
auto random_num = dist(gen);
PS: Please be aware that the primary requirement is to support multiprecision. I require numbers that range from 16 bits to 512 bits. This is for tests, so performance is not really a requirement. I'm OK with generating large random numbers in other ways and converting them to boost::multiprecision.
The boost::random::random_device is a Non-deterministic Uniform Random Number Generator, a true random number generator. Unless you need real non-deterministic random numbers you could use a Pseudo-Random Number Generator (at least for testing purposes), which can be seeded. One known Pseudo-Random Number Generator is the mersenne twister boost::random::mt19937.
This generator usually gets seeded by a real random number which you could print for reproducability in your unit tests:
auto seed = boost::random::random_device{}();
std::cout << "Using seed: " << seed << '\n';
boost::random::mt19937 gen{ seed };
boost::random::uniform_int_distribution<boost::multiprecision::uint256_t> dist{100, 1000};
auto random_num = dist(gen);

how to generate uncorrelated random sequences using c++

I'd like to generate two sequences of uncorrelated normal distributed random numbers X1, X2.
As normal distributed random numbers come from uniform numbers, all I need is two uncorrelated uniform sequences. But how to do it using:
srand (time(NULL));
I guess I need to seed twice or do something similar?
Since the random numbers generated by a high-quality random-number generator are uniform and independent, you can generate as many independent sequences from it as you like.
You do not need, and should not seed two different generators.
In C++(11), you should use a pseudo-random number generator from the header <random>. Here’s a minimal example that can serve as a template for an actual implementation:
std::random_device seed;
std::mt19937 gen{seed()};
std::normal_distribution<> dist1{mean1, sd1};
std::normal_distribution<> dist2{mean2, sd2};
Now you can generate independent sequences of numbers by calling dist1(gen) and dist2(gen). The random_device is used to seed the actual generator, which in my code is a Mersenne Twister generator. This type of generator is efficient and has good statistical properties. It should be considered the default choice for a (non cryptographically secure) generator.
rand doesn't support generating more than a single sequence. It stores its state in a global variable. On some systems (namely POSIX-compliant ones) you can use rand_r to stay close to that approach. You'd simply use some initial seed as internal state for each. But since your question is tagged C++, I suggest you use the random number facilities introduced in C++11. Or, if C++11 is not an option, use the random module from boost.
A while ago I've asked a similar question, Random numbers for multiple threads, the answers to which might be useful for you as well. They discuss various aspects of how to ensure that sequences are not interrelated, or at least not in an obvious way.
Use two random_devices (possibly with some use of engine) with a normal_distribution from <random> :
std::random_device rd1, rd2;
std::normal_distribution d;
double v1 = d(rd1);
double v2 = d(rd2);
...
See also example code at http://en.cppreference.com/w/cpp/numeric/random/normal_distribution