Boost Mersenne Twister / 53 bit precision double random value - c++

The Boost library has a Mersenne Twister random number generator, and using the Boost Random library I can convert that into a double value.
boost::random::mt19937 rng; // produces randomness out of thin air
// see pseudo-random number generators
boost::random::uniform_real_distribution<> dblvals(0,1);
// distribution that maps to 0..1
// see random number distributions
double x = dblvals(rng); // get the number
Internally it looks like it is using an acceptance / rejection method to generate the random number.
Since the underlying integer used to create the double is 32-bits, I think this means I get a random number with 32-bits resolution, in other words 32-bits worth of randomness.
The original mt19937ar.c had a function called genrand_res53() which generated a random number with 53-bit resolution (using two 32-bit integers). Is there a way to do this in Boost?

If you have to use boost you can use boost::random::mt19937_64 to get 64 bits of randomness. If you have access to C++11 or higher you can also use std::mt19337_64 which will also give you 64 random bits.
I will note that per boost's listing boost::random::mt19937_64 runs about 2.5 times slower than boost::random::mt19937 and that is probably mirrored in its standard equivalent. if speeds is a factor then this could come into play.

Similarly to what C++ now offers (since 11), there is mt19937_64 in boost, take a look here.

Related

How do I use boost::random_device to generate a cryptographically secure 64 bit integer?

I would like to do something like this:
boost::random_device rd;
boost::random::mt19937_64 gen(rd());
boost::random::uniform_int_distribution<unsigned long long> dis;
uint64_t value = dis(gen);
But I've read that a mersenne twister is not cryptographically secure. However, I've also read that a random_device could be, if its pulling data from /dev/urandom which is likely on a linux platform (my main platform). So if the random_device is non-deterministically random and its used to seed the mersenne twister (as shown above), doesn't that also make the mersenne twister cryptographically secure (even though by itself, it isn't)?
I'm a bit of a novice in this arena so any advice is appreciated.
So, how can I generate a cryptographically secure 64 bit number that can be stored in a uint64_t?
Thanks,
Ben.
Analyzing your question is harder than it might seem:
You seed the mersenne twister with rd(), which returns an unsigned int, and therefore (on most platforms) contains at most 32 random bits.
Everything that the mersenne twister does from this point on is determined by those 32 bits.
This means that the value can only take on 2**32 different values, which can be a problem if any attack vector exists that attacks whatever you do with this number by brute force. In fact, the mersenne twister's seeding routine may even reduce the number of possible values for the first result, since it distributes the 32 random bits over its complete state (to ensure that this is not the case you would have to analyse the seed routine boost uses).
The primary weakness of the mersenne twister (its state can be derived after seeing 624 numbers) is not even of interest in this case however, since you generate a sequence that is so short (1 value).
Generating 64 cryptographically secure bits
Assuming that unsigned int is equivalent to uint32_t on your platform, you can easily generate 64 cryptographically secure random bits by using boost::random_device:
boost::random_device rd;
std::uint64_t value = rd();
value = (value << 32) | rd();
This is fairly secure, since the implementations for both linux and windows use the operating system's own cryptographically secure randomness sources.
Generating cryptographically secure values with arbitrary distributions
While the previous works well enough, you may wish for a more flexible solution. This is easy to do by realizing that you can actually use the random distributions boost provides with random_device as well. A simple example would be to rewrite the previous solution like this:
boost::random_device rd;
boost::random::uniform_int_distribution<std::uint64_t> dis;
std::uint64_t value = dis(rd);
(While this can in theory also provide a more robust solution if the previous one does not actually contain a number in [0, 2**32), this is not a problem in practice.)
Binding distribution to generator
To improve usability you will often find usage of boost::bind to bind distribution and generator together. Since boost::bind copies its arguments, and the copy ctor is deleted for boost::random_device, you need to use a little trick:
boost::random_device rd;
boost::random::uniform_int_distribution<std::uint64_t> dis;
boost::function<std::uint64_t()> gen = boost::bind(dis, boost::ref(rd));
std::uint64_t value = gen();
Using the random device just for seeding isn't really cryptographically secure. The problem is then reduced to figuring out the initial seed, which is a greatly reduced problem. Instead, directly use the random device.
val = dis(rd);
For greater security, initialize the random device with /dev/random rather than /dev/urandom. /dev/random will block if there isn't enough "entropy", until some random things have happened. However, it may be much, much slower.
BTW, assuming that you have a high-quality C++11 implementation that doesn't return bogus values for the entropy function, using C++11 might be a better idea if you are trying to remove dependencies.
EDIT: Apparently there is some debate about whether or not /dev/random is any better than /dev/urandom. I refer you to this.

how to generate uncorrelated random sequences using c++

I'd like to generate two sequences of uncorrelated normal distributed random numbers X1, X2.
As normal distributed random numbers come from uniform numbers, all I need is two uncorrelated uniform sequences. But how to do it using:
srand (time(NULL));
I guess I need to seed twice or do something similar?
Since the random numbers generated by a high-quality random-number generator are uniform and independent, you can generate as many independent sequences from it as you like.
You do not need, and should not seed two different generators.
In C++(11), you should use a pseudo-random number generator from the header <random>. Here’s a minimal example that can serve as a template for an actual implementation:
std::random_device seed;
std::mt19937 gen{seed()};
std::normal_distribution<> dist1{mean1, sd1};
std::normal_distribution<> dist2{mean2, sd2};
Now you can generate independent sequences of numbers by calling dist1(gen) and dist2(gen). The random_device is used to seed the actual generator, which in my code is a Mersenne Twister generator. This type of generator is efficient and has good statistical properties. It should be considered the default choice for a (non cryptographically secure) generator.
rand doesn't support generating more than a single sequence. It stores its state in a global variable. On some systems (namely POSIX-compliant ones) you can use rand_r to stay close to that approach. You'd simply use some initial seed as internal state for each. But since your question is tagged C++, I suggest you use the random number facilities introduced in C++11. Or, if C++11 is not an option, use the random module from boost.
A while ago I've asked a similar question, Random numbers for multiple threads, the answers to which might be useful for you as well. They discuss various aspects of how to ensure that sequences are not interrelated, or at least not in an obvious way.
Use two random_devices (possibly with some use of engine) with a normal_distribution from <random> :
std::random_device rd1, rd2;
std::normal_distribution d;
double v1 = d(rd1);
double v2 = d(rd2);
...
See also example code at http://en.cppreference.com/w/cpp/numeric/random/normal_distribution

generating a normal distribution on gmp arbitrary precision

So, I'm trying to use gmp to some calculations I'm doing, and at some point I need to generate a pseudo random number (prn) from a normal distribution.
Since gmp has a uniform random variable, that already helps a lot. However, I'm finding difficult to choose which method I should use generate the normal distribution from a uniform one. In practice, my problem is that gmp only has simple operations, and so for instance I cannot use cos or erf evaluations, since I would have to implement all by miself.
My question is to what extent can I generate prn from a normal distribution on gmp, and, if it is very difficult, if there is any arbitrary precision lib which already has normal distribution implemented.
As two examples of methods that do not work (retrieved from this question):
Ziggurat algorithm uses evaluation of f, which in this case is an non-integer exponential and thus not supported by gmp.
Box–Muller Transform uses cos and sin, which are not supported by gmp.
The Marsaglia polar method would work, if your library has a ln.
Combine a library able to generate a random numbers for a N(0,1) distribution as doubles with the uniform generator of GMP.
For instance, suppose your normal generator produced 0x8.F67E33Ap-1
Probably, just a few of those digits are really random, so truncate the number to a fixed number of binary digits (i.e. truncating to 16 bits, 0x8.F67E33Ap-1 => 0x8.F67p-1) and generate a number uniformly in the range [0x8.F67p-1, 0x8.F68p-1)
For a better approximation, instead of using a uniform distribution, you may like to calculate the values of the density function at the interval extremes (double precision is enough here) and generate a random number with the distribution associated to the trapezoid defined by those two values.
Another way to solve that problem is to just generate a table of 1000, 10000 or 100000 mpf values where N(x) becomes 1/n, 2/n, etc. then, use the uniform random generator to select one of these intervals and again, calculate a random number inside the selected interval using a uniform or linear distribution.
I ended up using mpfr which is essentially gmp with some more functionality. It already has a normal distribution implemented.

Just how random is std::random_shuffle?

I'd like to generate a random number of reasonably arbitrary length in C++. By "reasonably arbitary" I mean limited by speed and memory of the host computer.
Let's assume:
I want to sample a decimal number (base 10) of length ceil(log10(MY_CUSTOM_RAND_MAX)) from 0 to 10^(ceil(log10(MY_CUSTOM_RAND_MAX))+1)-1
I have a vector<char>
The length of vector<char> is ceil(log10(MY_CUSTOM_RAND_MAX))
Each char is really an integer, a random number between 0 and 9, picked with rand() or similar methods
If I use std::random_shuffle to shuffle the vector, I could iterate through each element from the end, multiplying by incremented powers of ten to convert it to unsigned long long or whatever that gets mapped to my final range.
I don't know if there are problems with std::random_shuffle in terms of how random it is or isn't, particularly when also picking a sequence of rand() results to populate the vector<char>.
How sketchy is std::random_shuffle for generating a random number of arbitrary length in this manner, in a quantifiable sense?
(I realize that there is a library in Boost for making random int numbers. It's not clear what the range limitations are, but it looks like MAX_INT. That said, I realize that said library exists. This is more of a general question about this part of the STL in the generation of an arbitrarily large random number. Thanks in advance for focusing your answers on this part.)
I'm slightly unclear as to the focus of this question, but I'll try to answer it from a few different angles:
The quality of the standard library rand() function is typically poor. However, it is very easy to find replacement random number generators which are of a higher quality (you mentioned Boost.Random yourself, so clearly you're aware of other RNGs). It is also possible to boost (no pun intended) the quality of rand() output by combining the results of multiple calls, as long as you're careful about it: http://www.azillionmonkeys.com/qed/random.html
If you don't want the decimal representation in the end, there's little to no point in generating it and then converting to binary. You can just as easily stick multiple 32-bit random numbers (from rand() or elsewhere) together to make an arbitrary bit-width random number.
If you're generating the individual digits (binary or decimal) randomly, there is little to no point in shuffling them afterwards.

Better random algorithm?

I'm making a game in C++ and it involves filling tiles with random booleans (either yes or no) whether it is yes or no is decided by rand() % 1. It doesn't feel very random.
I'm using srand with ctime at startup, but it seems like the same patterns are coming up.
Are there any algorithms that will create very random numbers? Or any suggestions on how I could improve rand()?
True randomness often doesn't seem very random. Do expect to see odd runs.
But at least one immediate thing you can do to help is to avoid using just the lowest-order bit. To quote Numerical Recipes in C:
If you want to generate a random integer between 1 and 10, you should always do it by using high-order bits, as in
j = 1 + (int) (10.0 * (rand() / (RAND_MAX + 1.0)));
and never by anything resembling
j = 1 + (rand() % 10);
(which uses lower-order bits).
Also, you might consider using a different RNG with better properties instead. The Xorshift algorithm is a nice alternative. It's speedy and compact at just a few lines of C, and should be good enough statistically for nearly any game.
The low order bits are not very random.
By using %2 you are only checking the bottom bit of the random number.
Assuming you are not needing crypto strength randomness.
Then the following should be OK.
bool tile = rand() > (RAND_MAX / 2);
The easiest thing you can do, short of writing another PRNG or using a library, would be to just use all bits that a single call to rand() gives you. Most random number generators can be broken down to a stream of bits which has certain randomness and statistical properties. Individual bits, spaced evenly on that stream, need not have the same properties. Essentially you're throwing away between 14 and 31 bits of pseudo-randomness here.
You can just cache the number generated by a call to rand() and use each bit of it (depending on the number of bits rand() gives you, of course, which will depend on RAND_MAX). So if your RAND_MAX is 32768 you can use the lowest-order 15 bits of that number in sequence. Especially if RAND_MAX is that small you are not dealing with the low-order bits of the generator, so taking bits from the high end doesn't gain you much. For example the Microsoft CRT generates random numbers with the equation
xn + 1 = xn · 214013 + 2531011
and then shifts away the lowest-order 16 bits of that result and restricts it to 15 bits. So no low-order bits from the generator there. This largely holds true for generators where RAND_MAX is as high as 231 but you can't count on that sometimes (so maybe restrict yourself to 16 or 24 bits there, taken from the high-order end).
So, generally, just cache the result of a call to rand() and use the bits of that number in sequence for your application, instead of rand() % 2.
Many pseudo-random number generators suffer from cyclical lower bits, especially linear congruential algorithms, which are typically the most common implementations. Some people suggest shifting out the least significant bits to solve this.
C++11 has the following way of implementing the Mersenne tittie twister algorothm. From cppreference.com:
#include <random>
#include <iostream>
int main()
{
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<> dis(1, 6);
for (int n=0; n<10; ++n)
std::cout << dis(gen) << ' ';
std::cout << '\n';
}
This produces random numbers suitable for simulations without the disadvantages of many other random number generators. It is not suitable for cryptography; but cryptographic random number generators are more computationally intensive.
There is also the Well equidistributed long-period linear algorithm; with many example implementations.
Boost Random Number Library
I have used the Mersenne Twister random number generator successfully for many years. Its source code is available from the maths department of Hiroshima Uni here. (Direct link so you don't have to read Japanese!)
What is great about this algorithm is that:
Its 'randomness' is very good
Its state vector is a vector of unsigned ints and an index, so it is very easy to save its state, reload its state, and resume a pseudo-random process from where it left off.
I'd recommend giving it a look for your game.
The perfect way of Yes or No as random is toggling those. You may not need random function.
The lowest bits of standard random number generators aren't very random, this is a well known problem.
I'd look into the boost random number library.
A quick thing that might make your numbers feel a bit more random would be to re-seed the generator each time the condition if(rand() % 50==0) is true.
Knuth suggests a Random number generation by subtractive method. Its is believed to be quite randome. For a sample implementation in the Scheme language see here
People say lower-order bits are not random. So try something from the middle. This will get you the 28th bit:
(rand() >> 13) % 2
With random numbers to get good results you really need to have a generator that combines several generators's results. Just discarding the bottom bit is a pretty silly answer.
multiply with carry is simple to implement and has good results on its own and if you have several of them and combine the results you will get extremely good results. It also doesn't require much memory and is very fast.
Also if you reseed too fast then you will get the exact same number. Personally I use a class that updates the seed only when the time has changed.