Encapsulate c++ Random Number Generator - c++

I'm building something that requires me to
template<D>
class DistributionAdapter {
public:
/**
* #return number generated by the distribution function.
*/
virtual D operator()(RANDOM_NUMBER_GENERATOR& rng) = 0;
};
RANDOM_NUMBER_GENERATOR is supposed to represent the class of random number generator in c++, either std::random_device or a pseudo random number generator. Can someone tell me how should I approach this, I don't know if random number generator in c++ have a common base type

Section § 26.5.1.3 of the standard describes the requirements for random number generators.
In particular, a generator must support the function call operator :
g() T Returns a value in the closed interval [ G::min() , G::max() ] .
amortized constant
So, although there is no base class shared by every single generator, the standard guarantees that the operator() will be present in each of them : you can call rng() in your function.

It's not entirely clear what you're asking, but here is a fairly easy to use function returning uniformly distributed random integers within a particular range.
#include <random>
// random number generator from Stroustrup:
// http://www.stroustrup.com/C++11FAQ.html#std-random
int rand_int(int low, int high)
{
static std::default_random_engine re {};
using Dist = std::uniform_int_distribution<int>;
static Dist uid {};
return uid(re, Dist::param_type{low,high});
}

Related

How to correctly implement a function that will generate pseudo-random integers with C++20

I want to note that in C++ the generation of pseudo random numbers is overcomplicated. If you remember about old languages like Pascal, then they had the function Random(n), where n is integer and the generation range is from 0 to n-1. Now, going back to modern C++, I want to get a similar interface, but with a function random_int(a,b), which generates numbers in the [a,b].
Consider the following example:
#include <random>
namespace utils
{
namespace implementation_details
{
struct eng_wrap {
std::mt19937 engine;
eng_wrap()
{
std::random_device device;
engine.seed(device());
}
std::mt19937& operator()()
{
return engine;
}
};
eng_wrap rnd_eng;
}
template <typename int_t, int_t a, int_t b> int_t random_int()
{
static_assert(a <= b);
static std::uniform_int_distribution<int_t> distr(a, b);
return distr(implementation_details::rnd_eng());
}
}
You can see that the distr is marked with the static keyword. Due to this, repeated calls with the same arguments will not cause the construction of the type std::uniform_int_distribution.
In some cases, at the compilation time we do not know the generation boundaries.
Therefore, we have to rewrite this function:
template <typename int_t> int_t random_int2(int_t a, int_t b)
{
std::uniform_int_distribution<int_t> distr(a, b);
return distr(implementation_details::rnd_eng());
}
Next, suppose the second version of this function is called more times:
int a, b;
std::cin>>a>>b;
for (int i=1;i!=1000000;++i)
std::cout<<utils::random_int2(a,b)<<' ';
Question
What is the cost of creating std::uniform_int_distribution in each
iteration of the loop?
Can you suggest a more optimized function that returns a pseudo-random number in the passed range for a normal desktop application?
If you want to use the same a and b repeatedly, use a class with a member function—that’s what they’re for. If you don’t want to expose your rnd_eng (choosing instead to preclude useful multithreaded clients), write the class to use it:
template<class T>
struct random_int {
random_int(T a,T b) : d(a,b) {}
T operator()() const {return d(implementation_details::rnd_eng());}
private:
std::uniform_int_distribution<T> d;
};
IMO, for most simple programs such as games, graphics, and Monte Carlo simulations, the API you actually want is
static xoshiro256ss g;
// Generate a random number between 0 and n-1.
// For example, randint0(2) flips a coin; randint0(6) rolls a die.
int randint0(int n) {
return g() % n;
}
// This version is useful for games like NetHack, where you often
// want to express an ad-hoc percentage chance of something happening.
bool pct(int n) {
return randint0(100) < n;
}
(or substitute std::mt19937 for xoshiro256ss but be aware you're trading away performance in exchange for... something. :))
The % n above is mathematically dubious, when n is astronomically large (e.g. if you're rolling a 12297829382473034410-sided die, you'll find that values between 0 and 6148914691236517205 come up twice as often as they should). So you may prefer to use C++11's uniform_int_distribution:
int randint0(int n) {
return std::uniform_int_distribution<int>(0, n-1)(g);
}
However, again be aware you're gaining mathematical perfection at the cost of raw speed. uniform_int_distribution is more for when you don't already trust your random number engine to be sane (e.g. if the engine's output range might be 0 to 255 but you want to generate numbers from 1 to 1000), or when you're writing template code to work with any arbitrary integer distribution (e.g. binomial_distribution, geometric_distribution) and need a uniform distribution object of that same general "shape" to plug into your template.
The answer to your question #1 is "The cost is free." You will not gain anything by stashing the result of std::uniform_int_distribution<int>(0, n-1) into a static variable. A distribution object is very small, trivially copyable, and basically free to construct. In fact, the cost of constructing the uniform_int_distribution in this case is orders of magnitude cheaper than the cost of thread-safe static initialization.
(There are special cases such as std::normal_distribution where not-stashing the distribution object between calls can result in your doing twice as much work as needed; but uniform_int_distribution is not one of those cases.)

uniform_int_distribution with zero range goes to infinite loop

For unit tests I implemented a mock random number generator. I believe that this is a valid implementation of UniformBitGenerator (the mock actually uses google mock to set the return of operator(), but it behaves the same).
struct RNG
{
using result_type = size_t;
static result_type min() { return 0; }
static result_type max() { return std::numeric_limits<result_type>::max(); }
result_type operator()() { return max(); }
};
Now I use this mock to sample from std::uniform_int_distribution in the range [a, b], a == b. I believe this is allowed, the only restriction I have found here on the parameters of the distribution is b >= a. So I would expect the following program to print 5.
int main()
{
auto rng = RNG();
auto dist = std::uniform_int_distribution<>(5, 5);
printf("%d\n", dist(rng));
return 0;
}
Instead it goes into an infinite loop inside the STL, repeatedly drawing numbers from the generator but failing to find a number within the specified range. I tested different (current) compilers (including clang, gcc, icc) in different versions. RNG::max can return other values (e.g. 42) as well, doesn't change anything.
The real code I'm testing draws a random index into a container which may contain only one element. It would be easy to check this condition but it's a rare case and I would like to avoid it.
Am I missing something in the specification of RNGs in the STL? I'd be surprised to find a bug in ALL compilers ...
A uniform distribution is usually achieved with rejection sampling. You keep requesting random numbers until you get one that meets the criteria. You've set up a situation where the criteria can't be met, because your random number generator is very non-random, so it results in an infinite loop.
The standard says ([rand.dist.uni.int]):
A uniform_­int_­distribution random number distribution produces random integers i,
a ≤ i ≤ b, distributed according to the constant discrete probability function
  P(i|a,b)=1/(b−a+1)
. . .
explicit uniform_int_distribution(IntType a = 0, IntType b = numeric_limits<IntType>::max());
  Requires: a ≤ b.
So uniform_int_distribution<>(5,5) should return 5 with probability 1/1.
Implementations that go into an infinite loop instead, have a bug.
However, your mock RNG that always generates the same value, doesn't satisfy Uniform random bit generator requirements:
A uniform random bit generator g of type G is a function object returning unsigned integer values such that each value in the range of possible results has (ideally) equal probability of being returned. [ Note: The degree to which g's results approximate the ideal is often determined statistically.  — end note ]
See [req.genl]/p1.b:
Throughout this subclause [rand], the effect of instantiating a template:
b) that has a template type parameter named URBG is undefined unless the corresponding template argument is cv-unqualified and satisfies the requirements of uniform random bit generator.
Sure enough, with a standard RNG it just works:
#include <iostream>
#include <random>
int main() {
std::mt19937_64 rng;
std::uniform_int_distribution<> dist(5, 5);
std::cout << dist(rng) << "\n";
}
Prints:
5

Can I generate seed for random numbers as static variable of a function?

In a really schematic way, my aim is to generate good-quality random numbers inside a function. I would like to seed the generator of random numbers with a static variable so I don't have to seed every time that I call the function.
I am generating the random numbers using the GSL library (https://www.gnu.org/software/gsl/doc/html/rng.html). It is supposed to have better quality than the ones generated with rand() and in a more efficient way than the ones generated with the Mersenne Twister engine std::mt19937.
gsl_rng* Initialize() { //INITIALIZE
int rand_seed = 77711; //any integer
srand(time(NULL));
const gsl_rng_type* gsl_rng_T;
gsl_rng* r; //The random variable
gsl_rng_env_setup();
gsl_rng_default_seed = rand_seed;
gsl_rng_T = gsl_rng_default;
r = gsl_rng_alloc(gsl_rng_T);
return r;
}
int random_int(int n) { //Generate integer random variable in [0,n[
static gsl_rng* r2 = Initialize(); //Initialize as static
return gsl_rng_uniform_int(r2, n);
}
void Calculations(/*Variables that have nothing to do with the random numbers*/) {
//Stuff
int position = random_int(Info.I);
//Info.I is an integer member of the class "Info", its value changes
// with each call of the function "Calculations".
//.
//.
//.
return;
}
I have to call the function Calculations() a lot of times, the values of position at each call are highly correlated (not really random). I basically always obtain the same output for position.
I am quite new in C++, I am used to use FORTRAN and I apologize for the terrible coding!
In related questions I've seen that people define a class for the seed. Is there any benefit to doing this? Someone want to recommend a different method or random number generator?
Thank you very much :)

Generating pseudo-random 16-bit integers

I need to generate 16-bit pseudo-random integers and I am wondering what the best choice is.
The obvious way that comes in my mind is something as follows:
std::random_device rd;
auto seed_data = std::array<int, std::mt19937::state_size> {};
std::generate(std::begin(seed_data), std::end(seed_data), std::ref(rd));
std::seed_seq seq(std::begin(seed_data), std::end(seed_data));
std::mt19937 generator(seq);
std::uniform_int_distribution<short> dis(std::numeric_limits<short>::min(),
std::numeric_limits<short>::max());
short n = dis(generator);
The problem I see here is that std::mt19937 produces 32-bit unsigned integers since it's defined as this:
using mt19937 = mersenne_twister_engine<unsigned int,
32, 624, 397,
31, 0x9908b0df,
11, 0xffffffff,
7, 0x9d2c5680,
15, 0xefc60000,
18, 1812433253>;
That means static casting is done and only the least significant part of these 32-bit integers is used by the distribution. So I am wondering how good are these series of pseudo-random shorts and I don't have the mathematical expertise to answer that.
I expect that a better solution would be to use your own defined mersenne_twister_engine engine for 16-bit integers. However, I haven't found any mentioned set for the template arguments (requirements can be found here for instance). Are there any?
UPDATE: I updated the code sample with proper initialization for the distribution.
Your way is indeed the correct way.
The mathematical arguments are complex (I'll try to dig out a paper), but taking the least significant bits of the Mersenne Twister, as implemented by the C++ standard library, is the correct thing to do.
If you're in any doubt as to the quality of the sequence, then run it through the diehard tests.
There may be a misconception, considering this quote from OP's question (emphasis mine):
The problem I see here is that std::mt19937 produces 32-bit unsigned integers […].
That means static casting is done and only the least significant part of these 32-bit integers is used by the distribution.
That's not how it works.
The following are quotes from https://en.cppreference.com/w/cpp/numeric/random
The random number library provides classes that generate random and
pseudo-random numbers. These classes include:
Uniform random bit generators (URBGs), […];
Random number distributions (e.g. uniform, normal, or poisson distributions) which convert the output of URBGs into various statistical distributions
URBGs and distributions are designed to be used together to produce random values.
So a uniform random bit generator, like mt19937 or random_device
is a function object returning unsigned integer values such that each value in the range of possible results has (ideally) equal probability of being returned.
While a random number distribution, like uniform_int_distribution
post-processes the output of a URBG in such a way that resulting output is distributed according to a defined statistical probability density function.
The way it's done uses all the bits from the source to produce an output. As an example, we can look at the implementation of std::uniform_distribution in libstdc++ (starting at line 824), which can be roughly simplified as
template <typename Type>
class uniform_distribution
{
Type a_ = 0, b_ = std::numeric_limits<Type>::max();
public:
uniform_distribution(Type a, Type b) : a_{a}, b_{b} {}
template<typename URBG>
Type operator() (URBG &gen)
{
using urbg_type = std::make_unsigned_t<typename URBG::result_type>;
using u_type = std::make_unsigned_t<Type>;
using max_type = std::conditional_t<(sizeof(urbg_type) > sizeof(u_type))
, urbg_type, u_type>;
urbg_type urbg_min = gen.min();
urbg_type urbg_max = gen.max();
urbg_type urbg_range = urbg_max - urbg_min;
max_type urange = b_ - a_;
max_type udenom = urbg_range <= urange ? 1 : urbg_range / (urange + 1);
Type ret;
// Note that the calculation may require more than one call to the generator
do
ret = (urbg_type(gen()) - urbg_min ) / udenom;
// which is 'ret = gen / 65535' with OP's parameters
// not a simple cast or bit shift
while (ret > b_ - a_);
return ret + a_;
}
};
This could be tested HERE.

Boost C++: Seeding random numbers in a function

I have a function that should simulate a new random exponential variable every time it is called:
#include <boost/random.hpp>
//Simulates a single exponential random variable
double generateExponential(double lambda) {
boost::mt19937 rng; //Mersenne Twister Generator
rng.seed(time(0));
boost::variate_generator< boost::mt19937&, boost::exponential_distribution<> > rndm(rng, boost::exponenti\
al_distribution<>(lambda));
return rndm();
}
for example,
double newExp = generateExponential(10);
However, each time I call the function, it generates the same random number. I want it to generate a different random number each time the function is called. I thought "rng.seed(time(0))" might fix it but it hasn't. How could I get it to do this?
If you can't change the signature of your function, then you could use a static instance of your generator. No need to re-seed.
#include <boost/random.hpp>
typedef boost::mt19937 G;
typedef boost::exponential_distribution D;
double generateExponential(double lambda)
{
static G rng(std::time(0)); // initialized and seeded once
boost::variate_generator<G &, D<> > rndm(rng, D<>(lambda));
return rndm();
}
Generally speaking, a source of random numbers should be a resource whose lifespan is that of your entire program, not that of an individual function call.
Consequently, you need the object representing said source of random numbers to have an appropriate lifespan. Making it a variable local to your function is a Bad Idea. It should be an object passed into your function, or maybe a global object.
(also, frequently reseeding a random number generator is another well known Bad Idea)