thread safe random number with same seed for intel TBB threads - c++

I have a function object for parallelizing a for_each() algorithm using Thread Building Blocks,
The function object uses a random number generator RND whose operator method () generates a random number.
Problem: I need a random number number generator to 1) initialize only once in the function object 2) should be threadsafe and 3) can be provided same seed so that same results could be obtained.
I dont know much about generating thread safe random number generators in function objects as such. I tried using my own random generator class (using engine, distribution and generators) (using boost libraries)
but I need something simple such as erand() ? or something like that for which we dont need to do write separate code.
struct func {
public:
func() { }
func(int t_) : t(t_) { }
template <typename house_t>
void operator()(house_t house) const {
if ( RND() )
{ //do something with the house }
} //operator
private:
int t;
};
//Parallel for_each
tbb::parallel_for_each( house.begin(), house.end(), func(t) );
Please suggest

Related

Proper design of RNG helper in C++

In my project, I want to have some kind of helper functions/class to work with random number generator. Main topic of the project is Monte Carlo simulations so I will be using it very often and in many places. Hence, I'm looking for ideas of designing such wrapper on C++ random library so I can randomize e.g. probability by calling simple function generateProbability(). I've came with 2 ideas, one is just class with needed functions. This solution looks nice, however I would have to create separate RNG objects inside every file/place what it's needed. The other solution is just creating separate namespace with global variables of pseudo-random engine, distributions and helper functions. I've prepared example code for both cases:
#pragma once
#include <random>
namespace rng {
std::mt19937_64 rng_engine(std::random_device{}());
std::uniform_int_distribution<uint8_t> zeroToNineDistrib(0, 9);
inline auto generateNumber() { return zeroToNineDistrib(rng_engine); }
} // namespace rng
class RNG {
public:
RNG() : rng_engine(std::random_device{}()), zeroToNineDistrib(0, 9){};
~RNG() = default;
auto generateNumber() { return zeroToNineDistrib(rng_engine); }
private:
std::mt19937_64 rng_engine;
std::uniform_int_distribution<uint8_t> zeroToNineDistrib;
};
What do you think about these solutions? Which one is 'better' in a way that it's more professional and considered as 'cleaner'? Or maybe you have other ideas on how to do that better?
I encourage you to discussion because I can see both pros and cons of either solution and can't decide which one should I pick.

How to use random number in user defined tensorflow op?

How to use random number in user defined tensorflow op?
I am writing a op in cpp which need random number in the Compute function.
but It seems I should not use cpp random library directly, since that cannot control by tf.set_random_seed.
My current code is something like the following, what should I do in function some_interesting_random_function ?
#include "tensorflow/core/framework/op.h"
#include "tensorflow/core/framework/op_kernel.h"
#include "tensorflow/core/framework/common_shape_fns.h"
#include <iostream>
#include <typeinfo>
#include <random>
using namespace tensorflow;
REGISTER_OP("MyRandom")
.Output("random: int32")
.SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
c->set_output(0, c->Scalar());
return Status::OK();
});
int some_interesting_random_function(){
return 10;
}
class MyRandomOp : public OpKernel {
public:
explicit MyRandomOp(OpKernelConstruction* context) : OpKernel(context) {}
void Compute(OpKernelContext* context) override {
Tensor* res;
TensorShape shape;
int dims[] = {};
TensorShapeUtils::MakeShape(dims, 0, &shape);
OP_REQUIRES_OK(context, context->allocate_output(0, shape,
&res));
auto out1 = res->flat<int32>();
out1(0) = some_interesting_random_function();
}
};
REGISTER_KERNEL_BUILDER(Name("MyRandom").Device(DEVICE_CPU), MyRandomOp);
The core of all random number generation in TensorFlow is PhiloxRandom, generally accessed through its wrapper GuardedPhiloxRandom. As explained in tf.set_random_seed, there are graph-level and op-level seeds, both of which may or may not be set. If you want to have this in your op too, you need to do a couple of things. First, your op should be declared with two optional attributes, seed and seed2; see the existing ops in random_ops.cc. Then, in Python, you have some user API wrapping your op that makes these two values using tensorflow.python.framework.random_seed, which you have to import as tensorflow.python.framework import random_seed, and do seed1, seed2 = random_seed.get_seed(seed); this will correctly create the two seed values using the graph's seed and an optional seed parameter to the function (see random_ops.py). These seed1 and seed2 values are then passed as seed and seed2 attributes to your op, obviously. If you do all that, then GuardedPhiloxRandom will take care of properly initializing the random number generator using the right seeds.
Now, to the kernel implementation. In addition to the things I mentioned above, you will need to combine two things: the struct template FillPhiloxRandom, declared in core/kernels/random_op.h, which will help you fill a tensor with random data; and a Distribution, which is just an object that can be called with a random number generator to produce a value (see existing implementations in core/lib/random/random_distributions.h). Now it is mostly a matter of looking at how it is done in core/kernels/random_op.cc, and copy the bits you need. Most kernels in there are based on PhiloxRandomOp (which is not publicly declared, but you can copy or adapt). This essentially holds a random number generator, allocates space in the output tensor (it assumes the first input is the desired shape) and calls FillPhiloxRandom to do the work. If this is the kind of op you are trying to create (generate some data according to some distribution), then you are all set! Your code could look something like this:
// Required for thread pool device
#define EIGEN_USE_THREADS
#include "tensorflow/core/framework/op_kernel.h"
#include "tensorflow/core/framework/register_types.h"
#include "tensorflow/core/framework/tensor.h"
#include "tensorflow/core/framework/tensor_shape.h"
#include "tensorflow/core/kernels/random_op.h"
#include "tensorflow/core/util/guarded_philox_random.h"
// Helper function to convert an 32-bit integer to a float between [0..1).
// Copied from core/lib/random/random_distributions.h
PHILOX_DEVICE_INLINE float Uint32ToFloat(uint32 x) {
// IEEE754 floats are formatted as follows (MSB first):
// sign(1) exponent(8) mantissa(23)
// Conceptually construct the following:
// sign == 0
// exponent == 127 -- an excess 127 representation of a zero exponent
// mantissa == 23 random bits
const uint32 man = x & 0x7fffffu; // 23 bit mantissa
const uint32 exp = static_cast<uint32>(127);
const uint32 val = (exp << 23) | man;
// Assumes that endian-ness is same for float and uint32.
float result;
memcpy(&result, &val, sizeof(val));
return result - 1.0f;
}
// Template class for your custom distribution
template <class Generator, typename RealType>
class MyDistribution;
// Implementation for tf.float32
template <class Generator>
class MyDistribution<Generator, float> {
public:
// The number of elements that will be returned (see below).
static const int kResultElementCount = Generator::kResultElementCount;
// Cost of generation of a single element (in cycles) (see below).
static const int kElementCost = 3;
// Indicate that this distribution may take variable number of samples
// during the runtime (see below).
static const bool kVariableSamplesPerOutput = false;
typedef Array<float, kResultElementCount> ResultType;
typedef float ResultElementType;
PHILOX_DEVICE_INLINE
ResultType operator()(Generator* gen) {
typename Generator::ResultType sample = (*gen)();
ResultType result;
for (int i = 0; i < kResultElementCount; ++i) {
float r = Uint32ToFloat(sample[i]);
// Example distribution logic: produce 1 or 0 with 50% probability
result[i] = 1.0f * (r < 0.5f);
}
return result;
}
};
// Could add implementations for other data types...
// Base kernel
// Copied from core/kernels/random_op.cc
static Status AllocateOutputWithShape(OpKernelContext* ctx, const Tensor& shape,
int index, Tensor** output) {
TensorShape tensor_shape;
TF_RETURN_IF_ERROR(ctx->op_kernel().MakeShape(shape, &tensor_shape));
return ctx->allocate_output(index, tensor_shape, output);
}
template <typename Device, class Distribution>
class PhiloxRandomOp : public OpKernel {
public:
typedef typename Distribution::ResultElementType T;
explicit PhiloxRandomOp(OpKernelConstruction* ctx) : OpKernel(ctx) {
OP_REQUIRES_OK(ctx, generator_.Init(ctx));
}
void Compute(OpKernelContext* ctx) override {
const Tensor& shape = ctx->input(0);
Tensor* output;
OP_REQUIRES_OK(ctx, AllocateOutputWithShape(ctx, shape, 0, &output));
auto output_flat = output->flat<T>();
tensorflow::functor::FillPhiloxRandom<Device, Distribution>()(
ctx, ctx->eigen_device<Device>(),
// Multiplier 256 is the same as in FillPhiloxRandomTask; do not change
// it just here.
generator_.ReserveRandomOutputs(output_flat.size(), 256),
output_flat.data(), output_flat.size(), Distribution());
}
private:
GuardedPhiloxRandom generator_;
};
// Register kernel
typedef Eigen::ThreadPoolDevice CPUDevice;
template struct functor::FillPhiloxRandom<
CPUDevice, MyDistribution<tensorflow::random::PhiloxRandom, float>>;
REGISTER_KERNEL_BUILDER(
Name("MyDistribution")
.Device(DEVICE_CPU)
.HostMemory("shape")
.TypeConstraint<float>("dtype"),
PhiloxRandomOp<CPUDevice, MyDistribution<tensorflow::random::PhiloxRandom, float>>);
// Register kernels for more types, can use macros as in core/kernels/random_op.cc...
There are a few extra bits and pieces here. First you need to understand that PhiloxRandom generally produces four unsigned 32-bit integers on each step, and you have to make your random values from these. Uint32ToFloat is a helper to get a float between zero and one from one of this numbers. There are a few constants in there too. kResultElementCount is the number of values your distribution produces on each step. If you produce one value per random number form the generator, you can set it too Generator::kResultElementCount, like here (which is 4). However, for example if you want to produce double values (that is, tf.float64), you may want to use two 32-bit integers per value, so maybe you would produce Generator::kResultElementCount / 2 in that case. kElementCost is supposed to indicate how many cycles it takes your distribution to produce an element. I do not know how this is measured by the TensorFlow team, but it is just a hint to distribute the generation work among tasks (used by FillPhiloxRandom), so you can just guess something, or copy it from a similarly expensive distribution. kVariableSamplesPerOutput determines whether each call to your distribution may produce a different number of outputs; again, when this is false (which should be the common case), FillPhiloxRandom will make the value generation more efficient. PHILOX_DEVICE_INLINE (defined in core/lib/random/philox_random.h) is a compiler hint to inline the function. You can add then additional implementations and kernel registrations for other data types and, if you are supporting it, for DEVICE_GPU GPUDevice (with typedef Eigen::GpuDevice GPUDevice) or even DEVICE_SYCL (with typedef Eigen::SyclDevice SYCLDevice), if you want. And about that, EIGEN_USE_THREADS is just to enable the thread pool execution device in Eigen, to make CPU implementation multi-threaded.
If your use case is different, though (for example, you want to generate some random numbers and do some other computation in addition to that), FillPhiloxRandom may not be useful to you (or it may be, but then you also need to do something else). Having a look at core/kernels/random_op.cc and the headers of the different classes should help you figure out how to use them for your problem.

How to create a collection of heterogeneous objects

I would like to create a collection of heterogeneous objects; ie. objects of different types.
This would be useful when these objects have similar functionality (including members), but don't derive from the same parent class.
A perfect example of this are random number engines: minstd_rand, mt19937 and ranlux24 are all engines. They have the same members (such as the call operator ()), but don't derive from a common "Engine" class and so are of different types.
The same is the case with random number distributions.
Had there been a common root class 'Engine', I could easily have created a vector of these objects as follows:
vector<Engine> engines {minstd_rand, mt19937, ranlux24};
Having done this, I could then invoke a function in a loop, as follows:
/// Generate 10 random numbers.
void gen(vector<Engine>& engines)
{
for (auto& e : engines)
for (int i = 0; i < 10; i++)
cout << e() << endl;
}
int main()
{
gen(engines); /// Invocation
}
However, I can't do this.
If I use a tuple to wrap each engine, each object would have a different type:
tuple<type1>, tuple<type2>, ....
Again, the types would be heterogeneous and I couldn't create a collection of them.
So the question is, is it possible to create a collection of heterogeneous objects and if so, how?
You can use vector<function<size_t ()>> to hold these engines.
using Engine = function<size_t ()>;
vector<Engine> engines = {minstd_rand{}, mt19937{}, ranlux24{}};
for (auto &e : engines) {
cout << e() << endl;
}
You can simply create your own polymorphic hierarchy to wrap your different separately typed pseudo random number generators in. This is made easier by the fact that the different standard generators have a common interface even though they do not derive from a common base type.
Something a bit like this:
// Base interface class
class prng
{
public:
using dist_type = std::uniform_int_distribution<int>;
virtual ~prng() = default;
virtual int operator()(int min, int max) = 0;
protected:
dist_type dist;
template<typename PRNG>
static PRNG& eng()
{
thread_local static PRNG eng{std::random_device{}()};
return eng;
}
};
// smart pointers because polymorphism
using prng_uptr = std::unique_ptr<prng>;
// Generic class takes advantage of the different PRNG's
// similar interfaces
template<typename PRNG>
class typed_prng
: public prng
{
public:
int operator()(int min, int max) override
{ return dist(eng<PRNG>(), dist_type::param_type(min, max)); }
};
// Some nice names
using prng_minstd_rand = typed_prng<std::minstd_rand>;
using prng_mt19937 = typed_prng<std::mt19937>;
using prng_ranlux24 = typed_prng<std::ranlux24>;
int main()
{
// A vector of smart base pointers to typed instances
std::vector<prng_uptr> prngs;
// Add whatever generators you want
prngs.push_back(std::make_unique<prng_minstd_rand>());
prngs.push_back(std::make_unique<prng_mt19937>());
prngs.push_back(std::make_unique<prng_ranlux24>());
// numbers between 10 and 50
for(auto const& prng: prngs)
std::cout << (*prng)(10, 50) << '\n';
}
[ver 1] #Galik's post effectively demonstrated the basic technique of creating a polymorphic hierarchy, which I explain below.
The post was intended as a demonstration (rather than an implementation) of the technique involved. As such, it doesn't compile:
http://coliru.stacked-crooked.com/a/0465c2a11d3a0558
I have corrected the underlying issues and the following version works [ver 2]:
http://coliru.stacked-crooked.com/a/9bb0f47251e6dfed
The technique involved is important. I have voted for #Galik's post.
However, there was 1 issue with #Galik's solution: The random_device() engine was hard-coded in the base class itself. As such, it was always used, regardless of which engine was passed as argument in the sub-class. In fact, the engine passed as argument to the sub-class should have been used as the source of random numbers.
I have corrected this and also changed some of the names in the following version [ver 3]:
http://coliru.stacked-crooked.com/a/350eadb55a4bafe7
#include <vector>
#include <memory> /// unique_ptr
#include <random>
#include <iostream>
/// Declarations ...
class RndBase; /// Random number base class
/// Generic Random number sub-class
/// takes advantage of the different Engines' similar interfaces
template<typename Eng>
class RndSub;
template<typename T, typename... Args>
std::unique_ptr<T> make_unique(Args&&... args);
/// Implementation ...
/// Random number base class
class RndBase
{
public:
using dist_type = std::uniform_int_distribution<int>;
virtual ~RndBase() = default;
virtual int operator() (int min, int max) = 0;
protected:
dist_type dist;
/// factory method
template<typename Eng>
static Eng& eng()
{
static Eng eng {Eng {}};
return eng;
}
}; // RndBase
/// Generic Random number sub-class
/// takes advantage of the different Engines' similar interfaces
template<typename Eng>
class RndSub : public RndBase
{
public:
/// Generate a random number.
int operator() (int min, int max) override
{
return dist(eng<Eng>(), dist_type::param_type(min, max));
}
};
int main()
{
using Eminstd_rand = RndSub<std::minstd_rand>;
using Emt19937 = RndSub<std::mt19937>;
using Eranlux24 = RndSub<std::ranlux24>;
/// smart pointers because of polymorphism
using pRndBase = std::unique_ptr<RndBase>;
/// A vector of smart base pointers to typed sub-classes
std::vector<pRndBase> prndbases;
/// Add whatever generators you want
prndbases.push_back(make_unique<Eminstd_rand> ());
prndbases.push_back(make_unique<Emt19937> ());
prndbases.push_back(make_unique<Eranlux24> ());
/// random numbers between 10 and 50
for(auto const& prb : prndbases)
std::cout << (*prb) (10, 50) << std::endl;
}
template<typename T, typename... Args>
std::unique_ptr<T> make_unique(Args&&... args)
{
return std::unique_ptr<T> {new T {args...}};
} // make_unique()
The explanation of the "Polymorphic Hierarchy" pattern is as follows:
1) We know all the engines have different types, though the same interface.
2) We create an abstract base class (RndBase), which includes this interface. It also defines a parameterized (static) factory method named eng() that creates an object of the parameter Eng and returns a reference to it. It is expected that an engine would be used as the parameter.
3) We create a parameterized sub-class named RndSub that derives from the base class RndBase. This class defines a call operator which returns the random number obtained by invoking the distribution.
4) Effectively, what we have done is as follows:
a) The heterogeneous engines are abstracted by the parameterized sub-class RndSub. Each sub-class is different.
b) However, they now have a single common base-class RndBase.
c) Since there is only a single base class (RndBase), we can now create a vector<RndBase>, which is homogeneous. The sub-classes of RndBase are heterogeneous.
d) Since the interface is common, we can use the interface defined in the base class to invoke the implementation in the sub-class. This implementation invokes the factory method eng() in the base class to obtain an engine, which is passed as argument to the distribution. This returns the random number.
This solution is specifically for Random numbers. I am trying to make a solution for any classes (that have a similar interface).

using bind vs using pointers with random number generators

I had the following implementation in my code:
// first solution
//random.h
class Random{
public:
std::mt19937* gen;
std::uniform_real_distribution<double>* dis;
}
//random.cpp
Random::Random()
{
std::mt19937_64::result_type seed = chrono::high_resolution_clock::now().time_since_epoch().count();
gen = new std::mt19937(seed);
dis = new std::uniform_real_distribution<double>(0.0,1.0);
}
double Random::next()
{
double rand = 0;
rand_int = (*dis)(*gen);
return rand;
}
On the other hand someone else in the company did a different implementation, where he used the bind feature from c++11 as follows:
// second solution
//random.h
class Random{
public:
std::function<double()> real_rand;
}
//random.cpp
Random::Random()
{
std::mt19937_64::result_type seed = chrono::high_resolution_clock::now().time_since_epoch().count();
real_rand = std::bind(std::uniform_real_distribution<double>(0.0,1.0), mt19937_64(seed))
}
double Random::next()
{
double rand = 0;
rand = real_rand();
return rand;
}
Taking into account that you are supposed to have only one PRNG object, and you are supposed to seed it once, then you call that object every time you need a new random number, as the seed is used to create a series of random numbers in the PRNG. I can clearly see this being the case for the first solution.
The question is, how is bind() working behind the scenes? Is it creating a new object on every call? Is it really calling the (constructor) or the function()? How can it tell which one to call? Are there any differences between the solutions?
std::bind generates a function object which encapsulates the arguments provided to it. In effect your colleague's code generates the following object:
struct random_call
{
random_call(unsigned seed)
: _mt19937_64(seed)
, _uniform_real(0.0, 1.0)
{}
double operator() {
return _uniform_real(_mt19937_64);
}
std::mt19937_64 _mt19937_64;
std::uniform_real_distribution<double> _uniform_real;
};
so it looks ok (and actually pretty clever) to me!
One caveat is that you probably wouldn't want to make any copies of the binder object - even if it turns out to be copyable, copying it and then calling operator() on the original and the copy will yield the same numbers.

Factory class driven by a map for determining the object type

I'm having a brain freeze and can't figure out how to best solve this issue. I'm creating objects from my factory class by calling
CreateEnvironment<T>(ARGS);
Now lets say that i want to save alot of class-types into a map and iterate through the map and call the method at runtime like this:
ITERATION:
CrateEnvironment<(*it)>(world);
(*it) should be the class type, which could be FOO or BAR for example. How do i achieve this instead of having alot of if statements?
Best regards
For each class you could have a function that would serve as generator and create a new object and return a pointer to it (or better, a shared_ptr).
In your container you could then store a the generator function pointers.
Step by step explanations:
Suppose you have these classes to populate your world:
struct GO { virtual void say_hello()=0; }; // Game Object
struct A:GO { void say_hello() { cout<<"I'm a werewolf\n";} };
struct B:GO { void say_hello() { cout<<"I'm a soldier\n";}};
You can then define a generic GO generator:
template <class T>
shared_ptr<GO> generator() {
return make_shared<T>();
};
This would serve as subsititue for your "type" container (for the simplicity of the example I've used a vector, but you could easily opt for a map):
typedef shared_ptr<GO> (*gen_fn)();
vector <gen_fn> generators{generator<A>, generator<B>};
You could then populate your universe like this, without any if:
vector<shared_ptr<GO>> universe;
default_random_engine generator;
uniform_int_distribution<int> distribution(0,generators.size()-1);
for (int i=0; i<10; i++) {
int mytype = distribution(generator);
universe.push_back(generators[mytype]());
}
for (auto x: universe)
x->say_hello();
And here an online demo.
Statistical remark: As the distribution is uniform, you will have a high probability of having the roughly the same proportion of each type of object. If you'd like to have different distribution, you could add several times generators of the same type. For example, with generators{generator<A>, generator<B>, generator<B>}; you'd have around 66% of soldiers and 33% of werewolves.