How to generate a massive amount of high quality Random Numbers? - c++

I'm working on a random walk simulation of particles moving in a lattice. For that reason I must create a massive amount of random numbers, about 10^12 and above. Currently I'm using the possibilities C++11 provides with <random>. When profiling my program, I see that a major amount of time is spent in <random>. The vast majority of those numbers are between 0 and 1, evenly distributed. Here a then I need a number from a binomial distribution. But the focus lies on the 0..1 numbers.
The question is: What can I do to reduce the CPU time needed to generate these numbers and what would the impact be on their quality?
As you can see, I tried different engines, but that had no big effect on CPU time. Further, what is the difference between my uniform01(gen) and generate_canonical<double,numeric_limits<double>::digits>(gen) anyhow?
Edit: Reading through the answers I conclude that there is not THE ideal solution for my problem. Thus I decided to first make my program multi threading capable and run multiple RNG in different threads (seeded with one random_device number + an thread individual increment). For the time being this seams to be the most unavoidable step (multi threading would be required anyhow). As a further step, pending on exact requirements I consider switching to the suggested Intel RNG or to Thrust. Meaning that my RNG implementation should not be to complex, which, currently is is not. But for now I like to focus on the physical correctness of my model and not on programming stuff, this comes as soon as the output of my program is physically correct.
Thrust
Concerning Intel RNG
Here is what I do currently:
class Generator {
public:
Generator();
virtual ~Generator();
double rand01(); //random number [0,1)
int binomial(int n, double p); //binomial distribution with n samples with probability p
private:
std::random_device randev; //seed
/*Engines*/
std::mt19937_64 gen;
//std::mt19937 gen;
//std::default_random_engine gen;
/*Distributions*/
std::uniform_real_distribution<double> uniform01;
std::binomial_distribution<> binomialdist;
};
Generator::Generator() : randev(), gen(randev()), uniform01(0.,1.), binomial(1,1.) {
}
Generator::~Generator() { }
double Generator::rand01() {
//return uniform01(gen);
return generate_canonical<double,numeric_limits<double>::digits>(gen);
}
int Generator::binomialdist(int n, double p) {
binomial.param(binomial_distribution<>::param_type(n,p));
return binomial(gen);
}

You can pre-process random numbers and use them when you need.
If you need true random numbers I suggest you to use a service like http://www.random.org/ that ensures random numbers calculated by environment ambient instead that some algorithm.
And, speaking about random numbers, you must also check this:

If you need a massive amount of random numbers, and I mean MASSIVE, do a careful search on the internet for IBM's floating point random number generator, published maybe ten years ago. You'll have to buy either a PowerPC machine, or a newer Intel machine with fused multiply-add. They achieved random numbers at a rate of one per cycle per core. So if you bought a new Mac Pro, you could achieve probably 50 billion random numbers per second.

Perhaps instead of using a CPU you could use a GPU to generate many numbers concurrently?
Efficient Random Number Generation and Application Using CUDA

On my i3, the following program runs in about five seconds:
#include <random>
std::mt19937_64 foo;
double drand() {
union {
double d;
long long l;
} x;
x.d = 1.0;
x.l |= foo() & (1LL<<53)-1;
return x.d-1;
}
int main() {
double d;
for (int i = 0; i < 1e9; i++)
d += drand();
printf("%g\n", d);
}
whereas replacing the drand() call with the following results in a program that runs in about ten seconds:
double drand2() {
return std::generate_canonical<double,
std::numeric_limits<double>::digits>(foo);
}
Using the following instead of drand() also results in a program that runs in about ten seconds:
std::uniform_real_distribution<double> uni;
double drand3() {
return uni(foo);
}
Perhaps the hacky drand() above suits your purposes better than the standard solutions..

Task Definition
OP asks to get answer for both the
1. Speed of generation -- assuming a set of 10E+012 random numbers to be "massive"
and
2. Quality of generator -- with a weak assumption that numbers just evenly distributed over some range of values are also random
However, there are more cardinal aspects to be addressed and successfully solved for the real system:
A. Define, whether your system simulation needs to be provided with a guarantee of a repeatability of the sequence of the random numbers for future re-runs of an experiment.
If this is not the case, the re-runs of the simulated experiment will yield principally different results then the randomizer process ( or pre-randomizer and randomized-selector ) need not worry about their re-entrant, state-full mode of operation and will get much simpler implementation.
B. Define, to what level do you need to proof a quality of randomness of the generated random numbers ( or does the generated sets of random numbers have to belong to some specific law of statistic theory ( some known synthetic distributions or truly random with an utmost Kolmogorov complexity of the resulting set of random numbers )). One need not be NSA expert to state that numerical generators of true-random sequences is a very hard issue and has it's computational costs associated with production of high-randomness products.
Hyper-chaotic and true-random sequences are computationally extemely expensive. Using low- or poor-randomness generators is not an option for randomness-quality sensitive applications ( whatever the marketing papers may say, no MIL-STD- or NSA-graded system will ever try this compromised quality in enviroments, where the results indeed matter, so why to settle for less in scientific simulations? Perhaps not a problem if you do not mind to miss so many "unvisited" states of the simulated phenomena ).
C. Verify, how many random numbers does your simulation system need to "consume per [usec]" and whether this design requirement parameter is constant or may get scaled-up by going into multi-threaded, vectorised, Grid-/Cloud-based distributed computation framework.
D. Does your simulation system require to maintain a global or per-thread- or perGrid/CloudNode- individual access management to the pool-of-randomized numbers in case of vectorized or Grid/Cloud-based computational strategy.
Task Solution Approach
Fastest [1] and best [2] solution with [A] and [B] solved and options for [D] is to pre-generate an utmost randomness quality numbers into an adequate access-pool ( and pay an acceptable cost of [C] and [D] on access-policy and access-management controls to re-read from the pool, rather than to re-generate ).

Related

Is the seed of the mersenne_twister_engine instance invariant? [duplicate]

Inspired from this and the similar questions, I want to learn how does mt19937 pseudo-number generator in C++11 behaves, when in two separate machines, it is seeded with the same input.
In other words, say we have the following code;
std::mt19937 gen{ourSeed};
std::uniform_int_distribution<int> dest{0, 10000};
int randNumber = dist(gen);
If we try this code on different machines at different times, will we get the same sequence of randNumber values or a different sequence each time ?
And in either case, why this is the case ?
A further question:
Regardless of the seed, will this code generate randomly numbers infinitely ? I mean for example, if we use this block of code in a program that runs for months without stopping, will there be any problem in the generation of the number or in the uniformity of the numbers ?
The generator will generate the same values.
The distributions may not, at least with different compilers or library versions. The standard did not specify their behaviour to that level of detail. If you want stability between compilers and library versions, you have to roll your own distribution.
Barring library/compiler changes, that will return the same values in the same sequence. But if you care write your own distribution.
...
All PRNGs have patterns and periods. mt19937 is named after its period of 2^19937-1, which is unlikely to be a problem. But other patterns can develop. MT PRNGs are robust against many statistical tests, but they are not crytographically secure PRNGs.
So it being a problem if you run for months will depend on specific details of what you'd find to be a problem. However, mt19937 is going to be a better PRNG than anything you are likely to write yourself. But assume attackers can predict its future behaviour from past evidence.
Regardless of the seed, will this code generate randomly numbers infinitely ? I mean for example, if we use this block of code in a program that runs for months without stopping, will there be any problem in the generation of the number or in the uniformity of the numbers ?
RNG we deal with with standard C++ are called pseudo-random RNGs. By definition, this is pure computational device, with multi-bit state (you could think about state as large bit vector) and three functions:
state seed2state(seed);
state next_state(state);
uint(32|64)_t state2output(state);
and that is it. Obviously, state has finite size, 19937 bits in case of MT19937, so total number of states are 219937 and thus MT19937 next_state() function is a periodic one, with max period no more than 219937. This number is really HUGE, and most likely more than enough for typical simulation
But output is at max 64 bits, so output space is 264. It means that during large run any particular output appears quite a few times. What matters is when not only some 64bit number appears again, but number after that, and after that and after that - this is when you know RNG period is reached.
If we try this code on different machines at different times, will we get the same sequence of randNumber values or a different sequence each time?
Generators are defined rather strictly, and you'll get the same bit stream. For example for MT19937 from C++ standard (https://timsong-cpp.github.io/cppwp/rand)
class mersenne_twister_engine {
...
static constexpr result_type default_seed = 5489u;
...
and function seed2state described as (https://timsong-cpp.github.io/cppwp/rand#eng.mers-6)
Effects: Constructs a mersenne_­twister_­engine object. Sets X−n to value mod 2w. Then, iteratively for i=−n,…,−1, sets Xi to ...
Function next_state is described as well together with test value at 10000th invocation. Standard says (https://timsong-cpp.github.io/cppwp/rand#predef-3)
using mt19937 = mersenne_twister_engine<uint_fast32_t,32,624,397,31,0x9908b0df,11,0xffffffff,7,0x9d2c5680,15,0xefc60000,18,1812433253>;
3
#Required behavior: The 10000th consecutive invocation of a default-constructed object
of type mt19937 shall produce the value 4123659995.
Big four compilers (GCC, Clang, VC++, Intel C++) I used produced same MT19937 output.
Distributions, from the other hand, are not specified that well, and therefore vary between compilers and libraries. If you need portable distributions you either roll your own or use something from Boost or similar libraries
Any pseudo RNG which takes a seed will give you the same sequence for the same seed every time, on every machine. This happens since the generator is just a (complex) mathematical function, and has nothing actually random about it. Most times when you want to randomize, you take the seed from the system clock, which constantly changes so each run will be different.
It is useful to have the same sequence in computer games for example when you have a randomly generated world and want to generate the exact same one, or to avoid people cheating using save games in a game with random chances.

Deterministic random numbers from STL [duplicate]

Inspired from this and the similar questions, I want to learn how does mt19937 pseudo-number generator in C++11 behaves, when in two separate machines, it is seeded with the same input.
In other words, say we have the following code;
std::mt19937 gen{ourSeed};
std::uniform_int_distribution<int> dest{0, 10000};
int randNumber = dist(gen);
If we try this code on different machines at different times, will we get the same sequence of randNumber values or a different sequence each time ?
And in either case, why this is the case ?
A further question:
Regardless of the seed, will this code generate randomly numbers infinitely ? I mean for example, if we use this block of code in a program that runs for months without stopping, will there be any problem in the generation of the number or in the uniformity of the numbers ?
The generator will generate the same values.
The distributions may not, at least with different compilers or library versions. The standard did not specify their behaviour to that level of detail. If you want stability between compilers and library versions, you have to roll your own distribution.
Barring library/compiler changes, that will return the same values in the same sequence. But if you care write your own distribution.
...
All PRNGs have patterns and periods. mt19937 is named after its period of 2^19937-1, which is unlikely to be a problem. But other patterns can develop. MT PRNGs are robust against many statistical tests, but they are not crytographically secure PRNGs.
So it being a problem if you run for months will depend on specific details of what you'd find to be a problem. However, mt19937 is going to be a better PRNG than anything you are likely to write yourself. But assume attackers can predict its future behaviour from past evidence.
Regardless of the seed, will this code generate randomly numbers infinitely ? I mean for example, if we use this block of code in a program that runs for months without stopping, will there be any problem in the generation of the number or in the uniformity of the numbers ?
RNG we deal with with standard C++ are called pseudo-random RNGs. By definition, this is pure computational device, with multi-bit state (you could think about state as large bit vector) and three functions:
state seed2state(seed);
state next_state(state);
uint(32|64)_t state2output(state);
and that is it. Obviously, state has finite size, 19937 bits in case of MT19937, so total number of states are 219937 and thus MT19937 next_state() function is a periodic one, with max period no more than 219937. This number is really HUGE, and most likely more than enough for typical simulation
But output is at max 64 bits, so output space is 264. It means that during large run any particular output appears quite a few times. What matters is when not only some 64bit number appears again, but number after that, and after that and after that - this is when you know RNG period is reached.
If we try this code on different machines at different times, will we get the same sequence of randNumber values or a different sequence each time?
Generators are defined rather strictly, and you'll get the same bit stream. For example for MT19937 from C++ standard (https://timsong-cpp.github.io/cppwp/rand)
class mersenne_twister_engine {
...
static constexpr result_type default_seed = 5489u;
...
and function seed2state described as (https://timsong-cpp.github.io/cppwp/rand#eng.mers-6)
Effects: Constructs a mersenne_­twister_­engine object. Sets X−n to value mod 2w. Then, iteratively for i=−n,…,−1, sets Xi to ...
Function next_state is described as well together with test value at 10000th invocation. Standard says (https://timsong-cpp.github.io/cppwp/rand#predef-3)
using mt19937 = mersenne_twister_engine<uint_fast32_t,32,624,397,31,0x9908b0df,11,0xffffffff,7,0x9d2c5680,15,0xefc60000,18,1812433253>;
3
#Required behavior: The 10000th consecutive invocation of a default-constructed object
of type mt19937 shall produce the value 4123659995.
Big four compilers (GCC, Clang, VC++, Intel C++) I used produced same MT19937 output.
Distributions, from the other hand, are not specified that well, and therefore vary between compilers and libraries. If you need portable distributions you either roll your own or use something from Boost or similar libraries
Any pseudo RNG which takes a seed will give you the same sequence for the same seed every time, on every machine. This happens since the generator is just a (complex) mathematical function, and has nothing actually random about it. Most times when you want to randomize, you take the seed from the system clock, which constantly changes so each run will be different.
It is useful to have the same sequence in computer games for example when you have a randomly generated world and want to generate the exact same one, or to avoid people cheating using save games in a game with random chances.

If we seed c++11 mt19937 as the same on different machines, will we get the same sequence of random numbers

Inspired from this and the similar questions, I want to learn how does mt19937 pseudo-number generator in C++11 behaves, when in two separate machines, it is seeded with the same input.
In other words, say we have the following code;
std::mt19937 gen{ourSeed};
std::uniform_int_distribution<int> dest{0, 10000};
int randNumber = dist(gen);
If we try this code on different machines at different times, will we get the same sequence of randNumber values or a different sequence each time ?
And in either case, why this is the case ?
A further question:
Regardless of the seed, will this code generate randomly numbers infinitely ? I mean for example, if we use this block of code in a program that runs for months without stopping, will there be any problem in the generation of the number or in the uniformity of the numbers ?
The generator will generate the same values.
The distributions may not, at least with different compilers or library versions. The standard did not specify their behaviour to that level of detail. If you want stability between compilers and library versions, you have to roll your own distribution.
Barring library/compiler changes, that will return the same values in the same sequence. But if you care write your own distribution.
...
All PRNGs have patterns and periods. mt19937 is named after its period of 2^19937-1, which is unlikely to be a problem. But other patterns can develop. MT PRNGs are robust against many statistical tests, but they are not crytographically secure PRNGs.
So it being a problem if you run for months will depend on specific details of what you'd find to be a problem. However, mt19937 is going to be a better PRNG than anything you are likely to write yourself. But assume attackers can predict its future behaviour from past evidence.
Regardless of the seed, will this code generate randomly numbers infinitely ? I mean for example, if we use this block of code in a program that runs for months without stopping, will there be any problem in the generation of the number or in the uniformity of the numbers ?
RNG we deal with with standard C++ are called pseudo-random RNGs. By definition, this is pure computational device, with multi-bit state (you could think about state as large bit vector) and three functions:
state seed2state(seed);
state next_state(state);
uint(32|64)_t state2output(state);
and that is it. Obviously, state has finite size, 19937 bits in case of MT19937, so total number of states are 219937 and thus MT19937 next_state() function is a periodic one, with max period no more than 219937. This number is really HUGE, and most likely more than enough for typical simulation
But output is at max 64 bits, so output space is 264. It means that during large run any particular output appears quite a few times. What matters is when not only some 64bit number appears again, but number after that, and after that and after that - this is when you know RNG period is reached.
If we try this code on different machines at different times, will we get the same sequence of randNumber values or a different sequence each time?
Generators are defined rather strictly, and you'll get the same bit stream. For example for MT19937 from C++ standard (https://timsong-cpp.github.io/cppwp/rand)
class mersenne_twister_engine {
...
static constexpr result_type default_seed = 5489u;
...
and function seed2state described as (https://timsong-cpp.github.io/cppwp/rand#eng.mers-6)
Effects: Constructs a mersenne_­twister_­engine object. Sets X−n to value mod 2w. Then, iteratively for i=−n,…,−1, sets Xi to ...
Function next_state is described as well together with test value at 10000th invocation. Standard says (https://timsong-cpp.github.io/cppwp/rand#predef-3)
using mt19937 = mersenne_twister_engine<uint_fast32_t,32,624,397,31,0x9908b0df,11,0xffffffff,7,0x9d2c5680,15,0xefc60000,18,1812433253>;
3
#Required behavior: The 10000th consecutive invocation of a default-constructed object
of type mt19937 shall produce the value 4123659995.
Big four compilers (GCC, Clang, VC++, Intel C++) I used produced same MT19937 output.
Distributions, from the other hand, are not specified that well, and therefore vary between compilers and libraries. If you need portable distributions you either roll your own or use something from Boost or similar libraries
Any pseudo RNG which takes a seed will give you the same sequence for the same seed every time, on every machine. This happens since the generator is just a (complex) mathematical function, and has nothing actually random about it. Most times when you want to randomize, you take the seed from the system clock, which constantly changes so each run will be different.
It is useful to have the same sequence in computer games for example when you have a randomly generated world and want to generate the exact same one, or to avoid people cheating using save games in a game with random chances.

C++ fast normal random number generator

I'm using the mt19937 generator to generate normal random numbers as shown below:
normal_distribution<double> normalDistr(0, 1);
mt19937 generator(123);
vector<double> randNums(1000000);
for (size_t i = 0; i != 1000000; ++i)
{
randNums[i] = normalDistr(generator);
}
The above code works, however since I'm generating more than 100 million normal random numbers in my code, the above is very slow.
Is there a faster way to generate normal random numbers?
The following is some background on how the code would be used:
Quality of the random numbers is not that important
Precision of the numbers is not that important, either double or float is OK
The normal distribution always has mean = 0 and sigma = 1
EDIT:
#Dúthomhas, Andrew:
After profiling the following function is taking up more than 50% of the time:
std::normal_distribution<double>::_Eval<std::mersenne_twister_engine<unsigned int,32,624,397,31,2567483615,11,4294967295,7,2636928640,15,4022730752,18,1812433‌​253> >
Most importantly, do you really need 100,000,000 random numbers simultaneously? The writing to and subsequent reading from RAM of all these data unavoidably requires significant time. If you only need the random numbers one at a time, you should avoid that.
Assuming that you do need all of these numbers in RAM, then you should first
profile your code if you really want to know where the CPU time is spent/lost.
Second, you should avoid unnecessary re-allocation and initialisation of the data. This is most easily done by using std::vector::reserve(final_size) in conjunction with std::vector::push_back().
Third, you could use a faster RNG than std::mt19937. That RNG is recommended when the quality of the numbers is of importance. The online documentation says that the lagged Fibonacci generator (implemented in std:: subtract_with_carry_engine) is fast, but it may not have a long enough recurrence period -- you must check this. Alternatively, you may want to use std::min_stdrand (which uses the linear congruential generator)
std::vector<double> make_normal_random(std::size_t number,
std::uint_fast32_t seed)
{
std::normal_distribution<double> normalDistr(0,1);
std::min_stdrand generator(seed);
std::vector<double> randNums;
randNums.reserve(number);
while(number--)
randNums.push_back(normalDistr(generator));
return randNums;
}
You also will want to look into std::vector reserve rather than resize. It will allow you get all the memory you will need in 1 shot. I am assuming you don't need all 100 million doubles at once?
If it really is the generator that is the cause of the performance degradation then use the ordinary rand function (you need to draw numbers in pairs), transform to a float or double in 0, 1 then apply the Box Muller transformation.
That will be hard to beat in terms of time, but note that the statistical properties are no better than rand.
A numerical recipes routine gasdev does this - you should be able to download a copy.

How to properly choose rng seed for parallel processes

I'm currently working on a C/C++ project where I'm using a random number generator (gsl or boost). The whole idea can be simplified to a non-trivial stochastic process which receives a seed and returns results. I'm computing averages over different realisations of the process.
So, the seed is important: the processes must be with different seeds or it will bias the averages.
So far, I'm using time(NULL) to give a seed. However, if two processes start at the same second, the seed is the same. That happens because I'm using parallelisation (using openMP).
So, my question is: how to implement a "seed giver" on C/C++ which gives independent seeds?
For instance, I though in using the thread number (thread_num), seed = time(NULL)*thread_num. However, this means that the seeds are correlated: they are multiple of each others. Does that poses any problem to the "pseudo-random" or is it as good as sequential seeds?
The requirements are that it must work on both Mac OS (my pc) and Linux distribution similar to OS Cent (the cluster) (and naturally give independent realisations).
A commonly used scheme for this is to have a "master" RNG used to generate seeds for each process-specific RNG.
The advantage of such a scheme is that the whole computation is determined by only one seed, which you can record somewhere to be able to replay any simulation (this might be useful to debug nasty bugs).
We ran into a similar problem on a Beowulf computing grid, the solution we used was to incorporate the pid of the process into the RNG seed, like so:
time(NULL)*thread_num*getpid()
Of course, you could just read from /dev/urandom or /dev/random into an integer.
When faced with this problem I often use seed_rng from Boost.Uuid. It uses time, clock and random data from /dev/urandom to calculate a seed. You can use it like
#include <boost/uuid/seed_rng.hpp>
#include <iostream>
int main() {
int seed = boost::uuids::detail::seed_rng()();
std::cout << seed << std::endl;
}
Note that seed_rng comes from a detail namespace, so it can go away without further notice. In that case writing your own implementation based on seed_rng shouldn't be too hard.
Mac OS is Unix too, so it probably has /dev/random. If so, that's the
best solution for obtaining the seeds. Otherwise, if the generator is
good, taking time( NULL ) once, and then incrementing it for the seed
of each generator, should give reasonably good results.
If you are on x86 and don't mind making the code non-portable then you could read the Time Stamp Counter (TSC) which is a 64-bit counter that increments at the CPU (max) clock rate (about 3 GHz) and use that as a seed.
#include <stdint.h>
static inline uint64_t rdtsc()
{
uint64_t tsc;
asm volatile
(
"rdtsc\n\t"
"shl\t$32,%%rdx\n\t" // rdx = TSC[ 63 : 32 ] : 0x00000000
"add\t%%rdx,%%rax\n\t" // rax = TSC[ 63 : 0 ]
: "=a" (tsc) : : "%rdx"
);
return tsc;
}
When compare two infinite time sequences produced by the same pseudo-random number generator with different seeds, we can see that they are same delayed by some time tau. Usually this time time scale is much bigger than your problem to ensure that the two random walks are uncorrelated.
If your stochastic process is in a high dimensional phase space, I think that one good suggestion could be:
seed = MAXIMUM_INTEGER/NUMBER_OF_PARALLEL_RW*thread_num + time(NULL)
Notice that using scheme you are not guaranteeing that time tau is big !!
If you have some knowledge of your system time scale, you can call your random number generator some number o times in order to generate seeds that are equidistant by some time interval.
Maybe you could try std::chrono high resolution clock from C++11:
Class std::chrono::high_resolution_clock represents the clock with the
smallest tick period available on the system. It may be an alias of
std::chrono::system_clock or std::chrono::steady_clock, or a third,
independent clock.
http://en.cppreference.com/w/cpp/chrono/high_resolution_clock
BUT tbh Im not sure that there is anything wrong with srand(0); srand(1), srand(2).... but my knowledge of rand is very very basic. :/
For crazy safety consider this:
Note that all pseudo-random number generators described below are
CopyConstructible and Assignable. Copying or assigning a generator
will copy all its internal state, so the original and the copy will
generate the identical sequence of random numbers.
http://www.boost.org/doc/libs/1_51_0/doc/html/boost_random/reference.html#boost_random.reference.generators
Since most of the generators have crazy long cycles you could generate one, copy it as first generator, generate X numbers with original, copy it as second, generate X numbers with original, copy it as third...
If your users call their own generator less than X time they will not be overlapping.
The way I understand your question, you have multiple processes using the same pseudo-random number generation algorithm, and you want each "stream" of random numbers (in each process) to be independent of each other. Am I correct ?
In that case, you are right in suspecting that giving different (correlated) seeds does not guaranty you anything unless the rng algorithm says so. You basically have two solutions:
Simple version
Use a single source of random numbers, with a single seed. Then feed random numbers in a round-robin fashion to each process.
This solution is slow but provide some guaranty that the number you give to your processes are ok.
You can do the same thing but generating all the random numbers you need at once, and then splitting this set into as many slices as you have processes.
Use a RNG designed for that
You can find in papers and on the web several algorithms specifically designed to provide independent streams of random numbers from a single initial state. They are complicated but most provide source code. The idea is generally to "split" the RNG space (values you can obtain from the initial state) into various chunks like above. They are just faster because the algorithm used makes it possible to compute easily what would be the state of the RNG if you skipped a given number of values.
These generators are generally called "parallel random number generators".
The most popular ones are probably these two:
RngStreams: http://statmath.wu.ac.at/software/RngStreams/
SPRNG: http://sprng.cs.fsu.edu/
Check their manuals to fully understand what they do, how they do it, and if it really is what you need.