I need a 'good' way to initialize the pseudo-random number generator in C++. I've found an article that states:
In order to generate random-like
numbers, srand is usually initialized
to some distinctive value, like those
related with the execution time. For
example, the value returned by the
function time (declared in header
ctime) is different each second, which
is distinctive enough for most
randoming needs.
Unixtime isn't distinctive enough for my application. What's a better way to initialize this? Bonus points if it's portable, but the code will primarily be running on Linux hosts.
I was thinking of doing some pid/unixtime math to get an int, or possibly reading data from /dev/urandom.
Thanks!
EDIT
Yes, I am actually starting my application multiple times a second and I've run into collisions.
This is what I've used for small command line programs that can be run frequently (multiple times a second):
unsigned long seed = mix(clock(), time(NULL), getpid());
Where mix is:
// Robert Jenkins' 96 bit Mix Function
unsigned long mix(unsigned long a, unsigned long b, unsigned long c)
{
a=a-b; a=a-c; a=a^(c >> 13);
b=b-c; b=b-a; b=b^(a << 8);
c=c-a; c=c-b; c=c^(b >> 13);
a=a-b; a=a-c; a=a^(c >> 12);
b=b-c; b=b-a; b=b^(a << 16);
c=c-a; c=c-b; c=c^(b >> 5);
a=a-b; a=a-c; a=a^(c >> 3);
b=b-c; b=b-a; b=b^(a << 10);
c=c-a; c=c-b; c=c^(b >> 15);
return c;
}
The best answer is to use <random>. If you are using a pre C++11 version, you can look at the Boost random number stuff.
But if we are talking about rand() and srand()
The best simplest way is just to use time():
int main()
{
srand(time(nullptr));
...
}
Be sure to do this at the beginning of your program, and not every time you call rand()!
Side Note:
NOTE: There is a discussion in the comments below about this being insecure (which is true, but ultimately not relevant (read on)). So an alternative is to seed from the random device /dev/random (or some other secure real(er) random number generator). BUT: Don't let this lull you into a false sense of security. This is rand() we are using. Even if you seed it with a brilliantly generated seed it is still predictable (if you have any value you can predict the full sequence of next values). This is only useful for generating "pseudo" random values.
If you want "secure" you should probably be using <random> (Though I would do some more reading on a security informed site). See the answer below as a starting point: https://stackoverflow.com/a/29190957/14065 for a better answer.
Secondary note: Using the random device actually solves the issues with starting multiple copies per second better than my original suggestion below (just not the security issue).
Back to the original story:
Every time you start up, time() will return a unique value (unless you start the application multiple times a second). In 32 bit systems, it will only repeat every 60 years or so.
I know you don't think time is unique enough but I find that hard to believe. But I have been known to be wrong.
If you are starting a lot of copies of your application simultaneously you could use a timer with a finer resolution. But then you run the risk of a shorter time period before the value repeats.
OK, so if you really think you are starting multiple applications a second.
Then use a finer grain on the timer.
int main()
{
struct timeval time;
gettimeofday(&time,NULL);
// microsecond has 1 000 000
// Assuming you did not need quite that accuracy
// Also do not assume the system clock has that accuracy.
srand((time.tv_sec * 1000) + (time.tv_usec / 1000));
// The trouble here is that the seed will repeat every
// 24 days or so.
// If you use 100 (rather than 1000) the seed repeats every 248 days.
// Do not make the MISTAKE of using just the tv_usec
// This will mean your seed repeats every second.
}
if you need a better random number generator, don't use the libc rand. Instead just use something like /dev/random or /dev/urandom directly (read in an int directly from it or something like that).
The only real benefit of the libc rand is that given a seed, it is predictable which helps with debugging.
On windows:
srand(GetTickCount());
provides a better seed than time() since its in milliseconds.
C++11 random_device
If you need reasonable quality then you should not be using rand() in the first place; you should use the <random> library. It provides lots of great functionality like a variety of engines for different quality/size/performance trade-offs, re-entrancy, and pre-defined distributions so you don't end up getting them wrong. It may even provide easy access to non-deterministic random data, (e.g., /dev/random), depending on your implementation.
#include <random>
#include <iostream>
int main() {
std::random_device r;
std::seed_seq seed{r(), r(), r(), r(), r(), r(), r(), r()};
std::mt19937 eng(seed);
std::uniform_int_distribution<> dist{1,100};
for (int i=0; i<50; ++i)
std::cout << dist(eng) << '\n';
}
eng is a source of randomness, here a built-in implementation of mersenne twister. We seed it using random_device, which in any decent implementation will be a non-determanistic RNG, and seed_seq to combine more than 32-bits of random data. For example in libc++ random_device accesses /dev/urandom by default (though you can give it another file to access instead).
Next we create a distribution such that, given a source of randomness, repeated calls to the distribution will produce a uniform distribution of ints from 1 to 100. Then we proceed to using the distribution repeatedly and printing the results.
Best way is to use another pseudorandom number generator.
Mersenne twister (and Wichmann-Hill) is my recommendation.
http://en.wikipedia.org/wiki/Mersenne_twister
i suggest you see unix_random.c file in mozilla code. ( guess it is mozilla/security/freebl/ ...) it should be in freebl library.
there it uses system call info ( like pwd, netstat ....) to generate noise for the random number;it is written to support most of the platforms (which can gain me bonus point :D ).
The real question you must ask yourself is what randomness quality you need.
libc random is a LCG
The quality of randomness will be low whatever input you provide srand with.
If you simply need to make sure that different instances will have different initializations, you can mix process id (getpid), thread id and a timer. Mix the results with xor. Entropy should be sufficient for most applications.
Example :
struct timeb tp;
ftime(&tp);
srand(static_cast<unsigned int>(getpid()) ^
static_cast<unsigned int>(pthread_self()) ^
static_cast<unsigned int >(tp.millitm));
For better random quality, use /dev/urandom. You can make the above code portable in using boost::thread and boost::date_time.
The c++11 version of the top voted post by Jonathan Wright:
#include <ctime>
#include <random>
#include <thread>
...
const auto time_seed = static_cast<size_t>(std::time(0));
const auto clock_seed = static_cast<size_t>(std::clock());
const size_t pid_seed =
std::hash<std::thread::id>()(std::this_thread::get_id());
std::seed_seq seed_value { time_seed, clock_seed, pid_seed };
...
// E.g seeding an engine with the above seed.
std::mt19937 gen;
gen.seed(seed_value);
#include <stdio.h>
#include <sys/time.h>
main()
{
struct timeval tv;
gettimeofday(&tv,NULL);
printf("%d\n", tv.tv_usec);
return 0;
}
tv.tv_usec is in microseconds. This should be acceptable seed.
As long as your program is only running on Linux (and your program is an ELF executable), you are guaranteed that the kernel provides your process with a unique random seed in the ELF aux vector. The kernel gives you 16 random bytes, different for each process, which you can get with getauxval(AT_RANDOM). To use these for srand, use just an int of them, as such:
#include <sys/auxv.h>
void initrand(void)
{
unsigned int *seed;
seed = (unsigned int *)getauxval(AT_RANDOM);
srand(*seed);
}
It may be possible that this also translates to other ELF-based systems. I'm not sure what aux values are implemented on systems other than Linux.
Suppose you have a function with a signature like:
int foo(char *p);
An excellent source of entropy for a random seed is a hash of the following:
Full result of clock_gettime (seconds and nanoseconds) without throwing away the low bits - they're the most valuable.
The value of p, cast to uintptr_t.
The address of p, cast to uintptr_t.
At least the third, and possibly also the second, derive entropy from the system's ASLR, if available (the initial stack address, and thus current stack address, is somewhat random).
I would also avoid using rand/srand entirely, both for the sake of not touching global state, and so you can have more control over the PRNG that's used. But the above procedure is a good (and fairly portable) way to get some decent entropy without a lot of work, regardless of what PRNG you use.
For those using Visual Studio here's yet another way:
#include "stdafx.h"
#include <time.h>
#include <windows.h>
const __int64 DELTA_EPOCH_IN_MICROSECS= 11644473600000000;
struct timezone2
{
__int32 tz_minuteswest; /* minutes W of Greenwich */
bool tz_dsttime; /* type of dst correction */
};
struct timeval2 {
__int32 tv_sec; /* seconds */
__int32 tv_usec; /* microseconds */
};
int gettimeofday(struct timeval2 *tv/*in*/, struct timezone2 *tz/*in*/)
{
FILETIME ft;
__int64 tmpres = 0;
TIME_ZONE_INFORMATION tz_winapi;
int rez = 0;
ZeroMemory(&ft, sizeof(ft));
ZeroMemory(&tz_winapi, sizeof(tz_winapi));
GetSystemTimeAsFileTime(&ft);
tmpres = ft.dwHighDateTime;
tmpres <<= 32;
tmpres |= ft.dwLowDateTime;
/*converting file time to unix epoch*/
tmpres /= 10; /*convert into microseconds*/
tmpres -= DELTA_EPOCH_IN_MICROSECS;
tv->tv_sec = (__int32)(tmpres * 0.000001);
tv->tv_usec = (tmpres % 1000000);
//_tzset(),don't work properly, so we use GetTimeZoneInformation
rez = GetTimeZoneInformation(&tz_winapi);
tz->tz_dsttime = (rez == 2) ? true : false;
tz->tz_minuteswest = tz_winapi.Bias + ((rez == 2) ? tz_winapi.DaylightBias : 0);
return 0;
}
int main(int argc, char** argv) {
struct timeval2 tv;
struct timezone2 tz;
ZeroMemory(&tv, sizeof(tv));
ZeroMemory(&tz, sizeof(tz));
gettimeofday(&tv, &tz);
unsigned long seed = tv.tv_sec ^ (tv.tv_usec << 12);
srand(seed);
}
Maybe a bit overkill but works well for quick intervals. gettimeofday function found here.
Edit: upon further investigation rand_s might be a good alternative for Visual Studio, it's not just a safe rand(), it's totally different and doesn't use the seed from srand. I had presumed it was almost identical to rand just "safer".
To use rand_s just don't forget to #define _CRT_RAND_S before stdlib.h is included.
Assuming that the randomness of srand() + rand() is enough for your purposes, the trick is in selecting the best seed for srand. time(NULL) is a good starting point, but you'll run into problems if you start more than one instance of the program within the same second. Adding the pid (process id) is an improvement as different instances will get different pids. I would multiply the pid by a factor to spread them more.
But let's say you are using this for some embedded device and you have several in the same network. If they are all powered at once and you are launching the several instances of your program automatically at boot time, they may still get the same time and pid and all the devices will generate the same sequence of "random" numbers. In that case, you may want to add some unique identifier of each device (like the CPU serial number).
The proposed initialization would then be:
srand(time(NULL) + 1000 * getpid() + (uint) getCpuSerialNumber());
In a Linux machine (at least in the Raspberry Pi where I tested this), you can implement the following function to get the CPU Serial Number:
// Gets the CPU Serial Number as a 64 bit unsigned int. Returns 0 if not found.
uint64_t getCpuSerialNumber() {
FILE *f = fopen("/proc/cpuinfo", "r");
if (!f) {
return 0;
}
char line[256];
uint64_t serial = 0;
while (fgets(line, 256, f)) {
if (strncmp(line, "Serial", 6) == 0) {
serial = strtoull(strchr(line, ':') + 2, NULL, 16);
}
}
fclose(f);
return serial;
}
Include the header at the top of your program, and write:
srand(time(NULL));
In your program before you declare your random number. Here is an example of a program that prints a random number between one and ten:
#include <iostream>
#include <iomanip>
using namespace std;
int main()
{
//Initialize srand
srand(time(NULL));
//Create random number
int n = rand() % 10 + 1;
//Print the number
cout << n << endl; //End the line
//The main function is an int, so it must return a value
return 0;
}
Related
I like to learn by screwing around with code, recently I copied and pasted a random number generator code. Then I removed all the lines of code that were not "necessary" to make the executable work to generate a random number. The final straw was me deleting "time" from srand.
srand((unsigned) time(0));
What is the point of "time(0)" here?
Does it use the time that the program is opened to generate the seed for the random number? Is that why removing it (time) makes it not work? Because then it doesn't have a seed?
Also...
include <stdlib.h>
include <stdio.h>
include <time.h>
int main()
{
srand((unsigned) time(0));
printf("Your dice has been rolled! You got:");
int result = 1 + (rand() % 20);
printf("%d", result);
}
that's the whole code and I noticed it used the "rand" result for output. Does the "rand" pull the seed from "srand"?
If you don’t “seed” the random number generator (or if you use the same seed value), you’ll get the same set of pseudorandom numbers.
Using the current time is an easy way to get a different seed every time.
The effect of srand cannot cross threads, so the random number seed should be set once on each thread. #Buddy said that using time(0) is the most convenient way to do this, and each call will get a different seed.Of course you can use an atomic variable .
std::atomic<int> seek(2374213); //init whatever you like
void thread1fun()
{
srand(++seek);
//...
int rand_num = rand();
}
void thread2fun()
{
srand(++seek);
//...
int rand_num = rand();
}
I am C++ student and I am working on creating a random number generator.
Infact I should say my algorithm selects a number within a defined range.
I am writing this just because of my curiosity.
I am not challenging existing library functions.
I always use library functions when writing applications based on randomness but I am again stating that I just want to make it because of my curiosity.
I would also like to know if there is something wrong with my algorithm or with my approach.
Because i googled how PRNGs work and on some sites they said that a mathematical algorithm is there and a predefined series of numbers and a seed just sets the pointer in a different point in the series and after some intervals the sequence repeats itself.
My algorithm just starts moving to and fro in the array of possible values and the seed breaks the loop with different values each time. I don't i this approach is wrong. I got answers suggesting a different algorithm but they didn't explain What's wrong with my current algorithm?
Yes,there was a problem with my seed as it was not precise and made results little predictable as here:-
cout<
<
rn(50,100);
The results in running four times are 74,93,56,79.
See the pattern of "increasing order".
And for large ranges patterns could be seen easily.I got an answer on getting good seeds but that too recommended a new algorithm(but didn't say why?).
An alternative way could be to shuffle my array randomly generating a new sequence every time.And the pattern of increasing order will go off.Any help with that rearranging too will also be good.Here is the code below.And if my function is not possible please notify me.
Thanking you in anticipation.
int rn(int lowerlt, int upperlt)
{
/* Over short ranges, results are satisfactory.
* I want to make it effective for big ranges.
*/
const int size = upperlt - lowerlt; // Constant size of the integer array.
int ar[size]; // Array to store all possible values within defined range.
int i, x, ret; // Variables to control loops and return value.
long pointer = 0; //pointer variable. The one which breaks the main loop.
// Loop to initialize the array with possible values..
for (i=0, x=lowerlt; x <= upperlt; i++, x++)
ar[i]=x;
long seed = time(0);
//Main loop . To find the random number.
for (i=0; pointer <= seed; i++, pointer++)
{
ret = ar[i];
if (i == size-1)
{
// Reverse loop.
for (; i >= 0; i--)
{
ret=ar[i];
}
}
}
return ret;
}
Caveat: From your post, aside from your random generator algorithm, one of your problems is getting a good seed value, so I'll address that part of it.
You could use /dev/random to get a seed value. That would be a great place to start [and would be sufficient on its own], but might be considered "cheating" from some perspective.
So, here are some other sources of "entropy":
Use a higher resolution time of day clock source: gettimeofday or clock_gettime(CLOCK_REALTIME,...) call it "cur_time". Use only the microsecond or nanosecond portion respectively, call it "cur_nano". Note that cur_nano is usually pretty random all by itself.
Do a getpid(2). This has a few unpredictable bits because between invocations other programs are starting and we don't know how many.
Create a new temp file and get the file's inode number [then delete it]. This varies slowly over time. It may be the same on each invocation [or not]
Get the high resolution value for the system's time of day clock when the system was booted, call it "sysboot".
Get the high resolution value for the start time of your "session": When your program's parent shell was started, call it "shell_start".
If you were using Linux, you could compute a checksum of /proc/interrupts as that's always changing. For other systems, get some hash of the number of interrupts of various types [should be available from some type of syscall].
Now, create some hash of all of the above (e.g.):
dev_random * cur_nano * (cur_time - sysboot) * (cur_time - shell_start) *
getpid * inode_number * interrupt_count
That's a simple equation. You could enhance it with some XOR and/or sum operations. Experiment until you get one that works for you.
Note: This only gives you the seed value for your PRNG. You'll have to create your PRNG from something else (e.g. earl's linear algorithm)
unsigned int Random::next() {
s = (1664525 * s + 1013904223);
return s;
}
's' is growing with every call of that function.
Correct is
unsigned int Random::next() {
s = (1664525 * s + 1013904223) % xxxxxx;
return s;
}
Maybe use this function
long long Factor = 279470273LL, Divisor = 4294967291LL;
long long seed;
next()
{
seed = (seed * Factor) % Divisor;
}
The C++11 standard specifies a number of different engines for random number generation: linear_congruential_engine, mersenne_twister_engine, subtract_with_carry_engine and so on. Obviously, this is a large change from the old usage of std::rand.
Obviously, one of the major benefits of (at least some) of these engines is the massively increased period length (it's built into the name for std::mt19937).
However, the differences between the engines is less clear. What are the strengths and weaknesses of the different engines? When should one be used over the other? Is there a sensible default that should generally be preferred?
From the explanations below, a linear engine seems to be faster but less random while the Mersenne Twister has a higher complexity and randomness. Subtract-with-carry random number engine is an improvement to the linear engine and it is definitely more random. In the last reference, it is stated that Mersenne Twister has higher complexity than the Subtract-with-carry random number engine.
Linear congruential random number engine
A pseudo-random number generator engine that produces unsigned integer numbers.
This is the simplest generator engine in the standard library. Its state is a single integer value, with the following transition algorithm:
x = (ax+c) mod m
Where x is the current state value, a and c are their respective template parameters, and m is its respective template parameter if this is greater than 0, or numerics_limits<UIntType>::max() + 1, otherwise.
Its generation algorithm is a direct copy of the state value.
This makes it an extremely efficient generator in terms of processing and memory consumption, but producing numbers with varying degrees of serial correlation, depending on the specific parameters used.
The random numbers generated by linear_congruential_engine have a period of m.
Mersenne twister random number engine
A pseudo-random number generator engine that produces unsigned integer numbers in the closed interval [0,2^w-1].
The algorithm used by this engine is optimized to compute large series of numbers (such as in Monte Carlo experiments) with an almost uniform distribution in the range.
The engine has an internal state sequence of n integer elements, which is filled with a pseudo-random series generated on construction or by calling member function seed.
The internal state sequence becomes the source for n elements: When the state is advanced (for example, in order to produce a new random number), the engine alters the state sequence by twisting the current value using xor mask a on a mix of bits determined by parameter r that come from that value and from a value m elements away (see operator() for details).
The random numbers produced are tempered versions of these twisted values. The tempering is a sequence of shift and xor operations defined by parameters u, d, s, b, t, c and l applied on the selected state value (see operator()).
The random numbers generated by mersenne_twister_engine have a period equivalent to the mersenne number 2^((n-1)*w)-1.
Subtract-with-carry random number engine
A pseudo-random number generator engine that produces unsigned integer numbers.
The algorithm used by this engine is a lagged fibonacci generator, with a state sequence of r integer elements, plus one carry value.
Lagged Fibonacci generators have a maximum period of (2k - 1)*^(2M-1) if addition or subtraction is used. The initialization of LFGs is a very complex problem. The output of LFGs is very sensitive to initial conditions, and statistical defects may appear initially but also periodically in the output sequence unless extreme care is taken. Another potential problem with LFGs is that the mathematical theory behind them is incomplete, making it necessary to rely on statistical tests rather than theoretical performance.
And finally from the documentation of random:
The choice of which engine to use involves a number of tradeoffs: the linear congruential engine is moderately fast and has a very small storage requirement for state. The lagged Fibonacci generators are very fast even on processors without advanced arithmetic instruction sets, at the expense of greater state storage and sometimes less desirable spectral characteristics. The Mersenne Twister is slower and has greater state storage requirements but with the right parameters has the longest non-repeating sequence with the most desirable spectral characteristics (for a given definition of desirable).
I think that the point is that random generators have different properties, which can make them more suitable or not for a given problem.
The period length is one of the properties.
The quality of the random numbers can also be important.
The performance of the generator can also be an issue.
Depending on your need, you might take one generator or another one. E.g., if you need fast random numbers but do not really care for the quality, an LCG might be a good option. If you want better quality random numbers, the Mersenne Twister is probably a better option.
To help you making your choice, there are some standard tests and results (I definitely like the table p.29 of this paper).
EDIT: From the paper,
The LCG (LCG(***) in the paper) family are the fastest generators, but with the poorest quality.
The Mersenne Twister (MT19937) is a little bit slower, but yields better random numbers.
The substract with carry ( SWB(***), I think) are way slower, but can yield better random properties when well tuned.
As the other answers forget about ranlux, here is a small note by an AMD developer that recently ported it to OpenCL:
https://community.amd.com/thread/139236
RANLUX is also one of very few (the only one I know of actually) PRNGs that has a underlying theory explaining why it generates "random" numbers, and why they are good. Indeed, if the theory is correct (and I don't know of anyone who has disputed it), RANLUX at the highest luxury level produces completely decorrelated numbers down to the last bit, with no long-range correlations as long as we stay well below the period (10^171). Most other generators can say very little about their quality (like Mersenne Twister, KISS etc.) They must rely on passing statistical tests.
Physicists at CERN are fan of this PRNG. 'nuff said.
Some of the information in these other answers conflicts with my findings. I've run tests on Windows 8.1 using Visual Studio 2013, and consistently I've found mersenne_twister_engine to be but higher quality and significantly faster than either linear_congruential_engine or subtract_with_carry_engine. This leads me to believe, when the information in the other answers are taken into account, that the specific implementation of an engine has a significant impact on performance.
This is of great surprise to nobody, I'm sure, but it's not mentioned in the other answers where mersenne_twister_engine is said to be slower. I have no test results for other platforms and compilers, but with my configuration, mersenne_twister_engine is clearly the superior choice when considering period, quality, and speed performance. I have not profiled memory usage, so I cannot speak to the space requirement property.
Here's the code I'm using to test with (to make portable, you should only have to replace the windows.h QueryPerformanceXxx() API calls with an appropriate timing mechanism):
// compile with: cl.exe /EHsc
#include <random>
#include <iostream>
#include <windows.h>
using namespace std;
void test_lc(const int a, const int b, const int s) {
/*
typedef linear_congruential_engine<unsigned int, 48271, 0, 2147483647> minstd_rand;
*/
minstd_rand gen(1729);
uniform_int_distribution<> distr(a, b);
for (int i = 0; i < s; ++i) {
distr(gen);
}
}
void test_mt(const int a, const int b, const int s) {
/*
typedef mersenne_twister_engine<unsigned int, 32, 624, 397,
31, 0x9908b0df,
11, 0xffffffff,
7, 0x9d2c5680,
15, 0xefc60000,
18, 1812433253> mt19937;
*/
mt19937 gen(1729);
uniform_int_distribution<> distr(a, b);
for (int i = 0; i < s; ++i) {
distr(gen);
}
}
void test_swc(const int a, const int b, const int s) {
/*
typedef subtract_with_carry_engine<unsigned int, 24, 10, 24> ranlux24_base;
*/
ranlux24_base gen(1729);
uniform_int_distribution<> distr(a, b);
for (int i = 0; i < s; ++i) {
distr(gen);
}
}
int main()
{
int a_dist = 0;
int b_dist = 1000;
int samples = 100000000;
cout << "Testing with " << samples << " samples." << endl;
LARGE_INTEGER ElapsedTime;
double ElapsedSeconds = 0;
LARGE_INTEGER Frequency;
QueryPerformanceFrequency(&Frequency);
double TickInterval = 1.0 / ((double) Frequency.QuadPart);
LARGE_INTEGER StartingTime;
LARGE_INTEGER EndingTime;
QueryPerformanceCounter(&StartingTime);
test_lc(a_dist, b_dist, samples);
QueryPerformanceCounter(&EndingTime);
ElapsedTime.QuadPart = EndingTime.QuadPart - StartingTime.QuadPart;
ElapsedSeconds = ElapsedTime.QuadPart * TickInterval;
cout << "linear_congruential_engine time: " << ElapsedSeconds << endl;
QueryPerformanceCounter(&StartingTime);
test_mt(a_dist, b_dist, samples);
QueryPerformanceCounter(&EndingTime);
ElapsedTime.QuadPart = EndingTime.QuadPart - StartingTime.QuadPart;
ElapsedSeconds = ElapsedTime.QuadPart * TickInterval;
cout << " mersenne_twister_engine time: " << ElapsedSeconds << endl;
QueryPerformanceCounter(&StartingTime);
test_swc(a_dist, b_dist, samples);
QueryPerformanceCounter(&EndingTime);
ElapsedTime.QuadPart = EndingTime.QuadPart - StartingTime.QuadPart;
ElapsedSeconds = ElapsedTime.QuadPart * TickInterval;
cout << "subtract_with_carry_engine time: " << ElapsedSeconds << endl;
}
Output:
Testing with 100000000 samples.
linear_congruential_engine time: 10.0821
mersenne_twister_engine time: 6.11615
subtract_with_carry_engine time: 9.26676
I just saw this answer from Marnos and decided to test it myself. I used std::chono::high_resolution_clock to time 100000 samples 100 times to produce an average. I measured everything in std::chrono::nanoseconds and ended up with different results:
std::minstd_rand had an average of 28991658 nanoseconds
std::mt19937 had an average of 29871710 nanoseconds
ranlux48_base had an average of 29281677 nanoseconds
This is on a Windows 7 machine. Compiler is Mingw-Builds 4.8.1 64bit. This is obviously using the C++11 flag and no optimisation flags.
When I turn on -O3 optimisations, the std::minstd_rand and ranlux48_base actually run faster than what the implementation of high_precision_clock can measure; however std::mt19937 still takes 730045 nanoseconds, or 3/4 of a second.
So, as he said, it's implementation specific, but at least in GCC the average time seems to stick to what the descriptions in the accepted answer say. Mersenne Twister seems to benefit the least from optimizations, whereas the other two really just throw out the random numbers unbelieveably fast once you factor in compiler optimizations.
As an aside, I'd been using Mersenne Twister engine in my noise generation library (it doesn't precompute gradients), so I think I'll switch to one of the others to really see some speed improvements. In my case, the "true" randomness doesn't matter.
Code:
#include <iostream>
#include <chrono>
#include <random>
using namespace std;
using namespace std::chrono;
int main()
{
minstd_rand linearCongruentialEngine;
mt19937 mersenneTwister;
ranlux48_base subtractWithCarry;
uniform_real_distribution<float> distro;
int numSamples = 100000;
int repeats = 100;
long long int avgL = 0;
long long int avgM = 0;
long long int avgS = 0;
cout << "results:" << endl;
for(int j = 0; j < repeats; ++j)
{
cout << "start of sequence: " << j << endl;
auto start = high_resolution_clock::now();
for(int i = 0; i < numSamples; ++i)
distro(linearCongruentialEngine);
auto stop = high_resolution_clock::now();
auto L = duration_cast<nanoseconds>(stop-start).count();
avgL += L;
cout << "Linear Congruential:\t" << L << endl;
start = high_resolution_clock::now();
for(int i = 0; i < numSamples; ++i)
distro(mersenneTwister);
stop = high_resolution_clock::now();
auto M = duration_cast<nanoseconds>(stop-start).count();
avgM += M;
cout << "Mersenne Twister:\t" << M << endl;
start = high_resolution_clock::now();
for(int i = 0; i < numSamples; ++i)
distro(subtractWithCarry);
stop = high_resolution_clock::now();
auto S = duration_cast<nanoseconds>(stop-start).count();
avgS += S;
cout << "Subtract With Carry:\t" << S << endl;
}
cout << setprecision(10) << "\naverage:\nLinear Congruential: " << (long double)(avgL/repeats)
<< "\nMersenne Twister: " << (long double)(avgM/repeats)
<< "\nSubtract with Carry: " << (long double)(avgS/repeats) << endl;
}
Its a trade-off really. A PRNG like Mersenne Twister is better because it has extremely large period and other good statistical properties.
But a large period PRNG takes up more memory (for maintaining the internal state) and also takes more time for generating a random number (due to complex transitions and post processing).
Choose a PNRG depending on the needs of your application. When in doubt use Mersenne Twister, its the default in many tools.
In general, mersenne twister is the best (and fastest) RNG, but it requires some space (about 2.5 kilobytes). Which one suits your need depends on how many times you need to instantiate the generator object. (If you need to instantiate it only once, or a few times, then MT is the one to use. If you need to instantiate it millions of times, then perhaps something smaller.)
Some people report that MT is slower than some of the others. According to my experiments, this depends a lot on your compiler optimization settings. Most importantly the -march=native setting may make a huge difference, depending on your host architecture.
I ran a small program to test the speed of different generators, and their sizes, and got this:
std::mt19937 (2504 bytes): 1.4714 s
std::mt19937_64 (2504 bytes): 1.50923 s
std::ranlux24 (120 bytes): 16.4865 s
std::ranlux48 (120 bytes): 57.7741 s
std::minstd_rand (4 bytes): 1.04819 s
std::minstd_rand0 (4 bytes): 1.33398 s
std::knuth_b (1032 bytes): 1.42746 s
This question already has answers here:
Consistent pseudo-random numbers across platforms
(5 answers)
Closed 9 years ago.
The title says it all, I am looking for something preferably stand-alone because I don't want to add more libraries.
Performance should be good since I need it in a tight high-performance loop. I guess that will come at a cost of the degree of randomness.
Any particular pseudo-random number generation algorithm will behave like this. The problem with rand is that it's not specified how it is implemented. Different implementations will behave in different ways and even have varying qualities.
However, C++11 provides the new <random> standard library header that contains lots of great random number generation facilities. The random number engines defined within are well-defined and, given the same seed, will always produce the same set of numbers.
For example, a popular high quality random number engine is std::mt19937, which is the Mersenne twister algorithm configured in a specific way. No matter which machine, you're on, the following will always produce the same set of real numbers between 0 and 1:
std::mt19937 engine(0); // Fixed seed of 0
std::uniform_real_distribution<> dist;
for (int i = 0; i < 100; i++) {
std::cout << dist(engine) << std::endl;
}
Here's a Mersenne Twister
Here is another another PRNG implementation in C.
You may find a collection of PRNG here.
Here's the simple classic PRNG:
#include <iostream>
using namespace std;
unsigned int PRNG()
{
// our initial starting seed is 5323
static unsigned int nSeed = 5323;
// Take the current seed and generate a new value from it
// Due to our use of large constants and overflow, it would be
// very hard for someone to predict what the next number is
// going to be from the previous one.
nSeed = (8253729 * nSeed + 2396403);
// Take the seed and return a value between 0 and 32767
return nSeed % 32767;
}
int main()
{
// Print 100 random numbers
for (int nCount=0; nCount < 100; ++nCount)
{
cout << PRNG() << "\t";
// If we've printed 5 numbers, start a new column
if ((nCount+1) % 5 == 0)
cout << endl;
}
}
I am trying to produce true random number in c++ with C++ TR1.
However, when run my program again, it produces same random numbers.The code is below.
I need true random number for each run as random as possible.
std::tr1::mt19937 eng;
std::tr1::uniform_real<double> unif(0, 1);
unif(eng);
You have to initialize the engine with a seed, otherwise the default seed is going to be used:
eng.seed(static_cast<unsigned int >(time(NULL)));
However, true randomness is something you cannot achieve on a deterministic machine without additional input. Every pseudo-random number generator is periodical in some way, which is something you wouldn't expect from a non-deterministic number. For example std::mt19937 has a period of 219937-1 iterations. True randomness is hard to achieve, as you would have to monitor something that doesn't seem deterministic (user input, atmospheric noise). See Jerry's and Handprint's answer.
If you don't want a time based seed you can use std::random_device as seen in emsr's answer. You could even use std::random_device as generator, which is the closest you'll get to true randomness with standard library methods only.
These are pseudo-random number generators. They can never produce truly random numbers. For that, you typically need special hardware (e.g., typically things like measuring noise in a thermal diode or radiation from radioactive source).
To get a difference sequences from pseudo-random generators in different runs, you typically seed the generator based on the current time.
That produces fairly predictable results though (i.e., somebody else can figure out the seed you used fairly easily. If you need to prevent that, most systems do provide some source of at least fairly random numbers. On Linux, /dev/random, and on Windows, CryptGenRandom.
Those latter tend to be fairly slow, though, so you usually want to use them as a seed, not just retrieve all your random numbers from them.
If you want true hardware random numbers then the standard library offers access to this through the random_device class:
I use it to seed another generator:
#include <random>
...
std::mt19937_64 re;
std::random_device rd;
re.seed(rd());
...
std::cout << re();
If your hardware has /dev/urandom or /dev/random then this will be used. Otherwise the implementation is free to use one of it's pseudorandom generators. On G++ mt19937 is used as a fallback.
I'm pretty sure tr1 has this as well bu as others noted I think it's best to use std C++11 utilities at this point.
Ed
This answer is a wiki. I'm working on a library and examples in .NET, feel free to add your own in any language...
Without external 'random' input (such as monitoring street noise), as a deterministic machine, a computer cannot generate truly random numbers: Random Number Generation.
Since most of us don't have the money and expertise to utilize the special equipment to provide chaotic input, there are ways to utitlize the somewhat unpredictable nature of your OS, task scheduler, process manager, and user inputs (e.g. mouse movement), to generate the improved pseudo-randomness.
Unfortunately, I do not know enough about C++ TR1 to know if it has the capability to do this.
Edit
As others have pointed out, you get different number sequences (which eventually repeat, so they aren't truly random), by seeding your RNG with different inputs. So you have two options in improving your generation:
Periodically reseed your RNG with some sort of chaotic input OR make the output of your RNG unreliable based on how your system operates.
The former can be accomplished by creating algorithms that explicitly produce seeds by examining the system environment. This may require setting up some event handlers, delegate functions, etc.
The latter can be accomplished by poor parallel computing practice: i.e. setting many RNG threads/processes to compete in an 'unsafe manner' to create each subsequent random number (or number sequence). This implicitly adds chaos from the sum total of activity on your system, because every minute event will have an impact on which thread's output ends up having being written and eventually read when a 'GetNext()' type method is called. Below is a crude proof of concept in .NET 3.5. Note two things: 1) Even though the RNG is seeded with the same number everytime, 24 identical rows are not created; 2) There is a noticeable hit on performance and obvious increase in resource consumption, which is a given when improving random number generation:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
namespace RandomParallel
{
class RandomParallel
{
static int[] _randomRepository;
static Queue<int> _randomSource = new Queue<int>();
static void Main(string[] args)
{
InitializeRepository(0, 1, 40);
FillSource();
for (int i = 0; i < 24; i++)
{
for (int j = 0; j < 40; j++)
Console.Write(GetNext() + " ");
Console.WriteLine();
}
Console.ReadLine();
}
static void InitializeRepository(int min, int max, int size)
{
_randomRepository = new int[size];
var rand = new Random(1024);
for (int i = 0; i < size; i++)
_randomRepository[i] = rand.Next(min, max + 1);
}
static void FillSource()
{
Thread[] threads = new Thread[Environment.ProcessorCount * 8];
for (int j = 0; j < threads.Length; j++)
{
threads[j] = new Thread((myNum) =>
{
int i = (int)myNum * _randomRepository.Length / threads.Length;
int max = (((int)myNum + 1) * _randomRepository.Length / threads.Length) - 1;
for (int k = i; k <= max; k++)
{
_randomSource.Enqueue(_randomRepository[k]);
}
});
threads[j].Priority = ThreadPriority.Highest;
}
for (int k = 0; k < threads.Length; k++)
threads[k].Start(k);
}
static int GetNext()
{
if (_randomSource.Count > 0)
return _randomSource.Dequeue();
else
{
FillSource();
return _randomSource.Dequeue();
}
}
}
}
As long as there is user(s) input/interaction during the generation, this technique will produce an uncrackable, non-repeating sequence of 'random' numbers. In such a scenario, knowing the initial state of the machine would be insufficient to predict the outcome.
Here's an example of seeding the engine (using C++11 instead of TR1)
#include <chrono>
#include <random>
#include <iostream>
int main() {
std::mt19937 eng(std::chrono::high_resolution_clock::now()
.time_since_epoch().count());
std::uniform_real_distribution<> unif;
std::cout << unif(eng) << '\n';
}
Seeding with the current time can be relatively predictable and is probably not something that should be done. The above at least does not limit you just to one possible seed per second, which is very predictable.
If you want to seed from something like /dev/random instead of the current time you can do:
std::random_device r;
std::seed_seq seed{r(), r(), r(), r(), r(), r(), r(), r()};
std::mt19937 eng(seed);
(This depends on your standard library implementation. For example, libc++ uses /dev/urandom by default, but in VS11 random_device is deterministic)
Of course nothing you get out of mt19937 is going to meet your requirement of a "true random number", and I suspect that you don't really need true randomness.