Random number generation with C++ or Python - c++

I heard that computation results can be very sensitive to choice of random number generator.
1 I wonder whether it is relevant to program own Mersenne-Twister or other pseudo-random routines to get a good number generator. Also, I don't see why I should not trust native or library generators as random.uniform() in numpy, rand() in C++. I understand that I can build generators on my own for distributions other than uniform (inverse repartition function methor, polar method). But is it evil to use one built-in generator for uniform sampling?
2 What is wrong with the default 'time' seed? Should one re-seed and how frequently in a code sample (and why)?
3 Maybe you have some good links on these topics!
--edit More precisely, I need random numbers for multistart optimization routines, and for uniform space sample to initialize some other optimization routine parameters. I also need random numbers for Monte Carlo methods (sensibility analysis). I hope the precisions help figure out the scope of question.

Well, I can't speak about C++, but Python uses the Mersenne Twister. So there's no need to implement your own in Python. Also, Python only uses the system time as a seed if there's no other source of randomness; this is a system-dependent issue. See also the os.urandom docs about this.
It's fun to write your own, though. The pseudocode on the MT Wikipedia page is clear and easy-to-implement.
Of course the usual caveats apply. This is not a cryptographic random number generator. Not all permutations of a largish list can be generated by random.shuffle, and so on. But for the uses you specify, it's likely that the Mersenne Twister is fine.

In C++ the <random> library probably provides all you need. It has 3 different PRNG template algorithms including mersenne twister, 3 adaptors for use on top of those, 9 concrete random number generators, plus access to your system's non-deterministic random number source.
On top of that it has 20 random number distributions that include uniform, normal, bernoulli, poisson, and sampling distributions.
Here's a (slightly modified) example from Stroustrup's C++11 FAQ.
#include <iostream>
#include <random>
#include <string>
#include <vector>
#include <functional>
int main()
{
auto rand = std::bind(
std::normal_distribution<>(15.0,4.0),
std::mt19937());
std::vector<int> output(32);
for (int i = 0; i<400; ++i)
++output[rand()];
for (int i = 0; i<output.size(); ++i)
std::cout << i << '\t' << std::string(output[i],'*') << '\n';
}
0
1
2 *
3 **
4 **
5 **
6 ***
7 ***
8 ******
9 ***************
10 **************************
11 ******************
12 ************************************************
13 ******************************************
14 ****************************************
15 *******************************
16 ***************************************
17 **************************************
18 *************************
19 *****************
20 ************
21 ************
22 *****
23 *******
24 ***
25 **
26
27 *
28
29
30
31

At least in C++, rand is sometimes rather poor quality, so code should rarely use it for anything except things like rolling dice or shuffling cards in children's games. In C++ 11, however, a set of random number generator classes of good quality have been added, so you should generally use them by preference.
Seeding based on time can work fine under some circumstances, but not if you want to make it difficult for somebody else to duplicate the same series of numbers (e.g., if you're generating nonces for encryption). Normally, you want to seed only once at the beginning of the program, at least in a single-threaded program. With multithreading, you frequently want a separate seed for each thread, in which case you need each one to start out unique to prevent generating the same sequences in all threads.

Python's random.uniform() is fine. Actually it already uses Mersenne-Twsiter.
However, you'd better avoid C and C++'s rand(), since it often produce bad random numbers (See also What common algorithms are used for C's rand()?). Even worse, on Windows the RAND_MAX is only 0x7fff so you can't get more than 32768 distinct values. If you could use C++11, check the new <random> library which contains many random number generators, including MT-19937. Otherwise, you could still use Boost.Random.
Seeding a random number generator with time is fine, as long as (1) you're not working with serious crypotography (you shouldn't use Mersenne-Twsiter in crypotography anyway), and (2) you can guarantee that it is impossible to have two seeds with the same time value, which will cause the same sequence be generated.

Related

Deterministic random numbers from STL [duplicate]

Inspired from this and the similar questions, I want to learn how does mt19937 pseudo-number generator in C++11 behaves, when in two separate machines, it is seeded with the same input.
In other words, say we have the following code;
std::mt19937 gen{ourSeed};
std::uniform_int_distribution<int> dest{0, 10000};
int randNumber = dist(gen);
If we try this code on different machines at different times, will we get the same sequence of randNumber values or a different sequence each time ?
And in either case, why this is the case ?
A further question:
Regardless of the seed, will this code generate randomly numbers infinitely ? I mean for example, if we use this block of code in a program that runs for months without stopping, will there be any problem in the generation of the number or in the uniformity of the numbers ?
The generator will generate the same values.
The distributions may not, at least with different compilers or library versions. The standard did not specify their behaviour to that level of detail. If you want stability between compilers and library versions, you have to roll your own distribution.
Barring library/compiler changes, that will return the same values in the same sequence. But if you care write your own distribution.
...
All PRNGs have patterns and periods. mt19937 is named after its period of 2^19937-1, which is unlikely to be a problem. But other patterns can develop. MT PRNGs are robust against many statistical tests, but they are not crytographically secure PRNGs.
So it being a problem if you run for months will depend on specific details of what you'd find to be a problem. However, mt19937 is going to be a better PRNG than anything you are likely to write yourself. But assume attackers can predict its future behaviour from past evidence.
Regardless of the seed, will this code generate randomly numbers infinitely ? I mean for example, if we use this block of code in a program that runs for months without stopping, will there be any problem in the generation of the number or in the uniformity of the numbers ?
RNG we deal with with standard C++ are called pseudo-random RNGs. By definition, this is pure computational device, with multi-bit state (you could think about state as large bit vector) and three functions:
state seed2state(seed);
state next_state(state);
uint(32|64)_t state2output(state);
and that is it. Obviously, state has finite size, 19937 bits in case of MT19937, so total number of states are 219937 and thus MT19937 next_state() function is a periodic one, with max period no more than 219937. This number is really HUGE, and most likely more than enough for typical simulation
But output is at max 64 bits, so output space is 264. It means that during large run any particular output appears quite a few times. What matters is when not only some 64bit number appears again, but number after that, and after that and after that - this is when you know RNG period is reached.
If we try this code on different machines at different times, will we get the same sequence of randNumber values or a different sequence each time?
Generators are defined rather strictly, and you'll get the same bit stream. For example for MT19937 from C++ standard (https://timsong-cpp.github.io/cppwp/rand)
class mersenne_twister_engine {
...
static constexpr result_type default_seed = 5489u;
...
and function seed2state described as (https://timsong-cpp.github.io/cppwp/rand#eng.mers-6)
Effects: Constructs a mersenne_­twister_­engine object. Sets X−n to value mod 2w. Then, iteratively for i=−n,…,−1, sets Xi to ...
Function next_state is described as well together with test value at 10000th invocation. Standard says (https://timsong-cpp.github.io/cppwp/rand#predef-3)
using mt19937 = mersenne_twister_engine<uint_fast32_t,32,624,397,31,0x9908b0df,11,0xffffffff,7,0x9d2c5680,15,0xefc60000,18,1812433253>;
3
#Required behavior: The 10000th consecutive invocation of a default-constructed object
of type mt19937 shall produce the value 4123659995.
Big four compilers (GCC, Clang, VC++, Intel C++) I used produced same MT19937 output.
Distributions, from the other hand, are not specified that well, and therefore vary between compilers and libraries. If you need portable distributions you either roll your own or use something from Boost or similar libraries
Any pseudo RNG which takes a seed will give you the same sequence for the same seed every time, on every machine. This happens since the generator is just a (complex) mathematical function, and has nothing actually random about it. Most times when you want to randomize, you take the seed from the system clock, which constantly changes so each run will be different.
It is useful to have the same sequence in computer games for example when you have a randomly generated world and want to generate the exact same one, or to avoid people cheating using save games in a game with random chances.

Do you have a function (else than random) to find random dice(from 1 to 6) numbers(in C++)

I am writing a code for a game with seven dices and I have a problem. If I use a random function(dice = rand()%6 + 1) I realized the probability to get for instance a sequence such as 123456 (a sequence that makes points in my game) has a much higher probability to get out.
Mathematically, this sequence has 1.54% probability to show up but When I use a random function with 100 millions iterations it appears up to 5.4% of the time!
That leads me to my question. Do you know another way I could randomize the dice so that they would respect the probability? Or a way to fix that problem anyway?
Thanks in advance!
The problem you are facing is well known and a very natural result of using the modulo operator with random.
C++11 solves these problems by providing not only uniformly distributed random numbers but several different types of distributions like the Bernoulli distribution, the normal distribution and the Poisson distribution.
The new header providing all these generators and distributions is random.
Let's do an example: We want do have a random number generator that gives us some numbers and we want to have a distribution that shapes these numbers as we want them (uniformly, Bernoulli ...).
#include <iostream>
#include <random>
int main(){
std::mt19937(6473); // The random number generator using a deterministic seed
std::uniform_int_distribution<int> dist(1,6); // The distribution that gives us random numbers in [1, 6)
for(int i=0;i<10;i++){
std::cout << dist(mt) << std::endl;
}
}
This gives us pseudo-random numbers uniformly distributed into an interval we chose! But C++11 provides even more! It provides a real random number generator (see implementations for more details) which we can use as follows:
#include <iostream>
#include <random>
int main(){
std::random_device rd;
std::mt19937 mt(rd()); // The random number generator using a non-deterministic random device
std::uniform_int_distribution<int> dist(1,6); // The distribution that gives us random numbers in [1,6)
for(int i=0;i<10;i++){
std::cout << dist(mt) << std::endl;
}
}
It is this easy to provide real high quality random numbers distributed as you want into and interval you want using C++11. I got the knowlege about this topic from a talk of Stephen T. Lavavej (STL) held at GoingNative 2013 that you can watch on Channel 9 and that is called rand() Considered Harmful.
Fun fact: The title is a reference to an essay from the great Edsger Wybe Dijkstra called "Go to considered harmful." in which Dijkstra explained why no programmer should use the goto statement.
If you want some good random numbers using the boost library random function is a good place to start. Also has the example of working with dice like you are asking. http://www.boost.org/doc/libs/1_61_0/doc/html/boost_random.html

rand() not giving me a random number (even when srand() is used)

Okay I'm starting to lose my mind. All I want to do is random a number between 0 and 410, and according to this page, my code should do that. And since I want a random number and not a pseudo-random number, I'm using srand() as well, in a way that e.g. this thread told me to do. But this isn't working. All I get is a number that is depending on how long it was since my last execution. If I e.g. execute it again as fast as I can, the number is usually 6 numbers higher than the last number, and if I wait longer, it's higher, etc. When it reaches 410 it goes back to 0 and begins all over again. What am I missing?
Edit: And oh, if I remove the srand(time(NULL)); line I just get the same number (41) every time I run the program. That's not even pseudo random, that's just a static number. Just copying the first line of code from the article I linked to above still gives me number 41 all the time. Am I the star in a sequel to "The Number 23", or have I missed something?
int main(void) {
srand(time(NULL));
int number = rand() % 410;
std::cout << number << std::endl;
system("pause");
}
That is what you get for using deprecated random number generation.
rand produces a fixed sequence of numbers (which by itself is fine), and does that very, very badly.
You tell rand via srand where in the sequence to start. Since your "starting point" (called seed btw) depends on the number of seconds since 1.1.1970 0:00:00 UTC, your output is obviously time depended.
The correct way to do what you want to do is using the C++11 <random> library. In your concrete example, this would look somewhat like this:
std::mt19937 rng (std::random_device{}());
std::uniform_int_distribution<> dist (0, 409);
auto random_number = dist(rng);
For more information on the evils of rand and the advantages of <random> have a look at this.
As a last remark, seeding std::mt19937 like I did above is not quite optimal because the MT's state space is much larger than the 32 bit you get out of a single call to std::random_device{}(). This is not a problem for toy programs and your standard school assignments, but for reference: Here is my take at seeding the MT's entire state space, plus some helpful suggestions in the answers.
From manual:
time() returns the time as the number of seconds since the Epoch,
1970-01-01 00:00:00 +0000 (UTC).
Which means that if you start your program twice both times at the same second you will initialize srand with same value and will get same state of PRNG.
And if you remove initialization via call to srand you will always get exactly same sequence of numbers from rand.
I'm afraid you can't get trully random numbers there. Built in functions are meant to provide just pseudo random numbers. Moreover using srand and rand, because the first uses the same approach as the second one. If you want to cook true random numbers, you must find a correct source of entrophy, working for example with atmospheric noise, as the approach of www.random.org.
The problem here consists in the seed used by the randomness algorithm: if it's a number provided by a machine, it can't be unpredictable. A normal solution for this is using external hardware.
Unfortunately you can't get a real random number from a computer without specific hardware (which is often too slow to be practical).
Therefore you need to make do with a pseudo generator. But you need to use them carefully.
The function rand is designed to return a number between 0 and RAND_MAX in a way that, broadly speaking, satisfies the statistical properties of a uniform distribution. At best you can expect the mean of the drawn numbers to be 0.5 * RAND_MAX and the variance to be RAND_MAX * RAND_MAX / 12.
Typically the implementation of rand is a linear congruential generator which basically means that the returned number is a function of the previous number. That can give surprisingly good results and allows you to seed the generator with a function srand.
But repeated use of srand ruins the statistical properties of the generator, which is what is happening to you: your use of srand is correlated with your system clock time. The behaviour you're observing is completely expected.
What you should do is to only make one call to srand and then draw a sequence of numbers using rand. You cannot easily do this in the way you've set things up. But there are alternatives; you could switch to a random number generator (say mersenne twister) which allows you to draw the (n)th term and you could pass the value of n as a command line argument.
As a final remark, I'd avoid using a modulus when drawing a number. This will create a statistical bias if your modulo is not a multiple of RAND_MAX.
Try by change the NULL in time(NULL) by time(0) (that will give you the current système time). If it doesn't work, you could try to convert time(0) into ms by doing time(0)*1000.

C++ Simple 0-10 multiplication flashcard using rand()

I am having trouble grasping the concept of rand() and srand() in c++. I need to create a program that displays two random numbers, have the user enter a response, then match the response with a message and do this for 5 times.
My question is how do I use it, the instructions say I can't use the time() function and that seems to be in every tutorial online about rand().
this is what I have so far.
#include <iostream>
#include <cmath>
#include <cstdlib>
using namespace std;
int main()
{
int seed;
int response;
srand(1969);
seed=(rand()%10+1);
cout<<seed<<" * "<<seed<<" = ";
cin>>response;
cout<<response;
if(response==seed*seed)
cout<<"Correct!. you have correctly answered 1 out of 1."<<endl;
else
cout<<"Wrong!. You have correctly answered 0 out of 1."<<endl;
This just outputs something like 6*6 or 7*7, I thought the seed variable would be not necessary different but not the same all the time?
This is what the output should look like:
3 * 5 =
34
Wrongo. You have correctly answered 0 out of 1.
8 * 1 =
23
Wrongo. You have correctly answered 0 out of 2.
7 * 1 =
7
Correct! You have correctly answered 1 out of 3.
2 * 0 =
2
Wrongo. You have correctly answered 1 out of 4.
8 * 1 =
8
Correct! You have correctly answered 2 out of 5.
Final Results: You have correctly answered 2 out of 5 for a 40% average.
and these are the requirements:
Your program should use rand() to generate pseudo-random numbers as needed. You may use srand() to initialize the random number generator, but please do not use any 'automatic' initializer (such as the time() function), as those are likely to be platform dependent. Your program should not use any loops.
By the way, since this is C++, you should really seek to use std::uniform_int_distribution, e.g.
#include <functional>
#include <random>
...
auto rand = std::bind(std::uniform_int_distribution<unsigned>(0, 10),
std::default_random_engine());
Now, you can just use rand() to generate a number in the desired interval.
The way you are using it now seems fine. The reason why all the tutorials use time() is because the numbers will be different every time you run your program. So, if you use a fixed number, every time your program runs, the output (number generation) will be the same. However, according to your requirements this doesn't seem to be a problem (if you need the random generation to be different every time you run your program, please specify that in your question).
However, rand()%10+1 is a range from 1 to 10 and not 0 to 10 like you want.
AFTER EDITS
To get the desired output, all you need is to make two seeds like so:
seed1=(rand()%11);
seed2=(rand()%11);
cout<<seed1<<" * "<<seed2<<" = ";
Also, you can ask the user for a seed and then pass that to srand to make each run more random.
About the requirements:
please do not use any 'automatic' initializer (such as the time()
function), as those are likely to be platform dependent
std::time is a standard C++ function in the <ctime> header. I do not understand why it matters if the result is platform dependent.
Your program should not use any loops.
This is also a very strange requirement. Loops are fundamental building blocks of any program. The requirements seem very strange to me, I would ask your professor or teacher for clarification.
On windows you can use GetTickCount() instead if time().
You could use rand_s which doesn't need to be seeded.
On *nix systems you can utilize /dev/random.
(How to use /dev/random)
You use srand() to seed the random function. This is necessary otherwise you'd get the same sequence of number with each run, and each call to rand()
You can seed rand with whatever you please. You'll find most tutorials use the current time as a seed as the number returned is usually different with each run of the program.
If you truly can't use the time() functionality, I would pass the seed as a command line argument.
int main(int argc, char* argv[])
{
srand(atoi(argv[1])); // Seed with command line argument.
}

Same random numbers every time I run the program

My random numbers that output, output in the same sequence every time I run my game. Why is this happening?
I have
#include <cstdlib>
and am using this to generate the random numbers
randomDiceRollComputer = 1 + rand() % 6;
You need to seed your random number generator:
Try putting this at the beginning of the program:
srand ( time(NULL) );
Note that you will need to #include <ctime>.
The idea here is to seed the RNG with a different number each time you launch the program. By using time as the seed, you get a different number each time you launch the program.
You need to give the randum number generator a seed. This can be done by taking the current time, as this is hopefully some kind of random.
#include <cstdlib>
#include <ctime>
using namespace std;
int main()
{
int r;
srand(time(0));
r = rand();
return 0;
}
The rand() function is specifically required to produce the same sequence of numbers when seeded with a given seed (by calling srand()); each possible seed value specifies a sequence. And if you never call srand(), you get the same sequence you would have gotten by calling srand(1) before any call to rand().
(This doesn't apply across different C or C++ implementations.)
This can be useful for testing purposes. If there's a bug in your program, for example, you can reproduce it by re-running it with the same seed, guaranteeing that (barring other unpredictable behaviors) you'll get the same sequence of pseudo-random numbers.
Calling srand(time(NULL)) is the usual recommended way to get more or less unpredictable pseudo-random numbers. But it's not perfect. If your program runs twice within the same second, you'll probably get the same sequence, because time() (typically) has a resolution of 1 second. And typical `rand() implementations are not good enough for cryptographic use; it's too easy for an attacker to guess what numbers you're going to get.
There are a number of other random number implementations. Linux systems have two pseudo-devices, /dev/random and /dev/urandom, from which you can read reasonably high-quality pseudo-random byte values. Some systems might have functions like random(), drand48(), and so forth. And there are numerous algorithms; I've heard good things about the Mersenne Twister.
For something like a game, where you don't expect or care about players trying to cheat, srand(time(NULL)) and rand() is probably good enough. For more serious purposes, you should get advice from someone who knows more about this stuff than I do.
Section 13 of the comp.lang.c FAQ has some very good information about pseudo-random number generation.
Pseudorandom number generators take a starting number, or seed, and then generate the next number in the sequence from this. That's why they're called pseudorandom, because if they always use the same starting value, they will generate the same sequence of numbers like the C standard lib generator does. This can be fixed by giving the generator a starting value that will change the next time the program is run like the current time.
Anyway, the code you're looking for like others have said is:
srand(time(0)); //Seed the generator, give it a starting value