implementation of the random number generator in C/C++ [duplicate] - c++

This question already has answers here:
can rand() be used to generate predictable data?
(4 answers)
Closed 8 years ago.
I am a bit confused by the implementation of the random number generator in C, which is also apparently different from that in C++
If I understand correctly, a call to 'srand(seed)' somehow initializes a hidden variable (the seed) that is accessible by 'rand()', which in turn points the function to a pre-generated sequence, like for example this one. Each successive call to 'rand()' advances the sequence (and apparently there are other ways to advance in C++), which also suggests the use of an internal hidden pointer or counter to keep track of the advance.
I have found many discussions on how the algorithms for pseudo-random number generation work and the documentation of the functions rand() and srand(), but haven't been able to find information about these hidden parameters and their behavior, except for the fact that according to this source, they are not thread-safe.
Could anybody here please shed some light as to how are these parameters defined and what should be their defined behavior according to the standards, or if their behavior is implementation-defined?
Are they expected to be local to the function/method that calls rand() and srand()? If so, is there a way to communicate them to another function/method?
If your answer is specific to either C or C++, please be so kind to point it out. Any information will be much appreciated. Please bear in mind that this question is not about the predictability of data generated by rand() and srand(), but about the requirements, status and functioning of their internal variables as well as their accessibility and scope.

The requirements on rand are:
Generates pseudo-random numbers.
Range is 0 to RAND_MAX (minimum of 32767).
The seed set by srand() determines the sequence of pseudo-random numbers returned.
It need not be thread-safe or even reentrant, the state can be stored in a static variable.
The standard does not define any way to recover the internal state for reseeding or anything else.
There is no requirement on what PRNG is implemented, so every implementation can have its own, though Linear Congrueantial Generators are a favorite.
A conforming (though arguably useless) implementation is presented in this dilbert strip:
http://dilbert.com/strips/comic/2001-10-25/
Or for those who like XKCD (It's a perfect drop-in for any C or C++ library ;-)):
For completeness, the standard quotes:
7.22.2.1 The rand function
The rand function computes a sequence of pseudo-random integers in the range 0 to
RAND_MAX.
The rand function is not required to avoid data races with other calls to pseudo-random
sequence generation functions. The implementation shall behave as if no library function
calls the rand function.
[...]
The value of the RAND_MAX macro shall be at least 32767.
7.22.2.2 The srand function
The srand function uses the argument as a seed for a new sequence of pseudo-random
numbers to be returned by subsequent calls to rand. If srand is then called with the
same seed value, the sequence of pseudo-random numbers shall be repeated. If rand is
called before any calls to srand have been made, the same sequence shall be generated
as when srand is first called with a seed value of 1.
The srand function is not required to avoid data races with other calls to pseudo-random
sequence generation functions. The implementation shall behave as if no library function
calls the srand function.
C++ includes rand, srand and RAND_MAX without change by reference from the C standard.
There are a few C++ library functions/classes which are explicitly documented to use the C random number generator though.

The following answer is for C; specifically, the 1999 standard.
The C99 standard is very light on actual implementation details for rand & srand. It simply states that the argument to srand is used "as a seed for a new sequence of pseudo-random numbers to be returned by subsequent calls to rand."
In practice, the way it usually works is:
The C library defines an integer variable that rand and srand use to keep track of the PRNG's state.
srand sets the state variable to the supplied value.
rand takes the value of the state variable and performs some mathematical magic on it to produce two new integers: one is the pseudo-random number that it returns, and the other becomes the new value for the state variable, thus influencing the next call to rand (assuming srand isn't called before then).
The C standard gives an example of a possible implementation of rand and srand that exhibits this behavior:
static unsigned long int next = 1;
int rand(void) // RAND_MAX assumed to be 32767
{
next = next * 1103515245 + 12345;
return (unsigned int)(next/65536) % 32768;
}
void srand(unsigned int seed)
{
next = seed;
}

Related

Equivalent of srand() and rand() using post-C++11 std library

I have old code that predates C++11 and it uses rand() for generating random ints.
However, there is shortcoming in rand(): you can't save and then restore the state of the random device; since there is not an object I can save, nor can I extract the state.
Therefore, I want to refactor to use C++11's solution <random>.
However, I do not want a behaviour change - I hope to get exactly the sequence rand() gives me but with <random>.
Do you guys know whether this is achievable?
You can't even assure that you get the same sequence if you use rand() on another compiler. And no, you can't get random to produce the same sequence as whoever's rand() it was you were using. (Thank goodness. rand() is notorious for being one of the worst pseudo-random number generators of all time.)
It is possible for you to restore the state of rand(), simply by using srand() to set the initial state and counting how many times you called rand(). You can later repeat that to bring rand() back to that same state.
But don't use rand()!
What you want is not possible. The C-style random number generator is implementation-defined. The C++ random engines are all very well specified as to their particular algorithms (except random_device, which varies due to potentially being a more "true" random generator). None of its engines are defined to have the same algorithm as rand.

rand() and srand() functions in c++

I have been learning recently how to program games in c++ from a beginner book, and i reached a lesson where i have to make a game in where i have to guess the computer's random picked number, and i have to use this line of code:
srand(static_cast<unsigned int>(time(0)));
variable=rand();
I obviously use iostream cstdlib and ctime.I don't really understand how this works.How is it picking the time and date, and by what rules is it converting into an unsigned int. Basically, how those functions work.
Thank you!
1. About time()
time (or better std::time in C++) is a function that returns some integer or floating point number that represents the current time in some way.
Which arithmetic type it actually returns and how it represents the current time is unspecified, however, most commonly you will get some integer type that holds the seconds since begin of the Unix epoch.
2. About srand()
srand is a function that uses its argument (which is of type unsigned int), the so called seed, to set the internal state of the pseudo number generator rand. When I write random in the rest of this answer, read pseudo random.
Using a different seed will in general result in a different sequence of random numbers produced by subsequent calls to rand, while using the same seed again will result in the exactly same sequence of random numbers.
3. Using time() to seed rand()
If we do not want to get the same random numbers every time we run the program, we need some seed that is different on each run. The current time is a widely used source for such a seed as it changes constantly.
This integer (or whatever else time returned) representing the current time is now converted to unsigned int with a static_cast. This explicit cast is not actually needed as all arithmetic types convert to unsigned int implicitly, but the cast may silence some warnings. As time goes by, we can expect the resulting unsigned int and thus the sequence of random numbers produced by rand to change.
4. Pitfalls
If, as is common, time returns the number of seconds since the beginning of the Unix epoch, there are three important things to note:
The sequence you produce will be different only if at least a second has passed between two invocations.
Depending on the actual implementation, the resulting sequences may start of kind of similar if the time points used to seed rand are close to each other (compared to time since Epoch). Afaik, this is the case in MSVC's implementation. If that is problematic, just discard the first couple of hundred or thousand values of the sequence. (As I have learned by now, this does not really help much for poor RNGs as commonly used for rand. So if that is problematic, use <random> as described below.)
Your numbers are not very random in the end: If someone knows when your call to srand occurred, they can derive the entire sequence of random numbers from that. This has actually led to a decryption tool for a ransom ware that used srand(time(0)) to generate its "random" encryption key.
Also, the sequence generated by rand tends to have poor statistical properties even if the seed was good. For a toy program like yours, that is probably fine, however, for real world use, one should be aware of that.
5. The new <random>
C++11 introduced new random number facilities that are in many ways superior to the old rand based stuff. They provided in the standard header <random>. It includes std::random_device which provides a way to get actually random seeds, powerful pseudo random number generators like std::mt19937 and facilities to map the resulting random sequences to integer or float ranges without introducing unnecessary bias.
Here is an example how to randomly roll a die in C++11:
#include <random>
#include <iostream>
int main()
{
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<> dis(1, 6);
for (int n=0; n<10; ++n)
std::cout << dis(gen) << ' ';
std::cout << '\n';
}
(Code from cppr) Note: std::random_device does not work properly with MinGW, at least in the version (Nuwen MinGW5.3) I tested!
It should also be noted that the state space of a mt19937 is much larger than the 32 bit we (commonly) get out of a single call to random_device. Again, this will most likely not matter for toy programs and homework, but for reference: Here is my attempt to properly seed the entire state space, plus some helpful suggestions in the answers.
If you are interested in more details about rand vs <random>, this is an interesting watch.
First line:
srand() is a pseudo-random number generator. In your case it is initialized with the current time (execution time) on your system.
Second line:
After the pseudo-random number generator is configured, you can retrieve random numbers by calling rand().

How does calling srand more than once affect the quality of randomness?

This comment, which states:
srand(time(0)); I would put this line as the first line in main()
instead if calling it multiple times (which will actually lead to less
random numbers).
...and I've bolded the line which I'm having an issue with... repeats common advice to call srand once in a program. Questions like srand() — why call only once? re-iterate that because time(0) returns the current time in seconds, that multiple calls to srand within the same second will produce the same seed. A common workaround is to use milliseconds or nanoseconds instead.
However, I don't understand why this means that srand should or can only be called once, or how it leads to less random numbers.
cppreference:
Generally speaking, the pseudo-random number generator should only be
seeded once, before any calls to rand(), and the start of the program.
It should not be repeatedly seeded, or reseeded every time you wish to generate a new batch of pseudo-random numbers.
phoxis's answer to srand() — why call only once?:
Initializing once the initial state with the seed value will generate
enough random numbers as you do not set the internal state with srand,
thus making the numbers more probable to be random.
Perhaps they're simply using imprecise language, none of the explanations seem to explain why calling srand multiple times is bad (aside from producing the same sequence of random numbers) or how it affects the "randomness" of the numbers. Can somebody clear this up for me?
Look at the source of srand() from this question: Rand Implementation
Also, example implementation from this thread:
static unsigned long int next = 1;
int rand(void) // RAND_MAX assumed to be 32767
{
next = next * 1103515245 + 12345;
return (unsigned int)(next/65536) % 32768;
}
void srand(unsigned int seed)
{
next = seed;
}
As you can see, when you calling srand(time(0)) you will got new numbers on rand() depends on seed. Numbers will repeat after some milions, but calling srand again will make it other. Anyway, it must repeat after some cycles - but order depends on argument for srand. This is why C rand isn't good for cryptography - you can predict next number when you know seed.
If you have fast loop, calling srand every iteration is without sense - you can got same number while your time() (1 second is very big time for modern CPUs) give another seed.
There is no reason in simple app to call srand multiple times - this generator are weak by design and if you want real random numbers, you must use other (the best I know is Blum Blum Shub)
For me, there is no more or less random numbers - it always depends on seed, and they repeat if you use same seed. Using time is good solution because it's easy to implement, but you must use only one (at beginning of main()) or when you sure that you calling srand(time(0)) in another second.
The numbers rand() returns are not actually random but "pseudo-random." What this means is that rand() generates a stream of numbers that look random for given values of "look" and "random" from an internal state that changes with each call.
As a rule, rand() is what is called a linear congruental generator, which means that uses a mechanism roughly like this:
int state; // persistent state
int rand() {
state = (a * state + b) % c;
return state;
}
with carefully chosen constants a, b and c. c tends to be a power of two in practice because that makes it faster to calculate.
The "randomness" of this sequence depends in part on the persistence of the state. If the sequence is constantly reseeded with predictable values, the return values of rand() become predictable in turn. How critical this is depends on the application, but it is not a purely academical consideration. Consider, for example, the case
a = 69069
b = 1
c = 2^32
which was used, for example, by old versions of glibc. Granted that I picked this example for the obviousness of the pattern, but the point remains in less obvious cases. Imagine this RNG were seeded with a sequence of incrementing numbers n, n+1, n+2 and so forth -- you will get from rand() a sequence of numbers, each 69069 larger than the last (modulo 2^32). The pattern will be plainly visible. Starting with 0, we would get
1
69070
138139
207208
...
rising until a bit over 4 billion in steady increments. And to make matters worse, some implementation actually returned the seed value in the first call of rand after a call to srand, in which case you'd just get your seeds back.
A pseudo random generator is an engine which produce numbers that look almost random. However, they are completely deterministic. In other words, given a seed x0, they are produced by repeated application of some injective function on x0, call it f(x0), so that f^m(x0) is quite different from f^{m-1}(x0) or f^{m+1}(x0), where the notation f^m denotes the function composition m times. In other words, f(x) has huge jumps, almost uncorrelated with the previous ones.
If you use sradnd(time) multiple times in a second, you may get the same seed, as the clock is not as fast as you may imagine. So the resulting sequence of random numbers will be the same. And this may be a (huge) problem, especially in cryptography applications (anyway, in the latter case, people buy good number generators based on real-time physical processes such as temperature difference in atmospheric data etc, or, recently, on measuring quantum bits, e.g. superposition of polarized photons, the latter being truly random, as long as quantum mechanics is correct.)
There are also other serious issues with rand. One of it is that the distribution is biased. See e.g. http://eternallyconfuzzled.com/arts/jsw_art_rand.aspx for some discussion, although I remember I've seen something similar on SO, although cannot find it now.
If you plan to use it in crypto applications, just don't do it. Use <random> and a serious random engine like Mersene's twister std::mt19937 combined with std::random_device
If you seed your random number generator twice using srand, and get different seeds, then
you will get two sequences that will be quite different. This may be satisfactory for you. However, each sequence per se will not be a good random distribution due to the issues I mentioned above. On the other hand, if you seed your rng too many times, you will get the same seed, and THIS IS BAD, as you'll generate the same numbers over and over again.
PS: seen in the comments that pseudo-numbers depend on a seed, and this is bad. This is the definition of pseudo-numbers, and it is not a bad thing as it allows you to repeat numerical experiments with the same sequence. The idea is that each different seed should produce a sequence of (almost) random numbers, different from a previous sequence (technically, you shouldn't be able to distinguish them from a perfect random sequence).
The seed determines what random numbers will be generated, in order, i.e. srand(1), will always generate the same number on the first call to rand(), the same on the second call to rand() and so on.
In other words, if you re-seeded with the same seed before each rand() invocation, you'd generate the same random number every single time.
So successive seeding with time(0), during a single second, will mean all your random numbers after re-seeding are actually the same number.
Most of the other answers are saying exactly what the question already stated: multiple calls to srand with the same second will produce the same seed. I believe the actual question is the same one that I had, which is: why would it be bad to call srand multiple times, even if it was with a different seed every time?
I can think of three reasons:
People are not clear in their language and they actually mean srand should not be called multiple times with time() if you want different sequences of random numbers.
It's cryptographically bad because every seed passed to srand is not itself a random number (well, it's probably not). Meaning, every srand is injecting a chance for someone to guess that seed and therefore predict your stream of pseudo-random numbers.
It can mess up the distribution of pseudo-random numbers. #vsoftco's answer gave me a clue. If you call srand once, rand can be designed to give you a uniform distribution of pseudo-random numbers over its lifetime. If you call srand in the middle, however, you'll throw off that uniform distribution because it would "start over" with a new seed.
So, if you don't care about any of that, I would think it's okay to call srand more than once. In my case, I want to call it at the start of my program, but call it again after a fork() because the seed is apparently shared across child processes, and I want each child process to have its own sequence of pseudo-random numbers.
Going back to why it's cryptographically bad, it's easier to guess a seed if it's something like time() because a bad actor can try to guess the time it was seeded. That is why calling srand at the start of a program might be better, because it could be less likely that someone would guess that time as well as, say, when a server request was initiated.
But I would surmise that even passing nanoseconds would be cryptographically dangerous if there's a chance the underlying clock doesn't have that kind of precision. Imagine, for example, that you call srand(get_time_in_ns()) and the underlying clock only returns time to the nearest millisecond.
Now, I'm no crypto expert in any way, but this leads me to wonder if it would be safer than current-time to pass the output of a different pseudo-random generator as seeds to multiple srand calls? For example, can you call each srand with a number from Linux's /dev/random? (I imagine you might want to do that if you want a safer seed than the current time but still want to use rand() so you don't have the overhead of reading from the kernel every time.)

Calling srand() twice in the same program [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 9 years ago.
Improve this question
why is it when i call srand() at 2 very different points it cause numbers to not be random? Once i remove one of them it goes back to normal.
It depends on how you call it. The purpose of srand() is to seed the pseudo-random number generator used by rand(). So when you call srand(i), it will initialise rand() to a fixed sequence which depends on i. So when you re-seed with the same seed, you start getting the same sequence.
The most common use case is to seed the generator just once, and with a suitable "random" value (such as the idiomatic time(NULL)). This guarantees makes it likely that you'll get different sequences of pseudo-random numbers in different program executions.
However, occasionally you might want to make the pseudo-random sequence "replayable." Imagine you're testing several sorting algorithms on random data. To get fair comparisons, you should test each algorithm on the exact same data - so you'll re-seed the generator with the same seed before each run.
In other words: if you want the numbers simply pseudo-random, seed once, and with a value as random as possible. If you want some control & replayability, seed as necessary.
srand (seed);
Two different initializations with the same seed will generate the
same succession of results in subsequent calls to rand.
If seed is set to 1, the generator is reinitialized to its initial
value and produces the same values as before any call to rand or
srand.
Each time rand() is seeded with srand(), it must produce the same
sequence of values.
http://www.cplusplus.com/reference/cstdlib/srand/
http://en.cppreference.com/w/cpp/numeric/random/srand
Are you initializing the srand? You have to initialize it in the beginning of you function/code like this:
srand(time(NULL));
It should work :)
You may read about pseudo random numbers generators, standard library srand-rand functions are implementation of one of them.
The core idea is that pseudo random generator is initialized with the special number - seed.
srand() is used to set seed. For every seed pseudo random generator generate exactly the same sequence of numbers ever. By using different seeds you'll get different sequences of numbers.
So if you want to get different random numbers everytime you start you program, you need everytime to set new seed.
The one of simpliest way to do this is to use time for seed.
#include <time.h>
srand((unsigned int)time(0));

C++: seeding random number generator outside of main()

I was creating a simple program that simulates a coin toss for my class. (Actually, class is over this term and i'm just working through the rest of the projects that weren't required). It involves the creating and calling a function that generates a random number between 1 and 2. Originally, I tried to seed the random number generator within the function that would be using it (coinToss); however, it did not produce a random number. Each time the program was run it was the same number as though I had only used
rand()
instead of
unsigned seed = time(0);
srand(seed);
rand();
Yet, when i moved the above within
int main()
it worked fine.
My question is 1)why did it not work when setup within the function that called it and (2) how does rand()
have access to what was done by srand() if they do not both occur in the same function?
Obviously, i'm a beginner so please forgive me if i didn't formulate the question correctly. Also, my book has only briefly touched on rand() and srand() so that's all i really know.
thanks for any help!
Pertinent code:
First attempt that didn't work:
int main()
{
//...........
coinToss();
//...........
}
int coinToss()
{
unsigned seed = time(0);
srand(seed);
return 1 + rand() % 2;
}
Second attempt which did work:
int main()
{
unsigned seed = time(0);
srand(seed);
coinToss();
}
int coinToss()
{
return 1 + rand() % 2;
}
You probably only want to seed the random number generator once. rand() returns the next pseudo-random number from it's internal generator. Every time you call rand() you will get the next number from the internal generator.
srand() however sets the initial conditions of the random number generator. You can think of it as setting the 'starting-out point' for the internal random number generator (in reality it's a lot more complicated than that, but it's a useful cognitive model to follow).
So, you should be calling srand(time(0)) exactly once in your application - somewhere near the beginning. After that, you can call rand() as many times as you want!
However
To answer your actual question - the first version doesn't work because time() returns the number of seconds since the epoch. So If you call coinToss() several times in a second (say, if you wanted to simulate 100 coin tosses), then you'd be constantly seeding the random number generator with the same number, thereby resetting it's internal state (and thus the next number you get) every time.
Anyway - using time() as a seed to srand() is somewhat crappy for this very reason - time() doesn't chage very often, and worse, it's predictable. If you know the current time, you can work out what rand() will return. The internet has many, many examples of better srand() seeds.
Pseudo-random number generators (like rand) work by taking a single starting number (the seed) and performing a numeric transformation on it each time you request a new number. You want to seed the generator just once, or it will continually get reset, which is not what you want.
As you discovered, you should just call srand just once in main. Also note that a number of rand implementations have pretty short cycles on the low 4 bits or so. In practice this means you might get an easily predictable repeating cycle of numbers You might want to shift the return from rand right by 4-8 bits before you take the % 2.
EDIT: The call would look something like:
return 1 + (rand() >> 6) % 2;
Seed only once per program, not every time you call coinToss()
To expand on Mark B's answer: It is not so much that the random number generator is reset as it sets a new variable to be used in calculating random numbers. However your program doesn't do that much work between calls to srand. Therefore every time you call srand(time(0)) it is using the same seed, so you are resetting the internal state of the random number generator . If you put a sleep in there so that time(0) changed you would not get the same number every time.
As for how data passes from srand to rand, it is fairly simple, a global variable is used. All names that start with an underscore and a capital letter or two underscores are reserved for variables used by your compiler. More than likely this variable has been declared static so it isn't visible outside of the translation unit(aka the library file that contains your compiler's standard library.) This is done so that #define STUFF 5 doesn't break your standard library.
for simple simulations, you must not change the seed at all during the simulation. Your simulation will be "worse" in that case.
To understand this, you should see pseudo random sequences as a big wheel of fortune. When you change the seed, it is like you change the position, and then, each call to rand will give you a different number. If you roll again, it will be more probable finding yourself repeating numbers.