I have old code that predates C++11 and it uses rand() for generating random ints.
However, there is shortcoming in rand(): you can't save and then restore the state of the random device; since there is not an object I can save, nor can I extract the state.
Therefore, I want to refactor to use C++11's solution <random>.
However, I do not want a behaviour change - I hope to get exactly the sequence rand() gives me but with <random>.
Do you guys know whether this is achievable?
You can't even assure that you get the same sequence if you use rand() on another compiler. And no, you can't get random to produce the same sequence as whoever's rand() it was you were using. (Thank goodness. rand() is notorious for being one of the worst pseudo-random number generators of all time.)
It is possible for you to restore the state of rand(), simply by using srand() to set the initial state and counting how many times you called rand(). You can later repeat that to bring rand() back to that same state.
But don't use rand()!
What you want is not possible. The C-style random number generator is implementation-defined. The C++ random engines are all very well specified as to their particular algorithms (except random_device, which varies due to potentially being a more "true" random generator). None of its engines are defined to have the same algorithm as rand.
Related
I'm implementing an algorithm. Because calculations takes time, and I need to repeat them multiple times, I'm saving to output file seed values as well. The idea was that I could repeat same instance of a program if I'll need to get more info about what was happening (like additional values, some percentage, anything that will not mess in the algorithm itself).
Unfortunately, even though I thought everything worked as intended, about 20% of the seeded instances gave different values in at least one of the outputted values.
My question is - what type of changes in the code affects how srand() / rand() works in C++? Each class is compiled separately and all are linked together at the end. Can I implement functions and everything will be fine? Does it break only when I change the size of any class in the program by adding/removing class fields? Is it connected with heap/stack allocation?
Until now, I thought that if I seed srand() I will have same order of rand() values no matter what (eg. for srand(123) I'll always get first rand() == 5, second rand() == 8 etc.). And I can break it only when I'll put more rand() calls in between.
I hope you could find where I'm thinking wrong, or you could link something that will help me.
Cheers
mrozo
Your understanding about srand is correct: seeding with a specific value should be enough to generate a reproducible sequence of random numbers. You should debug your application to discover why it behaves in a non-reproducible way.
One reason for such behavior is a race condition on the hidden RNG state. Quoting from the C++ rand wiki:
It is implementation-defined whether rand() is thread-safe.
...
It is recommended to use C++11's random number generation facilities to replace rand().
I have been learning recently how to program games in c++ from a beginner book, and i reached a lesson where i have to make a game in where i have to guess the computer's random picked number, and i have to use this line of code:
srand(static_cast<unsigned int>(time(0)));
variable=rand();
I obviously use iostream cstdlib and ctime.I don't really understand how this works.How is it picking the time and date, and by what rules is it converting into an unsigned int. Basically, how those functions work.
Thank you!
1. About time()
time (or better std::time in C++) is a function that returns some integer or floating point number that represents the current time in some way.
Which arithmetic type it actually returns and how it represents the current time is unspecified, however, most commonly you will get some integer type that holds the seconds since begin of the Unix epoch.
2. About srand()
srand is a function that uses its argument (which is of type unsigned int), the so called seed, to set the internal state of the pseudo number generator rand. When I write random in the rest of this answer, read pseudo random.
Using a different seed will in general result in a different sequence of random numbers produced by subsequent calls to rand, while using the same seed again will result in the exactly same sequence of random numbers.
3. Using time() to seed rand()
If we do not want to get the same random numbers every time we run the program, we need some seed that is different on each run. The current time is a widely used source for such a seed as it changes constantly.
This integer (or whatever else time returned) representing the current time is now converted to unsigned int with a static_cast. This explicit cast is not actually needed as all arithmetic types convert to unsigned int implicitly, but the cast may silence some warnings. As time goes by, we can expect the resulting unsigned int and thus the sequence of random numbers produced by rand to change.
4. Pitfalls
If, as is common, time returns the number of seconds since the beginning of the Unix epoch, there are three important things to note:
The sequence you produce will be different only if at least a second has passed between two invocations.
Depending on the actual implementation, the resulting sequences may start of kind of similar if the time points used to seed rand are close to each other (compared to time since Epoch). Afaik, this is the case in MSVC's implementation. If that is problematic, just discard the first couple of hundred or thousand values of the sequence. (As I have learned by now, this does not really help much for poor RNGs as commonly used for rand. So if that is problematic, use <random> as described below.)
Your numbers are not very random in the end: If someone knows when your call to srand occurred, they can derive the entire sequence of random numbers from that. This has actually led to a decryption tool for a ransom ware that used srand(time(0)) to generate its "random" encryption key.
Also, the sequence generated by rand tends to have poor statistical properties even if the seed was good. For a toy program like yours, that is probably fine, however, for real world use, one should be aware of that.
5. The new <random>
C++11 introduced new random number facilities that are in many ways superior to the old rand based stuff. They provided in the standard header <random>. It includes std::random_device which provides a way to get actually random seeds, powerful pseudo random number generators like std::mt19937 and facilities to map the resulting random sequences to integer or float ranges without introducing unnecessary bias.
Here is an example how to randomly roll a die in C++11:
#include <random>
#include <iostream>
int main()
{
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<> dis(1, 6);
for (int n=0; n<10; ++n)
std::cout << dis(gen) << ' ';
std::cout << '\n';
}
(Code from cppr) Note: std::random_device does not work properly with MinGW, at least in the version (Nuwen MinGW5.3) I tested!
It should also be noted that the state space of a mt19937 is much larger than the 32 bit we (commonly) get out of a single call to random_device. Again, this will most likely not matter for toy programs and homework, but for reference: Here is my attempt to properly seed the entire state space, plus some helpful suggestions in the answers.
If you are interested in more details about rand vs <random>, this is an interesting watch.
First line:
srand() is a pseudo-random number generator. In your case it is initialized with the current time (execution time) on your system.
Second line:
After the pseudo-random number generator is configured, you can retrieve random numbers by calling rand().
my rand number is rand()%6+1 aka dice rolling, when its based on "time", is it possible to make a console app that foresees the future numbers in the time I want to? for example predict a number on time 14:40:32 on a certain day in future?
Yes provided that you use the same implementation of rand i.e. link with the same version of the standard library. All you need is to get the time_t value for the time you are interested in pass it to srand and call rand to get the value.
For example, if time_t holds the number of seconds since the epoch (which is the case for most implementations), then you can do the following to get the value returned by rand with a 10-second-in-the-future seed:
std::srand(std::time(nullptr) + 10);
std::cout << std::rand();
(Leaving aside the questions of whether it's a good idea to use rand at all.)
... for example predict a number on time 14:40:32 on a certain day in future?
It's possible when knowing how exactly rand() generates the pseudo random number on a certain seed (which is available for most compilers open source code implementation).
You have a certain seed number given from your date and time, thus you can just inspect the sequence of random numbers generated consecutively.
Yes and no. If you have a value of time_t, then just run the same library version of srand() on that value, and rand() will definitely yield the same sequence.
But you need to be sure that
the random libraries in the two applications use the same implementation (I think it's Mersenne Twister, but I'd need to check)
the clock of the two applications is synchronised. If you think that the master application's clock is 14:30:17, but it's really 14:30:18, then entering 14:30:17 in the monitor application will (of course) get different values.
the sequence of calls to rand() in both applications is the same, i.e., the number of calls between the srand() and the rand() you are interested in is known by you.
The last point might be a showstopper.
Say that you know that the app was initialised with srand(T) and you know T. Now yes, you know all the future extractions of its rand(). But you still need to know at which point in the sequence you are.
The number extracted at 19:30:17 GMT will not depend on the '19:30:17 GMT', but on how many numbers have been extracted before since the call to srand().
TL;DR if you know the value that time(0) passed to srand(), you cannot predict the output of the rand() call at a given time. You can predict the output of the n-th call to rand() for any given n.
my rand number is rand()%6+1 aka dice rolling, when its based on "time", is it possible to make a console app that foresees the future numbers in the time I want to? for example predict a number on time 14:40:32 on a certain day in future?
Yes provided that you use the same implementation of rand i.e. link with the same version of the standard library. All you need is to get the time_t value for the time you are interested in pass it to srand and call rand to get the value.
For example, if time_t holds the number of seconds since the epoch (which is the case for most implementations), then you can do the following to get the value returned by rand with a 10-second-in-the-future seed:
std::srand(std::time(nullptr) + 10);
std::cout << std::rand();
(Leaving aside the questions of whether it's a good idea to use rand at all.)
... for example predict a number on time 14:40:32 on a certain day in future?
It's possible when knowing how exactly rand() generates the pseudo random number on a certain seed (which is available for most compilers open source code implementation).
You have a certain seed number given from your date and time, thus you can just inspect the sequence of random numbers generated consecutively.
Yes and no. If you have a value of time_t, then just run the same library version of srand() on that value, and rand() will definitely yield the same sequence.
But you need to be sure that
the random libraries in the two applications use the same implementation (I think it's Mersenne Twister, but I'd need to check)
the clock of the two applications is synchronised. If you think that the master application's clock is 14:30:17, but it's really 14:30:18, then entering 14:30:17 in the monitor application will (of course) get different values.
the sequence of calls to rand() in both applications is the same, i.e., the number of calls between the srand() and the rand() you are interested in is known by you.
The last point might be a showstopper.
Say that you know that the app was initialised with srand(T) and you know T. Now yes, you know all the future extractions of its rand(). But you still need to know at which point in the sequence you are.
The number extracted at 19:30:17 GMT will not depend on the '19:30:17 GMT', but on how many numbers have been extracted before since the call to srand().
TL;DR if you know the value that time(0) passed to srand(), you cannot predict the output of the rand() call at a given time. You can predict the output of the n-th call to rand() for any given n.
This comment, which states:
srand(time(0)); I would put this line as the first line in main()
instead if calling it multiple times (which will actually lead to less
random numbers).
...and I've bolded the line which I'm having an issue with... repeats common advice to call srand once in a program. Questions like srand() — why call only once? re-iterate that because time(0) returns the current time in seconds, that multiple calls to srand within the same second will produce the same seed. A common workaround is to use milliseconds or nanoseconds instead.
However, I don't understand why this means that srand should or can only be called once, or how it leads to less random numbers.
cppreference:
Generally speaking, the pseudo-random number generator should only be
seeded once, before any calls to rand(), and the start of the program.
It should not be repeatedly seeded, or reseeded every time you wish to generate a new batch of pseudo-random numbers.
phoxis's answer to srand() — why call only once?:
Initializing once the initial state with the seed value will generate
enough random numbers as you do not set the internal state with srand,
thus making the numbers more probable to be random.
Perhaps they're simply using imprecise language, none of the explanations seem to explain why calling srand multiple times is bad (aside from producing the same sequence of random numbers) or how it affects the "randomness" of the numbers. Can somebody clear this up for me?
Look at the source of srand() from this question: Rand Implementation
Also, example implementation from this thread:
static unsigned long int next = 1;
int rand(void) // RAND_MAX assumed to be 32767
{
next = next * 1103515245 + 12345;
return (unsigned int)(next/65536) % 32768;
}
void srand(unsigned int seed)
{
next = seed;
}
As you can see, when you calling srand(time(0)) you will got new numbers on rand() depends on seed. Numbers will repeat after some milions, but calling srand again will make it other. Anyway, it must repeat after some cycles - but order depends on argument for srand. This is why C rand isn't good for cryptography - you can predict next number when you know seed.
If you have fast loop, calling srand every iteration is without sense - you can got same number while your time() (1 second is very big time for modern CPUs) give another seed.
There is no reason in simple app to call srand multiple times - this generator are weak by design and if you want real random numbers, you must use other (the best I know is Blum Blum Shub)
For me, there is no more or less random numbers - it always depends on seed, and they repeat if you use same seed. Using time is good solution because it's easy to implement, but you must use only one (at beginning of main()) or when you sure that you calling srand(time(0)) in another second.
The numbers rand() returns are not actually random but "pseudo-random." What this means is that rand() generates a stream of numbers that look random for given values of "look" and "random" from an internal state that changes with each call.
As a rule, rand() is what is called a linear congruental generator, which means that uses a mechanism roughly like this:
int state; // persistent state
int rand() {
state = (a * state + b) % c;
return state;
}
with carefully chosen constants a, b and c. c tends to be a power of two in practice because that makes it faster to calculate.
The "randomness" of this sequence depends in part on the persistence of the state. If the sequence is constantly reseeded with predictable values, the return values of rand() become predictable in turn. How critical this is depends on the application, but it is not a purely academical consideration. Consider, for example, the case
a = 69069
b = 1
c = 2^32
which was used, for example, by old versions of glibc. Granted that I picked this example for the obviousness of the pattern, but the point remains in less obvious cases. Imagine this RNG were seeded with a sequence of incrementing numbers n, n+1, n+2 and so forth -- you will get from rand() a sequence of numbers, each 69069 larger than the last (modulo 2^32). The pattern will be plainly visible. Starting with 0, we would get
1
69070
138139
207208
...
rising until a bit over 4 billion in steady increments. And to make matters worse, some implementation actually returned the seed value in the first call of rand after a call to srand, in which case you'd just get your seeds back.
A pseudo random generator is an engine which produce numbers that look almost random. However, they are completely deterministic. In other words, given a seed x0, they are produced by repeated application of some injective function on x0, call it f(x0), so that f^m(x0) is quite different from f^{m-1}(x0) or f^{m+1}(x0), where the notation f^m denotes the function composition m times. In other words, f(x) has huge jumps, almost uncorrelated with the previous ones.
If you use sradnd(time) multiple times in a second, you may get the same seed, as the clock is not as fast as you may imagine. So the resulting sequence of random numbers will be the same. And this may be a (huge) problem, especially in cryptography applications (anyway, in the latter case, people buy good number generators based on real-time physical processes such as temperature difference in atmospheric data etc, or, recently, on measuring quantum bits, e.g. superposition of polarized photons, the latter being truly random, as long as quantum mechanics is correct.)
There are also other serious issues with rand. One of it is that the distribution is biased. See e.g. http://eternallyconfuzzled.com/arts/jsw_art_rand.aspx for some discussion, although I remember I've seen something similar on SO, although cannot find it now.
If you plan to use it in crypto applications, just don't do it. Use <random> and a serious random engine like Mersene's twister std::mt19937 combined with std::random_device
If you seed your random number generator twice using srand, and get different seeds, then
you will get two sequences that will be quite different. This may be satisfactory for you. However, each sequence per se will not be a good random distribution due to the issues I mentioned above. On the other hand, if you seed your rng too many times, you will get the same seed, and THIS IS BAD, as you'll generate the same numbers over and over again.
PS: seen in the comments that pseudo-numbers depend on a seed, and this is bad. This is the definition of pseudo-numbers, and it is not a bad thing as it allows you to repeat numerical experiments with the same sequence. The idea is that each different seed should produce a sequence of (almost) random numbers, different from a previous sequence (technically, you shouldn't be able to distinguish them from a perfect random sequence).
The seed determines what random numbers will be generated, in order, i.e. srand(1), will always generate the same number on the first call to rand(), the same on the second call to rand() and so on.
In other words, if you re-seeded with the same seed before each rand() invocation, you'd generate the same random number every single time.
So successive seeding with time(0), during a single second, will mean all your random numbers after re-seeding are actually the same number.
Most of the other answers are saying exactly what the question already stated: multiple calls to srand with the same second will produce the same seed. I believe the actual question is the same one that I had, which is: why would it be bad to call srand multiple times, even if it was with a different seed every time?
I can think of three reasons:
People are not clear in their language and they actually mean srand should not be called multiple times with time() if you want different sequences of random numbers.
It's cryptographically bad because every seed passed to srand is not itself a random number (well, it's probably not). Meaning, every srand is injecting a chance for someone to guess that seed and therefore predict your stream of pseudo-random numbers.
It can mess up the distribution of pseudo-random numbers. #vsoftco's answer gave me a clue. If you call srand once, rand can be designed to give you a uniform distribution of pseudo-random numbers over its lifetime. If you call srand in the middle, however, you'll throw off that uniform distribution because it would "start over" with a new seed.
So, if you don't care about any of that, I would think it's okay to call srand more than once. In my case, I want to call it at the start of my program, but call it again after a fork() because the seed is apparently shared across child processes, and I want each child process to have its own sequence of pseudo-random numbers.
Going back to why it's cryptographically bad, it's easier to guess a seed if it's something like time() because a bad actor can try to guess the time it was seeded. That is why calling srand at the start of a program might be better, because it could be less likely that someone would guess that time as well as, say, when a server request was initiated.
But I would surmise that even passing nanoseconds would be cryptographically dangerous if there's a chance the underlying clock doesn't have that kind of precision. Imagine, for example, that you call srand(get_time_in_ns()) and the underlying clock only returns time to the nearest millisecond.
Now, I'm no crypto expert in any way, but this leads me to wonder if it would be safer than current-time to pass the output of a different pseudo-random generator as seeds to multiple srand calls? For example, can you call each srand with a number from Linux's /dev/random? (I imagine you might want to do that if you want a safer seed than the current time but still want to use rand() so you don't have the overhead of reading from the kernel every time.)