A problem with random number generation - c++

I am taking a course on programming, and we're using C++.
We had an assignment where, at some point, we needed to code a function that would return a random number in an [upper, lower] interval. I used the following:
lower + (int) (upper * (rand() / (RAND_MAX + 1.0)));
I did not forget to change srand by using srand((unsigned int) time(0)).
However, I get the same value every time! I asked my professor for help and he, after some investigation, found out that the first number generated by rand() isn't that random... The higher order bits remained unchanged, and since this implementation uses them, the end result isn't quite what I expected.
Is there a more elegant, yet simple solution than to discard the first value or use remainders to achieve what I want?
Thanks a lot for your attention!
~Francisco
EDIT: Thank you all for your input. I had no idea rand() was such a sucky RNG :P

Given that rand() is not a very strong random number generator, the small amount of bias added by the standard approach is probably not an issue: (higher-lower) needs to be smaller than MAX_RAND of course.
lower + rand() % (higher-lower+1);
fixed off by one error.

rand() is not a good random-number generator. In addition to the problem you observed it's period length can be very short.
Consider using one of the gsl random number generators.

Depending on what OS your are using you may have random() available in addition to rand(). This generates much better pseudo-random numbers than rand(). Check <stdlib.h> and/or man 3 random.

Your code is good, but you should substitute
lower + (int) upper * (rand() / (RAND_MAX + 1.0));
with
lower + (int) (upper - lower + 1)*(rand() / (RAND_MAX + 1.0));
The following code works nicely on my machine:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define lower 10
#define upper 20
int main(void)
{
int i;
int number;
srand(time(0));
for(i=0; i<10; i++)
{
number = lower + (int) (upper - lower + 1)*(rand() / (RAND_MAX + 1.0));
printf ("%d\n", number);
}
return 0;
}
Of course, since time(0) gives the current time in seconds, two executions within the same second give the same result.

C++0x random number library (also available in TR1 and Boost) finally solves some nasty issues of rand. It allows getting real randomness (random_device) that you can use for proper seeding, then you can use a fast and good pseudo random generator (mt19937), and you may apply a suitable distribution to that (e.g. uniform_int for min-max range with equal probability for each value).
It also does not use global hidden state like rand() does, so there won't be any issues in multi-threaded programs.
Due to all the modularity it is a bit more difficult to use than simply calling rand, but still the benefits greatly outweigh the steeper learning curve.

Related

Why does rand() produce the same value when seeded with 1 and UINT_MAX?

Here's some code:
#include <iostream>
int main() {
srand(1);
std::cout << rand() << "\n";
srand(UINT_MAX);
std::cout << rand() << "\n";
}
This produces the following output:
16807
16807
Why do these two seeds produce the same result? The entire sequence of values they produce on successive rand() calls is also identical. The range of possible values is too large for this to be a pure coincidence. Is it:
An accident of the implementation of rand() (and if so, I'm curious what that might be)
By design (and if so, why?)
(Possibly related: the seeds 10, 100, 1000, 10000, and 100000 produce 168070, 1680700, 16807000, 168070000, and 1680700000 respectively.)
A very simple usable random number generator is a Lehmer number generator. This RNG is maybe the simplest to implement in software, which is still usable, so probably it has the most issues with randomness, and most easy to analyze.
The number 16807 (aka the fifth power of 7) is associated with Lehmer RNG, because it was used in one of the earliest implementations in 1988 - apparently still in use today!
The formula for Nth random number is (using ^ for exponentiation):
R(n) = (seed * (16807 ^ n)) mod (2 ^ 31 - 1)
If you set seed = 1, n = 1:
R(1) = 16807 mod (2 ^ 31 - 1) = 16807
If you set seed = 2 ^ 32 - 1:
R(1) =
(2 ^ 32 - 1) * 16807 ≡ (expressing 2^32 = 2^31 * 2)
((2 ^ 31 - 1) * 2 + 1) * 16807 ≡ (distributive law)
(2 ^ 31 - 1) * 2 * 16807 + 1 * 16807 ≡ (modulo 2^31-1)
16807
Here the equality of the first number in the random sequence is because the modulo-number in the Lehmer RNG is almost a power of 2 (2^31-1), and your seed is also almost a power of 2 (2^32-1).
The same would happen for seed = 2^31.
tl;dr: rand() is known to be that bad.
The actual values are implementation defined. I get the following values on my platform:
seed: 1 : 41
seed: 4294967295 : 35
seed: 10 : 71
seed: 100 : 365
seed: 1000 : 3304
seed: 10000 : 32694
rand() can be relied on to look somewhat random to a casual user in a hurry. It is not portably suitable for anything else.
The implementation usually use a low quality generator (most often Linear Congruential with bad constants).
The required numeric range is 0...32767, and while implementation may they usually don't exceed that - so you can expect many seeds to result in the same value.
For C++ modern, see <random> for reliable options.
This depends on the implementation of you random number generator.
See What common algorithms are used for C's rand()?
for common implementations.
Usually the space of possible seed values is much shorter than your UINT_MAX.
It could be that 1 and UINT_MAX are mapped to the same internal seed.
Often Linear congruential generator are used for rand(), then the first generated random number depends like
first_random_number = (seed * const + another_const) % third_constant
on the seed. This explains the dependence you found.
I don't see a good reason why the unfortunate correlation that you observed would be designed into the implementation of rand that you're using. It's most likely an accident of the implementation as you suggest. That said, I would also consider it a coincidence, that you can produce correlation with exactly those inputs. Another implementation could have other sets of unfortunate inputs.
if so, I'm curious what that might be
If your implementation is open source, then you can find out by reading the source. If it's proprietary you could still find a mention of the algorithm in some documentation, or if you're a customer, you could ask the implementer.
As stated here, If seed is set to 1, the generator is reinitialized to its initial value and produces the same values as before any call to rand or srand.
Also note that Two different initializations with the same seed will generate the same succession of results in subsequent calls to rand.

Trying to take 1 random digit at a time

I am new to cpp programing, and new to stackoverflow.
I have a simple situation and a problem that is taking more time than reasonable to solve, so I thought I'd ask it here.
I want to take one digit from a rand() at a time. I have managed to strip of the digit, but I can't convert it to an int which I need because it's used as an array index.
Can anyone help? I'd be appreciative.
Also if anyone has a good solution to get evenly-distributed-in-base-10 random numbers, I'd like that too... of course with a rand max that isn't all 9s we don't have that.
KTM
Well you can use the modulus operator to get the digit of what number rand returns.
int digit = rand()%10;
As for your first question, if you have a character digit you can subtract the value of '0' to get the digit.
int digit = char_digit - '0';
If one wants a pedantic even distribution of 0 to 9 one could
Assume rand() itself is evenly distributed. (Not always a good assumption.)
Call rand() again as needed.
int ran10(void) {
const static int r10max = RAND_MAX - (RAND_MAX % 10);
int r;
while ((r = rand()) >= r10max);
return r%10;
}
Example:
If RAMD_MAX was 32767, r10max would have the value of 32760. Any rand value in the range 32760 to 32767 would get tossed and a new random value would be fetched.
While not as fast as modulo-arithmetic implementations like rand() % 10, this will return evenly distributed integers and avoid perodicity in the least significant bits that occur in some pseudo-random number generators (see http://www.gnu.org/software/gsl/manual/html_node/Other-random-number-generators.html).
int rand_integer(int exclusive_upperbound)
{
return int((double)exclusive_upperbound*rand()/RAND_MAX);
}

Rand generating same numbers

I have a problem with the small game that I made.
#include "stdafx.h"
#include <iostream>
#include <cstdlib>
#include <ctime>
using namespace std;
int main()
{
int span = 100;
srand(time(0));
int TheNumber = static_cast<double> (rand()) /RAND_MAX * (span -1) +1;
cout << "You need to guess the number between 1 and " << span << endl;
int mynumber;
int numberofAttempts = 0;
do {
cout << ++numberofAttempts <<" Attempt: ";
cin >> mynumber;
if (mynumber > TheNumber)
cout <<"Lower!" << endl;
else if (mynumber < TheNumber)
cout <<"Higher!" << endl;
} while (mynumber != TheNumber);
cout << "SUCESS!!!" << endl;
return 0;
}
The game is supposed to generate a random number between 0-100 and you are supposed to guess it. After running this code 15-20times the same numbers generated some even 8 times (the number 2 in my case).
I know that there is no absolute random number and that it uses some math formula or something to get one.I know that using srand(time(0)) makes it dependent on the current time. But how would I make it "more" random, since I don't want the stuff to happen that I mentioned above.
First time I ran it the result was 11, after running it again (after guessing the right number) , it was still 11, even though the time changed.
[ADDITION1]
If you DO truly wish to look into better random number generation, then this is a good algorithm to begin with:
http://en.wikipedia.org/wiki/Mersenne_twister
Remember though that any "Computer Generated" (i.e. mathematically generated) random number is ONLY pseudo-random. Pseudo-random means that while the outputs from the algorithm look to have normal distribution, they are truly deterministic if one knows the input seed. True random numbers are completely non-deterministic.
[ORIGINAL]
Try simply one of the following lines:
rand() % (span + 1); // This will give 0 - 100
rand() % span; // this will give 0 - 99
rand() % span + 1; // This will give 1 - 100
Instead of:
(rand()) /RAND_MAX * (span -1) +1
Also, don't cast the result of that to a double, then place into an int.
Look here also:
http://www.cplusplus.com/reference/clibrary/cstdlib/rand/
In Response to the comment!!!
If you use:
rand() / (span + 1);
then in order to get values between 0 and 100, then the output values from rand would indeed have to be between 0 and (100 * 100), and this nature would have to be guaranteed. This is because of simple division. A value of 1 will essentially pop out when rand() produces a 101 - 201, a 2 will pop out of the division when the rand() outputs a value of 202 - 302, etc...
In this case, you may be able to get away with it at 100 * 100 is only 10000, and there are definitely integers larger than this in the 32 bit space, but in general doing a divide will not allow you to take advantage utilizing the full number space provided!!!
There are a number of problems with rand(). You've run into one of them, which is that the first several values aren't "random". If you must use rand(), it is always a good idea to discard the first four or results from rand().
srand (time(0));
rand();
rand();
rand();
rand();
Another problem with rand() is that the low order bits are notoriously non-random, even after the above hack. On some systems, the lowest order bit alternates 0,1,0,1,0,1,... It's always better to use the high order bits such as by using the quotient rather than the remainder.
Other problems: Non-randomness (most implementations of rand() fails a number of tests of randomness) and short cycle. With all these problems, the best advice is to use anything but rand().
First, rand() / RAND_MAX does not give a number between 0 and 1, it returns 0. This is because RAND_MAX fits 0 times in the result of rand(). Both are integers, so with integer division it does not return a floating point number.
Second, RAND_MAX may well be the same size as an INT. Multiplying RAND_MAX with anything will then give an overflow.

Extend rand() max range

I created a test application that generates 10k random numbers in a range from 0 to 250 000. Then I calculated MAX and min values and noticed that the MAX value is always around 32k...
Do you have any idea how to extend the possible range? I need a range with MAX value around 250 000!
This is according to the definition of rand(), see:
http://cplusplus.com/reference/clibrary/cstdlib/rand/
http://cplusplus.com/reference/clibrary/cstdlib/RAND_MAX/
If you need larger random numbers, you can use an external library (for example http://www.boost.org/doc/libs/1_49_0/doc/html/boost_random.html) or calculate large random numbers out of multiple small random numbers by yourself.
But pay attention to the distribution you want to get. If you just sum up the small random numbers, the result will not be equally distributed.
If you just scale one small random number by a constant factor, there will be gaps between the possible values.
Taking the product of random numbers also doesn't work.
A possible solution is the following:
1) Take two random numbers a,b
2) Calculate a*(RAND_MAX+1)+b
So you get equally distributed random values up to (RAND_MAX+1)^2-1
Presumably, you also want an equal distribution over this extended
range. About the only way you can effectively do this is to generate a
sequence of smaller numbers, and scale them as if you were working in a
different base. For example, for 250000, you might 4 random numbers
in the range [0,10) and one in range [0,25), along the lines:
int
random250000()
{
return randomInt(10) + 10 * randomInt(10)
+ 100 * randomInt(10) + 1000 * randomInt(10)
+ 10000 * randomInt(25);
}
For this to work, your random number generator must be good; many
implementations of rand() aren't (or at least weren't—I've not
verified the situation recently). You'll also want to eliminate the
bias you get when you map RAND_MAX + 1 different values into 10 or
25 different values. Unless RAND_MAX + 1 is an exact multiple of
10 and 25 (e.g. is an exact multiple of 50), you'll need something
like:
int
randomInt( int upperLimit )
{
int const limit = (RAND_MAX + 1) - (RAND_MAX + 1) % upperLimit;
int result = rand();
while ( result >= limit ) {
result = rand();
return result % upperLimit;
}
(Attention when doing this: there are some machines where RAND_MAX + 1
will overflow; if portability is an issue, you'll need to take
additional precautions.)
All of this, of course, supposes a good quality generator, which is far
from a given.
You can just manipulate your number bitwise by generating smaller random numbers.
For instance, if you need a 32-bit random number:
int32 x = 0;
for (int i = 0; i < 4; ++i) { // 4 == 32/8
int8 tmp = 8bit_random_number_generator();
x <<= 8*i; x |= tmp;
}
If you don't need good randomness in your numbers, you can just use rand() & 0xff for the 8-bit random number generator. Otherwise, something better will be necessary.
Are you using short ints? If so, you will see 32,767 as your max number because anything larger will overflow the short int.
Scale your numbers up by N / RAND_MAX, where N is your desired maximum. If the numbers fit, you can do something like this:
unsigned long long int r = rand() * N / RAND_MAX;
Obviously if the initial part overflows you can't do this, but with N = 250000 you should be fine. RAND_MAX is 32K on many popular platforms.
More generally, to get a random number uniformly in the interval [A, B], use:
A + rand() * (B - A) / RAND_MAX;
Of course you should probably use the proper C++-style <random> library; search this site for many similar questions explaining how to use it.
Edit: In the hope of preventing an escalation of comments, here's yet another copy/paste of the Proper C++ solution for truly uniform distribution on an interval [A, B]:
#include <random>
typedef std::mt19937 rng_type;
typedef unsigned long int int_type; // anything you like
std::uniform_int_distribution<int_type> udist(A, B);
rng_type rng;
int main()
{
// seed rng first:
rng_type::result_type const seedval = get_seed();
rng.seed(seedval);
int_type random_number = udist(rng);
// use random_number
}
Don't forget to seend the RNG! If you store the seed value, you can replay the same random sequence later on.

C++ generating random numbers

My output is 20 random 1's, not between 10 and 1, can anyone explain why this is happening?
#include <iostream>
#include <ctime>
#include <cstdlib>
using namespace std;
int main()
{
srand((unsigned)time(0));
int random_integer;
int lowest=1, highest=10;
int range=(highest-lowest)+1;
for(int index=0; index<20; index++){
random_integer = lowest+int(range*rand()/(RAND_MAX + 1.0));
cout << random_integer << endl;
}
}
output:
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Because, on your platform, RAND_MAX == INT_MAX.
The expression range*rand() can never take on a value greater than INT_MAX. If the mathematical expression is greater than INT_MAX, then integer overflow reduces it to a number between INT_MIN and INT_MAX. Dividing that by RAND_MAX will always yield zero.
Try this expression:
random_integer = lowest+int(range*(rand()/(RAND_MAX + 1.0)))
It's much easier to use the <random> library correctly than rand (assuming you're familiar enough with C++ that the syntax doesn't throw you).
#include <random>
#include <iostream>
int main() {
std::random_device r;
std::seed_seq seed{r(), r(), r(), r(), r(), r(), r(), r()};
std::mt19937 eng(seed);
std::uniform_int_distribution<> dist(1, 10);
for(int i = 0; i < 20; ++i)
std::cout << dist(eng) << " ";
}
random_integer = (rand() % 10) + 1
That should give you a pseudo-random number between 1 & 10.
A somewhat late answer, but it should provide some additional
information if the quality of the generation is important. (Not all
applications need this—a slight bias is often not a problem.)
First, of course, the problem in the original code is the fact that
range * rand() has precedence over the following division, and is done
using integer arithmetic. Depending on RAND_MAX, this can easily
result in overflow, with implementation defined results; on all
implementations that I know, if it does result in overflow (because
RAND_MAX > INT_MAX / range, the actual results will almost certainly
be smaller than RAND_MAX + 1.0, and the division will result in a
value less than 1.0. There are several ways of avoiding this: the
simplest and most reliable is simply rand() % range + lowest.
Note that this supposes that rand() is of reasonable quality. Many
earlier implementations weren't, and I've seen at least one where
rand() % 6 + 1 to simulate a dice throw alternated odd and even. The
only correct solution here is to get a better implementation of
rand(); it has lead to people trying alternative solutions, such as
(range * (rand() / (RAND_MAX + 1.0))) + lowest. This masks the
problem, but it won't change a bad generator into a good one.
A second issue, if the quality of the generation is important, is
that when generating random integers, you're discretizing: if you're
simulating the throw of a die, for example, you have six possible
values, which you want to occur with equal probability. The random
generator will generate RAND_MAX + 1 different values, with equal
probability. If RAND_MAX + 1 is not a multiple of 6, there's no
possible way of distributing the values equaly amont the 6 desired
values. Imagine the simple case where RAND_MAX + 1 is 10. Using the
% method above, the values 1–4 are twice as likely as the the
values 5 and 6. If you use the more complicated formula 1 + int(6 *
(rand() / (RAND_MAX + 1.0))) (in the case where RAND_MAX + 1 == 10,
it turns out that 3 and 6 are only half as likely as the other values.
Mathematically, there's simply no way of distributing 10 different
values into 6 slots with an equal number of elements in each slot.
Of course, RAND_MAX will always be considerably larger than 10, and
the bias introduced will be considerably less; if the range is
significantly less than RAND_MAX, it could be acceptable. If it's
not, however, the usual procedure is something like:
int limit = (RAND_MAX + 1LL) - (RAND_MAX + 1LL) % range;
// 1LL will prevent overflow on most machines.
int result = rand();
while ( result >= limit ) {
result = rand();
}
return result % range + lowest;
(There are several ways of determining the values to throw out. This
happens to be the one I use, but I remember Andy Koenig using something
completely different—but which resulted in the same values being
thrown out in the end.)
Note that most of the time, you won't enter the loop; the worst case is
when range is (RAND_MAX + 1) / 2 + 1, in which case, you'll still
average just under one time through the loop.
Note that these comments only apply when you need a fixed number of
discrete results. For the (other) common case of generating a random
floating point number in the range of [0,1), rand() / (RAND_MAX +
1.0) is about as good as you're going to get.
Visual studio 2008 has no trouble with that program at all and happily generates a swathe of random numbers.
What I would be careful of is the /(RAND_MAX +1.0) as this will likely fall foul of integer problems and end up with a big fat zero.
Cast to double before dividing and then cast back to int afterwards
I suggest you replace rand()/(RAND_MAX + 1.0) with range*double(rand())/(RAND_MAX + 1.0)). Since my solution seems to give headaches ...
possible combinations of arguments:
range*rand() is an integer and overflows.
double(range*rand()) overflows before you convert it to double.
range*double(rand()) is not overflowing and yields expected results.
My original post had two braces but they did not change anything (results are the same).
(rand() % highest) + lowest + 1
Probably "10 * rand()" is smaller than "RAND_MAX + 1.0", so the value of your calculation is 0.
You are generating a random number (ie (range*rand()/(RAND_MAX + 1.0))) whose value is between -1 and 1 (]-1,1[) and then casting it to an integer. The integer value of such number is always 0 so you end up with the lower + 0
EDIT: added the formula to make my answer clearer
What about using a condition to check if the last number is the same as the current one? If the condition is met then generate another random number. This solution works but it will take more time though.
It is one of the simplest logics, got it from a blog. in this logic you can limit the random numbers with that given modulus(%) operator inside the for loop, its just a copy and paste from that blog, but any way check it out:
// random numbers generation in C++ using builtin functions
#include <iostream>
using namespace std;
#include <iomanip>
using std::setw;
#include <cstdlib> // contains function prototype for rand
int main()
{
// loop 20 times
for ( int counter = 1; counter <= 20; counter++ ) {
// pick random number from 1 to 6 and output it
cout << setw( 10 ) << ( 1 + rand() % 6 );
// if counter divisible by 5, begin new line of output
if ( counter % 5 == 0 )
cout << endl;
}
return 0; // indicates successful termination
} // end main
- See more at: http://www.programmingtunes.com/generation-of-random-numbers-c/#sthash.BTZoT5ot.dpuf