random_device vs default_random_engine - c++

#include <vector>
#include <random>
using namespace std;
int main()
{
vector<int> coll{1, 2, 3, 4};
shuffle(coll.begin(), coll.end(), random_device{});
default_random_engine dre{random_device{}()};
shuffle(coll.begin(), coll.end(), dre);
}
Question 1: What's the difference between
shuffle(coll.begin(), coll.end(), random_device{});
and
shuffle(coll.begin(), coll.end(), dre);?
Question 2: Which is better?

Question 1: What's the difference between...
std::random_device conceptually produces true random numbers. Some implementations will stall if you exhaust the system's source of entropy so this version may not perform as well.
std::default_random_engine is a pseudo-random engine. Once seeded, with an random number it would be extremely difficult (but not impossible) to predict the next number.
There is another subtle difference. std::random_device::operator() will throw an exception if it fails to come up with a random number.
Question 2: Which is better?
It depends. For most cases, you probably want the performance and temporal-determinism of the pseudorandom engine seeded with a random number, so that would be the second option.

Both random_device and default_random_engine are implementation defined. random_device should provide a nondeterministic source of randomness if available, but it may also be a prng in some implementations. Use random_device if you want unpredictable random numbers (most machines nowadays have hardware entropy sources). If you want pseudo random numbers you'd probably use one of the specific algorithms, like the mersenne_twister_engine.
I guess default_random_engine is what you'd use if you don't care about the particulars of how you get your random numbers. I'd suspect it'd just use rand under the hood or a linear_congruential_engine most of the time.
I don't think the question "which is better" can be answered objectively. It depends on what you're trying to do with your random numbers. If they're supposed to be random sources for some cryptographic process, I suspect default_random_engine is a terrible choice, although I'm not a security expert, so I'm not sure if even random_device is good enough.

Related

Better alternatives to random_device in C++? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 28 days ago.
Improve this question
I have been using random_device rd{} to generate seeds for my Mersenne-Twister pseudo random number generator mt19937 RNG{rd()} as have been suggested here. However, it is written in the documentation (comment in the documentations' example code), that "the performance of many implementations of random_device degrades sharply once the entropy pool is exhausted. For practical use random_device is generally only used to seed a PRNG such as mt19937". I have tried testing how big this "entropy pool" is, and for 10^6 number of calls, random_device returns me more than 10^2 repeating numbers (see my example code and output below). In other words, if I will try using random_device as a seed to my Mersenne-Twister PRNG, it will generate a solid fraction of repeating seeds.
Question: do people still use random_device in C++ to generate seeds for PRNG or are there already better alternatives?
My code:
#include <iostream>
#include <random>
#include <chrono>
using namespace std;
int main(){
auto begin = std::chrono::high_resolution_clock::now();
random_device rd{};
mt19937 RNG{ rd() };
int total_n_of_calls = 1e6;
vector<int> seeds;
for(auto call = 0; call < total_n_of_calls; call++){
int call_rd = rd();
seeds.push_back(call_rd);
}
int count_repeats = 0;
sort(seeds.begin(), seeds.end());
for(int i = 0; i < seeds.size() - 1; i++) {
if (seeds[i] == seeds[i + 1]) {
count_repeats++;
}
}
printf("Number of times random_device have been called: %i\n", total_n_of_calls);
printf("Number of repeats: %i\n", count_repeats);
auto end = std::chrono::high_resolution_clock::now();
auto elapsed = std::chrono::duration_cast<std::chrono::nanoseconds>(end - begin);
printf("Duration: %.3f seconds.\n", elapsed.count() * 1e-9);
return 0;
}
The output:
Number of times random_device have been called: 1000000
Number of repeats: 111
Duration: 0.594 seconds.
TL;DR: No, there's nothing better. You just need to stop abusing it.
The point of random_device is that it asks your platform for bits that are actually random, not just pseudorandom from some deterministic seed.
If the platform / OS thinks the entropy it had was expended, then it cannot offer you this. But honestly, it uses true sources of randomness, from actual randomness hardware in your CPU to timing of disk access, to modify the internal state of a PRNG. That's all there is to it – to someone external, the bits you get are still unpredictable.
So, the answer is this:
you use random_device because you need actually random seeds. There's no algorithmic shortcut to randomness – the word "algorithm" already says that it's deterministic. And software, universally, is deterministic, unless it gets random data externally. So, all you can do is ask the operating system, which actually deals with any source of randomness there is in your system. And that's already exactly what random_device does.
So, no, you cannot use something else but actual external entropy, which is exactly what you get most efficiently from random_device (unless you buy an expensive dedicated random generator card and write a driver for it).
As the OS uses the random external source to change the internal state of a PRNG, it can produce more random things securely than random events happen – but it needs to keep track of how much bits got taken out of the PRNG, so that it never becomes possible for an attacker to reconstruct prior state with a high probability of being right. Thus, it slows down your consumption of randomness when there's not enough external randomness to modify the internal state.
Thus, 10⁶ calls to generate a seed in a short time sound like you're doing something wrong; twice as much if these are used to feed a Mersenne twister, an algorithm that is overly complex and slow, but not cryptographically secure. You're not using this much actual randomness, ever! Don't reseed, continue to use your seeded PRNG, unless you need cryptographically safety that these seeds are independent.
And that's exactly the thing: if you're in a situation where you need to generate 10⁶ independent cryptographically secure keys in less than a few seconds, you're a bit special. Are you working for someone who does CDNs, where a single operating system would serve millions of new TLS connections per second? If not, reduce your usage of random_device to what it's actually useful for.
If you want to understand more about the way true randomness ends up in your program, I recommend reading this answer. In short, if you're actually in need of more random bytes per second than the default random_device offers, try constructing it with "/dev/urandom" as a ctor parameter. It's still going to be secure, for any assumable definition of what that means in the context in which you're asking this (which means I assume you're not writing a cryptographic library for extremely high throughput of key generation).

C++ need a good technique for seeding rand() that does not use time()

I have a bash script that starts many client processes. These are AI game players that I'm using to test a game with many players, on the order of 400 connections.
The problem I'm having is that the AI player uses
srand( time(nullptr) );
But if all the players start at approximately the same time, they will frequently receive the same time() value, which will mean that they are all on the same rand() sequence.
Part of the testing process is to ensure that if lots of clients try to connect at approximately the same time, the server can handle it.
I had considered using something like
srand( (int) this );
Or similar, banking on the idea that each instance has a unique memory address.
Is there another better way?
Use a random seed to a pseudorandom generator.
std::random_device is expensive random data. (expensive as in slow)
You use that to seed a prng algorithm. mt19937 is the last prng algorithm you will ever need.
You can optionally follow that up by feeding it through a distribution if your needs require it. i.e. if you need values in a certain range other than what the generator provides.
std::random_device rd;
std::mt19937 generator(rd());
These days rand() and srand() are obsolete.
The generally accepted method is to seed a pseudo random number generator from the std::random_device. On platforms that provide non-deterministic random sources the std::random_device is required to use them to provide high quality random numbers.
However it can be slow or even block while gathering enough entropy. For this reason it is generally only used to provide the seed.
A high quality but efficient random engine is the mersenne twister provided by the standard library:
inline
std::mt19937& random_generator()
{
thread_local static std::mt19937 mt{std::random_device{}()};
return mt;
}
template<typename Number>
Number random_number(Number from, Number to)
{
static_assert(std::is_integral<Number>::value||std::is_floating_point<Number>::value,
"Parameters must be integer or floating point numbers");
using Distribution = typename std::conditional
<
std::is_integral<Number>::value,
std::uniform_int_distribution<Number>,
std::uniform_real_distribution<Number>
>::type;
thread_local static Distribution dist;
return dist(random_generator(), typename Distribution::param_type{from, to});
}
You use a random number seed if and only if you want reproducible results. This can be handy for things like map generation where you want the map to be randomized, but you want it to be predictably random based on the seed.
For most cases you don't want that, you want actually random numbers, and the best way to do that is through the Standard Library generator functions:
#include <random>
std::random_device rd;
std::map<int, int> hist;
std::uniform_int_distribution<int> dist(0, 5);
int random_die_roll = dist(rd);
No seed is required nor recommended in this case. The "random device" goes about seeding the PRNG (pseudo random number generator) properly to ensure unpredictable results.
Again, DO NOT use srand(time(NULL)) because it's a very old, very bad method for initializing random numbers and it's highly predictable. Spinning through a million possible seeds to find matching output is trivial on modern computers.
I'm trying so seed the random function with errno :
#include <stddef.h>
#include <string.h>
int main(void){
srand(&errno);
srand(strerror(0));
return rand();
}

How should I properly seed C++11 std::default_random_engine?

According to this post, intuitive seeding with std::random_device may not produce the expected results. In particular, if the Mersenne Twister engine is used, not all the initialization states can be reached. Using seed_seq doesn't helper either, since it is not a bijection.
This all, as far as I understand, means that not the std::uniform_int_distribution will not really be uniform - because not all seed values are possible.
I'd like to simply generate a couple of random numbers. While this is a really interesting topic which I will certainly devote some of my free time, many people may not have this possibility.
So the question is: how should I properly seed the std::default_random_engine so that it simply does what I expect?
A uniform_int_distribution will still be uniform however you seed it. But better seeding can reduce chances of getting the same sequence of uniformly distributed values.
I think for most purposes using a std::seed_seq with about 8 random 32bit ints from std::random_device should be sufficient. It is not perfect, for the reasons given in the post you linked but if you need really secure numbers for cryptographic purposes you shouldn't really be using a pseudo random number generator anyway:
constexpr std::size_t SEED_LENGTH = 8;
std::array<uint_fast32_t, SEED_LENGTH> generateSeedData() {
std::array<uint_fast32_t, SEED_LENGTH> random_data;
std::random_device random_source;
std::generate(random_data.begin(), random_data.end(), std::ref(random_source));
return random_data;
}
std::mt19937 createEngine() {
auto random_data = generateSeedData();
std::seed_seq seed_seq(random_data.begin(), random_data.end());
return std::mt19937{ seed_seq };
}

Achieve same random number sequence on different OS with same seed

Is there any way to achieve same random int numbers sequence in different operating system with same seed?
I have tried this code:
std::default_random_engine engine(seed);
std::uniform_int_distribution<int> dist(0, N-1);
If I ran this code on one machine multiple times with same seed, sequence of dist(engine) is the same, but on different operating system sequence is different.
Yes there is, but you need a different or to put it exactly, the same PRNG on each platform. std::default_random_engine engine is a implementation defined PRNG. That means you may not get the same PRNG on every platform. If you do not have the same one then your chances of getting the same sequence is pretty low.
What you need is something like std::mt19937 which is required to give the same output for the same seed. In fact all of the defined generators in <random> besides std::default_random_engine engine will produce the same output when using the same seed.
The other thing you need to know is that std::uniform_int_distribution is also implementation defined. The formula it has to use is defined but the way it achieves that is left up to the implementor. That means you may not get the exact same output. If you need portability you will need to roll you own distribution or get a third party one that will always be the same regardless of platform.

Random number generator repeats every time?

I'm trying to find a random number generator that will give me a single random number each time I run it. I have spent a week trying dozens of different ones, both from this site and others. Every time I run it, it gives me the same number! The only time it changes is if I change the range, and then it just gives me the new number over and over.
I am running Code::Blocks ver. 16.01 on Windows 7. Can anyone help?? I'm at my wits' end!
This code gives me a decently ramdom string of numbers, but still the same string each time!
#include <iostream>
#include <random>
int main()
{
std::random_device rd;
std::mt19937 eng(rd()); std::uniform_int_distribution<> distr(0, 10);
for(int n=0; n<100; ++n)
std::cout << distr(eng) << '\t';
}
I have tried the code on my compiler app on my phone as well.
Every pseudo random number generator will return the same sequence of numbers for the same initial seed value.
What you want to do is to use a different seed every time you run the program. Otherwise you'll just be using the same default seed every time and get the same values.
Picking good seeds is not as easy as you might think. Using the output from time(nullptr) for example still gives the same results if two copies of the program run within the same second. Using the value of getpid() is also bad since pid values wrap and thus sometimes you'll get the same value for different runs. Luckily you have other options. std::seed_seq lets you combine multiple bad sources and returns a good (or rather, pretty good) seed value you can use. There is also std::random_device which (on all sane implementations) returns raw entropy - perfect for seeding a pseudo random generator (or you can just use it directly if it is fast enough for your purpose) or you can combine it with std::seed_seq and the bad sources to seed a generator if you are worried it might be implemented as a prng on your implementation.
I would advice you to read this page: http://en.cppreference.com/w/cpp/numeric/random for an overview of how to deal with random number generation in modern C++.
The standard allows std::random_device to be implemented in terms of a pseudo-random number generator if there is no real random source on the system.
You may need to find a different entropy source, such as the time, or user touch co-ordinates.