Is std::random_device cryptographic secure? - c++

I see many people talk about security and std::random_device together.
For example, here slide 22.
According to cppreference, std::random_device :
std::random_device is a uniformly-distributed integer random number generator that produces non-deterministic random numbers.
It does not talk about security explicitly.
Is there any valid reference that explicitly mentions std::random_device is secure for cryptography?

No, because that's not what std::random_device is designed for; it's designed to generate random numbers, not to be secure.
In the context of security, randomness is something that is useful for key generation, but randomness is not something that is absolutely needed. For example, AES does not use any randomness, yet AES-256 is what is used to encrypt top secret information in the US.
One area where randomness and security cross, is when a random key is generated and used; if I can guess the seed and know the random protocol used, there's a good chance I can then use that same seed value to generate the same "random" value and thus the same key.
std::random_device will use a hardware module (like a hardware TPM) if one is available, otherwise it will use whatever the OS has as a RNG (like CryptGenRandom in Windows, or /dev/random in *nix systems), which might even be a PRNG (pseudo-random number generator), which might generate the same number depending on the random number algorithm used. As a side note: much like how the AES instruction set was incorporated into chipsets to speed up encryption and decryption, hardware RNG's help to give a larger entropy pool and faster random number generation as the algorithms are moved into hardware.
So if you are using std::random_device in any sort of cryptographic key generation, you'll need to be aware what random number generator is being used on the system being deployed to, otherwise you can have collisions and thus your encrypted system can be susceptible to duplicate key types of attack.
Hope that can help.

TL;DR: only use std::random_device to generate seeds for the defined PRNG's within this library. Otherwise use a cryptographic library such as Crypto++, Bothan, OpenSSL etc. to generate secure random numbers.
To have an idea why std::random_device is required it is important to see it in the light of the context for which it was defined.
std::random_device is part of a set of classes and methods that are used to generate deterministic/pseudo random number sequences fast. One example - also shown in the slides - is the Mersenne twister algorithm, which is certainly not cryptographically secure.
Now this is all very nice, but as the defined algorithms are all deterministic, this is arguably not what the users may be after: they want a fast random number generator that doesn't produce the same stream all of the time. Some kind of entropy source is required to seed the insecure PRNG. This is where std::random_device comes into action, it is used to seed the Mersenne twister (as shown in the slides referred to in the answer).
The slides show a speed difference of about 250 times for the Mersenne twister and the slow system provided non-deterministic random number generator. This clearly demonstrates why local, deterministic PRNG's can help to speed up random number generation.
Also note that local PRNG's won't slow down much when used from multiple threads. The system generator could be speedy when accessed by multiple threads, but this is certainly not a given. Sometimes system RNG's may even block or have latency or related issues.
As mentioned in the comments below the question, the contract for std::random_device is rather weak. It talks about using a "deterministic" generator if a system generator is not avialable. Of course, on most desktop / server configurations such a device (e.g. /dev/random or the non-blocking /dev/urandom device) is available. In that case std:random_device is rather likely to return a secure random number generator. However, you cannot rely on this to happen on all system configurations.
If you require a relatively fast secure random number generator I would recommend you use a cryptographic library such as OpenSSL or Crypto++ instead of using an insecure fast one the relatively slow system random generator. OpenSSL - for instance - will use the system random generator (as well as other entropy sources) to seed a more secure algorithm.

Related

Is there a C++11 CSPRNG?

As we know, the Mersenne Twister is not crytographically secure:
Mersenne Twister is not cryptographically secure. (MT is based on a
linear recursion. Any pseudorandom number sequence generated by a
linear recursion is insecure, since from sufficiently long subsequencje
of the outputs, one can predict the rest of the outputs.)
But many sources, like Stephan T. Lavavej and even this website. The advice is almost always (verbatim) to use the Mersenne Twister like this:
auto engine = mt19937{random_device{}()};
They come in different flavors, like using std::seed_seq or complicated ways of manipulating std::tm, but this is the simplest approach.
Even though std::random_device is not always reliable:
std::random_device may be implemented in terms of an
implementation-defined pseudo-random number engine if a
non-deterministic source (e.g. a hardware device) is not available to
the implementation. In this case each std::random_device object may
generate the same number sequence.
The /dev/urandom vs /dev/random debate rages on.
But while the standard library provides a good collection of PRNGs, it doesn't seem to provide any CSPRNGs. I prefer to stick to the standard library rather than using POSIX, Linux-only headers, etc. Can the Mersenne Twister be manipulated to make it cryptographically secure?
Visual Studio guarantees that random_device is cryptographically secure and non-deterministic:
https://msdn.microsoft.com/en-us/library/bb982250.aspx
If you want something faster or cross platform, you could for example use GnuTLS: http://gnutls.org/manual/html_node/Random-number-generation.html
It provides random numbers of adjustable quality. GNUTLS_RND_RANDOM is what you want I think.
As several people already said, please forget about MT in cryptographic contexts.

Should I use std::default_random_engine or should I use std::mt19937?

when I want to generate random numbers using std::random, which engine should I prefer? the std::default_random_engine or the std::mt19937? what are the differences?
For lightweight randomnes (e.g. games), you could certainly consider default_random_engine. But if your code depends heavily on quality of randomness (e.g. simulation software), you shouldn't use it, as it gives only minimalistic garantees:
It is the library implemention's selection of a generator that
provides at least acceptable engine behavior for relatively casual,
inexpert, and/or lightweight use.
The mt19937 32 bits mersene twister (or its 64 bit counterpart mt19937_64) is on the other side a well known algorithm that passes very well statistical randomness tests. So it's ideal for scientific applications.
However, you shall consider neither of them, if your randomn numbers are meant for security (e.g. cryptographic) purpose.
The question is currently having one close vote as primary opinion based. I would argue against that and say that std::default_random_engine is objectively a bad choice, since you don't know what you get and switching standard libraries can give you different results in the quality of the randomness you receive.
You should pick whatever random number generator gives you the kind of qualities you are looking for. If you have to pick between the two, go with std::mt19937 as it gives you predictable and defined behaviour.
They address different needs. The first is an implementation-defined alias for a certain generator whilst the latter specifically uses the Mersenne-Twister algorithm with a 32 bit seed.
If you don't have particular requirements, std::default_random_engine should be ok.

How to create the same random numbers on different computers with Armadillo?

I am using the Armadillo c++ library, that allows high-perfomance computation of matrices and vectors. This library has built-in functions to populate its objects with random numbers. I use it in the context of a procedurial random generation of an object. The object creation is random, but no matter how often I recreate the object, it remains the same as long as the seed remains the same.
The issue is that, although I can set the seed to a determined value, and thus recreate the same run on my machine... I lose the coherence of the randomness when going to a different computer. I come from the enchanted land of Matlab where I can specify the function used for the generation of pseudo-random numbers. So, this generation can be cross platform if one chooses the function well. But how do I specify the RNG function for Armadillo?
My research has led me to this source documentation, that "detail" the process of random number generation:
http://arma.sourceforge.net/internal_docs_4300/a01181_source.html
http://arma.sourceforge.net/internal_docs_4300/a00087.html
But i have no clue on what to do here: this code is much more advanced than what I can write. I would appreciate any help!
Thank you guys!
Remarks:
- I do not care how good the random function used is. I just want a fast cross-platform cross-architecture generator. Deterministic randomness is my goal anyway.
- In details, in case it matters, the machines to consider should be intel processors, windows or mac, 32b or 64b.
- I have read the several posts mentionning the use of seeds for randomness but it seems that the problem here is the cross-platform context and the fact that the random generator is buried (to my untrained eyes at least) within Armadillo's code.
In C++98 / C++03 mode, Armadillo will internally use std::rand() for generating random numbers (there's more to it, but that's a good approximation of what's happening).
If you move from one operating system to the next (or across two versions of the same operating system), there is no guarantee that the system provided random number generator will be the same.
If you use Armadillo in C++11 mode, you can use any random number generator you like, with the help of the .imbue() function. Example:
std::mt19937 engine; // Mersenne twister random number engine with default parameters
std::uniform_real_distribution<double> distr(0.0, 1.0);
mat A(123,456);
A.imbue( [&]() { return distr(engine); } ); // fill with random numbers provided by the engine
The Mersenne twister random number engine is provided as standard functionality in C++11. The default parameters should be stable across compiler vendors and versions, and are independent of the operating system.

how to find the "true" entropy of std::random_device?

I want to check whether my implementation of std::random_device
has non-zero entropy (i.e. is non-deterministic), using std::random_device::entropy() function. However, according
to cppreference.com
"This function is not fully implemented in some standard libraries.
For example, gcc and clang always return zero even though the device
is non-deterministic. In comparison, Visual C++ always returns 32,
and boost.random returns 10."
Is there any way of finding the real entropy? In particular, do modern
computers (MacBook Pro/iMac etc) have a non-deterministic source or randomness, like e.g. using heat dissipation monitors?
I recommend you the lecture of this article.
Myths about /dev/urandom
ยง 26.5.6
A random_device uniform random number generator produces non-deterministic random numbers.
If implementation limitations prevent generating non-deterministic random numbers, the implementation may employ a random number engine.
So basically it will try to use the internal system "true" random number generator, in linux /dev/{u}random o windows RltGenRandom.
A different point is you don't trust those sources of randomness because they depend on internal noise or are close implementations.
Additionally is how do you meassure the quality of entropy, as you know that is one of the biggest problem trying to find good rng generators.
One estimation could be extremely good and other estimation could report not so good entropy.
Entropy Estimation
In various science/engineering applications, such as independent
component analysis, image analysis, genetic analysis, speech
recognition, manifold learning, and time delay estimation it is useful
to estimate the differential entropy of a system or process, given
some observations.
As it sais, you must rely on final observations, and those can be wrong.
I you think the internal rng is not good enough, you can always try to buy hardware devices for that purpose. This list on wikipedia has a list of vendors, you can check on the internet reviews about them.
Performance
One point you must consider is the performance within your application using real random number generators. One common technique is to use as seed in a mersenne twister a number obtained using /dev/random.
If the user can't access your system physically, you will need to balance reliability with availability, a system with security holes is as bad as one doesn't work, at the end you must have your important data encrypted.
Edit 1: As suggestion I have moved the article at the top of my comment, is a good read. Thanks for the hint :-).
All the standard gives you is what you've already seen. You would need to know something about how a given standard library implements random_device in order to answer this question. For example, in Visual Studio 2013 Update 4, random_device forwards to rand_s which forwards to RtlGenRandom, which may actually be (always?) a cryptographically secure pseudorandom number generator depending on your Windows version and the hardware available.
If you don't trust the platform to provide a good source of entropy, then you should use your own cryptographically secure PRNG, such as one based on AES. That said, platform vendors have strong incentives for their random numbers to actually be random, and embedding the PRNG into your app means that the PRNG can't be updated as easily in the event it is found to be insecure. Only you can decide on that tradeoff for yourself :)
Entropy is just one measure of RNG quality (and true, exact entropy is impossible to measure). For a practical and reasonably-accurate measurement of your std::random_device's random number quality, consider using a standard randomness test suite such as TestU01, diehard, or its successor dieharder. These run a battery of statistical tests designed to stress your RNG, ensuring it produces statistically random data.
Note that statistical randomness by itself does not certify that the RNG is suitable for cryptographic applications.
Many modern computers have easily-accessible sources of hardware randomness, namely the analog-to-digital converters found in the audio input, camera, and various sensors. These exhibit low-level thermal or electrical noise which can be exploited to produce high-quality random data. However, no OS that I know of actually uses these sensors to supply their system random-number sources (such as /dev/[u]random), since the bitrate of such physical random number sources tends to be very low.
Instead, OS-provided random number sources tend to be seeded by hardware counters and events, such as page faults, device driver events, and other sources of unpredictability. In theory, these events might be fully predictable given the precise hardware state (since they aren't based on e.g. quantum or thermal noise), but in practice they are sufficiently unpredictable that they produce good random data.
Entropy as a scientific term is misused when describing random numbers. Complexity might be a better term. Entropy in physics is defined as the logarithm of the number of available quantum states (not useful in RNG), and entropy in information theory is defined by the Shannon entropy, but that is geared towards the other extreme - how to put as much information into a noisy bit stream, not how to minimize the information.
For example, the digits of Pi look random, but the actual entropy of the digits is zero once you know that they derive from Pi. Increasing "Entropy" in RNG is basically a question of making the source of the numbers as obscure as possible.

Is there a way to check if std::random_device is in fact random?

Quoting from cppreference:
std::random_device is a non-deterministic random number engine, although implementations are allowed to implement std::random_device using a pseudo-random number engine if there is no support for non-deterministic random number generation.
Is there a way to check whether current implementation uses PRNG instead of RNG (and then say exit with an error) and if not, why not?
Note that little bit of googling shows that at least MinGW implements std::random_device in this way, and thus this is real danger if std::random_device is to be used.
---edit---
Also, if the answer is no and someone could give some insight as to why there is no such function/trait/something I would be quite interested.
Is there a way to check whether current implementation uses PRNG instead of RNG (and then say exit with an error) and if not, why not?
There is a way: std::random_device::entropy will return 0.0 if it is implemented in terms of a random number engine (that is, it's deterministic).
From the standard:
double entropy() const noexcept;
Returns: If the implementation employs a random number engine, returns 0.0. Otherwise, returns an entropy estimate for the random numbers returned by operator(), in the range min() to log_2(max() + 1).
There is no 100% safe way to determine real randomness for sure. With a black box approach the best you could do do is to show evidence if it's not fully random:
first you could verify that the distribution seems random, by generating a lot of random munmbers and making statistics about their distribution (e.g. generate 1 million random numbers between 0 and 1000). If it appears that some numbers come out significantly more often than other, then obviously it's not really random.
THe next you can is to run several time a programme generating random numbers after the same initial seed. If you obtain the same sequence of random numbers then it's definitively PRNG and not real randmness. However, if you don't obtain the same sequence it does not proove tanything: the library could use some kind of auto-seed (using clock ticks or something else) to hide/improve the pseudo-randmness.
If your application highly depends on randomness quality (e.g. cryptographic quality) you should consider some more tests, such as those recommended by NIST SP 800-22
Xarn stated above:
However, said pessimism also precludes this method from differentiating between RNG and PRNG based implementation, making it rather unhelpful. Also VC++ could be realistic, but to check that would probably require a lot of insider knowledge about Windows.
If you debug into the Windows implementation, then you will find that you end up in RtlGenRandom, which is one of the better sources of cryptographically random bytes. If you debug into the Linux implementation, then you should end up reading from dev/urandom, which is also OK. The fact that they don't tell us that we're not using something awful, like rand, is annoying.
PS - you don't have to have internal Windows knowledge, you just need to attach the symbols to the debugger.