The randint proposal is supposed to provide an intuitive interface for getting random integral numbers. The problem however is that it uses default_random_engine and so far has no way of letting you specify the engine it uses. libstdc++ and libc++ both use minstd_rand which is a LCG, which I've read creates poor quality numbers. MSVC uses mt19937 which seems to be the most common on this site.
Is this a big negative of randint?
Related
As we know, the Mersenne Twister is not crytographically secure:
Mersenne Twister is not cryptographically secure. (MT is based on a
linear recursion. Any pseudorandom number sequence generated by a
linear recursion is insecure, since from sufficiently long subsequencje
of the outputs, one can predict the rest of the outputs.)
But many sources, like Stephan T. Lavavej and even this website. The advice is almost always (verbatim) to use the Mersenne Twister like this:
auto engine = mt19937{random_device{}()};
They come in different flavors, like using std::seed_seq or complicated ways of manipulating std::tm, but this is the simplest approach.
Even though std::random_device is not always reliable:
std::random_device may be implemented in terms of an
implementation-defined pseudo-random number engine if a
non-deterministic source (e.g. a hardware device) is not available to
the implementation. In this case each std::random_device object may
generate the same number sequence.
The /dev/urandom vs /dev/random debate rages on.
But while the standard library provides a good collection of PRNGs, it doesn't seem to provide any CSPRNGs. I prefer to stick to the standard library rather than using POSIX, Linux-only headers, etc. Can the Mersenne Twister be manipulated to make it cryptographically secure?
Visual Studio guarantees that random_device is cryptographically secure and non-deterministic:
https://msdn.microsoft.com/en-us/library/bb982250.aspx
If you want something faster or cross platform, you could for example use GnuTLS: http://gnutls.org/manual/html_node/Random-number-generation.html
It provides random numbers of adjustable quality. GNUTLS_RND_RANDOM is what you want I think.
As several people already said, please forget about MT in cryptographic contexts.
when I want to generate random numbers using std::random, which engine should I prefer? the std::default_random_engine or the std::mt19937? what are the differences?
For lightweight randomnes (e.g. games), you could certainly consider default_random_engine. But if your code depends heavily on quality of randomness (e.g. simulation software), you shouldn't use it, as it gives only minimalistic garantees:
It is the library implemention's selection of a generator that
provides at least acceptable engine behavior for relatively casual,
inexpert, and/or lightweight use.
The mt19937 32 bits mersene twister (or its 64 bit counterpart mt19937_64) is on the other side a well known algorithm that passes very well statistical randomness tests. So it's ideal for scientific applications.
However, you shall consider neither of them, if your randomn numbers are meant for security (e.g. cryptographic) purpose.
The question is currently having one close vote as primary opinion based. I would argue against that and say that std::default_random_engine is objectively a bad choice, since you don't know what you get and switching standard libraries can give you different results in the quality of the randomness you receive.
You should pick whatever random number generator gives you the kind of qualities you are looking for. If you have to pick between the two, go with std::mt19937 as it gives you predictable and defined behaviour.
They address different needs. The first is an implementation-defined alias for a certain generator whilst the latter specifically uses the Mersenne-Twister algorithm with a 32 bit seed.
If you don't have particular requirements, std::default_random_engine should be ok.
I am using the Armadillo c++ library, that allows high-perfomance computation of matrices and vectors. This library has built-in functions to populate its objects with random numbers. I use it in the context of a procedurial random generation of an object. The object creation is random, but no matter how often I recreate the object, it remains the same as long as the seed remains the same.
The issue is that, although I can set the seed to a determined value, and thus recreate the same run on my machine... I lose the coherence of the randomness when going to a different computer. I come from the enchanted land of Matlab where I can specify the function used for the generation of pseudo-random numbers. So, this generation can be cross platform if one chooses the function well. But how do I specify the RNG function for Armadillo?
My research has led me to this source documentation, that "detail" the process of random number generation:
http://arma.sourceforge.net/internal_docs_4300/a01181_source.html
http://arma.sourceforge.net/internal_docs_4300/a00087.html
But i have no clue on what to do here: this code is much more advanced than what I can write. I would appreciate any help!
Thank you guys!
Remarks:
- I do not care how good the random function used is. I just want a fast cross-platform cross-architecture generator. Deterministic randomness is my goal anyway.
- In details, in case it matters, the machines to consider should be intel processors, windows or mac, 32b or 64b.
- I have read the several posts mentionning the use of seeds for randomness but it seems that the problem here is the cross-platform context and the fact that the random generator is buried (to my untrained eyes at least) within Armadillo's code.
In C++98 / C++03 mode, Armadillo will internally use std::rand() for generating random numbers (there's more to it, but that's a good approximation of what's happening).
If you move from one operating system to the next (or across two versions of the same operating system), there is no guarantee that the system provided random number generator will be the same.
If you use Armadillo in C++11 mode, you can use any random number generator you like, with the help of the .imbue() function. Example:
std::mt19937 engine; // Mersenne twister random number engine with default parameters
std::uniform_real_distribution<double> distr(0.0, 1.0);
mat A(123,456);
A.imbue( [&]() { return distr(engine); } ); // fill with random numbers provided by the engine
The Mersenne twister random number engine is provided as standard functionality in C++11. The default parameters should be stable across compiler vendors and versions, and are independent of the operating system.
Quoting from cppreference:
std::random_device is a non-deterministic random number engine, although implementations are allowed to implement std::random_device using a pseudo-random number engine if there is no support for non-deterministic random number generation.
Is there a way to check whether current implementation uses PRNG instead of RNG (and then say exit with an error) and if not, why not?
Note that little bit of googling shows that at least MinGW implements std::random_device in this way, and thus this is real danger if std::random_device is to be used.
---edit---
Also, if the answer is no and someone could give some insight as to why there is no such function/trait/something I would be quite interested.
Is there a way to check whether current implementation uses PRNG instead of RNG (and then say exit with an error) and if not, why not?
There is a way: std::random_device::entropy will return 0.0 if it is implemented in terms of a random number engine (that is, it's deterministic).
From the standard:
double entropy() const noexcept;
Returns: If the implementation employs a random number engine, returns 0.0. Otherwise, returns an entropy estimate for the random numbers returned by operator(), in the range min() to log_2(max() + 1).
There is no 100% safe way to determine real randomness for sure. With a black box approach the best you could do do is to show evidence if it's not fully random:
first you could verify that the distribution seems random, by generating a lot of random munmbers and making statistics about their distribution (e.g. generate 1 million random numbers between 0 and 1000). If it appears that some numbers come out significantly more often than other, then obviously it's not really random.
THe next you can is to run several time a programme generating random numbers after the same initial seed. If you obtain the same sequence of random numbers then it's definitively PRNG and not real randmness. However, if you don't obtain the same sequence it does not proove tanything: the library could use some kind of auto-seed (using clock ticks or something else) to hide/improve the pseudo-randmness.
If your application highly depends on randomness quality (e.g. cryptographic quality) you should consider some more tests, such as those recommended by NIST SP 800-22
Xarn stated above:
However, said pessimism also precludes this method from differentiating between RNG and PRNG based implementation, making it rather unhelpful. Also VC++ could be realistic, but to check that would probably require a lot of insider knowledge about Windows.
If you debug into the Windows implementation, then you will find that you end up in RtlGenRandom, which is one of the better sources of cryptographically random bytes. If you debug into the Linux implementation, then you should end up reading from dev/urandom, which is also OK. The fact that they don't tell us that we're not using something awful, like rand, is annoying.
PS - you don't have to have internal Windows knowledge, you just need to attach the symbols to the debugger.
I want to supply a number, and then receive a set of random numbers. However, I want those numbers to be the same regardless of which computer I run it on (assuming I supply the same seed).
Basically my question is: in C++, if I make use of rand(), but supply srand() with a user-defined seed rather than the current time, will I be able to generate the same random number stream on any computer?
There are dozens of PRNGs available as libraries. Pick one. I tend to use Mersenne Twister.
By using an externally supplied library, you bypass the risk of a weird or buggy implementation of your language's library rand(). As long as your platforms all conform to the same mathematical semantics, you'll get consistent results.
MT is a favorite of mine because I'm a physicist, and I use these things for Monte Carlo, where the guarantee of equal-distribution to high dimensions is important. But don't use MT as a cryptographic PRNG!
srand() & rand() are not part of the STL. They're actually part of the C runtime.
Yes, they will produce the same results as long as it's the same implementation of srand()/rand().
Depending on your needs, you might want to consider using Boost.Random. It provides several high-quality random number generators.
Assuming the implementations of rand() are the same, yes.
The easiest way to ensure this is to include a known rand() implementation with your program - either included in your project's source code or in the form of a library you can manage.
No, the ANSI C standard only specifies that rand() must produce a stream of random integers between 0 and RAND_MAX, which must be at least 32767 (source). This stream must be deterministic only in that, for a given implementation on a given machine, it must produce the same integer stream given the same seed.
You want a portable PRNG. Mersenne Twister (many implementations linked at the bottom) is pretty portable, as is Ben Pfaff's homegrown C99-compliant PRNG. Boost.Random should be fine too; as you're writing your code in C++, using Boost doesn't limit your choice of platforms much (although some "lesser" (i.e. non-compliant) compilers may have trouble with its heavy use of template metaprogramming). This is only really a problem for low-volume embedded platforms and perhaps novel research architectures, so if by "any computer" you mean "any x86/PPC/ARM/SPARC/Alpha/etc. platform that GCC targets", any of the above should do just fine.
Write your own pseudorandom number routine. There are a lot of algorithms documented on the internet, and they have a number of applications where rand isn't good enough (e.g. Perlin Noise).
Try these links for starters:
http://en.wikipedia.org/wiki/Linear_congruential_generator
http://en.wikipedia.org/wiki/Pseudorandom_number_generator
I believe if you supply with srand with the same seed, you will get the same results. That's pretty much the definition of a seed in terms of pseudo random number generators.
Yes. For a given seed (starting value), the sequence of numbers that rand() returns will always be the same.