I'm looking to encrypt license keys on an audio software plugin. The biggest risk to the integrity of the license keys is small-time crackers decompiling the code and looking for the encryption key. My solution is to store an arbitrary number in the code and feed it to an algorithm that will obfuscate the encryption key while still allowing me to differ the key between projects (I'm a freelancer).
My question is - will seeding the C++ random number generator create the same psuedo-random encryption key every time, or will it differ between runs, libraries, etcetera. It's fine if it differs between operating systems, I just need it to not differ between SDKs and hosting softwares on the same computer.
srand and rand will produce the same sequence of numbers when you use the same implementation. Change compilers, even to a newer version of the same compiler, and there are no guarantees,
But the new random number generators, introduced in C++11 and defined in <random>, are requires to generate the same sequence of numbers on all implementations.
Related
I'm trying to re-implement the RSA key generation in C++ (as a hobby/learning playground) and by far my biggest problem seems to be generating a random number in range x,y which is also cryptographically secure (the primes p and q, for example).
I suppose using mt19937 or std::rand with a secure random seed (e.g. /dev/urandom or OpenSSL RAND_bytes etc) would not be considered 'cryptographically secure' in this case (RSA)?
ISAAC looked promising but I have zero clue on how to use it since I wasn't able to find any documentation at all.
Notably, this is also my first C++ project (I've done some C, Rust etc before... So C++ at least feels somewhat familiar and I'm not a complete newbie, mind you).
I suppose using mt19937 or std::rand with a secure random seed (e.g. /dev/urandom or OpenSSL RAND_bytes etc) would not be considered 'cryptographically secure' in this case (RSA)?
No, those are not cryptographically secure for basically any purpose.
ISAAC looked promising but I have zero clue on how to use it since I wasn't able to find any documentation at all.
Well, it stood the time I suppose. But I'd simply use a C++ library such as Crypto++ or Botan or something similar and then just implement the RSA key pair generation bit, borrowing one of their secure random generators. With a bit of luck they also have a bignum library so that you don't have to implement that either.
I see many people talk about security and std::random_device together.
For example, here slide 22.
According to cppreference, std::random_device :
std::random_device is a uniformly-distributed integer random number generator that produces non-deterministic random numbers.
It does not talk about security explicitly.
Is there any valid reference that explicitly mentions std::random_device is secure for cryptography?
No, because that's not what std::random_device is designed for; it's designed to generate random numbers, not to be secure.
In the context of security, randomness is something that is useful for key generation, but randomness is not something that is absolutely needed. For example, AES does not use any randomness, yet AES-256 is what is used to encrypt top secret information in the US.
One area where randomness and security cross, is when a random key is generated and used; if I can guess the seed and know the random protocol used, there's a good chance I can then use that same seed value to generate the same "random" value and thus the same key.
std::random_device will use a hardware module (like a hardware TPM) if one is available, otherwise it will use whatever the OS has as a RNG (like CryptGenRandom in Windows, or /dev/random in *nix systems), which might even be a PRNG (pseudo-random number generator), which might generate the same number depending on the random number algorithm used. As a side note: much like how the AES instruction set was incorporated into chipsets to speed up encryption and decryption, hardware RNG's help to give a larger entropy pool and faster random number generation as the algorithms are moved into hardware.
So if you are using std::random_device in any sort of cryptographic key generation, you'll need to be aware what random number generator is being used on the system being deployed to, otherwise you can have collisions and thus your encrypted system can be susceptible to duplicate key types of attack.
Hope that can help.
TL;DR: only use std::random_device to generate seeds for the defined PRNG's within this library. Otherwise use a cryptographic library such as Crypto++, Bothan, OpenSSL etc. to generate secure random numbers.
To have an idea why std::random_device is required it is important to see it in the light of the context for which it was defined.
std::random_device is part of a set of classes and methods that are used to generate deterministic/pseudo random number sequences fast. One example - also shown in the slides - is the Mersenne twister algorithm, which is certainly not cryptographically secure.
Now this is all very nice, but as the defined algorithms are all deterministic, this is arguably not what the users may be after: they want a fast random number generator that doesn't produce the same stream all of the time. Some kind of entropy source is required to seed the insecure PRNG. This is where std::random_device comes into action, it is used to seed the Mersenne twister (as shown in the slides referred to in the answer).
The slides show a speed difference of about 250 times for the Mersenne twister and the slow system provided non-deterministic random number generator. This clearly demonstrates why local, deterministic PRNG's can help to speed up random number generation.
Also note that local PRNG's won't slow down much when used from multiple threads. The system generator could be speedy when accessed by multiple threads, but this is certainly not a given. Sometimes system RNG's may even block or have latency or related issues.
As mentioned in the comments below the question, the contract for std::random_device is rather weak. It talks about using a "deterministic" generator if a system generator is not avialable. Of course, on most desktop / server configurations such a device (e.g. /dev/random or the non-blocking /dev/urandom device) is available. In that case std:random_device is rather likely to return a secure random number generator. However, you cannot rely on this to happen on all system configurations.
If you require a relatively fast secure random number generator I would recommend you use a cryptographic library such as OpenSSL or Crypto++ instead of using an insecure fast one the relatively slow system random generator. OpenSSL - for instance - will use the system random generator (as well as other entropy sources) to seed a more secure algorithm.
I'm working on a test suite for my package, and as part of the tests I would like to run my algorithm on a block of data. However, it occured to me that instead of hardcoding a particular block of data, I could use an algorithm to generate it. I'm wondering if the C++11 <random> facilities would be appropriate for this purpose.
From what I understand, the C++11 random number engines are required to implement specific algorithms. Therefore, given the same seed they should produce the same sequence of random integers in the range defined by the algorithm parameters.
However, as far as distributions are concerned, the standard specifies that:
The algorithms for producing each of the specified distributions are implementation-defined.
(26.5.8.1 Random number distribution class templates / In general)
Which — unless I'm mistaken — means that the output of a distribution is pretty much undefined. And from what I've tested, the distributions in GNU libstdc++ and LLVM project's libc++ produce different results given the same random engines.
The question would therefore be: what would be the most correct way of producing pseudo-random data that would be completely repeatable across different platforms?
what would be the most correct way of producing pseudo-random data that would be completely repeatable across different platforms?
That would be obvious: write your own distribution. As you yourself pointed out, the engines are cross-platform since they implement a specific algorithm. It's the distributions that are implementation-defined.
So write the distributions yourself.
Please see this answer: https://stackoverflow.com/a/34962942/1151329
I had exactly this problem and writing my own distributions worked perfectly. I got the same sequences across linux, OSx, windows, x86 and ARM.
I am using the Armadillo c++ library, that allows high-perfomance computation of matrices and vectors. This library has built-in functions to populate its objects with random numbers. I use it in the context of a procedurial random generation of an object. The object creation is random, but no matter how often I recreate the object, it remains the same as long as the seed remains the same.
The issue is that, although I can set the seed to a determined value, and thus recreate the same run on my machine... I lose the coherence of the randomness when going to a different computer. I come from the enchanted land of Matlab where I can specify the function used for the generation of pseudo-random numbers. So, this generation can be cross platform if one chooses the function well. But how do I specify the RNG function for Armadillo?
My research has led me to this source documentation, that "detail" the process of random number generation:
http://arma.sourceforge.net/internal_docs_4300/a01181_source.html
http://arma.sourceforge.net/internal_docs_4300/a00087.html
But i have no clue on what to do here: this code is much more advanced than what I can write. I would appreciate any help!
Thank you guys!
Remarks:
- I do not care how good the random function used is. I just want a fast cross-platform cross-architecture generator. Deterministic randomness is my goal anyway.
- In details, in case it matters, the machines to consider should be intel processors, windows or mac, 32b or 64b.
- I have read the several posts mentionning the use of seeds for randomness but it seems that the problem here is the cross-platform context and the fact that the random generator is buried (to my untrained eyes at least) within Armadillo's code.
In C++98 / C++03 mode, Armadillo will internally use std::rand() for generating random numbers (there's more to it, but that's a good approximation of what's happening).
If you move from one operating system to the next (or across two versions of the same operating system), there is no guarantee that the system provided random number generator will be the same.
If you use Armadillo in C++11 mode, you can use any random number generator you like, with the help of the .imbue() function. Example:
std::mt19937 engine; // Mersenne twister random number engine with default parameters
std::uniform_real_distribution<double> distr(0.0, 1.0);
mat A(123,456);
A.imbue( [&]() { return distr(engine); } ); // fill with random numbers provided by the engine
The Mersenne twister random number engine is provided as standard functionality in C++11. The default parameters should be stable across compiler vendors and versions, and are independent of the operating system.
I want to check whether my implementation of std::random_device
has non-zero entropy (i.e. is non-deterministic), using std::random_device::entropy() function. However, according
to cppreference.com
"This function is not fully implemented in some standard libraries.
For example, gcc and clang always return zero even though the device
is non-deterministic. In comparison, Visual C++ always returns 32,
and boost.random returns 10."
Is there any way of finding the real entropy? In particular, do modern
computers (MacBook Pro/iMac etc) have a non-deterministic source or randomness, like e.g. using heat dissipation monitors?
I recommend you the lecture of this article.
Myths about /dev/urandom
§ 26.5.6
A random_device uniform random number generator produces non-deterministic random numbers.
If implementation limitations prevent generating non-deterministic random numbers, the implementation may employ a random number engine.
So basically it will try to use the internal system "true" random number generator, in linux /dev/{u}random o windows RltGenRandom.
A different point is you don't trust those sources of randomness because they depend on internal noise or are close implementations.
Additionally is how do you meassure the quality of entropy, as you know that is one of the biggest problem trying to find good rng generators.
One estimation could be extremely good and other estimation could report not so good entropy.
Entropy Estimation
In various science/engineering applications, such as independent
component analysis, image analysis, genetic analysis, speech
recognition, manifold learning, and time delay estimation it is useful
to estimate the differential entropy of a system or process, given
some observations.
As it sais, you must rely on final observations, and those can be wrong.
I you think the internal rng is not good enough, you can always try to buy hardware devices for that purpose. This list on wikipedia has a list of vendors, you can check on the internet reviews about them.
Performance
One point you must consider is the performance within your application using real random number generators. One common technique is to use as seed in a mersenne twister a number obtained using /dev/random.
If the user can't access your system physically, you will need to balance reliability with availability, a system with security holes is as bad as one doesn't work, at the end you must have your important data encrypted.
Edit 1: As suggestion I have moved the article at the top of my comment, is a good read. Thanks for the hint :-).
All the standard gives you is what you've already seen. You would need to know something about how a given standard library implements random_device in order to answer this question. For example, in Visual Studio 2013 Update 4, random_device forwards to rand_s which forwards to RtlGenRandom, which may actually be (always?) a cryptographically secure pseudorandom number generator depending on your Windows version and the hardware available.
If you don't trust the platform to provide a good source of entropy, then you should use your own cryptographically secure PRNG, such as one based on AES. That said, platform vendors have strong incentives for their random numbers to actually be random, and embedding the PRNG into your app means that the PRNG can't be updated as easily in the event it is found to be insecure. Only you can decide on that tradeoff for yourself :)
Entropy is just one measure of RNG quality (and true, exact entropy is impossible to measure). For a practical and reasonably-accurate measurement of your std::random_device's random number quality, consider using a standard randomness test suite such as TestU01, diehard, or its successor dieharder. These run a battery of statistical tests designed to stress your RNG, ensuring it produces statistically random data.
Note that statistical randomness by itself does not certify that the RNG is suitable for cryptographic applications.
Many modern computers have easily-accessible sources of hardware randomness, namely the analog-to-digital converters found in the audio input, camera, and various sensors. These exhibit low-level thermal or electrical noise which can be exploited to produce high-quality random data. However, no OS that I know of actually uses these sensors to supply their system random-number sources (such as /dev/[u]random), since the bitrate of such physical random number sources tends to be very low.
Instead, OS-provided random number sources tend to be seeded by hardware counters and events, such as page faults, device driver events, and other sources of unpredictability. In theory, these events might be fully predictable given the precise hardware state (since they aren't based on e.g. quantum or thermal noise), but in practice they are sufficiently unpredictable that they produce good random data.
Entropy as a scientific term is misused when describing random numbers. Complexity might be a better term. Entropy in physics is defined as the logarithm of the number of available quantum states (not useful in RNG), and entropy in information theory is defined by the Shannon entropy, but that is geared towards the other extreme - how to put as much information into a noisy bit stream, not how to minimize the information.
For example, the digits of Pi look random, but the actual entropy of the digits is zero once you know that they derive from Pi. Increasing "Entropy" in RNG is basically a question of making the source of the numbers as obscure as possible.