OpenWall BCrypt: Example of Hashing Input using crypt_r and explanation of inputs and best practice

OpenWall BCrypt: Example of Hashing Input using crypt_r and explanation of inputs and best practice - c++

I am struggling with finding examples for OpenWall's bcrypt implementation that use crypt_gensalt_rn and crypt_r that also explain in depth exactly what is happening in terms of input, settings etc and more generally the cryptographic component. crypt and crypt_gensalt are not really viable due to them not being MT-Safe so I am trying to better understand the rn, ra, and r implementations.
Summary: I want to better understand what the
the parameters to the function are and what their purposes are.
What are the best practices cryptographically for password hashing using these re-entrant ones and how to use them safely in a MT environment so I am not one of those developers who just uses crypto functions without understanding the implications of them and pitfalls.
when generating random bytes for the salt generator, what is a cryptographically strong source for random bytes?
I am also open to recommendations to other libraries etc. but they need to be production ready.

Tried to solve this on my own. Here is what I found out:
1a. crypt_gensalt_rn:
prefix = Prefix code in the form of $<Algorithim>$ ex. $2a$
count : The number of rounds you want the hashing algorithim to run. Set this value by response time. (ie. if you want to finish a hash in 1s, then run a benchmark and figure out the # of rounds needed to respond in 1s)
rbytes, nrbytes : rbytes is a set of random bytes and nrbytes is the length of said char array of randombytes. You can pass NULL to rbytes and 0 to nrbytes to have the OS get them for you (best practice typically unless OS doesn't have random bytes hardware needed which can result in a security risk)
output, output_size : This is defined by each algorithm individually. In the case of bcrypt it is 32 or the length of the setting string for crypt_ra. This stores in the format of $<Algorithim>$<rounds>$<salt> for bcrypt and many others.
1b. crypt_ra(const char *phrase, const char *setting, void **data, int *size);
phrase : the text you want to hash
setting: the setting string (ie. char* output) made in crypt_gensalt_rn.
data : a pointer to a generic crypt_struct used by most linux libraries. This is where crypt_ra will allocate a struct you need to free.
size : A pointer to an integer that crypt_ra will set to the length in bytes of the crypt struct.
Ra and RN are safe in Multithreaded environments. Make sure if your server doesn't support Random Byte Generation via hardware there can be a security risk (this was reported). Set rounds to a time you want it to take to verify a password not a # of rounds.
You can use the OS if it has the appropriate hardware. Otherwise you can use RNG like mersenne twister.

Related

Generate nonce c++

I am wondering if there is a way to generate a Cryptographic Nonce using OpenSSL or Crypto++ libraries. Is there anything more to it than just generating a set of random bytes using autoseeded pools?

I am wondering if there is a way to generate a cryptographic nonce using OpenSSL or Crypto++ libraries.
Crypto++:
SecByteBlock nonce(16);
AutoSeededRandomPool prng;
prng.GenerateBlock(nonce, nonce.size());
OpenSSL:
unsigned char nonce[16];
int rc = RAND_bytes(nonce, sizeof(nonce));
unsigned long err = ERR_get_error();
if(rc != 1) {
/* RAND_bytes failed */
/* `err` is valid */
}
/* OK to proceed */
Is there anything more to it than just generating a set of random bytes using autoseeded pools?
A nonce is basically an IV. Its usually considered a public parameter, like an IV or a Salt.
A nonce must be unique within a security context. You may need a nonce to be unpredictable, too.
Uniqueness and unpredictability are two different properties. For example, a counter starting at 0000000000000000 is unique, but its also predictable.
When you need both uniqueness and unpredictability, you can partition the nonce into a random value and a counter. The random value will take up 8 bytes of a 16 byte nonce; while the counter will take up the remaining 8 bytes of a 16 byte nonce. Then you use an increment function to basically perform i++ each time you need a value.
You don't need an 8-8 split. 12-4 works, as does 4-12. It depends on the application and the number of nonces required before rekeying. Rekeying is usually driven by plain text byte counts.
16-0 also works. In this case, you're using random values, avoiding the counter, and avoiding the increment function. (The increment function is basically a cascading add).
NIST SP800-38C and SP800-38D offer a couple of methods for creating nonces because CCM and GCM uses them.
Also see What are the requirements of a nonce? on the Crypto Stack Exchange.

You need a unique number for each nonce. You can use either a serial number or a random number. To help ensure uniqueness, it is common, though not required, to add a timestamp to the nonce. Either passing the timestamp as a separate field or concatenating it with the nonce. Sometimes information such as IP addresses and process IDs are also added.
When you use a serial number, you don't need to worry about skipping numbers. That's fine. Just make sure you never repeat. It must be unique across restarts of your software. This is one place where adding a timestamp can help. Because time-in-millis+serial-number is almost certainly unique across restarts of the server.
For the pseudo random number generator, anyone should be fine. Just make sure that you use a sufficiently large space to make the chances of getting a duplicate effectively impossible. Again, adding time will reduce the likelihood of you getting duplicates as you'll need to get the same random number twice in the same millisecond.
You may wish to hash the nonce to obscure the data in it (eg: process ID) though the hash will only be secure if you include a secure random number in the nonce. Otherwise it may be possible for a viewer of the nonce to guess the components and validate by redoing the hash (ie: they guess the time and try all possible proc IDs).

No. If the nonce is large enough then an autoseeded DRBG (deterministic random bit generator - NIST nomenclature) is just fine. I would suggest a nonce of about 12 bytes. If the nonce needs to be 16 bytes then you can leave the least significant bits - most often the rightmost bytes - set to zero for maximum compatibility.
Just using the cryptographically secure random number generators provided by the API should be fine - they should be seeded using information obtained from the operating system (possibly among other data). It never hurts to add the system time to the seed data just to be sure.
Alternatively you could use a serial number, but that would require you to keep some kind of state which may be hard across invocations. Beware that there are many pitfalls that may allow a clock to repeat itself (daylight saving, OS changes, dead battery etc. etc.).
It never hurts to double check that the random number generator doesn't repeat for a large enough output. There have been issues just with programming or system configuration mistakes, e.g. when a fix after a static code analysis for Debian caused the OpenSSL RNG not to be seeded at all.

Is there a library that would produce a string that would hash (SHA1) to a given input?

I'm wondering if it's possible to find a block of text that would hash to a known value. In particular, I'm looking for a function CreateDataFromHash() that could be called as follows:
unsigned char myHash[] = "da39a3ee5e6b4b0d3255bfef95601890afd80709";
unsigned int length = 10000;
CreateDataFromHash(myHash, length);
Here CreateDataFromHash would return the string of the length 10000 containing arbitrary data, which would hash to myHash using SHA1.
Thanks.

There's no known easy or even moderately difficult way to do this in general.
The entire point of hashes (or so-called one-way functions), is that it's easy to compute them, but next to impossible to reverse their computation (find input values based on output). That said, for some hash functions, there are known methods that may allow computing inputs for a given hash value in reasonable time.
For example, this MD5 sum technique will find collisions (but not input for a given output) in about 8 hours on a 1.6GHz computer.
For SHA-1 in particular you may be interested in reading this.

One of the purposes of SHA1 is that this should be very hard to do.

hashing is a one way function. you can't get input from the output.

This would be a "preimage attack". No such thing is publicly known against SHA-1.
The only attack known against SHA-1 is a collision attack. That means I find two inputs that produce the same result, but neither of them is pre-ordained, so to speak. Even so, this attack isn't really feasible for most people -- based on the amount of computation involved, the closest I can figure is that you'd have to spend somewhere in the range of a few million dollars to build a machine that would give you about one colliding pair of keys per week (assuming it ran, doing nothing else 24/7).

You have to brute force it. See
PHP brute force password generator
Get string, do hash, compare, repeat

Two-way "Hashing" of string

I want to generate int from a string and be able to generate it back.
Something like hash function but two-way function.
I want to use ints as ID in my application, but want to be able to convert it back in case of logging or debugging.
Like:
int id = IDProvider::getHash("NameOfMyObject");
object * a = createObject(id);
...
if(error)
{
LOG(IDProvider::getOriginalString(a->getId()), "some message");
}
I have heard of slightly modified CRC32 to be fast and 100% reversible, but I can not find it and I am not able to write it by myself.
Any hints what should I use?
Thank you!
edit
I have just founded the source I have the whole CRC32 thing from:
Jason Gregory : Game Engine Architecture
quotation:
"As with any hashing system, collisions are a possibility (i.e., two different strings might end up with the same hash code). However, with a suitable hash function, we can all but guarantee that collisions will not occur for all reasonable input strings we might use in our game. After all, a 32-bit hash chode represents more than four billion possible values. So if our hash function does a good job of distributing strings evently throughout this very large range, we are unlikely to collide. At Naughty Dog, we used a variant of the CRC-32 algorithm to hash our strings, and we didn't encounter a single collision in over two years of development on Uncharted: Drake's Fortune."

Reducing an arbitrary length string to a fixed size int is mathematically impossible to reverse. See Pidgeonhole principle. There is a near infinite amount of strings, but only 2^32 32 bit integers.
32 bit hashes(assuming your int is 32 bit) can have collisions very easily. So it's not a good unique ID either.
There are hashfunctions which allow you to create a message with a predefined hash, but it most likely won't be the original message. This is called a pre-image.
For your problem it looks like the best idea is creating a dictionary that maps integer-ids to strings and back.
To get the likelyhood of a collision when you hash n strings check out the birthday paradox. The most important property in that context is that collisions become likely once the number of hashed messages approaches the squareroot of the number of available hash values. So with a 32 bit integer collisions become likely if you hash around 65000 strings. But if you're unlucky it can happen much earlier.

I have exactly what you need. It is called a "pointer". In this system, the "pointer" is always unique, and can always be used to recover the string. It can "point" to any string of any length. As a bonus, it also has the same size as your int. You can obtain a "pointer" to a string by using the & operand, as shown in my example code:
#include <string>
int main() {
std::string s = "Hai!";
std::string* ptr = &s; // this is a pointer
std::string copy = *ptr; // this retrieves the original string
std::cout << copy; // prints "Hai!"
}

What you need is encryption. Hashing is by definition one way. You might try simple XOR Encryption with some addition/subtraction of values.
Reversible hash function?
How come MD5 hash values are not reversible?
checksum/hash function with reversible property
http://groups.google.com/group/sci.crypt.research/browse_thread/thread/ffca2f5ac3093255
... and many more via google search...

You could look at perfect hashing
http://en.wikipedia.org/wiki/Perfect_hash_function
It only works when all the potential strings are known up front. In practice what you enable by this, is to create a limited-range 'hash' mapping that you can reverse-lookup.
In general, the [hash code + hash algorithm] are never enough to get the original value back. However, with a perfect hash, collisions are by definition ruled out, so if the source domain (list of values) is known, you can get the source value back.
gperf is a well-known, age old program to generate perfect hashes in c/c++ code. Many more do exist (see the Wikipedia page)

Is it not possible. Hashing is not-returnable function - by definition.

As everyone mentioned, it is not possible to have a "reversible hash". However, there are alternatives (like encryption).
Another one is to zip/unzip your string using any lossless algorithm.
That's a simple, fully reversible method, with no possible collision.

Deterministically Generating cryptographically secure keys and IVEC's

Background
I am designing a system which enables the development of dynamic authentication schemes for a user of static web content. The motivation is to pre-generate large amounts of complex-to-generate, yet sensitive web-content, and then serve it statically with cookie-based (embedding reversably encrypted information) authentication in place, enforced by the web-server inately. Using an AEAD-mode encryption primitive.
The Problem
I need to generate IVEC's and keys that are valid for a duration of time, say one week (the current-valid pair). and that past IVECs/Keys are also valid for say 2 weeks(historically-valid) and any data encrypted with the historically valid secrets will just be re-encrypted with the current-valid IVEC/KEY.
What I need is a deterministic CSPRNG that seeds of a random number and a passphrase and that can produce in an indexed fashion 64-bit or 128-bit blocks of numbers. If I use a weeks-since-"jan 1 1970" as one of the index element of my hypothetical CSPRNG I should be able to build a system that innately changes keys automatically as time goes by.
Approach I am Considering
Now I don't see such functionality in cryptopp, or I do now know the terminology well enough, and as cryptopp is the most advanced of the encryption libraries out there, I don't have confidence I will find another one. So, If I can't find an implementation out there, I should roll my own. Will generating a static string structure out of the concatinated data and then hashing it (shown below) do the trick ?
RIPEMD160(RandomPreGeneratedFixedNonce:PassPhrase:UInt64SinceEpoch:128BitBlockIndexNumber);
Note: The blocknumbers will be assigned and have a regular structure, so for example for a 128-bit digest, the first 64-bits of block 0 will be for the ivec, and all of element 1 for the 128-bit key.
Is this a sound approach (--.i.e, cryptographically secure) ?
-- edit: post accept comment --
After some reflection, I have decided to merge what I originally considered the passphrase and the nonce/salt into a 16-byte (cryptographicall strong) key, and use the techniques outlined in the PKCS #5 to derive multiple time-based keys. There isn't a need for a salt, as passphrases aren't used.

Interesting question.
First, your Initial Vectors don't have to be cryptographically strong random quantities, but they should be unique per-message. The IV is really just a kind of salt value that ensures that similar messages encrypted using the same key don't look similar once they're encrypted. You can use any quick pseudo-random generator to generate the IV, and then send it (preferably encrypted) along with the encrypted data.
The keys, of course, should be as strong as you can practically make them.
Your proposal to hash a text string containing a nonce, passphrase, and validity data seems to me to be very reasonable -- it's broadly in line with what is done by other system that use a passphrase to generate a key. You should hash more many times -- not just once -- to make the key generation computationally expensive (which will be a bigger problem for anyone trying to brute-force the key than it will for you).
You might also want to have a look at the key-generation scheme set out in PKCS#5 (e.g. at http://www.faqs.org/rfcs/rfc2898.html) which is implemented in cryptopp as PasswordBasedKeyDerivationFunction. This mechanism is already widely used and known to be reasonable secure (note that PKCS#5 recommends hashing the passphrase data at least 1000 times). You could just append your validity period and index data to the passphrase and use PasswordBasedKeyDerivationFunction as it stands.
You don't say what encryption algorithm you propose to use to encrypt the data, but I would suggest that you should pick something widely-used and known to be secure ... and in particular I'd suggest that you use AES. I'd also suggest using one of the SHA digest functions (maybe as an input to PasswordBasedKeyDerivationFunction). SHA-2 is current, but SHA-1 is sufficient for key generation purposes.
You also don't say what key length you're looking to generate, but you should be aware that the amount of entropy in your keys depends on the length of the passphrase that you use, and unless the passphrase is very long that will be much less than the keylength ideally requires.
The weakest link in this scheme is the passphrase itself, and that's always going to limit the level of security you can achive. As long as you salt your data (as you are doing) and make the key-generation expensive to slow down brute-force attacks you should be fine.

What I need is a deterministic CSPRNG that seeds of a random number and a passphrase and that can produce in an indexed fashion 64-bit or 128-bit blocks of numbers. If I use a weeks-since-"jan 1 1970" as one of the index element of my hypothetical CSPRNG I should be able to build a system that innately changes keys automatically as time goes by.
Well, I think part of the solution is to use a non-time based generator. That way, if both sides start with the same seed, then they both produce the same random stream. You can layer your "weeks since Week 1, 1970" logic on top of that.
To do that, you would use OFB_mode<T>::Encryption. It can be used as a generator because OFB mode uses AdditiveCipherTemplate<T>, which derives from RandomNumberGenerator.
In fact, Crpyto++ uses the generator in test.cpp so that results can be reproduced if something fails. Here's how you would use OFB_mode<T>::Encryption. It also applies to CTR_Mode<T>::Encryption:
SecByteBlock seed(32 + 16);
OS_GenerateRandomBlock(false, seed, seed.size());
for(unsigned int i = 0; i < 10; i++)
{
OFB_Mode<AES>::Encryption prng;
prng.SetKeyWithIV(seed, 32, seed + 32, 16);
SecByteBlock t(16);
prng.GenerateBlock(t, t.size());
string s;
HexEncoder hex(new StringSink(s));
hex.Put(t, t.size());
hex.MessageEnd();
cout << "Random: " << s << endl;
}
The call to OS_GenerateRandomBlock fetches bytes from /dev/{u|s}random and then uses that as a simulated shared seed. Each run of the program will be different. Within each run of the program, it prints similar to the following:
$ ./cryptopp-test.exe
Random: DF3D3F8E8A21C39C0871B375013AA2CD
Random: DF3D3F8E8A21C39C0871B375013AA2CD
Random: DF3D3F8E8A21C39C0871B375013AA2CD
Random: DF3D3F8E8A21C39C0871B375013AA2CD
Random: DF3D3F8E8A21C39C0871B375013AA2CD
Random: DF3D3F8E8A21C39C0871B375013AA2CD
Random: DF3D3F8E8A21C39C0871B375013AA2CD
Random: DF3D3F8E8A21C39C0871B375013AA2CD
Random: DF3D3F8E8A21C39C0871B375013AA2CD
Random: DF3D3F8E8A21C39C0871B375013AA2CD
There's another generator available that does the same, but its not part of the Crypto++ library. Its called AES_RNG, and its based on AES-256. Its a header only implementation, and you can find it at the Crypto++ wiki under RandomNumberGenerator.
Also see the topic Reproducibility for RandomNumberGenerator class on the Crypto++ wiki.

Decryption with AES and CryptoAPI? When you know the KEY/SALT

Okay so i have a packed a proprietary binary format. That is basically a loose packing of several different raster datasets. Anyways in the past just reading this and unpacking was an easy task. But now in the next version the raster xml data is now to be encrypted using AES-256(Not my choice nor do we have a choice).
Now we basically were sent the AES Key along with the SALT they are using so we can modify our unpackager.
NOTE THESE ARE NOT THE KEYS JUST AN EXAMPLE:
They are each 63 byte long ASCII characters:
Key: "QS;x||COdn'YQ#vs-`X\/xf}6T7Fe)[qnr^U*HkLv(yF~n~E23DwA5^#-YK|]v."
Salt: "|$-3C]IWo%g6,!K~FvL0Fy`1s&N<|1fg24Eg#{)lO=o;xXY6o%ux42AvB][j#/&"
We basically want to use the C++ CryptoAPI to decrypt this(I also am the only programmer here this week, and this is going live tomorrow. Not our fault). I've looked around for a simple tutorial of implementing this. Unfortunately i cannot even find a tutorial where they have both the salt and key separately. Basically all i have really right now is a small function that takes in an array of BYTE. Along with its length. How can i do this?
I've spent most of the morning trying to make heads/tails of the cryptoAPI. But its not going well period :(
EDIT
So i asked for how they encrypt it. They use C#, and use RijndaelManaged, which from my knowledge is not equivalent to AES.
EDIT2
Okay finally got exactly what was going on, and they sent us the wrong keys.
They are doing the following:
Padding = PKCS7
CipherMode = CBC
The Key is defined as a set of 32 Bytes in hex.
The IV is defined as a set of 32 bytes in hex too.
They took away the salt when i asked them.
How hard is it to set these things in CryptoAPI using the wincrypt.h header file.?

AES-256 uses 256 bit keys. Ideally, each key in your system should be equally likely. A 63 byte string would be 504 bits. You first need to figure out how the string of 63 characters needs to be converted to 256 bits (The sample ones you gave are not base64 encoded). Next, "salt" isn't an intrinsic part of AES. You might be referring to either an initialization vector (IV) in Cipher-Block-Chaining mode or you could be referring to somehow updating the key.
If I were to guess, I'm assuming that by "SALT" you mean IV and specifically CBC mode.
You will need to know all of this when using CAPI functions (e.g. decrypt).
If all of this sounds confusing, then it might be best to change your design so that you don't have to worry about getting all of this right. Crypto is hard. One bad step could invalidate all the security. Consider looking at this comment on my Stick Figure Guide to AES.
UPDATE: You can look at this for a rough starting point for C++ CAPI. You'll need a 64 character hex string to get 256 bits ( 256 bits / (4 bits / char) == 64 chars). You can convert the chars to bits yourself.
Again, I must caution that playing fast and loose with IV's and keys can have disastrous consequences. I've studied AES/Rijndael in depth down to the math and gate level and have even written my own implementation. However, in my production code, I stick to using a well-tested TLS implementation if at all possible for data in transit. Even for data at rest, it'd be better to use a higher level library.

Rijndael is the algorithm name for AES

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js