Encryption Initialization Vector - c++

Using the aes_cfb_encrypt and aes_cfb_decrypt functions, I have the following questions.
What is unsigned char *iv (Initialization Vector) in an encryption.
Is it required to preserve the *iv for decryption.
Each time i encrypt a block of data the *iv is modified, What i have to do with this modified *iv.
I am encrypting a large file around 100mb, and passing a random *iv for the very first time, do i have to use the same *iv for the rest of the loop, or i have to use the updated *iv from the last call of encrypt block.
Lastly, I am dealing with a structured file, so do i had to use Sizeof(struct) as the length of the buffer or have to use sizeof(struct)*8 as length of the buffer for encryption or decryption.
Please guide..
AES_RETURN aes_cfb_encrypt(const unsigned char *ibuf, unsigned char *obuf, int len, unsigned char *iv, aes_encrypt_ctx cx[1]);
AES_RETURN aes_cfb_decrypt(const unsigned char *ibuf, unsigned char *obuf, int len, unsigned char *iv, aes_encrypt_ctx cx[1]);

In answering your questions, please note the following:
PT(x) = Plain Text representation of 'x'
CT(x) = Cipher Text representation of 'x'
Bn = Logical Data Block 'n' in a sequence of multiple blocks.
1. What is an IV?
IV is short notation for Initialization Vector. It is used in symmetric block-encryption algorithms that perform their encryption in what are called chained or feedback modes. In either, the previous block of encrypted data is used as a piece of functional data "goo" to alter the next block of data to be encrypted. Each successive block of data that is encrypted is fed the prior already-encrypted data block as their blob of goo to use. But what about the first block of plaintext? What does it use for its special sauce? Answer: the IV provided to the function. Pictorially, it looks like the following:
CT(B1) = Encrypt(IV + PT(B1))
CT(B2) = Encrypt(CT(B1) + PT(B2))
CT(B3) = Encrypt(CT(B2) + PT(B3))
...
CT(Bn) = Encrypt(CT(Bn-1) + PT(Bn))
Note: '+' in the above denotes the application of the prior cipher block to the next plaintext block. It is not to be thought of as mathematical-addition. Think of it as "combined with".
The size of the IV must be the same as the block size of the symmetric algorithm being used. Both AES-128-CFB and AES-256-CFB use a 128-bit block size (16 bytes). Therefore, your IV should be 16 bytes of random goo for your purposes in this question, and should be generated on the encryption-side using a secure FIPS-compliant random-source algorithm.
2. Is it required to preserve the IV for decryption?
Yes, but not necessarily in the fashion you may first think. The first IV (provided by you) must be retained somehow. Traditionally, it is sent right where you would think it should be; as the first block of encrypted data. This often freaks people out, they think "But if I send the IV with the data, it isn't as secure, is it?" Think about it this way. How many "IV's" are you sending, anyway? Remember, each data block is encrypted using the prior block of encrypted data as its IV. Therefore, you're actually sending an entire stream of IVs, each encrypted block the IV for the next encrypted block, etc. Where the initial IV is in your output ciphertext is a data-representation question, but where it goes is ultimately irrelevant to the question. It must be preserved. It is possible your API does this for you as part of its output stream (it is not uncommon at all, in fact).
3. Each time I encrypt a block of data the *iv is modified, What i have to do with this modified *iv?
I'm not familiar with the API you're using, but it sounds like you're given the IV to use for the next encryption, which makes perfect sense when you consider how feedback or chaining works for block-mode encryption. You should NOT use the same IV repeatedly. Use the one returned last as the next one. Since your API is modifying the IV in place, it appears the only thing you may need to do is preserve the initial IV somewhere else before sending. I would compare the first ciphertext block against your IV. if they are not the same, you probably need to send your IV, then the cipher text chain in your data stream, and have the receiver aware that the first block is the IV of the decryption.
4. I am encrypting a large file around 100mb, and passing a random *iv for the very first time, do i have to use the same *iv for the rest of the loop, or i have to use the updated *iv from the last call of encrypt block.
See (3). Use the updated IV for each successive block.
5. Lastly, I am dealing with a structured file, so do i Sizeof(struct) as the length of the buffer or have to use sizeof(struct)*8 as length of the buffer for encryption or decryption.?
Use the size of your structure in bytes (not bits). The C/C++ sizeof(yourstruct) should compute this for you, but note if you're encrypting each structure as an independent entity (and not the entire file in one mass), each encryption will carry with it a minimum amount of data added to account for (a) the IV used for that structure, and (b) padding the last block out to an even block boundary, assuming you're using PKCS5 padding. The exact size of an encrypted structure, therefore, would be:
IV + ((sizeof(struct) + 15)/16)*16) bytes.
Again, this is if you're independently encrypting, and storing, each structure as a singular encryption, and again, your API may account for some of this for you.
For more information on symmetric AES, see the AES entry on Wiki. For information on the CFB block cipher mode, see the Block Cipher Modes of Operation article on the same site.
I hope this helps. Do some homework and above all, learn exactly how your API works, which is something I cannot, unfortunately, help you with.

The initialization vector (IV) in a cryptographic system is a random value that is included as part of the encryption system's initialization to ensure that if the same data is encrypted multiple times, it always comes back looking different. This is a requirement of secure cryptographic systems to ensure that an attacker looking at multiple different encrypted messages cannot easily determine whether any two of those messages are the same. Ideally, the IV should be chosen completely randomly.
You should not need to preserve the IV for decryption. Typically, the IV is sent in plaintext along with the encrypted data. That's not a security concern - it's by design.
The IV is changed on each iteration of encryption because internally the cryptographic system is iteratively applying a block cipher to the data, then using the output of that cipher, combined with some extra data, as the new IV for the next application of the block cipher. This process is then iterated as many times as necessary. I suspect (but am not sure) that the IV is handed back to you so that you can encrypt more data where you left off from before. You should definitely double-check this!
As for whether to use the size of your structure or eight times that - I can't say without seeing more of your code. However, you should probably be providing the total number of bytes to encrypt, so if you're encrypting eight copies of the struct, pass in eight times the sizeof the struct.
Hope this helps!

Related

Generate nonce c++

I am wondering if there is a way to generate a Cryptographic Nonce using OpenSSL or Crypto++ libraries. Is there anything more to it than just generating a set of random bytes using autoseeded pools?
I am wondering if there is a way to generate a cryptographic nonce using OpenSSL or Crypto++ libraries.
Crypto++:
SecByteBlock nonce(16);
AutoSeededRandomPool prng;
prng.GenerateBlock(nonce, nonce.size());
OpenSSL:
unsigned char nonce[16];
int rc = RAND_bytes(nonce, sizeof(nonce));
unsigned long err = ERR_get_error();
if(rc != 1) {
/* RAND_bytes failed */
/* `err` is valid */
}
/* OK to proceed */
Is there anything more to it than just generating a set of random bytes using autoseeded pools?
A nonce is basically an IV. Its usually considered a public parameter, like an IV or a Salt.
A nonce must be unique within a security context. You may need a nonce to be unpredictable, too.
Uniqueness and unpredictability are two different properties. For example, a counter starting at 0000000000000000 is unique, but its also predictable.
When you need both uniqueness and unpredictability, you can partition the nonce into a random value and a counter. The random value will take up 8 bytes of a 16 byte nonce; while the counter will take up the remaining 8 bytes of a 16 byte nonce. Then you use an increment function to basically perform i++ each time you need a value.
You don't need an 8-8 split. 12-4 works, as does 4-12. It depends on the application and the number of nonces required before rekeying. Rekeying is usually driven by plain text byte counts.
16-0 also works. In this case, you're using random values, avoiding the counter, and avoiding the increment function. (The increment function is basically a cascading add).
NIST SP800-38C and SP800-38D offer a couple of methods for creating nonces because CCM and GCM uses them.
Also see What are the requirements of a nonce? on the Crypto Stack Exchange.
You need a unique number for each nonce. You can use either a serial number or a random number. To help ensure uniqueness, it is common, though not required, to add a timestamp to the nonce. Either passing the timestamp as a separate field or concatenating it with the nonce. Sometimes information such as IP addresses and process IDs are also added.
When you use a serial number, you don't need to worry about skipping numbers. That's fine. Just make sure you never repeat. It must be unique across restarts of your software. This is one place where adding a timestamp can help. Because time-in-millis+serial-number is almost certainly unique across restarts of the server.
For the pseudo random number generator, anyone should be fine. Just make sure that you use a sufficiently large space to make the chances of getting a duplicate effectively impossible. Again, adding time will reduce the likelihood of you getting duplicates as you'll need to get the same random number twice in the same millisecond.
You may wish to hash the nonce to obscure the data in it (eg: process ID) though the hash will only be secure if you include a secure random number in the nonce. Otherwise it may be possible for a viewer of the nonce to guess the components and validate by redoing the hash (ie: they guess the time and try all possible proc IDs).
No. If the nonce is large enough then an autoseeded DRBG (deterministic random bit generator - NIST nomenclature) is just fine. I would suggest a nonce of about 12 bytes. If the nonce needs to be 16 bytes then you can leave the least significant bits - most often the rightmost bytes - set to zero for maximum compatibility.
Just using the cryptographically secure random number generators provided by the API should be fine - they should be seeded using information obtained from the operating system (possibly among other data). It never hurts to add the system time to the seed data just to be sure.
Alternatively you could use a serial number, but that would require you to keep some kind of state which may be hard across invocations. Beware that there are many pitfalls that may allow a clock to repeat itself (daylight saving, OS changes, dead battery etc. etc.).
It never hurts to double check that the random number generator doesn't repeat for a large enough output. There have been issues just with programming or system configuration mistakes, e.g. when a fix after a static code analysis for Debian caused the OpenSSL RNG not to be seeded at all.

Are blowfish and twofish encrypted byte by byte?

I am receiving packets through network. But some of those packets have dynamic length, so second byte has a 2 bytes long WORD that contains length. So I receive the packet number first, then receive all according to the length. Everything is okay here, when there is no encryption. Will it be same if i use twofish or blowfish encryption ? What i mean is, 'A' is encrypted as 'B' but will 'AA' be encrypted as 'BB' ? Can i extract a byte and decrypt it from a packet encrypted with TF/BF as whole ?
What i mean is, 'A' is encrypted as 'B' but will 'AA' be encrypted as 'BB' ?
A sensible encryption algorithm will never do this, otherwise the encrypted info can be easily broken by frequency analysis. (This is known as substitution cipher, by the way). This is of course true* for blowfish and twofish.
Even if you want to extract a byte in the middle, you have to decrypt the whole packet first.
*: unless you use the weak ECB mode, which only reduces the two encryption algorithms into substitution ciphers over 64-bit/128-bit blocks).
Generally the answer is to pad the encrypted data. Don't just pad by adding 0's until you get to the block length, however; padding can give away a bit too much information.
As far as extracting a byte, depending on the cipher mode used - how the cipher is changed between blocks - you should not be able to do this. You'll need to decrypt all the way up the byte byte you'd like to read. It is general practice for the encryption to be "transparent" - i.e. you do your network programming, then slap SSL onto it, so that SSL handles encrypting everything, dealing with variable lengths, etc. and you just get to deal with plain old data.
As to whether throwing SSL at it is a good idea with your design, I have no idea, but you can use the concept.
Twofish, at its base level, encodes 16-byte blocks. So the minimum piece of Twofish-encrypted data you can have is 16 bytes long. If your data contains the length, then you can decrypt it then throw away any extra bytes in the last block.
So to encrypt 'A' you need to pad it (somehow - there are various ways - all zeroes is apparently not the best way) to 16 bytes, then encrypt your one byte of data and your 15 unwanted bytes. You get a 16-byte encrypted block. On decryption you can throw away the extra bytes.
I suggest half an hour reading these Wikipedia articles:
http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation
http://en.wikipedia.org/wiki/Padding_%28cryptography%29
Both of them have been helpful to me.

Is it fair for us to conclude XOR string encryption is less secure than well known encryption (Say Blowfish)

I was wondering, is it fair to conclude, XOR string encryption is less secure than other encryption method, say Blowfish
This is because for both methods, their input are
Unencrypted string
A secret key
string XOR(string value,string key)
{
string retval(value);
short unsigned int klen=key.length();
short unsigned int vlen=value.length();
short unsigned int k=0;
short unsigned int v=0;
for(v;v<vlen;v++)
{
retval[v]=value[v]^key[k];
k=(++k<klen?k:0);
}
return retval;
}
Is there any proof that XOR encryption method is more easy to be "broken" than Blowfish if the same key is being chosen?
If your key is (a) truly random, (b) at least as long as the plaintext, and (c) never re-used then XOR encryption is proveably unbreakable.
If you can't meet those stringent criteria then XOR encryption is proveably weaker than proper encryption algorithms like Blowfish, though I'm not in a position to prove it myself.
XOR encryption is wildly unsafe if you try encoding the same string more than once with it because it allows malicious users to recover the XOR of two messages by XORing their encryption. If they can somehow trick you into XORing a known string, then they can recover your key and break the encryption entirely.
More advanced encryption algorithms like Blowfish and AES have much stronger security guarantees and are assumed to be so strong that even if the attacker lets you encrypt known data, they can't recover your key or recover any individual bits. They should always be used instead of your own custom encryption in any scenario where security is important.
As an interesting aside, XOR encryption where on each invocation you use a new (randomly-chosen) key is cryptographically unbreakable from an information-theoretic perspective. This is called a "one-time pad" and could in theory be useful in some settings.
Actually, "XOR encryption" is proven to be perfectly secure against eavesdropping (you are still in trouble when someone can change your stream), given an encryption key that is
as long as the string to be encrypted
perfectly random (bits are uncorrelated and have equal probabilities of being 1 and 0)
used only once.
Otherwise, you are in deep trouble. If your bits aren't random, you can extract some information about the plaintext bits. If you reuse your key, you can XOR two ciphertexts to get the XOR of the plaintext, or, if by chance someone gets you to post a message with a plaintext known to him, he can extract the key (they actually did that for ENIGMA in WW2, so that's not unrealistic!). If the key is shorter than the message, you can employ statistics - say they key is 10 bytes long, you can divide the ciphertext into 10 parts which are all encoded with the same byte, and employ frequency analysis ... just to name a few attacks.
I don't know why it so rarely gets mentioned but the one time pad (OTP) has one major short-coming which is avoided by using the likes of RSA. Imagine that it is known that you will send one of two messages regarding stock ABC to your broker based on some news you will get at 2:00pm. You will either send BUY or SELL. With OTP, it doesn't take much crypto-analysis to figure out what you sent, because the length alone gives it away.
IMHO, it's better to use something that changes the length when converting to ciphertext as most common encryption systems do.

two AES implementations generated different encryption results

I have an application that uses an opensource "libgcrypt" to encrypt/decrypt a data block (32 bytes). Now I am going to use Microsoft CryptAPI to replace it. My problem is that the libgcrypt and cryptApi approaches generate different ciphertext contents as I use the same AES-256 algoritjm in CFB mode, same key, and same IV, although the ciphertext can be decrypted by their own correspndingly.
Could some tell me what is the problem? Thanks.
Do the two assume different endianness, or assign the bytes in the key/IV in different orders?
If the endianness assumptions are different, you may need to re-order the bytes in the key, IV and/or plaintext to get matching results. For example, if you are supplying bytes in the order abcdefgh, you may need to switch this to 'dcbahgfe' to get things to work.
There is an additional parameter for CFB, namely the "shift amount" at each iteration. The Wikipedia page on CFB has some information. Namely, you encrypt x bits for every block encryption, where x is any value between 1 and the block size (128 for AES). I suspect that in your code, the Microsoft CryptoAPI and libgcrypt do not use the same value for x.
As explained in the documentation for CryptSetKeyParam(), Windows defaults to x=8 (i.e. one byte at a time). This is the KP_MODE_BITS parameter. On the other hand, libgcrypt defaults to x=n for a n-bit block cipher (i.e. x=128 for AES). I am not sure libgcrypt can be convinced to use another value.
i think the problem is with block size .as you said you are using 32 byte as block size make sure if block size of both are same and supports as well .because some of library block size is fixed for Aes as 16 byte .
What is the length of your key and IV?
Are ciphertexts different if the length of opentext is exactly 256 bit?
I have same problem, but with a different library. I noticed one thing in this library; If I pass input byte less than 32 bytes, in that case it's showing me both are the same encrypted data.
Is that what's happening in your case? If so, it means the problem is with the padding mechanism.

Decryption with AES and CryptoAPI? When you know the KEY/SALT

Okay so i have a packed a proprietary binary format. That is basically a loose packing of several different raster datasets. Anyways in the past just reading this and unpacking was an easy task. But now in the next version the raster xml data is now to be encrypted using AES-256(Not my choice nor do we have a choice).
Now we basically were sent the AES Key along with the SALT they are using so we can modify our unpackager.
NOTE THESE ARE NOT THE KEYS JUST AN EXAMPLE:
They are each 63 byte long ASCII characters:
Key: "QS;x||COdn'YQ#vs-`X\/xf}6T7Fe)[qnr^U*HkLv(yF~n~E23DwA5^#-YK|]v."
Salt: "|$-3C]IWo%g6,!K~FvL0Fy`1s&N<|1fg24Eg#{)lO=o;xXY6o%ux42AvB][j#/&"
We basically want to use the C++ CryptoAPI to decrypt this(I also am the only programmer here this week, and this is going live tomorrow. Not our fault). I've looked around for a simple tutorial of implementing this. Unfortunately i cannot even find a tutorial where they have both the salt and key separately. Basically all i have really right now is a small function that takes in an array of BYTE. Along with its length. How can i do this?
I've spent most of the morning trying to make heads/tails of the cryptoAPI. But its not going well period :(
EDIT
So i asked for how they encrypt it. They use C#, and use RijndaelManaged, which from my knowledge is not equivalent to AES.
EDIT2
Okay finally got exactly what was going on, and they sent us the wrong keys.
They are doing the following:
Padding = PKCS7
CipherMode = CBC
The Key is defined as a set of 32 Bytes in hex.
The IV is defined as a set of 32 bytes in hex too.
They took away the salt when i asked them.
How hard is it to set these things in CryptoAPI using the wincrypt.h header file.?
AES-256 uses 256 bit keys. Ideally, each key in your system should be equally likely. A 63 byte string would be 504 bits. You first need to figure out how the string of 63 characters needs to be converted to 256 bits (The sample ones you gave are not base64 encoded). Next, "salt" isn't an intrinsic part of AES. You might be referring to either an initialization vector (IV) in Cipher-Block-Chaining mode or you could be referring to somehow updating the key.
If I were to guess, I'm assuming that by "SALT" you mean IV and specifically CBC mode.
You will need to know all of this when using CAPI functions (e.g. decrypt).
If all of this sounds confusing, then it might be best to change your design so that you don't have to worry about getting all of this right. Crypto is hard. One bad step could invalidate all the security. Consider looking at this comment on my Stick Figure Guide to AES.
UPDATE: You can look at this for a rough starting point for C++ CAPI. You'll need a 64 character hex string to get 256 bits ( 256 bits / (4 bits / char) == 64 chars). You can convert the chars to bits yourself.
Again, I must caution that playing fast and loose with IV's and keys can have disastrous consequences. I've studied AES/Rijndael in depth down to the math and gate level and have even written my own implementation. However, in my production code, I stick to using a well-tested TLS implementation if at all possible for data in transit. Even for data at rest, it'd be better to use a higher level library.
Rijndael is the algorithm name for AES