Are blowfish and twofish encrypted byte by byte? - c++

I am receiving packets through network. But some of those packets have dynamic length, so second byte has a 2 bytes long WORD that contains length. So I receive the packet number first, then receive all according to the length. Everything is okay here, when there is no encryption. Will it be same if i use twofish or blowfish encryption ? What i mean is, 'A' is encrypted as 'B' but will 'AA' be encrypted as 'BB' ? Can i extract a byte and decrypt it from a packet encrypted with TF/BF as whole ?

What i mean is, 'A' is encrypted as 'B' but will 'AA' be encrypted as 'BB' ?
A sensible encryption algorithm will never do this, otherwise the encrypted info can be easily broken by frequency analysis. (This is known as substitution cipher, by the way). This is of course true* for blowfish and twofish.
Even if you want to extract a byte in the middle, you have to decrypt the whole packet first.
*: unless you use the weak ECB mode, which only reduces the two encryption algorithms into substitution ciphers over 64-bit/128-bit blocks).

Generally the answer is to pad the encrypted data. Don't just pad by adding 0's until you get to the block length, however; padding can give away a bit too much information.
As far as extracting a byte, depending on the cipher mode used - how the cipher is changed between blocks - you should not be able to do this. You'll need to decrypt all the way up the byte byte you'd like to read. It is general practice for the encryption to be "transparent" - i.e. you do your network programming, then slap SSL onto it, so that SSL handles encrypting everything, dealing with variable lengths, etc. and you just get to deal with plain old data.
As to whether throwing SSL at it is a good idea with your design, I have no idea, but you can use the concept.

Twofish, at its base level, encodes 16-byte blocks. So the minimum piece of Twofish-encrypted data you can have is 16 bytes long. If your data contains the length, then you can decrypt it then throw away any extra bytes in the last block.
So to encrypt 'A' you need to pad it (somehow - there are various ways - all zeroes is apparently not the best way) to 16 bytes, then encrypt your one byte of data and your 15 unwanted bytes. You get a 16-byte encrypted block. On decryption you can throw away the extra bytes.
I suggest half an hour reading these Wikipedia articles:
http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation
http://en.wikipedia.org/wiki/Padding_%28cryptography%29
Both of them have been helpful to me.

Related

CBC MAC: message length and length prepending

I want to use CBC MAC in C++. First I hope to find some implementation of block cipher which I will use in CBC mode, which I understand is CBC MAC. But I have two questions:
1) If the length of the message to be authenticated is not multiple of block cipher block length, what shall I do?
2) To strengthen CBC MAC one recommended way as mentioned on Wiki is to put the length of the message in the first block. But how should I encode the length, as string? Or in binary? If block length of cipher is say 64 bits, do I encode the number as 64 bit number? e.g. if message length is 230, I should use following value as first block:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 ‭11100110‬
?
It depends on the 2nd question. You must "pad" the message with something until it is a multiple of the block size. The pad bytes are added to the message before the MAC is calculated but only the original message is transmitted/stored/etc.
For a MAC, the easiest thing to do is pad with zeros. However this has a vulnerability - if the message part ends in one or more zeros, an attacker could add or remove zeros and not change the MAC. But if you do step 2, this and another attack are both mitigated.
If you prepend the length of the message before the message (e.g. not just in the first block, but the first thing in the first block), it mitigates the ability to sometimes add/remove zeros. It also mitigates an attackers ability to forge a message with an entire arbitrary extra block added. So this is a good thing to do. And it is a good idea for completely practical reasons too - you know how many bytes the message is without relying on any external means.
It does not matter what the format of the length is - some encoded ASCII version or binary. However as a practical matter it should always be simple binary.
There is no reason that the number of bits in the length must match the cipher block size. The size of the length field must be large enough to represent the message sizes. For example, if the message size could range from 0 to 1000 bytes long, you could prepend an unsigned 16 bit integer.
This is done first before the MAC is calculated on both the sender and receiver. In essence the length is verified at the same time as the rest of the message, eliminating the ability for an attacker to forge a longer or shorter message.
There are many open source C implementations of block ciphers like AES that would be easy to find and get working.
Caveat
Presumably the purpose of the question is just for learning. Any serious use should consider a stronger MAC such as suggested by other comments and a good crypto library. There are other weaknesses and attacks that can be very subtle so you should never attempt to implement your own crypto. Neither of us are crypto experts and this should be done only for learning.
BTW, I recommend both of the following books by Bruce Schneier:
http://www.amazon.com/Applied-Cryptography-Protocols-Algorithms-Source/dp/0471117099/ref=asap_bc?ie=UTF8
http://www.amazon.com/Cryptography-Engineering-Principles-Practical-Applications/dp/0470474246/ref=asap_bc?ie=UTF8

Encryption Initialization Vector

Using the aes_cfb_encrypt and aes_cfb_decrypt functions, I have the following questions.
What is unsigned char *iv (Initialization Vector) in an encryption.
Is it required to preserve the *iv for decryption.
Each time i encrypt a block of data the *iv is modified, What i have to do with this modified *iv.
I am encrypting a large file around 100mb, and passing a random *iv for the very first time, do i have to use the same *iv for the rest of the loop, or i have to use the updated *iv from the last call of encrypt block.
Lastly, I am dealing with a structured file, so do i had to use Sizeof(struct) as the length of the buffer or have to use sizeof(struct)*8 as length of the buffer for encryption or decryption.
Please guide..
AES_RETURN aes_cfb_encrypt(const unsigned char *ibuf, unsigned char *obuf, int len, unsigned char *iv, aes_encrypt_ctx cx[1]);
AES_RETURN aes_cfb_decrypt(const unsigned char *ibuf, unsigned char *obuf, int len, unsigned char *iv, aes_encrypt_ctx cx[1]);
In answering your questions, please note the following:
PT(x) = Plain Text representation of 'x'
CT(x) = Cipher Text representation of 'x'
Bn = Logical Data Block 'n' in a sequence of multiple blocks.
1. What is an IV?
IV is short notation for Initialization Vector. It is used in symmetric block-encryption algorithms that perform their encryption in what are called chained or feedback modes. In either, the previous block of encrypted data is used as a piece of functional data "goo" to alter the next block of data to be encrypted. Each successive block of data that is encrypted is fed the prior already-encrypted data block as their blob of goo to use. But what about the first block of plaintext? What does it use for its special sauce? Answer: the IV provided to the function. Pictorially, it looks like the following:
CT(B1) = Encrypt(IV + PT(B1))
CT(B2) = Encrypt(CT(B1) + PT(B2))
CT(B3) = Encrypt(CT(B2) + PT(B3))
...
CT(Bn) = Encrypt(CT(Bn-1) + PT(Bn))
Note: '+' in the above denotes the application of the prior cipher block to the next plaintext block. It is not to be thought of as mathematical-addition. Think of it as "combined with".
The size of the IV must be the same as the block size of the symmetric algorithm being used. Both AES-128-CFB and AES-256-CFB use a 128-bit block size (16 bytes). Therefore, your IV should be 16 bytes of random goo for your purposes in this question, and should be generated on the encryption-side using a secure FIPS-compliant random-source algorithm.
2. Is it required to preserve the IV for decryption?
Yes, but not necessarily in the fashion you may first think. The first IV (provided by you) must be retained somehow. Traditionally, it is sent right where you would think it should be; as the first block of encrypted data. This often freaks people out, they think "But if I send the IV with the data, it isn't as secure, is it?" Think about it this way. How many "IV's" are you sending, anyway? Remember, each data block is encrypted using the prior block of encrypted data as its IV. Therefore, you're actually sending an entire stream of IVs, each encrypted block the IV for the next encrypted block, etc. Where the initial IV is in your output ciphertext is a data-representation question, but where it goes is ultimately irrelevant to the question. It must be preserved. It is possible your API does this for you as part of its output stream (it is not uncommon at all, in fact).
3. Each time I encrypt a block of data the *iv is modified, What i have to do with this modified *iv?
I'm not familiar with the API you're using, but it sounds like you're given the IV to use for the next encryption, which makes perfect sense when you consider how feedback or chaining works for block-mode encryption. You should NOT use the same IV repeatedly. Use the one returned last as the next one. Since your API is modifying the IV in place, it appears the only thing you may need to do is preserve the initial IV somewhere else before sending. I would compare the first ciphertext block against your IV. if they are not the same, you probably need to send your IV, then the cipher text chain in your data stream, and have the receiver aware that the first block is the IV of the decryption.
4. I am encrypting a large file around 100mb, and passing a random *iv for the very first time, do i have to use the same *iv for the rest of the loop, or i have to use the updated *iv from the last call of encrypt block.
See (3). Use the updated IV for each successive block.
5. Lastly, I am dealing with a structured file, so do i Sizeof(struct) as the length of the buffer or have to use sizeof(struct)*8 as length of the buffer for encryption or decryption.?
Use the size of your structure in bytes (not bits). The C/C++ sizeof(yourstruct) should compute this for you, but note if you're encrypting each structure as an independent entity (and not the entire file in one mass), each encryption will carry with it a minimum amount of data added to account for (a) the IV used for that structure, and (b) padding the last block out to an even block boundary, assuming you're using PKCS5 padding. The exact size of an encrypted structure, therefore, would be:
IV + ((sizeof(struct) + 15)/16)*16) bytes.
Again, this is if you're independently encrypting, and storing, each structure as a singular encryption, and again, your API may account for some of this for you.
For more information on symmetric AES, see the AES entry on Wiki. For information on the CFB block cipher mode, see the Block Cipher Modes of Operation article on the same site.
I hope this helps. Do some homework and above all, learn exactly how your API works, which is something I cannot, unfortunately, help you with.
The initialization vector (IV) in a cryptographic system is a random value that is included as part of the encryption system's initialization to ensure that if the same data is encrypted multiple times, it always comes back looking different. This is a requirement of secure cryptographic systems to ensure that an attacker looking at multiple different encrypted messages cannot easily determine whether any two of those messages are the same. Ideally, the IV should be chosen completely randomly.
You should not need to preserve the IV for decryption. Typically, the IV is sent in plaintext along with the encrypted data. That's not a security concern - it's by design.
The IV is changed on each iteration of encryption because internally the cryptographic system is iteratively applying a block cipher to the data, then using the output of that cipher, combined with some extra data, as the new IV for the next application of the block cipher. This process is then iterated as many times as necessary. I suspect (but am not sure) that the IV is handed back to you so that you can encrypt more data where you left off from before. You should definitely double-check this!
As for whether to use the size of your structure or eight times that - I can't say without seeing more of your code. However, you should probably be providing the total number of bytes to encrypt, so if you're encrypting eight copies of the struct, pass in eight times the sizeof the struct.
Hope this helps!

Is it fair for us to conclude XOR string encryption is less secure than well known encryption (Say Blowfish)

I was wondering, is it fair to conclude, XOR string encryption is less secure than other encryption method, say Blowfish
This is because for both methods, their input are
Unencrypted string
A secret key
string XOR(string value,string key)
{
string retval(value);
short unsigned int klen=key.length();
short unsigned int vlen=value.length();
short unsigned int k=0;
short unsigned int v=0;
for(v;v<vlen;v++)
{
retval[v]=value[v]^key[k];
k=(++k<klen?k:0);
}
return retval;
}
Is there any proof that XOR encryption method is more easy to be "broken" than Blowfish if the same key is being chosen?
If your key is (a) truly random, (b) at least as long as the plaintext, and (c) never re-used then XOR encryption is proveably unbreakable.
If you can't meet those stringent criteria then XOR encryption is proveably weaker than proper encryption algorithms like Blowfish, though I'm not in a position to prove it myself.
XOR encryption is wildly unsafe if you try encoding the same string more than once with it because it allows malicious users to recover the XOR of two messages by XORing their encryption. If they can somehow trick you into XORing a known string, then they can recover your key and break the encryption entirely.
More advanced encryption algorithms like Blowfish and AES have much stronger security guarantees and are assumed to be so strong that even if the attacker lets you encrypt known data, they can't recover your key or recover any individual bits. They should always be used instead of your own custom encryption in any scenario where security is important.
As an interesting aside, XOR encryption where on each invocation you use a new (randomly-chosen) key is cryptographically unbreakable from an information-theoretic perspective. This is called a "one-time pad" and could in theory be useful in some settings.
Actually, "XOR encryption" is proven to be perfectly secure against eavesdropping (you are still in trouble when someone can change your stream), given an encryption key that is
as long as the string to be encrypted
perfectly random (bits are uncorrelated and have equal probabilities of being 1 and 0)
used only once.
Otherwise, you are in deep trouble. If your bits aren't random, you can extract some information about the plaintext bits. If you reuse your key, you can XOR two ciphertexts to get the XOR of the plaintext, or, if by chance someone gets you to post a message with a plaintext known to him, he can extract the key (they actually did that for ENIGMA in WW2, so that's not unrealistic!). If the key is shorter than the message, you can employ statistics - say they key is 10 bytes long, you can divide the ciphertext into 10 parts which are all encoded with the same byte, and employ frequency analysis ... just to name a few attacks.
I don't know why it so rarely gets mentioned but the one time pad (OTP) has one major short-coming which is avoided by using the likes of RSA. Imagine that it is known that you will send one of two messages regarding stock ABC to your broker based on some news you will get at 2:00pm. You will either send BUY or SELL. With OTP, it doesn't take much crypto-analysis to figure out what you sent, because the length alone gives it away.
IMHO, it's better to use something that changes the length when converting to ciphertext as most common encryption systems do.

two AES implementations generated different encryption results

I have an application that uses an opensource "libgcrypt" to encrypt/decrypt a data block (32 bytes). Now I am going to use Microsoft CryptAPI to replace it. My problem is that the libgcrypt and cryptApi approaches generate different ciphertext contents as I use the same AES-256 algoritjm in CFB mode, same key, and same IV, although the ciphertext can be decrypted by their own correspndingly.
Could some tell me what is the problem? Thanks.
Do the two assume different endianness, or assign the bytes in the key/IV in different orders?
If the endianness assumptions are different, you may need to re-order the bytes in the key, IV and/or plaintext to get matching results. For example, if you are supplying bytes in the order abcdefgh, you may need to switch this to 'dcbahgfe' to get things to work.
There is an additional parameter for CFB, namely the "shift amount" at each iteration. The Wikipedia page on CFB has some information. Namely, you encrypt x bits for every block encryption, where x is any value between 1 and the block size (128 for AES). I suspect that in your code, the Microsoft CryptoAPI and libgcrypt do not use the same value for x.
As explained in the documentation for CryptSetKeyParam(), Windows defaults to x=8 (i.e. one byte at a time). This is the KP_MODE_BITS parameter. On the other hand, libgcrypt defaults to x=n for a n-bit block cipher (i.e. x=128 for AES). I am not sure libgcrypt can be convinced to use another value.
i think the problem is with block size .as you said you are using 32 byte as block size make sure if block size of both are same and supports as well .because some of library block size is fixed for Aes as 16 byte .
What is the length of your key and IV?
Are ciphertexts different if the length of opentext is exactly 256 bit?
I have same problem, but with a different library. I noticed one thing in this library; If I pass input byte less than 32 bytes, in that case it's showing me both are the same encrypted data.
Is that what's happening in your case? If so, it means the problem is with the padding mechanism.

Decryption with AES and CryptoAPI? When you know the KEY/SALT

Okay so i have a packed a proprietary binary format. That is basically a loose packing of several different raster datasets. Anyways in the past just reading this and unpacking was an easy task. But now in the next version the raster xml data is now to be encrypted using AES-256(Not my choice nor do we have a choice).
Now we basically were sent the AES Key along with the SALT they are using so we can modify our unpackager.
NOTE THESE ARE NOT THE KEYS JUST AN EXAMPLE:
They are each 63 byte long ASCII characters:
Key: "QS;x||COdn'YQ#vs-`X\/xf}6T7Fe)[qnr^U*HkLv(yF~n~E23DwA5^#-YK|]v."
Salt: "|$-3C]IWo%g6,!K~FvL0Fy`1s&N<|1fg24Eg#{)lO=o;xXY6o%ux42AvB][j#/&"
We basically want to use the C++ CryptoAPI to decrypt this(I also am the only programmer here this week, and this is going live tomorrow. Not our fault). I've looked around for a simple tutorial of implementing this. Unfortunately i cannot even find a tutorial where they have both the salt and key separately. Basically all i have really right now is a small function that takes in an array of BYTE. Along with its length. How can i do this?
I've spent most of the morning trying to make heads/tails of the cryptoAPI. But its not going well period :(
EDIT
So i asked for how they encrypt it. They use C#, and use RijndaelManaged, which from my knowledge is not equivalent to AES.
EDIT2
Okay finally got exactly what was going on, and they sent us the wrong keys.
They are doing the following:
Padding = PKCS7
CipherMode = CBC
The Key is defined as a set of 32 Bytes in hex.
The IV is defined as a set of 32 bytes in hex too.
They took away the salt when i asked them.
How hard is it to set these things in CryptoAPI using the wincrypt.h header file.?
AES-256 uses 256 bit keys. Ideally, each key in your system should be equally likely. A 63 byte string would be 504 bits. You first need to figure out how the string of 63 characters needs to be converted to 256 bits (The sample ones you gave are not base64 encoded). Next, "salt" isn't an intrinsic part of AES. You might be referring to either an initialization vector (IV) in Cipher-Block-Chaining mode or you could be referring to somehow updating the key.
If I were to guess, I'm assuming that by "SALT" you mean IV and specifically CBC mode.
You will need to know all of this when using CAPI functions (e.g. decrypt).
If all of this sounds confusing, then it might be best to change your design so that you don't have to worry about getting all of this right. Crypto is hard. One bad step could invalidate all the security. Consider looking at this comment on my Stick Figure Guide to AES.
UPDATE: You can look at this for a rough starting point for C++ CAPI. You'll need a 64 character hex string to get 256 bits ( 256 bits / (4 bits / char) == 64 chars). You can convert the chars to bits yourself.
Again, I must caution that playing fast and loose with IV's and keys can have disastrous consequences. I've studied AES/Rijndael in depth down to the math and gate level and have even written my own implementation. However, in my production code, I stick to using a well-tested TLS implementation if at all possible for data in transit. Even for data at rest, it'd be better to use a higher level library.
Rijndael is the algorithm name for AES