Cryptopp: Get the padded string of input to the cipher - c++

I am currently struggling around with the crypto++ lib for c++. All I want is to get the padded input to the cipher. I have the following snippet to encrypt a string "plain":
CryptoPP::StringSource(plain, true, new CryptoPP::StreamTransformationFilter(e, new CryptoPP::StringSink(cipher), CryptoPP::BlockPaddingSchemeDef::DEFAULT_PADDING));
where e is a cipher as:
CryptoPP::CBC_Mode<CryptoPP::SPECK128 >::Encryption e;
I can output the cipher text with the following snippet:
CryptoPP::StringSource(cipher, true, new CryptoPP::HexEncoder(new CryptoPP::FileSink(std::cout)));
What I need is to get the padded version of the string "plain". Is there anybody who can give me a hint to use the StreamTransformationFilter to get the output of the padded input string?

Just use CryptoPP::BlockPaddingSchemeDef::NO_PADDING instead of removing BlockPaddingSchemeDef altogether during decryption. It will then not expect padding and interpret the padding as plaintext.
The padding is always within the final 16 bytes (presuming a block size of 128 bits), starting at the right. Beware that the returned bytes are usually not characters as they are below < 0x20, which means that they are usually interpreted as control characters.

Related

Why is my decrypted data formatted like this?

I am currently working on a side project to learn how to use Crypto++ for encryption/decryption. For testing my project I was given the following values to help setup and validate that my project is working:
original string: "100000"
encrypted value: "f3q2PYciHlwmS0S1NFpIdA=="
key and iv: empty byte array
key size: 24 bytes
iv size: 16 bytes
The project runs and decrypts the encrypted value okay, but instead of returning
"100000"
it returns
"1 0 0 0 0 0 "
where each space is really "\0". Here is my minimal code that I use for decryption:
#include "modes.h"
#include "aes.h"
#include "base64.h"
using namespace CryptoPP;
void main()
{
string strEncoded = "f3q2PYciHlwmS0S1NFpIdA==";
string strDecrypted;
string strDecoded;
byte abKey[24];
byte abIV[AES::BLOCKSIZE];
memset(abKey, 0, sizeof(abKey));
memset(abIV, 0, AES::BLOCKSIZE);
AES::Decryption cAESDecryption(abKey, sizeof(abKey));
CBC_Mode_ExternalCipher::Decryption cCBCDecryption(cAESDecryption, abIV);
StringSource(strEncoded, true, new Base64Decoder(new StringSink(strDecoded)));
StreamTransformationFilter cDecryptor(cCBCDecryption, new StringSink(strDecrypted));
cDecryptor.Put(reinterpret_cast<const byte*>(strDecoded.c_str()), strDecoded.size());
cDecryptor.MessageEnd();
}
I am okay with using the decrypted value as is, but what I need help understanding is why the decrypted value is showing "1 0 0 0 0 0 " instead of "100000"? By the way, this is built in VS2005 as a Windows Console Application with Crypto++ as a static library and I am using Debug mode to look at the values.
Add a strHex string, and add the following line after you decrypt the text:
StringSource ss2(strDecrypted, true, new HexEncoder(new StringSink(strHex)));
cout << strHex << endl;
You should see something similar to:
$ ./cryptopp-test.exe
310030003000300030003000
As #Maarten said, it looks like UTF-16 LE without the BOM. My guess is the sample was created in .Net, and they are asking you to decrypt in C++/Crypto++. I'm guessing .Net because its UTF-16 and little endian, while Java is UTF-16 and big endian by default (IIRC).
You could also ask that they provide you with strings produced by getBytes(Encoding.UTF8). That will side step the issue, too.
So the value in strDecrypted is not a std::string. Its just a binary string (a.k.a a Rope) that needs to be converted. For the conversion to UTF-8 (or other narrow character set), I believe you can use iconv. libiconv is built into GNU Linux's GLIBC (IIRC), and it can be found in the lib directory of the BSDs.
If you are on Windows, then use WideCharToMultiByte function.
It's very probably just text that is encoded using the UTF-16LE or UCS-2LE character-encoding, apparently without Byte Order Mark (BOM). So to display the text you have to decode it first.

IV value read from Binary file, is not proper

I have an encrypted binary file of size 256*N bytes.
The last two bytes of the first page(256 length) contains the IV value to decrypt.
If i fetch that using the below code:
infile.seek(240,0)
iv = infile.read(16)
(infile is input file). IV value is not matching to that in the bin file.
Also, is it fine if i just send this "iv" to AES.new ? as below code?
decryptor = AES.new(key, AES.MODE_CBC, iv)
Also, if i have to send a hard coded IV value to AES new function, in what format i need to send it? i have a 16 bytes HEX value and i need to convert it into a byte string right?
Please let me know how to do it.
First question
Yes, that seems to be the proper method to read the IV. Make sure you opened the file in binary mode though, not in text mode.
Second question
Yes, if the IV is a 16 byte binary value that would be correct.
Third question
Using a constant IV would defeat the purpose of the IV altogether. But if you must use one that is specified as hexadecimals you should unhexlify it.

I get "Invalid utf 8 error" when checking string, but it seems correct when i use std::cout

I am writing some code that must read utf 8 encoded text files, and send them to OpenGL.
Also using a library which i downloaded from this site: http://utfcpp.sourceforge.net/
When i write down this i can show the right images on OpenGL window:
std::string somestring = "abcçdefgğh";
// Convert string to utf32 encoding..
// I also set local on program startup.
But when i read the utf8 encoded string from file:
The library warns me about that the string has not a valid utf encoding
I can't send the 'read from file' string to OpenGL. It crashes.
But i can still use std::cout for the string that i read from file (it looks right).
I use this code to read from file:
void something(){
std::ifstream ifs("words.xml");
std::string readd;
if(ifs.good()){
while(!ifs.eof()){
std::getline(ifs, readd);
// do something..
}
}
}
Now the question is:
If the string which is read from file is not correct, how does it look as expected when i check it with std::cout?
How can i get this issue solved?
Thanks in advance:)
The shell to which you write output is probably rather robust against characters it doesn't understand. It seems, not all of the used software is. It should, however, be relatively straight forward to verify if you byte sequence is a valid UTF-8 sequence: the UTF-8 encoding is relatively straight forward:
each code point starts with a byte representing the number of bytes to be read and the first couple of bytes:
if the high bit is 0, the code point consists of one byte represented by the 7 lower bits
otherwise the number of leading 1 bits represent the total number of bytes followed by a zero bit (obiously) and the remaining bits become the high bits of the code point
since 1 byte is already represented, bytes with the high bit set and the next bit not set are continuation bytes: the lower 6 bits are part of the representation of the code point
Based on these rules, there are two things which can go wrong and make the UTF-8 invalid:
a continuation byte is encountered at a point where a start byte is expected
there was a start byte indicating more continuation bytes then followed
I don't have code around which could indicate where things are going wrong but it should be fairly straight forward to write such code.

How to perform unpadding after decryption of stream using CryptoPP

I've got the stream to decrypt. I divide it into blocks and pass each block to the method below. The data I need to decrypt is encrypted by 16 - bytes blocks and if the last block is less than 16, then all the rest bytes are filled by padding. Then in the moment of decryption I'm getting my last block result as the value including these additional padding bytes. How can I determine the length of original data and return only it or determine the padding bytes and remove them, considering different paddings could be used?
void SymmetricAlgorithm::Decrypt(byte* buffer, size_t dataBytesSize) {
MeterFilter meter(new ArraySink(buffer, dataBytesSize));
CBC_Mode<CryptoPP::Rijndael>::Decryption dec(&Key.front(), Key.size(), &IV.front());
StreamTransformationFilter* filter = new StreamTransformationFilter(dec, new Redirector(meter), PKCS_PADDING);
ArraySource(buffer, dataBytesSize, true, filter);
dec.Resynchronize(&IV.front());
}
Now I'm trying with PKCS_PADDING and Rijndael, but in general I might need work with any algorithm and any padding.
I divide it into blocks and pass each block to the method below
In this case, you might consider calling ProcessBlock directly:
CBC_Mode<Rijndael>::Decryption dec(...);
// Assume 'b' is a 16-byte block
dec.ProcessBlock(b);
The block is processed in place, so its destructive. You will also be responsible for processing the last block, including the removal of padding.
By blocking and removing padding, you are doing the work of the StreamTransformationFilter (and friends).
As it happens, I found what I needed occasionally in the example from this question.
Appreciate your help, Gabriel L., but I didn't want make my method not to use padding at all. Sorry for unclear explanations, I wanted to extract plain data from decrypted data, which includes padding symbols. And the bold row in this code shows how to find out plain data bytes count.
void SymmetricAlgorithm::Decrypt(byte* buffer, size_t dataBytesSize) {
MeterFilter meter(new ArraySink(buffer, dataBytesSize));
CBC_Mode<CryptoPP::Rijndael>::Decryption dec(&Key.front(), Key.size(), &IV.front());
StreamTransformationFilter* filter = new StreamTransformationFilter(dec, new Redirector(meter), PKCS_PADDING);
ArraySource(buffer, dataBytesSize, true, filter);
int t = meter.GetTotalBytes(); //plain data bytes count
dec.Resynchronize(&IV.front());
}

Incorrect string length after Blowfish encryption

I did a little testing with Blowfish encoding and noticed something. The encoded string is not always as long as the source string. it is shorter sometimes.
If i want to decode a encoded string i need the length to decode in the openssl function:
BF_cfb64_encrypt(encoded_input,decoded_output,length_to_decode,&key,iv,&num,BF_DECRYPT);
The problem here is that i don't have the length_to_decode if i don't know the length of the source string. if i use the length of the decoded string as length_to_decode then this may be too short.
If i use a bigger length then the decoded string is not correct. So do i need to know the length to decode with blowfish encoding?
in all the example code on the internet encoding and decoding always takes place in one function and the decoding example use hard coded length for decoding. but in real life i don't know the length of the encoded string and what then?
here is an example:
source string: sdcfgssssssss
source length: 13
encryption key: s
encrypted output: :‹(
encrypted length: 4
I init my key like this:
BF_KEY key;
const char * keyStr = "s";
BF_set_key(&key, strlen(keyStr), (const unsigned char *)keyStr);
The length of the output is identical to the length of the input. Do note that the output data may contain NUL characters, so you should not use strlen on the data.