Incorrect string length after Blowfish encryption - c++

I did a little testing with Blowfish encoding and noticed something. The encoded string is not always as long as the source string. it is shorter sometimes.
If i want to decode a encoded string i need the length to decode in the openssl function:
BF_cfb64_encrypt(encoded_input,decoded_output,length_to_decode,&key,iv,&num,BF_DECRYPT);
The problem here is that i don't have the length_to_decode if i don't know the length of the source string. if i use the length of the decoded string as length_to_decode then this may be too short.
If i use a bigger length then the decoded string is not correct. So do i need to know the length to decode with blowfish encoding?
in all the example code on the internet encoding and decoding always takes place in one function and the decoding example use hard coded length for decoding. but in real life i don't know the length of the encoded string and what then?
here is an example:
source string: sdcfgssssssss
source length: 13
encryption key: s
encrypted output: :‹(
encrypted length: 4
I init my key like this:
BF_KEY key;
const char * keyStr = "s";
BF_set_key(&key, strlen(keyStr), (const unsigned char *)keyStr);

The length of the output is identical to the length of the input. Do note that the output data may contain NUL characters, so you should not use strlen on the data.

Related

Cryptopp: Get the padded string of input to the cipher

I am currently struggling around with the crypto++ lib for c++. All I want is to get the padded input to the cipher. I have the following snippet to encrypt a string "plain":
CryptoPP::StringSource(plain, true, new CryptoPP::StreamTransformationFilter(e, new CryptoPP::StringSink(cipher), CryptoPP::BlockPaddingSchemeDef::DEFAULT_PADDING));
where e is a cipher as:
CryptoPP::CBC_Mode<CryptoPP::SPECK128 >::Encryption e;
I can output the cipher text with the following snippet:
CryptoPP::StringSource(cipher, true, new CryptoPP::HexEncoder(new CryptoPP::FileSink(std::cout)));
What I need is to get the padded version of the string "plain". Is there anybody who can give me a hint to use the StreamTransformationFilter to get the output of the padded input string?
Just use CryptoPP::BlockPaddingSchemeDef::NO_PADDING instead of removing BlockPaddingSchemeDef altogether during decryption. It will then not expect padding and interpret the padding as plaintext.
The padding is always within the final 16 bytes (presuming a block size of 128 bits), starting at the right. Beware that the returned bytes are usually not characters as they are below < 0x20, which means that they are usually interpreted as control characters.

Find hex string is utf-8 or utf-16

I am new to c++ .
I have hex string from file.
Example - 657374696E65 which if utf-8 code will convert to "estine".
Sometime i get utf-16 code to string.
I need to find, is string in encoded in utf-8 or utf-16 by programatically.
std::string input = "657374696E65";
std::string extract = input.substr(0, 4);
unsigned int x;
std::stringstream ss;
ss << std::hex << extract;
ss >> x;
i initially take each 4 substr then convert to ascii then to widestring.
Sometime t get utf-8 too.
Can any one help me to find is the string i have to convert each 2 char or 4 char to ascii.
The first thing you should do before further processing is undoing the hex encoding, by putting raw bytes into an std::string or std::vector<unsigned char>. Then you can post-process your collection of bytes by UTF-8 or UTF-16 decoding into the string type your application needs.
There is no safe way to detect whether a string is UTF-8 or UTF-16. Microsoft tried to do so in a quite clever way in their IsTextUnicode function. The result was the misinterpretation of files containing the string "bush hid the facts" (without newline) in Notepad (e.g. on Windows XP).
If you can ensure that all UTF-16 strings you receive start with a byte order mark (BOM), use the BOM as indicator for UTF-16.
If you are sure that you strings always contain (amongst other characters) US-ASCII characters, take the appearance of NUL bytes ('\x00') as indicator for UTF-16.
This is one of the better heuristics Windows used: If there is the patter \x0D\x0A (CR/LF), detect the string as UTF-8. This prevents the "bush hid the facts" issue if there is a line break in the string.

Do we need to consider encoding (UTF-8) while constructing a string from char* buffer

I am working on a HTTP Client module which receives information from server in a character buffer and is UTF-8 encoded. I wanted to create a std::string object from this character buffer.
Can I create a string object directly by passing the character buffer like this ?
std::string receivedstring(receievedbuffer,bufferlength);
here receievedbuffer is char[] array which contains data received from TCP/IP connection and bufferlength contains the number of bytes received. I am really confused with the term UTF-8 , I understood that its a unicode encoding , do I need to take any steps before the conversion.
std::string receivedstring(receievedbuffer,bufferlength);
It does not do any conversion, it just copies from receievedbuffer to receivedstring.
If your receievedbuffer was UTF-8 encoded then the the exact same bytes will be stored into receivedstring.
std::string is just a storage format and does not reflect the encoding of the data stored in it.

IV value read from Binary file, is not proper

I have an encrypted binary file of size 256*N bytes.
The last two bytes of the first page(256 length) contains the IV value to decrypt.
If i fetch that using the below code:
infile.seek(240,0)
iv = infile.read(16)
(infile is input file). IV value is not matching to that in the bin file.
Also, is it fine if i just send this "iv" to AES.new ? as below code?
decryptor = AES.new(key, AES.MODE_CBC, iv)
Also, if i have to send a hard coded IV value to AES new function, in what format i need to send it? i have a 16 bytes HEX value and i need to convert it into a byte string right?
Please let me know how to do it.
First question
Yes, that seems to be the proper method to read the IV. Make sure you opened the file in binary mode though, not in text mode.
Second question
Yes, if the IV is a 16 byte binary value that would be correct.
Third question
Using a constant IV would defeat the purpose of the IV altogether. But if you must use one that is specified as hexadecimals you should unhexlify it.

LPBYTE data to CString in MFC

I am encrypting data using CryptProtectData function and I am getting encrypted data in LPBYTE format, I want to save that data into a file and then read back for decryption.
In order to write string in file, I used following one to convert LPBYTE data to CString:
CString strEncrUName = (wchar_t *)encryptedUN;
I even tried this one How to convert from BYTE array to CString in MFC? but still it's not working.
Character set used is unicode.
Thanks in advance
The encrypted data is a buffer of raw bytes, not characters. If you want to convert it to a string, you'll have to encode it somehow, such as by converting it to Hex chars.
eg. byte 0xd5 becomes 2 chars: "D5"
Looping through each byte and converting it to hex chars is an easy exercice left up to the reader.
Of course, you'll have to convert it back to binary after you read the file.
Are you sure you want to save it to a text file. Your other option is to save the binary encrypted data to a binary file: no need to convert to/from string.
If your pointer represents zero terminated string
LPBYTE pByte;
CString str(LPCSTR(pByte));