Length of null terminated string in C/C++

Length of null terminated string in C/C++ - c++

I am using openssl to encrypt the string and i get null terminated string. I have now encrypted string and I want to send it over network with base64 encoding. I just need to send the encrypted data, how can I calculate the length of the string on the other side before decryption?
unsigned char *plaintext = (unsigned char*) "The quick brown fox jumps sover thes lazy dog";
unsigned char ciphertext[256] = {};
// After using openssl envelope_seal(), with EVP_aes_128_cbc()
strlen((const char*)ciphertext) // length is different each time due to null terminated binary string
sizeof(ciphertext) // lenght is equal to 256
int envelope_seal( ) // gives me the length of that string as 48.
Kindly help me to calculate the length of the ciphertext which is 48 but none of the method in my knowledge gives me the correct output.

AES is a block cipher. It has block size of 16 bytes - which means if you want to encrypt some data with it the length of the data in bytes must be multiple of 16 (if it is not you might need to use padding such as say PKCS7, more details).
Now after you encrypt a string with AES (say length of string is 32 bytes) - you can't use strlen anymore to get the length of the result, because the result, isn't a string anymore it is some byte array which represents the results of encryption. Actually you don't need to get the length anyway, it will be same size as plaintext - 32 bytes as we said in our case.
So I think you don't have issues with calculating length anymore - now if the other side should know length of the ciphertext you can send the length (say 32 in our case) in advance in packet. The other side should reconstruct the plain text now (and also remove padding bytes if one was used).
note: After you performed encryption and have the ciphertext you can apply base64 encoding to it and send it over, but you could as well send the byte array representing the ciphertext too.
In regard to comments, I will try to briefly highlight how this process goes. Say you have string char * str = "hello" - which is 5 bytes and you need to encrypt it. Like I said you can't encrypt it directly, you need to pad it to make multiple of 16. For this you can use PKCS7 padding (it is similar to PKCS5 but is for 16 bytes blocks). Now when you apply padding e.g., char * paddedText = PKCS7(str), you will end up with byte array which is 16 bytes.
Now, there is no more problem. You can encrypt this 16 bytes of plaintext. Result will also be 16 bytes cipher text.
Now if you want to decrypt this 16 bytes of cipher text, you can decrypt it directly (because it is multiple of 16). Now, due to the way PKCS7 padding works you will easily be able to tell from the result that only first 5 bytes were original string and you will remove 11 redundant bytes (how PKCS5 works, see in my link-PKCS7 is similar just for 16 byte block lengths), and obtain "hello".
Finally, I don't see where is the problem, when you send from client to server, you can just encode message length, e.g., 16 in packet, so that receiver knows 16 bytes represent cipher text. But again as I said, using this 16 bytes, the receiver, after decrypting will be able to determine that only first 5 bytes were original text (due to used padding PKCS7). So there is no need to send anything except 16; with help of PKCS padding scheme you will be able to tell that only first 5 bytes were plain text.
ps. After some discussions with OP it seems padding was not his main issue, because openssl method that was used seems to take care of that.

If the valid data in the array isn't terminated, then there's no way to tell its length by looking at the array.
If envelope_seal told you the length, then use that, passing it wherever the length is needed.

AES is a block cipher. Therefore the length of ciphertext will be the length of your plaintext modulo blocksize, rounded up to nearest blocksize.

Related

Building a fast PNG encoder issues

I am trying to build a fast 8-bit greyscale PNG encoder. Unfortunately I must be misunderstanding part of the spec. Smaller image sizes seem to work, but the larger ones will only open in some image viewers. This image (with multiple DEFLATE Blocks) gives a
"Decompression error in IDAT" error in my image viewer but opens fine in my browser:
This image has just one DEFLATE block but also gives an error:
Below I will outline what I put in my IDAT chunk in case you can easily spot any mistakes (note, images and steps have been modified based on answers, but there is still a problem):
IDAT length
"IDAT" in ascii (literally the bytes 0x49 0x44 0x41 0x54)
Zlib header 0x78 0x01
Steps 4-7 are for every deflate block, as the data may need to be broken up:
The byte 0x00 or 0x01, depending on if it is a middle or the last block.
Number of bytes in block (up to 2^16-1) stored as a little endian 16-bit integer
The 1's complement of this integer representation.
Image data (each scan-line is starts with a zero-byte for the no filter option in PNG, and is followed by width bytes of greyscale pixel data)
An adler-32 checksum of all the image data
A CRC of all the IDAT data
I've tried pngcheck on linux, an it does not spot any errors. If nobody can see what is wrong, can you point me in the right direction for a debugging tool?
My last resort is to use the libpng library to make my own decoder, and debug from there.
Some people have suggested it may be my adler-32 function calculation:
static uint32_t adler32(uint32_t height, uint32_t width, char** pixel_array)
{
uint32_t a=1,b=0,w,h;
for(h=0;h<height;h++)
{
b+=a;
for(w=0;w<width;w++)
{
a+=pixel_array[h][w];
b+=a;
}
}
return (uint32_t)(((b%65521)*65536)|(a%65521));
}
Note that because the pixel_array passed to the function does not contain the zero-byte at the beginning of each scanline (needed for PNG) there is an extra b+=a (and implicit a+=0) at the beginning of each iteration of the outer loop.

I do get an error with pngcheck: "zlib: inflate error = -3 (data error)". As your PNG scaffolding structure looks okay, it's time to take a low-level look into the IDAT block with a hex viewer. (I'm going to type this up while working through it.)
The header looks alright; IDAT length is okay. Your zlib flags are 78 01 ("No/low compression", see also What does a zlib header look like?), where one of my own tools use 78 9C ("Default compression"), but then again, these flags are only informative.
Next: zlib's internal blocks (per RFC1950).
Directly after the compression flags (CMF in RFC1950) it expects FLATE compressed data, which is the only compression scheme zlib supports. And that is in another castle RFC: RFC1951.
Each separately compression block is prepended by a byte:
3.2.3. Details of block format
Each block of compressed data begins with 3 header bits
containing the following data:
first bit BFINAL
next 2 bits BTYPE
...
BFINAL is set if and only if this is the last block of the data
set.
BTYPE specifies how the data are compressed, as follows:
00 - no compression
01 - compressed with fixed Huffman codes
10 - compressed with dynamic Huffman codes
11 - reserved (error)
So this value can be set to 00 for 'not last block, uncompressed' and to 01 for 'last block, uncompressed', immediately followed by the length (2 bytes) and its bitwise inverse, per 3.2.4. Non-compressed blocks (BTYPE=00):
3.2.4. Non-compressed blocks (BTYPE=00)
Any bits of input up to the next byte boundary are ignored.
The rest of the block consists of the following information:
0 1 2 3 4...
+---+---+---+---+================================+
| LEN | NLEN |... LEN bytes of literal data...|
+---+---+---+---+================================+
LEN is the number of data bytes in the block. NLEN is the
one's complement of LEN.
They are the final 4 bytes in your IDAT segment. Why do small images work, and larger not? Because you only have 2 bytes for the length.1 You need to break up your image into blocks no larger than 65,535 bytes (in my own PNG creator I seem to have used 32,768, probably "for safety"). If the last block, write out 01, else 00. Then add the two times two LEN bytes, properly encoded, followed by exactly LEN data bytes. Repeat until done.
The Adler-32 checksum is not part of this Flate-compressed data, and should not be counted in the blocks of LEN data. (It is still part of the IDAT block, though.)
After re-reading your question to verify I addressed all of your issues (and confirming I spelled "Adler-32" correctly), I realized you describe all of the steps right -- except that the 'last block' indicator is 01, not 80 (later edit: uh, perhaps you are right about that!) -- but that it does not show in this sample PNG. See if you can get it to work following all of the steps by the letter.
Kudos for doing this 'by hand'. It's a nice exercise in 'following the specs', and if you get this to work, it may be worthwhile to try and add proper compression. I shun pre-made libraries as much as possible; the only allowance I made for my own PNG encoder/decoder was to use Rich Geldreich's miniz.c, because implementing proper Flate encoding/decoding is beyond my ken.
1 That's not the whole story. Browsers are particularly forgiving in HTML errors; it seems they are as forgiving for PNG errors as well. Safari displays your image just fine, and so does Preview. But they may just all be sharing OS X's PNG decoder, because Photoshop rejects the file.

The byte 0x00 or 0x80, depending on if it is a middle or the last block.
Change the 0x80 to 0x01 and all will be well.
The 0x80 is appearing as a stored block that is not the last block. All that's being looked at is the low bit, which is zero, indicating a middle block. All of the data is in that "middle" block, so a decoder will recover the full image. Some liberal PNG decoders may then ignore the errors it gets when it tries to decode the next block, which isn't there, and then ignore the missing check values (Adler-32 and CRC-32), etc. That's why it shows up ok in browsers, even though it is an invalid PNG file.
There are two things wrong with your Adler-32 code. First, you are accessing the data from a char array. char is signed, so your 0xff bytes are being added not as 255, but rather as -127. You need to make the array unsigned char or cast it to that before extracting byte values from it.
Second, you are doing the modulo operation too late. You must do the % 65521 before the uint32_t overflows. Otherwise you don't get the modulo of the sum as required by the algorithm. A simple fix would be to do the % 65521 to a and b right after the width loop, inside the height loop. This will work so long as you can guarantee that the width will be less than 5551 bytes. (Why 5551 is left as an exercise for the reader.) If you cannot guarantee that, then you will need to embed a another loop to consume bytes from the line until you get to 5551 of them, do the modulo, and then continue with the line. Or, a smidge slower, just run a counter and do the modulo when it gets to the limit.
Here is an example of a version that works for any width:
static uint32_t adler32(uint32_t height, uint32_t width, unsigned char ** pixel_array)
{
uint32_t a = 1, b = 0, w, h, k;
for (h = 0; h < height; h++)
{
b += a;
w = k = 0;
while (k < width) {
k += 5551;
if (k > width)
k = width;
while (w < k) {
a += pixel_array[h][w++];
b += a;
}
a %= 65521;
b %= 65521;
}
}
return (b << 16) | a;
}

Sending a fixed-length header

I'm trying to send data with a fixed-length header that tells the server how many bytes of data it's going to have to have available to read before it reads it. I'm having trouble doing this, though. The maximum number of bytes of data I want to be able to send at once is 65536, so I'm sending a uint16_t type variable as the header of my data because the maximum number it can represent is 65536.
The problem is, a uint16_t takes up two bytes, but numbers less than 255 only require one byte. So I have this code on the client side:
uint16_t messageSize = clientSendBuf.size(); //clientSendBuf is the data I want to send
char *bytes((char*)&messageSize);
clientSendBuf.prepend(bytes);
client.write(clientSendBuf);
And on the server, I handle receiving messages like this:
char serverReceiveBuf[65536];
uint16_t messageSize;
client->read((char*)&messageSize, sizeof(uint16_t));
client->read(serverReceiveBuf, messageSize);
I'm going to change this around a bit later because it's not the best solution (particularly for when all of the data isn't available yet), but I want to get this fixed first. My problem is that when clientSendBuf.size() is too small (in my test case it was 16 bytes, I assume this happens for every value under 255) reading data with
client->read((char*)&messageSize, sizeof(uint16_t));
reads a second byte that isn't part of the header, giving and incorrect value for messageSize and crashing the server. If I replace sizeof(uint16_t) with 1, then the server reads the data fine as I'd expect, although then I have a messageSize maximum of 255, which is much lower than I want. How do I make it so that the messageSize prepended to clientSendBuf is always two bytes, even for numbers <255?

Your
clientSendBuf.prepend(bytes);
Should also be told that it needs to send 2 bytes; now it treats the bytes as a zero-terminated string, which accidently works since on your platform the second byte of 0x0010 is zero (using little-endian numbers: 0x16, 0x00).
The prepend(char*, int) method will do the trick:
// use this instead:
cliendSendBuf.prepend(bytes, sizeof(messageSize));

DES encryption/decryption

I'm using a DES algorithm I found on the web. It works fine but I have a problem.
As you know, DES encrypts/decrypts blocks of 64 bytes each. But what happens if in a big file the last block doesn't end at a 64 byte block boundary? I know, there would be errors.
I'm using the following code:
Des d1;
char *plaintext=new char[64];
char *chyphertext=new char[64];
h.open("requisiti.txt",ios::in|ios::binary);.
k.open("requisiti2.txt",ios::out|ios::binary);
while(!h.eof())
{
h.read(plaintext,64);
chyphertext=d1.Encrypt(plaintext);
//decryption is the same.just change Encrypt to Decrypt
k.write(chyphertext,64);
}
h.close();
k.close();
remove("requisiti.txt");
rename("requisiti2.txt","requisiti.txt");
So I need a solution like "padding", but I don't know a simple algorithm for it. Please help me to encrypt/decrypt file in a good way.

First, I'd like to point that DES works on 64bits chunks (making it 8bytes, not 64), as you can see in http://en.wikipedia.org/wiki/Data_Encryption_Standard (check data block size).
Now you're looking for some padding (and unpadding when deciphering). You can look at http://en.wikipedia.org/wiki/Padding_(cryptography)
I personnally like PKCS#7 because it's easy and usually adds a little overhead compared to standard size.
For encryption:
check the size of the chunk you just read from file
if it's 64bits, add a new chunk [8,8,8,...8], otherwise, pad it with the number of missing bytes (see example below)
encrypt
note that LAST packet is always containing padding with that algorithm (worst case is 8 bytes of padding)
Example:
read 0a 0b 0c, missing 5 bytes to fit in 8 bytes
padded packet :0a 0b 0c 05 05 05 05 05
For decryption :
read packet
decrypt
if it's the last packet, check value of the last byte (say it's n)
remove n bytes at the end of your packet
Hope this makes it more clear and helps you
EDIT
If your input file is pure text, you can pad with 0, if it's binary (and it must be since you're opening it as binary), PKCS#7 is better
Think about a file created like that : dd if=/dev/zero of=temp.zero count=100
a few of hundred bytes of zeros, what is padding and what ain't ?
Implementation is really easy :
think memset
don't forget to add last chunk if ile is a multiple of 8
By the way, DES is nowadays seriously broken, you should think about using a decent cipher if concerned with security (thinking AES at least, check http://en.wikipedia.org/wiki/Data_Encryption_Standard#Replacement_algorithms )

First of all: never use eof() to check whether the end of file is reached since it doesn't predict the end of file.
while(h.read(plaintext,64))
{
if (std::h.gcount() < 64) // gcount returns the number of characters extracted by the last unformatted input operation.
for (int i=std::h.gcount(); i<64; i++)
paintext[i] = 0; // pad the last block
chyphertext=d1.Encrypt(plaintext);
//decryption is the same.just change Encrypt to Decrypt
k.write(chyphertext,64);
}

I'm not sure what you are using DES for, but you should really use something else if you are actually trying to protect the data you are encrypting. DES is not secure anymore.
Also, I would imagine that a good library would do the padding for you.

First of all do not use DES! DES is broken and can be brute forced quite fast. Secondly you are using ECB mode you can read on wiki why you should avoid this one. Your data can be tampered with and you will not know about this - use AE mode like GCM. Like someone mentioned earlier DES have 64 bits not bytes block size which is 8 bytes.

Will TripleDes alter the Datasize

I have a code that encrpyts and decrypts the data using Triple DES.
Everything works fine with the code.
I have a query with the Triple Des.
Will Triple DES alter the data size while it does the encyption process.
I googled and was totally confused of the answers that i got.
will it alter. If Yes means how to find the size of the encrpyted data.
Here is the code :
unsigned char initVector[8];
unsigned char* block;
int j;
memset(initVector, 0xEE, sizeof(initVector));
nBlocks = dwDataSize / 8;
for (i=0; i < nBlocks; i++)
{
block = (unsigned char*) pData + i*8;
memset(initVector, 0xEE, sizeof(initVector));
des_ede3_cbc_encrypt((unsigned char *)block,(unsigned char *)block, 8,
m_Schedule1 , m_Schedule2, m_Schedule3, (C_Block *)initVector, DES_ENCRYPT);
I saw in another one discussion that the size will change.
Here is the link.
Length of Encrypted String
Regards,
Siva./

TripleDES is a block cipher primitive. Block ciphers work by creating a permutation of a block of input data (which is supposed to be indistinguishable from random data) based on a key, which can only be reversed if the key is known.
As such, the encrypted data occupies exactly the same amount of space as the input data (except perhaps for padding of the final block). Typical block sizes are any powers of two from 4 to 32 bytes.
(A thought experiment: It would be impossible for the cipher text to be shorter than the input, because then two distinct inputs would have to map to the same cipher text, which is impossible. Conversely, if the cipher text were longer, then there would be certain cipher texts than can never be the result of an encryption, thus not being "indistinguishable from random data".)

It depends. To be more precise, it depends on the following elements:
the encoding of the cipher text and plain text
the encryption mode
the padding mode & block size
the NONCE or IV
the (optional) authentication tag
3DES is a block cipher. It is a seemingly random permutation on bits (mostly using bytes as minimum element). A single 3DES uses 64 bit/8 bytes as input and generated the same size
To start with the first one: if you encrypt a piece of text (a character string) then you need to encode the string to bytes first. If you expect the cipher text to be stored in a string, you will need to convert the result into a string.
Next is the encryption mode: if this is a mode that converts the 3DES block cipher into a stream cipher (e.g. CTR) then the input size is identical to the output size, excluding the NONCE.
Then there is padding mode. If you use ECB or CBC mode encryption then you must pad if the plain text has length x, x % n != 0 and n is the block size in bytes. If you can distinguish the plain text from the padding, then you can add 0 to n - 1 bytes of padding. If you cannot, then you need to always pad, adding 1 to n bytes of padding. PKCS#5 padding (the most common one) always pads.
Normally you need to transfer the IV or NONCE as well. Both of them are normally about the same as the block size. A common option is to prepend the IV to the cipher text. This is often performed for CBC mode encryption which you apply. The only time you should not create a new (random) IV is when you use the key only a single time.
Most of the time you should add integrity protection to cipher text. If you use e.g. GCM mode encryption, then you need some additional space for the authentication tag. If you use a MAC or HMAC then this should be included on top of the cipher text.
There is also such a thing as cipher text stealing, which can be used to do away with padding. Finally, you may not need an IV for certain modes of single block encryption.
In your case:
If you work with bytes, use CBC mode encryption, prepend the IV and use PKCS#5 padding then the calculation would be (n) + ((x) + (n - x % n)). For 3DES, n = 8.

Arduino Ethernet Byte size problem

I'm using an Arduino (duemilanove) with the official Ethernet shield to send data to the controller for controlling an LED matrix. I am trying to send some raw 32-bit unsigned int values (unix timestamps) to the controller by taking the 4 bytes in the 32-bit value on the desktop and sending it to the arduino as 4 consecutive bytes. However, whenever a byte value is larger than 127, the returned value by the ethernet client library is 63.
The following is a basic example of what I'm doing on the arduino side of things. Some things have been removed for neatness.
byte buffer[32];
memset(buffer, 0, 32);
int data;
int i=0;
data = client.read();
while(data != -1 && i < 32)
{
buffer[i++] = (byte)data;
data = client.read();
}
So, whenever the input byte is bigger than 127 the variable "data" will end up getting set to 63! At first I thought the problem was further down the line (buffer used to be char instead of byte) but when I print out "data" right after the read, it's still 63.
Any ideas what could be causing this? I know client.read() is supposed to output int and internally reads data from the socket as uint8_t which is a full byte and unsigned, so I should be able to at least go to 255...
EDIT: Right you are, Hans. Didn't realize that Encoding.ASCII.GetBytes only supported the first 7 bits and not all 8.

I'm more inclined to suspect the transmit side. Are you positive the transmit side is working correctly? Have you verified with a wireshark capture or some such?

63 is the ASCII code for ?. There's some relevance to the values, ASCII doesn't have character codes for values over 127. An ASCII encoder commonly replaces invalid codes like this with a question mark. Default behavior for the .NET Encoding.ASCII encoder for example.
It isn't exactly clear where that might happen. Definitely not in your snippet. Probably on the other end of the wire. Write bytes, not characters.

+1 for Hans Passant and Karl Bielefeldt.
Can you just send the data without encoding? How is the data being sent? TCP/UDP/IP/Ethernet definitely support sending binary data without restriction. If this isn't possible, perhaps converting the data to hex will solve the problem. Base64 will also work (better) but is considerably more work. For small amounts of data, hex is probably the easiest and fastest solution.
+1 again to Karl and Ben for mentioning wireshark. Invaluable for debugging network problems like this.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js