using reinterpret_cast to convert char* to vector<byte> - c++

I am using a library called botan for encryption. but the case here is not related with the library, it seems to be a issue in c++ or casting. using the library a 16 byte long vector is created as below.
SecureVector<byte> salt = rng.random_vec(16);
then it is converted to a string as,
std::string salt_string ((const char*)salt.begin() , salt.size());
using Qt i can just read the srting as,
ui->textEdit->append("Salt is : "+ QString::fromStdString(salt_string));
now I need to write this to a file and regenerate the vector at a later time.
It is written to a file as,
ofstream outfile ("salt.txt" , std::ios::binary);
outfile.write((const char*)salt.begin(), salt.size());
up to this point the code looks working great and the problem occurs when reading and regenerating the vector.
here is how i read the data to a char* array,
ifstream infile ("salt.txt" , std::ios::binary );
char* salt = new char[16];
infile.read(salt , 16 );
now I need to recreate the SecureVector<byte> as salt2 , I tried to do it using reinterpret_cast as below,
SecureVector<byte> salt2 = reinterpret_cast<byte> (salt);
which compiles without errors but returns a empty string when try to display as i displayed salt above. what am i doing wrong or how to do the conversion correctly. any help or advice will be highly appreciated.

reinterpret_cast doesn't magically convert one type to another, even if it appears to do so. Frankly, unless and until you understand what it does do, you should never use it.
To make a vector contain the bytes from an array, create the vector and then add the bytes to it. You can't do this using a cast.
SecureVector<byte> salt2(salt, salt + 16);

The problem here is your assignment:
SecureVector<byte> salt2 = reinterpret_cast<byte>(salt);
You are converting the char* into a byte (So a pointer is being converted to a byte (I assume you meant to convert it to a byte* (note the extra *) but that did not compile so you took off the * to see what would hapen)). What this does is undefined (if not a very bad idea). But You have a byte.
But it compiles because SecureVector<byte> has a constructor that takes a size_t as a parameter. A size_t is an integer as is a byte and thus the compiler generated an auto conversion and constructed you vector using the byte as a size.
What you actually want to do is use the constructor that takes a pointer to byte and a size. see: SecureVector.
SecureVector<byte> salt2(reinterpret_cast<byte*>(salt), 16);

It's ugly, but due to the type conversion you may have to just do a for loop here:
for(int i = 0; i < 16; ++i)
salt2.push_back(reinterpret_cast<byte>(salt[i]));
I don't think casting like that can work because a vector isn't laid out the same way an array is in memory, it has to contain other information like its size.

Related

Binary writing in C++, how to implement it?

Update: Still none game me answer to my main question how to save string char by char or as a whole (I want to ignore the last Null)?
Today I learned something new, which wasn't that clear to me.
I know how to save data as binary to a file, first I open it like this:
std::ofstream outfile(filename, std::ios_base::binary);
and then if I want to write a number I do the following:
outfile.write((const char*)&num, sizeof(int));
But, what about writing a string, how may I do this? char by char, or is there a faster method? Plus, what should the size of it be?
But, what about writing a string, how may I do this? char by char, or is there a faster method? Plus, what should the size of it be?
you can use the c_str() method in std::string to get the char array exist inside the string object, as it returns const char* and it's the same type in file.write() parameters. And for the size you can get the string size using length() method from std::string. the code can be like :
string mystr = "hello world";
outfile.write(mystr.c_str(), mystr.length());
And for
outfile.write((const char*)&num1, sizeof(unsigned int));
it save something but it is not your integer, it does not save it. you may try this :
outfile.write(reinterpret_cast<char*>(&num1), sizeof(num1));
and if it doesn't work you need to save your integer manually in a char array and write it to the file. you can convert your int to char using char* _itoa(int value, char* str, int base); and for char* str size you can allocate a number of chars as many digit you have in your integer.
PS: _itoa function belongs to C so using it in C++ may require to define _CRT_SECURE_NO_WARNINGS in the preprocessors

Dynamic memory on a function new char[size] vs char[size]

So I have this function that has a string with a pre-defined buffer (the buffer is defined when calling a function).
My question is, why doesn't the compiler throws me an error whenever I do the following (without the new operator?):
int crc32test(unsigned char *write_string, int buffer_size){
// Append CRC32 to string
int CRC_NBYTES = 4;
int new_buffer_size = buffer_size + CRC_NBYTES; // Current buffer size + CRC
// HERE (DECLARATION OF THE STRING)
unsigned char appendedcrc_string[new_buffer_size];
return 0;
}
isn't THIS the correct way to do it..?
int crc32test(unsigned char *write_string, int buffer_size){
// Append CRC32 to string
int CRC_NBYTES = 4;
int new_buffer_size = buffer_size + CRC_NBYTES; // Current buffer size + CRC
// HERE (DECLARATION OF THE STRING USING NEW)
unsigned char * appendedcrc_string = new unsigned char[new_buffer_size+1];
delete[] appendedcrc_string ;
return 0;
}
And I actually compiled both, and both worked. Why isn't the compiler throwing me any error?
And is there a reason to use the new operator if apparently the former function works too?
There's a few answers here already, and I'm going to repeat several things said already. The first form you use is not valid C++, but will work in certain versions of GCC and CLang... It is decidedly non-portable.
There are a few options that you have as alternatives:
Use std::string<unsigned char> for your input and s.append(reinterpret_cast<unsigned char*>(crc), 4);
Similarly, you can use std::vector<unsigned char>
If your need is just for a simple resizable buffer, you can use std::unique_ptr<unsigned char[]> and use memcpy & std::swap, etc to move the data into a resized buffer and then free the old buffer.
As a non-portable alternative for temporary buffer creation, the alloca() function carves out a buffer by twiddling the stack pointer. It doesn't play very well with C++ features but it can be used if extremely careful about ensuring that the function will never have an exception thrown from it.
Store the CRC with the buffer in a structure like
struct input {
std::unique_ptr<unsigned char[]> buffer;
uint32_t crc;
}
And deal with the concatenation of the CRC and buffer someplace else in your code (i.e. on output). This, I believe is the best method.
The first code is ill-formed, however some compilers default to a mode where non-standard extensions are accepted.
You should be able to specify compiler switches for standard conformance. For example, in gcc, -std=c++17 -pedantic.
The second code is "correct" although not the preferred way either, you should use a container which frees the memory when execution leaves the scope, instead of a manual delete. For example, std::vector<unsigned char> buf(new_buffer_size + 1);.
The first example uses a C99 feature called Variable Length Arrays (VLA), that e.g. g++ by default supports as a C++ language extension. It's non-standard code.
Instead of the second example and similar, you should preferably use std::vector.

The size of my buffer change when I convert an unsigned char to a string

I'm having issues with a type conversion that I can't explain.
Here is what i would like to do
I have a buffer that I dynamically allocate and i need to convert it to a string in order to use a parsing function from an external library.
My implementation
unsigned char* msg_data;
msg_data = (unsigned char*)malloc(msg_data_length);
string msg_data_str = std::string(reinterpret_cast<const char*>(_msg_data));
SomeObject myObject;
myObject.ParseFromString(msg_data_str);
But here is the thing : my parsing function fails because it receives the wrong size of data.
Let's say that i have a buffer of size msg_data_length = 10, the size of my string is my_data_str.size() = 14.
I get rid of my problem by using my_data_str.resize(my_data_length)
but I would like to understand why the size of my_data_str is not directly msg_data_length.
Thanks for your help !
I assume that the message data is not actually zero-terminated like a C-style string, which leads to undefined behavior when the std::string constructor is going out of bounds to find the terminator.
To fix this, use the constructor taking two arguments, the string and the length.
See e.g. this std::string constructor reference.

UCHAR* to std::string

I am using the WinAPI for one of the first times, and i have a function that returns a UCHAR*, but i need it as a std:string, because when i try printing it as a UCHAR* but when i did that it prints a lot of gibberish. There must be some easy way to fix this problem. I Googled this but i could not find anything. I don't even know what a UCHAR* is although it seems to act as some kind of string. I heard that it is a pointer to an unsigned string but i am not quite sure what that means.
This should work
char temp[5];
memcpy(temp, battery_info.Chemistry, 4);
temp[4] = '\0'; // add nul terminator
std::string s = temp; // convert to string
Because your source data does not necessarily have the usual nul terminator, I've copied the data to a temporary char array, added a nul terminator to make sure, then converted to a std::string.
Since the members of that structure are not null terminated:
std::string chemistry(battery_info.Chemistry, battery_info.Chemistry + 4);
Will get you the behavior your want without having to do a memcpy;

Why do they want an 'unsigned char*' and not just a normal string or 'char*'

EDIT: After taking adivce I have rearranged the parameters & types. But the application crashes when I call the digest() function now? Any ideas whats going wrong?
const std::string message = "to be encrypted";
unsigned char* hashMessage;
SHA256::getInstance()->digest( message, hashMessage ); // crash occurs here, what am I doing wrong?
printf("AFTER: n"); //, hashMessage); // line never reached
I am using an open source implementation of the SHA256 algorithm in C++. My problem is understanding how to pass a unsigned char* version of my string so it can be hashed?
This is the function that takes a unsigned char* version of my string:
void SHA256::digest(const std::string &buf, unsigned char *dig) {
init();
update(reinterpret_cast<const unsigned char *>(buf.c_str()), static_cast<unsigned int>(buf.length()));
final();
digest(dig);
}
How can I convert my string(which I want hashed) to an unsigned char*?
The following code I have made causes a runtime error when I go to print out the string contents:
const std::string hashOutput;
char message[] = "to be encrypted";
printf("BEFORE: %s bb\n", hashOutput.c_str());
SHA256::getInstance()->digest( hashOutput, reinterpret_cast<unsigned char *>(message) );
printf("AFTER: %s\n", hashOutput.c_str()); // CRASH occurs here
PS: I have been looking at many implementations of SHA256 & they all take an unsigned char* as the message to be hashed. Why do they do that? Why not a char* or a string instead?
You have the parameters around the wrong way. Buf is the input (data to be hashed) and dig is the output digest ( the hash).
Furthermore, a hash is binary data. You will have to convert said binary data into some string representation prior to printing it to screen. Normally, people choose to use a hexadecimal string for this.
The reason that unsigned char is used is that it has guaranteed behaviours under bitwise operations, shifts, and overflow.
char, (when it corresponds to signed char) does not give any of these guarantees, and so is far less useable for operations intended to act directly on the underlying bits in a string.
The answer to the question: "why does it crash?" is "you got lucky!". Your code has undefined behaviour. In short, you are writing through a pointer hashMessage that has never been initialised to point to any memory. A short investigation of the source code for the library that you are using reveals that it requires the digest pointer to point to a block of valid memory that is at least SHA256_DIGEST_SIZE chars long.
To fix this problem, all that you need to do is to make sure that the pointer that you pass in as the digest argument (hashMessage) is properly initialised, and points to a block of memory of sufficient size. In code:
const std::string message("to be encrypted");
unsigned char hashMessage[SHA256_DIGEST_SIZE];
SHA256::getInstance()->digest( message, hashMessage );
//hashMessage should now contain the hash of message.
I don't know how a SHA256 hash is produced but maybe it involves some sort of arithmetic that needs to be done on a unsigned data type.
Why does it matter? Get a char* from your string object by calling the c_str() method then cast to unsigned char*.