C++ String to byteArray Convertion and Addition - c++

I have a string which I want to convert to a byteArray, and then I want this byteArray to be added to another byteArray, but at the beginning of that byteArray.
Let us say this is the string I have
string suffix = "$PMARVD";
And this is the existing byteArray that I have (ignore the object there, it is a .proto object which is irrelevant now):
int size = visionDataMsg.ByteSize(); // see how big is it
char* byteArray = new char[size]; //create a bytearray of that size
visionDataMsg.SerializeToArray(byteArray, size); // serialize it
So what I want to do is something like this:
char* byteArrayforSuffix = suffix.convertToByteArray();
char* byteArrayforBoth = byteArrayforSuffix + byteArray;
Anyway of doing this in C++?
Edit: I should add that after the concatenation operation, the complete byteArrayforBoth is to be processed in:
// convert bytearray to vector
vector<unsigned char> byteVector(byteArrayforBoth, byteArrayforBoth + size);

the whole idea behind std::string is to wrap the C style strings (null terminated charcaters/bytes array) with a class that manages everything.
you can excess the inner characters array with std::string::data method. example :
std::string hello ("hello") , world(" world");
auto helloWorld = hello + world;
const char* byteArray = helloWorld.data();
EDIT:
ByteArray is a built-in type of char[] or unsigned char[], unlike Java or C#, you can't just "append" built-in byte array to another. as you suggested, you simply want a vector of unsigned characters. in this situation I would simply create a utility-function that utilizes push_back:
void appendBytes(vector<unsigend char>& dest,const char* characterArray,size_t size){
dest.reserve(dest.size() + size);
for (size_t i=0;i<size;i++){
dest.push_back(characterArray[i]);
}
}
now , with the objects you provided:
std::vector<unsigned char> dest;
appendBytes(dest, suffix.data(),suffix.size());
auto another = visionDataMsg.SerializeToArray(byteArray, size);
appendBytes(dest,another,size);

Scrap built-in arrays. You have vectors. Here is the fully working, type-safe solution which took me 3 minutes to type:
int size = visionDataMsg.ByteSize(); // see how big is it
std::vector<char> byteArray(size);
visionDataMsg.SerializeToArray(&byteArray[0], size); // serialize it
std::string str("My String");
byteArray.reserve(byteArray.size() + str.size());
std::copy(str.begin(), str.end(), std::back_inserter(byteArray));

Related

Copy a part of an std::string in a char* pointer

Let's suppose I've this code snippet in C++
char* str;
std::string data = "This is a string.";
I need to copy the string data (except the first and the last characters) in str.
My solution that seems to work is creating a substring and then performing the std::copy operation like this
std::string substring = data.substr(1, size - 2);
str = new char[size - 1];
std::copy(substring.begin(), substring.end(), str);
str[size - 2] = '\0';
But maybe this is a bit overkilling because I create a new string. Is there a simpler way to achieve this goal? Maybe working with offets in the std:copy calls?
Thanks
As mentioned above, you should consider keeping the sub-string as a std::string and use c_str() method when you need to access the underlying chars.
However-
If you must create the new string as a dynamic char array via new you can use the code below.
It checks whether data is long enough, and if so allocates memory for str and uses std::copy similarly to your code, but with adapted iterators.
Note: there is no need to allocate a temporary std::string for the sub-string.
The Code:
#include <string>
#include <iostream>
int main()
{
std::string data = "This is a string.";
auto len = data.length();
char* str = nullptr;
if (len > 2)
{
auto new_len = len - 2;
str = new char[new_len+1]; // add 1 for zero termination
std::copy(data.begin() + 1, data.end() - 1, str); // copy from 2nd char till one before the last
str[new_len] = '\0'; // add zero termination
std::cout << str << std::endl;
// ... use str
delete[] str; // must be released eventually
}
}
Output:
his is a string
There is:
int length = data.length() - 1;
memcpy(str, data.c_str() + 1, length);
str[length] = 0;
This will copy the string in data, starting at position [1] (instead of [0]) and keep copying until length() - 1 bytes have been copied. (-1 because you want to omit the first character).
The final character then gets overwritten with the terminating \0, finalizing the string and disposing of the final character.
Of course this approach will cause problems if the string does not have at least 1 character, so you should check for that beforehand.

Convert from vector<unsigned char> to char* includes garbage data

I'm trying to base64 decode a string, then convert that value to a char array for later use. The decode works fine, but then I get garbage data when converting.
Here's the code I have so far:
std::string encodedData = "VGVzdFN0cmluZw=="; //"TestString"
std::vector<BYTE> decodedData = base64_decode(encodedData);
char* decodedChar;
decodedChar = new char[decodedData.size() +1]; // +1 for the final 0
decodedChar[decodedData.size() + 1] = 0; // terminate the string
for (size_t i = 0; i < decodedData.size(); ++i) {
decodedChar[i] = decodedData[i];
}
vector<BYTE> is a typedef of unsigned char BYTE, as taken from this SO answer. The base64 code is also from this answer (the most upvoted answer, not the accepted answer).
When I run this code, I get the following value in the VisualStudio Text Visualiser:
TestStringÍ
I've also tried other conversion methods, such as:
char* decodedChar = reinterpret_cast< char *>(&decodedData[0]);
Which gives the following:
TestStringÍÍÍýýýýÝÝÝÝÝÝÝ*b4d“
Why am I getting the garbage data at the end of the string? What am i doing wrong?
EDIT: clarified which answer in the linked question I'm using
char* decodedChar;
decodedChar = new char[decodedData.size() +1]; // +1 for the final 0
Why would you manually allocate a buffer and then copy to it when you have std::string available that does this for you?
Just do:
std::string encodedData = "VGVzdFN0cmluZw=="; //"TestString"
std::vector<BYTE> decodedData = base64_decode(encodedData);
std::string decodedString { decodedData.begin(), decodedData.end() };
std::cout << decodedString << '\n';
If you need a char * out of this, just use .c_str()
const char* cstr = decodedString.c_str();
If you need to pass this on to a function that takes char* as input, for example:
void someFunc(char* data);
//...
//call site
someFunc( &decodedString[0] );
We have a TON of functions and abstractions and containers in C++ that were made to improve upon the C language, and so that programmers wouldn't have to write things by hand and make same mistakes every time they code. It would be best if we use those functionalities wherever we can to avoid raw loops or to do simple modifications like this.
You are writing beyond the last element of your allocated array, which can cause literally anything to happen (according to the C++ standard). You need decodedChar[decodedData.size()] = 0;

Subsetting char array without copying it in C++

I have a long array of char (coming from a raster file via GDAL), all composed of 0 and 1. To compact the data, I want to convert it to an array of bits (thus dividing the size by 8), 4 bytes at a time, writing the result to a different file. This is what I have come up with by now:
uint32_t bytes2bits(char b[33]) {
b[32] = 0;
return strtoul(b,0,2);
}
const char data[36] = "00000000000000000000000010000000101"; // 101 is to be ignored
char word[33];
strncpy(word,data,32);
uint32_t byte = bytes2bits(word);
printf("Data: %d\n",byte); // 128
The code is working, and the result is going to be written in a separate file. What I'd like to know is: can I do that without copying the characters to a new array?
EDIT: I'm using a const variable here just to make a minimal, reproducible example. In my program it's a char *, which is continually changing value inside a loop.
Yes, you can, as long as you can modify the source string (in your example code you can't because it is a constant, but I assume in reality you have the string in writable memory):
uint32_t bytes2bits(const char* b) {
return strtoul(b,0,2);
}
void compress (char* data) {
// You would need to make sure that the `data` argument always has
// at least 33 characters in length (the null terminator at the end
// of the original string counts)
char temp = data[32];
data[32] = 0;
uint32_t byte = bytes2bits(data);
data[32] = temp;
printf("Data: %d\n",byte); // 128
}
In this example by using char* as a buffer to store that long data there is not necessary to copy all parts into a temporary buffer to convert it to a long.
Just use a variable to step through the buffer by each 32 byte length period, but after the 32th byte there needs the 0 termination byte.
So your code would look like:
uint32_t bytes2bits(const char* b) {
return strtoul(b,0,2);
}
void compress (char* data) {
int dataLen = strlen(data);
int periodLen = 32;
char* periodStr;
char tmp;
int periodPos = periodLen+1;
uint32_t byte;
periodStr = data[0];
while(periodPos < dataLen)
{
tmp = data[periodPos];
data[periodPos] = 0;
byte = bytes2bits(periodStr);
printf("Data: %d\n",byte); // 128
data[periodPos] = tmp;
periodStr = data[periodPos];
periodPos += periodLen;
}
if(periodPos - periodLen <= dataLen)
{
byte = bytes2bits(periodStr);
printf("Data: %d\n",byte); // 128
}
}
Please than be careful to the last period, which could be smaller than 32 bytes.
const char data[36]
You are in violation of your contract with the compiler if you declare something as const and then modify it.
Generally speaking, the compiler won't let you modify it...so to even try to do so with a const declaration you'd have to cast it (but don't)
char *sneaky_ptr = (char*)data;
sneaky_ptr[0] = 'U'; /* the U is for "undefined behavior" */
See: Can we change the value of an object defined with const through pointers?
So if you wanted to do this, you'd have to be sure the data was legitimately non-const.
The right way to do this in modern C++ is by using std::string to hold your string and std::string_view to process parts of that string without copying it.
You can using string_view with that char array you have though. It's common to use it to modernize the classical null-terminated string const char*.

concatenate string and integer to produce byte array c++

I have a std string and and long that I want to concatenate to produce a byte array (unsigned char *). I have no clue how to do it in c++, I struggled trying to do it with memory.
In java System.arraycopy does the trick.
here are my inputs :
unsigned long part1 = 0x0100000002;
std::string part2 = "some_str";
what i want to have is unsigned char * combined = part2 + part1
any hint ?
There are lots of ways to do this, but here's one using a std::vector to hold the destination buffer (and manage all the memory allocation and deallocation associated with it) and std::memcpy (which is similar to System.arraycopy) to do the copying.
unsigned long part1 = 0x0100000002;
std::string part2 = "some_str";
// create a vector big enough to hold both components
std::vector<char> buffer(sizeof(part1) + part2.size());
// copy the string into the beginning of the buffer
std::memcpy(&buffer[0], &part2[0], part2.size());
// copy the int into the space after the string
std::memcpy(&buffer[part2.size()], &part1, sizeof(part1));
std::cout.write(&buffer[0], buffer.size());
std::cout << "\n";
You can get a plain old char* pointer from a std::vector<char> by doing things like &buffer[0], which gets a pointer to the first element in the underlying array that makes up the vector. You may need to handle your own null termination, if you wanted to use this like a string (which is why I used std::cout.write instead of std::cout << in my example).
An alternative that avoids using memcpy and having to handle the buffer size yourself is to use a stream:
std::stringstream ss;
ss.write(&part2[0], part2.size());
ss.write(reinterpret_cast<const char*>(&part1), sizeof(part1));
std::string buf = ss.str();
std::cout.write(buf.c_str(), buf.size());
std::cout << "\n";
As output on windows from either version, I get this:
some_str☻

std::string to BYTE[]

My goal is to get this:
BYTE Data1[] = {0x6b,0x65,0x79};
BYTE Data2[] = {0x6D,0x65,0x73,0x73,0x61,0x67,0x65};
But my starting point is:
std::string msg = "message";
std::string key = "key";
I am not able to get from std::string to BYTE[].
I tried the following:
std::vector<BYTE> msgbytebuffer(msg.begin(), msg.end());
BYTE* Data1 = &msgbytebuffer[0];
This didn't cause compile or run time error. However, the end result (I feed this to a winapi function - crypto api) was not the same as when I used the actual byte array like in top most ({0x6D,0x65,0x73,0x73,0x61,0x67,0x65}).
You can use string::c_str() function which returns a pointer to c style string that can be passed to winapi functions like:
foo(string.c_str());
What it actually does is that it returns a pointer to an array that contains a null-terminated sequence of characters.
I suppose BYTE[] is actually a char array. You can assign your std::string to char array by doing:
std::string str = "hello";
BYTE byte[6]; // null terminated string;
strcpy(byte, str.c_str()); // copy from str to byte[]
If you want to copy the str without the 0 at the end, use strncpy instead:
BYTE byte[5];
strncpy(byte, str.c_str(), str.length());
Seems me that winapi is waiting a null terminated c-string. You can achieve that by using:
msg.c_str();
or, using your BYTE type, something like that:
std::vector<BYTE> msgbytebuffer(msg.length() + 1, 0);
std::copy(msg.begin(), msg.end(), msgbytebuffer.begin());