concatenate string and integer to produce byte array c++ - c++

I have a std string and and long that I want to concatenate to produce a byte array (unsigned char *). I have no clue how to do it in c++, I struggled trying to do it with memory.
In java System.arraycopy does the trick.
here are my inputs :
unsigned long part1 = 0x0100000002;
std::string part2 = "some_str";
what i want to have is unsigned char * combined = part2 + part1
any hint ?

There are lots of ways to do this, but here's one using a std::vector to hold the destination buffer (and manage all the memory allocation and deallocation associated with it) and std::memcpy (which is similar to System.arraycopy) to do the copying.
unsigned long part1 = 0x0100000002;
std::string part2 = "some_str";
// create a vector big enough to hold both components
std::vector<char> buffer(sizeof(part1) + part2.size());
// copy the string into the beginning of the buffer
std::memcpy(&buffer[0], &part2[0], part2.size());
// copy the int into the space after the string
std::memcpy(&buffer[part2.size()], &part1, sizeof(part1));
std::cout.write(&buffer[0], buffer.size());
std::cout << "\n";
You can get a plain old char* pointer from a std::vector<char> by doing things like &buffer[0], which gets a pointer to the first element in the underlying array that makes up the vector. You may need to handle your own null termination, if you wanted to use this like a string (which is why I used std::cout.write instead of std::cout << in my example).
An alternative that avoids using memcpy and having to handle the buffer size yourself is to use a stream:
std::stringstream ss;
ss.write(&part2[0], part2.size());
ss.write(reinterpret_cast<const char*>(&part1), sizeof(part1));
std::string buf = ss.str();
std::cout.write(buf.c_str(), buf.size());
std::cout << "\n";
As output on windows from either version, I get this:
some_str☻

Related

Convert from vector<unsigned char> to char* includes garbage data

I'm trying to base64 decode a string, then convert that value to a char array for later use. The decode works fine, but then I get garbage data when converting.
Here's the code I have so far:
std::string encodedData = "VGVzdFN0cmluZw=="; //"TestString"
std::vector<BYTE> decodedData = base64_decode(encodedData);
char* decodedChar;
decodedChar = new char[decodedData.size() +1]; // +1 for the final 0
decodedChar[decodedData.size() + 1] = 0; // terminate the string
for (size_t i = 0; i < decodedData.size(); ++i) {
decodedChar[i] = decodedData[i];
}
vector<BYTE> is a typedef of unsigned char BYTE, as taken from this SO answer. The base64 code is also from this answer (the most upvoted answer, not the accepted answer).
When I run this code, I get the following value in the VisualStudio Text Visualiser:
TestStringÍ
I've also tried other conversion methods, such as:
char* decodedChar = reinterpret_cast< char *>(&decodedData[0]);
Which gives the following:
TestStringÍÍÍýýýýÝÝÝÝÝÝÝ*b4d“
Why am I getting the garbage data at the end of the string? What am i doing wrong?
EDIT: clarified which answer in the linked question I'm using
char* decodedChar;
decodedChar = new char[decodedData.size() +1]; // +1 for the final 0
Why would you manually allocate a buffer and then copy to it when you have std::string available that does this for you?
Just do:
std::string encodedData = "VGVzdFN0cmluZw=="; //"TestString"
std::vector<BYTE> decodedData = base64_decode(encodedData);
std::string decodedString { decodedData.begin(), decodedData.end() };
std::cout << decodedString << '\n';
If you need a char * out of this, just use .c_str()
const char* cstr = decodedString.c_str();
If you need to pass this on to a function that takes char* as input, for example:
void someFunc(char* data);
//...
//call site
someFunc( &decodedString[0] );
We have a TON of functions and abstractions and containers in C++ that were made to improve upon the C language, and so that programmers wouldn't have to write things by hand and make same mistakes every time they code. It would be best if we use those functionalities wherever we can to avoid raw loops or to do simple modifications like this.
You are writing beyond the last element of your allocated array, which can cause literally anything to happen (according to the C++ standard). You need decodedChar[decodedData.size()] = 0;

C++ String to byteArray Convertion and Addition

I have a string which I want to convert to a byteArray, and then I want this byteArray to be added to another byteArray, but at the beginning of that byteArray.
Let us say this is the string I have
string suffix = "$PMARVD";
And this is the existing byteArray that I have (ignore the object there, it is a .proto object which is irrelevant now):
int size = visionDataMsg.ByteSize(); // see how big is it
char* byteArray = new char[size]; //create a bytearray of that size
visionDataMsg.SerializeToArray(byteArray, size); // serialize it
So what I want to do is something like this:
char* byteArrayforSuffix = suffix.convertToByteArray();
char* byteArrayforBoth = byteArrayforSuffix + byteArray;
Anyway of doing this in C++?
Edit: I should add that after the concatenation operation, the complete byteArrayforBoth is to be processed in:
// convert bytearray to vector
vector<unsigned char> byteVector(byteArrayforBoth, byteArrayforBoth + size);
the whole idea behind std::string is to wrap the C style strings (null terminated charcaters/bytes array) with a class that manages everything.
you can excess the inner characters array with std::string::data method. example :
std::string hello ("hello") , world(" world");
auto helloWorld = hello + world;
const char* byteArray = helloWorld.data();
EDIT:
ByteArray is a built-in type of char[] or unsigned char[], unlike Java or C#, you can't just "append" built-in byte array to another. as you suggested, you simply want a vector of unsigned characters. in this situation I would simply create a utility-function that utilizes push_back:
void appendBytes(vector<unsigend char>& dest,const char* characterArray,size_t size){
dest.reserve(dest.size() + size);
for (size_t i=0;i<size;i++){
dest.push_back(characterArray[i]);
}
}
now , with the objects you provided:
std::vector<unsigned char> dest;
appendBytes(dest, suffix.data(),suffix.size());
auto another = visionDataMsg.SerializeToArray(byteArray, size);
appendBytes(dest,another,size);
Scrap built-in arrays. You have vectors. Here is the fully working, type-safe solution which took me 3 minutes to type:
int size = visionDataMsg.ByteSize(); // see how big is it
std::vector<char> byteArray(size);
visionDataMsg.SerializeToArray(&byteArray[0], size); // serialize it
std::string str("My String");
byteArray.reserve(byteArray.size() + str.size());
std::copy(str.begin(), str.end(), std::back_inserter(byteArray));

how to read a particular string from a buffer

i have a buffer
char buffer[size];
which i am using to store the file contents of a stream(suppose pStream here)
HRESULT hr = pStream->Read(buffer, size, &cbRead );
now i have all the contents of this stream in buffer which is of size(suppose size here). now i know that i have two strings
"<!doctortype html" and ".html>"
which are present somewhere (we don't their loctions) inside the stored contents of this buffer and i want to store just the contents of the buffer from the location
"<!doctortype html" to another string ".html>"
in to another buffer2[SizeWeDontKnow] yet.
How to do that ??? (actually contents from these two location are the contents of a html file and i want to store the contents of only html file present in this buffer). any ideas how to do that ??
You can use strnstr function to find the right position in your buffer. After you've found the starting and ending tag, you can extract the text inbetween using strncpy, or use it in place if the performance is an issue.
You can calculate needed size from the positions of the tags and the length of the first tag nLength = nPosEnd - nPosStart - nStartTagLength
Look for HTML parsers for C/C++.
Another way is to have a char pointer from the start of the buffer and then check each char there after. See if it follows your requirement.
If that's the only operation which operates on HTML code in your app, then you could use the solution I provided below (you can also test it online - here). However, if you are going to do some more complicated parsing, then I suggest using some external library.
#include <iostream>
#include <cstdio>
#include <cstring>
using namespace std;
int main()
{
const char* beforePrefix = "asdfasdfasdfasdf";
const char* prefix = "<!doctortype html";
const char* suffix = ".html>";
const char* postSuffix = "asdasdasd";
unsigned size = 1024;
char buf[size];
sprintf(buf, "%s%sTHE STRING YOU WANT TO GET%s%s", beforePrefix, prefix, suffix, postSuffix);
cout << "Before: " << buf << endl;
const char* firstOccurenceOfPrefixPtr = strstr(buf, prefix);
const char* firstOccurenceOfSuffixPtr = strstr(buf, suffix);
if (firstOccurenceOfPrefixPtr && firstOccurenceOfSuffixPtr)
{
unsigned textLen = (unsigned)(firstOccurenceOfSuffixPtr - firstOccurenceOfPrefixPtr - strlen(prefix));
char newBuf[size];
strncpy(newBuf, firstOccurenceOfPrefixPtr + strlen(prefix), textLen);
newBuf[textLen] = 0;
cout << "After: " << newBuf << endl;
}
return 0;
}
EDIT
I get it now :). You should use strstr to find the first occurence of the prefix then. I edited the code above, and updated the link.
Are you limited to C, or can you use C++?
In the C library reference there are plenty of useful ways of tokenising strings and comparing for matches (string.h):
http://www.cplusplus.com/reference/cstring/
Using C++ I would do the following (using buffer and size variables from your code):
// copy char array to std::string
std::string text(buffer, buffer + size);
// define what we're looking for
std::string begin_text("<!doctortype html");
std::string end_text(".html>");
// find the start and end of the text we need to extract
size_t begin_pos = text.find(begin_text) + begin_text.length();
size_t end_pos = text.find(end_text);
// create a substring from the positions
std::string extract = text.substr(begin_pos,end_pos);
// test that we got the extract
std::cout << extract << std::endl;
If you need C string compatibility you can use:
char* tmp = extract.c_str();

How to serialize numeric data into char*

I have a need to serialize int, double, long, and float
into a character buffer and this is the way I currently do it
int value = 42;
char* data = new char[64];
std::sprintf(data, "%d", value);
// check
printf( "%s\n", data );
First I am not sure if this is the best way to do it but my immediate problem is determining the size of the buffer. The number 64 in this case is purely arbitrary.
How can I know the exact size of the passed numeric so I can allocate exact memory; not more not less than is required?
Either a C or C++ solution is fine.
EDIT
Based on Johns answer ( allocate large enough buffer ..) below, I am thinking of doing this
char *data = 0;
int value = 42;
char buffer[999];
std::sprintf(buffer, "%d", value);
data = new char[strlen(buffer)+1];
memcpy(data,buffer,strlen(buffer)+1);
printf( "%s\n", data );
Avoids waste at a cost of speed perhaps. And does not entirely solve the potential overflow Or could I just use the max value sufficient to represent the type.
In C++ you can use a string stream and stop worrying about the size of the buffer:
#include <sstream>
...
std::ostringstream os;
int value=42;
os<<42; // you use string streams as regular streams (cout, etc.)
std::string data = os.str(); // now data contains "42"
(If you want you can get a const char * from an std::string via the c_str() method)
In C, instead, you can use the snprintf to "fake" the write and get the size of the buffer to allocate; in facts, if you pass 0 as second argument of snprintf you can pass NULL as the target string and you get the characters that would have been written as the return value. So in C you can do:
int value = 42;
char * data;
size_t bufSize=snprintf(NULL, 0 "%d", value)+1; /* +1 for the NUL terminator */
data = malloc(bufSize);
if(data==NULL)
{
// ... handle allocation failure ...
}
snprintf(data, bufSize, "%d", value);
// ...
free(data);
I would serialize to a 'large enough' buffer then copy to an allocated buffer. In C
char big_buffer[999], *small_buffer;
sprintf(big_buffer, "%d", some_value);
small_buffer = malloc(strlen(big_buffer) + 1);
strcpy(small_buffer, big_buffer);

Combining std::string and std::vector<char>

This is not the actual code, but this represents my problem.
std::string str1 = "head";
char *buffer = "body\0body"; // Original code has nullbytes;
std::string str2 = "foot";
std::vector<char> mainStr(buffer, buffer + strlen(buffer));
I want to put str1 and str2 to mainStr in an order:
headbody\0bodyfoot
So the binary data is maintained. Is this possible to do this?
PS: Thanks for telling the strlen part is wrong. I just used it to represent buffer's length. :)
There should be some way of defining length of data in "buffer".
Usually character 0 is used for this and most of standard text functions assume this. So if you use character 0 for other purposes, you have to provide another way to find out length of data.
Just for example:
char buffer[]="body\0body";
std::vector<char> mainStr(buffer,buffer+sizeof(buffer)/sizeof(buffer[0]));
Here we use array because it provides more information that a pointer - size of stored data.
You cannot use strlen as it uses '\0' to determine the end of string. However, the following will do what you are looking for:
std::string head = "header";
std::string foot = "footer";
const char body[] = "body\0body";
std::vector<char> v;
v.assign(head.begin(), head.end());
std::copy(body, body + sizeof(body)/sizeof(body[0]) - 1, std::back_inserter<std::vector<char> >(v));
std::copy(foot.begin(), foot.end(), std::back_inserter<std::vector<char> >(v));
Because the character buffer adds an NUL character at the end of the string, you'll want to ignore it (hence the -1 from the last iterator).
btw. strlen will not work if there are nul bytes in your string!
The code to insert into the vector is:
front:
mainStr.insert(mainStr.begin(), str1.begin(), str1.end());
back:
mainStr.insert(mainStr.end(), str2.begin(), str2.end());
With your code above (using strlen will print)
headbodyfoot
EDIT: just changed the copy to insert as copy requires the space to be available I think.
You could use std::vector<char>::insert to append the data you need into mainStr.
Something like this:
std::string str1 = "head";
char buffer[] = "body\0body"; // Original code has nullbytes;
std::string str2 = "foot";
std::vector<char> mainStr(str1.begin(), str1.end());
mainStr.insert(mainStr.end(), buffer, buffer + sizeof(buffer)/sizeof(buffer[0]));
mainStr.insert(mainStr.end(), str2.begin(), str2.end());
Disclaimer: I didn't compile it.
You can use IO streams.
std::string str1 = "head";
const char *buffer = "body\0body"; // Original code has nullbytes;
std::string str2 = "foot";
std::stringstream ss;
ss.write(str1.c_str(), str1.length())
.write(buffer, 9) // insert real length here
.write(str2.c_str(), str2.length());
std::string result = ss.str();
std::vector<char> vec(result.c_str(), result.c_str() + result.length());
str1 and str2 are string objects that write the text.
I wish compilers would fail on statements like the declaration of buffer and I don't care how much legacy code it breaks. If you're still building it you can still fix it and put in a const.
You would need to change your declaration of vector because strlen will stop at the first null character. If you did
char buffer[] = "body\0body";
then sizeof(buffer) would actually give you close to what you want although you'll get the end null-terminator too.
Once your vector mainStr is then set up correctly you could do:
std::string strConcat;
strConcat.reserve( str1.size() + str2.size() + mainStr.size() );
strConcat.assign(str1);
strConcat.append(mainStr.begin(), mainStr.end());
strConcat.append(str2);
if vector was set up using buffer, buffer+sizeof(buffer)-1
mainStr.resize(str1.length() + str2.length() + strlen(buffer));
memcpy(&mainStr[0], &str1[0], str1.length());
memcpy(&mainStr[str1.length()], buffer, strlen(buffer));
memcpy(&mainStr[str1.length()+strlen(buffer)], &str2[0], str2.length());