Converting character array with null characters to string

Converting character array with null characters to string - c++

I have a character buffer which I'm using to send messages over a network. I've serialized an integer using memcpy and so there are some null characters in the buffer. Currently, I'm declaring the buffer as a char array and packing the data from the structure m into it using my serialization routine
sendline = new char[256];
serialize(m, sendline);
void serialize(myMsg &m, char* out)
{
uint16_t seq = htons(m.seq);
memcpy(out, &m.type, sizeof(m.type));
memcpy(out + sizeof(m.type), &seq, sizeof(seq));
memcpy(out + sizeof(seq) + sizeof(m.type), m.data.c_str(), m.data.length());
}
My question is- can I use a string here instead of a char array? I tried making sendline a string, but it terminates the buffer on the first null character.

Related

How is memset working in this snippet of code?

I think this snippet of code is enough to get the idea of what I'm doing.
I'm using getline to read input data from a text file that has lines that might look something like this: The cat is fat/And likes to sing
From searching around the internet I was able to get it working, but I'd like to better understand WHY it is working. My primary question is how the
memcpy(id, buffer, temp - buffer);
line is working. I read what memcpy() does but do not understand how the temp - buffer part is working.
So from my understanding I'm setting *temp to the '/' in that line. Then I'm copying the line up until the '/' into it. But how does the temp, which is at '/' minus the buffer (which is the whole line from getline) work out to just be The cat is fat?
Hopefully that made some sense.
#define MAX_SIZE 255
char buffer[MAX_SIZE + 1] = { 0 };
cin.getline(buffer, MAX_SIZE);
memset(id, 0, 256);
memset(title, 0, 256);
char* temp = strchr(buffer, '/');
memcpy(id, buffer, temp - buffer);
temp++;
strcpy(title, temp);
Also, if I can double dip, why would MAX_SIZE be defined at 255 but MAX_SIZE+1 is often used. Does this have to do with a delimiter or white space at the end of a line?
Thanks for the help.

In my opinion it is simply a bad code.:)
I would write it like
const size_t MAX_SIZE = 256
char buffer[MAX_SIZE] = {};
std::cin.getline( buffer, MAX_SIZE );
id[0] = '\0';
title[0] = '\0';
if ( char* temp = strchr( buffer, '/' ) )
{
std::memcpy( id, buffer, temp - buffer );
id[temp - buffer] = '\0';
std::strcpy( title, temp + 1 );
}
else
{
std::strcpy( id, buffer );
}
As for memcpy in this statement
memcpy(id, buffer, temp - buffer);
then it copies temp - buffer bytes from buffer to id. As id was previously set to zeroes then after memcpy it will contain a string with terminating zero.

You're question concerns pointer-difference calculation, part of the family of arithmetic operations that are done in pointer-arithmetic.
Most beginners don't have too much trouble grasping how pointer-addition works. Given this:
char buffer[256];
char *p = buffer + 10;
it is usually clear that p points to the 10th slot in the buffer char array. But you need to remember that the pointer type is important. The same construct you see above also works for more complicated data types:
struct Something
{
char name[128];
int ident;
int supervisor;
} people[64];
struct Something *p = people+10; // NOTE: same line, different types
Just as before, p points to the tenth element in the array, but note the arithmetic; the size of the underlying type is used to calculate the relevant memory offset. You don't need to do it yourself. No sizeof required here.
So why do you care? Because just like regular math, pointer math has certain properties, one of them being the following:
char buffer[256];
char *p = buffer+10; // p addresses the 10th slot in the array
size_t len = p-buffer // len is the typed-difference between p and buffer.
In this case, len will be 10, the same as the offset of p. So how does this relate to your question? Well...
char* temp = strchr(buffer, '/');
memcpy(id, buffer, temp - buffer);
The horrid nature of this code aside (if there is no '/' in the buffer array the result is temp being NULL, and the ensuing memcpy will all-but-guarantee a massive segfault). This code finds the location in the string where '/' resides. Once it has that, the calculation temp - buffer uses pointer arithmetic (specifically pointer differencing) to calculate the distance between the address in temp and the address as the base of the array. The result is the element count not including the slash itself. Therefore this code copies up-to, but not including, the discovered slash, into the id buffer. The rest of the id buffer retains all the 0 values populated with the memset and therefore the string is terminated (which is way more work than you need to do, btw).
After that line, the remainder:
temp++;
strcpy(title, temp);
post-increments the temp pointer, which says "move to the next element in the array". Then the strcpy copies the remaining chars of the null-terminated buffer string into title. Worth noting this could have simply been:
strcpy(title, ++temp);
And likewise:
strcpy(title, temp+1);
which retains temp at the '/' position. In all of the above, the result in title will be the same: all chars after the slash, but not including it.
I hope that explains what is going on. Best of luck.

MAX_SIZE+1 is reserving space for the null terminator at the end of the string ('\0')
memcpy(id, buffer, temp - buffer)
This is copying (temp-buffer) bytes from buffer to id. Since strchr finds the '/' character in the input, temp is pointing inside buffer (assumiing it's found). So for example assume buffer points to a location in memory:
buffer = 0x781230001
and the third byte is the '/', after strchr, you have
temp = 0x781230003
temp - buffer therefore is 2.
HOWEVER: If the '/' is not found, then temp will not work and the code will crash. You should check the result of strchr before doing the pointer arithmetic.

There you calculate position of first / in buffer.
char* temp = strchr(buffer, '/');
Now temp points to / in buffer. If you want to copy this part of buffer, its enough to get pointer to start and length of string. So temp - buffer evaluates to length.
=================================
The cat is fat/And likes to sing
=================================
^ ^
buffer temp
| length | = temp - buffer
End of null terminated string determinated by \0 (or simply 0). So if you need to store N chars you need to allocate N+1 buffer size.

Replace method changes size of QByteArray

I want to manipulate a 32 bit write command which I have stored in a QByteArray. But the thing that confuses me is that my QByteArray changes size and I cannot figure out why that happens.
My code:
const char CMREFCTL[] = {0x85,0x00,0x00,0x0B};
QByteArray test = QByteArray::fromRawData(CMREFCTL, sizeof(CMREFCTL));
qDebug()<<test.toHex();
const char last1 = 0x0B;
const char last2 = 0x0A;
test.replace(3,1,&last2);
qDebug()<<test.toHex();
test.replace(3,1,&last1);
qDebug()<<test.toHex();
Generates:
"0x8500000b"
"0x8500000a0ba86789"
"0x8500000ba867890ba86789"
I expected the following output:
"0x8500000b"
"0x8500000a"
"0x8500000b"
Using test.replace(3,1,&last2,1) works but I dont see why my code above dont give the same result.
Best regards!

Here is the documentation for the relevant method:
QByteArray & QByteArray::replace ( int pos, int len, const char *
after )
This is an overloaded function.
Replaces len bytes from index position pos with the zero terminated
string after.
Notice: this can change the length of the byte array.
You are not giving the byte array a zero-terminated string, but a pointer to a single char. So it will scan forward in memory from that pointer until it hits a 0, and treat all that memory as the string to replace with.
If you just want to change a single character test[3] = last2; should do what you want.

How to read x characters from position y in a char * buffer?

I am reading through a buffer (char *) and i have a cursor, where i am tracking my starting position of the buffer, is there a way to copy characters 7-64 out of the buffer, or is my best bet to just loop the buffer from poistion x to position y?
The size of the destination buffer is the result of another function dynamically computed
Initializing this returns
variable-sized object 'version' may not be initialized
Relevant code parts:
int32_t size = this->getObjectSizeForMarker(cursor, length, buffer);
cursor = cursor + 8; //advance cursor past marker and size
char version[size] = this->getObjectForSizeAndCursor(size, cursor, buffer);
-
char* FileReader::getObjectForSizeAndCursor(int32_t size, int cursor, char *buffer) {
char destination[size];
memcpy(destination, buffer + cursor, size);
}
-
int32_t FileReader::getObjectSizeForMarker(int cursor, int eof, char * buffer) {
//skip the marker and read next 4 byes
cursor = cursor + 4; //skip marker and read 4
unsigned char *ptr = (unsigned char *)buffer + cursor;
int32_t objSize = (ptr[0] << 24) | (ptr[1] << 16) | (ptr[2] << 8) | ptr[3];
return objSize;
}

Move the pointer to buffer six units ahead (to get to the seventh index), and then memcpy 64-7 (57) bytes, e.g.:
const char *buffer = "foo bar baz...";
char destination[SOME_MAX_LENGTH];
memcpy(destination, buffer + 6, 64-7);
You may want to terminate the destination array so that you can work with it using standard C string functions. Note that we're adding the null character at the 58th index, after the 57 bytes that were copied over:
/* terminate the destination string at the 58th byte, if desired */
destination[64-7] = '\0';
If you need to work with a dynamically sized destination, use a pointer instead of an array:
const char *buffer = "foo bar baz...";
char *destination = NULL;
/* note we do not multiply by sizeof(char), which is unnecessary */
/* we should cast the result, if we're in C++ */
destination = (char *) malloc(58);
/* error checking */
if (!destination) {
fprintf(stderr, "ERROR: Could not allocate space for destination\n");
return EXIT_FAILURE;
}
/* copy bytes and terminate */
memcpy(destination, buffer + 6, 57);
*(destination + 57) = '\0';
...
/* don't forget to free malloc'ed variables at the end of your program, to prevent memory leaks */
free(destination);
Honestly, if you're in C++, you should really probably be using the C++ strings library and std::string class. Then you can call the substr substring method on your string instance to get the 57-character substring of interest. It would involve fewer headaches and less re-inventing the wheel.
But the above code should be useful for both C and C++ applications.

How to serialize numeric data into char*

I have a need to serialize int, double, long, and float
into a character buffer and this is the way I currently do it
int value = 42;
char* data = new char[64];
std::sprintf(data, "%d", value);
// check
printf( "%s\n", data );
First I am not sure if this is the best way to do it but my immediate problem is determining the size of the buffer. The number 64 in this case is purely arbitrary.
How can I know the exact size of the passed numeric so I can allocate exact memory; not more not less than is required?
Either a C or C++ solution is fine.
EDIT
Based on Johns answer ( allocate large enough buffer ..) below, I am thinking of doing this
char *data = 0;
int value = 42;
char buffer[999];
std::sprintf(buffer, "%d", value);
data = new char[strlen(buffer)+1];
memcpy(data,buffer,strlen(buffer)+1);
printf( "%s\n", data );
Avoids waste at a cost of speed perhaps. And does not entirely solve the potential overflow Or could I just use the max value sufficient to represent the type.

In C++ you can use a string stream and stop worrying about the size of the buffer:
#include <sstream>
...
std::ostringstream os;
int value=42;
os<<42; // you use string streams as regular streams (cout, etc.)
std::string data = os.str(); // now data contains "42"
(If you want you can get a const char * from an std::string via the c_str() method)
In C, instead, you can use the snprintf to "fake" the write and get the size of the buffer to allocate; in facts, if you pass 0 as second argument of snprintf you can pass NULL as the target string and you get the characters that would have been written as the return value. So in C you can do:
int value = 42;
char * data;
size_t bufSize=snprintf(NULL, 0 "%d", value)+1; /* +1 for the NUL terminator */
data = malloc(bufSize);
if(data==NULL)
{
// ... handle allocation failure ...
}
snprintf(data, bufSize, "%d", value);
// ...
free(data);

I would serialize to a 'large enough' buffer then copy to an allocated buffer. In C
char big_buffer[999], *small_buffer;
sprintf(big_buffer, "%d", some_value);
small_buffer = malloc(strlen(big_buffer) + 1);
strcpy(small_buffer, big_buffer);

Using istringstream to process a memory block of variable length

I'm trying to use istringstream to recreate an encoded wstring from some memory. The memory is laid out as follows:
1 byte to indicate the start of the wstring encoding. Arbitrarily this is '!'.
n bytes to store the character length of the string in text format, e.g. 0x31, 0x32, 0x33 would be "123", i.e. a 123-character string
1 byte separator (the space character)
n bytes which are the wchars which make up the string, where wchar_t's are 2-bytes each.
For example, the byte sequence:
21 36 20 66 00 6f 00 6f 00
is "!6 f.o.o." (using dots to represent char 0)
All I've got is a char* pointer (let's call it pData) to the start of the memory block with this encoded data in it. What's the 'best' way to consume the data to reconstruct the wstring ("foo"), and also move the pointer to the next byte past the end of the encoded data?
I was toying with using an istringstream to allow me to consume the prefix byte, the length of the string, and the separator. After that I can calculate how many bytes to read and use the stream's read() function to insert into a suitably-resized wstring. The problem is, how do I get this memory into the istringstream in the first place? I could try constructing a string first and then pass that into the istringstream, e.g.
std::string s((const char*)pData);
but that doesn't work because the string is truncated at the first null byte. Or, I could use the string's other constructor to explicitly state how many bytes to use:
std::string s((const char*)pData, len);
which works, but only if I know what len is beforehand. That's tricky given that the data is variable length.
This seems like a really solvable problem. Does my rookie status with strings and streams mean I'm overlooking an easy solution? Or am I barking up the wrong tree with the whole string approach?

Try setting your stringstream's rdbuf:
char* buffer = something;
std::stringbuf *pbuf;
std::stringstream ss;
std::pbuf=ss.rdbuf();
std::pbuf->sputn(buffer, bufferlength);
// use your ss
Edit: I see that this solution will have a similar problem to your string(char*, len) situation. Can you tell us more about your buffer object? If you don't know the length, and it isn't null terminated, it's going to be very hard to deal with.

Is it possible to modify how you encode the length, and make that a fixed size?
unsigned long size = 6; // known string length
char* buffer = new char[1 + sizeof(unsigned long) + 1 + size];
buffer[0] = '!';
memcpy(buffer+1, &size, sizeof(unsigned long));
buffer should hold the start indicator (1 byte), the actual size (size of unsigned long), the delimiter (1 byte) and the text itself (size).
This way, you could get the size "pretty" easy, then set the pointer to point beyond the overhead, and then use the len variable in the string constructor.
unsigned long len;
memcpy(&len, pData+1, sizeof(unsigned long)); // +1 to avoid the start indicator
// len now contains 6
char* actualData = pData + 1 + sizeof(unsigned long) + 1;
std::string s(actualData, len);
It's low level and error prone :) (for instance if you read anything that isn't encoded the way that you expect it to be, the len can get pretty big), but you avoid dynamically reading the length of the string.

It seems like something on this order should work:
std::wstring make_string(char const *input) {
if (*input != '!')
return "";
char length = *++input;
return std::wstring(++input, length);
}
The difficult part is dealing with the variable length of the size. Without something to specify the length it's hard to guess when to stop treating the data as specifying the length of the string.
As for moving the pointer, if you're going to do it inside a function, you'll need to pass a reference to the pointer, but otherwise it's a simple matter of adding the size you found to the pointer you received.

It's tempting to (ab)use the (deprecated but nevertheless standard) std::istrstream here:
// Maximum size to read is
// 1 for the exclamation mark
// Digits for the character count (digits10() + 1)
// 1 for the space
const std::streamsize max_size = 3 + std::numeric_limits<std::size_t>::digits10;
std::istrstream s(buf, max_size);
if (std::istream::traits_type::to_char_type(s.get()) != '!'){
throw "missing exclamation";
}
std::size_t size;
s >> size;
if (std::istream::traits_type::to_char_type(s.get()) != ' '){
throw "missing space";
}
std::wstring(reinterpret_cast<wchar_t*>(s.rdbuf()->str()), size/sizeof(wchar_t));

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Converting character array with null characters to string - c++

Related

How is memset working in this snippet of code?

Replace method changes size of QByteArray

How to read x characters from position y in a char * buffer?

How to serialize numeric data into char*

Using istringstream to process a memory block of variable length

Categories

Resources