Typecasting from byte[] to struct - c++

I'm currently working on a small C++ project where I use a client-server model someone else built. Data gets sent over the network and in my opinion it's in the wrong order. However, that's not something I can change.
Example data stream (simplified):
0x20 0x00 (C++: short with value 32)
0x10 0x35 (C++: short with value 13584)
0x61 0x62 0x63 0x00 (char*: abc)
0x01 (bool: true)
0x00 (bool: false)
I can represent this specific stream as :
struct test {
short sh1;
short sh2;
char abc[4];
bool bool1;
bool bool2;
}
And I can typecast it with test *t = (test*)stream; However, the char* has a variable length. It is, however, always null terminated.
I understand that there's no way of actually casting the stream to a struct, but I was wondering whether there would be a better way than struct test() { test(char* data) { ... }} (convert it via the constructor)

This is called Marshalling or serialization.
What you must do is read the stream one byte at a time (or put all in a buffer and read from that), and as soon as you have enough data for a member in the structure you fill it in.
When it comes to the string, you simply read until you hit the terminating zero, and then allocate memory and copy the string to that buffer and assign it to a pointer in the struct.
Reading strings this way is simplest and most effective if you have of the message in a buffer already, because then you don't need a temporary buffer for the string.
Remember though, that with this scheme you have to manually free the memory containing the string when you are done with the structure.

Just add a member function that takes in the character buffer(function input parameter char *) and populates the test structure by parsing it.
This makes it more clear and readable as well.
If you provide a implicit conversion constructor then you create a menace which will do the conversion when you least expect it.

When reading variable length data from a sequence of bytes,
you shouldn't fit everything into a single structure or variable.
Pointers are also used to store this variable length.
The following suggestion, is not tested:
// data is stored in memory,
// in a different way,
// NOT as sequence of bytes,
// as provided
struct data {
short sh1;
short sh2;
int abclength;
// a pointer, maybe variable in memory !!!
char* abc;
bool bool1;
bool bool2;
};
// reads a single byte
bool readByte(byte* MyByteBuffer)
{
// your reading code goes here,
// character by character, from stream,
// file, pipe, whatever.
// The result should be true if not error,
// false if cannot rea anymore
}
// used for reading several variables,
// with different sizes in bytes
int readBuffer(byte* Buffer, int BufferSize)
{
int RealCount = 0;
byte* p = Buffer;
while (readByte(p) && RealCount <= BufferSize)
{
RealCount++
p++;
}
return RealCount;
}
void read()
{
// real data here:
data Mydata;
byte MyByte = 0;
// long enough, used to read temporally, the variable string
char temp[64000];
// fill buffer for string with null values
memset(temp, '\0', 64000);
int RealCount = 0;
// try read "sh1" field
RealCount = (readBuffer(&(MyData.sh1), sizeof(short)));
if (RealCount == sizeof(short))
{
// try read "sh2" field
RealCount = readBuffer(&(MyData.sh2), sizeof(short));
if (RealCount == sizeof(short))
{
RealCount = readBuffer(temp, 64000);
if (RealCount > 0)
{
// store real bytes count
MyData.abclength = RealCount;
// allocate dynamic memory block for variable length data
MyData.abc = malloc(RealCount);
// copy data from temporal buffer into data structure plus pointer
// arrays in "plain c" or "c++" doesn't require the "&" operator for address:
memcpy(MyData.abc, temp, RealCount);
// comented should be read as:
//memcpy(&MyData.abc, &temp, RealCount);
// continue with rest of data
RealCount = readBuffer(&(MyData.bool1), sizeof(bool));
if (RealCount > 0)
{
// continue with rest of data
RealCount = readBuffer(&(MyData.bool2), sizeof(bool));
}
}
}
}
} // void read()
Cheers.

Related

`istream` to array of floats (4-bytes each item)

I have the following function (so far):
void read_binary_file(std::istream is,
ByteArray arr)
{
int length = is.tellg();
char *buffer = new char[length];
is.read(buffer, length);
// What to do next?
// The goal is to place istream buffer in my `ByteArray` class `values`class,
// ByteArray - an array of `float`, each item should be 4 bytes from the buffer
}
My goal is to place each 4 bytes from the buffer inside my ByteArray->values class. Each item should contain 4 bytes from the buffer.
ByteArray definition:
class ByteArray
{
....
float *values;
}
Limitations: I don't want to use stl/ vector classes.
I couldn't find an example with my current limitations.
Any idea how I can do that?
If I understand correctly, you want to create a ByteArray object and copy bytes from buffer to ByteArray::values[] as floats. Assuming that the file is opened in binary mode & contain floats dumped in correct format+endianness, and total data in file is multiple of sizeof(float):
class ByteArray
{
private:
float* values;
public:
void set(char* buffer, int len)
{
values = new float[len/4];
for(int itr =0; itr < len/4; itr++)
{
values[itr] = *(float*)(buffer+itr*4);
}
}
};
...
arr.set(buffer, length);
Note that i) smarter codes are possible but I kept it as simple as possible for your understanding. ii) Ulrich is right, you should pass istream by reference (as well as ByteArray for most practical purposes):
void read_binary_file(std::istream& is,
ByteArray& arr)
...
If you want to use istream to send bytes byte by byte you can say
arr.values=(float*)buffer;
or
arr.values=new float[length/4];
memcpy(arr.values,buffer,length);
delete[] buffer;
It works until you want to send a float which contains a eof byte by accident. 2 is a float like that, so it isn't uncommon. Then you can't do anything as istream stops at that byte. So I recommend not to send floats byte by byte in stringteams. Send them an other way eg in hexa. (hat way you don't loose precision).
What generated the file you want to read?

Subsetting char array without copying it in C++

I have a long array of char (coming from a raster file via GDAL), all composed of 0 and 1. To compact the data, I want to convert it to an array of bits (thus dividing the size by 8), 4 bytes at a time, writing the result to a different file. This is what I have come up with by now:
uint32_t bytes2bits(char b[33]) {
b[32] = 0;
return strtoul(b,0,2);
}
const char data[36] = "00000000000000000000000010000000101"; // 101 is to be ignored
char word[33];
strncpy(word,data,32);
uint32_t byte = bytes2bits(word);
printf("Data: %d\n",byte); // 128
The code is working, and the result is going to be written in a separate file. What I'd like to know is: can I do that without copying the characters to a new array?
EDIT: I'm using a const variable here just to make a minimal, reproducible example. In my program it's a char *, which is continually changing value inside a loop.
Yes, you can, as long as you can modify the source string (in your example code you can't because it is a constant, but I assume in reality you have the string in writable memory):
uint32_t bytes2bits(const char* b) {
return strtoul(b,0,2);
}
void compress (char* data) {
// You would need to make sure that the `data` argument always has
// at least 33 characters in length (the null terminator at the end
// of the original string counts)
char temp = data[32];
data[32] = 0;
uint32_t byte = bytes2bits(data);
data[32] = temp;
printf("Data: %d\n",byte); // 128
}
In this example by using char* as a buffer to store that long data there is not necessary to copy all parts into a temporary buffer to convert it to a long.
Just use a variable to step through the buffer by each 32 byte length period, but after the 32th byte there needs the 0 termination byte.
So your code would look like:
uint32_t bytes2bits(const char* b) {
return strtoul(b,0,2);
}
void compress (char* data) {
int dataLen = strlen(data);
int periodLen = 32;
char* periodStr;
char tmp;
int periodPos = periodLen+1;
uint32_t byte;
periodStr = data[0];
while(periodPos < dataLen)
{
tmp = data[periodPos];
data[periodPos] = 0;
byte = bytes2bits(periodStr);
printf("Data: %d\n",byte); // 128
data[periodPos] = tmp;
periodStr = data[periodPos];
periodPos += periodLen;
}
if(periodPos - periodLen <= dataLen)
{
byte = bytes2bits(periodStr);
printf("Data: %d\n",byte); // 128
}
}
Please than be careful to the last period, which could be smaller than 32 bytes.
const char data[36]
You are in violation of your contract with the compiler if you declare something as const and then modify it.
Generally speaking, the compiler won't let you modify it...so to even try to do so with a const declaration you'd have to cast it (but don't)
char *sneaky_ptr = (char*)data;
sneaky_ptr[0] = 'U'; /* the U is for "undefined behavior" */
See: Can we change the value of an object defined with const through pointers?
So if you wanted to do this, you'd have to be sure the data was legitimately non-const.
The right way to do this in modern C++ is by using std::string to hold your string and std::string_view to process parts of that string without copying it.
You can using string_view with that char array you have though. It's common to use it to modernize the classical null-terminated string const char*.

How to parse captured packet from socket in cpp?

I'm using RAW socket to capture udp packets. After capturing I want to parse the packet and see what's inside.
The input I get from the socket is an unsigned char* buffer and it's length. I tried to put the buffer into a string but I guess I did it wrong because when I checked the string it was empty.
Any advice?
I don't know what you want to parse, but your have the buffer and it's length. So you can do everything you want with this memory. Look for pointer arithmetic. If you want to make an C-String out of the content, simply add an '\0' to the end of the memory block. But this assumes, that no other 0x00 are inside the buffer. So maybe you have to check that. Like πάντα ῥεῖ said.
Steps:
1: receive UDP package
2: cast like:
unsigned char* buffer;
char* cString = (char*) buffer;
3: check casted cString if an '\0' occurred before buffer size was reached. If it does, then create a new char* pointer to the byte after the '\0', but be aware of the buffer size. Save the pointer in an vector.
I made an code example, but haven't checked if it is runnable!
char* firstPtr = (char*) buffer;
size_t indexer = 0;
std::vector<char*> pointerVec;
pointerVec.push_back(firstPtr);
while(indexer < bufferSize) {
if(*(buffer + indexer) == '\0') {
if(indexer + 1 < bufferSize) {
char* cString = (char*) (buffer + indexer);
pointerVec.push_back(cString);
}
}
} // end while
After that you should have the positions of the different strings saved with the pointers inside of the vector. Now you can handle them to an copy mechanism which takes every C-String pointer and saves it's content to one C-String or String.
Hope you searched for something like that, because you question was unclear.

How to read in only a particular number of characters

I have a small query regarding reading a set of characters from a structure. For example: A particular variable contains a value "3242C976*32" (char - type). How can I get only the first 8 bits of this variable. Kindly help.
Thanks.
Edit:
I'm trying to read in a signal:
For Ex: $ASWEER,2,X:3242C976*32
into this structure:
struct pg
{
char command[7]; // saves as $ASWEER,2,X:3242C976*32
char comma1[1]; // saves as ,2,X:3242C976*32
char groupID[1]; // saves as 2,X:3242C976*32
char comma2[1]; // etc
char handle[2]; // this is the problem, need it to save specifically each part, buts its not
char canID[8];
char checksum[3];
}m_pg;
...
When memcopying buffer into a structure, it works but because there is no carriage returns it saves the rest of the signal in each char variable. So, there is always garbage at the end.
you could..
convert your hex value in canID to float(depending on how you want to display it), e.g.
float value1 = HexToFloat(m_pg.canID); // find a conversion script for HexToFloat
CString val;
val.Format("0.3f",value1);
the garbage values aren't actually being stored in the structure, it only displays it as so, as there is no carriage return, so format the message however you want to and display it using the CString val;
If "3242C976*3F" is a c-string or std::string, you can just do:
char* str = "3242C976*3F";
char first_byte = str[0];
Or with an arbitrary memory block you can do:
SomeStruct memoryBlock;
char firstByte;
memcpy(&firstByte, &memoryBlock, 1);
Both copy the first 8bits or 1 byte from the string or arbitrary memory block just as well.
After the edit (original answer below)
Just copy by parts. In C, something like this should work (could also work in C++ but may not be idiomatic)
strncpy(m_pg.command, value, 7); // m.pg_command[7] = 0; // oops
strncpy(m_pg.comma, value+7, 1); // m.pg_comma[1] = 0; // oops
strncpy(m_pg.groupID, value+8, 1); // m.pg_groupID[1] = 0; // oops
strncpy(m_pg.comma2, value+9, 1); // m.pg_comma2[1] = 0; // oops
// etc
Also, you don't have space for the string terminator in the members of the structure (therefore the oopses above). They are NOT strings. Do not printf them!
Don't read more than 8 characters. In C, something like
char value[9]; /* 8 characters and a 0 terminator */
int ch;
scanf("%8s", value);
/* optionally ignore further input */
while (((ch = getchar()) != '\n') && (ch != EOF)) /* void */;
/* input terminated with ch (either '\n' or EOF) */
I believe the above code also "works" in C++, but it may not be idiomatic in that language
If you have a char pointer, you can just set str[8] = '\0'; Be careful though, because if the buffer is less than 8 (EDIT: 9) bytes, this could cause problems.
(I'm just assuming that the name of the variable that already is holding the string is called str. Substitute the name of your variable.)
It looks to me like you want to split at the comma, and save up to there. This can be done with strtok(), to split the string into tokens based on the comma, or strchr() to find the comma, and strcpy() to copy the string up to the comma.

How to serialize numeric data into char*

I have a need to serialize int, double, long, and float
into a character buffer and this is the way I currently do it
int value = 42;
char* data = new char[64];
std::sprintf(data, "%d", value);
// check
printf( "%s\n", data );
First I am not sure if this is the best way to do it but my immediate problem is determining the size of the buffer. The number 64 in this case is purely arbitrary.
How can I know the exact size of the passed numeric so I can allocate exact memory; not more not less than is required?
Either a C or C++ solution is fine.
EDIT
Based on Johns answer ( allocate large enough buffer ..) below, I am thinking of doing this
char *data = 0;
int value = 42;
char buffer[999];
std::sprintf(buffer, "%d", value);
data = new char[strlen(buffer)+1];
memcpy(data,buffer,strlen(buffer)+1);
printf( "%s\n", data );
Avoids waste at a cost of speed perhaps. And does not entirely solve the potential overflow Or could I just use the max value sufficient to represent the type.
In C++ you can use a string stream and stop worrying about the size of the buffer:
#include <sstream>
...
std::ostringstream os;
int value=42;
os<<42; // you use string streams as regular streams (cout, etc.)
std::string data = os.str(); // now data contains "42"
(If you want you can get a const char * from an std::string via the c_str() method)
In C, instead, you can use the snprintf to "fake" the write and get the size of the buffer to allocate; in facts, if you pass 0 as second argument of snprintf you can pass NULL as the target string and you get the characters that would have been written as the return value. So in C you can do:
int value = 42;
char * data;
size_t bufSize=snprintf(NULL, 0 "%d", value)+1; /* +1 for the NUL terminator */
data = malloc(bufSize);
if(data==NULL)
{
// ... handle allocation failure ...
}
snprintf(data, bufSize, "%d", value);
// ...
free(data);
I would serialize to a 'large enough' buffer then copy to an allocated buffer. In C
char big_buffer[999], *small_buffer;
sprintf(big_buffer, "%d", some_value);
small_buffer = malloc(strlen(big_buffer) + 1);
strcpy(small_buffer, big_buffer);