copying a buffer in a struct in C++ style - c++

Say i have a struct like
typedef struct {
unsigned char flag, type;
unsigned short id;
uint32 size;
} THdr;
and a buffer of data coming from a UDP comunication, i have a buffer of bytes and its size (data and data_size). data size is bigger than sizeof(THdr).
The struct is at the beginning of the buffer, and i want to copy it to a struct defined in the code, THdr Struct_copy .
I know that i can use memcpy(&Struct_copy,data[0],sizeof(Struct_copy)); but i would like to use a "C++ style" way, like using std::copy.
any clue?

There is no "clever" way to do this in C++. If it doesn't involve memcpy, then you need std::copy and you will have to use casts to const unsigned char * to make the datatype of the source and/or destination match up.
There are various other ways to "copy some data", such as type-punning, but again, it's not a "better" solution than memcpy - just different.
C++ was designed to accept C-code, so if there's nothing wrong with your current code, then I don't see why you should change it.

How about something like
THdr hdr;
std::copy(&hdr, &hdr + 1, reinterpret_cast<THdr*>(data));
Should copy one THdr structure from data to hdr.
It can also be much simpler:
THdr hdr;
hdr = *reinterpret_cast<THdr*>(data);
This last works in C as well:
THdr hdr;
hdr = *(THdr *) data;
Or why not create a constructor that takes the data as input? Like
struct THdr
{
explicit THdr(const unsigned char* data)
{
THdr* other = reinterpret_cast<THdr*>(data);
*this = *other;
}
// ...
};
Then it can be used such as
THdr hdr(data);

Related

convert struct to uint8_t array in C++

I have a typedef struct with different data types in it. The number array has negative and non-negative values. How do I convert this struct in to a unint8t array in C++ on the Linux platform. Appreciate some help on this. Thank you. The reason I am trying to do the conversation is to send this uint8_t buffer as a parameter to a function.
typedef struct
{
int enable;
char name;
int numbers[5];
float counter;
};
appreciate any example on doing this. thank you
For a plain old data structure like the one you show this is trivial:
You know the size of the structure in bytes (from the sizeof operator).
That means you can create a vector of bytes of that size.
Once you have that vector you can copy the bytes of the structure object into the vector.
Now you have, essentially, an array of bytes representation of the structure object.
In code it would be something like this:
struct my_struct_type
{
int enable;
char name;
int numbers[5];
float counter;
};
my_struct_type my_struct_object = {
// TODO: Some valid initialization here...
};
// Depending on the compiler you're using, and its C++ standard used
// you might need to use `std::uint8_t` instead of `std::byte`
std::vector<std::byte> bytes(sizeof my_struct_object);
std::memcpy(bytes.data(), reinterpret_cast<void*>(&my_struct_object), sizeof my_struct_object);
Suggestion: use char*. For that type, C++ allows you to cast any pointer to it and use reinterpret_cast<char*>(&obj). (Assuming Obj obj;.)
If you need to use uint8_t* (and it's not a typedef to char*, you can allocate sufficient memory and do memcpy:
uint8_t buffer[sizeof(Obj)];
memcpy(buffer, &obj, sizeof(obj));

Is there a better way to modify a char array in a struct? C++

I am trying to read in a cstring from a edit control box in MFC, then put it into a char array in a struct, but since I cannot do something like clientPacket->path = convertfuntion(a); I had to create another char array to store the string then store it element by element.
That felt like a bandait solution, is there a better way to approach this? I'd like to learn how to clean up the code.
CString stri;//Read text from edit control box and convert it to std::string
GetDlgItem(IDC_EDIT1)->GetWindowText(stri);
string a;
a = CT2A(stri);
char holder[256];
strcpy_s(holder,a.c_str());
int size = sizeof(holder);
struct packet {
char caseRadio;
char path[256];
};
packet* clientPacket = new packet;
for (int t = 0; t < size; t++) {
clientPacket->path[t] = holder[t] ;
}
EDIT:This is currently what I went with:
CString stri;//Read text from edit control box and convert it to std::string
GetDlgItem(IDC_EDIT1)->GetWindowText(stri);
string a = CT2A(stri);
struct packet {
char caseRadio;
char path[CONSTANT];//#define CONSTANT 256
};
packet* clientPacket = new packet;
a = a.substr(0, sizeof(clientPacket->path) - 1);
strcpy_s(clientPacket->path, a.c_str());
I got a problem where I got "1path" instead of "path", turns out it read in caseRadio='1', fixed it by reading out caseRadio first in the server
I don't see the need to create the intermediate 'holder' char array.
I think you can just directly do
strcpy(clientPacket->path, a.c_str());
You may want to do this:
a= a.substr(0, sizeof(clientPacket->path)-1);
before the strcpy to avoid buffer overrun depending on whether the edit text is size limited or not.
You can copy directly into a user-provided buffer when using the Windows API call GetWindowTextA. The following illustrates how to do this:
struct packet {
char caseRadio;
char path[512];
} p;
::GetWindowTextA(GetDlgItem(IDC_EDIT1)->GetSafeHwnd(), &p.path[0],
static_cast<int>(sizeof(p.path)));
This does an implicit character encoding conversion using the CP_ACP code page. This is not generally desirable, and you may wish to perform the conversion using a known character encoding (such as CP_UTF8).
Use the CString.GetBuffer function to get a pointer to the string. In your struct, store the path as a char* instead of a char array.
struct packet {
char caseRadio;
char* path;
};
packet* clientPacket = new packet;
clientPacket->path = stri.GetBuffer();
Like this, maybe? strncpy(clientPacket->path, CT2A(stri).c_str(), 255);.
Also, better make the 256 bytes a constant and use that name, just in case you change this in 10 years.

The right way to work with network buffer in modern GCC/C++ without breaking strict-aliasing rules

The program - some sort of old-school network messaging:
// Common header for all network messages.
struct __attribute__((packed)) MsgHeader {
uint32_t msgType;
};
// One of network messages.
struct __attribute__((packed)) Msg1 {
MsgHeader header;
uint32_t field1;
};
// Network receive buffer.
uint8_t rxBuffer[MAX_MSG_SIZE];
// Receive handler. The received message is already in the rxBuffer.
void onRxMessage() {
// Detect message type
if ( ((const MsgHeader*)rxBuffer)->msgType == MESSAGE1 ) { // Breaks strict-aliasing!
// Process Msg1 message.
const Msg1* msg1 = (const Msg1*)rxBuffer;
if ( msg1->field1 == 0 ) { // Breaks strict-aliasing!
// Some code here;
}
return;
}
// Process other message types.
}
This code violates strict-aliasing in modern GCC (and falls down to unspecified behaviour in modern C++).
What is the correct way to solve the problem (to make the code that doesn't throw the "strict-aliasing" warning)?
P.S. If rxBuffer is defined as:
union __attribute__((packed)) {
uint8_t[MAX_MSG_SIZE] rawData;
} rxBuffer;
and then I cast &rxBuffer to other pointers it doesn't cause any warnings. But is it safe, right and portable way?
Define rxBuffer as a pointer to a union of uint8_t[MAX_SIZE], MsgHeader, Msg1 and whatever type you plan to cast to. Note that this would still break the strict aliasing rules, but in GCC it it guaranteed to work as non-standard extension.
EDIT: if such a method would lead to a too complicated declaration, a fully portable (if slower) way is to keep the buffer as a simple uint8_t[] and memcpy it to the opportune message struct as soon as it has to be reinterpreted. The feasability of this method obviously depends on your performance and efficiency needs.
EDIT 2: a third solution (if you are working on "normal" architectures) is to use char or unsigned char instead of uint8_t. Such types are guaranteed to alias everything. Not valid because the conversion to the message type might not work, see here
By working with the individual bytes, you can avoid all pointer casting and eliminate portability issues with endianness and alignment:
uint32_t decodeUInt32(uint8_t *p) {
// Decode big-endian, which is network byte order.
return (uint32_t(p[0])<<24) |
(uint32_t(p[1])<<16) |
(uint32_t(p[2])<< 8) |
(uint32_t(p[3]) );
}
void onRxMessage() {
// Detect message type
if ( decodeUInt32(rxBuffer) == MESSAGE1 ) {
// Process Msg1 message.
if ( decodeUInt32(rxBuffer+4) == 0 ) {
// Some code here;
}
return;
}
// Process other message types.
}
Like Alberto M wrote, you can change the type of your buffer and how you receive into it:
union {
uint8_t rawData[MAX_MSG_SIZE];
struct MsgHeader msgHeader;
struct {
struct MsgHeader dummy;
struct Msg1 msg;
} msg1;
} rxBuffer;
receiveBuffer(&rxBuffer.rawData);
if (rxBuffer.msgHeader.msgType == MESSAGE1) {
if (rxBuffer.msg1.msg.field1) {
// ...
or directly receive into the struct, if your receive uses chars (uint8_t only aliases uint8_t unlike char, which may always alias):
struct {
struct MsgHeader msgHeader;
union {
struct Msg1 msg1;
struct Msg2 msg2;
} msg;
} rxBuffer;
recv(fd, (char *)&rxBuffer, MAX_MSG_SIZE, 0);
// handle errors and insufficient recv length
if (rxBuffer.msgHeader.msgType == MESSAGE1) {
// ...
Btw. type punning through a union is standard and doesn't break strict aliasing. See C99-TC3 6.5 (7) and also search for "type punning". The question is about C++, but not C, so Alberto M is right about it being non-standard, but a GCC extension.
Using memcpy for this works kind of in the same manner like above, but is standard: bytes are copied on per character basis, effectively reinterpreting them as a struct when accessing the destination location, like you would do when you're type punning through a union:
struct MsgHeader msgHeader;
memcpy(&msgHeader, rxBuffer, sizeof(msgHeader));
if (msg_header.msgType == MESSAGE1) {
struct Msg1 msg;
memcpy(&msg, rxBuffer + sizeof(msgHeader), sizeof(msg));
if (msg.field1 == 0) {
// Some code here;
}
}
Or like Vaughn Cato wrote, you can unpack (and should then probably also pack) the received and sent network buffers yourself. Again it's standard compliant and this way you also work around padding and byte order in a portable way:
uint8_t *buf= rxBuffer;
struct MsgHeader msgHeader;
msgHeader.msgType = (buf[3]<<0) | (buf[2]<<8) | (buf[1]<<16) | (buf[0]<<24); // read uint32_t in big endian
if (msgHeader.msgType == MESSAGE2) {
struct Msg2 msg;
buf += sizeof(MsgHeader);
msg.field1 = (buf[1]<<0) | (buf[0]<<8); // read uint16_t in big endian
if (msg.field1 == 0) {
// ...
Note: struct Msg1 and struct Msg2 don't contain a struct MsgHeader in the above snippets and are like this:
struct Msg1 {
uint32_t field1;
};
struct Msg2 {
uint16_t field1;
};
It boils down to this:
((const MsgHeader*)rxBuffer)->msgType
rxBuffer is of one type, but we wish to treat is as if it was of another type. I suggest the following "alias-cast":
const MsgHeader * msg_header_p = (const MsgHeader *) rxBuffer;
memmove(msg_header_p, rxBuffer, sizeof(MsgHeader));
auto msg_type = msg_header_p -> msgType;
memmove (like its less flexible cousin memcpy) effectively says that the bit pattern that was available at the source (rxBuffer) will, after the call to memmove be available at the destination (msg_header_p). Even if the types are different.
You might argue that memmove does "nothing", because the source and destination are identical. But that's exactly the point. Logically, it serves the purpose of making msg_header_p an alias for rxBuffer, even though in practice a good compiler will optimize it out.
(This answer is potentially a bit controversial. I may be pushing memmove too far. I guess my logic is: First, memcpy to a new location is clearly acceptable to answer this question; second, memmove is just a better, more general (but maybe slower), version of memcpy; third, if memcpy allows you to look at the same bit pattern via a different type, when why shouldn't memmove allow the same idea to "change" the type of a particular bit pattern? If we memcpy to a temporary area, then memcpy back to the original position, would be OK also? )
If you want to build a full answer out of this, you'll need to alias-cast back again at some point, memmove(rxBuffer, msg_header_p, sizeof(MsgHeader));, but I guess I should await feedback on my "alias cast" first!

Dereferencing Variable Size Arrays in Structs

Structs seem like a useful way to parse a binary blob of data (ie a file or network packet). This is fine and dandy until you have variable size arrays in the blob. For instance:
struct nodeheader{
int flags;
int data_size;
char data[];
};
This allows me to find the last data character:
nodeheader b;
cout << b.data[b.data_size-1];
Problem being, I want to have multiple variable length arrays:
struct nodeheader{
int friend_size;
int data_size;
char data[];
char friend[];
};
I'm not manually allocating these structures. I have a file like so:
char file_data[1024];
nodeheader* node = &(file_data[10]);
As I'm trying to parse a binary file (more specifically a class file). I've written an implementation in Java (which was my class assignment), no I'm doing a personal version in C++ and was hoping to get away without having to write 100 lines of code. Any ideas?
Thanks,
Stefan
You cannot have multiple variable sized arrays. How should the compiler at compile time know where friend[] is located? The location of friend depends on the size of data[] and the size of data is unknown at compile time.
This is a very dangerous construct, and I'd advise against it. You can only include a variable-length array in a struct when it is the LAST element, and when you do so, you have to make sure you allocate enough memory, e.g.:
nodeheader *nh = (nodeheader *)malloc(sizeof(nodeheader) + max_data_size);
What you want to do is just use regular dynamically allocated arrays:
struct nodeheader
{
char *data;
size_t data_size;
char *friend;
size_t friend_size;
};
nodeheader AllocNodeHeader(size_t data_size, size_t friend_size)
{
nodeheader nh;
nh.data = (char *)malloc(data_size); // check for NULL return
nh.data_size = data_size;
nh.friend = (char *)malloc(friend_size); // check for NULL return
nh.friend_size = friend_size;
return nh;
}
void FreeNodeHeader(nodeheader *nh)
{
free(nh->data);
nh->data = NULL;
free(nh->friend);
nh->friend = NULL;
}
You can't - at least not in the simple way that you're attempting. The unsized array at the end of a structure is basically an offset to the end of the structure, with no build-in way to find the end.
All the fields are converted to numeric offsets at compile time, so they need to be calculable at that time.
The answers so far are seriously over-complicating a simple problem. Mecki is right about why it can't be done the way you are trying to do it, however you can do it very similarly:
struct nodeheader
{
int friend_size;
int data_size;
};
struct nodefile
{
nodeheader *header;
char *data;
char *friend;
};
char file_data[1024];
// .. file in file_data ..
nodefile file;
file.header = (nodeheader *)&file_data[0];
file.data = (char *)&file.header[1];
file.friend = &file.data[file->header.data_size];
For what you are doing you need an encoder/decoder for the format. The decoder takes the raw data and fills out your structure (in your case allocating space for the copy of each section of the data), and the decoder writes raw binary.
(Was 'Use std::vector')
Edit:
On reading feedback, I suppose I should expand my answer. You can effectively fit two variable length arrays in your structure as follows, and the storage will be freed for you automatically when file_data goes out of scope:
struct nodeheader {
std::vector<unsigned char> data;
std::vector<unsigned char> friend_buf; // 'friend' is a keyword!
// etc...
};
nodeheader file_data;
Now file_data.data.size(), etc gives you the length and and &file_data.data[0] gives you a raw pointer to the data if you need it.
You'll have to fill file data from the file piecemeal - read the length of each buffer, call resize() on the destination vector, then read in the data. (There are ways to do this slightly more efficiently. In the context of disk file I/O, I'm assuming it doesn't matter).
Incidentally OP's technique is incorrect even for his 'fine and dandy' cases, e.g. with only one VLA at the end.
char file_data[1024];
nodeheader* node = &(file_data[10]);
There's no guarantee that file_data is properly aligned for the nodeheader type. Prefer to obtain file_data by malloc() - which guarantees to return a pointer aligned for any type - or else (better) declare the buffer to be of the correct type in the first place:
struct biggestnodeheader {
int flags;
int data_size;
char data[ENOUGH_SPACE_FOR_LARGEST_HEADER_I_EVER_NEED];
};
biggestnodeheader file_data;
// etc...

How to read in specific sizes and store data of an unknown type in c++?

I'm trying to read data in from a binary file and then store in a data structure for later use. The issue is I don't want to have to identify exactly what type it is when I'm just reading it in and storing it. I just want to store the information regarding what type of data it is and how much data of this certain type there is (information easily obtained in the first couple bytes of this data)
But how can I read in just a certain amount of data, disregarding what type it is and still easily be able to cast (or something similar) that data into a readable form later?
My first idea would be to use characters, since all the data I will be looking at will be in byte units.
But if I did something like this:
ifstream fileStream;
fileStream.open("fileName.tiff", ios::binary);
//if I had to read in 4 bytes of data
char memory[4];
fileStream.read((char *)&memory, 4);
But how could I cast these 4 bytes if I later I wanted to read this and knew it was a double?
What's the best way to read in data of an unknown type but know size for later use?
fireStream.
I think a reinterpret_cast will give you what you need. If you have a char * to the bytes you can do the following:
double * x = reinterpret_cast<double *>(dataPtr);
Check out Type Casting on cplusplus.com for a more detailed description of reinterpret_cast.
You could copy it to the known data structure which makes life easier later on:
double x;
memcpy (&x,memory,sizeof(double));
or you could just refer to it as a cast value:
if (*((double*)(memory)) == 4.0) {
// blah blah blah
}
I believe a char* is the best way to read it in, since the size of a char is guaranteed to be 1 unit (not necessarily a byte, but all other data types are defined in terms of that unit, so that, if sizeof(double) == 27, you know that it will fit into a char[27]). So, if you have a known size, that's the easiest way to do it.
You could store the data in a class that provides functions to cast it to the possible result types, like this:
enum data_type {
TYPE_DOUBLE,
TYPE_INT
};
class data {
public:
data_type type;
size_t len;
char *buffer;
data(data_type a_type, char *a_buffer, size_t a_len)
: type(a_type), buffer(NULL), len(a_len) {
buffer = new char[a_len];
memcpy(buffer, a_buffer, a_len);
}
~data() {
delete[] buffer;
}
double as_double() {
assert(TYPE_DOUBLE == type);
assert(len >= sizeof(double));
return *reinterpret_cast<double*>(buffer);
}
int as_int() {...}
};
Later you would do something like this:
data d = ...;
switch (d.type) {
case TYPE_DOUBLE:
something(d.as_double());
break;
case TYPE_INT:
something_else(d.as_int());
break;
...
}
That's at least how I'm doing these kind of things :)
You can use structures and anonymous unions:
struct Variant
{
size_t size;
enum
{
TYPE_DOUBLE,
TYPE_INT,
} type;
union
{
char raw[0]; // Copy to here. *
double asDouble;
int asInt;
};
};
Optional: Create a table of type => size, so you can find the size given the type at runtime. This is only needed when reading.
static unsigned char typeSizes[2] =
{
sizeof(double),
sizeof(int),
};
Usage:
Variant v;
v.type = Variant::TYPE_DOUBLE;
v.size = Variant::typeSizes[v.type];
fileStream.read(v.raw, v.size);
printf("%f\n", v.asDouble);
You will probably receive warnings about type punning. Read: Doing this is not portable and against the standard! Then again, so is reinterpret_cast, C-style casting, etc.
Note: First edit, I did not read your original question. I only had the union, not the size or type part.
*This is a neat trick I learned a long time ago. Basically, raw doesn't take up any bytes (thus doesn't increase the size of the union), but provides a pointer to a position in the union (in this case, the beginning). It's very useful when describing file structures:
struct Bitmap
{
// Header stuff.
uint32_t dataSize;
RGBPixel data[0];
};
Then you can just fread the data into a Bitmap. =]
Be careful. In most environments I'm aware of, doubles are 8 bytes, not 4; reinterpret_casting memory to a double will result in junk, based on what the four bytes following memory contain. If you want a 32-bit floating point value, you probably want a float (though I should note that the C++ standard does not require that float and double be represented in any way and in particular need not be IEEE-754 compliant).
Also, your code will not be portable unless you take endianness into account in your code. I see that the TIFF format has an endianness marker in its first two bytes that should tell you whether you're reading in big-endian or little-endian values.
So I would write a function with the following prototype:
template<typename VALUE_TYPE> VALUE_TYPE convert(char* input);
If you want full portability, specialize the template and have it actually interpret the bits in input. Otherwise, you can probably get away with e.g.
template<VALUE_TYPE> VALUE_TYPE convert(char* input) {
return reinterpret_cast<double>(input);
}