Copying struct with bitfields & dynamic data into a Char array buffer - c++

I have a struct like the following
struct Struct {
int length; //dynamicTest length
unsigned int b: 1;
unsigned int a: 1;
unsigned int padding: 10;
int* dynamicTest;
int flag;
}
I want to copy this into a char array buffer (to send over a socket). I'm curious how I would do that.

To be precise, you do this with memcpy, e.g.:
#include <string.h>
/* ... */
Struct s = /*... */;
char buf[1024]
memcpy(buf, &s, sizeof(s));
/* now [buf, buf + sizeof(s)) holds the needed data */
Alternatively you can avoid copying at all and view an instance of struct as an array of char (since everything in computer memory is sequence of bytes, this approach works).
Struct s = /* ... */;
const char* buf = (char*)(&s);
/* now [buf, buf + sizeof(s)) holds the needed data */
If you are going to send it over the network, you need to care of byte order, int size and many other details.
Copying bit fields present no problem, but for dynamic fields, such as your char* this naive approach won't work. The more general solution, that works with any other types is serialization.

Related

How can I declare a structure of an unknow size?

I have this structure :
struct __attribute__((packed)) BabelPacket
{
unsigned senderId;
unsigned dataLength;
unsigned char data[0];
};
And to declare it I do :
BabelPacket *packet = reinterpret_cast<BabelPacket *>(new char[sizeof(BabelPacket) + 5]);
packet->senderId = 1;
packet->data = "kappa";
packet->dataLength = 5;
But when I compile I have this error :
error: incompatible types in assignment of ‘const char [6]’ to ‘unsigned char [0]’
packet->data = "kappa";
^
Have you an idea how I can do that ?
And I need to send this structure through a socket, to get the object back in my server, so I can use only C types.
If this was a C program, the error you get is because you try to assign to an array, which is not possible. You can only copy to an array:
memcpy(packet->data, "kappa", 5);
Also note that if you want the data to be a C string, you need to allocate an extra character for the string terminator '\0'. Then you can use strcpy instead of memcpy above. Or strncpy to copy at most a specific amount of characters, but then you might need to manually terminate the string.
However, this should not work in C++ at all, unless your compiler have it as an extension.
You can't assign a literal string that way. You'll need to allocate additional memory for the string, then copy to the data pointer.
struct A {
size_t datasize;
char data[0]; // flexible member must appear last.
};
A* create_A(const char* str)
{
size_t datasize = strlen(str) + 1; // null terminated (?)
A* p = reinterpret_cast<A*>(new char[sizeof(A) + datasize]);
memcpy(p->data, str, datasize);
p->datasize = datasize;
return p;
}
A* p = create_A("data string");
This solution is only applicable in environments supporting zero-length or flexible arrays. In fact, a better solution may be to write the sockets code in C and export that interface for use in C++.
If you are willing/allowed to change the unsigned char to a regular char, you can use strcpy:
#include <iostream>
#include <stdio.h>
#include <string.h>
struct __attribute__((packed)) BabelPacket
{
unsigned senderId;
unsigned dataLength;
char data[0]; // I changed this to char in order to use strcpy
};
int main(){
BabelPacket *packet = reinterpret_cast<BabelPacket *>(new char[sizeof(BabelPacket) + 5]);
packet->senderId = 1;
// Copy the string. Add NULL character at the end of
// the string to indicate its end
strcpy(packet->data, "kappa\0");
packet->dataLength = 5;
// Verify that the string is copied properly
for (int i=0;i<packet->dataLength;++i){
std::cout<<packet->data[i];
}
std::cout<<std::endl;
return 0;
}
Note that this will only work if data is at the end of the struct, otherwise there is no contiguous memory to allocate data. If I swap the order of the elements to:
struct __attribute__((packed)) BabelPacket
{
unsigned senderId;
char data[0]; // I changed this to char in order to use strcpy
unsigned dataLength;
};
the output of the code above (instead of "kappa"), would be "a".
A more reliable way if you are determined to use C-arrays would be to assume a maximum number of elements and preallocate the array, i.e.:
#include <iostream>
#include <stdio.h>
#include <string.h>
#define MAX_NUMBER_OF_CHARACTERS 5 // Many ways to do this, I defined the macro for the purposes of this example
struct __attribute__((packed)) BabelPacket
{
unsigned senderId;
// I changed this to char in order to use strcpy. Allocate the
// max number + 1 element for the termination string
char data[MAX_NUMBER_OF_CHARACTERS+1];
unsigned dataLength;
};
int main(){
BabelPacket *packet = reinterpret_cast<BabelPacket *>(new char[sizeof(BabelPacket) + 5]);
packet->senderId = 1;
packet->dataLength = 5;
if (dataLength>MAX_NUMBER_OF_CHARACTERS){
std::cout<<"String greater than the maximum number of characters"<<std::endl;
}
// Copy the string. Add NULL character at the end of
// the string to indicate its end
strcpy(packet->data, "kappa\0");
// Verify that the string is copied properly
for (int i=0;i<packet->dataLength;++i){
std::cout<<packet->data[i];
}
std::cout<<std::endl;
return 0;
}
This code produces the correct output, and protects you against violations. As you can see, it can get messy pretty quickly, which is why I would recommend to use std::vector for this. The dataLength may then be retrieved automatically as the size of the vector, and you are always protected against overflows.

Nice representation of byte array and its size

How would you represent byte array and its size nicely? I'd like to store (in main memory or within a file) raw byte arrays(unsigned chars) in which first 2/4 bytes will represents its size. But operations on such array does not look well:
void func(unsigned char *bytearray)
{
int size;
memcpy(&size, bytearray, sizeof(int));
//rest of operation when we know bytearray size
}
How can I avoid that? I think about a simple structure:
struct bytearray
{
int size;
unsigned char *data;
};
bytearray *b = reinterpret_cast<bytearray*>(new unsigned char[10]);
b->data = reinterpret_cast<unsigned char*>(&(b->size) + 1);
And I've got an access to a size and data part of bytearray. But it still looks ugly. Could you recommend an another approach?
Unless you have some overwhelming reason to do otherwise, just do the idiomatic thing and use std::vector<unsigned char>.
You're effectively re-inventing the "Pascal string". However
b->data = reinterpret_cast<unsigned char*>(&(b->size) + 1);
won't work at all, because the pointer points to itself, and the pointer will get overwritten.
You should be able to use an array with unspecified size for the last element of a structure:
struct bytearray
{
int size;
unsigned char data[];
};
bytearray *b = reinterpret_cast<bytearray*>(::operator new(sizeof (bytearray) + 10));
b->size = 10;
//...
::operator delete(b);
Unlike std::vector, this actually stores the size and data together, so you can, for example, write it to a file in one operation. And memory locality is better.
Still, the fact that std::vector is already tested and many useful algorithms are implemented for you makes it very attractive.
I would use std::vector<unsigned char> to manage the memory, and write a conversion function to create some iovec like structure for you at the time that you need such a thing.
iovec make_iovec (std::vector<unsigned char> &v) {
iovec iv = { &v[0], v.size() };
return iv;
}
Using iovec, if you need to write both the length and data in a single system call, you can use the writev call to accomplish it.
ssize_t write_vector(int fd, std::vector<unsigned char> &v) {
uint32_t len = htonl(v.size());
iovec iv[2] = { { &len, sizeof(uint32_t) }, make_iovec(v) };
return writev(fd, iv, 2);
}

Variable class array

I have the following packet layout:
struct PacketLayout
{
int GateCode;
BYTE StringLen;
char String[StringLen];
BYTE ServerStatus;
BYTE ServerStorage;
BYTE ServerNumber;
}
The class is this:
class ServerInfo
{
short PacketSize; //Size of the whole packet
BYTE TotalServers; //total of PacketLayout structs
PacketLayout Server[TotalServers];
int GlobalSecCode;
short EncryptedPacketSize; //Same as the first member, just xored
}
So the problem i have is making an variable size array inside an class or an struct which size depends of the last member pointed by BYTE StringLen (for struct) and BYTE TotalServers (for the class).
I don't know what is the solution to this, maybe implement a template?, if that's so can i see an example (i am not familiar with templates yet) also i want to reference my member names without calculating the pointer position by myself (as i am currently doing now).
Thanks.
Doing this with a template is possible, for example:
template <int StringSize>
struct PacketLayout
{
int GateCode;
BYTE StringLen;
char String[StringSize];
BYTE ServerStatus;
BYTE ServerStorage;
BYTE ServerNumber;
};
lets you use this as:
PacketLayout<100> pkt;
Normally what you would want to do is simpler though. For example if you reorder the packet and know the upper limit on size you can do simply:
struct PacketLayout
{
int GateCode;
BYTE StringLen;
BYTE ServerStatus;
BYTE ServerStorage;
BYTE ServerNumber;
char String[MAX_POSSIBLE_SIZE];
};
Or alternatively:
struct PacketLayout
{
int GateCode;
BYTE StringLen;
BYTE ServerStatus;
BYTE ServerStorage;
BYTE ServerNumber;
char *String;
};
and allocate/set String during reading.
Personally though I'd skip all this messy low level details and use something like protobuf to do the work for you and leave you free to concentrate on the more important higher level things that add value to your project.
There's a common but dirty trick used sometimes too:
struct PacketLayout
{
int GateCode;
BYTE StringLen;
BYTE ServerStatus;
BYTE ServerStorage;
BYTE ServerNumber;
char String[1];
};
Where people define the size of the variable part at the end to be 1 and then deliberately allocate more memory than needed for the struct so they can write past the end of it. This is evil and very much not recommended though.
Using templates is definitely the way to go:
template <size_t TotalServers>
class ServerInfo
{
PacketLayout Server[TotalServers];
int GlobalSecCode;
};
This has the disadvantage of not having one ServerInfo assignable to another, so possibly using a std::vector is important, if that means a lot to you.
There is no nice way in C++ to achieve this.
This is how you can create variable size array in PacketLayout:
struct PacketLayout
{
int GateCode;
BYTE StringLen;
BYTE ServerStatus;
BYTE ServerStorage;
BYTE ServerNumber;
char String[1];
}
then you allocate an instance:
PacketLayout* createPacketLayout(BYTE stringLen)
{
PacketLayout* packetLayout = (PacketLayout*)new char[sizeof(PacketLayout) - 1 + stringLen];
packetLayout->StringLen = stringLen;
return packetLayout;
}
in this case ServerInfo could hold array of pointers.

How do i push different datatypes into a void buffer?

I have the following data i need to add in the void buffer:
MyStruct somedata; // some struct containing ints or floats etc.
string somestring;
How do i do this?
This is my buffer allocation:
void *buffer = (void *)malloc(datasize);
How do i add first the somedata into the buffer (, which takes lets say 20 bytes), and then after 20 bytes comes the string which is variable size. I was thinking to read the structs byte by byte and add to buffer, but that feels stupid, there must be some easier way...?
Edit: i want this to equal to: fwrite( struct1 ); fwrite( struct2 ); which are called sequentially, but instead of writing to file, i want to write to a void buffer.
Edit 2: Made it working, heres the code:
char *data = (char *)malloc(datasize);
unsigned int bufferoffset = 0;
for(...){
MyStruct somedata; // some POD struct containing ints or floats etc.
string somestring;
... stuff ...
// add to buffer:
memcpy(data+bufferoffset, &somedata, sizeof(MyStruct));
bufferoffset += sizeof(MyStruct);
memcpy(data+bufferoffset, somestring.c_str(), str_len);
bufferoffset += str_len;
}
Anything to fix?
memcpy(buffer, &somedata, sizeof(MyStruct));
strcpy(buffer + sizeof(MyStruct), somestring.c_str());
Which will copy the string as a c string.
In general you should avoid doing this for classes which have custom copy-constructors etc.
But if you have to and you know what you're doing, use memcpy
In C, I'd do a bit like this:
MyStruct somedata;
string somestring;
void *buffer = (void *)malloc(datasize);
memmove(buffer, &somedata, 20);
strcpy(buffer + 20, somestring);
But there's LOTS of bad smell in the first 3 lines of this C code:
MyStruct is either a typedef (why? I hate typedefs) or it should be struct MyStruct
string is either a typedef (why? I hate typedefs) or it should be struct string; and identifiers starting with "str" are reserved and should not be used by programmers
Casting the return value of malloc is redundant and may hide errors
Edit after noticing (thanks Newbie) operations on void *
char *buffer = malloc(datasize);
In C, void* and any other pointer type are assignment compatible in both directions, so there is no need to cast char * to void * when passing it to memmove() and friends.
memmove(buffer, &somedata, 20);

Dereferencing Variable Size Arrays in Structs

Structs seem like a useful way to parse a binary blob of data (ie a file or network packet). This is fine and dandy until you have variable size arrays in the blob. For instance:
struct nodeheader{
int flags;
int data_size;
char data[];
};
This allows me to find the last data character:
nodeheader b;
cout << b.data[b.data_size-1];
Problem being, I want to have multiple variable length arrays:
struct nodeheader{
int friend_size;
int data_size;
char data[];
char friend[];
};
I'm not manually allocating these structures. I have a file like so:
char file_data[1024];
nodeheader* node = &(file_data[10]);
As I'm trying to parse a binary file (more specifically a class file). I've written an implementation in Java (which was my class assignment), no I'm doing a personal version in C++ and was hoping to get away without having to write 100 lines of code. Any ideas?
Thanks,
Stefan
You cannot have multiple variable sized arrays. How should the compiler at compile time know where friend[] is located? The location of friend depends on the size of data[] and the size of data is unknown at compile time.
This is a very dangerous construct, and I'd advise against it. You can only include a variable-length array in a struct when it is the LAST element, and when you do so, you have to make sure you allocate enough memory, e.g.:
nodeheader *nh = (nodeheader *)malloc(sizeof(nodeheader) + max_data_size);
What you want to do is just use regular dynamically allocated arrays:
struct nodeheader
{
char *data;
size_t data_size;
char *friend;
size_t friend_size;
};
nodeheader AllocNodeHeader(size_t data_size, size_t friend_size)
{
nodeheader nh;
nh.data = (char *)malloc(data_size); // check for NULL return
nh.data_size = data_size;
nh.friend = (char *)malloc(friend_size); // check for NULL return
nh.friend_size = friend_size;
return nh;
}
void FreeNodeHeader(nodeheader *nh)
{
free(nh->data);
nh->data = NULL;
free(nh->friend);
nh->friend = NULL;
}
You can't - at least not in the simple way that you're attempting. The unsized array at the end of a structure is basically an offset to the end of the structure, with no build-in way to find the end.
All the fields are converted to numeric offsets at compile time, so they need to be calculable at that time.
The answers so far are seriously over-complicating a simple problem. Mecki is right about why it can't be done the way you are trying to do it, however you can do it very similarly:
struct nodeheader
{
int friend_size;
int data_size;
};
struct nodefile
{
nodeheader *header;
char *data;
char *friend;
};
char file_data[1024];
// .. file in file_data ..
nodefile file;
file.header = (nodeheader *)&file_data[0];
file.data = (char *)&file.header[1];
file.friend = &file.data[file->header.data_size];
For what you are doing you need an encoder/decoder for the format. The decoder takes the raw data and fills out your structure (in your case allocating space for the copy of each section of the data), and the decoder writes raw binary.
(Was 'Use std::vector')
Edit:
On reading feedback, I suppose I should expand my answer. You can effectively fit two variable length arrays in your structure as follows, and the storage will be freed for you automatically when file_data goes out of scope:
struct nodeheader {
std::vector<unsigned char> data;
std::vector<unsigned char> friend_buf; // 'friend' is a keyword!
// etc...
};
nodeheader file_data;
Now file_data.data.size(), etc gives you the length and and &file_data.data[0] gives you a raw pointer to the data if you need it.
You'll have to fill file data from the file piecemeal - read the length of each buffer, call resize() on the destination vector, then read in the data. (There are ways to do this slightly more efficiently. In the context of disk file I/O, I'm assuming it doesn't matter).
Incidentally OP's technique is incorrect even for his 'fine and dandy' cases, e.g. with only one VLA at the end.
char file_data[1024];
nodeheader* node = &(file_data[10]);
There's no guarantee that file_data is properly aligned for the nodeheader type. Prefer to obtain file_data by malloc() - which guarantees to return a pointer aligned for any type - or else (better) declare the buffer to be of the correct type in the first place:
struct biggestnodeheader {
int flags;
int data_size;
char data[ENOUGH_SPACE_FOR_LARGEST_HEADER_I_EVER_NEED];
};
biggestnodeheader file_data;
// etc...