Dereferencing Variable Size Arrays in Structs - c++

Structs seem like a useful way to parse a binary blob of data (ie a file or network packet). This is fine and dandy until you have variable size arrays in the blob. For instance:
struct nodeheader{
int flags;
int data_size;
char data[];
};
This allows me to find the last data character:
nodeheader b;
cout << b.data[b.data_size-1];
Problem being, I want to have multiple variable length arrays:
struct nodeheader{
int friend_size;
int data_size;
char data[];
char friend[];
};
I'm not manually allocating these structures. I have a file like so:
char file_data[1024];
nodeheader* node = &(file_data[10]);
As I'm trying to parse a binary file (more specifically a class file). I've written an implementation in Java (which was my class assignment), no I'm doing a personal version in C++ and was hoping to get away without having to write 100 lines of code. Any ideas?
Thanks,
Stefan

You cannot have multiple variable sized arrays. How should the compiler at compile time know where friend[] is located? The location of friend depends on the size of data[] and the size of data is unknown at compile time.

This is a very dangerous construct, and I'd advise against it. You can only include a variable-length array in a struct when it is the LAST element, and when you do so, you have to make sure you allocate enough memory, e.g.:
nodeheader *nh = (nodeheader *)malloc(sizeof(nodeheader) + max_data_size);
What you want to do is just use regular dynamically allocated arrays:
struct nodeheader
{
char *data;
size_t data_size;
char *friend;
size_t friend_size;
};
nodeheader AllocNodeHeader(size_t data_size, size_t friend_size)
{
nodeheader nh;
nh.data = (char *)malloc(data_size); // check for NULL return
nh.data_size = data_size;
nh.friend = (char *)malloc(friend_size); // check for NULL return
nh.friend_size = friend_size;
return nh;
}
void FreeNodeHeader(nodeheader *nh)
{
free(nh->data);
nh->data = NULL;
free(nh->friend);
nh->friend = NULL;
}

You can't - at least not in the simple way that you're attempting. The unsized array at the end of a structure is basically an offset to the end of the structure, with no build-in way to find the end.
All the fields are converted to numeric offsets at compile time, so they need to be calculable at that time.

The answers so far are seriously over-complicating a simple problem. Mecki is right about why it can't be done the way you are trying to do it, however you can do it very similarly:
struct nodeheader
{
int friend_size;
int data_size;
};
struct nodefile
{
nodeheader *header;
char *data;
char *friend;
};
char file_data[1024];
// .. file in file_data ..
nodefile file;
file.header = (nodeheader *)&file_data[0];
file.data = (char *)&file.header[1];
file.friend = &file.data[file->header.data_size];

For what you are doing you need an encoder/decoder for the format. The decoder takes the raw data and fills out your structure (in your case allocating space for the copy of each section of the data), and the decoder writes raw binary.

(Was 'Use std::vector')
Edit:
On reading feedback, I suppose I should expand my answer. You can effectively fit two variable length arrays in your structure as follows, and the storage will be freed for you automatically when file_data goes out of scope:
struct nodeheader {
std::vector<unsigned char> data;
std::vector<unsigned char> friend_buf; // 'friend' is a keyword!
// etc...
};
nodeheader file_data;
Now file_data.data.size(), etc gives you the length and and &file_data.data[0] gives you a raw pointer to the data if you need it.
You'll have to fill file data from the file piecemeal - read the length of each buffer, call resize() on the destination vector, then read in the data. (There are ways to do this slightly more efficiently. In the context of disk file I/O, I'm assuming it doesn't matter).
Incidentally OP's technique is incorrect even for his 'fine and dandy' cases, e.g. with only one VLA at the end.
char file_data[1024];
nodeheader* node = &(file_data[10]);
There's no guarantee that file_data is properly aligned for the nodeheader type. Prefer to obtain file_data by malloc() - which guarantees to return a pointer aligned for any type - or else (better) declare the buffer to be of the correct type in the first place:
struct biggestnodeheader {
int flags;
int data_size;
char data[ENOUGH_SPACE_FOR_LARGEST_HEADER_I_EVER_NEED];
};
biggestnodeheader file_data;
// etc...

Related

convert struct to uint8_t array in C++

I have a typedef struct with different data types in it. The number array has negative and non-negative values. How do I convert this struct in to a unint8t array in C++ on the Linux platform. Appreciate some help on this. Thank you. The reason I am trying to do the conversation is to send this uint8_t buffer as a parameter to a function.
typedef struct
{
int enable;
char name;
int numbers[5];
float counter;
};
appreciate any example on doing this. thank you
For a plain old data structure like the one you show this is trivial:
You know the size of the structure in bytes (from the sizeof operator).
That means you can create a vector of bytes of that size.
Once you have that vector you can copy the bytes of the structure object into the vector.
Now you have, essentially, an array of bytes representation of the structure object.
In code it would be something like this:
struct my_struct_type
{
int enable;
char name;
int numbers[5];
float counter;
};
my_struct_type my_struct_object = {
// TODO: Some valid initialization here...
};
// Depending on the compiler you're using, and its C++ standard used
// you might need to use `std::uint8_t` instead of `std::byte`
std::vector<std::byte> bytes(sizeof my_struct_object);
std::memcpy(bytes.data(), reinterpret_cast<void*>(&my_struct_object), sizeof my_struct_object);
Suggestion: use char*. For that type, C++ allows you to cast any pointer to it and use reinterpret_cast<char*>(&obj). (Assuming Obj obj;.)
If you need to use uint8_t* (and it's not a typedef to char*, you can allocate sufficient memory and do memcpy:
uint8_t buffer[sizeof(Obj)];
memcpy(buffer, &obj, sizeof(obj));

Is there a better way to modify a char array in a struct? C++

I am trying to read in a cstring from a edit control box in MFC, then put it into a char array in a struct, but since I cannot do something like clientPacket->path = convertfuntion(a); I had to create another char array to store the string then store it element by element.
That felt like a bandait solution, is there a better way to approach this? I'd like to learn how to clean up the code.
CString stri;//Read text from edit control box and convert it to std::string
GetDlgItem(IDC_EDIT1)->GetWindowText(stri);
string a;
a = CT2A(stri);
char holder[256];
strcpy_s(holder,a.c_str());
int size = sizeof(holder);
struct packet {
char caseRadio;
char path[256];
};
packet* clientPacket = new packet;
for (int t = 0; t < size; t++) {
clientPacket->path[t] = holder[t] ;
}
EDIT:This is currently what I went with:
CString stri;//Read text from edit control box and convert it to std::string
GetDlgItem(IDC_EDIT1)->GetWindowText(stri);
string a = CT2A(stri);
struct packet {
char caseRadio;
char path[CONSTANT];//#define CONSTANT 256
};
packet* clientPacket = new packet;
a = a.substr(0, sizeof(clientPacket->path) - 1);
strcpy_s(clientPacket->path, a.c_str());
I got a problem where I got "1path" instead of "path", turns out it read in caseRadio='1', fixed it by reading out caseRadio first in the server
I don't see the need to create the intermediate 'holder' char array.
I think you can just directly do
strcpy(clientPacket->path, a.c_str());
You may want to do this:
a= a.substr(0, sizeof(clientPacket->path)-1);
before the strcpy to avoid buffer overrun depending on whether the edit text is size limited or not.
You can copy directly into a user-provided buffer when using the Windows API call GetWindowTextA. The following illustrates how to do this:
struct packet {
char caseRadio;
char path[512];
} p;
::GetWindowTextA(GetDlgItem(IDC_EDIT1)->GetSafeHwnd(), &p.path[0],
static_cast<int>(sizeof(p.path)));
This does an implicit character encoding conversion using the CP_ACP code page. This is not generally desirable, and you may wish to perform the conversion using a known character encoding (such as CP_UTF8).
Use the CString.GetBuffer function to get a pointer to the string. In your struct, store the path as a char* instead of a char array.
struct packet {
char caseRadio;
char* path;
};
packet* clientPacket = new packet;
clientPacket->path = stri.GetBuffer();
Like this, maybe? strncpy(clientPacket->path, CT2A(stri).c_str(), 255);.
Also, better make the 256 bytes a constant and use that name, just in case you change this in 10 years.

How do you read from a memory buffer c++

I am fairly new at C++ and am trying to understand how memory manipulation works. I am used to Java and Python and haven't really been exposed to this.
I am working on a project that has the following structure that doesn't quite make sense to me.
typedef struct
{
size_t size;
char *data;
} data_buffer;
This structure basically acts as a buffer, with a pointer to the data stored within the buffer and the size of the buffer to allow the program to know how large the buffer is when reading from it.
An example of how the program uses the buffer:
data_buffer buffer = {0};
//Manipulate data here so it contains pertinent information
CFile oFile;
oFile.Write(buffer.data, buffer.size);
The program mostly uses 3rd party code to read the data found within the buffer, so I am having trouble finding an example of how this is done. My main question is how do I read the contents of the buffer, given only a pointer to a character and a size? However, I would also like to understand how this actually works. From what I understand, memory is written to, with a pointer to where it starts and the size of the memory, so I should be able to just iterate through the memory locations, grabbing each character from memory and tagging it onto whatever structure I choose to use, like a CString or a string. Yet, I don't understand how to iterate through memory. Can someone help me understand this better? Thanks.
There is no reason you cannot use a std::string or CString to manipulate that data. (Use higher level constructs when they are available to you.)
To get the data into a std::string, use the constructor or assignment operator:
std::string s( buffer.data, buffer.size );
You can even stick it in a std::stringstream so you can treat the data buffer like a file:
std::istringstream ss( s );
int n;
ss >> n;
Things work similarly for the MFC string class.
To get the data from a string, you'll need to copy it over. Ideally, you'll be able to allocate the data's memory. Assuming you have data written into a stringstream
std::ostringstream ss;
ss << name << "," << employee_number;
You can then allocate the space you need using the function that creates the data_buffer object:
function_that_creates_a_data_buffer( buffer, ss.str().size() );
If there is no such function (there ought to be!) you must malloc() or new it yourself, as appropriate:
buffer.size = ss.str().size();
buffer.data = (char*)malloc( buffer.size );
Now just copy it:
ss.str().copy( buffer.data, buffer.size );
If your buffer needs a null-terminator (I have so far assumed it doesn't), make sure to add one to the size you allocate and set the last character to zero.
buffer.size = ss.str().size + 1;
buffer.data = new char[ buffer.size ];
ss.str().copy( buffer.data, buffer.size );
buffer.data[ buffer.size-1 ] = 0;
Make sure to look at the documentation for the various classes you will use.
Hope this helps.
A variable of type char* is actually a pointer to memory. Your struct contains data which is of type char* so it is a pointer to memory. (I suggest writing char* data instead of char *data, to help keep this clear.)
So you can use it as a starting point to look at your data. You can use another pointer to walk over the buffer.
char* bufferInspectorPointer;
bufferInspectorPointer = buffer.data;
bufferInspectorPointer will now point to the first byte of the buffer's data and
*bufferInsepectorPointer
will return the contents of the byte.
bufferInspectorPointer++
will advance the pointer to the next byte in the buffer.
You can do arithmetic with pointers in C++, so
bufferInspectorPointer - buffer.data
will tell you how many bytes you have covered. You can compare it to buffer.size to see how far you have left to go.
Since you tagged this as C++ I'd recommend using algorithms. You can get your iterators by using buffer.data as start and buffer.data + buffer.size as end. So to copy the memory into a std::string you'd do something like so:
std::string str(buffer.data, buffer.data + buffer.size);
Or perhaps to append onto a string:
str.reserve(str.size() + buffer.size);
std::copy(buffer.data, buffer.data + buffer.size, std::back_inserter(str));
Of course you can always chose a different end so long as it's not past buffer.data + buffer.size.
They are using a char array so that you can access each byte of the data buffer since size of char is usually 1 byte.
Reading the contents of the data buffer depends on the application. If you know how the internal data is encoded, you can write an unpacking function which selects chunks of the char array and convert/typecast it to the target variables.
eg: Lets say the data buffer is actually a list of integers of size 4 bytes.
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char const* argv[])
{
//how the data buffer was probably filled
int *a = (int *)malloc(10*sizeof(int));
int i;
for(i=0;i<10;i++) {
a[i] = i;
}
char *data = (char *)a;
//how we could read from the data buffer
int *b = (int *)malloc(10*sizeof(int));
char *p = data;
for(i=0;i<10;i++) {
b[i]=(int )*p;
printf("got value %d\n",b[i]);
p += sizeof(int);
}
free(a);
free(b);
return 0;
}
Note: That being said, since this is C++, it would be much safer if we could avoid using char pointers and work with strings or vectors. Other answers have explored other options of how to handle such buffers properly in C++.

copying a buffer in a struct in C++ style

Say i have a struct like
typedef struct {
unsigned char flag, type;
unsigned short id;
uint32 size;
} THdr;
and a buffer of data coming from a UDP comunication, i have a buffer of bytes and its size (data and data_size). data size is bigger than sizeof(THdr).
The struct is at the beginning of the buffer, and i want to copy it to a struct defined in the code, THdr Struct_copy .
I know that i can use memcpy(&Struct_copy,data[0],sizeof(Struct_copy)); but i would like to use a "C++ style" way, like using std::copy.
any clue?
There is no "clever" way to do this in C++. If it doesn't involve memcpy, then you need std::copy and you will have to use casts to const unsigned char * to make the datatype of the source and/or destination match up.
There are various other ways to "copy some data", such as type-punning, but again, it's not a "better" solution than memcpy - just different.
C++ was designed to accept C-code, so if there's nothing wrong with your current code, then I don't see why you should change it.
How about something like
THdr hdr;
std::copy(&hdr, &hdr + 1, reinterpret_cast<THdr*>(data));
Should copy one THdr structure from data to hdr.
It can also be much simpler:
THdr hdr;
hdr = *reinterpret_cast<THdr*>(data);
This last works in C as well:
THdr hdr;
hdr = *(THdr *) data;
Or why not create a constructor that takes the data as input? Like
struct THdr
{
explicit THdr(const unsigned char* data)
{
THdr* other = reinterpret_cast<THdr*>(data);
*this = *other;
}
// ...
};
Then it can be used such as
THdr hdr(data);

How to read in specific sizes and store data of an unknown type in c++?

I'm trying to read data in from a binary file and then store in a data structure for later use. The issue is I don't want to have to identify exactly what type it is when I'm just reading it in and storing it. I just want to store the information regarding what type of data it is and how much data of this certain type there is (information easily obtained in the first couple bytes of this data)
But how can I read in just a certain amount of data, disregarding what type it is and still easily be able to cast (or something similar) that data into a readable form later?
My first idea would be to use characters, since all the data I will be looking at will be in byte units.
But if I did something like this:
ifstream fileStream;
fileStream.open("fileName.tiff", ios::binary);
//if I had to read in 4 bytes of data
char memory[4];
fileStream.read((char *)&memory, 4);
But how could I cast these 4 bytes if I later I wanted to read this and knew it was a double?
What's the best way to read in data of an unknown type but know size for later use?
fireStream.
I think a reinterpret_cast will give you what you need. If you have a char * to the bytes you can do the following:
double * x = reinterpret_cast<double *>(dataPtr);
Check out Type Casting on cplusplus.com for a more detailed description of reinterpret_cast.
You could copy it to the known data structure which makes life easier later on:
double x;
memcpy (&x,memory,sizeof(double));
or you could just refer to it as a cast value:
if (*((double*)(memory)) == 4.0) {
// blah blah blah
}
I believe a char* is the best way to read it in, since the size of a char is guaranteed to be 1 unit (not necessarily a byte, but all other data types are defined in terms of that unit, so that, if sizeof(double) == 27, you know that it will fit into a char[27]). So, if you have a known size, that's the easiest way to do it.
You could store the data in a class that provides functions to cast it to the possible result types, like this:
enum data_type {
TYPE_DOUBLE,
TYPE_INT
};
class data {
public:
data_type type;
size_t len;
char *buffer;
data(data_type a_type, char *a_buffer, size_t a_len)
: type(a_type), buffer(NULL), len(a_len) {
buffer = new char[a_len];
memcpy(buffer, a_buffer, a_len);
}
~data() {
delete[] buffer;
}
double as_double() {
assert(TYPE_DOUBLE == type);
assert(len >= sizeof(double));
return *reinterpret_cast<double*>(buffer);
}
int as_int() {...}
};
Later you would do something like this:
data d = ...;
switch (d.type) {
case TYPE_DOUBLE:
something(d.as_double());
break;
case TYPE_INT:
something_else(d.as_int());
break;
...
}
That's at least how I'm doing these kind of things :)
You can use structures and anonymous unions:
struct Variant
{
size_t size;
enum
{
TYPE_DOUBLE,
TYPE_INT,
} type;
union
{
char raw[0]; // Copy to here. *
double asDouble;
int asInt;
};
};
Optional: Create a table of type => size, so you can find the size given the type at runtime. This is only needed when reading.
static unsigned char typeSizes[2] =
{
sizeof(double),
sizeof(int),
};
Usage:
Variant v;
v.type = Variant::TYPE_DOUBLE;
v.size = Variant::typeSizes[v.type];
fileStream.read(v.raw, v.size);
printf("%f\n", v.asDouble);
You will probably receive warnings about type punning. Read: Doing this is not portable and against the standard! Then again, so is reinterpret_cast, C-style casting, etc.
Note: First edit, I did not read your original question. I only had the union, not the size or type part.
*This is a neat trick I learned a long time ago. Basically, raw doesn't take up any bytes (thus doesn't increase the size of the union), but provides a pointer to a position in the union (in this case, the beginning). It's very useful when describing file structures:
struct Bitmap
{
// Header stuff.
uint32_t dataSize;
RGBPixel data[0];
};
Then you can just fread the data into a Bitmap. =]
Be careful. In most environments I'm aware of, doubles are 8 bytes, not 4; reinterpret_casting memory to a double will result in junk, based on what the four bytes following memory contain. If you want a 32-bit floating point value, you probably want a float (though I should note that the C++ standard does not require that float and double be represented in any way and in particular need not be IEEE-754 compliant).
Also, your code will not be portable unless you take endianness into account in your code. I see that the TIFF format has an endianness marker in its first two bytes that should tell you whether you're reading in big-endian or little-endian values.
So I would write a function with the following prototype:
template<typename VALUE_TYPE> VALUE_TYPE convert(char* input);
If you want full portability, specialize the template and have it actually interpret the bits in input. Otherwise, you can probably get away with e.g.
template<VALUE_TYPE> VALUE_TYPE convert(char* input) {
return reinterpret_cast<double>(input);
}