So the server sends the data just as packed structures, so what only need to decode is to overlay the structure pointer on the buffer. However one of the structure is a dynamic array kind of data, but I learned that flexible array member is not a C++ standard feature. How can I do it in standard C++ way, but without copying like a vector?
// on wire format: | field a | length | length of struct b |
// the sturcts are defined packed
__pragma(pack(1))
struct B {
//...
};
struct Msg {
int32_t a;
uint32_t length;
B *data; // how to declare this?
};
__pragma(pack())
char *buf = readIO();
// overlay, without copy and assignments of each field
const Msg *m = reinterpret_cast<const Msg *>(buf);
// access m->data[i] from 0 to length
The common way to do this in C was to declare data as an array of length one as the last struct member. You then allocate the space needed as if the array was larger.
Seems to work fine in C++ as well. You should perhaps wrap access to the data in a span or equivalent, so the implementation details don't leak outside your class.
#include <string>
#include <span>
struct B {
float x;
float y;
};
struct Msg {
int a;
std::size_t length;
B data[1];
};
char* readIO()
{
constexpr int numData = 3;
char* out = new char[sizeof(Msg) + sizeof(B) * (numData - 1)];
return out;
}
int main(){
char *buf = readIO();
// overlay, without copy and assignments of each field
const Msg *m = reinterpret_cast<const Msg *>(buf);
// access m->data[i] from 0 to length
std::span<const B> data(m->data, m->length);
for(auto& b: data)
{
// do something
}
return 0;
}
https://godbolt.org/z/EoMbeE8or
A standard solution is to not represent the array as a member of the message, but rather as a separate object.
struct Msg {
int a;
size_t length;
};
const Msg& m = *reinterpret_cast<const Msg*>(buf);
span<const B> data = {
reinterpret_cast<const B*>(buf + sizeof(Msg)),
m.length,
};
Note that reinterpretation / copying of bytes is not portable between systems with different representations (byte endianness, integer sizes, alignments, subobject packing etc.), and same representation is typically not something that can be assumed in network communication.
// on wire format: | field a | length | length of struct b |
You can't overlay the struct, because you can't guarantee that the binary representation of Msg will match the on wire format. Also int is at least 16 bits, can be any number of bits greater than 16, and size_t has various size depending on architecture.
Write actual accessors to the data. Use fixed width integer types. It will only work if the data actually point to a properly aligned region. This method allows you to write assertions and throw exceptions when stuff goes bad (for example, you can throw on out-of-bounds access to the array).
struct Msg {
constexpr static size_t your_required_alignment = alingof(uint32_t);
char *buf;
Msg (char *buf) : buf(buf) {
assert((uintptr_t)buf % your_required_alignment == 0);
}
int32_t& get_a() { return *reinterpret_cast<int32_t*>(buf); }
uint32_t& length() { return *reinterpret_cast<uint32_t *>(buf + sizeof(int32_t)); }
struct Barray {
char *buf;
Barray(char *buf) : buf(buf) {}
int16_t &operator[](size_t idx) {
return *reinterpret_cast<int16_t*>(buf + idx * sizeof(int16_t));
}
}
Barray data() {
return buf + sizeof(int32_t) + sizoef(uint32_t);
}
};
int main() {
Msg msg(readIO());
std::cout << msg.a() << msg.length();
msg.data()[1] = 5;
// or maybe even implement straight operator[]:
// msg[1] = 5;
}
If the data do not point to a properly aligned region, you have to copy the data, there is no possibility to access them using other types then char.
I have a class that holds audio data bytes:
class clsAudioData
{
private:
unsigned char *m_content;
long m_size;
public:
clsAudioData();
~clsAudioData();
void Load(string file);
long Size();
unsigned char *Content();
void LoadContent(long size, FILE *f);
};
void clsAudioData::LoadContent(long size, FILE *f)
{
m_size =size;
m_content = new unsigned char[m_size];
fread(m_content, sizeof(unsigned char), m_size,f);
}
I'm trying to printf values at certain positions.
To do that, I tried:
for (int i = 0; i < 20; i++)
{
printf("audio data = %d\n", nAudioData.Content[i]);
}
The compiler tells me:
clsAudioData::Content Function doesn't accept 1 argument
How could I access an "element" at a certain index to printf it?
Thank you.
You'll have to call the Content function: nAudioData.Content()[i].
Aside: please make m_content a std::vector<unsigned char>. You'll have a lot less chance of leaking memory.
I'm having some trouble working with memory: I have to keep a copy of some data in a new class. The main problem is that the first 9 bytes of this data should be thrown away. Whenever the object gets deleted though, I either get a segmentation fault or SIGABRT (it's not even consistent)
class Frame
{
public:
Frame();
~Frame();
void setFirstData(uint8_t *data, size_t dataLength);
void setSecondData(uint8_t *data, size_t dataLength);
void setThirdData(uint8_t *data, size_t dataLength);
void setFourthData(uint8_t *data, size_t dataLength);
...
private:
unsigned char *_firstData;
bool _firstDataSet;
size_t _firstDataLength;
unsigned char *_secondData;
bool _secondDataSet;
size_t _secondDataLength;
unsigned char *_thirdData;
bool _thirdDataSet;
size_t _thirdDataLength;
unsigned char *_fourthData;
bool _fourthDataSet;
size_t _fourthDataLength;
};
Frame::Frame()
{
_firstDataSet = false;
_secondDataSet = false;
_thirdDataSet = false;
_fourthDataSet = false;
}
Frame::~Frame()
{
if (_firstDataSet)
delete [] _firstData;
if (_secondDataSet)
delete[] _secondData;
if (_thirdDataSet)
delete[] _thirdData;
if (_fourthDataSet)
delete[] _fourthData;
}
void Frame::setFirstData(uint8_t *data, size_t dataLength)
{
//copy all the data in a unsigned char*, except for the first 9 bytes
_firstDataLength = dataLength - 9;
_firstData = new unsigned char[_firstDataLength];
memcpy(_firstData, data + 9, _firstDataLength*sizeof(*_firstData));
/*for (int i = 0; i < dataLength - 9; i++)
{
_firstData[i] = (unsigned char) data[i + 9];
}*/
_firstDataSet = true;
}
The other setData functions are identical to setFirstData, but with the correct arrays.
Am I supposed to use something else than memcpy? Or is the usage wrong? The commented for loop was my original method of 'copying' the data but I dont think it actually copies the data (original array will be deleted when the copied data still has to be available).
EDIT: I added the qt tag because i'm working in a Qt environment and using some Qt classes for GUI. I don't think qt has anything to do with these basic C++ functions.
What with setting firstData:
_firstData = new unsigned char[_dataLength];
memcpy(_firstData, data + 9, _dataLength*sizeof(*_firstData));
I came across this code for developing a class for GA/GP but failed to understand it and hence unable debug the program.
typedef struct {
void *dataPointer;
int length;
} binary_data;
typedef struct {
organism *organisms; //This must be malloc'ed
int organismsCount;
int (*fitnessTest)(organism org);
int orgDnaLength;
unsigned int desiredFitness;
void (*progress)(unsigned int fitness);
} evolutionary_algorithm;
The above is straight forward. Then we try to initiate organism before testing their fitnness etc...
int main(int argc, char *argv[])
{
srand(time(NULL));
int i;
evolutionary_algorithm ea;
ea.progress = progressDisplayer;
ea.organismsCount = 50;
ea.orgDnaLength = sizeof(unsigned int);
organism *orgs =(organism *) malloc(sizeof(organism) * ea.organismsCount);
for (i = 0; i < 50; i++)
{
organism newOrg;
binary_data newOrgDna;
newOrgDna.dataPointer = malloc(sizeof(unsigned int));
memset(newOrgDna.dataPointer, i, 1);
newOrgDna.length = sizeof(unsigned int);
newOrg.dna = newOrgDna;
orgs[i] = newOrg;
}
As far as i understand is the memset() tries to write a binary value into that memory location void pointer (newOrgDna.dataPointer) and so on. But i cant figure how to reassemble all those binary values to get the integer value assigned to variable "dna" of newOrg so that i check the integer value assign to the an individual organism and eventually the entire population residing in the entire memory location which has been assigned to "orgs".
As you guess from above, i not very familiar memory management at this deep level of details so your help is very much appreciated.
Thank you so much
This code looks a bit strange. This line:
newOrgDna.dataPointer = malloc(sizeof(unsigned int));
will allocate probably 4 bytes (or 8 on 64 bit machines). Strange part is that memset in line just below will set only first byte.
To get actual value you might do:
char val = *((char*) newOrgDna.dataPointer);
But, as I said, this code looks a bit off. I would rewrite it as:
for (i = 0; i < 50; i++)
{
organism newOrg;
binary_data newOrgDna;
unsigned int * data = (unsigned int*) malloc(sizeof(unsigned int));
*data = i;
newOrgDna.length = sizeof(*data);
newOrgDna.data = (void*) data; // I think that cast can be dropped
newOrg.dna = newOrgDna;
orgs[i] = newOrg;
}
Then everywhere you want to get data from organism * you can do:
void f( organism * o )
{
assert( sizeof(unsigned int) == o->dna.length );
unsigned int data = *((unsigned int*) o->dna.data);
}
Also this is rather a C question not C++.
I've a class that consists basically of a matrix of vectors: vector< MyFeatVector<T> > m_vCells, where the outer vector represents the matrix. Each element in this matrix is then a vector (I extended the stl vector class and named it MyFeatVector<T>).
I'm trying to code an efficient method to store objects of this class in binary files.
Up to now, I require three nested loops:
foutput.write( reinterpret_cast<char*>( &(this->at(dy,dx,dz)) ), sizeof(T) );
where this->at(dy,dx,dz) retrieves the dz element of the vector at position [dy,dx].
Is there any possibility to store the m_vCells private member without using loops? I tried something like: foutput.write(reinterpret_cast<char*>(&(this->m_vCells[0])), (this->m_vCells.size())*sizeof(CFeatureVector<T>)); which seems not to work correctly. We can assume that all the vectors in this matrix have the same size, although a more general solution is also welcomed :-)
Furthermore, following my nested-loop implementation, storing objects of this class in binary files seem to require more physical space than storing the same objects in plain-text files. Which is a bit weird.
I was trying to follow the suggestion under http://forum.allaboutcircuits.com/showthread.php?t=16465 but couldn't arrive into a proper solution.
Thanks!
Below a simplified example of my serialization and unserialization methods.
template < typename T >
bool MyFeatMatrix<T>::writeBinary( const string & ofile ){
ofstream foutput(ofile.c_str(), ios::out|ios::binary);
foutput.write(reinterpret_cast<char*>(&this->m_nHeight), sizeof(int));
foutput.write(reinterpret_cast<char*>(&this->m_nWidth), sizeof(int));
foutput.write(reinterpret_cast<char*>(&this->m_nDepth), sizeof(int));
//foutput.write(reinterpret_cast<char*>(&(this->m_vCells[0])), nSze*sizeof(CFeatureVector<T>));
for(register int dy=0; dy < this->m_nHeight; dy++){
for(register int dx=0; dx < this->m_nWidth; dx++){
for(register int dz=0; dz < this->m_nDepth; dz++){
foutput.write( reinterpret_cast<char*>( &(this->at(dy,dx,dz)) ), sizeof(T) );
}
}
}
foutput.close();
return true;
}
template < typename T >
bool MyFeatMatrix<T>::readBinary( const string & ifile ){
ifstream finput(ifile.c_str(), ios::in|ios::binary);
int nHeight, nWidth, nDepth;
finput.read(reinterpret_cast<char*>(&nHeight), sizeof(int));
finput.read(reinterpret_cast<char*>(&nWidth), sizeof(int));
finput.read(reinterpret_cast<char*>(&nDepth), sizeof(int));
this->resize(nHeight, nWidth, nDepth);
for(register int dy=0; dy < this->m_nHeight; dy++){
for(register int dx=0; dx < this->m_nWidth; dx++){
for(register int dz=0; dz < this->m_nDepth; dz++){
finput.read( reinterpret_cast<char*>( &(this->at(dy,dx,dz)) ), sizeof(T) );
}
}
}
finput.close();
return true;
}
A most efficient method is to store the objects into an array (or contiguous space), then blast the buffer to the file. An advantage is that the disk platters don't have waste time ramping up and also the writing can be performed contiguously instead of in random locations.
If this is your performance bottleneck, you may want to consider using multiple threads, one extra thread to handle the output. Dump the objects into a buffer, set a flag, then the writing thread will handle the output, releaving your main task to perform more important tasks.
Edit 1: Serializing Example
The following code has not been compiled and is for illustrative purposes only.
#include <fstream>
#include <algorithm>
using std::ofstream;
using std::fill;
class binary_stream_interface
{
virtual void load_from_buffer(const unsigned char *& buf_ptr) = 0;
virtual size_t size_on_stream(void) const = 0;
virtual void store_to_buffer(unsigned char *& buf_ptr) const = 0;
};
struct Pet
: public binary_stream_interface,
max_name_length(32)
{
std::string name;
unsigned int age;
const unsigned int max_name_length;
void load_from_buffer(const unsigned char *& buf_ptr)
{
age = *((unsigned int *) buf_ptr);
buf_ptr += sizeof(unsigned int);
name = std::string((char *) buf_ptr);
buf_ptr += max_name_length;
return;
}
size_t size_on_stream(void) const
{
return sizeof(unsigned int) + max_name_length;
}
void store_to_buffer(unsigned char *& buf_ptr) const
{
*((unsigned int *) buf_ptr) = age;
buf_ptr += sizeof(unsigned int);
std::fill(buf_ptr, 0, max_name_length);
strncpy((char *) buf_ptr, name.c_str(), max_name_length);
buf_ptr += max_name_length;
return;
}
};
int main(void)
{
Pet dog;
dog.name = "Fido";
dog.age = 5;
ofstream data_file("pet_data.bin", std::ios::binary);
// Determine size of buffer
size_t buffer_size = dog.size_on_stream();
// Allocate the buffer
unsigned char * buffer = new unsigned char [buffer_size];
unsigned char * buf_ptr = buffer;
// Write / store the object into the buffer.
dog.store_to_buffer(buf_ptr);
// Write the buffer to the file / stream.
data_file.write((char *) buffer, buffer_size);
data_file.close();
delete [] buffer;
return 0;
}
Edit 2: A class with a vector of strings
class Many_Strings
: public binary_stream_interface
{
enum {MAX_STRING_SIZE = 32};
size_t size_on_stream(void) const
{
return m_string_container.size() * MAX_STRING_SIZE // Total size of strings.
+ sizeof(size_t); // with room for the quantity variable.
}
void store_to_buffer(unsigned char *& buf_ptr) const
{
// Treat the vector<string> as a variable length field.
// Store the quantity of strings into the buffer,
// followed by the content.
size_t string_quantity = m_string_container.size();
*((size_t *) buf_ptr) = string_quantity;
buf_ptr += sizeof(size_t);
for (size_t i = 0; i < string_quantity; ++i)
{
// Each string is a fixed length field.
// Pad with '\0' first, then copy the data.
std::fill((char *)buf_ptr, 0, MAX_STRING_SIZE);
strncpy(buf_ptr, m_string_container[i].c_str(), MAX_STRING_SIZE);
buf_ptr += MAX_STRING_SIZE;
}
}
void load_from_buffer(const unsigned char *& buf_ptr)
{
// The actual coding is left as an exercise for the reader.
// Psuedo code:
// Clear / empty the string container.
// load the quantity variable.
// increment the buffer variable by the size of the quantity variable.
// for each new string (up to the quantity just read)
// load a temporary string from the buffer via buffer pointer.
// push the temporary string into the vector
// increment the buffer pointer by the MAX_STRING_SIZE.
// end-for
}
std::vector<std::string> m_string_container;
};
I'd suggest you to read C++ FAQ on Serialization and you can choose what best fits for your
When you're working with structures and classes, you've to take care of two things
Pointers inside the class
Padding bytes
Both of these could make some notorious results in your output. IMO, the object must implement to serialize and de-serialize the object. The object can know well about the structures, pointers data etc. So it can decide which format can be implemented efficiently.
You will have to iterate anyway or has to wrap it somewhere. Once you finished implementing the serialization and de-serialization function (either you can write using operators or functions). Especially when you're working with stream objects, overloading << and >> operators would be easy to pass the object.
Regarding your question about using underlying pointers of vector, it might work if it's a single vector. But it's not a good idea in the other way.
Update according to the question update.
There are few things you should mind before overriding STL members. They're not really a good candidate for inheritance because it doesn't have any virtual destructors. If you're using basic data types and POD like structures it wont make much issues. But if you use it truly object oriented way, you may face some unpleasant behavior.
Regarding your code
Why you're typecasting it to char*?
The way you serialize the object is your choice. IMO what you did is a basic file write operation in the name of serialization.
Serialization is down to the object. i.e the parameter 'T' in your template class. If you're using POD, or basic types no need of special synchronization. Otherwise you've to carefully choose the way to write the object.
Choosing text format or binary format is your choice. Text format has always has a cost at the same time it's easy to manipulate it rather than binary format.
For example the following code is for simple read and write operation( in text format).
fstream fr("test.txt", ios_base::out | ios_base::binary );
for( int i =0;i <_countof(arr);i++)
fr << arr[i] << ' ';
fr.close();
fstream fw("test.txt", ios_base::in| ios_base::binary);
int j = 0;
while( fw.eof() || j < _countof(arrout))
{
fw >> arrout[j++];
}
It seems to me, that the most direct root to generate a binary file containing a vector is to memory map the file and place it in the mapped region. As pointed out by sarat, you need to worry about how pointers are used within the class. But, boost-interprocess library has a tutorial on how to do this using their shared memory regions which include memory mapped files.
First off, have you looked at Boost.multi_array? Always good to take something ready-made rather than reinventing the wheel.
That said, I'm not sure if this is helpful, but here's how I would implement the basic data structure, and it'd be fairly easy to serialize:
#include <array>
template <typename T, size_t DIM1, size_t DIM2, size_t DIM3>
class ThreeDArray
{
typedef std::array<T, DIM1 * DIM2 * DIM3> array_t;
array_t m_data;
public:
inline size_t size() const { return data.size(); }
inline size_t byte_size() const { return sizeof(T) * data.size(); }
inline T & operator()(size_t i, size_t j, size_t k)
{
return m_data[i + j * DIM1 + k * DIM1 * DIM2];
}
inline const T & operator()(size_t i, size_t j, size_t k) const
{
return m_data[i + j * DIM1 + k * DIM1 * DIM2];
}
inline const T * data() const { return m_data.data(); }
};
You can serialize the data buffer directly:
ThreeDArray<int, 4, 6 11> arr;
/* ... */
std::ofstream outfile("file.bin");
outfile.write(reinterpret_cast<char*>(arr.data()), arr.byte_size());