C++ Binary file reading - c++

I'm trying to keep objects including vectors of objects in a binary file.
Here's a bit of the load from file code:
template <class T> void read(T* obj,std::ifstream * file) {
file->read((char*)(obj),sizeof(*obj));
file->seekg(int(file->tellg())+sizeof(*obj));
}
void read_db(DB* obj,std::ifstream * file) {
read<DB>(obj,file);
for(int index = 0;index < obj->Arrays.size();index++) {
std::cin.get(); //debugging
obj->Arrays[0].Name = "hi"; //debugging
std::cin.get(); //debugging
std::cout << obj->Arrays[0].Name;
read<DB_ARRAY>(&obj->Arrays[index],file);
for(int row_index = 0;row_index < obj->Arrays[index].Rows.size();row_index++) {
read<DB_ROW>(&obj->Arrays[index].Rows[row_index],file);
for(int int_index = 0;int_index < obj->Arrays[index].Rows[row_index].i_Values.size();int_index++) {
read<DB_VALUE<int>>(&obj->Arrays[index].Rows[row_index].i_Values[int_index],file);
}
}
}
}
And here's the DB/DB_ARRAY classes
class DB {
public:
std::string Name;
std::vector<DB_ARRAY> Arrays;
DB_ARRAY * operator[](std::string);
DB_ARRAY * Create(std::string);
};
class DB_ARRAY {
public:
DB* Parent;
std::string Name;
std::vector<DB_ROW> Rows;
DB_ROW * operator[](int);
DB_ROW * Create();
DB_ARRAY(DB*,std::string);
DB_ARRAY();
};
So now the first argument to the read_db function would have correct values, and the vector Arrays on the object has the correct size, However if I index any value of any object from obj->Arrays it's going to throw the access violation exception.
std::cout << obj->Arrays[0].Name; // error
std::cout << &obj->Arrays[0]; // no error
The later always prints the same address, so when I save an object casted to char* does it save the address of it too?

As various commenters pointed out, you cannot simply serialize a (non-POD) object by saving / restoring it's memory.
The usual way to implement serialization is to implement a serialization interface on the classes. Something like this:
struct ISerializable {
virtual std::ostream& save(std::ostream& os) const = 0;
virtual std::istream& load(std::istream& is) = 0;
};
You then implement this interface in your serializable classes, recursively calling save and load on any members referencing other serializable classes, and writing out any POD members. E.g.:
class DB_ARRAY : public ISerializable {
public:
DB* Parent;
std::string Name;
std::vector<DB_ROW> Rows;
DB_ROW * operator[](int);
DB_ROW * Create();
DB_ARRAY(DB*,std::string);
DB_ARRAY();
virtual std::ostream& save(std::ostream& os) const
{
// serialize out members
return os;
}
virtual std::istream& load(std::istream& is)
{
// unserialize members
return os;
}
};
As count0 pointed out, boost::serialization is also a great starting point.

What is the format of the binary data in the file? Until you specify
that, we can't tell you how to write it. Basically, you have to specify
a format for all of your data types (except char), then write the code
to write out that format, byte by byte (or generate it into a buffer);
and on the other side, to read it in byte by byte, and reconstruct it.
The C++ standard says nothing (or very little) about the size and
representation of the data types, except that sizeof(char) must be
1, and that unsigned char must be a pure binary representation over
all of the bits. And on the machines I have access today (Sun Sparc and
PC's), only the character types have a common representation. As for
the more complex types, the memory used in the value representation
might not even be contiguous: the bitwise representation of an
std::vector, for example, is usually three pointers, with the actual
values in the vector being found somewhere else entirely.
The functions istream::read and ostream::write are
designed for reading data into a buffer for manual parsing, and writing
a pre-formatted buffer. The fact that you need to use a
reinterpret_cast to use them otherwise should be a good indication
that it won't work.

Related

What's the most simple way to read and write data from a struct to and from a file in c++ without serialization library?

I am writing a program to that regularly stores and reads structs in the form below.
struct Node {
int leftChild = 0;
int rightChild = 0;
std::string value;
int count = 1;
int balanceFactor = 0;
};
How would I read and write nodes to a file? I would like to use the fstream class with seekg and seekp to do the serialization manually but I'm not sure how it works based off of the documentation and am struggling with finding decent examples.
[edit] specified that i do not want to use a serialization library.
This problem is known as serialization. Use a serializing library like e.g. Google's Protocol Buffers or Flatbuffers.
To serialize objects, you will need to stick to the concept that the object is writing its members to the stream and reading members from the stream. Also, member objects should write themselves to the stream (as well as read).
I implemented a scheme using three member functions, and a buffer:
void load_from_buffer(uint8_t * & buffer_pointer);
void store_to_buffer(uint8_t * & buffer_pointer) const;
unsigned int size_on_stream() const;
The size_on_stream would be called first in order to determine the buffer size for the object (or how much space it occupied in the buffer).
The load_from_buffer function loads the object's members from a buffer using the given pointer. The function also increments the pointer appropriately.
The store_to_buffer function stores the objects's members to a buffer using the given pointer. The function also increments the pointer appropriately.
This can be applied to POD types by using templates and template specializations.
These functions also allow you to pack the output into the buffer, and load from a packed format.
The reason for I/O to the buffer is so you can use the more efficient block stream methods, such as write and read.
Edit 1: Writing a node to a stream
The problem with writing or serializing a node (such a linked list or tree node) is that pointers don't translate to a file. There is no guarantee that the OS will place your program in the same memory location or give you the same area of memory each time.
You have two options: 1) Only store the data. 2) Convert the pointers to file offsets. Option 2) is very complicated as it may require repositioning the file pointer because file offsets may not be known ahead of time.
Also, be aware of variable length records like strings. You can't directly write a string object to a file. Unless you use a fixed string width, the string size will change. You will either need to prefix the string with the string length (preferred) or use some kind of terminating character, such as '\0'. The string length first is preferred because you don't have to search for the end of the string; you can use a block read to read in the text.
If you replace the std::string by a char buffer, you can use fwrite and fread to write/read your structure to and from disk as a fixed size block of information. Within a single program that should work ok.
The big bug-a-boo is the fact that compilers will insert padding between fields in order to keep the data aligned. That makes the code less portable as if a module is compiled with different alignment requirements the structure literally can be a different size, throwing your fixed size assumption out the door.
I would lean toward a well worn in serialization library of some sort.
Another approach would be to overload the operator<< and operator>> for the structure so that it knows how to save/load itself. That would reduce the problem to knowing where to read/write the node. In theory, your left and right child fields could be seek addresses to where the nodes actually reside, while a new field could hold the seek location of the current node.
When implementing your own serialization method, the first decision you'll have to make is whether you want the data on disk to be in binary format or textual format.
I find it easier to implement the ability to save to a binary format. The number of functions needed to implement that is small. You need to implement functions that can write the fundamental types, arrays of known size at compile time, dynamic arrays and strings. Everything else can be built on top of those.
Here's something very close to what I recently put into production code.
#include <cstring>
#include <fstream>
#include <cstddef>
#include <stdexcept>
// Class to write to a stream
struct Writer
{
std::ostream& out_;
Writer(std::ostream& out) : out_(out) {}
// Write the fundamental types
template <typename T>
void write(T number)
{
out_.write(reinterpret_cast<char const*>(&number), sizeof(number));
if (!out_ )
{
throw std::runtime_error("Unable to write a number");
}
}
// Write arrays whose size is known at compile time
template <typename T, uint64_t N>
void write(T (&array)[N])
{
for(uint64_t i = 0; i < N; ++i )
{
write(array[i]);
}
}
// Write dynamic arrays
template <typename T>
void write(T array[], uint64_t size)
{
write(size);
for(uint64_t i = 0; i < size; ++i )
{
write(array[i]);
}
}
// Write strings
void write(std::string const& str)
{
write(str.c_str(), str.size());
}
void write(char const* str)
{
write(str, std::strlen(str));
}
};
// Class to read from a stream
struct Reader
{
std::ifstream& in_;
Reader(std::ifstream& in) : in_(in) {}
template <typename T>
void read(T& number)
{
in_.read(reinterpret_cast<char*>(&number), sizeof(number));
if (!in_ )
{
throw std::runtime_error("Unable to read a number.");
}
}
template <typename T, uint64_t N>
void read(T (&array)[N])
{
for(uint64_t i = 0; i < N; ++i )
{
read(array[i]);
}
}
template <typename T>
void read(T*& array)
{
uint64_t size;
read(size);
array = new T[size];
for(uint64_t i = 0; i < size; ++i )
{
read(array[i]);
}
}
void read(std::string& str)
{
char* s;
read(s);
str = s;
delete [] s;
}
};
// Test the code.
#include <iostream>
void writeData(std::string const& file)
{
std::ofstream out(file);
Writer w(out);
w.write(10);
w.write(20.f);
w.write(200.456);
w.write("Test String");
}
void readData(std::string const& file)
{
std::ifstream in(file);
Reader r(in);
int i;
r.read(i);
std::cout << "i: " << i << std::endl;
float f;
r.read(f);
std::cout << "f: " << f << std::endl;
double d;
r.read(d);
std::cout << "d: " << d << std::endl;
std::string s;
r.read(s);
std::cout << "s: " << s << std::endl;
}
void testWriteAndRead(std::string const& file)
{
writeData(file);
readData(file);
}
int main()
{
testWriteAndRead("test.bin");
return 0;
}
Output:
i: 10
f: 20
d: 200.456
s: Test String
The ability to write and read a Node is very easily implemented.
void write(Writer& w, Node const& n)
{
w.write(n.leftChild);
w.write(n.rightChild);
w.write(n.value);
w.write(n.count);
w.write(n.balanceFactor);
}
void read(Reader& r, Node& n)
{
r.read(n.leftChild);
r.read(n.rightChild);
r.read(n.value);
r.read(n.count);
r.read(n.balanceFactor);
}
The process you are referring to are known as serialization. I'd recommend Cereal at http://uscilab.github.io/cereal/
It supports both json, xml and binary serialization and is very easy to use (with good examples).
(Unfortunately it does not support my favourite format yaml)

Calculate serialization size of objects in Qt

How can I know the size of qt data types in bytes; including QString objects if these data types were written on some QFile. I have to implement sizeOf() function in the Student class as below; something like we sizeof(struct student) in C.
Student class
#include<QtCore>
class Student
{
public:
QString name,fname;
quint8 age,weight,clss;
Student(){
}
/*the following function should return the size of this object in bytes;
I will use QDataStream & operator<<(QDataStream &out, Student &f)
function to write data to a QFile */
qint16 sizeOf()
{
// needs implementation;
}
};
QDataStream & operator<<(QDataStream &out, Student &f)
{
out<<f.name<<f.fname<<f.age<<f.weight<<f.clss<<f.next;
return out;
}
QDataStream & operator>>(QDataStream &in, Student &f)
{
in>>f.name>>f.fname>>f.age>>f.weight>>f.clss>>f.next;
return in;
}
I know that data can be read with QDataStream & operator>>(QDataStream &in, Student &f); but I want to know size also for some other cases.
This does not give me a valid size on file. It seems Qt adds some extra bits while serializing; possibly for endian-ness independency on different platforms. Actual size is always more than returned by sizeOf() function
qint16 sizeOf()
{
qint16 size=0;
size+=sizeof(quint8)*3; // size of age, weight and clss
//variables all have type quint8
size+=name.size()*16; // number of characters in string
// multiply with 16 bit QChar
size+=fname.size()*16;
return size;
}
I am using QFile, QDataStream api. Qt version 4.8 on Windows 8.
The size which sizeof will give you does not reflect the actual size an object might have. For example, the sizeof a QString in a 32 bit build will always be 4 bites, regardless how long the actual string is.
This sizeof operator includes stuff that doesn't need to be serialized, like the object's vtable pointer, and does not account for the size of dynamically allocated resources for that object.
You can easily determine the serializable size, just use a QDataStream, from the device() get the pos() before, input the object in the data stream and compare with the pos() afterwards.
Also, this line is clearly wrong: size+=sizeof(quint8*3) it will not give you three times the size of a byte. It will give you the size of an int, which is how the result is promoted after the multiplication.
Here is a nifty little class you can use for the task:
class SerialSize {
public:
SerialSize() : stream(&data) { data.open(QIODevice::WriteOnly); }
template <typename T>
quint64 operator ()(const T & t) {
data.seek(0);
stream << t;
return data.pos();
}
private:
QBuffer data;
QDataStream stream;
};
Then use it:
SerialSize size;
qDebug() << size(QString("a")); // 6
qDebug() << size(QString("aa")); // 8
In the implementation of the sizeOf function, you should
try changing size+=sizeof(quint8*3) to size+=sizeof(quint8) * 3.

Saving a game state using serialization C++

I have a class called Game which contains the following:
vector<shared_ptr<A>> attr; // attributes
D diff; // differences
vector<shared_ptr<C>> change; // change
My question is, how can I write these (save) to a file and read/load it up later?
I thought about using a struct with these in it, and simply saving the struct but I have no idea where to start.
This is my attempt so far, with just trying to save change. I've read up a lot on the issue and my issue (well one of them, anyway) here seems to be that I am storing pointers which after closing the program would be invalid (compounded by the fact that I also free them before exiting).
/* Saves state to file */
void Game::saveGame(string toFile) {
ofstream ofs(toFile, ios::binary);
ofs.write((char *)&this->change, sizeof(C));
/* Free memory code here */
....
exit(0);
};
/* Loads game state from file */
void Game::loadGame(string fromFile) {
ifstream ifs(fromFile, ios::binary);
ifs.read((char *)&this->change, sizeof(C));
this->change.toString(); // display load results
};
Can anyone guide me in the right direction for serializing this data? I'd like to use only standard packages, so no boost.
Thanks.
I have no idea how is implemented classes A, C or D, but that is the first question: how to serialize an object of that class. For the C case, you need to implement something like this:
std::ostream& operator <<(std::ostream& os, const C& c) {
// ... code to serialize c to an output stream
return os;
}
std::istream& operator >>(std::istream& is, C& c) {
// ... code to populate c contents from the input stream
return is;
}
or, if you prefer, create a write() and read() function for that class.
Well, if you want to serialize a vector<shared_ptr<C>> looks obvious you don't want to serialize the pointer, but the contents. So you need to dereference each of those pointers and serialize. If the size of the vector is not known before loading it (i.e., is not always the same), you'll need to store that information. Then, you can create a pair of functions to serialize the complete vector:
std::ostream& operator <<(std::ostream& os, const std::vector<std::shared_ptr<C>>& vc) {
// serialize the size of the vector using << operator
// for each element of the vector, let it be called 'pc'
os << *pc << std::endl; // store the element pointed by the pointer, not the pointer.
return os;
}
std::istream& operator >>(std::istream& is, std::vector<std::shared_ptr<C>>& c) {
// read the size of the vector using >> operator
// set the size of the vector
// for each i < sizeo of the vector, let 'auto &pc = vc[i]' be a reference to the i-th element of the vector
C c; // temporary object
is >> c; // read the object stored in the stream
pc = std::make_shared<C>(c); // construct the shared pointer, assuming the class C has copy constructor
return is;
}
And then,
/* Saves state to file */
void Game::saveGame(string toFile) {
ofstream ofs(toFile);
ofs << change;
....
};
/* Loads game state from file */
void Game::loadGame(string fromFile) {
ifstream ifs(fromFile);
ifs >> change;
};
I know there are a lot of things you still need to resolve. I suggest you to investigate to resolve them so you understand well how to solve your problem.
Not only are you saving pointers, you're trying to save a shared_ptr but using the wrong size.
You need to write serialization functions for all your classes, taking care to never just write the raw bits of a non-POD type. It's safest to always implement member-by-member serialization for everything, because you never know what the future will bring.
Then handling collections of them is just a matter of also storing how many there are.
Example for the Cs:
void Game::save(ofstream& stream, const C& data)
{
// Save data as appropriate...
}
void Game::saveGame(string toFile) {
ofstream ofs(toFile, ios::binary);
ofs.write((char *)change.size(), sizeof(change.size());
for (vector<shared_ptr<C>>::const_iterator c = change.begin(); c != change.end(); ++c)
{
save(ofs, **c);
}
};
shared_ptr<C> Game::loadC(ofstream& stream)
{
shared_ptr<C> data(new C);
// load the object...
return data;
}
void Game::loadGame(string fromFile) {
change.clear();
size_t count = 0;
ifstream ifs(fromFile, ios::binary);
ifs.read((char *)&count, sizeof(count));
change.reserve(count);
for (int i = 0; i < count; ++i)
{
change.push_back(loadC(ifs));
}
};
All the error handling is missing of course - you would need to add that.
It's actually a good idea to at least start with text storage (using << and >>) instead of binary. It's easier to find bugs, or mess around with the saved state, when you can just edit it in a text editor.
Writing your own serialization is quite a challenge. Even if you do not use boost serializatoin I would recommend you learn how to use it and comprehend how it works rather than discovering it yourself.
When serializing you finally end up with a buffer of data of which content you have very vague idea. You have to save everything you need to be able to restore it. You read it chunk by chunk. Example (not compiled, not tested and not stylish ):
void save(ostream& out, const string& s)
{
out << s.size();
out.write(s.c_str(), s.size());
}
void load(istream& in, string& s)
{
unsigned len;
in >> len;
s.resize(len);
in.read((char*)s, len);
}
struct Game
{
void save(ostream& out)
{
player.save(out);
};
void load(istream& in)
{
player.load(in);
}
};
struct Player
{
void save(ostream& out)
{
// save in the same order as loading, serializing everything you need to read it back
save(out, name);
save(out, experience);
}
void load(istream& in)
{
load(in, name);
load(in, experience); //
}
};
I do not know why you would do it to yourself instead of using boost but those are some of the cases you should consider:
- type - you must figure out a way to know what "type of change" you actually have there.
- a string (vector, whatever) - size + data (then the first thing you read back from the string is the length, you resize it and copy the "length" number of characters)
- a pointer - save the data pointed by pointer, then upon deserialization you have to allocate it, construct it (usually default construct) and read back the data and reset the members to their respective values. Note: you have to avoid memory leakage.
- polymorphic pointer - ouch you have to know what type the pointer actually points to, you have to construct the derived type, save the values of the derived type... so you have to save type information
- null pointer... you have to distinguish null pointer so you know that you do not need to further read data from the stream.
- versioning - you have to be able to read a data after you added/removed a field
There is too much of it for you to get a complete answer.

Serializing Struct of vectors of vector

I`m in a bit complicated situation here.
I want to save a struct of vectors to a file and read it after some time.
But the problem is with the reading. I dont know how i can fill al vectors into structure from the saved file.
struct TDNATable{
std::vector<String>GenomOne;
std::vector<String>GenomeL;
std::vector<String>GenomeER;
std::vector<String>GenomeRET;
std::vector<String>GenomeSEL;
};
std::vector<TDNATable> DnaTbl;
//PSEUDO CODE:
//For example simple writing could be
ofstream file("C:\\Users\\User11\\Desktop\\SNF_TBL.INI", ios::binary);
file.write((char*)&DnaTbl, sizeof(DnaTbl));
//Problem comes with reading
// impl
ifstream file("C:\\Users\\User11\\Desktop\\SNF_TBL.INI",
std::ifstream::binary);
// get pointer to associated buffer object
std::filebuf* pbuf = file.rdbuf();
// get file size using buffer's members
std::size_t size = pbuf->pubseekoff(0, file.end, file.in);
pbuf->pubseekpos(0, file.in);
// allocate memory to contain file data
char* buffer = new char[size];
// get file data
pbuf->sgetn(buffer, size);
file.close();
for (int i = 0; i < SnfTbl.size(); i++) {
//Back inserter can be used only with 1D vector
std::copy(buffer, buffer +sizeof(buffer),
std::back_inserter(SnfTbl[i].GenomeL);
std::copy(buffer, buffer +sizeof(buffer),
std::back_inserter(SnfTbl[i].GenomeER));
}
RefreshDFSGrid();
delete[]buffer;
file.close();
I tried with boost/serialize but without sucsess.
Do you have any idea how can i save/load this ds in a elegant way?
Thanks!
Boost could be an overkill for easy tasks. In my own code, I solved that problem with sorts of stream class. I did it that way:
Declare abstract base class with virtual Read(buffer, byteCount) = 0 and virtual Write(buffer, byteCount) = 0. In illustration below, IArchiveI and IArchiveO are such base classes.
For builtin types, provide operator << and operator >> that simply calls Read() and Write() as appropriate.
For library types such as vector / string / ..., provide non-member template operators built on base type operators (eg you no longer call raw Read / Write).
For instance, there's how I handle a vector:
template <class T>
IArchiveO& operator << (IArchiveO& a_Stream, const std::vector<T>& a_Vector)
{
a_Stream << a_Vector.size();
for (size_t i = 0; i < a_Vector.size(); i++)
{
a_Stream << a_Vector[i];
}
return a_Stream;
}
template <class T>
IArchiveI& operator >> (IArchiveI& a_Stream, std::vector<T>& a_Vector)
{
a_Vector.clear();
size_t contSize = 0;
a_Stream >> contSize;
a_Vector.resize(contSize);
for (size_t i = 0; i < contSize; i++)
{
a_Stream >> a_Vector[i];
}
return a_Stream;
}
For non-library types of your own, provide operators the same way.
For instance, here's how your code would look like:
IArchiveI& operator >> (IArchiveI& a_Stream, TDNATable& a_Value)
{
a_Stream >> a_Value.GenomOne;
a_Stream >> a_Value.GenomeL;
a_Stream >> a_Value.GenomeER;
a_Stream >> a_Value.GenomeRET;
a_Stream >> a_Value.GenomeSEL;
return a_Stream;
}
Inherit from base classes and make classes that provide storage, for instance, reading/writing to file. You will only need to overload virtual Read(buffer, byteCount) and virtual Write(buffer, byteCount).
Finally, you construct an instance of storage class and serialize your entire array in one go (in this code, CFileArchiveO is inherited from IArchiveO, overloading Write()):
CFileArchiveO ar(...);
ar << DnaTbl;
The trick is, when compiler has operators for each type, it automatically builds code for whatever nesting you have, even if it's a vector<vector<vector<string>>>
std::vector allocates memory when inserting data.
So
file.write((char*)&DnaTbl, sizeof(DnaTbl));
only saves the "metadata" of std::vector<TDnaTbl> and leaves out the data you inserted, which is stored somewhere else in memory. You would have to iterate over the vector and save the element count and the element-data manually.

On what platforms will this crash, and how can I improve it?

I've written the rudiments of a class for creating dynamic structures in C++. Dynamic structure members are stored contiguously with (as far as my tests indicate) the same padding that the compiler would insert in the equivalent static structure. Dynamic structures can thus be implicitly converted to static structures for interoperability with existing APIs.
Foremost, I don't trust myself to be able to write Boost-quality code that can compile and work on more or less any platform. What parts of this code are dangerously in need of modification?
I have one other design-related question: Is a templated get accessor the only way of providing the compiler with the requisite static type information for type-safe code? As it is, the user of dynamic_struct must specify the type of the member they are accessing, whenever they access it. If that type should change, all of the accesses become invalid, and will either cause spectacular crashes—or worse, fail silently. And it can't be caught at compile time. That's a huge risk, and one I'd like to remedy.
Example of usage:
struct Test {
char a, b, c;
int i;
Foo object;
};
void bar(const Test&);
int main(int argc, char** argv) {
dynamic_struct<std::string> ds(sizeof(Test));
ds.append<char>("a") = 'A';
ds.append<char>("b") = '2';
ds.append<char>("c") = 'D';
ds.append<int>("i") = 123;
ds.append<Foo>("object");
bar(ds);
}
And the code follows:
//
// dynamic_struct.h
//
// Much omitted for brevity.
//
/**
* For any type, determines the alignment imposed by the compiler.
*/
template<class T>
class alignment_of {
private:
struct alignment {
char a;
T b;
}; // struct alignment
public:
enum { value = sizeof(alignment) - sizeof(T) };
}; // class alignment_of
/**
* A dynamically-created structure, whose fields are indexed by keys of
* some type K, which can be substituted at runtime for any structure
* with identical members and packing.
*/
template<class K>
class dynamic_struct {
public:
// Default maximum structure size.
static const int DEFAULT_SIZE = 32;
/**
* Create a structure with normal inter-element padding.
*/
dynamic_struct(int size = DEFAULT_SIZE) : max(size) {
data.reserve(max);
} // dynamic_struct()
/**
* Copy structure from another structure with the same key type.
*/
dynamic_struct(const dynamic_struct& structure) :
members(structure.members), max(structure.max) {
data.reserve(max);
for (iterator i = members.begin(); i != members.end(); ++i)
i->second.copy(&data[0] + i->second.offset,
&structure.data[0] + i->second.offset);
} // dynamic_struct()
/**
* Destroy all members of the structure.
*/
~dynamic_struct() {
for (iterator i = members.begin(); i != members.end(); ++i)
i->second.destroy(&data[0] + i->second.offset);
} // ~dynamic_struct()
/**
* Get a value from the structure by its key.
*/
template<class T>
T& get(const K& key) {
iterator i = members.find(key);
if (i == members.end()) {
std::ostringstream message;
message << "Read of nonexistent member \"" << key << "\".";
throw dynamic_struct_access_error(message.str());
} // if
return *reinterpret_cast<T*>(&data[0] + i->second.offset.offset);
} // get()
/**
* Append a member to the structure.
*/
template<class T>
T& append(const K& key, int alignment = alignment_of<T>::value) {
iterator i = members.find(key);
if (i != members.end()) {
std::ostringstream message;
message << "Add of already existing member \"" << key << "\".";
throw dynamic_struct_access_error(message.str());
} // if
const int modulus = data.size() % alignment;
const int delta = modulus == 0 ? 0 : sizeof(T) - modulus;
if (data.size() + delta + sizeof(T) > max) {
std::ostringstream message;
message << "Attempt to add " << delta + sizeof(T)
<< " bytes to struct, exceeding maximum size of "
<< max << ".";
throw dynamic_struct_size_error(message.str());
} // if
data.resize(data.size() + delta + sizeof(T));
new (static_cast<void*>(&data[0] + data.size() - sizeof(T))) T;
std::pair<iterator, bool> j = members.insert
({key, member(data.size() - sizeof(T), destroy<T>, copy<T>)});
if (j.second) {
return *reinterpret_cast<T*>(&data[0] + j.first->second.offset);
} else {
std::ostringstream message;
message << "Unable to add member \"" << key << "\".";
throw dynamic_struct_access_error(message.str());
} // if
} // append()
/**
* Implicit checked conversion operator.
*/
template<class T>
operator T&() { return as<T>(); }
/**
* Convert from structure to real structure.
*/
template<class T>
T& as() {
// This naturally fails more frequently if changed to "!=".
if (sizeof(T) < data.size()) {
std::ostringstream message;
message << "Attempt to cast dynamic struct of size "
<< data.size() << " to type of size " << sizeof(T) << ".";
throw dynamic_struct_size_error(message.str());
} // if
return *reinterpret_cast<T*>(&data[0]);
} // as()
private:
// Map from keys to member offsets.
map_type members;
// Data buffer.
std::vector<unsigned char> data;
// Maximum allowed size.
const unsigned int max;
}; // class dynamic_struct
There's nothing inherently wrong with this kind of code. Delaying type-checking until runtime is perfectly valid, although you will have to work hard to defeat the compile-time type system. I wrote a homogenous stack class, where you could insert any type, which functioned in a similar fashion.
However, you have to ask yourself- what are you actually going to be using this for? I wrote a homogenous stack to replace the C++ stack for an interpreted language, which is a pretty tall order for any particular class. If you're not doing something drastic, this probably isn't the right thing to do.
In short, you can do it, and it's not illegal or bad or undefined and you can make it work - but you only should if you have a very desperate need to do things outside the normal language scope. Also, your code will die horrendously when C++0x becomes Standard and now you need to move and all the rest of it.
The easiest way to think of your code is actually a managed heap of a miniature size. You place on various types of object.. they're stored contiguously, etc.
Edit: Wait, you didn't manage to enforce type safety at runtime either? You just blew compile-time type safety but didn't replace it? Let me post some far superior code (that is somewhat slower, probably).
Edit: Oh wait. You want to convert your dynamic_struct, as the whole thing, to arbitrary unknown other structs, at runtime? Oh. Oh, man. Oh, seriously. What. Just no. Just don't. Really, really, don't. That's so wrong, it's unbelievable. If you had reflection, you could make this work, but C++ doesn't offer that. You can enforce type safety at runtime per each individual member using dynamic_cast and type erasure with inheritance. Not for the whole struct, because given a type T you can't tell what the types or binary layout is.
I think the type-checking could be improved. Right now it will reinterpret_cast itself to any type with the same size.
Maybe create an interface to register client structures at program startup, so they may be verified member-by-member — or even rearranged on the fly, or constructed more intelligently in the first place.
#define REGISTER_DYNAMIC_STRUCT_CLIENT( STRUCT, MEMBER ) \
do dynamic_struct::registry< STRUCT >() // one registry obj per client type \
.add( # MEMBER, &STRUCT::MEMBER, offsetof( STRUCT, MEMBER ) ) while(0)
// ^ name as str ^ ptr to memb ^ check against dynamic offset
I have one question: what do you get out of it ?
I mean it's a clever piece of code but:
you're fiddling with memory, the chances of blow-up are huge
it's quite complicated too, I didn't get everything and I would certainly have to pose longer...
What I am really wondering is what you actually want...
For example, using Boost.Fusion
struct a_key { typedef char type; };
struct object_key { typedef Foo type; };
typedef boost::fusion<
std::pair<a_key, a_key::type>,
std::pair<object_key, object_key::type>
> data_type;
int main(int argc, char* argv[])
{
data_type data;
boost::fusion::at_key<a_key>(data) = 'a'; // compile time checked
}
Using Boost.Fusion you get compile-time reflection as well as correct packing.
I don't really see the need for "runtime" selection here (using a value as key instead of a type) when you need to pass the right type to the assignment anyway (char vs Foo).
Finally, note that this can be automated, thanks to preprocessor programming:
DECLARE_ATTRIBUTES(
mData,
(char, a)
(char, b)
(char, c)
(int, i)
(Foo, object)
)
Not much wordy than a typical declaration, though a, b, etc... will be inner types rather than attributes names.
This has several advantages over your solution:
compile-time checking
perfect compliance with default generated constructors / copy constructors / etc...
much more compact representation
no runtime lookup of the "right" member