How to load a serialized boost::variant? - c++

I'm not able to use boost::serialization because it has library dependencies so I'm trying to figure out a way to do it myself. It doesn't matter if that means copying from boost::serialization.
After reading this answer to a similar question, I had a look at boost/serialization/variant.hpp and found save() function which is straight forward and understandable for me.
However the load() function looks more complicated: There is a recursion involving load() and variant_impl<types>::load() and a decremented which parameter.
So apparently the code iterates over each type of the variant in order to convert the int which into a type.
The rest is beyond me.
I know that boost has lot of code to make it portable so maybe there is a less-portable but easier way to do this?

If you were to remove the serialization stuff from a copy of boost/serialization/variant.hpp (apart from the Archive template parameter) - i.e. get throw your own exception types and change e.g.
ar >> BOOST_SERIALIZATION_NVP(which);
// to:
ar >> which;
Then it looks like you should be able to replace Archive with std::ostream or std::istream in the save/load functions, respectively.
Not tried it, but at a glance it looks like it should work.
I guess it does depend on what you are actually using to serialize the data if not using boots::serialization?

Related

Setting a property that I can access in the serialize functions when using Cereal serialization library

I'm using the 'Cereal' serialization library (uscilab.github.io/cereal/) to serialize objects that can have millions of numbers, and the meta-data that describes the numbers. In some instances, I do not need the numbers to be serialized, just the meta-data; other times I would like both in the archive.
The only way I could think to achieve was to add a boolean property to the OutputArchive class defined in the cereal.hpp file. My thinking is that when I construct the archive, I set this value. Then, when the serialization code runs, any object could access this property and serialize the appropriate values. Most objects would ignore this property, but the objects holding the (potentially) millions of numbers could either ignore the numbers or not, based on the value of this property.
Here is some pseudocode to help explain (derived from the examples on the Cereal website). Creating an archive would look like this:
int main()
{
std::stringstream ss;
{
cereal::BinaryOutputArchive oarchive(ss, true); // I modified the constructor to accept a boolean parameter, and set the property
...
}
...
Then, within the function that serializes my data object (the object that holds metadata and the millions of numbers):
template<class Archive>
void save(Archive& ar) const
{
ar(metadata);
ar(more_meta_data);
boolean bArchiveEverything = ar.ArchiveNumbers(); //<<-- this is what I don't know how to accomplish
ar(bArchiveEverything); // put this into the archive, so I know what to expect when deserializing
if (bArchiveEverything) {
ar(bigVectorOfNumbers);
}
}
My questions:
1) Am I going about this all wrong? Is there a simpler more elegant way I'm missing?
2) If not, and this seems reasonable, I'm not sure how I can access my property in the OutputArchive through the 'Archive&' parameter that gets passed into the template functions that Cereal needs for serializing.
Thanks in advance for any help.
I still don't know if this was the best way, so I can't answer my first question.
However, accessing the property didn't end up being that difficult. It turns out, that as long as all of the classes that get passed into the 'save' function as 'ar' have the same function, I can use that function just like my pseudo-code function "ArchiveNumbers()". So, all I had to do was add that function to the 'OutputArchive' class in Cereal, and have it return my property.
I didn't think that would even compile, but I was wrong about that. I'm still trying to wrap my head around template programming. While I got this to work, I certainly can't say this is a 'best practice'.

C++ serialization of data-structures

I'm studying serializations in C++. What's the advantage/difference of boost::serialization if compared to something like:
ifstream_obj.read(reinterpret_cast<char *>(&obj), sizeof(obj)); // read
// or
ofstream_obj.write(reinterpret_cast<char *>(&obj), sizeof(obj)); // write
// ?
and, which one is better to use?
The big advantages of Boost Serialization are:
it actually works for non-trivial (POD) data types (C++ is not C)
it allows you to decouple serialization code from archive backend, thereby giving you text, xml, binary serialization
If you use the proper archive you can even have portability (try that with your sample). This means you can send on one machine/OS/version and receive on another without problems.
Lastly, it adds (a) layer(s) of abstraction which make things a lot less error prone. Granted, you could have done the same for your suggested serialization approach without much issue.
Here's an answer that does the kind of serialization you suggest but safely:
How to pass class template argument to boost::variant?
Note that Boost Serialization is fully aware of bitwise serializable types and you can tell it about your own too:
Boost serialization bitwise serializability

boost serialization omit version for a wrapper

How can I tell boost that for a particular structure it should not write/read a class "version" identifier?
I am writing some wrapper classes for serializing some types in a smaller fashion (like a variable length integer). If the wrapper gets a class version written the whole point of the size reduction is lost (it'll end up bigger in most cases).
For example, given integer a I'll be replacing this code:
ar & a;
with this:
ar & wrapper(a);
I see the is_wrapper trait, but I can't really find any docs on what that does, or if it might help.
Add
BOOST_CLASS_IMPLEMENTATION(wrapper, boost::serialization::object_serializable)
It's the documented way.

What is the best design pattern to register data "chunks"?

I have a library which can save/load on disk "chunks" which are POD structs with constant size and unique static CHUNK_ID field. So load looks somethink like this.
void Load(int docId, char* ptr, int type, size_t& size)...
If you want to add new chunk you just add struct with new CHUNK_ID and use Save Load functions to it.
What I want is to force all "chunks" to have functions like PrintHumanReadable, CompareThisTypeOfChunk etc(Ideally program should not compile without such functions). Also I want to mark/register/enumerate all chunk-structs.
I have a few ideas but all of them have problems.
Create base class with pure virtual functions PrintHumanReadable, CompareThisTypeOfChunk.
Problem:breaks pod type and requires library rewriting.
Implement factory which creates chunk struct from CHUNK_ID. Problem: compiles when I add new chunk without required functions.
Could you recomend elegant design solution for my problem?
Implement a simple code generator. You can use something like Mako or Cheetah (both Python libraries). Make a text file containing all the class names, then have the generator build the factory method and a series of methods which aren't really used but which refer to the desired methods in all the classes. This will also make it straightforward to enumerate the classes (again, using generated code).
The proper design pattern for this is called "use Boost.Serialization". It's really the best tool for writing objects to a format and then reading them back later. It can write in text, binary, and even XML formats (and others if you write a proper stream for them). It's can be non-intrusive, so you don't need to modify the objects to serialize them. And so forth.
Once you're using the proper tool for this job, you can then use whatever class hierarchy or other method you like to ensure that the proper functions for an object exist.
If you can't/won't use Boost.Serialization, then you're pretty much stuck with a runtime solution. And since the solution is runtime rather than compile time, there's no way to ensure at compile time that any particular chunk ID has the requisite functions.

Best way to take a snapshot of an object to a file

What's the best way to output the public contents of an object to a human-readable file? I'm looking for a way to do this that would not require me to know of all the members of the class, but rather use the compiler to tell me what members exist, and what their names are. There have to be macros or something like that, right?
Contrived example:
class Container
{
public:
Container::Container() {/*initialize members*/};
int stuff;
int otherStuff;
};
Container myCollection;
I would like to be able to do something to see output along the lines of "myCollection: stuff = value, otherStuff = value".
But then if another member is added to Container,
class Container
{
public:
Container::Container() {/*initialize members*/};
int stuff;
string evenMoreStuff;
int otherStuff;
};
Container myCollection;
This time, the output of this snapshot would be "myCollection: stuff = value, evenMoreStuff=value, otherStuff = value"
Is there a macro that would help me accomplish this? Is this even possible? (Also, I can't modify the Container class.)
Another note: I'm most interested about a potential macros in VS, but other solutions are welcome too.
What you're looking for is "[reflection](http://en.wikipedia.org/wiki/Reflection_(computer_science)#C.2B.2B)".
I found two promising links with a Google search for "C++ reflection":
http://www.garret.ru/cppreflection/docs/reflect.html
http://seal-reflex.web.cern.ch/seal-reflex/index.html
Boost has a serialization library that can serialize into text files. You will, however, not be able to get around with now knowing what members the class contains. You would need reflection, which C++ does not have.
Take a look at this library .
What you need is object serialization or object marshalling. A recurrent thema in stackoverflow.
I'd highly recommend taking a look at Google's Protocol Buffers.
There's unfortunately no macro that can do this for you. What you're looking for is a reflective type library. These can vary from fairly simple to home-rolled monstrosities that have no place in a work environment.
There's no real simple way of doing this, and though you may be tempted to simply dump the memory at an address like so:
char *buffer = new char[sizeof(Container)];
memcpy(buffer, containerInstance, sizeof(Container));
I'd really suggest against it unless all you have are simple types.
If you want something really simple but not complete, I'd suggest writing your own
printOn(ostream &) member method.
XDR is one way to do this in a platform independent way.