C++ serialization of data-structures - c++

I'm studying serializations in C++. What's the advantage/difference of boost::serialization if compared to something like:
ifstream_obj.read(reinterpret_cast<char *>(&obj), sizeof(obj)); // read
// or
ofstream_obj.write(reinterpret_cast<char *>(&obj), sizeof(obj)); // write
// ?
and, which one is better to use?

The big advantages of Boost Serialization are:
it actually works for non-trivial (POD) data types (C++ is not C)
it allows you to decouple serialization code from archive backend, thereby giving you text, xml, binary serialization
If you use the proper archive you can even have portability (try that with your sample). This means you can send on one machine/OS/version and receive on another without problems.
Lastly, it adds (a) layer(s) of abstraction which make things a lot less error prone. Granted, you could have done the same for your suggested serialization approach without much issue.
Here's an answer that does the kind of serialization you suggest but safely:
How to pass class template argument to boost::variant?
Note that Boost Serialization is fully aware of bitwise serializable types and you can tell it about your own too:
Boost serialization bitwise serializability

Related

Generate operator== using Boost Serialization?

Problem: I have a set of classes for which I have already implemented boost serialization methods. Now, I want to add an operator== to one class that contains many of the other classes as its members. This comparison should be straightforward: A deep, member wise comparison.
Idea: Since the existing serialization methods already tell the compiler everything it needs to know, I wonder if this can be used to generate efficient comparison operators.
Approach 1: The simplest thing would be to compare strings containing serializations of the objects to be compared. The runtime of this approach is probably much slower than handcrafted operator== implementations.
Approach 2: Implement a specialized boost serialization archive for comparisons. However, implementing this is much more complicated and time consuming than implementing either handcrafted operators or approach 1.
I did a similar thing recently for hashing of serializable types:
Hash an arbitrary precision value (boost::multiprecision::cpp_int)
Here I "abused" boost serialization to get a hash function for any Boost Multi Precision number type (that has a serializable backend).
The approach given in the linked answer is strictly more lightweight and much easier to implement than writing a custom archive type. Instead, I wrote a custom custom Boost Iostreams sink to digest the raw data.
namespace io = boost::iostreams;
struct hash_sink {
hash_sink(size_t& seed_ref) : _ptr(&seed_ref) {}
typedef char char_type;
typedef io::sink_tag category;
std::streamsize write(const char* s, std::streamsize n) {
boost::hash_combine(*_ptr, boost::hash_range(s, s+n));
return n;
}
private:
size_t* _ptr;
};
This is highly efficient because it doesn't store the archive anywhere in the process. (So it sidesteps the dreadful inefficiency of Approach 1)
Applying to Equality Comparison
Sadly, you can't easily equality compare in streaming mode (unless you can be sure that both streams can be consumed in tandem, which is a bit of finicky assumption).
Instead you would probably use something like boost::iostreams::device::array_sink or boost::iostreams::device::back_insert_device to receive the raw data.
If memory usage is a concern you might want to compress it in memory (Boost Iostreams comes with the required filters for zip/bzip too). But I guess this is not your worry - as you would likely not be trying to avoid duplicating code in that case. You could just implement the comparison directly.

framework/library for property-tree-like data structure with generic get/set-implementation?

I'm looking for a data structure which behaves similar to boost::property_tree but (optionally) leaves the get/set implementation for each value item to the developer.
You should be able to do something like this:
std::function<int(void)> f_foo = ...;
my_property_tree tree;
tree.register<int>("some.path.to.key", f_foo);
auto v1 = tree.get<int>("some.path.to.key"); // <-- calls f_foo
auto v2 = tree.get<int>("some.other.path"); // <-- some fallback or throws exception
I guess you could abuse property_tree for this but I haven't looked into the implementation yet and I would have a bad feeling about this unless I knew that this is an intended use case.
Writing a class that handles requests like val = tree.get("some.path.to.key") by calling a provided function doesn't look too hard in the first place but I can imagine a lot of special cases which would make this quite a bulky library.
Some extra features might be:
subtree-handling: not only handle terminal keys but forward certain subtrees to separate implementations. E.g.
tree.register("some.path.config", some_handler);
// calls some_handler.get<int>("network.hostname")
v = tree.get<int>("some.path.config.network.hostname");
search among values / keys
automatic type casting (like in boost::property_tree)
"path overloading", e.g. defaulting to a property_tree-implementation for paths without registered callback.
Is there a library that comes close to what I'm looking for? Has anyone made experiences with using boost::property_tree for this purpose? (E.g. by subclassing or putting special objects into the tree like described here)
After years of coding my own container classes I ended up just adopting QVariantMap. This way it pretty much behaves (and is as flexible as) python. Just one interface. Not for performance code though.
If you care to know, I really caved in for Qt as my de facto STL because:
Industry standard - used even in avionics and satellite software
It has been around for decades with little interface change (think about long term support)
It has excellent performance, awesome documentation and enormous user base.
Extensive feature set, way beyond the STL
Would an std::map do the job you are interested in?
Have you tried this approach?
I don't quite understand what you are trying to do. So please provide a domain example.
Cheers.
I have some home-cooked code that lets you register custom callbacks for each type in GitHub. It is quite basic and still missing most of the features you would like to have. I'm working on the second version, though. I'm finishing a helper structure that will do most of the job of making callbacks. Tell me if you're interested. Also, you could implement some of those features yourself, as the code to register callbacks is already done. It shouldn't be so difficult.
Using only provided data structures:
First, getters and setters are not native features to c++ you need to call the method one way or another. To make such behaviour occur you can overload assignment operator. I assume you also want to store POD data in your data structure as well.
So without knowing the type of the data you're "get"ting, the only option I can think of is to use boost::variant. But still, you have some overloading to do, and you need at least one assignment.
You can check out the documentation. It's pretty straight-forward and easy to understand.
http://www.boost.org/doc/libs/1_61_0/doc/html/variant/tutorial.html
Making your own data structures:
Alternatively, as Dani mentioned, you can come up with your own implementation and keep a register of overloaded methods and so on.
Best

Creating a std::iostream adapter

I'd like to create an iostream adapter class which lets me modify the data written to or read from a stream on-the-fly.
The adapter itself should be a iostream to allow true transparency towards third-party code.
Example for a StreamEncoder class derived from std::ostream:
// External algorithm, creates large amounts of log data
int foo(int bar, std::ostream& logOutput);
int main()
{
// The target file
std::ofstream file("logfile.lzma");
// A StreamEncoder compressing the output via LZMA
StreamEncoder lzmaEncoder(file, &encodeLzma);
// A StreamEncoder converting the UTF-8 log data to UTF-16
StreamEncoder utf16Encoder(lzmaEncoder, &utf8ToUtf16);
// Call foo(), but write the log data to an LZMA-compressed UTF-16 file
cout << foo(42, utf16Encoder);
}
As far as I know, I need to create a new basic_streambuf derivate and embed it in a basic_ostream subclass, but that seems to be pretty complex.
Is there any easier way to accomplish this?
Oddly enough, at least as things are really intended to work, none of this should directly involve iostreams and/or streambufs at all.
I would think of an iostream as a match-maker class. An iostream has a streambuf which provides a buffered interface to some sort of external source/sink of data. It also has a locale, which handles all the formatting. The iostream is little more than the playground supervisor that keeps those two playing together nicely (so to speak). Since you're dealing with data formatting, all of this is (or should be) handled in the locale.
A locale isn't monolithic though -- it's composed of a number of facets, each devoted to one particular part of data formatting. In this case, the part you probably care about is the codecvt facet, which is used (almost exclusively) to translate between the external and internal representations of data being read from/written to iostreams.
For better or worse, however, a locale can only contain one codecvt facet at a time, not a chain of them like you're contemplating. As such, what you really need/want is a wrapper class that provides a codecvt as its external interface, but allows you to chain some arbitrary set of transforms to be done to the data during I/O.
For the utf-to-utf conversion, Boost.locale provides a utf_to_utf function, and codecvt wrapper code, so doing this part of the conversion is simple and straightforward.
Lest anybody suggest that such things be done with ICU, I'll add that Boost.Locale is pretty much a wrapper around ICU, so this is more or less the same answer, but in a form that's much more friendly to C++ (whereas ICU by itself is rather Java-like, and all but overtly hostile to C++).
The other side of things is that writing a codecvt facet adds a great deal of complexity to a fairly simple task. A filtering streambuf (for one example) is generally a lot simpler to write. It's still not as easy as you'd like, but not nearly as bad as a codecvt facet. As #Flexo already mentioned, the Boost iostreams library already includes a filtering streambuf that does zip compression. Doing roughly the same with lzma (or lzh, arithmetic, etc. compression) is relatively easy, at least assuming you have compression functions that are easy to use (you basically just supply them with a buffer of input, and they supply a buffer of results).

How to load a serialized boost::variant?

I'm not able to use boost::serialization because it has library dependencies so I'm trying to figure out a way to do it myself. It doesn't matter if that means copying from boost::serialization.
After reading this answer to a similar question, I had a look at boost/serialization/variant.hpp and found save() function which is straight forward and understandable for me.
However the load() function looks more complicated: There is a recursion involving load() and variant_impl<types>::load() and a decremented which parameter.
So apparently the code iterates over each type of the variant in order to convert the int which into a type.
The rest is beyond me.
I know that boost has lot of code to make it portable so maybe there is a less-portable but easier way to do this?
If you were to remove the serialization stuff from a copy of boost/serialization/variant.hpp (apart from the Archive template parameter) - i.e. get throw your own exception types and change e.g.
ar >> BOOST_SERIALIZATION_NVP(which);
// to:
ar >> which;
Then it looks like you should be able to replace Archive with std::ostream or std::istream in the save/load functions, respectively.
Not tried it, but at a glance it looks like it should work.
I guess it does depend on what you are actually using to serialize the data if not using boots::serialization?

How to mix std::stream and Delphi TStream?

I'm using C++Builder, and am trying to slowly migrate code to using C++ standard library in preference to the Delphi VCL.
The VCL has a streaming architecture based around the TStream class, and I'm switching to using std::stream instead. However, in the short term, I still need a way of 'mixing' use of the two stream types.
I can do this using intermediate std::stringstream/TStringStream objects, but this seems a little inefficient and cumbersome. Does anyone have a better suggestion?
Edit:
TStream provides similar functionality to std::streams, but is not derived from it. You can create different kinds of streams (TFileStream, TMemoryStream, TStringStream) and read/write data to/from them. See the Embarcadero docwiki TStream reference.
Edit:
Example - Imagine I have a std::ostream that I have written some stuff to, and I now want to append a JPEG Image to it using TJPEGImage.SaveToStream(str : TStream). And, I'll want to read it from a std::istream later...
Maybe you can write an adapter/proxy class similar to the VCL TStreamAdapter which implements an IStream interface for a TStream.
Well, I don't know too much about C++, but I do know how to mix two incompatible classes with similar functionality, and that's with a wrapper class. It looks to me like the base stream classes in the C++ hierarchy are abstract classes that define methods but leave it to the descendants to implement them in different ways. So create a class that descends from iostream (most Delphi streams are two-way) and takes a TStream object in its constructor, and then implements the iostream methods by calling the equivalent methods on the internal TStream object.