Using boost::archive with boost::iostreams to compress data - c++

I want to write a serialize function for a class that can optionally compress the data. I would like to use the compression facilities provided in boost::iostreams. Does anyone know how to do this?
struct X
{
X() {}
template<class Archive>
void serialize(Archive & ar, const unsigned int version)
{
ar & compression;
if(compression == 0)
{
ar & data;
}
else if(compression == 1)
{
// use boost::iostream compression
// facilities to serialize data
}
}
int compression;
std::vector<int> data;
};

The only way I can see to do that is compress the data first and then use ar.load_binary and ar.save_binary. To compress the data, you could use a filtering_stream with std::ostringstream as sink and an appropriate compression filter.
Any reason you don't want to push the compression down the stack (that is, build your archive over a compressing stream)?

Related

Creating custom boost serialization output archive

I'm trying to use Boost serialization to serialize objects into a buffer. The objects are large (hundreds of MB) so I don't want to use binary_oarchive to serialize them into an std::stringstream to then copy-them into their final destination. I have a Buffer class which I would like to use instead.
The problem is, binary_oarchive takes an std::ostream as parameter, and this class's stream operator is not virtual, so I can't make my Buffer class inherit from it to be used by binary_oarchive. Similarly, I haven't found a way to inherit from binary_oarchive_impl that would let me use something else than std::ostream to serialize into.
So I looked into how to create an archive from scratch, here: https://www.boost.org/doc/libs/1_79_0/libs/serialization/doc/archive_reference.html, which I'm putting back here for reference:
#include <cstddef> // std::size_t
#include <boost/archive/detail/common_oarchive.hpp>
/////////////////////////////////////////////////////////////////////////
// class complete_oarchive
class complete_oarchive :
public boost::archive::detail::common_oarchive<complete_oarchive>
{
// permit serialization system privileged access to permit
// implementation of inline templates for maximum speed.
friend class boost::archive::save_access;
// member template for saving primitive types.
// Specialize for any types/templates that require special treatment
template<class T>
void save(T & t);
public:
//////////////////////////////////////////////////////////
// public interface used by programs that use the
// serialization library
// archives are expected to support this function
void save_binary(void *address, std::size_t count);
};
But unless I misunderstood something, it looks like by defining my archive this way, I need to overload the save method for every single type I want to store, in particular STL types, which kind of defies the point of using Boost serialization altogether. If I don't define save at all, I'm getting compilation errors indicating that a save function could not be found, in particular for things like std::string and for boost-specific types like boost::archive::version_type.
So my question is: how would you make it possible, with Boost serialization, to serialize in binary format into a custom Buffer object, while retaining all the power of Boost (i.e. not having to redefine how every single STL container and boost type is serialized)?
This is something I've done pretty easily in the past with the Cereal library, unfortunately I'm stuck with Boost for this particular code base.
Don't create your own archive class. Like the commenter said, use the streambuf interface to your advantage. The upside is that things will work for any of the archive implementations, including binary archives, and perhaps more interestingly things like the EOS Portable Archive implementation.
The streambuf interface can be quite flexible. E.g. i've used it to implement hashing/equality operations:
Generate operator== using Boost Serialization?
In that answer I used Boost Iostreams with its Device concept to make implementing things simpler.
Now if your Buffer type (which you might have shown) has an interface that resembles most buffers (i.e. one or more (void*,size) pairs), you could use existing adapters present in Boost IOstreams. E.g.
Boost: Re-using/clearing text_iarchive for de-serializing data from Asio:receive()
where I show how to use Serialization with a re-usable fixed buffer. Here's the Proof Of Concept:
#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/serialization/serialization.hpp>
#include <boost/iostreams/device/array.hpp>
#include <boost/iostreams/stream.hpp>
#include <sstream>
namespace bar = boost::archive;
namespace bio = boost::iostreams;
struct Packet {
int i;
template <typename Ar> void serialize(Ar& ar, unsigned) { ar & i; }
};
namespace Reader {
template <typename T>
Packet deserialize(T const* data, size_t size) {
static_assert(boost::is_pod<T>::value , "T must be POD");
static_assert(boost::is_integral<T>::value, "T must be integral");
static_assert(sizeof(T) == sizeof(char) , "T must be byte-sized");
bio::stream<bio::array_source> stream(bio::array_source(data, size));
bar::text_iarchive ia(stream);
Packet result;
ia >> result;
return result;
}
template <typename T, size_t N>
Packet deserialize(T (&arr)[N]) {
return deserialize(arr, N);
}
template <typename T>
Packet deserialize(std::vector<T> const& v) {
return deserialize(v.data(), v.size());
}
template <typename T, size_t N>
Packet deserialize(boost::array<T, N> const& a) {
return deserialize(a.data(), a.size());
}
}
template <typename MutableBuffer>
void serialize(Packet const& data, MutableBuffer& buf)
{
bio::stream<bio::array_sink> s(buf.data(), buf.size());
bar::text_oarchive ar(s);
ar << data;
}
int main() {
boost::array<char, 1024> arr;
for (int i = 0; i < 100; ++i) {
serialize(Packet { i }, arr);
Packet roundtrip = Reader::deserialize(arr);
assert(roundtrip.i == i);
}
std::cout << "Done\n";
}

Cereal std::vector serialization

I want to give a shot to cereal serialization library . I am at beginner level c++ and retired level at java :)
I try to serialize vector with it but cant.
Any suggestion ?
I have below codes and gives an error. What am I missing here ?
struct personv1 {
matrix<float, 0, 1 > _descriptor;
string person_name;
string notes;
matrix<rgb_pixel> s_chips;
template<class Archive>
void serialize(Archive & archive)
{
// serialize things by passing them to the archive
archive( _descriptor, person_name,notes , s_chips );
}
};
std::vector<personv1> person;
and:
std::stringstream ss;
{
cereal::BinaryOutputArchive oarchive(ss); // Create an output archive
oarchive(person); // Write the data to the archive
} // archive goes out of scope, ensuring all contents are flushed
{
cereal::BinaryInputArchive iarchive(ss); // Create an input archive
iarchive(person); // Read the data from the archive
}
But it gives error:
---GUI-master-MAC/cereal/cereal.hpp:462: error: static_assert failed "cereal could not find any output serialization functions for the provided type and archive combination.
Types must either have a serialize function, load/save pair, or load_minimal/save_minimal pair (you may not mix these).
Serialize functions generally have the following signature:
template<class Archive>
void serialize(Archive & ar)
{
ar( member1, member2, member3 );
}
static_assert(traits::detail::count_output_serializers<T, ArchiveType>::value != 0,
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Boost Serialization: Transition from versioned class to object_serializable

TLDR: I would like to transition a class serialisation from implementation level object_class_info to object_serializable, keeping compatibility with old data files while always writing files in the new format.
Rationale: When I started using the Boost Serialization library, I didn’t realise that it already came with a header (complex.hpp) to serialise std::complex<double>. Instead, I wrote my own, free-standing serialisation function:
namespace boost { namespace serialization {
template<class Archive, typename T>
void serialize(Archive& ar, std::complex<T>& comp, const unsigned int version) {
ar & reinterpret_cast<T(&)[2]>(comp)[0];
ar & reinterpret_cast<T(&)[2]>(comp)[1];
}
}
This by default enables version and class info tracking, which slows code down quite a bit. The serialization function which comes with Boost is a fair bit faster.
I would now like to transition to using the Boost version always when writing out new data files, but still be able to read in old data files. Reading in new files with an old binary is not an issue.
The problem is that the new serialisation is not versioned (obviously). Further, I don’t even see how I could attempt to read in an archive using the old version of the code and immediately write it out again using the new version, as the deserialisation/serialisation traits are global properties.
What would be the best way to either a) transparently read in old and new files while always writing new files or b) reading in an old file and immediately writing it out in the new format?
You can keep the old implementation and use it if the file version is "old".
Use the boost version of complex serialization only when saving or when the file version is "new".
This should be trivial if you have an object containing the complex data to serialize, because can bump the version on the containing object to achieve this
UPDATE
Sample, using a simple wrapper to invoke the old style of serialization:
Live On Coliru
#include <boost/archive/text_oarchive.hpp>
#include <boost/archive/text_iarchive.hpp>
#include <boost/serialization/complex.hpp>
template <typename T> struct old_format_wrapper {
T& wrapped;
old_format_wrapper(T& w) : wrapped(w) {}
};
template <typename T>
old_format_wrapper<T> old_format(T& w) { return {w}; }
namespace boost { namespace serialization {
template<class Archive, typename T>
void serialize(Archive& ar, old_format_wrapper<std::complex<T> >& comp, unsigned) {
ar & reinterpret_cast<T(&)[2]>(comp.wrapped)[0];
ar & reinterpret_cast<T(&)[2]>(comp.wrapped)[1];
}
} }
struct IHaveComplexData {
std::complex<double> data;
template <typename Ar> void serialize(Ar& ar, unsigned version) {
switch(version) {
case 0: { // old
auto wrap = old_format(data);
ar & wrap;
}
break;
case 1: // new
default:
ar & data; // uses boost serialization
break;
}
}
};
int main() {
{
boost::archive::text_oarchive oa(std::cout);
IHaveComplexData o { { 2, 33 } };
oa << o;
}
{
std::istringstream iss("22 serialization::archive 13 0 0 0 0 2.00000000000000000e+00 3.30000000000000000e+01");
boost::archive::text_iarchive ia(iss);
IHaveComplexData o;
ia >> o;
std::cout << o.data;
}
}
Prints (depending on your boost version):
22 serialization::archive 13 0 0 0 0 2.00000000000000000e+00 3.30000000000000000e+01
(2,33)
Of course, you can now set BOOST_CLASS_VERSION(IHaveComplexData, 1)

How to iterate over archive in boost::serialization

I loaded multiple data into boost::archive::text_oarchive, now I need to extract the data.
But because the archive contains multiple records, I would need an iterator.
something like
//input archive
boost::archive::text_iarchive iarch(ifs);
//read until the end of file
while (!iarch.eof()){
//read current value
iarch >> temp;
...//do something with temp
}
is there any standard way to iterate over elements of the archive?
I found only iarchive.iterator_type, but is it what I need and how do I use it?
The iterator type you are looking at actually comes from
class shared_ptr_helper {
...
typedef std::set<
boost::shared_ptr<const void>,
collection_type_compare
> collection_type;
typedef collection_type::const_iterator iterator_type;
which is used during the load of the archive rather than being an iterator for external use I think.
If you look at the link http://www.boost.org/doc/libs/1_53_0/libs/serialization/doc/index.html under tutorial -> STL Collection you will see the following example:
#include <boost/serialization/list.hpp>
class bus_route
{
friend class boost::serialization::access;
std::list<bus_stop *> stops;
template<class Archive>
void serialize(Archive & ar, const unsigned int version)
{
ar & stops;
}
public:
bus_route(){}
};
If that isn't quite what you need then you would probably need to look at overriding load and save as per http://www.boost.org/doc/libs/1_53_0/libs/serialization/doc/tutorial.html#splitting
and adding handling as required.

Boost Serialization of classes with private data

Is it possible to non intrusively serialize a class with private data but with public get/set methods using the Boost serialize library. If not, are there other libraries that are capable of doing this?
Thnaks
You can deserialise/serialise to temporary variables, if you have to (the archive doesn't magically know that the variables being serialised into are fields of the class). Adapting the serialise function from the tutorial to assume no direct access to data:
template<class Archive>
void serialize(Archive & ar, gps_position & g, const unsigned int version)
{
int degrees = g.getDegrees();
int minutes = g.getMinutes();
float seconds = g.getSeconds();
ar & degrees;
ar & minutes;
ar & seconds;
g.setDegrees(degrees);
g.setMinutes(minutes);
g.setSeconds(seconds);
}