Standard serialization protocol to serialize set of objects to disk - c++

I am looking for standard protocol that provides ability to serialize set of object (same type) to a file, but also provide easy way to align to object boundary if reader/de-serializer start reading from random byte offset.
After googling I found out that Apache Avro provides this functionality using sync markers, but they don't have c++ lib to provide seek functionality, plus also no native windows library support for c++.
Is there any other well known protocols for the above requirements?
Possible protocols: protobuff and thrift, but after googling looks like they don't provide seeking capabilities (I might be wrong).

Related

How to define a class and method in DDS idl file?

I am new to DDS.... so far I have little experience in OpenDDS and CycloneDDS
Is it possible to define a class inside the idle file and have member variables and member methods? or only structure and primitive data types are supported in DDS standards?
The IDL language is defined in the OMG IDL specification. It consists of a number of building blocks that include Core Data Types, like the structures and primitive data types that you mentioned, and Interfaces, that includes the methods you asked about.
However, only a subset of those building blocks is used by DDS. For the current version 4.2, section 9.3 DDS Profiles defines which of them are relevant for three different levels of support by DDS: Plain DDS, Extensible DDS and DDS over RPC.
You will see that the latter indeed includes Building Block Interfaces - Basic, as you might expect from RPC. However, not all DDS implementations support RPC. Plain DDS and Extensible DDS are more commonly supported and interfaces are not part of that functionality.
Since you asked about this in another question: note that the interface functionality as captured in DDS over RPC is not for the purpose of distributing objects with their methods, but for invoking methods on objects remotely -- as the name Remote Procedure Call implies.
Another answer to your question is that you are, perhaps, asking the follow-up question as if it is the initial one. There are many different ways of building distributed systems, and given your question, three examples seem appropriate:
those designed around remote procedure calls/remote method invocation: in this context, CORBA is the perfect reference, but there are many (RPC, gRPC, DCOM, you name it);
those designed around shipping objects with their implementation across: one example is Java/JINI, but there are many others (JavaScript in a browser could be considered one);
those designed around shipping state (a.k.a. plain old data) and adding/transforming that state: SPLICE in the ancient history, DDS today.
Your question suggests that you are looking for middleware for doing distributed object computing. If that's indeed what you are looking for, DDS is a very suboptimal choice. Yes, RPC can be built on top of it (RPC-over-DDS simply makes it a bit easier to do it) and in a system predominantly built around distributing state it makes sense to do that.
If you can serialise objects with their methods then of course you can use DDS to distribute them in the network (there are fun things you can do that way). However, that's more a function of the programming language you use than of the middleware and IDL won't help you with that.

Serialize/ Deserialize C++ classes

I'm looking for a way to send C++ class between 2 clients aptication.
I was looking for a way doing so and all i can find is that I need to create for each Class Serialize/ Deserialize (to JSON for example) functions and send it over TCP/IP.
The main problem I'm faceing is that I have ~600 classes (some are classes including instances of others) that I need to pass which mean I need to spent the next writing Serialize/ Deserialize functions.
Is there any generic way writing Serialize/Deserialize functions ?
Is there any other way sending C++ classes ?
Thanks,
Guy Ergas.
Are you using a Framework at all? Qt and MFC for example have built in Serialization that would make your task easier. Otherwise I would guess that you'd need to spend at least some effort on each of the 600 classes.
As recommended above Boost Serialization is probably a good way to go, you can send the serialized class over Tcp using Boost Asio too:
http://www.boost.org/doc/libs/1_54_0/doc/html/boost_asio.html
Alternatively, there is a C++ API for Google Protocol Buffers (protobuf):
https://developers.google.com/protocol-buffers/docs/reference/cpp/
Boost Serialization
Although I haven't used it my self, it is very popular around my peers at work.
More info about it can be found in "Boost (1.54.00) Serialization"
Thrift
Thrift have a very limited serialize functionality which I don't think fits your requirements. But it can help you "move" the data from one client to anther even if they are using different languages.
More info about it can be found in "Thrift: The Missing Guide"
try s11n or nosjob
s11n (an abbreviation for serialization) is an Open Source project
focused on the generic serialization of objects (i.e., object
persistence) in the C++ programming language.
nosjob, a C++ library for generating and consuming JSON data.
You may be interested in ASN.1. It's not necessarily the easiest to use and tools/libraries are a little hard to come by (Objective Systems at http://www.obj-sys.com/index.php is worth a look, though not free).
However the big advantage is that it is very heavily standardised (so no trouble with library version incompatibilities) and most languages are supported one way or another. Handy if you need support across multiple platforms. It also does binary encodings, so its way less bloaty than XML (which it also supports). I chose it for these reasons and didn't regret it.
If you are at linux platform , You can directly use json.h library for serialization.
Here is sample code i have come across :)
Json Serializer

Thrift or Protocol buffer as cross-language serialization solution?

I've already chosen to use thrift as RPC framework in a project. This project has a lot of serialization / deserialization operations (e.g., store the data to disks). And the serialized format should be accessible for at least C++/Java/Python. It seems that thrift's serialization solution is more complicated than Protobuf (e.g., it needs to create a protocol before serializing an object).
So my question is: is it worth to use Protobuf for the serialization / deserialization part even if thrift is capable of this task?
I would agree that Thrift is a better choice for cross language RPC than Protobuf RPC ( see http://pjklauser.wordpress.com/2013/02/27/why-googles-protobuf-rpc-will-not-reach-widespread-adoption/ ). If you're using thrift already it's difficult to justify using a different "library" for serialization to file/storage. You'll need to write endless mapping code. Both libraries will have different maintenance cycles which you need to maintain independently which will give extra future effort. The cost of writing a line or two more code, or save one or two bytes of space, or save a microsecond of CPU time will be nothing compared to your additional efforts.

C/C++ FLAC tagging library

Is there any C/C++ FLAC tagging library that work on streams? Wherever I look I only find ones that work on files. It's kinda weird to me - why use something limited like file instead of more abstract stream. Well, maybe I'm just spoiled by managed languages neatness (I'm more of a Java guy, but this time I need unmanaged code solution).
I'm not familiar with any FLAC libraries, but the reference FLAC library supports an interface for custom I/O. This allows you to write a small stub that will convert I/O calls to a custom data source, which needn't be a file.
It seems to require capacity to seek, though. If that is the case, then you might not be able to wrap a socket without a high-level protocol that allows you to seek.

Is it safe to use boost serialization to serialize objects in C++ to a binary format for use over a socket?

I know that you can use boost serialization to serialize to a text format and then push over a socket, but I'd like to serialize a class of statistics data into a binary format (both for size and encoding/decoding overhead reasons). Is it safe to use boost serialization for this?
My specific worries are:
Differences between integer type sizes on different platforms (mainly 32-bit vs 64-bit).
Though I can largely get around this by using exactly-sized integer from stdint, I'd still like to understand the behavior.
Differences in endianness between systems, does boost serialize into a standard endian-ness (eg: network ordering), and then deserialize using the host's endianness?
It's a very nice library, but unfortunately documentation on it's binary capabilities is somewhat limited, so I just want to make sure that using it this way would be safe.
No, in general boost binary serialization is not machine-independent. See here.
It's available, I've been hearing a lot about Google's protobuf. It has a C and C++ binding.
You should check out Apache Thrift. It was designed by Facebook for cross platform serialization/deserialization.