Serialization doubts - c++

I have to create an application layer protocol for a C++ application, but I have some doubts about how I can do it, especially about the serialization:
My idea is to create a class for describing the header, something like this:
class Header {
int type;
int length;
char[] message;
}
Now, in order to serialize it and to pass it through a socket, I'm thinking about using Boost Serialization. But my question is: is it "cross-platform"? In the sense that if I want to receive the data into a Python/Ruby/any-other-language server with its own class, can I do it or not (since I've serialized a C++ class)?
If not, is useful to serialize the class data into a JSON/XML file and transmit it?
If I want to serialize an object into a string, Does I have to pay attention to the big/little-endian and/or the string encoding and/or other details?
Since, not all the machines using the same number of bytes to define, for example, the primitive data types, is it necessary to use something like uint32_t data types to force the system to use a certain amount of bytes?
Thank you very much!

I think you should check this answer first.
Serializing into a C-style string is fine as far as big/little endian woes go, but is not as good performance-wise.
Using C-style string serialization will mostly solve this problem as well. On read side you should make sure that when parsing numbers they don't exceed local data size.
If this is a serious project then maybe consider using JSON or XML.

Related

Serialize a mapi message in c++

What I am trying to do is to convert a mapimessage object (LPMESSAGE) into a transferable format let's say to serialize it into bytes ( I prefer this approach), or to xml format. What is the best practice in this case? And how to do it? Is there a library to do so?
NOTE: I can convert lpmessage into mime and back, but I noticed it loses much of its properties when I use the iconversionsession.
If you have access to these classes' internals (and I belive you do), I would advise you to go through this FAQ on serialization: https://isocpp.org/wiki/faq/serialization. There's everything you would want to know on this topic. (hint: if you don't then you can always derive from them and extend their interface with additional, serialization method.)
If you are not interested in implementing your own solution, you could try 3rd party, like boost serialization lib: http://www.boost.org/doc/libs/1_60_0/libs/serialization/doc/index.html
You can convert the IMessage to a .MSG file. While not a perfect process, most properties are preserved, most of the time, and Outlook can open these serialized messages so they're easier to use (and verify). Search for code samples for OpenIMsgOnIStg.
If you don't want to use MSG as your serialization format, then you'll have to roll your own. IMAPIProp objects are just property bags with numeric ID's, but all of the different types of property values will have to be persisted differently.

Best way of serializing/deserializing a simple protocol in C++

I want to build a simple application protocol using Berkeley sockets in C++ using on Linux. The transport layer should be UDP, and the protocols will contain the following two parts:
The first part:
It is a fixed part which is the protocol Header with the following fields:
1. int HeaderType
2. int TransactionID
3. unsigned char Source[4]
4. unsigned char Destination[4]
5. int numberoftlvs
The second part
It will contain variable number of TLVs, each TLV will contain the following fields:
1. int type
2. int length
3. unsigned char *data "Variable length"
My question is for preparing the message to be sent over the wire,what's the best way to do serialization and deserialization, to be portable on all the systems like little Endian and big Endian?
Should I prepare a big buffer of "unsigned char", and start copying the fields one by one to it? And after that, just call send command?
If I am going to follow the previous way, how can I keep tracking the pointer to where to copy my fields, my guess would be to build for each datatype a function which will know how many bytes to move the pointer, correct?
If someone can provide me with a well explained example ,it will much appreciated.
some ideas... in no particular order... and probably not making sense all together
You can have a Buffer class. This class contains raw memory pointer where you are composing your message and it can contains counter or pointers to keep track of how much have you written, where you're writing and how far can you go.
Probably you would like to have one instance of the Buffer class for each thread reading/writing. No more, because you don't want to have expensive buffers like this around. Bound to a specific thread because you don't want to share them without locking (and locking is expensive)
Probably you would like to reuse a Buffer from one message to the next, avoiding the cost of creating and destroying it.
You might want to explore the idea of a Decorator a class that inherits or contains each of your data classes. In this case they idea for this decorator is to contain the methods to serialize and deserialize each of your data types.
One option for this is to make the Decorator a template and use class template specialization to provide the different formats.
Combining the Decorator methods and Buffer methods you should have all the control you need.
You can have in the Buffer class magical templated methods that have an arbitary object as parameter and automatically creates a Decorator for it and serializes.
Connversely, deserializing should get you a Decorator that should be made convertible to the decorated type.
I'm sorry I don't have the time right now to give you a full blown example, but I hope the above ideas can get you started.
As an example I shamelessly plug my own (de)serialization library that's packing into msgpackv5 format:
flurry

C++ way to serialize a message?

In my current project I have a few different interfaces that require me to serialize messages into byte buffers. I feel like I'm probably not doing it in a way that would make a true C++ programmer happy (and I'd like to).
I would typically do something like this:
struct MyStruct {
uint32_t x;
uint64_t y;
uint8_t z[80];
};
uint8_t* serialize(const MyStruct& s) {
uint8_t* buffer = new uint8_t[sizeof(s)];
uint8_t* temp = buffer;
memcpy(temp, &s.x, sizeof(s.x));
temp += sizeof(s.x);
//would also have put in network byte order...
... etc ...
return buffer;
}
Excuse any typos, that was just an example off the top of my head. Obviously it can get more complex if the structure I'm serializing has internal pointers.
So, I have two questions that are closely related:
Is there any problem in the specific scenario above with serializing by casting the struct directly to a char buffer assuming I know that the destination systems are in the same endianness?
Main question: Is there a better... erm... C++? way to do this aside from the addition of smart pointers? I feel like this is such a common problem that the STL probably handles it - and if it doesn't, I'm sure there's a better way to do it anyway using C++ mechanisms.
EDIT Bonus points if you can do a clean example of serializing this structure in a better way using standard C++/STL without added libraries.
You might want to take a look at Google Protocol Buffers (also known as protobuf). You define your data in a language neutral IDL and then run it through a generator to generate your C++ classes. It'll take care of byte ordering issues and can provide a very compact binary form.
By using this you'll not only be able to save your C++ data but it'll be useable in other languages (C#, Java, Python etc) as there are protobuf implementation available for them.
You should probable use either Boost::serialization or streams directly. More information in either of the links to the right.
Is it possible to serialize and deserialize a class in C++?
just voted AzPs reply as the answer, checking Boost first is the way to go.
in addition about your code sample:
1 - changing the signature of your serialization function to a method taking a file:
void MyStruct::serialize(FILE* file) // or stream
{
int size = sizeof(this);
fwrite(&size, sizeof(int), 1, file); // write size
fwrite(this, 1, size, file); // write raw bytes of struct
}
reduces the necessity to copy the struct.
2 - yes, your code makes the serialized bytes dependent on your platform, compiler and compiler settings. this is not good or bad, if the same binary writes and reads the serialized bytes, this might be beneificial because of simplicity and performance. But it is not only endianness, also packing and struct layout affects compatibility. For instance a 32bit or a 64bit version of your app will change the layout of our struct quite for sure. Finally serializing the raw footprint also serializes padding bytes - the bytes the compiler might put between struct fields, an overhead undesirable for high traffic network streams (see google protocol buffers as they hunt every bit they can save).
EDIT:
i see you added "embedded". yes, then such simple serialize / deserialize (mirror implementation of the above serialize) methods might be a good and simple choice.

Passing raw data in C++

Up until now, whenever I wanted to pass some raw data to a function (like a function that loads an image from a buffer), I would do something like this:
void Image::load(const char* buffer, std::size_t size);
Today I took a look at the Boost libraries, more specifically at the property_tree/xml_parser.hpp header, and I noticed this function signature:
template<typename Ptree>
void read_xml(std::basic_istream<typename Ptree::key_type::value_type>&,
Ptree &, int = 0);
This actually made me curious: is this the correct way to pass around raw data in C++, by using streams? Or am I misinterpreting what the function is supposed to be used for?
If it's the former, could you please point me to some resource where I can learn how to use streams for this? I haven't found much myself (mainly API references), and I have't been able to find the Boost source code for the XML parser either.
Edit: Some extra details
Seems there's been some confusion as to what I want. Given a data buffer, how can I convert it to a stream such that it is compatible with the read_xml function I posted above? Here's my specific use case:
I'm using the SevenZip C library to read an XML file from an archive. The library will provide me with a buffer and its size, and I want to put that in stream format such that it is compatible with read_xml. How can I do that?
Well, streams are quite used in C++ because of their conveniences:
- error handling
- they abstract away the data source, so whether you are reading from a file, an audio source, a camera, they are all treated as input streams
- and probably more advantages I don't know of
Here is an overview of the IOstream library, perhaps that might better help you understand what's going on with streams:
http://www.cplusplus.com/reference/iostream/
Understanding what they are exactly will help you understand how and when to use them.
There's no single correct way to pass around data buffers. A combination of pointer and length is the most basic way; it's C-friendly. Passing a stream might allow for sequential/chunked processing - i. e. not storing the whole file in memory at the same time. If you want to pass a mutable buffer (that might potentially grow), a vector<char>& would be a good choice.
Specifically on Windows, a HGLOBAL or a section object handle might be used.
The C++ philosophy explicitly allows for many different styles, depending on context and environment. Get used to it.
Buffers of raw memory in C++ can either be of type unsigned char*, or you can create a std::vector<unsigned char>. You typically don't want to use just a char* for your buffer since char is not guaranteed by the standard to use all the bits in a single byte (i.e., this will end up varying by platform/compiler). That being said, streams have some excellent uses as well, considering that you can use a stream to read bytes from a file or some other input, etc., and from there, store that data in a buffer.
Seems there's been some confusion as to what I want. Given a data buffer, how can I convert it to a stream such that it is compatible with the read_xml function I posted above?
Easily (I hope PTree::Key_type::value_type would be something like char):
istringstream stream(string(data, len));
read_xml(stream, ...);
More on string streams here.
This is essentially using a reference to pass the stream contents. So behind the scene's it's essentially rather similar to what you did so far and it's essentially the same - just using a different notation. Simplified, the reference just hides the pointer aspect, so in your boost example you're essentially working with a pointer to the stream.
References got the advantage avoiding all the referencing/dereferencing and are therefore easier to handle in most situations. However they don't allow you multiple levels of (de-)referencing.
The following two example functions do essentially the same:
void change_a(int &var, myclass &cls)
{
var = cls.convert();
}
void change_b(int *var, myclass *cls)
{
*var = cls->convert();
}
Talking about the passed data itself: It really depends on what you're trying to achieve and what's more effective. If you'd like to modify a string, utilizing an object of class std::string might be more convenient than using a classic pointer to a buffer (char *). Streams got the advantage that they can represent several different things (e.g. data stream on the network, a compressed stream or simply a file or memory stream). This way you can write single functions or methods that accept a stream as input and will instantly work without worrying about the actual stream source. Doing this with classic buffers can be more complicated. On the other side you shouldn't forget that all objects will add some overhead, so depending on the job to be done a simple pointer to a character string might be perfectly fine (and the most effective solution). There's no "the one way to do it".

How Do I Serialize and Deserialize an Object Containing a container of abstract objects in c++?

Im trying to text serialize and deserialize an object containing a container of abstract objects in c++,does somebody know of a code example of the above?
Take a look at boost::serialize.
It contains methods to assist in the serialization of containers (link loses frame on left).
Of course, don't just skip to that page, you'll want to read the whole thing. :)
Unlike other languages, C++ doesn't come with this kind of serialization "baked in." You're going to want to use a library. Such as Boost.Serialization, Google Protocol Buffers (can be a file format) or Apache Thrift.
You could create a method for your abstract class called:
virtual void serialize(char *out, int outLen) = 0;
.. and in turn a static deserializer:
AbstractClass deserialize(char *serializedString, int strLen);
In your deserializer, you could have different strategies to deserialize the right subclass of the abstract class.
Hey I asked a similar question a little while back. Have a look at dribeas's answer it was particularly good. This method allows the addition of new objects of the abstract type will little manipulation of existing code (ie. we can serialize them without adding additional switch/else if options to our deserializer).
Best Practice For List of Polymorphic Objects in C++