c++ serialization options

c++ serialization options - c++

I'm working in a project where I'm writing a plugin for a particular package.
This package implements a "new" method in one of its headers, and as such, I am unable to include <strstream> as it also implements "new".
The package sdk also includes a thinned out and very old version of boost, which means that I can't use the boost serialization classes. It is built on Qt for VS2008, and we are (required for this project) to be in VS2005, so I can't include Qt either.
I need to be able to get data from an externally running application, sending the data over TCPIP. What is the best way for me to serialize out the data from the source and read it back in with these limitations?
I'm currently tempted to make a struct which could contain all possible data that might be sent over, and then just copying the memory of that struct into a block of bytes which gets sent over, but this sounds like a bad approach to me.
Thanks,
Liron

Google Protobuf

boost.serialization is the way to go here. It is the most comprehensible serialization library for C++ that I know of, and it comes with support for standard containers.

You can extract the bytes from your data and pass it around, see a basic example.
QByteArray vectorToBin(const QVector<qint32> & vec)
{
size_t size = sizeof(qint16);
QByteArray result;
foreach(qint16 e, vec) {
for(int n = 0; n<=(size-1)*8; n+=8) {
char c = static_cast<char>((e >> n));
result.append(c);
}
}
return result;
}
QVector<qint32> binToVector(const QByteArray & bytes)
{
QVector<qint32> result;
size_t size = sizeof(qint16);
for(int i=0; i<bytes.size(); i+=size) {
qint16 e = ((bytes[i+1] & 0xff)<<8) | (bytes[i] & 0xff);
result << e;
}
return result;
}

Related

Serialize openDDS topic to a `std::string`

I've using OpenDDS in a project. Now, for interoperability, we need to send topics also with a custom framework to other machines. Since this custom framework allows to send strings, I'd like to serialize the topics in a string and then send them.
I was using boost::serialization, but then I've made up the idea that in order to send a topic, OpenDDS should be able to serialize a topic some way, so I should be able to pick the corresponding function and use it for serialize data.
Inspecting the code I was able to find the overload of >>= and <<= operators:
void
operator<<= (
::CORBA::Any &_tao_any,
BasicType::LocalForceDataDataReader_ptr _tao_elem)
{
BasicType::LocalForceDataDataReader_ptr _tao_objptr =
BasicType::LocalForceDataDataReader::_duplicate (_tao_elem);
_tao_any <<= &_tao_objptr;
}
/// Non-copying insertion.
void
operator<<= (
::CORBA::Any &_tao_any,
BasicType::LocalForceDataDataReader_ptr *_tao_elem)
{
TAO::Any_Impl_T<BasicType::LocalForceDataDataReader>::insert (
_tao_any,
BasicType::LocalForceDataDataReader::_tao_any_destructor,
BasicType::_tc_LocalForceDataDataReader,
*_tao_elem);
}
It serializes the topic into Corba::Any. It seems to work, but now I need to send the content of Corba::Any. Is there a way to put the content of Corba::Any to a string, and retrieve its data from a string? Or, in other words, how can I serialize and deserialize Corba::Any?
Or there's a better way to serialize a OpenDDS topic to a string?

It's possible to use TAO's serialization system to do this, but it's probably better to use what OpenDDS is using: https://github.com/objectcomputing/OpenDDS/blob/master/dds/DCPS/Serializer.h (or at least it's easier for me to write an example for since I know it much better)
These are some functions that will serialize types to and from std::strings:
const OpenDDS::DCPS::Encoding encoding(OpenDDS::DCPS::Encoding::KIND_XCDR2);
template <typename IdlType>
std::string serialize_to_string(const IdlType& idl_value)
{
const size_t xcdr_size = OpenDDS::DCPS::serialized_size(encoding, idl_value);
ACE_Message_Block mb(xcdr_size);
OpenDDS::DCPS::Serializer serializer(&mb, encoding);
if (!(serializer << idl_value)) {
throw std::runtime_error("failed to serialize");
}
return std::string(mb.base(), mb.length());
}
template <typename IdlType>
IdlType deserialize_from_string(const std::string& xcdr)
{
ACE_Message_Block mb(xcdr.size());
mb.copy(xcdr.c_str(), xcdr.size());
OpenDDS::DCPS::Serializer serializer(&mb, encoding);
IdlType idl_value;
if (!(serializer >> idl_value)) {
throw std::runtime_error("failed to deserialize");
}
return idl_value;
}
Also be careful when using std::string for any binary data like CDR to make sure it's not interpreted as a null-terminated string.

Boost Serialization : How To Predict The Size Of The Serialized Result?

I use booost serialization that way :
Header H(__Magic, SSP_T_REQUEST, 98, 72, 42, Date(), SSP_C_NONE);
Header Z;
std::cout << H << std::endl;
std::cout << std::endl;
char serial_str[4096];
std::memset(serial_str, 0, 4096);
boost::iostreams::basic_array_sink<char> inserter(serial_str, 4096);
boost::iostreams::stream<boost::iostreams::basic_array_sink<char> > s(inserter);
boost::archive::binary_oarchive oa(s);
oa & H;
s.flush();
std::cout << serial_str << std::endl;
boost::iostreams::basic_array_source<char> device(serial_str, 4096);
boost::iostreams::stream<boost::iostreams::basic_array_source<char> > s2(device);
boost::archive::binary_iarchive ia(s2);
ia >> Z;
std::cout << Z << std::endl;
And It works perfectly fine.
Nevertheless, I need to send those packet on a socket. My problem is, how do I know on the other side how many bytes I need to read ? The size of the serialized result is not constant and btw is bigger than sizeof of my struct.
How can I be sure that the data is complete on the other side ? I use circular buffer but with serialisation how to do ?
Thx all

In general it's impossible to predict. It depends (a lot) on the archive format. But with object tracking completely subgraphs might be elided, and with dynamic type information a lot of data could be added.
If you can afford scratch buffers for serialized data, you can serialize to a buffer first, and then send the size (now that you know it) before sending the payload.
There will be overhead for
object tracking (serializing through pointers/references)
dynamic polymorphism (serializing through (smart) pointer-to-base)
versioning (unless you disable it for the types involved)
archive header (unless disabled)
code conversion (unless disabled)
Here are some answers that give you more information about these tweak points:
Boost C++ Serialization overhead
Boost Serialization Binary Archive giving incorrect output
Boost Serialization of vector<char>
Tune things (boost::archive::no_codecvt, boost::archive::no_header, disable tracking etc.)
If all your data is POD, it's easy to predict the size.
Out of the box
If you share IPC on the same machine, and you're already using circular buffers, consider putting the circular buffer into shared memory.
I have lots of answers (search for managed_shared_memory or managed_mapped_file) with examples of this.
A concrete example, focusing on a lock-free single-producer/single-consumer scenario is here: Shared-memory IPC synchronization (lock-free)
Even if you choose to/need to stream messages (e.g. over the network) you can still employ e.g. Managed External Buffers. Hereby you avoid the need to do any serialization even without requiring all data to be POD. (The trick is that internally, offset_ptr<> is used instead of raw pointers, making all references relative).

Create your own streaming class and override xsputn method.
class counter_streambuf : public std::streambuf {
public:
using std::streambuf::streambuf;
size_t size() const { return m_size; }
protected:
std::streamsize xsputn(const char_type* __s, std::streamsize __n) override
{ this->m_size += __n; return __n; }
private:
size_t m_size = 0;
};
Usage:
Header H(__Magic, SSP_T_REQUEST, 98, 72, 42, Date(), SSP_C_NONE);
counter_streambuf csb;
boost::archive::binary_oarchive oa(csb, boost::archive::no_header);
oa & H;
cout<<"Size: "<<csb.size();

C++ regex on partial data

I have a callback function, which provides pointer to data and it's size. I don't know what size will be next time and which call will be the last. And I need to match incoming data with regex and save matches.
Something like that.
class data_filter
{
public:
data_filter(const std::string& re)
: re_(re)
{}
public:
// callback func. It will be called many times with data parts
void process(const char* data, const size_t len)
{
re_.match(data, len, m_); // if found match, add it to matches
}
public:
void print_matches()
{
for(size_t i = 0; i < m_.size(); ++i)
{
std::cout << m_[i] << std::endl;
}
}
private:
some_cool_regex re_;
cool_regex_matches m_;
};
If absolutely neccessary i can provide some fixed buffer for regex backtracking, but i would like to avoid it.
I already had a brief look at boost::regex with partial_match option. As far as i understood from a first glance it can provide such functionality, but user should manually deal with temporary buffer.
So, should i stick with boost or there are some libraries that match my needs closer?
Thanks.

Since, indeed, there could be a need for backtracking, your options for streaming are limited or non-existent.
Boost Spirit "solves" the same issue by using the multi_pass_iterator<> adapter around input iterators. The adapter is able to maintain a buffer of previously read data for backtracking, freeing it as soon as it is no longer required (e.g. due to an expectation point).
If you shared some details about "some cool regex" then I could probably show you how to do this.
UPDATE Just found this library: https://github.com/openresty/sregex
libsregex - A non-backtracking regex engine library for large data streams

COM interop: how to use ICustomMarshaler to call 3rd party component

I want to call a method in a COM component from C# using COM interop. This is the methods signature:
long GetPrecursorInfoFromScanNum(long nScanNumber,
LPVARIANT pvarPrecursorInfos,
LPLONG pnArraySize)
and this is sample code (which I checked is really working) to call it in C++:
struct PrecursorInfo
{
double dIsolationMass;
double dMonoIsoMass;
long nChargeState;
long nScanNumber;
};
void CTestOCXDlg::OnOpenParentScansOcx()
{
VARIANT vPrecursorInfos;
VariantInit(&vPrecursorInfos);
long nPrecursorInfos = 0;
m_Rawfile.GetPrecursorInfoFromScanNum(m_nScanNumber,
&vPrecursorInfos,
&nPrecursorInfos);
// Access the safearray buffer
BYTE* pData;
SafeArrayAccessData(vPrecursorInfos.parray, (void**)&pData);
for (int i=0; i < nPrecursorInfos; ++i)
{
// Copy the scan information from the safearray buffer
PrecursorInfo info;
memcpy(&info,
pData + i * sizeof(MS_PrecursorInfo),
sizeof(PrecursorInfo));
}
SafeArrayUnaccessData(vPrecursorInfos.parray);
}
And here's the corresponding C# signature after importing the typelib of the COM component:
void GetPrecursorInfoFromScanNum(int nScanNumber, ref object pvarPrecursorInfos, ref int pnArraySize);
If I'm not mistaken, I need to pass in null for pvarPrecursorInfos to make COM interop marshal it as the expected VT_EMPTY variant. When I'm doing it, I get a SafeArrayTypeMismatchException - not really surprising, looking at how the result is expected to be handled in the sample. So I was trying to use a custom marshaler. Since a cannot alter the component itself, I tried to introduce it this way:
[Guid("06F53853-E43C-4F30-9E5F-D1B3668F0C3C")]
[TypeLibType(4160)]
[ComImport]
public interface IInterfaceNew : IInterfaceOrig
{
[DispId(130)]
int GetPrecursorInfoFromScanNum(int nScanNumber, [MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(MyMarshaler))] ref object pvarPrecursorInfos, ref int pnArraySize);
}
The TypeLibType and DispID attribute are the same as in the original version. This works as far as that the MyMarshaller.GetInstance() method is called, but I do not get a callback in MyMarshaller.NativeToManaged. Instead, an access violation is reported. So is this a reliable approach? If yes - how can I make it work? If no: are there any alternatives?
(Just a footnote: in theory I could try to use managed C++ to call the component natively. However, there are lots of other methods in it that work fine with COM interop, so I would very much like to stick with C# if there is any way.)

Since someone asked for it, here's my solution in Managed C++.
array<PrecursorInfo^>^ MSFileReaderExt::GetPrecursorInfo(int scanNumber)
{
VARIANT vPrecursorInfos;
VariantInit(&vPrecursorInfos);
long nPrecursorInfos = -1;
//call the COM component
long rc = pRawFile->GetPrecursorInfoFromScanNum(scanNumber, &vPrecursorInfos, &nPrecursorInfos);
//read the result
//vPrecursorInfos.parray points to a byte sequence
//that can be seen as array of MS_PrecursorInfo instances
//(MS_PrecursorInfo is a struct defined within the COM component)
MS_PrecursorInfo* pPrecursors;
SafeArrayAccessData(vPrecursorInfos.parray, (void**)&pPrecursors);
//now transform into a .NET object
array<PrecursorInfo^>^ infos = gcnew array<PrecursorInfo^>(nPrecursorInfos);
MS_PrecursorInfo currentPrecursor;
for (int i=0; i < nPrecursorInfos; ++i)
{
currentPrecursor = pPrecursors[i];
infos[i] = safe_cast<PrecursorInfo^>(Marshal::PtrToStructure(IntPtr(&currentPrecursor), PrecursorInfo::typeid));
}
SafeArrayUnaccessData(vPrecursorInfos.parray);
return infos;
}

I look at the github code mzLib, which I believe is related to this topic. The code looks good, where it calls
pin_ptr<const wchar_t> wch = PtrToStringChars(path);
I think it may cause some problem, better use
pin_ptr<const wchar_t> pathChar = static_cast<wchar_t*>(System::Runtime::InteropServices::Marshal::StringToHGlobalUni(path).ToPointer());
The code then seems to be worked just fine when compiles. However, it might run in problem when imported as dll. I worked on that by adding a constructor,such as
public ref class ThermoDLLClass
{
public:
ThermoDLLClass();
PrecursorInfo GetPrecursorInfo(int scanNum, String^ path);
};
Then, it seems to get precursorInfo and parameters appropriately.

How to serialize an object to send over network

I'm trying to serialize objects to send over network through a socket using only STL. I'm not finding a way to keep objects' structure to be deserialized in the other host. I tried converting to string, to char* and I've spent a long time searching for tutorials on the internet and until now I have found nothing.
Is there a way to do it only with STL?
Are there any good tutorials?
I am almost trying boost, but if there is how to do it with STL I'd like to learn.

You can serialize with anything. All serialization means is that you are converting the object to bytes so that you can send it over a stream (like an std::ostream) and read it with another (like an std::istream). Just override operator <<(std::ostream&, const T&) and operator >>(std::istream&, T&) where T is each of your types. And all the types contained in your types.
However, you should probably just use an already-existing library (Boost is pretty nice). There are tons of things that a library like Boost does for you, like byte-ordering, taking care of common objects (like arrays and all the stuff from the standard library), providing a consistent means of performing serialization and tons of other stuff.

My first question will be: do you want serialization or messaging ?
It might seem stupid at first, since you asked for serialization, but then I have always distinguished the two terms.
Serialization is about taking a snapshot of your memory and restoring it later on. Each object is represented as a separate entity (though they might be composed)
Messaging is about sending information from one point to another. The message usually has its own grammar and may not reflect the organization of your Business Model.
Too often I've seen people using Serialization where Messaging should have been used. It does not mean that Serialization is useless, but it does mean that you should think ahead of times. It's quite difficult to alter the BOM once you have decided to serialize it, especially if you decide to relocate some part of information (move it from one object to another)... because how then are you going to decode the "old" serialized version ?
Now that that's been cleared up...
... I will recommend Google's Protocol Buffer.
You could perfectly rewrite your own using the STL, but you would end up doing work that has already been done, and unless you wish to learn from it, it's quite pointless.
One great thing about protobuf is that it's language agnostic in a way: ie you can generate the encoder/decoder of a given message for C++, Java or Python. The use of Python is nice for message injection (testing) or message decoding (to check the output of a logged message). It's not something that would come easy were you to use the STL.

Serializing C++ Objects over a Network Socket
This is 6 years late but I just recently had this problem and this was one of the threads that I came across in my search on how to serialize object through a network socket in C++. This solution uses just 2 or 3 lines of code. There are a lot of answers that I found work but the easiest that I found was to use reinterpret_cast<obj*>(target) to convert the class or structure into an array of characters and feed it through the socket. Here's an example.
Class to be serialized:
/* myclass.h */
#ifndef MYCLASS_H
#define MYCLASS_H
class MyClass
{
public:
int A;
int B;
MyClass(){A=1;B=2;}
~MyClass(){}
};
#endif
Server Program:
/* server.cpp */
#include "myclass.h"
int main (int argc, char** argv)
{
// Open socket connection.
// ...
// Loop continuously until terminated.
while(1)
{
// Read serialized data from socket.
char buf[sizeof(MyClass)];
read(newsockfd,buf, sizeof(MyClass));
MyClass *msg = reinterpret_cast<MyClass*>(buf);
std::cout << "A = " << std::to_string(msg->A) << std::endl;
std::cout << "B = " << std::to_string(msg->B) << std::endl;
}
// Close socket connection.
// ...
return 0;
}
Client Program:
/* client.cpp */
#include "myClass.h"
int main(int argc, char *argv[])
{
// Open socket connection.
// ...
while(1)
{
printf("Please enter the message: ");
bzero(buffer,256);
fgets(buffer,255,stdin);
MyClass msg;
msg.A = 1;
msg.B = 2;
// Write serialized data to socket.
char* tmp = reinterpret_cast<char*>(&msg);
write(sockfd,tmp, sizeof(MyClass));
}
// Close socket connection.
// ...
return 0;
}
Compile both server.cpp and client.cpp using g++ with -std=c++11 as an option. You can then open two terminals and run both programs, however, start the server program before the client so that it has something to connect to.
Hope this helps.

I got it!
I used strinstream to serialize objects and I sent it as a message using the stringstream's method str() and so string's c_str().
Look.
class Object {
public:
int a;
string b;
void methodSample1 ();
void methosSample2 ();
friend ostream& operator<< (ostream& out, Object& object) {
out << object.a << " " << object.b; //The space (" ") is necessari for separete elements
return out;
}
friend istream& operator>> (istream& in, Object& object) {
in >> object.a;
in >> object.b;
return in;
}
};
/* Server side */
int main () {
Object o;
stringstream ss;
o.a = 1;
o.b = 2;
ss << o; //serialize
write (socket, ss.str().c_str(), 20); //send - the buffer size must be adjusted, it's a sample
}
/* Client side */
int main () {
Object o2;
stringstream ss2;
char buffer[20];
string temp;
read (socket, buffer, 20); //receive
temp.assign(buffer);
ss << temp;
ss >> o2; //unserialize
}
I'm not sure if is necessary convert to string before to serialize (ss << o), maybe is possible directly from char.

I think you should use google Protocol Buffers in your project.In network transport Protocol buffers have many advantages over XML for serializing structured data. Protocol buffers:
are simpler
are 3 to 10 times smaller
are 20 to 100 times faster
are less ambiguous
generate data access classes that are easier to use programmaticall
and so on. I think you need read https://developers.google.com/protocol-buffers/docs/overview about protobuf

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js