Partial deserialization of std::map

Partial deserialization of std::map - c++

Are there any ways to do partial deserialization of std::map that was serialized with boost::archive::text_oarchive and then saved to file?
For example we have a big serialized and saved map where key is integer and value is some structure and now we need to get it back by parts... load first 100 records, then load next 100 records... etc.
Are there any libs, boost classes or solutions to do it?

Normally the same serialize() function is called both to serialize and to deserialize. If you want to get it back in parts, you should serialize it in parts in the first place.

Related

Pre-serializing some fields of a proto message

Suppose I have a proto structure that looks like the following:
message TMessage {
optional TDictionary dictionary = 1;
optional int specificField1 = 2;
optional TOtherMessage specificField2 = 3;
...
}
Suppose I am using C++. This is the message stub that is used in the master process to send information to the bunch of the nodes using the network. In particular, the dictionary field is 1) pretty heavy 2) common for all the serialized messages, and all the following specific fields are filled with the relatively small information specific to the destination node.
Of course, dictionary is built only once, but it comes out that the major part of running time is spent while serializing the common dictionary part again and again for each new node.
Obvious optimization would be to pre-serialize dictionary into the byte string and put it into the TMessage as a bytes field, but this looks a bit nasty to me.
Am I right that there is no built-in way to pre-serialize a message field without ruining the message structure? It sounds like an idea for a good plugin for proto compiler.

Protobuf is designed such that concatenation === composition, at least for the root message. That means that you can serialize an object with just the dictionary, and snapshot the bytes somewhere. Now for each of the real messages you can paste down that snapshot, and then serialize an object with just the other fields - just whack it straight after: no additional syntax is required. This is semantically identical to serializing them all at the same time. In fact, since it will retain the field order, it should actually be identical bytes too.
It helps that you used "optional" throughout :)

Marc's answer is perfect for your use case. Here is just another option:
The field must be a submessage, like your TDictionary is.
Have another variant of the outer message, with bytes in place of the submessage you want to preserialize:
message TMessage_preserialized {
optional bytes dictionary = 1;
...
}
Now you can serialize the TDictionary separately and put the resulting data in the bytes field. In protobuf format, submessages and bytes field are written out the same way. This means you can serialize as TMessage_preserialized and still deserialize as normal TMessage.

Qt splitting data structure into groups

I have a problem I'm trying to solve but I'm at a stand still due to the fact that I'm in the process of learning Qt, which in turn is causing doubts as to what's the 'Qt' way of solving the problem. Whilst being the most efficient in term of time complexity. So I read a file line by line ( file qty ranging between 10-2000,000). At the moment my approach is to dump ever line to a QVector.
Qvector <QString> lines;
lines.append("id,name,type");
lines.append("1,James,A");
lines.append("2,Mark,B");
lines.append("3,Ryan,A");
Assuming the above structure I would like to give the user with three views that present the data based on the type field. The data is comma delimited in its original form. My question is what's the most elegant and possibly efficient way to achieve this ?
Note: For visual aid , the end result kind of emulates Microsoft access. So there will be the list of tables on the left side.In my case these table names will be the value of the grouping field (A,B). And when I switch between those two list items the central view (a table) will refill to contain the particular groups data.
Should I split the data into x amount of structures ? Or would that cause unnecessary overhead ?
Would really appreciate any help

In the end, you'll want to have some sort of a data model that implements QAbstractItemModel that exposes the data, and one or more views connected to it to display it.
If the data doesn't have to be editable, you could implement a custom table model derived from QAbstractTableModel that maps the file in memory (using QFile::map), and incrementally parses it on the fly (implement canFetchMore and fetchMore).
If the data is to be editable, you might be best off throwing it all into a temporary sqlite table as you parse the file, attaching a QSqlTableModel to it, and attaching some views to it.
When the user wants to save the changes, you simply iterate over the model and dump it out to a text file.

Complex and interrelated data structure in the Client Server scenerio

I need to know efficient mechanism used for data structure in the socket programming. Lets consider an example of car manufacturing on assembly line.
Initially Conveyer is empty then i start adding different parts dynamically. How can i transmit my data to the server using the TCP/UDP. What can i do so that my server can recognize, if i add some new part dynamically ? and after calculating server return data to client in same structure, so that client can put calculated data on the exact position of component.
Is it possible to arrange this data using some B Tree or B+ Tree structures ? is it possible to reconstruct the same tree on the server side ? what could be other possible alternatives approaches to do this ?

You need to serialize your data, whatever you need to send to server, to some text or binary blob. Yeah, it's possible to serialize interrelated data structure, e.g. by assigning some ID to items and then referencing them by that ID. For C++ serialization I would recommend to have a look at Boost.Serialization.
The simplest ID is memory address on serializer (sender) side - kind of unique identifier ready to use. Of course on deserializer side it must be considered as a just ID and not a memory address.

C++ persistence in database

I would like to persist some of my objects in a database (this could be relational (postgresql or MariaDB) or MongoDB). I have found a number of libraries that seem potentially useful, but I am missing the overall picture.
I have used boost::serialization serialize c++ to xml / binary, but it is not clear to me how to get this into the database (do I use the binary or xml format?)?
How do I get this into my mongoDB or postgresql?

You'd serialize to binary, as it is smaller and much faster. Also, the XML format isn't really pretty/easy to use outside of Boost Serialization anyways.
WARNING: Use Boost Portable Archive (EPA http://epa.codeplex.com/) if you need to use the format across different machines.
You'd usually store it in a column
text or CLOB (character large object) by encoding in base64 and putting that in the Database native charset (base64 is safe even for ASCII)
BLOB (binary large object) which doesn't bring the need to encode and could be more efficient storage wise.
Note: if you need to index, store the index properties in normal database columns.
Finally, if you like, I have recently made a streambuffer that allows you to stream data directly into a Sqlite BLOB column. Perhaps you can glean some ideas from this you could use:
How to boost::serialize into a sqlite::blob?

MFC treeview control : looking for a foolproof way to deal with data

Maybe I am doing something wrong here. I am using a treeview control , which I populate with data. The data (integers mainly) are transformed to CStrings for that matter. When the user clicks on an item, I can read the CString, but then have to parse it in order to get the data .
Several times I have changed the way the data appears on the screen ,and then everything breaks, and I need to rewrite the parsing function. I wonder if there is a better way to do this...
EDIT : The treeview is being populated with items from a std::vector. If I could get the treeview to return an index in the vector instead of a CString , this would fit me perfectly.

You can use CTreeCtrl::SetItemData to associate an arbitrary data value with a tree item, and CTreeCtrl::GetItemData to retrieve this value. Typically you use SetItemData to store a pointer to an object, but in your case you could use this to store the integer values directly.
I hope this helps!

If you change the way you set/get your data in the tree, then you will have to change the way you format and and parse it.
Normally, you should only have 2 functions, the setter and the parser, so it should not be a big issue
I don't think there is a way to make it really faster or cleaner.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Partial deserialization of std::map - c++

Normally the same serialize() function is called both to serialize and to deserialize. If you want to get it back in parts, you should serialize it in parts in the first place.

Related

Pre-serializing some fields of a proto message

Qt splitting data structure into groups

Complex and interrelated data structure in the Client Server scenerio

C++ persistence in database

MFC treeview control : looking for a foolproof way to deal with data

Categories

Resources