How to create a dynamic message with Protocol Buffers? - c++

Say we want to create our message not using any preexisting .proto files and compiled out from them cpp/cxx/h files. We want to use protobuf strictly as a library. For example we got (in some only known to us format) message description: a message called MyMessage has to have MyIntFiels and optional MyStringFiels. How to create such message? for example fill it with simple data save to .bin and read from that binary its contents back?
I looked all over dynamic_message.h help description and DescriptorPool and so on but do not see how to add/remove fields to the message as well as no way to add described on fly message to DescriptorPool.
Can any one please explain?

Short answer: it can't be used that way.
The overview page of Protobuf says:
XML is also – to some extent – self-describing. A protocol buffer is only meaningful if you have the message definition (the .proto file).
Meaning the whole point of Protobuf is to throw-out self-descriptability in favor of parsing speed ==> it's just not it's purpose to create self describing messages.
Consider using XML or JSON or any other serialization format. If the protection is needed, you can use symmetric encryption and/or lzip compression.

Related

How to read Bazels binary build event protocol file?

I want to implement fetching of compiler warnings with Bazel (Bazel based build). I know that there are files which can already be used for this. These files are located at:
$PROJECT_ROOT/bazel-out/_tmp/action_outs/
and are named stderr-XY.
Bazel has the ability to save all build events in a designated file. Note that currently (Bazel 0.14) there are 3 supported formats for that designated file, and those are: text file, JSON file and binary file. This question is related only to binary file.
If I have understood Google's protocol buffers correctly, the workflow for them to be implemented and to work is:
You specify how you want the information you're serializing to be structured by defining protocol buffer message types in .proto files.
Once you've defined your messages, you run the protocol buffer compiler (protoc) for your application's language on your .proto file to generate data access classes.
Include generated files in your project and use generated class in your code. By use it is meant to populate, serialize and retrieve protocol buffer messages (i.e. for C++ which is the programming language that I use it is possible to use SerializeToOstream and ParseFromIstream methods for such tasks)
To conclude this question:
As it is stated here:
"Have Bazel serialize the protocol buffer messages to a file by specifying the option --build_event_binary_file=/path/to/file. The file will contain serialized protocol buffer messages with each message being length delimited."
I do not see the way to avoid the fact that the developer who wants to use Bazel's functionality to write build events in a binary file, needs to know the "format" or even more concise to say Class architecture to read that binary file. Am I missing something here? Can all of this be done and how?
Also, I have tried to use protoc --decode_raw < bazelbepbinary.bin and it says:
Failed to parse input.
All of this was done on Ubuntu 16.04 and at the moment I'm not sure what is the GCC version but I will add GCC version to the question when I have to access to that information.
My side question is: is it possible to capture only those build events which reflect build warnings (without using some kind of filter e.g grep on generated file?) I have read the documentation and used:
bazel help build --long | grep "relevant_build_event_protocol_keywords"
and was unable to find anything like that in the API.

Real time parsing

I am quite new to parsing text files. While googling a bit, I found out that a parser builds a tree structure usually out of a text file. Most of the examples consists of parsing files, which in my view is quite static. You load the file to parser and get the output.
My problem is something different from parsing files. I have a stream of JSON data coming from a server socket at TCP port 6000. I need to parse the incoming data.I have some questions in mind:
1) Do I need to save the incoming JSON data at the client side with some sought of buffer? Answer: I think yes I need to save it, but are there any parsers which can do it directly like passing the JSON object as an argument to the parse function.
2) How would the structure of the real time parser look like`? Answer: Since on google only static parsing tree structure is available. In my view each object is parsed and have some sought of parsed tree and then it is deleted from the memory. Otherwise it will cause memory overflow because the data is continuous.
There are some parser libraries available like JSON-C and JSON lib. One more thing which comes into my mind is that can we save a JSON object in any C/C++ array. Just thought of that but could realize how to do that.

Transmit raw vertex information to XTK?

We're using XTK to display data processed and created on a server. In our particular case, it's a parallel isocontouring application. As it currently stands we're converting to the (textual) VTK format and passing the entire (imaginary) VTK file over the wire to the client, where XTK renders it. This provides some substantial overhead, as the text format outweighs in the in-memory format by a considerably amount.
Is there a recommended mechanism available for transmitting binary data directly, either through an alternate format that is well-described or by constructing XTK primitives inside the JavaScript code itself?
It should be supported to parse an X.object from JSON. So you could generate the JSON on the serverside and use the X.object(jsonobject) copy constructor to safe down cast it. This should also give the advantage that the objects can be 'webgl-ready' and do not require any clientside parsing which should result in instant loading.
I was planning to play with that myself soon but if you get anything to work, please let us know.
Just have in mind that you need to match the X.object structure even in JSON. The best way to see what is expected by xtk is to JSON.stringify a webgl-ready X.object.
XMLHTTPRequest, in its second specification (the last one), allows trans-domain http requests (but you must have the control of the php header on the server side).
In addition it allows to sent ArrayBuffer, or Blobs or Documents (look here). And then on the client side you can write your own parser for that blob or (I think it fits more in you case) that BinaryBuffer using binary buffer views (see doc here). However XMLHTTPRequest is from client to server, but look HTML5 WebSocket, it seems it can transfert binaryArrays too (they say it here : ).
In every case you will need a parser to transform binary to string or to X.object at the client side.
I wish it helped you.

Store a file metadata in an extra file

I have a bunch of image files (mostly .jpg). I would like to store metadata about these files (e.g. dominant color, color distribution, maximum gradient flow field, interest points, ...). These data fields are not fixed and are not available in all images.
Right now I am storing the metadata for each file as a separate file with the same name but a different extension. The format is just text:
metadataFieldName1 metadataFieldValue1
metadataFieldName2 metadataFieldValue2
This gets me wondering, is there a better/easier way to store these metadata? I thought of ProtocolBuffer since I need to be able to read and write these information in both C++ and Python. But, how do I support the case where some metadata are not available?
I would suggest that you store such metadata within the image files themselves.
Most image formats support storing metadata. I think that .jpeg support it through Exif.
If you're on Windows you can use the WIC to store and retrieve metadata in a unified manner.
Why protocol buffers and not XML or INI files or whatever text-ish format? Just choose some format...
And what do you mean with "metadata not available"? It is up to your application to respond to such error situations...what has this to do with the format of the storage?
Look at http://www.yaml.org. YAML is less verbose than XML and more human friendly to read.
There are YAML libraries for both C++, Python and many other languages.
Example:
import yaml
data = { "field1" : "value1",
"field2" : "value2" }
serializedData = yaml.dump(data, default_flow_style=False)
open("datafile", "w").write(serializedData)
I thought long on this matter and went with ProtocolBuffer to store metadata for my images. For each image e.g. Image00012.jpg, I store the metadata in Image00012.jpg.pbmd. Once I have my .proto file setup, the Python class and C++ class got auto-generated. It works very well and require me to spend little time on parsing (clearly better than writing custom reader for YAML files).
RestRisiko brings up a good point about how I should handle metadata not available. The good thing about ProtocolBuffer is it supports optional/required fields. This solves my problem on this front.
The reason I think XML and INI are not good for this purpose is because many of my metadata are complex (color distribution, ...) and require a bit of storage customization. ProtocolBuffer allows me to nest proto declaration. Plus, the size of the metadata file and the parsing speed is clearly superior to my hand-roll XML reading/writing.

XML Serialization/Deserialization in C++

I am using C++ from Mingw, which is the windows version of GNC C++.
What I want to do is: serialize C++ object into an XML file and deserialize object from XML file on the fly. I check TinyXML. It's pretty useful, and (please correct me if I misunderstand it) it basically add all the nodes during processing, and finally put them into a file in one chunk using TixmlDocument::saveToFile(filename) function.
I am working on real-time processing, and how can I write to a file on the fly and append the following result to the file?
Thanks.
BOOST has a very nice Serialization/Deserialization lib BOOST.Serialization.
If you stream your objects to a boost xml archive it will stream them in xml format.
If xml is to big or to slow you only need to change the archive in a text or binary archive to change the streaming format.
Here is a better example of C++ object serialization:
http://www.codeproject.com/KB/XML/XMLFoundation.aspx
I notice that each TiXmlBase Class has a Print method and also supports streaming to strings and streams.
You could walk the new parts of the document in sequence and output those parts as they are added, maybe?
Give it a try.....
Tony
I've been using gSOAP for this purpose. It is probably too powerful for just XML serialization, but knowing it can do much more means I do not have to consider other solutions for more advanced projects since it also supports WSDL, SOAP, XML-RPC, and JSON. Also suitable for embedded and small devices, since XML is simply a transient wire format and not kept in a DOM or something memory intensive.