How to get a message request from its sequence number? - c++

Given a sequence number, I need to find the corresponding request message string.
I can't find a way to it easily do that with quickFix lib.
To be short, I've had the idea to use the FileStore "body" file to help me retrieve the message request string from a sequence number,as the FileStore class exposes a convenient method:
get(int begin, int end, std::vector result)
But I am facing an issue: as those files are accessed by another FileStore instance (from the Initiator instance) those files are inaccessible from any other part of my application under Windows OS: as it forbids a second owner on the those files.
Do I need to rewrite my own mechanism to get request message string form their sequence number?

I'm not sure why are you trying to get the 'message string' based on sequence number.
Is this during trading? Can you modify your application code? Your application gets the messages from the server/client so you can just dump the message as string (in c++ they have methods something to do with ToString() or similar).
You could keep the string in a dictionary with the sequence number as id and so on. The library gets you to peek at the outgoing messages as well.
If it is after traiding the messages you can set the engine to create data files and then just process the data file, it has all the messages received and sent.
Sorry, I just can't figure out what exactly you are trying to use.

Related

modify raw protobuf stream

Let's say I have compiled an application (Receiver) with the following proto file:
syntax = "proto3";
message Control {
bytes version = 1;
uint32 id = 2;
bytes color = 3;
}
and I have another application (Transmitter) which initially has the same proto file but after an update a new field is added like:
syntax = "proto3";
message Control {
bytes name = 1;
uint32 id = 2;
bytes color = 3;
uint32 color_id = 4;
}
I have seen that if the Receiver app tries to parse the proto, change some data and then serialize it back the added fields coming from the Transmitter app are removed.
I need a way to change the id field directly accessing to the raw bytes without having to parse/serialize the proto. Is it possible ?
This is needed because I have some "header" fields in the Control message that I know that will never be changed but others that can be added/changed in the same proto of trasmitter app due to app update.
I have seen: https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream
but I was not able to modify an existing bytestream and the ReadString is not able to understand the string length.
Thanks in advance
I don't think there is an official way to do it. You could do this by hand following the encoding guidelines by protobuf (https://developers.google.com/protocol-buffers/docs/encoding#structure).
Basically you should do this:
start decoding with the very first bit
decode until you reach the field number of the id
identify the bits representing the id and replace them with your new (encoded!) id
This is bad for several reasons. Most importantly, your code has to know details about the message structure and content (field number and data type of your id), and this is exactly what you want to avoid when using protocol buffers (you always need some info from the .proto files).
In proto2 syntax, protobuf C++ library used to preserve unknown fields so that when you re-encoded the message, they would remain. Unfortunately this feature (like many others) have been removed in the proto3 syntax.
One workaround could be to do it this way:
Set only the new id value in the Receiver message and encode it.
Append this data after the original binary data.
This relies on the protobuf feature that appended messages replace original values of fields in protobuf messages.
Hmm, actually reading the issue report linked above, it seems that you can turn on unknown field preservation in protobuf version 3.5 and newer.
Just deserialize the entire message and map it on the new message. It is the cleanest way. You do not have a lot of data and probably no real time requirements. Create a mapper and do not overthink the problem.

How to send large, frequent xml data from javascript to a c++ http server

In my project I want to send possibly large and frequent XML data to a custom server written in c++. I don't want to use Apache and CGI because the data is too frequent to be starting a CGI process for every request. I would prefer if the data was recieved directly in the c++ code that will process the data and send a reply.
I started out by using libmicrohttpd for the c++ server but now I believe it won't be possible because it doesn't give access to the raw POST data. I started looking for another library but I can't seem to find a c++ library that does this. Can anyone suggest a c++ http server library that has access to the raw post data?
Here is the code I intended to start with. It is one of the example files provided in the source code of libmicrohttpd. Post Example from libmicrohttpd library
Edit:
A little more context.
From what I understand to access the post data in libmicrohttpd you create MHD_PostProcessor function that gets called incrementally as the post data is received in chunks. But in the example below it only shows how to get post data in the form of key value pairs. But I can't see how to get the data from a post.
The example implements the MHD_PostProcessor as post_iterator. See the definition of
static int post_iterator(void *cls,
enum MHD_ValueKind kind,
const char *key,
const char *filename,
const char *content_type,
const char *transfer_encoding,
const char *data, uint64_t off, size_t size) {
...
in the example posted above. You will see it only shows how to iterate the key value pairs.
MHD does give you access to the raw POST data, just grab it from "upload_data" directly instead of passing it to the MHD_PostProcessor. MHD will give you the uploaded POST stream incrementally by calling your main request processing callback repeatedly with more and more POST data being given to you raw, unprocessed in "upload_data".

Protobuf ParseDelimitedFrom implementation in C++

C# Publisher is publishing continuos marketdata messages in custom protobuff format over the socket using "writeDelimitedTo" API. I have to read all messages in C++ and desearialize it. Below is my code. Since C++ don't have "parseDelimitedFrom", so have coded something like below after going through multiple suggestions in this forum.
Now my question is - Refering to the code below, If the first message size is less than 1024 then in the first iteration, i will have full stream of the 1st message and part of the stream from the 2nd message. After deserializing first message, How can i read remaining streams of the second message from socket and merge it with the stream which i read in the previous iteration ?
EDIT: Support for "delimited" format is now part of the official protobuf library. The post below predates it being added.
I've written optimally-efficient versions of parseDelimitedFrom and writeDelimitedTo in C++ here (the read and write methods of Uncompressed):
https://github.com/capnproto/capnproto/blob/06a7136708955d91f8ddc1fa3d54e620eacba13e/c%2B%2B/src/benchmark/protobuf-common.h#L101
Feel free to copy.
These implementations read from / write to a ZeroCopyInputStream / ZeroCopyOutputStream.(Hmm, for some reason my write is declared to use FileOutputStream, but you should be able to just change that to ZeroCopyOutputStream.)
So, you'll need to create a ZeroCopyInputStream which reads from your StreamSocket, then pass it to my read().
It looks like StreamSocket is a classic copying-read interface. You should therefore use CopyingInputStreamAdaptor as your ZeroCopyInputStream, wrapping an implementation of CopyingInputStream which reads from your StreamSocket.
https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.zero_copy_stream_impl_lite#CopyingInputStreamAdaptor

Protocol Buffers - Reading header (nested message) common across all messages

I am currently evaluating Protocol Buffers for use in a project (no code written as of yet). One of the things I'm unclear on is how you would read part of an encoded message, for example say I have a common header:
message Header {
required uint16 msg_type = 1;
required uint16 length = 2;
}
And say I deliver multiple different messages to a queue. How would the consumer work out how much data to read per message and what message type is should be constructed as?
There should be no need for a Header message here; the most common approach is to follow the "streaming" advice from here. Within that, you could either treat it as a sequence of identical union type messages, or (my preference) when writing, instead of just writing a length-prefix before each, include a varint that indicates the message type then the length (as a varint). The number that indicates the message type is some arbitrary map you invent, so 1 = Foo, 2 = Bar, 3 = Blap, etc). If you left-shift the message-type by 3 bits then "or" 2, then it will also be a well-formed protobuf stream itself, 100% identical to a repeated YourUnionType.
Basically, this is exactly the same as this answer, but instead of being field 1 each time, the number varies per message-type. Most implementations have a reader/writer API that make it possible to read and write raw varints, and to length-restrict the reader API. Some implementations have helper mechanisms to support streams of heterogeneous messages directly (basically, doing all the above for you).
In a recent project, I used Protocol Buffers like this:
We had one 'container' message that included all the actual messages as optional members:
message ContainerMessage {
optional Message1 message_1 = 1;
optional Message2 message_2 = 2;
//...
optional MessageN message_N = N;
}
Inside an application, you could just use ContainerMessage as a discriminated union of the real Messages.
Between applications, we serialized/deserialized the ContainerMessage and sent the serialized content, prefixed with a simple header containing the length of the serialized content.
That will depend on the protocol you are using.
Note that e.g. a lot of protocols go via serial interfaces, where you might have extra lines telling when a message starts and stops.
Often, messages will have there length at a fixed offset after the message start.
In other cases, you might need to parse the message element by element to find out how much of the message is left. So a string embedded in the message may be of fixed length, or have the length at the beginning, or might have \0 as end marker.
Mostly, when you store messages in a queue for further processing, you will want to add some more information to make your life easier - like when you just have an extra signal telling you when the message stops, you might store the message internally with its length.

What's the most efficient way to parse incomplete XML messages over a stream?

I have a TCP connection that sends me XML messages over a stream.
The first message I receive in the <?xml version="1.0" encoding="utf-8"?> message.
The second is a authentication request message, which provides a seed to use when hashing my credentials to send back to the server - <session seed="VJAWKBJXJO">.
At this point I should send a <session user="admin" password_hash="123456789"> message back to authenticate myself.
Once authenticated I will receive the desired data in the form of <Msg>data</Msg>.
If I do not authenticate in time with the server, I receive a </session> message, to indicate the session has been closed.
The problem is that I can't use a DOM parser because attempting to parse the <session> tag with no end tag always throws an error, so I'm attempting to use the Xerces-c SAX parser, to perform progressive parsing of the XML.
When I receive each message I want to ideally append it to a MemBufInputSource which contains all XML which has currently been received, then perform a parseNext on the buffer to parse the new XML that has been received, but I can't figure out how to get it working correctly.
Is there a better way around this problem? Perhaps just using a special case for the <session></session> messages?
Thanks
Have you tried using a different parser? If not, I'm using libxml2 (http://xmlsoft.org/), it's incredibly simple and it allows you to handle errors at your leisure.
You can create an xmlTextReaderPtr from a stream (your connection):
xmlTextReaderPtr reader = xmlReaderForMemory(...)
Then iterate through the nodes until you find your data:
while ( (result=xmlTextReaderRead(reader))== 1 )
{
int nodetype = xmlTextReaderNodeType(reader);
if ( nodetype == XML_READER_TYPE_ELEMENT )
{
const xmlChar* name = xmlTextReaderConstName(reader);
/* now name is the name of the element, like "session" */
if ( strcmp(name,"session")==0 )
{
/* now look for the XML_READER_TYPE_ATTRIBUTE named "seed" and read the
* value with xmlTextReaderConstValue to get the seed value */
}
}
}
They have a simple example, as well, for parsing out values:
http://xmlsoft.org/examples/reader1.c
It does have a bunch of features in there, though I can only speak for the basic reading, writing, and xinclude features.
Hope that helps!