Send metadata along with Akka stream - akka

Here is my previous question: Send data from InputStream over Akka/Spring stream
I have managed to send compressed and encrypted file over Akka stream. Now, I am looking for way to transport metadata along with data, mainly filename and hash (checksum).
My current idea is to use Flow.prepend function and insert metadata before data this way:
filename, that can vary in size but always ends with null byte
fixed size hash (checksum)
data
Then, on receiving end I would have to use Flow.takeWhile twice - once to read filename and second time to read hash, and then just read data. It doesn't really look like elegant solution plus if in future I would like to add more metadata it will become even worse.
I have noticed method Flow.named, however documentation says just:
Add a ``name`` attribute to this Flow.
and I do not know how to use this (and if is it possible to transport filename over it).
Question is: is there better idea to transport metadata along with data over Akka stream than above?
EDIT: Attaching my drawing with idea.

I think prepending the metadata makes sense. A simple approach could be to prepend the metadata using the same framing you use to send the data.
The receiving end will need to know how many metadata blocks are there, and use this information to split it. See example below.
// client end
filenameSrc
.concat(hashSrc)
.concat(dataSrc)
.via(Framing.delimiter(ByteString("\n"), Int.MaxValue, allowTruncation = true))
.via(Tcp().outgoingConnection(???, ???))
.runForeach{ ??? }
// server end
val printMetadata =
Flow.fromGraph(GraphDSL.create() { implicit builder: GraphDSL.Builder[NotUsed] =>
import GraphDSL.Implicits._
val metadataSink = Sink.foreach(println)
val bcast = builder.add(Broadcast[ByteString](2))
bcast.out(0).take(2) ~> metadataSink
FlowShape(bcast.in, bcast.out(1).drop(2).outlet)
})
val handler =
Framing.delimiter(ByteString("\n"), Int.MaxValue)
.via(printMetadata)
.via(???)
This is only one of the many possible approaches to solve this. But whatever solution you choose, the receiver will need to have knowledge of how to extract the metadata from the raw stream of bytes it reads over TCP.

Related

get the binary data transferred from grpc client

I am new to gRPC framework, and I have created a sample client-server on my PC (referring to this).
In my client-server application I have implemented a simple RPC
service NameStudent {
rpc GetRoll(RollNo) returns (Details) {}
}
The client sends a RollNo and receives his/her details which are name, age, gender, parent name, and roll no.
message RollNo{
int32 roll = 1;
}
message Details {
string name = 1;
string gender = 2;
int32 age = 3;
string parent = 4;
RollNo rollid = 5;
}
The actual server and client codes are adaptation of the sample code explained here
Now my server is able to listen to "0.0.0.0:50051(address:port)" and client is able to send the roll no on "localhost:50051" and receive the details.
I want to see the actual binary data that is transferred between client and server. i have tried using Wireshark, but I don't understand what I am seeing here.
Here is the screenshot of wireshark capture
And here are the details of highlighted entry from above screenshot.
Need help in understanding wireshark here, Or any other way that can be used to see the binary data.
Wireshark uses the port to determine how to decode the communication, and it doesn't know any protocol associated with 50051. So you need to configure it to treat this as HTTP.
Right click on a row and select "Decode As..." in the context menu.
Then set "Current" to "HTTP" or "HTTP2" (HTTP will generally auto-detect HTTP2) and hit "OK".
Then the HTTP/2 frames should be decoded. And if using a recent version of Wireshark, you may also see the gRPC frames decoded.
The whole idea of grpc is to HIDE that. Let's say we ignore that and you know what you're doing.
Look at https://en.wikipedia.org/wiki/Protocol_Buffers. gRPC uses Protocol Buffers for it's data representation. You might get a hint at the data you're seeing.
Two good starting points for a reverse engineer exercise are:
Start simple: compile a program that sends an integer. Understand it. Sniff it. Then compile a program that sends a string. Try several values. Once you understand it, pass to tacke the problem of understanding how's google sending your structure.
Use known data and do small variations: knowing what 505249... means is easier if you start knowing the data you're sending (as an example, send "Hello world" string; then change it to "Hella world"; see what changes on the coded sniff; also check that sending several times the same data produces the same sniffed output). Apply prior point: start simple, first empty string, then " ", then "a", then "b", etc. and then pass to complex and larger strings. Don't be affraid to start simple.

Serialize and deserialize the message using google protobuf in socket programming in C++

Message format to send to server side as below :
package test;
message Test {
required int32 id = 1;
required string name = 2;
}
Server.cpp to do encoding :
string buffer;
test::Test original;
original.set_id(0);
original.set_name("original");
original.AppendToString(&buffer);
send(acceptfd,buffer.c_str(), buffer.size(),0);
By this send function it will send the data to client,i hope and i am not getting any error also for this particular code.
But my concern is like below:
How to decode using Google Protocol buffer for the above message in
the client side
So that i can see/print the message.
You should send more than just the protobuf message to be able to decode it on the client side.
A simple solution would be to send the value of buffer.size() over the socket as a 4-byte integer using network byte order, and the send the buffer itself.
The client should first read the buffer's size from the socket and convert it from network to host byte order. Let's denote the resulting value s. The client must then preallocate a buffer of size s and read s bytes from the socket into it. After that, just use MessageLite::ParseFromString to reconstruct your protobuf.
See here for more info on protobuf message methods.
Also, this document discourages the usage of required:
You should be very careful about marking fields as required. If at
some point you wish to stop writing or sending a required field, it
will be problematic to change the field to an optional field – old
readers will consider messages without this field to be incomplete and
may reject or drop them unintentionally. You should consider writing
application-specific custom validation routines for your buffers
instead. Some engineers at Google have come to the conclusion that
using required does more harm than good; they prefer to use only
optional and repeated. However, this view is not universal.

How to read complete data in QTcpSocket?

Now the server (implemented with java) will send some stream data to me, my code is like below:
connect(socket, SIGNAL(readyRead()), this, SLOT(read_from_server()));
in the read_from_server():
{
while (socket->bytesAvailable())
{
QString temp = socket->readAll();
}
}
but I find that even the server sent me a string with only several characters, the data is truncated, and my function is called twice, thus temp is the never complete data that I want.
If server send me a longer string, my function may be called three or more times, making me diffficult to know at which time the data is transfered completely.
So anyone can tell me how to completely receive the data easily, without so many steps of bothering? I'm sorry if this is duplicated with some questions else, I couldn't get their answers work for me. Many thanks!
What you're seeing is normal for client-server communication. Data is sent in packets and the readyRead signal is informing your program that there is data available, but has no concept of what or how much data there is, so you have to handle this.
To read the data correctly, you will need a buffer, as mentioned by #ratchetfreak, to append the bytes as they're read from the stream. It is important that you know the format of the data being sent, in order to know when you have a complete message. I have previously used at least two methods to do this: -
1) Ensure that sent messages begin with the size, in bytes, of the message being sent. On receiving data, you start by reading the size and keep appending to your buffer until it totals the size to expect.
2) Send all data in a known format, such as JSON or XML, which can be checked for the end of the message. For example, in the case of JSON, all packets will begin with an opening brace '{' and end with a closing brace '}', so you could count braces and match up the data, or use QJsonDocument::fromRawData to verify that the data is complete.
Having used both of these methods, I recommend using the first; include the size of a message that is being sent.
you can use a buffer field to hold the unfinished data temporarily and handle packets as they complete:
{
while (socket->bytesAvailable())
{
buffer.append(socket->readAll());
int packetSize = getPacketSize(buffer);
while(packetSize>0)
{
handlePacket(buffer.left(packetSize);
buffer.remove(0,packetSize);
packetSize = getPacketSize(buffer);
}
}
}
If all of the data has not yet arrived then your while loop will exit prematurely. You need to use a message format that will let the receiving code determine when the complete message has been received. For example, the message could begin with a length element, or if you are dealing with text the message could end with some character used as a terminator.
Problem is that during tcp data transfer data are send in undefined chunks. If you are trying to read defined block size you have to know in advance expected chunk size ore have a way to determinate when your block ends (something like zero terminated c-string).
Check if this answer doesn't help you (there is a trick to wait for expected data block).

parsing an XMPP stream with libxml2

I'm a beginner when it comes to libxml2, so here is my question:
I'm working at a small XMPP client. I have a stream that I receive from the network, the received buffer is fed into my Parser class, chunk by chunk, as the data is received. I may receive incomplete fragments of XML data:
<stream><presence from='user1#dom
and at the next read from socket I should get the rest:
ain.com to='hatter#wonderland.lit/'/>
The parser should report an error in this case.
I'm only interested in elements having depth 0 and depth 1, like stream and presence in my example above. I need to parse this kind of stream and for each of this elements, depth 0 or 1, create a xmlNodePtr (I have classes representing stream, presence elements that take as input a xmlNodePtr). So this means I must be able to create an xmlNodePtr from only an start element like , because the associated end element( in this case) is received only when the communication is finished.
I would like to use a pull parser.
What are the best functions to use in this case ? xmlReaderForIO, XmlReaderForMemory etc ?
Thank you !
You probably want a push parser using xmlCreatePushParserCtxt and xmlParseChunk. Even better would be to choose one of the existing open source C libraries for XMPP. For example, here is the code from libstrophe that does what you want already.

synchronizing between send/recv in sockets

I have a server thats sending out data records as strings of varying length(for eg, 79,80,81,82)
I want to be able to receive exactly one record at a time.I've delimited records with a (r) but because I dont know howmany bytes I have to receive, It sometimes merges records and makes it difficult for me to process.
I have two ideas for you:
Use XML for the protocol. This way you know exactly when each message ends.
Send in the header of each "packet" the packet size, this way you know how much to read from the socket for this specific packet.
Edit:
Look at this dummy code for (2)
int buffer_size;
char* buffer;
read( socket, &buffer_size, sizeof(buffer_size));
buffer = (char*) malloc(packet_size);
read( socket, buffer, buffer_size );
// do something
free( buffer) ;
EDIT:
I recommend looking at the comments here, as they note that the contect might not be ready by a simple "read()", you need to keep "read()"ing, until you get the correct buffer size.
Also - you might not need to read the size. Basically you need to look for the ending top level tag of the XML. This can be done by parsing the whole XML, or parlty parsing the XML you get from the stream untill you have 0 nodes "open".
You should delimit with null byte. Show us your code, and we may be able to help you.
Stream sockets do not natively support an idea of a "record" - the abstraction they provide is that of a continuous stream.
You must implement a layer on top of them to provide "records". It sounds like you are already part way there, with the end-of-record delimiter. The pseudo-code to complete it is:
create empty buffer;
forever {
recv data and append to buffer;
while (buffer contains end-of-record marker) {
remove first record from buffer and process it;
move remaining data to beginning of buffer;
}
}
Are you sending your data as a stream?
You can send it as a structure which is easier to parse and retrieve the data from.
struct Message
{
int dataSize;
char data[256];
};