How to split long messages into short messages in gRPC C++

How to split long messages into short messages in gRPC C++ - c++

Context: I have a long message(~1G) to send over gRPC.(Why I use gRPC other than mere HTTP? 95% messages are shorter than 4MB)
Since gRPC default max send size is 4MB, I have to split them into multiple messages at server-side and reassemble them at client-side.
This is how I do:
Get the Response, and marshal it into a std::string by response.SerializeToString(&str)
Split the long string into multiple short strings, wrap them with a metadata(eg, index) and send them one by one
client-side receives all of them and concatenate them
Get Response by message.ParseFromString(&concated_str)
In these steps, I assume there're four times of copying 1GB data. Is there a way to avoid any of them, if possible?

It looks like OK.
But it is kind of rare to send such a huge message with rpc. Why don't you use some DB or messaging component such as Kafka or RabbitMQ? RPC is simply Remote Procedure Call, which means that you should use it just like calling a function. You can pass some arguments to a function but not a piece of 1G data. GRPC can retry and set timeout. You should set retry-times and timeout because you don't want to use the configuration by default. In your case, what if some retry happen while sending the third piece of data? This could happen because network is not always reliable. You ganna have to wait for a very long time to finish just one agent-level call. How would you set the timeout of GRPC? 10 seconds? 20 seconds? What if the first 1G request hasn't finished while the second one is coming? The stack may explode!
So your design may work as expected but is not a good design.
Do NOT try to use one technique to solve all of issues. GRPC is simply a kind of technique. But it's not a silver bullet. Sending huge data should use another technique. In fact, that's why people developed many kinds of messaging components.
I suggest you using a messaging queue, such as Kafka. 1G data can't be generated in one short, right? So when some data is generated, write them into Kafka immediately. And on the other side, read the Kafka queue.
Here is the architecture:
GRPC ----> Kafka writer ----> Kafka ----> Kafka reader ----> GRPC
You see that's why people invented a word: stream.
Do not regard 1G data as a piece of static data, instead, regard it as a stream.

Related

Boost ASIO - process data while still receiving

I am new to Boost ASIO and have the following use case:
A client sends 1 MB data to a server. The server is able to process each byte of the data independent from the remaining data. My current solution is using the read_some and write_some methods for the server and client, respectively. This works well, but I would like to speed up my implementation by letting the server directly process the data while it still receives them. I already had a look at the documented examples but could not find one that fits my requirements.
I also wonder how I can take track how many bytes are received so far. I always have the same amount of data that the client sends.
Thank you in advance! Best regards.

How to send and receive data up to SO_SNDTIMEO and SO_RCVTIMEO without corrupting connection?

I am currently planning how to develop a man in the middle network application for TCP server that would transfer data between server and client. It would behave as regular client for server and server for remote client without modifying any data. It will be optionally used to detect and measure how long server or client is not able to receive data that is ready to be received in situation when connection is inactive.
I am planning to use blocking send and recv functions. Before any data transfer I would call a setsockopt function to set SO_SNDTIMEO and SO_RCVTIMEO to about 10 - 20 miliseconds assuming it will force blocking send and recv functions to return early in order to let another active connection data to be routed. Running thread per connection looks too expensive. I would not use async sockets here because I can not find guarantee that they will get complete in a parts of second especially when large data amount is being sent or received. High data delays does not look good. I would use very small buffers here but calling function for each received byte looks overkill.
My next assumption would be that is safe to call send or recv later if it has previously terminated by timeout and data was received less than requested.
But I am confused by contradicting information available at msdn.
send function
https://msdn.microsoft.com/en-us/library/windows/desktop/ms740149%28v=vs.85%29.aspx
If no error occurs, send returns the total number of bytes sent, which
can be less than the number requested to be sent in the len parameter.
SOL_SOCKET Socket Options
https://msdn.microsoft.com/en-us/library/windows/desktop/ms740532%28v=vs.85%29.aspx
SO_SNDTIMEO - The timeout, in milliseconds, for blocking send calls.
The default for this option is zero, which indicates that a send
operation will not time out. If a blocking send call times out, the
connection is in an indeterminate state and should be closed.
Are my assumptions correct that I can use these functions like this? Maybe there is more effective way to do this?
Thanks for answers

While you MIGHT implement something along the ideas you have given in your question, there are preferable alternatives on all major systems.
Namely:
kqueue on FreeBSD and family. And on MAC OSX.
epoll on linux and related types of operating systems.
IO completion ports on Windows.
Using those technologies allows you to process traffic on multiple sockets without timeout logics and polling in an efficient, reactive manner. They all can be considered successors of the ancient select() function in socket API.
As for the quoted documentation for send() in your question, it is not really confusing or contradicting. Useful network protocols implement a mechanism to create "backpressure" for situations where a sender tries to send more data than a receiver (and/or the transport channel) can accomodate for. So, an application can only provide more data to send() if the network stack has buffer space ready for it.
If, for example an application tries to send 3Kb worth of data and the tcp/ip stack has only room for 800 bytes, send() might succeed and return that it used 800 bytes of the 3k offered bytes.
The basic approach to forwarding the data on a connection is: Do not read from the incoming socket until you know you can send that data to the outgoing socket. If you read greedily (and buffer on application layer), you deprive the communication channel of its backpressure mechanism.
So basically, the "send capability" should drive the receive actions.
As for using timeouts for this "middle man", there are 2 major scenarios:
You know the sending behavior of the sender application. I.e. if it has some intent on sending any data within your chosen receive timeout at any time. Some applications only send sporadically and any chosen value for a receive timeout could be wrong. Even if it is supposed to send at a specific time interval, your timeouts will cause trouble once someone debugs the sending application.
You want the "middle man" to work for unknown applications (which must not use some encryption for middle man to have a chance, of course). There, you cannot pick any "adequate" timeout value because you know nothing about the sending behavior of the involved application(s).

As a previous poster has suggested, I strongly urge you to reconsider the design of your server so that it employs an asynchronous I/O strategy. This may very well require that you spend significant time learning about each operating systems' preferred approach. It will be time well-spent.
For anything other than a toy application, using blocking I/O in the manner that you suggest will not perform well. Even with short timeouts, it sounds to me as though you won't be able to service new connections until you have completed the work for the current connection. You may also find (with short timeouts) that you're burning more CPU time spinning waiting for work to do than actually doing work.
A previous poster wisely suggested taking a look at Windows I/O completion ports. Take a look at this article I wrote in 2007 for Dr. Dobbs. It's not perfect, but I try to do a decent job of explaining how you can design a simple server that uses a small thread pool to handle potentially large numbers of connections:
Windows I/O Completion Ports
http://www.drdobbs.com/cpp/multithreaded-asynchronous-io-io-comple/201202921
If you're on Linux/FreeBSD/MacOSX, take a look at libevent:
Libevent
http://libevent.org/
Finally, a good, practical book on writing TCP/IP servers and clients is "Practical TCP/IP Sockets in C" by Michael Donahoe and Kenneth Calvert. You could also check out the W. Richard Stevens texts (which cover the topic completely for UNIX.)
In summary, I think you should take some time to learn more about asynchronous socket I/O and the established, best-of-breed approaches for developing servers.
Feel free to private message me if you have questions down the road.

How to avoid dropping messages zeromq pub sub

I have seen several questions about this, but none have answers I found satisfactory. This question, zeromq pattern: pub/sub with guaranteed delivery in particular is similar, though I am open to using any other zeromq mechanism to achieve the same effect.
My question is, is there any way to send messages in a fan-out pattern like publisher-subscriber in ZeroMQ with the assurance that the messages will be delivered? It seems as though a Dealer with zero-copy could do this okay, but it would be much messier than pub-sub. Is there a better option? What are the drawbacks of doing it this way besides having to write more code?
Reason for needing this:
I am writing a code to analyze data coming from instrumentation. The module which connects to the instrumentation needs to be able to broadcast data to other modules for them to analyze. They, in turn, need to broadcast their analyzed data to output modules.
At first glance pub-sub with ZeroMQ seemed perfect for the job, but messages get dropped if any subscriber slows down and hits the high watermark. In the case of this system, it is not acceptable for messages to be dropped at only a fraction of the modules because of event continuity. All the modules need to analyze an event for the output to be meaningful. However, if no modules received the messages for an event, that would be fine. For this reason, it would be okay to block the publisher (the instrumentation module) if one of the analysis modules hit the high watermark.
I suppose another alternative is to deal with missed messages after the fact, but that just wastes processing time on events that would be discarded later.
EDIT:
I guess thinking about this further, I currently expect a message sent = message delivered because I'm using inproc and communicating between threads. However, if I were to send messages over TCP there is a chance that the message could be lost even if ZeroMQ does not drop it on purpose. Does this mean I probably need to deal with dropped messages even if I use a blocking send? Are there any guarantees about message delivery with inproc?

In general, I think there's no way of providing a guarantee for pub/sub on its own with 0MQ. If you really need completely reliable messaging, you're going to have to roll your own.
Networks are inherently unreliable, which is why TCP does so much handshaking just to get packets across.
As ever, it's a balance between latency and throughput. If you're prepared to sacrifice throughput, you can do message handshaking yourself - perhaps using REQ/REP - and handle the broadcasting yourself.
The 0MQ guide has some ideas on how to go about at least part of what you want here.

I agree with SteveL. If you truly need 100% reliability (or close to it), ZeroMq is probably not your solution. You're better off going with commercial messaging products where guaranteed message delivery and persistence are addressed, otherwise, you'll be coding reliability features in ZeroMq and likely pull your hair out in the process. Would you implement your own app server if you required ACID compliance between your application and database? Unless you want to implement your own transaction manager, you'd buy WebLogic, WebSphere, or JBoss to do it for you.
Does this mean I probably need to deal with dropped messages even if I
use a blocking send?
I'd stay away from explicitly blocking anything, it's too brittle. A synchronous sender could hang indefinitely if something goes wrong on the consumption side. You could address this using polling and timeouts, but again, it's brittle and messy code; stick with asynchronous.
Are there any guarantees about message delivery with inproc?
Well, one thing is guaranteed; you're not dealing with physical sockets, so any network issues are eliminated.

This question comes up on search engines, so I just wanted to update.
You can stop ZeroMQ from dropping messages when using PUB sockets. You can set the ZMQ_XPUB_NODROP option, which will instead raise an error when the send buffer is full.
With that information, you could create something like a dead letter queue, as mentioned here, and keep trying to resend with sleeps in between.
Efficiently handling this problem may not be possible currently, as there does not appear to be a way to be notified when the send buffer in ZeroMQ is no longer full, which means timed sleeps / polling may be the only way to find out if the send queue has room again so the messages can be published.

Looking for best approach to sending the same data to multiple destinations using sockets

Looking for the best approach to sending the same message to multiple destinations using TCP/IP sockets. I'm working with an existing VS 2010 C++ application on Windows. Hoping to use a standard library/design pattern approach that has many of the complexities already worked out if possible.
Here's one approach I'm thinking about.. One main thread retrieves messages from a database and adds them to some sort of thread safe queue. The application also has one thread for each client socket connection to some destination server. Each one of these threads would read from the thread safe queue, and send the message over a tcp/ip socket.
There may be better/simpler/more robust approaches than this one though..
The issues I have to be concerned about mostly are latency. The destinations could be anywhere, and there may be significant latency between one socket connection and another.
The messages must go in an exact FIFO order to all the destinations.
Also one destination will be considered the primary destination.. all messages must get to this destination, no exceptions. For the other destinations, i.e. non-primary, the messages are just copies and it's not absolutely critical if the non-primary destinations do not receive a few messages. At any point, one of the non-primary destinations could become the primary destination. If one of the destinations falls too far behind, then that thread would need to catch up to the primary destination, but skipping some messages.
Looking for any suggestions. Preliminary research so far, my situation appears to be something akin to a single producer and multiple consumers pattern, or possibly master-worker pattern in Java.
I need to implement this in C++ on Windows, and the application must use tcp/ip sockets using an existing defined protocol.
Any help at all would be greatly appreciated.

You need exactly two threads, one that saturates the IO channel to the database and another that saturates the IO channel to the network leading to the 12 servers. Unless you have multiple network interfaces (which you should think about!) you don't send things faster by using multiple threads. Also, since you don't have multiple threads taking care of the network, you don't have to sync them.
What you definitely need to know about is select(). In the case of WinSock, also take a look at WSAEventSelect/WaitForMultipleObjects. Basically, you take a message from the queue and then send it to all clients when they're ready. select() tells you when one of a set of sockets is ready to accept data, so you don't waste time waiting or block trying to send data. What you need to come up with is a schema to reconnect after broken connections, when to drop messages to lagging clients etc. Also, in case the throughput to the different targets varies a lot, you need to think about handling multiple messages in parallel. If they are small (less than a network packet's payload) it makes sense combining them anyway to avoid overhead.
I hope this short overview helps getting you started, otherwise I can elaborate on the details.

Webservice protection against big messages

I am developing a WebService in Java upon the jax-ws stack and glassfish.
Now I am a bit concerned about a couple of things.
I need to pass in a unknown amount of binary data that will be processed with a MDB, it is written this way to be asynchronous (so the user does not have to wait for the calculation to take place, kind of fault tolerant aswell as being very scalable.
The input message can however be split into chunks and sent to the MDB or split in the client and sent to the WS itself in chunks.
What I am looking for is a way to be able to specify the max size of the input so I wont blow the heap even if someone delibratly tries to send a to big message. I have noticed that things tend to be a bit unstable once you hit the ceiling and I must be able to keep running.
Is it possible to be safe against big messages or should I try to use another method instead of WS. Which options do I have?
Well I am rather new to Java EE..

If you're passing binary data take a look at enabling MTOM for endpoint. It utilizes streaming and has 'threshold' parameter.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js