zmq DEALER socket zmq_recv_msg call always timeouts - c++

I am using zmq ROUTER and DEALER sockets in my application(C++).
One process is listening on zmq ROUTER socket for clients to connect (a Service).
Clients connect to this service using zmq DEALER socket. From the client I am doing synchronous (blocking) request to the service. To avoid the
infinite wait time for the response, I am setting RCVTIMEO on DEALER socket to let say 5 ms. After setting this timeout I observe un-expected
behaviour on the client.
Here are the details:
Case 1: No RCVTIMEO is set on DEALER (client) socket
In this case, let say client sends 1000 Request to the service. Out of these requests for around 850 requests, client receives responses within 5 ms.
For remaining 150 request it takes more than 5 ms for response to come.
Case 2: RCVTIMEO is set for 5 ms on DEALER (client) socket
In this case, for the first 150-200 request I see valid response received within RCVTIMEO period. For all remaining requests I see RCVTIMEO timeout happening, which is not expected. The requests in both the cases are same.
The expected behiour should be: for 850 requests we should receive valid response (as they are coming within RCVTIMEO). And for remaining 150
requests we should see a timeout happening.
For having the timeout feature, I tried zmq_poll() also instead of setting RCVTIMEO, but results are same. Most of the requests are getting TIMEDOUT.
I went through the zmq documentation for details, but didn't find anything.
Can someone please explain the reason for this behaviour ?

Related

AkkaHttpClient: Equivalent of socketTimeout

We have 3 timeouts in Apache-HttpClient:
HttpClients.custom()
.setConnectionManager(cm)
.setDefaultRequestConfig(RequestConfig.custom()
.setConnectTimeout(...)
.setSocketTimeout(...)
.setConnectionRequestTimeout(...)
.build();
Which:
Connection Timeout: The time to
establish the connection with the remote host the
Socket Timeout: The time waiting for data, after establishing the connection; maximum time of inactivity between two data packets
But AkkaHttpClient only has connecting-timeout and doesn't have any configuration property for Socket Timeout. Is There any equivalent prop or way for setting a default Socket Timeout for requests?
In general for timeouts in the client beyond the connecting-timeout, the recommendation is to use the various Akka Streams operators (e.g. idleTimeout), which give you far more control.
There is also a general idle timeout which will close connections if nothing is sent or received: this is intended as a global safety feature, so it can't be configured per-request.

How to gracefully handle auto disconnect of Daphne websockets

Daphne has a parameter --websocket_timeout link. As mentioned in the doc,
--websocket_timeout WEBSOCKET_TIMEOUT
Maximum time to allow a websocket to be connected. -1 for infinite.
The socket is disconnected and no further communication can be done. However, the client does not receives a disconnect event, hence cant handle it gracefully. How does my client get to know whether the socket is disconnected or not?? I don't want to keep (at client) a timer nor want to keep rechecking it.
This is how I deploy my app
daphne -b 0.0.0.0 -p 8000 --websocket_timeout 1800 app.asgi:application
The socket gets auto-disconnected after every 30 mins, but the client never gets to know about this.
Whats the right way to go about it,.??
Update
Trying to send an event before the connection is closed. I'm over-riding my websocket_disconnect handler that sends the json before disconnecting. However, it does not send the event.
class Consumer(AsyncJsonWebsocketConsumer):
async def websocket_disconnect(self, message):
"""Over-riding."""
print('Inside websocket_disconnect consumer')
await self.send_json(
"event": "disconnecting..."
)
await super().websocket_disconnect(message)
I'm not sure it's a problem that needs a solution. The client has a certainty that after X minutes of inactivity it will get disconnected, where X is determined by the server. It has no certainty it won't happen before that. So you need connectivity handling code regardless.
While it seems dirty to keep an idling connection around, I can't imagine it costing a lot of resources.
Your premise that the client doesn't get to know about it is wrong. When you register the onclose handler, the client receives a disconnect event and can act accordingly.

What notification is provided for a lost connection in a C++ gRPC async server

I have an async gRPC server for Windows written in C++. I’d like to detect the loss of connection to a client – whether a network connection is lost, or the client crashes, etc. I see references to the keepalive channel arguments, and I’ve tried various combinations of those settings, such as:
builder.AddChannelArgument(GRPC_ARG_KEEPALIVE_TIME_MS, 10000);
builder.AddChannelArgument(GRPC_ARG_KEEPALIVE_TIMEOUT_MS, 10000);
builder.AddChannelArgument(GRPC_ARG_KEEPALIVE_PERMIT_WITHOUT_CALLS, 1);
builder.AddChannelArgument(GRPC_ARG_HTTP2_MIN_RECV_PING_INTERVAL_WITHOUT_DATA_MS, 9000);
builder.AddChannelArgument(GRPC_ARG_HTTP2_BDP_PROBE, 1);
I've done some testing with a streaming RPC method. If I kill the client process and then try to send data to the client, the lost connection is detected. I don't actually even have to send data. I can set an Alarm object to trigger immediately and that causes the call handler to be cancelled. However, if I don't try to send data (or set an alarm) after killing the client process then there's no notification or callback that I've been able to find/enable. I must not have a complete understanding. So:
How does the detection of a lost connection manifest itself for the server? Is there a callback method, or notification of some type? My server doesn’t receive any errors; the completion queue’s ‘Next()’ method never returns, etc.
Does this detection work for both unary (call/response) and streaming methods?
Does the server detection of a lost connection work whether or not the client has implemented lost connection / keepalive logic?
Is there some method besides the keepalive channel arguments that is preferred?
Thanks - any help is appreciated.
You can use ServerContext::AsyncNotifyWhenDone() to get a notification when the request has been cancelled.
https://grpc.github.io/grpc/cpp/classgrpc__impl_1_1_server_context_base.html#a0f1289f31257e6dbef57bc901bd7b5f2

Timeout on 3 attempt

I've set up two services with grapevine, a client server schema, I'm like the following problem, after the second request the server or the client returns me a timeout limit.
After calling the method RestResponse response = this.client.Execute (request) the RestResponse; To execute the request, so I saw it does not arrive on the server.
This always happens on 3 did I send the call

Netty file trasfer proxy suffer big connection delay under high concurrency

I am doing a project of building a file transfer proxy using netty which should efficiently handle high concurrency.
Here is my structure:
Back Server, a normal file server just like Http(File Server) example on netty.io which receive and confirm a request and send out a file either using ChunkedBuffer or zero-copy.
Proxy, with both NioServerSocketChannelFactory and NioClientSocketChannelFactory, both using cachedThreadPool, listening to clients' requests and fetch the file from Back Server back to the clients. Once a new client is accepted, the new accepted Channel(channel1) created by NioServerSocketChannelFactory and waiting for the request. Once the request is received, the Proxy will establish a new connection to Back Server using NioClientSocketChannelFactory, and the new Channel(channel2) will send request to Back Server and deliver the response to the client. Each channel1 and channel2 using its own pipeline.
More simply, the procedure is
channel1 accepted
channel1 receives the request
channel2 connected to Back Server
channel2 send request to Back Server
channel2 receive response(including file) from Back Server
channel1 send the response got from channel2 to the client
once transferring is done, channel2 close and channel1 close on flush.(each client only send one request)
Since the required file can be big(10M), the proxy stops channel2.readable when channel1 is NOT writtable, just like example Proxy Server on netty.io.
With the above structure, each client has one accepted Channel and once it send a request it also corresponds to one client Channel until the transferring is done.
Then I use ab(apache bench) to fire up thousands of requests to the proxy and evaluate the request time. Proxy, Back Server and Client are three boxes on one rack which has no other traffic loaded.
The results are weird:
File Size 10MB, when concurrency is 1, connection delay is very small, but when concurrency increases from 1 to 10, top 1% connection delay becomes incredibly high, up to
3 secs. The other 99% are very small. When concurrency increases to 20, 1% goes to 8 sec. And it even causes ab to be timeout if concurrency is higher than 100. The 90% Processing delay are usually linear with the concurrency but 1% can abnormally goes very high under a random number of concurrency(varies over multiple testing).
File Size 1K, everything is fine at lease with concurrency below 100.
Put them on a single local machine, no connection delay.
Can anyone explain this issue and tell me which part is wrong? I saw many benchmarking online, but they are pure ping-pang testing rather than this large file transferring and proxy stuff. Hope this is interested to you guys :)
Thank you!
========================================================================================
After some source coding reading today, I found one place may prevent the new sockets to be accepted. In NioServerSocketChannelSink.bind(), the boss executor will call Boss.run(), which contains a for loop for accepting the incoming sockets. In each iteration of this loop, after getting the accepted channel, AbstractNioWorker.register() will be called which suppose to add new sockets into the selector running in worker executor. However, in
register(), a mutex called startStopLock has to be checked before worker executor invoked. This startStopLock is also used in AbstractNioWorker.run() and AbstractNioWorker.executeInIoThread(), both of which check the mutex before they invoke the worker thread. In other words, startStopLock is used in 3 functions. If it is locked in AbstractNioWorker.register(), the for loop in Boss.run() will be blocked which can cause incoming accept delay. Hope this ganna help.