Client doesn't detect Server disconnection - c++

In my application (c++) I have a service exposed as:
grpc foo(stream Request) returns (Reply) { }
The issue is that when the server goes down (CTRL-C) the stream on the client side keeps going indeed the
grpc::ClientWriter::Write
doesn't return false. I can confirm that using netstat I don't see any connection between the client and the server (apart a TIME_WAIT one that after a while goes away) and the client keeps calling that Write without errors.
Is there a way to see if the underlying connection is still up instead to rely on the Write return value ? I use grpc version 1.12
update
I discovered that the underlying channel goes in status IDLE but still the ClientWriter::Write doesn't report the error, I don't know if this is intended. During the streaming I'm trying now to reestablish a connection with the server every time the channel status is not GRPC_CHANNEL_READY

This could happen in a few scenarios but the most common element is a connection issue. We have KEEPALIVE support in gRPC to tackle exactly this issue. For C++, please refer to https://github.com/grpc/grpc/blob/master/doc/keepalive.md on how to set this up. Essentially, endpoints would send pings at certain intervals and expect a reply within a certain timeframe.

Related

Detect half-open websockets with PING/PONG

I'm using Jetty 9.2.24 as a WebSocket server. I want to detect half-open connections, so that no more messages are sent over this connection and buffered instead.
I know PING/PONG frames are used for this, so I tried sending PINGs periodically and set a low maxIdleTimeout. I modified my client to NOT return a PONG to see if Jetty would regard this as a failed connection since the RFC-6455 spec dictates that the remote endpoint MUST respond with a PONG. Apparently Jetty does not detect missing PONGs or I am doing something wrong.
What is the best way to continue. Should I implement the PING/PONG timeouts myself by explicitly receiving all PONG messages and detect a timeout? I would think this would be responsibility of the underlying websocket managing framework.
Note that Jetty 9.2.x is EOL (End of Life) you should consider upgrading.
Setting Max Idle Timeout and then causing the connection to not be idle by sending ping/pong isn't ideal.
The spec says that when you receive a PING you must send a PONG, and Jetty indeed does that.
It does not say that receiving a PONG, or not receiving a PONG, or receiving an unsolicited PONG has any meaning or behavior on it like you think it should.
Jetty 9.4 websocket will only keep a half-open connection open long enough to complete the current message (no matter how many frames it takes) then respond to the CLOSE frame it received (that caused the half-open connection). So half-open is only for the duration of the active message, then CLOSED. If no message is active, then the CLOSE happens immediately.
On Jetty 9.4 you can also add a WebSocketFrameListener and respond accordingly based on the frames received (eg: make the server end the conversation immediately, either via a CLOSE frame, or harsh disconnect)

Check if server received data after timeout

I made a program that uses serveral RestAPI's of Bitcoin exchanges, e.g. Bitstamp
There is a function that allows me to do a trade: sell or buy Bitcoin for a specific price. Simplified, you have to call a URL with parameters like this:
https://www.bitstamp.net/api/trade?price=100&amount=1&type=sell
The server then answers in JSON. Example:
{"error":"","message":"Sold 1 BTC # 100$"}
If the trade was successful, my program continues. If it was not, it tries again (depending on the error message).
However, there is one problem. I'm using libcurl for the communication with the server and I set the CURLOPT_TIMEOUT to two seconds. It almost always works, but sometimes I get the following error:
Code #28: Operation timed out after 2000 milliseconds with 0 bytes received
When this happens, my program tries to trade again. But sometimes, despite the timeout, the trade was already made, which means it is done multiple times because my code tries again.
Can I somehow find out if the server atleast received all the data? The thing is if I increase CURLOPT_TIMEOUT to say 10 seconds, and the server does not answer, I have the same problem. So this is not a solution.
I do not know details of Bitstamp, but here is how HTTP works. Client sends a request to a server and receives a response. In the response, details about success or failure are described (by using HTTP error codes). However, if a timeout is received, then client has no information about it's request:
is it sent to the server;
did server receive it;
if server received the request, did it manage to process;
maybe server processed the request, but sending back the response failed due to the network issues.
For that reason, one should not count that the request was successful, and should resend the request. The problem you have described is certainly possible - server received the request, processed it but did not manage to send back the response. For that reason, other more complex protocols should be used, unfortunately HTTP is not one of them because of it's request-response nature.
Perhaps you should check if the given REST API gives some status for the transactions.
You are supposed to wait for the HTTP response to be a little bit more sure wether your request was successfully processed or not.
If you can access to the file descriptor, you can call ioctl() with the SIOCOUTQ (Linux) or FIONWRITE (BSD) -- I don't know the equivalent for Windows --, to check for unacknowledged sent data at socket level, before totally aborting you connection.
The problem is that it wouldn't be totally error-free either. Even though TCP is stateful at transport level, HTTP is stateless at application level. If your application needs transactional behavior (you dealing with currency, after all, aren't you?), it should provide a means for that.
All that said, I think two seconds might be too little. If you need speed because of multiple operations or something like that, consider parallelizing your connections.

Winsock select function returning different values

I am working on project having client server architecture. select function returns different value in different scenarios Followings are the details
Scenario 1:
When i install my server at my machine, stop all the corresponding services, my client goes to DC state and now return value of select is 1 and read_mask.fd_count is also 1.
Scenario 2:
When i connect to remote server (abc.com) and disconnect my wireless connection. in this case the same function returns 0 also read_mask.fd_count is 0. i tried changing timeout variable value from ten ms to 50 sec. cant figure out the problem.
Any help will be appreciated
When you shot down the server you cause the network stack to shutdown the connection. Furtehr connection request are refused. The select indicates that there's something and the the recv() returns 0 to indicate forcibly closed.
When you pull the wireless cable out of the plug then the client gets neither the shutdown nor the connection request. You wait for any timeout to detect the not available server.
In a real world application you should implement a kind of heartbeat in you protocol that allows to detect the "disconnected state" in the second scenario.
Edit: If your Winsock implementation supports SO_KEEPALIVE_VALS, you can also configure this to detect the lost connectivity. See also: SO_KEEPALIVE.

How to handle SSL connection premature closure

I am writing a proxy server that proxies SSL connections, and it is all working perfectly fine for normal traffic. However when there is a large file transfer (Anything over 20KB) like an email attachment, then the connection is reset on the TCP level before the file is finished being written. I am using non-blocking IO, and am spawning a thread for each specific connection.
When a connection comes in I do the following:
Spawn a thread
Connect to the client (unencrypted) and read the connect request (all other requests are ignored)
Create a secure connection (SSL using openssl api) to the server
Tell the client that we contacted the server (unencrypted)
Create secure connection to client, and start proxying data between the two using a select loop to determine when reading and writing can occur
Once the underlying sockets are closed, or there is an error, the connection is closed, and thread is terminated.
Like I said, this works great for normal sized data (regular webpages, and other things) but fails as soon as a file is too large with either an error code (depending on the webapp being used) or a Error: Connection Interrupted.
I have no idea what is causing the connection to close, whether it's something TCP, HTTP, or SSL specific, and I can't find any information on it at all. In some browsers it will start to work if I put a sleep statement immediately after the SSL_write, but this seems to cause other issues in other browsers. The sleep doesn't have to be long, really just a delay. I currently have it set to 4ms per write, and 2ms per read, and this fixes it completely in older firefox, chrome with HTTP uploads, and opera.
Any leads would be appreciated, and let me know if you need any more information. Thanks in advanced!
-Sam
If the web-app thinks an uploaded file is too large what does it do? If it's entitled to just close the connection, that will cause an ECONN at the sender: 'connection reset'. Whatever it does, as you're writing a proxy, and assuming there are no bugs in your code that are causing this, your mission is to mirror whatever happens to your upstream connection back down the downstream connection. In this case the answer is to just do what you're doing: close the upstream and downstream sockets. If you got an incoming close_notify from the server, do an orderly SSL close to the client; if you got ECONN, just close the client socket directly, bypassing SSL.

What is the best way to implement a heartbeat in C++ to check for socket connectivity?

Hey gang. I have just written a client and server in C++ using sys/socket. I need to handle a situation where the client is still active but the server is down. One suggested way to do this is to use a heartbeat to periodically assert connectivity. And if there is none to try to reconnect every X seconds for Y period of time, and then to time out.
Is this "heartbeat" the best way to check for connectivity?
The socket I am using might have information on it, is there a way to check that there is a connection without messing with the buffer?
If you're using TCP sockets over an IP network, you can use the TCP protocol's keepalive feature, which will periodically check the socket to make sure the other end is still there. (This also has the advantage of keeping the forwarding record for your socket valid in any NAT routers between your client and your server.)
Here's a TCP keepalive overview which outlines some of the reasons you might want to use TCP keepalive; this Linux-specific HOWTO describes how to configure your socket to use TCP keepalive at runtime.
It looks like you can enable TCP keepalive in Windows sockets by setting SIO_KEEPALIVE_VALS using the WSAIoctl() function.
If you're using UDP sockets over IP you'll need to build your own heartbeat into your protocol.
Yes, this heartbeat is the best way. You'll have to build it into the protocol the server and client use to communicate.
The simplest solution is to have the client send data periodically and the server close the connection if it hasn't received any data from the client in a particular period of time. This works perfectly for query/response protocols where the client sends queries and the server sends responses.
For example, you can use the following scheme:
The server responds to every query. If the server does not receive a query for two minutes, it closes the connection.
The client sends queries and keeps the connection open after each one.
If the client has not send a query for one minute, it sends an "are you there" query. The server responds with "yes I am". This resets the server's two minutes timer and confirms to the client that the connection is still available.
It may be simpler to just have the client close the connection if it hasn't needed to send a query for the past minute. Since all operations are initiated by the client, it can always just open a new connection if it needs to perform a new operation. That reduces it to just this:
The server closes the connection if it hasn't received a query in two minutes.
The client closes the connection if it hasn't needed to send a query in one minute.
However, this doesn't assure the client that the server is present and ready to accept a query at all times. If you need this capability, you will have to implement an "are you there" "yes I am" query/response into your protocol.
If the other side has gone away (i.e. the process has died, the machine has gone down, etc.), attempting to receive data from the socket should result in an error. However if the other side is merely hung, the socket will remain open. In this case, having a heartbeat is useful. Make sure that whatever protocol you are using (on top of TCP) supports some kind of "do-nothing" request or packet - each side can use this to keep track of the last time they received something from the other side, and can then close the connection if too much time elapses between packets.
Note that this is assuming you're using TCP/IP. If you're using UDP, then that's a whole other kettle of fish, since it's connectionless.
Ok, I don't know what your program does or anything, so maybe this isn't feasible, but I suggest that you avoid trying to always keep the socket open. It should only be open when you are using it, and should be closed when you are not.
If you are between reads and writes waiting on user input, close the socket. Design your client/server protocol (assuming you're doing this by hand and not using any standard protocols like http and/or SOAP) to handle this.
Sockets will error if the connection is dropped; write your program such that you don't lose any information in the case of such an error during a write to the socket and that you don't gain any information in the case of an error during a read from the socket. Transactionality and atomicity should be rolled into your client/server protocol (again, assuming you're designing it yourself).
maybe this will help you, TCP Keepalive HOWTO
or this SO_SOCKET