How to keep a HTTP long-polling connection open? - c++

I want to implement long polling in a web service. I can set a sufficiently long time-out on the client. Can I give a hint to intermediate networking components to keep the response open? I mean NATs, virus scanners, reverse proxies or surrounding SSH tunnels that may be in between of the client and the server and I have not under my control.
A download may last for hours but an idle connection may be terminated in less than a minute. This is what I want to prevent. Can I inform the intermediate network that an idle connection is what I want here, and not because the server has disconnected?
If so, how? I have been searching around four hours now but I don’t find information on this.
Should I send 200 OK, maybe some headers, and then nothing?
Do I have to respond 102 Processing instead of 200 OK, and everything is fine then?
Should I send 0x16 (synchronous idle) bytes every now and then? If so, before or after the initial HTTP status code, before or after the header? Do they make it into the transferred file, and may break it?
The web service / server is in C++ using Boost and the content file being returned is in Turtle syntax.

You can't force proxies to extend their idle timeouts, at least not without having administrative access to them.
The good news is that you can design your long polling solution in such a way that it can recover from a connection being suddenly closed.
One such design would be as follows:
Since long polling is normally used for event notifications (think the Observer pattern), you associate a serial number with each event.
The client makes a GET request carrying the serial number of the last event it has seen, either as part of the URL or in a cookie.
The server maintains a buffer of recent events. Upon receiving a GET request from the client, it checks if any of the buffered events need to be sent to the client, based on their serial numbers and the serial number provided by the client. If so, all such events are sent in one HTTP response. The response finishes at that point, in case there is a proxy that wants to buffer the whole response before relaying it further.
If the client is up to date, that is it didn't miss any of the buffered events, the server is delaying its response till another event is generated. When that happens, it's sent as one complete HTTP response.
When the client receives a response, it immediately sends a new one. When it detects the connection was closed, it creates a new one and makes a new request.
When using cookies to convey the serial number of the last event seen by the client, the client side implementation becomes really simple. Essentially you just enable cookies on the client side and that's it.

Related

Check if server received data after timeout

I made a program that uses serveral RestAPI's of Bitcoin exchanges, e.g. Bitstamp
There is a function that allows me to do a trade: sell or buy Bitcoin for a specific price. Simplified, you have to call a URL with parameters like this:
https://www.bitstamp.net/api/trade?price=100&amount=1&type=sell
The server then answers in JSON. Example:
{"error":"","message":"Sold 1 BTC # 100$"}
If the trade was successful, my program continues. If it was not, it tries again (depending on the error message).
However, there is one problem. I'm using libcurl for the communication with the server and I set the CURLOPT_TIMEOUT to two seconds. It almost always works, but sometimes I get the following error:
Code #28: Operation timed out after 2000 milliseconds with 0 bytes received
When this happens, my program tries to trade again. But sometimes, despite the timeout, the trade was already made, which means it is done multiple times because my code tries again.
Can I somehow find out if the server atleast received all the data? The thing is if I increase CURLOPT_TIMEOUT to say 10 seconds, and the server does not answer, I have the same problem. So this is not a solution.
I do not know details of Bitstamp, but here is how HTTP works. Client sends a request to a server and receives a response. In the response, details about success or failure are described (by using HTTP error codes). However, if a timeout is received, then client has no information about it's request:
is it sent to the server;
did server receive it;
if server received the request, did it manage to process;
maybe server processed the request, but sending back the response failed due to the network issues.
For that reason, one should not count that the request was successful, and should resend the request. The problem you have described is certainly possible - server received the request, processed it but did not manage to send back the response. For that reason, other more complex protocols should be used, unfortunately HTTP is not one of them because of it's request-response nature.
Perhaps you should check if the given REST API gives some status for the transactions.
You are supposed to wait for the HTTP response to be a little bit more sure wether your request was successfully processed or not.
If you can access to the file descriptor, you can call ioctl() with the SIOCOUTQ (Linux) or FIONWRITE (BSD) -- I don't know the equivalent for Windows --, to check for unacknowledged sent data at socket level, before totally aborting you connection.
The problem is that it wouldn't be totally error-free either. Even though TCP is stateful at transport level, HTTP is stateless at application level. If your application needs transactional behavior (you dealing with currency, after all, aren't you?), it should provide a means for that.
All that said, I think two seconds might be too little. If you need speed because of multiple operations or something like that, consider parallelizing your connections.

Reusing sock_fd For UDP Server Response vs New sock_fd

If I have a UDP server that handles incoming requests with recvfrom, processes the requests that come in (possibly time consuming), possibly sends back a response, and then calls recvfrom again, is it better to create a new sock_fd with the information in sockaddr* from to send the response back with or to use the server's sock_fd to send a response?
Basically, the question is do I want the overhead of having to create a new sock_fd, or do I want my server to be able to handle requests without having to wait to send the previous request a response.
I can't decide based on the application's needs, because this will be used in a library (hence I don't know whether there will need to be a response or not, and how long it will take to process the request).
I fail to see how this is not a real question. The question is clearly asked in the bolded section above, and in the last part of the first sentence
There is no need to create a new sock_fd as the one which is created will have already done a bind call as its a server.
Also you have to ensure that the clients are not waiting for a response in a blocking recvfrom .
Most servers send out some error codes if they cannot give a proper response and the clients do a repeat request or something depending on that error code, may be you need to design the protocol in request-response way.
If processing is a problem hen you can always have the data + struct sockaddr of client in a queue and defer processing by signalling a thread to wakeup, by doing so your listening thread can come back to recvfrom fast, and then you can send the response from the processing thread to the saved struct sockaddr of client when you are finished.
do I want the overhead of having to create a new sock_fd
No.
or do I want my server to be able to handle requests without having to wait to send the previous request a response.
Nobody has to wait to send a message over a UDP socket. You can handle every incoming request on a separate thread if you like, and they can all call sendmsg(), simultaneously if necessary.
You definitely only want to use one socket. For one thing, it will mean that the reply will get back to the client with the same source-address information that they sent it to, which will be less confusing all round.

Sockets in Linux - how do I know the client has finished?

I am currently trying to implement my own webserver in C++ - not for productive use, but for learning.
I basically open a socket, listen, wait for a connection and open a new socket from which I read the data sent by the client. So far so good. But how do I know the client has finished sending data and not simply temporarily stopped sending more because of some other reason?
My current example: When the client sends a POST-request, it first sends the headers, then two times "\r\n" in a row and then the request body. Sometimes the body does not contain any data. So if the client is temporarily unable to send anything after it sent the headers - how do I know it is not yet finished with its request?
Does this solely depend on the used protocol (HTTP) and it is my task to find this out on the basis of the data I received, or is there something like an EOF for sockets?
If I cannot get the necessary Information from the socket, how do I protect my program from faulty clients? (Which I guess I must do regardless of this, since it might be an attacker and not a faulty client sending wrong data.) Is my only option to keep reading until the request is complete by definition of the protocol or a timeout (defined by me) is reached?
I hope this makes sense.
Btw: Please don't tell me to use some library - I want to learn the basics.
The protocol (HTTP) tells you when the client has stopped sending data. You can't get the info from the socket as the client will leave it open waiting for a response.
As you say, you must guard against errant clients not sending proper requests. Typically in the case of an incomplete request a timeout is applied to the read. If you haven't received anything in 30 seconds, say, then close the socket and ignore it.
For an HTTP post, there should be a header (Content-Length) saying how many bytes to expect after the the end of the headers. If its a POST and there is no Content-Length, then reject it.
"Does this solely depend on the used protocol (HTTP) and it is my task to find this out on the basis of the data I received,"
Correct. You can find the HTTP spec via google;
http://www.w3.org/Protocols/rfc2616/rfc2616.html
"or is there something like an EOF for sockets?"
There is as it behaves just like a file ... but that's not applicable here because the client isn't closing the connection; you're sending the reply ON that connection.
With text based protocols like HTTP you are at the mercy of the client. Most well formatted POST will have a content-length so you know how much data is coming. However the client can just delay sending the data, or it may have had its Ethernet cable removed or just hang, in which case that socket is sitting there indefinitely. If it disconnects nicely then you will get a socket closed event/response from the recv().
Most well designed servers in that case will have a receive timeout, and if the socket is idle for more than say 30 seconds it will close that socket, so resources are not leaked by misbehaving clients.

Is it normal for WSASend to fail during big file transfers?

I need a little help if someone's got a minute.
I've written a web server using IO completion ports, but I am having some trouble sending out large files. Web pages seem to load fine, but during large file transfers, WSASend() fails after a few minutes with error "The specified network name is no longer available."
Right now, my server just closes the associated connection when any overlapped operation fails. Is this the right thing to do? or should I retry failed overlapped operations a few times before I close the socket? I am using tcp/stream sockets.
(fixed) I am also receiving what seems like random 0 byte packets from WSARecv. I am not sure what to make of this, or if the problem is related.(/fixed)
Thanks for any help
edit: now that the server properly handles connections, and has a much more comprehensive log, it seems like Len is right. The client is closing the connection for some reason.
The log:
Initializing Windows Sockets...
Forwarding port 80...
Starting server...
Waiting for incoming connections...
Socket 1128: Client connected.
Socket 1128: Request received
Socket 1128: Sent response
Socket 1128: Error 64: SendChunk() failed. //WSASend()
Socket 1128: Closing connection - GetQueueCompletionStatus == FALSE
so the question is now, why would the client close the connection? It takes anywhere from 2-5 minutes to happen. I have decreased the buffer size to 4098 bytes per send, and only send the next chunk when the first has completed.
Thanks again for any ideas on this.
p.s. I even just implemented a retry function so that it will retry a failed overlapped IO operation five times before giving up....still no luck =(
A zero length packet returned from recv indicates client on the other end has closed the connection.
Which answers why your subsequent send to the client failed.
http://www.opengroup.org/onlinepubs/009695399/functions/recv.html
If no messages are available to be
received and the peer has performed an
orderly shutdown, recv() shall return
0.
Are you doing anything to impose some form of flow control on your data transmission?
If not then you are probably using up resources which is causing the send to fail.
For example, if you are simply issuing LOTS of WSASend() calls one after the other rather than pacing them based on when they complete then each one will use system resources (non-paged pool and/or lock pages which go towards the 'locked pages limit'). You'll then likely eventually fail with ENOBUFS or similar errors.
What you need to do is build a flow control system that works off of the send completions so that you only ever have a known number of sends outstanding at a time.
See these questions for more detail:
Implement a good performing "to-send" queue with TCP
Limiting TCP sends with a "to-be-sent" queue and other design issues
Finally figured it out.
from Rogers Internet Terms of Service:
Without limitation, you may not use (or allow anyone else to use) our Services to:
(xvi) operate a server in connection with the Services, including, without limitation, >mail, news, file, gopher, telnet, chat, Web, or host configuration servers, multimedia >streamers or multi-user interactive forums;
how lame is that? O_o
good news: server works fine =)
edit- called Rogers. They verified that they are cutting me off, and told me that I need a business account to run a web server.

What is the best way to implement a heartbeat in C++ to check for socket connectivity?

Hey gang. I have just written a client and server in C++ using sys/socket. I need to handle a situation where the client is still active but the server is down. One suggested way to do this is to use a heartbeat to periodically assert connectivity. And if there is none to try to reconnect every X seconds for Y period of time, and then to time out.
Is this "heartbeat" the best way to check for connectivity?
The socket I am using might have information on it, is there a way to check that there is a connection without messing with the buffer?
If you're using TCP sockets over an IP network, you can use the TCP protocol's keepalive feature, which will periodically check the socket to make sure the other end is still there. (This also has the advantage of keeping the forwarding record for your socket valid in any NAT routers between your client and your server.)
Here's a TCP keepalive overview which outlines some of the reasons you might want to use TCP keepalive; this Linux-specific HOWTO describes how to configure your socket to use TCP keepalive at runtime.
It looks like you can enable TCP keepalive in Windows sockets by setting SIO_KEEPALIVE_VALS using the WSAIoctl() function.
If you're using UDP sockets over IP you'll need to build your own heartbeat into your protocol.
Yes, this heartbeat is the best way. You'll have to build it into the protocol the server and client use to communicate.
The simplest solution is to have the client send data periodically and the server close the connection if it hasn't received any data from the client in a particular period of time. This works perfectly for query/response protocols where the client sends queries and the server sends responses.
For example, you can use the following scheme:
The server responds to every query. If the server does not receive a query for two minutes, it closes the connection.
The client sends queries and keeps the connection open after each one.
If the client has not send a query for one minute, it sends an "are you there" query. The server responds with "yes I am". This resets the server's two minutes timer and confirms to the client that the connection is still available.
It may be simpler to just have the client close the connection if it hasn't needed to send a query for the past minute. Since all operations are initiated by the client, it can always just open a new connection if it needs to perform a new operation. That reduces it to just this:
The server closes the connection if it hasn't received a query in two minutes.
The client closes the connection if it hasn't needed to send a query in one minute.
However, this doesn't assure the client that the server is present and ready to accept a query at all times. If you need this capability, you will have to implement an "are you there" "yes I am" query/response into your protocol.
If the other side has gone away (i.e. the process has died, the machine has gone down, etc.), attempting to receive data from the socket should result in an error. However if the other side is merely hung, the socket will remain open. In this case, having a heartbeat is useful. Make sure that whatever protocol you are using (on top of TCP) supports some kind of "do-nothing" request or packet - each side can use this to keep track of the last time they received something from the other side, and can then close the connection if too much time elapses between packets.
Note that this is assuming you're using TCP/IP. If you're using UDP, then that's a whole other kettle of fish, since it's connectionless.
Ok, I don't know what your program does or anything, so maybe this isn't feasible, but I suggest that you avoid trying to always keep the socket open. It should only be open when you are using it, and should be closed when you are not.
If you are between reads and writes waiting on user input, close the socket. Design your client/server protocol (assuming you're doing this by hand and not using any standard protocols like http and/or SOAP) to handle this.
Sockets will error if the connection is dropped; write your program such that you don't lose any information in the case of such an error during a write to the socket and that you don't gain any information in the case of an error during a read from the socket. Transactionality and atomicity should be rolled into your client/server protocol (again, assuming you're designing it yourself).
maybe this will help you, TCP Keepalive HOWTO
or this SO_SOCKET