C++ "HTTP" Server - Chunked Data Transfer - c++

UPDATE: Thank you for the help so far. I've just tested the program connecting directly to it from the browser, instead of thru an XMLHttpRequest. Going straight from the browser is working flawlessly.
However, this connection must be handled via an XMLHTTPRequest. According to FireBug, it's receiving the full response (31 bytes in this case). It closes the connection, sets the readyState to 4. But the responseText is completely empty.
I'm creating a C++ app that accepts connections and responds as if it were an HTTP Server. My goal is to create a real-time chat server by opening connections to this C++ app, and responding with a "page" that continues to load as new messages are sent. I am currently sending the following back:
HTTP/1.1 200 OK\r\n
Transfer-Encoding: chunked\r\n
Content-Type: text/plain\r\n
\r\n
Up to this point, everything works. Using FireBug, I can see that it is properly receiving and interpreting headers. However, I cannot figure out how to forward response text. I know that in plain text, it would be read as follows:
5
Hello
8
Good bye
But every iteration I've tried (with \r\n, without \r\n, counting \r\n as 2 additional bytes) so far does not get properly read by the browser as response text. Can somebody help with crafting a proper string to send as response text?

You should end the transfer with a zero-length chunk:
5
Hello
8
Good bye
0
Otherwise the browser does not know you are finished.

You're trying to implement "HTTP Push" or HTTP streaming or whatever, the issue is that not all browsers will support this correctly, for browsers such as firefox/opera etc, you could try the mime-type multipart/x-mixed-replace, so as long as you keep the connection live and send stuff down, firefox should read, but this will not work in IE...

"Each chunk starts with the number of octets of the data it embeds expressed in hexadecimal followed by optional parameters (chunk extension) and a terminating CRLF (carriage return and line feed) sequence, followed by the chunk data"
Are you using hex for your lengths? The \r\n after the chunk length should not be counted in the length.
Also, try closing out the page with a 0 length. That will let you know if the browser is just buffering before parsing.

Related

boost::asio::read() blocking in custom web service

I love seeing that there's a cross-platform standard for TCP/IP sockets emerging for C++ in boost. And so far I've been able to find help for all topics I've run into. But now I'm stuck on an odd behavior. I'm developing using Xcode 7.3.1 on an late-2013 iMac.
I'm developing a simple web server for a special purpose. The code below is a pared down version that demonstrates the bad behavior:
#include <boost/asio.hpp>
#include <boost/bind.hpp>
using namespace std;
using namespace boost;
using namespace boost::asio;
using namespace boost::asio::ip;
int main(int argc, const char * argv[]) {
static asio::io_service ioService;
static tcp::acceptor tcpAcceptor(ioService, tcp::endpoint(tcp::v4(), 2080));
while (true) {
// creates a socket
tcp::socket* socket = new tcp::socket(ioService);
// wait and listen
tcpAcceptor.accept(*socket);
asio::streambuf inBuffer;
istream headerLineStream(&inBuffer);
char buffer[1];
asio::read(*socket, asio::buffer(buffer, 1)); // <--- Yuck!
asio::write(*socket, asio::buffer((string) "HTTP/1.1 200 OK\r\n\r\nYup!"));
socket->shutdown(asio::ip::tcp::socket::shutdown_both);
socket->close();
delete socket;
}
return 0;
}
When I access this service, under a certain set of conditions, the browser will choke for upwards of 20 seconds. If I pause the program running in debug mode, I can see that the asio::read() call is blocking. It's literally waiting for even a single character to appear from the browser. Why is this?
Let me clarify, because what I have to do to reproduce this on my machine is strange. Once I start the program (for debugging), I open the "page" from Chrome (as http://localhost:2080/). I can hit Refresh many times and it works just fine. But then I use Firefox (or Safari) and it hangs for maybe 20 seconds, whence the page shows up as expected. Now get this. If, during that delay in Firefox, I hit Refresh in Chrome, the Firefox page shows up immediately, too. In another experiment, I hit Refresh in Chrome (works fine) and then hit Refresh in both Firefox and Safari. Both of them hang. I hit Refresh in Chrome and all 3 show up immediately.
In a change to this experiment, as soon as I start this program, I hit Refresh in either Firefox or Safari and they work just fine. No matter how many times I refresh. And going back and forth between them. I'm literally holding down CMD-R to rapid-fire refresh these browsers. But as soon as I refresh Chrome on the same page and then try refreshing the other two browsers, they hang again.
Having done web programming since around 1993, I know the HTTP standard well. The most basic workflow is that the browser initiates a TCP connection. As soon as the web server accepts the connection, the client sends an HTTP header. Something like "GET /\r\n\r\n" for the root page ("/"). The server typically reads all the header lines and stops until it gets to the first blank line, which signals the end of the headers and beginning of the uploaded content (e.g., POSTed form content), which the web application is free to consume or ignore. The server responds when it is ready with its own HTTP headers, starting typically with "HTTP/1.1 200 OK\r\n", followed by the actual page content (or binary file contents, etc).
In my app, I'm actually using asio::read_until(*socket, inBuffer, "\r\n\r\n") to read the entire HTTP header. Since that was hanging, I thought maybe those other browsers were sending corrupt headers or something. Hence my trimming down of the sample to just reading a single character (should be the "G" in "GET /"). One single character. Nope.
As a side note, I know I'm doing this synchronously, but I really wanted a simple, linear demo to show this bad behavior. I'm assuming that's not what's causing this problem, but I know it's possible.
Any thoughts here? In my use case, this is sufferable, since the server does eventually respond, but I'd really rather understand eliminate this bad behavior.
It seems this results from a design quirk in Chrome. See this post:
server socket receives 2 http requests when I send from chrome and receives one when I send from firefox
I see what's happening now. Chrome makes 2 connection requests. The first is for the desired page and contains a proper request HTTP header. The second connection, once accepted, does not contain even a single byte of input data. So my attempt to read that first byte goes unrewarded. Fortunately, the read attempt times out. That's easy enough to recover from with a try/catch.
This appears to be a greedy optimization to speed up Chrome's performance. That is, it holds the next connection open until the browser needs something from the site, whence it sends the HTTP request on that open socket. It then immediately opens a new connection, again in anticipation of a future request. Although I get how this speeds Chrome's experience up, this seems a dubious design because of the added burden it places on the server.
This is a good argument for opening a separate thread to handle each accepted socket. A thread could patiently hang out waiting for the never-forthcoming request while other threads handle other requests. To that end, I wrapped up everything after tcpAcceptor.accept(*socket); in a new thread so the loop can continue waiting for the next request.

Resume Ability for a simple Download Manager (C++ - WinInet)

I'm writing a very simple download manager which just can Download - Pause - Resume,
how is it possible to resume the download from the exact point of the file that stop before, well actually the only thing I'm looking for is how to set the file pointer in server side and then I can download it from the exact point i wanted by InternetReadFile (Any Other Functions are accepted if you know a better way for it).
Although, InternetSetFilePointer Never works for me :) and I don't want to use BITS.
I think this can be happen by sending a header but don't know what and how to send it.
You are looking for the Range header. Use HttpAddRequestHeaders() to add a custom Range request header telling the server what range of bytes you want. See RFC 2616 Section 14.35 for syntax.
If the server supports ranges (use HttpQueryInfo(HTTP_QUERY_ACCEPT_RANGES) to verify), it will send a 206 status code instead of a 200 status code (use HttpQueryInfo(HTTP_QUERY_STATUS_CODE) to verify).
If 206 is received, simply seek your existing file to the resume position and then read the response data as-is to your file.
If 200 is received, the file is starting over from the beginning, so either:
truncate the existing file and start writing it fresh
seek the file, read and discard the response data until it reaches the desired position, then read the remaining data into your file.
Treat any other status code as a download failure.

Unable to send binary data over WebSockets

I am developing a viewer application, in which server captures image, perform some image processing operations and this needs to be shown at the client end, on HTML5 canvas. The server that I've written is in VC++ and uses http://www.codeproject.com/Articles/371188/A-Cplusplus-Websocket-server-for-realtime-interact.
So far I've implemented the needed functionality. Now all I need to do is Optimization. Reference was a chat application which was meant to send strings, and so I was encoding data into 7-bit format. Which is causing overhead. I need binary data transfer capability. So I modified the encoding and framing (Now opcode is 130, for binary messages instead of 129.) and I can say that server part is alright. I've observed the outgoing frame, it follows protocol. I'm facing problem in the client side.
Whenever the client receives the incoming message, if all the bytes are within limits (0 to 127) it calls onMessage() and I can successfully decode the incoming message. However even a single introduction of character which is >127 causes the client to call onClose(). The connection gets closed and I am unable to find cause. Please help me out.
PS: I'm using chrome 22.0 and Firefox 17.0
Looks like your problem is related to how you assemble your frames? As you have an established connection that terminates when the onmessage event is about to fire, i asume that it is frame related?
What if you study the network -> WebSocket -> frame of your connection i Google Chrome? what does it say?
it may be out-of-scope for you ?, but im one of the developers of XSockets.NET (C#) framework, we have binary support there, if you are interested there is an example that i happend to publish just recently, it can be found on https://github.com/MagnusThor/XSockets.Binary.Controller.Example
How did you observe the outgoing frame and what were the header bytes that you observed? It sounds like you may not actually be setting the binary opcode successfully, and this is triggering UTF-8 validation in the browser which fails.

HTTP keep-alive with C++ recv winsocket2

I'm Coding my own HTTP fetcher socket. I use C++ in MVC++ and winsocket2.h
I was able to program the socket to connect to the required website's server and send an HTTP GET request.
Now the problem is after I send an HTTP GET request with Keep-alive connection, I call the recv function , and it works fine except after it retrieves the website, it stays lingering and waiting for time-out hint from the server or a connection to close!!
This takes a few seconds of less depending in the keep-alive timeout the servers has,
Therefore, I can't benefit from the keep-alive HTTP settings.
How can I tell the recv function to stop after retrieving the website and gives back the command to me so I can send another HTTP request while avoiding another hand-shake regime.
When I use the non-blocking sockets it works faster, But I don't know when to stop, I set a str.rfind("",-1,7) to stop retrieving data.
however, it is not very efficient.
Does anybody know a way to do it, or what is that last character send by the HTTP server when the connection is kept alive, so I can use it as a stopping decision.
Best,
Moe
Check for a Content-Length: xxxxx header, and only read xxxxx bytes after the header, which is terminated by a blank line (CR-LF-CR-LF in stream).
update
If the data is chunked:
Chunked Transfer-Encoding (reference)
...
A chunked message body contains a
series of chunks, followed by a line
with "0" (zero), followed by optional
footers (just like headers), and a
blank line. Each chunk consists of two
parts:
a line with the size of the chunk
data, in hex, possibly followed by a
semicolon and extra parameters you can
ignore (none are currently standard),
and ending with CRLF.
the data itself,
followed by CRLF.
Also, http://www.w3.org description of Chunked Transfer-Encoding is in section 3.6.1 # http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html.
Set the non-blocking I/O flag on the socket, so that recv will return immediately with only as much data has already been received. Combine this with select, WSAEventSelect, WSAAsyncSelect, or completion ports to be informed when data arrives (instead of busy-waiting).

How to buffer and process chunked data before sending headers in IIS7 Native Module

So I've been working on porting an IIS6 ISAPI module to IIS7. One problem that I have is that I need to be able to parse and process responses, and then change/delete/add some HTTP headers based on the content. I got this working fine for most content, but it appears to break down when chunked encoding is being used on the response body.
It looks like CHttpModule::OnSendResponse is being called once for each chunk. I've been able to determine when a chunked response is being sent, and to buffer the data until all of the chunks have been passed in, and set the entity count to 0 to prevent it from sending that data out, but after the first OnSendResponse is called the headers are sent to the client already, so I'm not able to modify them later after I've already processed the chunked data.
I realize that doing this is going to eliminate the benefits of the chunked encoding, but in this case it is necessary.
The only example code I can find for IIS Native Modules are very simplistic and don't demonstrate performing any filtering of response data. Any tips or links on this would be great.
Edit: Okay, I found IHttpResponse::SuppressHeaders, which will prevent the headers from being sent after the first OnSendResponse. However, now it will not send the headers at all. So what I did was when it's a chunked response I set it to suppress headers, and then later after I process the response, I check to see if the headers were suppressed, and if they were I read all of the headers from raw response structure (HTTP_RESPONSE), and insert them at the beginning of the response entity chunks myself. This seems to work okay so far.
Still open to other ideas if anybody has any better option.