Resume Ability for a simple Download Manager (C++ - WinInet) - c++

I'm writing a very simple download manager which just can Download - Pause - Resume,
how is it possible to resume the download from the exact point of the file that stop before, well actually the only thing I'm looking for is how to set the file pointer in server side and then I can download it from the exact point i wanted by InternetReadFile (Any Other Functions are accepted if you know a better way for it).
Although, InternetSetFilePointer Never works for me :) and I don't want to use BITS.
I think this can be happen by sending a header but don't know what and how to send it.

You are looking for the Range header. Use HttpAddRequestHeaders() to add a custom Range request header telling the server what range of bytes you want. See RFC 2616 Section 14.35 for syntax.
If the server supports ranges (use HttpQueryInfo(HTTP_QUERY_ACCEPT_RANGES) to verify), it will send a 206 status code instead of a 200 status code (use HttpQueryInfo(HTTP_QUERY_STATUS_CODE) to verify).
If 206 is received, simply seek your existing file to the resume position and then read the response data as-is to your file.
If 200 is received, the file is starting over from the beginning, so either:
truncate the existing file and start writing it fresh
seek the file, read and discard the response data until it reaches the desired position, then read the remaining data into your file.
Treat any other status code as a download failure.

Related

How to stream a file continuously with boost::beast

I have a local file which some process continuously appends to. I would like to serve that file with boost::beast.
So far I'm using boost::beast::http::response<boost::beast::http::file_body> and boost::beast::http::async_write to send the file to the client. That works very well and it is nice that boost::beast takes care of everything. However, when the end of the file is reached, it stops the asynchronous writing. I assume that is because is_done of the underlying serializer returns true at this point.
Is it possible to keep the asynchronous writing ongoing so new contents are written to the client as the local file grows (similar to how tail -f would keep writing the file's contents to stdout)?
I've figured that I might need to use boost::beast::http::response_serializer<boost::beast::http::file_body> for that kind of customization but I'm not sure how to use it correctly. And do I need to use chunked encoding for that purpose?
Note that keeping the HTTP connection open is not the problem, only writing further output as soon as the file grows.
After some research this problem seems not easily solvable, at least not under GNU/Linux which I'm currently focusing on.
It is possible to use chunked encoding as described in boost::beast's documentation. I've implemented serving chunks asynchronously from file contents which are also read asynchronously with the help of boost::asio::posix::stream_descriptor. That works quite well. However, it also stops with an end-of-file error as soon as the end of the file is reached. When using async_wait via the descriptor I'm getting the error "Operation not supported".
So it simply seems not possible to asynchronously wait for more bytes to be written to a file. That's strange considering tail -f does exactly that. So I've straceed tail -f and it turns out that it calls inotify_add_watch(4, "path_to_file", IN_MODIFY). Hence I assume one actually needs to use inotify to implement this.
For me it seems easier and more efficient to take control over the process which writes to the file so far to let it prints to stdout. Then I can stream the pipe (similarly to how I've attempted streaming the file) and write the file myself.
However, if one wanted to go down the road, I suppose using inotify and boost::asio::posix::stream_descriptor is the answer to the question, at least under GNU/Linux.

Python: reading from rfile in SimpleHTTPRequestHandler

While overloading SimpleHTTPRequestHandler, my function blocks on self.rfile.read(). How do I find out if there is any data in rfile before reading? Alternatively, is there a non-blocking read call that returns in absence of data?
For the record the solution is to only try to read as many bytes as specified in the header Content-Length. ie something like:
contentLength = int(request.headers['Content-Length'])
payload = str(request.rfile.read(contentLength))
I just solved a case like this.
I'm pretty sure what's "gottcha" on this is an infinite stream of bits being written to your socket by the client your connected to. This is often called "pre-connect" and it happens because http/tcp/HttpServer doesn't make a significant distinction between "chunks" and single bytes being fed slowly to the connection. If you see the response header contains Transfer-Encoding: chunked you are a candidate for this happening. Google Chrome works this way and is a good test. If fire fox and IE work, but Chrome doesn't when you recieve a reponse from the same website, then this what's probably happening.
Possible solution:
In CPython 3.7 the HttpServer has options that support pre-connect. Look at the HttpThreadingServer in the docs.
Put the request handing in a separate thread, and timeout that thread when the read operation takes too long.
Both of these are potentially painful solutions, but at least it should get you started.

Unable to send binary data over WebSockets

I am developing a viewer application, in which server captures image, perform some image processing operations and this needs to be shown at the client end, on HTML5 canvas. The server that I've written is in VC++ and uses http://www.codeproject.com/Articles/371188/A-Cplusplus-Websocket-server-for-realtime-interact.
So far I've implemented the needed functionality. Now all I need to do is Optimization. Reference was a chat application which was meant to send strings, and so I was encoding data into 7-bit format. Which is causing overhead. I need binary data transfer capability. So I modified the encoding and framing (Now opcode is 130, for binary messages instead of 129.) and I can say that server part is alright. I've observed the outgoing frame, it follows protocol. I'm facing problem in the client side.
Whenever the client receives the incoming message, if all the bytes are within limits (0 to 127) it calls onMessage() and I can successfully decode the incoming message. However even a single introduction of character which is >127 causes the client to call onClose(). The connection gets closed and I am unable to find cause. Please help me out.
PS: I'm using chrome 22.0 and Firefox 17.0
Looks like your problem is related to how you assemble your frames? As you have an established connection that terminates when the onmessage event is about to fire, i asume that it is frame related?
What if you study the network -> WebSocket -> frame of your connection i Google Chrome? what does it say?
it may be out-of-scope for you ?, but im one of the developers of XSockets.NET (C#) framework, we have binary support there, if you are interested there is an example that i happend to publish just recently, it can be found on https://github.com/MagnusThor/XSockets.Binary.Controller.Example
How did you observe the outgoing frame and what were the header bytes that you observed? It sounds like you may not actually be setting the binary opcode successfully, and this is triggering UTF-8 validation in the browser which fails.

How to buffer and process chunked data before sending headers in IIS7 Native Module

So I've been working on porting an IIS6 ISAPI module to IIS7. One problem that I have is that I need to be able to parse and process responses, and then change/delete/add some HTTP headers based on the content. I got this working fine for most content, but it appears to break down when chunked encoding is being used on the response body.
It looks like CHttpModule::OnSendResponse is being called once for each chunk. I've been able to determine when a chunked response is being sent, and to buffer the data until all of the chunks have been passed in, and set the entity count to 0 to prevent it from sending that data out, but after the first OnSendResponse is called the headers are sent to the client already, so I'm not able to modify them later after I've already processed the chunked data.
I realize that doing this is going to eliminate the benefits of the chunked encoding, but in this case it is necessary.
The only example code I can find for IIS Native Modules are very simplistic and don't demonstrate performing any filtering of response data. Any tips or links on this would be great.
Edit: Okay, I found IHttpResponse::SuppressHeaders, which will prevent the headers from being sent after the first OnSendResponse. However, now it will not send the headers at all. So what I did was when it's a chunked response I set it to suppress headers, and then later after I process the response, I check to see if the headers were suppressed, and if they were I read all of the headers from raw response structure (HTTP_RESPONSE), and insert them at the beginning of the response entity chunks myself. This seems to work okay so far.
Still open to other ideas if anybody has any better option.

C/C++ Determine Whether Files have been completely written

I have a directory (DIR_A) to dump from Server A to Server B which is
expected to take a few weeks. DIR_A has the normal tree
structure i.e. a directory could have subfolders or files, etc
Aim:
As DIR_A is being dumped to server B, I will have to
go through DIR_A and search for certain files within it (do not know the
exact name of each file because server A changes the names of all the files
being sent). I cannot wait for weeks to process some files within DIR_A. So, I want to
start manipulating some of the files once I receive them at server B.
Brief:
Server A sends DIR_A to Server B. Expected to take weeks.
I have to start processing the files at B before the upload is
complete.
Attempt Idea:
I decided to write a program that will list the contents of DIR_A.
I went on finding out whether files exist within folders and subfolders of DIR_A.
I thought that I might look for the EOF of a file within DIR_A. If it is not present
then the file has not yet been completely uploaded. I should wait till the EOF
is found. So, I keep looping, calculating the size of the file and verifying whether EOF is present. If this is the case, then I start processing that file.
To simulate the above, I decided to write and execute a program writing to
a text file and then stopped it in the middle without waiting for completion.
I tried to use the program below to determine whether the EOF could be found. I assumed that since I abrubtly ended the program writing to the text file the eof will not be present and hence the output "EOF FOUND" should not be reached. I am wrong since this was reached. I also tried with feof(), and fseek().
std::ifstream file(name_of_file.c_str, std::ios::binary);
//go to the end of the file to determine eof
char character;
file.seekg(0, ios::end);
while(!file.eof()){
file.read(character, sizeof(char));
}
file.close();
std::cout << "EOF FOUND" << std::endl
Could anyone provide with an idea of determining whether a file has been completely written or not?
EOF is simply C++'s way of telling you there is no more data. There's no EOF "character" that you can use to check if the file is completely written.
The way this is typically accomplished is to transfer the file over with one name, i.e. myfile.txt.transferring, and once the transfer is complete, move the file on the target host (back to something like myfile.txt). You could do the same by using separate directories.
Neither C nor C++ have a standard way to determine if the file is still open for writing by another process. We have a similar situation: a server that sends us files and we have to pick them up and handle as soon as possible. For that we use Linux's inotify subsystem, with a watch configured for IN_CLOSE_WRITE events (file was closed after having been opened for writing), which is wrapped in boost::asio::posix::stream_descriptor for convenient asynchronicity.
Depending on the OS, you may have a similar facility. Or just lsof as already suggested.
All finite files have an end. If a file is being written by one process, and (assuming the OS allows it) simultaneously read (faster than it is being written) by another process,then the reading process will see an EOF when it has read all the characters that have been written.
What would probably work better is, if you can determine a length of time during which you can guarantee that you'll receive a significant number of bytes and write them to a file (beware OS buffering), then you can walk the directory once per period, and any file that has changed its file size can be considered to be unfinished.
Another approach would require OS support: check what files are open by the receiving process, with a tool like lsof. Any file open by the receiver is unfinished.
In C, and I think it's the same in C++, EOF is not a character; it is a condition a file is (or is not) in. Just like media removed or network down is not a character.