Some Problems of Indy 10 IdHTTP Implementation - c++

In regard to Indy 10 of IdHTTP, many things have been running perfectly, but there are a few things that don't work so well here. That is why, once again, I need your help.
Download button has been running perfectly. I'm using the following code :
void __fastcall TForm1::DownloadClick(TObject *Sender)
{
MyFile = SaveDialog->FileName;
TFileStream* Fist = new TFileStream(MyFile, fmCreate | fmShareDenyNone);
Download->Enabled = false;
Urlz = Edit1->Text;
Url->Caption = Urlz;
try
{
IdHTTP->Get(Edit1->Text, Fist);
IdHTTP->Connected();
IdHTTP->Response->ResponseCode = 200;
IdHTTP->ReadTimeout = 70000;
IdHTTP->ConnectTimeout = 70000;
IdHTTP->ReuseSocket;
Fist->Position = 0;
}
__finally
{
delete Fist;
Form1->Updated();
}
}
However, a "Cancel Resume" button is still can't resume interrupted downloads. Meant, it is always sending back the entire file every time I call Get() though I've used IdHTTP->Request->Ranges property.
I use the following code:
void __fastcall TForm1::CancelResumeClick(TObject *Sender)
{
MyFile = SaveDialog->FileName;;
TFileStream* TFist = new TFileStream(MyFile, fmCreate | fmShareDenyNone);
if (IdHTTP->Connected() == true)
{
IdHTTP->Disconnect();
CancelResume->Caption = "RESUME";
IdHTTP->Response->AcceptRanges = "Bytes";
}
else
{
try {
CancelResume->Caption = "CANCEL";
// IdHTTP->Request->Ranges == "0-100";
// IdHTTP->Request->Range = Format("bytes=%d-",ARRAYOFCONST((TFist->Position)));
IdHTTP->Request->Ranges->Add()->StartPos = TFist->Position;
IdHTTP->Get(Edit1->Text, TFist);
IdHTTP->Request->Referer = Edit1->Text;
IdHTTP->ConnectTimeout = 70000;
IdHTTP->ReadTimeout = 70000;
}
__finally {
delete TFist;
}
}
Meanwhile, by using the FormatBytes function, found here, has been able to shows only the size of download files. But still unable to determine the speed of download or transfer speed.
I'm using the following code:
void __fastcall TForm1::IdHTTPWork(TObject *ASender, TWorkMode AWorkMode, __int64 AWorkCount)
{
__int64 Romeo = 0;
Romeo = IdHTTP->Response->ContentStream->Position;
// Romeo = AWorkCount;
Download->Caption = FormatBytes(Romeo) + " (" + IntToStr(Romeo) + " Bytes)";
ForSpeed->Caption = FormatBytes(Romeo);
ProgressBar->Position = AWorkCount;
ProgressBar->Update();
Form1->Updated();
}
Please advise and give an example. Any help would sure be appreciated!

In your DownloadClick() method:
Calling Connected() is useless, since you don't do anything with the result. Nor is there any guarantee that the connection will remain connected, as the server could send a Connection: close response header. I don't see anything in your code that is asking for HTTP keep-alives. Let TIdHTTP manage the connection for you.
You are forcing the Response->ResponseCode to 200. Don't do that. Respect the response code that the server actually sent. The fact that no exception was raised means the response was successful whether it is 200 or 206.
You are reading the ReuseSocket property value and ignoring it.
There is no need to reset the Fist->Position property to 0 before closing the file.
Now, with that said, your CancelResumeClick() method has many issues.
You are using the fmCreate flag when opening the file. If the file already exists, you will overwrite it from scratch, thus TFist->Position will ALWAYS be 0. Use fmOpenReadWrite instead so an existing file will open as-is. And then you have to seek to the end of the file to provide the correct Position to the Ranges header.
You are relying on the socket's Connected() state to make decisions. DO NOT do that. The connection may be gone after the previous response, or may have timed out and been closed before the new request is made. The file can still be resumed either way. HTTP is stateless. It does not matter if the socket remains open between requests, or is closed in between. Every request is self-contained. Use information provided in the previous response to govern the next request. Not the socket state.
You are modifying the value of the Response->AcceptRanges property, instead of using the value provided by the previous response. The server tells you if the file supports resuming, so you have to remember that value, or query it before then attempting to resumed download.
When you actually call Get(), the server may or may not respect the requested Range, depending on whether the requested file supports byte ranges or not. If the server responds with a response code of 206, the requested range is accepted, and the server sends ONLY the requested bytes, so you need to APPEND them to your existing file. However, if the server response with a response code of 200, the server is sending the entire file from scratch, so you need to REPLACE your existing file with the new bytes. You are not taking that into account.
In your IdHTTPWork() method, in order to calculate the download/transfer speed, you have to keep track of how many bytes are actually being transferred in between each event firing. When the event is fired, save the current AWorkCount and tick count, and then the next time the event is fired, you can compare the new AWorkCount and current ticks to know how much time has elapsed and how many bytes were transferred. From those value, you can calculate the speed, and even the estimated time remaining.
As for your progress bar, you can't use AWorkCount alone to calculate a new position. That only works if you set the progress bar's Max to AWorkCountMax in the OnWorkBegin event, and that value is not always know before a download begins. You need to take into account the size of the file being downloaded, whether it is being downloaded fresh or being resumed, how many bytes are being requested during a resume, etc. So there is lot more work involved in displaying a progress bar for a HTTP download.
Now, to answer your two questions:
How to retrieve and save the download file to a disk by using its original name?
It is provided by the server in the filename parameter of the Content-Disposition header, and/or in the name parameter of the Content-Type header. If neither value is provided by the server, you can use the filename that is in the URL you are requesting. TIdHTTP has a URL property that provides the parsed version of the last requested URL.
However, since you are creating the file locally before sending your download request, you will have to create a local file using a temp filename, and then rename the local file after the download is complete. Otherwise, use TIdHTTP.Head() to determine the real filename (you can also use it to determine if resuming is supported) before creating the local file with that filename, then use TIdHTTP.Get() to download to that local file. Otherwise, download the file to memory using TMemoryStream instead of TFileStream, and then save with the desired filename when complete.
when I click http://get.videolan.org/vlc/2.2.1/win32/vlc-2.2.1-win32.exe then the server will process requests to its actual url. http://mirror.vodien.com/videolan/vlc/2.2.1/win32/vlc-2.2.1-win32.exe. The problem is that IdHTTP will not automatically grab through it.
That is because VideoLan is not using an HTTP redirect to send clients to the real URL (TIdHTTP supports HTTP redirects). VideoLan is using an HTML redirect instead (TIdHTTP does not support HTML redirects). When a webbrowser downloads the first URL, a 5 second countdown timer is displayed before the real download then begins. As such, you will have to manually detect that the server is sending you an HTML page instead of the real file (look at the TIdHTTP.Response.ContentType property for that), parse the HTML to determine the real URL, and then download it. This also means that you cannot download the first URL directly into your target local file, otherwise you will corrupt it, especially during a resume. You have to cache the server's response first, either to a temp file or to memory, so you can analyze it before deciding how to act on it. It also means you have to remember the real URL for resuming, you cannot resume the download using the original countdown URL.
Try something more like the following instead. It does not take into account for everything mentioned above (particularly speed/progress tracking, HTML redirects, etc), but should get you a little closer:
void __fastcall TForm1::DownloadClick(TObject *Sender)
{
Urlz = Edit1->Text;
Url->Caption = Urlz;
IdHTTP->Head(Urlz);
String FileName = IdHTTP->Response->RawHeaders->Params["Content-Disposition"]["filename"];
if (FileName.IsEmpty())
{
FileName = IdHTTP->Response->RawHeaders->Params["Content-Type"]["name"];
if (FileName.IsEmpty())
FileName = IdHTTP->URL->Document;
}
SaveDialog->FileName = FileName;
if (!SaveDialog->Execute()) return;
MyFile = SaveDialog->FileName;
TFileStream* Fist = new TFileStream(MyFile, fmCreate | fmShareDenyWrite);
try
{
try
{
Download->Enabled = false;
Resume->Enabled = false;
IdHTTP->Request->Clear();
//...
IdHTTP->ReadTimeout = 70000;
IdHTTP->ConnectTimeout = 70000;
IdHTTP->Get(Urlz, Fist);
}
__finally
{
delete Fist;
Download->Enabled = true;
Updated();
}
}
catch (const EIdHTTPProtocolException &)
{
DeleteFile(MyFile);
throw;
}
}
void __fastcall TForm1::ResumeClick(TObject *Sender)
{
TFileStream* Fist = new TFileStream(MyFile, fmOpenReadWrite | fmShareDenyWrite);
try
{
Download->Enabled = false;
Resume->Enabled = false;
IdHTTP->Request->Clear();
//...
Fist->Seek(0, soEnd);
IdHTTP->Request->Ranges->Add()->StartPos = Fist->Position;
IdHTTP->Request->Referer = Edit1->Text;
IdHTTP->ConnectTimeout = 70000;
IdHTTP->ReadTimeout = 70000;
IdHTTP->Get(Urlz, Fist);
}
__finally
{
delete Fist;
Download->Enabled = true;
Updated();
}
}
void __fastcall TForm1::IdHTTPHeadersAvailable(TObject*Sender, TIdHeaderList *AHeaders, bool &VContinue)
{
Resume->Enabled = ( ((IdHTTP->Response->ResponseCode == 200) || (IdHTTP->Response->ResponseCode == 206)) && TextIsSame(AHeaders->Values["Accept-Ranges"], "bytes") );
if ((IdHTTP->Response->ContentStream) && (IdHTTP->Request->Ranges->Count > 0) && (IdHTTP->Response->ResponseCode == 200))
IdHTTP->Response->ContentStream->Size = 0;
}

#Romeo:
Also, you can try a following function to determine the real download filename.
I've translated this to C++ based on the RRUZ'function. So far so good, I'm using it on my simple IdHTTP download program, too.
But, this translation result is of course still need value improvement input from Remy Lebeau, RRUZ, or any other master here.
String __fastcall GetRemoteFileName(const String URI)
{
String result;
try
{
TIdHTTP* HTTP = new TIdHTTP(NULL);
try
{
HTTP->Head(URI);
result = HTTP->Response->RawHeaders->Params["Content-Disposition"]["filename"];
if (result.IsEmpty())
{
result = HTTP->Response->RawHeaders->Params["Content-Type"]["name"];
if (result.IsEmpty())
result = HTTP->URL->Document;
}
}
__finally
{
delete HTTP;
}
}
catch(const Exception &ex)
{
ShowMessage(const_cast<Exception&>(ex).ToString());
}
return result;
}

Related

How to get download file size before download using C/C++ in Linux environment [duplicate]

I want to get the size of an http:/.../file before I download it. The file can be a webpage, image, or a media file. Can this be done with HTTP headers? How do I download just the file HTTP header?
Yes, assuming the HTTP server you're talking to supports/allows this:
public long GetFileSize(string url)
{
long result = -1;
System.Net.WebRequest req = System.Net.WebRequest.Create(url);
req.Method = "HEAD";
using (System.Net.WebResponse resp = req.GetResponse())
{
if (long.TryParse(resp.Headers.Get("Content-Length"), out long ContentLength))
{
result = ContentLength;
}
}
return result;
}
If using the HEAD method is not allowed, or the Content-Length header is not present in the server reply, the only way to determine the size of the content on the server is to download it. Since this is not particularly reliable, most servers will include this information.
Can this be done with HTTP headers?
Yes, this is the way to go. If the information is provided, it's in the header as the Content-Length. Note, however, that this is not necessarily the case.
Downloading only the header can be done using a HEAD request instead of GET. Maybe the following code helps:
HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://example.com/");
req.Method = "HEAD";
long len;
using(HttpWebResponse resp = (HttpWebResponse)(req.GetResponse()))
{
len = resp.ContentLength;
}
Notice the property for the content length on the HttpWebResponse object – no need to parse the Content-Length header manually.
Note that not every server accepts HTTP HEAD requests. One alternative approach to get the file size is to make an HTTP GET call to the server requesting only a portion of the file to keep the response small and retrieve the file size from the metadata that is returned as part of the response content header.
The standard System.Net.Http.HttpClient can be used to accomplish this. The partial content is requested by setting a byte range on the request message header as:
request.Headers.Range = new RangeHeaderValue(startByte, endByte)
The server responds with a message containing the requested range as well as the entire file size. This information is returned in the response content header (response.Content.Header) with the key "Content-Range".
Here's an example of the content range in the response message content header:
{
"Key": "Content-Range",
"Value": [
"bytes 0-15/2328372"
]
}
In this example the header value implies the response contains bytes 0 to 15 (i.e., 16 bytes total) and the file is 2,328,372 bytes in its entirety.
Here's a sample implementation of this method:
public static class HttpClientExtensions
{
public static async Task<long> GetContentSizeAsync(this System.Net.Http.HttpClient client, string url)
{
using (var request = new System.Net.Http.HttpRequestMessage(System.Net.Http.HttpMethod.Get, url))
{
// In order to keep the response as small as possible, set the requested byte range to [0,0] (i.e., only the first byte)
request.Headers.Range = new System.Net.Http.Headers.RangeHeaderValue(from: 0, to: 0);
using (var response = await client.SendAsync(request))
{
response.EnsureSuccessStatusCode();
if (response.StatusCode != System.Net.HttpStatusCode.PartialContent)
throw new System.Net.WebException($"expected partial content response ({System.Net.HttpStatusCode.PartialContent}), instead received: {response.StatusCode}");
var contentRange = response.Content.Headers.GetValues(#"Content-Range").Single();
var lengthString = System.Text.RegularExpressions.Regex.Match(contentRange, #"(?<=^bytes\s[0-9]+\-[0-9]+/)[0-9]+$").Value;
return long.Parse(lengthString);
}
}
}
}
WebClient webClient = new WebClient();
webClient.OpenRead("http://stackoverflow.com/robots.txt");
long totalSizeBytes= Convert.ToInt64(webClient.ResponseHeaders["Content-Length"]);
Console.WriteLine((totalSizeBytes));
HttpClient client = new HttpClient(
new HttpClientHandler() {
Proxy = null, UseProxy = false
} // removes the delay getting a response from the server, if you not use Proxy
);
public async Task<long?> GetContentSizeAsync(string url) {
using (HttpResponseMessage responce = await client.GetAsync(url))
return responce.Content.Headers.ContentLength;
}

How to read the length of audio files using Juce "C++." Without playing the file

I'm trying to display the length of audio files in a Playlist component for an application.
I've not used Juce or C++ before, and I can't understand how to do that from Juce documentation.
I want to make a function that takes an audio file's URL and returns the length in seconds of that audio without playing that file or doing anything else with that file.
I've tried a lot of things, and all of them didn't work, and this is the last thing I've tried:
void PlaylistComponent::trackStats(URL audioURL)
{
AudioFormatManager formatManager;
std::unique_ptr<AudioFormatReaderSource> readerSource;
AudioTransportSource transportSource;
auto* reader = formatManager.createReaderFor(audioURL.createInputStream(false));
if (reader != nullptr)
{
std::unique_ptr<AudioFormatReaderSource> newSource(new AudioFormatReaderSource(reader, true));
transportSource.setSource(newSource.get(), 0, nullptr, reader->sampleRate);
readerSource.reset(newSource.release());
DBG("PlaylistComponent::trackStats(URL audioURL): " << transportSource.getLengthInSeconds());
}
else
{
DBG("Something went wrong loading the file");
}
}
And this is the PlaylistComponent header file:
class PlaylistComponent : public juce::Component,
public juce::TableListBoxModel,
public Button::Listener,
public FileDragAndDropTarget
{
...
}
juce::AudioFormatReaderSource has a method called getTotalLength() which returns the total amount of samples.
Divide that by the sample rate of the file and you have the total length in seconds. Something like this:
if (auto* reader = audioFormatReaderSource->getAudioFormatReader())
double lengthInSeconds = static_cast<double> (audioFormatReaderSource->getTotalLength()) / reader->sampleRate;
You can do this very early on in the audio file opening procedure. You only need an AudioFormatReader instance (no need to create an AudioFormatReaderSource):
// create juce::File from a path juce::String (from a drag & drop event etc).
File file{filePath};
// make sure it is a file and not a directory, etc.
if (!file.existsAsFile()) return;
// create the AudioFormatReader that contains the data
AudioFormatReader *reader = formatManagerInstance.createReaderFor(file);
// make sure a valid reader can be created (not an unsupported file)
if (reader == nullptr) return;
// log the length in seconds
std::cout << reader->lengthInSamples / reader->sampleRate << "\n";
Note: for this to work you will need to
have access to your AudioFormatManager instance and
have already registered the format(s) of the file type (usually through a .registerBasicFormats() call on your AudioFormatManager instance.
IMPORTANT: if successful the createReaderFor() method uses new to create a new AudioFormatReaderInstance, so make sure to use delete on it when you are finished using it to avoid memory leaks

Calling a Web Service (containg multiple pages) does not load all the pages (without an added sleep delay)

My question is about a strange behavious I notice both on my iPhone device and the codenameone simulator (NetBeans).
I invoke the following code below which calls a google web service to provide a list of food places around a GPS coordinate:
The web service that is called is as follows (KEY OBSCURED):
https://maps.googleapis.com/maps/api/place/nearbysearch/json?location=40.714353,-74.00597299999998&radius=200&types=food&key=XXXXXXXXXXXXXXXXXXXXXXX
Each result contains the next page token and thus, the second call (for the subsequent page) is as follows:
https://maps.googleapis.com/maps/api/place/nearbysearch/json?location=40.714353,-74.00597299999998&radius=200&types=food&key=XXXXXXXXXXXXXXXXXXXXXXX&pagetoken=YYYYYYYYYYYYYYYYYY
public static byte[] getWSResponseData(String urlString, boolean usePost)
{
ConnectionRequest r = new ConnectionRequest();
r.setUrl(urlString);
r.setPost(usePost);
InfiniteProgress prog = new InfiniteProgress();
Dialog dlg = prog.showInifiniteBlocking();
r.setDisposeOnCompletion(dlg);
NetworkManager.getInstance().addToQueueAndWait(r);
try
{
Thread.sleep(2000);
}
catch (InterruptedException ex)
{
}
byte[] responseData = r.getResponseData();
return responseData;
}
public static void getLocationsList(double lat, double lng)
{
boolean done = false;
while (!done)
{
byte[] responseData = getWSResponseData(finalURL,false);
result = Result.fromContent(parser.parseJSON(new InputStreamReader(new ByteArrayInputStream(responseData))));
String venueNames[] = result.getAsStringArray("/results/name");
nextToken = result.getAsString("/next_page_token");
if ( nextToken == null || nextToken.equals(""))
done = true;
else
finalURL = completeURL + "&pagetoken=" + nextToken;
}
.....
}
This code works fine with the sleep timer, but when I remove the Thread.sleep, only the first page gets called.
Any help would be appreciated.
Using the debugger does not help as this is a timing issue and the issue does not occur when using the debugger.
Also when I put some print statements into the code
while (!done)
{
String nextToken = null;
**System.out.println(finalURL);**
...
}
System.out.println("Total Number of entries returned: " + itemCount);
I get the following output:
First Run (WITHOUT SLEEP):
https://maps.googleapis.com/maps/api/place/nearbysearch/json?location=40.714353,-74.00597299999998&radius=200&types=food&key=XXXXXXXX
https://maps.googleapis.com/maps/api/place/nearbysearch/json?location=40.714353,-74.00597299999998&radius=200&types=food&key=XXXXXXXX&pagetoken=CqQCF...
Total Number of entries returned: 20
Using the network monitor I see that the response to the second WS call returns:
{
"html_attributions" : [],
"results" : [],
"status" : "INVALID_REQUEST"
}
Which is strange as when I cut and paste the WS URL into my browser, it works fine...
Second Run (WITH SLEEP):
https://maps.googleapis.com/maps/api/place/nearbysearch/json?location=40.714353,-74.00597299999998&radius=200&types=food&key=XXXXXXXXX
https://maps.googleapis.com/maps/api/place/nearbysearch/json?location=40.714353,-74.00597299999998&radius=200&types=food&key=XXXXXXXXX&pagetoken=CqQCFQEAA...
https://maps.googleapis.com/maps/api/place/nearbysearch/json?location=40.714353,-74.00597299999998&radius=200&types=food&key=XXXXXXXXX&pagetoken=CsQDtQEAA...
Total Number of entries returned: 60
Well it seems to be a google API issue as indicated here:
Paging on Google Places API returns status INVALID_REQUEST
I still could not get it to work by changing the WS URL with a random parameter as they suggested, but I will keep trying and post something here if I get it to work. For now I will just keep a 2 second delay between the calls which seems to work.
Well gave up on using the google WS for this and switched to Yelp, works very well:
https://api.yelp.com/v3/businesses/search?.....

Ghostscript api requesting "press <return> to continue"

I am using GhostScript API to test if a PDF is user password protected (You can only open the document if you have the password, what is different from owner protection that only protects the content against unauthorized copying or printing)
I am using this code in Qt:
const bool PDFTools::isUserProtected(const QString &iFilePath)
{
void *minst;
const QString filePath = iFilePath.toUtf8();
const std::string inputFile = QString( "%1" ).arg(filePath).toStdString();
int initiationCode = 0;
int executionCode = 0;
// GhostScript arguments
const int gsArgumentCount = 2;
char * gsArgumentValues[gsArgumentCount];
gsArgumentValues[0] = "-dNODISPLAY";
gsArgumentValues[1] = const_cast<char*>( inputFile.c_str() );
initiationCode = gsapi_new_instance(&minst, NULL);
if(initiationCode < 0)
return true;
initiationCode = gsapi_set_arg_encoding(minst, GS_ARG_ENCODING_UTF8);
if (initiationCode < 0)
{
gsapi_delete_instance(minst);
return true;
}
executionCode = gsapi_init_with_args(minst, gsArgumentCount, gsArgumentValues);
if(executionCode < 0)
{
gsapi_exit(minst);
gsapi_delete_instance(minst);
return true;
}
gsapi_exit(minst);
gsapi_delete_instance(minst);
return false;
}
When the document is protected it returns true because it can't open without password and Ghostscript returns a execution error, this is OK.
The problem is when the document it's not protected it opens the first page and keeps waiting for the return key to be pressed so it can pass to the next page.
>>showpage, press <return> to continue<<
You can add the "-dNOPAUSE" and "-dBATCH" to the gsArgumentValues but this will cause another problem. If I have really big PDF with big number of pages this will take to much time to pass through all pages only after that it will return false.
Anyone knows how can I exit the process when the Ghostscript prompts for return? At that stage I already know that the PDF is not protected and I can exit and return false.
I have tried with the callback function but with no success.
Thanks in advance.
Try using -dFirstPage=1 -dLastPage=1 and -dNOPAUSE -dBATCH. Then it only processes the first page.
Otherwise you'll have to intercept the press any key and interrupt the process, or modify the device to throw an error after the first page or something.
NOTE although you can use these switches with other types of input than PDF, they still have to process the entire input file because, unlike PDF, there is no easy way to find the n'th page in a PostScript or PCL file.

Verifying download box pop up using selenium

I have a website where I am using selenium for integration testing.
I have link there that is generated using multiple variables from page.
I would like to verify the download pop up box is displaying if at all possible, when i am simulating click on link to download the file.
I know i can have JsUnit that will do that for me.
Any ideas?
Selenium is lousy when working with downloads. Clicking the link will get you into trouble.
You could, however, make a request using HttpURLConnection, Apache HttpComponents (or maybe just a file get through URL) for the link specified and assert a 200 OK response. Or try to get the file - this is my favourite tool for this with Selenium.
Thnx to Slanec i have took up your examples.
Ok after investigation I have decided that best solution will be
something along this line.
public int GetFileLenghtFromUrlLocation(string location)
{
int len = 0;
int timeoutInSeconds = 5;
// paranoid check for null value
if (string.IsNullOrEmpty(location)) return 0;
// Create a web request to the URL
HttpWebRequest myRequest = (HttpWebRequest)WebRequest.Create(location);
myRequest.Timeout = timeoutInSeconds * 1000;
myRequest.AddRange(1024);
try
{
// Get the web response
HttpWebResponse myResponse = (HttpWebResponse)myRequest.GetResponse();
// Make sure the response is valid
if (HttpStatusCode.OK == myResponse.StatusCode)
{
// Open the response stream
using (Stream myResponseStream = myResponse.GetResponseStream())
{
if (myResponseStream == null) return 0;
using (StreamReader rdr = new StreamReader(myResponseStream))
{
len = rdr.ReadToEnd().Length;
}
}
}
}
catch (Exception err)
{
throw new Exception("Error saving file from URL:" + err.Message, err);
}
return len;
}