HttpQueryInfo to get File Size - c++

Why does this function work on a direct url to a download however fail on a php page echoing out a file for download? (GetLastError is 0)

Not all HTTP requests will have a content length field in the response. Dynamic pages generated by PHP scripts might not know how large the content actually is.
In these cases you need just need to read a little bit at the time until there is no more data returned from the server.

Related

How does one determine the filetype on an AWS S3 hosted file without the extension?

As an example, I'm currently uploading items directly to an S3 bucket using a form. While I was testing, I didn't specify any expected filenames or extensions.
I uploaded a .png which produced this direct link:
https://s3-us-west-2.amazonaws.com/easyhighlighting2/2015-07-271438019663927upload94788
When I place this inside an img tag, it displays on a web page properly.
My question is, without an extension, how would my browser know what type of file it's loading? Inside the bucket, the file's metadata isn't even filled out.
Is there any way to get that file extension, programmatically?
I'm ready to try any clientside methods available; my server-side language is ColdFusion which is somewhat limiting, but I'm open to suggestions for that as well.
Okay, so after some more extensive digging, I found a method of retrieving the file's type that was only added since CF10 was released; that would explain the lack of documentation.
The answer lies in the FileGetMimeType function.
<cfset someVar = "https://s3-us-west-2.amazonaws.com/easyhighlighting2/2015-07-271438019663927upload94788">
<cfset FileType = FileGetMimeType(someVar)>
<cfoutput>#FileType#</cfoutput>
This code would output image/png - which is correct and has worked for every filetype I have tested thus far.
I'm surprised this kind of question hasn't popped up before, but this appears to be the best answer, at least for users of CFML.
Edit:
ColdFusion accomplishes this by either reading the contents of a file, or by trusting its extension. An implicit attribute, 'strict', is used in this function. If true, it reads the file's contents. If false, it uses the provided extension.
True is the default.
Link:
https://wikidocs.adobe.com/wiki/display/coldfusionen/FileGetMimeType
Check the Content-Type HTTP response header returned by Amazon S3.
For example, curl -I https://s3.amazonaws.com/path/to/file fetches only the headers.

How to retrieve codepage from cURL HTTP response?

I'm using lib-cURL as a HTTP client to retrieve various pages (can be any URL for that matter).
Usually the data comes as a UTF-8 string and then I just call "MultiByteToWideChar" and it works well.
However, some web-pages still use code-page encoding and I see gibberish if i try to convert those pages to UTF-8.
Is there an easy way to retrieve the code page from the data? or I'll have to scan it manually (for "encoding=") and then translate it accordingly.
If so, how do i get the code-page id from name (Code Page Identifiers)?
Thanks,
Omer
There are several location where a document can state its encoding:
the Content-Type HTTP header
the (optional) XML declaration
the Content-Type meta tag inside the document header
for HTML5 documents the charset meta tag.
There are probably even more I've forgotten.
In the end, detecting the actual encoding is rather hard. You really shouldn't do this yourself but use high-level libraries for retrieving and parsing HTML content. I'm sure they are available even for C++, even if they have to be thiefed from the a browser environment. :)
I used DetectInputCodepage in IMultiLanguage2 interface and it worked great !

How to Check if a website contains Flash

I've created a web browser using mfc and i'm using IHhmlReader to read the contents of html when the user enters a url in the browser and page is completely loaded.Now i want to check if the webpage has any flash in it.
Any Helps would be highly appreciated.
Thank You.
I think this is a bit difficult to do, just reading from the HTML source, unless you try to instantiate the page and see if it's making a call to the Flash object. I have listed some options you can try, but you'll need to make sure that the code element is not commented out and check include files and iframes to see if Flash is called from there.
* Look for the OBJECT and EMBED tags (see http://kb2.adobe.com/cps/127/tn_12701.html)
* In page's JavaScript, look for SWFObject() call
* Look for the call to .swf file (could even be in an img tag)
Good luck...

How to get input from web?

i am trying to find out, how to get input from html inputs using c++. In windows you can send WM_GETTEXT to the window and it returns text, that you wanted. But is there any way to do the same thing in web interface?.
I am not interesting in sniffing packets now.
For example. Some site has html intput which expects name. I write name to the input. And then i want to catch it with my program
If I understood correctly what you want to do, you have to set up a web server that calls your C++ application via CGI. So, you'll have an HTML page (static or generated by your program) that will contain a form, that refers to the URL of your application. So, when the user will click Submit, the browser will issue a request to the webserver, which in turn will call your application, passing to it the various POST/GET parameters related to the form.
Your application then can process the data, extracting such parameters from the environment variables (if the data is passed using the GET method) or from the standard input (if the POST method is used). To generate the output page (along with the output HTTP header) you'll simply have to write it to the standard output.
One thing I can think of (if you're using Linux) is using wget via system() from within your C++ app.
Wget to fetch the html page and output it to a file, parse the file for the URL of the form and data that it needs, pass the response as POST / GET via wget and so on.
That is, if I understood what you meant by "do it from existing page" correctly.

Django return large file

I am trying to find the best way (most efficient way) to return large files from Django back to an http client.
receive http get request
read large file from disk
return the content of that file
I don't want to read the file then post the response using HttpResponse as the file content is first stored in RAM if I am correct. How can I do that efficiently ?
Laurent
Look into mod_xsendfile on Apache (or equivalents for nginx, etc) if you like to use Django for authentication. Otherwise, there's no need to hit django, and just server straight from Apache.
There is a ticket that aims to deal with this problem here: http://code.djangoproject.com/ticket/2131
It adds an HttpResponseSendFile class that uses sendfile() to send the file, which transparently sends the file as it's read.
However, the standard HttpResponse is implemented as an iterator, so if you pass it a file-like object, it will follow its iteration semantics, so presumably you could create a file-like object wrapper that chunks the file in small enough pieces before sending them out.
I believe the semantics of iterating over a standard file object in python is that it reads line-by-line, which most likely won't solve your problem if you're dealing with binary files.
Of course, you could always put the static files in another location and serve that with a normal web server, unless you require intricate control (like access control requiring knowledge of the Django database)
My preference for all of this is to synthesize django with your http server so that when you want to serve static files, you simply refer them to a path that will never reach django. The strategy will look something like this:
Configure http server so that some requests go to django and some go to a static document root
link to static documents from any web pages that obviously need the static documents (e.g. css, javascript, etc.)
for any non-obvious return of a static document, use an HttpRedirect("/web-path/to/doc").
If you need to include the static document inside a dynamic document (maybe a page-viewer wrapping a large text or binary file), then return a wrapper page that populates a div with an ajax call to your static document.