get an MIME from an HTTP reply file - c++

i have successfully make a request and get a source page which begins with:
HTTP/1.1 200 OK
Date: Tue, 30 Jul 2013 02:11:13 GMT
RallyRequestID: qs-app-061t71gj9w60m7graiwwkcihbo3.qs-app-061391284
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Content-Type: image/jpeg;charset=UTF-8
Content-Disposition: attachment; filename=iteration_burndown.jpg
Set-Cookie: XSESSIONID=o44i037e364313anetibup7ux;Path=/analytics;HttpOnly
Vary: Accept-Encoding
P3P: CP="NON DSP COR CURa PSAa PSDa OUR NOR BUS PUR COM NAV STA"
Cache-Control: no-cache,no-store,max-age=0,must-revalidate
Transfer-Encoding: chunked
the rest part is the jpeg binary data
is there an exited lib in python or C++ that when given the length of whole file and the length of header, then it can get the jpeg out?

Related

Getting GSSException: Defective token detected error while calling HDFS API on a kerberised cluster

I have a kerberised CDH v5.14 cluster with 3 nodes.I trying to call the HDFS API using python as below
baseurl = "http://<host_name>:50070/webhdfs/v1/prod/?op=LISTSTATUS"
__, krb_context = kerberos.authGSSClientInit("HTTP/<host_name>")
#kerberos.authGSSClientStep(krb_context, "")
negotiate_details = kerberos.authGSSClientResponse(krb_context)
headers = {"Authorization": "Negotiate " + str(negotiate_details)}
r = requests.get(baseurl, headers=headers)
print r.status_code
The below error is returned
GSSException: Defective
token detected (Mechanism level: GSSHeader did not find the right tag)
HTTP ERROR 403
But the same works fine when I run it using curl
curl -i --negotiate -u: http://<host_name>:50070/webhdfs/v1/prod/?op=LISTSTATUS
HTTP/1.1 401 Authentication required Cache-Control:
must-revalidate,no-cache,no-store Date: Wed, 30 May 2018 02:50:04 GMT
Pragma: no-cache Date: Wed, 30 May 2018 02:50:04 GMT Pragma: no-cache
Content-Type: text/html; charset=iso-8859-1 X-FRAME-OPTIONS:
SAMEORIGIN WWW-Authenticate: Negotiate Set-Cookie: hadoop.auth=;
Path=/; HttpOnly Content-Length: 1409
HTTP/1.1 200 OK Cache-Control: no-cache Expires: Wed, 30 May 2018
02:50:04 GMT Date: Wed, 30 May 2018 02:50:04 GMT Pragma: no-cache
Expires: Wed, 30 May 2018 02:50:04 GMT Date: Wed, 30 May 2018 02:50:04
GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS:
SAMEORIGIN WWW-Authenticate: Negotiate
YGYGCSqGSIb3EgECAgIAb1cwVaADAgEFoQMCAQ+iSTBHoAMCAReiQAQ+6Seu0SSYGmoqN4hdykSQ55ZcP+juBO/jk8/BGjoK5NCmdlBRFPMSbCZXvVjNHLg9iPACGvM8V0jqXTM5UfQ=
Set-Cookie:
hadoop.auth="u=XXXX&p=XXXX#HOSTNAME&t=kerberos&e=1527684604664&s=tVsrEsDMBGV0To8hOPp8mLxyiSo=";
Path=/; HttpOnly Transfer-Encoding: chunked
and it gives the correct response, what am I missing here? Any help is appreciated.

Can't adjust buffer to fit data

I am trying to make an HTTP request the with EtherCard library, then get the full response. Using the code from the examples, I'm only able to capture the headers, which are then abruptly cut off. The issue seems to be that I can't make the buffer big enough to store the data, but the data, hence why it's cut off. But it's only 292 bytes.
Here is another question I asked trying to understand what the example code was doing: What is happening in this C/Arduino code?
Here is the data I'm trying to GET: http://jsonplaceholder.typicode.com/posts/1
String response;
byte Ethernet::buffer[800]; // if i raise this to 1000, response will be blank
static void response_handler (byte status, word off, word len) {
Serial.println("Response:");
Ethernet::buffer[off + 400] = 0; // if i raise 400 much higher, response will be blank
response = String((char*) Ethernet::buffer + off);
Serial.println(response);
}
See the comments above for what I've attempted.
Here is the output from the code above:
Response:
HTTP/1.1 404 Not Found
Date: Fri, 20 Jan 2017 12:15:19 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 2
Connection: close
Set-Cookie: __cfduid=d9714bd94284b999ceb0e87bc91705d501484914519; expires=Sat, 20-Jan-18 12:15:19 GMT; path=/; domain=.typicode.com; HttpOnly
X-Powered-By: Express
Vary: Origin, Accept-Encoding
Access-Control-Allow-Credentials: true
Cache-Control: no
As you can see, it's not the complete data, only some of the headers.
There are several problems here:
1) You get a HTTP 404 response, which means the resource was not found on the server. So you need to check your request.
2) You are cutting off the string at pos 400:
Ethernet::buffer[off + 400] = 0; // if i raise 400 much higher, response will be blank
That's why it stops after Cache-Control: no, which is exactly 400 bytes (byte 0-399).
You probably want Ethernet::buffer[off + len] = 0;, but you also need to check if that is not out of bounds (i.e. larger than your buffer size - that's probably why you get a 'blank' response).
For example, a 404 response from that server looks like this:
HTTP/1.1 404 Not Found
Date: Mon, 23 Jan 2017 07:00:00 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 2
Connection: keep-alive
x-powered-by: Express
Vary: Accept-Encoding
Access-Control-Allow-Credentials: true
Cache-Control: no-cache
Pragma: no-cache
Expires: -1
x-content-type-options: nosniff
Etag: W/"2-mZFLkyvTelC5g8XnyQrpOw"
Via: 1.1 vegur
CF-Cache-Status: MISS
Server: cloudflare-nginx
CF-RAY: 32595301c275445d-xxx
{}
and the 200 response headers (from a browser):
HTTP/1.1 200 OK
Date: Mon, 23 Jan 2017 07:00:00 GMT
Content-Type: application/json; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
x-powered-by: Express
Vary: Accept-Encoding
Access-Control-Allow-Credentials: true
Cache-Control: public, max-age=14400
Pragma: no-cache
Expires: Mon, 23 Jan 2017 10:59:01 GMT
x-content-type-options: nosniff
Etag: W/"124-yv65LoT2uMHrpn06wNpAcQ"
Via: 1.1 vegur
CF-Cache-Status: HIT
Server: cloudflare-nginx
CF-RAY: 32595c4ff39b445d-xxx
Content-Encoding: gzip
So your buffer needs to be big enough to hold both the response headers and the data.
3) In the 200 response we see 2 things: the transfer is chunked, and gzipped (but the latter only happens when there is a Accept-Encoding: gzip header in the request.
The easiest way to handle this is to send a HTTP/1.0 request instead of HTTP/1.1 (chunked transfer and gzip are not allowed/available in HTTP/1.0).

Response headers returned by Google's NaCL are incomplete

Response headers returned by Google's NaCL are incomplete. I can see in the network tab of chrome developer tools that browser actually received all the headers but NaCl is returning me incomplete headers in c++ code.
Headers i see in network tab of chrome:
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 3312
Connection: keep-alive
Date: Wed, 14 Sep 2016 02:12:37 GMT
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET
Last-Modified: Wed, 14 Sep 2016 02:12:37 GMT
ETag: "644c53b76e0fd646aad2f4aaf313067d"
x-amz-version-id: rMFIbU5ptnyKgwn1_YRNr5OkEkTT8cEV
Accept-Ranges: bytes
Server: AmazonS3
Vary: Origin
X-Cache: Miss from cloudfront
Via: 1.1 a75bbd1dd9f3f983d073b0972494851d.cloudfront.net (CloudFront)
X-Amz-Cf-Id: 41xr_GlfJLkZ_SszRBmy62jCVjQgCl9sgyHtusJjc95Tb91BISksyg==
Headers i receive in c++ code from NaCL:
Content-Type: application/octet-stream
Last-Modified: Wed, 14 Sep 2016 02:12:37 GMT
This is how i access and print headers received form NaCL:
//Following is inside callback method passed to pp::URLLoader::Open method as second argument.
auto response = _pLoader->GetResponseInfo();
auto headers = response.GetHeaders().AsString();
std::cout<< headers << std::endl;
Question: How can i get all the headers and print them?

Qt- QNetworkReply not showing Content-Length header

For some download urls the QNetworkReply object does not contain the Content-Length header and returns File Size as -1. I tested for the following url:
http://download-cf.jetbrains.com/webide/PhpStorm-EAP-141.332.tar.gz
The headers shown by Live HTTP Headers in Firefox are as follows:
HTTP/1.1 200 OK
Content-Type: application/x-tar
Content-Length: 135144452
Connection: keep-alive
Date: Mon, 30 Mar 2015 17:49:03 GMT
Content-Encoding: gzip
x-amz-meta-s3cmd-attrs: uid:572/gname:cds/uname:cds/gid:574/mode:33188/mtime:1427282503/atime:1427282968/md5:a2ccadce9ae0f356e9c11a6d5dd5a4f0/ctime:1427282503
Last-Modified: Wed, 25 Mar 2015 11:36:03 GMT
Etag: "db9a27ca51b84bac23080028b3e267ef-9"
Accept-Ranges: bytes
Server: AmazonS3
Age: 313
X-Cache: Miss from cloudfront
Via: 1.1 f94856caaa8ad33df4ddf975899fadd2.cloudfront.net (CloudFront)
X-Amz-Cf-Id: GFsaZTTMQ5eQ54JOUzBfJmIHL6AolKkXknb2HAcfbCKsbIYgdJng_Q==
And when I do following:
qDebug()<<reply->rawHeaderList();
The output is:
("Content-Type", "Connection", "Date", "Content-Encoding",
"x-amz-meta-s3cmd-attrs", "Last-Modified",
"ETag", "Accept-Ranges", "Server", "Age", "X-Cache",
"Via", "X-Amz-Cf-Id")
Clearly, Content-Length is missing. So, is their any solution for this.
I have logged a bug report for the same. It can be tracked at following url:
https://bugreports.qt.io/browse/QTBUG-45322

writing proper "HEAD" and "GET" request in winsock c++

Actually I was coding for downloading the files in HTTP using winsock c++ and to get the details I fired "HEAD" header..
(this is what actually I did)
HEAD /files/ODBC%20Programming%20in%20C%2B%2B.pdf HTTP/1.0
Host: devmentor-unittest.googlecode.com
Response was:
HTTP/1.0 404 Not Found
Content-Type: text/html; charset=UTF-8
Set-Cookie: PREF=ID=feeed8106df5e5f1:TM=1370157208:LM=1370157208:S=10bN4nrXqkcCDN5n; expires=Tue, 02-Jun-2015 07:13:28 GMT; path=/; domain=devmentor-unittest.googlecode.com
X-Content-Type-Options: nosniff
Date: Sun, 02 Jun 2013 07:13:28 GMT
Server: codesite_downloads
Content-Length: 974
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
But if I do:
GET /files/ODBC%20Programming%20in%20C%2B%2B.pdf HTTP/1.0
Host: devmentor-unittest.googlecode.com
The file sucessfully gets downloaded....
After then after I download, again if I fire the HEAD request... it also brings up the following
HTTP/1.0 200 OK
Content-Length: 320381
Content-Type: application/pdf
Content-Disposition: attachment; filename="ODBC Programming in C++.pdf"
Accept-Ranges: bytes
Date: Sun, 02 Jun 2013 05:47:11 GMT
Last-Modified: Sun, 11 Nov 2007 03:17:59 GMT
Expires: Sun, 09 Jun 2013 05:47:11 GMT
Cache-Control: public, max-age=604800
Server: DFE/largefile
//something like this.....
Question: why "HEAD" is returning the false "error not found" at first but the file gets downloaded in using "GET" and after downloading "HEAD" also returns goodies i need...where have i mistaken..
The file I am trying to download here is "http://devmentor-unittest.googlecode.com/files/ODBC%20Programming%20in%20C%2B%2B.pdf" (just for example)
The problem is not on your end. Google Code simply does not implement HEAD correctly. This was reported 5 years ago and is still an open issue:
Issue 660: support HTTP HEAD method for file download urls