cURL connection closes on http status 200(OK) - c++

I'm trying to connect server and get response with curl but when data starting transfer curl drop connection. Can anyone say me whats going wrong ?
< HTTP/1.1 200 OK
* Server nginx is not blacklisted
< Server: nginx
< Date: Sat, 15 Feb 2014 02:08:27 GMT
< Content-Type: application/json; charset=utf-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< Status: 200 OK
< X-UA-Compatible: IE=Edge,chrome=1
< Cache-Control: max-age=0, private, must-revalidate
< ETag: "e283c3fc75aa2172d77a717ecbb49b41"
< Set-Cookie: remember_token=BAhbB2kCyVsiRTUxMmI2MzQwOTk5MGU2ZDU3NmMxMmRjY2UxZTI0ODgzNzJmOGRlNzFlNzZjMWExZWM3NTA3OWJjY2E0NmQ5Mjc%3D--1e016eb43ab4803ac65211ddd383ddeeaf9b53f2; path=/; expires=Wed, 15-Feb-2034 02:08:26 GMT
< Set-Cookie: _gameserver2250_session=BAh7BiIPc2Vzc2lvbl9pZCIlYzIxNGNlNTExNjYwNDg0MTU0YWJkOGQ2ZWI1ZDI3ZTk%3D--f9c25e0cd8d94283b01056e9aeff46a1feb48d6b; path=/; HttpOnly
< X-Runtime: 1.249006
< Content-Encoding: gzip
<
* Recv failure: Connection reset by peer
* Closing connection 0

Related

download large file from Jetty (ambari webhdfs) is slow

I have a file about 5G, download from hdfs using python client at 12M/s, buy my network could reach 500M/s, and smaller file work fine. Then I reproduced this problem with curl.
Here is curl debug log:
curl -v -X GET http://x.x.x.x/file
> GET /webhdfs/v1/user/sohuvideo/online/srcFile/188/718/188718791/dat1_188718791_2020_4_11_17_4_172647e6e60.mp4?op=OPEN&user.name=sohuvideo&namenoderpcaddress=sotocyon&offset=0 HTTP/1.1
> User-Agent: curl/7.29.0
> Host: x.x.x.com:50075
> Accept: */*
>
< HTTP/1.1 200 OK
< Cache-Control: no-cache
< Expires: Tue, 21 Apr 2020 03:01:26 GMT
< Date: Tue, 21 Apr 2020 03:01:26 GMT
< Pragma: no-cache
< Expires: Tue, 21 Apr 2020 03:01:26 GMT
< Date: Tue, 21 Apr 2020 03:01:26 GMT
< Pragma: no-cache
< Content-Type: application/octet-stream
< Access-Control-Allow-Methods: GET
< Access-Control-Allow-Origin: *
< Transfer-Encoding: chunked
< Server: Jetty(6.1.26)
<
{ [data not shown]
100 119M 0 119M 0 0 13.0M 0 --:--:-- 0:00:09 --:--:-- 12.1M^C
After some digging, I found if attach header Connection: close to the request, it could end up much faster.
curl -v -H "Connection: close" -X GET http://x.x.x.x/file
> GET /webhdfs/v1/user/sohuvideo/online/srcFile/188/718/188718791/dat1_188718791_2020_4_11_17_4_172647e6e60.mp4?op=OPEN&user.name=sohuvideo&namenoderpcaddress=sotocyon&offset=0 HTTP/1.1
> User-Agent: curl/7.29.0
> Host: x.x.x.com:50075
> Accept: */*
> Connection: close
>
< HTTP/1.1 200 OK
< Cache-Control: no-cache
< Expires: Tue, 21 Apr 2020 03:00:13 GMT
< Date: Tue, 21 Apr 2020 03:00:13 GMT
< Pragma: no-cache
< Expires: Tue, 21 Apr 2020 03:00:13 GMT
< Date: Tue, 21 Apr 2020 03:00:13 GMT
< Pragma: no-cache
< Content-Type: application/octet-stream
< Access-Control-Allow-Methods: GET
< Access-Control-Allow-Origin: *
< Connection: close
< Server: Jetty(6.1.26)
<
{ [data not shown]
100 4517M 0 4517M 0 0 138M 0 --:--:-- 0:00:32 --:--:-- 153M
* Closing connection 0
I think this probably caused by Transfer-Encoding: chunked from server when file is large, server chose this because when server transfer the file the file size has not yet be decided, chunked stream could give a lots of overhead. If given Connection: close then server would not use Transfer-Encoding: chunked to indicate the end of steam, just close the connection instead.
Is there any way to fix this from server side?

Getting GSSException: Defective token detected error while calling HDFS API on a kerberised cluster

I have a kerberised CDH v5.14 cluster with 3 nodes.I trying to call the HDFS API using python as below
baseurl = "http://<host_name>:50070/webhdfs/v1/prod/?op=LISTSTATUS"
__, krb_context = kerberos.authGSSClientInit("HTTP/<host_name>")
#kerberos.authGSSClientStep(krb_context, "")
negotiate_details = kerberos.authGSSClientResponse(krb_context)
headers = {"Authorization": "Negotiate " + str(negotiate_details)}
r = requests.get(baseurl, headers=headers)
print r.status_code
The below error is returned
GSSException: Defective
token detected (Mechanism level: GSSHeader did not find the right tag)
HTTP ERROR 403
But the same works fine when I run it using curl
curl -i --negotiate -u: http://<host_name>:50070/webhdfs/v1/prod/?op=LISTSTATUS
HTTP/1.1 401 Authentication required Cache-Control:
must-revalidate,no-cache,no-store Date: Wed, 30 May 2018 02:50:04 GMT
Pragma: no-cache Date: Wed, 30 May 2018 02:50:04 GMT Pragma: no-cache
Content-Type: text/html; charset=iso-8859-1 X-FRAME-OPTIONS:
SAMEORIGIN WWW-Authenticate: Negotiate Set-Cookie: hadoop.auth=;
Path=/; HttpOnly Content-Length: 1409
HTTP/1.1 200 OK Cache-Control: no-cache Expires: Wed, 30 May 2018
02:50:04 GMT Date: Wed, 30 May 2018 02:50:04 GMT Pragma: no-cache
Expires: Wed, 30 May 2018 02:50:04 GMT Date: Wed, 30 May 2018 02:50:04
GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS:
SAMEORIGIN WWW-Authenticate: Negotiate
YGYGCSqGSIb3EgECAgIAb1cwVaADAgEFoQMCAQ+iSTBHoAMCAReiQAQ+6Seu0SSYGmoqN4hdykSQ55ZcP+juBO/jk8/BGjoK5NCmdlBRFPMSbCZXvVjNHLg9iPACGvM8V0jqXTM5UfQ=
Set-Cookie:
hadoop.auth="u=XXXX&p=XXXX#HOSTNAME&t=kerberos&e=1527684604664&s=tVsrEsDMBGV0To8hOPp8mLxyiSo=";
Path=/; HttpOnly Transfer-Encoding: chunked
and it gives the correct response, what am I missing here? Any help is appreciated.

Can't adjust buffer to fit data

I am trying to make an HTTP request the with EtherCard library, then get the full response. Using the code from the examples, I'm only able to capture the headers, which are then abruptly cut off. The issue seems to be that I can't make the buffer big enough to store the data, but the data, hence why it's cut off. But it's only 292 bytes.
Here is another question I asked trying to understand what the example code was doing: What is happening in this C/Arduino code?
Here is the data I'm trying to GET: http://jsonplaceholder.typicode.com/posts/1
String response;
byte Ethernet::buffer[800]; // if i raise this to 1000, response will be blank
static void response_handler (byte status, word off, word len) {
Serial.println("Response:");
Ethernet::buffer[off + 400] = 0; // if i raise 400 much higher, response will be blank
response = String((char*) Ethernet::buffer + off);
Serial.println(response);
}
See the comments above for what I've attempted.
Here is the output from the code above:
Response:
HTTP/1.1 404 Not Found
Date: Fri, 20 Jan 2017 12:15:19 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 2
Connection: close
Set-Cookie: __cfduid=d9714bd94284b999ceb0e87bc91705d501484914519; expires=Sat, 20-Jan-18 12:15:19 GMT; path=/; domain=.typicode.com; HttpOnly
X-Powered-By: Express
Vary: Origin, Accept-Encoding
Access-Control-Allow-Credentials: true
Cache-Control: no
As you can see, it's not the complete data, only some of the headers.
There are several problems here:
1) You get a HTTP 404 response, which means the resource was not found on the server. So you need to check your request.
2) You are cutting off the string at pos 400:
Ethernet::buffer[off + 400] = 0; // if i raise 400 much higher, response will be blank
That's why it stops after Cache-Control: no, which is exactly 400 bytes (byte 0-399).
You probably want Ethernet::buffer[off + len] = 0;, but you also need to check if that is not out of bounds (i.e. larger than your buffer size - that's probably why you get a 'blank' response).
For example, a 404 response from that server looks like this:
HTTP/1.1 404 Not Found
Date: Mon, 23 Jan 2017 07:00:00 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 2
Connection: keep-alive
x-powered-by: Express
Vary: Accept-Encoding
Access-Control-Allow-Credentials: true
Cache-Control: no-cache
Pragma: no-cache
Expires: -1
x-content-type-options: nosniff
Etag: W/"2-mZFLkyvTelC5g8XnyQrpOw"
Via: 1.1 vegur
CF-Cache-Status: MISS
Server: cloudflare-nginx
CF-RAY: 32595301c275445d-xxx
{}
and the 200 response headers (from a browser):
HTTP/1.1 200 OK
Date: Mon, 23 Jan 2017 07:00:00 GMT
Content-Type: application/json; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
x-powered-by: Express
Vary: Accept-Encoding
Access-Control-Allow-Credentials: true
Cache-Control: public, max-age=14400
Pragma: no-cache
Expires: Mon, 23 Jan 2017 10:59:01 GMT
x-content-type-options: nosniff
Etag: W/"124-yv65LoT2uMHrpn06wNpAcQ"
Via: 1.1 vegur
CF-Cache-Status: HIT
Server: cloudflare-nginx
CF-RAY: 32595c4ff39b445d-xxx
Content-Encoding: gzip
So your buffer needs to be big enough to hold both the response headers and the data.
3) In the 200 response we see 2 things: the transfer is chunked, and gzipped (but the latter only happens when there is a Accept-Encoding: gzip header in the request.
The easiest way to handle this is to send a HTTP/1.0 request instead of HTTP/1.1 (chunked transfer and gzip are not allowed/available in HTTP/1.0).

Django: Strange HTTP Status codes for non-existing URLS

I have a Django website with activated translations (django.middleware.locale.LocaleMiddleware),
I someone requests a non-existing page:
https://example.com/nonexisting
Django then responds with:
HTTP/1.1 302 FOUND
Date: Fri, 02 Sep 2016 09:15:45 GMT
Server: Apache/2.4.7 (Ubuntu)
Vary: Cookie,Host
Location: https://example.com/de/nonexisting
Content-Type: text/html; charset=utf-8
HTTP/1.1 301 MOVED PERMANENTLY
Date: Fri, 02 Sep 2016 09:15:45 GMT
Server: Apache/2.4.7 (Ubuntu)
Vary: Cookie,Host
X-Frame-Options: SAMEORIGIN
Content-Language: de
Set-Cookie: django_language=de; expires=Sat, 02-Sep-2017 09:15:45 GMT; Max-Age=31536000; Path=/
Location: https://example.com/de/nonexisting/
Content-Type: text/html; charset=utf-8
HTTP/1.1 404 NOT FOUND
Date: Fri, 02 Sep 2016 09:15:45 GMT
Server: Apache/2.4.7 (Ubuntu)
Vary: Cookie,Host
X-Frame-Options: SAMEORIGIN
Content-Language: de
Set-Cookie: django_language=de; expires=Sat, 02-Sep-2017 09:15:45 GMT; Max-Age=31536000; Path=/
Content-Type: text/html; charset=utf-8
The user receives in this order:
302,301,404
How can I achieve that the user directly gets the 404?

What is the HeadBucket operation in Amazon S3

I have been looking at the usage reports from Amazons S3 service and noticed that there is a DataTransfer-out-bytes charge for GetObject operations (ok i understand this one) and also a DataTransfer-out-bytes charge for HeadBucket operations.
What is HeadBucket, when is this request made?
cheers
That's a HEAD request to a bucket:
HEAD /my-s3-bucket
Which will basically just tell you that a bucket exists (200 OK), or not (404 Not Found).
For Example:
# curl -v -X HEAD http://s3.amazonaws.com/fooXXXX
* About to connect() to s3.amazonaws.com port 80 (#0)
* Trying 72.21.211.144... connected
* Connected to s3.amazonaws.com (72.21.211.144) port 80 (#0)
> HEAD /fooXXXX HTTP/1.1
> User-Agent: curl/7.18.2 (i486-pc-linux-gnu) libcurl/7.18.2 OpenSSL/0.9.8g zlib/1.2.3.3 libidn/1.10
> Host: s3.amazonaws.com
> Accept: */*
>
< HTTP/1.1 404 Not Found
< x-amz-request-id: A21BF750F080A267
< x-amz-id-2: SPQ7yX6Ln0Zgp0YULT/64ag9077nNnN25jH8PMLGMm/SbXPZ+FF3qFuiOyBfiktP
< Content-Type: application/xml
< Transfer-Encoding: chunked
< Date: Thu, 23 Apr 2009 13:39:50 GMT
< Server: AmazonS3
Vs.
# curl -v -X HEAD http://s3.amazonaws.com/s3hub
* About to connect() to s3.amazonaws.com port 80 (#0)
* Trying 72.21.207.135... connected
* Connected to s3.amazonaws.com (72.21.207.135) port 80 (#0)
> HEAD /s3hub HTTP/1.1
> User-Agent: curl/7.18.2 (i486-pc-linux-gnu) libcurl/7.18.2 OpenSSL/0.9.8g zlib/1.2.3.3 libidn/1.10
> Host: s3.amazonaws.com
> Accept: */*
>
< HTTP/1.1 200 OK
< x-amz-id-2: E6OvrEMD35HpJjlBg0kB90H/uaQDX8qk0oXb+baOtDKIoMXmNwgIRSX2rDE5Urlb
< x-amz-request-id: DAAAA11524A4A557
< Date: Thu, 23 Apr 2009 13:43:01 GMT
< Content-Type: application/xml
< Transfer-Encoding: chunked
< Server: AmazonS3
<