Unable to execute AWS API gateway url from cURL - amazon-web-services

I have created one simple api gateway, which will trigger lambda function and returns some welcome message to the caller.
If I call api url from browser, it is working for both http (internally it is redirecting to https) and https calls.
If I call using cURL, http call is not redirecting to https and I am getting following error:
HTTP/1.1 301 Moved Permanently
Server: CloudFront
Date: Wed, 28 Feb 2018 11:27:48 GMT
Content-Type: text/html
Content-Length: 183
Connection: keep-alive
Location: https://f1234567.execute-api.ap-southeast-1.amazonaws.com/test
X-Cache: Redirect from cloudfront
Via: 1.1 werfdbdcc66044d560a313352d21.cloudfront.net (CloudFront)
X-Amz-Cf-Id: sfdgfgsdqr7WzN85pe0AD6565bEKZoT4x56gN24dfgdf
<html>
<head><title>301 Moved Permanently</title></head>
<body bgcolor="white">
<center><h1>301 Moved Permanently</h1></center>
<center>CloudFront</center>
</body>
</html>
As next step, I want to call this api url from my hardware device.

It is not an error. What you're seeing is the exact behaviour you wanted -- 301 redirection. However curl does not follow redirects (unlike browsers, for instance) unless explicitly told to do so.
Adding -L switch to your curl call will enable redirects.
You can also enable verbosity flag -v and observe that it will then hit 301 before being redirected to https.

Related

How to send a valid HTTP request to a Google Apps Script Web App?

We sent an HTTP request from a C++ app (Arduino Sketck) to a Google apps script web app, but we got the HTTP Response: HTTP/1.1 302 Moved Temporarily. The url with the http request works fine from a browser.
The same code works also fine with other web site, like www.google.com. Do not work with script.google.com.
The Google apps script published web app is public, anyone even anonymous can access:
Here the code we used.
client.println("GET /macros/s/AKfycbyQnmHekk4_NNy3Bl5ILzuSRkykMWaXQ7Rtojk7fFieDUbVqNM/exec?valore=7 HTTP/1.1");
client.println("Host: script.google.com");
client.println("Connection: close");
client.println();
The answer was:
HTTP/1.1 301 Moved Permanently
Content-Type: text/html; charset=UTF-8
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Pragma: no-cache
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Date: Wed, 03 Feb 2021 09:29:02 GMT
Location: https://script.google.com/macros/s/AKfycbyQnmHekk4_NNy3Bl5ILzuSRkykMWaXQ7Rtojk7fFieDUbVqNM/exec?valore=7
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
Content-Security-Policy: frame-ancestors 'self'
X-XSS-Protection: 1; mode=block
Server: GSE
Accept-Ranges: none
Vary: Accept-Encoding
Connection: close
Transfer-Encoding: chunked
11e
<HTML>
<HEAD>
<TITLE>Moved Permanently</TITLE>
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000">
<H1>Moved Permanently</H1>
The document has moved here.
</BODY>
</HTML>
0
disconnecting from server.
The url is correct (
http://script.google.com/macros/s/AKfycbyQnmHekk4_NNy3Bl5ILzuSRkykMWaXQ7Rtojk7fFieDUbVqNM/exec?valore=7) but seems that the google apps script web app redirect the request (to the same url, using the https protocol).
Using the same code, we did others HTTP request from Arduino, and it worked fine.
For example we did:
client.println("GET /search?q=arduino HTTP/1.1");
client.println("Host: www.google.com");
client.println("Connection: close");
client.println();
And we got the response `` HTTP/1.1 200 OK ```, and the html response contains the search result according with the query q=arduino
Any suggestion on how we can send a valid http/https request to a Google apps script web app?
Thanks.
As you have noticed, the Google script app is redirecting you from HTTP to HTTPS. Some Google sites are accessible via HTTP, they don't have to redirect to HTTPS if they don't want to. In your example, http://www.google.com/search?q=arduino does redirect, to https://www.google.com/search?q=arduino&gws_rd=ssl. But, your client is not sending a User-Agent header in the request, so Google knows your client is not a browser, and might not be issuing the redirect in your case. But in a real browser, it does.
Putting the URL http://script.google.com/macros/s/AKfycbyQnmHekk4_NNy3Bl5ILzuSRkykMWaXQ7Rtojk7fFieDUbVqNM/exec?valore=7 into a browser does redirect to https://script.google.com/macros/s/AKfycbyQnmHekk4_NNy3Bl5ILzuSRkykMWaXQ7Rtojk7fFieDUbVqNM/exec?valore=7. A real browser will follow that redirect automatically, a user might not even notice the difference.
But your client will have to follow the redirect manually. That means extracting the Location header from the response, closing the existing connection (to script.google.com on port 80), connecting to the specified server (script.google.com on port 443), and initiating an SSL/TLS encrypted session with the server before you can finally send the HTTP request.
SSL/TLS is complex, and HTTP has a lot of rules to it. Don't try to implement them manually. You are best off using an existing HTTP library that has HTTPS support built in. Let it handle all of these details for you.

How to redirect to https://example.com/ instead of https://example.com:443/ on AWS ALB?

I have configured the following redirect rule to my AWS Application Load Balancer to redirect all HTTP traffic to HTTPS:
The issue is that when I now curl (or visit the domain in the browser), I will get this ugly and redundant Location response (domain changed to example.com):
~ $ curl -I http://www.example.com
HTTP/1.1 301 Moved Permanently
Server: awselb/2.0
Date: Mon, 14 Sep 2020 18:28:48 GMT
Content-Type: text/html
Content-Length: 150
Connection: keep-alive
Location: https://www.example.com:443/
I know that https://www.example.com:443/ is in practice just fine, and I know that it will not be shown in the end user's browser's URL field. But still, it will be shown in the browser's network tab's 'Response headers', and to me it just looks unprofessional compared to a redirect without the port, e.g.:
~ $ curl -I http://www.apple.com
HTTP/1.1 301 Moved Permanently
Server: AkamaiGHost
Content-Length: 0
Location: https://www.apple.com/
Cache-Control: max-age=0
Expires: Mon, 14 Sep 2020 18:33:23 GMT
Date: Mon, 14 Sep 2020 18:33:23 GMT
Connection: keep-alive
strict-transport-security: max-age=31536000
Set-Cookie: geo=FI; path=/; domain=.apple.com
Set-Cookie: ccl=izHQtSPVGso4jrdTGqyAkA==; path=/; domain=.apple.com
It would seem like a logical thing to just drop the port from the URL, but unfortunately it's a required field:
Also the 'Switch to full URL' option doesn't seem to really help, even though the port can be cleared there:
it still appears back after saving:
Is there any way to make this work?
Edit:
My domain is managed through AWS Route 53.
UPDATE: maybe it's worth mentioning this is nice as an exercise and show capabilities. However between a Lambda redirection and just a pure Load Balancer redirection, I'd still suggest you go with the ALB-native. It's more performatic (no cold-start, no extra hop), less error-prone (no coding required), for the cost of just aesthetics on a network trace or developer tools. End users would never notice that (you probably browse in lot of these sites everyday and you don't even know). So if you ask me, I'd suggest against going to production with something like this.
If you want something very simple as a redirect, you could just create a Target Group pointing to a Lambda function that returns a redirect. It doesn't require the overhead of managing a CloudFront distribution or trying to make it work behind a load balancer, as that would require changes in your DNS as well, since you cannot just place CloudFront behind a load balancer (not that I know of, at least).
For your case, I created a Lambda function (Python 3.8) with the following code:
def lambda_handler(event, context):
response = {}
if event.get('headers', {}).get('x-forwarded-proto', '') == 'http':
print(event)
response["statusCode"]=302
response["headers"] = {"Location": f"https://{event['headers']['host']}{event['path']}"}
response["body"]=""
return response
Then I created a new Target Group with that Lambda function as backend:
Finally, I configured my Listener on port 80 to the redirect Target Group:
With this implementation, your 302 redirect will show up as you expected:
This would still display in your browser's URL field without the port if this is what you're worrying about.
By default interpreters assume that if the protocol is HTTPS then the port is 443 so you do not need to specify anything, when the redirect occurs it does not add a port number to the end unless you specify a non-standard port such as 4430.
You have to specify the protocol (HTTPS) and port (443) simply because the configuration requires these items, this will not display for the user. This is the same as other AWS configurations such as during the configuration for an ELB.
You could put CloudFront in front of your server and have that perform https redirect. If you are only using ELB for SSL termination, you could remove it entirely and have CloudFront terminate your SSL. This would save you some money as ELB is pretty expensive if your only using it for one origin server.
https://aws.amazon.com/cloudfront/
You can use CloudFront and set the the origin as the load balancer. Then force HTTPS redirection on the cloud-front with Viewer Protocol Policy set to Redirect HTTP to HTTPS.
In this approach, all the ELB rules will still be applied.

Access Denied from Cloudfront with Secure Cookies returns no CORS headers preventing reading error information from a XHR request

I'm using cloudfront secure cookies to keep some files private. When cookie auth succeeds and the origin is hit cloudfront returns the proper cors headers (Access-Control-Allow-Origin) from the origin but how do I make cloudfront return CORS headers during a 403/Access Denied? This validation is entirely in cloudfront before the request to the origin, but is there a setting to enable it? I want to be able to make a XHR request to cloudfront and know why the request failed. Since cloudfront doesn't return cors headers on a 403 most modern browsers will prevent reading any information on the request including the status code and its tough to determine why the request failed.
Thanks!
As you know, CloudFront doesn't spontaneously emit CORS headers -- they need to come from the origin server -- so in order to see CORS headers in the response, the request needs to be allowed by CloudFront... but, of course, it can't be allowed, because the condition you're trying to catch is 403 Forbidden.
So, what we need in order to allow your unauthorized responses to be CORS-friendly is an additional origin that can provide us with an alternate error response, and that origin needs to be CORS-aware. The solution seems to be something we can accomplish with a little help from CloudFront Custom Error Responses and an otherwise-empty S3 bucket, created for the purpose.
Custom error responses allow you to configure CloudFront to fetch the custom error response from another origin, rather than generating it internally. As part of that process, some headers from the original request are included in the upstream fetch, and the response headers from the error document are returned.
S3 makes a handy origin, since it has configurable CORS support.
Create a new, empty bucket.
Enable CORS for the bucket, and configure CORS with the appropriate parameters. The default configuration may be fine for this purpose.
Create a simple file that your CloudFront distribution will be using instead of its built in response for a 403. For test purposes, that can just be a text file that says "Access denied."
Upload the file to the bucket with whatever name you like, such as 403.txt. Select the option to make the object publicly-readable. Set metadata Cache-Control: no-cache, no-store, private, must-revalidate and Content-Type: text/plain (or text/html, depending on what exactly you put in the error file).
In CloudFront, create a new Origin. For the Origin Domain Name, select the bucket from the list of buckets.
Create a new Cache Behavior, matching path /403.txt (or whatever you named the file). Whitelist the Origin, Access-Control-Request-Headers, and Access-Control-Request-Method headers for forwarding. Set Restrict Viewer Access to No, because for this one path, we don't require signed credentials. Note that this path needs to be exactly the same as the filename in the bucket (except the leading slash, which isn't shown in the bucket but should be included, here).
In CloudFront Custom Error Responses, choose Create Custom Error Response. Select error code 403, set Error Caching Minimum TTL to 0, choose Customize Error Response Yes, set Response Page Path /403.txt and set HTTP Response code to 403.
Profit!
Test:
$ curl -v dzczcnnnnexample.cloudfront.net -H 'Origin: http://example.com'
* Rebuilt URL to: dzczcnnnnexample.cloudfront.net/
* Trying 203.0.113.208...
* Connected to dzczcnnnnexample.cloudfront.net (203.0.113.208) port 80 (#0)
> GET / HTTP/1.1
> Host: dzczcnnnnexample.cloudfront.net
> User-Agent: curl/7.47.0
> Accept: */*
> Origin: http://example.com
>
< HTTP/1.1 403 Forbidden
< Content-Type: text/plain
< Content-Length: 16
< Connection: keep-alive
< Date: Sun, 08 Apr 2018 14:01:25 GMT
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Methods: GET, HEAD
< Access-Control-Max-Age: 3000
< Last-Modified: Sun, 08 Apr 2018 13:29:19 GMT
< ETag: "fd9e8f7be7b65381c4acc272b6afc858"
< x-amz-server-side-encryption: AES256
< Cache-Control: private, no-cache, no-store, must-revalidate
< Accept-Ranges: bytes
< Server: AmazonS3
< Vary: Origin,Access-Control-Request-Headers,Access-Control-Request-Method
< X-Cache: Error from cloudfront
< Via: 1.1 1234567890a26beddeac6bfc77b2d348.cloudfront.net (CloudFront)
< X-Amz-Cf-Id: ExAmPlEIbQBtaqExamPLEQs4VwcxhvtU1YXBi47uUzUgami0Hj0MwQ==
<
Access denied.
* Connection #0 to host dzczcnnnnexample.cloudfront.net left intact
Here, Access denied. is what I put in the text file I created. You may want to get a little more creative, after confirming that this works for you, as it does for me. The content of this new file in S3 will always be returned whenever CloudFront throws a 403 error. Additionally, it will also be returned whenever your origin throws a 403, because custom error responses are designed to replace all errors with a given HTTP status code.
You note, above, that we see Access-Control-Allow-Origin: *. This is the default behavior of S3 CORS. If you provide explicit origins in the S3 CORS config, you get a response like this...
Access-Control-Allow-Origin: http://example.com
...but for GET requests, I assume this level of specificity would not be necessary and the wildcard would suffice. The scenario described here isn't setting CORS for the entire CloudFront distribution -- just for the error response.

Getting 301 Error when making HTTP GET Requests using C++ sockets

I am trying to make GET requests from a C++ program and every time I get a 301 Moved Permanently error. I am using an API that uses sockets and cannot seem to figure out why this error always comes up.
Here is the request that is getting made:
GET https://www.quandl.com/api/v3/datasets/EOD/AAPL.csv?sort_order=asc&auth_token=YZffVEztoepdzHNAMexz HTTP/1.1
Host: www.quandl.com
Connection: close
And here is the response to the request:
HTTP/1.1 301 Moved Permanently
Date: Sun, 12 Nov 2017 03:58:41 GMT
Content-Type: text/html
Content-Length: 182
Connection: close
Set-Cookie: __cfduid=d51b8e22f5239ed65b480d8ec37cad8251510459121; expires=Mon, 12-Nov-18 03:58:41 GMT; path=/; domain=.quandl.com; HttpOnly
Location: https://www.quandl.com/api/v3/datasets/EOD/AAPL.csv?sort_order=asc&auth_token=YZffVEztoepdzHNAMexz
Server: cloudflare-nginx
CF-RAY: 3bc6930581840ed9-EWR
<html>
<head><title>301 Moved Permanently</title></head>
<body bgcolor="white">
<center><h1>301 Moved Permanently</h1></center>
<hr><center>openresty</center>
</body>
</html>
I think it may have to do with the Http-only part in the Set-cookie but am not 100% sure about that and don't know how to get rid of it. I think the url in the response after location is where the page has "moved to", however it is the exact same one as the one I am requesting so I don't understand why I am getting the error.
GET https://www.quandl.com/api/v3/datasets/EOD/AAPL.csv?sort_order=asc&auth_token=YZffVEztoepdzHNAMexz HTTP/1.1
Host: www.quandl.com
Connection: close
That's not a valid request for a https:// resource. Instead you have to create a TLS connection to the server (instead of only a TCP connection) and send the request with path-only instead of full-URL:
GET /api/v3/datasets/EOD/AAPL.csv?sort_order=asc&auth_token=YZffVEztoepdzHNAMexz HTTP/1.1
Host: www.quandl.com
Connection: close
301 isn't an error, it means the resource has changed URL's.
When you get this valid response code you can issue another request to the Location URL specified in the response.
Be careful to limit the number of times you follow a redirect because you could wind up with an infinite loop. A lot of HTTP client libraries have an option to handle this automatically.

URL forbidden 403 when using a tool but fine from browser

I have some images that I need to do a HttpRequestMethod.HEAD in order to find out some details of the image.
When I go to the image url on a browser it loads without a problem.
When I attempt to get the Header info via my code or via online tools it fails
An example URL is http://www.adorama.com/images/large/CHHB74P.JPG
As mentioned, I have used the online tool Hurl.It to try and attain the Head request but I am getting the same 403 Forbidden message that I am getting in my code.
I have tried adding many various headers to the Head request (User-Agent, Accept, Accept-Encoding, Accept-Language, Cache-Control, Connection, Host, Pragma, Upgrade-Insecure-Requests) but none of this seems to work.
It also fails to do a normal GET request via Hurl.it. Same 403 error.
If it is relevant, my code is a c# web service and is running on the AWS cloud (just in case the adorama servers have something against AWS that I dont know about). To test this I have also spun up an ec2 (linux box) and run curl which also returned the 403 error. Running curl locally on my personal computer returns the binary image which is presumably just the image data.
And just to remove the obvious thoughts, my code works successfully for many many other websites, it is just this one where there is an issue
Any idea what is required for me to download the image headers and not get the 403?
same problem here.
Locally it works smoothly. Doing it from an AWS instance I get the very same problem.
I thought it was a DNS resolution problem (redirecting to a malfunctioning node). I have therefore tried to specify the same IP address as it was resolved by my client but didn't fix the problem.
My guess is that Akamai (the service is provided by an Akamai CDN in this case) is blocking AWS. It is understandable somehow, customers pay by traffic for CDN, by abusing it, people can generate huge bills.
Connecting to www.adorama.com (www.adorama.com)|104.86.164.205|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 403 Forbidden
Server: **AkamaiGHost**
Mime-Version: 1.0
Content-Type: text/html
Content-Length: 301
Cache-Control: max-age=604800
Date: Wed, 23 Mar 2016 09:34:20 GMT
Connection: close
2016-03-23 09:34:20 ERROR 403: Forbidden.
I tried that URL from Amazon and it didn't work for me. wget did work from other servers that weren't on Amazon EC2 however. Here is the wget output on EC2
wget -S http://www.adorama.com/images/large/CHHB74P.JPG
--2016-03-23 08:42:33-- http://www.adorama.com/images/large/CHHB74P.JPG
Resolving www.adorama.com... 23.40.219.79
Connecting to www.adorama.com|23.40.219.79|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.0 403 Forbidden
Server: AkamaiGHost
Mime-Version: 1.0
Content-Type: text/html
Content-Length: 299
Cache-Control: max-age=604800
Date: Wed, 23 Mar 2016 08:42:33 GMT
Connection: close
2016-03-23 08:42:33 ERROR 403: Forbidden.
But from another Linux host it did work. Here is output
wget -S http://www.adorama.com/images/large/CHHB74P.JPG
--2016-03-23 08:43:11-- http://www.adorama.com/images/large/CHHB74P.JPG
Resolving www.adorama.com... 23.45.139.71
Connecting to www.adorama.com|23.45.139.71|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.0 200 OK
Content-Type: image/jpeg
Last-Modified: Wed, 23 Mar 2016 08:41:57 GMT
Server: Microsoft-IIS/8.5
X-AspNet-Version: 2.0.50727
X-Powered-By: ASP.NET
ServerID: C01
Content-Length: 15131
Cache-Control: private, max-age=604800
Date: Wed, 23 Mar 2016 08:43:11 GMT
Connection: keep-alive
Set-Cookie: 1YDT=CT; expires=Wed, 20-Apr-2016 08:43:11 GMT; path=/; domain=.adorama.com
P3P: CP="NON DSP ADM DEV PSD OUR IND STP PHY PRE NAV UNI"
Length: 15131 (15K) [image/jpeg]
Saving to: \u201cCHHB74P.JPG\u201d
100%[=====================================>] 15,131 --.-K/s in 0s
2016-03-23 08:43:11 (460 MB/s) - \u201cCHHB74P.JPG\u201d saved [15131/15131]
I would guess that the image provider is deliberately blocking requests from EC2 address ranges.
The reason the wget outgoing ip address is different in the two examples is due to DNS resolution on the cdn provider that adorama are providing
Web Server may implement ways to check particular fingerprint attributes to prevent automated bots . Here a few of them they can check
Geoip, IP
Browser headers
User agents
plugin info
Browser fonts return
You may simulate the browser header and learn some fingerprinting "attributes" here : https://panopticlick.eff.org
You can try replicate how a browser behave and inject similar headers/user-agent. Plain curl/wget are not likely to satisfied those condition, even tools like phantomjs occasionally get blocked. There is a reason why some prefer tools like selenium webdriver that launch actual browser.
I found using another url also being protected by AkamaiGHost was blocking due to certain parts in the user agent. Particulary using a link with protocol was blocked:
Using curl -H 'User-Agent: some-user-agent' https://some.website I found the following results for different user agents:
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:70.0) Gecko/20100101 Firefox/70.0 okay
facebookexternalhit/1.1 (+http\://www.facebook.com/externalhit_uatext.php): 403
https ://bar: okay
https://bar: 403
All I could find for now is this (downvoted) answer https://stackoverflow.com/a/48137940/230422 stating that colons (:) are not allowed in header values. That is clearly not the only thing happening here as the Mozilla example also has a colon, only not a link.
I guess that at least most webservers don't care and allow facebook's bot and other bots having a contact url in their user agent. But appearently AkamaiGHost does block it.