Here is a schema of CDN to resize images and serve them via AWS CloudFront:
If an image is not found in the S3 bucket, it issues a 307 Temporary Redirect (instead of 404) to access Lambda via API Gateway. Lambda resizes the image (based on the original from the S3 bucket) and uploads it into the S3 bucket. The browser gets once again permanently redirected to the S3 bucket with the newly generated image.
When I want to access the same image via CloudFront, I am receiving a 403 Forbidden error. It comes either from the S3 or CloudFront. As the status indicates, this may have something to do with access rights.
Why does adding CloudFront into the working request chain cause the 403 error?
What works:
https://{bucket}.s3-website-{region}.amazonaws.com/100x100/image.jpg
HTTP/1.1 307 Temporary Redirect
x-amz-id-2: xxxx
x-amz-request-id: xxxx
Date: Sat, 19 Aug 2017 15:37:12 GMT
Location: https://{gateway}.execute-api.{region}.amazonaws.com/prod/resize?key=100x100/image.jpg
Content-Length: 0
Server: AmazonS3
https://{gateway}.execute-api.{region}.amazonaws.com/prod/resize?key=100x100/image.jpg
HTTP/1.1 301 Moved Permanently
Content-Type: application/json
Content-Length: 0
Connection: keep-alive
Date: Sat, 19 Aug 2017 15:37:16 GMT
x-amzn-RequestId: xxxx
location: http://{bucket}.s3-website-eu-west-1.amazonaws.com/100x100/image.jpg
X-Amzn-Trace-Id: xxxx
X-Cache: Miss from cloudfront
Via: 1.1 {distribution}.cloudfront.net (CloudFront)
X-Amz-Cf-Id: xxxx
http://{bucket}.s3-website-{region}.amazonaws.com/100x100/image.jpg
HTTP/1.1 200 OK
x-amz-id-2: xxxx
x-amz-request-id: xxxx
Date: Sat, 19 Aug 2017 15:37:18 GMT
Last-Modified: Sat, 19 Aug 2017 15:37:17 GMT
x-amz-version-id: null
ETag: xxxx
Content-Type: image/png
Content-Length: 20495
Server: AmazonS3
What doesn't work:
https://{distribution}.cloudfront.net/100x100/image.jpg
HTTP/1.1 403 Forbidden
Content-Type: application/xml
Transfer-Encoding: chunked
Connection: keep-alive
Date: Sat, 19 Aug 2017 15:38:24 GMT
Server: AmazonS3
X-Cache: Error from cloudfront
Via: 1.1 {distribution}.cloudfront.net (CloudFront)
X-Amz-Cf-Id: xxxx
I've added the S3 bucket as origin into CloudFront
The error was caused by using a REST endpoint (e.g. s3.amazonaws.com) for website-like functionality (redirects, html error messages, and index documents). These features are only provided by the web site endpoints (e.g. bucketname.s3-website-us-east-1.amazonaws.com).
http://docs.aws.amazon.com/AmazonS3/latest/dev/WebsiteEndpoints.html
It confused me because the REST endpoint was offered via autocomplete in the console, when creating the CloudFront distribution. The correct endpoint has to be entered manually.
CloudFront also caches 40x 50x status codes coming from S3 (doc.: http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/HTTPStatusCodes.html#HTTPStatusCodes-cached-errors ).
You should invalidate the Cloudfront cache for the resized img path. You can do it by calling the CreateInvalidation API from your Lambda function.
Doc:
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Invalidation.html#invalidating-objects-api
Related
I have just created a custom error page, but for some reason I cannot set several headers on the file(s).
Currently my headers look like this:
X-Firefox-Spdy h2
accept-ranges bytes
age 432
content-length 1931
content-type text/html
date Fri, 06 Oct 2017 10:55:47 GMT
etag "6fc24050256bab8cec351de1c6c74a4f"
last-modified Fri, 06 Oct 2017 10:55:33 GMT
server AmazonS3
via 1.1 a57f85bbf89c6dasdasdasddcddasd9687e0.cloudfront.net (CloudFront)
x-amz-cf-id JZAiF7gZnnUVrorerfasusQu84gQVGwV0UU4h3mjaw4E-CKL2_Xm6zOg==
x-cache Error from cloudfront
but should really look like this:
X-Firefox-Spdy h2
age 1512
content-encoding gzip
content-type text/html
date Fri, 14 Jul 2017 06:42:03 GMT
last-modified Sat, 17 Jan 2015 17:35:49 GMT
server AmazonS3
vary Accept-Encoding
via 1.1 a57f85bbf89c6dasdasdasddcddasd9687e0.cloudfront.net (CloudFront)
x-amz-cf-id JZAiF7gZnnUVrorerfasusQu84gQVGwV0UU4h3mjaw4E-CKL2_Xm6zOg==
x-cache Error from cloudfront
There is an option in metadata to enter Content-Encoding, but when I enter gzip, I keep getting an error and page is not being displayed. In addition to this, header Accept-Encoding cannot be set and when I am trying to delete accept-ranges header, it keeps coming back again and again.
What should I do or what should not do to make it right.
Here is the documentation on how to setup S3 or custom origins to serve compressed data transfer through cloudfront,
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/ServingCompressedFiles.html
One does not need to compress file and store it in S3. Cloudfront will automatically handle that for you.
Hope it helps.
I went through numerous articles and forums online on how to create a distribution group but all of them are using S3 as origin domain name.
I created a distribution group using origin domain name as rails server e.g assets.abcd.efgh.com I can access the asset if i do assets.abcd.efgh.com/assets/abcdefghti-ieajife.css but i can not access the asset using distribution domain name as 1234test.cloudfront.net/assets/abcdefghti-ieajife.css. i am getting error:
Failed to contact the origin
The result i get using curl is
curl -I -s -X GET -H "Origin: https://assets.abcd.efgh.com" 1234test.cloudfront.net/assets/abcdefghti-ieajife.css
HTTP/1.1 503 Service Unavailable
Content-Type: text/html
Content-Length: 507
Connection: keep-alive
Server: CloudFront
Date: Tue, 25 Oct 2016 16:48:17 GMT
Expires: Tue, 25 Oct 2016 16:48:17 GMT
X-Cache: Error from cloudfront
Via: 1.1 8f18deab0e501ffbd2fa94cfd46e4785.cloudfront.net (CloudFront)
X-Amz-Cf-Id: PLAjGN5UuFEEFZSRYu_fGfsMDBcjH1w7Ruy1x1fv9bWiftWak3k1QA==
can someone guide me what other settings i need to do while creating distribution group or what i am missing?
Found out that the origin needed to be updated to receive public requests. It was receiving private requests only
I have a PHP app on AWS Elastic Beanstalk, I have created an assets bucket on S3. I'm trying to setup a Cloudfront distribution with behaviors to send requests for assets/* to S3 with a default behavior to send requests to EB. The domain points to Cloudfront.
All requests are going to EB which returns a 404 since there is no assets diretory in the EB environment.
I have created 2 Cloudfront origins, one for EB and one for the S3 bucket. This is what my behaviors look like:
Precedence Path Pattern Origin Protocol Policy Fwd Query Strings
0 assets/* S3-example-bucket HTTP and HTTPS No
1 Default (*) Custom-example.us-east-1.elasticbeanstalk.com HTTP and HTTPS Yes
It seems as though this should be pretty straight forward so I assume I'm missing something basic. Any help is greatly appreciated.
Edit:
Request header:
GET /assets/images/10waysaudiobook.png HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Cookie: wordpress_logged_in_8a27500b7747be1e4fbad7f473f238e5=snickerspixy%7C1466021823%7Cr7rE5moINjanjHEqb1TGbsSkn9F7OCZLfX69IbcnGJu%7C28fc452885f3fe6e954243abab585a188f6511cdd6eeec6fa5ec5c50b9f3d393; wp-settings-7674=m4%3Do%26m5%3Do%26m9%3Do%26m6%3Do%26editor%3Dhtml%26m10%3Do%26m0%3Do%26m3%3Do%26hidetb%3D1%26m2%3Dc%26m1%3Do%26m8%3Do%26m12%3Do%26m7%3Do%26m11%3Do%26urlbutton%3Dnone%26m13%3Do%26tml1%3D1%26imgsize%3Dfull%26align%3Dcenter%26libraryContent%3Dbrowse%26ed_size%3D569%26unfold%3D1%26wplink%3D1%26mfold%3Do%26post_dfw%3Doff%26advImgDetails%3Dshow%26posts_list_mode%3Dlist; wp-settings-time-7674=1464816549; AWSELB=1FCB85F51606EBAFF15FEADB01C8069AEDE17E2A043407E615EF1A0E1ABF24607545A45D3DC206631F7AAE4503ADA423788B5E6B5B48FAE93EE916DE068509E64F92AC10FF; PHPSESSID=cpi2su7s967phu87rlpjgneel6; wordpress_test_cookie=WP+Cookie+check
Connection: keep-alive
Response header:
HTTP/1.1 404 Not Found
Cache-Control: no-cache, must-revalidate, max-age=0
Content-Type: text/html; charset=UTF-8
Date: Sun, 05 Jun 2016 00:54:23 GMT
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Link: <http://example.com/wp-json/>; rel="https://api.w.org/"
Pragma: no-cache
Server: Apache
Transfer-Encoding: chunked
Connection: keep-alive
The response headers indicate that this request wasn't served by CloudFront at all, because there are headers that should be present... but are absent.
CloudFront adds Via:, X-Cache:, and x-amz-cf-id: headers to every response, and sometimes Age: (on cache hits and errors) or Vary: if you're forwarding the CloudFront-Is-*-Viewer: headers to the origin.
The absence of these headers suggest that the DNS for the site hasn't been pointed to CloudFront and may still be pointing directly to the EB environment, or if the change was recent, that the former TTL for the DNS entry may not yet have expired.
I am switching to Amazon Cloudfront for serving images on my website. To reduce load when we finally make it live, I thought of warming up the cache by hitting image URLs (I am making these request from India and expect majority of users to request from the same region so no need to have a copy of object on all edge locations worldwide).
The problem is that script uses curl to request image and when I access the same URL in browser I get MISS from Cloudfront. So Cloudfront is making two copies of object for these two request.
My current Cloudfront configuration forwards Content-Type request Header to origin.
How should I configure Cloudfront so that it doesn't care about request headers at all and once I made a request (whether curl or using browser) it should serve all future request for same resource from edge and not origin.
Request/Response headers-
I am afraid that the Cloudfront url won't be accessible from outside (until we go live) but I am posting request/response headers, this should give you fair idea. Also you can check out caching headers at origin - https://origin.ixigo.com/image/upload/t_thumb,f_auto/r7y6ykuajvlumkp4lk2a.jpg
Response after two successive request using browser
Remote Address:54.230.156.66:443
Request URL:https://youcannotaccess.com/image/upload/t_thumb,f_auto/r7y6ykuajvlumkp4lk2a.jpg
Request Method:GET
Status Code:200 OK
Response Headers
view source
Accept-Ranges:bytes
Age:23
Cache-Control:public, max-age=31557600
Connection:keep-alive
Content-Length:8708
Content-Type:image/jpg
Date:Fri, 27 Nov 2015 09:16:03 GMT
ETag:"-170562206"
Last-Modified:Sun, 29 Jun 2014 03:44:59 GMT
Vary:Accept-Encoding
Via:1.1 7968275877e438c758292828c0593684.cloudfront.net (CloudFront)
X-Amz-Cf-Id:fcbGLv8uBOP89qfR52OWa-NlqWkEREJPpZpy9ix0jdq8-a4oTx7lNw==
X-Backend:image6_40
X-Cache:Hit from cloudfront
X-Cache-Hits:0
X-Device:pc
X-DeviceType:pc
X-Powered-By:xyz
Now same url requested using curl but gave me miss
curl manu-mdc:cache manuc$ curl -I https://youcannotaccess.com/image/upload/t_thumb,f_auto/r7y6ykuajvlumkp4lk2a.jpg
HTTP/1.1 200 OK
Content-Type: image/jpg
Content-Length: 8708
Connection: keep-alive
Age: 0
Cache-Control: public, max-age=31557600
Date: Fri, 27 Nov 2015 09:16:47 GMT
ETag: "-170562206"
Last-Modified: Sun, 29 Jun 2014 03:44:59 GMT
X-Backend: image6_40
X-Cache-Hits: 0
X-Device: pc
X-DeviceType: pc
X-Powered-By: xyz
Vary: Accept-Encoding
X-Cache: Miss from cloudfront
Via: 1.1 4d42171c56a4c8b5c627040e6aa0938d.cloudfront.net (CloudFront)
X-Amz-Cf-Id: fY0LXhp7NlqB-I8F5-1TIMnA6bONjPD3CEp7dsyVdykP-7N2mbffvw==
Now this will give HIT
manu-mdc:cache manuc$ curl -I https://youcannotaccess.com/image/upload/t_thumb,f_auto/r7y6ykuajvlumkp4lk2a.jpg
HTTP/1.1 200 OK
Content-Type: image/jpg
Content-Length: 8708
Connection: keep-alive
Cache-Control: public, max-age=31557600
Date: Fri, 27 Nov 2015 09:16:47 GMT
ETag: "-170562206"
Last-Modified: Sun, 29 Jun 2014 03:44:59 GMT
X-Backend: image6_40
X-Cache-Hits: 0
X-Device: pc
X-DeviceType: pc
X-Powered-By: xyz
Age: 3
Vary: Accept-Encoding
X-Cache: Hit from cloudfront
Via: 1.1 6877899d48ba844a34ea4378ce336f06.cloudfront.net (CloudFront)
X-Amz-Cf-Id: qpPhbLX_5t2Xj0XZuZdjWD2w-BI80DUVyL496meQkLfSEn3ikt7hNg==
This is similar to this issue: Why are two requests with different clients from the same computer cache misses on cloudfront?
Depending on whether you provide the "Accept-Encoding: gzip" header or not, CloudFront edge server caches the object separately. Since browsers provides this header by default, and your site is likely to be accessed majorly via browser, I will suggest changing your curl call to include this header.
I was facing the same problem, after making the change in my curl call, I started to get a Hit from the browser on my first try via browser (after making a curl call).
Another thing I noticed is that CloudFront requires the full requested object to be downloaded before it will be cached. If you try to download the file partially by specifying the byte range in the curl, the intended object does not get cached, only the downloaded part gets cached as a different object. Same goes for a curl that was terminated in between. The other options I tried were wget call with spider option, but that internally does a HEAD call only and thus does not get the content cached on the edge server.
I am setting up video delivery for video files to tv set-top boxes.
I want to use Amazon Cloudfront.
The video files are requested as usual http requets that may contain a range header to request partial resources (to enable the user on the box to jump into any position within the video).
My problem is that it is working on 2 of 3 boxes, one makes problems.
The request looks like this (sample data):
GET /path/file.mp4 HTTP/1.1
User-Agent: My User Agent
Host:myhost.com
Accept:*/*
Range: bytes=100-200
So if I do a request to cloudfront using telnet I see that the response is HTTP 1.0:
joe#flimmit-joe:~$ telnet d2zf9fl0izzsf6.cloudfront.net 80
Trying 216.137.61.164...
Connected to d2zf9fl0izzsf6.cloudfront.net.
Escape character is '^]'.
GET /skin/frontend/default/flimmit/images/headerbanners/02_green.png HTTP/1.1
User-Agent: My User Agent
Host:d2zf9fl0izzsf6.cloudfront.net
Accept:*/*
Range: bytes=100-200
HTTP/1.0 206 Partial Content
Date: Sun, 12 Feb 2012 18:42:15 GMT
Server: Apache/2.2.16 (Ubuntu)
Last-Modified: Tue, 26 Jul 2011 10:37:54 GMT
ETag: "1e0b8a-2d2b-4a8f6863ac11a"
Accept-Ranges: bytes
Cache-Control: max-age=2592000
Expires: Tue, 13 Mar 2012 18:42:15 GMT
Content-Type: image/png
Age: 351213
Content-Range: bytes 100-200/11563
Content-Length: 101
X-Cache: Hit from cloudfront
X-Amz-Cf-Id: W2fzPeBSWb8_Ha_UzvIepZH-Z9xibXyRddoHslJZ3TDXyFfjwE3UMQ==,CwiKc8-JGfE77KBVTTOyE9g-OYf7P-bCJZEWGwef9Es5rzhUBYKE8A==
Via: 1.0 972e3ba2f91fd0a38ea062d0cc03be37.cloudfront.net (CloudFront)
Connection: close
q�]#��ĥM�oӘ�i��i��������Y�.��/��ib���&
���
�Ⱦ�00�>�����Y`��X���r���s�=�n�s�b���7MConnection closed by foreign host.
joe#flimmit-joe:~$
Unfortunately I have only limited access to the box for testing purposes.
However this behavior by cloud-front seems strange to me so I wanted to ask whether it is even valid.
It is absolutely "valid" to answer with Http 1.0 to an Http 1.1 request.
I'll cite the Appendix 19.6 to the RFC2068 "It is beyond the scope of a protocol specification to mandate compliance with previous versions. HTTP/1.1 was deliberately designed, however, to make supporting previous versions easy."
http://www.w3.org/Protocols/rfc2616/rfc2616-sec19.html#sec19.6
The important part is basically that the RFC does not force an Http 1.1 answer, so it's up to the server.