Cloudfront not sending If-None-Match to my origin - amazon-web-services

I'm trying to configure Cloudfront to send an If-None-Match to my custom origin when a resource expires, so that I can respond with a 304 if nothing has changed. For some reason, I'm unable to get Cloudfront to do so.
My origin responds with these headers:
HTTP/2 200
content-type: application/json; charset=utf-8
content-length: 181691
access-control-allow-origin: *
cache-control: max-age=5
date: Fri, 17 Feb 2023 21:16:49 GMT
x-content-type-options: nosniff
x-frame-options: DENY
etag: W/"15-mbAPvGdFm9PuCZHJFTtrwm#3"
vary: Accept-Encoding
So, sending cache-control of 5 seconds and a weak e-tag.
My cloudfront cache policy has min ttl of 1, forwards headers Origin and a few x- ones, forwards all query strings. No cookies. Compression is turned on.
My origin request policy is "AllViewer".
The request is traveling to Cloudfront, which goes through a classic AWS load balancer, which hits a kubernetes pod that handles and sends the response.
For some reason, Cloudfront never sends an If-None-Match header to my origin when resource expires. If I manually specify an If-None-Match header in my request in a curl command to Cloudfront, my origin does see it and responds correctly. No no intermediate hop is removing the If-None-Match header, so it must be that Cloudfront is not sending it in the first place.
Any ideas what could be wrong? I've been pouring over the documentations but have not found anything that worked.
Thanks!

Related

CloudFront is removing Access-Control-* headers from my Origin

I have a CloudFront distribution with a custom origin to an APIGateway that forwards calls to a Lambda which is my API code. I have a separate CloudFront distribution for my static single-page website. My website is not working because it is getting CORS errors when calling my API on a separate subdomain. It is my Lambda that is currently responsible for sending back CORS headers.
Looking into it, it seems CloudFront is removing CORS headers from the responses from the APIGateway and I cannot figure out how to get it to allow the headers. I can make the same call directly to my APIGateway and I get the correct response headers.
Request:
OPTIONS https://api.mywebsite.com/some/endpoint
User-Agent: ...snip...
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Access-Control-Request-Method: GET
Access-Control-Request-Headers: authorization
Referer: https://www.mywebsite.com/
Origin: https://www.mywebsite.com
Connection: keep-alive
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-site
APIGateway Response:
200 OK
Date: Fri, 27 Jan 2023 03:47:55 GMT
Content-Type: application/json
Content-Length: 0
Connection: keep-alive
x-amzn-RequestId: ...snip...
X-XSS-Protection: 1; mode=block
Access-Control-Allow-Origin: https://www.mywebsite.com
Access-Control-Allow-Headers: authorization
X-Frame-Options: DENY
x-amz-apigw-id: ...snip...
Vary: Origin
Vary: Access-Control-Request-Method
Vary: Access-Control-Request-Headers
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Expires: 0
X-Content-Type-Options: nosniff
Access-Control-Allow-Methods: GET
Pragma: no-cache
Access-Control-Max-Age: 3600
CloudFront Response:
200 OK
Content-Type: application/json
Content-Length: 0
Connection: keep-alive
Date: Fri, 27 Jan 2023 03:51:58 GMT
x-amzn-RequestId: ...snip...
X-XSS-Protection: 1; mode=block
Accept-Patch:
Access-Control-Allow-Origin: https://www.cicerone.development.loesoft.com
Allow: GET,HEAD,OPTIONS
X-Frame-Options: DENY
x-amz-apigw-id: ...snip...
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Expires: 0
X-Content-Type-Options: nosniff
Pragma: no-cache
X-Cache: Miss from cloudfront
Via: 1.1 ...snip....cloudfront.net (CloudFront)
X-Amz-Cf-Pop: DFW56-P2
X-Amz-Cf-Id: ...snip...
The browser is rejecting the desired GET call because the PreFlight call didn't have the expected information. I suspect this is because of the missing one or more of the Access-Control-* headers.
I've tried configuring CloudFront a few different ways with no success. Original configuration, the default (only) behavior had a Cache policy and no assigned Origin Request policy or Response Headers policy. I tried adding the managed "All Viewer" managed Origin Request policy which should be sending all inbound request headers to my APIGateway. I did this just in case any headers were being removed in this case. This made no difference. I then added a Response Headers policy that set generic values for the various CORS headers and made sure the "override origin" flag was off so that the "Access-Control-*" headers coming from my origin would be used. This also did not solve the issue. I've tried various different configurations for all the policies but I'm not having much luck.
Additionally, if I have my UI bypass CloudFront and access my API directly, the API calls from the browser work w/o issue.
Is there a way to configure CloudFront to solve my CORS issue or even just to not filter any headers coming from the my origin?
Thank you in advance.
The issue turned out to be 2 parted. First, without an assigned Origin Request policy, CF was stripping many of the CORS headers before sending the request to the origin. This was causing the appropriate CORS response headers to not be generated by my backend Lambda. Next, adding the AllViewer Origin Request policy resulted in all responses returning 403 but never actually getting to my backend Lambda. It appears that setting this will cause the Host header to be sent with the down stream request, and APIGateway was rejecting the call.
I ended up creating my own Origin Request policy that included all the viewer headers except the Host header and then my downstream Lambda started getting the headers and returning the correct response headers that were then being echoed back by CF.
I did not need a Response Headers policy in place for this to work.

CloudFront CORS request using signed cookies and withCredentials, not sending back Access-Control-Allow-Credentials unless I include some extra header

I'm having a very strange issue that I can't seem to crack. I configured a private CloudFront distribution to serve content from a private S3 bucket. I am using signed cookies to grant access to the files. I am also making cross origin requests from a browser for the files, so I need to allow credentials to send the cookie. I configured a custom response headers policy to do this (I set it to set Access-Control-Allow-Credentials to true, explicitly set Access-Control-Allow-Origin to my intended domain, and set Access-Control-Allow-Methods / Access-Control-Max-Age appropriately, and it is set to origin override), and I also set up a custom cache policy to cache based on origin and access-control headers.
this cURL command is not giving the correct response:
curl -v -H "origin: https://my-subdomain.my-domain.com" -H "cookie: CloudFront-Key-Pair-Id=MyKeyPairID; CloudFront-Policy=Base64EncodedPolicy; CloudFront-Signature=SignedPolicy" https://my-other-subdomain.my-domain.com/key/to/my/private/file.txt
it yields the following:
< HTTP/1.1 200 OK
< Content-Type: application/octet-stream
< Content-Length: 576
< Connection: keep-alive
< Date: Fri, 18 Feb 2022 18:34:09 GMT
< Last-Modified: Thu, 16 Dec 2021 14:45:12 GMT
< ETag: "a50884915242f9876bea4bb633963191"
< Accept-Ranges: bytes
< Server: AmazonS3
< Vary: Origin,Access-Control-Request-Headers,Access-Control-Request-Method
< Access-Control-Allow-Origin: https://my-subdomain.my-domain.com
< Vary: Origin
< X-Cache: Hit from cloudfront
< Via: 1.1 redacted.cloudfront.net (CloudFront)
< X-Amz-Cf-Pop: EWR50-C1
< X-Amz-Cf-Id: JxMbPWHeQr0a9AAlf9PI5ksF6xGKVWL1LvpEC9XEoR_PVuVgiJ5zGA==
< Age: 626
Notice the missing Access-Control-Allow-Credentials header.
However, this command, yields the correct response:
curl -v -H "X-some-header: nonsense" -H "origin: https://my-subdomain.my-domain.com" -H "cookie: CloudFront-Key-Pair-Id=MyKeyPairID; CloudFront-Policy=Base64EncodedPolicy; CloudFront-Signature=SignedPolicy" https://my-other-subdomain.my-domain.com/key/to/my/private/file.txt
returns:
< HTTP/1.1 200 OK
< Content-Type: application/octet-stream
< Content-Length: 576
< Connection: keep-alive
< Date: Fri, 18 Feb 2022 18:34:09 GMT
< Last-Modified: Thu, 16 Dec 2021 14:45:12 GMT
< ETag: "a50884915242f9876bea4bb633963191"
< Accept-Ranges: bytes
< Server: AmazonS3
< Vary: Origin,Access-Control-Request-Headers,Access-Control-Request-Method
< Access-Control-Allow-Credentials: true
< Access-Control-Allow-Origin: https://my-subdomain.my-domain.com
< Vary: Origin
< X-Cache: Hit from cloudfront
< Via: 1.1 redacted.cloudfront.net (CloudFront)
< X-Amz-Cf-Pop: EWR50-C1
< X-Amz-Cf-Id: pUSouCDwLH5Zu6-NBUZqKrb5kY407GLqXXtH4EK2-Th0Z9zZNb54ag==
< Age: 693
this time, with the correct Access-Control-Allow-Credentials header. I have no idea what I might have misconfigured to cause this or why this could be happening. Any insights would be greatly appreciated, any configuration or test output needed, just let me know.
Thank you
EDIT:
After some trial and error, I've determined the origin override setting on the Response Header Policy is causing the problem. When that is set to true, it will not send the Access-Control-Allow-Credentials header unless you send some extraneous header with your request. This is an issue as it also causes unwanted preflight requests in the browser.
Turning that setting off and then configuring my S3 Bucket's CORS to look like the below fixed it:
[
{
"AllowedHeaders": [
"*"
],
"AllowedMethods": [
"GET",
"HEAD"
],
"AllowedOrigins": [
"https://*",
"http://*"
],
"ExposeHeaders": [
"ETag"
]
}
]
However, I'm still curious if I was misunderstanding the origin override setting and there's a way to do that correctly, or if this is a bug of some kind in CloudFront
EDIT 2:
Origin Request Policy: AWS Managed CORS-S3Oriign (I tried with this and with no policy, same result)
Cache Policy: Custom Policy to cache based on Origin and access control headers, also tried with standard managed CacheOptimized policy, and the NoCache policy to make sure I wasn't having some issue with non credentialed requests getting stuck in the cache. Also tried invalidating the cache manually and seeing if hits or misses made a difference, they do not.
Response Headers Policy: Custom to allow credentials, this is the original configuration. I eventually set origin override to false, and things started working if I reconfigured my S3 CORS policy to set the headers. I have a random value under Access-Control-Allow-Headers because I was not allowed to leave that field blank for whatever reason. The random header sent does not have to match the header set here to get the credentials header to get returned, but it does have to match for the browsers preflight check to pass. I also did some fiddling with the expose headers setting, nothing helped.
Further note, once I had S3 setting the CORS headers correctly, I was able to remove the Response headers policy entirely, but I did have to keep the custom cache policy or else different origins could get the wrong headers. This is also less than ideal as I will have users accessing these files from different origins, and I believe that if the response headers policy worked correctly, it would be setting the headers after it's pulled from the cache rather than caching the headers (but I may be wrong on that). Seems my only other option is some CF function running on the responses, but that incurs additional cost and overhead, while a functioning response headers policy would be free and more efficient.
But what's very strange is that even if S3 is setting the CORS headers correctly, if I use the Response Headers Policy with origin override true, it still breaks the response without the random header attached.

Access Denied from Cloudfront with Secure Cookies returns no CORS headers preventing reading error information from a XHR request

I'm using cloudfront secure cookies to keep some files private. When cookie auth succeeds and the origin is hit cloudfront returns the proper cors headers (Access-Control-Allow-Origin) from the origin but how do I make cloudfront return CORS headers during a 403/Access Denied? This validation is entirely in cloudfront before the request to the origin, but is there a setting to enable it? I want to be able to make a XHR request to cloudfront and know why the request failed. Since cloudfront doesn't return cors headers on a 403 most modern browsers will prevent reading any information on the request including the status code and its tough to determine why the request failed.
Thanks!
As you know, CloudFront doesn't spontaneously emit CORS headers -- they need to come from the origin server -- so in order to see CORS headers in the response, the request needs to be allowed by CloudFront... but, of course, it can't be allowed, because the condition you're trying to catch is 403 Forbidden.
So, what we need in order to allow your unauthorized responses to be CORS-friendly is an additional origin that can provide us with an alternate error response, and that origin needs to be CORS-aware. The solution seems to be something we can accomplish with a little help from CloudFront Custom Error Responses and an otherwise-empty S3 bucket, created for the purpose.
Custom error responses allow you to configure CloudFront to fetch the custom error response from another origin, rather than generating it internally. As part of that process, some headers from the original request are included in the upstream fetch, and the response headers from the error document are returned.
S3 makes a handy origin, since it has configurable CORS support.
Create a new, empty bucket.
Enable CORS for the bucket, and configure CORS with the appropriate parameters. The default configuration may be fine for this purpose.
Create a simple file that your CloudFront distribution will be using instead of its built in response for a 403. For test purposes, that can just be a text file that says "Access denied."
Upload the file to the bucket with whatever name you like, such as 403.txt. Select the option to make the object publicly-readable. Set metadata Cache-Control: no-cache, no-store, private, must-revalidate and Content-Type: text/plain (or text/html, depending on what exactly you put in the error file).
In CloudFront, create a new Origin. For the Origin Domain Name, select the bucket from the list of buckets.
Create a new Cache Behavior, matching path /403.txt (or whatever you named the file). Whitelist the Origin, Access-Control-Request-Headers, and Access-Control-Request-Method headers for forwarding. Set Restrict Viewer Access to No, because for this one path, we don't require signed credentials. Note that this path needs to be exactly the same as the filename in the bucket (except the leading slash, which isn't shown in the bucket but should be included, here).
In CloudFront Custom Error Responses, choose Create Custom Error Response. Select error code 403, set Error Caching Minimum TTL to 0, choose Customize Error Response Yes, set Response Page Path /403.txt and set HTTP Response code to 403.
Profit!
Test:
$ curl -v dzczcnnnnexample.cloudfront.net -H 'Origin: http://example.com'
* Rebuilt URL to: dzczcnnnnexample.cloudfront.net/
* Trying 203.0.113.208...
* Connected to dzczcnnnnexample.cloudfront.net (203.0.113.208) port 80 (#0)
> GET / HTTP/1.1
> Host: dzczcnnnnexample.cloudfront.net
> User-Agent: curl/7.47.0
> Accept: */*
> Origin: http://example.com
>
< HTTP/1.1 403 Forbidden
< Content-Type: text/plain
< Content-Length: 16
< Connection: keep-alive
< Date: Sun, 08 Apr 2018 14:01:25 GMT
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Methods: GET, HEAD
< Access-Control-Max-Age: 3000
< Last-Modified: Sun, 08 Apr 2018 13:29:19 GMT
< ETag: "fd9e8f7be7b65381c4acc272b6afc858"
< x-amz-server-side-encryption: AES256
< Cache-Control: private, no-cache, no-store, must-revalidate
< Accept-Ranges: bytes
< Server: AmazonS3
< Vary: Origin,Access-Control-Request-Headers,Access-Control-Request-Method
< X-Cache: Error from cloudfront
< Via: 1.1 1234567890a26beddeac6bfc77b2d348.cloudfront.net (CloudFront)
< X-Amz-Cf-Id: ExAmPlEIbQBtaqExamPLEQs4VwcxhvtU1YXBi47uUzUgami0Hj0MwQ==
<
Access denied.
* Connection #0 to host dzczcnnnnexample.cloudfront.net left intact
Here, Access denied. is what I put in the text file I created. You may want to get a little more creative, after confirming that this works for you, as it does for me. The content of this new file in S3 will always be returned whenever CloudFront throws a 403 error. Additionally, it will also be returned whenever your origin throws a 403, because custom error responses are designed to replace all errors with a given HTTP status code.
You note, above, that we see Access-Control-Allow-Origin: *. This is the default behavior of S3 CORS. If you provide explicit origins in the S3 CORS config, you get a response like this...
Access-Control-Allow-Origin: http://example.com
...but for GET requests, I assume this level of specificity would not be necessary and the wildcard would suffice. The scenario described here isn't setting CORS for the entire CloudFront distribution -- just for the error response.

AWS Cloudfront behaviors not working as expected

I have a PHP app on AWS Elastic Beanstalk, I have created an assets bucket on S3. I'm trying to setup a Cloudfront distribution with behaviors to send requests for assets/* to S3 with a default behavior to send requests to EB. The domain points to Cloudfront.
All requests are going to EB which returns a 404 since there is no assets diretory in the EB environment.
I have created 2 Cloudfront origins, one for EB and one for the S3 bucket. This is what my behaviors look like:
Precedence Path Pattern Origin Protocol Policy Fwd Query Strings
0 assets/* S3-example-bucket HTTP and HTTPS No
1 Default (*) Custom-example.us-east-1.elasticbeanstalk.com HTTP and HTTPS Yes
It seems as though this should be pretty straight forward so I assume I'm missing something basic. Any help is greatly appreciated.
Edit:
Request header:
GET /assets/images/10waysaudiobook.png HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Cookie: wordpress_logged_in_8a27500b7747be1e4fbad7f473f238e5=snickerspixy%7C1466021823%7Cr7rE5moINjanjHEqb1TGbsSkn9F7OCZLfX69IbcnGJu%7C28fc452885f3fe6e954243abab585a188f6511cdd6eeec6fa5ec5c50b9f3d393; wp-settings-7674=m4%3Do%26m5%3Do%26m9%3Do%26m6%3Do%26editor%3Dhtml%26m10%3Do%26m0%3Do%26m3%3Do%26hidetb%3D1%26m2%3Dc%26m1%3Do%26m8%3Do%26m12%3Do%26m7%3Do%26m11%3Do%26urlbutton%3Dnone%26m13%3Do%26tml1%3D1%26imgsize%3Dfull%26align%3Dcenter%26libraryContent%3Dbrowse%26ed_size%3D569%26unfold%3D1%26wplink%3D1%26mfold%3Do%26post_dfw%3Doff%26advImgDetails%3Dshow%26posts_list_mode%3Dlist; wp-settings-time-7674=1464816549; AWSELB=1FCB85F51606EBAFF15FEADB01C8069AEDE17E2A043407E615EF1A0E1ABF24607545A45D3DC206631F7AAE4503ADA423788B5E6B5B48FAE93EE916DE068509E64F92AC10FF; PHPSESSID=cpi2su7s967phu87rlpjgneel6; wordpress_test_cookie=WP+Cookie+check
Connection: keep-alive
Response header:
HTTP/1.1 404 Not Found
Cache-Control: no-cache, must-revalidate, max-age=0
Content-Type: text/html; charset=UTF-8
Date: Sun, 05 Jun 2016 00:54:23 GMT
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Link: <http://example.com/wp-json/>; rel="https://api.w.org/"
Pragma: no-cache
Server: Apache
Transfer-Encoding: chunked
Connection: keep-alive
The response headers indicate that this request wasn't served by CloudFront at all, because there are headers that should be present... but are absent.
CloudFront adds Via:, X-Cache:, and x-amz-cf-id: headers to every response, and sometimes Age: (on cache hits and errors) or Vary: if you're forwarding the CloudFront-Is-*-Viewer: headers to the origin.
The absence of these headers suggest that the DNS for the site hasn't been pointed to CloudFront and may still be pointing directly to the EB environment, or if the change was recent, that the former TTL for the DNS entry may not yet have expired.

MISS from Cloudfront after HIT from Cloudfront

I am switching to Amazon Cloudfront for serving images on my website. To reduce load when we finally make it live, I thought of warming up the cache by hitting image URLs (I am making these request from India and expect majority of users to request from the same region so no need to have a copy of object on all edge locations worldwide).
The problem is that script uses curl to request image and when I access the same URL in browser I get MISS from Cloudfront. So Cloudfront is making two copies of object for these two request.
My current Cloudfront configuration forwards Content-Type request Header to origin.
How should I configure Cloudfront so that it doesn't care about request headers at all and once I made a request (whether curl or using browser) it should serve all future request for same resource from edge and not origin.
Request/Response headers-
I am afraid that the Cloudfront url won't be accessible from outside (until we go live) but I am posting request/response headers, this should give you fair idea. Also you can check out caching headers at origin - https://origin.ixigo.com/image/upload/t_thumb,f_auto/r7y6ykuajvlumkp4lk2a.jpg
Response after two successive request using browser
Remote Address:54.230.156.66:443
Request URL:https://youcannotaccess.com/image/upload/t_thumb,f_auto/r7y6ykuajvlumkp4lk2a.jpg
Request Method:GET
Status Code:200 OK
Response Headers
view source
Accept-Ranges:bytes
Age:23
Cache-Control:public, max-age=31557600
Connection:keep-alive
Content-Length:8708
Content-Type:image/jpg
Date:Fri, 27 Nov 2015 09:16:03 GMT
ETag:"-170562206"
Last-Modified:Sun, 29 Jun 2014 03:44:59 GMT
Vary:Accept-Encoding
Via:1.1 7968275877e438c758292828c0593684.cloudfront.net (CloudFront)
X-Amz-Cf-Id:fcbGLv8uBOP89qfR52OWa-NlqWkEREJPpZpy9ix0jdq8-a4oTx7lNw==
X-Backend:image6_40
X-Cache:Hit from cloudfront
X-Cache-Hits:0
X-Device:pc
X-DeviceType:pc
X-Powered-By:xyz
Now same url requested using curl but gave me miss
curl manu-mdc:cache manuc$ curl -I https://youcannotaccess.com/image/upload/t_thumb,f_auto/r7y6ykuajvlumkp4lk2a.jpg
HTTP/1.1 200 OK
Content-Type: image/jpg
Content-Length: 8708
Connection: keep-alive
Age: 0
Cache-Control: public, max-age=31557600
Date: Fri, 27 Nov 2015 09:16:47 GMT
ETag: "-170562206"
Last-Modified: Sun, 29 Jun 2014 03:44:59 GMT
X-Backend: image6_40
X-Cache-Hits: 0
X-Device: pc
X-DeviceType: pc
X-Powered-By: xyz
Vary: Accept-Encoding
X-Cache: Miss from cloudfront
Via: 1.1 4d42171c56a4c8b5c627040e6aa0938d.cloudfront.net (CloudFront)
X-Amz-Cf-Id: fY0LXhp7NlqB-I8F5-1TIMnA6bONjPD3CEp7dsyVdykP-7N2mbffvw==
Now this will give HIT
manu-mdc:cache manuc$ curl -I https://youcannotaccess.com/image/upload/t_thumb,f_auto/r7y6ykuajvlumkp4lk2a.jpg
HTTP/1.1 200 OK
Content-Type: image/jpg
Content-Length: 8708
Connection: keep-alive
Cache-Control: public, max-age=31557600
Date: Fri, 27 Nov 2015 09:16:47 GMT
ETag: "-170562206"
Last-Modified: Sun, 29 Jun 2014 03:44:59 GMT
X-Backend: image6_40
X-Cache-Hits: 0
X-Device: pc
X-DeviceType: pc
X-Powered-By: xyz
Age: 3
Vary: Accept-Encoding
X-Cache: Hit from cloudfront
Via: 1.1 6877899d48ba844a34ea4378ce336f06.cloudfront.net (CloudFront)
X-Amz-Cf-Id: qpPhbLX_5t2Xj0XZuZdjWD2w-BI80DUVyL496meQkLfSEn3ikt7hNg==
This is similar to this issue: Why are two requests with different clients from the same computer cache misses on cloudfront?
Depending on whether you provide the "Accept-Encoding: gzip" header or not, CloudFront edge server caches the object separately. Since browsers provides this header by default, and your site is likely to be accessed majorly via browser, I will suggest changing your curl call to include this header.
I was facing the same problem, after making the change in my curl call, I started to get a Hit from the browser on my first try via browser (after making a curl call).
Another thing I noticed is that CloudFront requires the full requested object to be downloaded before it will be cached. If you try to download the file partially by specifying the byte range in the curl, the intended object does not get cached, only the downloaded part gets cached as a different object. Same goes for a curl that was terminated in between. The other options I tried were wget call with spider option, but that internally does a HEAD call only and thus does not get the content cached on the edge server.