How does one find real Host when using Cloudfront? - amazon-web-services

In what seems to be a very weird choice, Cloudfront sets the Host header to the origin server host that you specify when forwarding a request.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/RequestAndResponseBehaviorCustomOrigin.html
Why is this? More importantly, when using wildcard subdomains, how do I know which subdomain the request is from, when they all forward to the same origin?

Good news- cloudfront supports host header forwarding now. It's listed very deep in the documentation:
Host [header]: CloudFront sets the value to the domain name of the origin that is associated with the requested object.
Presumably, all you need to do is ensure header forwarding is enabled in the Default Cache Behavior Settings:

Related

How to pass UserAgent header to origin but not cache on it

We use CloudFront for caching user requests on CDN level.
The problem is that we want to pass User-Agent header to origin to be able to track it but we do not want to use it as caching key(there are just too many User agents and value of caching will be zero).
When using Whitelist strategy for headers AWS Cloudfront cuts all headers which are not whitelisted and are not default and doesn't send them to origin.
As we use Whitelist Forward strategy now, one of the solutions will be to switch to All strategy but we do like other headers that we are using as cache keys.
Any ideas?
I checked similar questions, but found no answer to this question by far:
Does Amazon pass custom headers to origin?
Pass cookie to CloudFront origin but prevent from caching
(similar but for cookies)

Akamai CDN - Whitelist service by Request header or User agent

We are using Akamai CDN as our load balancer and it also servers as a gatekeeper for requests.
We usually consume 3rd party services and in those cases whitelist their IP to be accessed in our servers. The service we are currently using cannot share IP since it is on cloud and keeps changing. They can either provide host name or Custom request header or a user agent.
I tried adding host entry but that did not work. Any idea how to add custom request header or user agent?
Depending on the product you're using to deliver with Akamai, the solution is to add a "Modify Outgoing Request Header" behavior.
This is what the "Add Behavior" tool looks like when you filter behaviors on "header".
By default, this allows you to specify a User-Agent header to go to your origin with the request. You can also specify a custom header name and value using that Behavior.
If you make it part of your Default Rule, then every single request that goes through that property will have the custom header.

How can I change TTL values for Cloudfront object caching for my default cache behaviour pattern?

I have a legacy site and a new site, and I'm using Cloudfront to route traffic to the two different server groups based on URL path (eg. /new goes to the new servers, everything else to the old ones).
In AWS Cloudfront, my Default (*) path pattern captures all traffic not caught by the other rules, and routes these requests to the legacy site. This site explicitly prevents caching with its headers and I want to override this.
It looks from the AWS Console, though, like I can't do this:
All the other cache behaviours (eg. for the /new path pattern) allow me to set these options. Does this mean that Cloudfront doesn't allow customisation of TTLs for the default path? If not, the only way I can fix these cache headers will be to manually change them at the source origin, which I'd prefer to avoid.
Is there a way I can change these settings for my default route?

Mapping Host Header to custom header for secondary origin

I'm looking for a way to pass the requesting host header onto either the API Gateway or a custom endpoint (outside of amazon) from a cloudfront origin.
Essentially I have multiple domains mapped to a cloudfront catchall and I'm trying to pre-render based off the index request on the server while letting all other resources through.
IF this is not possible, would lambda edge be able to achieve such a thing?
Thanks!
Until such time as Lambda#Edge leaves preview, here's your workaround:
For each domain name, create a separate CloudFront distribution, and add a unique custom origin header.
If you've configured more than one CloudFront distribution to use the same origin, you can specify different custom headers for the origins in each distribution and use the logs for your web server to distinguish between the requests that CloudFront forwards for each distribution.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/forward-custom-headers.html
It should go without saying that "use the logs for your web server" is only one possible use for this value. You can also use it to identify which domain the request is for, by inspecting the inserted request header.
For example, for the site api-42.example.com, add a custom origin header X-Forwarded-Host with the static value the same as the hostname, api-42.example.com.
CloudFront adds the custom origin header to each request when sending it to the origin server.
If the client, for whatever reason, sends the same header, CloudFront discards what the client sent, before adding your header and value to each request.
Since the actual CloudFront distributions themselves are free, there's no real harm in this solution. If you need to create a lot of them, that's easily scripted with aws-cli. By default, accounts can create 200 different distributions, but you can submit a free support request to increase that limit.
You may now be contemplating the impact of this on your cache hit rate, since the different sites wouldn't share a common cache. That's a valid concern, but the impact may not be as substantial as you expect, for a variety of reasons -- not the least of which is that CloudFront's cache is not monolithic. If you have viewers hitting a single distribution but from two different parts of the world, those users are almost certainly connecting to different CloudFront edge locations, thus hitting different cache instances anyway.

Cloudfront not reached when integrated with Route 53

I'm trying to make Cloudfront work on my solution. I'm using Route 53 + CloudFront + ELB.
Consider the following:
1. Route 53 is pointing to CloudFront through a record set alias.
2. CloudFront is pointing to the ELB through a origin domain name.
3. CloudFront has an Alternate Domain Name set to my custom domain (mysite.com)
If I make a request using the CloudFront domain name (d1ngxxxx.cloudfront.net) or the custom domain (mysite.com), the initial request goes to CloudFront which responds with a HTTP 302. All the subsequent requests (for resources like images, css, js..) are made directly to the ELB domain name bypassing CloudFront.
What should I do to make all requests go throuhg CloudFront?
Thanks is advance!
I can't come up with a circumstance where Cloudfront would issue these redirects.
It seems likely that what's happening is that your server itself is issuing the 302 redirect, because it doesn't like the Host: header it's getting from Cloudfront.
Host: CloudFront sets the value to the domain name of the origin that is associated with the requested object.
— http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/RequestAndResponseBehaviorCustomOrigin.html
Cloudfront is then returning the redirect to the browser.
Cloudfront can also cache such a redirect, so be mindful of that as you're troubleshooting. The response headers should indicate whether cloudfront went to the origin for the particular reponse:
X-Cache: Miss from cloudfront
...or whether cloudfront served the request from cache.
X-Cache: Hit from cloudfront
Two possible approaches to resolve this:
If your legacy code is reacting to the Host: header in a negative way, you might be able to reconfigure the web server to modify that value before the code is able to see it, so the redirection wouldn't occur.
Alternately, you could use something outboard, a reverse-proxying engine like Varnish or HAProxy (of which I have touched on elsewhere). In HAProxy, for a simple example:
reqirep ^Host:\ .* Host:\ expected-domain.example.com if { hdr(host) -i unexpected-domain.example.com }
A rule in form similar to this would replace the Host: unexpected-domain.example.com header with Host: expected-domain.example.com in all incoming requests where that header was present, which should keep your legacy code happy and avoid the redirects. Running HAProxy in front of your legacy system doesn't impose a significant load, since the code is very tight. All of my legacy web systems are now fronted with these systems, to give me the ability to manipulate and modify behavior much more easily than might otherwise be possible.