If I'm in the northeast US and trying to talk to a site on AWS, behind CloudFront, I'm generally getting routed to servers in us-east. For a particular hostname (say, myfico.com), is there any way to force e.g. curl to talk with servers in a different region, presumably by routing my traffic to a different CloudFront edge server?
Since amazon publishes the IP ranges of all their edge servers, I suspect it might be possible to do something like overriding some of my DNS records and forcing it to use an IP prefix from a different region. IT might also be possible to do a more general location spoofing by modifying the edns-client-subnet that's passed along in the DNS request, but I think this might not be possible.
AWS regions are not really related to the CloudFront CDN. Generally speaking AWS region is the region where the server is physically located while the CDN offers a termination point to the client so that it can get a reply from the server quickly. You can possibly hit a CDN node in West US but the request may still be routed to us-east if the website is configured that way.
Related
I have a custom origin i.e. a web app on an EC2 instance. How do I decide whether I should go for:
a Cloudfront CDN
or,
deploy multiple instances in different regions and configure a Geolocation/proximity based routing policy
The confusion arises from the fact that both aim at routing the request to the nearest location (edge location in case of Cloudfront and region specific EC2 instance when it comes to multi-region deployments with Geolocation based policy with Route 53) based on where the request originates from.
There is no reason why you can't do both.
CloudFront automatically routes requests to an edge location nearest the viewer, and when a request can't be served from that location or the nearest regional cache, CloudFront does a DNS lookup for the origin domain name and fetches the content from the origin.
So far, I've only really stated the obvious. But up next is a subtle but important detail:
CloudFront does that origin server DNS lookup from a location that is near the viewer -- which means that if the origin domain name is a latency-based record set in Route 53, pointing to deployments in two or more EC2 regions, then the request CloudFront makes to "find" the origin will be routed to the origin deployment nearest the edge, which is also by definition going to be near to the viewer.
So a single, global CloudFront deployment can automatically and transparently select the best origin, using latency-based configuration for the backend's DNS configuration.
If the caching and transport optimizations provided by CloudFront do not give you the global performance you require, then you can deploy in multiple regions, behind CloudFront... being mindful, always, that a multi-region deployment is almost always a more complex environment, depending on the databases that are backing your application and how they are equipped to handle cross-region replication for reads and/or writes.
Including CloudFront as the front-end is also a better solution for fault tolerance among multiple regional deployments, because CloudFront correctly honors the DNS TTL on your origin server's DNS record, and if you have Route 53 health checks configured to take an unhealthy region out of the DNS response on the origin domain name, CloudFront will quickly stop sending further requests to it. Browsers are notoriously untrustworthy in this regard, sometimes caching a DNS answer until all tabs/windows are closed.
And if CloudFront is your front-end, you can offload portions of your logic to Lambda#Edge if desired.
You can use multi region for lot reasons mainly,
Proximity
Failover (incase if first region fails, requests can be sent to another region)
Multi region lambda deployment is clearly documented here. You can apply the same logic to all of the AWS Resources too. (DynamoDB, S3)
https://aws.amazon.com/blogs/compute/building-a-multi-region-serverless-application-with-amazon-api-gateway-and-aws-lambda/
You can also run Lambda#Edge to force all your requests / splits to one region on the edge.
Hope it helps.
I have attended an AWS training, and they explained to us that a good practice is to have cache all dynamic content via Cloudfront, setting TTL to 0, even if you have an LB in front on the Load Balancer. So it could be like:
Route 53 -> CloudFront -> Application LB
I can not see any advantage of this architecture, instead of having directly (only for dynamic content):
Route 53 -> Application LB
I do not see the point since Cloudfront will send all traffic always to the LB, so you will have:
Two HTTPS negotiation (client <-> Cloudfront, and Cloudfront <-> LB)
No caching at all (it is dynamic content, it should not be cached, since that is the meaning of "dynamic")
You will not have the client IP since your LB will see only the Cloudfront IP (I know this can be fixed, to have the client IP, but then you will have issues with the next bullet).
As an extra work, you need to be able to update your LB security groups often, to match the CloudFront IPs (for this region), as I guess you want to get traffic only from your Cloudfront, and not directly from the LB public endpoint.
So, probably, I am missing something important about this Route 53 -> CloudFront -> Application LB architecture.
Any ideas?
Thanks!
Here are some of the benefits of having cloudfront on top of your ALB
For a web application or other content that's served by an ALB in Elastic Load Balancing, CloudFront can cache objects and serve them
directly to users (viewers), reducing the load on your ALB.
CloudFront can also help to reduce latency and even absorb some distributed denial of service (DDoS) attacks. However, if users can
bypass CloudFront and access your ALB directly, you don't get these
benefits. But you can configure Amazon CloudFront and your Application
Load Balancer to prevent users from directly accessing the Application
Load Balancer (Doc).
Outbound data transfer charges from AWS services to CloudFront is $0/GB. The cost coming out of CloudFront is typically half a cent less
per GB than data transfer for the same tier and Region. What this
means is that you can take advantage of the additional performance and
security of CloudFront by putting it in front of your ALB, AWS Elastic
Beanstalk, S3, and other AWS resources delivering HTTP(S) objects for
next to no additional cost (Doc).
The CloudFront global network, which consists of over 100 points of presence (POP), reduces the time to establish viewer-facing
connections because the physical distance to the viewer is shortened.
This reduces overall latency for serving both static and dynamic
content (Doc).
CloudFront maintains a pool of persistent connections to the origin, thus reducing the overhead of repeatedly establishing new
connections to the origin. Over these connections, traffic between
CloudFront and AWS origins are routed over a private backbone network
for reliability and performance. This reduces overall latency for
serving both static and dynamic content (Doc).
You can use geo restriction, also known as geo blocking, to prevent users in specific geographic locations from accessing content that
you're distributing through a CloudFront distribution (Doc).
In other words you can use the benefits of ClodFront to add new features to your source (ALB, Elastic Beanstalk, S3, EC2) but if you don't need these features it is better not to do this configuration in your architecture.
Cloudfront enables you deliver content faster because Cloudfront Edge locations closer to the user requesting and are connected to the AWS Regions through the AWS network backbone.
You can terminate SSL at cloudfront and make the load balancer listen at port 80
Cloudfront allows to apply geo location restriction easily in 2 clicks.
I think another reason you may want to use CF in front of an ALB is that you could have a better experience with WAF (if you are already using (or planning to) WAF, of course).
Even though WAF is available for both ALB and CF, ALB and CF use different services for WAF. The reason is that Cloudfront is a global service and ALB is one per region.
That may bring more complex management and duplication of ACL (and probably more costs).
Cloudfront is really an amazing CDN content delivery network service like Akamai etc . Now if your web applications having lots of dynamic content like media files even you static code you can put them into a S3 bucket another object storage service by AWS .
Once you have your dynamic content to you S3 bucket you can create a Cloudfront distribution by considering that bucket as a origin this phenomena will cached your dynamic content across AWS multiple edge locations. And it will become fast to deliver on client side.
Now if we talk Load balancer . So it have it’s own purpose to serve image you are using a Application where you get an unpredictable traffic so here your Load balancer which we are considering an Application or classic Load balancer which is accepting request from Route 53 and passing it to your servers.
For high availability and scalability we consider such architecture of Application.
we create a autoscaling group of our ec2 instances and put them behind a load balancer and as per our scaling policy example: if my cpu or memory utilization goes more that 70% launch another instance or similar.
You can set a request policy as well on load balancer to serve traffic to your ec2 server maybe in round Robbin or on availability.
I just shared the best practices recommended of AWS fault tolerant and highly available architecture . I hope this may help you to get a better idea to decide now .
Please let me know if I can help you with more suggestions on it.
Thanks and Happy Leanings!!
Amazon released new feature - to support regional api endpoints
Does it mean I can deploy my same API code in two regions which sends request to Lambda micro-services? (It will be two different Https endpoints)
And have CloudFront distribute the traffic for me?
Any code snippets?
Does it mean I can deploy my same API code in two regions which sends request to Lambda micro-services? (It will be two different Https endpoints)
This was already possible. You can already deploy the same API code in multiple regions and create different HTTPS endpoints using API Gateway.
What you couldn't do, before, was configure API Gateway API endpoints in different regions to expect the same hostname -- and this is a critical capability that was previously unavailable, if you wanted to have a geo-routing or failover scenario using API Gateway.
With the previous setup -- which has now been renamed "Edge-Optimized Endpoints" -- every API Gateway API had a regional endpoint hostname but was automatically provisioned behind CloudFront. Accessing your API from anywhere meant you were accessing it through the CloudFront, which meant optimized connections and transport from the API client -- anywhere on the globe -- back to your API's home region via the AWS Edge Network, which is the network that powers CloudFront, Route 53, and S3 Transfer Acceleration.
Overall, this was good, but in some cases, it can be better.
The new configuration offering, called a Regional API Endpoint, does not use CloudFront or the Edge Network... but your API is still only in one region (but keep reading).
Regional API Endpoints are advantageous in cases like these:
If your traffic is from EC2 within the region, this avoids the necessity of jumping onto the Edge Network and back off again, which will optimize performance of API requests from inside the same EC2 region.
If you wanted to deploy an API Gateway endpoint behind a CloudFront distribution that you control (for example, to avoid cross-origin complications, or otherwise integrate API Gateway into a larger site), this previously required that you point your CloudFront distribution to the CloudFront distribution managed by API Gateway, thus looping through CloudFront twice, which meant transport latency and some loss of flexibility.
Creating a Regional API Endpoint allows you to then point your own CloudFront distribution directly at the API endpoint.
If you have a single API in a single region, and it's being accessed from points all over the globe, and you aren't using CloudFront yourself, the Edge-Optimized endpoint is still almost certainly the best way to go.
But Regional API Endpoints get interesting when it comes to custom domain names. Creating APIs with the same custom domain name (e.g. api.example.com) in multiple AWS regions was not previously possible, because of API Gateway's dependency on CloudFront. CloudFront is a global service, so the hostname namespace is also global -- only one CloudFront distribution, worldwide, can respond to a specific incoming request hostname. Since Regional API Endpoints don't depend on CloudFront, provisioning APIs with the same custom domain name in multiple AWS regions becomes possible.
So, assuming you wanted to serve api.example.com out of both us-east-2 and us-west-2, you'd deploy your individual APIs and then in each region, create a custom domain name configuration in each region for api.example.com with a Regional API Endpoint, selecting an ACM certificate for each deployment. (This requires ACM certs in the same region as the API, rather than always in us-east-1.)
This gives you two different hostnames, one in each region, that you use for your DNS routing. They look like this:
d-aaaaaaaaaa.execute-api.us-east-2.amazonaws.com
d-bbbbbbbbbb.execute-api.us-west-2.amazonaws.com
So, what next?
You use Route 53 Latency-Based routing to create a CNAME record for api.example.com with two targets -- one from us-east-2, one from us-west-2 -- pointing to the two respective names, along with health checks on the targets. Route 53 will automatically resolve DNS queries to whichever regional endpoint is closer to the requester. If, for example, you try to reach the API from us-east-1, your DNS query goes to Route 53 and there's no record there for us-east-1, so Route 53 determines that us-east-2 is the closer of the two regions, and -- assuming the us-east-2 endpoint has passed its healthcheck -- Route 53 returns the DNS record pointing to d-aaaaaaaaaa.execute-api.us-east-2.amazonaws.com.
So, this feature creates the ability to deploy an API in multiple AWS regions that will respond to the same hostname, which is not possible with Edge Optimized API Endpoints (as all endpoints were, prior to the announcement of this new feature).
And have CloudFront distribute the traffic for me?
Not exactly. Or, at least, not directly. CloudFront doesn't make origin determinations based on the requester's region, but Lambda#Edge dynamic origin selection could be used to modify the origin server based on the requester's general location (by evaluating which API region is nearest to the CloudFront edge that happens to be serving a specific request).
However, as you can see, above, Route 53 Latency-Based routing can do that for you. Yet, there's still a compelling reason to put this configuration behind a CloudFront distribution, anyway... two reasons, actually...
This is in essence a DNS failover configuration, and that is notoriously unreliable when the access is being made by a browser or by a Java programmer who hasn't heard that Java seems to cache DNS lookups indefinitely. Browsers are bad about that, too. With CloudFront in front of your DNS failover configuration, you don't have to worry about clients caching your DNS lookup, because CloudFront does it correctly. The TTL of your Route 53 records -- which used as an origin server behind CloudFront -- behaves as expected, so regional failover occurs correctly.
The second reason to place this configuration behind CloudFront would be if you want the traffic to be transported on the Edge Network. If the requests are only coming from the two AWS regions where the APIs are hosted, this might not be helpful, but otherwise it should improve responsiveness overall.
Note that geo-redundancy across regions is not something that can be done entirely transparently with API Gateway in every scenario -- it depends on how you are using it. One problematic case that comes to mind would involve a setup where you require IAM authentication against the incoming requests. The X-Amz-Credential includes the target region, and the signature of course would differ because the signing keys in Signature V4 are based on the secret/date/region/service/signing-key paradigm (which is a brilliant design, but I digress). This would complicate the setup since the caller would not know the target region. There may be other complications. Cognito may have similar complications. But for a straightforward API where the authentication is done by your own mechanism of application tokens, cookies, etc., this new capability is very much a big deal.
Somewhat amusingly, before this new capability was announced, I was actually working on the design of a managed service that would handle failover and geo-routing of requests to redundant deployments of API Gateway across regions, including a mechanism that had the capability to compensate for the differing region required in the signature. The future prospects of what I was working on are a bit less clear at the moment.
It means you can deploy your API based on the region which reduces latency.
One use case would be, Say you have a Lambda function which is invoking an API request. If both Lambda and API is in the same region then you can expect high performance
Please have a look on https://docs.aws.amazon.com/apigateway/latest/developerguide/create-regional-api.html
Does AWS allow usage of Cloudfront for websites usage, eg:- caching web pages.
Website should be accessible within corporate VPN only. Is it a good idea to cache webpages on cloudfront when using Application restricted within one network?
As #daxlerod points out, it is possible to use the relatively new Web Application Firewall service with CloudFront, to restrict access to the content, for example, by IP address ranges.
And, of course, there's no requirement that the web site actually be hosted inside AWS in order to use CloudFront in front of it.
However, "will it work?" and "are all the implications of the required configuration acceptable from a security perspective?" are two different questions.
In order to use CloudFront on a site, the origin server (the web server where CloudFront fetches content that isn't in the cache at the edge node where the content is being requested) has to be accessible from the Internet, in order for CloudFront to connect to it, which means your private site has to be exposed, at some level, to the Internet.
The CloudFront IP address ranges are public information, so you could partially secure access to the origin server with the origin server's firewall, but this only prevents access from anywhere other than through CloudFront -- and that isn't enough, because if I knew the name of your "secured" server, I could create my own CloudFront distribution and access it through CloudFront, since the IP addresses would be in the same range.
The mechanism CloudFront provides for ensuring that requests came from and through an authorized CloudFront distribution is custom origin headers, which allows CloudFront to inject an unknown custom header and secret value into each request it sends to your origin server, to allow your server to authenticate the fact that the request not only came from CloudFront, but from your specific CloudFront distribution. Your origin server would reject requests not accompanied by this header, without explanation, of course.
See http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/forward-custom-headers.html#forward-custom-headers-restrict-access.
And, of course, you need https between the browser and CloudFront and https between CloudFront and the origin server. It is possible to configure CloudFront to use (or require) https on the front side or the back side separately, so you will want to ensure it's configured appropriately for both, if the security considerations addressed above make it a viable solution for your needs.
For information that is not highly sensitive, this seems like a sensible approach if caching or other features of CloudFront would be beneficial to your site.
Yes, you CloudFront is designed as a caching layer in front of a web site.
If you want to restrict access to CloudFront, you can use the Web Application Firewall service.
Put your website into public network > In CloudFront distribution attach WAF rules > In WAF rules whitelist range of your company's IP's and blacklist everything else
I am hosting my website using AWS.
The website is on 2 ec2 instances, with a load balancer (ELB) balancing traffic between them.
Currently, I am using my DNS (Route 53) to restrict the access to the website by using Route 53's geolocation routing:
http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy.html#routing-policy-geo
(The geolocation restriction is just to limit the initial release of my website. It is not for security reasons. Meaning the restriction just needs to work for the general public)
This worries me a little because my load balancer is still exposed to access from everywhere. So I am concerned that my load balancer will get indexed by google or something and then people outside of my region will be able to access the site.
Are there any fixes for this? Am I restricting access by location the wrong way? Is there a way perhaps to specify in the ELB's security group that it only receive inbound traffic from my DNS (of course then I would also have to specify that inbound traffic from edge locations be allowed as well for my static content but this is not a problem)?
Note: There is an option when selecting inbound rules for a security group, under "type" to select "DNS(UDP)" or "DNS(TCP)". I tried adding two rules for both DNS types (and IP Address="anywhere") for my ELB but this did not limit access to the ELB to be solely through my DNS.
Thank you.
The simple solution, here, is found in CloudFront. Two solutions, actually:
CloudFront can use its GeoIP database to do the blocking for you...
When a user requests your content, CloudFront typically serves the requested content regardless of where the user is located. If you need to prevent users in specific countries from accessing your content, you can use the CloudFront geo restriction feature[...]
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/georestrictions.html
You can configure CloudFront with which countries are allowed, or which are denied. You can also configure static pages, stored in S3, which are displayed to denied users. (You can also configure static custom error pages for other CloudFront errors that might occur, and store those pages in S3 as well, where CloudFront will fetch them if it ever needs them).
...or...
CloudFront can pass the location information back to your server using the CloudFront-Viewer-Country: header, and your application code, based on the contents accompanying that header, can do the blocking. The incoming request looks something like this (some headers munged or removed for clarity):
GET / HTTP/1.1
Host: example.com
X-Amz-Cf-Id: 3fkkTxKhNxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx==
Via: 1.1 cb76b079000000000000000000000000.cloudfront.net (CloudFront)
CloudFront-Viewer-Country: US
CloudFront-Forwarded-Proto: https
Accept-Encoding: gzip
CloudFront caches the responses against the combination of the requested page and the viewer's country, and any other whitelisted headers, so it will correctly cache your denied responses as well as your allowed responses, independently.
Here's more about how you enable the CloudFront-Viewer-Country: header:
If you want CloudFront to cache different versions of your objects based on the country that the request came from, configure CloudFront to forward the CloudFront-Viewer-Country header to your origin. CloudFront automatically converts the IP address that the request came from into a two-letter country code.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/header-caching.html#header-caching-web-location
Or, of course, you can enable both features, letting CloudFront do the blocking, while still giving your app a heads-up on the country codes for the locations that were allowed through.
But how do you solve the issue with the fact that your load balancer is still open to the world?
CloudFront has recently solved this one, too, with Custom Origin Headers. These are secret custom headers sent to your origin server, by CloudFront, with each request.
You can identify the requests that are forwarded to your custom origin by CloudFront. This is useful if you want to know whether users are bypassing CloudFront[...]
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/forward-custom-headers.html
So, let's say you added a custom header to CloudFront:
X-Yes-This-Request-Is-Legit: TE9MIHdoYXQgd2VyZSB5b3UgZXhwZWN0aW5nIHRvIHNlZT8=
What's all that line noise? Nothing, really, just a made up secret value that only your server and CloudFront know about. Configure your web server so that if this header and value are not present in the incoming request, then access is denied -- this is a request that didn't pass through CloudFront.
Don't use the above secret, of course... make up your own. It's entirely arbitrary.
Caveat applicable to any GeoIP-restricting strategy: it isn't perfect. CloudFront claims 99.8% accuracy.
The most reliable way to implement geographic IP restrictions, is to use a geographic location database or service API, and implement it at the application level.
For example, for a web site in practically any language, it is very simple to add test at the start of each page request, and compare the client IP against the geo ip database or service, and handle the response from there.
At the application level, it is easier to manage the countries you accept/deny, and log those events, as needed, than at the network level.
IP based geo location data is generally reliable, and there are many sources for this data. While you may trust AWS for many things, I do think that there are many reliable 3rd party sources for geo IP dat, that focus on this data.
freegeoip.net provides a public HTTP API to search the geolocation of IP addresses. You're allowed up to 10,000 queries per hour.
ip2location.com LITE is a free IP geolocation database for personal or commercial use.
If your application uses a database, these geo databases are quite easy to import and reference in your app.
I have a post explaining in detail how to whitelist / blacklist locations with Route53: https://www.devpanda.me/2017/10/07/DNS-Blacklist-of-locations-countries-using-AWS-Route53/.
In terms of your ELB being exposed to public that shouldn't be a problem since the Host header on any requests to the ELB over port 80 / 443 won't match your domain name, which means for most web servers a 404 will be returned or similar.
There is a way using AWS WAF
You can select - Resource type to associate with web ACL as ELB.
Select your ELB and create conditions like Geo Match, IP Address etcetera.
You can also update anytime if anything changes in future.
Thanks