Does AWS allow usage of Cloudfront for websites usage, eg:- caching web pages.
Website should be accessible within corporate VPN only. Is it a good idea to cache webpages on cloudfront when using Application restricted within one network?
As #daxlerod points out, it is possible to use the relatively new Web Application Firewall service with CloudFront, to restrict access to the content, for example, by IP address ranges.
And, of course, there's no requirement that the web site actually be hosted inside AWS in order to use CloudFront in front of it.
However, "will it work?" and "are all the implications of the required configuration acceptable from a security perspective?" are two different questions.
In order to use CloudFront on a site, the origin server (the web server where CloudFront fetches content that isn't in the cache at the edge node where the content is being requested) has to be accessible from the Internet, in order for CloudFront to connect to it, which means your private site has to be exposed, at some level, to the Internet.
The CloudFront IP address ranges are public information, so you could partially secure access to the origin server with the origin server's firewall, but this only prevents access from anywhere other than through CloudFront -- and that isn't enough, because if I knew the name of your "secured" server, I could create my own CloudFront distribution and access it through CloudFront, since the IP addresses would be in the same range.
The mechanism CloudFront provides for ensuring that requests came from and through an authorized CloudFront distribution is custom origin headers, which allows CloudFront to inject an unknown custom header and secret value into each request it sends to your origin server, to allow your server to authenticate the fact that the request not only came from CloudFront, but from your specific CloudFront distribution. Your origin server would reject requests not accompanied by this header, without explanation, of course.
See http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/forward-custom-headers.html#forward-custom-headers-restrict-access.
And, of course, you need https between the browser and CloudFront and https between CloudFront and the origin server. It is possible to configure CloudFront to use (or require) https on the front side or the back side separately, so you will want to ensure it's configured appropriately for both, if the security considerations addressed above make it a viable solution for your needs.
For information that is not highly sensitive, this seems like a sensible approach if caching or other features of CloudFront would be beneficial to your site.
Yes, you CloudFront is designed as a caching layer in front of a web site.
If you want to restrict access to CloudFront, you can use the Web Application Firewall service.
Put your website into public network > In CloudFront distribution attach WAF rules > In WAF rules whitelist range of your company's IP's and blacklist everything else
Related
I have a load balancer (ALB) that is dealing with my app's API requests (alb.domain.com) . I also have an S3 bucket where my static site is served from via CloudFront (domain.com). I have configured a distribution so that /api goes to the ALB and the rest goes to S3 and all works fine. However, instead of routing my API requests to CloudFront (domain.com/api), I can also route them directly to the ALB (alb.domain.com/api), which also works fine. I don't need caching on my API requests so i disabled it on the distribution anyway. What would be the point in me doing the requests via CloudFront given that I am (1) introducing an extra connection in the route and (2) my requests are now calculated for CloudFront pricing. Are there any disadvantages associated with just routing requests directly to the ALB and using CloudFront only for serving static files?
There are a number of benefits to using CloudFront in your scenario (and generally).
Firstly by having route through CloudFront it gives the impression this is a single domain, this removes implementation of CORS, helps to reduce complexities in any security configurations (such as CSP). Additionally you can cache any static based requests, or those that change infrequently in the future if you need it.
You also gain the benefits that you get access to the edge network to do routing, this benefits for clients that are further away from the region. This means that the user will hit the closest edge location to them, which will then route and establish a connection with the origin over the private AWS network (which will be faster than traversing the public internet).
Additionally security evaluations at the Edge, both WAF and Lambda#Edge can be evaluated closer to the user on AWS Edge infrastructure rather than your own resources.
I'm new with AWS WAF and get stuck with setting up it for application that hosts on some dedicated server. I didn't find any information how to set up it without migration to aws servers, but I found that WAF integrated with CloudFront. But anyway I found only few information that explain how to integrate this CDN with my web application. So, the main question is:
Is it possible to use AWS WAF with application that hosted on some dedicated server? And if it possible - can you provide some guides and/or docs for setting up?
Yes, you can use WAF with a server outside AWS.
WAF works with CloudFront, and CloudFront does not require the origin server to be in the AWS ecosystem.
When you create a distribution, you specify where CloudFront sends requests for the files. CloudFront supports using several AWS resources as origins. For example, you can specify an Amazon S3 bucket or a MediaStore container, a MediaPackage channel, or a custom origin, such as an Amazon EC2 instance or your own HTTP web server. (emphasis added)
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/DownloadDistS3AndCustomOrigins.html
Configuring CloudFront to work with your external server is no different than configuring it to work with a server in EC2. Your DNS entry (e.g. www.example.com) changes to point to CloudFront, and CloudFront connects to your server using a new name that you create (e.g. origin.example.com). CloudFront proxies requests through to your server, unless the edge location handling the a given request happens to have access to a copy of the same resource that it cached while handling a previous request for the same page -- that's how CloudFront gets your content, by caching it as it handles requests that are passing through. (You don't pre-load any content into CloudFront.) If CloudFront has a cached copy, your server sees nothing, and CloudFront returns the object to the browser from its cache. But CloudFront isn't strictly a CDN, even though they market it that way. It is a global network of reverse proxies and high-reliability/low-latency transport.
You'll want to take steps to ensure that the web server rejected requests that didn't come through CloudFront. See Using Custom Headers to Restrict Access to Your Content on a Custom Origin as well as the list of CloudFront IP Addresses which you could use on your web server's firewall.
Once you have your site working through CloudFront, all you do is activate WAF on the distribution. CloudFront is very tightly integrated with WAF so that is a very simple change, once you have your WAF rules set up.
I configured CDN in front of my website. Everything works well when you access the website via my custom DNS or CDN DNS. My problem is when I want to use an IP instead of DNS.
When I do a nslookup on my CDN DNS name, I get a list of IP's. If I grab one IP address from there and try to access the website, I get a 403 Forbidden request.
Why is CDN only accepting DNS request and not IP's?
What if I have a proxy in front of my CDN and try to access my website by using the proxy IP, how can I access the website using the proxy IP which points to CDN?
It's a wired requirement and time consuming, I've been looking for the correct answer. No one seems to show a solution.
Cheers!
Why is CDN only accepting DNS request and not IP's?
CloudFront is not designed to work this way. It is a massive, globally-distributed system. When you look up the IP addresses of your CloudFront distribution, you are receiving the list of addresses where CloudFront expects to receive traffic:
for your web site, and
for potentially hundreds or thousands of other sites, and
from browsers in the same geographic area as you
You need a way to identify which distribution you expect CloudFront to use when processing your request.
In HTTP mode, this uses the Host: HTTP header, sent by the browser. In HTTPS mode, this uses the TLS SNI value and the Host: header.
If you were using a proxy to access CloudFront, you would need for the proxy to inject a Host header for HTTP and to set the SNI correctly, too, for HTTPS.
In HAProxy, for example, set the host header, overwriting any such header that's already present.
http-request set-header Host dzczcexample.cloudfront.net
Of course, you might use any one of the Alternate Domain Name values configured for your distribution, as well.
For SNI:
backend my-cloudfront-backend
server my-cloudfront dzczcexample.cloudfront.net:443 ssl verify none sni str(dzczcexample.cloudfront.net)
(Source: https://serverfault.com/a/830327/153161)
But this is only a minimum baseline working configuration, because CloudFront has features that this simple setup overlooks.
As noted above, CloudFront is returning a list of IP addresses that should be used to access (1) your site, (2) from where you are, (3) right now. The list of addresses can and will vary. CloudFront appears to be able to dynamically manage and distribute its workload and mitigate DDoS by moving traffic from one set of servers to another, from one edge location to another, etc., by modifying the DNS responses... so your proxy needs to be using the multiple addresses returned, and needs to be refreshing its DNS values so that it always connects to where CloudFront wishes for it to connect, for optimum behavior and performance.
Also, don't overlook the fact that a proxy server will connect to CloudFront via an edge near the proxy, not near the browser, so this is not something you would routinely use in production, though it absolutely does have some valid use cases. (I have used HAProxy on both sides of CloudFront for several years, for certain applications -- some of which have now been obviated by Lambda#Edge, but I digress).
It's a wired [weird?] requirement
Not really. Name-based virtual hosting has been the standard practice for many years. It is -- in my opinion -- almost an accident of history that when you set up a web server, it will commonly respond on the IP address in the Host header, as well. A well-configured web server will not do this -- if you (the web browser) don't know what host you are asking for and are just sending a request to my IP, then I (the web server) should tell you I have no idea what you want from me, because you are more likely than not to be arriving for malicious reasons, or benign but annoying reasons (scanning), or as the result of a misconfiguration. You also don't want search engine spiders finding your content at an IP address. Bad for listings, bad for SEO.
The problem I have with this proxy is I cannot configure the headers. I was thinking to add another proxy where I have control and then modify the headers with the values I want. But this is not really a good solution, is like jumping from one proxy to another.
I think I should rely more on DNS and hostnames, rather than IP's. Which is fine with me, I prefer using proper DNS name.
Thanks for your thorough explanation, you clarify a lot of things.
I need to use AWS WAF for my web application hosted on AWS to provide additional rule based security to it. I couldnt find any way to directly use WAF with ELB and WAF needs Cloudfront to add WEB ACL to block actions based on rules.
So, I added my Application ELB CNAME to cloudfront, only the domain name, WebACL with an IP block rule and HTTPS protocol was updated with cloudfront. Rest all has been left default. once both WAF and Cloudfront with ELB CNAME was added, i tried to access the CNAME ELB from one of the ip address that is in the block ip rule in WAF. I am still able to access my web application from that IP address. Also, I tried to check cloudwatch metrics for Web ACL created and I see its not even being hit.
First, is there any good way to achieve what I am doing and second, is there a specific way to add ELB CNAME on cloudfront.
Thanks and Regards,
Jay
Service update: The orignal, extended answer below was correct at the time it was written, but is now primarily applicable to Classic ELB, because -- as of 2016-12-07 -- Application Load Balancers (elbv2) can now be directly integrated with Web Application Firewall (Amazon WAF).
Starting [2016-12-07] AWS WAF (Web Application Firewall) is available on the Application Load Balancer (ALB). You can now use AWS WAF directly on Application Load Balancers (both internal and external) in a VPC, to protect your websites and web services. With this launch customers can now use AWS WAF on both Amazon CloudFront and Application Load Balancer.
https://aws.amazon.com/about-aws/whats-new/2016/12/AWS-WAF-now-available-on-Application-Load-Balancer/
It seems like you do need some clarification on how these pieces fit together.
So let's say your actual site that you want to secure is app.example.com.
It sounds as if you have a CNAME elb.example.com pointing to the assigned hostname of the ELB, which is something like example-123456789.us-west-2.elb.amazonaws.com. If you access either of these hostnames, you're connecting directly to the ELB -- regardless of what's configured in CloudFront or WAF. These machines are still accessible over the Internet.
The trick here is to route the traffic to CloudFront, where it can be firewalled by WAF, which means a couple of additional things have to happen: first, this means an additional hostname is needed, so you configure app.example.com in DNS as a CNAME (or Alias, if you're using Route 53) pointing to the dxxxexample.cloudfront.net hostname assigned to your distribution.
You can also access your sitr using the assigned CloudFront hostname, directly, for testing. Accessing this endpoint from the blocked IP address should indeed result in the request being denied, now.
So, the CloudFront endpoint is where you need to send your traffic -- not directly to the ELB.
Doesn't that leave your ELB still exposed?
Yes, it does... so the next step is to plug that hole.
If you're using a custom origin, you can use custom headers to prevent users from bypassing CloudFront and requesting content directly from your origin.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/forward-custom-headers.html
The idea here is that you will establish a secret value known only to your servers and CloudFront. CloudFront will send this in the headers along with every request, and your servers will require that value to be present or else they will play dumb and throw an error -- such as 503 Service Unavailable or 403 Forbidden or even 404 Not Found.
So, you make up a header name, like X-My-CloudFront-Secret-String and a random string, like o+mJeNieamgKKS0Uu0A1Fqk7sOqa6Mlc3 and configure this as a Custom Origin Header in CloudFront. The values shown here are arbitrary examples -- this can be anything.
Then configure your application web server to deny any request where this header and the matching value are not present -- because this is how you know the request came from your specific CloudFront distribution. Anything else (other than ELB health checks, for which you need to make an exception) is not from your CloudFront distribution, and is therefore unauthorized by definition, so your server needs to deny it with an error, but without explaining too much in the error message.
This header and its expected value remains a secret because it will not be sent back to the browser by CloudFront -- it's only sent in the forward direction, in the requests that CloudFront sends to your ELB.
Note that you should get an SSL cert for your ELB (for the elb.example.com hostname) and configure CloudFront to forward all requests to your ELB using HTTPS. The likelihood of interception of traffic between CloudFront and ELB is low, but this is a protection you should consider implenting.
You can optionally also reduce (but not eliminate) most unauthorized access by blocking all requests that don't arrive from CloudFront by only allowing the CloudFront IP address ranges in the ELB security group -- the CloudFront address ranges are documented (search the JSON for blocks designated as CLOUDFRONT, and allow only these in the ELB security group) but note that if you do this, you still need to set up the custom origin header configuration, discussed above, because if you only block at the IP level, you're still technically allowing anybody's CloudFront distribution to access your ELB. Your CloudFront distribution shares IP addresses in a pool with other CloudFront distribution, so the fact that the request arrives from CloudFront is not a sufficient guarantee that it is from your CloudFront distribution. Note also that you need to sign up for change notifications so that if new address ranges are added to CloudFront, then you'll know to add them to your security group.
I am hosting my website using AWS.
The website is on 2 ec2 instances, with a load balancer (ELB) balancing traffic between them.
Currently, I am using my DNS (Route 53) to restrict the access to the website by using Route 53's geolocation routing:
http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy.html#routing-policy-geo
(The geolocation restriction is just to limit the initial release of my website. It is not for security reasons. Meaning the restriction just needs to work for the general public)
This worries me a little because my load balancer is still exposed to access from everywhere. So I am concerned that my load balancer will get indexed by google or something and then people outside of my region will be able to access the site.
Are there any fixes for this? Am I restricting access by location the wrong way? Is there a way perhaps to specify in the ELB's security group that it only receive inbound traffic from my DNS (of course then I would also have to specify that inbound traffic from edge locations be allowed as well for my static content but this is not a problem)?
Note: There is an option when selecting inbound rules for a security group, under "type" to select "DNS(UDP)" or "DNS(TCP)". I tried adding two rules for both DNS types (and IP Address="anywhere") for my ELB but this did not limit access to the ELB to be solely through my DNS.
Thank you.
The simple solution, here, is found in CloudFront. Two solutions, actually:
CloudFront can use its GeoIP database to do the blocking for you...
When a user requests your content, CloudFront typically serves the requested content regardless of where the user is located. If you need to prevent users in specific countries from accessing your content, you can use the CloudFront geo restriction feature[...]
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/georestrictions.html
You can configure CloudFront with which countries are allowed, or which are denied. You can also configure static pages, stored in S3, which are displayed to denied users. (You can also configure static custom error pages for other CloudFront errors that might occur, and store those pages in S3 as well, where CloudFront will fetch them if it ever needs them).
...or...
CloudFront can pass the location information back to your server using the CloudFront-Viewer-Country: header, and your application code, based on the contents accompanying that header, can do the blocking. The incoming request looks something like this (some headers munged or removed for clarity):
GET / HTTP/1.1
Host: example.com
X-Amz-Cf-Id: 3fkkTxKhNxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx==
Via: 1.1 cb76b079000000000000000000000000.cloudfront.net (CloudFront)
CloudFront-Viewer-Country: US
CloudFront-Forwarded-Proto: https
Accept-Encoding: gzip
CloudFront caches the responses against the combination of the requested page and the viewer's country, and any other whitelisted headers, so it will correctly cache your denied responses as well as your allowed responses, independently.
Here's more about how you enable the CloudFront-Viewer-Country: header:
If you want CloudFront to cache different versions of your objects based on the country that the request came from, configure CloudFront to forward the CloudFront-Viewer-Country header to your origin. CloudFront automatically converts the IP address that the request came from into a two-letter country code.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/header-caching.html#header-caching-web-location
Or, of course, you can enable both features, letting CloudFront do the blocking, while still giving your app a heads-up on the country codes for the locations that were allowed through.
But how do you solve the issue with the fact that your load balancer is still open to the world?
CloudFront has recently solved this one, too, with Custom Origin Headers. These are secret custom headers sent to your origin server, by CloudFront, with each request.
You can identify the requests that are forwarded to your custom origin by CloudFront. This is useful if you want to know whether users are bypassing CloudFront[...]
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/forward-custom-headers.html
So, let's say you added a custom header to CloudFront:
X-Yes-This-Request-Is-Legit: TE9MIHdoYXQgd2VyZSB5b3UgZXhwZWN0aW5nIHRvIHNlZT8=
What's all that line noise? Nothing, really, just a made up secret value that only your server and CloudFront know about. Configure your web server so that if this header and value are not present in the incoming request, then access is denied -- this is a request that didn't pass through CloudFront.
Don't use the above secret, of course... make up your own. It's entirely arbitrary.
Caveat applicable to any GeoIP-restricting strategy: it isn't perfect. CloudFront claims 99.8% accuracy.
The most reliable way to implement geographic IP restrictions, is to use a geographic location database or service API, and implement it at the application level.
For example, for a web site in practically any language, it is very simple to add test at the start of each page request, and compare the client IP against the geo ip database or service, and handle the response from there.
At the application level, it is easier to manage the countries you accept/deny, and log those events, as needed, than at the network level.
IP based geo location data is generally reliable, and there are many sources for this data. While you may trust AWS for many things, I do think that there are many reliable 3rd party sources for geo IP dat, that focus on this data.
freegeoip.net provides a public HTTP API to search the geolocation of IP addresses. You're allowed up to 10,000 queries per hour.
ip2location.com LITE is a free IP geolocation database for personal or commercial use.
If your application uses a database, these geo databases are quite easy to import and reference in your app.
I have a post explaining in detail how to whitelist / blacklist locations with Route53: https://www.devpanda.me/2017/10/07/DNS-Blacklist-of-locations-countries-using-AWS-Route53/.
In terms of your ELB being exposed to public that shouldn't be a problem since the Host header on any requests to the ELB over port 80 / 443 won't match your domain name, which means for most web servers a 404 will be returned or similar.
There is a way using AWS WAF
You can select - Resource type to associate with web ACL as ELB.
Select your ELB and create conditions like Geo Match, IP Address etcetera.
You can also update anytime if anything changes in future.
Thanks