I am hosting my website using AWS.
The website is on 2 ec2 instances, with a load balancer (ELB) balancing traffic between them.
Currently, I am using my DNS (Route 53) to restrict the access to the website by using Route 53's geolocation routing:
http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy.html#routing-policy-geo
(The geolocation restriction is just to limit the initial release of my website. It is not for security reasons. Meaning the restriction just needs to work for the general public)
This worries me a little because my load balancer is still exposed to access from everywhere. So I am concerned that my load balancer will get indexed by google or something and then people outside of my region will be able to access the site.
Are there any fixes for this? Am I restricting access by location the wrong way? Is there a way perhaps to specify in the ELB's security group that it only receive inbound traffic from my DNS (of course then I would also have to specify that inbound traffic from edge locations be allowed as well for my static content but this is not a problem)?
Note: There is an option when selecting inbound rules for a security group, under "type" to select "DNS(UDP)" or "DNS(TCP)". I tried adding two rules for both DNS types (and IP Address="anywhere") for my ELB but this did not limit access to the ELB to be solely through my DNS.
Thank you.
The simple solution, here, is found in CloudFront. Two solutions, actually:
CloudFront can use its GeoIP database to do the blocking for you...
When a user requests your content, CloudFront typically serves the requested content regardless of where the user is located. If you need to prevent users in specific countries from accessing your content, you can use the CloudFront geo restriction feature[...]
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/georestrictions.html
You can configure CloudFront with which countries are allowed, or which are denied. You can also configure static pages, stored in S3, which are displayed to denied users. (You can also configure static custom error pages for other CloudFront errors that might occur, and store those pages in S3 as well, where CloudFront will fetch them if it ever needs them).
...or...
CloudFront can pass the location information back to your server using the CloudFront-Viewer-Country: header, and your application code, based on the contents accompanying that header, can do the blocking. The incoming request looks something like this (some headers munged or removed for clarity):
GET / HTTP/1.1
Host: example.com
X-Amz-Cf-Id: 3fkkTxKhNxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx==
Via: 1.1 cb76b079000000000000000000000000.cloudfront.net (CloudFront)
CloudFront-Viewer-Country: US
CloudFront-Forwarded-Proto: https
Accept-Encoding: gzip
CloudFront caches the responses against the combination of the requested page and the viewer's country, and any other whitelisted headers, so it will correctly cache your denied responses as well as your allowed responses, independently.
Here's more about how you enable the CloudFront-Viewer-Country: header:
If you want CloudFront to cache different versions of your objects based on the country that the request came from, configure CloudFront to forward the CloudFront-Viewer-Country header to your origin. CloudFront automatically converts the IP address that the request came from into a two-letter country code.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/header-caching.html#header-caching-web-location
Or, of course, you can enable both features, letting CloudFront do the blocking, while still giving your app a heads-up on the country codes for the locations that were allowed through.
But how do you solve the issue with the fact that your load balancer is still open to the world?
CloudFront has recently solved this one, too, with Custom Origin Headers. These are secret custom headers sent to your origin server, by CloudFront, with each request.
You can identify the requests that are forwarded to your custom origin by CloudFront. This is useful if you want to know whether users are bypassing CloudFront[...]
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/forward-custom-headers.html
So, let's say you added a custom header to CloudFront:
X-Yes-This-Request-Is-Legit: TE9MIHdoYXQgd2VyZSB5b3UgZXhwZWN0aW5nIHRvIHNlZT8=
What's all that line noise? Nothing, really, just a made up secret value that only your server and CloudFront know about. Configure your web server so that if this header and value are not present in the incoming request, then access is denied -- this is a request that didn't pass through CloudFront.
Don't use the above secret, of course... make up your own. It's entirely arbitrary.
Caveat applicable to any GeoIP-restricting strategy: it isn't perfect. CloudFront claims 99.8% accuracy.
The most reliable way to implement geographic IP restrictions, is to use a geographic location database or service API, and implement it at the application level.
For example, for a web site in practically any language, it is very simple to add test at the start of each page request, and compare the client IP against the geo ip database or service, and handle the response from there.
At the application level, it is easier to manage the countries you accept/deny, and log those events, as needed, than at the network level.
IP based geo location data is generally reliable, and there are many sources for this data. While you may trust AWS for many things, I do think that there are many reliable 3rd party sources for geo IP dat, that focus on this data.
freegeoip.net provides a public HTTP API to search the geolocation of IP addresses. You're allowed up to 10,000 queries per hour.
ip2location.com LITE is a free IP geolocation database for personal or commercial use.
If your application uses a database, these geo databases are quite easy to import and reference in your app.
I have a post explaining in detail how to whitelist / blacklist locations with Route53: https://www.devpanda.me/2017/10/07/DNS-Blacklist-of-locations-countries-using-AWS-Route53/.
In terms of your ELB being exposed to public that shouldn't be a problem since the Host header on any requests to the ELB over port 80 / 443 won't match your domain name, which means for most web servers a 404 will be returned or similar.
There is a way using AWS WAF
You can select - Resource type to associate with web ACL as ELB.
Select your ELB and create conditions like Geo Match, IP Address etcetera.
You can also update anytime if anything changes in future.
Thanks
Related
I configured CDN in front of my website. Everything works well when you access the website via my custom DNS or CDN DNS. My problem is when I want to use an IP instead of DNS.
When I do a nslookup on my CDN DNS name, I get a list of IP's. If I grab one IP address from there and try to access the website, I get a 403 Forbidden request.
Why is CDN only accepting DNS request and not IP's?
What if I have a proxy in front of my CDN and try to access my website by using the proxy IP, how can I access the website using the proxy IP which points to CDN?
It's a wired requirement and time consuming, I've been looking for the correct answer. No one seems to show a solution.
Cheers!
Why is CDN only accepting DNS request and not IP's?
CloudFront is not designed to work this way. It is a massive, globally-distributed system. When you look up the IP addresses of your CloudFront distribution, you are receiving the list of addresses where CloudFront expects to receive traffic:
for your web site, and
for potentially hundreds or thousands of other sites, and
from browsers in the same geographic area as you
You need a way to identify which distribution you expect CloudFront to use when processing your request.
In HTTP mode, this uses the Host: HTTP header, sent by the browser. In HTTPS mode, this uses the TLS SNI value and the Host: header.
If you were using a proxy to access CloudFront, you would need for the proxy to inject a Host header for HTTP and to set the SNI correctly, too, for HTTPS.
In HAProxy, for example, set the host header, overwriting any such header that's already present.
http-request set-header Host dzczcexample.cloudfront.net
Of course, you might use any one of the Alternate Domain Name values configured for your distribution, as well.
For SNI:
backend my-cloudfront-backend
server my-cloudfront dzczcexample.cloudfront.net:443 ssl verify none sni str(dzczcexample.cloudfront.net)
(Source: https://serverfault.com/a/830327/153161)
But this is only a minimum baseline working configuration, because CloudFront has features that this simple setup overlooks.
As noted above, CloudFront is returning a list of IP addresses that should be used to access (1) your site, (2) from where you are, (3) right now. The list of addresses can and will vary. CloudFront appears to be able to dynamically manage and distribute its workload and mitigate DDoS by moving traffic from one set of servers to another, from one edge location to another, etc., by modifying the DNS responses... so your proxy needs to be using the multiple addresses returned, and needs to be refreshing its DNS values so that it always connects to where CloudFront wishes for it to connect, for optimum behavior and performance.
Also, don't overlook the fact that a proxy server will connect to CloudFront via an edge near the proxy, not near the browser, so this is not something you would routinely use in production, though it absolutely does have some valid use cases. (I have used HAProxy on both sides of CloudFront for several years, for certain applications -- some of which have now been obviated by Lambda#Edge, but I digress).
It's a wired [weird?] requirement
Not really. Name-based virtual hosting has been the standard practice for many years. It is -- in my opinion -- almost an accident of history that when you set up a web server, it will commonly respond on the IP address in the Host header, as well. A well-configured web server will not do this -- if you (the web browser) don't know what host you are asking for and are just sending a request to my IP, then I (the web server) should tell you I have no idea what you want from me, because you are more likely than not to be arriving for malicious reasons, or benign but annoying reasons (scanning), or as the result of a misconfiguration. You also don't want search engine spiders finding your content at an IP address. Bad for listings, bad for SEO.
The problem I have with this proxy is I cannot configure the headers. I was thinking to add another proxy where I have control and then modify the headers with the values I want. But this is not really a good solution, is like jumping from one proxy to another.
I think I should rely more on DNS and hostnames, rather than IP's. Which is fine with me, I prefer using proper DNS name.
Thanks for your thorough explanation, you clarify a lot of things.
I need to use AWS WAF for my web application hosted on AWS to provide additional rule based security to it. I couldnt find any way to directly use WAF with ELB and WAF needs Cloudfront to add WEB ACL to block actions based on rules.
So, I added my Application ELB CNAME to cloudfront, only the domain name, WebACL with an IP block rule and HTTPS protocol was updated with cloudfront. Rest all has been left default. once both WAF and Cloudfront with ELB CNAME was added, i tried to access the CNAME ELB from one of the ip address that is in the block ip rule in WAF. I am still able to access my web application from that IP address. Also, I tried to check cloudwatch metrics for Web ACL created and I see its not even being hit.
First, is there any good way to achieve what I am doing and second, is there a specific way to add ELB CNAME on cloudfront.
Thanks and Regards,
Jay
Service update: The orignal, extended answer below was correct at the time it was written, but is now primarily applicable to Classic ELB, because -- as of 2016-12-07 -- Application Load Balancers (elbv2) can now be directly integrated with Web Application Firewall (Amazon WAF).
Starting [2016-12-07] AWS WAF (Web Application Firewall) is available on the Application Load Balancer (ALB). You can now use AWS WAF directly on Application Load Balancers (both internal and external) in a VPC, to protect your websites and web services. With this launch customers can now use AWS WAF on both Amazon CloudFront and Application Load Balancer.
https://aws.amazon.com/about-aws/whats-new/2016/12/AWS-WAF-now-available-on-Application-Load-Balancer/
It seems like you do need some clarification on how these pieces fit together.
So let's say your actual site that you want to secure is app.example.com.
It sounds as if you have a CNAME elb.example.com pointing to the assigned hostname of the ELB, which is something like example-123456789.us-west-2.elb.amazonaws.com. If you access either of these hostnames, you're connecting directly to the ELB -- regardless of what's configured in CloudFront or WAF. These machines are still accessible over the Internet.
The trick here is to route the traffic to CloudFront, where it can be firewalled by WAF, which means a couple of additional things have to happen: first, this means an additional hostname is needed, so you configure app.example.com in DNS as a CNAME (or Alias, if you're using Route 53) pointing to the dxxxexample.cloudfront.net hostname assigned to your distribution.
You can also access your sitr using the assigned CloudFront hostname, directly, for testing. Accessing this endpoint from the blocked IP address should indeed result in the request being denied, now.
So, the CloudFront endpoint is where you need to send your traffic -- not directly to the ELB.
Doesn't that leave your ELB still exposed?
Yes, it does... so the next step is to plug that hole.
If you're using a custom origin, you can use custom headers to prevent users from bypassing CloudFront and requesting content directly from your origin.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/forward-custom-headers.html
The idea here is that you will establish a secret value known only to your servers and CloudFront. CloudFront will send this in the headers along with every request, and your servers will require that value to be present or else they will play dumb and throw an error -- such as 503 Service Unavailable or 403 Forbidden or even 404 Not Found.
So, you make up a header name, like X-My-CloudFront-Secret-String and a random string, like o+mJeNieamgKKS0Uu0A1Fqk7sOqa6Mlc3 and configure this as a Custom Origin Header in CloudFront. The values shown here are arbitrary examples -- this can be anything.
Then configure your application web server to deny any request where this header and the matching value are not present -- because this is how you know the request came from your specific CloudFront distribution. Anything else (other than ELB health checks, for which you need to make an exception) is not from your CloudFront distribution, and is therefore unauthorized by definition, so your server needs to deny it with an error, but without explaining too much in the error message.
This header and its expected value remains a secret because it will not be sent back to the browser by CloudFront -- it's only sent in the forward direction, in the requests that CloudFront sends to your ELB.
Note that you should get an SSL cert for your ELB (for the elb.example.com hostname) and configure CloudFront to forward all requests to your ELB using HTTPS. The likelihood of interception of traffic between CloudFront and ELB is low, but this is a protection you should consider implenting.
You can optionally also reduce (but not eliminate) most unauthorized access by blocking all requests that don't arrive from CloudFront by only allowing the CloudFront IP address ranges in the ELB security group -- the CloudFront address ranges are documented (search the JSON for blocks designated as CLOUDFRONT, and allow only these in the ELB security group) but note that if you do this, you still need to set up the custom origin header configuration, discussed above, because if you only block at the IP level, you're still technically allowing anybody's CloudFront distribution to access your ELB. Your CloudFront distribution shares IP addresses in a pool with other CloudFront distribution, so the fact that the request arrives from CloudFront is not a sufficient guarantee that it is from your CloudFront distribution. Note also that you need to sign up for change notifications so that if new address ranges are added to CloudFront, then you'll know to add them to your security group.
Does AWS allow usage of Cloudfront for websites usage, eg:- caching web pages.
Website should be accessible within corporate VPN only. Is it a good idea to cache webpages on cloudfront when using Application restricted within one network?
As #daxlerod points out, it is possible to use the relatively new Web Application Firewall service with CloudFront, to restrict access to the content, for example, by IP address ranges.
And, of course, there's no requirement that the web site actually be hosted inside AWS in order to use CloudFront in front of it.
However, "will it work?" and "are all the implications of the required configuration acceptable from a security perspective?" are two different questions.
In order to use CloudFront on a site, the origin server (the web server where CloudFront fetches content that isn't in the cache at the edge node where the content is being requested) has to be accessible from the Internet, in order for CloudFront to connect to it, which means your private site has to be exposed, at some level, to the Internet.
The CloudFront IP address ranges are public information, so you could partially secure access to the origin server with the origin server's firewall, but this only prevents access from anywhere other than through CloudFront -- and that isn't enough, because if I knew the name of your "secured" server, I could create my own CloudFront distribution and access it through CloudFront, since the IP addresses would be in the same range.
The mechanism CloudFront provides for ensuring that requests came from and through an authorized CloudFront distribution is custom origin headers, which allows CloudFront to inject an unknown custom header and secret value into each request it sends to your origin server, to allow your server to authenticate the fact that the request not only came from CloudFront, but from your specific CloudFront distribution. Your origin server would reject requests not accompanied by this header, without explanation, of course.
See http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/forward-custom-headers.html#forward-custom-headers-restrict-access.
And, of course, you need https between the browser and CloudFront and https between CloudFront and the origin server. It is possible to configure CloudFront to use (or require) https on the front side or the back side separately, so you will want to ensure it's configured appropriately for both, if the security considerations addressed above make it a viable solution for your needs.
For information that is not highly sensitive, this seems like a sensible approach if caching or other features of CloudFront would be beneficial to your site.
Yes, you CloudFront is designed as a caching layer in front of a web site.
If you want to restrict access to CloudFront, you can use the Web Application Firewall service.
Put your website into public network > In CloudFront distribution attach WAF rules > In WAF rules whitelist range of your company's IP's and blacklist everything else
How does cloudfront work with Route53 routing policies?
So as I understand it CF is supposed to route requests to the nearest server, which is in effect the Route53 latency policy. So if you have an R53 hosted zone entry for your CF domain name is this done by default if you leave the routing policy as simple or do you neec to explicitly set this yourself? And if you chose another policy type (failover, geo-location etc) would that overwrite it?
You leave it as simple.
You don't have access to the necessary information to actually configure it yourself -- CloudFront returns an appropriate DNS response based on the location of the requester, from a single, simple DNS record. The functionality and configuration is managed transparently by the logic that powers the cloudfront.net domain, you set it and forget it, because there are no user-serviceable parts inside.
This is true whether you use an A-record Alias or a CNAME.
Any other configuration would not really make sense, because talking of failover or geolocation imply that you'd want to send traffic somewhere other than where CloudFront's algorithm would send it.
Now... there are cases when, behind CloudFront, you might want to use some of Route 53's snazzier options. Let's say you had app servers in multiple regions serving exactly the same content. Latency-based routing for the origin hostname (the one where CloudFront sends cache misses) would allow CloudFront to magically send requests to the app server closest to the CloudFront edge that serves each individual request. This would be unrelated to the routing from the browser to the edge, though.
I am having cname(abc.com) pointed to my elastic IP and need to create three EC2 instances(e.g. Instance1, Instance2, Instance3) for three different applications.
Now I want to achieve following results:
If user hits "abc.com/App1", request should be redirected to Instance1.If user hits "abc.com/App2", request should be redirected to Instance2.If user hits "abc.com/App3", request should be redirected to Instance3.
All these Instances should work independently. And, If any of these goes down, it should not impact others.
We can't use subdomains. I am trying to find out something in ELB.
ELB does not offer path-based routing. All instances connected to an ELB receive a share of incoming requests.
CloudFront, however, does support path-based routing. You can configure each instance as a "custom origin" and configure which path patterns to route to it.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/distribution-web-values-specify.html#DownloadDistValuesPathPattern
Granted, this is not the "primary purpose" of CloudFront, but it works quite nicely in this application.
CloudFront is actually a caching reverse proxy CDN service, so if you go this route, you can also potentially relieve your back-end machines of some workload, or you can disable caching entirely by forwarding all the request headers to the origin and returning an appropriate Cache-Control: header from your instances.
A CloudFront distribution can be associated with a domain name in Route 53 in exactly the same way that an ELB can -- using Alias records.
Bonus: you can also easily pluck additional paths and route them directly to S3 to serve up static assets from an S3 bucket.