I configured CDN in front of my website. Everything works well when you access the website via my custom DNS or CDN DNS. My problem is when I want to use an IP instead of DNS.
When I do a nslookup on my CDN DNS name, I get a list of IP's. If I grab one IP address from there and try to access the website, I get a 403 Forbidden request.
Why is CDN only accepting DNS request and not IP's?
What if I have a proxy in front of my CDN and try to access my website by using the proxy IP, how can I access the website using the proxy IP which points to CDN?
It's a wired requirement and time consuming, I've been looking for the correct answer. No one seems to show a solution.
Cheers!
Why is CDN only accepting DNS request and not IP's?
CloudFront is not designed to work this way. It is a massive, globally-distributed system. When you look up the IP addresses of your CloudFront distribution, you are receiving the list of addresses where CloudFront expects to receive traffic:
for your web site, and
for potentially hundreds or thousands of other sites, and
from browsers in the same geographic area as you
You need a way to identify which distribution you expect CloudFront to use when processing your request.
In HTTP mode, this uses the Host: HTTP header, sent by the browser. In HTTPS mode, this uses the TLS SNI value and the Host: header.
If you were using a proxy to access CloudFront, you would need for the proxy to inject a Host header for HTTP and to set the SNI correctly, too, for HTTPS.
In HAProxy, for example, set the host header, overwriting any such header that's already present.
http-request set-header Host dzczcexample.cloudfront.net
Of course, you might use any one of the Alternate Domain Name values configured for your distribution, as well.
For SNI:
backend my-cloudfront-backend
server my-cloudfront dzczcexample.cloudfront.net:443 ssl verify none sni str(dzczcexample.cloudfront.net)
(Source: https://serverfault.com/a/830327/153161)
But this is only a minimum baseline working configuration, because CloudFront has features that this simple setup overlooks.
As noted above, CloudFront is returning a list of IP addresses that should be used to access (1) your site, (2) from where you are, (3) right now. The list of addresses can and will vary. CloudFront appears to be able to dynamically manage and distribute its workload and mitigate DDoS by moving traffic from one set of servers to another, from one edge location to another, etc., by modifying the DNS responses... so your proxy needs to be using the multiple addresses returned, and needs to be refreshing its DNS values so that it always connects to where CloudFront wishes for it to connect, for optimum behavior and performance.
Also, don't overlook the fact that a proxy server will connect to CloudFront via an edge near the proxy, not near the browser, so this is not something you would routinely use in production, though it absolutely does have some valid use cases. (I have used HAProxy on both sides of CloudFront for several years, for certain applications -- some of which have now been obviated by Lambda#Edge, but I digress).
It's a wired [weird?] requirement
Not really. Name-based virtual hosting has been the standard practice for many years. It is -- in my opinion -- almost an accident of history that when you set up a web server, it will commonly respond on the IP address in the Host header, as well. A well-configured web server will not do this -- if you (the web browser) don't know what host you are asking for and are just sending a request to my IP, then I (the web server) should tell you I have no idea what you want from me, because you are more likely than not to be arriving for malicious reasons, or benign but annoying reasons (scanning), or as the result of a misconfiguration. You also don't want search engine spiders finding your content at an IP address. Bad for listings, bad for SEO.
The problem I have with this proxy is I cannot configure the headers. I was thinking to add another proxy where I have control and then modify the headers with the values I want. But this is not really a good solution, is like jumping from one proxy to another.
I think I should rely more on DNS and hostnames, rather than IP's. Which is fine with me, I prefer using proper DNS name.
Thanks for your thorough explanation, you clarify a lot of things.
Related
I have a website that depends on multiple auxiliary services, to which it makes http requests. (These services e.g. provide information from data probes to update the website via ajax in real-time, and are written in a different language to the website so that I can't use RMI or similar.)
I'm trying to secure the website with SSL so that it displays as secure in browsers, but due to the http requests to the auxiliary services I'm getting mixed content warnings. I'm hosting on AWS which only seems to allow https requests to the single port 443, on which I have my website itself listening; how can I set things up so that I can access my auxiliary services securely if I need them to listen on a different port to the website?
EDIT: I should add that this is for our test website, so there's no load balancing enabled...
This seems very easily solved with CloudFront.
Forget all about caching and the fact that CloudFront is a CDN. It isn't just those things. It has a number of other useful tricks up its sleeve.
CloudFront is also a reverse proxy that can route requests to multiple destinations ("origin servers"), including multiple ports on a single instance, based on the request path (cache behaviors)... meanwhile, the browser thinks everything is coming from a single web site and is speaking HTTPS to CloudFront, yet CloudFront can optionally speak ordinary HTTP to the back-end services if that is a security policy you allow. A single CloudFront distribution can have up to 25 path mappings (including the default catchall mapping for the main site), referencing up to 25 different backend IP/Hostname+port combinations.
I’m using an Amazon EC2 instance to host my web application. The EC2 instance is in the APAC region. I wanted to use an SSL certificate from Amazon Certificate Manager.
For the above scenario, I have to go for either Elastic Load Balancing option or CloudFront.
Since my instance is in APAC region, I cannot go for Elastic Load Balancing, as load balancing is available only for instances in US East (N. Virginia) region.
The other option is to go for CloudFront. CloudFront option would have been easier if I was hosting my web application using Amazon S3 bucket. But I’m using an EC2 instance.
I requested and got an ACM certificate in the US East (N. Virginia) region.
I went ahead with CloudFront, and gave in my domain name (example.com) in the origin field, in the origin path; I gave the location of the application directory (/application), and filled in the http and https ports.
When the CloudFront distribution was deployed, I could only see the default self-signed certificate for the web application, and not the ACM certificate.
Your comments and suggestions are welcome to solve this issue. Thank you.
I went ahead with CloudFront, and gave in my domain name (example.com) in the origin field,
This is incorrect. The origin needs to be a hostname that CloudFront can use to contact the EC2 instance. It can't be your domain name, because once you finish this setup, your domain name will point directly at CloudFront, so CloudFront can't use this to access the instance.
Here, use the public DNS hostname for your instance, which you'll find in the console. It will look something lke ec2-x-x-x-x.aws-region.compute.amazonaws.com.
in the origin path; I gave the location of the application directory (/application),
This, too, is incorrect. The origin path should be left blank. Origin path is string you want CloudFront to prepend to every request. If you set this to /foo and the browser requests /bar then your web server will see the request as coming in for the page /foo/bar. Probably not what you want.
and filled in the http and https ports.
Here, you will need to set the origin protocol policy to HTTP Only. CloudFront will not make a back-end connection to your server using HTTPS unless you have a certificate on the server that is valid and not self-signed. The connection between the browser and CloudFront can still be HTTPS, but without a valid certificate on the instance, CloudFront will refuse to make an HTTPS connection on the back side.
Also, under the Cache Behaviors, you will need to configure CloudFront to either forward all request headers to the origin server (which also disables caching, so you may not want this) or you at least need to whitelist the Host: header so your origin server recognizes the request. Add any other headers you need to see, such as Referer.
Enable query string forwarding if you need it. Otherwise CloudFront will strip ?your=query&strings=off_the_requests and your server will never see them.
If your site uses cookies, configure the cookies you need CloudFront to forward, or forward all cookies.
That should have your CloudFront distribution configured, but is not yet live on your site.
When the CloudFront distribution was deployed,
This only means that CloudFront has deployed your settings to all of its edge locations around the world, and is ready for traffic, not that it is actually going to receive any.
I could only see the default self-signed certificate for the web application, and not the ACM certificate.
Right, because you didn't actually change the DNS for "example.com" to point to CloudFront instead of to your web server.
Once the distribution is ready, you need to send traffic to it. In Route 53, find the A record for your site, which will have the EC2 instance's IP address in the box, and the "Alias" radio button set to "No." Change this to Yes, and then select the CloudFront distribution from the list of alias targets that appears. Save the changes.
Now... after the old DNS entry's time to live (TTL) timer expires, close your browser (all browser windows), close your eyes, cross your fingers, open your eyes, open your browser, and hit your site.
...which should be served via CloudFront, with the ACM certificate.
This probably sounds complicated, but should be something you can do in far less time that it took me to type this all out.
Elastic Load Balancer is available in all the regions. The assumption that it is available only in US East is wrong. Check it out, maybe this alone solves your issue.
About SSL termination, you can enable the service on the ELB.
If in single node, you can SSL terminate on the web server itself, a cheaper solution.
Firstly, thank you so much for taking your time and helping me with the query. I proceeded with your suggestions.
'Also, under the Cache Behaviors, you will need to configure CloudFront to either forward all request headers to the origin server (which also disables caching, so you may not want this) or you at least need to whitelist the Host: header so your origin server recognizes the request. Add any other headers you need to see, such as Referer.'
I don't know what you mean by whitelisting the host. Under the whitelist box, what value do I have to give?
Since I wasn't sure about whitelisting the header, I proceeded with allow all header. And I forwarded all cookies.
In the origin settings, I don’t know what to give in the header name and header value. So, I gave the header name as ‘header1’, and value as my domain name ‘www(.)example(.)com’.
I have made the DNS change in Route 53 as you have suggested.
Now when I click www(.)example(.)com, I’m able to see https://www.example(.)com with a valid ACM certificate.
However, when I tried to access my application, https://www(.)example(.)com/application, the webpage is navigating to https://ec2-x-x-x-x.ap-southeast-1.compute.amazonaws.com/application/, and it’s showing the self-signed cert again.
I’m guessing there is some problem with DNS configuration in Amazon Route 53. Can you please tell me what changes do I have to do so that when I hit my application I can see a valid certificate?
Also, when I hit my application, my URL is changing to show ec2-x-x-x-x, instead of my domain name? Can you please tell me how to correct that?
Thank you so much.
Does AWS allow usage of Cloudfront for websites usage, eg:- caching web pages.
Website should be accessible within corporate VPN only. Is it a good idea to cache webpages on cloudfront when using Application restricted within one network?
As #daxlerod points out, it is possible to use the relatively new Web Application Firewall service with CloudFront, to restrict access to the content, for example, by IP address ranges.
And, of course, there's no requirement that the web site actually be hosted inside AWS in order to use CloudFront in front of it.
However, "will it work?" and "are all the implications of the required configuration acceptable from a security perspective?" are two different questions.
In order to use CloudFront on a site, the origin server (the web server where CloudFront fetches content that isn't in the cache at the edge node where the content is being requested) has to be accessible from the Internet, in order for CloudFront to connect to it, which means your private site has to be exposed, at some level, to the Internet.
The CloudFront IP address ranges are public information, so you could partially secure access to the origin server with the origin server's firewall, but this only prevents access from anywhere other than through CloudFront -- and that isn't enough, because if I knew the name of your "secured" server, I could create my own CloudFront distribution and access it through CloudFront, since the IP addresses would be in the same range.
The mechanism CloudFront provides for ensuring that requests came from and through an authorized CloudFront distribution is custom origin headers, which allows CloudFront to inject an unknown custom header and secret value into each request it sends to your origin server, to allow your server to authenticate the fact that the request not only came from CloudFront, but from your specific CloudFront distribution. Your origin server would reject requests not accompanied by this header, without explanation, of course.
See http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/forward-custom-headers.html#forward-custom-headers-restrict-access.
And, of course, you need https between the browser and CloudFront and https between CloudFront and the origin server. It is possible to configure CloudFront to use (or require) https on the front side or the back side separately, so you will want to ensure it's configured appropriately for both, if the security considerations addressed above make it a viable solution for your needs.
For information that is not highly sensitive, this seems like a sensible approach if caching or other features of CloudFront would be beneficial to your site.
Yes, you CloudFront is designed as a caching layer in front of a web site.
If you want to restrict access to CloudFront, you can use the Web Application Firewall service.
Put your website into public network > In CloudFront distribution attach WAF rules > In WAF rules whitelist range of your company's IP's and blacklist everything else
I am hosting my website using AWS.
The website is on 2 ec2 instances, with a load balancer (ELB) balancing traffic between them.
Currently, I am using my DNS (Route 53) to restrict the access to the website by using Route 53's geolocation routing:
http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy.html#routing-policy-geo
(The geolocation restriction is just to limit the initial release of my website. It is not for security reasons. Meaning the restriction just needs to work for the general public)
This worries me a little because my load balancer is still exposed to access from everywhere. So I am concerned that my load balancer will get indexed by google or something and then people outside of my region will be able to access the site.
Are there any fixes for this? Am I restricting access by location the wrong way? Is there a way perhaps to specify in the ELB's security group that it only receive inbound traffic from my DNS (of course then I would also have to specify that inbound traffic from edge locations be allowed as well for my static content but this is not a problem)?
Note: There is an option when selecting inbound rules for a security group, under "type" to select "DNS(UDP)" or "DNS(TCP)". I tried adding two rules for both DNS types (and IP Address="anywhere") for my ELB but this did not limit access to the ELB to be solely through my DNS.
Thank you.
The simple solution, here, is found in CloudFront. Two solutions, actually:
CloudFront can use its GeoIP database to do the blocking for you...
When a user requests your content, CloudFront typically serves the requested content regardless of where the user is located. If you need to prevent users in specific countries from accessing your content, you can use the CloudFront geo restriction feature[...]
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/georestrictions.html
You can configure CloudFront with which countries are allowed, or which are denied. You can also configure static pages, stored in S3, which are displayed to denied users. (You can also configure static custom error pages for other CloudFront errors that might occur, and store those pages in S3 as well, where CloudFront will fetch them if it ever needs them).
...or...
CloudFront can pass the location information back to your server using the CloudFront-Viewer-Country: header, and your application code, based on the contents accompanying that header, can do the blocking. The incoming request looks something like this (some headers munged or removed for clarity):
GET / HTTP/1.1
Host: example.com
X-Amz-Cf-Id: 3fkkTxKhNxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx==
Via: 1.1 cb76b079000000000000000000000000.cloudfront.net (CloudFront)
CloudFront-Viewer-Country: US
CloudFront-Forwarded-Proto: https
Accept-Encoding: gzip
CloudFront caches the responses against the combination of the requested page and the viewer's country, and any other whitelisted headers, so it will correctly cache your denied responses as well as your allowed responses, independently.
Here's more about how you enable the CloudFront-Viewer-Country: header:
If you want CloudFront to cache different versions of your objects based on the country that the request came from, configure CloudFront to forward the CloudFront-Viewer-Country header to your origin. CloudFront automatically converts the IP address that the request came from into a two-letter country code.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/header-caching.html#header-caching-web-location
Or, of course, you can enable both features, letting CloudFront do the blocking, while still giving your app a heads-up on the country codes for the locations that were allowed through.
But how do you solve the issue with the fact that your load balancer is still open to the world?
CloudFront has recently solved this one, too, with Custom Origin Headers. These are secret custom headers sent to your origin server, by CloudFront, with each request.
You can identify the requests that are forwarded to your custom origin by CloudFront. This is useful if you want to know whether users are bypassing CloudFront[...]
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/forward-custom-headers.html
So, let's say you added a custom header to CloudFront:
X-Yes-This-Request-Is-Legit: TE9MIHdoYXQgd2VyZSB5b3UgZXhwZWN0aW5nIHRvIHNlZT8=
What's all that line noise? Nothing, really, just a made up secret value that only your server and CloudFront know about. Configure your web server so that if this header and value are not present in the incoming request, then access is denied -- this is a request that didn't pass through CloudFront.
Don't use the above secret, of course... make up your own. It's entirely arbitrary.
Caveat applicable to any GeoIP-restricting strategy: it isn't perfect. CloudFront claims 99.8% accuracy.
The most reliable way to implement geographic IP restrictions, is to use a geographic location database or service API, and implement it at the application level.
For example, for a web site in practically any language, it is very simple to add test at the start of each page request, and compare the client IP against the geo ip database or service, and handle the response from there.
At the application level, it is easier to manage the countries you accept/deny, and log those events, as needed, than at the network level.
IP based geo location data is generally reliable, and there are many sources for this data. While you may trust AWS for many things, I do think that there are many reliable 3rd party sources for geo IP dat, that focus on this data.
freegeoip.net provides a public HTTP API to search the geolocation of IP addresses. You're allowed up to 10,000 queries per hour.
ip2location.com LITE is a free IP geolocation database for personal or commercial use.
If your application uses a database, these geo databases are quite easy to import and reference in your app.
I have a post explaining in detail how to whitelist / blacklist locations with Route53: https://www.devpanda.me/2017/10/07/DNS-Blacklist-of-locations-countries-using-AWS-Route53/.
In terms of your ELB being exposed to public that shouldn't be a problem since the Host header on any requests to the ELB over port 80 / 443 won't match your domain name, which means for most web servers a 404 will be returned or similar.
There is a way using AWS WAF
You can select - Resource type to associate with web ACL as ELB.
Select your ELB and create conditions like Geo Match, IP Address etcetera.
You can also update anytime if anything changes in future.
Thanks
i was starting to setup 2-Region for my Heroku App, then create distribute the workload with Amazon Route 53 GeoDNS service.
Solution 1 fail
a = api.mydomain.com, Europe, myApp-EU.herokuapp.com
b = api.mydomain.com, US, myApp-US.herokuapp.com
a,b fail:
since Heroku don't know "api.mydomain.com".
Solution 2 fail
a = api.mydomain.com, Europe, CNAME api-eu.mydomain.com
b = api.mydomain.com, US, CNAME api-us.mydomain.com
c = api-eu.mydomain.com, Europe, myApp-EU.herokuapp.com
d = api-us.mydomain.com, US, myApp-US.herokuapp.com
c,d work since heroku know "api-eu.mydomain.com".
a,b doesn't work since heroku don't know "api.mydomain.com"
At this point i would conclude that is not possibile with Herou?
In an http/s environment, two factors have to come together. (It is apparent from the question that you understand these two issues, but please indulge me while I clarify).
The request hostname must be resolvable to an endpoint IP address, either directly, with a DNS A record, or indirectly, through CNAME records, and
The destination must be expecting requests with the original hostname in the Host: http header.
Heroku evidently understands the concept that an app should support a custom hostname, e.g., eu-app.example.com or api.example.com.
If their system allowed you to assign the same host/domain name to both apps, the issue solves itself, because you'd configure the geographic DNS maps to the appropriate regional CNAME targets and it would work as expected. Since you have asked this question, I assume this is not the case.
Depending on how useful it is to you to host on Heroku, given this apparent issue, a simple and relatively low cost solution would be an EC2 instance in each destination region, whose sole purpose is to receive those geographically routed requests, rewrite the Host: header to what Heroku expects to see, and reverse proxy the request.
I have a similar application where I have a static site with one hostname, stores in an S3 bucket with a different name. For static web site hosting, S3 requires that the bucket name equal the value sent in the Host: header by the browser, which prevents a CNAME from a different hostname from working, for exactly the same reason that you are seeing here.
The proxy server runs HAProxy on a t2.micro instance, < $10/month, and serves many thousands of hits each day, barely using any CPU on the t2.micro (my credit balance is always over 100), since HAProxy is very well written and lean on resource usage. Since the EC2 machine is in the same region as the S3 bucket, the additional latency added to each request is insignificant. As an additional bonus, of course, I get real-time logging, header capture, and and external perspective on the performance and behavior of the back-end service. Nginx or Varnish or several other proxy servers could probably be used for the same purpose. You would put these in the same regions where your application is deployed, though for one region, you could skip it if the Heroku platform was expecting your api.example.com directly. You'd technically only need this in one location.
The reason this works is that HAProxy doesn't care what the incoming Host: header contains, unless you configure it to make routing decisions based on that header.
You can also load your api.example.com SSL certificate on all of your HAProxy instances, and it will terminate the SSL and forward the request using a new SSL connection to the endpoint, which can be using any valid SSL cert, including any wildcard/non-vanity cert the destination endpoint may support.
After some looking over different solutions i figure out how to handle it.
AKAMAI and Fastly are the 2 CDN service that allow you to override the 'http host header'. In this way Heroku know that the request came from "api-eu.mydomain.com" since Akamai was able to ovverride the "api.mydomain.com" host-header.