Does a reverse proxy defeat the geographical benefits of a CDN?

Does a reverse proxy defeat the geographical benefits of a CDN? - amazon-web-services

This question has a great explanation of a reverse proxy, but if all requests go though a single server in front of a CDN, doesn't this defeat the purpose of having many CDN locations?
For context, I have a completely static website on a CDN but need to separate mobile traffic by user agent and handle those requests with a another server.
Edit: the site is hosted on Netlify, which I believe uses Cloudfront
Will routing all requests through a reverse proxy affect performance?

Yes, using a reverse proxy in a single location in front will defeat the purpose of using a CDN.
Your best bet is to use your reverse proxy as your origin in cloudfront and forward the appropriate application headers then use your reverse proxy to decide what page to render based on those headers

Related

Is there some equivalent of x-sendfile or x-accell-redirect for S3?

I'm building an API and for some responses it will stream the content of S3 objects back to the requester. I would prefer to serve the content directly rather than redirect to send a 302 (e.g. to redirect to a cloudfront distro).
The default is that I read the file into the application and then stream it back out.
If I were using apache or nginx with a local file system I could ask the reverse proxy to stream the content directly from disk with X-Sendfile or X-Accel-Redirect.
Is there an AWS-native mechanism for doing this, so I can avoid loading the file into the application and serving back out again?

I’m not entirely sure I understand your scenario correctly, but I’m thinking in the following direction:
Generally, Cloudfront works like a reverse proxy with a cache attached. (Unlike other vendor’s products where you would “deploy on” the CDN.)
You can attach different types of origins to Cloudfront, it has native support for S3 buckets, but basically everything that speaks HTTP can be attached as a custom origin.
So, in the most trivial scenario, you would place your S3 bucket behind the Cloudfront, add an Origin Access Policy (OAI) and a bucket policy which permits the OAI to access your content.
In order to benefit from caching on the Cloudfront edge, you will need to configure it appropriately, otherwise it will just be a proxy. Make sure to set the Cloudfront TTLs for your content. Check how min/max/default TTL work.
But also don’t forget to set headers for your clients to cache (Cache-Control etc); this may save you a lot of money if the same clients need the same content over and over again.
As we know, caching and cach invalidation in particular, are tricky. Make sure to understand how Cloudfront handles caching to not run into problems. For example: cache busting with query parameters does work, but you need to make Cloudfront aware that the query sting is significant.
Now here comes the exciting part: If you need to react dynamically to the request of the client, you have Lambda#Edge and Cloudfront Functions at your disposal.
Lambda#Edge is basically what it says; Lambda functions on the edge. They can work in four modes: Client request, origin request, origin response, client response. Depends what you need to modify; incoming vs. outgoing data and client-Cloudfront vs. Cloudfront-origin communication.
CF Functions are pretty limited (ES5 only, no XHR or anything, only works on viewer request/response) but very cheap at the same time. Check the AWS docs to determine what you need.
FWIW, Cloudfront also supports signed cookies and signed URLs in case you need to restrict the content to particular viewers.

Redirect the user to another URL using Pivotal Cloud Foundry

Is there a way to redirect the incoming URL to another URL?
EX - when user hits https://www.website1.com, user should be redirected to https://www.website2.com

CloudFoundry itself does not provide a way to redirect arbitrary requests prior to requests hitting an application (i.e. at the Gorouter layer).
You have some options though:
Perform redirects on your load balancer. If you need to redirect the request as early as possible, doing it on your load balancers would be the earliest possible place. This is good for example if you want to redirect all requests from HTTP to HTTPS. It does assume that your load balancer will allow this, and not all do.
You can redirect from an application. So in your example, the application to which you have mapped the route www.website1.com would be listening and issue the redirect response to send requests to www.website2.com.
This app could be something as simple as an nginx.conf pushed using the Nginx Buildpack, or you could develop a custom application to issue the redirects. The beauty of this is that you can map any number of routes (including wildcards) to the app and a single Nginx server, configured correctly, could issue redirects for all of them. If you are redirecting a lot of separate URLs to one central URL, this works well.
If you have an existing application on www.website1.com and cannot replace it with an Nginx app to do the redirects, you could modify your custom application to perform the redirects. You don't have to use Nginx, it is just a convenient and very low overhead way to do redirects.
You could use a route service. Route Services are able to filter requests and so you could use one to intercept requests and issue redirects for certain routes/domains.
A route service can make some sense if you have complicated routing logic or if you have many applications to which you need to apply the logic. The reason is that a route service requires a custom application to be the route service. So if you have one application and you're trying to redirect requests for that one application to somewhere else, it doesn't make sense to add a second application into the mix just to perform some redirects. Now, if you have 500 applications and they all need some sort of redirect logic, it could make a little more sense to use a route service.

Mapping Host Header to custom header for secondary origin

I'm looking for a way to pass the requesting host header onto either the API Gateway or a custom endpoint (outside of amazon) from a cloudfront origin.
Essentially I have multiple domains mapped to a cloudfront catchall and I'm trying to pre-render based off the index request on the server while letting all other resources through.
IF this is not possible, would lambda edge be able to achieve such a thing?
Thanks!

Until such time as Lambda#Edge leaves preview, here's your workaround:
For each domain name, create a separate CloudFront distribution, and add a unique custom origin header.
If you've configured more than one CloudFront distribution to use the same origin, you can specify different custom headers for the origins in each distribution and use the logs for your web server to distinguish between the requests that CloudFront forwards for each distribution.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/forward-custom-headers.html
It should go without saying that "use the logs for your web server" is only one possible use for this value. You can also use it to identify which domain the request is for, by inspecting the inserted request header.
For example, for the site api-42.example.com, add a custom origin header X-Forwarded-Host with the static value the same as the hostname, api-42.example.com.
CloudFront adds the custom origin header to each request when sending it to the origin server.
If the client, for whatever reason, sends the same header, CloudFront discards what the client sent, before adding your header and value to each request.
Since the actual CloudFront distributions themselves are free, there's no real harm in this solution. If you need to create a lot of them, that's easily scripted with aws-cli. By default, accounts can create 200 different distributions, but you can submit a free support request to increase that limit.
You may now be contemplating the impact of this on your cache hit rate, since the different sites wouldn't share a common cache. That's a valid concern, but the impact may not be as substantial as you expect, for a variety of reasons -- not the least of which is that CloudFront's cache is not monolithic. If you have viewers hitting a single distribution but from two different parts of the world, those users are almost certainly connecting to different CloudFront edge locations, thus hitting different cache instances anyway.

DNS hosting public and web application on different hosts

Here is my setup.
Public site hosted by squarespace.com (www.example-domain.com)
Web application (AWS EC2/ELB), i would like to be available via the same domain. (my.example-domain.com)
Custom profile pages available as www.example-domain.com/username
My question is how can i setup the DNS to achieve this? If can't do it just through DNS, any suggestions? The problem i am facing is that if squarespace.com is handling the www.example-domain.com traffic how can i have it only partially handle it for certain urls. Maybe i am going about this in the wrong was all together though.

The two first are ok. As you mention, (1) is not compatible with (3) for a pure DNS config as www of example-domain.com has to be configured to a single end-point.
Some ideas of non-DNS workaround:
Having the squarespace.com domain on sqsp.example-domain.com and configure your www domain to a custom web server on which you configure the root (/) to redirect (HTTP 300) to sqsp.example-domain.com. It will be quite transparent for the user, except in his browser address.
The same but setting on / a full page HTML iframe containing sqsp.example-domain.com.
The iframe approach is a "less clean", Google the solutions to build your opinion.
EDIT:
As #mike-ryan mentioned, there is the proxy solution as well where you configure you web server to request another server to get the content to return to your user. If you are already using AWS, a smart way to do this is to use CloudFront: you can setup CloudFront to proxy one server on one URL and proxy another server on other URL. Actually, this is maybe the faster to way to implement you need. Of course, a proxy is one more "hop", so it may add more delay.

If you really want to have content served from different servers while only using a single domain name, you'll need to set up a proxy server to handle the request routing for you. I am assuming your custom profile pages must be served from your EC2 instance.
Nginx will receive all requests, and will then decide whether they should be sent to Square Space or your web app. Requests will be reverse proxied to Square Space or to your app, depending on the URL.
This is similar to #smad's answer, except it will all be invisible to the users which IMHO is better than redirecting the user to a new domain name.
Example steps:
Set up an Nginx server, create two virtual hosts - one for my.example.com, and one for www.example.com
Create two upstreams in your Nginx config - one for Square Space, and one for your app
Configure the www.example.com virtual host to reverse proxy connections to the Square Space upstream, if the URL is "/". Otherwise, traffic should be proxied to your app upstream [0]
Configure the my.example.com virtual host to proxy all traffic to your app upstream
[0] how to reverse proxy via nginx a specific url?

CDN only works assigned domain

For example:
About www.abc.com , and en.abc.com
I want to know how to configure the CDN or somethings, make the CDN only works in www.abc.com,
for en.abc.com don't works.
I am using aliyun.com as my cdn provider.
How about the NGINX or Django or Domain or CDN settings?

CDN systems always 'only' work on their configured hostnames. Basically, a CDN is a reverse proxy with a set of rules on it. For any request coming in, it has to know
where to fetch the content from
which additional logic to apply to the content when delivering it
If you want to use a different hostname on the CDN, you will have to make the CDN work, all other components in your web site delivery will not be reached if the CDN configuration doesn't proxy the request to your web server.
I am not familiar with aliyun.com specifically, but there might be a chance to have them set up a wildcard/regex hostname (like *.example.com). You will have to get suport from aliyun to understand if this is possible.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js