Canary Release and Blue Green Deployment on AWS - amazon-web-services

I am currently implementing Canary Release and Blue Green Deployment on my Static Website on AWS S3. Basically, I created two S3 bucket (v1 and v2) and 2 cloud front (I didn't append the CNAME). Then, I create 2 A alias records in Route 53 with 50% each weight routing policy. However, I was being routed to v1 only using both laptop and mobile to access my domain. I even ask my colleague to open my domain and they're being routed to v1 as well.
It really puzzled me why there's no user being routed to v2?
AWS Static Web in S3

The assigned dyyyexample.cloudfront.net and dzzzexample.cloudfront.net hostnames that route traffic to your CloudFront distributions go to the same place. CloudFront can't see your DNS alias entries, so it is unaware of which alias was followed.
Instead, it looks at the TLS SNI and the HTTP Host header the browser sends. It uses this information to match with the Alternate Domain Name for your distribution -- with no change to the DNS.
Your site's hostname, example.com, is only configured as the Alternate Domain Name on one of your distributions, because CloudFront does not allow you to provision the same value on more than one distribution.
If you swap that Alternate Domain Name entry to the other distribution, all traffic will move go the other distribution.
In short, CloudFront does not directly and natively support Blue/Green or Canary.
The workaround is to use a Lambda#Edge trigger and a cookie to latch each viewer to one origin or another. Lambda#Edge origin request trigger allows the origin to be changed while the request is in flight.
There is an A/B testing example in the docs, but that example swaps out the path. See the Dynamic Origin Selection examples for how to swap out the origin. Combining the logic of these two allows A/B testing across two buckets (or any two alternate back-ends).

What you are explaining should work if you make use of "overlapping aliases" in Cloudfront. You configure one distribution to listen to app.example.com and the other one to *.example.com and use Route53 weighted routing for app.example.com
However weighted routing might not be ideal solution for canary releases. This is due to DNS propagation/caching and the fact that it is not sticky.
Like Michael suggests you might want to look into having 1 cloudfront and routing to bucket A/B using Lambda#Edge or Cloudfront functions.
Here is an example.

Related

AWS CloudFront failover

I have the following business continuity (BC) solution:
Static website that is deployed in S3 in two identical buckets in regions A (primary) and B (secondary).
CloudFront distribution that uses Origin Group to failover from the bucket in region A to the other one in region B.
The distribution is configured with the custom domain and certificates to provide access to the website.
Route 53 has "A" record connecting the custom domain to the CloudFront distribution.
The above solution works as indented, i.e. it provides the failover in case of the failure of the primary site (e.g. S3 failure).
What I am trying to figure out is the best way to ensure the availability of the CloudFront distribution, i.e. what happens if the region where the distribution is configured becomes compromised.
Originally I was thinking to create another distribution that will be identical to the first one but will exist in another region and use Route 53 to failover between them. Unfortunately it is not possible due to the one-to-one relationship between a CF distribution and customer domains (CNAME).
I would appreciate anyone's experience with creating a BC solution for the CloudFront distribution that uses custom domains.
Thank you,
gen

Multiple region deployment with Geolocation based policy v/s Cloudfront

I have a custom origin i.e. a web app on an EC2 instance. How do I decide whether I should go for:
a Cloudfront CDN
or,
deploy multiple instances in different regions and configure a Geolocation/proximity based routing policy
The confusion arises from the fact that both aim at routing the request to the nearest location (edge location in case of Cloudfront and region specific EC2 instance when it comes to multi-region deployments with Geolocation based policy with Route 53) based on where the request originates from.
There is no reason why you can't do both.
CloudFront automatically routes requests to an edge location nearest the viewer, and when a request can't be served from that location or the nearest regional cache, CloudFront does a DNS lookup for the origin domain name and fetches the content from the origin.
So far, I've only really stated the obvious. But up next is a subtle but important detail:
CloudFront does that origin server DNS lookup from a location that is near the viewer -- which means that if the origin domain name is a latency-based record set in Route 53, pointing to deployments in two or more EC2 regions, then the request CloudFront makes to "find" the origin will be routed to the origin deployment nearest the edge, which is also by definition going to be near to the viewer.
So a single, global CloudFront deployment can automatically and transparently select the best origin, using latency-based configuration for the backend's DNS configuration.
If the caching and transport optimizations provided by CloudFront do not give you the global performance you require, then you can deploy in multiple regions, behind CloudFront... being mindful, always, that a multi-region deployment is almost always a more complex environment, depending on the databases that are backing your application and how they are equipped to handle cross-region replication for reads and/or writes.
Including CloudFront as the front-end is also a better solution for fault tolerance among multiple regional deployments, because CloudFront correctly honors the DNS TTL on your origin server's DNS record, and if you have Route 53 health checks configured to take an unhealthy region out of the DNS response on the origin domain name, CloudFront will quickly stop sending further requests to it. Browsers are notoriously untrustworthy in this regard, sometimes caching a DNS answer until all tabs/windows are closed.
And if CloudFront is your front-end, you can offload portions of your logic to Lambda#Edge if desired.
You can use multi region for lot reasons mainly,
Proximity
Failover (incase if first region fails, requests can be sent to another region)
Multi region lambda deployment is clearly documented here. You can apply the same logic to all of the AWS Resources too. (DynamoDB, S3)
https://aws.amazon.com/blogs/compute/building-a-multi-region-serverless-application-with-amazon-api-gateway-and-aws-lambda/
You can also run Lambda#Edge to force all your requests / splits to one region on the edge.
Hope it helps.

Are route53 failover policy records only useful for non aws alias-able resources?

If all my endpoints are AWS services like ELB or S3 "Evaluate Target Health" can be used instead of failover records correct? I can use multiple weighted, geo, or latency records and if I enabled "Evaluate Target Health" it also servers the purpose of failover if one of the resources a record is pointing to is not healthly route53 will not send traffic to it.
The only use I see for failover records with custom healthchecks is for non-aws resources OR if maybe you have a more complex decision you want DNS to make instead of just ELB/S3/etc service health.
EDIT: so it seems while I can get active-active with "Evaluate Target Health" (on alias endpoints) if I want active-passive I have to use a failover policy- is this correct?
Essentially, yes. Evaluating target health makes the records viable candidates for generating responses, only when healthy. Without a failover policy, they're all viable when they're all healthy.
If you do something like latency-based routing and you had two targets, let's say Ohio and London, then you'd essentially have a dual active/passive configuration with reversed roles -- Ohio active and London passive for viewers in North America, and the roles reversed for viewers in Europe. But if you want global active/passive, you'd need a a failover policy.
Note that if you are configuring any kind of high-availability design using Route 53 and target health, your best bet is to do all of this behind CloudFront -- where the viewer always connects to CloudFront and CloudFront does the DNS lookup against Route 53 to find the correct origin based on whatever rules you've created. The reason for this is that CloudFront always respects the DNS TTL values. Browsers, for performance reasons, do not. Your viewers can find themselves stuck with DNS records for a dead target because their browsers don't flush their cached DNS lookups until all tabs in all windows are closed. For users like me, that almost never happens.
This also works with latency-based routes in Route 53 behind CloudFront, because CloudFront has already routed the viewer to its optimal edge, and when that edge does a lookup on a latency-based route in Route 53, it receives the answer that has the lowest latency from the CloudFront edge that's handling the request... so both viewer to CloudFront and CloudFront to origin routes are thus optimal.
Note also that failover routing to S3 with only DNS is not possible, because S3 expects the hostname to match the bucket name, and bucket names are global. An S3 failure is a rare event, but it has happened at least once. When it happened, the impact was limited to a single region, as designed. For a site to survive an S3 regional failure requires additional heroics involving either CloudFront and Lambda#Edge triggers, or EC2-based proxies that can modify the request as needed and send it to the alternate bucket in an alternate region.
Latency-based routing to buckets with Route 53 is also not possible, for the same reason, but can be accomplished by Lambda#Edge origin request triggers. These triggers are aware of the AWS region where a given invocation is running, and thus can swap origin servers based on location.

Regional API Gateway with CloudFront

Amazon released new feature - to support regional api endpoints
Does it mean I can deploy my same API code in two regions which sends request to Lambda micro-services? (It will be two different Https endpoints)
And have CloudFront distribute the traffic for me?
Any code snippets?
Does it mean I can deploy my same API code in two regions which sends request to Lambda micro-services? (It will be two different Https endpoints)
This was already possible. You can already deploy the same API code in multiple regions and create different HTTPS endpoints using API Gateway.
What you couldn't do, before, was configure API Gateway API endpoints in different regions to expect the same hostname -- and this is a critical capability that was previously unavailable, if you wanted to have a geo-routing or failover scenario using API Gateway.
With the previous setup -- which has now been renamed "Edge-Optimized Endpoints" -- every API Gateway API had a regional endpoint hostname but was automatically provisioned behind CloudFront. Accessing your API from anywhere meant you were accessing it through the CloudFront, which meant optimized connections and transport from the API client -- anywhere on the globe -- back to your API's home region via the AWS Edge Network, which is the network that powers CloudFront, Route 53, and S3 Transfer Acceleration.
Overall, this was good, but in some cases, it can be better.
The new configuration offering, called a Regional API Endpoint, does not use CloudFront or the Edge Network... but your API is still only in one region (but keep reading).
Regional API Endpoints are advantageous in cases like these:
If your traffic is from EC2 within the region, this avoids the necessity of jumping onto the Edge Network and back off again, which will optimize performance of API requests from inside the same EC2 region.
If you wanted to deploy an API Gateway endpoint behind a CloudFront distribution that you control (for example, to avoid cross-origin complications, or otherwise integrate API Gateway into a larger site), this previously required that you point your CloudFront distribution to the CloudFront distribution managed by API Gateway, thus looping through CloudFront twice, which meant transport latency and some loss of flexibility.
Creating a Regional API Endpoint allows you to then point your own CloudFront distribution directly at the API endpoint.
If you have a single API in a single region, and it's being accessed from points all over the globe, and you aren't using CloudFront yourself, the Edge-Optimized endpoint is still almost certainly the best way to go.
But Regional API Endpoints get interesting when it comes to custom domain names. Creating APIs with the same custom domain name (e.g. api.example.com) in multiple AWS regions was not previously possible, because of API Gateway's dependency on CloudFront. CloudFront is a global service, so the hostname namespace is also global -- only one CloudFront distribution, worldwide, can respond to a specific incoming request hostname. Since Regional API Endpoints don't depend on CloudFront, provisioning APIs with the same custom domain name in multiple AWS regions becomes possible.
So, assuming you wanted to serve api.example.com out of both us-east-2 and us-west-2, you'd deploy your individual APIs and then in each region, create a custom domain name configuration in each region for api.example.com with a Regional API Endpoint, selecting an ACM certificate for each deployment. (This requires ACM certs in the same region as the API, rather than always in us-east-1.)
This gives you two different hostnames, one in each region, that you use for your DNS routing. They look like this:
d-aaaaaaaaaa.execute-api.us-east-2.amazonaws.com
d-bbbbbbbbbb.execute-api.us-west-2.amazonaws.com
So, what next?
You use Route 53 Latency-Based routing to create a CNAME record for api.example.com with two targets -- one from us-east-2, one from us-west-2 -- pointing to the two respective names, along with health checks on the targets. Route 53 will automatically resolve DNS queries to whichever regional endpoint is closer to the requester. If, for example, you try to reach the API from us-east-1, your DNS query goes to Route 53 and there's no record there for us-east-1, so Route 53 determines that us-east-2 is the closer of the two regions, and -- assuming the us-east-2 endpoint has passed its healthcheck -- Route 53 returns the DNS record pointing to d-aaaaaaaaaa.execute-api.us-east-2.amazonaws.com.
So, this feature creates the ability to deploy an API in multiple AWS regions that will respond to the same hostname, which is not possible with Edge Optimized API Endpoints (as all endpoints were, prior to the announcement of this new feature).
And have CloudFront distribute the traffic for me?
Not exactly. Or, at least, not directly. CloudFront doesn't make origin determinations based on the requester's region, but Lambda#Edge dynamic origin selection could be used to modify the origin server based on the requester's general location (by evaluating which API region is nearest to the CloudFront edge that happens to be serving a specific request).
However, as you can see, above, Route 53 Latency-Based routing can do that for you. Yet, there's still a compelling reason to put this configuration behind a CloudFront distribution, anyway... two reasons, actually...
This is in essence a DNS failover configuration, and that is notoriously unreliable when the access is being made by a browser or by a Java programmer who hasn't heard that Java seems to cache DNS lookups indefinitely. Browsers are bad about that, too. With CloudFront in front of your DNS failover configuration, you don't have to worry about clients caching your DNS lookup, because CloudFront does it correctly. The TTL of your Route 53 records -- which used as an origin server behind CloudFront -- behaves as expected, so regional failover occurs correctly.
The second reason to place this configuration behind CloudFront would be if you want the traffic to be transported on the Edge Network. If the requests are only coming from the two AWS regions where the APIs are hosted, this might not be helpful, but otherwise it should improve responsiveness overall.
Note that geo-redundancy across regions is not something that can be done entirely transparently with API Gateway in every scenario -- it depends on how you are using it. One problematic case that comes to mind would involve a setup where you require IAM authentication against the incoming requests. The X-Amz-Credential includes the target region, and the signature of course would differ because the signing keys in Signature V4 are based on the secret/date/region/service/signing-key paradigm (which is a brilliant design, but I digress). This would complicate the setup since the caller would not know the target region. There may be other complications. Cognito may have similar complications. But for a straightforward API where the authentication is done by your own mechanism of application tokens, cookies, etc., this new capability is very much a big deal.
Somewhat amusingly, before this new capability was announced, I was actually working on the design of a managed service that would handle failover and geo-routing of requests to redundant deployments of API Gateway across regions, including a mechanism that had the capability to compensate for the differing region required in the signature. The future prospects of what I was working on are a bit less clear at the moment.
It means you can deploy your API based on the region which reduces latency.
One use case would be, Say you have a Lambda function which is invoking an API request. If both Lambda and API is in the same region then you can expect high performance
Please have a look on https://docs.aws.amazon.com/apigateway/latest/developerguide/create-regional-api.html

AWS Cloudfront and Route53

How does cloudfront work with Route53 routing policies?
So as I understand it CF is supposed to route requests to the nearest server, which is in effect the Route53 latency policy. So if you have an R53 hosted zone entry for your CF domain name is this done by default if you leave the routing policy as simple or do you neec to explicitly set this yourself? And if you chose another policy type (failover, geo-location etc) would that overwrite it?
You leave it as simple.
You don't have access to the necessary information to actually configure it yourself -- CloudFront returns an appropriate DNS response based on the location of the requester, from a single, simple DNS record. The functionality and configuration is managed transparently by the logic that powers the cloudfront.net domain, you set it and forget it, because there are no user-serviceable parts inside.
This is true whether you use an A-record Alias or a CNAME.
Any other configuration would not really make sense, because talking of failover or geolocation imply that you'd want to send traffic somewhere other than where CloudFront's algorithm would send it.
Now... there are cases when, behind CloudFront, you might want to use some of Route 53's snazzier options. Let's say you had app servers in multiple regions serving exactly the same content. Latency-based routing for the origin hostname (the one where CloudFront sends cache misses) would allow CloudFront to magically send requests to the app server closest to the CloudFront edge that serves each individual request. This would be unrelated to the routing from the browser to the edge, though.