The Problem
Suppose I have a simple CRUD web application. The application is containerized and developed locally and set up with main/staging/develop branches on Github. CI/CD is configured with Github actions. Merges to main trigger a deployment to AWS App Runner.
Generally, we need two main AWS services here: Cloud Formation, ECR, and App Runner.
Where does the IaC with AWS CDK belong?
Approaches
Have a separate repository for the IaC. Run this repository once to setup the App Runner service and a dedicated ECR repository. Treat the ECR repository URI as an environment variable in the application repository. On merge to main in the application repository, Github Actions rebuilds and pushes the image to the ECR repo. The App Runner service detects the new image and redeploys per this documentation.
Pros: Separation of responsibilities between repositories. One is for infrastructure, One is for application code. CDK only runs once. Deployments are far simpler and easier to diagnose.
Cons: Significant manual overhead. More work to change infrastructure.
AWS CDK supports declaring an App Runner service with a local docker image per this documentation. Simply create an AWS CDK project directly in the application repository. Upon merge to main, it simply re-runs the CDK with the new image.
Pros: One repository. Almost entirely automated with no manual overhead from infrastructure/DevOps team.
Cons: Developers may have to worry about IaC. Potential compute overhead with constant re-runs of CDK.
The Actual Questions
Which approach is best for websites/API that might rely on multiple backend services?
Which one fits the best in a development culture that relies heavily on microservices? Is there another approach I'm not thinking of? Am I asking the wrong questions?
I personally prefer approach 2 because I hate manual overhead.
I have less experience in microservices, so I was hoping some people with more industry experience could present some insight.
If this is the wrong place to ask this question or if I need to be more specific, please comment below and I'll adjust accordingly.
Have a separate repository for the IaC
The most compelling reason to do this is to decouple CI/CD for IaC from the
app repo. For example, if the application codebase is not continuously delivered, and the IaC is committed to the same repo as the codebase, you end up in a scenario where the IaC is versioned along with the code. If you deploy an old version, or a branch, does that version or branch get IaC from the same ref in the repo as the code itself? If so, you're in a position where you have to merge IaC changes across branches to make them deployable, which is a huge headache.
Most development teams as of 2022 do want to continuously deliver their code and "fail forward" rather than rolling back to old versions. In this scenario it doesn't really matter - because the main branch of the code is also the main branch of the infrastructure. But in this case, rolling back to an old version of the code inherently means rolling back to the matching version of the infrastructure, so you can't do it without looking very carefully at what infrastructure changes were made between the two versions and whether it's safe to roll back infra changes or not.
On the other hand, if the IaC repo is separate from the code repo, the IaC can be made to accommodate multiple versions of the code. Dependencies still exist - new features in the app that require new infrastructure are inherently dependent on the Infra as Code changes, and you don't have the shared repo to make sure those dependencies are deployed before the app.
It usually comes down to a question of ownership. If the infrastructure is primarily managed by a distinct group, then putting the infra in a separate repository makes a lot of sense because commingling infrastructure changes with code changes makes it hard for these groups to operate independently. To push out an infrastructure change from a different repo, is essentially an isolated step. To push out an infrastructure change from the same repo, requires merging a PR into the code base and deploying that. If the infra change is the only thing being deployed , that's pretty straight forward. but if the CI branch of the codebase is in a messy state then the infrastructure becomes undeployable because the code is undeployable. If the infrastructure is owned by the team whose job it is to also keep the code repo clean and deployable, then splitting the repos apart doesn't do much good.
Having spent the last 12 years or so doing DevOps, I'm pretty attracted to putting IaC in a separate repo for messy applications whose teams struggle with continuous delivery. That way when I want to make infrastructure changes, I can consider them in relative isolation and can deploy them to all environments regardless of which version of the code is deployed there. It really sucks to be trying to migrate database hosting, for example, if you need to work with the product team to get your IaC into each version of the code deployed into each environment. But it's not a free lunch - I still have to make sure to coordinate dependencies between the infra and the code, of course.
The smaller the service, the more the development team also handles IaC, and the more disciplined the development team's approach to CICD , the less it matters. If the same code goes out to dev/prod anyway and code merge and deploy is a frequent, comfortable thing, then the IaC may as well be in the app codebase. But you have to be ready to be limited to a fail-forward, continuously integrated approach, and accept that infrastructure and code deployments are coupled at the repo layer.
Most microservice dev teams tend to own their own IaC, continuously deliver their application, and put their IaC in the same repo as their code.
Related
We have a CodePipeline that runs on every GitHub commit/merge to the main branch, building the application and releasing it to a staging environment where we can manually test the application. Every now and then, ad-hoc, weekly, etc, depending on the project, we'd release to production manually. To implement this I added a ManualApprovalStep to my CodePipeline between staging and production but that means that my pipeline is never green. It's always stuck in blue:
This makes me think that I'm using the wrong tool here.
My mental model is coming from Heroku (ignore the review apps, I'm not tackling that challenge yet):
In Heroku there's a Tests tab that's green if the tests pass and there's a pipeline that's green if it gets deployed to staging. Lack of promotion to production in Heroku is not a non-green state in Heroku but it would be in ManualApprovalStep.
Is there another tool that AWS gives me to model this way of working that I'm missing?
Update: another big difference. The ManualApprovalStep seems to pile each change and releasing each change, one by one, not releasing whatever was the last release to staging, so clearly it's not analogous to the release to production that Heroku has.
You are right that the ManualApprovalStep is not a natural "promotion" mechanism. They are for yes-no approvals and will result in execution failure if rejected or after 7 days. Disabled Stage Transitions also sit awkwardly with your use case.
pipeline.CodePipline executions are (a) triggered on a change to a source and (b) meant to run all stages from start-to-finish. Executions are hard to interrupt. A consequence of a requirement to deploy environments independently is that environments are best modelled as independent pipelines, not stages within a single pipeline.
Simple Option: 2 github branches, 2 Pipelines
Clone your pipeline setup. A staging pipeline is tied to a staging branch source. A prod pipeline is triggered on changes to the main branch. This setup is easy to reason about and has the advantage that deploys always match your source. But it does not replicate the Heroku "promotion" concept.
Complex Option: 1 github branch, 2 Pipeline?
You could probably get something closer to the "promotion" pattern by having a pipelines.CodePipeline deployment for staging (tied to github) and a separate codepipeline.Pipeline pipeline for prod. The latter can be triggered by EventBridge events. Asset handling would be complex in this scenario.
[Edit:] Amplify CI/CD for the Front-end, CodePipeline for Back-end
AWS Amplify CI/CD gives you automatic feature branch deploys, PR review approvals etc. for front-end apps. Manual deploys require a workaround, but are possible. See this related SO question. The CDK supports Amplify build configurations. The catch is that these CI/CD goodies work for front-end apps, but not for arbitrary infrastructure stacks. To get the best of both worlds, split the app in two. Use Amplify for the high-velocity front-end and stick with CodePipeline for the back-end deploys.
Hi I am using the serverless framework to develop my application and I need to set it up in a local environment I am using API gateway, Lambda, VPC , SNS, SQS, and DB is connected via VPC peering, currently, everytime I am deploying and testing my code and its tedious process and takes 5 mins to deploy, Is there any way to set up a local environment to have everything in one place
It should be possible in theory, but it is not an easy thing to do. There are products like LocalStack that offer exactly this.
But, I would not recommend going that route. Ultimately, by design this will always be a huge cat and mouse game. AWS introduces a new feature or changes some minor detail of their implementation and products like LocalStack need to catch up. Furthermore, you will always only get an "approximation" of the "actual cloud". It never won't be a 100% match.
I would think there is a lot of work involved to get products like LocalStack working properly with your setup and have it running well.
Therefore, I would propose to invest the same time into proper developer experience within the "actual cloud". That is what we do: every developer deploys their version of the project to AWS.
This is also not trivial, but the end result is not a "fake version" of the cloud that might or might not reflect the "real cloud".
The key to achieve this is Infrastructure as code and as much automation as possible. We use Terraform and Makefiles which works very well for us. If done properly, we only ever build and deploy what we changed. The result is that changes can be deployed in seconds to AWS and the developer can test the result either through the Makefile itself or using the AWS console.
And another upside of this is, that in theory you need to do all the same work anyway for your continuous deployment, so ultimately you are reducing work by not having to maintain local deployments and cloud deployments.
The problem
I'm approaching AWS, and the first test project will be a website, but i'm struggling on how to approach the resource and the tools to accomplish this.
AWS documentation is not really beginner-friendly, so to me it is like to being punched in the face at the first boxe training session.
First attempt
I've installed bot AWS and SAM cli tools, so what I would expect is to be able to create an empty stack at first and adding the resource one by one as the specifications are given/outlined, but instead what I see is that i need to give a template to the tool to create the new stack, but that means I need to know how to write it beforehand and therefore the template specifications for each resource type.
Second attempt
This lead me to create the stack and the related resources from the online console to get the final stack template, but then I need to test every new resource or any updated resource locally, so I have to copy the template from the online console to my machine and run the cli tools with this, but obviously it is not the desired development flow.
What I expected
Coming from a standard/classical web development I would expect to be able to create the project locally, test the related resources locally, version it, and delegate the deployment to the pipeline.
So what?
All this made me understand that "probably" I'm missing somenthing on how to use the aws cli tools and how the development for an aws-hosted application is meant to be done.
I'm not seeking for a guide on specific resource types like every single tutorial I've found online, but something on a higher level on how to handle a project development on aws, best practices and stuffs like that, I can then dig deeper on any resource later when needed.
AWS's Cloud Development Kit ticks the boxes on your specific criteria.
Caveat: the CDK has a learning curve in line with its power and flexibility. There are much easier ways to deploy a web app on AWS, like the higher-level AWS Amplify framework, with abstractions tailored to front-end devs who want to minimise the mental energy spent on the underlying infrastructure.
Each of the squillion AWS and 3rd Party deploy tools is great for somebody. Nevertheless, looking at your explicit requirements in "What I expected", we can get close to the CDK as an objective answer:
Coming from a standard/classical web development
So you know JS/Python. With the CDK, you code infrastructure as functions and classes, rather than 500 lines of YAML as with SAM. The CDK's reference implementation is in Typescript. JS/Python are also supported. There are step-by-step AWS online workshops for these and the other supported languages.
create the project locally
Most of your work will be done locally in your language of choice, with a cdk deploy CLI command to
bundle the deployment artefacts and send them up to the cloud.
test the related resources locally
The CDK has built-in testing and assertion support.
version it
"Deterministic deploy" is a CDK design goal. Commit your code and the generated deployment artefacts so you have change control over your infrastructure.
delegate the deployment to the pipeline
The CDK has good pipeline support: i.e. a push to the remote main branch can kick off a deploy.
AWS SAM is actually a good option if you are just trying to get your feet wet with AWS. SAM is an open-source wrapper around the aws-cli, which allows you to create aws resources like Lambda in say ~10 lines of code vs ~100 lines if you were to use the aws-cli directly. Yes, you'll need to learn SAM specific things like SAMtemplate and SAM-cli but it is pretty straightforward using this doc.
Once you get the hang of it, it would be easier to start looking under the hood of what/how SAM is doing things and get into the weeds with aws-cli if you wanted. Which will then allow you to build out custom solutions (using aws-cli) for your complex use cases that SAM may not support. Caveat: SAM is still pretty new and has open issues that could be a blocker for advanced features/complex use cases.
I am on a project which is about to release first version. I want to setup bitbucket pipeline when deploying to AWS. When doing so, I am afraid that users on website might be affected while we are deploying. What is the best practice for deploying new feature to the live server without affecting users on the website?
One possible option might be that put maintenance page on the web and deploy new codes when not many users are using the website. is there other way to deploy?
As mentioned in the comment it something that depends on underlying tools and technology, but I will focus on your last question.
One possible option might be that put maintenance page on the web and
deploy new codes when not many users are using the website. is there
other way to deploy?
First thing, you should not deploy a new feature without proper testing as pipeline must include automating testing, as sometimes such code breaks the complete application.
You should not put application under maintenance during deployment, that is why we have CI/CD pipeline. You should design your pipeline in the way that you are sure about the lastest code and feature that It should work in production as expected. Many AWS services support blue/green deployment and in the interesting part of blue/green deployment is rollback. You can explore further in the below links.
AWS_Blue_Green_Deployments
using-bitbucket-pipeline-for-aws-ecs-deployments
deploy-to-ec2-with-aws-codedeploy-from-bitbucket-pipelines
continuous-deployment-pipeline
We are a relatively inexperienced development team trying to do things 'the right way'. We are using Github along with AWS and CodeDeploy for multiple PHP based web applications. We are utilising Github's auto-deployment with CodeDeploy when the master branch is updated.
We have two production EC2 web servers in separate AZ's along with a single EC2 staging server.
It currently works as follows:
We write code in a branch, we push to GitHub, we merge into 'master' which then kicks off CodeDeploy to write to our staging server where we can test it. Once we have tested it we then manually kick off CodeDeploy to write to production (with the same commit ID).
The problem is, if testing brings up issues, and we have another branch waiting to be merged and tested, everything becomes backed up?
We are obviously doing something wrong. We are writing to the master branch to utilise GitHub's autodeploy, but I assumed master was only to be written to when it was ready to be deployed?
Can someone please help us and put us straight?
Thanks
Make another branch called 'livecandidate' this branch will have each of the new feature branches merged into it
Each time a feature branch is merged into 'livecandidate' pull 'livecandidate' into your Code Deploy process and install to the test machine.
If the tests pass then merge 'livecandidate' into 'master' and kick off the install to production
If the tests do not pass then unwind the merge into 'livecandidate' (assuming no dependencies on chains of changes etc)
After doing a production install or a un-merge, try the next feature
General idea is to never ever have a broken master
All problems in computer science can be solved by another level of indirection - David Wheeler