On a merge to the master branch, a Jenkins job is kicked off to update an AWS Task Definition with a new Docker image with a new versioned tag.
I'm trying to adopt IAC as much as possible, and this deployment job creates a drift in the state of AWS vs what Pulumi has recorded.
I've considered pushing a latest tag on a deployment from master and simply restarting the service (which uses the latest tag rather than a versioned one), but then I lose the strict version control and it makes rollback more difficult.
What's the best way to achieve an automated CICD process while maintaining IAC?
Related
I am not quite clear on the best practices related to using CDK to deploy private Github repos to AWS. I understand that a pipeline should be created by CDK and the pipeline should invoke CodeDeploy to deploy the assets, but beyond that the details are murky.
I also want to understand if for this use case it would make sense to have a separate CDK repo which is responsible for the infrastructure for the entire backend of my project, or if it would make more sense to have CDK code included in each individual component repo. As I will be utilizing a microservice/cell based approach for building out components, the overhead required in adding CDK configuration for each component might be substantial.
You can think of CDK as a compiler that takes a given language and 'compiles' it down to CloudFormation templates. Those templates are then uploaded to Cloudformation by the CDK framework, and run for you when you execute the deploy command.
So. To answer your question more directly - if you can figure out how to do it in cloudformation, you can do it. That may involve spinning up a code build and running a script that executes some api calls or prepares a package for an ec2 server that then uses that package in the next step or any number of things.
But remember that CDK synths its cloudformation template all at once, and is only creating the template. It does not run any scripts that may be part of your code builds, and it does not 'wait' for certain things to be complete - because it isn't doing anything like that. If you have a sequence of events that need to occur, you want to use CodePipeline to orchestrate those events for you - but you can set up your CodePipeline with CDK for certain!
As for Overhead, maybe at first. But trust me when I say it becomes very quick and easy to generate a CDK stack for a given microservice with experience, and its super handy. Being able to spin up an ad-hoc on demand testing environment is super useful. Being able to deploy individual stacks on demand and make quick changes with just a line of code is handy as all get out. Having a single source of code for both your prod and development environments that you make a change in one and on next deployment in each is automatically reflected is super handy.
CDK is a very powerful tool - but it is a very low level one. It creates the template that will create your resources. Thats it. If your resources need to do something after being created for something else to happen you have to make use of other services to orchestrate that (CodePipeline, StepFunctions, Cloudwatch Events, ect)
I am trying to have run the lastest task definition image built from GitHub deployment (CD). Seems like on AWS it creates a task definition for example "task-api:1", "task-api:2", on was my cluster is still running task-api: 1 even though there is the latest task as a new image has been built. So far I have to manually stop the old one and start a new one . How can I have it automated?
You must wrap your tasks in a service and use rolling updates for automated deployments.
When the rolling update (ECS) deployment type is used for your service, when a new service deployment is started the Amazon ECS service scheduler replaces the currently running tasks with new tasks.
Read: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/deployment-type-ecs.html
This is DevOps, so you need a CI/CD pipeline that will do the rolling updates for you. Look at CodeBuild, CodeDeploy and CodePipeline (and CodeCommit if you integrate your code repository in AWS with your CI/CD)
Read: https://docs.aws.amazon.com/codepipeline/latest/userguide/tutorials-ecs-ecr-codedeploy.html
This is a complex topic, but it pays off in the end.
Judging from what you have said in the comments:
I created my task via the AWS console, I am running just the task definition on its own without service plus service with task definition launched via the EC2 not target both of them, so in the task definition JSON file on my Github both repositories they are tied to a revision of a task (could that be a problem?).
It's difficult to understand exactly how you have this set up and it'd probably be a good idea for you to go back and understand the services you are using a little better using the guide you are following or AWS documentation. Pushing a new task definition does not automatically update services to use the new definition.
That said, my guess is that you need to update the service in ECS to use the latest task definition. You can do that in many ways:
Through the console (https://docs.aws.amazon.com/AmazonECS/latest/developerguide/update-service-console-v2.html).
Through the CLI (https://docs.aws.amazon.com/cli/latest/reference/ecs/update-service.html).
Through the IaC like the CDK (https://docs.aws.amazon.com/cdk/api/latest/docs/aws-ecs-readme.html).
This can be automated but you would need to set up a process to automate it.
I would recommend reading some guides on how you could automate deployment and updates using the CDK. Amazon provide a good guide to get you started https://docs.aws.amazon.com/cdk/latest/guide/ecs_example.html.
I am programatically updating the task definition in ECS Service (service has one task) using the update-service api. What is the best way to track the status of this? The API returns immediately and does not wait for the update to complete. One way would be to continuously describe the tasks. Is there a better way?
Is there a better way?
It depends. But an alternative way could by creating dedicated CI/CD pipeline to update your service, especially you this happens regularly. This could be done with AWS CodePipeline with an example of:
Tutorial: Amazon ECS Standard Deployment with CodePipeline
Initial setup can be slow, especially if done for the first time, but once done, you can have a lot of benefits. They include:
automated deployments
notifications if deployment fails or succeeds
clear history of all deployments, and more.
I know that there are already countless questions in this direction, but unfortunately I was not able to find the right answer yet. If a post already exists, please just share the link here.
I have several gitlab CI / CD pipelines. The first pipeline uses Terraform to build the complete infrastructure for an ECS cluster based on Fargate. The second / third pipeline creates nightly builds of the frontend and the backend and pushes the Docker Image with the tag "latest" into the ECR of the (staging) AWS account.
What I now want to achieve is that the corresponding ECS tasks are redeloyed so that the latest Docker images are used. I actually thought that there is a way to do this via CloudWatch Events or whatsoever, but I don't find a really good starting point here. A workaround would be to install the AWS CLI in the CI / CD pipeline and then do a service update with "force new deployment". But that doesn't seem very elegant to me. Is there any better way here?
Conditions:
The solution must be fully automated (either in AWS or in gitlab CI / CD)
Switching to AWS CodePipeline is out of discussion
Ideally as close as possible to AWS standards. I would like to avoid extensive lambda functions that perform numerous actions due to their maintainability.
Thanks a lot!
Ok, for everybody who is interested in an answer. I solved it that way:
I execute the following AWS CLI command in the CICD pipeline
aws ecs update-service --cluster <<cluster-name>> --service <<service-name>> --force-new-deployment --region <<region>>
Not the solution I was looking for but it works.
As a general comment it is not recommended to always push the same container tag because then rolling back to a previous version in case of failure becomes really difficult.
One suitable option would be to use git tags.
Let's say you are deploying version v0.0.1
You can create a file app-version.tf which will contain the variable backend-version = v0.0.1 that you can reference on the task definition of the ecs service.
Same thing can be done for the container creation using git describe.
So, you get a new task definition for every git tag and the possibility of rolling back just by changing a value in the terraform configuration.
It is beneficial to refer to images using either digests or unique immutable tags. After the pipeline pushes the image, it could:
Grab the image's digest/unique tag
Create a new revision of the task definition
Trigger an ECS deployment with the new task definition.
As sgramo93 mentions, the big benefit is that rolling back your application can be done by deploying an older revision of the task definition.
We have a long running EMR cluster that has multiple libraries installed on it using bootstrap actions. Some of these libraries are under continuous development and their codebase is on GitHub.
I've been looking to plug Travis CI with AWS EMR in a similar way to Travis and CodeDeploy. The idea is to get the code on GitHub tested and deployed automatically to EMR while using bootstrap actions to install the updated libraries on all EMR's nodes.
A solution I came up with is to use an EC2 instance in the middle, where Travis and CodeDeploy can be first used to deploy the code on the instance. After that a lunch script on the instance is triggered to create a new EMR cluster with the updated libraries.
However, the above solution means we need to create a new EMR cluster every time we deploy a new version of the system
Any other suggestions?
You definitely don't want to maintain an EC2 instance to orchestrate a CI/CD process like that. First of all, it introduces a number of challenges because then you need to deal with an entire server instance, keep it maintained, deal with networking, apply monitoring and alerts to deal with availability issues, and even then, you won't have availability guarantees, which may cause other issues. Most of all, maintaining an EC2 instance for a purpose like that simply is unnecessary.
I recommend that you investigate using Amazon CodePipeline with a Lambda Step Function.
The Step Function can be used to orchestrate the provisioning of your EMR cluster in a fully serverless environment. With CodePipeline, you can setup a web hook into your Github repo to pull your code and spin up a new deployment automatically whenever changes are committed to your master Github branch (or whatever branch you specify). You can use EMRFS to sync an S3 bucket or folder to your EMR file system for your cluster and then obtain the security benefits of IAM, as well as additional consistency guarantees that come with EMRFS. With Lambda, you also get seamless integration into other services, such as Kinesis, DynamoDB, and CloudWatch, among many others, that will simplify many administrative and development tasks, as well as enable you to have more sophisticated automation with minimal effort.
There are some great resources and tutorials for using CodePipeline with EMR, as well as in general. Here are some examples:
https://aws.amazon.com/blogs/big-data/implement-continuous-integration-and-delivery-of-apache-spark-applications-using-aws/
https://docs.aws.amazon.com/codepipeline/latest/userguide/tutorials-ecs-ecr-codedeploy.html
https://chalice-workshop.readthedocs.io/en/latest/index.html
There are also great tutorials for orchestrating applications with Lambda Step Functions, including the use of EMR. Here are some examples:
https://aws.amazon.com/blogs/big-data/orchestrate-apache-spark-applications-using-aws-step-functions-and-apache-livy/
https://aws.amazon.com/blogs/big-data/orchestrate-multiple-etl-jobs-using-aws-step-functions-and-aws-lambda/
https://github.com/DavidWells/serverless-workshop/tree/master/lessons-code-complete/events/step-functions
https://github.com/aws-samples/lambda-refarch-imagerecognition
https://github.com/aws-samples/aws-serverless-workshops
In the very worst case, if all of those options fail, such as if you need very strict control over the startup process on the EMR cluster after the EMR cluster completes its bootstrapping, you can always create a Java JAR that is loaded as a final step and then use that to either execute a shell script or use the various Amazon Java libraries to run your provisioning commands. In even this case, you still have no need to maintain your own EC2 instance for orchestration purposes (which, in my opinion, still would be hard to justify even if it was running in a Docker container in Kubernetes) because you can easily maintain that deployment process as well with a fully serverless approach.
There are many great videos from the Amazon re:Invent conferences that you may want to watch to get a jump start before you dive into the workshops. For example:
https://www.youtube.com/watch?v=dCDZ7HR7dms
https://www.youtube.com/watch?v=Xi_WrinvTnM&t=1470s
Many more such videos are available on YouTube.
Travis CI also supports Lambda deployment, as mentioned here: https://docs.travis-ci.com/user/deployment/lambda/