I have an application deployed to Fargate on ECS using CDK. The first time I deployed the stack worked fine. The task passed all the healthchecks and everything was good.
I then tried to use the GitLab CD integration which failed.
I destroyed the stack with (cdk destroy) and after cdk deploy the task isn't passing the healthchecks (Even though I see in the logs that the app is responding just fine to the requests). Moreover I noticed that the new cluster contains somehow the old references of the failed tasks of the previous stack.
Here is how it looks:
Any ideea on why this behaviour is happening? and/or how to solve it?
Related
I have an ECS Fargate app up and running on AWS, and I was deploying a newer version of my code through CodeDeploy blue-green deployment. I have been using this method for quite some time, and I have never encountered any problems before unless there was actually a problem with the app itself. As always, I initiated the deployment and waited until all the tasks were running, and checked that the traffic has been rerouted to the newer version of task sets. I tested the app on a couple of devices and made sure that it was working correctly. However, after around 20 minutes or so, my service was down for a few minutes and I get an error message like this on CodeDeploy : CodeDeploy detected that the replacement task set is unhealthy. I expected codedeploy to automatically roll-back the deployment, but it was still the newer version of task set that was receiving traffic, and it was working fine! I did see a couple of stopped tasks but I do not have access to their logs anymore since stopped tasks somehow evaporate and are not accessible after some time. I re-ran the deployment with the exact same task definition, and that worked fine too. Does anyone have any idea what might cause a task set to be in an unhealthy state? Thanks so much!
below is the image of the error
deployment status
I'm building a simple web app on Elastic Beanstalk (Dockerized Python/Flask). I had it running successfully on one AWS account and wanted to migrate it to a new AWS account, so I'm recreating the Beanstalk app on the AWS console and trying to deploy the same code via eb deploy.
I noticed that when pushing a configuration update, Beanstalk will attempt the change but then roll it back if the app fails to start with the new change. This is true for several different kind of changes, and I need to make multiple to get my app fully working (basically I'm just recreating the Beanstalk settings I already have on my other AWS account):
Need to set a few environment variables
Need to set up a new RDS instance
Need to deploy my code (the new application version has been uploaded, but the deployed application version is still the old "sample application" that it started with)
All 3 must be done before this app will fully start. However, whenever I try one of these on its own, Beanstalk will attempt the change, then notice that the app fails to startup (it throws an exception on startup) and then beanstalk rolls back the change. The config rollback occurs even though I have "Ignore Health Check: true" under the deployment settings. (I would think at the very least it let me force update #3 above but apparently not.)
So I'm basically stuck because I can't do all of them at once. It there a way to --force a configuration update, so that Beanstalk doesn't rollback no matter what happens?
My other thought was that I could potentially make all the edits at once to the JSON config, but I figured that there must be a way to force config changes so people can respond quickly in a crisis without these well-intentioned guardrails.
Thanks for your help!
I have a python3 project which runs in a docker container environtment.
My Python project uses AWS Acces keys and secret but using a credentials file stored in the computer which is added to the container using ADD.
I deployed my project to EC2. The server has one task running which works fine. I am able to go through port 8080 to the webserver (Airflow).
When I do a new commit and push to a master branch in github, the hook download the content and deploy it without build stage.
The new code is in the EC2 server, I check it using ssh but the container that runs in the task get "stuck" and the bind volumes dissapear and they are not working until I restart a new task. The volumes are applied again from 0, and those reference to the new code. This action is fully manual.
Then, to fix it I listen about AWS ECS Blue/Green deployment, so I implemented it. In this case the Codepipeline add a build stage, but here starts the problem. If in the build I try to push a docker image to the ECR, which my task definition makes reference it fails. It fails because in the server, and in the repo (which I commit push my new code) there is no credentials file.
I tryed doing the latest docker image from my localhost, and avoiding build stage in codepipeline, and it works fine, but then when I go to the 8080 port in both working ip's I am able to get in the webserver, but the code is not there. If i click anywhere it says code not found.
So, in a general review I would like to understand what i am doing wrong, and how to fix in a general guidelines, and in the other hand ask why my EC2 instance in the AWS ECS Blue/Green cluster has 3 ip's.
The first one is the one that I use to reach server through port 22. And if there I run docker ps I see one or two containers running depending if I am in the middle of a new deployment. If here I search my new code its not here...
The other two ip's are changing after every deployment (I guess its blue and green) and both work fine until Codepipeline destroys the green one (5 minutes wait time), but the code is not there. I know it because when I click in any of the links in the webserver it fails saying the Airflow Dag hasn't been found.
So my problem is that I have a fully working AWS ECS Blue/Green deployment but without my code. Then my webserver doesn't have anything to run.
So we are launching an ecommerce store built on magento. We are looking to deploy it on Amazon EC2 instance using RDS as database service and using amazon auto-scaling and elastic load balancer to scale the application when needed.
What I don't understand is this:
I have installed and configured my production magento enviorment on an EC2 instance (database is in RDS). This much is working fine. But now when I want to dynamically scale the number of instances
how will I deploy the code on the dynamically generated instances each time?
Will aws copy the whole instance assign it a new ip and spawn it as a
new instance or will I have to write some code to automate this
process?
Plus will it not be an overhead to pull code from git and deploy every time a new instance is spawned?
A detailed explanation or direction towards some resources on the topic will be greatly appreciated.
You do this in the AutoScalingGroup Launch Configuration. There is a UserData section in the LaunchConfiguration in CloudFormation where you would write a script that is ran when ever the ASG scales up and deploys a new instance.
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-as-launchconfig.html#cfn-as-launchconfig-userdata
This is the same as the UserData section in an EC2 Instance. You can use LifeCycle hooks that will tell the ASG not to put the EC2 instance into load until everything you want to have configured it set up.
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-as-lifecyclehook.html
I linked all CloudFormation pages, but you may be using some other CI/CD tool for deploying your infrastructure, but hopefully that gets you started.
To start, do check AWS CloudFormation. You will be creating templates to design how the infrastructure of your application works ~ infrastructure as code. With these templates in place, you can rollout an update to your infrastructure by pushing changes to your templates and/or to your application code.
In my current project, we have a github repository dedicated for these infrastructure templates and a separate repository for our application code. Create a pipeline for creating AWS resources that would rollout an updated to AWS every time you push to the repository on a specific branch.
Create an infrastructure pipeline
have your first stage of the pipeline to trigger build whenever there's code changes to your infrastructure templates. See AWS CodePipeline and also see AWS CodeBuild. These aren't the only AWS resources you'll be needing but those are probably the main ones, of course aside from this being done in cloudformation template as mentioned earlier.
how will I deploy the code on the dynamically generated instances each time?
Check how containers work, it would be better and will greatly supplement on your learning on how launching new version of application work. To begin, see docker, but feel free to check any resources at your disposal
Continuation with my current project: We do have a separate pipeline dedicated for our application, but will also get triggered after our infrastructure pipeline update. Our application pipeline is designed to build a new version of our application via AWS Codebuild, this will create an image that will become a container ~ from the docker documentation.
we have two triggers or two sources that will trigger an update rollout to our application pipeline, one is when there's changes to infrastructure pipeline and it successfully built and second when there's code changes on our github repository connected via AWS CodeBuild.
Check AWS AutoScaling , this areas covers the dynamic launching of new instances, shutting down instances when needed, replacing unhealthy instances when needed. See also AWS CloudWatch, you can design criteria with it to trigger scaling down/up and/or in/out.
Will aws copy the whole instance assign it a new ip and spawn it as a new instance or will I have to write some code to automate this process?
See AWS ElasticLoadBalancing and also check out more on AWS AutoScaling. On the automation process, if ever you'll push through with CloudFormation, instance and/or containers(depending on your design) will be managed gracefully.
Plus will it not be an overhead to pull code from git and deploy every time a new instance is spawned?
As mentioned, earlier having a pipeline for rolling out new versions of your application via CodeBuild, this will create an image with the new code changes and when everything is ready, it will be deployed ~ becomes a container. The old EC2 instance or the old container( depending on how you want your application be deployed) will be gracefully shut down after a new version of your application is up and running. This will give you zero downtime.
I have configured an aws asg using ansible to provision new instances and then install the codedeploy agent via "user_data" script in a similar fashion as suggested in this question:
Can I use AWS code Deploy for pulling application code while autoscaling?
CodeDeploy works fine and I can install my application onto the asg once it has been created. When new instances are triggered in the ASG via one of my rules (e.g. high cpu usage), the codedeploy agent is installed correctly. The problem is, CodeDeploy does not install the application on these new instances. I suspect it is trying to run before the user_data script has finished. Has anyone else encountered this problem? Or know how to get CodeDeploy to automatically deploy the application to new instances which are spawned as part of the ASG?
AutoScaling tells CodeDeploy to start the deployment before the user data is started. To get around this CodeDeploy gives the instance up to an hour to start polling for commands for the first lifecycle event instead of 5 minutes.
Since you are having problems with automatic deployments but not manual ones and assuming that you didn't make any manual changes to your instances you forgot about, there is most likely a dependency specific to your deployment that's not available yet at the time the instance launches.
Try listing out all the things that your deployment needs to succeed and make sure that each of those is available before you install the host agent. If you can log onto the instance fast enough (before AutoScaling terminates the instance), you can try and grab the host agent logs and your application's logs to find out where the deployment is failing.
If you think the host agent is failing to install entirely, make sure you have Ruby2.0 installed. It should be there by default on AmazonLinux, but Ubuntu and RHEL need to have it installed as part of the user data before you can install the host agent. There is an installer log in /tmp that you can check for problems in the initial install (again you have to be quick to grab the log before the instance terminates).