Ecs run vs ecs deploy - amazon-web-services

for example for a migrate task we do ecs run and for any long running service to deploy we do ecs deploy. Why so?
What is basic fundamental difference between these two. Because ecs run doesnt give back the status of the task ran. (it always returns a non zero status code on running the service). So we have to do polling to get the status of the deployment. So why cant we use ecs deploy instead of ecs run because ecs deploy returns the status of the deployment also?

What is basic fundamental difference between these two.
aws ecs run-tusks starts a single task, while aws ecs deploy deploys a new task definition to a service.
Thus the different is that a single service can run many long-running tasks. Since you are running many tasks in a service you need to have a deployment strategy (e.g. rolling or blue/green) for how you deploy new versions of your task definitions.
So the choice of which to use depends on your specific use cases. For ad-hoc short running jobs, a single task can be sufficient. For hosting business critical containers, a service is the right choice.

Related

AWS ECS: Is it possible to make a scheduled ecs task accessible via ALB?

my current ECS infrastructure works as follows: ALB -> ECS Fargate --> ECS service -> ECS task.
Now I would like to replace the normal ECS task with a Scheduled ECS task. But nowhere do I find a way to connect the Scheduled ECS task to the service and thus make it accessible via the ALB. Isn't that possible?
Thanks in advance for answers.
A scheduled task is really more for something that runs to complete a given task and then exits.
If you want to connect your ECS task to a load balancer you should run it as part of a Service. ECS will handle connecting the task to the load balancer for you when it runs as a Service.
You mentioned in comments that your end goal is to run a dev environment for a specific time each day. You can do this with an ECS service and scheduled auto-scaling. This feature isn't available through the AWS Web console for some reason, but you can configure it via the AWS CLI or one of the AWS SDKs. You would configure it to scale to 0 during the time you don't want your app running, and scale up to 1 or more during the time you do want it running.
A scheduled ECS task is it a one-off task launched with the RunTask API and that has no ties to an ALB (because it's not part of the ECS service). You could probably make this work but you'd probably need to build the wiring yourself by finding out the details of the task and adding it to the target group. I believe what you need to do (if you want ECS to deal with the wiring) is to schedule a Lambda that increments the desired number of tasks in the service. I am also wondering what the use case is for this (as maybe there are other ways to achieve it). Scheduled tasks are usually batch jobs of some sort and not web services that need to get wired to a load balancer. What is the scenario / end goal you have?
UPDATE: I missed the non-UI support for scheduling the desired number of tasks so the Lambda isn't really needed.

Update ECS Task definition for background service from Codepipeline - without a load balancer

Introduction
I am deploying a Django app using Celery for background tasks on Amazon ECS, and we're using CodePipeline for CI/CD. I would like to be able to split this up into three ECS Services, each running only one task - this is so they can be scaled independently. It is proving hard to do while still meeting two key design goals:
Continuous delivery of changes - must be automated
Infrastructure changes must be managed as code
So, fundamentally, updates to ECS Task Definitions need to be versioned in git and updated as part of the automated release process and when they change, the services using them need to be updated.
For the service that accepts the traffic, this all works fine. The issue is with those services on ECS that are performing background tasks. There, I'm hitting a roadblock in that:
CodeDeploy Deployment Groups insist on being associated with a Load Balancer, and
Any Deployment provider that deals with updating the Task Definition requires a Deployment Group.
I think this is limited to the "CodeDeploy" and "ECS Blue/Green" providers.
Neither my "scheduler" or "worker" service accept traffic
So, it comes down to this: what kind of deployment can I do that doesn't require a load balancer, but will still allow me to update the task definition as part of the deployment?
Details
Now, to give you more specifics, the list of service I want is:
"web" service - runs Django, exposed to ALB on port 8000
"scheduler" service - runs Celery "beat", no exposed port
"worker" service - runs Celery worker, no exposed port
For the "web" service, CI/CD is straightforward, we have a CodeDeploy Application, with a Deployment Group that is associated with the Application Load Balancer and has the correct target groups and this does a "Blue/Green" deployment.
We have built some custom tooling that generates a replacement taskdef.json and appspec.yml for each of the services. These tools are invoked during the Build phase of our pipeline and (for the "web" service) applied at deployment time; this is so that updates to the application environments and resources are also managed in code.
So the flow goes:
Build new docker container
Generate new taskdef.json from source templates - filling in resource IDs (secrets etc.) by querying the CloudFormation stack
Generate new appspec.yml with the revision number of the task definition incremented by 1
CodeDeploy creates a new revision of the application based on the new AppSpec and TaskDef (Build artifacts from previous step) and deploys the updated service on the cluster.
This works well for the "web" service, and I would like something similar for the other two services, but I cannot find a way to either: not have a Deployment Group, but still update the Task Definition; or have a Deployment Group but not have a Load Balancer (because there's no traffic to load balance).
Is there a trick to this? Or a deployment type I've missed that is aimed at background services?
I would appreciate any advice you have to offer. Thanks!
For posterity, the answer I came up with in the end was to create a lambda dedicated to re-deploying the ECS Services for celery beat and the workers. Then have CodePipeline deploy the Web service using a Blue/Green deployment and then call the lambda twice (in parallel): once for the scheduler service, once for the worker service.
None of the built-in deployment types were of any help at all in getting this going.

How can i Update container image with imagedigest parameter in aws fargate cluster with aws cli

I have running my cluster and task is running.
My need is want to update container image in running task in cluster how to do?
My Image is with latest tag and every time any new changes come will push to ecr on latest tag.
Deploying with the tag latest isn't a best practice because you loose a lot of visibility into what you are doing (e.g. scale out events where you deploy more tasks as part of a service will all end up using LATEST but will be effectively running different versions of the code, etc.).
This pontificating aside, you didn't say if you started your task(s) as standalone using the run-task API or if you started your task(s) as part of a service.
If the former, you need to stop your task and run it again. If the latter, you need to redeploy your service using the --force-new-deployment flag.

How to provide tasks with different environment variables ECS Terraform

I have an ECS service and within that ECS service, I want to boot up 3 tasks all from the same task definition. I need each of these tasks to be on a separate EC2 instance, this seems simple enough however I want to pass a different command to each one of the running tasks to specify where their config can be found and some other options via the CLI within my running application.
For example for task 1 I want to pass run-node CONFIG_PATH="/tmp/nodes/node_0 and task 2 run-node CONFIG_PATH="/tmp/nodes/node_1" --bootnode true and task 3 run-node CONFIG_PATH="/tmp/nodes/node_0 --http true"
I'm struggling to see how I can manage individual task instances like this within a single service using Terraform, it seems really easy to manage multiple instances that are all completely equal but I can't find a way to pass custom overrides to each task that are all running off the same task definition.
I am thinking this may be a job for a different dev-ops automation tool but would love to carry on doing it in Terraform if possible.
This is not a limitation of a terraform. This is how ECS service works - runs exact copies of same task definition. Thus, you can't customize individual tasks in an ECS service as all these tasks are meant to be identical, interchangeable and disposable.
To provide overwrites you have to run the tasks outside of a service, which you can do using run-task or start-task with --overrides of AWS CLI or equivalent in any AWS SDK. Sadly there is no equivalent for that in terraform, except running local-exec with AWS CLI.

AWS ECS: Monitoring the status of a service update

I am trying to migrate a set of microservices from Docker Swarm, to AWS ECS using Fargate.
I have created an ECS cluster. Moreover, I have initialized repositories using the ECR, each of which contains an image of a microservice.
I have successfully came up with a way to create new images, and push them into the ECR. In fact, with each change in the code, a new docker image is built, tagged, and pushed.
Moreover, I have created a task definition that is linked to a service. This task definition contains one container, and all the necessary information. Moreover, its service defines that the task will run in a VPC, and is linked to a load balancer, and has a target group. I am assuming that every new deployment uses the image with the "latest" tag.
So far with what I have explained, everything is clear and is working well.
Below is the part that is confusing me. After every new build, I would like to update the service in order for new tasks with the update image get deployed. I am using the cli to do so with the following command:
aws ecs update-service --cluster <cluster-name> --service <service-name>
Typically, after performing the command, I am monitoring the deployment logs, under the event tab, and checking the state of the service using the following command:
aws ecs describe-services --cluster <cluster-name> --service <service-name>
Finally, I tried to simulate a case where the newly created image contains a bad code. Thus, the new tasks will not be able to get deployed. What I have witnessed is that Fargate will keep trying (without stopping) to deploy the new tasks. Moreover, aside the event logs, the describe-services command does not contain relevant information, other than what Fargate is doing (e.g., registering/deregistering tasks). I am surprised that I could not find any mechanism that instructs Fargate, or the service to stop the deployment and rollback to the already existing one.
I found this article (https://aws.amazon.com/blogs/compute/automating-rollback-of-failed-amazon-ecs-deployments/ ), which provides a solution. However, it is a fairly complicated one, and assumes that each new deployment is triggered by a new task definition, which is not what I want.
Therefore, considering what I have described above, I hope you can answer the following questions:
1) Using CLI commands (For automation purposes) Is there a way to instruct Fargate to automatically stop the current deployment, after failing to deploy new tasks after a few tries?
2) Using the CLI commands, is there a way to monitor the current status of the deployment? For instance, when performing a service update on a service on Docker swarm, the terminal generates live logs on the update process
3) After a failed deployment, is there a way for Fargate to signal an error code, or flag, or message?
At the moment, ECS does not offer deployment status directly. Once you issue a deployment, there is no way to determine its status other than to continually poll for updates until you have enough information to infer from them. Plus unexpected container exits are not logged anywhere. You have to search through failed tasks. The way I get them is by cloudwatch rule that triggers a lambda upon task state change.
I recommend you read: https://medium.com/#aaron.kaz.music/monitoring-the-health-of-ecs-service-deployments-baeea41ae737
As of now, you have a way to do this:
aws ecs wait services-stable --cluster MyCluster --services MyService
The previous example pauses and continues only after it can confirm that the service running on the cluster is stable. Will return 255 exit code after 40 failed checks.
To cancel a deployment, enable ECS Circuit Breaker when creating your service:
aws ecs create-service \
--service-name MyService \
--deployment-configuration "deploymentCircuitBreaker={enable=true,rollback=true}" \
{...}
References:
Service deployment check.
Circuit Breaker