We are in the development phase. So we are using AWS ECS cluster consisting of 2 EC2 instances. We are submitting tasks to ECS cluster using ECSOperator of Airflow. We are looking to scale this process. So we are going to use Celeryexecutor of Airflow which is used to concurrently submit and schedule tasks on Airflow.
So the question is, should we care about number of task submitted to ECS or irrespective of number of tasks submitted to ECS, it will service all the tasks without failure by any internal queuing mechanism?
Related
I am building a file processing service in AWS and here is what I have now as a manager-worker architecture:
A nodejs application running in an EC2 instance, serving as a manager node; In this EC2 instance, there is also a RabbitMQ service hosting a job queue
An ECS service running multiple task containers and the containers are also running nodejs code. The code in every task container runs some custom business logic for processing a job. The task containers get the jobs from the above RabbitMQ job queue. When there are jobs enqueued in the RabbitMQ queue, the jobs are assigned to the ECS task containers and the ECS task container would start processing the job.
Now, this ECS service should scale up or down. When there are no jobs in the queue (which happens very frequently), I just want to keep one worker container alive so that I can save budgets.
When there is a large number of jobs arriving at the manager and enqueue into the job queue, the manager has to figure out how to scale up.
It needs to figure out how many new worker container to add into the ECS service. And to do this, it needs to know:
the number of task containers in the ECS service now;
the status of each container: is it currently processing a job?
This second point leads to my question: is there a way to set a custom status to the task, such that this status can be read by the application in EC2 instance through some AWS ECS API?
As others have noted in the comments, there isn't any built in AWS method to do this. I have two suggestions that I hope can accomplish what you want to do:
Create a lambda function that runs on a regular interval that calls into your RabbitMQ api to check the queue length. Then it can use the ECS API to set the desired task count for your service. You can have as much control as you want over the thresholds and strategy for scaling in your code.
Consider using AWS Batch. The compute backend for Batch is also ECS based, so it might not be such a big change. Long running jobs where you want to scale up and down the processing is its sweet spot. If you want, you can queue the work directly in Batch and skip Rabbit. Or, if you still need to use Rabbit you could create a smaller job in Lambda or anywhere else, that pulls the messages out and creates AWS Batch jobs for each. Batch supports running on EC2 ECS clusters, but it can also use Fargate, so it could simplify your management even further.
I am currently using MWAA (AWS Managed Airflow Service) and triggering an ECS Task Definition (pre-created) from a DAG running on it. I want to get the status of triggered ECS task from Airflow.
Similar to how EMR has operators through which I can create JobFlow and Steps, then attach sensors to get failed/success/in-progress status of the Airflow task. I want to replicate the same for ECS Tasks, but there aren't any sensors present on Airflow docs. If I had to create a custom ECS Task sensor, would I need to implement a custom ECS Hook also like here. Any references or leads on this would also help.
I have an ECS cluster which i have an ec2 instance tied to and a scheduled task set to run daily using the 'Scheduled Tasks' functionality on the ECS dashboard.
This task runs a a bunch of containers that are each relatively expensive in memory, and this is compounded even more so with all the containers running at once.
I do not currently have a service set up for the ECS cluster and it is my understanding that for my goals, running a set task on some interval, a service would not be used.
AWS's definition of a service in there ECS docs says:
An Amazon ECS service enables you to run and maintain a specified number of instances of a task definition simultaneously in an Amazon ECS cluster.
Since this is not what I want; instead i need to just run a task on some scheduled interval i gather i do not need an service tied to my ECS cluster.
My question is on how to set up autoscaling for my scheduled tasks? The only references i can find to auto scaling within an ECS cluster are to do with creating ecs services that auto scale - which again, is not what I want (at least from how i understand ecs services to work).
What I need is for my ec2 instances to auto scale with my scheduled task running, allocating more resources as need for the task to run. Would I just need to set up auto scaling on the specific ec2 instance the ecs cluster is tied to from within the ec2 dashboard or is there some other way to do this from ECS directly?
I want; instead, I need to just run a task on some scheduled interval I gather I do not need a service tied to my ECS cluster.
For the above use case better to use fargate and you will not maintain or worry about auto-scaling scheduling, all you need to setup schedule task and AWS will take care of memory and other resources required for your task plus you will only pay for the resources that were used by you ECS task, unlike EC2 type task where you pay for the container instance.
AWS Fargate is a serverless compute engine for containers that works with both Amazon Elastic Container Service (ECS) and Amazon Elastic Kubernetes Service (EKS). Fargate makes it easy for you to focus on building your applications. Fargate removes the need to provision and manage servers, lets you specify and pay for resources per application, and improves security through application isolation by design.
aws fargate
Create a cloud watch rule base on some schedule that will trigger the task, make sure that the container exit once it completes the job, fargate will automatically stop the container.
cloudwatch-event-rule-to-invoke-an-ecs-task
I want to know what is the best way to run a scheduled task on AWS. Specifically I would like to only pay for the computation that is needed to run the task. So if the tasks runs once per day for 2h, then I only will pay for the 2h computation. So I don't want to have an EC2 instance running all the time, when the task is not running.
Could an AWS expert please explain how to realize this on AWS?
AWS ECS Scheduled Tasks is perfect solution for you. You can run ECS Tasks on top on AWS Fargate, so, you don't need to provision any EC2 instances. It's complete Serverless solution with simple type of configuration.
Related information:
AWS ECS
AWS Fargate
Scheduling Amazon ECS Tasks
Also you can just run and stop EC2 instances by AWS CloudWatch Event (cron scheduler)
I have a Docker image containing Python code and third-party binary executables. There are only outbound network requests. The image must run hourly and each execution lasts ~3 minutes.
I can:
Use an EC2 instance and schedule hourly execution via cron
Create a CloudWatch Event/Rule to run an ECS Task Defintion hourly
Setup an Elastic Beanstalk environment and schedule hourly deployment of the image
In all of these scenarios, an EC2 instance is running 24/7 and I am being charged for extended periods of no usage.
How do I accomplish scheduling the starting of an existing EC2 instance hourly and the stopping of said instance after the completion of my docker image?
Here's one approach I can think of. It's very high-level, and omits some details, but conceptually it would work just fine. You'll also need to consider the Identity & Access Management (IAM) Roles used:
CloudWatch Event Rule to trigger the Step Function
AWS Step Function to trigger the Lambda function
AWS Lambda function to start up EC2 instances
EC2 instance polling the Step Functions service for Activity Tasks
Create a CloudWatch Event Rule to schedule a periodic task, using a cron expression
The Target of the CloudWatch Event Rule is an AWS Step Function
The AWS Step Function State Machine starts by triggering an AWS Lambda function, which starts the EC2 instance
The next step in the Step Functions State Machine invokes an Activity Task, representing the Docker container that needs to execute
The EC2 instance has a script running on it, which polls the Activity Task for work
The EC2 instance executes the Docker container, waits for it to finish, and sends a completion message to the Step Functions Activity Task
The script running on the EC2 instance shuts itself down
The AWS Step Function ends
Keep in mind that a potentially better option would be to spin up a new EC2 instance every hour, instead of simply starting and stopping the same instance. Although you might get better startup performance by starting an existing instance vs. launching a new instance, you'll also have to spend time to maintain the EC2 instance like a pet: fix issues if they crop up, or patch the operating system periodically. In today's world, it's a commonly accepted practice that infrastructure should be disposable. After all, you've already packaged up your application into a Docker container, so you most likely don't have overly specific expectations around which host that container is actually being executed on.
Another option would be to use AWS Fargate, which is designed to run Docker containers, without worrying about spinning up and managing container infrastructure.
AWS Step Functions
AWS Fargate
Blog: AWS Fargate: An Overview
Creating a CloudWatch Event Rule that triggers on a schedule