Running scheduled Tasks on AWS - amazon-web-services

I want to know what is the best way to run a scheduled task on AWS. Specifically I would like to only pay for the computation that is needed to run the task. So if the tasks runs once per day for 2h, then I only will pay for the 2h computation. So I don't want to have an EC2 instance running all the time, when the task is not running.
Could an AWS expert please explain how to realize this on AWS?

AWS ECS Scheduled Tasks is perfect solution for you. You can run ECS Tasks on top on AWS Fargate, so, you don't need to provision any EC2 instances. It's complete Serverless solution with simple type of configuration.
Related information:
AWS ECS
AWS Fargate
Scheduling Amazon ECS Tasks
Also you can just run and stop EC2 instances by AWS CloudWatch Event (cron scheduler)

Related

AWS ECS: Is it possible to make a scheduled ecs task accessible via ALB?

my current ECS infrastructure works as follows: ALB -> ECS Fargate --> ECS service -> ECS task.
Now I would like to replace the normal ECS task with a Scheduled ECS task. But nowhere do I find a way to connect the Scheduled ECS task to the service and thus make it accessible via the ALB. Isn't that possible?
Thanks in advance for answers.
A scheduled task is really more for something that runs to complete a given task and then exits.
If you want to connect your ECS task to a load balancer you should run it as part of a Service. ECS will handle connecting the task to the load balancer for you when it runs as a Service.
You mentioned in comments that your end goal is to run a dev environment for a specific time each day. You can do this with an ECS service and scheduled auto-scaling. This feature isn't available through the AWS Web console for some reason, but you can configure it via the AWS CLI or one of the AWS SDKs. You would configure it to scale to 0 during the time you don't want your app running, and scale up to 1 or more during the time you do want it running.
A scheduled ECS task is it a one-off task launched with the RunTask API and that has no ties to an ALB (because it's not part of the ECS service). You could probably make this work but you'd probably need to build the wiring yourself by finding out the details of the task and adding it to the target group. I believe what you need to do (if you want ECS to deal with the wiring) is to schedule a Lambda that increments the desired number of tasks in the service. I am also wondering what the use case is for this (as maybe there are other ways to achieve it). Scheduled tasks are usually batch jobs of some sort and not web services that need to get wired to a load balancer. What is the scenario / end goal you have?
UPDATE: I missed the non-UI support for scheduling the desired number of tasks so the Lambda isn't really needed.

How to set up autoscaling for ECS cluster that uses scheduled tasks and no service?

I have an ECS cluster which i have an ec2 instance tied to and a scheduled task set to run daily using the 'Scheduled Tasks' functionality on the ECS dashboard.
This task runs a a bunch of containers that are each relatively expensive in memory, and this is compounded even more so with all the containers running at once.
I do not currently have a service set up for the ECS cluster and it is my understanding that for my goals, running a set task on some interval, a service would not be used.
AWS's definition of a service in there ECS docs says:
An Amazon ECS service enables you to run and maintain a specified number of instances of a task definition simultaneously in an Amazon ECS cluster.
Since this is not what I want; instead i need to just run a task on some scheduled interval i gather i do not need an service tied to my ECS cluster.
My question is on how to set up autoscaling for my scheduled tasks? The only references i can find to auto scaling within an ECS cluster are to do with creating ecs services that auto scale - which again, is not what I want (at least from how i understand ecs services to work).
What I need is for my ec2 instances to auto scale with my scheduled task running, allocating more resources as need for the task to run. Would I just need to set up auto scaling on the specific ec2 instance the ecs cluster is tied to from within the ec2 dashboard or is there some other way to do this from ECS directly?
I want; instead, I need to just run a task on some scheduled interval I gather I do not need a service tied to my ECS cluster.
For the above use case better to use fargate and you will not maintain or worry about auto-scaling scheduling, all you need to setup schedule task and AWS will take care of memory and other resources required for your task plus you will only pay for the resources that were used by you ECS task, unlike EC2 type task where you pay for the container instance.
AWS Fargate is a serverless compute engine for containers that works with both Amazon Elastic Container Service (ECS) and Amazon Elastic Kubernetes Service (EKS). Fargate makes it easy for you to focus on building your applications. Fargate removes the need to provision and manage servers, lets you specify and pay for resources per application, and improves security through application isolation by design.
aws fargate
Create a cloud watch rule base on some schedule that will trigger the task, make sure that the container exit once it completes the job, fargate will automatically stop the container.
cloudwatch-event-rule-to-invoke-an-ecs-task

Schedule Docker image to be run periodically on AWS ECS?

How do I schedule a docker image to be run periodically (hourly) using ECS and without having to use a continually running EC2 instance + cron? I have a docker image containing third party binaries and the python project.
The latter approach is not viable long-term as it's expensive for the instance to be running 24/7, while only being used for a small fraction of the day given invocation of the script only lasts ~3 minutes.
For AWS ECS cluster, it is recommended to have atleast 1 EC2 server running 24x7. Have you looked at AWS Fargate whether it can run your docker container?. Also AWS Batch?. If Fargate and AWS Batch are not possible then for your requirement, I would recommend something like this without ECS.
Build an EC2 AMI with pre-built docker and required softwares and libraries.
Have AWS Instance Scheduler to spin up a EC2 server every hour and as part of user data, start a docker container with image you mentioned.
https://aws.amazon.com/answers/infrastructure-management/instance-scheduler/
If you know your task execution time maybe 5min. After 8 or 10min then bring server down with scheduler.
Above approach will blindly start a EC2 and stop it without knowing whether your python work is done successfully. We can still improve above with Lambda and CloudFormation templates combination. Let me know your thoughts :)
Actually it's possible to schedule the launch directly in CloudWatch defining a rule, as explained in
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/scheduled_tasks.html
This solution is cleaner, because you will not need to worry about the execution time: once finished, the Task will just terminate and a new one will be spawned on the next cycle

Scheduling the stopping/starting of an EC2 instance when not in use by a Beanstalk Deployment or an ECS task?

I have a Docker image containing Python code and third-party binary executables. There are only outbound network requests. The image must run hourly and each execution lasts ~3 minutes.
I can:
Use an EC2 instance and schedule hourly execution via cron
Create a CloudWatch Event/Rule to run an ECS Task Defintion hourly
Setup an Elastic Beanstalk environment and schedule hourly deployment of the image
In all of these scenarios, an EC2 instance is running 24/7 and I am being charged for extended periods of no usage.
How do I accomplish scheduling the starting of an existing EC2 instance hourly and the stopping of said instance after the completion of my docker image?
Here's one approach I can think of. It's very high-level, and omits some details, but conceptually it would work just fine. You'll also need to consider the Identity & Access Management (IAM) Roles used:
CloudWatch Event Rule to trigger the Step Function
AWS Step Function to trigger the Lambda function
AWS Lambda function to start up EC2 instances
EC2 instance polling the Step Functions service for Activity Tasks
Create a CloudWatch Event Rule to schedule a periodic task, using a cron expression
The Target of the CloudWatch Event Rule is an AWS Step Function
The AWS Step Function State Machine starts by triggering an AWS Lambda function, which starts the EC2 instance
The next step in the Step Functions State Machine invokes an Activity Task, representing the Docker container that needs to execute
The EC2 instance has a script running on it, which polls the Activity Task for work
The EC2 instance executes the Docker container, waits for it to finish, and sends a completion message to the Step Functions Activity Task
The script running on the EC2 instance shuts itself down
The AWS Step Function ends
Keep in mind that a potentially better option would be to spin up a new EC2 instance every hour, instead of simply starting and stopping the same instance. Although you might get better startup performance by starting an existing instance vs. launching a new instance, you'll also have to spend time to maintain the EC2 instance like a pet: fix issues if they crop up, or patch the operating system periodically. In today's world, it's a commonly accepted practice that infrastructure should be disposable. After all, you've already packaged up your application into a Docker container, so you most likely don't have overly specific expectations around which host that container is actually being executed on.
Another option would be to use AWS Fargate, which is designed to run Docker containers, without worrying about spinning up and managing container infrastructure.
AWS Step Functions
AWS Fargate
Blog: AWS Fargate: An Overview
Creating a CloudWatch Event Rule that triggers on a schedule

Does AWS ECS internally maintain tasks queue

We are in the development phase. So we are using AWS ECS cluster consisting of 2 EC2 instances. We are submitting tasks to ECS cluster using ECSOperator of Airflow. We are looking to scale this process. So we are going to use Celeryexecutor of Airflow which is used to concurrently submit and schedule tasks on Airflow.
So the question is, should we care about number of task submitted to ECS or irrespective of number of tasks submitted to ECS, it will service all the tasks without failure by any internal queuing mechanism?