I have a doubt. I try to use ECS Schedule Task.
Mi first impression is the next... I can't say to Schedule task please run N task for two hours and after that stop them. In the options I have rate or cron but none of that options can't help me because the goal of them is run N task every N (minutes,hours,days, etc) and when the main proccess of the task it's finished, stop.
But I want to run N task for a specific time. Because mi task is an API but I need the API only in specific interval of time.
Does anyone know how to solve this? Or really the task schedules are not the best solution to achieve what I want?
I think you can achieve this using step function and aws lambda with boto3 client.
Stepfunctions have wait until specific time future for your usecase.
Related
i need to implement a comprehensive retry schedule in Airflow, say 3x with 5min interval and then 4x with 2h interval. what is the best way to do so? i am thinking to run the same task with the second retry schedule from on failure callback of the task with the first retry schedule, but the problem is that in that case i cannot (or don't know how) to make downstream task(s) depend on the task executed from callback only if it was started in the first place. maybe my idea is totally wrong. any advice?
My Lambda function has limit 15 minutes which was 5 minutes ealier.Lambda process is automatically terminated after 15 minutes but my process takes more than 15 minutes. How I can manage ?
There is no way around this. If you're doing some sort of long running processing then your other option may be to run this task on an EC2 instance. If this long running process can be broken down in to multiple steps then you could look in to Lambda Step Functions.
15 Minutes is the max and this max can not be extended.
EDIT:
Recently I started running some long running tasks that are variable in length (anywhere from a couple minutes to several hours). To accomplish this I've been using AWS Fargate and my task is node.js script that is stored as a Docker container in ECR. Doing this was fairly easy and also is fairly cheap (I think we spent a little over $1 for running this task daily in a month). This may be something worth looking in to for others who may come across this answer.
https://docs.aws.amazon.com/AmazonECS/latest/userguide/scheduled_tasks.html
Typically use a Fat Lambda strategy or Step Function
Fat Lambda Strategy
A Fat Lambda strategy is used when your task is singular but has a
long-running execution time and/or you have heavy hardware
requirements. The idea is that you would create a script that executes your long
process and put it into a docker container that's hosted in Fargate.
Meaning no limits to execution time and access to powerful hardware (How to create a Fat Lambda https://youtu.be/XUp9SHIHU8w)
Step Function strategy
A Step Function strategy is used to break down your entire process
into smaller steps. Usually, a step function strategy would work for
you if your process could have lots of miniature stages linked
together instead of a big colossal job attempting to do everything
simultaneously. Bear in mind that a "Fat Lambda" can also be triggered
by a Step Function (How to create a Step Function
https://www.youtube.com/watch?v=s0XFX3WHg0w)
Also, another note, remember lambdas can also trigger other lambdas. So you might even be able to have different lambdas run bits of your lambda code. For example, a FOR loop sends off a lot of mini lambdas to run small tasks. You might not even need a Step Function or a Fat Lambda.
If you're stuck on what to choose, follow the below. It will help you reason with your problem.
Singular Lambda >> Lambda invoke another Lambda? >> Step Functions? >> Fargate (Fat Lambda)?
If you can checkpoint the task then you can check the getRemainingTimeInMillis (docs) and if the time is running out then invoke the same lambda with a parameter where to continue.
Something like this flow:
start working (0% done)
time is running low (40% done) => start a new lambda telling it to start from 40%
old lambda is terminated, new lambda starts working (40%)
when its time is running low, start a new lambda again (80%)
the third lambda finishes the job
But it requires a very specific type of task to support this. If your require a single execution from start to finish then lambda is not a good choice for this.
What do you think about using a lambda to trigger an ECS task? An ECS task just runs a containerized application for as long as it needs to run.
This blog post is relevant: https://www.gravitywell.co.uk/insights/using-ecs-tasks-on-aws-fargate-to-replace-lambda-functions/.
Aws lambda is meant to be used for quick processing. if your task is this long then better choose some other way to develop that functionality. Although you can define the timeout property for AWS lambda, but that can not exceed 15 minutes.
As per you use case better to use EC2 for deploying you application and then terminate the EC2 instance when the processing is done or it remains idle more than the threshold time.
Refer the AW Lambda documentation - https://docs.aws.amazon.com/lambda/index.html
To add to the Step function answer - here's a very simple playbook:
Work for 10 minutes
Write progress to S3
kick off another lambda to consume your progress
terminate
Once you're done, output. Viola, infinite runtime lambda with very little effective overhead.
No, you cannot run a lambda for more than 15 mins!
But Yes you can manage this using Signals.
Basically, this will inform you to start plan B when plan A is not enough within 15 mins. If you can decouple the tasks in your process and add checkpoints in your process then the next lambda invocation can be picked up in plan B or you can somehow create entries in db in the plan B for the unprocessed parts. And reprocess them as a part of another run.
Framework here -
https://gist.github.com/kuharan/c2bfddac7bd8dc5702f6eec31729fb48
In one of my ECS clusters I have a scheduled Fargate task that's meant to spin up 8 instances of it's given target. However, when the task procs it starts up waaayyyy more than 8 tasks. Sometimes as many as 50. Does anyone know what could be causing this to happen?
Details:
Cron Expression: cron(40 16 ? * 1-5 *)
Target Definition:
For anyone who might run into this problem in the future:
This problem occurred because we had too many tasks running the cluster. As of the writing of this answer AWS set of limit of 50 tasks running in a single cluster. Before the rule triggered there was already close to 50 tasks running. The rule would proc and would start spinning up new tasks trying to get to the desired number (8).
However, due to the limit it would never be able to get 8 because new tasks over the limit would just get shutdown. So it would keep trying, and keep trying, and keep trying to spin up tasks which led to there being a huge pending queue of tasks that would seemingly push (nearly) all of our tasks out of the cluster and we'd be left with way more tasks than we had asked for.
The solution: we just moved the scheduled task into a new cluster to avoid the 50 task limit.
I am a newbie to Airflow. But I am now working on how to throttle current jobs in Airflow. Is there someone that knows a little about concurrency or throttling in Airflow. Any suggestions could be helpful.
Thanks a lot.
If you want to throttle tasks in a dag, you need to define its "concurrency" parameter.
"concurrency" defines how many running task instances a DAG is allowed
to have, beyond which point things get queued.
If you want to throttle tasks globally, look into this lines of the config file
The amount of parallelism as a setting to the executor. This defines
the max number of task instances that should run simultaneously
on this airflow installation
parallelism = 32
And
The number of task instances allowed to run concurrently by the scheduler
dag_concurrency = 16
The first is global, the second is the concurrency default value for all dags
MapReduce tasks are run within a parent pipeline, and of course we all know they can run for a very long time. But at the same time, the pipeline api documents that a pipeline must complete within 10 minutes (https://github.com/GoogleCloudPlatform/appengine-pipelines/wiki/Python). What is the proper way to understand this?
Thanks.
That pipeline documentation is really old... when it was written, tasks were limited to 10-mins. Now you can configure a non-default modules (used to be called a "backend") using basic/manual scaling that will allow a task to run for 24hrs
https://cloud.google.com/appengine/docs/python/modules/#Python_Instance_scaling_and_class
(NOTE: if you run a task on an auto-scaled module, it will still be limited to 10-mins)
The entire pipeline doesn't have to be limited to 24hrs though. The "root" pipeline (the first task that runs) can yield many child pipelines, and those each can further yield other pipelines... each pipeline is a task that has to run within the allotted time (10mins or 24hrs)... when it is done, it signals the parent to wake-up and finish... so the overall pipeline could run for days or months or whatever
We have our app split into two modules, one for the front-end (default, auto-scaled) that handles web requests, and one for the "back end" (basic scaling) that runs all of our tasks