I am new to AWS and experimenting with AWS Lambda and Fargate. I have a long-running process that I have defined as an AWS Fargate containerized task. This task is triggered using ecs.runTask(taskParams, callback) api call from an AWS Lambda function. The Lambda function is triggered by a notification of a file uploaded into an S3 bucket.
You can now configure your AWS Lambda functions to run up to 15
minutes per execution. Previously, the maximum execution time
(timeout) for a Lambda function was 5 minutes. (Source: Amazon)
My question is, does ecs.runTask() run the task asynchronously inside an on-demand container without the lambda function that triggered it waiting for its completion? Does that explain how the lambda function is no longer bound by the task running time? Is this a recommended approach for long-running processes where we don't want an ECS instance just around?
Finally, what is the difference between ecs.runTask() and ecs.startTask() api calls?
asynchronously inside an on-demand container without the lambda function that triggered it waiting for its completion?
Yes. Lambda will just start it.
what is the difference between ecs.runTask() and ecs.startTask() api calls?
startTask can be only used on EC2 launch type and requires you to explicitly choose which EC2 instance to use for your task. Can't be used for Fargate and allows you to launch a single task.
runTask can be use for both EC2 and Fargate launch types. When you use runTask ECS decides where to place your tasks, e.g. which instance. Also, you can run multiple copies of a single task at once.
There are probably more differences, but I think the above are the key ones.
Related
I have ECS cluster running fargate services(spring boot applications).
I wanted to run a lambda function which will check if the applications(spring boot) are up and running or not. And this function has to be called after receiving an event of successfully running fargate tasks.
Is there a way to configure such event in Cloudwatch or any other possible solution for this?
A way to write Lambda functions that can interact with AWS Fargate containers is to write an AWS Lambda function that uses the AWS ECS API. You can write the AWS Lambda function in a supported programming language.
I am not 100% clear what you mean by Consuming Services. DO you want your Lambda function to perform tasks on the container such as stopping, starting tasks, etc.
If so, you can write an AWS Lambda function using the Java runtime API and then use the software.amazon.awssdk.services.ecs.EcsClient to achieve your business logic. Using this Service Client, you can perform many ECS tasks and all from within an AWS Lambda function.
On the other hand, if you want your Lambda function to retrieve data from a service running in a container (for example JSON data), you can use programming logic to invoke the service from within the Lambda function and handle the response. No Specific AWS APIs are needed to do this.
Also AWS Cloud Watch is not used to invoke a service running in a container. Its simply a service that logs/monitors data.
You can configure an EventBridge (formerly known as CloudWatch Events) rule to react to an ECS state change by triggering a Lambda. Documentation for setting up the trigger is here, and information about ECS event types is here.
I'm not sure if either of those pages show an actual event rule. Here's one copied from a live system; you'll need to change the account ID and task definition name:
{
"detail-type": ["ECS Task State Change"],
"source": ["aws.ecs"],
"detail": {
"taskDefinitionArn": [{
"prefix": "arn:aws:ecs:us-east-1:123456789012:task-definition/MyTaskDefinition:"
}],
"lastStatus": ["RUNNING"]
}
}
Also be aware that for something like a web app, the container might be up and running before the app is ready to handle requests.
I know about the provisioned instance configuration for lambda functions. Is it possible to run multiple instances of a lambda function on a timer basis? I know generally we use CloudWatch Events for this, just not how to specify multiple instances.
To be clear, I want something like: I want 10 instances of my function to run at "2022-02-02 10:10:10".
Some options:
Create 10 identical CloudWatch events
Create a new Lambda that is triggered by your single CloudWatch event. The new Lambda would invokes your worker Lambda function 10 times asynchronously
Create a Step Functions state machine that triggers 10 Lambda invocations, and trigger the step function on a schedule
I currently have a service running with ECS + Fargate. Now I want to create a lambda, which would be triggered synchronously by an ECS task in the previous service.
Is it possible to trigger a lambda by an ECS task? I have come across a documentation in which Lambda is used to invoke ECS, but not the other way around.
"You can invoke Lambda functions directly with the Lambda console, the Lambda API, the AWS SDK, the AWS CLI, and AWS toolkits." So you'll have to decide which method is going to work best for your ECS tasks. You will need to set the correct permissions so that your ECS will be able to invoke the Lambda function.
If you want to invoke Lamda in an async manner, you can publish a message to SQS from your ECS task and SQS can trigger the Lamda on message receive.
In my architecture when I receive a new file on S3 bucket, a lambda function triggers an ECS task.
The problem occurs when I receive multiple files at the same time: the lambda will trigger multiple instance of the same ECS task that acts on the same shared resources.
I want to ensure only 1 instance is running for specific ECS Task, how can I do?
Is there a specific setting that can ensure it?
I tried to query ECS Cluster before run a new instance of the ECS task, but (using AWS Python SDK) I didn't receive any information when the task is in PROVISIONING status, the sdk only return data when the task is in PENDING or RUNNING.
Thank you
I don't think you can control that because your S3 event will trigger new tasks. It will be more difficult to check if the task is already running and you might miss execution if you receive a lot of files.
You should think different to achieve what you want. If you want only one task processing that forget about triggering the ECS task from the S3 event. It might work better if you implement queues. Your S3 event should add the information (via Lambda, maybe?) to an SQS queue.
From there you can have an ECS service doing a SQS long polling and processing one message at a time.
I understand that AWS Lambda is supposed to abstract the developer from the infrastructure. However I don't quite understand how scaling would work.
Does it automatically start new containers during high traffic?
AWS Lambda functions can be triggered by many different event sources.
AWS Lambda runs each Lambda function runs as a standalone process in its own environment. There is a default limit of 1000 concurrent Lambda functions.
There is no need to think of Lambda "scaling". Rather, whenever an event source (or your own application) runs a Lambda function, the environment is created, the function is run, and the environment is torn down. When there is nothing that is invoking a Lambda function, it is not running. When 1000 invocations happen, then 1000 Lambda functions run.
It "scales" automatically by running in parallel on AWS infrastructure. You only pay while a function is running, per 100ms. It is the job of AWS to ensure that their back-end infrastructure scales to support the number of Lambda functions being run by all customers in aggregate.
If you whant to change the nubmer of desired instanzes in Auto Scaling Group, you chan use botocore.session
import botocore.session
client = session.create_client('autoscaling')
client.set_desired_capacity(
AutoScalingGroupName='NAME',
DesiredCapacity=X,
HonorCooldown=True|False
)
https://docs.aws.amazon.com/cli/latest/reference/autoscaling/set-desired-capacity.html