Call concurrent instances of AWS lambda on Cloud watch event rule

Call concurrent instances of AWS lambda on Cloud watch event rule - amazon-web-services

I have one cloud watch event set per minute which triggers AWS Lambda.I have set concurrent executions of lambda to 10 however it's only triggering a single instance per minute. I want it to run 10 concurrent instances per minute.

Concurrency in Lambda is managed pretty differently from what you expect.
In your case you want a single CloudWatch Event to trigger multiple instances each minute.
However, Concurrency in Lambda is working as follows: think you have CloudWatch Event triggering your Lambda and also other AWS services (e.g. S3 and DynamoDB) which trigger your Lambda. What happens when one of your triggers activate the Lambda is that a Lambda instance is active and is consumed until the Lambda finishes its work/computation. During that period of time, the total concurrency units will be decreased by one. At that very moment if another trigger activates the Lambda, the total concurrency units will be decreased again. And this will happen until your Lambda instances are being executed.
So, in your case there will be always a single event (CloudWatch) triggering a single Lambda instance, causing the system not to trigger multiple instances, as for its operation this is the correct way to work. In other words, you do not want to increase concurrent lambda execution to 10 (or whatever) to reach your goal of running 10 parallel instances per minute.
In order to do so, it's probably better for you to create a Lambda orchestrator which calls multiple instances of your Lambda and then setting the Lambda Concurrency in this last Lambda higher than 10 (if you do not want the Lambda to throttle). This way is also pretty good in order to manage the execution of your multiple instances and to catch errors atomically with a greater error flow control.
You can refer to this article in order to get the Lambda Concurrency behavior. The implementation of Lambda orchestrator to manage the multiple instances execution, instead is pretty straightforward.

Related

Will eventbridge throttle lambda if there are so many events in a batch?

I am using AWS Eventbrige eventbus as source to trigger a lambda function. Eventbirge is not low latency service and it sends event to lambda around every 250ms. If there are multiple events in one batch, it will trigger lambda multiple times, one event for one lambda invocation.
My question is if there are more than 1000 events happens in one batch, since eventbridge tries to invoke lambda more than 1000 times, will it throttle lambda (maximum throughput in lambda is 1000)? If yes, how can I solve the issue? If not, what is the behaviour for the extra events? Will they be cached or dropped?

EventBridge will retry failing invocations, such as due to throttling, for up to 24 hours, so events persist for that time and do not get dropped.
See https://docs.aws.amazon.com/eventbridge/latest/userguide/cloudwatch-limits-eventbridge.html#invocations-limits
If the invocation of a target fails due to a problem with the target service, account throttling, etc., new attempts are made for up to 24 hours for a specific invocation.

Run AWS Lambda for multiple parameters one at a time on schedule

I have a lambda function that accepts a parameter i.e a category_id, pulls some data from an API, and updates the database based on the response.
I have to execute the same lambda function for Multiple Ids after an interval of 1 minute on daily basis.
For example, run lambda for category 1 at 12:00 AM, then run for category 2 at 12:01 AM and so one for 500+ categories.
What could be the best possible solution to achieve this?
This is what I am currently thinking:
Write Lambda using AWS SAM
Add Lambda Layer for Shared Dependencies
Attach Lambda with AWS Cloudwatch Events to run it on schedule
Add Environment Variable for category_id in lambda
Update the SAM template to use the same lambda function again and again but only change will be in the Cron expression schedule and Value of Environment Variable category_id
Problems in the Above Solution:
Number of Lambda functions will increase in the account.
Each Lambda will be attached with a Cloudwatch Event so its number will also increase
There is a quota limit of max 300 Cloudwatch Event per account (though we can request support to increase that limit)
It'll require the use of nested stacks because of the SAM template size limit as well as the number of resources per template which 200 max.
I'll be able to create only 50 Lambda Functions per nested stack, it means the number of nested stacks will also increase because 1 lambda = 4 resources (Lambda + Role + Rule + Event)
Other solutions (not sure if they can be used):
Use of Step Functions
Trigger First Lambda function only using Cron Schedule and Invoke Lambda for the next category using current lambda(only one CloudWatch Event will be required to invoke the function for the first category but time difference will vary i.e next lambda will not execute exactly after one minute).
Use Only One Lambda and One Cloud Watch Schedule Event, Lambda Function will have a list of all category ids and that function will invoke itself recursively by using one category id at a time and removing the use category id from the list (the only problem is lambda will not execute exactly after one minute for next category_id in the list)
Looking forward to hearing about the best solution.

I would suggest using a standard Worker pattern:
Create an Amazon SQS queue
Configure the AWS Lambda function so that it is triggered to run whenever a message is sent to the SQS queue
Trigger a separate process at midnight (eg another Lambda function) that sends the 500 messages to the SQS queue, each with a different category ID
This will cause the Amazon SQS functions to execute. If you only want one of the Lambda functions to be running at any time (with no parallel executions), set the function's Concurrency Limit to 1 so that only one is running at any time. When one function completes, Lambda will automatically grab another message from the queue and start executing. There will be practically no "wasted time" between executions of the function.

Given that you are doing a large amount of processing, an Amazon EC2 instance might be more appropriate.
If the bandwidth requirements are low (eg if it is just making API calls), then a T3a.micro ($0.0094 per Hour) or even T3a.nano instance ($0.0047 per Hour) can be quite cost-effective.
A script running on the instance could process a category, then sleep for 30 seconds, in a big loop. Running 500 categories at one minute each would take about 8 hours. That's under 10c each day!
The instance can then stop or self-terminate when the work is complete. See: Auto-Stop EC2 instances when they finish a task - DEV Community

Maximizing number of parallel operation in AWS Lambda

I have an AWS Lambda which has to invoke an API endpoint for 2 million records. Considering that the maximum execution period of Lambda is 15 minutes. I have to somehow process all these records using one Lambda(that is in 15 minutes if possible). The API endpoint which I want to invoke can handle the TPS of 3000. I want to maximize/parallelize my calls so I can utilize the TPS provided and run the operations using a single Lambda. I have created my invocations within parallelStream in Java. Is is possible to do it using the current approach? If yes, What changes would I have to make in Lambda Runtime in order to use multi core?

Considering that the maximum execution period of Lambda is 15 minutes.
I have to somehow process all these records using one Lambda(that is
in 15 minutes if possible).
Why? This defeats the entire reason you would use AWS Lambda for this task. Why limit yourself to a single Lambda function invocation to do all this work?
If you wrote a script to take your 2 million records and add them to an SQS queue, then you could have the AWS Lambda service automatically feed these records into multiple, parallel instances of your AWS Lambda function. This would allow you to easily tune the number of Lambda functions you want to have running in parallel, and also automatically handle retries in the case of failures.

CloudWatch event Lambda trigger and concurrency

If a Lambda function has a concurrency>1, and there are several instances running, does a CloudWatch event Lambda trigger get sent to all the running instances?

The question wording is a little bit ambiguous. I will try my best to make it more clear.
If a Lambda function has a concurrency>1, and there are several instances running
I think OP is talking about reserved concurrency which is set to a value that's greater than 1. In other words, the function is not throttled by default and can run multiple instances in parallel.
does a CloudWatch event Lambda trigger get sent to all the running instances?
This part is ambiguous. #hephalump provided one interpretation in the question comment.
I have another interpretation. If you are asking whether the currently-running lambda containers will be reused after the job is done, then here is the answer:
Based on #hephalump's comment, now it's clear that one CloudWatch event will only trigger one lambda instance to run. If multiple events come in during a short period of time, then multiple lambda instances will be triggered to run in parallel. Back to the question, if all existing lambda instances of that function are busy running, then no container will be reused, and another new lambda instance will be spun up to handle this event. If one of the running instances has just finished its job, then that container along with the execution environment will be reused to handle this incoming event from CloudWatch.
Hope this helps.

Serverless Task Scheduling on AWS

So our project was using Hangfire to dynamically schedule tasks but keeping in mind auto scaling of server instances we decided to do away with it. I was looking for cloud native serverless solution and decided to use CloudWatch Events with Lambda. I discovered later on that there is an upper limit on the number of Rules that can be created (100 per account) and that wouldn't scale automatically. So now I'm stuck and any suggestions would be great!

As per CloudWatch Events documentation you can request a limit increase.
100 per region per account. You can request a limit increase. For
instructions, see AWS Service Limits.
Before requesting a limit increase, examine your rules. You may have
multiple rules each matching to very specific events. Consider
broadening their scope by using fewer identifiers in your Event
Patterns in CloudWatch Events. In addition, a rule can invoke several
targets each time it matches an event. Consider adding more targets to
your rules.
If you're trying to create a serverless task scheduler one possible way could be:
CloudWatch Event that triggers a lambda function every minute.
Lambda function reads a DynamoDB table and decide which actions need to be executed at that time.
Lambda function could dispatch the execution to other functions or services.

So I decided to do as Diego suggested, use CloudWatch Events to trigger a Lambda every minute which would query DynamoDB to check for the tasks that need to be executed.
I had some concerns regarding the data that would be fetched from dynamoDb (duplicate items in case of longer than 1 minute of execution), so decided to set the concurrency to 1 for that Lambda.
I also had some concerns regarding executing those tasks directly from that Lambda itself (timeouts and tasks at the end of a long list) so what I'm doing is pushing the tasks to SQS each separately and another Lambda is triggered by the SQS to execute those tasks parallely. So far results look good, I'll keep updating this thread if anything comes up.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js