Serverless Task Scheduling on AWS

Serverless Task Scheduling on AWS - amazon-web-services

So our project was using Hangfire to dynamically schedule tasks but keeping in mind auto scaling of server instances we decided to do away with it. I was looking for cloud native serverless solution and decided to use CloudWatch Events with Lambda. I discovered later on that there is an upper limit on the number of Rules that can be created (100 per account) and that wouldn't scale automatically. So now I'm stuck and any suggestions would be great!

As per CloudWatch Events documentation you can request a limit increase.
100 per region per account. You can request a limit increase. For
instructions, see AWS Service Limits.
Before requesting a limit increase, examine your rules. You may have
multiple rules each matching to very specific events. Consider
broadening their scope by using fewer identifiers in your Event
Patterns in CloudWatch Events. In addition, a rule can invoke several
targets each time it matches an event. Consider adding more targets to
your rules.
If you're trying to create a serverless task scheduler one possible way could be:
CloudWatch Event that triggers a lambda function every minute.
Lambda function reads a DynamoDB table and decide which actions need to be executed at that time.
Lambda function could dispatch the execution to other functions or services.

So I decided to do as Diego suggested, use CloudWatch Events to trigger a Lambda every minute which would query DynamoDB to check for the tasks that need to be executed.
I had some concerns regarding the data that would be fetched from dynamoDb (duplicate items in case of longer than 1 minute of execution), so decided to set the concurrency to 1 for that Lambda.
I also had some concerns regarding executing those tasks directly from that Lambda itself (timeouts and tasks at the end of a long list) so what I'm doing is pushing the tasks to SQS each separately and another Lambda is triggered by the SQS to execute those tasks parallely. So far results look good, I'll keep updating this thread if anything comes up.

Related

Call concurrent instances of AWS lambda on Cloud watch event rule

I have one cloud watch event set per minute which triggers AWS Lambda.I have set concurrent executions of lambda to 10 however it's only triggering a single instance per minute. I want it to run 10 concurrent instances per minute.

Concurrency in Lambda is managed pretty differently from what you expect.
In your case you want a single CloudWatch Event to trigger multiple instances each minute.
However, Concurrency in Lambda is working as follows: think you have CloudWatch Event triggering your Lambda and also other AWS services (e.g. S3 and DynamoDB) which trigger your Lambda. What happens when one of your triggers activate the Lambda is that a Lambda instance is active and is consumed until the Lambda finishes its work/computation. During that period of time, the total concurrency units will be decreased by one. At that very moment if another trigger activates the Lambda, the total concurrency units will be decreased again. And this will happen until your Lambda instances are being executed.
So, in your case there will be always a single event (CloudWatch) triggering a single Lambda instance, causing the system not to trigger multiple instances, as for its operation this is the correct way to work. In other words, you do not want to increase concurrent lambda execution to 10 (or whatever) to reach your goal of running 10 parallel instances per minute.
In order to do so, it's probably better for you to create a Lambda orchestrator which calls multiple instances of your Lambda and then setting the Lambda Concurrency in this last Lambda higher than 10 (if you do not want the Lambda to throttle). This way is also pretty good in order to manage the execution of your multiple instances and to catch errors atomically with a greater error flow control.
You can refer to this article in order to get the Lambda Concurrency behavior. The implementation of Lambda orchestrator to manage the multiple instances execution, instead is pretty straightforward.

Timeout problems with bulk processing of files using AWS Lambda

I have a lambda function which I'm expecting to exceed 15 minutes of execution time. What should I do so it will continuously run until I processed all of my files?

If you can, figure out how to scale your workload horizontally. This means splitting your workload so it runs on many lambdas instead of one "super" lambda. You don't provide a lot of details so I'll list a couple common ways of doing this:
Create an SQS queue and each lambda takes one item off of the queue and processes it.
Use an S3 trigger so that when a new file is added to a bucket a lambda processes that file.
If you absolutely need to process for longer than 15 minutes you can look into other serverless technologies like AWS Fargate. Non-serverless options might include AWS Batch or running EC2.

15 minutes is the maximum execution time available for AWS Lambda functions.
If your processing is taking more than that, then you should break it into more than one lambda. You can trigger them in sequence or in parallel depending on your execution logic.

Scheduled jobs using AWS Lambda

I have several AWS lambda functions triggered by events from other applications, e.g. via Kinesis. Some of this events should trigger something happening at another time. As an example, consider the case of sending a reminder/notification e-mail to a user about something when 24 hours have passed since event X happened.
I have previously worked with lambda functions that schedule other lambda functions by dynamically creating CloudWatch "cron" rules in runtime, but I'm now revisiting my old design and considering whether this is the best approach. It was a bit tedious to set up lambdas that schedule other lambdas, because in addition to submitting CW rules with the new lambda as the target I also had to deal with runtime granting of the invoked lambda permissions to be triggered by the new CW rule.
So another approach I'm considering is to submit jobs to be done by adding them to a database table, with a given execution time, and then have one single CW cron rule running every x minutes that checks the database for due jobs. This reduces complexity of the CW rules (only one, static rule needed), lambda permissions (also static) etc, but adds complexity in an additional database table etc. Another difference is that while the old design only performed one executed one "job" per invocation, this design would potentially execute 100 pending jobs in the same invocation, and I'm not sure if this could cause timeout issues etc.
Did anyone successfully implement something similar? What approach did you choose?
I know there are also other services such as AWS Batch, but this seems overkill for scheduling of simple tasks such as sending an e-mail when time t has passed since event e happened, since to my knowledge it doesn't support simple lambda jobs. SQS also supports timed messages, but only up to 15 minutes, so it doesn't seem useful for scheduling something in 24 hours.

An interesting alternative is to use AWS Step Functions to trigger the AWS Lambda function after a given delay.
Step Functions has a Wait state that can schedule or delay execution, so you can can implement a fairly simple Step Functions state machine that puts a delay in front of calling a Lambda function. No database required!
For an example of the concept (slightly different, but close enough), see:
Using AWS Step Functions To Schedule Or Delay SNS Message Publication - Alestic.com
Task Timer - AWS Step Functions

Delay Lambda execution over specific data

I am trying to come up with a way to have pieces of data processed at specific time intervals by invoking aws lambda every N hours.
For example, parse a page at specific url every 6 hours and store result in s3 bucket.
Have many (~100k) urls each processed that way.
Of course, you can have a VM that hosts some scheduler that would trigger lambdas, as described in this answer, but that breaks the "serverless" approach.
So, is there a way to do this using aws services only?
Things I tried that does not work:
SQS can delay messages, but only for maximum of 15 min (I need hours) and there is no built-in integration between SQS and Lambda so you need to have some polling agent (lambda?) that would poll the qeueu all the time and send new messages to worker lambda, which again breaks the point of only executing at scheduled time;
CloudWatch Alarms can send messages to SNS that triggers Lambda. You can have periodic lambda calls implemented like that by using future metric timestamp, however alarm message cannot have a custom data (think url from example above) connected to it, so that does not work too;
I could create Lambda CloudWatch scheduled triggers programmatically but they also cannot pass any data to Lambda.
The only way I could think of, is to have a dynamo DB table with "url" records, each with the timestamp of last "processing" and have periodic lambda that would query the table and send "old" records as jobs to another "worker" lambda (directly or via SNS).
That would work, however you still need to have a "polling" lambda, which could become a bottleneck as number of items to process grows.
Any other ideas?

100k jobs every 6 hours, doesn't sound like a great use case for Serverless IMO. Personally, I would set up a CloudWatch event with a relevant cron expression that triggered a Lambda to start an EC2 instance that processed all the URLs (stored in DynamoDB) and script the EC2 instance to shutdown after processing the last url.
But that's not what you asked.
You could set up a CloudWatch event with a relevant cron expression that spawns a lambda (orchestrator) reads the urls from DynamoDB or even an S3 file then invokes a second lambda (worker) for each url to actually parse the pages.
Using this pattern you will start hitting concurrency issues at 1000 lambdas (1 orchestrator & 999 workers), less if you have other lambdas running in the same region. You can ask AWS to increase this limit, but I don't know under what scenarios they will do this, or how high they will increase the limit.
From here you have three choices.
Split out the payload to each worker lambda so each instance receives multiple urls to process.
Add an another column to your list of urls and group urls with this column (e.g. first 500 are marked with a 1, second 500 are marked with a 2, etc). Then your orchestrator lambda could take urls off the list in batches. This would require you to run the CloudWatch event at a greater frequency and manage the state so the orchestrator lambda when invoked knows which is the next batch (I've done this at a smaller scale just storing a variable in a S2 file).
Would be to use some combination of options 1 and 2.

Looks like, it's fitting Batch processing scenario with AWS lambda function as a job. It's serverless but obviously adds dependency on another AWS service.
In the same time, it has dashboard, processing status, retries and all perks from job scheduling service.

Dynamically create cronjobs in AWS

Is there a way to dynamically create scheduled lambda calls in AWS? I have to create many scheduled lambda calls that. I am aware of CloudWatch rules, but they have a limit on the amount you can get. I also heard about Cronally, but they are not launched yet, and I'd rather do something like this on my own. I do not see an obvious solution without trade offs, but does the 'easy way' exist, or it all depends on the particular application?

The cloudwatch events docs say the limit of 50 rules per account can be raised on request so maybe they might be able to raise it high enough for your needs.
Alternatively you could just do one rule that fires a single "scheduler"lambda function every minute. that scheduler can contain a
Schedule of which functions get fired at which times and invoke the other lambda functions according to that schedule. You could even store the schedule in a dynamoDB table or s3 bucket so don't need to update the lambda function itself to change the schedule.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js