Is there a way to dynamically create scheduled lambda calls in AWS? I have to create many scheduled lambda calls that. I am aware of CloudWatch rules, but they have a limit on the amount you can get. I also heard about Cronally, but they are not launched yet, and I'd rather do something like this on my own. I do not see an obvious solution without trade offs, but does the 'easy way' exist, or it all depends on the particular application?
The cloudwatch events docs say the limit of 50 rules per account can be raised on request so maybe they might be able to raise it high enough for your needs.
Alternatively you could just do one rule that fires a single "scheduler"lambda function every minute. that scheduler can contain a
Schedule of which functions get fired at which times and invoke the other lambda functions according to that schedule. You could even store the schedule in a dynamoDB table or s3 bucket so don't need to update the lambda function itself to change the schedule.
Related
The situation
Currently, I am using an Amazon SQS queue that triggers a Lambda function to process new messages upon arrival to the queue. Those Lambda functions are being moved to a DLQ (Dead-Letter Queue) upon failure.
In-order to seed the SQS queue, I am using a CRON that runs every day and inserts the available jobs into the queue.
I want to issue a summarizing alert/email once the processing of all the new jobs the CRON has inserted for the day are done or been processed, along with the details about how many successful, failing and total jobs were originally issued in that day.
The problem:
As the Lambda functions run separately, and the fact that I want to keep it that way, I was wondering what would be the best service to use in order to store the temporary count value (at least two out of the three counts are needed among the total, succeeding and failing counts)?
I was thinking about DynamoDB, but every DB seems to be an overkill for that, and won't be cost-effective either. S3 also doesn't seem to be the most practical/preferred for this type of solution. I can also use SQS (as its "storage" is somewhat designed for cases with relatively small data storage such as these) with an identifier "count" that will be updated by every Lambda function, but knowing which Lambda function was the last requires checking the whole queue, which seems like over-complicating that.
Any other AWS service that comes up to mind?
Here is a good listing of Storage Options in the AWS Cloud (2013, but includes some of that options available today as well).
AWS Systems Manager Parameter Store can be used as a 'mini-database'.
It requires AWS credentials to access (which would be available to the Lambda functions or whatever code you are running to perform this check) but has no operational cost.
From PutParameter - AWS Systems Manager:
Parameter Store offers a standard tier and an advanced tier for parameters. Standard parameters have a content size limit of 4 KB and can't be configured to use parameter policies. You can create a maximum of 10,000 standard parameters for each Region in an AWS account. Standard parameters are offered at no additional cost.
You could run into problems if multiple processes try to update the parameters simultaneously, but hopefully your use-case is pretty simple.
So our project was using Hangfire to dynamically schedule tasks but keeping in mind auto scaling of server instances we decided to do away with it. I was looking for cloud native serverless solution and decided to use CloudWatch Events with Lambda. I discovered later on that there is an upper limit on the number of Rules that can be created (100 per account) and that wouldn't scale automatically. So now I'm stuck and any suggestions would be great!
As per CloudWatch Events documentation you can request a limit increase.
100 per region per account. You can request a limit increase. For
instructions, see AWS Service Limits.
Before requesting a limit increase, examine your rules. You may have
multiple rules each matching to very specific events. Consider
broadening their scope by using fewer identifiers in your Event
Patterns in CloudWatch Events. In addition, a rule can invoke several
targets each time it matches an event. Consider adding more targets to
your rules.
If you're trying to create a serverless task scheduler one possible way could be:
CloudWatch Event that triggers a lambda function every minute.
Lambda function reads a DynamoDB table and decide which actions need to be executed at that time.
Lambda function could dispatch the execution to other functions or services.
So I decided to do as Diego suggested, use CloudWatch Events to trigger a Lambda every minute which would query DynamoDB to check for the tasks that need to be executed.
I had some concerns regarding the data that would be fetched from dynamoDb (duplicate items in case of longer than 1 minute of execution), so decided to set the concurrency to 1 for that Lambda.
I also had some concerns regarding executing those tasks directly from that Lambda itself (timeouts and tasks at the end of a long list) so what I'm doing is pushing the tasks to SQS each separately and another Lambda is triggered by the SQS to execute those tasks parallely. So far results look good, I'll keep updating this thread if anything comes up.
I know that AWS Lambda can be invoked by CloudWatch scheduler as well as by SQS event, but can they be used together in logical "and" combination?
Basically, what I need is to run my lambda every minute (for example) only when messages available in SQS. Is it even possible with AWS config only?
I need this to be able to utilize some third-party API with hard API limit, that's why I cannot just use SQS event (easy to break the limit) and I don't like the idea to use scheduler only, because it will be useless when queue is empty.
While this is a cool idea, this is unfortunately not possible - event sources in Lambda are always separate from each other. I understand your impulse to save CPU-cycles and API-calls (and money), but I think the only solution that works is your proposed put-it-on-a-timer-and-poll-sqs one.
I was searching the documentation for references on this, but couldn't find any.
I have several AWS lambda functions triggered by events from other applications, e.g. via Kinesis. Some of this events should trigger something happening at another time. As an example, consider the case of sending a reminder/notification e-mail to a user about something when 24 hours have passed since event X happened.
I have previously worked with lambda functions that schedule other lambda functions by dynamically creating CloudWatch "cron" rules in runtime, but I'm now revisiting my old design and considering whether this is the best approach. It was a bit tedious to set up lambdas that schedule other lambdas, because in addition to submitting CW rules with the new lambda as the target I also had to deal with runtime granting of the invoked lambda permissions to be triggered by the new CW rule.
So another approach I'm considering is to submit jobs to be done by adding them to a database table, with a given execution time, and then have one single CW cron rule running every x minutes that checks the database for due jobs. This reduces complexity of the CW rules (only one, static rule needed), lambda permissions (also static) etc, but adds complexity in an additional database table etc. Another difference is that while the old design only performed one executed one "job" per invocation, this design would potentially execute 100 pending jobs in the same invocation, and I'm not sure if this could cause timeout issues etc.
Did anyone successfully implement something similar? What approach did you choose?
I know there are also other services such as AWS Batch, but this seems overkill for scheduling of simple tasks such as sending an e-mail when time t has passed since event e happened, since to my knowledge it doesn't support simple lambda jobs. SQS also supports timed messages, but only up to 15 minutes, so it doesn't seem useful for scheduling something in 24 hours.
An interesting alternative is to use AWS Step Functions to trigger the AWS Lambda function after a given delay.
Step Functions has a Wait state that can schedule or delay execution, so you can can implement a fairly simple Step Functions state machine that puts a delay in front of calling a Lambda function. No database required!
For an example of the concept (slightly different, but close enough), see:
Using AWS Step Functions To Schedule Or Delay SNS Message Publication - Alestic.com
Task Timer - AWS Step Functions
Need to call a function at specific times without having a server up and running all the time
In particular, the challenge I'm facing is that we only use AWS Lambda and DynamoDB to - among other things - send a reminder to users at a time of their choice. That means we have to call a lambda function at the time the user wants to be reminded.
The time changes dynamically (depending on each user's choice) so the question is, what is a good way to set this up?
We are considering setting up a server if there's no way around it but even if we go for this solution, I lack the experience to see a good way to set this up. Any suggestions are greatly appreciated.
You can use AWS DynamoDB TTL event stream to trigger Lambda to achieve this. The approach is as follows.
Create a DynamoDB table to store User alarms.
When user setup an alarm, calculate the difference between the alarm timestamp and current timestamp.
Then store the difference as the TTL value of the alarm record, along with alarm information.
Configure DynamoDB streams to trigger a Lambda when TTL exceeds
You can call your Lambda function on a scheduled event:
http://docs.aws.amazon.com/lambda/latest/dg/with-scheduled-events.html
So set up your Lambda function with cron like event to wake on any interval you need, retrieve the list of alarms you need to send next, send them, mark completed alarms so they won't be triggered again.