Im looking for a serverless solution for an old system and its working like a charm, there is only one think I have no idea what is the best solution, here is the point
USER ---> API Gateway ---> Lambda ---> DynamoDB
User want to trigger a lambda in a specified time, example.
Im a user and I want to post a message in a dashboard (the function to do this is in a lambda) with some parameters saved in DynamoDB, and should be done tomorrow at 5.
User make a API request throw API Gateway, lambda is executed and put some info in DynamoDB, how can trigger another lambda with this info tomorrow ?
In the old system we have a cron with the time and when it should be triggered it just read the DB to get the parameters.
What can I use ? SQS ? CloudWatch Events ? with S3 ? DynamoDB stream ?
More info, could be like 10-20 executions per day.
When the user invokes the lambda via the API Gateway and when you put the data in dynamo db you can at the same time have a message inserted in SQS with the exact time stamp when you want this user action to have a lambda invocation.
Now have a scheduled lambda that executes every minute or every 5 minutes or whatever suits you. The work of this lambda will be to poll all the messages of SQS and see if any schedule has reached for any of the message. If yes, invoke another lambda and pass this payload for invocation. If not then sleep till next polling time
You could also do the same architecture in cloud. The on-prem cron can be replaced by a Cloudwatch cron schedule. So your second Lambda can be triggered based on the cron schedule and you scan your DB (dynamodb in this case) and do the processing.
Related
i can write a lambda function which will send email to a specific user using SES . what i want to do is to make some kind of schedule task which will trigger this lambda function . the lambda function should accept user-email as function parameter as different user have different email and this lambda function should be trigger in different time for different user . how can achieve this ? i already did some digging about SNS, SQS, CloudeWatch, step function and i got confused . can anyone help me here ?
Here is an AWS tutorial that shows you how to write a Lambda function that uses the Lambda runtime API. This document shows you how to invoke other AWS services from the Lambda function (such as Amazon DynamoDB). In this example, user data is located in a database, not passed to the Lambda function (this is a minor diff from what you described).
The Lambda function is invoked on a regular schedule using scheduled events. This tutorial walks you through all steps.
Creating scheduled events to invoke Lambda functions
Following are some of the options;
Option - 1)
a) An observer = A lambda to check connect and check in mongodb for list of user's, for whom subscription/trial period is about to end. Have list of emails and part of message body such as date and other details which will be used in drafting email (in json format with key as email) of such users
b) A command executer = A lambda which will take list of emails and mapped contents. Iterate over this list and send email to the user's from the list
c) Schedule a Serverless Workflow using AWS step functions use/connect lambda a) and b) in this
Additional information: Refer Error handling in Step Functions for negative test cases
OR
Option - 2) I am not sure whether this is available for the setup of mongodb you have;
a) Using MongoDB Scheduled Triggers
b) MongoDB Scheduled Triggers integrated via. AWS Eventbridge mapped to AWS Lambda with logic to send email.
OR
Option 3) Combine observer (1.a) and command executer (1.b) logic within single lambda. Schedule this lambda as per your requirements using AWS cloudwatch event rule
Create CloudWatch trigger for each user if you have a small amount of users. Otherwise trigger the function frequently, say, every minutes by CW and save the last time for each user in a dynamo table.
I have a scheduled daemon lambda that is called once a day to analyze data and create needed alert messages for the users of the app for the further sending.
The problem I try to solve is to avoid "sparks" in email sending since if I try to send all the created emails at one moment I will get natural reject from SES. So I plan to store created emails somewhere (SQS, DynamoDB, etc) and then schedule a run of another lambda that will take a portion of emails, call SES.sendEmail(), then if there are still emails to send, schedule another call of itself in a few seconds.
I planned to create CloudWatch event inside lambda and use Cron to schedule lambda call but realized that Cron has 1 minute precision, which is not enough for my purposes. I'm not going to wait inside any lambda (since its just pay for nothing approach), so Lambda.Invoke() is also inappropriate. Are there any other alternatives to recursively schedule call of lambda from itself with conditional delay and seconds precision?
You could have a lambda that publishes the emails to sqs, which allows you to set a delivery delay, and have another lambda which is triggered by messages in the queue.
i have an aws lambda function to do some statistics on over 1k of stock tickers after market close. i have an option like below.
setup a cron job in ec2 instance and trigger a cron job to submit 1k http request asyn (e.g. http://xxxxx.lambdafunction.xxxx?ticker= to trigger the aws lambda function (or submit 1k request to SNS and let lambda to pickup.
i think it should run fine, but much appreciate if there is any serverless/PaaS approach to trigger task
On top of my head, Here are a couple of ways to achieve what you need:
Option 1: [Cost-Effective]
Post all the ticks to AWS FIFO SQS queue.
Define triggers on this queue to invoke lambda function.
Result: Since you are posting all the events in FIFO queue that maintains the order, all the events will be polled sequentially. More-over SQS to lambda trigger will help you scale automatically based on the number of message in the queue.
Option 2: [Costly and can easily scale for real-time processing]
Same as above, but instead of posting to FIFO queue, post to Kinesis Stream.
Enable Kinesis stream to trigger lambda function.
Result: Kinesis will ensure the order of event arriving in the stream and lambda function invocation will be invoked based on the number of shards in the stream. This implementation scales significantly. If you have any future use-case for real-time processing of tickers, this could be a great solution.
Option 3: [Cost Effective, alternate to Option:1]
Collect all ticker events(1k or whatever) and put it into a file.
Upload this file to AWS S3 bucket.
Enable S3 event notification to trigger proxy lambda function.
This proxy lambda function reads the s3 file and based on the total number of events in the file, it will spawn n parallel actor lambda function.
Actor lambda function will process each event.
Result: Easy to implement, cost-effective and provides easy scaling based on your custom algorithm to distribute the load in the proxy lambda function.
Option 4: [All-serverless]
Write a lambda function that gets the list of tickers from some web-server.
Define an AWS cloud watch rule for generating events based on cron/frequency.
Add a trigger to this cloudwatch rule to invoke proxy lambda function.
Proxy lambda function will use any combination of above options[1, 2 or 3] to trigger the actor lambda function for processing the records.
Result: Everything can be configured via AWS console and easy to use. Alternatively, you can also write your AWS cloud formation template to generate all the required resources in a single go.
Having said that, now I will leave this up to you to choose the right solution based on your business/cost requirements.
You can use lambda fanout option.
You can follow these steps to process 1k or more using serverless aproach.
1.Store all the stock tickers in a S3 file.
2.Create a master lambda which will read the s3 file and split the stocks in groups of 10.
3. Create a child lambda which will make the async call to external http service and fetch the details.
4. In the master lambda Loop through these groups and invoke 100 child lambdas passing in each group and return the results to the
Master lambda
5. Collect all the information returned from the child lambdas and continue with your processing here.
Now you can trigger this master lambda at the end of markets everyday using CloudWatch time based rule scheduler.
This is a complete serverless approach.
I am trying to come up with a way to have pieces of data processed at specific time intervals by invoking aws lambda every N hours.
For example, parse a page at specific url every 6 hours and store result in s3 bucket.
Have many (~100k) urls each processed that way.
Of course, you can have a VM that hosts some scheduler that would trigger lambdas, as described in this answer, but that breaks the "serverless" approach.
So, is there a way to do this using aws services only?
Things I tried that does not work:
SQS can delay messages, but only for maximum of 15 min (I need hours) and there is no built-in integration between SQS and Lambda so you need to have some polling agent (lambda?) that would poll the qeueu all the time and send new messages to worker lambda, which again breaks the point of only executing at scheduled time;
CloudWatch Alarms can send messages to SNS that triggers Lambda. You can have periodic lambda calls implemented like that by using future metric timestamp, however alarm message cannot have a custom data (think url from example above) connected to it, so that does not work too;
I could create Lambda CloudWatch scheduled triggers programmatically but they also cannot pass any data to Lambda.
The only way I could think of, is to have a dynamo DB table with "url" records, each with the timestamp of last "processing" and have periodic lambda that would query the table and send "old" records as jobs to another "worker" lambda (directly or via SNS).
That would work, however you still need to have a "polling" lambda, which could become a bottleneck as number of items to process grows.
Any other ideas?
100k jobs every 6 hours, doesn't sound like a great use case for Serverless IMO. Personally, I would set up a CloudWatch event with a relevant cron expression that triggered a Lambda to start an EC2 instance that processed all the URLs (stored in DynamoDB) and script the EC2 instance to shutdown after processing the last url.
But that's not what you asked.
You could set up a CloudWatch event with a relevant cron expression that spawns a lambda (orchestrator) reads the urls from DynamoDB or even an S3 file then invokes a second lambda (worker) for each url to actually parse the pages.
Using this pattern you will start hitting concurrency issues at 1000 lambdas (1 orchestrator & 999 workers), less if you have other lambdas running in the same region. You can ask AWS to increase this limit, but I don't know under what scenarios they will do this, or how high they will increase the limit.
From here you have three choices.
Split out the payload to each worker lambda so each instance receives multiple urls to process.
Add an another column to your list of urls and group urls with this column (e.g. first 500 are marked with a 1, second 500 are marked with a 2, etc). Then your orchestrator lambda could take urls off the list in batches. This would require you to run the CloudWatch event at a greater frequency and manage the state so the orchestrator lambda when invoked knows which is the next batch (I've done this at a smaller scale just storing a variable in a S2 file).
Would be to use some combination of options 1 and 2.
Looks like, it's fitting Batch processing scenario with AWS lambda function as a job. It's serverless but obviously adds dependency on another AWS service.
In the same time, it has dashboard, processing status, retries and all perks from job scheduling service.
We have aws alarms set up to email on alarm but we would like to continue to get the alarm notification even if the state is in Alarm without a state change. How could I achieve this (would be happy to use a lambda but no idea how to do it)
Amazon CloudWatch alarm notifications are only sent when the state of the alarm changes. It is not possible to configure CloudWatch to continually send notifications while in the ALARM state.
You would need to write your own code to send such notifications. This could be accomplished via a cron job, scheduled AWS Lambda function or your own application.
Try with a script using Cloudwatch API for example with Boto3 + Python or a Lambda running every X minutes. I have a python script to get values from cloudwatch you can adapt it. http://www.dbigcloud.com/cloud-computing/230-integrando-metricas-de-aws-cloudwatch-en-zabbix.html
One alternative is, to create a Lambda function to send email and host that function using CloudWatch Rule with Scheduled option and target as Lambda function that you have created. In Schedule option, you can set the frequency of time that you expect to receive email. In defined frequency, the Rule will trigger Lambda Function to send email.