execute aws lambda function sequentially - amazon-web-services

I have written a python script for autoscaling server naming. which will check current servers in autoscaling and give an appropriate name and sequence to the new server.
I am triggering my Aws lambda function by autoscaling event. and when I bring 3 servers at the same time(or new autoscaling with desired capacity 10) I don't want lambda to be executed parallelly. it is making my script assign the same count for all servers.
Or if I can implement some kind of locking to put other lambdas in wait state. So what should i used for it.

There are 2 options:
You can create a distributed lock using a DynamoDB table. This will
help you maintain a state saying Operation in progress. Every time a
lambda function is invoked due to an autoscaling event, you can
check if this record exists in dynamodb. If it does not create one,
and proceed. If it exists, do nothing and return. After your lambda
function executes successfully, remove this entry. This probably
will add not more than $2.00 to your total AWS bill as the read and
write capacity for this table will be really low.
You can make use of the step functions to implement this scenario.
With step functions you can check if one is already running and
skip.
Read more about step functions here:
https://aws.amazon.com/step-functions/

Related

Can I use step functions to check whether a lambda is running to allow only one instance of lambda running to avoid db deadlock?

I have a state machine which gets triggered every time a file is uploaded to S3.
This state machine is triggered by a lambda which is connected to SQS and is invoked on every file upload.
For one of the processes, thi step machine contains a function which writes to a database.
I dont have a problem with multiple instances os state machines running parallely but this 2nd lambda function in the state machine which writes to the db should not run parallely to avoid table deadlock.
Is there a way I can get the state of lambda using the step functions and execute it only when it is not running by any other instance of step functions.
If you want to limit an AWS Lambda function so that it only runs a maximum of one instance of the function at any time, then you can set the Reserved Concurrency to 1.
See: Managing Lambda reserved concurrency - AWS Lambda

AWS lambda function for copying data into Redshift

I am new to AWS world and I am trying to implement a process where data written into S3 by AWS EMR can be loaded into AWS Redshift. I am using terraform to create S3 and Redshift and other supported functionality. For loading data I am using lambda function which gets triggered when the redshift cluster is up . The lambda function has the code to copy the data from S3 to redshift. Currently the process seams to work fine .The amount of data is currently low
My question is
This approach seems to work right now but I don't know how it will work once the volume of data increases and what if lambda functions times out
can someone please suggest me any alternate way of handling this scenario even if it can be handled without lambda .One alternate I came across searching for this topic is AWS data pipeline.
Thank you
A server-less approach I've recommended clients move to in this case is Redshift Data API (and Step Functions if needed). With the Redshift Data API you can launch a SQL command (COPY) and close your Lambda function. The COPY command will run to completion and if this is all you need to do then your done.
If you need to take additional actions after the COPY then you need a polling Lambda that checks to see when the COPY completes. This is enabled by Redshift Data API. Once COPY completes you can start another Lambda to run the additional actions. All these Lambdas and their interactions are orchestrated by a Step Function that:
launches the first Lambda (initiates the COPY)
has a wait loop that calls the "status checker" Lambda every 30 sec (or whatever interval you want) and keeps looping until the checker says that the COPY completed successfully
Once the status checker lambda says the COPY is complete the step function launches the additional actions Lambda
The Step function is an action sequencer and the Lambdas are the actions. There are a number of frameworks that can set up the Lambdas and Step Function as one unit.
With bigger datasets, as you already know, Lambda may time out. But 15 minutes is still a lot of time, so you can implement alternative solution meanwhile.
I wouldn't recommend data pipeline as it might be an overhead (It will start an EC2 instance to run your commands). Your problem is simply time out, so you may use either ECS Fargate, or Glue Python Shell Job. Either of them can be triggered by Cloudwatch Event triggered on an S3 event.
a. Using ECS Fargate, you'll have to take care of docker image and setup ECS infrastructure i.e. Task Definition, Cluster (simple for Fargate).
b. Using Glue Python Shell job you'll simply have to deploy your python script in S3 (along with the required packages as wheel files), and link those files in the job configuration.
Both of these options are serverless and you may chose one based on ease of deployment and your comfort level with docker.
ECS doesn't have any timeout limits, while timeout limit for Glue is 2 days.
Note: To trigger AWS Glue job from Cloudwatch Event, you'll have to use a Lambda function, as Cloudwatch Event doesn't support Glue start job yet.
Reference: https://docs.aws.amazon.com/eventbridge/latest/APIReference/API_PutTargets.html

Serverless Task Scheduling on AWS

So our project was using Hangfire to dynamically schedule tasks but keeping in mind auto scaling of server instances we decided to do away with it. I was looking for cloud native serverless solution and decided to use CloudWatch Events with Lambda. I discovered later on that there is an upper limit on the number of Rules that can be created (100 per account) and that wouldn't scale automatically. So now I'm stuck and any suggestions would be great!
As per CloudWatch Events documentation you can request a limit increase.
100 per region per account. You can request a limit increase. For
instructions, see AWS Service Limits.
Before requesting a limit increase, examine your rules. You may have
multiple rules each matching to very specific events. Consider
broadening their scope by using fewer identifiers in your Event
Patterns in CloudWatch Events. In addition, a rule can invoke several
targets each time it matches an event. Consider adding more targets to
your rules.
If you're trying to create a serverless task scheduler one possible way could be:
CloudWatch Event that triggers a lambda function every minute.
Lambda function reads a DynamoDB table and decide which actions need to be executed at that time.
Lambda function could dispatch the execution to other functions or services.
So I decided to do as Diego suggested, use CloudWatch Events to trigger a Lambda every minute which would query DynamoDB to check for the tasks that need to be executed.
I had some concerns regarding the data that would be fetched from dynamoDb (duplicate items in case of longer than 1 minute of execution), so decided to set the concurrency to 1 for that Lambda.
I also had some concerns regarding executing those tasks directly from that Lambda itself (timeouts and tasks at the end of a long list) so what I'm doing is pushing the tasks to SQS each separately and another Lambda is triggered by the SQS to execute those tasks parallely. So far results look good, I'll keep updating this thread if anything comes up.

Calling a function at specified times without a running server

Need to call a function at specific times without having a server up and running all the time
In particular, the challenge I'm facing is that we only use AWS Lambda and DynamoDB to - among other things - send a reminder to users at a time of their choice. That means we have to call a lambda function at the time the user wants to be reminded.
The time changes dynamically (depending on each user's choice) so the question is, what is a good way to set this up?
We are considering setting up a server if there's no way around it but even if we go for this solution, I lack the experience to see a good way to set this up. Any suggestions are greatly appreciated.
You can use AWS DynamoDB TTL event stream to trigger Lambda to achieve this. The approach is as follows.
Create a DynamoDB table to store User alarms.
When user setup an alarm, calculate the difference between the alarm timestamp and current timestamp.
Then store the difference as the TTL value of the alarm record, along with alarm information.
Configure DynamoDB streams to trigger a Lambda when TTL exceeds
You can call your Lambda function on a scheduled event:
http://docs.aws.amazon.com/lambda/latest/dg/with-scheduled-events.html
So set up your Lambda function with cron like event to wake on any interval you need, retrieve the list of alarms you need to send next, send them, mark completed alarms so they won't be triggered again.

Dynamically create cronjobs in AWS

Is there a way to dynamically create scheduled lambda calls in AWS? I have to create many scheduled lambda calls that. I am aware of CloudWatch rules, but they have a limit on the amount you can get. I also heard about Cronally, but they are not launched yet, and I'd rather do something like this on my own. I do not see an obvious solution without trade offs, but does the 'easy way' exist, or it all depends on the particular application?
The cloudwatch events docs say the limit of 50 rules per account can be raised on request so maybe they might be able to raise it high enough for your needs.
Alternatively you could just do one rule that fires a single "scheduler"lambda function every minute. that scheduler can contain a
Schedule of which functions get fired at which times and invoke the other lambda functions according to that schedule. You could even store the schedule in a dynamoDB table or s3 bucket so don't need to update the lambda function itself to change the schedule.