What's the update behavior (rolling vs blue/green) for AWS Lambda Functions when consuming from Kinesis Stream? - amazon-web-services

Let's say I have a Kinesis Stream with 4 shards being consumed by a Lambda Function. The stream is continuously receiving events so it would be a high usage scenario. As I have 4 shards I'd have 4 function instances running at the same time (assuming Parallelization Factor=1). Then I publish a new version of the function with some new code. What happens then?
The next invocation of the function will always pickup the latest version, meaning that there wont be intercalated invocations of both old and new versions?
A "rolling update", where each of the 4 instances are replaced one at a time over an interval so that some batches are processed by the old and some by the new version?
Something else?

The behavior is most like your first bullet point, but the details can vary.
A number of worker processes in the backend poll the shards for work. Whenever there is work to do, they do a synchronous invoke to the Lambda-API and wait for the response (docs).
The Lambda-API is now responsible for picking an execution context to handle the request. It will do that depending on the function-ARN that you specified in the event source mapping. If you use the "default", i.e., the $latest alias, Lambda will just create an execution context with the latest version of the code (or use an existing one that satisfies the criteria) to run your code.
If you use an alias or a specific version in the function ARN for the event source mapping, the behavior depends on what you pick. If you specify a function version, Lambda will specifically execute that version.
If you specify an alias that does weighted routing to multiple function versions, it will pick any of those versions to handle the request.
tl;dr The behavior depends on the function ARN in the event source mapping, but usually, Lambda will switch to the new version without doing any rolling update logic. It behaves more like a blue-green deployment.

Related

How can I track the progress/status of an asynchronous AWS Lambda invocation?

I have an API which I use to trigger AWS Lambda jobs. Upon request, the API invokes an AWS Lambda job with InvocationType='Event'. Hereafter, I want to periodically poll if the AWS Lambda job has finished.
The way that would fit best to my architecture, is to store an identifier of the Lambda job in a database and periodically check if the job is finished and what its output is. However, I was not able to find how I can do this.
How can I periodically poll for the result of an AWS Lambda job, and view the output once it has finished?
I have looked into using InvocationType='RequestResponse', but this requires me to store a future, which I cannot do in a database.
There's no built-in way to check for the status of an asynchronous Lambda invocation.
Asynchronous Lambda invocation, using the event invocation type, is meant to be a fire and forget job. As such, there's no 'progress' or 'status' to get or poll for.
As you don't want to wait for the Lambda to complete, synchronous Lambda invocation is out of the picture. In this case, you need to write your own logic to keep track of the status.
One way you could do this is to store a (job) item in a DynamoDB jobs table with 2 attributes:
jobId UUID (String attribute, set as the partition key)
completed boolean flag (Boolean attribute)
Workflow is then as follows:
Within your API, create & store a new job with completed defaulting to 'false'
Pass the newly-created jobId to the Lambda being invoked in the payload
When the Lambda finishes, lookup the job associated with the passed in jobId within the jobs table & set the completed attribute of the job to true
You can then periodically poll for the result of the job within the DynamoDB table.
Or take a look at using DynamoDB Streams as a way to know when a job finishes in near-real time without polling.
As to viewing the 'output', AWS Lambda just returns a success response without additional information. There is no 'output'. Store any output you might need in persistent storage - maybe an extra output attribute as a String with each job? - & later retrieve it.
#Ermiya Eskandary's answer is absolutely right.
I am a Dynamodb Subject matter expert, and did this status tracking (also error handling, retry, error logging) pattern for many of my customers
You could check the pynamodb_mate library, it has the status tracker pattern implemented and you can enable that with around 15 lines of code.
in general, when you say you want status tracking, you are talking about the following:
Each task should be handled by only one worker, you want a concurrency lock mechanism to avoid double consumption. (a lot of people didn't aware of this, it is called Idempotent)
For those succeeded tasks, store additional information such as the output of the task and log the success time.
For those failed task, log the error message for debug, so you can fix the bug and rerun the task.
For those failed task, you want to get all of failed tasks by one simple query and rerun with the updated business logic.
For those tasks failed too many times, you don't want to retry them anymore and wants to ignore them. (a lot of people run into endless loop when they deploy to production then realize that it is a necessary feature)
Run custom query based on task status for analytics purpose.
You can read this jupyter notebook example
Basically, with pynamodb_mate your lambda job application code become:
# this is your lambda application code
def lambda_handler(...):
...
# your new code should be:
with tracker.start_job():
lambda_handler()
If your application code is not Python, then you have two options:
create another lambda function that invoke the original one using sync mode. however, you pay more money to run the "caller" lambda function
suppose your lambda code in in Node.js, then add additional lambda runtime as a layer and wrap your node.js caller around a Python function. In short, you are using Python to call node.js.

Lambdas calls speed changes

I've created a simple lambda that reads data from dynamodb.
First time I call the lambda it takes about 1500ms to complete, but then after I run the lambda again it takes about 150ms. How is it possible?
What type of caching response does AWS preform to achieve this?
AWS Lambda is provision infrastructure on your first call and it's required time also AWS needs to start a JVM with the code to be able to call the function. Starting the JVM takes time and thus will incur some overhead.
Another issue is cold ,if there is no idle container available waiting to run the code. This is all invisible to the user and AWS has full control over when to kill containers.
So above steps are involved during first call and you can see 1500 ms
Next call you have everything on place so lambda give you response in 150 ms or less .
This is as per design of serverless to save infrastructure cost ,only provision infrastructure when needed and get first call.
I would suggest please read documents
- https://aws.amazon.com/lambda/
This happens due to cold start. This happens mainly when we invoke the lambda for the first time after deployment or when a lambda function is idle for sometime.
These articles explains about how language, memory or size of the lambda affects the cold start
https://read.acloud.guru/does-coding-language-memory-or-package-size-affect-cold-starts-of-aws-lambda-a15e26d12c76
https://mikhail.io/serverless/coldstarts/aws/

Could AWS Scheduled Events potentially overlap?

I would like to create a Scheduled Events to run a Lambda to execute an api call every 1 minute (cron-line behaviour).
The caveat to this setup is that; the external api is un-reliable / slow and the api call sometimes could last longer than 1 minute.
So, my question here is; given the setup & scenario - would AWS run another Scheduled Event and execute the lambda before the previous executing finished? I.e. overlap?
If it does; is there a way to configure the scheduled event to not "overlap"?
I did some initial research into this and came across this article:
https://docs.aws.amazon.com/lambda/latest/dg/concurrent-executions.html
It looks like you can set concurrency limits at function level? Is this the way to achieve non-overlapping scheduled lambda executions? i.e. set the function's concurrency limit to 1?
Yes by default it will execute your Lambda function every 1 minute, regardless if the previous invocation has completed or not.
To enforce no more than one running instance of your Lambda function at a time, set the Concurrency setting of your Lambda function to 1.

Make Lambda function execute now, and/or in an hour

I'm trying to implement an AWS Lambda function that should send an HTTP request. If that request fails (response is anything but status 200) I should wait another hour before retrying (longer that the Lambda stays hot). What the best way to implement this?
What comes to mind is to persist my HTTP request in some way and being able to trigger the Lambda function again in a specified amount of time in case of a persisted HTTP request. But I'm not completely sure which AWS service that would provide that functionality for me. Is SQS an option that can help here?
Or, can I dynamically schedule Lambda execution for this? Note that the request to be retried should be identical to the first one.
Any other suggestions? What's the best practice for this?
(Lambda function is my option. No EC2 or such things are possible)
You can't directly trigger Lambda functions from SQS (at the time of writing, anyhow).
You could potentially handle the non-200 errors by writing the request data (with appropriate timestamp) to a DynamoDB table that's configured for TTL. You can use DynamoDB Streams to detect when DynamoDB deletes a record and that can trigger a Lambda function from the stream.
This is obviously a roundabout way to achieve what you want but it should be simple to test.
As jarmod mentioned, you cannot trigger Lambda functions directly by SQS. But a workaround (one I've used personally) would be to do the following:
If the request fails, push an item to an SQS Delay Queue (docs)
This SQS message will only become visible on the queue after a certain delay (you mentioned an hour).
Then have a second scheduled lambda function which is triggered by a cron value of a smaller timeframe (I used a minute).
This second function would then scan the SQS queue and if an item is on the queue, call your first Lambda function (either by SNS or with the AWS SDK) to retry it.
PS: Note that you can put data in an SQS item, since you mentioned you needed the lambda functions to be identical you can store your first function's input in here to be reused after an hour.
I suggest that you take a closer look at the AWS Step Functions for this. Basically, Step Functions is a state machine that allows you to execute a Lambda function, i.e. a task in each step.
More information can be found if you log in to your AWS Console and choose the "Step Functions" from the "Services" menu. By pressing the Get Started button, several example implementations of different Step Functions are presented. First, I would take a closer look at the "Choice state" example (to determine wether or not the HTTP request was successful). If not, then proceed with the "Wait state" example.

AWS Lambda faster process way

Currently, I'm implementing a solution based on S3, Lambda and DynamoDB.
My use case is, when a new object is uploaded on S3, a first Lambda function is called, downloads the new file, splits it in around 100(or more) parts and for each of them, adds additional information. Next step, each part will be processed by second Lambda function and in some case an insert will be performed in DynamoDB.
My question is only about the best way to call the "second lambda". I mean, the faster way. I want to execute 100 Lambda function(if I'd 100 parts to process) at the same time.
I know there are different possibilities:
1) My first Lambda function can push each part as an item in a Kinesis stream and my second Lambda function will react, retrieve an item and processed it. In this case I don't know if AWS will launch a new Lambda function each time there is a remaining item in the stream. Maybe there is some limitation...
2) My first Lambda function can push each part in an SNS topic and then my second Lambda will react to each new message. In this case I've some doubts about the latency(time between the action to send a message through the SNS topic and the time to my second Lambda function to be executed).
3) My first Lambda function can launch directly the second one by performing an API call and by passing the information. In this case I have no idea if I can launch 100 Lambdas function at the same time. I think I'll be stuck by a rate limitation against the AWS API(I said, I think!)
Somebody have a feedback and maybe advises regarding my use case? One more time, the most important for me it's to have the faster process way.
Thanks
Lambda limits are in place to provide some sane defaults but many workloads quickly exceed them. You can request an increase so this will not be a bottleneck for your use case. This document describes the process:
http://docs.aws.amazon.com/lambda/latest/dg/limits.html
I'm not sure how much latency your use case can tolerate but I often use SNS to fan out and the latency is usually sub-second to the next invocation (unless it's Java/coldstart).
If latency is extremely sensitive then you'd probably want to invoke Lambdas directly using Invoke with the InvocationType set to "Event". This would minimize blocking while you Invoke 100 times. You could also thread these Invoke calls within your main Lambda function to further increase parallelism if you want to hyper-optimize.
Cold containers will occasionally cause latency in your invocations. If milliseconds count this can become tricky. People who are trying to hyper-optimize Lambda processing times will sometimes schedule executions of their Lambda function with a "heartbeat" event that returns immediately (so processing time is cheap). These containers will remain "warm" for a small period of time which allows them to pick up your events without incurring "cold startup" time. Java containers are much slower to spin up cold than Node containers (I assume Python is probably equally fast as Node though I haven't tested).