Aws Lambda handle cold start requests differently - amazon-web-services

I am using provisioned lambdas for running my application. The initialisation time for my function is quite high. I was wondering if there was a way to check at runtime if the lambda under consideration was a provisioned one or not. If it isn't a provisioned lambda I would like to handle the request in a very lean manner cutting out a lot of activities done during initialisation and during the actual request handling to provide a degraded experience.
Also it seems unlikely but could I essentially spin off a different background thread during the initialisation phase to take care of the heavy activities and have a flag implemented in my code that checks if the initialisation is complete. If it isn't and I start processing the request, I go ahead with the degraded experience otherwise I fulfill my request normally. I am not sure how the background thread will behave in Lambda.

Related

AWS Lambda async code execution

I've scoured for any answer but everything I've read are about concurrent lambda executions and async keyword syntax in Node however I can't find information about lambda instance execution.
The genesis of this was that I was at a meetup and someone mentioned that lambda instances (i.e. a ephemeral container hosted by AWS containing my code) can only execute one request at a time. This means that if I had 5 requests come in (for the sake of simplicity lets say to an already warm instance) they would all run in a separate instance, i.e. in 5 separate containers.
The bananas thing to me is that this undermines years of development in async programming. Starting back in 2009 node.js popularized programming with i/o in mind given that for a boring run of the mill CRUD app most of your request time is spent waiting on external DB calls or something. Writing async code allowed a single thread of execution to seemingly execute many simultaneous requests. While node didn't invent it I think it's fair to say it popularized it and has been a massive driver of backend technology development over the last decade. Many languages have added features to make async programming easier (callbacks/tasks/promises/futures or whatever you want to call them) and web servers have shifted to event loop based (node, vertx, kestrel etc) away from the single thread per request models of yester year.
Anyways enough with the history lesson, my point is that if what I heard is true then developing with lambdas throws most of that out the window. If the lambda run time will never send multiple requests through my running instance then programming in an async style will just waste resources. Say for example I'm talking C# and my lambda is for retrieving widgets. Then this code var response = await db.GetWidgets() is actually inefficient because it pushes the current threadcontext onto the stack so it can allow for other code to execute while it waits for that call to comeback. Since no other request will be invoked until the original one completes it makes more sense to program in a synchronous style save for places where parallel calls can be made.
Is this correct?
If so I'm honestly shocked it's not discussed more. Async programming has paradigm shift I've seen in the last few years and this totally changes that.
TL;DR: does lambda really only allow one request execution at a time per instance? If so this up ends major shift in server development towards asynchronous code.
Yes, you are correct - Lambda will spin up multiple containers to handle concurrent requests even if your Lambda does some work asynchronously (I have confirmed this through my own experience, and many other people have observed the same behavior - see this answer to a similar question). This is true for every supported runtime (C#, Node.js, etc).
This means that async code in your Lambda functions won't allow one Lambda container to handle multiple requests at once, as you stated. That being said, you still get all the other benefits of async code and you could still potentially improve your Lambda's performance by, for example, making many web service or database calls at once asynchronously - so this property of Lambda does not make async programming useless on the platform.
Your question is :
Since no other request will be invoked until the original one completes it makes more sense to program in a synchronous style save for places where parallel calls can be made.
No because you no longer have to wait the answer as you should do if you were using a sync process. Your trigger itself must die after the call so it will free memory. Either the lamba sends a notification or triggers a new service once it is completed, either a watcher looks at the result value (it is possible to wait the answer with a sync lambda, but it is not accurate due to the underlying async process beneath lambda system itself). As an Android developper, you can compare that to intent and broadcast, and it is completely async.
It is a complete different way to design solution because the async mechanism must be managed on the workflow layer itself and no longer in the core of the app, the solution becomes an aggregation of notifiers/watchers that triggers micro-services, it is no longer a single binary of thousand lines of code.
Each lambda function must be an individual micro-services.
Coming back to handle heavy traffic, you can run millions of Lambda in parallel as long as your micro-service is ending quickly, it won't cost much.
To ensure that your workflow is not dropping anything, you can add SQS (queue messaging) in the solution.
Further to the above answer, please see here. From what I understand, it's a synchronous loop. So, the only way to make things async from a request-handling perspective is to delegate the work to a message queue, e.g. SQS, as written here. I think this is similar to how Celery is used to make Django asynchronous. Lastly, if you truly want async handling of requests in line with async/await in node.js/python/c#/c++, if you may need to use AWS Fargate / EC2 instead of Lambda. Otherwise in Lambda, as you have mentioned yourself, it's bananas indeed. On the other hand, for heavy traffic, for which async/await shows its benefits, Lambda is not a good fit. There is a break-even analysis here about the three services: ec2, Lambda and Fargate.

How good is lambda functions to hit a REST API after a fixed amount of time?

I am using distributed scheduler 'Chronos'(distributed crontab) to hit a REST API after few minute of job addition(example: Add job at time T to schedule it at T+5minutes).This run on a bigger infrastructure and take care of fault-tolerant and no-data loss, however it has significant cost and I am thinking some alternative to the similar requirement. Please help if it can be done using a lambda function.
Its possible to do invoke a lambda function, block/wait for X seconds and continue execution, but not recommended. You cannot wait for more than 300 seconds though as thats the max timeout legally allowed by Lambda functions.
Moreover, you will hit concurrent execution limits from AWS and will need to keep calling AWS support to increase your concurrent execution limits.
Another approach to solve this problem could be to use Actor based system such as Akka, to create an Actor for each job and do the needful.

AWS Lambda execution duration randomly spikes and causes time-outs

I'm building a server-less web-tracking system which serves its tracking pixel using AWS API Gateway, which calls a Lambda function whenever a tracking request arrives to write the tracking event into a Kinesis stream.
The Lambda function itself does not do anything fancy. It just a takes the incoming event (its own argument) and writes it to the stream. Essentially, it's just:
import boto3
kinesis_client = boto3.client("kinesis")
kinesis_stream = "my_stream_name"
def return_tracking_pixel(event, context):
...
new_record = ...(event)
kinesis_client.put_record(
StreamName=kinesis_stream,
Data=new_record,
PartitionKey=...
)
return ...
Sometimes I experience a weird spike in the Lambda execution duration that causes some of my Lambda function invocations to time-out and the tracking requests to be lost.
This is the graph of 1-minute invocation counts of the Lambda function in the in affected time period:
Between 20:50 and 23:10 I suddenly see many invocation errors (1-minute error counts):
which are obviously caused by the Lambda execution time-out (maximum duration in 1-minute intervals):
There is nothing weird going on neither with my Kinesis stream (data-in, number of put records, put_record success count etc., all looks normal), nor with my API GW (number of invocations corresponds to number of API GW calls, well within the limits of the API GW).
What could be causing the sudden (and seemingly randomly occurring) spike in the Lambda function execution duration?
EDIT: neither the lambda functions are being throttled, which was my first idea.
Just to add my 2 cents, because there's not much investigative work without extra logging or some X-Ray analysis.
AWS Lambda sometimes will force recycle containers which will feel like cold starts even though your function is being reasonably exercised and warmed up. This might bring all cold start related issues, like extra delays for ENIs if your Lambda has an attached VPC and so on... but even for a simple function like yours, 1 second timeout is sometimes too optimistic for a cold start.
I don't know of any documentation on those forced recycles, other than some people having evidence for it.
"We see a forced recycle about 7 times a day." source
"It also appears that even once warmed, high concurrency functions get recycled much faster than those with just a few in memory." source
I wonder how you could confirm this is the case. Perhaps you could check those errors appearing in Cloud Watch log streams to be from containers that never appeared before.

AWS Lambda faster process way

Currently, I'm implementing a solution based on S3, Lambda and DynamoDB.
My use case is, when a new object is uploaded on S3, a first Lambda function is called, downloads the new file, splits it in around 100(or more) parts and for each of them, adds additional information. Next step, each part will be processed by second Lambda function and in some case an insert will be performed in DynamoDB.
My question is only about the best way to call the "second lambda". I mean, the faster way. I want to execute 100 Lambda function(if I'd 100 parts to process) at the same time.
I know there are different possibilities:
1) My first Lambda function can push each part as an item in a Kinesis stream and my second Lambda function will react, retrieve an item and processed it. In this case I don't know if AWS will launch a new Lambda function each time there is a remaining item in the stream. Maybe there is some limitation...
2) My first Lambda function can push each part in an SNS topic and then my second Lambda will react to each new message. In this case I've some doubts about the latency(time between the action to send a message through the SNS topic and the time to my second Lambda function to be executed).
3) My first Lambda function can launch directly the second one by performing an API call and by passing the information. In this case I have no idea if I can launch 100 Lambdas function at the same time. I think I'll be stuck by a rate limitation against the AWS API(I said, I think!)
Somebody have a feedback and maybe advises regarding my use case? One more time, the most important for me it's to have the faster process way.
Thanks
Lambda limits are in place to provide some sane defaults but many workloads quickly exceed them. You can request an increase so this will not be a bottleneck for your use case. This document describes the process:
http://docs.aws.amazon.com/lambda/latest/dg/limits.html
I'm not sure how much latency your use case can tolerate but I often use SNS to fan out and the latency is usually sub-second to the next invocation (unless it's Java/coldstart).
If latency is extremely sensitive then you'd probably want to invoke Lambdas directly using Invoke with the InvocationType set to "Event". This would minimize blocking while you Invoke 100 times. You could also thread these Invoke calls within your main Lambda function to further increase parallelism if you want to hyper-optimize.
Cold containers will occasionally cause latency in your invocations. If milliseconds count this can become tricky. People who are trying to hyper-optimize Lambda processing times will sometimes schedule executions of their Lambda function with a "heartbeat" event that returns immediately (so processing time is cheap). These containers will remain "warm" for a small period of time which allows them to pick up your events without incurring "cold startup" time. Java containers are much slower to spin up cold than Node containers (I assume Python is probably equally fast as Node though I haven't tested).

Is it possible to detect an AWS account is nearing the Lambda concurrency limit?

Lambda has some concurrency limits that when hit, cause subsequent invocations to get throttled.
This makes sense, but is it possible to detect this situation ahead of time and start applying backpressure?
The problem is that (according to the docs) the concurrency limit is per-account, which means a single runaway microservice can block ALL unrelated services.
For example: a lambda fn with an s3 event source could easily lead to API Gateway handlers being throttled and unhappy API users.
Is there any QoS for lambda functions? It'd be great to be able to give public-facing functions priority. (I know the answer is no, but I wish there were.)
Short of that, is it possible to detect that you're nearing this concurrency limit and build backpressure in?
I'm not seeing anything, and the only solution I can think of at this moment is to create a metric that watches for Throttles and as soon as one happens, toggle some flag somewhere? This adds significant complexity though...
Any ideas?