AWS Lambda: The function is idle - amazon-web-services

I am studying Lambdas of my project and I've seen that one of them is idle. In the top of dashboard page I see block with text:
The function __ is idle. To reactivate your function, choose Restore.
I slightly confused of it because this function is very similar to others which isn't marked as idle but as well haven't been launched for couple of months.
Since I haven't find answers in AWS documentation i'd appreciate somebody to explain me what difference between functions in idle state and not, and how/why function becomes Idle?

It is related to VPC, please check this doc.
If your functions aren't active for a long period of time, Lambda
reclaims its network interfaces, and the functions become Idle. To
reactivate an idle function, invoke it. This invocation fails, and the
function enters a Pending state again until a network interface is
available.
https://docs.aws.amazon.com/lambda/latest/dg/configuration-vpc.html

Instead of a Cloudwatch event, I would suggest using Provisioned Concurrency to keep a lambda(s) warm.
https://aws.amazon.com/about-aws/whats-new/2019/12/aws-lambda-announces-provisioned-concurrency/

You will need to invoke functions often using cloudwatch events regularly if you want your Lambda functions to stay alive & warm. Otherwise they will cool down as #Traycho Ivanov says.
Setup up cloudwatch events to invoke the Lambdas you need alive every so often, but how often is a bit debated it's not clear how AWS manages this and this could easily be changed into the future that your event is not frequent enough, or maybe it is too frequent costing you a couple more cents than you would otherwise want!

Related

What's Happening when an AWS Lambda Function Freezes

What's going on behind the scenes when an AWS lambda function freezes?
That is -- many of the Lambda Runtime Docs refer broadly to the concept of a function freezing or unfreezing
The runtime and each extension indicate completion by sending a Next API request. Lambda freezes the execution environment when the runtime and each extension have completed and there are no pending events.
My understanding of this is that after a Lambda function initializes (or "cold starts") and executes the first invocation request, if there are no other invocations to process the function's execution environment will "freeze". Then, when there's another function invocation to process, the function's execution environment will "unfreeze" almost instantly without needing to initialize/cold-start again. If a frozen function goes too long without being invoked it will shutdown, and the next invocation request will need to cold start.
Does anyone know what this freezing is? It's my understanding that these execution environments are firecracker virtual machines. Is this freezing something that firecracker supports, or is it some extra magic that AWS Web Services has that they keep to themselves? Put another way, if I have a Firecracker VM running can I freeze and unfreeze it?
We can understand the freeze as after each execution, AWS Lambda putting the instance to sleep. In other words, the instance freezes (similar to a laptop in hibernate mode). The virtual CPU is turned off. This frees up resources on the worker node. The overhead from waking up such a function is negligible.
For understand how Firecracker works under the hood, take a look on this AWS re:Invent of 2019 video: AWS re:Invent 2019: Firecracker open-source innovation (OPN402)
Also, take a look on this posts:
Understanding Container Reuse in AWS Lambda
A look behind the scenes of AWS Lambda

Managing SQS Queue Manually vs Lambda Trigger

I'm not sure if this would be better served on ServerFault or Software Engineering, willing to move this post if appropriate.
We have somewhat recently started to move some of our data processing pipeline to use queues to manage individual bits of data, whereas previously we had timed lambdas that would pull all data since last change.
While making this change, we noticed that queues didn't work quite as we had anticipated first of all - we thought lambda would just pull items off the queue as the lambdas had availability. Instead, it seems the aws managed lambda trigger grabs a chunk of messages (up to ten) and throws it at the lambda service. If lambda doesn't have availability, the message gets throttled, then replayed after a backoff time, up til our configured replay "error" limit (five). After that, it's thrown into our dead letter queue.
We see a handful of message per day end up in the dead letter queue as a result of throttling. We then throw these back into the main queue (we have a process to do so every handful of hours). However, we weren't 100% sure throttling was the reason for things being pushed over since nothing indicates why the messages are moved over - we just assumed as much because we weren't getting any error logs for those messages. We contacted Amazon support to ask about this, and they were able to actually confirm the messages were in fact "errored" as a result of throttling.
We asked further into their recommendations for this - this must be a common problem right? They first off suggested upping our replay limit, which seemed an obvious no go. Replays occur for any failure, so that would just hammer our lambdas with bad requests when they came through. Asked also if there's any way to differentiate the errors because we don't care for throttling, we'd happily let those retry a dozen times if needed - but no. The other suggestion they had was to manage the queue ourselves from our lambdas. Build our own code within our lambdas to pull messages and then delete them after processing. This seems really counter-intuitive, though - why would every AWS consumer build their own infrastructure?
So I guess my question is, is this what others are doing? Are you using the built in lambda triggers? Are you creating your own code for managing queue consumption? Do you see these sorts of throttling, or is there anything we could do differently? Are there any difference with other services to manage this?
Best practice is to handle errors in your code and manually delete messages that have succeeded. That allows you to handle poison messages without reprocessing the good messages again. Throttles shouldn't be ending up in a DLQ that often. This video from re:Invent 2020 has a good explaination of how this works. Scalable serverless event-driven architectures with SNS, SQS & Lambda. Start at about the 20 minutes mark to get into SQS error handling.

Lambdas calls speed changes

I've created a simple lambda that reads data from dynamodb.
First time I call the lambda it takes about 1500ms to complete, but then after I run the lambda again it takes about 150ms. How is it possible?
What type of caching response does AWS preform to achieve this?
AWS Lambda is provision infrastructure on your first call and it's required time also AWS needs to start a JVM with the code to be able to call the function. Starting the JVM takes time and thus will incur some overhead.
Another issue is cold ,if there is no idle container available waiting to run the code. This is all invisible to the user and AWS has full control over when to kill containers.
So above steps are involved during first call and you can see 1500 ms
Next call you have everything on place so lambda give you response in 150 ms or less .
This is as per design of serverless to save infrastructure cost ,only provision infrastructure when needed and get first call.
I would suggest please read documents
- https://aws.amazon.com/lambda/
This happens due to cold start. This happens mainly when we invoke the lambda for the first time after deployment or when a lambda function is idle for sometime.
These articles explains about how language, memory or size of the lambda affects the cold start
https://read.acloud.guru/does-coding-language-memory-or-package-size-affect-cold-starts-of-aws-lambda-a15e26d12c76
https://mikhail.io/serverless/coldstarts/aws/

AWS Lambda async code execution

I've scoured for any answer but everything I've read are about concurrent lambda executions and async keyword syntax in Node however I can't find information about lambda instance execution.
The genesis of this was that I was at a meetup and someone mentioned that lambda instances (i.e. a ephemeral container hosted by AWS containing my code) can only execute one request at a time. This means that if I had 5 requests come in (for the sake of simplicity lets say to an already warm instance) they would all run in a separate instance, i.e. in 5 separate containers.
The bananas thing to me is that this undermines years of development in async programming. Starting back in 2009 node.js popularized programming with i/o in mind given that for a boring run of the mill CRUD app most of your request time is spent waiting on external DB calls or something. Writing async code allowed a single thread of execution to seemingly execute many simultaneous requests. While node didn't invent it I think it's fair to say it popularized it and has been a massive driver of backend technology development over the last decade. Many languages have added features to make async programming easier (callbacks/tasks/promises/futures or whatever you want to call them) and web servers have shifted to event loop based (node, vertx, kestrel etc) away from the single thread per request models of yester year.
Anyways enough with the history lesson, my point is that if what I heard is true then developing with lambdas throws most of that out the window. If the lambda run time will never send multiple requests through my running instance then programming in an async style will just waste resources. Say for example I'm talking C# and my lambda is for retrieving widgets. Then this code var response = await db.GetWidgets() is actually inefficient because it pushes the current threadcontext onto the stack so it can allow for other code to execute while it waits for that call to comeback. Since no other request will be invoked until the original one completes it makes more sense to program in a synchronous style save for places where parallel calls can be made.
Is this correct?
If so I'm honestly shocked it's not discussed more. Async programming has paradigm shift I've seen in the last few years and this totally changes that.
TL;DR: does lambda really only allow one request execution at a time per instance? If so this up ends major shift in server development towards asynchronous code.
Yes, you are correct - Lambda will spin up multiple containers to handle concurrent requests even if your Lambda does some work asynchronously (I have confirmed this through my own experience, and many other people have observed the same behavior - see this answer to a similar question). This is true for every supported runtime (C#, Node.js, etc).
This means that async code in your Lambda functions won't allow one Lambda container to handle multiple requests at once, as you stated. That being said, you still get all the other benefits of async code and you could still potentially improve your Lambda's performance by, for example, making many web service or database calls at once asynchronously - so this property of Lambda does not make async programming useless on the platform.
Your question is :
Since no other request will be invoked until the original one completes it makes more sense to program in a synchronous style save for places where parallel calls can be made.
No because you no longer have to wait the answer as you should do if you were using a sync process. Your trigger itself must die after the call so it will free memory. Either the lamba sends a notification or triggers a new service once it is completed, either a watcher looks at the result value (it is possible to wait the answer with a sync lambda, but it is not accurate due to the underlying async process beneath lambda system itself). As an Android developper, you can compare that to intent and broadcast, and it is completely async.
It is a complete different way to design solution because the async mechanism must be managed on the workflow layer itself and no longer in the core of the app, the solution becomes an aggregation of notifiers/watchers that triggers micro-services, it is no longer a single binary of thousand lines of code.
Each lambda function must be an individual micro-services.
Coming back to handle heavy traffic, you can run millions of Lambda in parallel as long as your micro-service is ending quickly, it won't cost much.
To ensure that your workflow is not dropping anything, you can add SQS (queue messaging) in the solution.
Further to the above answer, please see here. From what I understand, it's a synchronous loop. So, the only way to make things async from a request-handling perspective is to delegate the work to a message queue, e.g. SQS, as written here. I think this is similar to how Celery is used to make Django asynchronous. Lastly, if you truly want async handling of requests in line with async/await in node.js/python/c#/c++, if you may need to use AWS Fargate / EC2 instead of Lambda. Otherwise in Lambda, as you have mentioned yourself, it's bananas indeed. On the other hand, for heavy traffic, for which async/await shows its benefits, Lambda is not a good fit. There is a break-even analysis here about the three services: ec2, Lambda and Fargate.

Is it possible to detect an AWS account is nearing the Lambda concurrency limit?

Lambda has some concurrency limits that when hit, cause subsequent invocations to get throttled.
This makes sense, but is it possible to detect this situation ahead of time and start applying backpressure?
The problem is that (according to the docs) the concurrency limit is per-account, which means a single runaway microservice can block ALL unrelated services.
For example: a lambda fn with an s3 event source could easily lead to API Gateway handlers being throttled and unhappy API users.
Is there any QoS for lambda functions? It'd be great to be able to give public-facing functions priority. (I know the answer is no, but I wish there were.)
Short of that, is it possible to detect that you're nearing this concurrency limit and build backpressure in?
I'm not seeing anything, and the only solution I can think of at this moment is to create a metric that watches for Throttles and as soon as one happens, toggle some flag somewhere? This adds significant complexity though...
Any ideas?