From the docs:
When your function is invoked, Lambda attempts to re-use the execution
environment from a previous invocation if one is available.
When a lambda function is invoked, an instance of the configured runtime will be launched.
Once the execution completes, after what time that instance of lambda function is destroyed?
Suppose,
at t0=0 sec, lambda is invoked and subsequently starts to initialize its runtime and lets say it takes 2sec to launch the instance of its runtime (say nodejs)
at t1=2 sec, lambda function execution started, and
at t2=3 sec, lambda function execution completed,
and at some time t the instance of nodejs runtime will be destroyed on aws backend,
Is there any way the time t or can it be computed or if it is predefined ?
AWS Lambda functions run in ephemeral containers and all those mechanisms are implemented and maintained by AWS itself.
Developer is not supposed to worry about environment. Developer only need to keep in mind that all functions in his app should be stateless and should not be designed to share any data via container itself.
AWS scales the amount of available containers automatically to provide the optimal performance and minimum cold starts during spikes of load to your service.
There is no exact time to live for those containers.
You can read more about reusing in Understanding Container Reuse in AWS Lambda | AWS Compute Blog.
Related
I understand that Lambda is serverless and it will create an Execution Environment (MicroVMs) on event invocations.
So, when an event is invoked, Lambda will spin up an execution environment that will have selected programming language runtime inside it.
So far, it is clear that these Execution Environments (MicroVMs) are created on demand, and terminated if found idle for long.
Now, original question comes.
My understanding is that, Lambda have a Runtime API. So, whenever we create a Lambda resource in AWS, it can be accessed by Lambda Runtime API. And these API end-points are invoked by Event Sources such as SQS, SNS, etc.
My question is that, is there any compute that run all the time, just to host these Lambda Runtime APIs. And if it is there, why there is not much detail about that, and why are not we charged for that?
Please correct my understanding here.
In a very simplified explanation, Lambda should be considered as a service with two components:
Data Plane: EC2 instances where the functions are executed.
Control Plane: Service that contains all the metadata related to each Lambda deployed, including event mapping.
When an event occurs, it will be processed by the control plane. The control plane will validate the security and check if there is an available copy of the function already instantiated.
If one is available, it will forward the event to the Lambda and pass instructions to send back the result. If there is no function available the control plane will download the function code, together with its runtime, instantiate a new function in the data plane and forward the event.
At all times, there will be control plane and data plane machines online. The AWS Lambda service will increase or decrease the number of each based on usage.
I am aware of the cold-start and warm-start in AWS Lambda.
However, I am not sure during the warm-start if the Lambda architecture reuses the Firecracker VM in the backend? Or does it do the invocation in a fresh new VM?
Is there a way to enforce VM level isolation for every invocation through some other AWS solution?
Based on what stated on the documentation for Lambda execution context, Lambda tries to reuse the execution context between subsequent executions, this is what leads to cold-start (when the context is spun up) and warm-start (when an existing context is reused).
You typically see this latency when a Lambda function is invoked for the first time or after it has been updated because AWS Lambda tries to reuse the execution context for subsequent invocations of the Lambda function.
This is corroborated by another statement in the documentation for the Lambda Runtime Environment where it's stated that:
When a Lambda function is invoked, the data plane allocates an execution environment to that function, or chooses an existing execution environment that has already been set up for that function, then runs the function code in that environment.
A later passage of the same page gives a bit more info on how environments/resources are shared among functions and executions in the same AWS Account:
Execution environments run on hardware virtualized virtual machines (microVMs). A microVM is dedicated to an AWS account, but can be reused by execution environments across functions within an account. [...] Execution environments are never shared across functions, and microVMs are never shared across AWS accounts.
Additionally, there's also another doc page that gives some more details on isolation among environments but again, no mention to the ability to enforce 1 execution per environment.
As far as I know there's no way to make it so that a new execution will use a new environment rather than an existing one. AWS doesn't provide much insight in this but the wording around the topic seems to suggest that most people actually try to do the opposite of what you're looking for:
When you write your Lambda function code, do not assume that AWS Lambda automatically reuses the execution context for subsequent function invocations. Other factors may dictate a need for AWS Lambda to create a new execution context, which can lead to unexpected results, such as database connection failures.
I would say that if your concern is isolation from other customers/accounts, AWS guarantees isolation by means of virtualisation that although not being at the physical level, depending on their SLAs and your SLAs/requirements might be enough. If instead you're thinking on doing some kind of multi-tenant infrastructure that requires Lambda executions to be isolated from one another then this component might not be what you're looking for.
For how much time AWS Lambda /tmp directory stores files or it cleans after each run of the function?
If a Lambda is invoked from cold, a new container (or sometimes containers from my own experience) is created/provisioned.
If you write something to the temp folder and call that Lambda again soon after it has finished running, you will likely find what you wrote to the temp folder is still there.
But if you called that Lambda while it is still running from a previous call, you will be redirected to a different container, and therefore the temp drive will be empty.
AWS won't divulge how long a container is kept for and I believe change it fairly often. A container is likely to stay active longer if kept warm (called regularly). But AWS will occasionally flush all the containers regardless. Again AWS won't give a time frame of when this happens as they will change it to suit themselves. I also think it is likely it is dependent on how busy a given region is at a particular time.
In general AWS recommend that you don't rely on a container being the same container that you called on the last invocation. Lambdas are designed for stateless processes. If you need to maintain a state, use external storage like S3 or whatever suits your needs best.
From AWS Lambda Execution Context - AWS Lambda:
After a Lambda function is executed, AWS Lambda maintains the execution context for some time in anticipation of another Lambda function invocation. In effect, the service freezes the execution context after a Lambda function completes, and thaws the context for reuse, if AWS Lambda chooses to reuse the context when the Lambda function is invoked again.
...
When you write your Lambda function code, do not assume that AWS Lambda automatically reuses the execution context for subsequent function invocations. Other factors may dictate a need for AWS Lambda to create a new execution context, which can lead to unexpected results, such as database connection failures.
So, it isn't necessarily for a certain time. It mostly depends on resource availability.
I'm looking for a specific piece of documentation about the scaling of AWS Lambda.
How I think the scaling works:
Scenario: high traffic
AWS spins up multiple instances of the same Lambda Function
AWS distributes the events (probably evenly) among the instances
So what am I looking for specifically?
Is there a document where AWS states how lambda works internally or any information that concerns the process I described above (I need something to quote).
Thank you.
Officially, none of the implementation details of how AWS Lambda operates should impact your usage of the service. All you need to know is that something triggers a Lambda function, it runs and exits.
There is a limit on the number of simultaneous functions that can run (but you can ask for an increase in this limit). There is no guarantee that the functions run in a specific order.
The reality, however, is that Lambda functions are deployed as containers and those containers might be reused. For example, if you have a function that runs once per second for 200ms, it is quite likely that the container will be reused. The benefit of this is that there is no initialization time for the container if it is reused. This is particularly beneficial for Java functions that require creation of a JVM.
It also means that your function should assume that the environment will be reused — it should cleanup temporary files and reset global variables.
For more details, see: Understanding Container Reuse in AWS Lambda | AWS Compute Blog
I understand that AWS Lambda is supposed to abstract the developer from the infrastructure. However I don't quite understand how scaling would work.
Does it automatically start new containers during high traffic?
AWS Lambda functions can be triggered by many different event sources.
AWS Lambda runs each Lambda function runs as a standalone process in its own environment. There is a default limit of 1000 concurrent Lambda functions.
There is no need to think of Lambda "scaling". Rather, whenever an event source (or your own application) runs a Lambda function, the environment is created, the function is run, and the environment is torn down. When there is nothing that is invoking a Lambda function, it is not running. When 1000 invocations happen, then 1000 Lambda functions run.
It "scales" automatically by running in parallel on AWS infrastructure. You only pay while a function is running, per 100ms. It is the job of AWS to ensure that their back-end infrastructure scales to support the number of Lambda functions being run by all customers in aggregate.
If you whant to change the nubmer of desired instanzes in Auto Scaling Group, you chan use botocore.session
import botocore.session
client = session.create_client('autoscaling')
client.set_desired_capacity(
AutoScalingGroupName='NAME',
DesiredCapacity=X,
HonorCooldown=True|False
)
https://docs.aws.amazon.com/cli/latest/reference/autoscaling/set-desired-capacity.html