Does AWS Lambda reset memory on each invocation? - amazon-web-services

I am currently using AWS Lambda in conjunction with Java 8. As far as I can see AWS Lambda is considered an isolated unit of work, i.e. each process can be done in isolation. In this case, upon a container being reused, would JVM or the Lambda container itself have persisted memory that the next execution could use? Does the Lambda memory reset at each invocation? I can't seem to find the answer to this in the docs.

No, it does not wipe the memory. You should implement your own memory wiping process before your function terminates, if mandated.
Prior to a function’s first invocation, Lambda scrubs the memory before assigning it to an execution environment. However, Lambda does not scrub memory between subsequent invocations on the same execution environment for the same function to facilitate execution environment reuse.
You can read further details in Security Overview of AWS Lambda whitepaper.

Lambda can contain cached data from previous invocation, but it is not guaranteed and AWS doesn't provide any TTL when this cached be invalidated. This can be good if you need to use same data over and over within all the invocations. However, if you specifically need fresh invocation, make sure you flush cache/reset variables etc as first step of your function.

Related

Does AWS Lambda run every invocation in a separate Firecracker VM?

I am aware of the cold-start and warm-start in AWS Lambda.
However, I am not sure during the warm-start if the Lambda architecture reuses the Firecracker VM in the backend? Or does it do the invocation in a fresh new VM?
Is there a way to enforce VM level isolation for every invocation through some other AWS solution?
Based on what stated on the documentation for Lambda execution context, Lambda tries to reuse the execution context between subsequent executions, this is what leads to cold-start (when the context is spun up) and warm-start (when an existing context is reused).
You typically see this latency when a Lambda function is invoked for the first time or after it has been updated because AWS Lambda tries to reuse the execution context for subsequent invocations of the Lambda function.
This is corroborated by another statement in the documentation for the Lambda Runtime Environment where it's stated that:
When a Lambda function is invoked, the data plane allocates an execution environment to that function, or chooses an existing execution environment that has already been set up for that function, then runs the function code in that environment.
A later passage of the same page gives a bit more info on how environments/resources are shared among functions and executions in the same AWS Account:
Execution environments run on hardware virtualized virtual machines (microVMs). A microVM is dedicated to an AWS account, but can be reused by execution environments across functions within an account. [...] Execution environments are never shared across functions, and microVMs are never shared across AWS accounts.
Additionally, there's also another doc page that gives some more details on isolation among environments but again, no mention to the ability to enforce 1 execution per environment.
As far as I know there's no way to make it so that a new execution will use a new environment rather than an existing one. AWS doesn't provide much insight in this but the wording around the topic seems to suggest that most people actually try to do the opposite of what you're looking for:
When you write your Lambda function code, do not assume that AWS Lambda automatically reuses the execution context for subsequent function invocations. Other factors may dictate a need for AWS Lambda to create a new execution context, which can lead to unexpected results, such as database connection failures.
I would say that if your concern is isolation from other customers/accounts, AWS guarantees isolation by means of virtualisation that although not being at the physical level, depending on their SLAs and your SLAs/requirements might be enough. If instead you're thinking on doing some kind of multi-tenant infrastructure that requires Lambda executions to be isolated from one another then this component might not be what you're looking for.

Is AWS lambda function doing some caching when it retrieves an AWS secret

I noticed that when i updated the secret, it takes sometime before the lambda is able to retrieve the updated secret value. I wonder if there is some caching happening during lambda invocation.
The only builtin caching I'm aware of in lambda function is the execution context reuse, which is documented here.
Take advantage of execution context reuse to improve the performance of your function. Initialize SDK clients and database connections outside of the function handler, and cache static assets locally in the /tmp directory. Subsequent invocations processed by the same instance of your function can reuse these resources. This saves execution time and cost.
To answer your question, if you fetch the secrets outside the function handler, then it will take some time to fully update in the execution context.

Is /tmp directory of lambda persistent for a while, or clears it after every run?

For how much time AWS Lambda /tmp directory stores files or it cleans after each run of the function?
If a Lambda is invoked from cold, a new container (or sometimes containers from my own experience) is created/provisioned.
If you write something to the temp folder and call that Lambda again soon after it has finished running, you will likely find what you wrote to the temp folder is still there.
But if you called that Lambda while it is still running from a previous call, you will be redirected to a different container, and therefore the temp drive will be empty.
AWS won't divulge how long a container is kept for and I believe change it fairly often. A container is likely to stay active longer if kept warm (called regularly). But AWS will occasionally flush all the containers regardless. Again AWS won't give a time frame of when this happens as they will change it to suit themselves. I also think it is likely it is dependent on how busy a given region is at a particular time.
In general AWS recommend that you don't rely on a container being the same container that you called on the last invocation. Lambdas are designed for stateless processes. If you need to maintain a state, use external storage like S3 or whatever suits your needs best.
From AWS Lambda Execution Context - AWS Lambda:
After a Lambda function is executed, AWS Lambda maintains the execution context for some time in anticipation of another Lambda function invocation. In effect, the service freezes the execution context after a Lambda function completes, and thaws the context for reuse, if AWS Lambda chooses to reuse the context when the Lambda function is invoked again.
...
When you write your Lambda function code, do not assume that AWS Lambda automatically reuses the execution context for subsequent function invocations. Other factors may dictate a need for AWS Lambda to create a new execution context, which can lead to unexpected results, such as database connection failures.
So, it isn't necessarily for a certain time. It mostly depends on resource availability.

AWS Lambda, Scaling, Implementation

I'm looking for a specific piece of documentation about the scaling of AWS Lambda.
How I think the scaling works:
Scenario: high traffic
AWS spins up multiple instances of the same Lambda Function
AWS distributes the events (probably evenly) among the instances
So what am I looking for specifically?
Is there a document where AWS states how lambda works internally or any information that concerns the process I described above (I need something to quote).
Thank you.
Officially, none of the implementation details of how AWS Lambda operates should impact your usage of the service. All you need to know is that something triggers a Lambda function, it runs and exits.
There is a limit on the number of simultaneous functions that can run (but you can ask for an increase in this limit). There is no guarantee that the functions run in a specific order.
The reality, however, is that Lambda functions are deployed as containers and those containers might be reused. For example, if you have a function that runs once per second for 200ms, it is quite likely that the container will be reused. The benefit of this is that there is no initialization time for the container if it is reused. This is particularly beneficial for Java functions that require creation of a JVM.
It also means that your function should assume that the environment will be reused — it should cleanup temporary files and reset global variables.
For more details, see: Understanding Container Reuse in AWS Lambda | AWS Compute Blog

AWS Lambda: Is it secure to store data on AWS Lambda local Disk?

I have following basic security related questions regarding AWS Lambda service:
Where does AWS Lambda store data if for example I try to store data on local disk?
Is is possible to encrypt the data on Lambda?
Thanks
One important sidenote to the /tmp of Lambda functions is that the Lambda function containers are re-used and scratch space is not always erased. If an invocation uses a container that was spun up because of a previous invocation (this happens if you execute a few Lambda function in quick succession), the scratch space is shared.
This screwed up a functionality for me once.
I store temporary data in my lambda function, never had any issue.
Store your data in /tmp, you may not have access to other dirs
The temporary data - as the name indicates - is available only for that invocation of lambda
If the data is sensitive, encrypt it (if the encryption libraries are not provided by default for that language, make sure you package the library)
Files stored on Lambda's local volumes should be for temporary short-term storage only and should not be expected to persist beyond the lifetime of your single Lambda function invocation.
If you need to store data long-term, use a database like DynamoDB or use Amazon S3.
If you must store data on the local volume, you can encrypt it, but you must do it yourself. Also, note that the next time the function is called, the data most likely will be gone.
If you by "secure" asks who will have access to data then the answer is anyone that can call the lambda. If you by "secure" also wonder if it is durable storage, then the answer is no. Lambda functions only have access to an ephemeral /tmp folder. There is no guarantee that the two consecutive calls to the same lambda function will be executed on the same physical machine. However, if the function is called twice within a short period of time, it could be executed on the very same machine and then a file that was saved by the first call may be available to the second call. If you choose to use this temporary file storage you should also be aware that there are some limitations of how much data that can be stored.
Lambda store data in lambdas \tmp folder.
and it is not secure the store data on lambda
reason is when lambda function completes its execution it will delete all the data which is in the \tmp folder
solution before terminating the lambda function or completion of the script move data from \tmp folder aws s3 bucket.