My lambda function was taking about 120ms with 1024mb memory size. When I checked the log, it was using only 22mb at max, so I tried optimizing it, reducing to 128mb.
But, when I did this, the ~120ms of processing went up to about ~350ms, but still, only 22 mb was being used.
I'm a bit confused, if I just used 22mb, then why having 128 or 1024mb available impact the processing time?
The underlying CPU power is directly proportional to the memory footprint that you select. So basically that memory knob controls your CPU allocation as well.
So that is the reason why you are seeing that reducing the memory causes Lambda to take more time for execution
Following is what is documented on AWS Docs for Lambda
Compute resources that you need – You only specify the amount of memory you want to allocate for your Lambda function. AWS Lambda allocates CPU power proportional to the memory by using the same ratio as a general purpose Amazon EC2 instance type, such as an M3 type. For example, if you allocate 256 MB memory, your Lambda function will receive twice the CPU share than if you allocated only 128 MB.
Related
Does anyone know if memory usage have on lambda response time? I understand that memory allocation is directly correlated to CPU allocation, but what about % memory utilization. e.g. 100mb allocated but 95mb is being used(for dependencies, that should be in layers). Will that effect the execution time?
The utilisation at runtime does not change the allocated number of virtual CPU cores at runtime.
As you already know: the number of cores depends on the amount of memory you configure. But this is a configuration time allocation and has nothing to do with the runtime.
Commenter Anon Coward already mentioned, high memory utilisation still can have an impact on your Lambdas execution time. But it does not have to. It depends on what your code is actually doing.
The great thing is: you can measure all of this and you can therefore find out what the best memory size is for your Lambda function.
Even better, there are already projects that help you do exactly that:
https://github.com/alexcasalboni/aws-lambda-power-tuning
I have to process a lot of data in my Lambda code and this computation could be parallelized. I am currently using single-threaded Python code and want to optimize it. I thought about converting it to multi-threaded Python code, but anyway it seems that Amazon Lambda doesn't have enough resources. What is the best way to do this?
AWS Lambda now supports up to 10 GB of memory and 6 vCPU cores for Lambda Functions
If you want to do CPU bound, parallelized functions on Lambda, always remember these core behaviour
The total amount number of vCPU (which correlate to the optimal thread count) is dictated by how much memory you assigned for that Lambda function
Lambda allocates CPU power in proportion to the amount of memory configured. Memory is the amount of memory available to your Lambda function at runtime. You can increase or decrease the memory and CPU power allocated to your function using the Memory (MB) setting. To configure the memory for your function, set a value between 128 MB and 10,240 MB in 1-MB increments. At 1,769 MB, a function has the equivalent of one vCPU (one vCPU-second of credits per second).
Lambda is also severely limited by the maximum amount of time it can be run: 900 seconds (15 minutes)
Depend on how your application has been architectured, you can improve the performance with these things in mind
It does support multi-threaded / multi-core processing. How-to in python can be found here
When you hit the upper limits of a single Lambda run, think about ways to break the work to multiple Lambdas running in parallel if possible. That level of horizontal scaling is what Lambda excels at.
In aws lambda the ram allocated for a lambda is for one instance of that lambda or for all running instance of that lambda? Till now I believed its for each instance.
Let's consider a lambda 'testlambda' and I am configuring it to have 5 Minutes timeout and 3008 MB (current max) RAM and have "Use unreserved account concurrency" option selected:
At T one instance of 'testlambda' start running and assume that it is going to run for 100 seconds and going to use 100 MB of RAM while it is running(for the whole 100 seconds), if one more instance of 'testlambda' start at T+50s how much RAM will be available for the second instance 3008 MB or 2908 MB ?
I used to believe that the second instance will also have 3008 MB. But after seeing the recent execution logs of my lambda I am inclined to say that for the second instance will have 2908 MB.
The allocation is for each container.
Containers are not used by more than one invocation at any given time -- that is, containers are reused, but not concurrently. (And not by more than one version of one function).
If your code is leaking memory, this means subsequent but non-concurrent invocations spaced relatively close together in time will be observed as using more and more memory because they are running in the same container... but this would never happen in the scenario you described, because with the second invocation at T+50, it would never share the container with the 100-second process started at T+0.
From what i saw, at least so far, the ram is not shared. We had a lot of concurrent requests with the default ram for lambdas, if for some reason this was shared we would see problems related to memory, but that never happened.
You could test this by reducing the amount of ram of a dummy lambda that would execute for X seconds and try to call it several times to see if the memory used is greater than the memory you selected.
I'm confused about the purpose of having both hard and soft memory limits for ECS task definitions.
IIRC the soft limit is how much memory the scheduler reserves on an instance for the task to run, and the hard limit is how much memory a container can use before it is murdered.
My issue is that if the ECS scheduler allocates tasks to instances based on the soft limit, you could have a situation where a task that is using memory above the soft limit but below the hard limit could cause the instance to exceed its max memory (assuming all other tasks are using memory slightly below or equal to their soft limit).
Is this correct?
Thanks
If you expect to run a compute workload that is primarily memory bound instead of CPU bound then you should use only the hard limit, not the soft limit. From the docs:
You must specify a non-zero integer for one or both of memory or memoryReservation in container definitions. If you specify both, memory must be greater than memoryReservation. If you specify memoryReservation, then that value is subtracted from the available memory resources for the container instance on which the container is placed; otherwise, the value of memory is used.
http://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html
By specifying only a hard memory limit for your tasks you avoid running out of memory because ECS stops placing tasks on the instance, and docker kills any containers that try to go over the hard limit.
The soft memory limit feature is designed for CPU bound applications where you want to reserve a small minimum of memory (the soft limit) but allow occasional bursts up to the hard limit. In this type of CPU heavy workload you don't really care about the specific value of memory usage for the containers that much because the containers will run out of CPU long before they exhaust the memory of the instance, so you can place tasks based on CPU reservation and the soft memory limit. In this setup the hard limit is just a failsafe in case something goes out of control or there is a memory leak.
So in summary you should evaluate your workload using load tests and see whether it tends to run out of CPU first or out of memory first. If you are CPU bound then you can use the soft memory limit with an optional hard limit just as a failsafe. If you are memory bound then you will need to use just the hard limit with no soft limit.
#nathanpeck is the authority here, but I just wanted to address a specific scenario that you brought up:
My issue is that if the ECS scheduler allocates tasks to instances
based on the soft limit, you could have a situation where a task that
is using memory above the soft limit but below the hard limit could
cause the instance to exceed its max memory (assuming all other tasks
are using memory slightly below or equal to their soft limit).
This post from AWS explains what occurs in such a scenario:
If containers try to consume memory between these two values (or
between the soft limit and the host capacity if a hard limit is not
set), they may compete with each other. In this case, what happens
depends on the heuristics used by the Linux kernel’s OOM (Out of
Memory) killer. ECS and Docker are both uninvolved here; it’s the
Linux kernel reacting to memory pressure. If something is above its
soft limit, it’s more likely to be killed than something below its
soft limit, but figuring out which process gets killed requires
knowing all the other processes on the system and what they are doing
with their memory as well. Again the new memory feature we announced
can come to rescue here. While the OOM behavior isn’t changing, now
containers can be configured to swap out to disk in a memory pressure
scenario. This can potentially alleviate the need for the OOM killer
to kick in (if containers are configured to swap).
I have a Lambda function that reads messages off an SQS queue and inserts items into Dynamo. At first, I had it at 512MB of memory. In cloud watch, it reported the max memory used was around 58MB. I assumed I could then lower the memory to 128MB and see the same rate of processing SQS messages. However, that wasn't the case. Things noticeably slowed. Can anyone explain?
Here is cloud watch showing max memory with 512MB Lambda:
Here is cloud watch showing max memory with 128MB Lambda:
Here you can see the capacity of the Dynamo table really dropped
Here you can see the number of messages being processes really slowed, as evidence by the lower slope
This seems counter-intuitive, but there's a logical explanation:
Reducing memory also reduces the available CPU cycles. You're paying for very short term use of a fixed fraction of the resources of an EC2 instance, which has a fixed ratio of CPU to memory.
Q: How are compute resources assigned to an AWS Lambda function?
In the AWS Lambda resource model, you choose the amount of memory you want for your function, and are allocated proportional CPU power and other resources. For example, choosing 256MB of memory allocates approximately twice as much CPU power to your Lambda function as requesting 128MB of memory and half as much CPU power as choosing 512MB of memory. You can set your memory in 64MB increments from 128MB to 1.5GB.
https://aws.amazon.com/lambda/faqs/
So, how much CPU capacity are we talking about?
AWS Lambda allocates CPU power proportional to the memory by using the same ratio as a general purpose Amazon EC2 instance type, such as an M3 type.
http://docs.aws.amazon.com/lambda/latest/dg/lambda-introduction-function.html
We can extrapolate.
In the M3 class, regardless of instance size, the provisioning factors look like this:
CPU = Xeon E5-2670 v2 (Ivy Bridge) × 8 cores
Relative Compute Performance = 26 ECU
Memory = 30 GiB
An ECU is an EC2 (or possibly "Elastic" or "Equivalent") Compute Unit, where 1.0 ECU is approximately equivalent to the compute capacity of a 1GHz Opteron. It's a dimensionless quantity for simplifying comparison of the relative CPU capacity of differing instance types.
So the provisioning ratios look like this:
8/30 Cores/GiB
26/30 ECU/GiB
So at 512 MiB memory, your Lambda function's container's share of this machine would be...
8 ÷ 30 ÷ (1024/512) = 0.133 of 1 core (~13.3% CPU)
26 ÷ 30 ÷ (1024/512) = 0.433 ECU (~433 MHz equivalent)
At 128 MiB, it's only about 1/4 of that.
These numbers seem really small, but they are not inappropriate for the typical Lambda use-case -- single-threaded, asynchronous actions that are not CPU intensive.