For security issues I wrote a lambda function that adds metadata to the data that should be saved in a DynamoDB database. Now I want to evaluate the impact of the additional metadata functionality. Therefore I want to check the performance of two lambda functions, one function with metadata and one without. I want to use this information for my Thesis.
How accurate are the metrics provided by lambda (memory used and execution time)? Can I use these parameters to evaluate the performance impact or is there a better way?
The metrics provided by Lambda are very accurate as they are used to determine billing, with a caveat that the execution time reported does not include network latency incurred in calling the lambda and AFAIK it also does not include time spent provisioning the container on cold starts (it does included time spent initializing your code in the container, but I don't think it includes time spent starting the container itself, copying code, etc.)
In conclusion I don't think you can get more accurate measurements for memory used but for the execution duration there are other ways you may want to look at the data depending on what you care about.
However, if you just want to A/B test two different function implementations, executing in the same environment with the same input data, you can probably rely on the Lambda reported metrics.
Related
I know that dagger creates injection at compile time by generating code and hence its performance is better than Guice, which do it at runtime. But specifically for case of lambda, I see it mentioned at multiple places that Dagger is preferred. Is it because of the cold start problem?
Because of the cold start problem in lambda, lambda keeps doing bootstrapping multiple times whenever it receives a request after long time. So, with dagger, bootstrapping would be much faster as compared to Guice as it already has the generated code? I am saying if all the objects in Guice are also created during bootstrap as compared to Lazy loading.
As you already know, any dependency injection framework, at some point, needs to build some sort of dependency graph of the objects that are required by your application. Building this graph is often the most computationally expensive part of the DI framework.
Guice figures it out this graph by using reflection at runtime. Dagger generates code that represents the dependency graph at compile time. I don’t know which one is faster, but I do know that using reflection incurs a non-trivial performance hit.
However, the biggest difference is that Dagger does all the heavy lifting at compile time (which means you do the work once, no matter how many times you run it), whereas Guice must do the equivalent work every time the application starts up.
Now, to answer your question, Dagger is preferred if your application frequently starts and stops. With something like an mobile app, a slower startup time mostly just degrades the UX. With Lambda, not only does it slow down the cold start time, but since you are billed for the amount of time your code is running, it’s actually going to cost you more money to be constantly rebuilding the dependency graph.
TLDR; Dagger is preferred on Lambda (for both the cold start time and the cost) because it moves the most expensive part of the DI framework to compile time instead of performing it at runtime.
I was telling a friend that an advantage of running a load with a lambda function is that each instance, and thus each execution, gets dedicated resources - memory and CPU (and perhaps disk, network,... but that's less relevant). And then I started wondering...
For instance, if you have a function with some CPU-intensive logic that is used by multiple tenants, then one execution should never be affected by another. If some calculation takes 5 seconds to execute, it will always take 5 seconds, no matter how many requests are processed simultaneously.
This seems self-evident for memory, but less so for CPU. From a quick test I seem to get mixed results.
So, does every function instance gets its own CPU dedicated resources?
My main focus is AWS Lambda, but the same question arises for Azure (on a Consumption plan, I guess) and Google.
Lambda uses fractional CPU allocations of instance CPU, running on an instance type comparable to compute optimized EC2 instance. That CPU share is dedicated to the Lambda, and its allocation is based on the amount of memory allocated to the function.
The CPU share dedicated to a function is based off of the fraction of
its allocated memory, per each of the two cores. For example, an
instance with ~ 3 GB memory available for lambda functions where each
function can have up to 1 GB memory means at most you can utilize ~
1/3 * 2 cores = 2/3 of the CPU. The details may be revisited in the
future
The explanation is supported by Lambda Function Configuration documentation, which states:
Performance testing your Lambda function is a crucial part in ensuring
you pick the optimum memory size configuration. Any increase in memory
size triggers an equivalent increase in CPU availabile to your
function.
So yes, you get a dedicated share of an instances' total CPU, based on your memory allocation and the formula above.
It might have been indicated more clearly that I wasn't looking for documentation, but for facts. The core question was if we can assume that one execution should never be affected by another.
As I said, a first quick test gave me mixed results, so I took the time to delve in a little deeper.
I created a very simple lambda that, for a specified number of seconds, generates and sums up random numbers (code here):
while (process.hrtime(start)[0] < duration) {
var nextRandom = randomizer();
random = random + nextRandom - 0.5;
rounds++;
}
Now, if executions on different instances are really independent, then there should be no difference between executing this lambda just once or multiple times in parallel, all other factors being equal.
But the figures indicate otherwise. Here's a graph, showing the number of 'rounds' per second that was achieved.
Every datapoint is the average of 10 iterations with the same number of parallel requests - which should rule out cold start effects and other variations. The raw results can be found here.
The results look rather shocking: they indicate that avoiding parallel executions of the same lambda can almost double the performance....?!
But sticking to the original question: this looks like the CPU fraction 'dedicated' to a lambda instance is not fixed, but depends on certain other factors.
Of course I welcome any remarks on the test, and of course, explanations for the observed behavior!
when thinking about lambda costs for memory/requests, is there any evidence to suggest having larger functions with many package includes vs many smaller functions with minimal packages is a better option?
for example
script A has 1(aws-sdk) + 9 required packages to complete 4 steps
-> this would be a single script request with a longer execution time ( and larger memory requirement?
vs
4 scripts with three packages each 1(aws-sdk) + two required packages to complete 4 steps
-> this would be multiple single script requests with a shorter execution time ( and less memory requirement?
Its difficult to recommend whether to use larger functions without going details into the exact scenarios. However in general, rather looking at function division based on file size, divide the functions based on business capabilities and maintainability.
Lambda is having a billed duration which is rounded up to the nearest 100 millisecond. In term of Cost, it make sense to have large functions with right file size(Not too small) that executes for long enough. On the other hand, this impacts initial Cold Start performance of the Lambda function.
Also having to many small functions make it difficult to maintain.
So based on your application priorities of Cost and Performance and Grouping of Business Capabilities, divide the Lambda functions as Serverless Microservices.
My AWS Lambda function integrated with AWS API- Gateway request URL is getting timed out for every first request but it works for the next request.
Note: We also tried to keep the Lambdas warm by scheduling them in CloudWatch, but it didn't work.
It is the problem with the cold start.
You can do few of the following to improve the cold start speed,
If you using node.js,
Webpack:
Pack all the modules that are in separate files into a single file.
If you are using other languages,
Number of Files:
Keep the number of files in less count
LazyLoad:
Don't load everything upfront, lazy load or load modules when needed.
Hope it helps.
Without knowing too much about your specific use case, here are two general suggestions:
Increase the memory allocated to your functions, which also increases CPU proportionally. Because your functions are called very infrequently, the added cost of increasing memory size will be balanced by faster cold start times and thus lower billed duration.
Reduce your code size: a smaller .zip, removing unnecessary require()'s in Node.js, etc. For example, if you are including the Async library just to remove a nested callback, consider forgoing that to improve performance.
Refer https://forums.aws.amazon.com/thread.jspa?threadID=181348 for more options.
Is scheduling a lambda function to get called every 20 mins with CloudWatch the best way to get rid of lambda cold start times? (not completely get rid of)...
Will this get pricey or is there something I am missing because I have it set up right now and I think it is working.
Before my cold start time would be like 10 seconds and every subsequent call would complete in like 80 ms. Now every call no matter how frequent is around 80 ms. Is this a good method until say your userbase grows, then you can turn this off?
My second option is just using beanstalk and having a server running 24/7 but that sounds expensive so I don't prefer it.
As far as I know this is the only way to keep the function hot right now. It can get pricey only when you have a lot of those functions.
You'd have to calculate yourself how much do you pay for keeping your functions alive considering how many of them do you have, how long does it take to run them each time and how much memory do you need.
But once every 20 minutes is something like 2000 times per month so if you use e.g. 128MB and make them finish under 100ms then you could keep quite a lot of such functions alive at 20 minute intervals and still be under the free tier - it would be 20 seconds per month per function. You don't even need to turn it off after you get a bigger load because it will be irrelevant at this point. Besides you can never be sure to get a uniform load all the time so you might keep your heart beating code active even then.
Though my guess is that since it is so cheap to keep a function alive (especially if you have a special argument that makes them return immediately) and that the difference is so great (10 seconds vs. 80 ms) then pretty much everyone will do it - there is pretty much no excuse not to. In that case I expect Amazon to either fight that practice (by making it hard or more expensive than it currently is - which wouldn't be a smart move) or to make it not needed in the future. If the difference between hot and cold start was 100ms then no one would bother. If it is 10 seconds than everyone needs to work around it.
There would always have to be a difference between running a code that was run a second ago and a code that was run a month ago, because having all of them in RAM and ready to go would waste a lot of resources, but I see no reason why that difference couldn't be made less noticeable or even have few more steps instead of just hot and cold start.
You can improve the cold start time by allocating more memory to your Lambda function. With the default 512MB, I am seeing cold start times of 8-10 seconds for functions written in Java. This improves to 2-3 seconds with 1536MB of memory.
Amazon says that it is the CPU allocation that really matters, but there is no way to directly change it. CPU allocation increases proportionately to memory.
And if you want close to zero cold start times, keeping the function warm is the way to go, as described rsp suggested.
Starting from December 2019 AWS Lambda supports Reserved Concurrency (so you can set the number of lambda functions that will be ready and waiting for new calls) [1]
The downside of this, is that you will be charged for the reserved concurrency. If you provision a concurrency of 1, for a lambda with 128MB being active 24 hrs for the whole month, you will be charged: 1 instance x 30 days x 24 hr x 60min x 60sec x (128/1024) = 324,000 GB-sec (almost all of the capacity AWS gives for the lambda free tier) [2]
From above you will get a lambda instance that responds very fast...subsequent concurrent calls may still suffer "cold-start" though.
What is more, you can configure application autoscaling to dynamically manage the provisioned concurrency of your lambda. [3]
Refs:
https://aws.amazon.com/blogs/aws/new-provisioned-concurrency-for-lambda-functions/
https://aws.amazon.com/lambda/pricing/
https://docs.aws.amazon.com/lambda/latest/dg/configuration-concurrency.html
Among adding more memory for lambda, there is also one more approach to reduce the cold starts: use Graal native-image tool. The jar is translated into byte code. Basically, we would do part of the work, which is done on aws. When you build your code, on loading to AWS - select "Custom runtime", not java8.
Helpful article: https://engineering.opsgenie.com/run-native-java-using-graalvm-in-aws-lambda-with-golang-ba86e27930bf
Beware:
but it also has its limitations; it does not support dynamic class loading, and reflection support is also limited
Azure has pre warming solution for serverless instances(Link). This would be a great feature in AWS lambda if and when they implement it.
Instead of user warming the instance at the application level it's handled by the cloud provider in the platform.
Hitting server would not resolve case of simultaneous requests by users, or same page sending a few api requests async.
A better solution is to dump the 'warmed-up' into docker checkpoint. It is especially useful for dynamic language when warm up is fast, yet loading of all the libraries is slow.
For details read
https://criu.org/Docker
https://www.imperial.ac.uk/media/imperial-college/faculty-of-engineering/computing/public/1819-ug-projects/StenbomO-Refunction-Eliminating-Serverless-Cold-Starts-Through-Container-Reuse.pdf
Other hints:
use more memory
use Python or JavaScript with most basic libraries, try eliminate bulky ones
create several 'microservices' to reduce chances of several users hitting same service
see more at https://www.jeremydaly.com/15-key-takeaways-from-the-serverless-talk-at-aws-startup-day/
Lambda's cold start depends on multiple factors such as your implementation, the language run-time you use, and the code size etc. If you give your Lambda function more memory, you can reduce the cold start too. You can read the best practices https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html
The serverless community also has recommendations for performance https://atlas.serverless.tech-field-community.aws.a2z.com/Performance/index.html
Lambda team also launched Provisioned Concurrency. You can now request multiple Lambda containers be kept in a "hyper ready" state, ready to re-run your function. This is the new best practice for reducing the likelihood of cold starts.
Official docs https://docs.aws.amazon.com/lambda/latest/dg/configuration-concurrency.html?icmpid=docs_lambda_console