How Could I Monitor Lambda Concurrent Executions on a Second-by-Second Basis (or Find a Better Solution to Limit Lambda ConcurrentExecutions)? - amazon-web-services

I am working on a massive distributive computing platform built within AWS Lambda. The platform is extremely spiky, so most of the time the number of ConcurrentExecutions is below 50, but we can hit maximum (1000 currently) for up to an hour or more if a large batch job hits the system (it is an event-driven system). This is a problem as we will have customer-facing APIs that will lag terribly. Finally, I am not an architect, so I have minimal control over how the system was designed, but I have been asked to devise a clever Concurrent Execution limiting solution
I'm not new to AWS, so I know about the standard ways to handle this problem. #1 is reserve concurrency on the user-facing lambdas. I'm not allowed to do that for the sake of this exercise (though I'll go tell my boss thats whats necessary if it truly is). I'm thinking of a system where we designate high-priority (for UI) and low priority functions (for batch processing), and the low-priority functions will check a stored (DynamoDB) value output from Cloudwatch on the current number of ConcurrentExecutions. If a low priority function finds that we are in danger of using all the ConcurrentExecutions, it will post to a queue with exponential backoff in place. This all should work, save the problem that ConcurrentExecutions are only monitored in one-minute increments, which is too slow, as many of our Lambdas run for around 500ms.
So my questions are as follows:
Is there a way to set up a custom ConcurrentExecutions metric that has second-by-second data points, and if so, how would you do it?
Is there a better way to implement a counter than Cloudwatch?
Am I just missing something here and someone has a clever way to manage Lambda ConcurrentExecutions

I don't think it's necessary to create a monitor or throttling solution at all. You will need to to build test and maintain something additional to your core solution. Instead, two suggestions:
Sounds like the current design has one lambda function doing too much. Decompose the Lambdas further, so you can split the Lambdas into a Ui/public lambda, and one or more dedicated to the batch processes. This way you can spread the concurrent execution limit across more Lambdas. The limit is per Lambda function.
Second, request a service quota/limit increase
To raise the limit above 1,000 concurrent function executions, submit a request to the AWS Support Center by following the steps in our documentation. This feature is available in all regions where Lambda is available.
See AWS Lambda Raises Default Concurrent Execution Limits.
https://aws.amazon.com/about-aws/whats-new/2017/05/aws-lambda-raises-default-concurrent-execution-limit/
The limit management team is very flexible when asking for a limit to be raped they were generally raise it to any reasonable number that our solution requires.
To request a limit increase, see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-resource-limits.html

Related

Is SQS better than DynamoDB for peak loads?

A service runs on ECS and writes the requested URL to a DynamoDB. Dynamic scaling was activated to keep the costs for DynamoDB from becoming too high. DynamoDB scales slower than requests are coming in at any given time, so some calls are not logged. My question now is whether writing to an SQS would be the better way here, because the documentation says:
Standard queues support a nearly unlimited number of API calls per second, per API action (SendMessage, ReceiveMessage, or DeleteMessage).
Of course, the messages would then have to be written back to DynamoDB, but another service can then do that.
Is the throughput of messages per second to SQS really unlimited, so it's definitely cheaper to send messages to SQS instead of increasing DynamoDB's writes per second?
I don't know if this qualifies for a good answer. But remembering a discussion with my architect at the time, we concluded that to have a queue for precisely this problem seems good practice, regardless of load. It keeps requests even if services go down, so there is an added benefit.
SQS and Dynamo fit two very different use cases. Its not so much which is better, its which is right for what you need.
Dynamodb is a NoSQL Document based Database. This is best for when you have known access patterns to data that needs to persist over time, that you need to access quickly, but probably are not making many changes too (or at least the changes do not have to be absolutely immediately, sub 5 ms accessible). Each document in a dynamodb is similar (but also very different) to a row in a standard SQL table, in that it will have attributes (columns) keys (Partition and Sort Key) and be retrievable through a query (though dynamic on the fly queries are NOT good for Dynamo)
SQS is a Queue system. It has no persistence. Payloads of JSON objects are dropped into the Queue and then processed by some end point - either a Lambda, or put into a dynamo, or something else entirely depending on your products use case. It is perfect for when you often receive bursts of data but your system needs some time to handle each individual payload - such as it is waiting on other systems to finish before it can handle the next one - so instead of scaling horizontally (by just handling all the payloads in parallel) you have to scale vertically (be able to handle more payloads at once through a single or only a few threads). You cannot access the data coming in while it is waiting in the queue, no queries on said data, only wait until that data pops/pushes off the queue and into processing by whatever system you have set up to receive it.
The answer to your question is entirely dependent on your use case and your system - something we here at SO will never really understand or know simply because we will always be hearing about it through you and never really experiencing it. As such, to answer it, you need to understand the capabilities of both Dynamo and SQS, the pros and cons for each, and then determine which is best for your product.

AWS Lambda best practices for Real Time Tracking

We currently run an AWS Lambda function that primarily simply redirects the user to a different URL. The function is invoked via API-Gateway.
For tracking purposes, we would like to create a widget on our dashboard that provides real-time insights into how many redirects are performed each second. The creation of the widget itself is not the problem.
My main question currently is which AWS Services is best suited for telling our other services that an invocation took place. We plan to register the invocation in our database.
Some additional things:
low latency (< 5 seconds) in order to be real-time data
nearly no increased time wait for the user. We aim to redirect the user as fast as possible
Many thanks in advance!
Best Regards
Martin
I understand that your goal is to simply persist the information that an invocation happened somewhere with minimal impact on the response time of the Lambda.
For that purpose I'd probably use an SQS standard queue and just send a message to the queue that the invocation happened.
You can then have an asynchronous process (Lambda, Docker, EC2) process the messages from the queue and update your Dashboard.
Depending on the scalability requirements looking into Kinesis Data Analytics might also be worth it.
It's a fully managed streaming data solution and the analytics part allows you to do sliding window analyses using SQL on data in the Stream.
In that case you'd write the info that something happened to the stream, which also has a low latency.

azure event hub capture vs custom function

we use event hub, the intent is to able to archive the inbound event data for troubleshooting/analytic reasons, understandably event hub capture built in plays the role, however looking at the price tag my boss not happy. His question is, what benifits it compares to we simply have a function to bridge the event hub to some sort of storage e.g. blob by ourself, would that justify the cost saving in long run..
I don't know how to answer this, could you please help?
Azure Functions consumption plan is billed mainly on number of executions whereas Event Hub capture is billed on number of TUs.
Here are couple things that can help to reduce Function app execution counts:
Smaller EH partitions counts - for example, 4 partitions would deliver events in larger batches than 32 partitions would do.
Increase batchSize in function app's config.
Since you have only 3 partitions and 1 TU traffic to process, you may probably save if you run with a function rather than capture. I recommend doing some test runs and see how many executions incurred then you can compare the hourly cost of functions app to $.10 hourly fixed cost of EH capture.
I am assuming storage side billing will probably be similar or you can even try reducing it further down by increasing batching and decreasing number of storage calls.

Has anyone ever had to exceed 1000 concurrent executions in lambda?

I'm currently using ~500 concurrent executions and this tends to reach up to 5000 easily, is this a long term problem or is it relatively easy to make a quota increase request to AWS?
Getting quota increases is not difficult, but it’s also not instantaneous. In some cases the support person will ask for more information on why you need the increase (often to be sure you aren’t going too far afoul of best practices), which can slow things down. Different support levels have different response times too. So if you are concerned about it you should get ahead of it and get the increase before you think you’ll need it.
To request an increase:
In the AWS management console, select Service Quotes
Click AWS Lambda
Select Concurrent executions
Click Request quota increase

Is it possible to keep an AWS Lambda function warm?

There are a few pieces of my app that cannot afford the additional 1-2 second delay caused by the "freeze-thaw" cycle that Lambda functions go through when they're new or unused for some period of time.
How can I keep these Lambda functions warm so AWS doesn't have to re-provision them all the time? This goes for both 1) infrequently-used functions and 2) recently-deployed functions.
Ideally, there is a setting that I missed called "keep warm" that increases the cost of the Lambda functions, but always keeps a function warm and ready to respond, but I'm pretty sure this does not exist.
I suppose an option is to use a CloudWatch timer to ping the functions every so often... but this feels wrong to me. Also, I don't know the interval that AWS uses to put Lambda functions on ice.
UPDATE DEC 2019
AWS now also offers 'Provisioned Concurrency'. https://aws.amazon.com/blogs/aws/new-provisioned-concurrency-for-lambda-functions/
Basically you pay around 10$/month (for a 1GB Lambda) per instance that you want to keep 'warm'.
BBC has published a nice article on iPlayer engineering where they describe a similar issue.
They have chosen to call the function every few minutes using CloudWatch Scheduled Events.
So in theory, it should just stay there, except it might not. So we have set up a scheduled event to keep the container ‘warm’. We invoke the function every couple of minutes not to do any processing, but to make sure we’ve got the model ready. This accounts for a very small percentage of invocations but helps us mitigate race conditions in the model download.(We also limit artificially how many lambdas we invoke in parallel as an additional measure).