Controlling Lambda + Kinesis Costs

Controlling Lambda + Kinesis Costs - amazon-web-services

We have a .NET client application that uploads files to S3. There is an event notification registered on the bucket which triggers a Lambda to process the file. If we need to do maintenance, then we suspend our processing by removing the event notification and adding it back later when we're ready to resume processing.
To process the backlog of files that have queued up in S3 during the period the event notification was disabled, we write a record to a kinesis stream with the S3 key to each file, and we have an event mapping that lets Lambda consume each kinesis record. This works great for us because it allows us to control our concurrency when we are processing a large backlog by controlling the number of shards in the stream. We were originally using SNS but when we had thousands of files that needed to be reprocessed SNS would keep starting Lambdas until we hit our concurrent executions threshold, which is why we switched to Kinesis.
The problem we're facing right now is that the cost of kinesis is killing us, even though we barely use it. We get 150 - 200 files uploaded per minute, and our lambda takes about 15 seconds to process each one. If we suspend processing for a few hours we end up with thousands of files to process. We could easily reprocess them with a 128 shard stream, however that would cost us $1,400 / month. The current cost for running our Lambda each month is less than $300. It seems terrible that we have to increase our COGS by 400% just to be able to control our concurrency level during a recovery scenario.
I could attempt to keep the stream size small by default and then resize it on the fly before we re-process a large backlog, however resizing a stream from 1 shard up to 128 takes an incredibly long time. If we're trying to recover from an unplanned outage then we can't afford to sit around waiting for the stream to resize before we can use it. So my questions are:
Can anyone recommend an alternative pattern to using kinesis shards for being able to control the upper bound on the number of concurrent lambdas draining a queue?
Is there something I am missing which would allow us to use Kinesis more cost efficiently?

You can use SQS with Lambda or Worker EC2s.
Here is how it can be achieved (2 approaches):
1. Serverless Approach
S3 -> SNS -> SQS -> Lambda Sceduler -> Lambda
Use SQS instead of Kinesis for storing S3 Paths
Use a Lambda Scheduler to keep polling messages (S3 paths) from SQS
Invoke Lambda function from Lambda scheduler for processing files
2. EC2 Approach
S3 -> SNS -> SQS -> Beanstalk Worker
Use SQS instead of Kinesis for storing S3 Paths
Use Beanstalk Worker environment which polls SQS automatically
Implement the application (processing logic) in the Beanstalk worker hosted locally on a HTTP server in the same EC2

Related

AWS Lambda read from SQS without concurrency

My requirement is like this.
Read from a SQS every 2 hours, take all the messages available and then process it.
Processing includes creating a file with details from SQS messages and sending it to an sftp server.
I implemented a AWS Lambda to achieve point 1. I have a Lambda which has an sqs trigger. I have set batch size as 50 and then batch window as 2 hours. My assumption was that Lambda will get triggered every 2 hours and 50 messages will be delivered to the lambda function in one go and I will create a file for every 50 records.
But I observed that my lambda function is triggered with varied number of messages(sometimes 50 sometimes 20, sometimes 5 etc) even though I have configured batch size as 50.
After reading some documentation I got to know(I am not sure) that there are 5 long polling connections which lambda spawns to read from SQS and this is causing this behaviour of lambda function being triggered with varied number of messages.
My question is
Is my assumption on 5 parallel connections being established correct? If yes, is there a way I can control it? I want this to happen in a single thread / connection
If 1 is not possible, what other alternative do I have here. I do not want to have one file created for every few records. I want one file to be generated every two hours with all the messages in sqs.

A "SQS Trigger" for Lambda is implemented with the so-called Event Source Mapping integration, which polls, batches and deletes messages from the queue on your behalf. It's designed for continuous polling, although you can disable it. You can set a maximum batch size of up to 10,000 records a function receives (BatchSize) and a maximum of 300s long polling time (MaximumBatchingWindowInSeconds). That doesn't meet your once-every-two-hours requirement.
Two alternatives:
Remove the Event Source Mapping. Instead, trigger the Lambda every two hours on a schedule with an EventBridge rule. Your Lambda is responsible for the SQS ReceiveMessage and DeleteMessageBatch operations. This approach ensures your Lambda will be invoked only once per cron event.
Keep the Event Source Mapping. Process messages as they arrive, accumulating the partial results in S3. Once every two hours, run a second, EventBridge-triggered Lambda, which bundles the partial results from S3 and sends them to the SFTP server. You don't control the number of Lambda invocations.
Note on scaling:
<Edit (mid-Jan 2023): AWS Lambda now supports SQS Maximum Concurrency>
AWS Lambda now supports setting Maximum Concurrency to the Amazon SQS event source, a more direct and less fiddly way to control concurrency than with reserved concurrency. The Maximum Concurrency setting limits the number of concurrent instances of the function that an Amazon SQS event source can invoke. The valid range is 2-1000 concurrent instances.
The create and update Event Source Mapping APIs now have a ScalingConfig option for SQS:
aws lambda update-event-source-mapping \
--uuid "a1b2c3d4-5678-90ab-cdef-11111EXAMPLE" \
--scaling-config '{"MaximumConcurrency":2}' # valid range is 2-1000
</Edit>
With the SQS Event Source Mapping integration you can tweak the batch settings, but ultimately the Lambda service is in charge of Lambda scaling. As the AWS Blog Understanding how AWS Lambda scales with Amazon SQS standard queues says:
Lambda consumes messages in batches, starting at five concurrent batches with five functions at a time. If there are more messages in the queue, Lambda adds up to 60 functions per minute, up to 1,000 functions, to consume those messages.
You could theoretically restrict the number of concurrent Lambda executions with reserved concurrency, but you would risk dropped messages due to throttling errors.

You could try to set the ReservedConcurrency of the function to 1. That may help. See the docs for reference.
A simple solution would be to create a CloudWatch Event Trigger (similar to a Cronjob) that triggers your Lambda function every two hours. In the Lambda function, you call ReceiveMessage on the Queue until you get all messages, process them and afterward delete them from the Queue. The drawback is that there may be too many messages to process within 15 minutes so that's something you'd have to manage.

batch processing s3 objects using lambda

The use case is that 1000s of very small-sized files are uploaded to s3 every minute and all the incoming objects are to be processed and stored in a separate bucket using lambda.
But using s3-object-create as a trigger will make many lambda invocations and concurrency needs to be taken care of. I am trying to batch process the newly created objects for every 5-10 minutes. S3 provides batch operations but it reports are generated everyday/week. Is there a service available that can help me?

According to AWS documentation, S3 can publish "New object created events" to following destinations:
Amazon SNS
Amazon SQS
AWS Lambda
In your case I would:
Create SQS.
Configure S3 Bucket to publish S3 new object events to SQS.
Reconfigure your existing Lambda to subscribe to SQS.
Configure batching for input SQS events.
Currently, the maximum batch size for SQS-Lambda subscription is 1000 events. But since your Lambda needs around 2 seconds to process single event, then you should start with something smaller, otherwise Lambda will timeout, because it won't be able to process all of the events.
Thanks to this, uploading X items to S3 will produce X / Y events, where Y is maximum batch size of SQS. For 1000 S3 items and batch size of 100, it will only invoke around 10 concurrent Lambda executions.
The AWS document mentioned above explains, how to publish S3 events to SQS. I won't explain it here, as it's more about implementation details.
Execution time
However you might run into a problem, where the processing is too slow, because Lambda will be processing probably events one-by-one in a loop.
The workaround would be to use asynchronous processing and implementation depends what runtime you use for Lambda, for Node.js it would be very easy to achieve.
Also if you want to speed up the processing in other ways, simply reduce maximum batch size and increase Lambda memory configuration, so single execution will be processing smaller number of events and will have access to more CPU units.

Why is AWS Lambda handler invoked directly by some AWS while Lambda needs to poll others?

Why are some Amazon Web Services configured such that they can make direct calls to Lambda handlers with appropriate permissions, while for others like SQS, lambda needs to poll repeatedly? Why can't we have a provision for invoking Lambda as soon as a message is added to an SQS, instead of polling repeatedly?

I think this is related to scaling.
From Understanding Scaling Behavior - AWS Lambda:
Poll-based event sources that are not stream-based: For Lambda functions that process Amazon SQS queues, AWS Lambda will automatically scale the polling on the queue until the maximum concurrency level is reached, where each message batch can be considered a single concurrent unit. AWS Lambda's automatic scaling behavior is designed to keep polling costs low when a queue is empty while simultaneously enabling you to achieve high throughput when the queue is being used heavily.
When an Amazon SQS event source mapping is initially enabled, Lambda begins long-polling the Amazon SQS queue. Long polling helps reduce the cost of polling Amazon Simple Queue Service by reducing the number of empty responses, while providing optimal processing latency when messages arrive.
When messages are available, Lambda initially launches up to 5 instances of your function, to handle 5 batches simultaneously. Then, Lambda launches up to 60 more instances per minute, up to 1000 total, as long as you have concurrency available at the account and function level.

Best way to process SQS messages

I have a client that is constantly pouring semi-real-time data into an sqs queue, and want to process and store the messages. My first thought was to use a CloudWatch scheduler that prompts a Lambda with the approximate number of messages that lambda then spawns worker lambdas to process and push the data into a Firehose. The problem is that there will be hundreds of thousands of messages put into the queue every day. I could also use EC2 to do this, but is there any other cost-effective way to process the queue semi-real-time.

The recommended solution for processing streaming data in AWS Lambda is to send the data to Amazon Kinesis, which can then trigger a Lambda function automatically. Kinesis also preserves the ordering of messages. (Amazon SQS only preserves ordering if you use a FIFO queue, which has throughput limitations.)
If you really are limited to processing from SQS, you could write a program that pulls from SQS and pushes to Kinesis or simply pull from SQS and process the data immediately. Such a program could run on an Amazon EC2 instance, or could be triggered on a regular basis by a scheduled Amazon CloudWatch Event.
The main thing to consider is how to handle variable volumes. If you cannot accept long delays between messages arriving and being processed, you will need to either use Lambda (automatically scalable) or have plenty of available processing power to handle the spikes.

Can I limit concurrent invocations of an AWS Lambda?

I have a Lambda function that’s triggered by a PUT to an S3 bucket.
I want to limit this Lambda function so that it’s only running one instance at a time – I don’t want two instances running concurrently.
I’ve had a look through the Lambda configuration and docs, but I can’t see anything obvious. I can about writing my own locking system, but it would be nice if this was already a solved problem.
How can I limit the number of concurrent invocations of a Lambda?

AWS Lambda now supports concurrency limits on individual functions:
https://aws.amazon.com/about-aws/whats-new/2017/11/set-concurrency-limits-on-individual-aws-lambda-functions/

I would suggest you to use Kinesis Streams (or alternatively DynamoDB + DynamoDB Streams, which essentially have the same behavior).
You can see Kinesis Streams as as queue. The good part is that you can use a Kinesis Stream as a Trigger to you Lambda function. So anything that gets inserted into this queue will automatically be passed over to your function, in order. So you will be able to process those S3 events one by one, one Lambda execution after the other (one instance at a time).
In order to do that, you'll need to create a Lambda function with the simple purpose of getting S3 Events and putting them into a Kinesis Stream. Then you'll configure that Kinesis Stream as your Lambda Trigger.
When you configure the Kinesis Stream as your Lambda Trigger I suggest you to use the following configuration:
Batch size: 1
This means that your Lambda will be called with only one event from Kinesis. You can select a higher number and you'll get a list of events of that size (for example, if you want to process the last 10 events in one Lambda execution instead of 10 consecutive Lambda executions).
Starting position: Trim horizon
This means it'll behave as a queue (FIFO)
A bit more info on AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AWS Lambda.
I hope this helps anyone with a similar problem.
P.S. Bear in mind that Kinesis Streams have their own pricing. Using DynamoDB + DynamoDB Streams might be cheaper (or even free due to the non-expiring Free Tier of DynamoDB).

No, this is one of the things I'd really like to see Lambda support, but currently it does not. One of the problems is that if there were a lot of S3 PUT operations happening AWS would have to queue up all the Lambda invocations somehow, and there is currently no support for that.
If you built a locking mechanism into your Lambda function, what would you do with the requests you don't process due to a lock? Would you just throw those S3 notifications away?
The solution most people recommend is to have S3 send the notifications to an SQS queue, and then have your Lambda function scheduled to run periodically, like once a minute, and check if there is an item in the queue that needs to be processed.
Alternatively, have S3 send the notifications to SQS and just have a t2.nano EC2 instance with a single-threaded service polling the queue.

I know this is an old thread, but I ran across it trying to figure out how to make sure my time sequenced SQS messages were processed in order coming out of a FIFO queue and not getting processed simultaneously/out-of-order via multiple Lambda threads running.
Per the documentation:
For FIFO queues, Lambda sends messages to your function in the order
that it receives them. When you send a message to a FIFO queue, you
specify a message group ID. Amazon SQS ensures that messages in the
same group are delivered to Lambda in order. Lambda sorts the messages
into groups and sends only one batch at a time for a group. If your
function returns an error, the function attempts all retries on the
affected messages before Lambda receives additional messages from the
same group.
Your function can scale in concurrency to the number of active message
groups.
Link: https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html
So essentially, as long as you use a FIFO queue and submit your messages that need to stay in sequence with the same MessageGroupID, SQS/Lambda automatically handles the sequencing without any additional settings necessary.

Have the S3 "Put events" cause a message to be placed on the queue (instead of involving a lambda function). The message should contain a reference to the S3 object. Then SCHEDULE a lambda to "SHORT POLL the entire queue".
PS: S3 events can not trigger a Kinesis Stream... only SQS, SMS, Lambda (see http://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html#supported-notification-destinations). Kinesis Stream are expensive and used for real-time event handling.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js