Has anyone else solved the following problem?
I have SNS topic filled with events from S3 and there is Lambda function which is subscribed on this topic and when thousand of events are put to this topic, lambda function is throttled because of exceeding the limit of concurrency. I don't want to request a limit increase for concurrent executions but I would decrease concurrent consuming from the topic, but I didn't find information how to do it. Thanks.
A couple of options regarding SNS:
1) SNS Maximum Receive Rate
Set the SNS Maximum Receive Rate. This will throttle the SNS messages sent to a subscriber, but may not be a great option if you have so many messages that they will be discarded before they can be processed. From the documentation:
You can set the maximum number of messages per second that Amazon SNS
sends to a subscribed endpoint by setting the Maximum receive rate
setting. Amazon SNS holds messages that are awaiting delivery for up
to an hour. Messages held for more than an hour are discarded.
If you're only getting thousands of events at a time, setting the Maximum Receive Rate to Lambda's default concurrent execution limit of '100' might be worth a try.
As #kndrk notes, this throttling is currently only available for HTTP/HTTPS subscribers to the SNS topic. To work around this, you can expose your lambda function via AWS API Gateway and subscribe that endpoint to the SNS topic, rather than the lambda function directly.
2) Process from SQS
Subscribe an SQS queue to the SNS topic and process messages from the queue, rather than directly from the sns topic. A single invokation of SQS ReceiveMessage can only handle 10 messages at a time, so that may be easier for you to throttle.
It is also worth noting that you can publish S3 Events directly to AWS Lambda.
Related
I have a simple lambda app that is not in production right now, only being used for testing and debugging. The function sends a message to SQS to perform CRUD operations on an external application. I've set this function to be invoked by SQS when it receives a message, so the same function is sending and receiving.
I've just received an email saying I've used over 85% of my free tier SQS requests quota, or over 850,000 requests in just the past 2 weeks. I'm certain these requests are not messages being sent to queue, or received. The number of sends/receives has to be under 1000 for how often I've used this app. I've also verified using SQS monitoring that there are no messages stuck in queue. And the number of sent messages is more or less what I expected, a low number.
Like I said this app is only being used by myself for testing, a few days per week. Where does the 850,000+ requests come from?
Amazon SQS is charged at $0.40 per million API calls. Calls include send, receive and delete, so it is possible that a message might use 3+ API calls.
From AWS Lambda Adds Amazon Simple Queue Service to Supported Event Sources | AWS News Blog:
There are no additional charges for this feature, but because the Lambda service is continuously long-polling the SQS queue the account will be charged for those API calls at the standard SQS pricing rates.
Long-polling takes 20 seconds, which makes 4320 polls per day. This equates to 60,480 over two weeks or 129,600 per month. Admittedly, it would be more if messages are flowing, since long polling exits whenever there are messages.
So, either the queue is being used a lot (and you are getting excellent value for your $0.40) or you have something else generating lots of SQS API calls.
If you use the same function for sending to SQS and receive from SQS, it means that:
Lambda send message to SQS -> SQS receive the message -> SQS trigger Lambda -> Lambda send message to SQS
And... It's an infinite loop :)
We are evaluating SNS for our messaging requirements to integrate multiple applications. we have a single producer that publishes messages to multiple topics on SNS. Each topic has 2-5 subscribers. In event of subscriber failures (down for maintenance) I have a few questions on the recommended strategy of using SQS queues per consumer
Is it possible to configure SNS to push to SQS only in event of failure in delivering the message to a subscriber? Dumping all the messages in SQS queue creates a problem for the consumer to analyze all messages in the queue when it restarts.
In event of subscriber failure, it can read messages from SQS queue on restart but how would it know that it missed messages from SNS when it was overloaded?
Any suggestions on handling subscriber failures are welcome.
Thanks!
No, it is not possible to "configure SNS to push to SQS only in event of failure".
Rather than trying to recover a message after a failure, you can configure the Amazon SNS retry policies.
From Setting Amazon SNS Delivery Retry Policies for HTTP/HTTPS Endpoints:
You can use delivery policies to control not only the total number of retries, but also the time delay between each retry. You can specify up to 100 total retries distributed among four discrete phases. The maximum lifetime of a message in the system is one hour. This one hour limit cannot be extended by a delivery policy.
So, you don't need to worry as long as the destination is back online within an hour.
If it is likely to be offline for more than an hour, you will need to find a way to store and "replay" the messages, possibly by inspecting CloudWatch Logs.
Or, here's another idea...
Push initially to SQS. Have an AWS Lambda function triggered by SQS. The Lambda function can do the 'push' that would normally be done by SNS. If it fails, then the standard SQS invisibility process will retry it later, eventually going to a Dead Letter Queue.
I have a Lambda function that is triggered by an SNS topic. What would happen to the messages being published to the SNS topic if Lambda reaches its limit of maximum concurrent executions and is not able to scale further?
For example, consider a situation where my SNS topic is receiving 1000 messages per second but Lambda is able to scale only up to processing 600 messages per second. From what I understand about SNS, it is a pub/sub mechanism and there can be no backlog in it (unlike SQS, Kinesis etc.). So what will happen to the extra 400 messages per second?
Also, how can I monitor if my Lambda is able to process at the rate at which SNS is receiving messages?
To answer your first question you need to understand the retry behavior of AWS Lambda. Please see the following quote out of the documentation.
Asynchronous invocation – Asynchronous events are queued before being
used to invoke the Lambda function. If AWS Lambda is unable to fully
process the event, it will automatically retry the invocation twice,
with delays between retries. If you have specified a Dead Letter Queue
for your function, then the failed event is sent to the specified
Amazon SQS queue or Amazon SNS topic. If you don't specify a Dead
Letter Queue (DLQ), which is not required and is the default setting,
then the event will be discarded. For more information, see Dead
Letter Queues.
To answer your second question:
You could use AWS CloudWatch.
There are two metrics interesting for you:
AWS/Lambda - Invocations
AWS/SNS - NumberOfMessagesPublished
From AWS SNS FAQ page (Reliability section) I can see that SNS guarantees at-least-once message delivery to SQS, but it's not clear whether the same is applicable when notification is sent to Lambda.
So the question does SNS provide at-least-once message delivery when a message is sent to AWS Lambda?
Q. What happens to Amazon SNS messages if the subscribing endpoint is not available?
Lambda: If Lambda is not available, SNS will retry 2 times at 1
seconds apart, then 10 times exponentially backing off from 1 seconds
to 20 minutes and finally 38 times every 20 minutes for a total 50
attempts over more than 13 hours before the message is discarded from
SNS.
https://aws.amazon.com/sns/faqs/
The same FAQ page states that
When a message is published to a topic, Amazon SNS will attempt to deliver notifications to all subscribers registered for that topic. Due to potential Internet issues or Email delivery restrictions, sometimes the notification may not successfully reach an HTTP or Email end-point. In the case of HTTP, an SNS Delivery Policy can be used to control the retry pattern (linear, geometric, exponential backoff), maximum and minimum retry delays, and other parameters. If it is critical that all published messages be successfully processed, developers should have notifications delivered to an SQS queue (in addition to notifications over other transports).
This applies to all Non-SQS subscribers
I think your question is: will a Lambda function, subscribed to an SNS topic, be invoked at least once or only once for any message?
If so, the answer is: at least once. The following part from the FAQs helps answer this, with emphasis mine:
Q: How many times will a subscriber receive each message?
Although most of the time each message will be delivered to your application exactly once, the distributed nature of Amazon SNS and transient network conditions could result in occasional, duplicate messages at the subscriber end. Developers should design their applications such that processing a message more than once does not create any errors or inconsistencies.
We're implementing AWS fanout pattern using SNS topic with multiple SQS queue subscribers. I was wondering what would happen if I successfully publish a message on a SNS topic but it fails (for some reason) to forward it to the queue(s). Will SNS retry and if so, is there a way to control this.
I found this page http://docs.aws.amazon.com/sns/latest/dg/DeliveryPolicies.html that talks about configuring retry policies for SNS HTTP/HTTPS endpoints but there's nothing on SQS.
I'm a member of the SNS development team. For the SQS deliveries, SNS will retry thousands of times before giving up. Except in the case of catastrophic multi-day outages, message deliveries to SQS are guaranteed.
Practically, AWS guarantees the delivery of messages to SQS from SNS. Since the question quotes as below, I want to point out a reason where delivery to SQS can fail.
I was wondering what would happen if I successfully publish a message
on a SNS topic but it fails (for some reason) to forward it to the queue(s).
This can happen only if the SQS we are pointing to, has been deleted (or is being deleted) by some process. As taken from SNS Faqs, if the SQS is unavailable, it will retry for a number of tries until the message is discarded.
SQS: If a SQS queue is not available, SNS will retry 10 times
immediately, then 100,000 times every 20 seconds for a total of
100,010 attempts over more than 23 days before the message is
discarded from SNS.
This was pointed out by #grigori in a comment.
SQS and SNS were a single team back in 2013 and 2014 when I was a developer on that team. As George Lin has pointed out, the SNS backend responsible for delivery does retry thousands of times for SQS queues (exact number was 100,000 times but might have changed). There is an implicit back off policy. In the first attempt itself, there will be several thousand retries in a very tight loop with no delay. If that fails, there is back off policy to gradually attempt to deliver over time. Generally speaking, deliveries to SQS endpoints rarely fail.
AWS guarantees delivery to SQS, so you don't need to worry about it:
Q: Will Amazon SNS guarantee that messages will be delivered to the subscribed end-point?
When a message is published to a topic, Amazon SNS will attempt to
deliver notifications to all subscribers registered for that topic.
Due to potential Internet issues or Email delivery restrictions,
sometimes the notification may not successfully reach an HTTP or Email
end-point. In the case of HTTP, an SNS Delivery Policy can be used to
control the retry pattern (linear, geometric, exponential backoff),
maximum and minimum retry delays, and other parameters. If it is
critical that all published messages be successfully processed,
developers should have notifications delivered to an SQS queue (in
addition to notifications over other transports).
http://aws.amazon.com/sns/faqs/