We're implementing AWS fanout pattern using SNS topic with multiple SQS queue subscribers. I was wondering what would happen if I successfully publish a message on a SNS topic but it fails (for some reason) to forward it to the queue(s). Will SNS retry and if so, is there a way to control this.
I found this page http://docs.aws.amazon.com/sns/latest/dg/DeliveryPolicies.html that talks about configuring retry policies for SNS HTTP/HTTPS endpoints but there's nothing on SQS.
I'm a member of the SNS development team. For the SQS deliveries, SNS will retry thousands of times before giving up. Except in the case of catastrophic multi-day outages, message deliveries to SQS are guaranteed.
Practically, AWS guarantees the delivery of messages to SQS from SNS. Since the question quotes as below, I want to point out a reason where delivery to SQS can fail.
I was wondering what would happen if I successfully publish a message
on a SNS topic but it fails (for some reason) to forward it to the queue(s).
This can happen only if the SQS we are pointing to, has been deleted (or is being deleted) by some process. As taken from SNS Faqs, if the SQS is unavailable, it will retry for a number of tries until the message is discarded.
SQS: If a SQS queue is not available, SNS will retry 10 times
immediately, then 100,000 times every 20 seconds for a total of
100,010 attempts over more than 23 days before the message is
discarded from SNS.
This was pointed out by #grigori in a comment.
SQS and SNS were a single team back in 2013 and 2014 when I was a developer on that team. As George Lin has pointed out, the SNS backend responsible for delivery does retry thousands of times for SQS queues (exact number was 100,000 times but might have changed). There is an implicit back off policy. In the first attempt itself, there will be several thousand retries in a very tight loop with no delay. If that fails, there is back off policy to gradually attempt to deliver over time. Generally speaking, deliveries to SQS endpoints rarely fail.
AWS guarantees delivery to SQS, so you don't need to worry about it:
Q: Will Amazon SNS guarantee that messages will be delivered to the subscribed end-point?
When a message is published to a topic, Amazon SNS will attempt to
deliver notifications to all subscribers registered for that topic.
Due to potential Internet issues or Email delivery restrictions,
sometimes the notification may not successfully reach an HTTP or Email
end-point. In the case of HTTP, an SNS Delivery Policy can be used to
control the retry pattern (linear, geometric, exponential backoff),
maximum and minimum retry delays, and other parameters. If it is
critical that all published messages be successfully processed,
developers should have notifications delivered to an SQS queue (in
addition to notifications over other transports).
http://aws.amazon.com/sns/faqs/
Related
I'm wondering what would happen if there was an SNS topic having messages written to it, but for a period of time, there is no SQS queue. Let's say there was a container which normally was subscribed to the SNS topic to handle such messages, but it crashed and burned and spent 10 minutes getting resurrected; what would happen to any messages written to that topic, during which there is no queue? Do they disappear forever, or do they wait politely until some queue comes along, subscribes and picks up said messages?
They disappear forever.
SNS cannot know that some subscriber wants to subscribe but simply cannot right now. The topic either has subscribers or it does not. All current subscriber get the message, all future ones do not.
If you have subscriber but the delivery fails there is some SNS specific behaviour in regards to retries: https://docs.aws.amazon.com/sns/latest/dg/sns-message-delivery-retries.html
If the subscriber fails to get the message, a retry mechanism in SNS kicks in as explained in the AWS docs:
When the delivery policy is exhausted, Amazon SNS stops retrying the delivery and discards the message—unless a dead-letter queue is attached to the subscription.
For SQS subscriber retry can be up to 100,015 times, over 23 days
If SQS Queue goes down then message won't disappear , Let's discuss this scenario:
Retry Policy :-** Let's say you set "Number of retries" as n and "Retry-backoff function" as Linear(you can select any other retry-backoff function) in SNS topic , then if SQS is not available then SNS will retry to send that message to subscriber(SQS) n number of times based on the "Retry-backoff function" .
But if you set Number of retries as 0 then your message will delete from SNS topic immediately if Subscriber(SQS) is not available
We are evaluating SNS for our messaging requirements to integrate multiple applications. we have a single producer that publishes messages to multiple topics on SNS. Each topic has 2-5 subscribers. In event of subscriber failures (down for maintenance) I have a few questions on the recommended strategy of using SQS queues per consumer
Is it possible to configure SNS to push to SQS only in event of failure in delivering the message to a subscriber? Dumping all the messages in SQS queue creates a problem for the consumer to analyze all messages in the queue when it restarts.
In event of subscriber failure, it can read messages from SQS queue on restart but how would it know that it missed messages from SNS when it was overloaded?
Any suggestions on handling subscriber failures are welcome.
Thanks!
No, it is not possible to "configure SNS to push to SQS only in event of failure".
Rather than trying to recover a message after a failure, you can configure the Amazon SNS retry policies.
From Setting Amazon SNS Delivery Retry Policies for HTTP/HTTPS Endpoints:
You can use delivery policies to control not only the total number of retries, but also the time delay between each retry. You can specify up to 100 total retries distributed among four discrete phases. The maximum lifetime of a message in the system is one hour. This one hour limit cannot be extended by a delivery policy.
So, you don't need to worry as long as the destination is back online within an hour.
If it is likely to be offline for more than an hour, you will need to find a way to store and "replay" the messages, possibly by inspecting CloudWatch Logs.
Or, here's another idea...
Push initially to SQS. Have an AWS Lambda function triggered by SQS. The Lambda function can do the 'push' that would normally be done by SNS. If it fails, then the standard SQS invisibility process will retry it later, eventually going to a Dead Letter Queue.
From AWS SNS FAQ page (Reliability section) I can see that SNS guarantees at-least-once message delivery to SQS, but it's not clear whether the same is applicable when notification is sent to Lambda.
So the question does SNS provide at-least-once message delivery when a message is sent to AWS Lambda?
Q. What happens to Amazon SNS messages if the subscribing endpoint is not available?
Lambda: If Lambda is not available, SNS will retry 2 times at 1
seconds apart, then 10 times exponentially backing off from 1 seconds
to 20 minutes and finally 38 times every 20 minutes for a total 50
attempts over more than 13 hours before the message is discarded from
SNS.
https://aws.amazon.com/sns/faqs/
The same FAQ page states that
When a message is published to a topic, Amazon SNS will attempt to deliver notifications to all subscribers registered for that topic. Due to potential Internet issues or Email delivery restrictions, sometimes the notification may not successfully reach an HTTP or Email end-point. In the case of HTTP, an SNS Delivery Policy can be used to control the retry pattern (linear, geometric, exponential backoff), maximum and minimum retry delays, and other parameters. If it is critical that all published messages be successfully processed, developers should have notifications delivered to an SQS queue (in addition to notifications over other transports).
This applies to all Non-SQS subscribers
I think your question is: will a Lambda function, subscribed to an SNS topic, be invoked at least once or only once for any message?
If so, the answer is: at least once. The following part from the FAQs helps answer this, with emphasis mine:
Q: How many times will a subscriber receive each message?
Although most of the time each message will be delivered to your application exactly once, the distributed nature of Amazon SNS and transient network conditions could result in occasional, duplicate messages at the subscriber end. Developers should design their applications such that processing a message more than once does not create any errors or inconsistencies.
Has anyone else solved the following problem?
I have SNS topic filled with events from S3 and there is Lambda function which is subscribed on this topic and when thousand of events are put to this topic, lambda function is throttled because of exceeding the limit of concurrency. I don't want to request a limit increase for concurrent executions but I would decrease concurrent consuming from the topic, but I didn't find information how to do it. Thanks.
A couple of options regarding SNS:
1) SNS Maximum Receive Rate
Set the SNS Maximum Receive Rate. This will throttle the SNS messages sent to a subscriber, but may not be a great option if you have so many messages that they will be discarded before they can be processed. From the documentation:
You can set the maximum number of messages per second that Amazon SNS
sends to a subscribed endpoint by setting the Maximum receive rate
setting. Amazon SNS holds messages that are awaiting delivery for up
to an hour. Messages held for more than an hour are discarded.
If you're only getting thousands of events at a time, setting the Maximum Receive Rate to Lambda's default concurrent execution limit of '100' might be worth a try.
As #kndrk notes, this throttling is currently only available for HTTP/HTTPS subscribers to the SNS topic. To work around this, you can expose your lambda function via AWS API Gateway and subscribe that endpoint to the SNS topic, rather than the lambda function directly.
2) Process from SQS
Subscribe an SQS queue to the SNS topic and process messages from the queue, rather than directly from the sns topic. A single invokation of SQS ReceiveMessage can only handle 10 messages at a time, so that may be easier for you to throttle.
It is also worth noting that you can publish S3 Events directly to AWS Lambda.
I want to compare SNS to SQS regarding dequeuing/consumption of the message/topic.
Does an SNS topic get dequeued/consumed if there is 1+ consumer active ?
Does the SNS topic not get "dequeue" if there is no consumer active ?
By consumer I mean any http endpoint, lambda etc.
I suppose you were referring to a message published to a SNS topic.
The Amazon SNS faq's have an answer to your question.
Q: What happens to Amazon SNS messages if the subscribing endpoint is not available?
If a message cannot be successfully delivered on the first attempt, Amazon SNS executes a 4-phase retry policy: 1) retries with no delay in between attempts, 2) retries with minimum delay between attempts, 3) retries according to a back-off model, and 4) retries with maximum delay between attempts. When the message delivery retry policy is exhausted, Amazon SNS can move the message to a dead-letter queue (DLQ). For more information, see Message Delivery Retries and Amazon SNS Dead-Letter Queues.
I am not quite sure about if got your question.
If you are asking about the de-queue process for queues which is subscribed to an SNS topic; SNS plays no role once the items [messages] are transferred to the subscribers.
SNS doesn't persist the items / caches by itself, it tries to transfer the topic items to its subscriber lambda or sqs or email etc.
Please edit your question to be clear to get better responses.