The case I'm trying to face is - log the DLQ messages content, while avoiding ack the messages, in order to allow re-queue to original queue in the future.
I thought about subscribing two DLQs, when one is for logging, and second is for having the messages for re-queue, but its not possible
You have to consume the SQS messages from DLQ as they have limited retention period (4 days by default). Thus, if you not act on them withing this period they will expire and you will loose them.
Normally, you would setup a lambda function which will process messages from the DLQ. Subsequently, you can do what you wish with these messages, which can include:
re-broadcasting to the original queue
sending SNS notifications about failed messages
saving them into S3 or CloudWatch Logs for future analysis.
You can configure a processing lambda(Event Source mapping) to get invoked when message falls into the DLQ and log the message contents in your cloudwatch log.
But ideally people will write the DLQ message into a file with the processing lambda and store them into s3 for logging/ auditing purposes
I have a Lambda function that is triggered by an SNS topic. What would happen to the messages being published to the SNS topic if Lambda reaches its limit of maximum concurrent executions and is not able to scale further?
For example, consider a situation where my SNS topic is receiving 1000 messages per second but Lambda is able to scale only up to processing 600 messages per second. From what I understand about SNS, it is a pub/sub mechanism and there can be no backlog in it (unlike SQS, Kinesis etc.). So what will happen to the extra 400 messages per second?
Also, how can I monitor if my Lambda is able to process at the rate at which SNS is receiving messages?
To answer your first question you need to understand the retry behavior of AWS Lambda. Please see the following quote out of the documentation.
Asynchronous invocation – Asynchronous events are queued before being
used to invoke the Lambda function. If AWS Lambda is unable to fully
process the event, it will automatically retry the invocation twice,
with delays between retries. If you have specified a Dead Letter Queue
for your function, then the failed event is sent to the specified
Amazon SQS queue or Amazon SNS topic. If you don't specify a Dead
Letter Queue (DLQ), which is not required and is the default setting,
then the event will be discarded. For more information, see Dead
Letter Queues.
To answer your second question:
You could use AWS CloudWatch.
There are two metrics interesting for you:
AWS/Lambda - Invocations
AWS/SNS - NumberOfMessagesPublished
Has anyone else solved the following problem?
I have SNS topic filled with events from S3 and there is Lambda function which is subscribed on this topic and when thousand of events are put to this topic, lambda function is throttled because of exceeding the limit of concurrency. I don't want to request a limit increase for concurrent executions but I would decrease concurrent consuming from the topic, but I didn't find information how to do it. Thanks.
A couple of options regarding SNS:
1) SNS Maximum Receive Rate
Set the SNS Maximum Receive Rate. This will throttle the SNS messages sent to a subscriber, but may not be a great option if you have so many messages that they will be discarded before they can be processed. From the documentation:
You can set the maximum number of messages per second that Amazon SNS
sends to a subscribed endpoint by setting the Maximum receive rate
setting. Amazon SNS holds messages that are awaiting delivery for up
to an hour. Messages held for more than an hour are discarded.
If you're only getting thousands of events at a time, setting the Maximum Receive Rate to Lambda's default concurrent execution limit of '100' might be worth a try.
As #kndrk notes, this throttling is currently only available for HTTP/HTTPS subscribers to the SNS topic. To work around this, you can expose your lambda function via AWS API Gateway and subscribe that endpoint to the SNS topic, rather than the lambda function directly.
2) Process from SQS
Subscribe an SQS queue to the SNS topic and process messages from the queue, rather than directly from the sns topic. A single invokation of SQS ReceiveMessage can only handle 10 messages at a time, so that may be easier for you to throttle.
It is also worth noting that you can publish S3 Events directly to AWS Lambda.
We're implementing AWS fanout pattern using SNS topic with multiple SQS queue subscribers. I was wondering what would happen if I successfully publish a message on a SNS topic but it fails (for some reason) to forward it to the queue(s). Will SNS retry and if so, is there a way to control this.
I found this page http://docs.aws.amazon.com/sns/latest/dg/DeliveryPolicies.html that talks about configuring retry policies for SNS HTTP/HTTPS endpoints but there's nothing on SQS.
I'm a member of the SNS development team. For the SQS deliveries, SNS will retry thousands of times before giving up. Except in the case of catastrophic multi-day outages, message deliveries to SQS are guaranteed.
Practically, AWS guarantees the delivery of messages to SQS from SNS. Since the question quotes as below, I want to point out a reason where delivery to SQS can fail.
I was wondering what would happen if I successfully publish a message
on a SNS topic but it fails (for some reason) to forward it to the queue(s).
This can happen only if the SQS we are pointing to, has been deleted (or is being deleted) by some process. As taken from SNS Faqs, if the SQS is unavailable, it will retry for a number of tries until the message is discarded.
SQS: If a SQS queue is not available, SNS will retry 10 times
immediately, then 100,000 times every 20 seconds for a total of
100,010 attempts over more than 23 days before the message is
discarded from SNS.
This was pointed out by #grigori in a comment.
SQS and SNS were a single team back in 2013 and 2014 when I was a developer on that team. As George Lin has pointed out, the SNS backend responsible for delivery does retry thousands of times for SQS queues (exact number was 100,000 times but might have changed). There is an implicit back off policy. In the first attempt itself, there will be several thousand retries in a very tight loop with no delay. If that fails, there is back off policy to gradually attempt to deliver over time. Generally speaking, deliveries to SQS endpoints rarely fail.
AWS guarantees delivery to SQS, so you don't need to worry about it:
Q: Will Amazon SNS guarantee that messages will be delivered to the subscribed end-point?
When a message is published to a topic, Amazon SNS will attempt to
deliver notifications to all subscribers registered for that topic.
Due to potential Internet issues or Email delivery restrictions,
sometimes the notification may not successfully reach an HTTP or Email
end-point. In the case of HTTP, an SNS Delivery Policy can be used to
control the retry pattern (linear, geometric, exponential backoff),
maximum and minimum retry delays, and other parameters. If it is
critical that all published messages be successfully processed,
developers should have notifications delivered to an SQS queue (in
addition to notifications over other transports).
http://aws.amazon.com/sns/faqs/
When would I use SNS versus SQS, and why are they always coupled together?
SNS is a distributed publish-subscribe system. Messages are pushed to subscribers as and when they are sent by publishers to SNS.
SQS is distributed queuing system. Messages are not pushed to receivers. Receivers have to poll or pull messages from SQS. Messages can't be received by multiple receivers at the same time. Any one receiver can receive a message, process and delete it. Other receivers do not receive the same message later. Polling inherently introduces some latency in message delivery in SQS unlike SNS where messages are immediately pushed to subscribers. SNS supports several end points such as email, SMS, HTTP end point and SQS. If you want unknown number and type of subscribers to receive messages, you need SNS.
You don't have to couple SNS and SQS always. You can have SNS send messages to email, SMS or HTTP end point apart from SQS. There are advantages to coupling SNS with SQS. You may not want an external service to make connections to your hosts (a firewall may block all incoming connections to your host from outside).
Your end point may just die because of heavy volume of messages. Email and SMS maybe not your choice of processing messages quickly. By coupling SNS with SQS, you can receive messages at your pace. It allows clients to be offline, tolerant to network and host failures. You also achieve guaranteed delivery. If you configure SNS to send messages to an HTTP end point or email or SMS, several failures to send message may result in messages being dropped.
SQS is mainly used to decouple applications or integrate applications. Messages can be stored in SQS for a short duration of time (maximum 14 days). SNS distributes several copies of messages to several subscribers. For example, let’s say you want to replicate data generated by an application to several storage systems. You could use SNS and send this data to multiple subscribers, each replicating the messages it receives to different storage systems (S3, hard disk on your host, database, etc.).
Here's a comparison of the two:
Entity Type
SQS: Queue (Similar to JMS)
SNS: Topic (Pub/Sub system)
Message consumption
SQS: Pull Mechanism - Consumers poll and pull messages from SQS
SNS: Push Mechanism - SNS Pushes messages to consumers
Use Case
SQS: Decoupling two applications and allowing parallel asynchronous processing
SNS: Fanout - Processing the same message in multiple ways
Persistence
SQS: Messages are persisted for some (configurable) duration if no consumer is available (maximum two weeks), so the consumer does not have to be up when messages are added to queue.
SNS: No persistence. Whichever consumer is present at the time of message arrival gets the message and the message is deleted. If no consumers are available then the message is lost after a few retries.
Consumer Type
SQS: All the consumers are typically identical and hence process the messages in the exact same way (each message is processed once by one consumer, though in rare cases messages may be resent)
SNS: The consumers might process the messages in different ways
Sample applications
SQS: Jobs framework: The Jobs are submitted to SQS and the consumers at the other end can process the jobs asynchronously. If the job frequency increases, the number of consumers can simply be increased to achieve better throughput.
SNS: Image processing. If someone uploads an image to S3 then watermark that image, create a thumbnail and also send a Thank You email. In that case S3 can publish notifications to an SNS topic with three consumers listening to it. The first one watermarks the image, the second one creates a thumbnail and the third one sends a Thank You email. All of them receive the same message (image URL) and do their processing in parallel.
You can see SNS as a traditional topic which you can have multiple Subscribers. You can have heterogeneous subscribers for one given SNS topic, including Lambda and SQS, for example. You can also send SMS messages or even e-mails out of the box using SNS. One thing to consider in SNS is only one message (notification) is received at once, so you cannot take advantage from batching.
SQS, on the other hand, is nothing but a queue, where you store messages and subscribe one consumer (yes, you can have N consumers to one SQS queue, but it would get messy very quickly and way harder to manage considering all consumers would need to read the message at least once, so one is better off with SNS combined with SQS for this use case, where SNS would push notifications to N SQS queues and every queue would have one subscriber, only) to process these messages. As of Jun 28, 2018, AWS Supports Lambda Triggers for SQS, meaning you don't have to poll for messages any more.
Furthermore, you can configure a DLQ on your source SQS queue to send messages to in case of failure. In case of success, messages are automatically deleted (this is another great improvement), so you don't have to worry about the already processed messages being read again in case you forgot to delete them manually. I suggest taking a look at Lambda Retry Behaviour to better understand how it works.
One great benefit of using SQS is that it enables batch processing. Each batch can contain up to 10 messages, so if 100 messages arrive at once in your SQS queue, then 10 Lambda functions will spin up (considering the default auto-scaling behaviour for Lambda) and they'll process these 100 messages (keep in mind this is the happy path as in practice, a few more Lambda functions could spin up reading less than the 10 messages in the batch, but you get the idea). If you posted these same 100 messages to SNS, however, 100 Lambda functions would spin up, unnecessarily increasing costs and using up your Lambda concurrency.
However, if you are still running traditional servers (like EC2 instances), you will still need to poll for messages and manage them manually.
You also have FIFO SQS queues, which guarantee the delivery order of the messages. SQS FIFO is also supported as an event source for Lambda as of November 2019
Even though there's some overlap in their use cases, both SQS and SNS have their own spotlight.
Use SNS if:
multiple subscribers is a requirement
sending SMS/E-mail out of the box is handy
Use SQS if:
only one subscriber is needed
batching is important
AWS SNS is a publisher subscriber network, where subscribers can subscribe to topics and will receive messages whenever a publisher publishes to that topic.
AWS SQS is a queue service, which stores messages in a queue. SQS cannot deliver any messages, where an external service (lambda, EC2, etc.) is needed to poll SQS and grab messages from SQS.
SNS and SQS can be used together for multiple reasons.
There may be different kinds of subscribers where some need the
immediate delivery of messages, where some would require the message
to persist, for later usage via polling. See this link.
The "Fanout Pattern." This is for the asynchronous processing of
messages. When a message is published to SNS, it can distribute it
to multiple SQS queues in parallel. This can be great when loading
thumbnails in an application in parallel, when images are being
published. See this link.
Persistent storage. When a service that is going to process a message is not reliable. In a case like this, if SNS pushes a
notification to a Service, and that service is unavailable, then the
notification will be lost. Therefore we can use SQS as a persistent
storage and then process it afterwards.
From the AWS documentation:
Amazon SNS allows applications to send time-critical messages to
multiple subscribers through a “push” mechanism, eliminating the need
to periodically check or “poll” for updates.
Amazon SQS is a message queue service used by distributed applications
to exchange messages through a polling model, and can be used to
decouple sending and receiving components—without requiring each
component to be concurrently available.
Fanout to Amazon SQS queues
Following are the major differences between the main messaging technologies on AWS (SQS, SNS, +EventBridge). In order to choose a particular AWS service, we should know the functionalities a service provides as well as its comparison with other services.
The below diagram summarizes the main similarities as well as differences between this service.
In simple terms,
SNS - sends messages to the subscriber using push mechanism and no need of pull.
SQS - it is a message queue service used by distributed applications to exchange messages through a polling model, and can be used to decouple sending and receiving components.
A common pattern is to use SNS to publish messages to Amazon SQS queues to reliably send messages to one or many system components asynchronously.
Reference from Amazon SNS FAQs.
One reason for coupling SQS and SNS would be for data processing pipelines.
Let's say you are generating three kinds of product, and that products B & C are both derived from the same intermediate product A. For each kind of product (i.e., for each segment of the pipeline) you set up:
a compute resource (maybe a lambda function, or a cluster of virtual machines, or an autoscaling kubernetes job) to generate the product.
a queue (describing units of work that need to be performed) to partition the work across the compute resource (so that each unit of work is processed exactly once, but separate units of work can be processed separately in parallel and asynchronously with each other).
a news feed (announcing outputs that have been produced).
Then arrange so that the input queues for B & C are both subscribing to the output announcements of A.
This makes the pipeline modular on the level of infrastructure. Rather than having a monolithic server application that generates all three products together, different stages of the pipeline can utilise different hardware resources (for example, perhaps stage B is very memory intensive, but the two other stages can be performed with cheaper hardware/services). This also makes it easier to iterate on the development of one pipeline segment without disrupting delivery of the other products.
There are some key distinctions between SNS and SQS:
SNS supports A2A and A2P communication, while SQS supports only A2A
communication.
SNS is a pub/sub system, while SQS is a queuing system. You'd
typically use SNS to send the same message to multiple consumers via
topics. In comparison, in most scenarios, each message in an SQS
queue is processed by only one consumer. With SQS, messages are
delivered through a long polling (pull) mechanism, while SNS uses a
push mechanism to immediately deliver messages to subscribed
endpoints.
SNS is typically used for applications that need real time
notifications, while SQS is more suited for message processing use
cases.
SNS does not persist messages - it delivers them to subscribers that
are present, and then deletes them. In comparison, SQS can persist
messages (from 1 minute to 14 days).
Individually, Amazon SQS and SNS are used for different use cases. You can, however, use them together in some scenarios.