I have a task generator to generate task messages to SQS queue and a bunch of workers to poll the SQS queue to process the task. In this case, is there any benefit to let the task generator to publish messages to a SNS topic first, and then the SQS queue subscribes to the SNS topic? I assume directly publish to SQS queue is enough.
Assuming you're not needing to fan out the messages to different types of workers, and your workers are doing the same job then no you don't.
Each worker can take and process one message.
One item to be aware off is the timeouts before the messages become visable on SQS again. i.e. not configuring the timeouts correctly could cause another worker to process the same message.
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-visibility-timeout.html
When a consumer receives and processes a message from a queue, the
message remains in the queue. Amazon SQS doesn't automatically delete
the message. Because Amazon SQS is a distributed system, there's no
guarantee that the consumer actually receives the message (for
example, due to a connectivity issue, or due to an issue in the
consumer application). Thus, the consumer must delete the message from
the queue after receiving and processing it. Visibility Timeout
Immediately after a message is received, it remains in the queue. To
prevent other consumers from processing the message again, Amazon SQS
sets a visibility timeout, a period of time during which Amazon SQS
prevents other consumers from receiving and processing the message.
The default visibility timeout for a message is 30 seconds. The
minimum is 0 seconds. The maximum is 12 hours. For information about
configuring visibility timeout for a queue using the console
Related
I'm reading aws documents on deadletter queue and re-drive policy, and the document mentioned "The redrive policy specifies the source queue, the dead-letter queue, and the conditions under which Amazon SQS moves messages from the former to the latter if the consumer of the source queue fails to process a message a specified number of times".
However, even the document mentioned "message process failed" several times, I do not understand how sqs detects a message processing failure (and thus triggers re-drive or move to the dead letter queue.)
From what I understand, consumer applications call receiveMessage to retrieve the message from SQS, then process the message. The processing function is not passed in to receiveMessage as a lambda. So how does SQS know that message processing has failed?
When a client (e.g. a lambda function) gets a message from the queue, it has limited time to call DeleteMessage. Each msg has also visibility timeout. If the msg is not deleted by the client within the visibility timeout, SQS "assumes" that the processing failed.
Such messages can be then forwarded to SQS depending on how many failed attempts you setup to tollerate.
I had a question regarding SQS services.
If you have a SQS queue with multiple consumers and long polling enabled for FIFO type SQS. Which consumer gets preference for the delivery? Is it based on which started the polling first or is random? And also are there any good readings for this?
Thanks in advance!
The number of consumers does not impact operation of the Amazon SQS queue. When a consumer requests messages from a FIFO queue, they will be given the earliest unprocessed message(s).
There is an additional Message Group ID on each message. While a message with a particular Message Group ID is being processed, no further messages with the same Message Group ID will be provided. This ensures that those messages are processed in-order.
Long-polling simply means that if no messages are available, then SQS will wait up to 20 seconds before returning an empty response. Long Polling is a default value that you can override with each request to the queue.
What happens when a producer fails to send a message to the SQS queue? Is there any way to configure or retry to send that message again.
Since Amazon SQS relies on Producer's "fire-and-forget" methodology, it makes Amazon SQS not responsible for retrying since Producer is the initiating. It is Producer's job to send the necessary message to SQS.
Retries in SQS is possible for Consumers which makes SQS responsible that the message has been processed.
We are evaluating SNS for our messaging requirements to integrate multiple applications. we have a single producer that publishes messages to multiple topics on SNS. Each topic has 2-5 subscribers. In event of subscriber failures (down for maintenance) I have a few questions on the recommended strategy of using SQS queues per consumer
Is it possible to configure SNS to push to SQS only in event of failure in delivering the message to a subscriber? Dumping all the messages in SQS queue creates a problem for the consumer to analyze all messages in the queue when it restarts.
In event of subscriber failure, it can read messages from SQS queue on restart but how would it know that it missed messages from SNS when it was overloaded?
Any suggestions on handling subscriber failures are welcome.
Thanks!
No, it is not possible to "configure SNS to push to SQS only in event of failure".
Rather than trying to recover a message after a failure, you can configure the Amazon SNS retry policies.
From Setting Amazon SNS Delivery Retry Policies for HTTP/HTTPS Endpoints:
You can use delivery policies to control not only the total number of retries, but also the time delay between each retry. You can specify up to 100 total retries distributed among four discrete phases. The maximum lifetime of a message in the system is one hour. This one hour limit cannot be extended by a delivery policy.
So, you don't need to worry as long as the destination is back online within an hour.
If it is likely to be offline for more than an hour, you will need to find a way to store and "replay" the messages, possibly by inspecting CloudWatch Logs.
Or, here's another idea...
Push initially to SQS. Have an AWS Lambda function triggered by SQS. The Lambda function can do the 'push' that would normally be done by SNS. If it fails, then the standard SQS invisibility process will retry it later, eventually going to a Dead Letter Queue.
Currently we are using 'RabbitMQ' for reliable messaging delivery,we have plan to move SQS.RMQP monitors its consumers through TCP when any consumer of the Queue goes down it will automatically Re-Queue the messaging for processing.
Will SQS monitor all its slaves? Will the message is visible in the queue if one of its consumer goes down while processing the message?
I have tried to find out the same from documentation,i could not find any.
If by 'slaves', you mean SQS consumers, then no, SQS does not monitor the consumers from the queue.
In a nutshell, SQS works like this:
A consumer requests a message from the queue.
SQS sends the message to the consumer to process and makes that message temporarily invisible to other consumers.
When the consumer is finished processing the message, it sends a 'DeleteMessage' requests back to SQS and SQS removes that item from the queue.
If a consumer does not send the deletemessage back soon enough (within its configurable timeout period), then SQS will put the message back into the queue automatically.
So SQS doesn't monitor you consumers, but if a consumer requests messages - and does nothing with them - they will eventually end up back in the queue to be processed by someone else.
But if your queue doesn't have any consumers, then sooner or later (14 days max), the messages will be deleted altogether (or sent to a dead-letter-queue if you set that up).
It is usually a good idea to setup your queue consumers in an auto-scaling group, with a health-check that can verify that it is running/processing properly. If an instance fails a health check, it will be terminated and a new instance spun up to continue the work in the queue. Optionally, you can spin up extra instances if the size of the SQS queue grows to meet peak demand.