My application consists of:
1 Amazon SQS message queue
n workers
The workers have the following logic:
1. Wait for message from SQS queue
2. Perform task described in message
3. Delete the message from the SQS queue
4. Go to (1)
I want each message to be received by only one worker to avoid redundant work.
Is there a mechanism to mark a message as "in progress" using SQS, so that other pollers do not receive it?
Alternatively, is it appropriate to delete the message as soon as it is received?
1. Wait for message from SQS queue
2. Delete the message from the SQS queue
3. Perform task described in message
4. Go to (1)
If I follow this approach, is there a way to recover received but unprocessed messages in case a worker crashes (step (3) fails)?
This question is specific to Spring, which contains all sorts of magic.
An SQS message is considered to be "inflight" after it is received from a queue by a consumer, but not yet deleted from the queue. These messages are not visible to other consumers.
In SQS messaging, a message is considered in "inflight" if:
You the consumer have received it, and
the visibility timeout has not expired and
you have not deleted it.
SQS is designed so that you can call ReceiveMessage and a message is given to you for processing. You have some amount of time (the visibility timeout) to perform the processing on this message. During this "visibility" timeout, if you call ReceiveMessage again, no worker will be returned the message you are currently working with. It is hidden.
Once the visibility timeout expires the message will be able to be returned to future ReceiveMessage calls. This could happen if the consumer fails in some way. If the process is successful, then you can delete the message.
The number of messages that are hidden from ReceiveMessage call is the "inflight" number. Currently a SQS queue is set by default to allow a max of 120,000 messages to be "inflight".
http://docs.amazonwebservices.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/AboutVT.html
There will be ReciepientHandle String which will be sent with the message. It will be having a expiry time based on queue visibility timeout.
You can use the this ReciepientHandle to delete message from queue
Related
I am using SQS to hold http requests. How can I keep the message alive (in the queue) to be re-processed when the request fails, without another process grabbing it?
The typical process is:
A message is placed into the Amazon SQS queue
A worker process calls ReceiveMessage() to retrieve a message from the queue
The message is temporarily marked as 'invisible' (in-flight) so that other workers cannot see the message
If the worker successfully processes the message, it calls DeleteMessage() to permanently remove the message
If the worker does not respond within the Invisibility Timeout period (eg if it fails), the message will reappear on the queue. The message can then be grabbed by another worker.
If a Dead Letter Queue has been configured, then a message that is retrieved from the queue more than a defined number of times will be moved to the Dead Letter Queue for separate investigation or re-processing.
Your question seems to fit the scenario for using a Dead Letter Queue.
I have an SQS FIFO queue which we send bunch of ids for processing on the other end. We have 4 workers digesting the message. Once the worker receives the message, it deletes the msg and stores these ids until it hits a limit before performing actions.
What I've noticed is that some ids are received more than once when each id is only sent once. Is it normal?
Your current process appears to be:
A worker pulls (Receives) a message from a queue
It deletes the message
It performs actions on the message
This is not the recommended way to use a queue because the worker might fail after it has deleted the message but before it has completed the action. Thus, the message would be "lost".
The recommended way to use a queue would be:
Pull a message from the queue (makes the message temporarily invisible)
Process the message
Delete the message
This way, if the worker fails while processing the message, it will automatically "reappear" on the queue after the invisibility period. The worker can also send a "still working" signal to keep the message invisible for longer while it is being processed.
Amazon SQS FIFO queues provide exactly-once processing. This means that a message will only be delivered once. (However, if the invisibility period expires before the message is deleted, it will be provided again.)
You say that "some ids are received more than once". I would recommend adding debug code to try and understand the circumstances in which this happens, since it should not be happening if the messages are deleted within the invisibility period.
I want to get all the messages in the queue to process them. However the property for MaxNumberOfMessages is 10 (based on documentation)
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_ReceiveMessage.html
How can I read in all messages so I can process them? Or how would I know when queue is empty?
thanks
When you receive messages from the queue, they are marked as "in flight." After you successfully process them, you send a call to the queue to delete them. This call will include IDs of each of the messages.
When the queue is empty, the next read will have an empty Messages array.
Usually when I do this I wrap my call to read the queue in a loop (a while loop) and only keep processing if I have Messages after doing a read.
It shouldn't make any difference if it's a FIFO queue or a standard one.
To check if the queue is empty you have to verify the total number of messages in the queue is zero. SQS does not provide a single metric for this, rather you have to calculate the sum of three different metrics.
From the docs:
To confirm that a queue is empty (AWS CLI, AWS API)
Stop all producers from sending messages.
Repeatedly run one of the following commands:
AWS CLI: get-queue-attributes
AWS API: GetQueueAttributes
Observe the metrics for the following attributes:
ApproximateNumberOfMessagesDelayed
ApproximateNumberOfMessagesNotVisible
ApproximateNumberOfMessages
When all of them are 0 for several minutes, the queue is empty.
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/confirm-queue-is-empty.html
Getting an empty response from a ReceiveMessage call does NOT necessarily mean the queue is empty. You can have messages in the queue and still receive an empty response if:
Messages are delayed - You can set delays on individual messages for standard queues or at the queue level for standard and FIFO queues. During the delay period messages are invisible to consumers.
Messages are in-flight - When a consumer receives a message, that message remains in the queue until the consumer deletes it by calling DeleteMessage. While the message is in this state it is considered in-flight and is not available for other consumers.
Multiple messages have the same message group id in a FIFO queue - When a consumer receives a message from a FIFO queue, no other consumer can receive messages from the same message group. This ensures messages are processed in FIFO order.
By summing the metrics listed above, you can account for all of these scenarios.
I set visibility time out 12 hours and max message 3, delay time 15 min, I'm get sqs message few minute after automatically I get same message again.
Why do I get multiple sqs message without timeout?
After visibility time out it delete message in queue or send again sqs message?
When ReceiveMessage() is called on an Amazon SQS queue, up to 10 messages (configurable) will be retrieved from the queue.
These messages will be marked as Invisible or In-Flight. This means that the messages are still in the queue, but will not be returned via another ReceiveMessage() call. The messages will remain invisible for a period of time. The default period is configured on the queue ("Default Visibility Timeout") or when the messages are retrieved (VisibilityTimeout).
When an application has finished handling a message, it should call DeleteMessage(), passing the MessageHandle that was provided with the message. The message will then be deleted from the queue.
If the invisibility period expires before a message is deleted, it will be placed on the queue again and applications can retrieve it again. Therefore, be sure to set your invisibility timeout to be longer than an application normally takes to process a message.
It is possible that a message may be retrieved more than once from Amazon SQS. It is rare, but can happen where there are multiple processes retrieving messages simultaneously. Thus, SQS is "At least once delivery". If this is a problem, you can use FIFO Queues (not yet available in every region) that will guarantee that each message is delivered only once, but there are throughput restrictions on FIFO queues.
So, if you are receiving a message more than once:
You should check your invisibility timeout setting (both the default setting and the value that can be passed when you call ReceiveMessage())
Consider using FIFO queues
Have your application check whether a message has already been processed before processing it again (eg via a unique ID)
I have an Amazon SQS Queue and I am trying to make it work this way:
When a new message added to the queue, only the first client who received that message will start work
For others, the message will be invisible for period of time
Is it possible to do this using Visibility Timeout?
When a consumer receives and processes a message from SQS queue, the message still remains in the queue (until it is deleted by the consumer). To make sure that other consumers don't process the same message, you can set visibility timeout of the queue. Once the message has been processed by the consumer, you can delete the message from the queue. For the duration of the visibility timeout, no other consumer will be able to receive and process the same message.
There is no other way to "lock" the message except setting a long Visibility Timeout, with a maximum 12 hour timeout.
However, if your real concern also including error/crashing, you can make use of the Dead-Letter-Queue redrive policy, to deal with queue contents that fail to be process indefinitely.