SQS returns more messages than queue size - amazon-web-services

I created a basic non-FIFO queue, and put only 1 message on it. I retrieve the message using the following code:
ReceiveMessageRequest request = new ReceiveMessageRequest();
request.setQueueUrl(queueUrl);
request.setMaxNumberOfMessages(10);
request.withMessageAttributeNames("All");
ReceiveMessageResult result = sqsClient.receiveMessage(request);
List<Message> messages = result.getMessages();
messages.size() gives 3
They have:
Same MessageId
Same body and attributes
Same MD5OfBody
Different ReceiptHandle
Changing MaxNumberOfMessages from 10 to 1 fixed it, but I want to receive in batch of 10 in the future.
Can someone explain why it is retrieving more message than it should?
Below is my queue configuration:
Default visibility timeout = 0
message retention = 4 days
max message size = 256kb
delivery delay = 0
receive message wait time = 0
no redrive policy

Details /complement to #Michael - sqlbot comment.
Setting the SQS visibility timeout to a small value are not going to fix your problem. You are going to hit the problem again. Use 30 seconds or more in order to allow your program to consume the message. (To cater for program crashes/unexpected program delay, you should create redrive policy to mitigate the issues.)
AWS has mentioned this in At-Least-Once Delivery
Amazon SQS stores copies of your messages on multiple servers for
redundancy and high availability. On rare occasions, one of the
servers that stores a copy of a message might be unavailable when you
receive or delete a message.
If this occurs, the copy of the message will not be deleted on that
unavailable server, and you might get that message copy again when you
receive messages. You should design your applications to be idempotent
(they should not be affected adversely when processing the same
message more than once).

Changing Default Visibility Timeout from 0 to 1 second fixed the issue

Related

AWS SQS only returning 1 message at a time

I'm getting the SQS queue with the following. But it always returns only 1 message. Based on the params below, I'm expecting 10 messages. This is not a FIFO queue.
Is there a different config parameter to bring back more than 1 message at a time?
const receiveParms = {
QueueUrl: queueURL,
MaxNumberOfMessages: 10,
VisibilityTimeout: 15,
WaitTimeSeconds: 20,
AttributeNames: ["All"],
};
const receiveCommand = new ReceiveMessageCommand(receiveParms);
const msgData = await sqsClient.send(receiveCommand);
I have been trying to resolve this issue recently and have discovered a few details that help explain why SQS (non FIFO) behaves in this way.
MaxNumberOfMessages
The maximum number of messages to return. Amazon SQS never returns more messages than this value (however, fewer messages might be returned). Valid values: 1 to 10. Default: 1.
VisibilityTimeout
The duration (in seconds) that the received messages are hidden from subsequent retrieve requests after being retrieved by a ReceiveMessage request.
WaitTimeSeconds
The duration (in seconds) for which the call waits for a message to arrive in the queue before returning. If a message is available, the call returns sooner than WaitTimeSeconds. If no messages are available and the wait time expires, the call returns successfully with an empty list of messages.
So if you have a single consumer and you want to get a specific message, then a way to do this is by repeat ReceiveMessage requests that utilise a VisibilityTimeout long enough to ensure that you only get different messages in each request.
Note, if you delete each message as it is received, then that makes things easier because you will only receive messages that are not deleted in subsequent requests. This is of course not ideal if there are multiple consumers of the same queue for the same message.
Even with all that said, you are still left with no guarantee that you will receive the messages you want in 1 or more requests.
So after all that, I was left with a redesign, just removing SQS from the equation and using a lambda to send information that otherwise would have been an SQS message, but stored in proper storage instead for retrieval and processing later.
Not exactly a solution, but a few work arounds and an alternative that may help.

Optimizing SQS Lambda configuration for single concurrency microservice composition

Apologies for the title. It's hard to summarise what I'm trying to accomplish.
Basically let me define a service to be an SQS Queue + A Lambda function.
A service (represented by square brackets below) performs a given task, where the queue is the input interface, processes the input, and outputs on to the queue of the subsequent service.
Service 1 Service 2 Service 3
[(APIG) -> (Lambda)] -> [(SQS) -> (Lambda)] -> [(SQS) -> (Lambda)] -> ...
Service 1: Consumes the request and payload, splits it into messages and passes on to the queue of the next service.
Service 2: This service does not have a reserved concurrency. It validates each message on the queue, and if valid, passes on to the next service.
Service 3: Processes each message in the queue (ideally in batches of approximately 100). The lambda here must have a reserved concurrency of 1 (as it hits an API that can't process multiple requests concurrently).
Currently I have the following configuration on Service 3.
Default visibility timeout of queue = 5 minutes
Lambda timeout = 5 minutes
Lambda reserved concurrency = 1
Problem 1: Service 3 consumes x items off the queue and if it finishes processing them within 30 seconds I expect the queue to process the next x items off the queue immediately (ideally x=100). Instead, it seems to always wait 5 minutes before taking the next batch of messages off the queue, even if the lambda completes in 30 seconds.
Problem 2: Service 3 typically consumes a few messages at a time (inconsistent) rather than batches of 100.
A couple of more notes:
In service 3 I do not explicitly delete messages off the queue using the lambda. AWS seems to do this itself when the lambda successfully finishes processing the messages
In service 2 I have one item per message. And so when I send messages to Service 3 I can only send 10 items at a time, which is kind of annoying. Because queue.send_messages(Entries=x), len(x) cannot exceed 10.
Does anyone know how I solve Problem 1 and 2? Is it an issue with my configuration? If you require any further information please ask in comments.
Thanks
Both your problems and notes indicate misconfigured SQS and/or Lambda function.
In service 3 I do not explicitly delete messages off the queue using
the lambda. AWS seems to do this itself when the lambda successfully
finishes processing the messages.
This is definitely not the case here as it would go agains the reliability of SQS. How would SQS know that the message was successfully processed by your Lambda function? SQS doesn't care about consumers and doesn't really communicate with them and that is exactly the reason why there is a thing such as visibility timeout. SQS deletes message in two cases, either it receives DeleteMessage API call specifying which message to be deleted via ReceiptHandle or you have set up redrive policy with maximum receive count set to 1. In such case, SQS will automatically send message to dead letter queue when if it receives it more than 1 time which means that every message that was returned to the queue will be send there instead of staying in the queue. Last thing that can cause this is a low value of Message Retention Period (min 60 seconds) which will drop the message after x seconds.
Problem 1: Service 3 consumes x items off the queue and if it finishes
processing them within 30 seconds I expect the queue to process the
next x items off the queue immediately (ideally x=100). Instead, it
seems to always wait 5 minutes before taking the next batch of
messages off the queue, even if the lambda completes in 30 seconds.
This simply doesn't happen if everything is working as it should. If the lambda function finishes in 30 seconds, if there is reserved concurrency for the function and if there are messages in the queue then it will start processing the message right away.
The only thing that could cause is that your lambda (together with concurrency limit) is timing out which would explain those 5 minutes. Make sure that it really finishes in 30 seconds, you can monitor this via CloudWatch. The fact that the message has been successfully processed doesn't necessarily mean that the function has returned. Also make sure that there are messages to be processed when the function ends.
Problem 2: Service 3 typically consumes a few messages at a time
(inconsistent) rather than batches of 100.
It can never consume 100 messages since the limit is 10 (messages in the sense of SQS message not the actual data that is stored within the message which can be anywhere up to 256 KB, possibly "more" using extended SQS library or similar custom solution). Moreover, there is no guarantee that the Lambda will receive 10 messages in each batch. It depends on the Receive Message Wait Time setting. If you are using short polling (1 second) then only subset of servers which are storing the messages will be polled and a single message is stored only on a subset of those servers. If those two subsets do not match when the message is polled, the message is not received in that batch. You can control this by increasing polling interval, Receive Message Wait Time, (max 20 seconds) but even if there are not enough messages in the queue when the timer finishes, the batch will still be received with fewer messages, possibly zero.
And as it was mentioned in the comments, using this strategy with concurrency set to low number can lead to some problems. Another thing is that you need to ensure that rate at which messages are produced is somehow consistent with the time it takes for one instance of lambda function to process the message otherwise you will end up with constantly growing queue, possibly losing messages after they outlive the Message Retention Period.

AWS SQS sender setting a time for message to be avilable for the consumer

I have seen an option called visibility timeout in AWS SQS which sets a time for other consumers to ignore the message which is being processed by one.
Is there an option to set a time before the message actually becomes available for consumers to use it, may be set while message actually is inserted into Queue
There is such an option, but it is a queue-level option, not a message-level option:
You can use the CreateQueue action to create a delay queue by setting the DelaySeconds attribute to any value between 0 and 900 (15 minutes). You can also change an existing queue into a delay queue using the SetQueueAttributes action to set the queue's DelaySeconds attribute
http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-delay-queues.html
When the value is nonzero, all messages are delayed by the specified number of seconds before they are initially visible to any consumer.
Found an option called Amazon SQS Message Timers
Amazon SQS message timers allow you to specify an initial invisibility period for a message that you add to a queue. For example, if you send a message with the DelaySeconds parameter set to 45, the message isn't visible to consumers for the first 45 seconds during which the message stays in the queue.
The default value for DelaySeconds is 0.
To set a delay period that applies to all messages in a queue, use delay queues. A message timer setting for an individual message overrides any DelaySeconds value that applies to the entire delay queue.
For more info on SQS check this detailed PDF by AWS
When you are processing the message, you can actually use ChangeMessageVisibility() . This allows you to change the timeout period while holding the message either up or down. For example, if you KNOW that your message failed the first time, you can set it to 0 and it will put it back on the queue. If you wanted a specifically longer timeout, you could check for that type of message and set it to 10 minutes for example.

I'm getting same multiple Sqs message before visibility timeout

I set visibility time out 12 hours and max message 3, delay time 15 min, I'm get sqs message few minute after automatically I get same message again.
Why do I get multiple sqs message without timeout?
After visibility time out it delete message in queue or send again sqs message?
When ReceiveMessage() is called on an Amazon SQS queue, up to 10 messages (configurable) will be retrieved from the queue.
These messages will be marked as Invisible or In-Flight. This means that the messages are still in the queue, but will not be returned via another ReceiveMessage() call. The messages will remain invisible for a period of time. The default period is configured on the queue ("Default Visibility Timeout") or when the messages are retrieved (VisibilityTimeout).
When an application has finished handling a message, it should call DeleteMessage(), passing the MessageHandle that was provided with the message. The message will then be deleted from the queue.
If the invisibility period expires before a message is deleted, it will be placed on the queue again and applications can retrieve it again. Therefore, be sure to set your invisibility timeout to be longer than an application normally takes to process a message.
It is possible that a message may be retrieved more than once from Amazon SQS. It is rare, but can happen where there are multiple processes retrieving messages simultaneously. Thus, SQS is "At least once delivery". If this is a problem, you can use FIFO Queues (not yet available in every region) that will guarantee that each message is delivered only once, but there are throughput restrictions on FIFO queues.
So, if you are receiving a message more than once:
You should check your invisibility timeout setting (both the default setting and the value that can be passed when you call ReceiveMessage())
Consider using FIFO queues
Have your application check whether a message has already been processed before processing it again (eg via a unique ID)

How do I limit the number of messages in my SQS queue?

I want to have only 200 messages there.
All others should move to the dead letter queue.
We just don't have the capacity to process more messages due to dependency on other services.
I don't think it is possible to limit the number of messages in a queue. You can set a limit to the size of a message in a queue but not the number of messages.
Source: SetQueueAttributes
You definitely can't limit the number of messages in the queue.
What is the nature of your application? Maybe there is a better solution if we knew more about why you need to limit the queue size...
SQS does not have such a limiting feature.
So don't try to do it at the SQS level. Instead, implement this limiting logic as you're pulling messages from the queue.
Keep track of the messages you pull from the queue and send to the 3rd party service. Once you hit your limit (of 20?), then junk the message.
Have a counter of messages that are "being processed".
Pull a message from the queue.
Check the counter and if it's less than 20, increment the counter and send the message to the 3rd party service.
When the 3rd party service call returns, decrement the counter.
When you check the counter in #2 above and it's 20, then junk the message.
UPDATE:
If you don't have a delayed visibility set to them, you could call ApproximateNumberOfMessagesVisible via the cloudwatch API before allowing the message to go through.
The number of messages available for retrieval from the queue.
Units: Count
Valid Statistics: Average
If you do have delayed visibility greater than 0, you could do two checks, the second with ApproximateNumberOfMessagesNotVisible.
If this solution doesn't work, yes, this seems a bit much, but you could do a call on NumberOfMessagesDeleted and NumberOfMessagesSent and get the number of messages still in the queue.
So the (pseudo) "code" would look like:
if (ApproximateNumberOfMessagesVisible < 200)
//Send message
//OR
var x = NumberOfMessagesSent - NumberOfMessagesDeleted;
if (x < 200)
//Send message
HERE is the documentation of the above calls
Information Check?
After a second look, I do not believe the below configuration will solve this problem. I will leave it in the answer until confirmed incorrect.
I believe this is possible during the set up process by adjusting the Dead Letter Queue Settings
In the SQS set up you will see: Use Redrive Policy that states:
Send messages into a dead letter queue after exceeding the Maximum Receives.
And just below that: Maximum Receives that states:
The maximum number of times a message can be received before it is sent to the Dead Letter Queue.
This setting should send all overflow to a secondary queue that is an optional value. So in other words, you can enable Redrive Policy leave Dead Letter Queue blank with the Maximum Receives set to 200
amazon-web-services amazon-sqs amazon-cloudwatch