Why one SQS message doesn't get deleted while others do? - amazon-web-services

I came across this question in my AWS study:
You create an SQS queue and decide to test it out by creating a simple
application which looks for messages in the queue. When a message is
retrieved, the application is supposed to delete the message. You
create three test messages in your SQS queue and discover that
messages 1 and 3 are quickly deleted but message 2 remains in the
queue. What is a possible cause for this behavior? Choose the 2
correct answers
Options:
A. The order that messages are received in is not guaranteed in SQS
B. Message 2 uses JSON formatting
C. You failed to set the correct permissions on message 2
D. Your application is using short polling
Correct Answer:
A. The order that messages are received in is not guaranteed in SQS
D. Your application is using short polling
Why A is considered as one the answer here? I understand A is correct from the SQS feature definition, however, it does not explain to the issue in this question, right? Why it is not the permission issue?
Anything I am missing?
Thank you.

I think that a justification for A & D is:
Various workers might be pulling messages from the queue
Given that it is not a FIFO queue, then message order is not guaranteed (A)
Short-polling will not necessarily check every 'server', it will simply return a message (D)
Message 2 simply hasn't been processed yet
Frankly, I don't think that D is so relevant, because Long Polling returns as soon as it gets a message and it simply means that no worker has requested the message yet.
B is irrelevant because message content has no impact on retrieval.
C is incorrect because there are no permissions on individual messages. Only the queue, or users accessing the queue, have permissions.

Related

Is it good to send a new RabbitMQ message from consumer to the same queue?

I do mailings via rabbitmq: I send a mailing list from the main application, the consumer reads it and sends it.
A broadcast may consist of different messages, which must be sent in the correct order.
In fact, a mailing list is a list of messages: [message_1, message_2, message_3, message_4]
Some of the messages can be sent and at some point the third-party service stops accepting requests.
I will describe the process of the consumer:
I take out the message from queue which contains distribution.
Sending: 1 part > 2 part
An error occurs. It remains to send 3 part > 4 part.
Acknowledge the original message from the queue.
Put a new one at the beginning of the same queue: [message 3, message 4].
Question 1: Is it good to send a new message (from consumer) created from parts of an old one to the same queue?
Question 2: Is it a good solution?
Are there any other solutions?
The sequence you posted loses a message if the handler process crashes between steps 4 and 5. So you have to switch the order of steps 4 and 5. But as soon as you do that you have to deal with the duplication of messages. If for some reason (like a bug) ack fails for a large percentage of messages you can end up with the same broadcast repeated multiple times in the queue. So if you want to avoid duplicated messages you have to use some external persistence to perform deduping. Also, RabbitMQ doesn't guarantee that messages are delivered in the same order. So you can end up in a situation where two messages for the same address are delivered out of order. So deduping should be on the level of individual parts, not entire messages.
Question 2: Is it a good solution? Are there any other solutions?
Consider using an orchestrator like temporal.io which eliminates most of the consistency problems I described.

Understanding SQS message receive amount

I have a queue which is supposed to receive the messages sent by a lambda function. This function is supposed to send each different message once only. However, I saw a scary amount of receive count on the console:
Since I cannot find any explanation about receive count in the plain English, I need to consult StackOverflow Community. I have 2 theories to verify:
There are actually not so many messages and the reason why "receive count" is that high is simply because I polled the messages for a looooong time so the messages were captured more than once;
the function that sends the messages to the queue is SQS-triggered, those messages might be processed by multiple processors. Though I set VisibilityTimeout already, are the messages which are processed going to be deleted? If they aren't remained, there are no reasons for them to be caught and processed for a second time.
Any debugging suggestion will be appreciated!!
So, receive count is basically the amount of times the lambda (or any other consumer) has received the message. It can be that a consumer receives a message more than once (this is by design, and you should handle that in your logic).
That being said, the receive count also increases if your lambda fails to process the message (or even hits the execution limits). The default is 3 times, so if something with your lambda is wrong, you will have at least 3 receives per message.
Also, when you are polling the message, via the AWS console, you are basically increasing the receive count.

Process messages from Amazon SQS Dead Letter Queue

I want to process messages from an Amazon SQS Dead Letter Queue.
What is the best way to process them?
Receive messages from dead letter queue and process it.
Receive messages from dead letter queue put back in main queue and then process it?
I just need to process messages from dead letter queue once in a while.
After careful consideration of various options, I am going with the option 2 "Receive messages from dead letter queue put back in main queue and then process it" you mentioned.
Make sure that while transferring the messages from one queue messages are not lost.
Before putting messages from DLQ to main queue, make sure that the errors faced in the main listener (mainly coding errors if any) are resolved or if any network issues are resolved.
The listener of the main queue has retried the message already and retrying it again. So please make sure to either skip already successful steps of message processing in case message is being retried. Also revert successfully processed steps in case of any errors. (This will will help in the message retry as well.)
DLQ is meant for unexpected errors. So you may have an on-demand job for doing this.
Presumably the message ended up in the Dead Letter Queue for a reason, after failing several times.
It would not be a good idea to put it back in the main queue because, presumably, it would fail again and you would create an infinite loop.
Initially, dead messages should be examined manually to determine the causes of failure. Then, based on this information, an alternate flow could be developed.

How to prevent other workers from accessing a message which is being currently processed?

I am working on a project that will require multiple workers to access the same queue to get information about a file which they will manipulate. Files are ranging from size, from mere megabytes to hundreds of gigabytes. For this reason, a visibility timeout doesn't seem to make sense because I cannot be certain how long it will take. I have though of a couple of ways but if there is a better way, please let me know.
The message is deleted from the original queue and put into a
‘waiting’ queue. When the program finished processing the file, it
deletes it, otherwise the message is deleted from the queue and put
back into the original queue.
The message id is checked with a database. If the message id is
found, it is ignored. Otherwise the program starts processing the
message and inserts the message id into the database.
Thanks in advance!
Use the default-provided SQS timeout but take advantage of ChangeMessageVisibility.
You can specify the timeout in several ways:
When the queue is created (default timeout)
When the message is retrieved
By having the worker call back to SQS and extend the timeout
If you are worried that you do not know the appropriate processing time, use a default value that is good for most situations, but don't make it so big that things become unnecessarily delayed.
Then, modify your workers to make a ChangeMessageVisiblity call to SQS periodically to extend the timeout. If a worker dies, the message stops being extended and it will reappear on the queue to be processed by another worker.
See: MessageVisibility documentation

Check if Kafka Queue is Empty

Right now I have functionality that writes a couple hundred messages onto a kafka queue. But when all of those messages have been consumed I need to also execute additional functionality. Is there a way to place a listener on a kafka queue to get notified when it has been emptied?
You could solve this two ways, I think:
Kafka's Fetch Response contains a HighwaterMarkOffset, which essentially is an offset of the last message in a partition. You could check whether your message has that offset and if so - you've reached the end. However, this won't work if you have producer and consumer working at the same time - consumer can just consume messages faster and thus stop earlier than you need.
Send a "poison pill" message - say you need to produce 100 messages. Then your producer sends these 100 messages + 1 special message (some UUID for example, but be sure it never appears under normal circumstances in your logic) that would mean "the end". On consumer side you would check whether the received message is a poison pill and shutdown if it is.