First of all, I quote the following text from RabbitMQ docs:
When a message is requeued, it will be placed to its original position
in its queue, if possible. If not (due to concurrent deliveries and
acknowledgements from other consumers when multiple consumers share a
queue), the message will be requeued to a position closer to queue
head.
Now imagine that there're two message (A and B) on the same queue, both are unacked: when official docs says "[...] closer to queue head" gives any guarantee of ordering?
Would message A be queued before B under any condition? For me the answer is no but I'm looking for advise.
The short answer is
if both A and B are requeued, and
A was originally added to the queue before B, and
there is no consumer active on the queue between the time A was requeued and B was requeued,
then A will always be placed before B in the queue after both have been re-queued.
See my answer here for explanation as to why these assumptions are necessary.
Related
I'm using an SQS queue in my application. To handle duplicates I store a unique id from the queue item in a DynamoDB table. Then for each item I check if it exists first.
How long should I keep these id's in my DynamoDB table? i.e. once an item is processed how long after is it possible for duplicates of that item to arrive from SQS?
Thanks
There's no documented time frame as far as I know. It should only be a matter of a few seconds though.
There are 2 modes in SQS - standard queue and FIFO.
Let's assume further that consumers delete handled messages (if you don't have it, then this is what you need the first thing).
FIFO queue doesn't have duplicates delivered. Standard queue may have duplicates. Since you have duplicates, let's go further with standard queue.
Standard queue uses eventual consistency while providing high performance.
We cannot ask for concrete time when there is no duplicate assuming we use eventually consistent approach.
If you need strong consistency and concrete numbers, then go with FIFO queue.
Once a message has been removed from the standard queue you can assume that you will not see it again. Therefore, the duplicate threat, in theory, persists until the message has been removed from the queue... either by error, successful completion or manual removal.
That said, if you have a redrive policy set up to retry errored messages after the visibility timeout has expired you probably don't want to treat those retries as duplicates. Therefore you will not only want to store the message's unique id, but its status as well.
I came across this question in my AWS study:
You create an SQS queue and decide to test it out by creating a simple
application which looks for messages in the queue. When a message is
retrieved, the application is supposed to delete the message. You
create three test messages in your SQS queue and discover that
messages 1 and 3 are quickly deleted but message 2 remains in the
queue. What is a possible cause for this behavior? Choose the 2
correct answers
Options:
A. The order that messages are received in is not guaranteed in SQS
B. Message 2 uses JSON formatting
C. You failed to set the correct permissions on message 2
D. Your application is using short polling
Correct Answer:
A. The order that messages are received in is not guaranteed in SQS
D. Your application is using short polling
Why A is considered as one the answer here? I understand A is correct from the SQS feature definition, however, it does not explain to the issue in this question, right? Why it is not the permission issue?
Anything I am missing?
Thank you.
I think that a justification for A & D is:
Various workers might be pulling messages from the queue
Given that it is not a FIFO queue, then message order is not guaranteed (A)
Short-polling will not necessarily check every 'server', it will simply return a message (D)
Message 2 simply hasn't been processed yet
Frankly, I don't think that D is so relevant, because Long Polling returns as soon as it gets a message and it simply means that no worker has requested the message yet.
B is irrelevant because message content has no impact on retrieval.
C is incorrect because there are no permissions on individual messages. Only the queue, or users accessing the queue, have permissions.
Imagine the following lifetime of an Order.
Order is Paid
Order is Approved
Order is Completed
We chose to use an SQS FIFO to ensure all these messages are processed in the order they are produced, to avoid for example changing the status of an order to Approved only after it was Paid and not after has been Completed.
But let's say that there is an error while trying to Approve an order, and after several attempts the message will be moved to the Deadletter queue.
The problem we noticed is the subsequent message, that is "Order is completed", it is processed, even though the previous message, "Approved", it is in the deadletter queue.
How we should handle this?
Should we check the contents of deadletter queue for having messages with the same MessageGroupID as the consuming one, assuming we could do this?
Is there a mechanism that we are missing?
Sounds to me like you are using a single Queue for multiple types of events, where I would probably recommend (at least) three seperate queues:
An order paid event queue
An order approved event queue
An order completed event queue
When a order payment comes in, an event is put into the first queue, once your system has successfully processed that payment, it removes the item from the first queue (deletes the message), and then inserts 'Order Approved' event into the 2nd queue.
The process responsible for processing those events, only watches that queue and does what it needs to do, and once complete, deletes the message and inserts a third message into the third queue so that yet another process can see and act on that message - process it and then delete it.
If anything fails along the way the message will eventually endup in a dead letter queue - either the same on, or one per queue - that makes no difference, but nothing that was supposed to happen AFTER the event failed would happen.
Doesn't even sound to me like you need a FIFO queue at all in this case, though there is no real harm (except for the slighlty higher cost, and lower throughput limits).
Source from AWS https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-dead-letter-queues.html:
Don't use a dead-letter queue with a FIFO queue if you don't want to break the exact order of messages or operations. For example, don't use a dead-letter queue with instructions in an Edit Decision List (EDL) for a video editing suite, where changing the order of edits changes the context of subsequent edits.
Right now I have functionality that writes a couple hundred messages onto a kafka queue. But when all of those messages have been consumed I need to also execute additional functionality. Is there a way to place a listener on a kafka queue to get notified when it has been emptied?
You could solve this two ways, I think:
Kafka's Fetch Response contains a HighwaterMarkOffset, which essentially is an offset of the last message in a partition. You could check whether your message has that offset and if so - you've reached the end. However, this won't work if you have producer and consumer working at the same time - consumer can just consume messages faster and thus stop earlier than you need.
Send a "poison pill" message - say you need to produce 100 messages. Then your producer sends these 100 messages + 1 special message (some UUID for example, but be sure it never appears under normal circumstances in your logic) that would mean "the end". On consumer side you would check whether the received message is a poison pill and shutdown if it is.
My understanding of the size limit on the message queue in a MFC thread comes from the explanation on PostThreadMessage page of MSDN.
https://msdn.microsoft.com/en-us/library/windows/desktop/ms644946%28v=vs.85%29.aspx
As stated, the limit by default is 10000 messages. I am trying to understand exactly what this limit is. I see it being one of two thing.
Scenario A
I have a GUI that is handling messages. The rate at which the messages are being placed in the queue is greater than that at which these messages are being pulled off the queue and handled. In this case messages accumulate, eventually there are 10000 messages on the queue, another message tries to join the queue, but it then fails.
Scenario B
I have a GUI that is handling messages. The rate at which message are being placed in the queue is less that then rate at which these messages are being pulled of the queue and handled. Messages do no accumulate on the queue. But after my queue has seen 10000 messages, it is rendered useless, so effectively, my message queue has a limited operational life.
The more I think about it, the answer should be Scenario A... but stranger things have happened..
From the linked article: GetLastError returns ERROR_NOT_ENOUGH_QUOTA when the message limit is hit. So, every attempt to send/post message when the queue is full fails, that's all.
Generally, destination thread handles the messages and removes them from the queue. PeekMessage with PM_NOREMOVE flag allows to handle the message without removing it. For reference, PeekMessage function: https://msdn.microsoft.com/en-us/library/windows/desktop/ms644943%28v=vs.85%29.aspx