Is it good to send a new RabbitMQ message from consumer to the same queue? - web-services

I do mailings via rabbitmq: I send a mailing list from the main application, the consumer reads it and sends it.
A broadcast may consist of different messages, which must be sent in the correct order.
In fact, a mailing list is a list of messages: [message_1, message_2, message_3, message_4]
Some of the messages can be sent and at some point the third-party service stops accepting requests.
I will describe the process of the consumer:
I take out the message from queue which contains distribution.
Sending: 1 part > 2 part
An error occurs. It remains to send 3 part > 4 part.
Acknowledge the original message from the queue.
Put a new one at the beginning of the same queue: [message 3, message 4].
Question 1: Is it good to send a new message (from consumer) created from parts of an old one to the same queue?
Question 2: Is it a good solution?
Are there any other solutions?

The sequence you posted loses a message if the handler process crashes between steps 4 and 5. So you have to switch the order of steps 4 and 5. But as soon as you do that you have to deal with the duplication of messages. If for some reason (like a bug) ack fails for a large percentage of messages you can end up with the same broadcast repeated multiple times in the queue. So if you want to avoid duplicated messages you have to use some external persistence to perform deduping. Also, RabbitMQ doesn't guarantee that messages are delivered in the same order. So you can end up in a situation where two messages for the same address are delivered out of order. So deduping should be on the level of individual parts, not entire messages.
Question 2: Is it a good solution? Are there any other solutions?
Consider using an orchestrator like temporal.io which eliminates most of the consistency problems I described.

Related

Understanding SQS message receive amount

I have a queue which is supposed to receive the messages sent by a lambda function. This function is supposed to send each different message once only. However, I saw a scary amount of receive count on the console:
Since I cannot find any explanation about receive count in the plain English, I need to consult StackOverflow Community. I have 2 theories to verify:
There are actually not so many messages and the reason why "receive count" is that high is simply because I polled the messages for a looooong time so the messages were captured more than once;
the function that sends the messages to the queue is SQS-triggered, those messages might be processed by multiple processors. Though I set VisibilityTimeout already, are the messages which are processed going to be deleted? If they aren't remained, there are no reasons for them to be caught and processed for a second time.
Any debugging suggestion will be appreciated!!
So, receive count is basically the amount of times the lambda (or any other consumer) has received the message. It can be that a consumer receives a message more than once (this is by design, and you should handle that in your logic).
That being said, the receive count also increases if your lambda fails to process the message (or even hits the execution limits). The default is 3 times, so if something with your lambda is wrong, you will have at least 3 receives per message.
Also, when you are polling the message, via the AWS console, you are basically increasing the receive count.

Why one SQS message doesn't get deleted while others do?

I came across this question in my AWS study:
You create an SQS queue and decide to test it out by creating a simple
application which looks for messages in the queue. When a message is
retrieved, the application is supposed to delete the message. You
create three test messages in your SQS queue and discover that
messages 1 and 3 are quickly deleted but message 2 remains in the
queue. What is a possible cause for this behavior? Choose the 2
correct answers
Options:
A. The order that messages are received in is not guaranteed in SQS
B. Message 2 uses JSON formatting
C. You failed to set the correct permissions on message 2
D. Your application is using short polling
Correct Answer:
A. The order that messages are received in is not guaranteed in SQS
D. Your application is using short polling
Why A is considered as one the answer here? I understand A is correct from the SQS feature definition, however, it does not explain to the issue in this question, right? Why it is not the permission issue?
Anything I am missing?
Thank you.
I think that a justification for A & D is:
Various workers might be pulling messages from the queue
Given that it is not a FIFO queue, then message order is not guaranteed (A)
Short-polling will not necessarily check every 'server', it will simply return a message (D)
Message 2 simply hasn't been processed yet
Frankly, I don't think that D is so relevant, because Long Polling returns as soon as it gets a message and it simply means that no worker has requested the message yet.
B is irrelevant because message content has no impact on retrieval.
C is incorrect because there are no permissions on individual messages. Only the queue, or users accessing the queue, have permissions.

Amazon SQS FIFO Queue send message validation

I am working on using amazon's fifo queue and when I send a message I would like to know if the item was added with my call, or if the message was already in the queue and it just returned true
Assuming you only have one process adding messages to the queue, just keep track of the sequenceNumber from the result (ie: add it to a Set) - once you have X unique sequenceNumbers, you're set (no pun intended).
If you have multiple processes adding messages, you'll need to either
ensure the messages sent by each process are unique (and thus can use the same mechanism as single process), or
use some mechanism of sharing information between processes
doing this option properly is likely more expensive than it's worth, and I'd strongly suggest either designing for option 1, or revisiting the requirement that each process sends exactly X unique messages, especially if "approximately X" is good enough.

Check if Kafka Queue is Empty

Right now I have functionality that writes a couple hundred messages onto a kafka queue. But when all of those messages have been consumed I need to also execute additional functionality. Is there a way to place a listener on a kafka queue to get notified when it has been emptied?
You could solve this two ways, I think:
Kafka's Fetch Response contains a HighwaterMarkOffset, which essentially is an offset of the last message in a partition. You could check whether your message has that offset and if so - you've reached the end. However, this won't work if you have producer and consumer working at the same time - consumer can just consume messages faster and thus stop earlier than you need.
Send a "poison pill" message - say you need to produce 100 messages. Then your producer sends these 100 messages + 1 special message (some UUID for example, but be sure it never appears under normal circumstances in your logic) that would mean "the end". On consumer side you would check whether the received message is a poison pill and shutdown if it is.

Separating messages in a simple TCP echo server using Winsock DLL

Please consider a simple echo server using TCP and the Winsock DLL. The client application sends messages from multiple threads. The recv call on the server sometimes returns with multiple messages stored in the passed buffer. At this point, there's no chance for the server to know, whether this is one huge message or multiple small messages.
I've read that one could use setsockopt in combination with the TCP_NODELAY option. Besides that MSDN states, that this option is implemented for backward compatibility only, it doesn't even change the behavior described above.
Of course, I could introduce some kind of delimiter at the end of each message and split the message on server-side. But I don't think that's way one should do it. So, what is the right way to do it?
Firstly, TCP_NODELAY was not the right way to do this... TCP is a byte stream protocol and any given connection only maintains the byte ordering - not necessarily the boundaries of any given send/write. It's inherently broken to rely on multiple threads that don't use any synchronisation being able to even keep the messages they want to send together on the stream. For example, say thread 1 wants to send the two-byte message "AB" and thread 2 wants to send "XY"... say thread 1 starts first and the output buffer only has room for one byte, send will enqueue "A" and let thread 1 know it's only sent one byte (so it should loop and retry - preferable after waiting for notification that the output queue has more space). Then, thread 2 might get some or all of "XY" into the queue before thread 1 can get "Y". These sorts of problems become more severe on slower connections, for slow and loaded machines (e.g. perhaps a low-powered phone that's playing video and multitasking while your app runs over 3G).
The ways to ensure the logical messages stay together over TCP include:
have a single sending thread that picks up messages sequentially from a shared queue (a mutex might be used to let the threads enqueue messages)
contest a lock (mutex) so the threads' sends have an uninterrupted ability to loop to send until a complete message is sent (this wouldn't suit some apps because any of the threads could be held up for quite a while doing comms work)
use a separate TCP connection per thread