What percentage of SQS messages are delivered at least once? - amazon-web-services

I understand that standard SQS uses "at least once" delivery, while FIFO messages are delivered exactly once.
What percentage (roughly) of SQS messages will be duplicated? This seems like an important factor when weighing standard queues vs FIFO. I wonder if it depends on message throughput?

Amazon does not provide any detailed number (even a ballpark one) to your question.
"On rare occasions" is the best I can find -
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/standard-queues.html
Based on Amazon's explanation why this can happen, I think it is irrelevant to your message throughput. You should consider it as an "expected" AWS platform glitch. It will not be an issue as long as your message handler is idempotent.

The SQS documentation says that duplicated message can occur if one of the nodes hosting SQS goes down, and cannot receive the delete message.
So based on that, you would have a fairly low number of duplicated messages. If your application cannot tolerate duplicated messages, then you probably want to use a FIFO queue.

I think the question you should be asking is "Is my process idempotent to handle duplicate messages?"
If not, make your process idempotent and use standard SQS queue.
If yes, use standard SQS queue.
You can always use SQS FIFO queue but that will make your application code "incompatible" with other queue systems that do not support such functionality.

Related

Throughput in Standard SQS vs FIFO SQS with a unique groupId for every message

I do not care much about the order of events but I would like the message to be processed exactly once. The lambda listening to SQS messages will store it in DynamoDB so throughput is pretty important as I have multiple microservices (as producers) writing messages to this SQS that will be read by a single microservice.
About processing messages exactly once, that is something that FIFO queue supports but is said to have not a good throughput.
Is the throughput of the FIFO queue the same as the Standard queue if each message has a unique groupId?
If not, my next option is probably to use "attribute_not_exists" in DynamoDB while storing the message.
Which of these should work better?
Messages / sec
FIFO
30,000 messages (with batching + high throughput mode)
3,000 messages (without batching + high throughput mode)
3,000 messages (with batching)
300 messages (without batching)
Standard
Nearly unlimited
https://aws.amazon.com/sqs/faqs/
To process exactly once, you need to use FIFO queue with de-deplication ID.
If your throughput requirement is below the limit mentioned above, then you're fine with the FIFO queue.
If not then, using DynamoDB as your original plan is also an alternative option. But you have to manage a lot of things yourself here with this approach like deleting the message, updating if the message is being read but not yet fully processed, and so on.
FIFO SQS queues have different rate limits than a regular SQS queue regardless of the use of message group ids
SQS Standard queues support a nearly unlimited number of API calls per second, per API action (SendMessage, ReceiveMessage, or DeleteMessage).
FIFO SQS supports 300 TPS for each API method
Look at the quota docs here
Also, AWS has a new feature for higher throughput FIFO SQS queue which might interest you
With batching of maximum 10 messages per API call you can handle 3,000 messages per second with FIFO queue
Regarding making sure you don't handle the same message twice - have you had a look at FIFO de-duplication ID? I am not sure if that's exactly what you need but it sounds pretty similar to your requirement
SQS delivery guarantee is at least once. Your application must be designed to handle processing duplicate messages.
I'd strongly recommend building your application this way.
If you must process some type of data exactly once, you need a strongly consistent system. Consider using dynamodb and conditional updates

Using Amazon SQS for multiple consumers receiving the same message

I have one primary application sending messages to SQS Queue and want 4 consumer applications to consume the same message and process it however they want to
I am not sure what Queuing architecture to use for this purpose.
I see the option of Standard SQS, SQS FIFO, (SQS + SNSTopic) & Kenesis
For the functionality that I want it seems like either (SQS + SNS Topic) or Kenesis would be the way to go.
But I also have a question regarding Standard SQS & SQS FIFO - Is it not possible for all of the consumers to get the same message if I use SQS FIFO or Standard SQS?
I think I am confused between all the options and overwhelmed by all the information available on the Queues but still confused about which architecture to choose
Primary source of information is Amazon docs and https://www.schibsted.pl/blog/choosing-best-aws-messaging-service/
Some of the questions I went through on stackoverflow:
Link_1 This post answers the question of using multiple consumers with the Queue but not sure if it addressing the issue of same messages consumed by multiple consumers
Link_2
This one answers why Kenesis can be used for my scenario
Helpful_Info I used this article just to understand the differences
I would really appreciate some help on this. I am trying to read as much as possible but would definitely appreciate if someone can help me make the right decision
This looks like a perfect use case for SNS-SQS fanout notifications - the messages are sent to an SNS "topic", and SNS will deliver it to multiple SQS queues that are "subscribed" to that topic.
Some notes:
Each consumer application (that is attached to a queue) will consume at its own rate - this means that it's possible for one or more to "fall behind". In general, that should be ok as long as the consumers are independent - the queue acts as the buffer so no information is lost.
If you need them to be in sync, then that won't work - you should just use a single queue, and a process to synchronously poll the queue and deliver the message to each application.
You can perform similar logic with Kinesis (it's built to have multiple consumers), but the extra development complexity and cost is typically not worthwhile unless you are dealing with very large message volumes
Kinesis bills by data volume (megabytes), while SQS bills by message count - do the math for your use case.
Don't worry about SQS FIFO unless you need the guarantees it provides around ordering. Plain SQS is already roughly ordered, and will suffice for most use cases.
According to your use case SNS seems to be a a great choice however if you want to persist the messages you can use SQS with SNS.

Is it possible to set up SQS standard queue to be sure to process only once my messages?

Is it possible to setup my SQS queue on AWS in order to process only once my message?
Maybe tweaking on long/short polling (is it going to have any impact on processing only once?)
or visibilityTimeout seconds,
or taking some best practice on my workers' application?
Or should I move definitely to a FIFO queue to be sure I have granted only once processing?
SQS will definitely process the message at least once but there a chance to process message more than once. Say you have a visibility timeout of 30 seconds and the consumer took 35 seconds to process the message then the message will again be available in the queue for other processes. If you don't have a problem with duplicate messages and expecting high throughput then SQS standard would be the right choice. Even you tweak with short polling or long polling you cannot guarantee that you can avoid duplication with SQS standard.
If you need to process message exactly once and if you strictly don't need any duplication then FIFO would be the right choice. Keep in mind throughput of FIFO wouldn't be that high as SQS standard. FIFO queues can support up to 300 messages per second
FIFO queues are designed to never introduce duplicate messages. However, your message producer might introduce duplicates in certain scenarios: for example, if the producer sends a message, does not receive a response, and then resends the same message. Amazon SQS APIs provide deduplication functionality that prevents your message producer from sending duplicates. Any duplicates introduced by the message producer are removed within a 5-minute deduplication interval.
Please read more about SQS standard here
Please read more about SQS FIFO here

Is Amazon SQS a good service for syncup between two components?

I have two components - one cloud based CLS app. and the other is normal Java based admin which talks to MySQL.
Considering SQS is not FIFO and I am not sure when will I receive the message at my consumer end. Also, I might receive a new message before the previous message on same data causing data inconsistency
If I want to syncup data between these two systems, is SQS a good service ?
Is SQS generally a good tool in such sync up scenarios?
SQS is "loosely-FIFO" and the SQS FAQ recommends adding sequencing information to each message to achieve ordering:
If your system requires the order of messages to be preserved, place
sequencing information in each message so that messages can be ordered
when they are received. Source
Messages that need to arrive in a specific order may not be a good candidate for standard SQS queue. However you can set a message sequence counter while sending message. At receiving end, you can keep processing messages if sequence is right. In case an out of sequence message comes, wait till the right message comes and then process right sequence message and others which came in between.
On Nov 17th, 2016 FIFO Queue have been introduced in certain regions (US East (Ohio) and US West (Oregon)) which complements the standard queue. The order in which messages are sent and received is strictly preserved and a message is delivered once and remains available until a consumer processes and deletes it; duplicates are not introduced into the queue.
FIFO queues use the same API actions as standard queues, and the mechanics for receiving and deleting messages and changing the visibility timeout are the same. However, when sending messages, you must specify a message group ID.
Amazon SQS has just gained FIFO Queues with Exactly-Once Processing & Deduplication:
Today we are making SQS even more powerful and flexible with support
for FIFO (first-in, first-out) queues. We are rolling out this new
type of queue in two regions now, and plan to make it available in
many others in early 2017.
These queues are designed to guarantee that messages are processed
exactly once, in the order that they are sent, and without duplicates.
[...]
[emphasis mine]
As emphasized, these new FIFO SQS queues provide more options to cover the use case at hand, but are not yet available in all SQS regions [initially only in US East (Ohio) and US West (Oregon)]. Also, the SQS FAQ for FIFO queues outlines notable differences between standard and FIFO queues that should be considered before deciding which queue type matches a particular use case.

does MSMQ have "lock until expire" functionality similar to Amazon SQS?

I've been using AWS SQS, which has a nice feature that when a message is claimed from the queue it locks for a period of time. During this lock if it is processed successfully the message is marked as completed. If the processing fails (and no response is received from the message processor), after a period of time the lock expires and the message is available for another processor to pick up.
Now I have a requirement to use queues outside of SQS (mostly for latency reasons, but potentially for cost reasons too). I'm really looking for a queue provider that has the same characteristic. MSMQ would be the obvious choice for me, since it's already installed and we use it elsewhere, but I can't find any functionality that handles failed messages in the same way.
Does MSMQ allow for this, or is there an easy way to replicate it?
Alternatively, is there another lightweight, open-source messaging service that does?
MSMQ does this already. If you read a message within a transaction and the transaction aborts then the message will reappear in the queue.