AWS SQS queue improve performance - amazon-web-services

I tried implementing an AWS SQS Queue to minimise the database interaction from the backend server, but I am having issues with it.
I have one consumer process that looks for messages from one SQS queue.
A JSON message is placed in the SQS queue when Clients click on a button in a web interface.
A backend job in the app server picks up the JSON message from the SQS queue, deletes the message from the queue and processes it.
To test the functionality, I implemented the logic for one client. It was running fine. However, when I added 3 more clients it was not working properly. I was able to see that the SQS queue was stuck up with 500 messages and the backend job was working properly reading from the queue.
Do I need to increase the number of backend jobs or increase the number of client SQS queues? Right now all the clients send the message to same queue.
How do I calculate the number of backend jobs required? Also, is there any setting to make SQS work faster?

Having messages stored in a queue is good - in fact, that's the purpose of using a queue.
If your backend systems cannot consume messages at the rate that they are produced, the queue will act as a buffer to retain the messages until they can be processed. A good example is this AWS re:Invent presentation where a queue is shown with more than 200 million messages: Building Elastic, High-Performance Systems with Amazon SQS and Amazon SNS
If it is important to process the messages quickly, then scale your consumers to match the rate of message production (or faster, so you can consume backlog).
You mention that your process "picks up the JSON message from the SQS queue, deletes the message from the queue and processes it". Please note that best practice is to receive a message from the queue, process it and then delete it (after it is fully processed). This way, if your process fails, the message will automatically reappear on the queue after a defined invisibility period. This makes your application more resilient to failure.

Related

AWS SQS standard queue or FIFO queue when message can not be duplicated?

We plan to use AWS SQS service to queue events created from web service and then use several workers to process those events. One event can only be processed one time. According to AWS SQS document, AWS SQS standard queue can "occasionally" produce duplicated message but with unlimited throughput. AWS SQS FIFO queue will not produce duplicated message but with throughput limitation of 300 API calls per second (with batchSize=10, equivalent of 3000 messages per second). Our current peak hour traffic is only 80 messages per second. So, both are fine in terms of throughput requirement. But, when I started to use AWS SQS FIFO queue, I found that I need to do extra work like providing extra parameters
"MessageGroupId" and "MessageDeduplicationId" or need to enable "ContentBasedDeduplication" setting. So, I am not sure which one is a better solution. We just need the message not duplicated. We don't need the message to be FIFO.
Solution #1:
Use AWS SQS FIFO queue. For each message, need to generate a UUID for "MessageGroupId" and "MessageDeduplicationId" parameters.
Solution #2:
Use AWS SQS FIFO queue with "ContentBasedDeduplcation" enabled. For each message, need to generate a UUID for "MessageGroupId".
Solution #3:
Use AWS SQS standard queue with AWS ElasticCache (either Redis or Memcached). For each message, the "MessageId" field will be saved in the cache server and checked for duplication later on. Existence means this message has been processed. (By the way, how long should the "MessageId" exists in the cache server. AWS SQS document does not mention how far back a message could be duplicated.)
You are making your systems complicated with SQS.
We have moved to Kinesis Streams, It works flawlessly. Here are the benefits we have seen,
Order of Events
Trigger an Event when data appears in stream
Deliver in Batches
Leave the responsibility to handle errors to the receiver
Go Back with time in case of issues
Buggier Implementation of the process
Higher performance than SQS
Hope it helps.
My first question would be that why is it even so important that you don't get duplicate messages? An ideal solution would be to use a standard queue and design your workers to be idempotent. For e.g., if the messages contain something like a task-ID and store the completed task's result in a database, ignore those whose task-ID already exists in DB.
Don't use receipt-handles for handling application-side deduplication, because those change every time a message is received. In other words, SQS doesn't guarantee same receipt-handle for duplicate messages.
If you insist on de-duplication, then you have to use FIFO queue.

Amazon SQS messages

For Amazon SQS - If the number of requests a cron job can make in a given window is exceeded by the number of messages received in SQS, how would you ensure all messages are processed in that window?
Standard Queues can hold upto 120,000 Messages and Fifo queues can hold upto 20,000 Messages. These messages can be set to be retained upto 14days which is pretty much.
So Even If the number of requests a cron job can make in a given window is exceeded by the number of messages received in SQS:
All the Incoming messages are still stored in the queue.
The Cron Job/Jobs just takes the first message from the queue and processes it.
Make sure you know and set the Visibity time out while reading from the queue and delete the message from the queue after it is processes. Otherwise, the cron job keeps processing the same message again and again.
Refer other AWS SQS features for better processing And utilizing.
To ensure all messages are processed in that window, As I mentioned You just need run the CRON Job with proper setting.
If you have a Delivery deadline which we will be having in most of the cases, Configure the environment to be Autoscaled.
We can even configure scale-in and scale-out based on SQS message.
Happy to Help.. :)

AWS SQS dump/restore

Is it possible to dump a SQS queue to open space for "urgent" messages and then restore the dump to keep SQS queue on track?
I am not talking about aws cli commands but any possibility of doing it.
Of course I could open a new SQS and change the application to look after that new queue, but it would have some implications.
No it's not possible. The design pattern I've seen AWS recommend when you want to have "high priority" messages is this:
Create 2 queues, one for high-priority messages and one for regular-priority messages.
Have your application always scan the high-priority queue first to check for new messages.
If you don't receive any messages from the high-priority queue, scan the regular-priority queue for messages.
AWS SQS does not provide a priority based queue at the moment. But you can do certain implementations and build a priority queue for your application (consumer). Following are some implementations you can use.
1) As #Markb mentioned, you can create two SQSs, where one is for high priority messages and other is for regular messages. Make sure application polls the high-priority SQS first and then move on to the regular SQS.
2) If using a single SQS, have a few worker threads on the application side, that will collect all the messages from SQS, and process them to see which ones have a higher priority. Take them and process first.
3) Use a combination of SQS and SNS. Send all the regular messages into the SQS. If there are high priority messages, send them to SNS to direct them to a specific endpoint in your application. From you application side (consumer side), have and endpoint that listens to high priority messages coming from SNS, and then have aprocess that polls the SQS to retrieve the regular messages.

Manual acknolwedgement of sqs?

Currently we are using 'RabbitMQ' for reliable messaging delivery,we have plan to move SQS.RMQP monitors its consumers through TCP when any consumer of the Queue goes down it will automatically Re-Queue the messaging for processing.
Will SQS monitor all its slaves? Will the message is visible in the queue if one of its consumer goes down while processing the message?
I have tried to find out the same from documentation,i could not find any.
If by 'slaves', you mean SQS consumers, then no, SQS does not monitor the consumers from the queue.
In a nutshell, SQS works like this:
A consumer requests a message from the queue.
SQS sends the message to the consumer to process and makes that message temporarily invisible to other consumers.
When the consumer is finished processing the message, it sends a 'DeleteMessage' requests back to SQS and SQS removes that item from the queue.
If a consumer does not send the deletemessage back soon enough (within its configurable timeout period), then SQS will put the message back into the queue automatically.
So SQS doesn't monitor you consumers, but if a consumer requests messages - and does nothing with them - they will eventually end up back in the queue to be processed by someone else.
But if your queue doesn't have any consumers, then sooner or later (14 days max), the messages will be deleted altogether (or sent to a dead-letter-queue if you set that up).
It is usually a good idea to setup your queue consumers in an auto-scaling group, with a health-check that can verify that it is running/processing properly. If an instance fails a health check, it will be terminated and a new instance spun up to continue the work in the queue. Optionally, you can spin up extra instances if the size of the SQS queue grows to meet peak demand.

What is the difference between Amazon SNS and Amazon SQS?

When would I use SNS versus SQS, and why are they always coupled together?
SNS is a distributed publish-subscribe system. Messages are pushed to subscribers as and when they are sent by publishers to SNS.
SQS is distributed queuing system. Messages are not pushed to receivers. Receivers have to poll or pull messages from SQS. Messages can't be received by multiple receivers at the same time. Any one receiver can receive a message, process and delete it. Other receivers do not receive the same message later. Polling inherently introduces some latency in message delivery in SQS unlike SNS where messages are immediately pushed to subscribers. SNS supports several end points such as email, SMS, HTTP end point and SQS. If you want unknown number and type of subscribers to receive messages, you need SNS.
You don't have to couple SNS and SQS always. You can have SNS send messages to email, SMS or HTTP end point apart from SQS. There are advantages to coupling SNS with SQS. You may not want an external service to make connections to your hosts (a firewall may block all incoming connections to your host from outside).
Your end point may just die because of heavy volume of messages. Email and SMS maybe not your choice of processing messages quickly. By coupling SNS with SQS, you can receive messages at your pace. It allows clients to be offline, tolerant to network and host failures. You also achieve guaranteed delivery. If you configure SNS to send messages to an HTTP end point or email or SMS, several failures to send message may result in messages being dropped.
SQS is mainly used to decouple applications or integrate applications. Messages can be stored in SQS for a short duration of time (maximum 14 days). SNS distributes several copies of messages to several subscribers. For example, let’s say you want to replicate data generated by an application to several storage systems. You could use SNS and send this data to multiple subscribers, each replicating the messages it receives to different storage systems (S3, hard disk on your host, database, etc.).
Here's a comparison of the two:
Entity Type
SQS: Queue (Similar to JMS)
SNS: Topic (Pub/Sub system)
Message consumption
SQS: Pull Mechanism - Consumers poll and pull messages from SQS
SNS: Push Mechanism - SNS Pushes messages to consumers
Use Case
SQS: Decoupling two applications and allowing parallel asynchronous processing
SNS: Fanout - Processing the same message in multiple ways
Persistence
SQS: Messages are persisted for some (configurable) duration if no consumer is available (maximum two weeks), so the consumer does not have to be up when messages are added to queue.
SNS: No persistence. Whichever consumer is present at the time of message arrival gets the message and the message is deleted. If no consumers are available then the message is lost after a few retries.
Consumer Type
SQS: All the consumers are typically identical and hence process the messages in the exact same way (each message is processed once by one consumer, though in rare cases messages may be resent)
SNS: The consumers might process the messages in different ways
Sample applications
SQS: Jobs framework: The Jobs are submitted to SQS and the consumers at the other end can process the jobs asynchronously. If the job frequency increases, the number of consumers can simply be increased to achieve better throughput.
SNS: Image processing. If someone uploads an image to S3 then watermark that image, create a thumbnail and also send a Thank You email. In that case S3 can publish notifications to an SNS topic with three consumers listening to it. The first one watermarks the image, the second one creates a thumbnail and the third one sends a Thank You email. All of them receive the same message (image URL) and do their processing in parallel.
You can see SNS as a traditional topic which you can have multiple Subscribers. You can have heterogeneous subscribers for one given SNS topic, including Lambda and SQS, for example. You can also send SMS messages or even e-mails out of the box using SNS. One thing to consider in SNS is only one message (notification) is received at once, so you cannot take advantage from batching.
SQS, on the other hand, is nothing but a queue, where you store messages and subscribe one consumer (yes, you can have N consumers to one SQS queue, but it would get messy very quickly and way harder to manage considering all consumers would need to read the message at least once, so one is better off with SNS combined with SQS for this use case, where SNS would push notifications to N SQS queues and every queue would have one subscriber, only) to process these messages. As of Jun 28, 2018, AWS Supports Lambda Triggers for SQS, meaning you don't have to poll for messages any more.
Furthermore, you can configure a DLQ on your source SQS queue to send messages to in case of failure. In case of success, messages are automatically deleted (this is another great improvement), so you don't have to worry about the already processed messages being read again in case you forgot to delete them manually. I suggest taking a look at Lambda Retry Behaviour to better understand how it works.
One great benefit of using SQS is that it enables batch processing. Each batch can contain up to 10 messages, so if 100 messages arrive at once in your SQS queue, then 10 Lambda functions will spin up (considering the default auto-scaling behaviour for Lambda) and they'll process these 100 messages (keep in mind this is the happy path as in practice, a few more Lambda functions could spin up reading less than the 10 messages in the batch, but you get the idea). If you posted these same 100 messages to SNS, however, 100 Lambda functions would spin up, unnecessarily increasing costs and using up your Lambda concurrency.
However, if you are still running traditional servers (like EC2 instances), you will still need to poll for messages and manage them manually.
You also have FIFO SQS queues, which guarantee the delivery order of the messages. SQS FIFO is also supported as an event source for Lambda as of November 2019
Even though there's some overlap in their use cases, both SQS and SNS have their own spotlight.
Use SNS if:
multiple subscribers is a requirement
sending SMS/E-mail out of the box is handy
Use SQS if:
only one subscriber is needed
batching is important
AWS SNS is a publisher subscriber network, where subscribers can subscribe to topics and will receive messages whenever a publisher publishes to that topic.
AWS SQS is a queue service, which stores messages in a queue. SQS cannot deliver any messages, where an external service (lambda, EC2, etc.) is needed to poll SQS and grab messages from SQS.
SNS and SQS can be used together for multiple reasons.
There may be different kinds of subscribers where some need the
immediate delivery of messages, where some would require the message
to persist, for later usage via polling. See this link.
The "Fanout Pattern." This is for the asynchronous processing of
messages. When a message is published to SNS, it can distribute it
to multiple SQS queues in parallel. This can be great when loading
thumbnails in an application in parallel, when images are being
published. See this link.
Persistent storage. When a service that is going to process a message is not reliable. In a case like this, if SNS pushes a
notification to a Service, and that service is unavailable, then the
notification will be lost. Therefore we can use SQS as a persistent
storage and then process it afterwards.
From the AWS documentation:
Amazon SNS allows applications to send time-critical messages to
multiple subscribers through a “push” mechanism, eliminating the need
to periodically check or “poll” for updates.
Amazon SQS is a message queue service used by distributed applications
to exchange messages through a polling model, and can be used to
decouple sending and receiving components—without requiring each
component to be concurrently available.
Fanout to Amazon SQS queues
Following are the major differences between the main messaging technologies on AWS (SQS, SNS, +EventBridge). In order to choose a particular AWS service, we should know the functionalities a service provides as well as its comparison with other services.
The below diagram summarizes the main similarities as well as differences between this service.
In simple terms,
SNS - sends messages to the subscriber using push mechanism and no need of pull.
SQS - it is a message queue service used by distributed applications to exchange messages through a polling model, and can be used to decouple sending and receiving components.
A common pattern is to use SNS to publish messages to Amazon SQS queues to reliably send messages to one or many system components asynchronously.
Reference from Amazon SNS FAQs.
One reason for coupling SQS and SNS would be for data processing pipelines.
Let's say you are generating three kinds of product, and that products B & C are both derived from the same intermediate product A. For each kind of product (i.e., for each segment of the pipeline) you set up:
a compute resource (maybe a lambda function, or a cluster of virtual machines, or an autoscaling kubernetes job) to generate the product.
a queue (describing units of work that need to be performed) to partition the work across the compute resource (so that each unit of work is processed exactly once, but separate units of work can be processed separately in parallel and asynchronously with each other).
a news feed (announcing outputs that have been produced).
Then arrange so that the input queues for B & C are both subscribing to the output announcements of A.
This makes the pipeline modular on the level of infrastructure. Rather than having a monolithic server application that generates all three products together, different stages of the pipeline can utilise different hardware resources (for example, perhaps stage B is very memory intensive, but the two other stages can be performed with cheaper hardware/services). This also makes it easier to iterate on the development of one pipeline segment without disrupting delivery of the other products.
There are some key distinctions between SNS and SQS:
SNS supports A2A and A2P communication, while SQS supports only A2A
communication.
SNS is a pub/sub system, while SQS is a queuing system. You'd
typically use SNS to send the same message to multiple consumers via
topics. In comparison, in most scenarios, each message in an SQS
queue is processed by only one consumer. With SQS, messages are
delivered through a long polling (pull) mechanism, while SNS uses a
push mechanism to immediately deliver messages to subscribed
endpoints.
SNS is typically used for applications that need real time
notifications, while SQS is more suited for message processing use
cases.
SNS does not persist messages - it delivers them to subscribers that
are present, and then deletes them. In comparison, SQS can persist
messages (from 1 minute to 14 days).
Individually, Amazon SQS and SNS are used for different use cases. You can, however, use them together in some scenarios.