Using Amazon SQS with multiple consumers - amazon-web-services

I have a service-based application that uses Amazon SQS with multiple queues and multiple consumers. I am doing this so that I can implement an event-based architecture and decouple all the services, where the different services react to changes in state of other systems. For example:
Registration Service:
Emits event 'registration-new' when a new user registers.
User Service:
Emits event 'user-updated' when user is updated.
Search Service:
Reads from queue 'registration-new' and indexes user in search.
Reads from queue 'user-updated' and updates user in search.
Metrics Service:
Reads from 'registration-new' queue and sends to Mixpanel.
Reads from queue 'user-updated' and sends to Mixpanel.
I'm having a number of issues:
A message can be received multiple times when doing polling. I can design a lot of the systems to be idempotent, but for some services (such as the metrics service) that would be much more difficult.
A message needs to be manually deleted from the queue in SQS. I have thought of implementing a "message-handling-service" that handles the deletion of messages when all the services have received them (each service would emit a 'message-acknowledged' event after handling a message).
I guess my question is this: what patterns should I use to ensure that I can have multiple consumers for a single queue in SQS, while ensuring that the messages also get delivered and deleted reliably. Thank you for your help.

I think you are doing it wrong.
It looks to me like you are using the same queue to do multiple different things. You are better of using a single queue for a single purpose.
Instead of putting an event into the 'registration-new' queue and then having two different services poll that queue, and BOTH needing to read that message and both doing something different with it (and then needing a 3rd process that is supposed to delete that message after the other 2 have processed it).
One queue should be used for one purpose.
Create a 'index-user-search' queue and a 'send to mixpanels' queue,
so the search service reads from the search queues, indexes the user
and immediately deletes the message.
The mixpanel-service reads from the mix-panels queue, processes the
message and deletes the message.
The registration service, instead of emiting a 'registration-new' to a single queue, now emits it to two queues.
To take it one step better, add SNS into the mix here and have the registration service emit an SNS message to the 'registration-new' topic (not queue), and then subscribe both of the queues I mentioned above, to that topic in a 'fan-out' pattern.
https://aws.amazon.com/blogs/aws/queues-and-notifications-now-best-friends/
Both queues will receive the message, but you only load it into SNS once - if down the road a 3rd unrelated service needs to also process 'registration-new' events, you create another queue and subscribe it to the topic as well - it can run with no dependencies or knowledge of what the other services are doing - that is the goal.

The primary use-case for multiple consumers of a queue is scaling-out.
The mechanism that allows for multiple consumers is the Visibility Timeout, which gives a consumer time to process and delete a message without it being consumed concurrently by another consumer.
To address the "At-Least-Once Delivery" property of Standard Queues,
the consuming service should be idempotent.
If that isn't possible, one possible solution is to use FIFO queues, but this mode has a limited message delivery rate and is not compatible with SNS subscription.

They even have a tutorial on how to create a fanout scenario using the combo SNS+SQS.
https://aws.amazon.com/getting-started/tutorials/send-fanout-event-notifications/
Too bad it does not support FIFO queues so you have to be careful to handle out of order messages.
It would be nice if they had a consistent hashing solution to have multiple competing consumers while respecting the message order.

Related

AWS SQS Selective Polling Pattern

I have a system where I publish updates to a shared topic meant for specific consumers.
I noticed messages getting stuck in the queue due to a lack of selective listening in SQS consumers, so messages are being hijacked.
Example:
Given: Message{destination: A, payload: 1234}
Given: ConsumerA, & ConsumerB
I expect Message to be processed by ConsumerA. However, it gets hijacked by Consumer B continuously. It receives the message, then refuses to process it since the destination field doesn't match, leading to the visibility timeout to expire, and the message put back on the queue.. but due to the nature of SQS, ConsumerB has an equal chance of picking the message again.
My question is, what patterns are used to solve this type of issue?
I'm considering creating a queue per consumer but it has drawbacks specific to the system im working on.
If I could only listen for messages with matching attributes, problem solved, but that's seemingly not the case.
Is there any other way?
Sharing a single Amazon SQS queue is not an appropriate architecture for your use-case.
If you want your consumers to be able to 'request' a message from a particular subset, you should either use separate SQS queues or use a database. You could even store objects in Amazon S3 as a form of noSQL database.
Having consumers grab messages and then 'send them back' to the queue is not compatible with the design of the Amazon SQS service.

How to ensure once-only processing of data in an AWS serverless architecture?

I have some data that needs to be processed at a point in time.
My current strategy is to pull the data every minute and load it into a queue and process it.
I have two concerns with this strategy:
I can't guarantee that the last minute captures all data so I pull the last two minutes; and
Lambdas as far as I know can fire multiple times depending on the trigger (in this case SQS.)
I'm trying to avoid writing a flag to the data because of the spikey nature of batch processing.
The only other solution I can think of is using S3 to create a lock-file.
Is there a better way to 'kick off' future events? Is there a strategy outside database and S3 flags?
Have a look at SQS FIFO Queues, they are designed to deliver once and only once.
You can now use Amazon Simple Queue Service (SQS) for applications that require messages to be processed in a strict sequence and exactly once using First-in, First-out (FIFO) queues. FIFO queues are designed to ensure that the order in which messages are sent and received is strictly preserved and that each message is processed exactly once. ...source

Single SQS Queue vs Multiple SQS Queue while creating a Async Model

I have to develop a component where the Apis are async in nature. In order to develop this async model, I am going to use Aws SQS queues for publishing messages and the client will read from the queue and send the response back into the queue. Now there are 10 APIs (currently) that I have to expose.
Currently, I can think of having a single request and a single response queue (which I will poll) for all the APIs and the payload of the APIs can be defined by some Operation.
The other way is to use a separate queue for each API. The advantage that I can see for multiple queues is that each API can have different traffic and having multiple queues can help the client of the queues to scale effectively.
What can be other pros or cons for both the approaches?
Separate your use-case into 2 distinct problems:
Problem 1: APIs to Workers, one queue or multiple?
If your workers do different types of work, then having a single queue will require them to inspect then discard messages they don't care about. If this is the case, then you should have one queue per message type. This way, any message a worker receives from the queue, it should be able to handle.
If you start ignoring messages, then other workers, who may be idle, may be waiting for a while for messages it cares about.
Problem 2: Using a return queue for the "results". If your clients will be polling for results, then at each poll, your API will need to poll the queue. Again, it will be "searching" for the right response, discarding those it doesn't care about, starving other clients.
Recommendation:
Use multiple queues, one per "worker type". Workers should be able to process any message it receives from the queue.
Then use something other than SQS to store the result. One option is to use S3 to store the result:
When your API "creates" the task, create an object in S3 and put a reference to that S3 object on your SQS queue.
Your worker will do the work, then put the result where it was told to.
When your client polls your API for the result, your API will check S3 and return the status/results.
Instead of S3, other data stores could be used if appropriate: RDS, DynamoDB, etc.

Event Driven MessageBus architecture with AWS SNS: one or many message buses/ lambda action functions

I am implementing a process in my AWS based hosting business with an event driven architecture on AWS SNS. This is largely a learning experience with a new architecture, programming and hosting paradigm for me.
I have considered AWS Step functions, but have decided to implement a Message Bus with AWS SNS topic(s), because I want to understand the underlying event driven programming model.
Nearly all actions are performed by lambda functions and steps are coupled via SNS and/or SQS.
I am undecided if to implement the process with one or many SNS topics and if I should subscribe the core logic to the message bus(es) with one or many lambda functions.
One or many message buses
My core process currently consist of 9 events which of which 2 sets of 2 can be parallel, the remaining 4 are sequential. Subscribing these all to the same message bus is easier to set up, but requires each lambda function to check if the message is relevant to it, which seems like a waste of resources.
On the other hand I could have 6 message buses and be sure that a notified resource has something to do with the message.
One or many lambda functions
If all lambda functions are subscribed to the same message bus, it may be easier to package them all up with a dispatcher function in a single lambda function. It would also reduce the amount of code to upload to lambda, albeit I don't have to pay for that.
On the other hand I would loose the ability to control the timeout for the lambda function and any changes to the order of events is now dependent on the dispatcher code.
I would still have the ability to scale each process part, as any parts that contain repeating elements are seperated by SQS queues.
You should always emit each type of message to it's own topic, as this allows other services to consume these events without tightly coupling the two services.
Likewise, each worker that wants to consume messages should have it's own queue with it's own subscription to the topic.
Doing the following allows you to add new message consumers for a given event without having to modify the upstream service. Furthermore, responsibility over each component is clear - the service producing messages to a topic owns that topic (and the message format), whereas the consumer owns its queue and event handling semantics.
Your consumer can specify a message filter when subscribing to a topic, so it can only receive messages it cares about (documentation).
For example, a process that sends a customer survey after the customer has received their order would subscribe its queue to the Order Status Changed event with the filter set to only receive events where the new_status field is equal to shipment-received).
The above reflects principles of Service-Oriented architecture - and there's plenty of good material out there elaborating the points above.

Chat bots: ensuring serial processing of messages on a per-conversation basis in clustered environment

In the context of writing a Messenger chat bot in a cloud environment, I'm facing some concurrency issues.
Specifically, I would like to ensure that incoming messages from the same conversation are processed one after the other.
As a constraint, I'm processing the messages with workers in a Cloud environment (i.e the worker pool is of variable size and worker instances are potentially short-lived and may crash). Also, low latency is important.
So abstracting a little, my requirements are:
I have a stream of incoming messages
each of these messages has a 'topic key' (the conversation id)
the set of topics is not known ahead-of-time and is virtually infinite
I want to ensure that messages of the same topic are processed serially
on a cluster of potentially ephemeral workers
if possible, I would like reliability guarantees e.g making sure that each message is processed exactly once.
My questions are:
Is there a name for this concurrency scenario?.
Are there technologies (message brokers, coordination services, etc.) which implement this out of the box?
If not, what algorithms can I use to implement this on top of lower-level concurrency tools? (distributed locks, actors, queues, etc.)
I don't know of a widely-accepted name for the scenario, but a common strategy to solve that type of problem is to route your messages so that all messages with the same topic key end up at the same destination. A couple of technologies that will do this for you:
With Apache ActiveMQ, HornetQ, or Apache ActiveMQ Artemis, you could use your topic key as the JMSXGroupId to ensure all messages with the same topic key are processed in-order by the same consumer, with failover
With Apache Kafka, you could use your topic key as the partition key, which will also ensure all messages with the same topic key are processed in-order by the same consumer
Some message broker vendors refer to this requirement as Message Grouping, Sticky Sessions, or Sticky Message Load Balancing.
Another common strategy on messaging systems with weaker delivery/ordering guarantees (like Amazon SQS) is to simply include a sequence number in the message and leave it up to the destination to resequence and request redelivery of missing messages as needed.
I think you can fix this by using a queue and a set. What I can think of is sending every message object in queue and processing it as first in first out. But while adding it in queue add topic name in set and while taking it out for processing remove topic name from set.
So now if you have any topic in set then don't add another message object of same topic in queue.
I hope this will help you. All the best :)