Does AWS SQS replicate messages across regions? - amazon-web-services

As SQS is distribute queue, so does it replicate messages in the same region or different region? Looking at architecture at the AWS
docs, it shows the message being replicated, but does it replicate in the same region or different regions?
Use case:
I'm setting up queue in region X, but it might be accessed in a region at other end of world. So if there are two workers running one in region X and one in region Y, does both get data from same region X queue or can it be region X and region Y got data from region near to them.
Like X got a message from region X and before the time this info reaches region Y to update queue, then another worker take from replicated region Y queue and reads same message.
P.S :- I know SQS in at least once semantics. But I want to know semantics in the above use case.

SQS is a regional service, that is highly available within a single region. There is no cross-region replication capability. You can definitely access the queue from different regions, just initialize the sqs client with the correct destination region.

As a standard practice for AWS services, the data resides within the region that you create the service in.
There are exceptions, but these will require you as the user to perform an action to allow such as copying an AMI, or enabling S3 replication.
If the queue is being consumed in multiple regions, it will always access the regional endpoint of the SQS queue rather than that of the current region.
As SQS is a queueing service, if you have workers distributed across regions the likelihood is that the item is removed from the queue and processed in a single region (although the exact definition would be it is delivered at least once).
If you're trying to have the message consumed in multiple regions, it would be better to consider a fanout based approach whereby each regions workers would consume from their own SQS queue as opposed to sharing one.
For more information take a look at the https://aws.amazon.com/getting-started/hands-on/send-fanout-event-notifications/ documentation.

Related

How to have multi-region replication for SQS messages?

I have an Active-Passive multi region architecture on AWS. We have cross-region replication for our RDS and DynamoDB; however, I'm not sure what to do with our messages in SQS when we failover. Based on the documentation there isn't a built-in feature for cross region replication.
Based on my research, the 2 solutions are;
Use fan-out with SNS so messages will be sent to both regions.
Have the secondary region applications get its messages from the primary region SQS queues.
The fan out pattern will not work for us because we need to come up with a way to determine if the message has already been processed by the primary region. We only have one active region at a time.
I was hoping for a more elegant solution than having the secondary region applications accessing the primary region messages. It's more expensive and introduce a little more complexity.
Is there a better way to accomplish having the secondary region continue to process messages where the primary region left off?

Pub/Sub Ordering and Multi-Region

While searching for the ordering features of Pub/Sub I stumbled upon the fact that ordering is preserved on the same region.
Supposing I have ordered Pub/Sub subscriptions outside of GCP.
Each subscription is on a different Datacenter on a Different Provider on another Region.
How can I specify that those subscriptions will consume from a specific region?
Is there an option on an ordered subscription to specify a region?
If not then how Pub/Sub decides which region my application is located since it is provisioned in another datacenter, on another provider. Is the region assigned going to change?
The ordering is preserved on the publish side only within a region. In other words, if you are publishing messages to multiple regions, only messages within the same region will be delivered in a consistent order. If your messages were all published to the same region, but your subscribers are spread across regions, then the subscribers will receive all messages in order. If you want to guarantee that your publishes all go to the same region to ensure they are in order, then you can use the regional service endpoints.

How does ECS Fargate distribute traffic across the available tasks when it receive messages from SQS?

I have a multi-region ECS Fargate, running 2 tasks in 1 cluster per region. Totally I have 4 tasks, 2 in us-east-1 and 2 in us-west-1.
The purpose of the ECS consumer tasks is to process messages as and when messages are available in SQS.
SQS will be configured in just a single region. The SQS arn will be configured in the container running the tasks.
With this setup, when there are messages from SQS, how does the traffic gets distributed across all available ECS tasks across multi-region? Is it random ? Someone please clarify.
I am not configuring load balancers for the ECS task since I do not have external calls. The source is always the messages from SQS.
With this setup, when there are messages from SQS, how does the traffic gets distributed across all available ECS tasks across multi-region? Is it random ? Someone please clarify
It's not random, but it is arbitrary. Here is what the docs say:
Standard queues provide best-effort ordering which ensures that messages are generally delivered in the same order as they're sent.
The reason that it's arbitrary is because SQS queues are distributed across multiple nodes and you have no idea how many nodes there are. So if SQS decides that you need 20 nodes to handle the rate that messages are added to the queue, and you retrieve 10 messages at a time (the limit), clearly you're going to get messages from some subset of those nodes.
Going into the realm of complete speculation, long polling might improve your chances of getting messages in the order that they were sent, because it is documented to "quer[y] all of the servers for messages." Of course, that could only apply when you can't fill your response from a single server. I would expect it to grab all messages that it can from each server and return as soon as it hits the maximum number of messages, even if it hasn't actually queried all servers.
SQS will be configured in just a single region. The SQS arn will be configured in the container running the tasks.
Beware that you need the queue URL, not its ARN, in order to retrieve messages.
Beware also that -- at least with the Python SDK -- you need to configure your SQS client's region to match the region where the queue exists (even though you pass the URL, which contains the region).

Fault Tolerant Clustered Queues - SQS

I would like to create SQS queues in 2 different AWS regions. Is there a way to setup synchronization between both queues? When data is read off a queue in either region , message must not be available for consumption. If one the region goes down , then consumer must start reading from the next message in the available region? Does AWS support this out of the box or are there any patterns available to support this use case?
No, this is not a feature of Amazon SQS.
It would be quite difficult to implement because you cannot request a specific message off a queue. So, if a message is retrieved in one region, there is no way to delete that message in a different region. You would need to operate a database to keep track of the messages, which sort of defeats the whole purpose of the queue.
Amazon SQS is a multi-AZ service that can survive failure of an Availability Zone, but resides in a single region.
You can use Amazon SNS to fan out messages to multi SQS queues, even in multiple different regions. Details here: Sending Amazon SNS messages to an Amazon SQS queue or AWS Lambda function in a different Region.
However this results in duplicate messages across those regions, this does not satisfy your requirement
When data is read off a queue in either region , message must not be available for consumption

Does lambdas execute operations in sequence.?

We are contemplating using Amazon web services for our project. Wherein the upstream flow will push the messages into the kinesis and later those messages will be fed into the lambdas, those messages before processing are going to be in order. As per my understanding, the AWS lambdas will scale out horizontally based on the volume of the messages. We have a volume of 400 messages per second, which means AWS lambda will respond to message volume and will instantiate new processes on separate containers to leverage parallelism and in order to achieve parallelism, ordering has to be compromised. So in case of 10 messages, which were ordered, hit the lambda functions and one function takes more time than another, a new function will be provisioned in some container by the AWS to serve the request.
Is the final output going to be in order after all of this processes?
Any help is appreciated.
Thanks.
If you are using Amazon Kinesis, then you can use a Data Transformation to trigger an AWS Lambda function on each incoming record.
This allows the record to be transformed or deleted, before continuing through the Firehose. Thus, records can be processed by Lambda while remaining in the same order. The final data can be delivered to Amazon S3, Amazon Redshift, Amazon Elasticsearch Service or Splunk.
If your application is consuming records from Amazon Kinesis directly (instead of via Firehose), then records will be consumed in order by your application.