We need to sync data between different web servers. The idea is very basic: when one entity is created on one server, it should be sent to all the other servers. What's the right way to do it? We are currently evaluating 2 approaches: amazon's sqs and sns services and custom implementation with some key-value database (like memcached and memqueue). What are the common pitfalls of custom implementations? Any feedback will be highly appreciated.
SQS would work OK if you create a new queue for each server and write the data to each queue. The biggest downside is that you will need each server to poll for new messages.
SNS would work more efficiently because it allows you to broadcast a message to multiple locations. However, it's a one-shot try; if a machine can't receive its notification when SNS sends it SNS will not try again.
You don't specify how many messages you are sending or what your performance requirements are, but any SQS/SNS system will likely be much, much slower (mostly due to latencies between sending the message and the servers receiving it) then a local memcache/key-value server solution.
A mixed solution would be to use a persistant store (like SimpleDB) and use SNS to alert the servers that new data is available.
Related
We do have a system that is using Redis pub/sub features to communicate between different parts of the system. To keep it simple we used the pub/sub channel to implement different things. On both ends (producer and consumer), we do have Servers containing code that I see no way to convert into Lambda Functions.
We are migrating to AWS and among other changes, we are trying to replace the use of Redis with a managed pub/sub solution. The required solution is fairly simple: a managed broker that allows to publish a message from one node and to subscribe for its reception from 0 or more other nodes.
It seems impossible to achieve this with any of the available solutions:
Kinesis - It is a streaming solution for data ingestion (similar to Apache Pulsar)
SNS - From the documentation, it looks like exactly what we need until we realize that there is no solution to connect a server (not a Lambda) unless with a custom HTTP endpoint.
EventBridge - Same issue as with SNS
SQS - It is a queue, not a pub/sub.
Amazon MQ / Rabbit MQ - It is a queue, not a pub/sub. But also is not a SaaS solution but rather an installation to an owned node.
I see no reason to remove a feature such as subscribing from a server, this is why I was sure it will be present in one or more of the available solutions. But we went through the documentation and attempted to consume fro SNS and EventBridge without success. Are we missing something? How to achieve what we need?
Example
Assume we have an API server layer, deployed on ECS with a load balancer in front. The API has 2 endpoints, a PUT to update a document, and an SSE to listen for updates on documents.
Assuming a simple round-robin load balancer, an update for document1 may occur on node1 where a client may have an ongoing SSE request for the same document on node2. This can be done with a Redis backbone; node1 publishes on document1 topic and node2 is subscribed to the same topic. This solution is fast and efficient (in this case at-most-once delivery is perfectly acceptable).
Being this an example we will not consider WebSocket pub/sub API or other ready-made solutions for this specific use case.
Lambda
Subscriber side can not be a Lambda. Being two distinct contexts involved (the SSE HTTP Request one and the SNS event one) this will cause two distinct lambdas to fire and no way to 'stitch' them together.
SNS + SQS
We hesitate with SQS in conjunction with SNS being a solution that will add a lot of unneeded complexity:
Number of nodes is not known in advance and they scale, requiring an automated system to increase/reduce the number of SQS queues.
Persistence is not required
Additional latency is introduced
Additional infrastructure cost
HTTP Endpoint
This is the closest thing to a programmatic subscription but suffers from similar issues to the SNS-SQS solution:
Number of nodes is unknown requiring endpoint subscriptions to be automatically added.
Eiter we expose one endpoint for each node or have a particular configuration on the Load Balancer to route the message to the appropriate node.
Additional API endpoints must be exposed, maintained, and secured.
my company has a messaging system which sends real-time messages in JSON format, and it's not built on AWS
our team is trying to use AWS SQS to receive these messages, which will then have DynamoDB to storage this messages
im thinking to use EC2 to read this messages then save them
any better solution ?? or how to do it i don't have a good experience
First of All EC2 is infrastructure on Cloud, It is similar to physical machine with OS on local setup. If you want to create any application that will fetch the data from Amazon SQS(Messages in Json Format) and Push it in dynamodb(No Sql database), Your design is correct as both SQS and DynamoDb have thorough Json Support. Once your application is ready then you deploy that application on EC2 machine.
For achieving this, your application must have the asyc Buffered SQS consumer that will consume the messages(limit of sqs messages is 256KB), Hence whichever application is publishing messages size of messages needs to be less thab 256Kb.
Please refer below link for sqs consumer
is putting sqs-consumer to detect receiveMessage event in sqs scalable
Once you had consumed the message from sqs queue you need to save it in dynamodb, that you can easily do it using crud repository. With Repository you can directly save the json in Dynamodb table but please sure to configure the provisioning write capacity based on requests, because more will be the provisioning capacity more will be the cost. Please refer below link for configuring the write capacity of table.
Dynamodb reading and writing units
In general, you'll have a setup something like this:
The EC2 instances (one or more) will read your queue every few seconds to see if there is anything there. If so, they will write this data to DynamoDB.
Based on what you're saying you'll have less than 1,000,000 reads from SQS in a month so you can start out on the free tier for that. You can have a single EC2 instance initially and that can be a very small instance - a T2.micro should be more than sufficient. And you don't need more than a few writes per second on DynamoDB.
The advantage of SQS is that if for some reason your EC2 instance is temporarily unavailable the messages continue to queue up and you won't lose any of them.
From a coding perspective, you don't mention your development environment but there are AWS libraries available for a pretty wide variety of environments. I develop in Java and the code to do this would be maybe 100 lines. I would guess that other languages would be similar. Make sure you look at long polling in the language you're using - it can help to speed up the processing and save you money.
Our .net core web app currently accepts websocket connections and pushes out data to clients on certain events (edit, delete, create of some of our entities).
We would like to load balance this application now but foresee a problem in how we handle the socket connections. Basically, if I understand correctly, only the node that handles a specific event will push data out to its clients and none of the clients connected to the other nodes will get the update.
What is a generally accepted way of handling this problem? The best way I can think of is to also send that same event to all nodes in a cluster so that they can also update their clients. Is this possible? How would I know about the other nodes in the cluster?
The will be hosted in AWS.
You need to distribute the event to all nodes in the cluster, so that they can each push the update out to their websocket clients. A common way to do this on AWS is to use SNS to distribute the event to all nodes. You could also use ElastiCache Redis Pub/Sub for this.
As an alternative to SNS or Redis, you could use a Kinesis Stream. But before going to that link, read about Apache Kafka, because the AWS docs don't do a good job of explaining why you'd use Kinesis for anything other than log ingest.
To summarize: Kinesis is a "persistent transaction log": everything that you write to it is stored for some amount of time (by default a day, but you can pay for up to 7 days) and is readable by any number of consumers.
In your use case, each worker process would start reading at the then-current end-of stream, and continue reading (and distributing events) until shut down.
The main issue that I have with Kinesis is that there's no "long poll" mechanism like there is with SQS. A given read request may or may not return data. What it does tell you is whether you're currently at the end of the stream; if not, you have to keep reading until you are. And, of course, Amazon will throttle you if you read too fast. As a result, your code tends to have sleeps.
I'm looking for help with an architectural design decision I'm making with a product.
We've got multiple producers (initiated by API Gateway calls into Lambda) that put messages on a SQS queue (the request queue). There can be multiple simultaneous calls, so there would be multiple Lambda instances running in parallel.
Then we have consumers (lets say twenty EC2 instances) who long-poll on the SQS for the message to process them. They take about 30-45 seconds to process a message each.
I would then ideally like to send the response back to the producer that issued the request - and this is the part I'm struggling with with SQS. I would in theory have a separate response queue that the initial Lambda producers would then be consuming, but there doesn't seem to be a way to cherry pick the specific correlated response. That is, each Lambda function might pick up another function's response. I'm looking for something similar to this design pattern: http://soapatterns.org/design_patterns/asynchronous_queuing
The only option that I can see is to create a new SQS Response queue for each Lambda API call, passing in its ARN in the message for the consumers to put the response on, but I can't imagine that's very efficient - especially when there's potentially hundreds of messages a minute? Am I missing something obvious?
I suppose the only other alternative would be setting up a bigger message broker (e.g. RabbitMQ/ApacheMQ) environment, but I'd like to avoid that if possible.
Thanks!
Create a (Temporary) Response Queue For Every Request
To late for the party, but i was thinking that i might find some help in what i want to achieve, #MattHouser #Zaheer Ally , or give an idea to someone working on a related issue.
I am facing a similar challenge. I have an API that upon request by a client, needs to communicate to multiple external APIs and collect (delayed) results.
Since my PHP API is synchronous, it can only perform these requests sequentially. So, i was thinking to use a request queue, where the producer (API) would send messages. Then, multiple workers would consume these messages, each of them performing one of these external API calls.
To get the results back, the producer would have created a temporary response queue, the name-identifier of which would be embedded in the message sent to workers. Hence, each worker would 'publish' his results on this temporary queue.
In the meantime, the producer would keep polling the temporary queue until he received the expected number of messages. Finally, he would delete the queue and send the collected results back to the client.
Yes, you could use RabbitMQ for a more "rpc" queue pattern.
But if you want to stay within AWS, try using something other than SQS for the response.
Instead, you could use S3 for the response. When your producer puts the item into SQS, include in the message an S3 destination for the response. When your consumer completes the tasks, put the response in the desired S3 location.
Then you can check S3 for the response.
Update
You may be able to accomplish an RPC-like message queue using Redis.
https://github.com/ServiceStack/ServiceStack/wiki/Messaging-and-redis
Then, you can use AWS ElastiCache for your Redis cluster. This would completely replace the use of SQS.
Another option would be to use Redis' pub/sub mechanism to asynchronously notify your lambda that the backend work is done. You can use AWS's Elasticache for Redis for an all-AWS-managed solution. Your lambda function would generate a UUID for each request, use that as the channel name to subscribe to, pass it along in the SQS message, and then the backend workers would publish a notification to that channel when the work is done.
I was facing this same problem so I tried it out, and it does work. Whether it's worth the effort over just polling S3 is another question. You have to configure the lambda functions to run inside your VPC, so they can access your Redis. I was going to have to do this anyway since I'd want the workers, in my case also lambda functions, to be able to access my Elasticsearch and RDS. But there are some considerations: most importantly, you need to use a private subnet with a NAT Gateway (or your own NAT Instance), so it can get out to the Internet and AWS managed services (including SQS).
One other thing I just stumbled across is that requests through API Gateway currently cannot take longer than 29 seconds, and this cannot be increased by AWS. You mentioned your jobs take 30 or more seconds, so this could be a showstopper for you using API Gateway and Lambda in this way anyway.
AWS now provides a Java client that supports temporary queues. This is useful for request/response patterns. I can't see a non-Java version.
My project requires me to communicate with many devices outside the cloud. If successful, this means millions of devices. These devices will not be running Android or iOS, and will be running behind routers & firewalls (I cannot assume they have an external IP).
I am looking to use SQS to send messages to my users outside the cloud. To allow the servers to message individual users, I am designing the system to have one queue per client. This can potentially mean millions (billions?) of queues. While it states that SQS can support unlimited queues, I would like to make sure that I am not abusing the system. If successful, the probability of millions of users is very high.
I am aware that SQS can be expensive, but I am using it at this stage
for ease of administration.
As far as I can tell SNS requires either an IOS/Android client, or an
HTTP server running on the consumer. This is why I ruled out SNS, and I
am using SQS.
I am going to build a distributed cloud front-end over SQS to handle
client connections. This front-end will just be a wrapper, that will
authenticate clients, and relay them to the SQS queues.
Am I abusing the SQS "unlimited queues" policy (will SQS performance drop)? Is there a simpler design for per device messaging?
Let me break the answer by the parts of your question:
About your questions:
Am I abusing the SQS "unlimited queues" policy?
AWS services are designed to prevent abuse and you will pay exactly for what you use, so if you believe this is the right approach, go for it. To remove the uncertainty, i'd advise for a preliminary "proof of concept" implementation.
Is there a simpler design for per device messaging?
Probably yes, re-consider SNS and other messaging systems.
About your statements:
I am aware that SQS can be expensive, but I am using it at this stage
for ease of administration.
"Expensive" is a very context-depend classification, considering that a SQS message can cost $0.00000005.
As far as I can tell SNS requires either an IOS/Android client, or an
HTTP server running on the consumer. This is why I ruled out SNS, and
I am using SQS.
SNS is a push based messaging system (SQS is pull based) that can handle 5 types of subscriptions: smtp, sms, http, mobile push and SQS, so they are not mutually exclusive.
I am going to build a distributed cloud front-end over SQS to handle
client connections. This front-end will just be a wrapper, that will
authenticate clients, and relay them to the SQS queues.
Managing millions of queues can be a overwhelming task for your "distributed cloud front-end over SQS". Unless your project is exactly about queue management, this is probably undifferentiated heavy lifting.
This is about all i can say without knowing your case, but consider that you can use SNS/SQS together with each other and with other messaging software, such as Apache Camel and others, that may help you build your solution or proof of concept.
I think SQS (or SNS if you can eventually use them) are still your best bet, esp if you need "quick response" or "near real time"; however, for the sake of having "alternatives" just so you can compare...
You can consider a giant dynamoDB, with each device/client having it's own "device-id" and perhaps "message-id" as key. This way, your devices can query it's own keys for messages. DynamoDB is meant to handle billions of rows, so this won't stress it much. The querying part, you should be careful though, as you could use up provisioned queries, although at aggregate level, your devices may not all respond/query at the same time, so you may still be ok.
You can also consider having a giant S3 bucket, each folder key'ed to the device id and further keyed into message-id folders. This is a poor man's SQS but it's guaranteed to scale, both in message quantity and number of accesses to it.
In both #1 and #2, if your devices are registered with Cognito for credentials, there's a straightforward way to do policies, so the devices can only access their "own" stuff. However, both alternatives #1 and #2 is likely slower than SQS, esp if you use SQS long-poll -- in long poll, you get responses, as soon as SQS detects a message have been dropped off... These alternatives will require you to wait for next cycle-poll.