I'm developing a game using AWS Amplify. The game state will be stored in DynamoDB tables and will be queried and modified with GraphQL. There isn't a pressing need for realtime or low-latency communication; However, I need to detect when a player joins or disconnects from a game. What's the best mechanism for implementing this?
What I had in mind was an event that fires when a WebSocket connection is established or broken. The best I could glean from the Amplify docs was using PubSub with AWS IoT, but I don't know if this will work. If possible, I would like to avoid incurring additional API costs.
I already implemented a version of this where the client updates a lastSeen field in the database every 30 seconds or so but it felt pretty janky.
I think you need to distinguish between a disconnect and inactivity. Somebody may simply be inactive and in that case you would disconnect them yourself after they did nothing for x amount of time.
A disconnect on the other hand should notify your server that is indeed what the user intended to do.
I think DataStore Events will do what you want. They have a specific network status event you can use to trigger state changes.
I have multiple clients A (main application) and multiple clients B (payments service).
If I publish a message from client A that will be processed and answered on client B (publishing an answer in another topic), how to capture this answer on client A?
The problem is that client A has multiple instances, so I can't guarantee that the exactly same instance that triggered the request will receive the response (PubSub will randomly pick one instance).
Saw that other brokers like RabbitMQ have "reply-to" option. Is there anything similar on Google PubSub?
That way, I could simulate a "synchronous" operation on client A and only answer to the user when processing/response is finished, instead of dealing with this check on front-end every time.
Thank you!
Decoupling publishers from subscribers is one of the core features of Cloud Pub/Sub, which follows the publish-subscribe pattern. There’s currently no support in Cloud Pub/Sub for sending responses from subscribers directly to an entity that published a given message.
You could work around this by including information about the instance of client A that published a given message, so client B could figure out which instance of client A to notify once processing has finished. For example, client B could send an RPC directly to the publisher, or if there are few enough instances of client A, they can each have dedicated topics where they receive “processing complete” messages as subscribers (on a topic that client B is the publisher for).
Potential issues to watch out for while you think about the right approach:
Cloud Pub/Sub offers at-least-once delivery. There is a possibility of having duplicate messages sent to subscribers, and your system will need to be resilient to this.
What happens if a given instance of client A or client B crashes at any point in your process? Would it introduce the risk of processing erroneous/duplicate payments?
I have a specific use-case for an Akka implementation.
I have a set of agents who send heartbeats to Akka. Akka takes this heartbeat, and assigns actors to send them to my meta-data server (a separate server). This part is done.
Now my meta-data server also needs to send action information to the agents. However, since these agents may be behind firewalls, Akka cannot communicate to them directly so it needs to send the action as a response to the Heartbeat. Thus when the meta-data server sends an action Akka stores it in a DurableMessageQueue (separate one for each agentID) and keeps the mapping of agent-ID to DurableMessageQueue in a HashMap. Then whenever the heartbeat comes, before responding it checks this queue and piggybacks the action in the response.
The issue with this is that the HashMap will be in a single JVM and therefor I cannot scale this. Am I missing something or is there a better way to do it?
I have Akka running behind Mina server running which received and sends messages.
I'm trying to come up with the best solution for scaling a chat service in AWS. I've come up with a couple potential solutions:
Redis Pub/Sub - When a user establishes a connection to a server that server subscribes to that user's ID. When someone sends a message to that user, a server will perform a publish to the channel with the user's id. The server the user is connected to will receive the message and push it down to the appropriate client.
SQS - I've thought of creating a queue for each user. The server the user is connected to will poll (or use SQS long-polling) that queue. When a new message is discovered, it will be pushed to the user from the server.
SNS - I really liked this solution until I discovered the 100 topic limit. I would need to create a topic for each user, which would only support 100 users.
Are their any other ways chat could be scaled using AWS? Is the SQS approach viable? How long does it take AWS to add a message to a queue?
Building a chat service isn't as easy as you would think.
I've built full XMPP servers, clients, and SDK's and can attest to some of the subtle and difficult problems that arise. A prototype where users see each other and chat is easy. A full features system with account creation, security, discovery, presence, offline delivery, and friend lists is much more of a challenge. To then scale that across an arbitrary number of servers is especially difficult.
PubSub is a feature offered by Chat Services (see XEP-60) rather than a traditional means of building a chat service. I can see the allure, but PubSub can have drawbacks.
Some questions for you:
Are you doing this over the Web? Are users going to be connecting and long-poling or do you have a Web Sockets solution?
How many users? How many connections per user? Ratio of writes to reads?
Your idea for using SQS that way is interesting, but probably won't scale. It's not unusual to have 50k or more users on a chat server. If you're polling each SQS Queue for each user you're not going to get anywhere near that. You would be better off having a queue for each server, and the server polls only that queue. Then it's on you to figure out what server a user is on and put the message into the right queue.
I suspect you'll want to go something like:
A big RDS database on the backend.
A bunch of front-end servers handling the client connections.
Some middle tier Java / C# code tracking everything and routing messages to the right place.
To get an idea of the complexity of building a chat server read the XMPP RFC's:
RFC 3920
RFC 3921
SQS/ SNS might not fit your chatty requirement. we have observed some latency in SQS which might not be suitable for a chat application. Also SQS does not guarantee FIFO. i have worked with Redis on AWS. It is quite easy and stable if it is configured taking all the best practices in mind.
I've thought about building a chat server using SNS, but instead of doing one topic per user, as you describe, doing one topic for the entire chat system and having each server subscribe to the topic - where each server is running some sort of long polling or web sockets chat system. Then, when an event occurs, the data is sent in the payload of the SNS notification. The server can then use this payload to determine what clients in its queue should receive the response, leaving any unrelated clients untouched. I actually built a small prototype for this, but haven't done a ton of testing to see if it's robust enough for a large number of users.
HI realtime chat doesn't work well with SNS. It's designed for email/SMS or service 1 or a few seconds latency is acceptable. In realtime chat, 1 or a few seconds are not acceptable.
check this link
Latency (i.e. “Realtime”) for PubNub vs SNS
Amazon SNS provides no latency guarantees, and the vast majority of latencies are measured over 1 second, and often many seconds slower. Again, this is somewhat irrelevant; Amazon SNS is designed for server-to-server (or email/SMS) notifications, where a latency of many seconds is often acceptable and expected.
Because PubNub delivers data via an existing, established open network socket, latencies are under 0.25 seconds from publish to subscribe in the 95% percentile of the subscribed devices. Most humans perceive something as “realtime” if the event is perceived within 0.6 – 0.7 seconds.
the way i would implement such a thing (if not using some framework) is the following:
have a webserver (on ec2) which accepts the msgs from the user.
use Autoscalling group on this webserver. the webserver can update any DB on amazon RDS which can scale easily.
if you are using your own db, you might consider to decouple the db from the webserver using the sqs (by sending all requests the same queue), and then u can have a consumer which consume the queue. this consumer can also be placed behind an autoscalling group, so that if the queue is larger than X msgs, it will scale (u can set it up with alarms)
sqs normally updates pretty fast i.e less than one second. (from the moment u sent it, to the moment it appears on the on the queue), and rarely more than that.
Since a new AWS IoT service started to support WebSockets, Keepalive and Pub/Sub couple months ago, you may easily build elastic chat on it. AWS IoT is a managed service with lots of SDKs for different languages including JavaScript that was build to handle monster loads (billions of messages) with zero administration.
You can read more about update here:
https://aws.amazon.com/ru/about-aws/whats-new/2016/01/aws-iot-now-supports-websockets-custom-keepalive-intervals-and-enhanced-console/
Edit:
Last SQS update (2016/11): you can now use Amazon Simple Queue Service (SQS) for applications that require messages to be processed in a strict sequence and exactly once using First-in, First-out (FIFO) queues. FIFO queues are designed to ensure that the order in which messages are sent and received is strictly preserved and that each message is processed exactly once.
Source:
https://aws.amazon.com/about-aws/whats-new/2016/11/amazon-sqs-introduces-fifo-queues-with-exactly-once-processing-and-lower-prices-for-standard-queues/
Now on, implementing SQS + SNS looks like a good idea too.
We have a mobile Application in a very unsteady WLan Environment. Sending Data to a webserver could result in a timeout or in a lost WLan connection.
How do we ensure, that our data is delivered correctly? Is there a possibility of having Web Services Reliable Messaging (WSRM) on the device?
MSMQ is no option at the moment.
WSRM isn't supported. A reliable mechanism is to ensure that either the Web Service responds to the upload with an ack after the data has been received (i.e. a synchronous call) or that when you start the upload you get back a transaction ID that you can then send back to the service at a later point to ensure that it has been delivered before local deletion.