How to implement a request-response pattern on Google Cloud PubSub? - google-cloud-platform

I have multiple clients A (main application) and multiple clients B (payments service).
If I publish a message from client A that will be processed and answered on client B (publishing an answer in another topic), how to capture this answer on client A?
The problem is that client A has multiple instances, so I can't guarantee that the exactly same instance that triggered the request will receive the response (PubSub will randomly pick one instance).
Saw that other brokers like RabbitMQ have "reply-to" option. Is there anything similar on Google PubSub?
That way, I could simulate a "synchronous" operation on client A and only answer to the user when processing/response is finished, instead of dealing with this check on front-end every time.
Thank you!

Decoupling publishers from subscribers is one of the core features of Cloud Pub/Sub, which follows the publish-subscribe pattern. There’s currently no support in Cloud Pub/Sub for sending responses from subscribers directly to an entity that published a given message.
You could work around this by including information about the instance of client A that published a given message, so client B could figure out which instance of client A to notify once processing has finished. For example, client B could send an RPC directly to the publisher, or if there are few enough instances of client A, they can each have dedicated topics where they receive “processing complete” messages as subscribers (on a topic that client B is the publisher for).
Potential issues to watch out for while you think about the right approach:
Cloud Pub/Sub offers at-least-once delivery. There is a possibility of having duplicate messages sent to subscribers, and your system will need to be resilient to this.
What happens if a given instance of client A or client B crashes at any point in your process? Would it introduce the risk of processing erroneous/duplicate payments?

Related

gRPC callback vs streaming in C++

I'm writing an application where a client will connect to a server and subscribe to data updates. The client tells the server what data items it is interested in, and then subscribes using a method with a streaming response. This works well.
However, there are also non-data related notifications that the client should know about. I'm not sure about the best way to handle those. I've thought of:
Adding another method to the existing service. This would be just like the data subscription but be used for event subscription. The client could then subscribe to both types of updates. Not sure what the best practice is for the number of methods in a service, or the mixing of responsibilities in a service.
Exposing a second service from the server with a streaming method for event notifications. This would make the client use multiple connections to get its data - and use another TCP port. The event notifications would be rare (maybe just a few during the lifetime of the connection) so not sure if that is important to consider. Again - not sure about best practices for the number of services exposed from a server.
This one seems unorthodox, but another method might be to pass connection info (IP address and port) from the client to the server during the client's connection sequence. The server could then use that to connect to the client as a way to send event notifications. So the client and server would each have to implement both client and server roles.
Any advice on ways to manage this? Seems like this a problem that would already have been solved - but it also appears that the C++ implementation of gRPC lags a bit behind some of the other languages which offer some more options.
Oh - and I'm doing this on Windows.
Thanks
I've come up with another alternative that seems to fit the ProtoBuf style better than the others. I've created ProtoBuf message types for each of the data/event/etc notifications that the server should send, and enclosed each of them inside a common 'notification' message that uses the 'oneof' type. This provides a way to have a single streaming method response that can accommodate any type of notification. It looks like this:
message NotificationData
{
oneof oneof_notification_type
{
DataUpdate item_data_update = 1;
EventUpdate event_data_update = 2;
WriteResponse write_response_update = 3;
}
}
service Items
{
...
rpc Subscribe (SubscribeRequest) returns (stream NotificationData) {}
...
}
Any comments or concerns about this usage?
Thanks

Azure EventHub: offline event buffering/queueing possible?

I can't find any definitive answer here. My IoT service needs to tollerate flaky connections. Currently, I manage a local cache myself and retry a cloud-blob transfer as often as required. Could I replace this with an Azure EventHub service? i.e. will the EventHub client (on IoT-Core) buffer events until the connection is available? If so, where is the info on this?
It doesn't seem so according to:
https://azure.microsoft.com/en-us/documentation/articles/event-hubs-programming-guide/
You are resposible for sending and caching it seems:
Send asynchronously and send at scale
You can also send events to an Event Hub asynchronously. Sending
asynchronously can increase the rate at which a client is able to send
events. Both the Send and SendBatch methods are available in
asynchronous versions that return a Task object. While this technique
can increase throughput, it can also cause the client to continue to
send events even while it is being throttled by the Event Hubs service
and can result in the client experiencing failures or lost messages if
not properly implemented. In addition, you can use the RetryPolicy
property on the client to control client retry options.

How to setup a ZMQ PUB/SUB pattern to serve only for pre-authorized subscriber(s)

How can I implement or do kind of "hack" in PUB-SUB pattern to get an ability to publish only to authorized subscribers, disconnect unauthorized subscribers etc?
I googled for this problem, but all the answers very similar to set subscribe filter in subscriber side.
But I want, as I said, publish my updates from PUB only to those clients that passed an authorization, or have some secret key, that was received in REQ-REP.
Thanks for any ideas.
Read Chapter 5 of The Guide, specifically the section called "Pros and Cons of Pub-Sub".
There are many problems with what you're trying to accomplish in the way you're trying to accomplish it (but there are solutions, if you're willing to change your architecture).
Presumably you need the PUB socket to be generally accessible to the world, whether that's the world at large or just a world consisting of some sockets which are authorized and some sockets which are not. If not, you can just control access (via firewall) to the PUB socket itself to only authorized machines/sockets.
When a PUB socket receives a new connection, it doesn't know whether the subscriber is authorized or not. PUB cannot receive actual communication from SUB sockets, so there's no way for the SUB socket to communicate its authorization directly. XPUB/XSUB sockets break this limitation, but it won't help you (see below).
No matter how you communicate a SUB socket's authorization to a PUB socket, I'm not aware of any way for the PUB socket to kill or ignore the SUB socket's connection if it is not authorized. This means that an untrusted SUB socket can subscribe ALL ('') and receive all messages from the PUB socket, and the PUB socket can't do anything about it. If you trust the SUB socket to police itself (you create the connecting socket and control the machines it's deployed on), then you have options to just subscribe to a "control" topic, send an authorization, and have the PUB socket feed back the channels/topics that you are allowed to subscribe to.
So, this pretty much kills it for achieving general security in a PUB/SUB paradigm that is publicly accessible.
Here are your options:
Abandon PUB/SUB - The only way you can control exactly which peer you send to every single time on the sending side (that I'm aware of) is with a ROUTER socket. If you use ROUTER/DEALER, the DEALER socket can send it's authorization, the ROUTER socket stores that with its ID, and when something needs to be sent out, it just finds all connected sockets that are authorized and sends it, sequentially, to each of them. Whether this is feasible or not depends on the number of sockets and the workload (size and number of messages).
Encrypt your messages - You've already said this is your last resort, but it may be the only feasible answer. As I said above, any SUB socket that can access your PUB socket can just subscribe to ALL ('') messages being sent out, with no oversight. You cannot effectively hide your PUB socket address/port, you cannot hide any messages being sent out over that PUB socket, but you can hide the content of those messages with encryption. Proper method of key sharing depends on your situation.
As Jason has shown you an excellent review on why ( do not forget to add a +1 to his remarkable answer, ok? ), let me add my two cents on how:
Q: How?
A: Forget about PUB/SUB archetype and create a case-specific one
Yes. ZeroMQ is rather a very powerful can-do toolbox, than a box-of-candies you are forbidden to taste and choose from to assemble your next super-code.
This way your code is and remains in power of setting both controls and measures for otherwise uncontrollable SUB-side code behaviour.
Creating one's own, composite, layered messaging solution is the very power ZeroMQ brings to your designs. There you realise you are the master of distributed system design. Besides the academic examples, no one uses the plain primitive-behaviour-archetypes, but typically composes more robust and reality-proof composite messaging patterns for the production-grade solutions.
There is no simple one-liner to make your system use-case work.
While it need not answer all your details, you may want to read remarks
on managing PUB/SUB connections
on ZeroMQ authorisation measures.

Ideas for scaling chat in AWS?

I'm trying to come up with the best solution for scaling a chat service in AWS. I've come up with a couple potential solutions:
Redis Pub/Sub - When a user establishes a connection to a server that server subscribes to that user's ID. When someone sends a message to that user, a server will perform a publish to the channel with the user's id. The server the user is connected to will receive the message and push it down to the appropriate client.
SQS - I've thought of creating a queue for each user. The server the user is connected to will poll (or use SQS long-polling) that queue. When a new message is discovered, it will be pushed to the user from the server.
SNS - I really liked this solution until I discovered the 100 topic limit. I would need to create a topic for each user, which would only support 100 users.
Are their any other ways chat could be scaled using AWS? Is the SQS approach viable? How long does it take AWS to add a message to a queue?
Building a chat service isn't as easy as you would think.
I've built full XMPP servers, clients, and SDK's and can attest to some of the subtle and difficult problems that arise. A prototype where users see each other and chat is easy. A full features system with account creation, security, discovery, presence, offline delivery, and friend lists is much more of a challenge. To then scale that across an arbitrary number of servers is especially difficult.
PubSub is a feature offered by Chat Services (see XEP-60) rather than a traditional means of building a chat service. I can see the allure, but PubSub can have drawbacks.
Some questions for you:
Are you doing this over the Web? Are users going to be connecting and long-poling or do you have a Web Sockets solution?
How many users? How many connections per user? Ratio of writes to reads?
Your idea for using SQS that way is interesting, but probably won't scale. It's not unusual to have 50k or more users on a chat server. If you're polling each SQS Queue for each user you're not going to get anywhere near that. You would be better off having a queue for each server, and the server polls only that queue. Then it's on you to figure out what server a user is on and put the message into the right queue.
I suspect you'll want to go something like:
A big RDS database on the backend.
A bunch of front-end servers handling the client connections.
Some middle tier Java / C# code tracking everything and routing messages to the right place.
To get an idea of the complexity of building a chat server read the XMPP RFC's:
RFC 3920
RFC 3921
SQS/ SNS might not fit your chatty requirement. we have observed some latency in SQS which might not be suitable for a chat application. Also SQS does not guarantee FIFO. i have worked with Redis on AWS. It is quite easy and stable if it is configured taking all the best practices in mind.
I've thought about building a chat server using SNS, but instead of doing one topic per user, as you describe, doing one topic for the entire chat system and having each server subscribe to the topic - where each server is running some sort of long polling or web sockets chat system. Then, when an event occurs, the data is sent in the payload of the SNS notification. The server can then use this payload to determine what clients in its queue should receive the response, leaving any unrelated clients untouched. I actually built a small prototype for this, but haven't done a ton of testing to see if it's robust enough for a large number of users.
HI realtime chat doesn't work well with SNS. It's designed for email/SMS or service 1 or a few seconds latency is acceptable. In realtime chat, 1 or a few seconds are not acceptable.
check this link
Latency (i.e. “Realtime”) for PubNub vs SNS
Amazon SNS provides no latency guarantees, and the vast majority of latencies are measured over 1 second, and often many seconds slower. Again, this is somewhat irrelevant; Amazon SNS is designed for server-to-server (or email/SMS) notifications, where a latency of many seconds is often acceptable and expected.
Because PubNub delivers data via an existing, established open network socket, latencies are under 0.25 seconds from publish to subscribe in the 95% percentile of the subscribed devices. Most humans perceive something as “realtime” if the event is perceived within 0.6 – 0.7 seconds.
the way i would implement such a thing (if not using some framework) is the following:
have a webserver (on ec2) which accepts the msgs from the user.
use Autoscalling group on this webserver. the webserver can update any DB on amazon RDS which can scale easily.
if you are using your own db, you might consider to decouple the db from the webserver using the sqs (by sending all requests the same queue), and then u can have a consumer which consume the queue. this consumer can also be placed behind an autoscalling group, so that if the queue is larger than X msgs, it will scale (u can set it up with alarms)
sqs normally updates pretty fast i.e less than one second. (from the moment u sent it, to the moment it appears on the on the queue), and rarely more than that.
Since a new AWS IoT service started to support WebSockets, Keepalive and Pub/Sub couple months ago, you may easily build elastic chat on it. AWS IoT is a managed service with lots of SDKs for different languages including JavaScript that was build to handle monster loads (billions of messages) with zero administration.
You can read more about update here:
https://aws.amazon.com/ru/about-aws/whats-new/2016/01/aws-iot-now-supports-websockets-custom-keepalive-intervals-and-enhanced-console/
Edit:
Last SQS update (2016/11): you can now use Amazon Simple Queue Service (SQS) for applications that require messages to be processed in a strict sequence and exactly once using First-in, First-out (FIFO) queues. FIFO queues are designed to ensure that the order in which messages are sent and received is strictly preserved and that each message is processed exactly once.
Source:
https://aws.amazon.com/about-aws/whats-new/2016/11/amazon-sqs-introduces-fifo-queues-with-exactly-once-processing-and-lower-prices-for-standard-queues/
Now on, implementing SQS + SNS looks like a good idea too.

Best practices for sending automated daily emails from web service

I am running a web service that currently sends confirmation emails out to new users via the gmail smtp servers. As I'm only getting a few new users each day, this hasn't been a problem.
I've recently added new features to the webapp that will require a customized message to be sent out to each user every day. Think of this as similar to the regular messages LinkedIn sends out that give you a status report on the activity in your network. Every user's message will be different. With thousands of users, this means thousands of unique messages will be sent each day.
Edit: I've since found that these types of email are called "transactional or relationship messages". Spamtacular has a good article on differentiating between marketing and transactional email.
I don't think using gmail's smtp servers will cut it anymore, but I don't know that for sure. I don't know what gmail's maximum outgoing messages per account is (it might be 100/day), but they limit outgoing mail to 500 recipients per message. I'm not sending a single message to 500 recipients, but I'm going to be sending 1000's of customized messages with each recipient getting one per day.
I'm interested to learn any best practices for doing this (especially for Java-based webapps). Here are some of my thoughts and concerns on it:
Should I set up my own outgoing mail server? If I do this, it seems like I'll have all sorts of other issues to worry about, such as preventing mail server abuse, monitoring bounces, allowing ways to opt-out of emails, etc. Are there any tools or services to help with this? Maybe something like OpenEMM or a services like MailChimp? But those seem focused more toward email marketing campaigns.
I don't think I should have the webapp itself handle sending emails as it currently is for new user signups. I'm thinking I should setup a separate messaging server that can access the same backend/datastore as the webapp. Thoughts on this?
Should I consider setting up some sort of message queueing service to help with this, such as JMS, RabbitMQ, ActiveMQ, etc.?
Do I need to provide users a way to opt-out? Do I need to flag these as bulk messages? I don't really consider these email marketing messages, but I'm unsure what is considered appropriate or proper netiquette.
Any advice is appreciated. I'm also very interested in open source tools or web services that simplify things and could help me to ramp up as quickly as possible.
Thanks!
With regard to your first question, yes, you should set up your own mail server. Using gmail to do this might work for a while, but they are likely to shut you down in short order when they see this kind of activity. You could sign up for a business account and use app engine to send messages. Here's a link with information about mail quotas for that service.
Regarding your second and third questions, It would be a good idea to have messages queued by the web app and sent out by a centralized service rather than having the app send out the messages on its own.
Usually I would just use a database table as a queue - the web app inserts rows for each message it wants to send. A service/scheduled task app would grab new messages out of the table and send them off. This gives you lots of flexibility if you want to switch mail servers later, better reliability if the mail server is down, easier diagnostics if there are problems with recipients not getting messages, and the ability to resend messages. As for using JMS/MQ to do this - probably not necessary. IMO a database table used as a queue would give you more flexibility here than an actualy JMS-based queue system.
As for opt outs, YES - you should give people a way to opt out. I don't think you need to flag the messages as bulk though.
On the architecture side of things I would definitely consider decoupling the sending of the emails from the main service via some form of asynchronous message queuing (or facsimile thereof using database as an intermediary). Another benefit of this approach is that if the SMPT server\network is down you could build in retry semantics, additionally for future scalability you could implement multiple mail senders reading from the same queue or implement sending throttling or scheduling (i.e send n messages per hour), etc etc.