According to this page, Shared Topic Subscription in WSO2, message delivery to subscribers sharing a client id will be done in round robin order. This article only shows a single MB instance. I am wondering how delivery is managed when you have a cluster of MB instances where there are multiple subscribers sharing a client id across the cluster. Is MB capable of round-robin delivery across all nodes?
WSO2 Message broker is distributed broker. It has slot based delivery model [1]. Slot creation and slot dispatch happen in coordinator node in cluster. Each node has slot delivery worker running to deliver messages to local subscriptions of node.
So when there are multiple subscriptions across in cluster sharing same subscription id, then local subscriptions of particular node will get messages in round robin order.
Due to slot architecture, it guarantee that none of two subscriptions will get same message. Because slot contain particular range of message id set.
Example: Let's say there is two node cluster where node1 and node2. Assume node1 is the coordinator node. There is topic call topic1. Publisher1 send 1000 messages to topic1 in node1 and there are two subscribers in each node call subcriber1, subscriber2 from node1 and subscriber3, subscriber4 from node2. Coordinator node will create slots when messages are publishing and dispatch to subscribers running node on demand. This happen through thrift communication. So all local subscribers in node1 and node2 will get messages round-robin order.
Hope you might understand high level architecture.
[1] More details - https://docs.wso2.com/display/MB300/Architecture
Related
currently we are using Artemis for our publish-subscribe pattern and at one specific case, we are using temporary queues to subscribe to a topic. They are non-shared and non-durable and they receive messages as long as there is a consumer listening from those. As a simple example, application instances are using local cache for a configuration and if that configuration is changed, an event is published, each instance receives the same message and evict their local caches. Each instance is connecting with temporary queue (names created by broker with UUID) at startup, and they may be restarted because of a deployment or rescheduling on kubernetes (as they are running on spot instances). Is it possible to migrate this usage to AWS services using SNS and SQS?
So far, I could only see virtual queues close to this, but as far as I understand they do not receive same message on different virtual queues (of one standard queue). If I have to use standard queues to subscribe for each instance, then I would need to use unique names for instances but there may be also scaling up and then scale down, so application needs to detect queues that do not have consumers anymore and remove them (so they do not receive messages from topic anymore).
I have made some trials with virtual queues where I have created two consumer threads (receiving messages with AmazonSQSVirtualQueuesClient) and send message to host queue (with AmazonSQSClient). They did not end up on virtual queues, in fact messages are still on the queue at the moment. I have also tried to send the message with AmazonSQSVirtualQueuesClient but then get warning WARNING: Orphaned message sent to ... . I believe it is only fit for request-responder pattern and the exact destination needs to be known by publisher.
I have a use case which is million clients data processed by individual actor.And all the actors created on across multiple nodes and formed as a cluster.
When i receive a message i have to send it to the particular actor in the cluster which is responsible for.How can i map to that actor with AKKA cluster.I don't want it to send to other actors.
Can this use case achievable With Akka - Cluster?
How failure handling will happen here?
I cant understand what cluster singleton is,saying in doc that it will only created on oldest node. In my case i only want all million actors as singleton.
How particular actor in cluster mapped with message?
How can i create actors like this in cluster?
Assuming that each actor is responsible for some uniquely identifiable part of state, it sounds like you want to use cluster sharding to balance the actors across the cluster, and route messages by ID.
Note that since the overhead of an actor is fairly small, it's not implausible to host millions of actors on a single node.
See the Akka JVM or Akka.Net docs.
I have two services, one is the producer (Service A) and one is a consumer (Service B). So Service A will produce a message which will be published to Amazon SQS service and then it will be delivered to Service B as it has subscribed to the queue. So, this works fine until I have a single instance of Service B.
But when I start another instance of Service B, such that now there are 2 instances of Service B, both of which are subscribing to the same queue, as it is the same service, I observe that the messages from SQS are now being delivered in round-robin fashion. Such that at a given time, only one instance of Service B receives the message that is published by Service A. I want that when a message is published to this queue, it should be received by all the instances of Service B.
How can we do this? I have developed these services as Springboot applications, along with Spring cloud dependencies.
Please see the diagram below for reference.
If you are interested in building functionality like this, use SNS, not SQS. We have a Spring BOOT example that shows how to build a web app that lets users sign up for email subscriptions and then when a message is published, all subscribed emails get the message.
The purpose of this example is to get you up and running building a Spring BOOT app using the Amazon Simple Notification Service. That is, you can build this app with Spring BOOT and the official AWS Java V2 API:
Creating a Publish/Subscription Spring Boot Application
While your message may appear to be read in a round robbin fashion, they are not actually consumed in a round robin. SQS works by making all messages available to any consumer (that has the appropriate IAM permissions) and hides the message as soon as one consumer fetches the message for a pre-configured amount of time that you can configure, effectively "locking" that message. The fact that all of your consumer seem to be operating in a round robin way is most likely coincidental.
As others have mentioned you could use SNS instead of SQS to fanout messages to multiple consumers at once, but that's not as simple a setup as it may sound. If your service B is load balanced, the HTTP endpoint subscriber will point to the Load Balancer's DNS name, and thus only one instance will get the message. Assuming your instances have a public IP, you could modify your app so that it self-registers as an HTTP subscriber to the topic when the application wakes up. The downsides here are that you're not only bypassing your Load Balancer, you're also losing the durability guarantees that come with SQS since an SNS topic will try to send the message X times, but will simply drop the message after that.
An alternative solution would be to change the message hiding timeout setting on the SQS queue to 0, that way the message is never locked and every consumer will be able to read it. That will also mean you'll need to modify your application to a) not process messages twice, as the same message will likely be read more than once by the time it has finished processing and b) handle failure gracefully when one of the instance deletes the message from the queue and other instances try to delete that message from the queue after that.
Alternatively, you might want to use some sort of service mesh, or service discovery mechanism so that instances can communicate between each other in a peer-to-peer fashion so that one instance can pull the message from the SQS queue and propagate it to the other instances of the service.
You could also use a distributed store like Redis or DynamoDB to persist the messages and their current status so that every instance can read them, but only one instance will ever insert a new row.
Ultimately there's a few solutions out there for this, but without understanding the use-case properly it's hard to make a hard recommendation.
Implement message fanout using Amazon Simple Notification Service (SNS) and Amazon Simple Queue Service (SQS). There is a hands-on Getting Started example of this.
Here's how it works: in the fanout model, service A publishes a message to an SNS topic. Each instance of service B has an associated SQS queue which is subscribed to that SNS topic. The published message is delivered to each subscribed queue and hence to each instance of service B.
I am new to Kafka and my use case is I have provision Kafka 3node cluster and if I produce the message in node1 it's automatically syncing in both node2 and node3 (mean I am consuming the msg in node2 and node3) so now i want that all the messages in another aws ec2 machine. how can i do that?
You can use Apache Kafka's MirrorMaker that facilitates Multi-datacentre replication. You can use it in order to copy data between two Kafka clusters.
Data is read from topics in the origin cluster and written to a topic
with the same name in the destination cluster. You can run many such
mirroring processes to increase throughput and for fault-tolerance (if
one process dies, the others will take overs the additional load).
The origin and destination clusters are completely independent
entities: they can have different numbers of partitions and the
offsets will not be the same. For this reason the mirror cluster is
not really intended as a fault-tolerance mechanism (as the consumer
position will be different). The MirrorMaker process will, however,
retain and use the message key for partitioning so order is preserved
on a per-key basis.
Another option (that requires licensing) is Confluent Replicator that also handles topic configuration.
The Confluent Replicator allows you to easily and reliably replicate
topics from one Kafka cluster to another. In addition to copying the
messages, this connector will create topics as needed preserving the
topic configuration in the source cluster. This includes preserving
the number of partitions, the replication factor, and any
configuration overrides specified for individual topics.
Here's a quickstart tutorial that will help you to get started with Confluent Kafka Replicator.
If I understand correctly, new machine is not a Kafka broker, so mirroring data to it wouldn't work.
it's automatically syncing in both node2 and node3
Only if the replication factor is 3 or more
mean I am consuming the msg in node2 and node3
Only if you have 3 or more partitions would you be consuming from all three nodes, since there's only one leader per partition, and all consume requests come from it
If you just run any consumer process on this new machine, you will get all messages from the existing cluster. If you planned on storing those messages for any particular reason, I would suggest looking into Kafka Connect S3 connector, then you can query an S3 bucket using Athena, for example
I have two nodes in a Akka cluster.
I subscribe to all ClusterDomainEvent of the cluster with:
cluster.subscribe(
self,
InitialStateAsEvents,
classOf[ClusterDomainEvent])
When one of the two nodes is down, I receive a Unreachable event and I start to receive some logs every few seconds that warn me as following:
Association with remote system [akka.tcp://application#127.0.0.1:2554] has failed
When the down node come back, the logs stop, so it is detected that the node is reachable again but I still don't get a ReachableMember event.
What am I missing? Why should I do in order to receive this cluster event?
The way of doing it is to subscribe to cluster events with classOf[ReachabilityEvent]
So
cluster.subscribe(
self,
InitialStateAsEvents,
classOf[MemberEvent],
classOf[ReachabilityEvent])