Ideal HA setup for WSO2 API Manager - wso2

I have been trying to setup an active active setup for WSO2 API Manager. This has been elaborated here:
https://docs.wso2.com/display/AM210/Configuring+an+Active-Active+Deployment#Linux-Mac
Few observations:
The setup looks to be on two different nodes, with all components deployed on each node.
The setup indicates that Publisher should be pointed to one of the two nodes for both the nodes. If that is the case, lets say that node-1 (publisher) node goes down, how will second active instance help?
It recommends using NFS for content synchronization. NFS becomes a single point of failure in that case. Why is content synchronization needed though? Is it only for advanced siddhi query based throttling policies?
Finally, if I do two independent, all-components-in-one setup of API Managers with shared database and content synchronization using rsync/unison; but no throttling data publishing, what are the downsides?
Is this kind of setup fit for Active - Passive?
Thanks

If you use rsync or any deployment synchronization mechanism which is one way, this becomes a single point of failure. Most of the use cases, API publishing happens at the development time and this is actually a limitation.
That's why we can use NFS or file share mechanism. You can point the localhost and write the Synapse file to the file system. Then it shares among the two nodes. When you publish an API, Synapse artifact is created and deployed in the gateway node. In your case, one of the nodes.You can find a sample file in APIM_HOME/repository/deployment/server/synapse/default/api location.
If you disable throttling data publishing, i.e advanced throttling, your APIs can be accessed without any limitation. Simply there is no limit. But burst control and backend throttling will apply.
Yes, this fits for A-P. You can control A-A or A-P from the load balancer.

Related

How to configure WSO2 identity server to avoid single point of failure?

My Company wants to setup wso2 identity server cluster on 3 machines such that if one machine fails, the cluster still works.
All the wso2 documentation shows clustering with shared user store and database but does not mention how to avoid single point of failure.
As per my understanding, the only way to do the same is to form an external ldap cluster as user store and an external database cluster. But that would be much complex and hard to manage.
Can we configure the wso2's embedded ldap to replicate and sync with other node's embedded ldap?
Is there any other way to avoid single point of failure in wso2?
No, you can't use embedded LDAP.
You should avoid using embedded LDAP in production at all costs. It will sure get corrupted with concurrent requests and growth the of data. And you will not be able to recover at all. It's just there for testing purposes.
If you want to avoid any single point of failures due to DB or LDAPs, you should be using DB and LDAP clustered as instructed by the respective provider. And point the common LB URL to the WSO2 server.

Deploying distributed queue mechanism on GCP

Background
I have a system working on Google Cloud. This system built on Micro-services architecture. The application communicating with each other with queues. Currently, I using Azure Storage Queues and want to move this managed service into GCP too.
Requirements
Mandatory:
Max processing time - 1 second to 1 hour.
Ack mechanism to mark tasks as completed.
Scaleable solution - should support the load of a few thousands of messages per second.
Support pulling of messages.
Nice to have:
Handle priorities - some tasks queues are more important than others.
Managed solution - I prefer to use a managed solution rather than install it my self.
Solutions I have checked
I already checked those services and I refused it because:
Microsoft Azure Storage Queue - Outside of google cloud.
Google Cloud PubSub - Not meet the mandatory requirement - Processing time is limited to 10 minutes.
Google Cloud Task - Seems like mostly designed for serverless applications, which are not suitable for my application.
I also check the RabbitMQ solution which seems to support all my requirements, except the one of 'managed solution'. It seems like GCP not provide RabitMQ as a service. So, I'll have to deploy it into virtual machines and make all the configuration my self...
The question
To be honest, it seems like the solution that I'm looking for should be pretty common. Nothing really special. But, all of the services I described here have some critical cons.
Which service did I miss here?

WSO2 APIM Clustering Configuration

I am using WSO2 APIM 1.10.0 on a single server deployment and would like to move to a clustering one. Looking at this documentation I could found a lot of information, howevre something is boring me; do I really have to always do all of it?
I mean, I don't want to split all my workers in multiple instances, all I want is configure two full setup configurations (key manager + publisher + store + gateway), each one on its own host and make sure I can put a load balance in front of it.
Thre requiremenst are simple: I would like to share the load on both of them, and guarantee a better availability in case of one of the hosts goes down. Is it a MUST break down the whole installation on both nodes so I have to start each component independently with offset ports configured?
I coud see that on version 2.0.0 a lot have been simplified, any way to reach the same on 1.10.0 one?
Regards
Splitting into profiles is not mandatory. This is designed in this way to scale API Manager based on the TPS. If you have a low TPS count and prefer to have 2 node HA setup, you can do the following.
Cluster the two nodes using wka, aws, etc.
Use dep-sync to share API artifacts between two nodes.
Use one node as the Publisher. You need to handle the publisher node traffic using single node. This is to avoid getting SVN conflicts.
You can serve API requests from both nodes.
You do not want to always use the same deployment pattern mentioned in the docuemtnation that you have pointed there. There are various Other deployment patterns that you can use according to the scalability and the requirement of yours.
Please refer the following documentation [1] for different deployment patterns you can use for WSO2 API Manager and [2] for more information on worker Manager separation and Load balancing.
[1] https://docs.wso2.com/display/CLUSTER44x/API+Manager+Deployment+Patterns
[2] https://docs.wso2.com/display/CLUSTER44x/Separating+the+Worker+and+Manager+Nodes

Microservices service registry registration and discovery

Little domain presentation
I m actually having two microservices :
User - managing CRUD on users
Billings - managing CRUD on billings, with a "reference" on a user concerned by the billing
Explanation
I need, when a billing is called in a HTTP request, to send the fully billing object with the user loaded. In that case, and in this specifical case, I really need this.
In a first time, I looked around, and it seems that it was a good idea to use message queuing, for asynchronicity, and so the billing service can send on a queue :
"who's the user with the id 123456 ? I need to load it"
So my two services could exchange, without really knowing each other, or without knowing the "location" of each other.
Problems
My first question is, what is the aim of using a service registry in that case ? The message queuing is able to give us the information without knowing anything at all concerning the user service location no ?
When do we need to use a service registration :
In the case of Aggregator Pattern, with RESTFul API, we can navigate through hateoas links. In the case of Proxy pattern maybe ? When the microservices are interfaced by another service ?
Admitting now, that we use proxy pattern, with a "frontal service". In this case, it's okay for me to use a service registration. But it means that the front send service know the name of the userService and the billing service in the service registration ? Example :
Service User registers as "UserServiceOfHell:http://80.80.80.80/v1/"
on ZooKeeper
Service Billing registers as "BillingService:http://90.90.90.90/v4.3/"
The front end service needs to send some requests to the user and billing service, it implies that it needs to know that the user service is "UserServiceOfHell". Is this defined at the beginning of the project ?
Last question, can we use multiple microservices patterns in one microservices architecture or is this a bad practice ?
NB : Everything I ask is based on http://blog.arungupta.me/microservice-design-patterns/
A lot of good questions!
First of all, I want to answer your last question - multiple patterns are ok when you know what you're doing. It's fine to mix asynchronous queues, HTTP calls and even binary RPC - it depends on consistency, availability and performance requirements. Sometimes you can see a good fit for simple PubSub and sometimes you need to have distributed lock - microservices are different.
Your example is simple: two microservices need to exchange some information. You chose asynchronous queue - fine, in this case they don't really need to know about each other. Queues don't expect any discovery between consumers.
But we need service discovery in other cases! For example, backing services: databases, caches and actually queues as well. Without service discovery you probably hardcoded the URL to your queue, but if it goes down you have nothing. You need to have high availability - cluster of nodes replicating your queue, for example. When you add a new node or existing node crashed - you should not change anything, service discovery tool should understand that and update the registry.
Consul is a perfect modern service discovery tool, you can just use custom DNS name for accessing your backing services and Consul will perform constant health checks and keep your cluster healthy.
The same rule can be applied to microservices - when you have a cluster running service A and you need to access it from service B without any queues (for example, for HTTP call) you have to use service discovery to be sure that endpoint you use will bring you to the healthy node. So it's a perfect fit for Aggregator or Proxy patterns from the article you mentioned.
Probably the most confusion is caused by the fact that you see "hardcoded" URLs in Zookeeper. And you think that you need to manage that manually. Modern tools like Consul or etcd allows you to avoid that headache and just rely on them. It's actually also achievable with Zookeeper, but it'll require more time and resources to have similar setup.
PS: please remember about the most important rule in microservices - http://martinfowler.com/bliki/MonolithFirst.html

WSO2 API Manager Clustering configuration

I'm trying to install and configure a highly availability setup for the WSO2 API Manager. I've been reading through this document: http://docs.wso2.org/wiki/display/Cluster/Clustering+API+Manager and in there it explains to break up the 4 components of the application into separate folders and that these 4 components can run on a single server. I'm not sure why this is needed. All I really want to do is take 2 servers, install the full application on both of them (without breaking the application up into 4 different pieces) and cluster them together between two servers with an Elastic Load Balancer in front of them.
What is the purpose of splitting up the multiple components on the same server if they all run out of a single installation? I'm looking for the simplest way to provide fail over capability to this application if one server goes down. Any insight into their methodology would be greatly appreciated.
Thanks.
The article you've linked describes on distributing different components of API Manager. If you look at the very end of that article there's a link to clustering configuration doc. In a production deployment usually it is encouraged that the 4 components are run on different nodes rather than having everything in a node and having multiple such nodes. That's why it goes on explaining breaking it down to separate components. The official AM doc below has a page on different deployment patterns.
You can go through the following articles to get a better understanding on clustering API Manager.
http://docs.wso2.org/wiki/display/AM140/Clustered+Deployment
http://sanjeewamalalgoda.blogspot.com/2012/09/how-do-clustering-and-enable-replicate.html
My 2cts:
The documentation mentioned in the remarks, explains how WSO2 sees the world of clustering. Spread the different functionality over different JVM's. This sounds logical from architectural point of view. A dis-advantages is that the diffent applications need to me administrated as well by operations. This makes the technical architecture rather complex.
In our situation, we defined 2 different servers with extra CPU and memory, on these servers we have installed the full WSO2 API Manager and defined the cluster configuration. Everything provisioned via Puppet.
Just a straightforward install, all data-source pointing to one schema in an Oracle database.
And...it is working; Our Developers happy, Operations happy, Architect department happy