How use throttling in Wso2 EI using client IP - wso2

I am planning to use throttling in wso2-ei 6.4.0, From local system i tested the scenario i face some problems could please help me if any one know thanks in advance.
If we restart the wso2-ei node policy is not working. It taking again from starting ( suppose request limit is 10 for 1 hour,Before restarting the node it processing 5 request after restarting it should take remaining 5 request but it accepting 10 request
Throttling is working based on wso2-ei node level but suppose Linux server having 10 nodes how to distribute the throttling policy in Linux server level .
How to consider client ip in throttling. If request coming from F5 load balance i need to consider the requested system IP not F5 server IP.

If we restart the wso2-ei node policy is not working. It taking again from starting ( suppose request limit is 10 for 1 hour,Before restarting the node it processing 5 request after restarting it should take remaining 5 request but it accepting 10 request
The throttle mediator does not store the throttle count. Therefore if you perform a server restart it will reset the throttle count value and start from zero. In a production environment, it is not expected to have frequent server restarts.
Throttling is working based on wso2-ei node level but suppose Linux server having 10 nodes how to distribute the throttling policy in Linux server level .
If you want to maintain the throttle count across all the nodes you need to cluster the nodes. Throttle mediator uses hazelcast cluster messages to maintain a global count across the cluster.

Related

Spring Data Neo4J - Unable to acquire connection from pool within configured maximum time

We have a Reactive REST API using Spring Data Neo4j (SpringBoot v2.7.5) deployed to Kubernetes. When running a stress test to determine the breaking point, once the volume of requests that the service can handle has been breached, the service does not auto-recover, even after the load has dropped to a level at which the service can handle.
After the service has fallen over the Neo4J health indicator shows:
“org.neo4j.driver.exceptions.ClientException: Unable to acquire connection from the pool within configured maximum time of 60000ms”
With respect to connection/configuration settings we are using defaults configured by SDN.
Observations:
Up until the point at which the service breaks only a small number of connections are utilised, at the point at which it breaks the connections in use jumps up to the max pool size and the above mentioned error is observed. No matter how much time passes (even well beyond the max connection lifetime) the service is unable to acquire a connection from the pool. Upon manually shutting down and restarting the service/pod the service returns to a healthy state.
As an interim solution we now check the Neo4J health indicator, if the mentioned error is present the liveness state is set to down which triggers Kubernetes to restart the service automatically. However, I’m wondering if there is an underlying issue with the connections in the pool not getting ‘cleaned up’?
You can take a look at this discussion https://github.com/spring-projects/spring-data-neo4j/issues/2632
I had the same issue. The problem is that either Spring Framework or Neo4j reactive transaction manager doesn't close connections in a complex reactive flow e.g. when there are a lot of inner calls/mappings and somewhere inside an exception is thrown.
So as a workaround you can add #Transactional in such places to avoid multiple transactions to be created.

Kafka Multi broker setup with ec2 machine: Timed out waiting for a node assignment. Call: createTopics

I am trying to setup kafka with 3 broker nodes and 1 zookeeper node in AWS EC2 instances. I have following server.properties for every broker:
kafka-1:
broker.id=0
listeners=PLAINTEXT_1://ec2-**-***-**-17.eu-central-1.compute.amazonaws.com:9092
advertised.listeners=PLAINTEXT_1://ec2-**-***-**-17.eu-central-1.compute.amazonaws.com:9092
listener.security.protocol.map=,PLAINTEXT_1:PLAINTEXT
inter.broker.listener.name=PLAINTEXT_1
zookeeper.connect=ec2-**-***-**-105.eu-central-1.compute.amazonaws.com:2181
kafka-2:
broker.id=1
listeners=PLAINTEXT_2://ec2-**-***-**-43.eu-central-1.compute.amazonaws.com:9093
advertised.listeners=PLAINTEXT_2://ec2-**-***-**-43.eu-central-1.compute.amazonaws.com:9093
listener.security.protocol.map=,PLAINTEXT_2:PLAINTEXT
inter.broker.listener.name=PLAINTEXT_2
zookeeper.connect=ec2-**-***-**-105.eu-central-1.compute.amazonaws.com:2181
kafka-3:
broker.id=2
listeners=PLAINTEXT_3://ec2-**-***-**-27.eu-central-1.compute.amazonaws.com:9094
advertised.listeners=PLAINTEXT_3://ec2-**-***-**-27.eu-central-1.compute.amazonaws.com:9094
listener.security.protocol.map=,PLAINTEXT_3:PLAINTEXT
inter.broker.listener.name=PLAINTEXT_3
zookeeper.connect=ec2-**-***-**-105.eu-central-1.compute.amazonaws.com:2181
zookeeper:
tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
When I ran following command in zookeeper I see that they are connected
I also telnetted from any broker to other ones with broker port they are all connected
However, when I try to create topic with 2 replication factor I get Timed out waiting for a node assignment
I cannot understand what is incorrect with my setup, I see 3 nodes running in zookeeper, but having problems when creating topic. BTW, when I make replication factor 1 I get the same error. How can I make sure that everything is alright with my cluster?
It's good that telnet checks the port is open, but it doesn't verify the Kafka protocol works. You could use kcat utility for that, but the fix includes
listeners are set to either PLAINTEXT://:9092 or PLAINTEXT://0.0.0.0:9092 for every broker, which means using the same port
Removing the number from the listener mapping and advertised listeners property such that each broker is the same
I'd also recommend looking at using Ansible/Terraform/Cloudformation to ensure you consistently modify the cluster rather than edit individual settings manually

Eureka Server memory, renew threshold is 0, self preservation issue - AWS

I deployed 2 instances of Eureka server and a total of 12 instances microservices. .
Renews (last min) is as expected 24. But Renew Threshold is always 0. Is this how it supposed to be when self preservation is turned on? Also seeing this error - THE SELF PRESERVATION MODE IS TURNED OFF. THIS MAY NOT PROTECT INSTANCE EXPIRY IN CASE OF NETWORK/OTHER PROBLEMS. What's the expected behavior in this case and how to resolve this if this is a problem?
As mentioned above, I deployed 2 instances of Eureka Server but after running for a while like around 19-20 hours, one instance of Eureka Server always goes down. Why that could be possibly happening? I checked the processes running using top command and found that Eureka Server is taking a lot of memory. What needs to be configured on Eureka Server so that it don't take a lot of memory?
Below is the configuration in the application.properties file of Eureka Server:
spring.application.name=eureka-server
eureka.instance.appname=eureka-server
eureka.instance.instance-id=${spring.application.name}:${spring.application.instance_id:${random.int[1,999999]}}
eureka.server.enable-self-preservation=false
eureka.datacenter=AWS
eureka.environment=STAGE
eureka.client.registerWithEureka=false
eureka.client.fetchRegistry=false
Below is the command that I am using to start the Eureka Server instances.
#!/bin/bash
java -Xms128m -Xmx256m -Xss256k -XX:+HeapDumpOnOutOfMemoryError -Dspring.profiles.active=stage -Dserver.port=9011 -Deureka.instance.prefer-ip-address=true -Deureka.instance.hostname=example.co.za -Deureka.client.serviceUrl.defaultZone=http://example.co.za:9012/eureka/ -jar eureka-server-1.0.jar &
java -Xms128m -Xmx256m -Xss256k -XX:+HeapDumpOnOutOfMemoryError -Dspring.profiles.active=stage -Dserver.port=9012 -Deureka.instance.prefer-ip-address=true -Deureka.instance.hostname=example.co.za -Deureka.client.serviceUrl.defaultZone=http://example.co.za:9011/eureka/ -jar eureka-server-1.0.jar &
Is this approach to create multiple instances of Eureka Server correct?
Deployment is on AWS. Is there any specific configuration needed for Eureka Server on AWS?
Spring Boot version: 2.3.4.RELEASE
I am new to all these, any help or direction will be a great help.
Let me try to answer your question one by one.
Renews (last min) is as expected 24. But Renew Threshold is always 0. Is this how it supposed to be when self-preservation is turned on?
What's the expected behaviour in this case and how to resolve this if this is a problem?
I can see that eureka.server.enable-self-preservation=false in your configuration, This is really needed if you want to remove an already registered application as soon as it fails to renew its lease.
Self-preservation feature is to prevent the above-mentioned situation since it can happen if there are some network hiccups. Say, you have two services A and B, both are registered to eureka and suddenly, B failed to renew its lease because of a temporary network hiccup. If Self-preservation is not there then B will be removed from the registry and A won't be able to reach B despite B is available.
So we can say that Self-preservation is a resiliency feature of eureka.
Renews threshold is the expected renews per minute, Eureka server enters self-preservation mode if the actual number of heartbeats in last minute(Renews) is less than the expected number of renews per minute(Renew Threshold) and
Of course, you can control the Renews threshold. you can configure renewal-percent-threshold (by default it is 0.85)
So in your case,
Total number of application instances = 12
You don't have eureka.instance.leaseRenewalIntervalInSeconds so default value 30s
and eureka.client.registerWithEureka=false
so Renewals(last minute) will be 24
You don't have renewal-percent-threshold configured, so the default value is 0.85
Number of renewals per application instance per minute = 2 (30s each)
so in case of self-preservation is enable Renews threshold will be calculated as 2 * 12 * 0.85 = 21 (rounded)
And in your case self-preservation is turned off, so Eureka won't calculate Renews Threshold
One instance of Eureka Server always goes down. Why that could be possibly happening?
I'm not able to answer this question time being, this can be because of multiple reasons.
You can find the reason mostly from logs, or if you can post logs here it would be great.
What needs to be configured on Eureka Server so that it doesn't take a lot of memory?
From the information that you have provided, I cannot tell about your memory issue and in addition to that you already specified -Xmx256m and I didn't face any memory issues with the eureka servers so far.
But I can say that top is not the right tool for checking the memory consumed by your java process. When JVM starts, It takes some memory from the operating system.
This is the amount of memory you see in tools like ps and top. so better use jstat or jvmtop
Is this approach to create multiple instances of Eureka Server correct?
It seems you are having the same hostname(eureka.instance.hostname) for both instances. Replication won't work if you use the same hostname.
And make sure that you have the same application names in both instances.
Deployment is on AWS. Is there any specific configuration needed for Eureka Server on AWS?
Nothing specifically for AWS as per my knowledge, other than making sure that the instances can communicate with each other.

Orderer disconnections in a Hyperledger Fabric application

We have a hyperledger application. The main application is hosted on AWS VM's whereas the DR is hosted on Azure VM's. Recently the Microsoft Team identified that one of the DR VM's became unavailable and the availability was restored in approximately 8 minutes. As per Microsoft "This unexpected occurrence was caused by an Azure initiated auto-recovery action. The auto-recovery action was triggered by a hardware issue on the physical node where the virtual machine was hosted. As designed, your VM was automatically moved to a different and healthy physical node to avoid further impact." The Zookeeper VM was also redeployed at the same
The day after this event occurred, we have started noticing that an orderer goes offline and immediately comes online after a few seconds. This disconnection/connection occurs regularly after a gap of 12 hours and 10 minutes.
We have noticed two things
In the log we get
- [orderer/consensus/kafka] startThread -> CRIT 24df#033[0m [channel:
testchainid] Cannot set up channel consumer = kafka server: The
requested offset is outside the range of offsets maintained by the
server for the given topic/partition.
- panic: [channel: testchainid] Cannot set up channel consumer = kafka
server: The requested offset is outside the range of offsets
maintained by the server for the given topic/partition.
- goroutine 52 [running]:
- github.com/hyperledger/fabric/vendor/github.com/op/go-logging.(*Logger).Panicf(0xc4202748a0,
0x108dede, 0x31, 0xc420327540, 0x2, 0x2)
- /w/workspace/fabric-binaries-x86_64/gopath/src/github.com/hyperledger/fabric/vendor/github.com/op/go-logging/logger.go:194
+0x134
- github.com/hyperledger/fabric/orderer/consensus/kafka.startThread(0xc42022cdc0)
- /w/workspace/fabric-binaries-x86_64/gopath/src/github.com/hyperledger/fabric/orderer/consensus/kafka/chain.go:261
+0xb33
- created by
github.com/hyperledger/fabric/orderer/consensus/kafka.(*chainImpl).Start
- /w/workspace/fabric-binaries-x86_64/gopath/src/github.com/hyperledger/fabric/orderer/consensus/kafka/chain.go:126
+0x3f
Another thing which we noticed is that, in logs prior to the VM failure event there were 3 kafka brokers but we can see only 2 kafka brokers in the logs after this event.
Can someone guide me on this? How do I resolve this problem?
Additional information - We have been through the Kafka logs of the day after which the VM was redeployed and we noticed the following
org.apache.kafka.common.network.InvalidReceiveException: Invalid receive (size = 1195725856 larger than 104857600)
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:132)
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:93)
at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:231)
at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:192)
at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:528)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:469)
at org.apache.kafka.common.network.Selector.poll(Selector.java:398)
at kafka.network.Processor.poll(SocketServer.scala:535)
at kafka.network.Processor.run(SocketServer.scala:452)
at java.lang.Thread.run(Thread.java:748)
It seems that we have a solution but it needs to be validated. Once the solution is validated, I will post it on this site.

Can Akka Cluster Client Send Messages to Cluster Nodes Not in Initial Contacts?

Using Akka 2.3.14, I'm trying to create an Akka cluster of various services. Until now, I have had all my "services" in one artifact that was clustered across multiple nodes, but now I am trying to break this artifact into multiple services that all exist on the same cluster.
So in breaking this up, we've designed it so that any node on the cluster will first try to connect to the seed nodes. If there is no seed node, it will look to see if it is a candidate to run as a seed node (if it's on the same host that a seed node can be on) in which case it will grab the an open seed node port and become a seed node. So in this sense, any service in the cluster can become the seed node.
At least, that was the idea. Our API into this system running as a separate service implements a ClusterClient into this system. The initialContacts are set to be the same as the seed nodes. The problem is that the only receptionist actors I can send a message to through the ClusterClient are the actors on the seed nodes.
Here is an example if it helps. Let's say I have a String Service and a Double Service, and the receptionist for each service is a StringActor and a DoubleActor respectively. Now lets say I have a Client Service which sends StringMessages and DoubleMessages to the StringActor and DoubleActor
So for simplicity, let's say I have two nodes, server1 and server2 then:
seed-nodes = ["akka.tcp://system#server1:2773", "akka.tcp://system#server2:2773"]
My ClusterClient would be initialize like so:
system.actorOf(
ClusterClient.props(
Set(
system.actorSelection("akka.tcp://system#server1:2773/user/receptionist"),
system.actorSelection("akka.tcp://system#server2:2773/user/receptionist")
)
),
"clusterClient"
)
Here are the scenarios that are happening for me:
If the StringServices start up on both servers first, then DoubleMessages from the Client Service just disappear into the ether.
If the DoubleServices start up on both servers first, then StringMessages from the Client Service just disappear into the ether.
If the StringService starts up first on serverX and the DoubleService starts up first on serverY, then all StringMessages will be sent to serverX and all DoubleMessages will be sent to serverY, which is not as bad as the above case, but it means it's not really scaling.
This isn't what I expected, it's possible it's just a defect in my code, so I would like to know if this IS expected behavior or not. And if not, then is there another Akka concept that could help me with this?
Arguably, I could just make one service type my entry point, like a RoutingService that could accept StringMessages or DoubleMessages, and then send that to the correct service. But if the Client Service can only send messages to the RoutingService instances that are in the initial contacts, then I can't dynamically scale the RoutingService because no matter how many nodes I add the Client Service can only send to the initial contacts.
I'm also thinking about subscribing to ClusterEvents in my Client Service and seeing if I can add and remove initial contacts from my cluster client as nodes are started up in the cluster, but I'm not sure if this is possible, and it feels like there should be a better solution.
This is what I found out upon more troubleshooting, in case it helps anyone else:
The ClusterClient will attempt to connect to the initial contacts in order, and then only sends it's messages across that connection. If you are deploying different services on each node, you will have problems as the messages sent from the ClusterClient will only be sent to the node that it makes its connection to. In this way, you can think of the ClusterClient a legitimate client, it will connect to a URL that you give it, and then continue to communicate with the server through that URL.
Reading the Distributed Workers example, I realized that my Frontend, or in this case my routing service, should actually be part of the cluster, rather than acting as a client. For this I used the DistributedPubSub method instead.