How to setup Akka Cluster in multiple machine? - akka

I have looked into offical Akka documentation and I confuse. I follow this link and I used the same application.conf as like that and change the seed-node into my another machine ip.
akka {
actor.provider = "akka.cluster.ClusterActorRefProvider"
remote.netty.tcp.port=0
remote.netty.tcp.hostname=127.0.0.1
cluster {
seed-nodes = [
"akka.tcp://ClusterSystem#slave01:2551",
"akka.tcp://ClusterSystem#slave02:2552"]
auto-down-unreachable-after = 10s
}
extensions = ["akka.cluster.client.ClusterClientReceptionist"]
persistence {
journal.plugin = "akka.persistence.journal.leveldb-shared"
journal.leveldb-shared.store {
# DO NOT USE 'native = off' IN PRODUCTION !!!
native = off
dir = "target/shared-journal"
}
snapshot-store.plugin = "akka.persistence.snapshot-store.local"
snapshot-store.local.dir = "target/snapshots"
}
}
The problem is that it's said it's unreachable and connection refused. Any suggestion?

Have you made sure connection isn't blocked by firewall on the two hosts? I would first check whether both slave01 and slave02 are remotely reachable using telnet at their corresponding port (e.g. telnet slave02 2552). And in case slave01 and slave02 are hostnames or FQDN, they would need to be mapped to corresponding IP addresses in /etc/hosts or DNS, accordingly.

Related

Connection to AWS MemoryDB cluster sometimes fails

We have an application that is using AWS MemoryDB for Redis. We have setup a cluster with one shard and two nodes. One of the nodes (named 0001-001) is a primary read/write while the other one is a read replica (named 0001-002).
After deploying the application, connecting to MemoryDB sometimes fails when we use the cluster endpoint connection string to connect. If we restart the application a few times it suddenly starts working. It seems to be random when it succeeds or not. The error we get is the following:
Endpoint Unspecified/ourapp-memorydb-cluster-0001-001.ourapp-memorydb-cluster.xxxxx.memorydb.eu-west-1.amazonaws.com:6379 serving hashslot 6024 is not reachable at this point of time. Please check connectTimeout value. If it is low, try increasing it to give the ConnectionMultiplexer a chance to recover from the network disconnect. IOCP: (Busy=0,Free=1000,Min=2,Max=1000), WORKER: (Busy=0,Free=32767,Min=2,Max=32767), Local-CPU: n/a
If we connect directly to the primary read/write node we get no such errors.
If we connect directly to the read replica it always fails. It even gets the error above, compaining about the "0001-001" node.
We use .NET Core 6
We use Microsoft.Extensions.Caching.StackExchangeRedis 6.0.4 which depends on StackExchange.Redis 2.2.4
The application is hosted in AWS ECS
StackExchangeRedisCache is added to the service collection in a startup file :
services.AddStackExchangeRedisCache(o =>
{
o.InstanceName = redisConfiguration.Instance;
o.ConfigurationOptions = ToRedisConfigurationOptions(redisConfiguration);
});
...where ToRedisConfiguration returns a basic ConfigurationOptions object :
new ConfigurationOptions()
{
EndPoints =
{
{ "clustercfg.ourapp-memorydb-cluster.xxxxx.memorydb.eu-west-1.amazonaws.com", 6379 } // Cluster endpoint
},
User = "username",
Password = "password",
Ssl = true,
AbortOnConnectFail = false,
ConnectTimeout = 60000
};
We tried multiple shards with multiple nodes and it also sometimes fail to connect to the cluster. We even tried to update the dependency StackExchange.Redis to 2.5.43 but no luck.
We could "solve" it by directly connecting to the primary node, but if a failover occurs and 0001-002 becomes the primary node we would have to manually change our connection string, which is not acceptable in a production environment.
Any help or advice is appreciated, thanks!

How Do I Add an External IP Address for an IPFS Client

I created an IPFS instance in the Google Compute Engine.
When I state the Daemon with
ipfs daemon
The Local IP address for the box (The Private address is listed) 10.128.0.4 but when I want to connected to the External IP (134.123.143.185 ) from outside Google I can't.
How do I add the External IP to the list of acceptable IP address to connect with.
try {
const client = createClient(new URL('http://134.123.143.185:5001'))
// call Core API methods
const { cid } = await client.add('Hello world!')
console.log(cid);
} catch (error) {
console.log(error);
}
The IPFS server is probably set only to listen on 127.0.0.1/5001
If you run
sudo ipfs config Addresses.API /ip4/0.0.0.0/tcp/5001
This is probably horribly insecure and not the right thing to do someone hopefully someone will correct this answer.

An Issue with an AWS EC2 instance WebSocket connection failed: Error in connection establishment: net::ERR_CONNECTION_TIMED_OUT

As I tried to run the chat app from localhost connected to MySQL database which had been coded with PHP via WebSocket it was successful.
Also when I tried to run from the PuTTY terminal logged into SSH credentials, it was displaying as Server Started with the port# 8080
ubuntu#ec3-193-123-96:/home/admin/web/ec3-193-123-96.eu-central-1.compute.amazonaws.com/public_html/application/libraries/server$ php websocket_server.php
PHP Fatal error: Uncaught React\Socket\ConnectionException: Could not bind to tcp://0.0.0.0:8080: Address already in use in /home/admin/web/ec3-193-123-96.eu-central-1.compute.amazonaws.com/public_html/application/libraries/vendor/react/socket/src/Server.php:29
Stack trace:
#0 /home/admin/web/ec3-193-123-96.eu-central-1.compute.amazonaws.com/public_html/application/libraries/vendor/cboden/ratchet/src/Ratchet/Server/IoServer.php(70): React\Socket\Server->listen(8080, '0.0.0.0')
#1 /home/admin/web/ec3-193-123-96.eu-central-1.compute.amazonaws.com/public_html/application/libraries/server/websocket_server.php(121): Ratchet\Server\IoServer::factory(Object(Ratchet\Http\HttpServer), 8080)
#2 {main}
thrown in /home/admin/web/ec3-193-123-96.eu-central-1.compute.amazonaws.com/public_html/application/libraries/vendor/react/socket/src/Server.php on line 29
ubuntu#ec3-193-123-96:/home/admin/web/ec3-193-123-96.eu-central-1.compute.amazonaws.com/public_html/application/libraries/server$
So I tried to change the port#8080 to port# 8282, it was successful
ubuntu#ec3-193-123-96:/home/admin/web/ec3-193-123-96.eu-central-1.compute.amazonaws.com/public_html/application/libraries/server$ php websocket_server.php
Keeping the shell script running, open a couple of web browser windows, and open a Javascript console or a page with the following Javascript:
var conn = new WebSocket('ws://0.0.0.0:8282');
conn.onopen = function(e) {
console.log("Connection established!");
};
conn.onmessage = function(e) {
console.log(e.data);
};
From the browser console results:
WebSocket connection to 'ws://5.160.195.94:8282/' failed: Error in
connection establishment: net::ERR_CONNECTION_TIMED_OUT
websocket_server.php
<?php
use Ratchet\Server\IoServer;
use Ratchet\Http\HttpServer;
use Ratchet\WebSocket\WsServer;
use MyApp\Chat;
require dirname(__DIR__) . '/vendor/autoload.php';
$server = IoServer::factory(
new HttpServer(
new WsServer(
new Chat()
)
),
8282
);
$server->run();
I even tried to assign Public IP and Private IP, but with no good it resulted in the same old result?
This was the composer files generated after executing and adding src folder $composer require cboden/ratchet
composer.json(On AmazonWebServer)
{
"autoload": {
"psr-4": {
"MyApp\\": "src"
}
},
"require": {
"cboden/ratchet": "^0.4.1"
}
}
composer.json(On localhost)
{
"autoload": {
"psr-4": {
"MyApp\\": "src"
}
},
"require": {
"cboden/ratchet": "^0.4.3"
}
}
How am I suppose to resolve/overcome while connecting it from the WebSocket especially from the hosted server with the domain name such as
http://ec3-193-123-96.eu-central-1.compute.amazonaws.com/
var conn = new WebSocket('ws://localhost:8282');
From the Security Group
Under Inbound tab
Under Outbound tab
When it comes to a connectivity issue with an EC2 there are few things you need to check to find the root cause.
SSH into the EC2 instance that the application is running and make sure you can access it from within the EC2 instance. If it works then its a network related issue that we need to solve.
If step 1 was successful. You have now identified it is a network issue to solve this you need to check the following.
Check if an Internet Gateway is created and attached to your VPC.
Next check if your subnets routing table has its default route pointing to the internet gateway. check this link to complete this and the above step.
Check your subnets Network ACLs rules to see if ports are not blocked
finally, you would want to check your Instances Security group as you have shown.
If you need access via a EC2 dns you will need to provision your ec2 instance in a public subnet and assign an elastic IP
If an issue still exists check if the EC2 status checks pass, or try provisioning a new instance.

Hazelcast cluster over AWS using Docker

Hi am trying to configure hazelcast cluster over AWS.
I am running hazelcast in docker container and using --net=host to use host network config.
when i look at hazelcast logs, I see
[172.17.0.1]:5701 [herald] [3.8] Established socket connection between /[node2]:5701 and /[node1]:47357
04:24:22.595 [hz._hzInstance_1_herald.IO.thread-out-0] DEBUG c.h.n.t.SocketWriterInitializerImpl - [172.17.0.1]:5701 [herald] [3.8] Initializing SocketWriter WriteHandler with Cluster Protocol
04:24:22.595 [hz._hzInstance_1_herald.IO.thread-in-0] WARN c.h.nio.tcp.TcpIpConnectionManager - [172.17.0.1]:5701 [herald] [3.8] Wrong bind request from [172.17.0.1]:5701! This node is not requested endpoint: [node2]:5701
04:24:22.595 [hz._hzInstance_1_herald.IO.thread-in-0] INFO c.hazelcast.nio.tcp.TcpIpConnection - [172.17.0.1]:5701 [herald] [3.8] Connection[id=40, /[node2]:5701->/[node1]:47357, endpoint=null, alive=false, type=MEMBER] closed. Reason: Wrong bind request from [172.17.0.1]:5701! This node is not requested endpoint: [node2]:5701
I can see error saying bind request is coming from 172.17.0.1 to node1, and node1 is not accepting this request.
final Config config = new Config();
config.setGroupConfig(clientConfig().getGroupConfig());
final NetworkConfig networkConfig = new NetworkConfig();
final JoinConfig joinConfig = new JoinConfig();
final TcpIpConfig tcpIpConfig = new TcpIpConfig();
final MulticastConfig multicastConfig = new MulticastConfig();
multicastConfig.setEnabled(false);
final AwsConfig awsConfig = new AwsConfig();
awsConfig.setEnabled(true);
// awsConfig.setSecurityGroupName("xxxx");
awsConfig.setRegion("xxxx");
awsConfig.setIamRole("xxxx");
awsConfig.setTagKey("type");
awsConfig.setTagValue("xxxx");
awsConfig.setConnectionTimeoutSeconds(120);
joinConfig.setAwsConfig(awsConfig);
joinConfig.setMulticastConfig(multicastConfig);
joinConfig.setTcpIpConfig(tcpIpConfig);
networkConfig.setJoin(joinConfig);
final InterfacesConfig interfaceConfig = networkConfig.getInterfaces();
interfaceConfig.setEnabled(true).addInterface("172.29.238.71");
config.setNetworkConfig(networkConfig);
above is the code to configure AWSConfig
Please help me resolve this issue.
Thanks
You are experiencing an issue (#11795) in default Hazelcast bind address selection mechanism.
There are several workarounds available:
Workaround 1: System property
You can set the bind address by providing correct IP address as a hazelcast.local.localAddress system property:
java -Dhazelcast.local.localAddress=[yourCorrectIpGoesHere]
or
System.setProperty("hazelcast.local.localAddress", "[yourCorrectIpGoesHere]")
Read details in System properties chapter of Hazelcast Reference Manual.
Workaround 2: Hazelcast Network configuration
Hazelcast Network configuration allows you to specify which IP addresses can be used to bind the server.
Declarative in hazelcast.xml:
<hazelcast>
...
<network>
...
<interfaces enabled="true">
<interface>10.3.16.*</interface>
<interface>10.3.10.4-18</interface>
<interface>192.168.1.3</interface>
</interfaces>
</network>
...
</hazelcast>
Programmatic:
Config config = new Config();
NetworkConfig network = config.getNetworkConfig();
InterfacesConfig interfaceConfig = network.getInterfaces();
interfaceConfig.setEnabled(true).addInterface("192.168.1.3");
HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance(config);
Read details in Interfaces section of Hazelcast Reference Manual.
Update:
With the earlier steps you are able to set a proper bind address - the local one returned by ip addr show for instance. Nevertheless, it could be insufficient if you run Hazelcast in an environment where local IP and public IP differs (clouds, docker).
Next Step: Configure public address
This step is necessary in environments, where cluster nodes doesn't see each other under the reported local address of the other node. You have to set the public address - it's the one which nodes are able to reach (optionally with port specified).
networkConfig.setPublicAddress("172.29.238.71");
// or if a non-default Hazelcast port is used - e.g.9991
networkConfig.setPublicAddress("172.29.238.71:9991");

akka cluster give false alarm in reporting node unreachable

I got a cluster event listener running on each node who send email to notify me when nodes are unreachable, and I noticed two strange things:
most of the time, unreachable event are followed by reachable again event
when unreachable event occurs, I query the state of cluster, it shows that all node are still UP
Here is my conf:
akka {
loglevel = INFO
loggers = ["akka.event.slf4j.Slf4jLogger"]
jvm-exit-on-fatal-error = on
actor {
provider = "akka.cluster.ClusterActorRefProvider"
}
remote {
//will be overwrite on runtime
log-remote-lifecycle-events = off
netty.tcp {
hostname = "127.0.0.1"
port = 9989
}
}
cluster {
failure-detector {
threshold = 12.0
acceptable-heartbeat-pause = 10 s
}
use-dispatcher = cluster-dispatcher
}
}
//relieve unreachable report rate
cluster-dispatcher {
type = "Dispatcher"
executor = "fork-join-executor"
fork-join-executor {
parallelism-min = 4
parallelism-max = 8
}
}
Please read the cluster membership lifecycle section in the documentation: http://doc.akka.io/docs/akka/2.4.0/common/cluster.html#Membership_Lifecycle
Unreachability is temporary, and indicates that there were no heartbeats for a while from the remote node. This can be reverted once heartbeats come again. This is useful to reroute data from overloaded nodes to others or compensating smaller, intermittent networking issues. Please note that a cluster member does not go to DOWN from unreachable automatically unless configured so: http://doc.akka.io/docs/akka/2.4.0/scala/cluster-usage.html#Automatic_vs__Manual_Downing
The reason why DOWNing is manual and not automatic by default is because of the risk of split-brain scenarios and their consequences for example when Cluster Singletons are used (which won't be singletons after the cluster falls into two parts because of a broken network cable). For more options for automatically resolving such cases there is the SBR (Split Brain Resolver) in the commercial version of Akka: http://doc.akka.io/docs/akka/rp-15v09p01/scala/split-brain-resolver.html
Also, DOWN-ing is permanent, a node, once marked as DOWN is forever banished from the surviving part of the cluster, i.e. even if it turns out to be alive in the future, it won't be allowed back again (see Fencing and STONITH for explanation: https://en.wikipedia.org/wiki/STONITH or http://advogato.org/person/lmb/diary/105.html).