I created 3 aws ec2 servers with RedHat 6 and used this tutorial to deploy storm.
After creating the zookeeper and nimbus instances i could manually start zookeeper and the nimbus/ui nodes. The nimbus:8080 showed me an empty topology.
The third server was configured for one supervisor/slave node and i saw it in the ui.
After that I added the supervisord option and changed some ec2 firewall options (unfortunately at the same time).
Now when i start zookeeper, nimbus and ui (with or without supervisord) and look at the ui i get this Error.
java.lang.RuntimeException: org.apache.thrift7.transport.TTransportException: java.net.ConnectException: Connection timed out
Already tried fiddling around with the aws firewall configs but none is changing anything. Even opening some ports to all ip addresses....
I used this readme to get the right settings.
The logs are all pretty empty. Zookeper seems to accept connections:
2015-12-04 21:47:04,151 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#192] - Accepted s2015-12-04 21:47:04,174 [myid:] - INFO [SyncThread:0:ZooKeeperServer#643] - Established session 0x15170091b370000 with negotiated timeout 20000 for client /52.34.142.187:53935
2015-12-04 21:47:04,177 [myid:] - INFO [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor#489] - Processed session termination for sessionid: 0x15170091b370000
2015-12-04 21:47:04,179 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1008] - Closed socket connection for client /52.34.142.187:53935 which had sessionid 0x15170091b370000
2015-12-04 21:47:04,181 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#192] - Accepted socket connection from /52.34.142.187:53936
2015-12-04 21:47:04,183 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#898] - Client attempting to establish new session at /52.34.142.187:53936
2015-12-04 21:47:04,187 [myid:] - INFO [SyncThread:0:ZooKeeperServer#643] - Established session 0x15170091b370001 with negotiated timeout 20000 for client /52.34.142.187:53936
2015-12-04 21:47:04,193 [myid:] - INFO [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor#489] - Processed session termination for sessionid: 0x15170091b370001
2015-12-04 21:47:04,194 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1008] - Closed socket connection for client /52.34.142.187:53936 which had sessionid 0x15170091b370001
2015-12-04 21:47:04,201 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#192] - Accepted socket connection from /52.34.142.187:53937
2015-12-04 21:47:04,203 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#898] - Client attempting to establish new session at /52.34.142.187:53937
2015-12-04 21:47:04,204 [myid:] - INFO [SyncThread:0:ZooKeeperServer#643] - Established session 0x15170091b370002 with negotiated timeout 20000 for client /52.34.142.187:53937
2015-12-04 21:47:54,973 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#192] - Accepted socket connection from /52.33.187.63:58714
2015-12-04 21:47:55,034 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#898] - Client attempting to establish new session at /52.33.187.63:58714
2015-12-04 21:47:55,035 [myid:] - INFO [SyncThread:0:ZooKeeperServer#643] - Established session 0x15170091b370003 with negotiated timeout 20000 for client /52.33.187.63:58714
2015-12-04 21:47:56,056 [myid:] - INFO [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor#489] - Processed session termination for sessionid: 0x15170091b370003
2015-12-04 21:47:56,058 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1008] - Closed socket connection for client /52.33.187.63:58714 which had sessionid 0x15170091b370003
2015-12-04 21:47:56,063 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#192] - Accepted socket connection from /52.33.187.63:58715
2015-12-04 21:47:56,065 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#898] - Client attempting to establish new session at /52.33.187.63:58715
2015-12-04 21:47:56,066 [myid:] - INFO [SyncThread:0:ZooKeeperServer#643] - Established session 0x15170091b370004 with negotiated timeout 20000 for client /52.33.187.63:58715
2015-12-04 21:49:12,000 [myid:] - INFO [SessionTracker:ZooKeeperServer#353] - Expiring session 0x15170091b370002, timeout of 20000ms exceeded
2015-12-04 21:49:12,000 [myid:] - INFO [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor#489] - Processed session termination for sessionid: 0x15170091b370002
2015-12-04 21:49:12,002 [myid:] - INFO [SyncThread:0:NIOServerCnxn#1008] - Closed socket connection for client /52.34.142.187:53937 which had sessionid 0x15170091b370002
2015-12-04 21:49:14,001 [myid:] - INFO [SessionTracker:ZooKeeperServer#353] - Expiring session 0x15170091b370004, timeout of 20000ms exceeded
2015-12-04 21:49:14,001 [myid:] - INFO [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor#489] - Processed session termination for sessionid: 0x15170091b370004
2015-12-04 21:49:14,002 [myid:] - INFO [SyncThread:0:NIOServerCnxn#1008] - Closed socket connection for client /52.33.187.63:58715 which had sessionid 0x15170091b370004ocket connection from /52.34.142.187:53935
2015-12-04 21:47:04,164 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#898] - Client attempting to establish new session at /52.34.142.187:53935
2015-12-04 21:47:04,165 [myid:] - INFO [SyncThread:0:FileTxnLog#199] - Creating new log file: log.195
nimbus.log:
2015-12-04 22:11:07.531 o.a.s.s.o.a.z.ClientCnxn [INFO] Client session timed out, have not heard from server in 20082ms for sessionid 0x0, closing socket connection and attempting reconnect
2015-12-04 22:11:08.632 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server ec2-52-34-113-54.us-west-2.compute.amazonaws.com/52.34.113.54:2181. Will not attempt to authenticate using SASL (unknown error)
2015-12-04 22:11:18.481 o.a.s.s.o.a.c.ConnectionState [WARN] Connection attempt unsuccessful after 31043 (greater than max timeout of 20000). Resetting connection and trying again with a new connection.
2015-12-04 22:11:27.743 o.a.s.s.o.a.z.ZooKeeper [INFO] Session: 0x0 closed
2015-12-04 22:11:27.743 o.a.s.s.o.a.z.ZooKeeper [INFO] Initiating client connection, connectString=zkserver1:2181 sessionTimeout=20000 watcher=org.apache.storm.shade.org.apache.curator.ConnectionState#185100a6
2015-12-04 22:11:27.747 o.a.s.s.o.a.z.ClientCnxn [INFO] EventThread shut down
2015-12-04 22:11:27.750 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server ec2-52-34-113-54.us-west-2.compute.amazonaws.com/52.34.113.54:2181. Will not attempt to authenticate using SASL (unknown error)
2015-12-04 22:11:47.753 o.a.s.s.o.a.z.ClientCnxn [INFO] Client session timed out, have not heard from server in 20006ms for sessionid 0x0, closing socket connection and attempting reconnect
2015-12-04 22:11:48.854 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server ec2-52-34-113-54.us-west-2.compute.amazonaws.com/52.34.113.54:2181. Will not attempt to authenticate using SASL (unknown error)
2015-12-04 22:12:03.860 o.a.s.s.o.a.c.ConnectionState [WARN] Connection attempt unsuccessful after 45379 (greater than max timeout of 20000). Resetting connection and trying again with a new connection.
2015-12-04 22:12:07.958 o.a.s.s.o.a.z.ZooKeeper [INFO] Session: 0x0 closed
2015-12-04 22:12:07.959 o.a.s.s.o.a.z.ZooKeeper [INFO] Initiating client connection, connectString=zkserver1:2181 sessionTimeout=20000 watcher=org.apache.storm.shade.org.apache.curator.ConnectionState#185100a6
2015-12-04 22:12:07.959 o.a.s.s.o.a.z.ClientCnxn [INFO] EventThread shut down
2015-12-04 22:12:07.960 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server ec2-52-34-113-54.us-west-2.compute.amazonaws.com/52.34.113.54:2181. Will not attempt to authenticate using SASL (unknown error)
2015-12-04 22:12:27.965 o.a.s.s.o.a.z.ClientCnxn [INFO] Client session timed out, have not heard from server in 20006ms for sessionid 0x0, closing socket connection and attempting reconnect
2015-12-04 22:12:29.066 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server ec2-52-34-113-54.us-west-2.compute.amazonaws.com/52.34.113.54:2181. Will not attempt to authenticate using SASL (unknown error)
2015-12-04 22:12:44.074 o.a.s.s.o.a.c.ConnectionState [WARN] Connection attempt unsuccessful after 40213 (greater than max timeout of 20000). Resetting connection and trying again with a new connection.
2015-12-04 22:12:48.169 o.a.s.s.o.a.z.ZooKeeper [INFO] Session: 0x0 closed
2015-12-04 22:12:48.169 o.a.s.s.o.a.z.ZooKeeper [INFO] Initiating client connection, connectString=zkserver1:2181 sessionTimeout=20000 watcher=org.apache.storm.shade.org.apache.curator.ConnectionState#185100a6
2015-12-04 22:12:48.170 o.a.s.s.o.a.z.ClientCnxn [INFO] EventThread shut down
2015-12-04 22:12:48.171 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server ec2-52-34-113-54.us-west-2.compute.amazonaws.com/52.34.113.54:2181. Will not attempt to authenticate using SASL (unknown error)
2015-12-04 22:13:03.172 o.a.s.s.o.a.z.ClientCnxn [INFO] Socket connection established to ec2-52-34-113-54.us-west-2.compute.amazonaws.com/52.34.113.54:2181, initiating session
2015-12-04 22:13:03.177 o.a.s.s.o.a.z.ClientCnxn [INFO] Session establishment complete on server ec2-52-34-113-54.us-west-2.compute.amazonaws.com/52.34.113.54:2181, sessionid = 0x15170091b370005, negotiated timeout = 20000
2015-12-04 22:13:03.179 o.a.s.s.o.a.c.f.s.ConnectionStateManager [INFO] State change: CONNECTED
2015-12-04 22:13:03.180 b.s.zookeeper [INFO] Zookeeper state update: :connected:none
2015-12-04 22:13:03.186 o.a.s.s.o.a.z.ZooKeeper [INFO] Session: 0x15170091b370005 closed
2015-12-04 22:13:03.186 o.a.s.s.o.a.z.ClientCnxn [INFO] EventThread shut down
2015-12-04 22:13:03.187 b.s.u.StormBoundedExponentialBackoffRetry [INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries [5]
2015-12-04 22:13:03.188 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting
2015-12-04 22:13:03.190 o.a.s.s.o.a.z.ZooKeeper [INFO] Initiating client connection, connectString=zkserver1:2181/storm sessionTimeout=20000 watcher=org.apache.storm.shade.org.apache.curator.ConnectionState#6c4fb026
2015-12-04 22:13:03.191 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server ec2-52-34-113-54.us-west-2.compute.amazonaws.com/52.34.113.54:2181. Will not attempt to authenticate using SASL (unknown error)
2015-12-04 22:13:03.193 o.a.s.s.o.a.z.ClientCnxn [INFO] Socket connection established to ec2-52-34-113-54.us-west-2.compute.amazonaws.com/52.34.113.54:2181, initiating session
2015-12-04 22:13:03.195 o.a.s.s.o.a.z.ClientCnxn [INFO] Session establishment complete on server ec2-52-34-113-54.us-west-2.compute.amazonaws.com/52.34.113.54:2181, sessionid = 0x15170091b370006, negotiated timeout = 20000
2015-12-04 22:13:03.195 o.a.s.s.o.a.c.f.s.ConnectionStateManager [INFO] State change: CONNECTED
2015-12-04 22:13:03.233 b.s.d.nimbus [INFO] Starting Nimbus server...
Any ideas?
i found out... my hosts file was global, so the nimbus ip was the external ip of my machine. It tried to connect through that ip even though it was on localhost, the firewall needs to configured properly or the hosts file adjusted
Related
I have installed mysql in a VM and wanted my EKS with istio 1.9 installed to talk with them, i am following this https://istio.io/latest/docs/setup/install/virtual-machine/ but when am doing this step the host file which getting generated is empty file.
With this empty host file i tried but when starting the vm with this command am getting
> sudo systemctl start istio
when tailed this file
*/var/log/istio/istio.log*
2021-03-22T18:44:02.332421Z info Proxy role ips=[10.8.1.179 fe80::dc:36ff:fed3:9eea] type=sidecar id=ip-10-8-1-179.vm domain=vm.svc.cluster.local
2021-03-22T18:44:02.332429Z info JWT policy is third-party-jwt
2021-03-22T18:44:02.332438Z info Pilot SAN: [istiod.istio-system.svc]
2021-03-22T18:44:02.332443Z info CA Endpoint istiod.istio-system.svc:15012, provider Citadel
2021-03-22T18:44:02.332997Z info Using CA istiod.istio-system.svc:15012 cert with certs: /etc/certs/root-cert.pem
2021-03-22T18:44:02.333093Z info citadelclient Citadel client using custom root cert: istiod.istio-system.svc:15012
2021-03-22T18:44:02.410934Z info ads All caches have been synced up in 82.7974ms, marking server ready
2021-03-22T18:44:02.411247Z info sds SDS server for workload certificates started, listening on "./etc/istio/proxy/SDS"
2021-03-22T18:44:02.424855Z info sds Start SDS grpc server
2021-03-22T18:44:02.425044Z info xdsproxy Initializing with upstream address "istiod.istio-system.svc:15012" and cluster "Kubernetes"
2021-03-22T18:44:02.425341Z info Starting proxy agent
2021-03-22T18:44:02.425483Z info dns Starting local udp DNS server at localhost:15053
2021-03-22T18:44:02.427627Z info dns Starting local tcp DNS server at localhost:15053
2021-03-22T18:44:02.427683Z info Opening status port 15020
2021-03-22T18:44:02.432407Z info Received new config, creating new Envoy epoch 0
2021-03-22T18:44:02.433999Z info Epoch 0 starting
2021-03-22T18:44:02.690764Z warn ca ca request failed, starting attempt 1 in 91.93939ms
2021-03-22T18:44:02.693579Z info Envoy command: [-c etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster istio-proxy --service-node sidecar~10.8.1.179~ip-10-8-1-179.vm~vm.svc.cluster.local --local-address-ip-version v4 --bootstrap-version 3 --log-format %Y-%m-%dT%T.%fZ %l envoy %n %v -l warning --component-log-level misc:error --concurrency 2]
2021-03-22T18:44:02.782817Z warn ca ca request failed, starting attempt 2 in 195.226287ms
2021-03-22T18:44:02.978344Z warn ca ca request failed, starting attempt 3 in 414.326774ms
2021-03-22T18:44:03.392946Z warn ca ca request failed, starting attempt 4 in 857.998629ms
2021-03-22T18:44:04.251227Z warn sds failed to warm certificate: failed to generate workload certificate: create certificate: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup istiod.istio-system.svc on 10.8.0.2:53: no such host"
2021-03-22T18:44:04.849207Z warn ca ca request failed, starting attempt 1 in 91.182413ms
2021-03-22T18:44:04.940652Z warn ca ca request failed, starting attempt 2 in 207.680983ms
2021-03-22T18:44:05.148598Z warn ca ca request failed, starting attempt 3 in 384.121814ms
2021-03-22T18:44:05.533019Z warn ca ca request failed, starting attempt 4 in 787.704352ms
2021-03-22T18:44:06.321042Z warn sds failed to warm certificate: failed to generate workload certificate: create certificate: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup istiod.istio-system.svc on 10.8.0.2:53: no such host"
2 computers,203,204
both run jobmanager and taskmanager on every computer
masters
hz203:9081
hz204:9081
slaves
hz203
hz204
flink-conf.yaml
jobmanager.rpc.port: 6123
rest.port: 9081
blob.server.port: 6124
query.server.port: 6125
web.tmpdir: /home/ctu/flink/deploy/webTmp
web.log.path: /home/ctu/flink/deploy/log
taskmanager.tmp.dirs: /home/ctu/flink/deploy/taskManagerTmp
high-availability: zookeeper
high-availability.storageDir: file:///home/ctu/flink/deploy/HA
high-availability.zookeeper.quorum: 10.0.1.79:2181
high-availability.zookeeper.path.root: /flink
high-availability.cluster-id: /flink
run ./start-cluster.sh
Starting HA cluster with 2 masters.
Starting standalonesession daemon on host hz203.
Starting standalonesession daemon on host hz204.
Starting taskexecutor daemon on host hz203.
Starting taskexecutor daemon on host hz204.
logs
2018-12-20 20:44:03,843 INFO org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Starting ZooKeeperLeaderElectionService ZooKeeperLeaderElectionService{leaderPath='/leader/rest_server_lock'}.
2018-12-20 20:44:03,864 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Web frontend listening at http://127.0.0.1:9081.
2018-12-20 20:44:03,875 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint for org.apache.flink.runtime.resourcemanager.StandaloneResourceManager at akka://flink/user/resourcemanager .
2018-12-20 20:44:03,989 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint for org.apache.flink.runtime.dispatcher.StandaloneDispatcher at akka://flink/user/dispatcher .
2018-12-20 20:44:03,999 INFO org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Starting ZooKeeperLeaderElectionService ZooKeeperLeaderElectionService{leaderPath='/leader/resource_manager_lock'}.
2018-12-20 20:44:04,008 INFO org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - Starting ZooKeeperLeaderRetrievalService /leader/resource_manager_lock.
2018-12-20 20:44:04,009 INFO org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Starting ZooKeeperLeaderElectionService ZooKeeperLeaderElectionService{leaderPath='/leader/dispatcher_lock'}.
2018-12-20 20:44:04,010 INFO org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - Starting ZooKeeperLeaderRetrievalService /leader/dispatcher_lock.
2018-12-20 20:44:04,206 WARN akka.remote.transport.netty.NettyTransport - Remote connection to [null] failed with java.net.ConnectException: Connection refused: /127.0.0.1:43012
2018-12-20 20:44:04,221 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink#127.0.0.1:43012] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink#127.0.0.1:43012]] Caused by: [Connection refused: /127.0.0.1:43012]
2018-12-20 20:44:04,301 WARN akka.remote.transport.netty.NettyTransport - Remote connection to [null] failed with java.net.ConnectException: Connection refused: /127.0.0.1:43012
2018-12-20 20:44:04,301 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink#127.0.0.1:43012] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink#127.0.0.1:43012]] Caused by: [Connection refused: /127.0.0.1:43012]
2018-12-20 20:44:04,378 WARN akka.remote.transport.netty.NettyTransport - Remote connection to [null] failed with java.net.ConnectException: Connection refused: /127.0.0.1:43012
2018-12-20 20:44:04,378 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink#127.0.0.1:43012] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink#127.0.0.1:43012]] Caused by: [Connection refused: /127.0.0.1:43012]
2018-12-20 20:44:04,451 WARN akka.remote.transport.netty.NettyTransport - Remote connection to [null] failed with java.net.ConnectException: Connection refused: /127.0.0.1:43012
2018-12-20 20:44:04,451 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink#127.0.0.1:43012] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink#127.0.0.1:43012]] Caused by: [Connection refused: /127.0.0.1:43012]
2018-12-20 20:44:04,520 WARN akka.remote.transport.netty.NettyTransport - Remote connection to [null] failed with java.net.ConnectException: Connection refused: /127.0.0.1:43012
questions
`akka.tcp://flink#127.0.0.1:33567/user/resourcemanager` --- Why the 127.0.0.1 instead of the `jobmanager` ip in the `masters's` config file?
The problem is a bug we fixed in version 1.6.1. In 1.6.0 we did not respect the --host command line option in the method ClusterEntrypoint#loadConfiguration as you can see here compared to the code of version 1.6.1.
Thus, upgrading to the latest 1.6.x version should fix the problem. In general I would always recommend upgrading to the latest bug fix version of a release if possible.
Following the VSOMEIP tutorial Vsomeip in 10 minutes everything works up to the point of Communication between 2 devices.
Current Setup:
Ubuntu 16.04 (two machines - Server & Client)
Two Machines connected over ethernet
Files used:
server.cpp
client.cpp
client_config.json
server_config.json
Output of Server
[info] Parsed vsomeip configuration in 1ms
[info] Using configuration file: "../clie_prop.json".
[info] Default configuration module loaded.
[info] Initializing vsomeip application "Hello".
[info] SOME/IP client identifier configured. Using 0033 (was: 1313)
[info] Instantiating routing manager [Proxy].
[info] Client [33] is connecting to [0] at /tmp/vsomeip-0
[info] Listening at /tmp/vsomeip-33
[info] Application(Hello, 33) is initialized (11, 100).
[info] Starting vsomeip application "Hello" using 2 threads
[warning] local_client_endpoint::connect: Couldn't connect to: /tmp/vsomeip-0 (Connection refused / 111)
[info] io thread id from application: 0033 (Hello) is: 7f80f5cd88c0 TID: 1497
[info] routing_manager_proxy::on_disconnect: Client 0x33 calling host_->on_state with DEREGISTERED
[info] io thread id from application: 0033 (Hello) is: 7f80f15e7700 TID: 1501
[info] shutdown thread id from application: 0033 (Hello) is: 7f80f1de8700 TID: 1500
[info] main dispatch thread id from application: 0033 (Hello) is: 7f80f25e9700 TID: 1499
[warning] local_client_endpoint::connect: Couldn't connect to: /tmp/vsomeip-0 (Connection refused / 111)
[info] routing_manager_proxy::on_disconnect: Client 0x33 calling host_->on_state with DEREGISTERED
[warning] local_client_endpoint::connect: Couldn't connect to: /tmp/vsomeip-0 (Connection refused / 111)
[info] routing_manager_proxy::on_disconnect: Client 0x33 calling host_->on_state with DEREGISTERED
[warning] local_client_endpoint::connect: Couldn't connect to: /tmp/vsomeip-0 (Connection refused / 111)
[info] routing_manager_proxy::on_disconnect: Client 0x33 calling host_->on_state with DEREGISTERED
[warning] local_client_endpoint::connect: Couldn't connect to: /tmp/vsomeip-0 (Connection refused / 111)
[info] routing_manager_proxy::on_disconnect: Client 0x33 calling host_->on_state with DEREGISTERED
[warning] local_client_endpoint::connect: Couldn't connect to: /tmp/vsomeip-0 (Connection refused / 111)
[info] routing_manager_proxy::on_disconnect: Client 0x33 calling host_->on_state with DEREGISTERED
Output of Client
[info] Parsed vsomeip configuration in 0ms
[info] Using configuration file: "../serv_prop.json".
[info] Default configuration module loaded.
[info] Initializing vsomeip application "World".
[warning] Routing Manager seems to be inactive. Taking over...
[info] SOME/IP client identifier configured. Using 1212 (was: 1212)
[info] Instantiating routing manager [Host].
[info] init_routing_endpoint Routing endpoint at /tmp/vsomeip-0
[info] Client [1212] is connecting to [0] at /tmp/vsomeip-0
[info] Service Discovery enabled. Trying to load module.
[info] Service Discovery module loaded.
[info] Application(World, 1212) is initialized (11, 100).
[info] OFFER(1212): [1234.5678:0.0]
[info] Starting vsomeip application "World" using 2 threads
[info] Watchdog is disabled!
[info] io thread id from application: 1212 (World) is: 7fa68723d8c0 TID: 5370
[info] Network interface "enp0s3" state changed: up
[info] vSomeIP 2.10.21 | (default)
[info] Sent READY to systemd watchdog
[info] io thread id from application: 1212 (World) is: 7fa6828f3700 TID: 5374
[info] shutdown thread id from application: 1212 (World) is: 7fa6838f5700 TID: 5372
[info] main dispatch thread id from application: 1212 (World) is: 7fa6840f6700 TID: 5371
[warning] Releasing client identifier 0003. Its corresponding application went offline while no routing manager was running.
[info] Application/Client 0003 is deregistering.
All the code used is the same as the code used in Request/Response in the vsomeip tutorial. The config files are the same as the config file specified in the communication between 2 devices section with the IP Addresses changed to match my machine addresses.
Any help would be greatly appreciated, thanks.
I found a solution!!
If you navigate to the /build/examples folder in the vsomeip or vsomeip-master directory, you will find executables (response-sample, subscribe-sample, etc.). If you run them in such a way that they use the same configuration files as used in the vsomeip in 10 mintutes (changing unicast addresses, etc.) it should work perfectly.
This is the configuration file I used.
{
"unicast" : "192.168.43.6",
"logging" :
{
"level" : "debug",
"console" : "true",
"file" : { "enable" : "false", "path" : "/tmp/vsomeip.log" },
"dlt" : "false"
},
"applications" :
[
{
"name" : "World",
"id" : "0x1212"
}
],
"services" :
[
{
"service" : "0x1234",
"instance" : "0x5678",
"unreliable" : "30509"
}
],
"routing" : "World",
"service-discovery" :
{
"enable" : "true",
"multicast" : "224.224.224.245",
"port" : "30490",
"protocol" : "udp",
"initial_delay_min" : "10",
"initial_delay_max" : "100",
"repetitions_base_delay" : "200",
"repetitions_max" : "3",
"ttl" : "3",
"cyclic_offer_delay" : "2000",
"request_response_delay" : "1500"
}
}
I used shell script to do this.
#!/bin/bash
route add -host 224.224.224.245 dev <interface>
export VSOMEIP_CONFIGURATION=<config_file>
export VSOMEIP_APPLICATION_NAME=<application_name>
./<executable>
It did for me anyways! Hope this helps! :)
To clarify Rob Crowley's answer, I got it working using the two unique .json configuration files included in the "vsomeip in 10 minutes" tutorial. I used the "World" configuration on the host offering the service and the "Hello" configuration file on the host running the client. The only thing I need to modify in these files was the "unicast" address. I changed that to match the IP address of the respective hosts.
I also modified the script to use "sudo" before the "route add -host" command, as I found it wouldn't actually add the route without it.
I called make in the "vsomeip/build/examples/" folder to build the examples. The I pointed to on the script for the service service was "notify-sample" executable (vsomeip/build/examples/). The I pointed to on the script for the service service was "subscribe-sample" executable (vsomeip/build/examples/).
This combination worked for me after connecting my two hosts via ethernet and ensuring their IP addresses matched those in the "unicast" field of their respective configuration files.
By following Akka documents, I can start two actors(front-end and back-end) on the same machine, and they can talk to each other. However, when I tried to deploy back-end actor to another machine(Linux), I hit error of start remoting:
============
Multiple main classes detected, select one to run:
[1] com.goticks.BackendMain
[2] com.goticks.BackendRemoteDeployMain
[3] com.goticks.FrontendMain
[4] com.goticks.FrontendRemoteDeployMain
[5] com.goticks.FrontendRemoteDeployWatchMain
[6] com.goticks.SingleNodeMain
Enter number: 2
[info] Running com.goticks.BackendRemoteDeployMain
INFO [Slf4jLogger]: Slf4jLogger started
INFO [Remoting]: Starting remoting
ERROR [NettyTransport]: failed to bind to /192.168.1.9:2551, shutting down Netty transport
192.168.1.9 is another machine.
In backend.conf:
remote {
enabled-transports = ["akka.remote.netty.tcp"]
netty.tcp {
#hostname = "0.0.0.0"
hostname = "192.168.1.9"
port = 2551
}
}
I have one basic question, when deploy and start a remote actor on remote JVM, do we need user login information to remote machine?
Thanks,
You don't need user login information, I think your port 2551 is already in use on hostname = 192.168.1.9, are you sure you don't use it in the past ?
I also had the same problem, and I accidentally forgot to close the running program on the same port after that I tried to run the program for the second time and it happened Exception in thread "main" org.jboss.netty.channel.ChannelException: Failed to bind to: /192.168.3.216:2552
Just to add more information regarding my previous question:
Multiple main classes detected, select one to run:
[1] com.goticks.BackendMain
[2] com.goticks.BackendRemoteDeployMain
[3] com.goticks.FrontendMain
[4] com.goticks.FrontendRemoteDeployMain
[5] com.goticks.FrontendRemoteDeployWatchMain
[6] com.goticks.SingleNodeMain
Enter number: 2
[info] Running com.goticks.BackendRemoteDeployMain
[DEBUG] [04/18/2016 15:54:11.554] [run-main-0] [EventStream(akka://backend)] logger log1-Logging$DefaultLogger started
[DEBUG] [04/18/2016 15:54:11.555] [run-main-0] [EventStream(akka://backend)] Default Loggers started
[INFO] [04/18/2016 15:54:11.591] [run-main-0] [akka.remote.Remoting] Starting remoting
[ERROR] [04/18/2016 15:54:11.748] [backend-akka.remote.default-remote-dispatcher-5] [NettyTransport(akka://backend)] failed to bind to /192.168.1.9:2551, shutting down Netty transport
[ERROR] [04/18/2016 15:54:11.757] [run-main-0] [akka.remote.Remoting] Remoting error: [Startup failed] [
akka.remote.RemoteTransportException: Startup failed at
akka.remote.Remoting.akka$remote$Remoting$$notifyError(Remoting.scala:136)
at akka.remote.Remoting.start(Remoting.scala:201)
at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
at akka.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:663)
at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:660)
at akka.actor.ActorSystemImpl._start(ActorSystem.scala:660)
at akka.actor.ActorSystemImpl.start(ActorSystem.scala:676)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:143)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:120)
at com.goticks.BackendRemoteDeployMain$.delayedEndpoint$com$goticks$BackendRemoteDeployMain$1(BackendRemoteDeployMain.scala:9)
at com.goticks.BackendRemoteDeployMain$delayedInit$body.apply(BackendRemoteDeployMain.scala:6)
at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at com.goticks.BackendRemoteDeployMain$.main(BackendRemoteDeployMain.scala:6)
at com.goticks.BackendRemoteDeployMain.main(BackendRemoteDeployMain.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at sbt.Run.invokeMain(Run.scala:67)
at sbt.Run.run0(Run.scala:61)
at sbt.Run.sbt$Run$$execute$1(Run.scala:51)
at sbt.Run$$anonfun$run$1.apply$mcV$sp(Run.scala:55)
at sbt.Run$$anonfun$run$1.apply(Run.scala:55)
at sbt.Run$$anonfun$run$1.apply(Run.scala:55)
at sbt.Logger$$anon$4.apply(Logger.scala:85)
at sbt.TrapExit$App.run(TrapExit.scala:248)
at java.lang.Thread.run(Unknown Source)
Caused by: org.jboss.netty.channel.ChannelException: Failed to bind to: /192.168.1.9:2551
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:410)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:406)
I have setup a test enviroment on a aws cluster using three machines and this guide.
I tested my code in local mode and using wirbelsturm to create a local vagrant cluster, both of which works gives desired results.
When i now submit my code to the webserver my spouts and all of my bolts are silent. My spout reads from a csv, which I have copied to the nimbus and my supervisor. The storm UI shows me the topology as active and displays all bolts and my spout, the counters are not visible though. The supervisor has no used workers. The firewall is configured to let nimbus and supervisor accept the ports 6700-6703 from supervisor and nimbus. Does the zookeeper talk on those ports?
I can't seem to find my output logs on my machines either. I find ui and nimbus logs in /usr/local/storm/logs of nimbus and slave but other than that i do not seem to get an error or even logs for spouts/bolts. The vagrant machines show a worker-xxxx.log, but my aws servers do not.
Is that because my code crashes on some error or because i did a config wrong?
Update: I verified my topology with the storm-starter example, those do not seem to work either. I used mvn package to build an uberjar.
Update2:
I included the log from my supervisor, doesnt show any errors but maybe theres something in there...
2015-12-08 13:42:55.168 b.s.u.Utils [INFO] Using defaults.yaml from resources
2015-12-08 13:42:55.297 b.s.u.Utils [INFO] Using storm.yaml from resources
2015-12-08 13:42:57.434 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2015-12-08 13:42:57.435 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:host.name=ip-172-31-26-239.us-west-2.compute.internal
2015-12-08 13:42:57.435 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:java.version=1.7.0_91
2015-12-08 13:42:57.435 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:java.vendor=Oracle Corporation
2015-12-08 13:42:57.435 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:java.home=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.91.x86_64/jre
2015-12-08 13:42:57.435 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:java.class.path=/usr/local/apache-storm-0.10.0/lib/clojure-1.6.0.jar:/usr/local/apache-storm-0.10.0/lib/log4j-core-2.1.jar:/usr/local/apache-storm-0.10.0/lib/log4j-api-2.1.jar:/usr/local/apache-sto$
2015-12-08 13:42:57.435 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:java.library.path=/usr/local/lib:/opt/local/lib:/usr/lib
2015-12-08 13:42:57.435 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:java.io.tmpdir=/tmp
2015-12-08 13:42:57.435 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:java.compiler=<NA>
2015-12-08 13:42:57.436 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:os.name=Linux
2015-12-08 13:42:57.436 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:os.arch=amd64
2015-12-08 13:42:57.436 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:os.version=2.6.32-504.8.1.el6.x86_64
2015-12-08 13:42:57.436 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:user.name=storm
2015-12-08 13:42:57.436 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:user.home=/app/home/storm
2015-12-08 13:42:57.436 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:user.dir=/
2015-12-08 13:42:57.459 o.a.s.s.o.a.z.s.ZooKeeperServer [INFO] Server environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2015-12-08 13:42:57.459 o.a.s.s.o.a.z.s.ZooKeeperServer [INFO] Server environment:host.name=ip-172-31-26-239.us-west-2.compute.internal
2015-12-08 13:42:57.459 o.a.s.s.o.a.z.s.ZooKeeperServer [INFO] Server environment:java.version=1.7.0_91
2015-12-08 13:42:57.459 o.a.s.s.o.a.z.s.ZooKeeperServer [INFO] Server environment:java.vendor=Oracle Corporation
2015-12-08 13:42:57.459 o.a.s.s.o.a.z.s.ZooKeeperServer [INFO] Server environment:java.home=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.91.x86_64/jre
2015-12-08 13:42:57.460 o.a.s.s.o.a.z.s.ZooKeeperServer [INFO] Server environment:java.class.path=/usr/local/apache-storm-0.10.0/lib/clojure-1.6.0.jar:/usr/local/apache-storm-0.10.0/lib/log4j-core-2.1.jar:/usr/local/apache-storm-0.10.0/lib/log4j-api-2.1.jar:/usr/local/ap$
2015-12-08 13:42:57.460 o.a.s.s.o.a.z.s.ZooKeeperServer [INFO] Server environment:java.library.path=/usr/local/lib:/opt/local/lib:/usr/lib
2015-12-08 13:42:57.460 o.a.s.s.o.a.z.s.ZooKeeperServer [INFO] Server environment:java.io.tmpdir=/tmp
2015-12-08 13:42:57.460 o.a.s.s.o.a.z.s.ZooKeeperServer [INFO] Server environment:java.compiler=<NA>
2015-12-08 13:42:57.460 o.a.s.s.o.a.z.s.ZooKeeperServer [INFO] Server environment:os.name=Linux
2015-12-08 13:42:57.460 o.a.s.s.o.a.z.s.ZooKeeperServer [INFO] Server environment:os.arch=amd64
2015-12-08 13:42:57.460 o.a.s.s.o.a.z.s.ZooKeeperServer [INFO] Server environment:os.version=2.6.32-504.8.1.el6.x86_64
2015-12-08 13:42:57.460 o.a.s.s.o.a.z.s.ZooKeeperServer [INFO] Server environment:user.name=storm
2015-12-08 13:42:57.460 o.a.s.s.o.a.z.s.ZooKeeperServer [INFO] Server environment:user.home=/app/home/storm
2015-12-08 13:42:57.460 o.a.s.s.o.a.z.s.ZooKeeperServer [INFO] Server environment:user.dir=/
2015-12-08 13:42:57.774 b.s.u.Utils [INFO] Using defaults.yaml from resources
2015-12-08 13:42:57.803 b.s.u.Utils [INFO] Using storm.yaml from resources
2015-12-08 13:42:57.939 b.s.d.supervisor [INFO] Starting Supervisor with conf {"topology.builtin.metrics.bucket.size.secs" 60, "nimbus.childopts" "-Xmx1024m -Djava.net.preferIPv4Stack=true", "ui.filter.params" nil, "storm.cluster.mode" "distributed", "storm.messaging.net$
2015-12-08 13:42:57.963 b.s.u.StormBoundedExponentialBackoffRetry [INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries [5]
2015-12-08 13:42:58.063 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting
2015-12-08 13:42:58.066 o.a.s.s.o.a.z.ZooKeeper [INFO] Initiating client connection, connectString=zkserver1:2181 sessionTimeout=20000 watcher=org.apache.storm.shade.org.apache.curator.ConnectionState#502016b8
2015-12-08 13:42:58.081 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server zkServer1/xx.xx.xx.xx:2181. Will not attempt to authenticate using SASL (unknown error)
2015-12-08 13:42:58.089 o.a.s.s.o.a.z.ClientCnxn [INFO] Socket connection established to zkServer1/xx.xx.xx.xx:2181, initiating session
2015-12-08 13:42:58.094 o.a.s.s.o.a.z.ClientCnxn [INFO] Session establishment complete on server zkServer1/xx.xx.xx.xx:2181, sessionid = 0x15182c7ba25000d, negotiated timeout = 20000
2015-12-08 13:42:58.096 o.a.s.s.o.a.c.f.s.ConnectionStateManager [INFO] State change: CONNECTED
2015-12-08 13:42:58.097 b.s.zookeeper [INFO] Zookeeper state update: :connected:none
2015-12-08 13:42:59.109 o.a.s.s.o.a.z.ClientCnxn [INFO] EventThread shut down
2015-12-08 13:42:59.110 o.a.s.s.o.a.z.ZooKeeper [INFO] Session: 0x15182c7ba25000d closed
2015-12-08 13:42:59.111 b.s.u.StormBoundedExponentialBackoffRetry [INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries [5]
2015-12-08 13:42:59.116 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting
2015-12-08 13:42:59.116 o.a.s.s.o.a.z.ZooKeeper [INFO] Initiating client connection, connectString=zkserver1:2181/storm sessionTimeout=20000 watcher=org.apache.storm.shade.org.apache.curator.ConnectionState#5edfa0aa
2015-12-08 13:42:59.121 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server zkServer1/xx.xx.xx.xx:2181. Will not attempt to authenticate using SASL (unknown error)
2015-12-08 13:42:59.122 o.a.s.s.o.a.z.ClientCnxn [INFO] Socket connection established to zkServer1/xx.xx.xx.xx:2181, initiating session
2015-12-08 13:42:59.124 o.a.s.s.o.a.z.ClientCnxn [INFO] Session establishment complete on server zkServer1/xx.xx.xx.xx:2181, sessionid = 0x15182c7ba25000e, negotiated timeout = 20000
2015-12-08 13:42:59.124 o.a.s.s.o.a.c.f.s.ConnectionStateManager [INFO] State change: CONNECTED
2015-12-08 13:42:59.169 b.s.d.supervisor [INFO] Starting supervisor with id cc5e1723-cc06-4bc1-a1bf-192a1d7f5bf6 at host xxxxxxx.us-west-2.compute.internal
2015-12-08 13:43:06.059 b.s.d.supervisor [INFO] Downloading code for storm id production-topology-4-1449599549 from /app/storm/nimbus/stormdist/production-topology-4-1449599549
2015-12-08 13:43:06.075 b.s.u.StormBoundedExponentialBackoffRetry [INFO] The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
Any ideas?
Update2:
So i did find this:
java.lang.RuntimeException: org.apache.thrift7.transport.TTransportException: java.net.ConnectException: Connection timed out
at backtype.storm.security.auth.TBackoffConnect.retryNext(TBackoffConnect.java:59) ~[storm-core-0.10.0.jar:0.10.0]
at backtype.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:51) ~[storm-core-0.10.0.jar:0.10.0]
at backtype.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:103) ~[storm-core-0.10.0.jar:0.10.0]
at backtype.storm.security.auth.ThriftClient.<init>(ThriftClient.java:72) ~[storm-core-0.10.0.jar:0.10.0]
at backtype.storm.utils.NimbusClient.<init>(NimbusClient.java:74) ~[storm-core-0.10.0.jar:0.10.0]
at backtype.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:37) ~[storm-core-0.10.0.jar:0.10.0]
at backtype.storm.utils.Utils.downloadFromMaster(Utils.java:361) ~[storm-core-0.10.0.jar:0.10.0]
at backtype.storm.daemon.supervisor$fn__7720.invoke(supervisor.clj:581) ~[storm-core-0.10.0.jar:0.10.0]
at clojure.lang.MultiFn.invoke(MultiFn.java:241) ~[clojure-1.6.0.jar:?]
at backtype.storm.daemon.supervisor$mk_synchronize_supervisor$this__7638.invoke(supervisor.clj:465) ~[storm-core-0.10.0.jar:0.10.0]
at backtype.storm.event$event_manager$fn__7258.invoke(event.clj:40) [storm-core-0.10.0.jar:0.10.0]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.7.0_91]
Caused by: org.apache.thrift7.transport.TTransportException: java.net.ConnectException: Connection timed out
at org.apache.thrift7.transport.TSocket.open(TSocket.java:187) ~[storm-core-0.10.0.jar:0.10.0]
at org.apache.thrift7.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-0.10.0.jar:0.10.0]
at backtype.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:103) ~[storm-core-0.10.0.jar:0.10.0]
at backtype.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:48) ~[storm-core-0.10.0.jar:0.10.0]
... 11 more
Caused by: java.net.ConnectException: Connection timed out
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.7.0_91]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) ~[?:1.7.0_91]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) ~[?:1.7.0_91]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) ~[?:1.7.0_91]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.7.0_91]
at java.net.Socket.connect(Socket.java:579) ~[?:1.7.0_91]
at org.apache.thrift7.transport.TSocket.open(TSocket.java:182) ~[storm-core-0.10.0.jar:0.10.0]
at org.apache.thrift7.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-0.10.0.jar:0.10.0]
at backtype.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:103) ~[storm-core-0.10.0.jar:0.10.0]
at backtype.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:48) ~[storm-core-0.10.0.jar:0.10.0]
... 11 more
2015-12-08 14:26:41.028 b.s.util [ERROR] Halting process: ("Error when processing an event")
java.lang.RuntimeException: ("Error when processing an event")
at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:336) [storm-core-0.10.0.jar:0.10.0]
at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.6.0.jar:?]
at backtype.storm.event$event_manager$fn__7258.invoke(event.clj:48) [storm-core-0.10.0.jar:0.10.0]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.7.0_91]
I followed the same guide as you and ran into the same issue.
What solved the problem for me :
Edit the 3 /etc/hosts files of your three machines (zookeeper, nimbus and slave1) the same way
First remove the IPv6 line which starts like ::1, this is not supported by apache storm.
In the first line of the file, containing local aliases, place the public hostname of the local machine (the one known by other nodes of the cluster) just after 127.0.0.1. I suppose this is the alias storm will take into account.
Finally, as told in the guide list all the other machines and there storm-knowned hostnames
Finally my /etc/hosts looks like this (for the nimbus)
127.0.0.1 vm-matthias-02 localhost.localdomain localhost
192.168.200.48 vm-matthias-01
192.168.200.49 vm-matthias-02
192.168.200.50 vm-matthias-03
Beware to use the same name of the machine when you edit configuration files.