AWS elasticsearch EC2 Discovery, cant find other nodes

AWS elasticsearch EC2 Discovery, cant find other nodes - amazon-web-services

My objective is to create a elasticsearch cluster in AWS using EC2 discovery.
I have 3 instances each running elasticsearch.
I have provided each instance a IAM role which allows them to describe ec2 data.
Each instance is inside the security group "sec-group-elasticsearch"
The nodes start but do not find each other (logs below).
I can telnet from one node to another using private dns and port 9300.
Reference
eg. telnet from node A->B works and B->A works.
telnet ip-xxx-xxx-xx-xxx.vpc.fakedomain.com 9300
iam role for each instance
{
"Statement": [
{
"Action": [
"ec2:DescribeInstances"
],
"Effect": "Allow",
"Resource": [
"*"
]
}
],
"Version": "2012-10-17"
}
sec group rules
Inbound
Custom TCP Rule TCP 9200 - 9400 0.0.0.0/0
Outbound
All traffic allowed
elasticsearch.yml
bootstrap.mlockall: false
cloud.aws.region: us-east
cluster.name: my-ec2-elasticsearch
discovery: ec2
discovery.ec2.groups: sec-group-elasticsearch
discovery.ec2.host_type: private_dns
discovery.ec2.ping_timeout: 30s
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false
http.port: 9200
network.host: _ec2:privateDns_
node.data: false
node.master: true
transport.tcp.port: 9300
On startup each instance logs like so:
[2016-03-02 03:13:48,128][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] version[2.1.0], pid[26976], build[72cd1f1/2015-11-18T22:40:03Z]
[2016-03-02 03:13:48,129][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] initializing ...
[2016-03-02 03:13:48,592][INFO ][plugins ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] loaded [cloud-aws], sites [head]
[2016-03-02 03:13:48,620][INFO ][env ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [11.4gb], net total_space [14.6gb], spins? [no], types [ext4]
[2016-03-02 03:13:50,928][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] initialized
[2016-03-02 03:13:50,928][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] starting ...
[2016-03-02 03:13:51,065][INFO ][transport ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com/xxx-xxx-xx-xxx:9300}, bound_addresses {xxx-xxx-xx-xxx:9300}
[2016-03-02 03:13:51,074][INFO ][discovery ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] my-ec2-elasticsearch/xVOkfK4TT-GWaPln59wGxw
[2016-03-02 03:14:21,075][WARN ][discovery ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] waited for 30s and no initial state was set by the discovery
[2016-03-02 03:14:21,084][INFO ][http ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com/xxx-xxx-xx-xxx:9200}, bound_addresses {xxx-xxx-xx-xxx:9200}
[2016-03-02 03:14:21,085][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] started
TRACE LOGGING ON FOR DISCOVERY:
2016-03-02 04:25:27,753][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] failed to connect to {#zen_unicast_2#}{::1}{[::1]:9300}
ConnectTransportException[[][[::1]:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /0:0:0:0:0:0:0:1:9300];
at org.elasticsearch.transport.netty.NettyTransport.connectToChannelsLight(NettyTransport.java:916)
at ..............
[2016-03-02 04:25:29,253][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] connecting (light) to {#zen_unicast_1#}{127.0.0.1}{127.0.0.1:9300}
[2016-03-02 04:25:29,253][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] sending to {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}
[2016-03-02 04:25:29,254][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] received response from {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}: [ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[143], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[145], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[147], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[149], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[151], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[153], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[154], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}]
[2016-03-02 04:25:29,253][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] connecting (light) to {#zen_unicast_2#}{::1}{[::1]:9300}
[2016-03-02 04:25:29,254][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] failed to connect to {#zen_unicast_1#}{127.0.0.1}{127.0.0.1:9300}
ConnectTransportException[[][127.0.0.1:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /127.0.0.1:9300];
at ...........
[2016-03-02 04:25:29,255][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] failed to connect to {#zen_unicast_2#}{::1}{[::1]:9300}
ConnectTransportException[[][[::1]:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /0:0:0:0:0:0:0:1:9300];
at

You have a tiny typo in your elasticsearch.yml configuration file:
discovery: ec2
should read:
discovery.type: ec2

Related

Elasticsearch: 503 error on cluster - master not discovered

I have set up a cluster with 3 masters for the time being on AWS.
Here are the three /etc/elasticsearch/elasticsearch.yml files
1.master1
cluster.name: es-staging
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: true
network.host: _ec2:privateIp_
discovery.ec2.endpoint: ec2.eu-west-1.amazonaws.com
discovery.ec2.host_type: private_ip
discovery.zen.hosts_provider: ec2
http.port: 9200
discovery.zen.minimum_master_nodes: 2
node.master: true
s3.client.default.endpoint: s3-eu-west-1.amazonaws.com
transport.tcp.port: 9300
node.name: elastic-master1-stg
action.auto_create_index: true
2.master2
cluster.name: es-staging
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: true
network.host: _ec2:privateIp_
discovery.ec2.endpoint: ec2.eu-west-1.amazonaws.com
discovery.ec2.host_type: private_ip
discovery.zen.hosts_provider: ec2
http.port: 9200
discovery.zen.minimum_master_nodes: 2
node.master: true
s3.client.default.endpoint: s3-eu-west-1.amazonaws.com
transport.tcp.port: 9300
node.name: elastic-master2-stg
action.auto_create_index: true
3.master3
cluster.name: es-staging
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: true
network.host: _ec2:privateIp_
discovery.ec2.endpoint: ec2.eu-west-1.amazonaws.com
discovery.ec2.host_type: private_ip
discovery.zen.hosts_provider: ec2
http.port: 9200
discovery.zen.minimum_master_nodes: 2
node.master: true
s3.client.default.endpoint: s3-eu-west-1.amazonaws.com
transport.tcp.port: 9300
node.name: elastic-master3-stg
action.auto_create_index: true
However, when on say master1:
curl -XGET http://10.11.11.118:9200/_cluster/health
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}
I have installed the discovery-ec2 plugin

Turns out it needed a role attached to the instances with the following policy
{
"Statement": [
{
"Action": [
"ec2:DescribeInstances"
],
"Effect": "Allow",
"Resource": [
"*"
]
}
],
"Version": "2012-10-17"
}

Elasticsearch 5.2 crashes in Ubuntu14.04, EC2 t2.large machine

I'm trying to run ElasticSearch 5.2 hosted in an EC2 Ubuntu 14.04 machine (t2.large, which has 8gb of RAM, the minimum specified by Elastic to run Elasticsearch). But ElasticSearch is shutting down unexpectedly.
I'm not being able to understand the cause of the shutting down.
this is the elasticsearch.log:
[2017-03-20T10:07:53,410][INFO ][o.e.p.PluginsService ] [QrRfI_U] loaded module [transport-netty4]
[2017-03-20T10:07:53,411][INFO ][o.e.p.PluginsService ] [QrRfI_U] no plugins loaded
[2017-03-20T10:07:55,555][INFO ][o.e.n.Node ] initialized
[2017-03-20T10:07:55,555][INFO ][o.e.n.Node ] [QrRfI_U] starting ...
[2017-03-20T10:07:55,626][WARN ][i.n.u.i.MacAddressUtil ] Failed to find a usable hardware address from the network interfaces; using random bytes: f6:fd:16:e4:90:62:fe:d6
[2017-03-20T10:07:55,673][INFO ][o.e.t.TransportService ] [QrRfI_U] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
[2017-03-20T10:07:58,755][INFO ][o.e.c.s.ClusterService ] [QrRfI_U] new_master {QrRfI_U}{QrRfI_UKQxWwvvhvgYxGmQ}{Rne8jnb_S0KVRnXvJj1m2w}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-03-20T10:07:58,793][INFO ][o.e.h.HttpServer ] [QrRfI_U] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}
[2017-03-20T10:07:58,793][INFO ][o.e.n.Node ] [QrRfI_U] started
[2017-03-20T10:07:59,072][INFO ][o.e.g.GatewayService ] [QrRfI_U] recovered [6] indices into cluster_state
[2017-03-20T10:07:59,724][INFO ][o.e.c.r.a.AllocationService] [QrRfI_U] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[logstash-2017.02.26][4], [logstash-2017.02.26][3], [logstash-2017.02.26][1], [logstash-2017.02.26][0]] ...]).
[2017-03-20T10:50:12,228][INFO ][o.e.c.m.MetaDataMappingService] [QrRfI_U] [logstash-2017.03.20/HXANYkA9RRKne-YAK9cNQg] update_mapping [logs]
[2017-03-20T11:06:55,449][INFO ][o.e.n.Node ] [QrRfI_U] stopping ...
[2017-03-20T11:06:55,514][INFO ][o.e.n.Node ] [QrRfI_U] stopped
[2017-03-20T11:06:55,515][INFO ][o.e.n.Node ] [QrRfI_U] closing ...
[2017-03-20T11:06:55,523][INFO ][o.e.n.Node ] [QrRfI_U] closed
When I restart ElasticSearch this is the node stats after 1 logstash input (I've never mmore than 3 inputs before elasticsearch crashes):
Request:
curl -i -XGET 'localhost:9200/_nodes/stats'
Response:
{"_nodes":{"total":1,"successful":1,"failed":0},"cluster_name":"elasticsearch","nodes":{"QrRfI_UKQxWwvvhvgYxGmQ":{"timestamp":1490011241990,"name":"QrRfI_U","transport_address":"127.0.0.1:9300","host":"127.0.0.1","ip":"127.0.0.1:9300","roles":["master","data","ingest"],"indices":{"docs":{"count":17,"deleted":0},"store":{"size_in_bytes":235863,"throttle_time_in_millis":0},"indexing":{"index_total":2,"index_time_in_millis":111,"index_current":0,"index_failed":0,"delete_total":0,"delete_time_in_millis":0,"delete_current":0,"noop_update_total":0,"is_throttled":false,"throttle_time_in_millis":0},"get":{"total":2,"time_in_millis":3,"exists_total":2,"exists_time_in_millis":3,"missing_total":0,"missing_time_in_millis":0,"current":0},"search":{"open_contexts":0,"query_total":84,"query_time_in_millis":70,"query_current":0,"fetch_total":80,"fetch_time_in_millis":91,"fetch_current":0,"scroll_total":0,"scroll_time_in_millis":0,"scroll_current":0,"suggest_total":0,"suggest_time_in_millis":0,"suggest_current":0},"merges":{"current":0,"current_docs":0,"current_size_in_bytes":0,"total":0,"total_time_in_millis":0,"total_docs":0,"total_size_in_bytes":0,"total_stopped_time_in_millis":0,"total_throttled_time_in_millis":0,"total_auto_throttle_in_bytes":545259520},"refresh":{"total":2,"total_time_in_millis":89,"listeners":0},"flush":{"total":0,"total_time_in_millis":0},"warmer":{"current":0,"total":28,"total_time_in_millis":72},"query_cache":{"memory_size_in_bytes":0,"total_count":0,"hit_count":0,"miss_count":0,"cache_size":0,"cache_count":0,"evictions":0},"fielddata":{"memory_size_in_bytes":0,"evictions":0},"completion":{"size_in_bytes":0},"segments":{"count":17,"memory_in_bytes":137618,"terms_memory_in_bytes":130351,"stored_fields_memory_in_bytes":5304,"term_vectors_memory_in_bytes":0,"norms_memory_in_bytes":384,"points_memory_in_bytes":15,"doc_values_memory_in_bytes":1564,"index_writer_memory_in_bytes":0,"version_map_memory_in_bytes":0,"fixed_bit_set_memory_in_bytes":0,"max_unsafe_auto_id_timestamp":-1,"file_sizes":{}},"translog":{"operations":2,"size_in_bytes":6072},"request_cache":{"memory_size_in_bytes":12740,"evictions":0,"hit_count":0,"miss_count":20},"recovery":{"current_as_source":0,"current_as_target":0,"throttle_time_in_millis":0}},"os":{"timestamp":1490011241998,"cpu":{"percent":1,"load_average":{"1m":0.18,"5m":0.08,"15m":0.06}},"mem":{"total_in_bytes":8371847168,"free_in_bytes":5678006272,"used_in_bytes":2693840896,"free_percent":68,"used_percent":32},"swap":{"total_in_bytes":0,"free_in_bytes":0,"used_in_bytes":0}},"process":{"timestamp":1490011241998,"open_file_descriptors":220,"max_file_descriptors":66000,"cpu":{"percent":1,"total_in_millis":14800},"mem":{"total_virtual_in_bytes":3171389440}},"jvm":{"timestamp":1490011241998,"uptime_in_millis":205643,"mem":{"heap_used_in_bytes":195922864,"heap_used_percent":37,"heap_committed_in_bytes":519438336,"heap_max_in_bytes":519438336,"non_heap_used_in_bytes":75810224,"non_heap_committed_in_bytes":81326080,"pools":{"young":{"used_in_bytes":96089960,"max_in_bytes":139591680,"peak_used_in_bytes":139591680,"peak_max_in_bytes":139591680},"survivor":{"used_in_bytes":11413088,"max_in_bytes":17432576,"peak_used_in_bytes":17432576,"peak_max_in_bytes":17432576},"old":{"used_in_bytes":88419816,"max_in_bytes":362414080,"peak_used_in_bytes":88419816,"peak_max_in_bytes":362414080}}},"threads":{"count":43,"peak_count":45},"gc":{"collectors":{"young":{"collection_count":5,"collection_time_in_millis":164},"old":{"collection_count":1,"collection_time_in_millis":39}}},"buffer_pools":{"direct":{"count":29,"used_in_bytes":70307265,"total_capacity_in_bytes":70307264},"mapped":{"count":17,"used_in_bytes":217927,"total_capacity_in_bytes":217927}},"classes":{"current_loaded_count":10981,"total_loaded_count":10981,"total_unloaded_count":0}},"thread_pool":{"bulk":{"threads":2,"queue":0,"active":0,"rejected":0,"largest":2,"completed":2},"fetch_shard_started":{"threads":4,"queue":0,"active":0,"rejected":0,"largest":4,"completed":26},"fetch_shard_store":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"flush":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"force_merge":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"generic":{"threads":4,"queue":0,"active":0,"rejected":0,"largest":4,"completed":54},"get":{"threads":2,"queue":0,"active":0,"rejected":0,"largest":2,"completed":2},"index":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"listener":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"management":{"threads":5,"queue":0,"active":1,"rejected":0,"largest":5,"completed":203},"refresh":{"threads":1,"queue":0,"active":0,"rejected":0,"largest":1,"completed":550},"search":{"threads":4,"queue":0,"active":0,"rejected":0,"largest":4,"completed":165},"snapshot":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"warmer":{"threads":1,"queue":0,"active":0,"rejected":0,"largest":1,"completed":23}},"fs":{"timestamp":1490011241999,"total":{"total_in_bytes":8309932032,"free_in_bytes":3226181632,"available_in_bytes":2780459008},"data":[{"path":"/home/ubuntu/elasticsearch-5.2.0/data/nodes/0","mount":"/ (/dev/xvda1)","type":"ext4","total_in_bytes":8309932032,"free_in_bytes":3226181632,"available_in_bytes":2780459008,"spins":"false"}],"io_stats":{"devices":[{"device_name":"xvda1","operations":901,"read_operations":4,"write_operations":897,"read_kilobytes":16,"write_kilobytes":10840}],"total":{"operations":901,"read_operations":4,"write_operations":897,"read_kilobytes":16,"write_kilobytes":10840}}},"transport":{"server_open":0,"rx_count":10,"rx_size_in_bytes":3388,"tx_count":10,"tx_size_in_bytes":3388},"http":{"current_open":5,"total_opened":12},"breakers":{"request":{"limit_size_in_bytes":311663001,"limit_size":"297.2mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.0,"tripped":0},"fielddata":{"limit_size_in_bytes":311663001,"limit_size":"297.2mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.03,"tripped":0},"in_flight_requests":{"limit_size_in_bytes":519438336,"limit_size":"495.3mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.0,"tripped":0},"parent":{"limit_size_in_bytes":363606835,"limit_size":"346.7mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.0,"tripped":0}},"script":{"compilations":0,"cache_evictions":0},"discovery":{"cluster_state_queue":{"total":0,"pending":0,"committed":0}},"ingest":{"total":{"count":0,"time_in_millis":0,"current":0,"failed":0},"pipelines":{}}}}}

Unable to form Elasticsearch (5.1.1) cluster on AWS EC2 instances

I am unable to form a ES cluster between 2 master nodes in EC2 instances. Following is the elasticsearch.yml for the nodes.
Node1:
bootstrap.memory_lock: true
cloud.aws.protocol: http
cloud.aws.proxy.host: <Proxy addr>
cloud.aws.proxy.port: <proxy port>
cloud.aws.region: us-east
cluster.name: production-test
discovery.ec2.availability_zones: us-east-1a,us-east-1b,us-east-1d,us-east-1e
discovery.zen.ping_timeout: 30s
discovery.ec2.tag.Name: <ec2-tag name>
discovery.zen.hosts_provider: ec2
#discovery.type: ec2
#discovery.zen.ping.multicast.enabled: false
http.port: 9205
#network.host: _eth0_, _local_, _ec2_
network.host: <private ip_addr>
#network.bind_host: <private ip_addr>
#network.publish_host: <private ip_addr>
node.data: true
node.master: true
plugin.mandatory: discovery-ec2, repository-s3
transport.tcp.port: 9305
#discovery.zen.ping.unicast.hosts: ["<private ip_addr of node1>","<private ip_addr of node2>"]
discovery.zen.ping.unicast.hosts: ["<private ip_addr of node1>:9305", "<private ip_addr of node2>:9305"]
cloud.node.auto_attributes: true
cluster.routing.allocation.awareness.attributes: aws_availability_zone
node.name: nodetest1
path.data: /var/lib/elasticsearch/
#path.data: /data/elasticsearch/data/production
path.logs: /var/log/elasticsearch/
path.conf: /etc/elasticsearch
Node 2:
bootstrap.memory_lock: true
cloud.aws.protocol: http
cloud.aws.proxy.host: <Proxy addr>
cloud.aws.proxy.port: <Procy port>
cloud.aws.region: us-east
cluster.name: production-test
discovery.ec2.availability_zones: us-east-1a,us-east-1b,us-east-1d,us-east-1e
discovery.zen.ping_timeout: 30s
discovery.ec2.tag.Name: <ec2-instance tag name>
discovery.zen.hosts_provider: ec2
#discovery.type: ec2
#discovery.zen.ping.multicast.enabled: false
http.port: 9205
#network.host: _eth0_, _local_, _ec2_
network.host: <private ip_addr>
#network.bind_host: <private ip_addr>
#network.publish_host: <private ip_addr>
node.data: true
node.master: true
plugin.mandatory: discovery-ec2, repository-s3
transport.tcp.port: 9305
discovery.zen.ping.unicast.hosts: ["<private ip_addr of node1>:9305","<private ip_addr of node2>:9305"]
cloud.node.auto_attributes: true
cluster.routing.allocation.awareness.attributes: aws_availability_zone
node.name: nodetest2
#Paths to log, conf, data directories
When both the nodes are started, the following is the log data on both the nodes:
[INFO ][o.e.b.BootstrapCheck ] [nodetest1] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[WARN ][o.e.n.Node ] [nodetest1] timed out while waiting for initial discovery state - timeout: 30s
[INFO ][o.e.h.HttpServer ] [nodetest1] publish_address {<private ip_addr of node1>:9205}, bound_addresses {<private ip_addr of node1>:9205}
[INFO ][o.e.n.Node ] [nodetest1] started
[INFO ][o.e.d.z.ZenDiscovery ] [nodetest1] failed to send join request to master [{nodetest}{YcGzQ-4CQtmuuxUGMQJroA}{yuxHmvGPTeK-iw59VTj4ZA}{<private ip_addr of node2>}{<private ip_addr of node2>:9305}{aws_availability_zone=us-east-1d}], reason [RemoteTransportException[[nodetest][<private ip_addr of node2>:9305][internal:discovery/zen/join]]; nested: NotMasterException[Node [{nodetest}{YcGzQ-4CQtmuuxUGMQJroA}{yuxHmvGPTeK-iw59VTj4ZA}{<private ip_addr of node2>}{<private ip_addr of node2>:9305}{aws_availability_zone=us-east-1d}] not master for join request]; ], tried [3] times
I have searched many similar issues and tried to apply the fixes but i still have the same result. Is there any fault in the elasticsearch.yml file?
curl -XGET <private ip_addr>:9205/_cat/master
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}
The two node instances are running ES-5.1.1 and are in the same security-group and iam role.
Any suggestions are highly appreciated.
Thanks,

Elasticsearch zen discovery - Connection refused: /127.0.0.1:9302

Very new in Elasticsearch and trying out Zen discovery plugin for the first time.
I'm currently using version 5.0.0-alpha5. Here is my current settings:
cluster:
name: Elastic-POC
node:
name: ${HOSTNAME}-data
master: false
data: true
cloud:
aws:
access_key: xxxxxx
secret_key: xxxxxx
region: us-west-2
ec2:
protocol: http
access_key: xxxxxx
secret_key: xxxxxx
discovery:
type: ec2
zen.minimum_master_nodes: 1
ec2.any_group: true
ec2.groups: sg-xxxxxx
network:
host: _ec2:privateIp_
The above setting is from the "data" node, it's unable to join to the "master" node. I have enabled "TRACE" for discover plugin and this is what I got from the log:
[2016-07-12 00:30:39,377][INFO ][env ] [ip-172-29-1-44-data] heap size [15.8gb], compressed ordinary object pointers [true]
[2016-07-12 00:30:40,563][DEBUG][discovery.zen.elect ] [ip-172-29-1-44-data] using minimum_master_nodes [1]
[2016-07-12 00:30:40,913][DEBUG][discovery.ec2 ] [ip-172-29-1-44-data] using host_type [PRIVATE_IP], tags [{}], groups [[sg-xxxxxx]] with any_group [true], availability_zones [[]]
[2016-07-12 00:30:40,914][DEBUG][discovery.zen.ping.unicast] [ip-172-29-1-44-data] using initial hosts [127.0.0.1, [::1]], with concurrent_connects [10]
[2016-07-12 00:30:40,922][DEBUG][discovery.zen ] [ip-172-29-1-44-data] using ping_timeout [3s], join.timeout [1m], master_election.ignore_non_master [false]
[2016-07-12 00:30:40,925][DEBUG][discovery.zen.fd ] [ip-172-29-1-44-data] [master] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2016-07-12 00:30:40,938][DEBUG][discovery.zen.fd ] [ip-172-29-1-44-data] [node ] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2016-07-12 00:30:41,250][DEBUG][discovery.ec2 ] [ip-172-29-1-44-data] using host_type [PRIVATE_IP], tags [{}], groups [[sg-xxxxxx]] with any_group [true], availability_zones [[]]
[2016-07-12 00:30:41,250][DEBUG][discovery.ec2 ] [ip-172-29-1-44-data] using host_type [PRIVATE_IP], tags [{}], groups [[sg-xxxxxx]] with any_group [true], availability_zones [[]]
[2016-07-12 00:30:41,252][INFO ][node ] [ip-172-29-1-44-data] initialized
[2016-07-12 00:30:41,252][INFO ][node ] [ip-172-29-1-44-data] starting ...
[2016-07-12 00:30:41,546][INFO ][transport ] [ip-172-29-1-44-data] publish_address {172.29.1.44:9300}, bound_addresses {172.29.1.44:9300}
[2016-07-12 00:30:41,561][TRACE][discovery.zen ] [ip-172-29-1-44-data] starting an election context, will accumulate joins
[2016-07-12 00:30:41,562][TRACE][discovery.zen ] [ip-172-29-1-44-data] starting to ping
[2016-07-12 00:30:42,477][TRACE][discovery.ec2 ] [ip-172-29-1-44-data] building dynamic unicast discovery nodes...
[2016-07-12 00:30:42,477][DEBUG][discovery.ec2 ] [ip-172-29-1-44-data] using dynamic discovery nodes []
[2016-07-12 00:30:42,480][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_1#}{127.0.0.1}{127.0.0.1:9300}
[2016-07-12 00:30:42,480][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_2#}{127.0.0.1}{127.0.0.1:9301}
[2016-07-12 00:30:42,482][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_4#}{127.0.0.1}{127.0.0.1:9303}
[2016-07-12 00:30:42,482][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_5#}{127.0.0.1}{127.0.0.1:9304}
[2016-07-12 00:30:42,482][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_3#}{127.0.0.1}{127.0.0.1:9302}
[2016-07-12 00:30:42,483][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_6#}{::1}{[::1]:9300}
[2016-07-12 00:30:42,485][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_7#}{::1}{[::1]:9301}
[2016-07-12 00:30:42,485][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_8#}{::1}{[::1]:9302}
[2016-07-12 00:30:42,487][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_9#}{::1}{[::1]:9303}
[2016-07-12 00:30:42,487][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_10#}{::1}{[::1]:9304}
[2016-07-12 00:30:42,508][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] failed to connect to {#zen_unicast_3#}{127.0.0.1}{127.0.0.1:9302}
ConnectTransportException[[][127.0.0.1:9302] connect_timeout[30s]]; nested: ConnectException[Connection refused: /127.0.0.1:9302];
at org.elasticsearch.transport.netty.NettyTransport.connectToChannelsLight(NettyTransport.java:1008)
at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:972)
at org.elasticsearch.transport.netty.NettyTransport.connectToNodeLight(NettyTransport.java:944)
at org.elasticsearch.transport.TransportService.connectToNodeLightAndHandshake(TransportService.java:325)
at org.elasticsearch.transport.TransportService.connectToNodeLightAndHandshake(TransportService.java:301)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$2.run(UnicastZenPing.java:398)
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:392)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: /127.0.0.1:9302
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
at org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
... 3 more
[2016-07-12 00:30:42,510][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] failed to connect to {#zen_unicast_7#}{::1}{[::1]:9301}
ConnectTransportException[[][[::1]:9301] connect_timeout[30s]]; nested: ConnectException[Connection refused: /0:0:0:0:0:0:0:1:9301];
at org.elasticsearch.transport.netty.NettyTransport.connectToChannelsLight(NettyTransport.java:1008)
at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:972)
at org.elasticsearch.transport.netty.NettyTransport.connectToNodeLight(NettyTransport.java:944)
at org.elasticsearch.transport.TransportService.connectToNodeLightAndHandshake(TransportService.java:325)
at org.elasticsearch.transport.TransportService.connectToNodeLightAndHandshake(TransportService.java:301)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$2.run(UnicastZenPing.java:398)
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:392)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: /0:0:0:0:0:0:0:1:9301
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
at org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)

I was able to figure it out. All my ES nodes were unable to make call to AWS API as all traffic to www was blocked. As a result, it was getting default host address during discovery. After enabling www traffic (NAT Gateways), it was able to find out expected ES hosts.

elasticsearch : EC2 discovery : master nodes work data nodes fail

My objective is to run a 6 node cluster on three instances in EC2.
I am placing one master-only and one data-only node on each instance (using the elastic ansible playbook).
The master nodes from each of the three instances all find each other without issue using EC2 discovery and form a cluster of three and elect a master.
The data nodes from the same instances fail on startup with the error below.
What have I tried
- switching data nodes to explicit zen.unicast discovery via hostnames works
- I can telnet on port 9301 from instance A->B without issue
REFERENCE:
java version - OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-0ubuntu1.14.04.1)
es version - 2.1.0
data node elasticseach.yml
bootstrap.mlockall: false
cloud.aws.region: us-east
cluster.name: my-cluster
discovery.ec2.groups: stage-elasticsearch
discovery.ec2.host_type: private_dns
discovery.ec2.ping_timeout: 30s
discovery.type: ec2
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false
gateway.expected_nodes: 4
http.port: 9201
network.host: _ec2:privateDns_
node.data: true
node.master: false
transport.tcp.port: 9301
node.name: ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1
master node elasticsearch.yml
bootstrap.mlockall: false
cloud.aws.region: us-east
cluster.name: my-cluster
discovery.ec2.groups: stage-elasticsearch
discovery.ec2.host_type: private_dns
discovery.ec2.ping_timeout: 30s
discovery.type: ec2
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false
gateway.expected_nodes: 4
http.port: 9200
network.host: _ec2:privateDns_
node.data: false
node.master: true
transport.tcp.port: 9300
node.name: ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-master
Errors from datanode startup:
[2016-03-02 15:45:06,246][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] initializing ...
[2016-03-02 15:45:06,679][INFO ][plugins ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] loaded [cloud-aws], sites [head]
[2016-03-02 15:45:06,710][INFO ][env ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [11.5gb], net total_space [14.6gb], spins? [no], types [ext4]
[2016-03-02 15:45:09,597][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] initialized
[2016-03-02 15:45:09,597][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] starting ...
[2016-03-02 15:45:09,678][INFO ][transport ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1/xxx-xxx-xx-xxx:9301}, bound_addresses {xxx-xxx-xx-xxx:9301}
[2016-03-02 15:45:09,687][INFO ][discovery ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] my-cluster/PNI6WAmzSYGgZcX2HsqenA
[2016-03-02 15:45:09,701][WARN ][com.amazonaws.jmx.SdkMBeanRegistrySupport]
java.security.AccessControlException: access denied ("javax.management.MBeanServerPermission" "findMBeanServer")
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:372)
at java.security.AccessController.checkPermission(AccessController.java:559)
at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
at javax.management.MBeanServerFactory.checkPermission(MBeanServerFactory.java:413)
at javax.management.MBeanServerFactory.findMBeanServer(MBeanServerFactory.java:361)
at com.amazonaws.jmx.MBeans.getMBeanServer(MBeans.java:111)
at com.amazonaws.jmx.MBeans.registerMBean(MBeans.java:50)
at com.amazonaws.jmx.SdkMBeanRegistrySupport.registerMetricAdminMBean(SdkMBeanRegistrySupport.java:27)
at com.amazonaws.metrics.AwsSdkMetrics.registerMetricAdminMBean(AwsSdkMetrics.java:355)
at com.amazonaws.metrics.AwsSdkMetrics.<clinit>(AwsSdkMetrics.java:316)
at com.amazonaws.AmazonWebServiceClient.requestMetricCollector(AmazonWebServiceClient.java:563)
at com.amazonaws.AmazonWebServiceClient.isRMCEnabledAtClientOrSdkLevel(AmazonWebServiceClient.java:504)
at com.amazonaws.AmazonWebServiceClient.isRequestMetricsEnabled(AmazonWebServiceClient.java:496)
at com.amazonaws.AmazonWebServiceClient.createExecutionContext(AmazonWebServiceClient.java:457)
at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5924)
at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.fetchDynamicNodes(AwsEc2UnicastHostsProvider.java:118)
at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:230)
at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:215)
at org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:55)
at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:104)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:335)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240)
at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106)
at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84)
at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:879)
at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:335)
at org.elasticsearch.discovery.zen.ZenDiscovery.access$5000(ZenDiscovery.java:75)
at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1236)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2016-03-02 15:45:09,703][WARN ][com.amazonaws.metrics.AwsSdkMetrics]
java.security.AccessControlException: access denied ("javax.management.MBeanServerPermission" "findMBeanServer")
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:372)
at java.security.AccessController.checkPermission(AccessController.java:559)
at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
at javax.management.MBeanServerFactory.checkPermission(MBeanServerFactory.java:413)
at javax.management.MBeanServerFactory.findMBeanServer(MBeanServerFactory.java:361)
at com.amazonaws.jmx.MBeans.getMBeanServer(MBeans.java:111)
at com.amazonaws.jmx.MBeans.isRegistered(MBeans.java:98)
at com.amazonaws.jmx.SdkMBeanRegistrySupport.isMBeanRegistered(SdkMBeanRegistrySupport.java:46)
at com.amazonaws.metrics.AwsSdkMetrics.registerMetricAdminMBean(AwsSdkMetrics.java:361)
at com.amazonaws.metrics.AwsSdkMetrics.<clinit>(AwsSdkMetrics.java:316)
at com.amazonaws.AmazonWebServiceClient.requestMetricCollector(AmazonWebServiceClient.java:563)
at com.amazonaws.AmazonWebServiceClient.isRMCEnabledAtClientOrSdkLevel(AmazonWebServiceClient.java:504)
at com.amazonaws.AmazonWebServiceClient.isRequestMetricsEnabled(AmazonWebServiceClient.java:496)
at com.amazonaws.AmazonWebServiceClient.createExecutionContext(AmazonWebServiceClient.java:457)
at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5924)
at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.fetchDynamicNodes(AwsEc2UnicastHostsProvider.java:118)
at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:230)
at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:215)
at org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:55)
at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:104)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:335)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240)
at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106)
at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84)
at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:879)
at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:335)
at org.elasticsearch.discovery.zen.ZenDiscovery.access$5000(ZenDiscovery.java:75)
at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1236)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2016-03-02 15:45:39,688][WARN ][discovery ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] waited for 30s and no initial state was set by the discovery
[2016-03-02 15:45:39,698][INFO ][http ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1/xxx-xxx-xx-xxx:9201}, bound_addresses {xxx-xxx-xx-xxx:9201}
[2016-03-02 15:45:39,699][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] started

I fixed this by removing the explicit setting of transport.tcp.port: 9300 and using the default of letting it pick any ports in the range 9300-9399.
The warnings from the AwsSdkMetrics remain but are NOT an issue as Val stated.

This is not actually an error.
See this issue where this has been reported. It just seems the plugin is logging too much.
If you modify your logging.yml config file as suggested in that issue with this, then you'll be fine:
# aws will try to do some sketchy JMX stuff, but its not needed.
com.amazonaws.jmx.SdkMBeanRegistrySupport: ERROR
com.amazonaws.metrics.AwsSdkMetrics: ERROR

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

AWS elasticsearch EC2 Discovery, cant find other nodes - amazon-web-services

You have a tiny typo in your elasticsearch.yml configuration file: discovery: ec2 should read: discovery.type: ec2

Related

Elasticsearch: 503 error on cluster - master not discovered

Elasticsearch 5.2 crashes in Ubuntu14.04, EC2 t2.large machine

Unable to form Elasticsearch (5.1.1) cluster on AWS EC2 instances

Elasticsearch zen discovery - Connection refused: /127.0.0.1:9302

elasticsearch : EC2 discovery : master nodes work data nodes fail

Categories

Resources