AWS elasticsearch EC2 Discovery, cant find other nodes - amazon-web-services
My objective is to create a elasticsearch cluster in AWS using EC2 discovery.
I have 3 instances each running elasticsearch.
I have provided each instance a IAM role which allows them to describe ec2 data.
Each instance is inside the security group "sec-group-elasticsearch"
The nodes start but do not find each other (logs below).
I can telnet from one node to another using private dns and port 9300.
Reference
eg. telnet from node A->B works and B->A works.
telnet ip-xxx-xxx-xx-xxx.vpc.fakedomain.com 9300
iam role for each instance
{
"Statement": [
{
"Action": [
"ec2:DescribeInstances"
],
"Effect": "Allow",
"Resource": [
"*"
]
}
],
"Version": "2012-10-17"
}
sec group rules
Inbound
Custom TCP Rule TCP 9200 - 9400 0.0.0.0/0
Outbound
All traffic allowed
elasticsearch.yml
bootstrap.mlockall: false
cloud.aws.region: us-east
cluster.name: my-ec2-elasticsearch
discovery: ec2
discovery.ec2.groups: sec-group-elasticsearch
discovery.ec2.host_type: private_dns
discovery.ec2.ping_timeout: 30s
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false
http.port: 9200
network.host: _ec2:privateDns_
node.data: false
node.master: true
transport.tcp.port: 9300
On startup each instance logs like so:
[2016-03-02 03:13:48,128][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] version[2.1.0], pid[26976], build[72cd1f1/2015-11-18T22:40:03Z]
[2016-03-02 03:13:48,129][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] initializing ...
[2016-03-02 03:13:48,592][INFO ][plugins ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] loaded [cloud-aws], sites [head]
[2016-03-02 03:13:48,620][INFO ][env ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [11.4gb], net total_space [14.6gb], spins? [no], types [ext4]
[2016-03-02 03:13:50,928][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] initialized
[2016-03-02 03:13:50,928][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] starting ...
[2016-03-02 03:13:51,065][INFO ][transport ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com/xxx-xxx-xx-xxx:9300}, bound_addresses {xxx-xxx-xx-xxx:9300}
[2016-03-02 03:13:51,074][INFO ][discovery ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] my-ec2-elasticsearch/xVOkfK4TT-GWaPln59wGxw
[2016-03-02 03:14:21,075][WARN ][discovery ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] waited for 30s and no initial state was set by the discovery
[2016-03-02 03:14:21,084][INFO ][http ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com/xxx-xxx-xx-xxx:9200}, bound_addresses {xxx-xxx-xx-xxx:9200}
[2016-03-02 03:14:21,085][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] started
TRACE LOGGING ON FOR DISCOVERY:
2016-03-02 04:25:27,753][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] failed to connect to {#zen_unicast_2#}{::1}{[::1]:9300}
ConnectTransportException[[][[::1]:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /0:0:0:0:0:0:0:1:9300];
at org.elasticsearch.transport.netty.NettyTransport.connectToChannelsLight(NettyTransport.java:916)
at ..............
[2016-03-02 04:25:29,253][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] connecting (light) to {#zen_unicast_1#}{127.0.0.1}{127.0.0.1:9300}
[2016-03-02 04:25:29,253][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] sending to {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}
[2016-03-02 04:25:29,254][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] received response from {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}: [ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[143], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[145], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[147], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[149], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[151], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[153], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[154], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}]
[2016-03-02 04:25:29,253][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] connecting (light) to {#zen_unicast_2#}{::1}{[::1]:9300}
[2016-03-02 04:25:29,254][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] failed to connect to {#zen_unicast_1#}{127.0.0.1}{127.0.0.1:9300}
ConnectTransportException[[][127.0.0.1:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /127.0.0.1:9300];
at ...........
[2016-03-02 04:25:29,255][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] failed to connect to {#zen_unicast_2#}{::1}{[::1]:9300}
ConnectTransportException[[][[::1]:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /0:0:0:0:0:0:0:1:9300];
at
You have a tiny typo in your elasticsearch.yml configuration file:
discovery: ec2
should read:
discovery.type: ec2
Related
Elasticsearch: 503 error on cluster - master not discovered
I have set up a cluster with 3 masters for the time being on AWS. Here are the three /etc/elasticsearch/elasticsearch.yml files 1.master1 cluster.name: es-staging path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch bootstrap.memory_lock: true network.host: _ec2:privateIp_ discovery.ec2.endpoint: ec2.eu-west-1.amazonaws.com discovery.ec2.host_type: private_ip discovery.zen.hosts_provider: ec2 http.port: 9200 discovery.zen.minimum_master_nodes: 2 node.master: true s3.client.default.endpoint: s3-eu-west-1.amazonaws.com transport.tcp.port: 9300 node.name: elastic-master1-stg action.auto_create_index: true 2.master2 cluster.name: es-staging path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch bootstrap.memory_lock: true network.host: _ec2:privateIp_ discovery.ec2.endpoint: ec2.eu-west-1.amazonaws.com discovery.ec2.host_type: private_ip discovery.zen.hosts_provider: ec2 http.port: 9200 discovery.zen.minimum_master_nodes: 2 node.master: true s3.client.default.endpoint: s3-eu-west-1.amazonaws.com transport.tcp.port: 9300 node.name: elastic-master2-stg action.auto_create_index: true 3.master3 cluster.name: es-staging path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch bootstrap.memory_lock: true network.host: _ec2:privateIp_ discovery.ec2.endpoint: ec2.eu-west-1.amazonaws.com discovery.ec2.host_type: private_ip discovery.zen.hosts_provider: ec2 http.port: 9200 discovery.zen.minimum_master_nodes: 2 node.master: true s3.client.default.endpoint: s3-eu-west-1.amazonaws.com transport.tcp.port: 9300 node.name: elastic-master3-stg action.auto_create_index: true However, when on say master1: curl -XGET http://10.11.11.118:9200/_cluster/health {"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503} I have installed the discovery-ec2 plugin
Turns out it needed a role attached to the instances with the following policy { "Statement": [ { "Action": [ "ec2:DescribeInstances" ], "Effect": "Allow", "Resource": [ "*" ] } ], "Version": "2012-10-17" }
Elasticsearch 5.2 crashes in Ubuntu14.04, EC2 t2.large machine
I'm trying to run ElasticSearch 5.2 hosted in an EC2 Ubuntu 14.04 machine (t2.large, which has 8gb of RAM, the minimum specified by Elastic to run Elasticsearch). But ElasticSearch is shutting down unexpectedly. I'm not being able to understand the cause of the shutting down. this is the elasticsearch.log: [2017-03-20T10:07:53,410][INFO ][o.e.p.PluginsService ] [QrRfI_U] loaded module [transport-netty4] [2017-03-20T10:07:53,411][INFO ][o.e.p.PluginsService ] [QrRfI_U] no plugins loaded [2017-03-20T10:07:55,555][INFO ][o.e.n.Node ] initialized [2017-03-20T10:07:55,555][INFO ][o.e.n.Node ] [QrRfI_U] starting ... [2017-03-20T10:07:55,626][WARN ][i.n.u.i.MacAddressUtil ] Failed to find a usable hardware address from the network interfaces; using random bytes: f6:fd:16:e4:90:62:fe:d6 [2017-03-20T10:07:55,673][INFO ][o.e.t.TransportService ] [QrRfI_U] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300} [2017-03-20T10:07:58,755][INFO ][o.e.c.s.ClusterService ] [QrRfI_U] new_master {QrRfI_U}{QrRfI_UKQxWwvvhvgYxGmQ}{Rne8jnb_S0KVRnXvJj1m2w}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined) [2017-03-20T10:07:58,793][INFO ][o.e.h.HttpServer ] [QrRfI_U] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200} [2017-03-20T10:07:58,793][INFO ][o.e.n.Node ] [QrRfI_U] started [2017-03-20T10:07:59,072][INFO ][o.e.g.GatewayService ] [QrRfI_U] recovered [6] indices into cluster_state [2017-03-20T10:07:59,724][INFO ][o.e.c.r.a.AllocationService] [QrRfI_U] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[logstash-2017.02.26][4], [logstash-2017.02.26][3], [logstash-2017.02.26][1], [logstash-2017.02.26][0]] ...]). [2017-03-20T10:50:12,228][INFO ][o.e.c.m.MetaDataMappingService] [QrRfI_U] [logstash-2017.03.20/HXANYkA9RRKne-YAK9cNQg] update_mapping [logs] [2017-03-20T11:06:55,449][INFO ][o.e.n.Node ] [QrRfI_U] stopping ... [2017-03-20T11:06:55,514][INFO ][o.e.n.Node ] [QrRfI_U] stopped [2017-03-20T11:06:55,515][INFO ][o.e.n.Node ] [QrRfI_U] closing ... [2017-03-20T11:06:55,523][INFO ][o.e.n.Node ] [QrRfI_U] closed When I restart ElasticSearch this is the node stats after 1 logstash input (I've never mmore than 3 inputs before elasticsearch crashes): Request: curl -i -XGET 'localhost:9200/_nodes/stats' Response: {"_nodes":{"total":1,"successful":1,"failed":0},"cluster_name":"elasticsearch","nodes":{"QrRfI_UKQxWwvvhvgYxGmQ":{"timestamp":1490011241990,"name":"QrRfI_U","transport_address":"127.0.0.1:9300","host":"127.0.0.1","ip":"127.0.0.1:9300","roles":["master","data","ingest"],"indices":{"docs":{"count":17,"deleted":0},"store":{"size_in_bytes":235863,"throttle_time_in_millis":0},"indexing":{"index_total":2,"index_time_in_millis":111,"index_current":0,"index_failed":0,"delete_total":0,"delete_time_in_millis":0,"delete_current":0,"noop_update_total":0,"is_throttled":false,"throttle_time_in_millis":0},"get":{"total":2,"time_in_millis":3,"exists_total":2,"exists_time_in_millis":3,"missing_total":0,"missing_time_in_millis":0,"current":0},"search":{"open_contexts":0,"query_total":84,"query_time_in_millis":70,"query_current":0,"fetch_total":80,"fetch_time_in_millis":91,"fetch_current":0,"scroll_total":0,"scroll_time_in_millis":0,"scroll_current":0,"suggest_total":0,"suggest_time_in_millis":0,"suggest_current":0},"merges":{"current":0,"current_docs":0,"current_size_in_bytes":0,"total":0,"total_time_in_millis":0,"total_docs":0,"total_size_in_bytes":0,"total_stopped_time_in_millis":0,"total_throttled_time_in_millis":0,"total_auto_throttle_in_bytes":545259520},"refresh":{"total":2,"total_time_in_millis":89,"listeners":0},"flush":{"total":0,"total_time_in_millis":0},"warmer":{"current":0,"total":28,"total_time_in_millis":72},"query_cache":{"memory_size_in_bytes":0,"total_count":0,"hit_count":0,"miss_count":0,"cache_size":0,"cache_count":0,"evictions":0},"fielddata":{"memory_size_in_bytes":0,"evictions":0},"completion":{"size_in_bytes":0},"segments":{"count":17,"memory_in_bytes":137618,"terms_memory_in_bytes":130351,"stored_fields_memory_in_bytes":5304,"term_vectors_memory_in_bytes":0,"norms_memory_in_bytes":384,"points_memory_in_bytes":15,"doc_values_memory_in_bytes":1564,"index_writer_memory_in_bytes":0,"version_map_memory_in_bytes":0,"fixed_bit_set_memory_in_bytes":0,"max_unsafe_auto_id_timestamp":-1,"file_sizes":{}},"translog":{"operations":2,"size_in_bytes":6072},"request_cache":{"memory_size_in_bytes":12740,"evictions":0,"hit_count":0,"miss_count":20},"recovery":{"current_as_source":0,"current_as_target":0,"throttle_time_in_millis":0}},"os":{"timestamp":1490011241998,"cpu":{"percent":1,"load_average":{"1m":0.18,"5m":0.08,"15m":0.06}},"mem":{"total_in_bytes":8371847168,"free_in_bytes":5678006272,"used_in_bytes":2693840896,"free_percent":68,"used_percent":32},"swap":{"total_in_bytes":0,"free_in_bytes":0,"used_in_bytes":0}},"process":{"timestamp":1490011241998,"open_file_descriptors":220,"max_file_descriptors":66000,"cpu":{"percent":1,"total_in_millis":14800},"mem":{"total_virtual_in_bytes":3171389440}},"jvm":{"timestamp":1490011241998,"uptime_in_millis":205643,"mem":{"heap_used_in_bytes":195922864,"heap_used_percent":37,"heap_committed_in_bytes":519438336,"heap_max_in_bytes":519438336,"non_heap_used_in_bytes":75810224,"non_heap_committed_in_bytes":81326080,"pools":{"young":{"used_in_bytes":96089960,"max_in_bytes":139591680,"peak_used_in_bytes":139591680,"peak_max_in_bytes":139591680},"survivor":{"used_in_bytes":11413088,"max_in_bytes":17432576,"peak_used_in_bytes":17432576,"peak_max_in_bytes":17432576},"old":{"used_in_bytes":88419816,"max_in_bytes":362414080,"peak_used_in_bytes":88419816,"peak_max_in_bytes":362414080}}},"threads":{"count":43,"peak_count":45},"gc":{"collectors":{"young":{"collection_count":5,"collection_time_in_millis":164},"old":{"collection_count":1,"collection_time_in_millis":39}}},"buffer_pools":{"direct":{"count":29,"used_in_bytes":70307265,"total_capacity_in_bytes":70307264},"mapped":{"count":17,"used_in_bytes":217927,"total_capacity_in_bytes":217927}},"classes":{"current_loaded_count":10981,"total_loaded_count":10981,"total_unloaded_count":0}},"thread_pool":{"bulk":{"threads":2,"queue":0,"active":0,"rejected":0,"largest":2,"completed":2},"fetch_shard_started":{"threads":4,"queue":0,"active":0,"rejected":0,"largest":4,"completed":26},"fetch_shard_store":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"flush":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"force_merge":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"generic":{"threads":4,"queue":0,"active":0,"rejected":0,"largest":4,"completed":54},"get":{"threads":2,"queue":0,"active":0,"rejected":0,"largest":2,"completed":2},"index":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"listener":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"management":{"threads":5,"queue":0,"active":1,"rejected":0,"largest":5,"completed":203},"refresh":{"threads":1,"queue":0,"active":0,"rejected":0,"largest":1,"completed":550},"search":{"threads":4,"queue":0,"active":0,"rejected":0,"largest":4,"completed":165},"snapshot":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"warmer":{"threads":1,"queue":0,"active":0,"rejected":0,"largest":1,"completed":23}},"fs":{"timestamp":1490011241999,"total":{"total_in_bytes":8309932032,"free_in_bytes":3226181632,"available_in_bytes":2780459008},"data":[{"path":"/home/ubuntu/elasticsearch-5.2.0/data/nodes/0","mount":"/ (/dev/xvda1)","type":"ext4","total_in_bytes":8309932032,"free_in_bytes":3226181632,"available_in_bytes":2780459008,"spins":"false"}],"io_stats":{"devices":[{"device_name":"xvda1","operations":901,"read_operations":4,"write_operations":897,"read_kilobytes":16,"write_kilobytes":10840}],"total":{"operations":901,"read_operations":4,"write_operations":897,"read_kilobytes":16,"write_kilobytes":10840}}},"transport":{"server_open":0,"rx_count":10,"rx_size_in_bytes":3388,"tx_count":10,"tx_size_in_bytes":3388},"http":{"current_open":5,"total_opened":12},"breakers":{"request":{"limit_size_in_bytes":311663001,"limit_size":"297.2mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.0,"tripped":0},"fielddata":{"limit_size_in_bytes":311663001,"limit_size":"297.2mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.03,"tripped":0},"in_flight_requests":{"limit_size_in_bytes":519438336,"limit_size":"495.3mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.0,"tripped":0},"parent":{"limit_size_in_bytes":363606835,"limit_size":"346.7mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.0,"tripped":0}},"script":{"compilations":0,"cache_evictions":0},"discovery":{"cluster_state_queue":{"total":0,"pending":0,"committed":0}},"ingest":{"total":{"count":0,"time_in_millis":0,"current":0,"failed":0},"pipelines":{}}}}}
Unable to form Elasticsearch (5.1.1) cluster on AWS EC2 instances
I am unable to form a ES cluster between 2 master nodes in EC2 instances. Following is the elasticsearch.yml for the nodes. Node1: bootstrap.memory_lock: true cloud.aws.protocol: http cloud.aws.proxy.host: <Proxy addr> cloud.aws.proxy.port: <proxy port> cloud.aws.region: us-east cluster.name: production-test discovery.ec2.availability_zones: us-east-1a,us-east-1b,us-east-1d,us-east-1e discovery.zen.ping_timeout: 30s discovery.ec2.tag.Name: <ec2-tag name> discovery.zen.hosts_provider: ec2 #discovery.type: ec2 #discovery.zen.ping.multicast.enabled: false http.port: 9205 #network.host: _eth0_, _local_, _ec2_ network.host: <private ip_addr> #network.bind_host: <private ip_addr> #network.publish_host: <private ip_addr> node.data: true node.master: true plugin.mandatory: discovery-ec2, repository-s3 transport.tcp.port: 9305 #discovery.zen.ping.unicast.hosts: ["<private ip_addr of node1>","<private ip_addr of node2>"] discovery.zen.ping.unicast.hosts: ["<private ip_addr of node1>:9305", "<private ip_addr of node2>:9305"] cloud.node.auto_attributes: true cluster.routing.allocation.awareness.attributes: aws_availability_zone node.name: nodetest1 path.data: /var/lib/elasticsearch/ #path.data: /data/elasticsearch/data/production path.logs: /var/log/elasticsearch/ path.conf: /etc/elasticsearch Node 2: bootstrap.memory_lock: true cloud.aws.protocol: http cloud.aws.proxy.host: <Proxy addr> cloud.aws.proxy.port: <Procy port> cloud.aws.region: us-east cluster.name: production-test discovery.ec2.availability_zones: us-east-1a,us-east-1b,us-east-1d,us-east-1e discovery.zen.ping_timeout: 30s discovery.ec2.tag.Name: <ec2-instance tag name> discovery.zen.hosts_provider: ec2 #discovery.type: ec2 #discovery.zen.ping.multicast.enabled: false http.port: 9205 #network.host: _eth0_, _local_, _ec2_ network.host: <private ip_addr> #network.bind_host: <private ip_addr> #network.publish_host: <private ip_addr> node.data: true node.master: true plugin.mandatory: discovery-ec2, repository-s3 transport.tcp.port: 9305 discovery.zen.ping.unicast.hosts: ["<private ip_addr of node1>:9305","<private ip_addr of node2>:9305"] cloud.node.auto_attributes: true cluster.routing.allocation.awareness.attributes: aws_availability_zone node.name: nodetest2 #Paths to log, conf, data directories When both the nodes are started, the following is the log data on both the nodes: [INFO ][o.e.b.BootstrapCheck ] [nodetest1] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks [WARN ][o.e.n.Node ] [nodetest1] timed out while waiting for initial discovery state - timeout: 30s [INFO ][o.e.h.HttpServer ] [nodetest1] publish_address {<private ip_addr of node1>:9205}, bound_addresses {<private ip_addr of node1>:9205} [INFO ][o.e.n.Node ] [nodetest1] started [INFO ][o.e.d.z.ZenDiscovery ] [nodetest1] failed to send join request to master [{nodetest}{YcGzQ-4CQtmuuxUGMQJroA}{yuxHmvGPTeK-iw59VTj4ZA}{<private ip_addr of node2>}{<private ip_addr of node2>:9305}{aws_availability_zone=us-east-1d}], reason [RemoteTransportException[[nodetest][<private ip_addr of node2>:9305][internal:discovery/zen/join]]; nested: NotMasterException[Node [{nodetest}{YcGzQ-4CQtmuuxUGMQJroA}{yuxHmvGPTeK-iw59VTj4ZA}{<private ip_addr of node2>}{<private ip_addr of node2>:9305}{aws_availability_zone=us-east-1d}] not master for join request]; ], tried [3] times I have searched many similar issues and tried to apply the fixes but i still have the same result. Is there any fault in the elasticsearch.yml file? curl -XGET <private ip_addr>:9205/_cat/master {"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503} The two node instances are running ES-5.1.1 and are in the same security-group and iam role. Any suggestions are highly appreciated. Thanks,
Elasticsearch zen discovery - Connection refused: /127.0.0.1:9302
Very new in Elasticsearch and trying out Zen discovery plugin for the first time. I'm currently using version 5.0.0-alpha5. Here is my current settings: cluster: name: Elastic-POC node: name: ${HOSTNAME}-data master: false data: true cloud: aws: access_key: xxxxxx secret_key: xxxxxx region: us-west-2 ec2: protocol: http access_key: xxxxxx secret_key: xxxxxx discovery: type: ec2 zen.minimum_master_nodes: 1 ec2.any_group: true ec2.groups: sg-xxxxxx network: host: _ec2:privateIp_ The above setting is from the "data" node, it's unable to join to the "master" node. I have enabled "TRACE" for discover plugin and this is what I got from the log: [2016-07-12 00:30:39,377][INFO ][env ] [ip-172-29-1-44-data] heap size [15.8gb], compressed ordinary object pointers [true] [2016-07-12 00:30:40,563][DEBUG][discovery.zen.elect ] [ip-172-29-1-44-data] using minimum_master_nodes [1] [2016-07-12 00:30:40,913][DEBUG][discovery.ec2 ] [ip-172-29-1-44-data] using host_type [PRIVATE_IP], tags [{}], groups [[sg-xxxxxx]] with any_group [true], availability_zones [[]] [2016-07-12 00:30:40,914][DEBUG][discovery.zen.ping.unicast] [ip-172-29-1-44-data] using initial hosts [127.0.0.1, [::1]], with concurrent_connects [10] [2016-07-12 00:30:40,922][DEBUG][discovery.zen ] [ip-172-29-1-44-data] using ping_timeout [3s], join.timeout [1m], master_election.ignore_non_master [false] [2016-07-12 00:30:40,925][DEBUG][discovery.zen.fd ] [ip-172-29-1-44-data] [master] uses ping_interval [1s], ping_timeout [30s], ping_retries [3] [2016-07-12 00:30:40,938][DEBUG][discovery.zen.fd ] [ip-172-29-1-44-data] [node ] uses ping_interval [1s], ping_timeout [30s], ping_retries [3] [2016-07-12 00:30:41,250][DEBUG][discovery.ec2 ] [ip-172-29-1-44-data] using host_type [PRIVATE_IP], tags [{}], groups [[sg-xxxxxx]] with any_group [true], availability_zones [[]] [2016-07-12 00:30:41,250][DEBUG][discovery.ec2 ] [ip-172-29-1-44-data] using host_type [PRIVATE_IP], tags [{}], groups [[sg-xxxxxx]] with any_group [true], availability_zones [[]] [2016-07-12 00:30:41,252][INFO ][node ] [ip-172-29-1-44-data] initialized [2016-07-12 00:30:41,252][INFO ][node ] [ip-172-29-1-44-data] starting ... [2016-07-12 00:30:41,546][INFO ][transport ] [ip-172-29-1-44-data] publish_address {172.29.1.44:9300}, bound_addresses {172.29.1.44:9300} [2016-07-12 00:30:41,561][TRACE][discovery.zen ] [ip-172-29-1-44-data] starting an election context, will accumulate joins [2016-07-12 00:30:41,562][TRACE][discovery.zen ] [ip-172-29-1-44-data] starting to ping [2016-07-12 00:30:42,477][TRACE][discovery.ec2 ] [ip-172-29-1-44-data] building dynamic unicast discovery nodes... [2016-07-12 00:30:42,477][DEBUG][discovery.ec2 ] [ip-172-29-1-44-data] using dynamic discovery nodes [] [2016-07-12 00:30:42,480][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_1#}{127.0.0.1}{127.0.0.1:9300} [2016-07-12 00:30:42,480][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_2#}{127.0.0.1}{127.0.0.1:9301} [2016-07-12 00:30:42,482][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_4#}{127.0.0.1}{127.0.0.1:9303} [2016-07-12 00:30:42,482][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_5#}{127.0.0.1}{127.0.0.1:9304} [2016-07-12 00:30:42,482][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_3#}{127.0.0.1}{127.0.0.1:9302} [2016-07-12 00:30:42,483][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_6#}{::1}{[::1]:9300} [2016-07-12 00:30:42,485][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_7#}{::1}{[::1]:9301} [2016-07-12 00:30:42,485][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_8#}{::1}{[::1]:9302} [2016-07-12 00:30:42,487][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_9#}{::1}{[::1]:9303} [2016-07-12 00:30:42,487][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] connecting (light) to {#zen_unicast_10#}{::1}{[::1]:9304} [2016-07-12 00:30:42,508][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] failed to connect to {#zen_unicast_3#}{127.0.0.1}{127.0.0.1:9302} ConnectTransportException[[][127.0.0.1:9302] connect_timeout[30s]]; nested: ConnectException[Connection refused: /127.0.0.1:9302]; at org.elasticsearch.transport.netty.NettyTransport.connectToChannelsLight(NettyTransport.java:1008) at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:972) at org.elasticsearch.transport.netty.NettyTransport.connectToNodeLight(NettyTransport.java:944) at org.elasticsearch.transport.TransportService.connectToNodeLightAndHandshake(TransportService.java:325) at org.elasticsearch.transport.TransportService.connectToNodeLightAndHandshake(TransportService.java:301) at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$2.run(UnicastZenPing.java:398) at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:392) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused: /127.0.0.1:9302 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152) at org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105) at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42) at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) ... 3 more [2016-07-12 00:30:42,510][TRACE][discovery.zen.ping.unicast] [ip-172-29-1-44-data] [1] failed to connect to {#zen_unicast_7#}{::1}{[::1]:9301} ConnectTransportException[[][[::1]:9301] connect_timeout[30s]]; nested: ConnectException[Connection refused: /0:0:0:0:0:0:0:1:9301]; at org.elasticsearch.transport.netty.NettyTransport.connectToChannelsLight(NettyTransport.java:1008) at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:972) at org.elasticsearch.transport.netty.NettyTransport.connectToNodeLight(NettyTransport.java:944) at org.elasticsearch.transport.TransportService.connectToNodeLightAndHandshake(TransportService.java:325) at org.elasticsearch.transport.TransportService.connectToNodeLightAndHandshake(TransportService.java:301) at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$2.run(UnicastZenPing.java:398) at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:392) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused: /0:0:0:0:0:0:0:1:9301 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152) at org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105) at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42) at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
I was able to figure it out. All my ES nodes were unable to make call to AWS API as all traffic to www was blocked. As a result, it was getting default host address during discovery. After enabling www traffic (NAT Gateways), it was able to find out expected ES hosts.
elasticsearch : EC2 discovery : master nodes work data nodes fail
My objective is to run a 6 node cluster on three instances in EC2. I am placing one master-only and one data-only node on each instance (using the elastic ansible playbook). The master nodes from each of the three instances all find each other without issue using EC2 discovery and form a cluster of three and elect a master. The data nodes from the same instances fail on startup with the error below. What have I tried - switching data nodes to explicit zen.unicast discovery via hostnames works - I can telnet on port 9301 from instance A->B without issue REFERENCE: java version - OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-0ubuntu1.14.04.1) es version - 2.1.0 data node elasticseach.yml bootstrap.mlockall: false cloud.aws.region: us-east cluster.name: my-cluster discovery.ec2.groups: stage-elasticsearch discovery.ec2.host_type: private_dns discovery.ec2.ping_timeout: 30s discovery.type: ec2 discovery.zen.minimum_master_nodes: 2 discovery.zen.ping.multicast.enabled: false gateway.expected_nodes: 4 http.port: 9201 network.host: _ec2:privateDns_ node.data: true node.master: false transport.tcp.port: 9301 node.name: ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1 master node elasticsearch.yml bootstrap.mlockall: false cloud.aws.region: us-east cluster.name: my-cluster discovery.ec2.groups: stage-elasticsearch discovery.ec2.host_type: private_dns discovery.ec2.ping_timeout: 30s discovery.type: ec2 discovery.zen.minimum_master_nodes: 2 discovery.zen.ping.multicast.enabled: false gateway.expected_nodes: 4 http.port: 9200 network.host: _ec2:privateDns_ node.data: false node.master: true transport.tcp.port: 9300 node.name: ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-master Errors from datanode startup: [2016-03-02 15:45:06,246][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] initializing ... [2016-03-02 15:45:06,679][INFO ][plugins ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] loaded [cloud-aws], sites [head] [2016-03-02 15:45:06,710][INFO ][env ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [11.5gb], net total_space [14.6gb], spins? [no], types [ext4] [2016-03-02 15:45:09,597][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] initialized [2016-03-02 15:45:09,597][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] starting ... [2016-03-02 15:45:09,678][INFO ][transport ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1/xxx-xxx-xx-xxx:9301}, bound_addresses {xxx-xxx-xx-xxx:9301} [2016-03-02 15:45:09,687][INFO ][discovery ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] my-cluster/PNI6WAmzSYGgZcX2HsqenA [2016-03-02 15:45:09,701][WARN ][com.amazonaws.jmx.SdkMBeanRegistrySupport] java.security.AccessControlException: access denied ("javax.management.MBeanServerPermission" "findMBeanServer") at java.security.AccessControlContext.checkPermission(AccessControlContext.java:372) at java.security.AccessController.checkPermission(AccessController.java:559) at java.lang.SecurityManager.checkPermission(SecurityManager.java:549) at javax.management.MBeanServerFactory.checkPermission(MBeanServerFactory.java:413) at javax.management.MBeanServerFactory.findMBeanServer(MBeanServerFactory.java:361) at com.amazonaws.jmx.MBeans.getMBeanServer(MBeans.java:111) at com.amazonaws.jmx.MBeans.registerMBean(MBeans.java:50) at com.amazonaws.jmx.SdkMBeanRegistrySupport.registerMetricAdminMBean(SdkMBeanRegistrySupport.java:27) at com.amazonaws.metrics.AwsSdkMetrics.registerMetricAdminMBean(AwsSdkMetrics.java:355) at com.amazonaws.metrics.AwsSdkMetrics.<clinit>(AwsSdkMetrics.java:316) at com.amazonaws.AmazonWebServiceClient.requestMetricCollector(AmazonWebServiceClient.java:563) at com.amazonaws.AmazonWebServiceClient.isRMCEnabledAtClientOrSdkLevel(AmazonWebServiceClient.java:504) at com.amazonaws.AmazonWebServiceClient.isRequestMetricsEnabled(AmazonWebServiceClient.java:496) at com.amazonaws.AmazonWebServiceClient.createExecutionContext(AmazonWebServiceClient.java:457) at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5924) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.fetchDynamicNodes(AwsEc2UnicastHostsProvider.java:118) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:230) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:215) at org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:55) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:104) at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:335) at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240) at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106) at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84) at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:879) at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:335) at org.elasticsearch.discovery.zen.ZenDiscovery.access$5000(ZenDiscovery.java:75) at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1236) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) [2016-03-02 15:45:09,703][WARN ][com.amazonaws.metrics.AwsSdkMetrics] java.security.AccessControlException: access denied ("javax.management.MBeanServerPermission" "findMBeanServer") at java.security.AccessControlContext.checkPermission(AccessControlContext.java:372) at java.security.AccessController.checkPermission(AccessController.java:559) at java.lang.SecurityManager.checkPermission(SecurityManager.java:549) at javax.management.MBeanServerFactory.checkPermission(MBeanServerFactory.java:413) at javax.management.MBeanServerFactory.findMBeanServer(MBeanServerFactory.java:361) at com.amazonaws.jmx.MBeans.getMBeanServer(MBeans.java:111) at com.amazonaws.jmx.MBeans.isRegistered(MBeans.java:98) at com.amazonaws.jmx.SdkMBeanRegistrySupport.isMBeanRegistered(SdkMBeanRegistrySupport.java:46) at com.amazonaws.metrics.AwsSdkMetrics.registerMetricAdminMBean(AwsSdkMetrics.java:361) at com.amazonaws.metrics.AwsSdkMetrics.<clinit>(AwsSdkMetrics.java:316) at com.amazonaws.AmazonWebServiceClient.requestMetricCollector(AmazonWebServiceClient.java:563) at com.amazonaws.AmazonWebServiceClient.isRMCEnabledAtClientOrSdkLevel(AmazonWebServiceClient.java:504) at com.amazonaws.AmazonWebServiceClient.isRequestMetricsEnabled(AmazonWebServiceClient.java:496) at com.amazonaws.AmazonWebServiceClient.createExecutionContext(AmazonWebServiceClient.java:457) at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5924) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.fetchDynamicNodes(AwsEc2UnicastHostsProvider.java:118) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:230) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:215) at org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:55) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:104) at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:335) at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240) at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106) at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84) at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:879) at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:335) at org.elasticsearch.discovery.zen.ZenDiscovery.access$5000(ZenDiscovery.java:75) at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1236) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) [2016-03-02 15:45:39,688][WARN ][discovery ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] waited for 30s and no initial state was set by the discovery [2016-03-02 15:45:39,698][INFO ][http ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1/xxx-xxx-xx-xxx:9201}, bound_addresses {xxx-xxx-xx-xxx:9201} [2016-03-02 15:45:39,699][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] started
I fixed this by removing the explicit setting of transport.tcp.port: 9300 and using the default of letting it pick any ports in the range 9300-9399. The warnings from the AwsSdkMetrics remain but are NOT an issue as Val stated.
This is not actually an error. See this issue where this has been reported. It just seems the plugin is logging too much. If you modify your logging.yml config file as suggested in that issue with this, then you'll be fine: # aws will try to do some sketchy JMX stuff, but its not needed. com.amazonaws.jmx.SdkMBeanRegistrySupport: ERROR com.amazonaws.metrics.AwsSdkMetrics: ERROR