Elasticsearch 5.2 crashes in Ubuntu14.04, EC2 t2.large machine - amazon-web-services
I'm trying to run ElasticSearch 5.2 hosted in an EC2 Ubuntu 14.04 machine (t2.large, which has 8gb of RAM, the minimum specified by Elastic to run Elasticsearch). But ElasticSearch is shutting down unexpectedly.
I'm not being able to understand the cause of the shutting down.
this is the elasticsearch.log:
[2017-03-20T10:07:53,410][INFO ][o.e.p.PluginsService ] [QrRfI_U] loaded module [transport-netty4]
[2017-03-20T10:07:53,411][INFO ][o.e.p.PluginsService ] [QrRfI_U] no plugins loaded
[2017-03-20T10:07:55,555][INFO ][o.e.n.Node ] initialized
[2017-03-20T10:07:55,555][INFO ][o.e.n.Node ] [QrRfI_U] starting ...
[2017-03-20T10:07:55,626][WARN ][i.n.u.i.MacAddressUtil ] Failed to find a usable hardware address from the network interfaces; using random bytes: f6:fd:16:e4:90:62:fe:d6
[2017-03-20T10:07:55,673][INFO ][o.e.t.TransportService ] [QrRfI_U] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
[2017-03-20T10:07:58,755][INFO ][o.e.c.s.ClusterService ] [QrRfI_U] new_master {QrRfI_U}{QrRfI_UKQxWwvvhvgYxGmQ}{Rne8jnb_S0KVRnXvJj1m2w}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-03-20T10:07:58,793][INFO ][o.e.h.HttpServer ] [QrRfI_U] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}
[2017-03-20T10:07:58,793][INFO ][o.e.n.Node ] [QrRfI_U] started
[2017-03-20T10:07:59,072][INFO ][o.e.g.GatewayService ] [QrRfI_U] recovered [6] indices into cluster_state
[2017-03-20T10:07:59,724][INFO ][o.e.c.r.a.AllocationService] [QrRfI_U] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[logstash-2017.02.26][4], [logstash-2017.02.26][3], [logstash-2017.02.26][1], [logstash-2017.02.26][0]] ...]).
[2017-03-20T10:50:12,228][INFO ][o.e.c.m.MetaDataMappingService] [QrRfI_U] [logstash-2017.03.20/HXANYkA9RRKne-YAK9cNQg] update_mapping [logs]
[2017-03-20T11:06:55,449][INFO ][o.e.n.Node ] [QrRfI_U] stopping ...
[2017-03-20T11:06:55,514][INFO ][o.e.n.Node ] [QrRfI_U] stopped
[2017-03-20T11:06:55,515][INFO ][o.e.n.Node ] [QrRfI_U] closing ...
[2017-03-20T11:06:55,523][INFO ][o.e.n.Node ] [QrRfI_U] closed
When I restart ElasticSearch this is the node stats after 1 logstash input (I've never mmore than 3 inputs before elasticsearch crashes):
Request:
curl -i -XGET 'localhost:9200/_nodes/stats'
Response:
{"_nodes":{"total":1,"successful":1,"failed":0},"cluster_name":"elasticsearch","nodes":{"QrRfI_UKQxWwvvhvgYxGmQ":{"timestamp":1490011241990,"name":"QrRfI_U","transport_address":"127.0.0.1:9300","host":"127.0.0.1","ip":"127.0.0.1:9300","roles":["master","data","ingest"],"indices":{"docs":{"count":17,"deleted":0},"store":{"size_in_bytes":235863,"throttle_time_in_millis":0},"indexing":{"index_total":2,"index_time_in_millis":111,"index_current":0,"index_failed":0,"delete_total":0,"delete_time_in_millis":0,"delete_current":0,"noop_update_total":0,"is_throttled":false,"throttle_time_in_millis":0},"get":{"total":2,"time_in_millis":3,"exists_total":2,"exists_time_in_millis":3,"missing_total":0,"missing_time_in_millis":0,"current":0},"search":{"open_contexts":0,"query_total":84,"query_time_in_millis":70,"query_current":0,"fetch_total":80,"fetch_time_in_millis":91,"fetch_current":0,"scroll_total":0,"scroll_time_in_millis":0,"scroll_current":0,"suggest_total":0,"suggest_time_in_millis":0,"suggest_current":0},"merges":{"current":0,"current_docs":0,"current_size_in_bytes":0,"total":0,"total_time_in_millis":0,"total_docs":0,"total_size_in_bytes":0,"total_stopped_time_in_millis":0,"total_throttled_time_in_millis":0,"total_auto_throttle_in_bytes":545259520},"refresh":{"total":2,"total_time_in_millis":89,"listeners":0},"flush":{"total":0,"total_time_in_millis":0},"warmer":{"current":0,"total":28,"total_time_in_millis":72},"query_cache":{"memory_size_in_bytes":0,"total_count":0,"hit_count":0,"miss_count":0,"cache_size":0,"cache_count":0,"evictions":0},"fielddata":{"memory_size_in_bytes":0,"evictions":0},"completion":{"size_in_bytes":0},"segments":{"count":17,"memory_in_bytes":137618,"terms_memory_in_bytes":130351,"stored_fields_memory_in_bytes":5304,"term_vectors_memory_in_bytes":0,"norms_memory_in_bytes":384,"points_memory_in_bytes":15,"doc_values_memory_in_bytes":1564,"index_writer_memory_in_bytes":0,"version_map_memory_in_bytes":0,"fixed_bit_set_memory_in_bytes":0,"max_unsafe_auto_id_timestamp":-1,"file_sizes":{}},"translog":{"operations":2,"size_in_bytes":6072},"request_cache":{"memory_size_in_bytes":12740,"evictions":0,"hit_count":0,"miss_count":20},"recovery":{"current_as_source":0,"current_as_target":0,"throttle_time_in_millis":0}},"os":{"timestamp":1490011241998,"cpu":{"percent":1,"load_average":{"1m":0.18,"5m":0.08,"15m":0.06}},"mem":{"total_in_bytes":8371847168,"free_in_bytes":5678006272,"used_in_bytes":2693840896,"free_percent":68,"used_percent":32},"swap":{"total_in_bytes":0,"free_in_bytes":0,"used_in_bytes":0}},"process":{"timestamp":1490011241998,"open_file_descriptors":220,"max_file_descriptors":66000,"cpu":{"percent":1,"total_in_millis":14800},"mem":{"total_virtual_in_bytes":3171389440}},"jvm":{"timestamp":1490011241998,"uptime_in_millis":205643,"mem":{"heap_used_in_bytes":195922864,"heap_used_percent":37,"heap_committed_in_bytes":519438336,"heap_max_in_bytes":519438336,"non_heap_used_in_bytes":75810224,"non_heap_committed_in_bytes":81326080,"pools":{"young":{"used_in_bytes":96089960,"max_in_bytes":139591680,"peak_used_in_bytes":139591680,"peak_max_in_bytes":139591680},"survivor":{"used_in_bytes":11413088,"max_in_bytes":17432576,"peak_used_in_bytes":17432576,"peak_max_in_bytes":17432576},"old":{"used_in_bytes":88419816,"max_in_bytes":362414080,"peak_used_in_bytes":88419816,"peak_max_in_bytes":362414080}}},"threads":{"count":43,"peak_count":45},"gc":{"collectors":{"young":{"collection_count":5,"collection_time_in_millis":164},"old":{"collection_count":1,"collection_time_in_millis":39}}},"buffer_pools":{"direct":{"count":29,"used_in_bytes":70307265,"total_capacity_in_bytes":70307264},"mapped":{"count":17,"used_in_bytes":217927,"total_capacity_in_bytes":217927}},"classes":{"current_loaded_count":10981,"total_loaded_count":10981,"total_unloaded_count":0}},"thread_pool":{"bulk":{"threads":2,"queue":0,"active":0,"rejected":0,"largest":2,"completed":2},"fetch_shard_started":{"threads":4,"queue":0,"active":0,"rejected":0,"largest":4,"completed":26},"fetch_shard_store":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"flush":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"force_merge":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"generic":{"threads":4,"queue":0,"active":0,"rejected":0,"largest":4,"completed":54},"get":{"threads":2,"queue":0,"active":0,"rejected":0,"largest":2,"completed":2},"index":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"listener":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"management":{"threads":5,"queue":0,"active":1,"rejected":0,"largest":5,"completed":203},"refresh":{"threads":1,"queue":0,"active":0,"rejected":0,"largest":1,"completed":550},"search":{"threads":4,"queue":0,"active":0,"rejected":0,"largest":4,"completed":165},"snapshot":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"warmer":{"threads":1,"queue":0,"active":0,"rejected":0,"largest":1,"completed":23}},"fs":{"timestamp":1490011241999,"total":{"total_in_bytes":8309932032,"free_in_bytes":3226181632,"available_in_bytes":2780459008},"data":[{"path":"/home/ubuntu/elasticsearch-5.2.0/data/nodes/0","mount":"/ (/dev/xvda1)","type":"ext4","total_in_bytes":8309932032,"free_in_bytes":3226181632,"available_in_bytes":2780459008,"spins":"false"}],"io_stats":{"devices":[{"device_name":"xvda1","operations":901,"read_operations":4,"write_operations":897,"read_kilobytes":16,"write_kilobytes":10840}],"total":{"operations":901,"read_operations":4,"write_operations":897,"read_kilobytes":16,"write_kilobytes":10840}}},"transport":{"server_open":0,"rx_count":10,"rx_size_in_bytes":3388,"tx_count":10,"tx_size_in_bytes":3388},"http":{"current_open":5,"total_opened":12},"breakers":{"request":{"limit_size_in_bytes":311663001,"limit_size":"297.2mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.0,"tripped":0},"fielddata":{"limit_size_in_bytes":311663001,"limit_size":"297.2mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.03,"tripped":0},"in_flight_requests":{"limit_size_in_bytes":519438336,"limit_size":"495.3mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.0,"tripped":0},"parent":{"limit_size_in_bytes":363606835,"limit_size":"346.7mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.0,"tripped":0}},"script":{"compilations":0,"cache_evictions":0},"discovery":{"cluster_state_queue":{"total":0,"pending":0,"committed":0}},"ingest":{"total":{"count":0,"time_in_millis":0,"current":0,"failed":0},"pipelines":{}}}}}
Related
never started elasticsearch node but got ERROR: Skipping security auto configuration
I got token from already running master node with following command bin\elasticsearch-create-enrollment-token -s node I ran other node that I've never strated bin\elasticsearch --enrollment-token <enrollment-token> but I got this error message ERROR: Skipping security auto configuration because it appears that security is already configured. I've never ran this node but It keep saying I did and sudo rm -rf /var/lib/elasticsearch/nodes doesn’t help I set inbound rule in AWS like: IPv4 HTTPS TCP 443 0.0.0.0/0 IPv4 SSH TCP 22 0.0.0.0/0 IPv4 HTTP TCP 80 0.0.0.0/0 IPv4 custom TCP TCP 9200 - 9399 0.0.0.0/0 IPv6 custom TCP TCP 9200 - 9399 ::/0 and I set master node on my instatce following these steps: #1 sudo yum install -y java-1.8.0-openjdk-devel.x86_64 #2 sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch #3 sudo nano /etc/yum.repos.d/elasticsearch.repo #4 save these lines on elasticsearch.repon [elasticsearch] name=Elasticsearch repository for 8.x packages baseurl=https://artifacts.elastic.co/packages/8.x/yum gpgcheck=1 gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch enabled=0 autorefresh=1 type=rpm-md #5 sudo yum install --enablerepo=elasticsearch elasticsearch #6 sudo chown -R elasticsearch:elasticsearch /usr/share/**elasticsearch** sudo chown -R elasticsearch:elasticsearch /var/lib/elasticsearch sudo chown -R elasticsearch:elasticsearch /var/log/elasticsearch sudo chown -R elasticsearch:elasticsearch /etc/elasticsearch #7 sudo su elasticsearch -s /bin/bash #8 cd /etc/elasticsearch #9 change elasticsearch config settings cluster.name: dev-cluster node.name: ${HOSTNAME} path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch bootstrap.memory_lock: true network.host: _site_ http.port: 9200 discovery.seed_hosts: ["ip-100-10-10-10.ap-northeast-2.compute.internal"] cluster.initial_master_nodes: ["ip-100-10-10-10.ap-northeast-2.compute.internal"] xpack.security.enabled: true xpack.security.enrollment.enabled: true xpack.security.http.ssl: enabled: true keystore.path: certs/http.p12 xpack.security.transport.ssl: enabled: true verification_mode: certificate keystore.path: certs/transport.p12 truststore.path: certs/transport.p12 http.host: 0.0.0.0 transport.host: 0.0.0.0 #10 start elasticsearch sudo systemctl start elasticsearch #11 get password of elastic accound bin/elasticsearch-reset-password --username user1 -i then I stopped master node, started again with command ./bin/elasticsearch -d and I got this log [2022-10-27T17:32:11,581][INFO ][o.e.n.Node ] [ip-172-31-44-89.ap-northeast-2.compute.internal] version[8.4.3], pid[31711], build[rpm/42f05b9372a9a4a470db3b52817899b99a76ee73/2022-10-04T07:17:24.662462378Z], OS[Linux/5.10.144-127.601.amzn2.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/18.0.2.1/18.0.2.1+1-1] [2022-10-27T17:32:11,586][INFO ][o.e.n.Node ] [ip-172-31-44-89.ap-northeast-2.compute.internal] JVM home [/usr/share/elasticsearch/jdk], using bundled JDK [true] [2022-10-27T17:32:11,587][INFO ][o.e.n.Node ] [ip-172-31-44-89.ap-northeast-2.compute.internal] JVM arguments [-Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -Djava.security.manager=allow, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Dlog4j2.formatMsgNoLookups=true, -Djava.locale.providers=SPI,COMPAT, --add-opens=java.base/java.io=ALL-UNNAMED, -XX:+UseG1GC, -Djava.io.tmpdir=/tmp/elasticsearch-14995458373941900172, -XX:+HeapDumpOnOutOfMemoryError, -XX:+ExitOnOutOfMemoryError, -XX:HeapDumpPath=/var/lib/elasticsearch, -XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Xms1959m, -Xmx1959m, -XX:MaxDirectMemorySize=1027604480, -XX:G1HeapRegionSize=4m, -XX:InitiatingHeapOccupancyPercent=30, -XX:G1ReservePercent=15, -Des.distribution.type=rpm, --module-path=/usr/share/elasticsearch/lib, --add-modules=jdk.net, -Djdk.module.main=org.elasticsearch.server] [2022-10-27T17:32:13,376][INFO ][c.a.c.i.j.JacksonVersion ] [ip-172-31-44-89.ap-northeast-2.compute.internal] Package versions: jackson-annotations=2.13.2, jackson-core=2.13.2, jackson-databind=2.13.2.2, jackson-dataformat-xml=2.13.2, jackson-datatype-jsr310=2.13.2, azure-core=1.27.0, Troubleshooting version conflicts: https://aka.ms/azsdk/java/dependency/troubleshoot [2022-10-27T17:32:14,680][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [aggs-matrix-stats] [2022-10-27T17:32:14,681][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [analysis-common] [2022-10-27T17:32:14,681][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [constant-keyword] [2022-10-27T17:32:14,682][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [data-streams] [2022-10-27T17:32:14,682][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [frozen-indices] [2022-10-27T17:32:14,682][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [ingest-attachment] [2022-10-27T17:32:14,683][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [ingest-common] [2022-10-27T17:32:14,683][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [ingest-geoip] [2022-10-27T17:32:14,683][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [ingest-user-agent] [2022-10-27T17:32:14,684][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [kibana] [2022-10-27T17:32:14,684][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [lang-expression] [2022-10-27T17:32:14,684][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [lang-mustache] [2022-10-27T17:32:14,685][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [lang-painless] [2022-10-27T17:32:14,685][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [legacy-geo] [2022-10-27T17:32:14,685][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [mapper-extras] [2022-10-27T17:32:14,686][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [mapper-version] [2022-10-27T17:32:14,686][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [old-lucene-versions] [2022-10-27T17:32:14,686][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [parent-join] [2022-10-27T17:32:14,686][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [percolator] [2022-10-27T17:32:14,687][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [rank-eval] [2022-10-27T17:32:14,687][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [reindex] [2022-10-27T17:32:14,687][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [repositories-metering-api] [2022-10-27T17:32:14,688][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [repository-azure] [2022-10-27T17:32:14,688][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [repository-encrypted] [2022-10-27T17:32:14,688][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [repository-gcs] [2022-10-27T17:32:14,688][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [repository-s3] [2022-10-27T17:32:14,689][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [repository-url] [2022-10-27T17:32:14,689][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [runtime-fields-common] [2022-10-27T17:32:14,689][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [search-business-rules] [2022-10-27T17:32:14,690][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [searchable-snapshots] [2022-10-27T17:32:14,690][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [snapshot-based-recoveries] [2022-10-27T17:32:14,690][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [snapshot-repo-test-kit] [2022-10-27T17:32:14,690][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [spatial] [2022-10-27T17:32:14,691][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [systemd] [2022-10-27T17:32:14,691][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [transform] [2022-10-27T17:32:14,691][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [transport-netty4] [2022-10-27T17:32:14,692][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [unsigned-long] [2022-10-27T17:32:14,692][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [vector-tile] [2022-10-27T17:32:14,692][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [wildcard] [2022-10-27T17:32:14,692][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-aggregate-metric] [2022-10-27T17:32:14,693][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-analytics] [2022-10-27T17:32:14,693][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-async] [2022-10-27T17:32:14,693][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-async-search] [2022-10-27T17:32:14,693][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-autoscaling] [2022-10-27T17:32:14,694][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-ccr] [2022-10-27T17:32:14,694][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-core] [2022-10-27T17:32:14,694][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-deprecation] [2022-10-27T17:32:14,694][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-enrich] [2022-10-27T17:32:14,695][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-eql] [2022-10-27T17:32:14,695][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-fleet] [2022-10-27T17:32:14,695][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-graph] [2022-10-27T17:32:14,695][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-identity-provider] [2022-10-27T17:32:14,696][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-ilm] [2022-10-27T17:32:14,696][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-logstash] [2022-10-27T17:32:14,696][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-ml] [2022-10-27T17:32:14,696][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-monitoring] [2022-10-27T17:32:14,697][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-ql] [2022-10-27T17:32:14,697][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-rollup] [2022-10-27T17:32:14,697][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-security] [2022-10-27T17:32:14,697][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-shutdown] [2022-10-27T17:32:14,698][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-sql] [2022-10-27T17:32:14,698][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-stack] [2022-10-27T17:32:14,698][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-text-structure] [2022-10-27T17:32:14,698][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-voting-only-node] [2022-10-27T17:32:14,699][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] loaded module [x-pack-watcher] [2022-10-27T17:32:14,699][INFO ][o.e.p.PluginsService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] no plugins loaded [2022-10-27T17:32:17,885][INFO ][o.e.e.NodeEnvironment ] [ip-172-31-44-89.ap-northeast-2.compute.internal] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [12gb], net total_space [14.9gb], types [xfs] [2022-10-27T17:32:17,885][INFO ][o.e.e.NodeEnvironment ] [ip-172-31-44-89.ap-northeast-2.compute.internal] heap size [1.9gb], compressed ordinary object pointers [true] [2022-10-27T17:32:17,935][INFO ][o.e.n.Node ] [ip-172-31-44-89.ap-northeast-2.compute.internal] node name [ip-172-31-44-89.ap-northeast-2.compute.internal], node ID [zl8ftWOcT_OJImkuzARXzw], cluster name [kp-dev-cluster], roles [transform, data_content, data_warm, master, remote_cluster_client, data, data_cold, ingest, data_frozen, ml, data_hot] [2022-10-27T17:32:21,155][INFO ][o.e.x.s.Security ] [ip-172-31-44-89.ap-northeast-2.compute.internal] Security is enabled [2022-10-27T17:32:21,504][INFO ][o.e.x.s.a.s.FileRolesStore] [ip-172-31-44-89.ap-northeast-2.compute.internal] parsed [0] roles from file [/etc/elasticsearch/roles.yml] [2022-10-27T17:32:21,998][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [ip-172-31-44-89.ap-northeast-2.compute.internal] [controller/31749] [Main.cc#123] controller (64 bit): Version 8.4.3 (Build 9c00cf51c9fea9) Copyright (c) 2022 Elasticsearch BV [2022-10-27T17:32:22,558][INFO ][o.e.t.n.NettyAllocator ] [ip-172-31-44-89.ap-northeast-2.compute.internal] creating NettyAllocator with the following configs: [name=elasticsearch_configured, chunk_size=1mb, suggested_max_allocation_size=1mb, factors={es.unsafe.use_netty_default_chunk_and_page_size=false, g1gc_enabled=true, g1gc_region_size=4mb}] [2022-10-27T17:32:22,588][INFO ][o.e.i.r.RecoverySettings ] [ip-172-31-44-89.ap-northeast-2.compute.internal] using rate limit [40mb] with [default=40mb, read=0b, write=0b, max=0b] [2022-10-27T17:32:22,630][INFO ][o.e.d.DiscoveryModule ] [ip-172-31-44-89.ap-northeast-2.compute.internal] using discovery type [multi-node] and seed hosts providers [settings] [2022-10-27T17:32:24,113][INFO ][o.e.n.Node ] [ip-172-31-44-89.ap-northeast-2.compute.internal] initialized [2022-10-27T17:32:24,114][INFO ][o.e.n.Node ] [ip-172-31-44-89.ap-northeast-2.compute.internal] starting ... [2022-10-27T17:32:24,127][INFO ][o.e.x.s.c.f.PersistentCache] [ip-172-31-44-89.ap-northeast-2.compute.internal] persistent cache index loaded [2022-10-27T17:32:24,128][INFO ][o.e.x.d.l.DeprecationIndexingComponent] [ip-172-31-44-89.ap-northeast-2.compute.internal] deprecation component started [2022-10-27T17:32:24,230][INFO ][o.e.t.TransportService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] publish_address {172.31.44.89:9300}, bound_addresses {[::]:9300} [2022-10-27T17:32:24,524][INFO ][o.e.b.BootstrapChecks ] [ip-172-31-44-89.ap-northeast-2.compute.internal] bound or publishing to a non-loopback address, enforcing bootstrap checks [2022-10-27T17:32:24,545][WARN ][o.e.c.c.ClusterBootstrapService] [ip-172-31-44-89.ap-northeast-2.compute.internal] this node is locked into cluster UUID [1V1YFJcKRh6FUZvuKcSzWQ] but [cluster.initial_master_nodes] is set to [ip-172-31-44-89.ap-northeast-2.compute.internal]; remove this setting to avoid possible data loss caused by subsequent cluster bootstrap attempts [2022-10-27T17:32:24,657][INFO ][o.e.c.s.MasterService ] [ip-172-31-44-89.ap-northeast-2.compute.internal] elected-as-master ([1] nodes joined)[_FINISH_ELECTION_, {ip-172-31-44-89.ap-northeast-2.compute.internal}{zl8ftWOcT_OJImkuzARXzw}{3Ak1MYQsROiwqqdy5sp47w}{ip-172-31-44-89.ap-northeast-2.compute.internal}{172.31.44.89}{172.31.44.89:9300}{cdfhilmrstw} completing election], term: 3, version: 38, delta: master node changed {previous [], current [{ip-172-31-44-89.ap-northeast-2.compute.internal}{zl8ftWOcT_OJImkuzARXzw}{3Ak1MYQsROiwqqdy5sp47w}{ip-172-31-44-89.ap-northeast-2.compute.internal}{172.31.44.89}{172.31.44.89:9300}{cdfhilmrstw}]} [2022-10-27T17:32:24,703][INFO ][o.e.c.s.ClusterApplierService] [ip-172-31-44-89.ap-northeast-2.compute.internal] master node changed {previous [], current [{ip-172-31-44-89.ap-northeast-2.compute.internal}{zl8ftWOcT_OJImkuzARXzw}{3Ak1MYQsROiwqqdy5sp47w}{ip-172-31-44-89.ap-northeast-2.compute.internal}{172.31.44.89}{172.31.44.89:9300}{cdfhilmrstw}]}, term: 3, version: 38, reason: Publication{term=3, version=38} [2022-10-27T17:32:24,781][INFO ][o.e.r.s.FileSettingsService] [ip-172-31-44-89.ap-northeast-2.compute.internal] starting file settings watcher ... [2022-10-27T17:32:24,798][INFO ][o.e.r.s.FileSettingsService] [ip-172-31-44-89.ap-northeast-2.compute.internal] file settings service up and running [tid=42] [2022-10-27T17:32:24,805][INFO ][o.e.h.AbstractHttpServerTransport] [ip-172-31-44-89.ap-northeast-2.compute.internal] publish_address {172.31.44.89:9200}, bound_addresses {[::]:9200} [2022-10-27T17:32:24,806][INFO ][o.e.n.Node ] [ip-172-31-44-89.ap-northeast-2.compute.internal] started {ip-172-31-44-89.ap-northeast-2.compute.internal}{zl8ftWOcT_OJImkuzARXzw}{3Ak1MYQsROiwqqdy5sp47w}{ip-172-31-44-89.ap-northeast-2.compute.internal}{172.31.44.89}{172.31.44.89:9300}{cdfhilmrstw}{ml.max_jvm_size=2055208960, ml.allocated_processors=2, xpack.installed=true, ml.machine_memory=4110344192} and this is my second node’s config cluster.name: dev-cluster node.name: ${HOSTNAME} path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch bootstrap.memory_lock: true network.host: _site_ http.port: 9200 discovery.seed_hosts: ["ip-100-10-10-10.ap-northeast-2.compute.internal"] cluster.initial_master_nodes: ["ip-100-10-10-10.ap-northeast-2.compute.internal"] xpack.security.enabled: true xpack.security.enrollment.enabled: true xpack.security.http.ssl: enabled: true keystore.path: certs/http.p12 xpack.security.transport.ssl: enabled: true verification_mode: certificate keystore.path: certs/transport.p12 truststore.path: certs/transport.p12 http.host: 0.0.0.0 transport.host: 0.0.0.0 I installed second node following exactly same step I used when installing master node except running node and setting password what's wrong with this
AWC EC2 Amazon Linux 2 Instances failed to boot after applying os updates
Yesterday we lost contact with 10 identically configured servers, after some investigation the conclusion was that a reboot after security updates had failed. We have so far not been able to get any of the servers back online, but were lucky enough to be able to reinstall the instances without data loss. I will paste the console log below, can anyone help me determine the root cause and perhaps give me some advice on if there is a better way to configure the server to make recovery easier (like getting past the "Press Enter to continue." prompt, that it seems to hang in). The full log is too big for SO, so I put it on pastebin and pasted a redacted version below. I have removed the escape sequences that colorize the output and removed some double new lines, but besides that it is complete. [ 0.000000] Linux version 4.14.200-155.322.amzn2.x86_64 (mockbuild#ip-10-0-1-230) (gcc version 7.3.1 20180712 (Red Hat 7.3.1-10) (GCC)) #1 SMP Thu Oct 15 20:11:12 UTC 2020 [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.14.200-155.322.amzn2.x86_64 root=UUID=a1e1011e-e38f-408e-878b-fed395b47ad6 ro console=tty0 console=ttyS0,115200n8 net.ifnames=0 biosdevname=0 nvme_core.io_timeout=4294967295 rd.emergency=poweroff rd.shell=0 LANG=en_US.UTF-8 [ 0.000000] e820: BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] SMBIOS 2.7 present. [ 0.000000] DMI: Amazon EC2 t3.micro/, BIOS 1.0 10/16/2017 [ 0.000000] Hypervisor detected: KVM [ 0.000000] tsc: Fast TSC calibration using PIT [ 0.000000] e820: last_pfn = 0x3e3fa max_arch_pfn = 0x400000000 [ 0.000000] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT [ 0.000000] Scanning 1 areas for low memory corruption [ 0.000000] Using GB pages for direct mapping [ 0.000000] RAMDISK: [mem 0x3433e000-0x36196fff] [ 0.000000] ACPI: Early table checksum verification disabled [ 0.000000] ACPI: RSDP 0x00000000000F8F80 000014 (v00 AMAZON) [ 0.000000] ACPI: RSDT 0x000000003E3FE360 00003C (v01 AMAZON AMZNRSDT 00000001 AMZN 00000001) [ 0.000000] ACPI: FACS 0x000000003E3FFF40 000040 [ 0.000000] ACPI: SSDT 0x000000003E3FF6C0 00087A (v01 AMAZON AMZNSSDT 00000001 AMZN 00000001) [ 0.000000] smpboot: Allowing 2 CPUs, 0 hotplug CPUs [ 0.000000] e820: [mem 0x40000000-0xdfffffff] available for PCI devices [ 0.000000] Booting paravirtualized kernel on KVM [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.14.200-155.322.amzn2.x86_64 root=UUID=a1e1011e-e38f-408e-878b-fed395b47ad6 ro console=tty0 console=ttyS0,115200n8 net.ifnames=0 biosdevname=0 nvme_core.io_timeout=4294967295 rd.emergency=poweroff rd.shell=0 LANG=en_US.UTF-8 [ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes) [ 0.000000] Memory: 943540K/1019488K available (10252K kernel code, 1958K rwdata, 2780K rodata, 2088K init, 4240K bss, 75948K reserved, 0K cma-reserved) [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1 [ 0.000000] Kernel/User page tables isolation: enabled [ 0.000000] ftrace: allocating 26683 entries in 105 pages [ 0.004000] Hierarchical RCU implementation. [ 0.004000] RCU restricting CPUs from NR_CPUS=8192 to nr_cpu_ids=2. [ 0.004000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2 [ 0.004000] NR_IRQS: 524544, nr_irqs: 440, preallocated irqs: 16 [ 0.004000] Console: colour VGA+ 80x25 [ 0.004000] console [tty0] enabled [ 0.004000] console [ttyS0] enabled [ 0.004005] tsc: Detected 2500.000 MHz processor [ 0.007582] Calibrating delay loop (skipped) preset value.. 5000.00 BogoMIPS (lpj=10000000) [ 0.008002] pid_max: default: 32768 minimum: 301 [ 0.012006] ACPI: Core revision 20170728 [ 0.016560] ACPI: 2 ACPI AML tables successfully acquired and loaded [ 0.020015] Security Framework initialized [ 0.024002] SELinux: Initializing. [ 0.028159] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes) [ 0.032082] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes) [ 0.036012] Mount-cache hash table entries: 2048 (order: 2, 16384 bytes) [ 0.040006] Mountpoint-cache hash table entries: 2048 (order: 2, 16384 bytes) [ 0.044325] Last level iTLB entries: 4KB 64, 2MB 8, 4MB 8 [ 0.048003] Last level dTLB entries: 4KB 64, 2MB 0, 4MB 0, 1GB 4 [ 0.052003] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization [ 0.056003] Spectre V2 : Mitigation: Full generic retpoline [ 0.060002] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch [ 0.064002] Speculative Store Bypass: Vulnerable [ 0.067720] TAA: Vulnerable: Clear CPU buffers attempted, no microcode [ 0.068002] MDS: Vulnerable: Clear CPU buffers attempted, no microcode [ 0.072086] Freeing SMP alternatives memory: 24K [ 0.076807] smpboot: Max logical packages: 1 [ 0.080264] x2apic enabled [ 0.084003] Switched APIC routing to physical x2apic. [ 0.088000] ..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1 [ 0.088000] smpboot: CPU0: Intel(R) Xeon(R) Platinum 8175M CPU # 2.50GHz (family: 0x6, model: 0x55, stepping: 0x4) [ 0.088074] Performance Events: unsupported p6 CPU model 85 no PMU driver, software events only. [ 0.092046] Hierarchical SRCU implementation. [ 0.095857] NMI watchdog: Perf event create on CPU 0 failed with -2 [ 0.096002] NMI watchdog: Perf NMI watchdog permanently disabled [ 0.100049] smp: Bringing up secondary CPUs ... [ 0.103696] x86: Booting SMP configuration: [ 0.104003] .... node #0, CPUs: #1 [ 0.004000] kvm-clock: cpu 1, msr 0:3e357041, secondary cpu clock [ 0.106853] KVM setup async PF for cpu 1 [ 0.107214] kvm-stealtime: cpu 1, msr 3e1161c0 [ 0.112307] MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details. [ 0.116006] TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details. [ 0.120007] smp: Brought up 1 node, 2 CPUs [ 0.123417] smpboot: Total of 2 processors activated (10000.00 BogoMIPS) [ 0.124320] devtmpfs: initialized [ 0.126970] x86/mm: Memory block size: 128MB [ 0.128137] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns [ 0.132008] futex hash table entries: 512 (order: 3, 32768 bytes) [ 0.136156] NET: Registered protocol family 16 [ 0.139769] cpuidle: using governor ladder [ 0.140013] cpuidle: using governor menu [ 0.143281] ACPI: bus type PCI registered [ 0.144000] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 [ 0.148144] PCI: Using configuration type 1 for base access [ 0.156770] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages [ 0.160017] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages [ 0.164044] ACPI: Added _OSI(Module Device) [ 0.168007] ACPI: Added _OSI(Processor Device) [ 0.172007] ACPI: Added _OSI(3.0 _SCP Extensions) [ 0.176004] ACPI: Added _OSI(Processor Aggregator Device) [ 0.180007] ACPI: Interpreter enabled [ 0.184011] ACPI: (supports S0 S4 S5) [ 0.187094] ACPI: Using IOAPIC for interrupt routing [ 0.188018] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug [ 0.300750] ACPI: Enabled 16 GPEs in block 00 to 0F [ 0.308023] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff]) [ 0.312007] acpi PNP0A03:00: _OSC: OS supports [ASPM ClockPM Segments MSI] [ 0.316010] acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM [ 0.320007] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge. [ 0.328324] acpiphp: Slot [3] registered [ 0.420040] acpiphp: Slot [31] registered [ 0.424003] PCI host bridge to bus 0000:00 [ 0.536451] pci 0000:00:03.0: vgaarb: setting as boot VGA device [ 0.540000] pci 0000:00:03.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none [ 0.548009] pci 0000:00:03.0: vgaarb: bridge control possible [ 0.551996] vgaarb: loaded [ 0.556090] EDAC MC: Ver: 3.0.0 [ 0.559140] PCI: Using ACPI for IRQ routing [ 0.560280] NetLabel: Initializing [ 0.563268] NetLabel: domain hash size = 128 [ 0.568019] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO [ 0.571902] NetLabel: unlabeled traffic allowed by default [ 0.576145] clocksource: Switched to clocksource kvm-clock [ 0.586755] VFS: Disk quotas dquot_6.6.0 [ 0.590090] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes) [ 0.594562] pnp: PnP ACPI init [ 0.597855] pnp: PnP ACPI: found 5 devices [ 0.608231] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns [ 0.614881] NET: Registered protocol family 2 [ 0.618324] TCP established hash table entries: 8192 (order: 4, 65536 bytes) [ 0.622749] TCP bind hash table entries: 8192 (order: 5, 131072 bytes) [ 0.626965] TCP: Hash tables configured (established 8192 bind 8192) [ 0.631170] UDP hash table entries: 512 (order: 2, 16384 bytes) [ 0.635163] UDP-Lite hash table entries: 512 (order: 2, 16384 bytes) [ 0.639358] NET: Registered protocol family 1 [ 0.642779] pci 0000:00:00.0: Limiting direct PCI/PCI transfers [ 0.646797] pci 0000:00:01.0: Activating ISA DMA hang workarounds [ 0.651113] pci 0000:00:03.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff] [ 0.657825] Unpacking initramfs... [ 0.734208] Freeing initrd memory: 31076K [ 0.737636] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x240939f1bb2, max_idle_ns: 440795263295 ns [ 0.745181] Scanning for low memory corruption every 60 seconds [ 0.750602] audit: initializing netlink subsys (disabled) [ 0.754606] audit: type=2000 audit(1603879247.564:1): state=initialized audit_enabled=0 res=1 [ 0.754917] Initialise system trusted keyrings [ 0.764927] Key type blacklist registered [ 0.768266] workingset: timestamp_bits=36 max_order=18 bucket_order=0 [ 0.773861] zbud: loaded [ 0.905903] Key type asymmetric registered [ 0.909292] Asymmetric key parser 'x509' registered [ 0.912915] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 251) [ 0.918972] io scheduler noop registered (default) [ 0.922543] io scheduler cfq registered [ 0.925904] crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64 [ 0.964594] crc32c_combine: 8373 self tests passed [ 0.968628] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled [ 1.000785] 00:04: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A [ 1.007649] i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12 [ 1.014310] i8042: Warning: Keylock active [ 1.018572] serio: i8042 KBD port at 0x60,0x64 irq 1 [ 1.022414] serio: i8042 AUX port at 0x60,0x64 irq 12 [ 1.026284] rtc_cmos 00:00: RTC can wake from S4 [ 1.030475] rtc_cmos 00:00: rtc core: registered rtc_cmos as rtc0 [ 1.034755] rtc_cmos 00:00: alarms up to one day, 114 bytes nvram [ 1.038955] hidraw: raw HID events driver (C) Jiri Kosina [ 1.042936] NET: Registered protocol family 17 [ 1.046622] mce: Using 32 MCE banks [ 1.049627] sched_clock: Marking stable (1049607566, 0)->(1755024155, -705416589) [ 1.056014] registered taskstats version 1 [ 1.059279] Loading compiled-in X.509 certificates [ 1.064832] Loaded X.509 cert 'Build time autogenerated kernel key: 121ffea65ca15230f4a21fe7e5b65abaabaa433c' [ 1.072013] zswap: loaded using pool lzo/zbud [ 1.075526] ima: No TPM chip found, activating TPM-bypass! (rc=-19) [ 1.079746] ima: Allocated hash algorithm: sha1 [ 1.083589] rtc_cmos 00:00: setting system clock to 2020-10-28 09:59:31 UTC (1603879171) [ 1.091820] Freeing unused kernel memory: 2088K [ 1.116102] Write protecting the kernel read-only data: 16384k [ 1.120697] Freeing unused kernel memory: 2016K [ 1.126528] Freeing unused kernel memory: 1316K [ 1.160972] systemd[1]: Inserted module 'autofs4' [ 1.176133] NET: Registered protocol family 10 [ 1.181508] Segment Routing with IPv6 [ 1.184828] systemd[1]: Inserted module 'ipv6' [ 1.189116] random: systemd: uninitialized urandom read (16 bytes read) [ 1.193763] random: systemd: uninitialized urandom read (16 bytes read) [ 1.198171] random: systemd: uninitialized urandom read (16 bytes read) [ 1.205354] systemd[1]: systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN) [ 1.217384] systemd[1]: Detected virtualization kvm. [ 1.221077] systemd[1]: Detected architecture x86-64. [ 1.224774] systemd[1]: Running in initial RAM disk. Welcome to Amazon Linux 2 dracut-033-535.amzn2.1.3 (Initramfs) [ 1.230712] systemd[1]: No hostname configured. [ 1.234213] systemd[1]: Set hostname to <localhost>. [ 1.237934] systemd[1]: Initializing machine ID from KVM UUID. [ OK ] Reached target Swap. [ 1.265844] systemd[1]: Reached target Swap. [ 1.269312] systemd[1]: Starting Swap. [ OK ] Created slice Root Slice. [ 1.274036] systemd[1]: Created slice Root Slice. [ OK ] Listening on Journal Socket. [ OK ] Reached target Timers. [ OK ] Reached target Local Encrypted Volumes. [ OK ] Reached target Local File Systems. [ OK ] Listening on udev Control Socket. [ OK ] Created slice System Slice. Starting Setup Virtual Console... Starting Journal Service... Starting Create list of required st... nodes for the current kernel... Starting Apply Kernel Variables... [ OK ] Reached target Slices. [ OK ] Listening on udev Kernel Socket. [ OK ] Reached target Sockets. Starting dracut cmdline hook... [ OK ] Started Setup Virtual Console. [ OK ] Started Create list of required sta...ce nodes for the current kernel. [ OK ] Started Apply Kernel Variables. Starting Create Static Device Nodes in /dev... [ OK ] Started Create Static Device Nodes in /dev. [ OK ] Started Journal Service. [ OK ] Started dracut cmdline hook. Starting dracut pre-udev hook... [ 1.390579] device-mapper: uevent: version 1.0.3 [ 1.394255] device-mapper: ioctl: 4.37.0-ioctl (2017-09-20) initialised: dm-devel#redhat.com [ OK ] Started dracut pre-udev hook. Starting udev Kernel Device Manager... [ OK ] Started udev Kernel Device Manager. Starting dracut pre-trigger hook... [ OK ] Started dracut pre-trigger hook. Starting udev Coldplug all Devices... [ OK ] Started udev Coldplug all Devices. Starting Show Plymouth Boot Screen... [ OK ] Reached target System Initialization. Starting dracut initqueue hook... [ 1.534629] nvme nvme0: pci function 0000:00:04.0 [ OK ] Started Show Plymouth Boot Screen. [ OK ] Reached target Paths. [ OK ] Reached target Basic System. [ 1.543815] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 11 [ 1.546543] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0 [ 1.556607] nvme nvme1: pci function 0000:00:1f.0 [ 1.557854] ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 10 [ 1.576394] AVX2 version of gcm_enc/dec engaged. [ 1.580503] AES CTR mode by8 optimization enabled [ 1.601321] alg: No test for pcbc(aes) (pcbc-aes-aesni) [ 1.776495] nvme0n1: p1 p128 [ 1.908576] random: fast init done [ OK ] Found device /dev/disk/by-uuid/a1e1011e-e38f-408e-878b-fed395b47ad6. Starting File System Check on /dev/...e-e38f-408e-878b-fed395b47ad6... [ OK ] Started File System Check on /dev/d...11e-e38f-408e-878b-fed395b47ad6. [ OK ] Started dracut initqueue hook. [ OK ] Reached target Remote File Systems (Pre). [ OK ] Reached target Remote File Systems. Starting dracut pre-mount hook... [ OK ] Started dracut pre-mount hook. Mounting /sysroot... [ 2.235770] SGI XFS with ACLs, security attributes, no debug enabled [ 2.242333] XFS (nvme0n1p1): Mounting V5 Filesystem [ 4.142597] XFS (nvme0n1p1): Ending clean mount [ OK ] Mounted /sysroot. [ OK ] Reached target Initrd Root File System. Starting Reload Configuration from the Real Root... [ OK ] Started Reload Configuration from the Real Root. [ OK ] Reached target Initrd File Systems. [ OK ] Reached target Initrd Default Target. Starting dracut pre-pivot and cleanup hook... [ OK ] Started dracut pre-pivot and cleanup hook. Starting Cleaning Up and Shutting Down Daemons... [ OK ] Stopped Cleaning Up and Shutting Down Daemons. [ OK ] Stopped target Timers. [ OK ] Stopped dracut pre-pivot and cleanup hook. Stopping dracut pre-pivot and cleanup hook... [ OK ] Stopped target Remote File Systems. [ OK ] Stopped target Remote File Systems (Pre). [ OK ] Stopped target Initrd Default Target. Starting Plymouth switch root service... [ OK ] Stopped dracut pre-mount hook. Stopping dracut pre-mount hook... [ OK ] Stopped dracut initqueue hook. Stopping dracut initqueue hook... [ OK ] Stopped target Basic System. [ OK ] Stopped target Sockets. [ OK ] Stopped target System Initialization. [ OK ] Stopped target Swap. [ OK ] Stopped target Local File Systems. [ OK ] Stopped Apply Kernel Variables. Stopping Apply Kernel Variables... [ OK ] Stopped target Local Encrypted Volumes. [ OK ] Stopped udev Coldplug all Devices. Stopping udev Coldplug all Devices... [ OK ] Stopped dracut pre-trigger hook. Stopping dracut pre-trigger hook... Stopping udev Kernel Device Manager... [ OK ] Stopped target Slices. [ OK ] Stopped target Paths. [ OK ] Stopped udev Kernel Device Manager. [ OK ] Stopped Create Static Device Nodes in /dev. Stopping Create Static Device Nodes in /dev... [ OK ] Stopped Create list of required sta...ce nodes for the current kernel. Stopping Create list of required st... nodes for the current kernel... [ OK ] Stopped dracut pre-udev hook. Stopping dracut pre-udev hook... [ OK ] Stopped dracut cmdline hook. Stopping dracut cmdline hook... [ OK ] Closed udev Kernel Socket. [ OK ] Closed udev Control Socket. Starting Cleanup udevd DB... [ OK ] Started Cleanup udevd DB. [ OK ] Reached target Switch Root. [ 4.553875] systemd-journald[667]: Received SIGTERM from PID 1 (systemd). [ OK ] Started Plymouth switch root service. Starting Switch Root... [ 4.885212] systemd: 30 output lines suppressed due to ratelimiting [ 5.925390] SELinux: Disabled at runtime. [ 5.980115] audit: type=1404 audit(1603879176.396:2): selinux=0 auid=4294967295 ses=4294967295 [ 6.083250] ip_tables: (C) 2000-2006 Netfilter Core Team [ 6.106470] systemd[1]: Inserted module 'ip_tables' Welcome to Amazon Linux 2 [ OK ] Stopped Switch Root. [ OK ] Stopped Journal Service. Starting Journal Service... [ OK ] Reached target Swap. [ OK ] Listening on Delayed Shutdown Socket. Mounting Huge Pages File System... [ OK ] Stopped target Switch Root. [ OK ] Stopped target Initrd Root File System. [ OK ] Created slice system-getty.slice. [ OK ] Listening on udev Control Socket. [ OK ] Listening on Device-mapper event daemon FIFOs. [ OK ] Created slice User and Session Slice. Starting Create list of required st... nodes for the current kernel... [ OK ] Listening on LVM2 poll daemon socket. [ OK ] Stopped target Initrd File Systems. [ OK ] Listening on udev Kernel Socket. Mounting Debug File System... [ OK ] Reached target Slices. [ OK ] Listening on LVM2 metadata daemon socket. Mounting POSIX Message Queue File System... [ OK ] Created slice system-selinux\x2dpol...grate\x2dlocal\x2dchanges.slice. Starting Monitoring of LVM2 mirrors... dmeventd or progress polling... [ OK ] Created slice system-serial\x2dgetty.slice. Starting Read and set NIS domainname from /etc/sysconfig/network... [ OK ] Listening on /dev/initctl Compatibility Named Pipe. [ OK ] Set up automount Arbitrary Executab...ats File System Automount Point. Starting Remount Root and Kernel File Systems... [ OK ] Started Journal Service. [ OK ] Mounted Debug File System. [ OK ] Mounted POSIX Message Queue File System. [ OK ] Mounted Huge Pages File System. [ OK ] Started Create list of required sta...ce nodes for the current kernel. [ OK ] Started Remount Root and Kernel File Systems. [ OK ] Started Read and set NIS domainname from /etc/sysconfig/network. Starting udev Coldplug all Devices... Starting Configure read-only root support... Starting Relabel kernel modules early in the boot, if needed... Starting Create Static Device Nodes in /dev... Starting Flush Journal to Persistent Storage... [ OK ] Started Relabel kernel modules early in the boot, if needed. Starting Load Kernel Modules... [ 7.047237] systemd-journald[1398]: Received request to flush runtime journal from PID 1 [ 7.069936] ena 0000:00:05.0: Elastic Network Adapter (ENA) v2.2.10g [ 7.084119] ena: ena device version: 0.10 [ 7.089001] ena: ena controller version: 0.0.1 implementation version 1 [ OK ] Started Configure read-only root support. Starting Load/Save Random Seed... [ OK ] Started Load/Save Random Seed. [ 7.156042] ena 0000:00:05.0: LLQ is not supported Fallback to host mode policy. [ OK ] Started udev Coldplug all Devices. Starting udev Wait for Complete Device Initialization... [ 7.181318] ena 0000:00:05.0: Elastic Network Adapter (ENA) found at mem febf4000, mac addr 0a:cf:65:4e:dd:ff [ OK ] Started Load Kernel Modules. Starting Apply Kernel Variables... [ OK ] Started LVM2 metadata daemon. Starting LVM2 metadata daemon... [ OK ] Started Apply Kernel Variables. [ OK ] Started Create Static Device Nodes in /dev. Starting udev Kernel Device Manager... [ OK ] Started Flush Journal to Persistent Storage. [ OK ] Started udev Kernel Device Manager. [ OK ] Found device /dev/ttyS0. [ 7.776329] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input3 [ 7.783413] ACPI: Power Button [PWRF] [ 7.786723] input: Sleep Button as /devices/LNXSYSTM:00/LNXSLPBN:00/input/input4 [ 7.793032] ACPI: Sleep Button [SLPF] Starting Relabel kernel modules early in the boot, if needed... [ OK ] Created slice system-ec2net\x2difup.slice. [ OK ] Started Relabel kernel modules early in the boot, if needed. [ 7.888784] input: ImPS/2 Generic Wheel Mouse as /devices/platform/i8042/serio1/input/input5 [ 7.904661] mousedev: PS/2 mouse device common for all mice [ OK ] Started udev Wait for Complete Device Initialization. Starting Activation of DM RAID sets... [ OK ] Started Activation of DM RAID sets. [ OK ] Reached target Local Encrypted Volumes. [ OK ] Started Monitoring of LVM2 mirrors,...ng dmeventd or progress polling. [ OK ] Reached target Local File Systems (Pre). [ 59.305661] random: crng init done [ 59.308921] random: 7 urandom warning(s) missed due to ratelimiting [ TIME ] Timed out waiting for device dev-sdf.device. [DEPEND] Dependency failed for /home/storage. [DEPEND] Dependency failed for Local File Systems. [DEPEND] Dependency failed for Mark the need to relabel after reboot. [DEPEND] Dependency failed for Relabel all filesystems, if necessary. [DEPEND] Dependency failed for Migrate local... structure to the new structure. Starting Preprocess NFS configuration... [ OK ] Reached target Timers. [ OK ] Reached target Network (Pre). [ OK ] Reached target Login Prompts. [ OK ] Reached target Cloud-init target. Starting Initial hibernation setup job... Starting Initial cloud-init job (metadata service crawler)... [ OK ] Reached target Network. [ OK ] Reached target Paths. [ OK ] Reached target Sockets. Starting Create Volatile Files and Directories... [ OK ] Started Emergency Shell. Starting Emergency Shell... [ OK ] Reached target Emergency Mode. Starting Tell Plymouth To Write Out Runtime Data... [ OK ] Started Preprocess NFS configuration. [ OK ] Started Create Volatile Files and Directories. Mounting RPC Pipe File System... Starting Security Auditing Service... Starting RPC bind service... [ 97.160193] RPC: Registered named UNIX socket transport module. [ 97.160194] RPC: Registered udp transport module. [ 97.160194] RPC: Registered tcp transport module. [ 97.160195] RPC: Registered tcp NFSv4.1 backchannel transport module. [ OK ] Mounted RPC Pipe File System. [ OK ] Reached target rpc_pipefs.target. [ OK ] Reached target NFS client services. [ OK ] Reached target Remote File Systems (Pre). [ OK ] Reached target Remote File Systems. [ OK ] Started Tell Plymouth To Write Out Runtime Data. [ OK ] Started RPC bind service. [ OK ] Started Security Auditing Service. Starting Update UTMP about System Boot/Shutdown... [ OK ] Started Update UTMP about System Boot/Shutdown. Starting Update UTMP about System Runlevel Changes... [ OK ] Started Update UTMP about System Runlevel Changes. [ 99.871085] hibinit-agent[1855]: Traceback (most recent call last): [ 99.871339] hibinit-agent[1855]: File "/usr/bin/hibinit-agent", line 496, in <module> [ 99.871592] hibinit-agent[1855]: main() [ 99.872080] hibinit-agent[1855]: File "/usr/bin/hibinit-agent", line 435, in main [ 99.872516] hibinit-agent[1855]: if not hibernation_enabled(config.state_dir): [ 99.873017] hibinit-agent[1855]: File "/usr/bin/hibinit-agent", line 390, in hibernation_enabled [ 99.873487] hibinit-agent[1855]: imds_token = get_imds_token() [ 99.873793] hibinit-agent[1855]: File "/usr/bin/hibinit-agent", line 365, in get_imds_token [ 99.875332] hibinit-agent[1855]: response = requests.put(token_url, headers=request_header) [ 99.877065] hibinit-agent[1855]: File "/usr/lib/python2.7/site-packages/requests/api.py", line 121, in put [ 99.877230] hibinit-agent[1855]: return request('put', url, data=data, **kwargs) [ 99.877959] hibinit-agent[1855]: File "/usr/lib/python2.7/site-packages/requests/api.py", line 50, in request [ 99.878225] hibinit-agent[1855]: response = session.request(method=method, url=url, **kwargs) [ 99.878614] hibinit-agent[1855]: File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 486, in request [ 99.879747] hibinit-agent[1855]: resp = self.send(prep, **send_kwargs) [ 99.880157] hibinit-agent[1855]: File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 598, in send [ 99.884411] hibinit-agent[1855]: r = adapter.send(request, **kwargs) [ 99.884728] hibinit-agent[1855]: File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 419, in send [ 99.892094] hibinit-agent[1855]: raise ConnectTimeout(e, request=request) [ 99.892377] hibinit-agent[1855]: requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /latest/api/token (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7efc029fa390>: Failed to establish a new connection: [Errno 101] Network is unreachable',)) [FAILED] Failed to start Initial hibernation setup job. See 'systemctl status hibinit-agent.service' for details. [ 101.215791] cloud-init[1856]: Cloud-init v. 19.3-3.amzn2 running 'init' at Wed, 28 Oct 2020 10:01:11 +0000. Up 101.18 seconds. [ 101.264707] cloud-init[1856]: ci-info: +++++++++++++++++++++++++++Net device info++++++++++++++++++++++++++++ [ 101.264940] cloud-init[1856]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+ [ 101.272469] cloud-init[1856]: ci-info: | Device | Up | Address | Mask | Scope | Hw-Address | [ 101.274166] cloud-init[1856]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+ [ 101.274497] cloud-init[1856]: ci-info: | eth0 | False | . | . | . | 0a:cf:65:4e:dd:ff | [ 101.284890] cloud-init[1856]: ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | host | . | [ 101.286727] cloud-init[1856]: ci-info: | lo | True | ::1/128 | . | host | . | [ 101.286986] cloud-init[1856]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+ [ 101.291933] cloud-init[1856]: ci-info: +++++++++++++++++++Route IPv6 info+++++++++++++++++++ [ 101.292215] cloud-init[1856]: ci-info: +-------+-------------+---------+-----------+-------+ [ 101.294122] cloud-init[1856]: ci-info: | Route | Destination | Gateway | Interface | Flags | [ 101.294383] cloud-init[1856]: ci-info: +-------+-------------+---------+-----------+-------+ [ 101.294543] cloud-init[1856]: ci-info: +-------+-------------+---------+-----------+-------+ Welcome to emerg Cannot open access to console, the root account is locked. See sulogin(8) man page for more details. Press Enter to continue.
Ok, shortly after posting we figured it out. Seems like a mount point has changed (I expect due to a linux kernel update) and we have not used the nofail option in /etc/fstab as described in the aws knowledge center, this caused the server to hang at boot. Going forward we will also ensure we use UUID mounting so we are independent on the device naming in /dev/.
I think I've narrowed this down to the ec2-utils package. We had the same issue, related to devices not mounting properly that we initially thought was related to the ENA or NVMe driver. Once we ran a yum update, it was resolved. If you downgrade the ec2-utils package to ec2-utils-1.2-2.amzn2 the issue returns. This seems to only affect nitro based instances. To fix it, you can temporarily boot as a t2 or other older instance type and update the package.
Reference: jfrog artifactory could not validate router error
I have tried everyone's suggestions and I still get a failure. This is on a new installation of artifactory: jfrog-artifactory-oss-7.4.1-linux.tar.gz. This is on a local CentOS VM. 2020-04-18T11:53:25.305Z [jfrt ] [INFO ] [a88f4f6ce96d65bb] [o.j.c.ExecutionUtils:141 ] [pool-13-thread-1 ] - Cluster join: Retry 5: Service registry ping failed, will retry. Error while trying to connect to local router at address ‘http://localhost:8046/access’: Connect to localhost:8046 [localhost/127.0.0.1] failed: Connection refused (Connection refused) hostname -i 172.16.217.147 more /etc/hosts 127.0.0.1 centos7 centos7.example.com localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 172.16.217.147 artifactory-master system.yaml shared: node: ip: 172.16.217.147 This is from access-service.log: 2020-04-18T11:52:19.789Z [jfac ] [INFO ] [7fbbd46f40602f6b] [o.j.a.s.r.s.GrpcServerImpl:65 ] [ocalhost-startStop-2] - Starting gRPC Server on port 8045 2020-04-18T11:52:20.072Z [jfac ] [INFO ] [7fbbd46f40602f6b] [o.j.a.s.r.s.GrpcServerImpl:84 ] [ocalhost-startStop-2] - gRPC Server started, listening on 8045 2020-04-18T11:52:21.995Z [jfac ] [INFO ] [7fbbd46f40602f6b] [o.j.a.AccessApplication:59 ] [ocalhost-startStop-2] - Started AccessApplication in 11.711 seconds (JVM running for 13.514) 2020-04-18T11:52:29.093Z [jfac ] [WARN ] [7b2c676f76c7ef43] [o.j.c.ExecutionUtils:141 ] [pool-6-thread-2 ] - Retry 20 Elapsed 9.54 secs failed: Registration with router on URL http://localhost:8046 failed with error: UNAVAILABLE: io exception. Trying again 2020-04-18T11:52:34.119Z [jfac ] [WARN ] [7b2c676f76c7ef43] [o.j.c.ExecutionUtils:141 ] [pool-6-thread-2 ] - Retry 30 Elapsed 14.57 secs failed: Registration with router on URL http://localhost:8046 failed with error: UNAVAILABLE: io exception. Trying again
AWS elasticsearch EC2 Discovery, cant find other nodes
My objective is to create a elasticsearch cluster in AWS using EC2 discovery. I have 3 instances each running elasticsearch. I have provided each instance a IAM role which allows them to describe ec2 data. Each instance is inside the security group "sec-group-elasticsearch" The nodes start but do not find each other (logs below). I can telnet from one node to another using private dns and port 9300. Reference eg. telnet from node A->B works and B->A works. telnet ip-xxx-xxx-xx-xxx.vpc.fakedomain.com 9300 iam role for each instance { "Statement": [ { "Action": [ "ec2:DescribeInstances" ], "Effect": "Allow", "Resource": [ "*" ] } ], "Version": "2012-10-17" } sec group rules Inbound Custom TCP Rule TCP 9200 - 9400 0.0.0.0/0 Outbound All traffic allowed elasticsearch.yml bootstrap.mlockall: false cloud.aws.region: us-east cluster.name: my-ec2-elasticsearch discovery: ec2 discovery.ec2.groups: sec-group-elasticsearch discovery.ec2.host_type: private_dns discovery.ec2.ping_timeout: 30s discovery.zen.minimum_master_nodes: 2 discovery.zen.ping.multicast.enabled: false http.port: 9200 network.host: _ec2:privateDns_ node.data: false node.master: true transport.tcp.port: 9300 On startup each instance logs like so: [2016-03-02 03:13:48,128][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] version[2.1.0], pid[26976], build[72cd1f1/2015-11-18T22:40:03Z] [2016-03-02 03:13:48,129][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] initializing ... [2016-03-02 03:13:48,592][INFO ][plugins ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] loaded [cloud-aws], sites [head] [2016-03-02 03:13:48,620][INFO ][env ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [11.4gb], net total_space [14.6gb], spins? [no], types [ext4] [2016-03-02 03:13:50,928][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] initialized [2016-03-02 03:13:50,928][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] starting ... [2016-03-02 03:13:51,065][INFO ][transport ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com/xxx-xxx-xx-xxx:9300}, bound_addresses {xxx-xxx-xx-xxx:9300} [2016-03-02 03:13:51,074][INFO ][discovery ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] my-ec2-elasticsearch/xVOkfK4TT-GWaPln59wGxw [2016-03-02 03:14:21,075][WARN ][discovery ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] waited for 30s and no initial state was set by the discovery [2016-03-02 03:14:21,084][INFO ][http ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com/xxx-xxx-xx-xxx:9200}, bound_addresses {xxx-xxx-xx-xxx:9200} [2016-03-02 03:14:21,085][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] started TRACE LOGGING ON FOR DISCOVERY: 2016-03-02 04:25:27,753][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] failed to connect to {#zen_unicast_2#}{::1}{[::1]:9300} ConnectTransportException[[][[::1]:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /0:0:0:0:0:0:0:1:9300]; at org.elasticsearch.transport.netty.NettyTransport.connectToChannelsLight(NettyTransport.java:916) at .............. [2016-03-02 04:25:29,253][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] connecting (light) to {#zen_unicast_1#}{127.0.0.1}{127.0.0.1:9300} [2016-03-02 04:25:29,253][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] sending to {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true} [2016-03-02 04:25:29,254][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] received response from {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}: [ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[143], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[145], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[147], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[149], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[151], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[153], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[154], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}] [2016-03-02 04:25:29,253][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] connecting (light) to {#zen_unicast_2#}{::1}{[::1]:9300} [2016-03-02 04:25:29,254][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] failed to connect to {#zen_unicast_1#}{127.0.0.1}{127.0.0.1:9300} ConnectTransportException[[][127.0.0.1:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /127.0.0.1:9300]; at ........... [2016-03-02 04:25:29,255][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] failed to connect to {#zen_unicast_2#}{::1}{[::1]:9300} ConnectTransportException[[][[::1]:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /0:0:0:0:0:0:0:1:9300]; at
You have a tiny typo in your elasticsearch.yml configuration file: discovery: ec2 should read: discovery.type: ec2
elasticsearch : EC2 discovery : master nodes work data nodes fail
My objective is to run a 6 node cluster on three instances in EC2. I am placing one master-only and one data-only node on each instance (using the elastic ansible playbook). The master nodes from each of the three instances all find each other without issue using EC2 discovery and form a cluster of three and elect a master. The data nodes from the same instances fail on startup with the error below. What have I tried - switching data nodes to explicit zen.unicast discovery via hostnames works - I can telnet on port 9301 from instance A->B without issue REFERENCE: java version - OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-0ubuntu1.14.04.1) es version - 2.1.0 data node elasticseach.yml bootstrap.mlockall: false cloud.aws.region: us-east cluster.name: my-cluster discovery.ec2.groups: stage-elasticsearch discovery.ec2.host_type: private_dns discovery.ec2.ping_timeout: 30s discovery.type: ec2 discovery.zen.minimum_master_nodes: 2 discovery.zen.ping.multicast.enabled: false gateway.expected_nodes: 4 http.port: 9201 network.host: _ec2:privateDns_ node.data: true node.master: false transport.tcp.port: 9301 node.name: ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1 master node elasticsearch.yml bootstrap.mlockall: false cloud.aws.region: us-east cluster.name: my-cluster discovery.ec2.groups: stage-elasticsearch discovery.ec2.host_type: private_dns discovery.ec2.ping_timeout: 30s discovery.type: ec2 discovery.zen.minimum_master_nodes: 2 discovery.zen.ping.multicast.enabled: false gateway.expected_nodes: 4 http.port: 9200 network.host: _ec2:privateDns_ node.data: false node.master: true transport.tcp.port: 9300 node.name: ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-master Errors from datanode startup: [2016-03-02 15:45:06,246][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] initializing ... [2016-03-02 15:45:06,679][INFO ][plugins ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] loaded [cloud-aws], sites [head] [2016-03-02 15:45:06,710][INFO ][env ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [11.5gb], net total_space [14.6gb], spins? [no], types [ext4] [2016-03-02 15:45:09,597][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] initialized [2016-03-02 15:45:09,597][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] starting ... [2016-03-02 15:45:09,678][INFO ][transport ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1/xxx-xxx-xx-xxx:9301}, bound_addresses {xxx-xxx-xx-xxx:9301} [2016-03-02 15:45:09,687][INFO ][discovery ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] my-cluster/PNI6WAmzSYGgZcX2HsqenA [2016-03-02 15:45:09,701][WARN ][com.amazonaws.jmx.SdkMBeanRegistrySupport] java.security.AccessControlException: access denied ("javax.management.MBeanServerPermission" "findMBeanServer") at java.security.AccessControlContext.checkPermission(AccessControlContext.java:372) at java.security.AccessController.checkPermission(AccessController.java:559) at java.lang.SecurityManager.checkPermission(SecurityManager.java:549) at javax.management.MBeanServerFactory.checkPermission(MBeanServerFactory.java:413) at javax.management.MBeanServerFactory.findMBeanServer(MBeanServerFactory.java:361) at com.amazonaws.jmx.MBeans.getMBeanServer(MBeans.java:111) at com.amazonaws.jmx.MBeans.registerMBean(MBeans.java:50) at com.amazonaws.jmx.SdkMBeanRegistrySupport.registerMetricAdminMBean(SdkMBeanRegistrySupport.java:27) at com.amazonaws.metrics.AwsSdkMetrics.registerMetricAdminMBean(AwsSdkMetrics.java:355) at com.amazonaws.metrics.AwsSdkMetrics.<clinit>(AwsSdkMetrics.java:316) at com.amazonaws.AmazonWebServiceClient.requestMetricCollector(AmazonWebServiceClient.java:563) at com.amazonaws.AmazonWebServiceClient.isRMCEnabledAtClientOrSdkLevel(AmazonWebServiceClient.java:504) at com.amazonaws.AmazonWebServiceClient.isRequestMetricsEnabled(AmazonWebServiceClient.java:496) at com.amazonaws.AmazonWebServiceClient.createExecutionContext(AmazonWebServiceClient.java:457) at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5924) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.fetchDynamicNodes(AwsEc2UnicastHostsProvider.java:118) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:230) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:215) at org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:55) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:104) at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:335) at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240) at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106) at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84) at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:879) at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:335) at org.elasticsearch.discovery.zen.ZenDiscovery.access$5000(ZenDiscovery.java:75) at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1236) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) [2016-03-02 15:45:09,703][WARN ][com.amazonaws.metrics.AwsSdkMetrics] java.security.AccessControlException: access denied ("javax.management.MBeanServerPermission" "findMBeanServer") at java.security.AccessControlContext.checkPermission(AccessControlContext.java:372) at java.security.AccessController.checkPermission(AccessController.java:559) at java.lang.SecurityManager.checkPermission(SecurityManager.java:549) at javax.management.MBeanServerFactory.checkPermission(MBeanServerFactory.java:413) at javax.management.MBeanServerFactory.findMBeanServer(MBeanServerFactory.java:361) at com.amazonaws.jmx.MBeans.getMBeanServer(MBeans.java:111) at com.amazonaws.jmx.MBeans.isRegistered(MBeans.java:98) at com.amazonaws.jmx.SdkMBeanRegistrySupport.isMBeanRegistered(SdkMBeanRegistrySupport.java:46) at com.amazonaws.metrics.AwsSdkMetrics.registerMetricAdminMBean(AwsSdkMetrics.java:361) at com.amazonaws.metrics.AwsSdkMetrics.<clinit>(AwsSdkMetrics.java:316) at com.amazonaws.AmazonWebServiceClient.requestMetricCollector(AmazonWebServiceClient.java:563) at com.amazonaws.AmazonWebServiceClient.isRMCEnabledAtClientOrSdkLevel(AmazonWebServiceClient.java:504) at com.amazonaws.AmazonWebServiceClient.isRequestMetricsEnabled(AmazonWebServiceClient.java:496) at com.amazonaws.AmazonWebServiceClient.createExecutionContext(AmazonWebServiceClient.java:457) at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5924) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.fetchDynamicNodes(AwsEc2UnicastHostsProvider.java:118) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:230) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:215) at org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:55) at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:104) at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:335) at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240) at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106) at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84) at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:879) at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:335) at org.elasticsearch.discovery.zen.ZenDiscovery.access$5000(ZenDiscovery.java:75) at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1236) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) [2016-03-02 15:45:39,688][WARN ][discovery ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] waited for 30s and no initial state was set by the discovery [2016-03-02 15:45:39,698][INFO ][http ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1/xxx-xxx-xx-xxx:9201}, bound_addresses {xxx-xxx-xx-xxx:9201} [2016-03-02 15:45:39,699][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] started
I fixed this by removing the explicit setting of transport.tcp.port: 9300 and using the default of letting it pick any ports in the range 9300-9399. The warnings from the AwsSdkMetrics remain but are NOT an issue as Val stated.
This is not actually an error. See this issue where this has been reported. It just seems the plugin is logging too much. If you modify your logging.yml config file as suggested in that issue with this, then you'll be fine: # aws will try to do some sketchy JMX stuff, but its not needed. com.amazonaws.jmx.SdkMBeanRegistrySupport: ERROR com.amazonaws.metrics.AwsSdkMetrics: ERROR