I'm using Logstash 1.4.1 with Elasticsearch (installed as EC2 cluster) 1.1.1 and Elasticsearch AWS plugin 2.1.1.
To try if the Logstash is properly talking to Elasticsearch, I use -
bin/logstash -e 'input { stdin { } } output { elasticsearch { host => <ES_cluster_IP> } }'
and I get -
log4j, [2014-06-10T18:30:17.622] WARN: org.elasticsearch.discovery: [logstash-ip-xxxxxxxx-20308-2010] waited for 30s and no initial state was set by the discovery
Exception in thread ">output" org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [30s]
at org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$3.onTimeout(org/elasticsearch/action/support/master/TransportMasterNodeOperationAction.java:180)
at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(org/elasticsearch/cluster/service/InternalClusterService.java:492)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java/util/concurrent/ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java/util/concurrent/ThreadPoolExecutor.java:615)
at java.lang.Thread.run(java/lang/Thread.java:744)
But when I use -
bin/logstash -e 'input { stdin { } } output { elasticsearch_http { host => <ES_cluster_IP> } }'
it works fine with the below warning -
Using milestone 2 output plugin 'elasticsearch_http'. This plugin should be stable, but if you see strange behavior, please let us know! For more information on plugin milestones, see http://logstash.net/docs/1.4.1/plugin-milestones {:level=>:warn}
I don't understand why can't I use elasticsearch instead of elasticsearch_http even when versions are compatible.
I'd take care to set the protocol option to one of "http", "transport" and "node". The documentation on this is contradictory - on the one hand it states that it's optional and there is no default, while at the end it says the default differs depending upon code set:
The ‘node’ protocol will connect to the cluster as a normal
Elasticsearch node (but will not store data). This allows you to use
things like multicast discovery. If you use the node protocol, you
must permit bidirectional communication on the port 9300 (or whichever
port you have configured).
The ‘transport’ protocol will connect to the host you specify and will
not show up as a ‘node’ in the Elasticsearch cluster. This is useful
in situations where you cannot permit connections outbound from the
Elasticsearch cluster to this Logstash server.
The ‘http’ protocol will use the Elasticsearch REST/HTTP interface to
talk to elasticsearch.
All protocols will use bulk requests when talking to Elasticsearch.
The default protocol setting under java/jruby is “node”. The default
protocol on non-java rubies is “http”
The problem here is that the protocol setting has some pretty significant impact on how you connect to Elasticsearch and how it will operate, yet it's not clear what it will do when you don't set protocol. Better to pick one and set it -
http://logstash.net/docs/1.4.1/outputs/elasticsearch#protocol
In the Logstash elasticsearch plugin page has mention:
VERSION NOTE: Your Elasticsearch cluster must be running Elasticsearch 1.1.1. If you use
any other version of Elasticsearch, you should set protocol => http in this plugin.
So it is not version incompatibility.
Elasticsearch use 9300 for multicast and communicate with other clients. So, it is probably your logstsah can't talk to your elasticsearch cluster. Please check your server configuration whether the firewall has block port 9300.
Aet the protocol in elasticsearch.yml:
output {
elasticsearch { host => localhost protocol => "http" port => "9200" }
stdout { codec => rubydebug }
}
Related
Specs:
The serverless Amazon MSK that's in preview.
t2.xlarge EC2 instance with Amazon Linux 2
Installed Kafka from https://dlcdn.apache.org/kafka/3.0.0/kafka_2.13-3.0.0.tgz
openjdk version "11.0.13" 2021-10-19 LTS
OpenJDK Runtime Environment 18.9 (build 11.0.13+8-LTS)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.13+8-LTS, mixed mode,
sharing)
Gradle 7.3.3
https://github.com/aws/aws-msk-iam-auth, successfully built.
I also tried adding IAM authentication information, as recommended by the Amazon MSK Library for AWS Identity and Access Management. It says to add the following in config/client.properties:
# Sets up TLS for encryption and SASL for authN.
security.protocol = SASL_SSL
# Identifies the SASL mechanism to use.
sasl.mechanism = AWS_MSK_IAM
# Binds SASL client implementation.
# sasl.jaas.config = software.amazon.msk.auth.iam.IAMLoginModule required;
# Encapsulates constructing a SigV4 signature based on extracted credentials.
# The SASL client bound by "sasl.jaas.config" invokes this class.
sasl.client.callback.handler.class = software.amazon.msk.auth.iam.IAMClientCallbackHandler
# Binds SASL client implementation. Uses the specified profile name to look for credentials.
sasl.jaas.config = software.amazon.msk.auth.iam.IAMLoginModule required awsProfileName="kafka-client";
And kafka-client is the IAM role attached to the EC2 instance as an instance profile.
Networking: I used VPC Reachability Analyzer to confirm that the security groups are configured correctly and the EC2 instance I'm using as a Producer can reach the serverless MSK cluster.
What I'm trying to do: create a topic.
How I'm trying: bin/kafka-topics.sh --create --partitions 1 --replication-factor 1 --topic quickstart-events --bootstrap-server boot-zclcyva3.c2.kafka-serverless.us-east-2.amazonaws.com:9098
Result:
Error while executing topic command : Timed out waiting for a node assignment. Call: createTopics
[2022-01-17 01:46:59,753] ERROR org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: createTopics
(kafka.admin.TopicCommand$)
I'm also trying: with the plaintext port of 9092. (9098 is the IAM-authentication port in MSK, and serverless MSK uses IAM authentication by default.)
All the other posts I found on SO about this node assignment error didn't include MSK. I tried suggestions like uncommenting the listener setting in server.properties, but that didn't change anything.
Installing kcat for troubleshooting didn't work for me, since there's no out-of-the box installation for the yum package manager, which Amazon Linux 2 uses, and since these instructions failed for me at checking for libcurl (by compile)... failed (fail).
The Question: Any other tips on solving this "node assignment" error?
The documentation has been updated recently, I was able to follow it end to end without any issue (The IAM policy is now correct)
https://docs.aws.amazon.com/msk/latest/developerguide/serverless-getting-started.html
The created properties file is not automatically used; your command needs to include --command-config client.properties, where this properties file is documented at the MSK docs on the linked IAM page.
Extract...
ssl.truststore.location=<PATH_TO_TRUST_STORE_FILE>
security.protocol=SASL_SSL
sasl.mechanism=AWS_MSK_IAM
sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;
sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler
Alternatively, if the plaintext port didn't work, then you have other networking issues
Beyond these steps, I suggest reaching out to MSK support, and telling them to update the "Create a Topic" page to no longer use Zookeeper, keeping in mind that Kafka 3.0 is not (yet) supported
I want to have ELK stack of which i have installed elasticsearch and Kibana on one machine and logstash on one machine below is my logstash file name as logstash.conf at this location /etc/logstash/conf.d with following configuration
input {
stdin {}
file {
type => syslog
path => "/u01/workspace/data/tenodata/logs/teno.log"
start_position => end
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => ["172.3.9.5:9200"]
}
}
but some how it is not able to connect to elasticsearch
can some help me on this and also what is the location of elasticsearch log
path.logs property in the elasticsearch.yml file (found in the config folder of wherever you installed ES, possibly /etc/elasticsearch) will point to where you can find logs. (for me its /var/log/elasticsearch)
I would check connectivity from the machine, maybe run curl 172.3.9.5:9200 and see if it spits back anything. If it just hangs try looking at your security groups to make sure traffic is allowed.
If that's good, make sure that elasticsearch is set to listen for connections from the outside world. By default its set to only bind to localhost, you can edit the network.host property in the elasticsearch.yml and set it to 0.0.0.0 to see if that works. (just fyi if your using the ec2-discovery plugin you would set it to _ec2_
I have been wrestling with this for a couple of days now. I want to deploy Spring Cloud Data Flow Server for Cloud Foundry to my org's enterprise Pivotal Cloud Foundry instance. My problem is forcing all Data Flow Server web requests to TLS/HTTPS. Here is an example of a configuration I've tried to get this working:
# manifest.yml
---
applications:
- name: gdp-dataflow-server
buildpack: java_buildpack_offline
host: dataflow-server
memory: 2G
disk_quota: 2G
instances: 1
path: spring-cloud-dataflow-server-cloudfoundry-1.2.3.RELEASE.jar
env:
SPRING_APPLICATION_NAME: dataflow-server
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_URL: https://api.system.x.x.io
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_ORG: my-org
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_SPACE: my-space
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_DOMAIN: my-domain.io
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_USERNAME: user
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_PASSWORD: pass
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_SERVICES: dataflow-mq
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_BUILDPACK: java_buildpack_offline
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_TASK_SERVICES: dataflow-db
SPRING_APPLICATION_JSON: |
{
"server": {
"use-forward-headers": true,
"tomcat": {
"remote-ip-header": "x-forwarded-for",
"protocol-header": "x-forwarded-proto"
}
},
"management": {
"context-path": "/management",
"security": {
"enabled": true
}
},
"security": {
"require-ssl": true,
"basic": {
"enabled": true,
"realm": "Data Flow Server"
},
"user": {
"name": "dataflow-admin",
"password": "nimda-wolfatad"
}
}
services:
dataflow-db
dataflow-redis
Despite the security block in SPRING_APPLICATION_JSON, the Data Flow Server's web endpoints are still accessible via insecure HTTP. How can I force all requests to HTTPS? Do I need to customize my own build of the Data Flow Server for Cloud Foundry? I understand that PCF's proxy is terminating SSL/TLS at the load balancer, but configuring the forward headers should induce Spring Security/Tomcat to behave the way I want, should it not? I must be missing something obvious here, because this seems like a common desire that should not be this difficult.
Thank you.
There's nothing out-of-the-box from Spring Boot proper to enable/disable HTTPS and at the same time also intercept and auto-redirect plain HTTP -> HTTPS.
There are several online literatures on how to write a custom Configuration class to accept multiple-connectors in Spring Boot (see example).
Spring Cloud Data Flow (SCDF) is a simple Spring Boot application, so all this applies to the SCDF-server as well.
That said, if you intend to enforce HTTPS all throughout your application interaction, there is a PCF setting [Disable HTTP traffic to HAProxy] that can be applied as a global override in Elastic Runtime - see docs. This consistently applies it to all the applications and it is not just specific to Spring Boot or SCDF. Even Python or Node or other types of apps can be enforced to interact via HTTPS with this setting.
I'm working on an elasticsearch project where I want to get data from Amazon s3.for this,I'm using logstash.To configure,
output{
elasticsearch{
host => 'host_'
cluster => 'cluster_name'
}
}
is the usual approach.
But,I'm using Amazon elasticsearch service. It has only end-point and Domain ARN. How should I specify host name in this case?
In the simplest case where your ES cluster on AWS is open to the world, you can have a simple elasticsearch output config like this:
For Logstash 2.0:
output {
elasticsearch{
hosts => 'search-xxxxxxxxxxxx.us-west-2.es.amazonaws.com:80'
}
}
don't forget the port number at the end
make sure to use the hosts setting (not host)
For Logstash 1.5.x:
output {
elasticsearch{
host => 'search-xxxxxxxxxxxx.us-west-2.es.amazonaws.com'
port => 80
protocol => 'http'
}
}
the port number is a separate setting named port
make sure to use the host setting (not hosts), i.e. opposite than with 2.0
I have one OpsWorks Nodejs Stack. I setup multiple nodejs apps. The problem now is that all nodejs server.js scripts listens on port 80 for amazon life check but the port can be used only by one.
I dont know how to solve this. I have read amazon documentation but could not find the solution. I read that I could try to change deploy recipe variables to set this life check to different port but it didn't work. Any help?
I battled with this issue for a while and eventually found a very simple solution.
The port is set in the deploy cookbook's attributes...
https://github.com/aws/opsworks-cookbooks/blob/release-chef-11.10/deploy/attributes/deploy.rb
by the line...
default[:deploy][application][:nodejs][:port] = deploy[:ssl_support] ? 443 : 80
you can override this using the stack's custom json, such as:
{
"deploy" : {
"app_name_1": {
"nodejs": {
"port": 80
}
},
"app_name_2": {
"nodejs": {
"port": 3000
}
}
},
"mongodb" : {
...
}
}
Now the monitrc files at /etc/monit.d/node_web_app-.monitrc should reflect their respective ports, and monit should keep them alive!
My solution was to implement life check node service that is listening on port 80. When amazon life check request is made to that service it responds and execute its own logic to check for health of all services. It works great.