I am trying to have Airflow email me using AWS SES whenever a task in my DAG fails to run or retries to run. I am using my AWS SES credentials rather than my general AWS credentials too.
My current airflow.cfg
[email]
email_backend = airflow.utils.email.send_email_smtp
[smtp]
# If you want airflow to send emails on retries, failure, and you want to use
# the airflow.utils.email.send_email_smtp function, you have to configure an
# smtp server here
smtp_host = emailsmtpserver.region.amazonaws.com
smtp_starttls = True
smtp_ssl = False
# Uncomment and set the user/pass settings if you want to use SMTP AUTH
smtp_user = REMOVEDAWSACCESSKEY
smtp_password = REMOVEDAWSSECRETACCESSKEY
smtp_port = 25
smtp_mail_from = myemail#myjob.com
Current task in my DAG that is designed to intentionally fail and retry:
testfaildag_library_install_jar_jdbc = PythonOperator(
task_id='library_install_jar',
retries=3,
retry_delay=timedelta(seconds=15),
python_callable=add_library_to_cluster,
params={'_task_id': 'cluster_create', '_cluster_name': CLUSTER_NAME, '_library_path':s3000://fakepath.jar},
dag=dag,
email_on_failure=True,
email_on_retry=True,
email=’myname#myjob.com’,
provide_context=True
)
Everything works as designed as the task retries the set number of times and ultimately fails, except no emails are being sent. I have checked the logs in the task mentioned above too, and smtp is never mentioned.
I've looked at the similar question here, but the only solution there did not work for me. Additionally, Airflow's documentation such as their example here does not seem to work for me either.
Does SES work with Airflow's email_on_failure and email_on_retry functions?
What I am currently thinking of doing is using the on_failure_callback function to call a python script provided by AWS here to send an email on failure, but that is not the preferable route at this point.
Thank you, appreciate any help.
--updated 6/8 with working SES
here's my write up on how we got it all working. There is a small summary at the bottom of this answer.
Couple of big points:
We decided not to use Amazon SES, and rather use sendmail We now have SES up and working.
It is the airflow worker that services the email_on_failure and email_on_retry features. You can do journalctl –u airflow-worker –f to monitor it during a Dag run. On your production server, you do NOT need to restart your airflow-worker after changing your airflow.cfg with new smtp settings - it should be automatically picked up. No need to worry about messing up currently running Dags.
Here is the technical write-up on how to use sendmail:
Since we changed from ses to sendmail on localhost, we had to change our smtp settings in the airflow.cfg.
The new config is:
[email]
email_backend = airflow.utils.email.send_email_smtp
[smtp]
# If you want airflow to send emails on retries, failure, and you want to use
# the airflow.utils.email.send_email_smtp function, you have to configure an
# smtp server here
smtp_host = localhost
smtp_starttls = False
smtp_ssl = False
# Uncomment and set the user/pass settings if you want to use SMTP AUTH
#smtp_user = not used
#smtp_password = not used
smtp_port = 25
smtp_mail_from = myjob#mywork.com
This works in both production and local airflow instances.
Some common errors one might receive if their config is not like mine above:
socket.error: [Errno 111] Connection refused -- you must change your smtp_host line in airflow.cfg to localhost
smtplib.SMTPException: STARTTLS extension not supported by server. -- you must change your smtp_starttls in airflow.cfg to False
In my local testing, I tried to simply force airflow to show a log of what was going on when it tried to send an email – I created a fake dag as follows:
# Airflow imports
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from airflow.operators.bash_operator import BashOperator
from airflow.operators.dummy_operator import DummyOperator
# General imports
from datetime import datetime,timedelta
def throwerror():
raise ValueError("Failure")
SPARK_V_2_2_1 = '3.5.x-scala2.11'
args = {
'owner': ‘me’,
'email': ['me#myjob'],
'depends_on_past': False,
'start_date': datetime(2018, 5,24),
'end_date':datetime(2018,6,28)
}
dag = DAG(
dag_id='testemaildag',
default_args=args,
catchup=False,
schedule_interval="* 18 * * *"
)
t1 = DummyOperator(
task_id='extract_data',
dag=dag
)
t2 = PythonOperator(
task_id='fail_task',
dag=dag,
python_callable=throwerror
)
t2.set_upstream(t1)
If you do the journalctl -u airflow-worker -f, you can see that the worker says that it has sent an alert email on the failure to the email in your DAG, but we were still not receiving the email. We then decided to look into the mail logs of sendmail by doing cat /var/log/maillog. We saw a log like this:
Jun 5 14:10:25 production-server-ip-range postfix/smtpd[port]: connect from localhost[127.0.0.1]
Jun 5 14:10:25 production-server-ip-range postfix/smtpd[port]: ID: client=localhost[127.0.0.1]
Jun 5 14:10:25 production-server-ip-range postfix/cleanup[port]: ID: message-id=<randomMessageID#production-server-ip-range-ec2-instance>
Jun 5 14:10:25 production-server-ip-range postfix/smtpd[port]: disconnect from localhost[127.0.0.1]
Jun 5 14:10:25 production-server-ip-range postfix/qmgr[port]: MESSAGEID: from=<myjob#mycompany.com>, size=1297, nrcpt=1 (queue active)
Jun 5 14:10:55 production-server-ip-range postfix/smtp[port]: connect to aspmx.l.google.com[smtp-ip-range]:25: Connection timed out
Jun 5 14:11:25 production-server-ip-range postfix/smtp[port]: connect to alt1.aspmx.l.google.com[smtp-ip-range]:25: Connection timed out
So this is probably the biggest "Oh duh" moment. Here we are able to see what is actually going on in our smtp service. We used telnet to confirm that we were not able to connect to the targeted IP ranges from gmail.
We determined that the email was attempting to be sent, but that the sendmail service was unable to connect to the ip ranges successfully.
We decided to allow all outbound traffic on port 25 in AWS (as our airflow production environment is an ec2 instance), and it now works successfully. We are now able to receive emails on failures and retries (tip: email_on_failure and email_on_retry are defaulted as True in your DAG API Reference - you do not need to put it into your args if you do not want to, but it is still good practice to explicitly state True or False in it).
SES now works. Here is the airflow config:
[email]
email_backend = airflow.utils.email.send_email_smtp
[smtp]
# If you want airflow to send emails on retries, failure, and you want to use
# the airflow.utils.email.send_email_smtp function, you have to configure an
# smtp server here
smtp_host = emailsmtpserver.region.amazonaws.com
smtp_starttls = True
smtp_ssl = False
# Uncomment and set the user/pass settings if you want to use SMTP AUTH
smtp_user = REMOVEDAWSACCESSKEY
smtp_password = REMOVEDAWSSECRETACCESSKEY
smtp_port = 587
smtp_mail_from = myemail#myjob.com (Verified SES email)
Thanks!
Similar case here, I tried to follow the same debugging process but got no log output. Also, the outbound rule for my airflow ec2 instance is open to all ports and ips, so it should be some other causes.
I noticed that when you create the SMTP credential from SES, it will also create an IAM user. I am not sure how is airflow running in your case (bare metal on ec2 instance or wrapped in containers), and how that user access is set up.
I am trying to configure for the first time my logstash.conf file with an output to amazon_es.
My whole logstash.conf file is here:
input {
jdbc {
jdbc_connection_string => "jdbc:mysql://localhost:3306/testdb"
# The user we wish to execute our statement as
jdbc_user => "root"
jdbc_password => "root"
# The path to our downloaded jdbc driver
jdbc_driver_library => "/mnt/c/Users/xxxxxxxx/mysql-connector-java-5.1.45/mysql-connector-java-5.1.45-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
# our query
statement => "SELECT * FROM testtable"
}
}
output {
amazon_es {
hosts => ["search-xxxxx.eu-west-3.es.amazonaws.com"]
region => "eu-west-3"
aws_access_key_id => 'xxxxxxxxxxxxxxxxxxxxxx'
aws_secret_access_key => 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
index => "test-migrate"
document_type => "data"
}
}
I have 3 elements selected from my database, but the first time i run the script, only the first element is indexed in elastic search. The second time i run it, all 3 elements are indexed. I get the error each time i run logstash with this conf file.
EDIT 2:
[2018-02-08T14:31:18,270][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"fb_apache", :directory=>"/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/modules/fb_apache/configuration"}
[2018-02-08T14:31:18,279][DEBUG][logstash.plugins.registry] Adding plugin to the registry {:name=>"fb_apache", :type=>:modules, :class=>#<LogStash::Modules::Scaffold:0x47c515a1 #module_name="fb_apache", #directory="/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/modules/fb_apache/configuration", #kibana_version_parts=["6", "0", "0"]>}
[2018-02-08T14:31:18,286][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"netflow", :directory=>"/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/modules/netflow/configuration"}
[2018-02-08T14:31:18,287][DEBUG][logstash.plugins.registry] Adding plugin to the registry {:name=>"netflow", :type=>:modules, :class=>#<LogStash::Modules::Scaffold:0x6f1a5910 #module_name="netflow", #directory="/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/modules/netflow/configuration", #kibana_version_parts=["6", "0", "0"]>}
[2018-02-08T14:31:18,765][DEBUG][logstash.runner ] -------- Logstash Settings (* means modified) ---------
[2018-02-08T14:31:18,765][DEBUG][logstash.runner ] node.name: "DEVFE-AMT"
[2018-02-08T14:31:18,766][DEBUG][logstash.runner ] *path.config: "logstash.conf"
[2018-02-08T14:31:18,766][DEBUG][logstash.runner ] path.data: "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/data"
[2018-02-08T14:31:18,767][DEBUG][logstash.runner ] modules.cli: []
[2018-02-08T14:31:18,768][DEBUG][logstash.runner ] modules: []
[2018-02-08T14:31:18,768][DEBUG][logstash.runner ] modules_setup: false
[2018-02-08T14:31:18,768][DEBUG][logstash.runner ] config.test_and_exit: false
[2018-02-08T14:31:18,769][DEBUG][logstash.runner ] config.reload.automatic: false
[2018-02-08T14:31:18,769][DEBUG][logstash.runner ] config.reload.interval: 3000000000
[2018-02-08T14:31:18,769][DEBUG][logstash.runner ] config.support_escapes: false
[2018-02-08T14:31:18,770][DEBUG][logstash.runner ] metric.collect: true
[2018-02-08T14:31:18,770][DEBUG][logstash.runner ] pipeline.id: "main"
[2018-02-08T14:31:18,771][DEBUG][logstash.runner ] pipeline.system: false
[2018-02-08T14:31:18,771][DEBUG][logstash.runner ] pipeline.workers: 8
[2018-02-08T14:31:18,771][DEBUG][logstash.runner ] pipeline.output.workers: 1
[2018-02-08T14:31:18,772][DEBUG][logstash.runner ] pipeline.batch.size: 125
[2018-02-08T14:31:18,772][DEBUG][logstash.runner ] pipeline.batch.delay: 50
[2018-02-08T14:31:18,772][DEBUG][logstash.runner ] pipeline.unsafe_shutdown: false
[2018-02-08T14:31:18,772][DEBUG][logstash.runner ] pipeline.java_execution: false
[2018-02-08T14:31:18,773][DEBUG][logstash.runner ] pipeline.reloadable: true
[2018-02-08T14:31:18,773][DEBUG][logstash.runner ] path.plugins: []
[2018-02-08T14:31:18,773][DEBUG][logstash.runner ] config.debug: false
[2018-02-08T14:31:18,776][DEBUG][logstash.runner ] *log.level: "debug" (default: "info")
[2018-02-08T14:31:18,783][DEBUG][logstash.runner ] version: false
[2018-02-08T14:31:18,784][DEBUG][logstash.runner ] help: false
[2018-02-08T14:31:18,784][DEBUG][logstash.runner ] log.format: "plain"
[2018-02-08T14:31:18,786][DEBUG][logstash.runner ] http.host: "127.0.0.1"
[2018-02-08T14:31:18,793][DEBUG][logstash.runner ] http.port: 9600..9700
[2018-02-08T14:31:18,793][DEBUG][logstash.runner ] http.environment: "production"
[2018-02-08T14:31:18,794][DEBUG][logstash.runner ] queue.type: "memory"
[2018-02-08T14:31:18,796][DEBUG][logstash.runner ] queue.drain: false
[2018-02-08T14:31:18,804][DEBUG][logstash.runner ] queue.page_capacity: 67108864
[2018-02-08T14:31:18,809][DEBUG][logstash.runner ] queue.max_bytes: 1073741824
[2018-02-08T14:31:18,822][DEBUG][logstash.runner ] queue.max_events: 0
[2018-02-08T14:31:18,823][DEBUG][logstash.runner ] queue.checkpoint.acks: 1024
[2018-02-08T14:31:18,836][DEBUG][logstash.runner ] queue.checkpoint.writes: 1024
[2018-02-08T14:31:18,837][DEBUG][logstash.runner ] queue.checkpoint.interval: 1000
[2018-02-08T14:31:18,846][DEBUG][logstash.runner ] dead_letter_queue.enable: false
[2018-02-08T14:31:18,854][DEBUG][logstash.runner ] dead_letter_queue.max_bytes: 1073741824
[2018-02-08T14:31:18,859][DEBUG][logstash.runner ] slowlog.threshold.warn: -1
[2018-02-08T14:31:18,868][DEBUG][logstash.runner ] slowlog.threshold.info: -1
[2018-02-08T14:31:18,873][DEBUG][logstash.runner ] slowlog.threshold.debug: -1
[2018-02-08T14:31:18,885][DEBUG][logstash.runner ] slowlog.threshold.trace: -1
[2018-02-08T14:31:18,887][DEBUG][logstash.runner ] keystore.classname: "org.logstash.secret.store.backend.JavaKeyStore"
[2018-02-08T14:31:18,896][DEBUG][logstash.runner ] keystore.file: "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/config/logstash.keystore"
[2018-02-08T14:31:18,896][DEBUG][logstash.runner ] path.queue: "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/data/queue"
[2018-02-08T14:31:18,911][DEBUG][logstash.runner ] path.dead_letter_queue: "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/data/dead_letter_queue"
[2018-02-08T14:31:18,911][DEBUG][logstash.runner ] path.settings: "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/config"
[2018-02-08T14:31:18,926][DEBUG][logstash.runner ] path.logs: "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/logs"
[2018-02-08T14:31:18,926][DEBUG][logstash.runner ] --------------- Logstash Settings -------------------
[2018-02-08T14:31:18,998][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2018-02-08T14:31:19,067][DEBUG][logstash.agent ] Setting up metric collection
[2018-02-08T14:31:19,147][DEBUG][logstash.instrument.periodicpoller.os] Starting {:polling_interval=>5, :polling_timeout=>120}
[2018-02-08T14:31:19,293][DEBUG][logstash.instrument.periodicpoller.jvm] Starting {:polling_interval=>5, :polling_timeout=>120}
[2018-02-08T14:31:19,422][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
[2018-02-08T14:31:19,429][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
[2018-02-08T14:31:19,453][DEBUG][logstash.instrument.periodicpoller.persistentqueue] Starting {:polling_interval=>5, :polling_timeout=>120}
[2018-02-08T14:31:19,464][DEBUG][logstash.instrument.periodicpoller.deadletterqueue] Starting {:polling_interval=>5, :polling_timeout=>120}
[2018-02-08T14:31:19,519][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"6.2.0"}
[2018-02-08T14:31:19,537][DEBUG][logstash.agent ] Starting agent
[2018-02-08T14:31:19,565][DEBUG][logstash.agent ] Starting puma
[2018-02-08T14:31:19,580][DEBUG][logstash.agent ] Trying to start WebServer {:port=>9600}
[2018-02-08T14:31:19,654][DEBUG][logstash.config.source.local.configpathloader] Skipping the following files while reading config since they don't match the specified glob pattern {:files=>["/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/CONTRIBUTORS", "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/Gemfile", "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/Gemfile.lock", "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/LICENSE", "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/NOTICE.TXT", "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/bin", "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/config", "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/data", "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/lib", "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/logs", "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/logstash-core", "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/logstash-core-plugin-api", "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/modules", "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/tools", "/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/vendor"]}
[2018-02-08T14:31:19,658][DEBUG][logstash.api.service ] [api-service] start
[2018-02-08T14:31:19,662][DEBUG][logstash.config.source.local.configpathloader] Reading config file {:config_file=>"/mnt/c/Users/anthony.maffert/l/logstash-6.2.0/logstash.conf"}
[2018-02-08T14:31:19,770][DEBUG][logstash.agent ] Converging pipelines state {:actions_count=>1}
[2018-02-08T14:31:19,776][DEBUG][logstash.agent ] Executing action {:action=>LogStash::PipelineAction::Create/pipeline_id:main}
[2018-02-08T14:31:19,948][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
[2018-02-08T14:31:21,157][DEBUG][logstash.plugins.registry] On demand adding plugin to the registry {:name=>"jdbc", :type=>"input", :class=>LogStash::Inputs::Jdbc}
[2018-02-08T14:31:21,557][DEBUG][logstash.plugins.registry] On demand adding plugin to the registry {:name=>"plain", :type=>"codec", :class=>LogStash::Codecs::Plain}
[2018-02-08T14:31:21,580][DEBUG][logstash.codecs.plain ] config LogStash::Codecs::Plain/#id = "plain_32fc0754-0187-437b-9d4d-2611eaba9a45"
[2018-02-08T14:31:21,581][DEBUG][logstash.codecs.plain ] config LogStash::Codecs::Plain/#enable_metric = true
[2018-02-08T14:31:21,581][DEBUG][logstash.codecs.plain ] config LogStash::Codecs::Plain/#charset = "UTF-8"
[2018-02-08T14:31:21,612][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#jdbc_connection_string = "jdbc:mysql://localhost:3306/testdb"
[2018-02-08T14:31:21,613][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#jdbc_user = "root"
[2018-02-08T14:31:21,616][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#jdbc_password = <password>
[2018-02-08T14:31:21,623][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#jdbc_driver_library = "/mnt/c/Users/anthony.maffert/Desktop/DocumentsUbuntu/mysql-connector-java-5.1.45/mysql-connector-java-5.1.45-bin.jar"
[2018-02-08T14:31:21,624][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#jdbc_driver_class = "com.mysql.jdbc.Driver"
[2018-02-08T14:31:21,631][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#statement = "SELECT * FROM testtable"
[2018-02-08T14:31:21,633][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#id = "ff7529f734e0813846bc8e3b2bcf0794d99ff5cb61b947e0497922b083b3851a"
[2018-02-08T14:31:21,647][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#enable_metric = true
[2018-02-08T14:31:21,659][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#codec = <LogStash::Codecs::Plain id=>"plain_32fc0754-0187-437b-9d4d-2611eaba9a45", enable_metric=>true, charset=>"UTF-8">
[2018-02-08T14:31:21,663][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#add_field = {}
[2018-02-08T14:31:21,663][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#jdbc_paging_enabled = false
[2018-02-08T14:31:21,678][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#jdbc_page_size = 100000
[2018-02-08T14:31:21,679][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#jdbc_validate_connection = false
[2018-02-08T14:31:21,693][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#jdbc_validation_timeout = 3600
[2018-02-08T14:31:21,694][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#jdbc_pool_timeout = 5
[2018-02-08T14:31:21,708][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#sequel_opts = {}
[2018-02-08T14:31:21,708][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#sql_log_level = "info"
[2018-02-08T14:31:21,715][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#connection_retry_attempts = 1
[2018-02-08T14:31:21,716][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#connection_retry_attempts_wait_time = 0.5
[2018-02-08T14:31:21,721][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#parameters = {}
[2018-02-08T14:31:21,723][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#last_run_metadata_path = "/home/maffer_a/.logstash_jdbc_last_run"
[2018-02-08T14:31:21,731][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#use_column_value = false
[2018-02-08T14:31:21,731][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#tracking_column_type = "numeric"
[2018-02-08T14:31:21,745][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#clean_run = false
[2018-02-08T14:31:21,746][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#record_last_run = true
[2018-02-08T14:31:21,808][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#lowercase_column_names = true
[2018-02-08T14:31:21,808][DEBUG][logstash.inputs.jdbc ] config LogStash::Inputs::Jdbc/#columns_charset = {}
[2018-02-08T14:31:21,830][DEBUG][logstash.plugins.registry] On demand adding plugin to the registry {:name=>"stdout", :type=>"output", :class=>LogStash::Outputs::Stdout}
[2018-02-08T14:31:21,893][DEBUG][logstash.plugins.registry] On demand adding plugin to the registry {:name=>"json_lines", :type=>"codec", :class=>LogStash::Codecs::JSONLines}
[2018-02-08T14:31:21,901][DEBUG][logstash.codecs.jsonlines] config LogStash::Codecs::JSONLines/#id = "json_lines_e27ae5ff-5352-4061-9415-c75234fafc91"
[2018-02-08T14:31:21,902][DEBUG][logstash.codecs.jsonlines] config LogStash::Codecs::JSONLines/#enable_metric = true
[2018-02-08T14:31:21,902][DEBUG][logstash.codecs.jsonlines] config LogStash::Codecs::JSONLines/#charset = "UTF-8"
[2018-02-08T14:31:21,905][DEBUG][logstash.codecs.jsonlines] config LogStash::Codecs::JSONLines/#delimiter = "\n"
[2018-02-08T14:31:21,915][DEBUG][logstash.outputs.stdout ] config LogStash::Outputs::Stdout/#codec = <LogStash::Codecs::JSONLines id=>"json_lines_e27ae5ff-5352-4061-9415-c75234fafc91", enable_metric=>true, charset=>"UTF-8", delimiter=>"\n">
[2018-02-08T14:31:21,924][DEBUG][logstash.outputs.stdout ] config LogStash::Outputs::Stdout/#id = "4fb47c5631fa87c6a839a6f476077e9fa55456c479eee7251568f325435f3bbc"
[2018-02-08T14:31:21,929][DEBUG][logstash.outputs.stdout ] config LogStash::Outputs::Stdout/#enable_metric = true
[2018-02-08T14:31:21,939][DEBUG][logstash.outputs.stdout ] config LogStash::Outputs::Stdout/#workers = 1
[2018-02-08T14:31:23,217][DEBUG][logstash.plugins.registry] On demand adding plugin to the registry {:name=>"amazon_es", :type=>"output", :class=>LogStash::Outputs::AmazonES}
[2018-02-08T14:31:23,287][DEBUG][logstash.codecs.plain ] config LogStash::Codecs::Plain/#id = "plain_673a059d-4236-4f10-ba64-43ee33e050e4"
[2018-02-08T14:31:23,288][DEBUG][logstash.codecs.plain ] config LogStash::Codecs::Plain/#enable_metric = true
[2018-02-08T14:31:23,288][DEBUG][logstash.codecs.plain ] config LogStash::Codecs::Plain/#charset = "UTF-8"
[2018-02-08T14:31:23,294][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#hosts = ["search-XXXXXXXXXXXXXX.eu-west-3.es.amazonaws.com"]
[2018-02-08T14:31:23,294][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#region = "eu-west-3"
[2018-02-08T14:31:23,295][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#aws_access_key_id = "XXXXXXXXXXX"
[2018-02-08T14:31:23,295][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#aws_secret_access_key = "XXXXXXXXXXXXX"
[2018-02-08T14:31:23,296][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#index = "test-migrate"
[2018-02-08T14:31:23,299][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#document_type = "data"
[2018-02-08T14:31:23,299][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#id = "7c6401c2f72c63f8d359a42a2f440a663303cb2cbfefff8fa32d64a6f571a527"
[2018-02-08T14:31:23,306][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#enable_metric = true
[2018-02-08T14:31:23,310][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#codec = <LogStash::Codecs::Plain id=>"plain_673a059d-4236-4f10-ba64-43ee33e050e4", enable_metric=>true, charset=>"UTF-8">
[2018-02-08T14:31:23,310][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#workers = 1
[2018-02-08T14:31:23,310][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#manage_template = true
[2018-02-08T14:31:23,317][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#template_name = "logstash"
[2018-02-08T14:31:23,325][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#template_overwrite = false
[2018-02-08T14:31:23,326][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#port = 443
[2018-02-08T14:31:23,332][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#protocol = "https"
[2018-02-08T14:31:23,333][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#flush_size = 500
[2018-02-08T14:31:23,335][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#idle_flush_time = 1
[2018-02-08T14:31:23,340][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#action = "index"
[2018-02-08T14:31:23,341][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#path = "/"
[2018-02-08T14:31:23,341][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#max_retries = 3
[2018-02-08T14:31:23,341][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#retry_max_items = 5000
[2018-02-08T14:31:23,342][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#retry_max_interval = 5
[2018-02-08T14:31:23,342][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#doc_as_upsert = false
[2018-02-08T14:31:23,342][DEBUG][logstash.outputs.amazones] config LogStash::Outputs::AmazonES/#upsert = ""
[2018-02-08T14:31:23,426][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2018-02-08T14:31:23,476][DEBUG][logstash.outputs.amazones] Normalizing http path {:path=>"/", :normalized=>"/"}
[2018-02-08T14:31:23,791][INFO ][logstash.outputs.amazones] Automatic template management enabled {:manage_template=>"true"}
[2018-02-08T14:31:23,835][INFO ][logstash.outputs.amazones] Using mapping template {:template=>{"template"=>"logstash-*", "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"_all"=>{"enabled"=>true, "omit_norms"=>true}, "dynamic_templates"=>[{"message_field"=>{"match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"string", "index"=>"analyzed", "omit_norms"=>true}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"string", "index"=>"analyzed", "omit_norms"=>true, "fields"=>{"raw"=>{"type"=>"string", "index"=>"not_analyzed", "ignore_above"=>256}}}}}], "properties"=>{"#version"=>{"type"=>"string", "index"=>"not_analyzed"}, "geoip"=>{"type"=>"object", "dynamic"=>true, "properties"=>{"location"=>{"type"=>"geo_point"}}}}}}}}
[2018-02-08T14:31:24,480][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
[2018-02-08T14:31:24,482][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
[2018-02-08T14:31:25,242][ERROR][logstash.outputs.amazones] Failed to install template: [400] {"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"No handler for type [string] declared on field [#version]"}],"type":"mapper_parsing_exception","reason":"Failed to parse mapping [_default_]: No handler for type [string] declared on field [#version]","caused_by":{"type":"mapper_parsing_exception","reason":"No handler for type [string] declared on field [#version]"}},"status":400}
[2018-02-08T14:31:25,246][INFO ][logstash.outputs.amazones] New Elasticsearch output {:hosts=>["search-XXXXXXXXXXXX.eu-west-3.es.amazonaws.com"], :port=>443}
[2018-02-08T14:31:25,619][INFO ][logstash.pipeline ] Pipeline started succesfully {:pipeline_id=>"main", :thread=>"#<Thread:0x42da9cf8 run>"}
[2018-02-08T14:31:25,712][INFO ][logstash.agent ] Pipelines running {:count=>1, :pipelines=>["main"]}
Thu Feb 08 14:31:26 GMT 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
[2018-02-08T14:31:26,716][INFO ][logstash.inputs.jdbc ] (0.008417s) SELECT version()
[2018-02-08T14:31:26,858][INFO ][logstash.inputs.jdbc ] (0.002332s) SELECT count(*) AS `count` FROM (SELECT * FROM testtable) AS `t1` LIMIT 1
[2018-02-08T14:31:26,863][DEBUG][logstash.inputs.jdbc ] Executing JDBC query {:statement=>"SELECT * FROM testtable", :parameters=>{:sql_last_value=>2018-02-08 14:23:01 UTC}, :count=>3}
[2018-02-08T14:31:26,873][INFO ][logstash.inputs.jdbc ] (0.000842s) SELECT * FROM testtable
[2018-02-08T14:31:27,022][DEBUG][logstash.inputs.jdbc ] Closing {:plugin=>"LogStash::Inputs::Jdbc"}
[2018-02-08T14:31:27,023][DEBUG][logstash.pipeline ] filter received {"event"=>{"#timestamp"=>2018-02-08T14:31:26.918Z, "personid"=>4004, "city"=>"Cape Town", "#version"=>"1", "firstname"=>"Richard", "lastname"=>"Baron"}}
[2018-02-08T14:31:27,023][DEBUG][logstash.pipeline ] filter received {"event"=>{"#timestamp"=>2018-02-08T14:31:26.919Z, "personid"=>4003, "city"=>"Cape Town", "#version"=>"1", "firstname"=>"Sharon", "lastname"=>"McWell"}}
[2018-02-08T14:31:27,023][DEBUG][logstash.pipeline ] filter received {"event"=>{"#timestamp"=>2018-02-08T14:31:26.890Z, "personid"=>4005, "city"=>"Cape Town", "#version"=>"1", "firstname"=>"Jaques", "lastname"=>"Kallis"}}
[2018-02-08T14:31:27,032][DEBUG][logstash.pipeline ] output received {"event"=>{"#timestamp"=>2018-02-08T14:31:26.918Z, "personid"=>4004, "city"=>"Cape Town", "#version"=>"1", "firstname"=>"Richard", "lastname"=>"Baron"}}
[2018-02-08T14:31:27,035][DEBUG][logstash.pipeline ] output received {"event"=>{"#timestamp"=>2018-02-08T14:31:26.890Z, "personid"=>4005, "city"=>"Cape Town", "#version"=>"1", "firstname"=>"Jaques", "lastname"=>"Kallis"}}
[2018-02-08T14:31:27,040][DEBUG][logstash.pipeline ] output received {"event"=>{"#timestamp"=>2018-02-08T14:31:26.919Z, "personid"=>4003, "city"=>"Cape Town", "#version"=>"1", "firstname"=>"Sharon", "lastname"=>"McWell"}}
[2018-02-08T14:31:27,047][DEBUG][logstash.pipeline ] Pushing flush onto pipeline {:pipeline_id=>"main", :thread=>"#<Thread:0x42da9cf8 sleep>"}
[2018-02-08T14:31:27,053][DEBUG][logstash.pipeline ] Shutting down filter/output workers {:pipeline_id=>"main", :thread=>"#<Thread:0x42da9cf8 run>"}
[2018-02-08T14:31:27,062][DEBUG][logstash.pipeline ] Pushing shutdown {:pipeline_id=>"main", :thread=>"#<Thread:0x3f1899bb#[main]>worker0 run>"}
[2018-02-08T14:31:27,069][DEBUG][logstash.pipeline ] Pushing shutdown {:pipeline_id=>"main", :thread=>"#<Thread:0x41529ca4#[main]>worker1 run>"}
[2018-02-08T14:31:27,070][DEBUG][logstash.pipeline ] Pushing shutdown {:pipeline_id=>"main", :thread=>"#<Thread:0x1c56e6d6#[main]>worker2 run>"}
[2018-02-08T14:31:27,083][DEBUG][logstash.pipeline ] Pushing shutdown {:pipeline_id=>"main", :thread=>"#<Thread:0x2f767b45#[main]>worker3 sleep>"}
[2018-02-08T14:31:27,083][DEBUG][logstash.pipeline ] Pushing shutdown {:pipeline_id=>"main", :thread=>"#<Thread:0x2017b165#[main]>worker4 run>"}
[2018-02-08T14:31:27,098][DEBUG][logstash.pipeline ] Pushing shutdown {:pipeline_id=>"main", :thread=>"#<Thread:0x65923ecd#[main]>worker5 sleep>"}
[2018-02-08T14:31:27,099][DEBUG][logstash.pipeline ] Pushing shutdown {:pipeline_id=>"main", :thread=>"#<Thread:0x1714b839#[main]>worker6 run>"}
[2018-02-08T14:31:27,113][DEBUG][logstash.pipeline ] Pushing shutdown {:pipeline_id=>"main", :thread=>"#<Thread:0xcbee48c#[main]>worker7 run>"}
[2018-02-08T14:31:27,116][DEBUG][logstash.pipeline ] Shutdown waiting for worker thread {:pipeline_id=>"main", :thread=>"#<Thread:0x3f1899bb#[main]>worker0 run>"}
{"#timestamp":"2018-02-08T14:31:26.919Z","personid":4003,"city":"Cape Town","#version":"1","firstname":"Sharon","lastname":"McWell"}
{"#timestamp":"2018-02-08T14:31:26.918Z","personid":4004,"city":"Cape Town","#version":"1","firstname":"Richard","lastname":"Baron"}
{"#timestamp":"2018-02-08T14:31:26.890Z","personid":4005,"city":"Cape Town","#version":"1","firstname":"Jaques","lastname":"Kallis"}
[2018-02-08T14:31:27,153][DEBUG][logstash.pipeline ] Shutdown waiting for worker thread {:pipeline_id=>"main", :thread=>"#<Thread:0x41529ca4#[main]>worker1 run>"}
[2018-02-08T14:31:27,158][DEBUG][logstash.pipeline ] Shutdown waiting for worker thread {:pipeline_id=>"main", :thread=>"#<Thread:0x1c56e6d6#[main]>worker2 run>"}
[2018-02-08T14:31:27,200][DEBUG][logstash.outputs.amazones] Flushing output {:outgoing_count=>1, :time_since_last_flush=>1.927723, :outgoing_events=>{nil=>[["index", {:_id=>nil, :_index=>"test-migrate", :_type=>"data", :_routing=>nil}, #<LogStash::Event:0x1bacf548>]]}, :batch_timeout=>1, :force=>nil, :final=>nil}
[2018-02-08T14:31:27,207][DEBUG][logstash.pipeline ] Shutdown waiting for worker thread {:pipeline_id=>"main", :thread=>"#<Thread:0x2f767b45#[main]>worker3 sleep>"}
[2018-02-08T14:31:27,251][DEBUG][logstash.instrument.periodicpoller.os] Stopping
[2018-02-08T14:31:27,271][DEBUG][logstash.instrument.periodicpoller.jvm] Stopping
[2018-02-08T14:31:27,273][DEBUG][logstash.instrument.periodicpoller.persistentqueue] Stopping
[2018-02-08T14:31:27,281][DEBUG][logstash.instrument.periodicpoller.deadletterqueue] Stopping
[2018-02-08T14:31:27,356][DEBUG][logstash.agent ] Shutting down all pipelines {:pipelines_count=>1}
[2018-02-08T14:31:27,362][DEBUG][logstash.agent ] Converging pipelines state {:actions_count=>1}
[2018-02-08T14:31:27,363][DEBUG][logstash.agent ] Executing action {:action=>LogStash::PipelineAction::Stop/pipeline_id:main}
[2018-02-08T14:31:27,385][DEBUG][logstash.pipeline ] Stopping inputs {:pipeline_id=>"main", :thread=>"#<Thread:0x42da9cf8 sleep>"}
[2018-02-08T14:31:27,389][DEBUG][logstash.inputs.jdbc ] Stopping {:plugin=>"LogStash::Inputs::Jdbc"}
[2018-02-08T14:31:27,399][DEBUG][logstash.pipeline ] Stopped inputs {:pipeline_id=>"main", :thread=>"#<Thread:0x42da9cf8 sleep>"}
You should try to add the index template yourself. Copy this ES 6.x template on your local file system and then add the template setting to your amazon_es output, it should work:
amazon_es {
hosts => ["search-xxxxx.eu-west-3.es.amazonaws.com"]
region => "eu-west-3"
aws_access_key_id => 'xxxxxxxxxxxxxxxxxxxxxx'
aws_secret_access_key => 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
index => "test-migrate"
document_type => "data"
template => '/path/to/template.json'
}