I am completely new in usung of Elastcalert. I am trying to use Elasticalert for striking email when no log is sent to logstash from my client server. I have successfully installed Elastcalert on my master server. However, when I run elastalert-create-index I get following error:
Traceback (most recent call last):
File "/usr/bin/elastalert-create-index", line 11, in <module>
load_entry_point('elastalert==0.1.21', 'console_scripts', 'elastalert-
create-index')()
File "/usr/lib/python2.7/site-packages/elastalert-0.1.21-
py2.7.egg/elastalert/create_index.py", line 77, in main
username = args.username if args.username else data.get('es_username')
UnboundLocalError: local variable 'data' referenced before assignment
My config.yaml is as follow:
# This is the folder that contains the rule yaml files
# Any .yaml file will be loaded as a rule
rules_folder: example_rules
# How often ElastAlert will query Elasticsearch
# The unit can be anything from weeks to seconds
run_every:
minutes: 1
# ElastAlert will buffer results from the most recent
# period of time, in case some log sources are not in real time
buffer_time:
minutes: 15
# The Elasticsearch hostname for metadata writeback
# Note that every rule can have its own Elasticsearch host
es_host: localhost
# The Elasticsearch port
es_port: 9200
# The AWS region to use. Set this when using AWS-managed elasticsearch
#aws_region: us-east-1
# The AWS profile to use. Use this if you are using an aws-cli profile.
# See http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-
started.html
# for details
#profile: test
# Optional URL prefix for Elasticsearch
#es_url_prefix: elasticsearch
# Connect with TLS to Elasticsearch
#use_ssl: True
# Verify TLS certificates
#verify_certs: True
# GET request with body is the default option for Elasticsearch.
# If it fails for some reason, you can pass 'GET', 'POST' or 'source'.
# See http://elasticsearch-py.readthedocs.io/en/master/connection.html?
highlight=send_get_body_as#transport
# for details
#es_send_get_body_as: GET
# Option basic-auth username and password for Elasticsearch
#es_username:
#es_password:
# Use SSL authentication with client certificates client_cert must be
# a pem file containing both cert and key for client
#verify_certs: True
#ca_certs: /path/to/cacert.pem
#client_cert: /path/to/client_cert.pem
#client_key: /path/to/client_key.key
# The index on es_host which is used for metadata storage
# This can be a unmapped index, but it is recommended that you run
# elastalert-create-index to set a mapping
writeback_index: elastalert_status
# If an alert fails for some reason, ElastAlert will retry
# sending the alert until this time period has elapsed
alert_time_limit:
days: 2
Did you try running elastalert-create-index without any arguments? It guides you through the setup process like this:
$>elastalert-create-index
Enter Elasticsearch host: localhost
Enter Elasticsearch port: 9200
Use SSL? t/f: f
Enter optional basic-auth username (or leave blank):
Enter optional basic-auth password (or leave blank):
Enter optional Elasticsearch URL prefix (prepends a string to the URL of every request):
New index name? (Default elastalert_status)
Name of existing index to copy? (Default None)
Elastic Version:6
Mapping used for string:{'type': 'keyword'}
New index elastalert_status created
Done!
Related
We are using Fluentbit as as Sidecar container in our ECS fargate Cluster which is running a dotnet application, initially we faced the issue of fluentbit sending the logs in multiline and we solved it using Fluentbit Multilne feature. Now the logs are being sent to Sumologic in Multiple however it is being sent as Json format whereas we just want fluentbit send only the raw log
Logs are currently
{
date:1675120653.269619,
container_id:"xvgbertytyuuyuyu",
container_name:"XXXXXXXXXX",
source:"stdout",
log:"2023-01-30 23:17:33.269Z DEBUG [.NET ThreadPool Worker] Connection.ManagedDbConnection - ComponentInstanceEntityAsync - Executing stored proc: dbo.prcGetComponentInstance"
}
We want only the line
2023-01-30 23:17:33.269Z DEBUG [.NET ThreadPool Worker] Connection.ManagedDbConnection - ComponentInstanceEntityAsync - Executing stored proc: dbo.prcGetComponentInstance
You need to modify Fluent Bit configuration to have the following filters and output configuration:
fluent.conf:
## prepare headers for Sumo Logic
[FILTER]
Name record_modifier
Match *
Record headers.content-type text/plain
## Set headers as headers attribute
[FILTER]
Name nest
Match *
Operation nest
Wildcard headers.*
Nest_under headers
Remove_prefix headers.
[OUTPUT]
Name http
...
# use log key as body
body_key $log
# use headers key as headers
headers_key $headers
That way, you are going to craft HTTP request manually. This is going to send request per log, which is not necessary a good idea. In order to mitigate that you can add the following parser and use it (flush_timeout may need an adjustment):
parsers.conf
# merge everything as one big log
[MULTILINE_PARSER]
name multiline-all
type regex
flush_timeout 500
#
# Regex rules for multiline parsing
# ---------------------------------
#
# configuration hints:
#
# - first state always has the name: start_state
# - every field in the rule must be inside double quotes
#
# rules | state name | regex pattern | next state
# ------|---------------|--------------------------------------------
rule "start_state" ".*" "cont"
rule "cont" ".*" "cont"
fluent.conf:
[INPUT]
name tail
...
multiline.parser multiline-all
I tried to install ElasticSearch on AWS ec2
I tried to set up initial password of ElasticSearch with following command and got this error message
/usr/share/elasticsearch/bin
$ ./elasticsearch-reset-password -u elasticsearch
ERROR: Failed to determine the health of the cluster.
Here is my elasticsearch.yml file
======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
# Before you set out to tweak and tune the configuration, make sure you
# understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:#
#cluster.name: aaaaaaa
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
##node.name: ${HOSTNAME}
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /var/lib/elasticsearch
# Path to log files:
path.logs: /var/log/elasticsearch
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# By default Elasticsearch is only accessible on localhost. Set a different
# address here to expose this node on the network:
#
#network.bind_host: 0.0.0.0
#network.publish_host: ${HOSTNAME}
network.host: ["0.0.0.0"]
#
# By default Elasticsearch listens for HTTP traffic on the first free port it
# finds starting at 9200. Set a specific HTTP port here:
#
#http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.seed_hosts: - ${HOSTNAME}
discovery.type: single-node
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: - ${HOSTNAME}
#
# For more information, consult the discovery and cluster formation module documentation.
#
# --------------------------------- Readiness ----------------------------------
#
# Enable an unauthenticated TCP readiness endpoint on localhost
#
#readiness.port: 9399
#
# ---------------------------------- Various -----------------------------------
#
# Allow wildcard deletion of indices:
#
#action.destructive_requires_name: false
#----------------------- BEGIN SECURITY AUTO CONFIGURATION -----------------------
#
# The following settings, TLS certificates, and keys have been automatically
# generated to configure Elasticsearch security features on 22-10-2022 08:48:06
#
# --------------------------------------------------------------------------------
# Enable security features
xpack.security.enabled: true
xpack.security.enrollment.enabled: true
# Enable encryption for HTTP API client connections, such as Kibana, Logstash, and Agents
xpack.security.http.ssl:
enabled: true
keystore.path: certs/http.p12
# Enable encryption and mutual authentication between cluster nodes
xpack.security.transport.ssl:
enabled: true
verification_mode: certificate
keystore.path: certs/transport.p12
truststore.path: certs/transport.p12
# Create a new cluster with the current node only
# Additional nodes can still join the cluster later
# cluster.initial_master_nodes: ["aaaaaaa"]
# Allow HTTP API connections from anywhere
# Connections are encrypted and require user authentication
http.host: 0.0.0.0
# Allow other nodes to join the cluster from anywhere
# Connections are encrypted and mutually authenticated
#transport.host: 0.0.0.0
#----------------------- END SECURITY AUTO CONFIGURATION -------------------------
################################################################################################################# just try to write something to upload this post ignore this ########################################################
This error means that your cluster is not ready yet when you are trying to change the cluster password.
First, you should configure your cluster and ensure that is healthy and then change the password.
The default username and password for Elasticsearch is "elastic" and "changeme".
I am trying to have Airflow email me using AWS SES whenever a task in my DAG fails to run or retries to run. I am using my AWS SES credentials rather than my general AWS credentials too.
My current airflow.cfg
[email]
email_backend = airflow.utils.email.send_email_smtp
[smtp]
# If you want airflow to send emails on retries, failure, and you want to use
# the airflow.utils.email.send_email_smtp function, you have to configure an
# smtp server here
smtp_host = emailsmtpserver.region.amazonaws.com
smtp_starttls = True
smtp_ssl = False
# Uncomment and set the user/pass settings if you want to use SMTP AUTH
smtp_user = REMOVEDAWSACCESSKEY
smtp_password = REMOVEDAWSSECRETACCESSKEY
smtp_port = 25
smtp_mail_from = myemail#myjob.com
Current task in my DAG that is designed to intentionally fail and retry:
testfaildag_library_install_jar_jdbc = PythonOperator(
task_id='library_install_jar',
retries=3,
retry_delay=timedelta(seconds=15),
python_callable=add_library_to_cluster,
params={'_task_id': 'cluster_create', '_cluster_name': CLUSTER_NAME, '_library_path':s3000://fakepath.jar},
dag=dag,
email_on_failure=True,
email_on_retry=True,
email=’myname#myjob.com’,
provide_context=True
)
Everything works as designed as the task retries the set number of times and ultimately fails, except no emails are being sent. I have checked the logs in the task mentioned above too, and smtp is never mentioned.
I've looked at the similar question here, but the only solution there did not work for me. Additionally, Airflow's documentation such as their example here does not seem to work for me either.
Does SES work with Airflow's email_on_failure and email_on_retry functions?
What I am currently thinking of doing is using the on_failure_callback function to call a python script provided by AWS here to send an email on failure, but that is not the preferable route at this point.
Thank you, appreciate any help.
--updated 6/8 with working SES
here's my write up on how we got it all working. There is a small summary at the bottom of this answer.
Couple of big points:
We decided not to use Amazon SES, and rather use sendmail We now have SES up and working.
It is the airflow worker that services the email_on_failure and email_on_retry features. You can do journalctl –u airflow-worker –f to monitor it during a Dag run. On your production server, you do NOT need to restart your airflow-worker after changing your airflow.cfg with new smtp settings - it should be automatically picked up. No need to worry about messing up currently running Dags.
Here is the technical write-up on how to use sendmail:
Since we changed from ses to sendmail on localhost, we had to change our smtp settings in the airflow.cfg.
The new config is:
[email]
email_backend = airflow.utils.email.send_email_smtp
[smtp]
# If you want airflow to send emails on retries, failure, and you want to use
# the airflow.utils.email.send_email_smtp function, you have to configure an
# smtp server here
smtp_host = localhost
smtp_starttls = False
smtp_ssl = False
# Uncomment and set the user/pass settings if you want to use SMTP AUTH
#smtp_user = not used
#smtp_password = not used
smtp_port = 25
smtp_mail_from = myjob#mywork.com
This works in both production and local airflow instances.
Some common errors one might receive if their config is not like mine above:
socket.error: [Errno 111] Connection refused -- you must change your smtp_host line in airflow.cfg to localhost
smtplib.SMTPException: STARTTLS extension not supported by server. -- you must change your smtp_starttls in airflow.cfg to False
In my local testing, I tried to simply force airflow to show a log of what was going on when it tried to send an email – I created a fake dag as follows:
# Airflow imports
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from airflow.operators.bash_operator import BashOperator
from airflow.operators.dummy_operator import DummyOperator
# General imports
from datetime import datetime,timedelta
def throwerror():
raise ValueError("Failure")
SPARK_V_2_2_1 = '3.5.x-scala2.11'
args = {
'owner': ‘me’,
'email': ['me#myjob'],
'depends_on_past': False,
'start_date': datetime(2018, 5,24),
'end_date':datetime(2018,6,28)
}
dag = DAG(
dag_id='testemaildag',
default_args=args,
catchup=False,
schedule_interval="* 18 * * *"
)
t1 = DummyOperator(
task_id='extract_data',
dag=dag
)
t2 = PythonOperator(
task_id='fail_task',
dag=dag,
python_callable=throwerror
)
t2.set_upstream(t1)
If you do the journalctl -u airflow-worker -f, you can see that the worker says that it has sent an alert email on the failure to the email in your DAG, but we were still not receiving the email. We then decided to look into the mail logs of sendmail by doing cat /var/log/maillog. We saw a log like this:
Jun 5 14:10:25 production-server-ip-range postfix/smtpd[port]: connect from localhost[127.0.0.1]
Jun 5 14:10:25 production-server-ip-range postfix/smtpd[port]: ID: client=localhost[127.0.0.1]
Jun 5 14:10:25 production-server-ip-range postfix/cleanup[port]: ID: message-id=<randomMessageID#production-server-ip-range-ec2-instance>
Jun 5 14:10:25 production-server-ip-range postfix/smtpd[port]: disconnect from localhost[127.0.0.1]
Jun 5 14:10:25 production-server-ip-range postfix/qmgr[port]: MESSAGEID: from=<myjob#mycompany.com>, size=1297, nrcpt=1 (queue active)
Jun 5 14:10:55 production-server-ip-range postfix/smtp[port]: connect to aspmx.l.google.com[smtp-ip-range]:25: Connection timed out
Jun 5 14:11:25 production-server-ip-range postfix/smtp[port]: connect to alt1.aspmx.l.google.com[smtp-ip-range]:25: Connection timed out
So this is probably the biggest "Oh duh" moment. Here we are able to see what is actually going on in our smtp service. We used telnet to confirm that we were not able to connect to the targeted IP ranges from gmail.
We determined that the email was attempting to be sent, but that the sendmail service was unable to connect to the ip ranges successfully.
We decided to allow all outbound traffic on port 25 in AWS (as our airflow production environment is an ec2 instance), and it now works successfully. We are now able to receive emails on failures and retries (tip: email_on_failure and email_on_retry are defaulted as True in your DAG API Reference - you do not need to put it into your args if you do not want to, but it is still good practice to explicitly state True or False in it).
SES now works. Here is the airflow config:
[email]
email_backend = airflow.utils.email.send_email_smtp
[smtp]
# If you want airflow to send emails on retries, failure, and you want to use
# the airflow.utils.email.send_email_smtp function, you have to configure an
# smtp server here
smtp_host = emailsmtpserver.region.amazonaws.com
smtp_starttls = True
smtp_ssl = False
# Uncomment and set the user/pass settings if you want to use SMTP AUTH
smtp_user = REMOVEDAWSACCESSKEY
smtp_password = REMOVEDAWSSECRETACCESSKEY
smtp_port = 587
smtp_mail_from = myemail#myjob.com (Verified SES email)
Thanks!
Similar case here, I tried to follow the same debugging process but got no log output. Also, the outbound rule for my airflow ec2 instance is open to all ports and ips, so it should be some other causes.
I noticed that when you create the SMTP credential from SES, it will also create an IAM user. I am not sure how is airflow running in your case (bare metal on ec2 instance or wrapped in containers), and how that user access is set up.
I use the syslog plugin available with kong api gateway and I have the following:
{ "api_id": "some_id",
"id": "some_id",
"created_at": 4544444,
"enabled": true,
"name": "syslog",
"config":
{ "client_errors_severity": "info",
"server_errors_severity": "info",
"successful_severity": "info",
"log_level": "emerg"
} }
I use a centos7 ,and i have the following conf file(/etc/rsyslog.conf)
# rsyslog configuration file
# For more information see /usr/share/doc/rsyslog-*/rsyslog_conf.html
# If you experience problems, see http://www.rsyslog.com/doc/troubleshoot.html
#### MODULES ####
# The imjournal module bellow is now used as a message source instead of imuxsock.
$ModLoad imuxsock # provides support for local system logging (e.g. via logger command)
$ModLoad imjournal # provides access to the systemd journal
#$ModLoad imklog # reads kernel messages (the same are read from journald)
#$ModLoad immark # provides --MARK-- message capability
# Provides UDP syslog reception
$ModLoad imudp
$UDPServerRun 514
# Provides TCP syslog reception
$ModLoad imtcp
$InputTCPServerRun 514
#### GLOBAL DIRECTIVES ####
# Where to place auxiliary files
$WorkDirectory /var/lib/rsyslog
# Use default timestamp format
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
$template precise,"%syslogpriority%,%syslogfacility%,%timegenerated%,%HOSTNAME%,%syslogtag%,%msg%\n"
$ActionFileDefaultTemplate precise
# File syncing capability is disabled by default. This feature is usually not required,
# not useful and an extreme performance hit
#$ActionFileEnableSync on
# Include all config files in /etc/rsyslog.d/
$IncludeConfig /etc/rsyslog.d/*.conf
# Turn off message reception via local log socket;
# local messages are retrieved through imjournal now.
$OmitLocalLogging on
# File to store the position in the journal
$IMJournalStateFile imjournal.state
#### RULES ####
# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.* /dev/console
# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none /var/log/messages
# The authpriv file has restricted access.
authpriv.* /var/log/secure
# Log all the mail messages in one place.
mail.* -/var/log/maillog
# Log cron stuff
cron.* /var/log/cron
# Everybody gets emergency messages
*.emerg :omusrmsg:*
# Save news errors of level crit and higher in a special file.
uucp,news.crit /var/log/spooler
# Save boot messages also to boot.log
local7.* /var/log/boot.log
# ### begin forwarding rule ###
# The statement between the begin ... end define a SINGLE forwarding
# rule. They belong together, do NOT split them. If you create multiple
# forwarding rules, duplicate the whole block!
# Remote Logging (we use TCP for reliable delivery)
#
# An on-disk queue is created for this action. If the remote host is
# down, messages are spooled to disk and sent when it is up again.
#$ActionQueueFileName fwdRule1 # unique name prefix for spool files
#$ActionQueueMaxDiskSpace 1g # 1gb space limit (use as much as possible)
#$ActionQueueSaveOnShutdown on # save messages to disk on shutdown
#$ActionQueueType LinkedList # run asynchronously
#$ActionResumeRetryCount -1 # infinite retries if host is down
# remote host is: name/ip:port, e.g. 192.168.0.1:514, port optional
#*.* ##remote-host:514
# ### end of the forwarding rule ###
Syslog daemon is running on the centos 7 machine and it's configured with logging level severity(info) same as or lower than the set config.log_level(emerg).
I am unable to see any logs on
/var/log/messages
syslog daemon configured with logging level severity same as or lower than the set config.log_level for proper logging.
I have a Radius client configuration file in /etc/raddb/server in that want to get valid IP address without commented line,So I'm using
grep -o '[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}' /etc/raddb/server
127.0.0.1
192.168.0.147
But I want to ignore 127.0.0.1 which is commented with # so how to do this stuff??
/etc/raddb/server file is as follow
cat /etc/raddb/server
# pam_radius_auth configuration file. Copy to: /etc/raddb/server
#
# For proper security, this file SHOULD have permissions 0600,
# that is readable by root, and NO ONE else. If anyone other than
# root can read this file, then they can spoof responses from the server!
#
# There are 3 fields per line in this file. There may be multiple
# lines. Blank lines or lines beginning with '#' are treated as
# comments, and are ignored. The fields are:
#
# server[:port] secret [timeout]
#
# the port name or number is optional. The default port name is
# "radius", and is looked up from /etc/services The timeout field is
# optional. The default timeout is 3 seconds.
#
# If multiple RADIUS server lines exist, they are tried in order. The
# first server to return success or failure causes the module to return
# success or failure. Only if a server fails to response is it skipped,
# and the next server in turn is used.
#
# The timeout field controls how many seconds the module waits before
# deciding that the server has failed to respond.
#
# server[:port] shared_secret timeout (s)
#127.0.0.1 secret 1
#other-server other-secret 3
192.168.0.147:1812 testing123 1
#
# having localhost in your radius configuration is a Good Thing.
#
# See the INSTALL file for pam.conf hints.
try grep -o '^[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}' /etc/raddb/server