Logstash grok & regex filter - regex

Wanting a filter that extracts given information from log messages.
Currently using, although it's very specific to one format/log layout
filter {
if "ONT" in [message] {
grok{
match => { "message" => "%{SYSLOGBASE} %{WORD:Alarm_Severity} %{DATA:Message} %{QS:ONT_ID} %{DATA:Time} %{QS:ONT_Message}" }
}
}
Log Files are:
Dec 16 15:01:13 172.20.x.xx NPF_OLT_LAB05: clear Alarm for card 1/1 at 2019/12/16 15:01:13.39: "Backup files exist"
Dec 16 15:01:13 172.20.x.xx NPF_OLT_LAB05: service "403
for ONT: "10002" - ONT needs restart at 2019/12/16 15:01:13.39 ONT message: "Backup files exist"
Wanting layout to give me these parameters
Time:15:01:13
Host: NPF_OLT_LAB05
Alarm Severity: clear
ONT ID: 10002
Source IP: 172.20.x.xx
ONT Message: "Backup files exist"
Message: clear Alarm for card 1/1
Service ID: 403

I guess these are two different logs, you need to have two different grok pattern as below,
Dec 16 15:01:13 172.20.12.12 NPF_OLT_LAB05: clear Alarm for card 1/1 at 2019/12/16 15:01:13.39: "Backup files exist"
Grok pattern
(?<Date>%{MONTH} +%{MONTHDAY}) %{TIME:Time} %{IPV4:SourceIP} %{NOTSPACE:HOST}\:\s(%{WORD:Severity} %{GREEDYDATA:Message})\s(?<timestamp>%{YEAR}\/%{MONTHNUM}\/%{MONTHDAY}\s%{TIME})\S\s\S%{GREEDYDATA:ONTMessage}\"
Dec 16 15:01:13 172.20.x.xx NPF_OLT_LAB05: service "403 for ONT: "10002" - ONT needs restart at 2019/12/16 15:01:13.39 ONT message: "Backup files exist"
Grok pattern
(?<Date>%{MONTH} +%{MONTHDAY}) %{TIME:Time} %{IPV4:SourceIP} %{NOTSPACE:HOST}\S\s%{WORD:Severity}\s\S%{BASE10NUM:ServiceID} %{NOTSPACE}\s(?:ONT: \S%{BASE10NUM:ONT_ID}\S) %{NOTSPACE} %{GREEDYDATA:Message}\s(?<timestamp>%{YEAR}\/%{MONTHNUM}\/%{MONTHDAY}\s%{TIME}) (?:ONT message\: \S(?<ONT Message:>%{GREEDYDATA}\S))
Below Conf
filter {
if "ONT" in [message] {
grok{
match => { "message" => [ "(?<Date>%{MONTH} +%{MONTHDAY}) %{TIME:Time} %{IPV4:SourceIP} %{NOTSPACE:HOST}\:\s(%{WORD:Severity} %{GREEDYDATA:Message})\s(?<timestamp>%{YEAR}\/%{MONTHNUM}\/%{MONTHDAY}\s%{TIME})\S\s\S%{GREEDYDATA:ONTMessage}\"" ,
"(?<Date>%{MONTH} +%{MONTHDAY}) %{TIME:Time} %{IPV4:SourceIP} %{NOTSPACE:HOST}\S\s%{WORD:Severity}\s\S%{BASE10NUM:ServiceID} %{NOTSPACE}\s(?:ONT: \S%{BASE10NUM:ONT_ID}\S) %{NOTSPACE} %{GREEDYDATA:Message}\s(?<timestamp>%{YEAR}\/%{MONTHNUM}\/%{MONTHDAY}\s%{TIME}) (?:ONT message\: \S(?<ONT Message:>%{GREEDYDATA}\S))" ]
}
}
}

Related

Logstash AWS solving code 403 trying to reconnect

I'm trying to push documents from local to elastic server in AWS, and when trying to do so I get 403 error and Logstash keeps on trying to establish connection with the server like so:
[2021-05-09T11:09:52,707][TRACE][logstash.inputs.file ][main] Registering file input {:path=>["~/home/ubuntu/json_try/json_try.json"]}
[2021-05-09T11:09:52,737][DEBUG][logstash.javapipeline ][main] Shutdown waiting for worker thread {:pipeline_id=>"main", :thread=>"#<Thread:0x5033269f run>"}
[2021-05-09T11:09:53,441][DEBUG][logstash.outputs.amazonelasticsearch][main] Waiting for connectivity to Elasticsearch cluster. Retrying in 4s
[2021-05-09T11:09:56,403][INFO ][logstash.outputs.amazonelasticsearch][main] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>https://my-dom.co:8001/scans, :path=>"/"}
[2021-05-09T11:09:56,461][WARN ][logstash.outputs.amazonelasticsearch][main] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>"https://my-dom.co:8001/scans", :error_type=>LogStash::Outputs::AmazonElasticSearch::HttpClient::Pool::BadResponseCodeError, :error=>"Got response code '403' contacting Elasticsearch at URL 'https://my-dom.co:8001/scans/'"}
[2021-05-09T11:09:56,849][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
[2021-05-09T11:09:56,853][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
[2021-05-09T11:09:57,444][DEBUG][logstash.outputs.amazonelasticsearch][main] Waiting for connectivity to Elasticsearch cluster. Retrying in 8s
.
.
.
I'm using the following logstash conf file:
input {
file{
type => "json"
path => "~/home/ubuntu/json_try/json_try.json"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
output{
amazon_es {
hosts => ["https://my-dom.co/scans"]
port => 8001
ssl => true
region => "us-east-1b"
index => "snapshot-%{+YYYY.MM.dd}"
}
}
Also I've exported AWS keys for the SSL to work. Is there anything I'm missing in order for the connection to succeed?
I've been able to solve this by using elasticsearch as my output plugin instead of amazon_es.
This usage will require cloud_id of the target AWS node, cloud_auth for it and also the target index in elastic for the data to be sent to. So the conf file will look something like this:
input {
file{
type => "json"
path => "~/home/ubuntu/json_try/json_try.json"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
output{
elasticsearch {
cloud_id: "node_name:node_hash"
cloud_auth: "auth_hash"
index: "snapshot-%{+YYYY.MM.dd}"
}
}

What is the grok pattern for this jenkins log?

Can you please help me with the grok pattern for this jenkins sample data or log. The log is only a single line.
hudson.slaves.CommandLauncher launch\nSEVERE: Unable to launch the agent for dot-dewsttlas403-ci\njava.io.IOException: Failed to create a temporary file in /opt_shared/iit_slave/jenkins_slave/workspace\n\tat hudson.util.AtomicFileWriter.<init>(AtomicFileWriter.java:144)\n\tat hudson.util.AtomicFileWriter.<init>(AtomicFileWriter.java:109)\n\tat hudson.util.AtomicFileWriter.<init>(AtomicFileWriter.java:84)\n\tat hudson.util.AtomicFileWriter.<init>(AtomicFileWriter.java:74)\n\tat hudson.util.TextFile.write(TextFile.java:116)\n\tat jenkins.branch.WorkspaceLocatorImpl$WriteAtomic.invoke(WorkspaceLocatorImpl.java:264)\n\tat jenkins.branch.WorkspaceLocatorImpl$WriteAtomic.invoke(WorkspaceLocatorImpl.java:256)\n\tat hudson.FilePath$FileCallableWrapper.call(FilePath.java:3042)\n\tat hudson.remoting.UserRequest.perform(UserRequest.java:212)\n\tat hudson.remoting.UserRequest.perform(UserRequest.java:54)\n\tat hudson.remoting.Request$2.run(Request.java:369)\n\tat hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)\n\tat java.lang.Thread.run(Thread.java:745)\n\tSuppressed: hudson.remoting.Channel$CallSiteStackTrace: Remote call to dot-dewsttlas403-ci\n\t\tat hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1743)\n\t\tat hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)\n\t\tat hudson.remoting.Channel.call(Channel.java:957)\n\t\tat hudson.FilePath.act(FilePath.java:1069)\n\t\tat hudson.FilePath.act(FilePath.java:1058)\n\t\tat jenkins.branch.WorkspaceLocatorImpl.save(WorkspaceLocatorImpl.java:254)\n\t\tat jenkins.branch.WorkspaceLocatorImpl.access$500(WorkspaceLocatorImpl.java:80)\n\t\tat jenkins.branch.WorkspaceLocatorImpl$Collector.onOnline(WorkspaceLocatorImpl.java:561)\n\t\tat hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:697)\n\t\tat hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:432)\n\t\tat hudson.slaves.CommandLauncher.launch(CommandLauncher.java:154)\n\t\tat hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:294)\n\t\tat jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)\n\t\tat jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)\n\t\tat java.util.concurrent.FutureTask.run(Unknown Source)\n\t\tat java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\n\t\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)\n\t\tat java.lang.Thread.run(Unknown Source)\nCaused by: java.io.IOException: No space left on device\n\tat java.io.UnixFileSystem.createFileExclusively(Native Method)\n\tat java.io.File.createTempFile(File.java:2024)\n\tat hudson.util.AtomicFileWriter.<init>(AtomicFileWriter.java:142)\n\t... 15 more\n
I am only interested to extract the following from the above logs.
agent status, agent name
Expected Result:
agent status: Unable
agent name: dot-dewsttlas403-ci
SEVERE: %{DATA:agent_status} to launch the agent for %{DATA:agent_name}\\n
This should give you the result you are interested in, but it would only work if the structure of the message is the same.
Configuration used:
input {stdin{}}
filter{
grok {
match =>{
"message" => "SEVERE: %{DATA:agent_status} to launch the agent for %{DATA:agent_name}\\n"
}
}
}
output {stdout{codec => json}}
Result:
{
"host": "MY_COMPUTER",
"agent_status": "Unable",
"message": "hudson.slaves.CommandLauncher launch\\nSEVERE: Unable to launch the agent for dot-dewsttlas403-ci\\njava.io.IOException: Failed to create a temporary file in /opt_shared/iit_slave/jenkins_slave/workspace\\n\\tat \r",
"agent_name": "dot-dewsttlas403-ci",
"#timestamp": "2020-01-29T16:54:27.256Z",
"#version": "1"
}
Also to help you next time you're working with logstash-grok:
An online tester for patterns:
http://grokconstructor.appspot.com/do/match
The basic grok patterns: https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns

How to create Regex pattern for fluentd

I am trying to parse daemon logs from my linux machine to elastic search using fluentd but having hard time creating regex pattern for it. Below are few of the logs from the daemon logs:
Jun 5 06:46:14 user avahi-daemon[309]: Registering new address record for fe80::a7c0:8b54:ee45:ea4 on wlan0.*.
Jun 5 06:46:14 user dhcpcd[337]: wlan0: deleting default route via fe80::1e56:feff:fe13:2da
Jun 5 06:46:14 user dhcpcd[337]: wlan0: deleting route to 2402:3a80:9db:48da::/64
Jun 5 06:46:14 user dhcpcd[337]: wlan0: deleting address fe80::a7c0:8b54:ee45:ea4
Jun 5 06:46:14 user avahi-daemon[309]: Withdrawing address record for fe80::a7c0:8b54:ee45:ea4 on wlan0.
Jun 5 06:46:14 user avahi-daemon[309]: Leaving mDNS multicast group on interface wlan0.IPv6 with address fe80::a7c0:8b54:ee45:ea4.
So as you can see from the above logs, first we have the time of the logs, then we have the username and the daemon name, followed by the message.
I want to create below json format for the above logs:
{
"time": "Jun 5 06:46:14",
"username": "user",
"daemon": "avahi-daemon[309]",
"msg": "Registering new address record for fe80::a7c0:8b54:ee45:ea4 on wlan0.*."
}
{
"time": "Jun 5 06:46:14",
"username": "user",
"daemon": "dhcpcd[337]: wlan0",
"msg": "deleting default route via fe80::1e56:feff:fe13:2da"
}
Can anyone please give me some help on this. Is there any tool which we can use to generate regex in fluentd.
Edit:
I have managed to get few things matched from the logs like:
^(?<time>^(.*?:.*?):\d\d) (?<username>[^ ]*) matches Jun 5 06:46:14 user
but when I am passing this in fluentular, its not showing any results.
Try Regex: ^(?<time>[A-Za-z]{3}\s+\d{1,2}\s+\d{2}:\d{2}:\d{2})\s(?<username>[^ ]+)\s+(?<daemon>[^:]+):\s+(?<message>.*)$
See Demo

Extracting a substring after a match position using grok in logstash

Objective : I have a log file from where I want to extract the amount details after the string Amount::: in the below given log file.
What I have Done so far: Since it is a Custom Parsing, I have created a custom pattern using RegEx and I am trying to Implement it using logstash.
here is my log file -
28-04-2017 14:45:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 3000.00
28-04-2017 12:45:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 31000.00
28-04-2017 14:15:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 10000.00
28-04-2017 11:45:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 9000.00
28-04-2017 08:15:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 7000.00
I have used Regex to find the string Amount:::
Note : I want to extract the sub string which is coming after the string Amount:::
here are my Custom Patterns I have used in Grok:
(but it doesn't yield good results)
CUSTOM_AMOUNT (?<= - Amount::: ).*
CUSTOM_AMOUNT (?<=Amount::: )%{BASE16FLOAT}
here is my logstacsh.conf-
input {
file {
path => "D:\elk\data\amnt_parse.txt"
type => "customgrok"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter{
if[type]== "customgrok" {
if "_grokparsefailure" in [tags] {
grok {
patterns_dir => "D:\elk\logstash-5.2.1\vendor\bundle\jruby\1.9\gems\logstash-patterns-core-4.0.2\patterns\custom"
match => { "message" => "%{CUSTOM_AMOUNT:amount" }
add_field => { "subType" => "Amount" }
}
}
}
mutate {
gsub => ['message', "\t", " "]
} } }
output {
stdout {
codec => "rubydebug"
}
elasticsearch {
index => "amnt_parsing_change"
hosts =>"localhost"
}
}
Our intension is to Visualize and to perform aggregation operations based on the extracted substring using Kibana and Elasticsearch.
but it stores the log file into the variable "message". as you can see here, match => { "message" => "%{CUSTOM_AMOUNT:amount" }.
here is how the line is stored inside "message", when I tried to view it in Kibana -
"message": "28-04-2017 11:45:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 9000.00\r",
"message": "28-04-2017 12:45:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 31000.00\r",
"message": "28-04-2017 11:45:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 9000.00\r",
Logstash file is loading the Data(log file) and Index is also getting created but Custom Pattern isn't giving expected result.
what are possibilities to extract the sub string which I have mentioned above ? or do we have any alternatives?
Here is what you have to do :
filter {
grok {
match => {
"message" => "%{DATESTAMP:Date} %{WORD:LogSeverity}\s+%{WORD:LogInfo} \(%{NOTSPACE:JavaClass}\) \- Amount::: %{NUMBER:Amount}"
}
}
mutate
{
gsub =>
[
"Data"," ","-"
]
#If you dont want those fields
remove_field => ["Date","LogSeverity","LogInfo","JavaClass"]
}
}
I recommend you to read the documentations :
Grok Documentation
Grok Patterns
You can use the following debugger :
GrokDebbuger

Logstash, EC2 and elasticsearch

I have two elasticsearch nodes setup in EC2 and am trying to use logstash with it. I get this error when I run logstash:
log4j, [2014-02-24T10:45:32.722] WARN: org.elasticsearch.discovery.zen.ping.unicast: [Ishihara, Shirow] failed to send ping to [[#zen_unicast_1#][inet[/10.110.65.91:9300]]]
org.elasticsearch.transport.RemoteTransportException: Failed to deserialize exception response from stream
Caused by: org.elasticsearch.transport.TransportSerializationException: Failed to deserialize exception response from stream
at org.elasticsearch.transport.netty.MessageChannelHandler.handlerResponseError(MessageChannelHandler.java:169)
at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:123)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
That's a snippet of it.
Here is the conf file I am using with logstash:
input {
redis {
host => "10.110.65.91"
# these settings should match the output of the agent
data_type => "list"
key => "logstash"
# We use the 'json' codec here because we expect to read
# json events from redis.
codec => json
}
}
output {
stdout { debug => true debug_format => "json"}
elasticsearch {
host => "10.110.65.91"
cluster => searchbuild
}
}
~
I'm running Logstash on .91 (have a second terminal window open) Am I missing something?
I had to change "elasticsearch" to "elasticsearch_http".
Fixed.