Extracting a substring after a match position using grok in logstash - regex

Objective : I have a log file from where I want to extract the amount details after the string Amount::: in the below given log file.
What I have Done so far: Since it is a Custom Parsing, I have created a custom pattern using RegEx and I am trying to Implement it using logstash.
here is my log file -
28-04-2017 14:45:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 3000.00
28-04-2017 12:45:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 31000.00
28-04-2017 14:15:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 10000.00
28-04-2017 11:45:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 9000.00
28-04-2017 08:15:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 7000.00
I have used Regex to find the string Amount:::
Note : I want to extract the sub string which is coming after the string Amount:::
here are my Custom Patterns I have used in Grok:
(but it doesn't yield good results)
CUSTOM_AMOUNT (?<= - Amount::: ).*
CUSTOM_AMOUNT (?<=Amount::: )%{BASE16FLOAT}
here is my logstacsh.conf-
input {
file {
path => "D:\elk\data\amnt_parse.txt"
type => "customgrok"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter{
if[type]== "customgrok" {
if "_grokparsefailure" in [tags] {
grok {
patterns_dir => "D:\elk\logstash-5.2.1\vendor\bundle\jruby\1.9\gems\logstash-patterns-core-4.0.2\patterns\custom"
match => { "message" => "%{CUSTOM_AMOUNT:amount" }
add_field => { "subType" => "Amount" }
}
}
}
mutate {
gsub => ['message', "\t", " "]
} } }
output {
stdout {
codec => "rubydebug"
}
elasticsearch {
index => "amnt_parsing_change"
hosts =>"localhost"
}
}
Our intension is to Visualize and to perform aggregation operations based on the extracted substring using Kibana and Elasticsearch.
but it stores the log file into the variable "message". as you can see here, match => { "message" => "%{CUSTOM_AMOUNT:amount" }.
here is how the line is stored inside "message", when I tried to view it in Kibana -
"message": "28-04-2017 11:45:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 9000.00\r",
"message": "28-04-2017 12:45:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 31000.00\r",
"message": "28-04-2017 11:45:50 INFO abcinfo (ABC_TxnLog_ServiceImpl.java295) - Amount::: 9000.00\r",
Logstash file is loading the Data(log file) and Index is also getting created but Custom Pattern isn't giving expected result.
what are possibilities to extract the sub string which I have mentioned above ? or do we have any alternatives?

Here is what you have to do :
filter {
grok {
match => {
"message" => "%{DATESTAMP:Date} %{WORD:LogSeverity}\s+%{WORD:LogInfo} \(%{NOTSPACE:JavaClass}\) \- Amount::: %{NUMBER:Amount}"
}
}
mutate
{
gsub =>
[
"Data"," ","-"
]
#If you dont want those fields
remove_field => ["Date","LogSeverity","LogInfo","JavaClass"]
}
}
I recommend you to read the documentations :
Grok Documentation
Grok Patterns
You can use the following debugger :
GrokDebbuger

Related

Regex - extract ip

I'm tring to pull some data from a plain log file with a json convrestor.
this is the log entry:
01/04/2022 15:29:34.2934 +03:00 - [INFO] - [w3wp/LPAPI-Last Casino/177] - AppsFlyerPostback?re_targeting_conversion_type=&is_retargeting=false&app_id=id624512118&platform=ios&event_type=in-app-event&attribution_type=organic&ip=8.8.8.8&name=blabla
This is the regex I'm using:
(?P<date>[0-9]{2}\/[0-9]{2}\/[0-9]{4}).(?P<time>\s*[0-9]{2}:[0-9]{2}:[0-9]{2}).*(?P<level>\[\D+\]).-.\[(?P<application_subsystem_thread>.*)\].-.(?P<message>.*)
This is the output I'm getting:
{
"application_subsystem_thread": "w3wp/LPAPI-Last Casino/177",
"date": "01/04/2022",
"level": "[INFO]",
"message": "AppsFlyerPostback?re_targeting_conversion_type=&is_retargeting=false&app_id=id624512118&platform=ios&event_type=in-app-event&attribution_type=organic&ip=8.8.8.8&name=blabla",
"time": "15:29:34"
}
As you can see, the convertor is using the group names as the json key.
I would like to get the following output instead:
{
"application_subsystem_thread": "w3wp/LPAPI-Last Casino/177",
"date": "01/04/2022",
"level": "[INFO]",
"message": "AppsFlyerPostback?re_targeting_conversion_type=&is_retargeting=false&app_id=id624512118&platform=ios&event_type=in-app-event&attribution_type=organic&ip=8.8.8.8&name=blabla",
"time": "15:29:34",
"ip": "8.8.8.8"
}
As you can see I would like to get the IP as well how can I do it ?
You could extract it from the part of the message:
As defined in the message it could be captured with
ip\=(?P<ip_address>(?:[0-9]+\.){3}[0-9]+)
So then we incoperate it as part of the greater message group
(?P<message>.*ip\=(?P<ip_address>(?:[0-9]+\.){3}[0-9]+).*)
Resulting in the final expression
(?P<date>[0-9]{2}\/[0-9]{2}\/[0-9]{4}).(?P<time>\s*[0-9]{2}:[0-9]{2}:[0-9]{2}).*(?P<level>\[\D+\]).-.\[(?P<application_subsystem_thread>.*)\].-.(?P<message>.*ip\=(?P<ip_address>(?:[0-9]+\.){3}[0-9]+).*)
var message = `01/04/2022 15:29:34.2934 +03:00 - [INFO] - [w3wp/LPAPI-Last Casino/177] - AppsFlyerPostback?re_targeting_conversion_type=&is_retargeting=false&app_id=id624512118&platform=ios&event_type=in-app-event&attribution_type=organic&ip=8.8.8.8&name=blabla`;
// NOTE - The regex in this code sample has been modified to be ECMAScript compliant
console.log(/(?<date>[0-9]{2}\/[0-9]{2}\/[0-9]{4}).(?<time>\s*[0-9]{2}:[0-9]{2}:[0-9]{2}).*(?<level>\[\D+\]).-.\[(?<application_subsystem_thread>.*)\].-.(?<message>.*ip\=(?<ip_address>(?:[0-9]+\.){3}[0-9]+).*)/gm.exec(message).groups)

Grok Filter for Confluence Logs

I am trying to write a Grok expression to parse Confluence logs and I am partially successful.
My Current Grok pattern is :
%{TIMESTAMP_ISO8601:conflog_timestamp} %{LOGLEVEL:conflog_severity} \[%{APPNAME:conflog_ModuleName}\] \[%{DATA:conflog_classname}\] (?<conflog_message>(.|\r|\n)*)
APPNAME [a-zA-Z0-9\.\#\-\+_%\:]+
And I am able to parse the below log line :
Log line 1:
2020-06-14 10:44:01,575 INFO [Caesium-1-1] [directory.ldap.cache.AbstractCacheRefresher] synchroniseAllGroupAttributes finished group attribute sync with 0 failures in [ 2030ms ]
However I do have other log lines such as :
Log line 2:
2020-06-15 09:24:32,068 WARN [https-jsse-nio2-8443-exec-13] [atlassian.confluence.pages.DefaultAttachmentManager] getAttachmentData Could not find data for attachment:
-- referer: https://confluence.jira.com/index.action | url: /download/attachments/393217/global.logo | traceId: 2a0bfc77cad7c107 | userName: abcd
and Log Line 3 :
2020-06-12 01:19:03,034 WARN [https-jsse-nio2-8443-exec-6] [atlassian.seraph.auth.DefaultAuthenticator] login login : 'ABC' tried to login but they do not have USE permission or weren't found. Deleting remember me cookie.
-- referer: https://confluence.jira.com/login.action?os_destination=%2Findex.action&permissionViolation=true | url: /dologin.action | traceId: 8744d267e1e6fcc9
Here the params "userName" , "referer", "url" and "traceId" may or maynot be present in the Log line.
I can write concrete grok expressions for each of these. Instead can we handle all these in the same grok expression ?
In shorts - Match all log lines..
If log line has "referer" param store it in a variable. If not, proceed to match rest of the params.
If log line has "url" param store it, if not try to match rest of the params.
Repeat for 'traceId' and 'userName'
Thank you..

What is the grok pattern for this jenkins log?

Can you please help me with the grok pattern for this jenkins sample data or log. The log is only a single line.
hudson.slaves.CommandLauncher launch\nSEVERE: Unable to launch the agent for dot-dewsttlas403-ci\njava.io.IOException: Failed to create a temporary file in /opt_shared/iit_slave/jenkins_slave/workspace\n\tat hudson.util.AtomicFileWriter.<init>(AtomicFileWriter.java:144)\n\tat hudson.util.AtomicFileWriter.<init>(AtomicFileWriter.java:109)\n\tat hudson.util.AtomicFileWriter.<init>(AtomicFileWriter.java:84)\n\tat hudson.util.AtomicFileWriter.<init>(AtomicFileWriter.java:74)\n\tat hudson.util.TextFile.write(TextFile.java:116)\n\tat jenkins.branch.WorkspaceLocatorImpl$WriteAtomic.invoke(WorkspaceLocatorImpl.java:264)\n\tat jenkins.branch.WorkspaceLocatorImpl$WriteAtomic.invoke(WorkspaceLocatorImpl.java:256)\n\tat hudson.FilePath$FileCallableWrapper.call(FilePath.java:3042)\n\tat hudson.remoting.UserRequest.perform(UserRequest.java:212)\n\tat hudson.remoting.UserRequest.perform(UserRequest.java:54)\n\tat hudson.remoting.Request$2.run(Request.java:369)\n\tat hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)\n\tat java.lang.Thread.run(Thread.java:745)\n\tSuppressed: hudson.remoting.Channel$CallSiteStackTrace: Remote call to dot-dewsttlas403-ci\n\t\tat hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1743)\n\t\tat hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)\n\t\tat hudson.remoting.Channel.call(Channel.java:957)\n\t\tat hudson.FilePath.act(FilePath.java:1069)\n\t\tat hudson.FilePath.act(FilePath.java:1058)\n\t\tat jenkins.branch.WorkspaceLocatorImpl.save(WorkspaceLocatorImpl.java:254)\n\t\tat jenkins.branch.WorkspaceLocatorImpl.access$500(WorkspaceLocatorImpl.java:80)\n\t\tat jenkins.branch.WorkspaceLocatorImpl$Collector.onOnline(WorkspaceLocatorImpl.java:561)\n\t\tat hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:697)\n\t\tat hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:432)\n\t\tat hudson.slaves.CommandLauncher.launch(CommandLauncher.java:154)\n\t\tat hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:294)\n\t\tat jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)\n\t\tat jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)\n\t\tat java.util.concurrent.FutureTask.run(Unknown Source)\n\t\tat java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\n\t\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)\n\t\tat java.lang.Thread.run(Unknown Source)\nCaused by: java.io.IOException: No space left on device\n\tat java.io.UnixFileSystem.createFileExclusively(Native Method)\n\tat java.io.File.createTempFile(File.java:2024)\n\tat hudson.util.AtomicFileWriter.<init>(AtomicFileWriter.java:142)\n\t... 15 more\n
I am only interested to extract the following from the above logs.
agent status, agent name
Expected Result:
agent status: Unable
agent name: dot-dewsttlas403-ci
SEVERE: %{DATA:agent_status} to launch the agent for %{DATA:agent_name}\\n
This should give you the result you are interested in, but it would only work if the structure of the message is the same.
Configuration used:
input {stdin{}}
filter{
grok {
match =>{
"message" => "SEVERE: %{DATA:agent_status} to launch the agent for %{DATA:agent_name}\\n"
}
}
}
output {stdout{codec => json}}
Result:
{
"host": "MY_COMPUTER",
"agent_status": "Unable",
"message": "hudson.slaves.CommandLauncher launch\\nSEVERE: Unable to launch the agent for dot-dewsttlas403-ci\\njava.io.IOException: Failed to create a temporary file in /opt_shared/iit_slave/jenkins_slave/workspace\\n\\tat \r",
"agent_name": "dot-dewsttlas403-ci",
"#timestamp": "2020-01-29T16:54:27.256Z",
"#version": "1"
}
Also to help you next time you're working with logstash-grok:
An online tester for patterns:
http://grokconstructor.appspot.com/do/match
The basic grok patterns: https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns

Logstash grok & regex filter

Wanting a filter that extracts given information from log messages.
Currently using, although it's very specific to one format/log layout
filter {
if "ONT" in [message] {
grok{
match => { "message" => "%{SYSLOGBASE} %{WORD:Alarm_Severity} %{DATA:Message} %{QS:ONT_ID} %{DATA:Time} %{QS:ONT_Message}" }
}
}
Log Files are:
Dec 16 15:01:13 172.20.x.xx NPF_OLT_LAB05: clear Alarm for card 1/1 at 2019/12/16 15:01:13.39: "Backup files exist"
Dec 16 15:01:13 172.20.x.xx NPF_OLT_LAB05: service "403
for ONT: "10002" - ONT needs restart at 2019/12/16 15:01:13.39 ONT message: "Backup files exist"
Wanting layout to give me these parameters
Time:15:01:13
Host: NPF_OLT_LAB05
Alarm Severity: clear
ONT ID: 10002
Source IP: 172.20.x.xx
ONT Message: "Backup files exist"
Message: clear Alarm for card 1/1
Service ID: 403
I guess these are two different logs, you need to have two different grok pattern as below,
Dec 16 15:01:13 172.20.12.12 NPF_OLT_LAB05: clear Alarm for card 1/1 at 2019/12/16 15:01:13.39: "Backup files exist"
Grok pattern
(?<Date>%{MONTH} +%{MONTHDAY}) %{TIME:Time} %{IPV4:SourceIP} %{NOTSPACE:HOST}\:\s(%{WORD:Severity} %{GREEDYDATA:Message})\s(?<timestamp>%{YEAR}\/%{MONTHNUM}\/%{MONTHDAY}\s%{TIME})\S\s\S%{GREEDYDATA:ONTMessage}\"
Dec 16 15:01:13 172.20.x.xx NPF_OLT_LAB05: service "403 for ONT: "10002" - ONT needs restart at 2019/12/16 15:01:13.39 ONT message: "Backup files exist"
Grok pattern
(?<Date>%{MONTH} +%{MONTHDAY}) %{TIME:Time} %{IPV4:SourceIP} %{NOTSPACE:HOST}\S\s%{WORD:Severity}\s\S%{BASE10NUM:ServiceID} %{NOTSPACE}\s(?:ONT: \S%{BASE10NUM:ONT_ID}\S) %{NOTSPACE} %{GREEDYDATA:Message}\s(?<timestamp>%{YEAR}\/%{MONTHNUM}\/%{MONTHDAY}\s%{TIME}) (?:ONT message\: \S(?<ONT Message:>%{GREEDYDATA}\S))
Below Conf
filter {
if "ONT" in [message] {
grok{
match => { "message" => [ "(?<Date>%{MONTH} +%{MONTHDAY}) %{TIME:Time} %{IPV4:SourceIP} %{NOTSPACE:HOST}\:\s(%{WORD:Severity} %{GREEDYDATA:Message})\s(?<timestamp>%{YEAR}\/%{MONTHNUM}\/%{MONTHDAY}\s%{TIME})\S\s\S%{GREEDYDATA:ONTMessage}\"" ,
"(?<Date>%{MONTH} +%{MONTHDAY}) %{TIME:Time} %{IPV4:SourceIP} %{NOTSPACE:HOST}\S\s%{WORD:Severity}\s\S%{BASE10NUM:ServiceID} %{NOTSPACE}\s(?:ONT: \S%{BASE10NUM:ONT_ID}\S) %{NOTSPACE} %{GREEDYDATA:Message}\s(?<timestamp>%{YEAR}\/%{MONTHNUM}\/%{MONTHDAY}\s%{TIME}) (?:ONT message\: \S(?<ONT Message:>%{GREEDYDATA}\S))" ]
}
}
}

Logstash Output Amazon ES Error

I'm using logstash 2.3.4 and Amazon Elasticsearch Service (2.3) .
My config
input {
jdbc {
# Postgres jdbc connection string to our database, mydb
jdbc_connection_string => "jdbc:mysql://awsmigration.XXXXXXXX.ap-southeast-1.rds.amazonaws.com:3306/table_receipt?zeroDateTimeBehavior=convertToNull&autoReconnect=true&useSSL=false"
# The user we wish to execute our statement as
jdbc_user => "XXXXXXXX"
jdbc_password => "XXXXXXXX"
# The path to our downloaded jdbc driver
jdbc_driver_library => "/opt/logstash/drivers/mysql-connector-java-5.1.39/mysql-connector-java-5.1.39-bin.jar"
# The name of the driver class for Postgresql
jdbc_driver_class => "com.mysql.jdbc.Driver"
# our query
statement => "SELECT * from Receipt"
jdbc_paging_enabled => true
jdbc_page_size => 200
}
}
output {
#stdout { codec => json_lines }
amazon_es {
hosts => ["search-XXXXXXXX.ap-southeast-1.es.amazonaws.com"]
region => "ap-southeast-1"
index => "slurp_receipt"
document_type => "Receipt"
document_id => "%{uid}"
}
}
After running a command
bin/logstash agent -f db.conf
I got this error :
Attempted to send a bulk request to Elasticsearch configured at '["https://search-XXXXXXXX.ap-southeast-1.es.amazonaws.com:443"]', but an error occurred and it failed! Are you sure you can reach elasticsearch from this machine using the configuration provided? {:client_config=>{:hosts=>["https://search-slurp-wjgudsrlz66esh6hyrijaagamu.ap-southeast-1.es.amazonaws.com:443"], :region=>"ap-southeast-1", :aws_access_key_id=>nil, :aws_secret_access_key=>nil, :transport_options=>{:request=>{:open_timeout=>0, :timeout=>60}, :proxy=>nil}, :transport_class=>Elasticsearch::Transport::Transport::HTTP::AWS, :logger=>nil, :tracer=>nil, :reload_connections=>false, :retry_on_failure=>false, :reload_on_failure=>false, :randomize_hosts=>false, :http=>{:scheme=>"https", :user=>nil, :password=>nil, :port=>443}}, :error_message=>"undefined method `credentials' for nil:NilClass", :error_class=>"NoMethodError", :backtrace=>["/opt/logstash/vendor/bundle/jruby/1.9/gems/aws-sdk-core-2.1.36/lib/aws-sdk-core/signers/v4.rb:24:in `initialize'", "/opt/logstash/vendor/local_gems/b0f0ff24/logstash-output-amazon_es-1.0-java/lib/logstash/outputs/amazon_es/aws_v4_signer_impl.rb:36:in `signer'", "/opt/logstash/vendor/local_gems/b0f0ff24/logstash-output-amazon_es-1.0-java/lib/logstash/outputs/amazon_es/aws_v4_signer_impl.rb:48:in `call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/faraday-0.9.2/lib/faraday/rack_builder.rb:139:in `build_response'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/faraday-0.9.2/lib/faraday/connection.rb:377:in `run_request'", "/opt/logstash/vendor/local_gems/b0f0ff24/logstash-output-amazon_es-1.0-java/lib/logstash/outputs/amazon_es/aws_transport.rb:49:in `perform_request'", "org/jruby/RubyProc.java:281:in `call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.18/lib/elasticsearch/transport/transport/base.rb:257:in `perform_request'", "/opt/logstash/vendor/local_gems/b0f0ff24/logstash-output-amazon_es-1.0-java/lib/logstash/outputs/amazon_es/aws_transport.rb:45:in `perform_request'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.18/lib/elasticsearch/transport/client.rb:128:in `perform_request'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-api-1.0.18/lib/elasticsearch/api/actions/bulk.rb:90:in `bulk'", "/opt/logstash/vendor/local_gems/b0f0ff24/logstash-output-amazon_es-1.0-java/lib/logstash/outputs/amazon_es/http_client.rb:53:in `bulk'", "/opt/logstash/vendor/local_gems/b0f0ff24/logstash-output-amazon_es-1.0-java/lib/logstash/outputs/amazon_es.rb:321:in `submit'", "org/jruby/ext/thread/Mutex.java:149:in `synchronize'", "/opt/logstash/vendor/local_gems/b0f0ff24/logstash-output-amazon_es-1.0-java/lib/logstash/outputs/amazon_es.rb:318:in `submit'", "/opt/logstash/vendor/local_gems/b0f0ff24/logstash-output-amazon_es-1.0-java/lib/logstash/outputs/amazon_es.rb:351:in `flush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.22/lib/stud/buffer.rb:219:in `buffer_flush'", "org/jruby/RubyHash.java:1342:in `each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.22/lib/stud/buffer.rb:216:in `buffer_flush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.22/lib/stud/buffer.rb:159:in `buffer_receive'", "/opt/logstash/vendor/local_gems/b0f0ff24/logstash-output-amazon_es-1.0-java/lib/logstash/outputs/amazon_es.rb:311:in `receive'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.4-java/lib/logstash/outputs/base.rb:83:in `multi_receive'", "org/jruby/RubyArray.java:1613:in `each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.4-java/lib/logstash/outputs/base.rb:83:in `multi_receive'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.4-java/lib/logstash/output_delegator.rb:130:in `worker_multi_receive'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.4-java/lib/logstash/output_delegator.rb:114:in `multi_receive'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.4-java/lib/logstash/pipeline.rb:301:in `output_batch'", "org/jruby/RubyHash.java:1342:in `each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.4-java/lib/logstash/pipeline.rb:301:in `output_batch'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.4-java/lib/logstash/pipeline.rb:232:in `worker_loop'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.4-java/lib/logstash/pipeline.rb:201:in `start_workers'"], :level=>:error}
May i know how to solve this problems?
thank you