How to create Regex pattern for fluentd - regex

I am trying to parse daemon logs from my linux machine to elastic search using fluentd but having hard time creating regex pattern for it. Below are few of the logs from the daemon logs:
Jun 5 06:46:14 user avahi-daemon[309]: Registering new address record for fe80::a7c0:8b54:ee45:ea4 on wlan0.*.
Jun 5 06:46:14 user dhcpcd[337]: wlan0: deleting default route via fe80::1e56:feff:fe13:2da
Jun 5 06:46:14 user dhcpcd[337]: wlan0: deleting route to 2402:3a80:9db:48da::/64
Jun 5 06:46:14 user dhcpcd[337]: wlan0: deleting address fe80::a7c0:8b54:ee45:ea4
Jun 5 06:46:14 user avahi-daemon[309]: Withdrawing address record for fe80::a7c0:8b54:ee45:ea4 on wlan0.
Jun 5 06:46:14 user avahi-daemon[309]: Leaving mDNS multicast group on interface wlan0.IPv6 with address fe80::a7c0:8b54:ee45:ea4.
So as you can see from the above logs, first we have the time of the logs, then we have the username and the daemon name, followed by the message.
I want to create below json format for the above logs:
{
"time": "Jun 5 06:46:14",
"username": "user",
"daemon": "avahi-daemon[309]",
"msg": "Registering new address record for fe80::a7c0:8b54:ee45:ea4 on wlan0.*."
}
{
"time": "Jun 5 06:46:14",
"username": "user",
"daemon": "dhcpcd[337]: wlan0",
"msg": "deleting default route via fe80::1e56:feff:fe13:2da"
}
Can anyone please give me some help on this. Is there any tool which we can use to generate regex in fluentd.
Edit:
I have managed to get few things matched from the logs like:
^(?<time>^(.*?:.*?):\d\d) (?<username>[^ ]*) matches Jun 5 06:46:14 user
but when I am passing this in fluentular, its not showing any results.

Try Regex: ^(?<time>[A-Za-z]{3}\s+\d{1,2}\s+\d{2}:\d{2}:\d{2})\s(?<username>[^ ]+)\s+(?<daemon>[^:]+):\s+(?<message>.*)$
See Demo

Related

Regex for filtering out the groups and if there is specific string in the group then extract that into another group

Hi I am trying to match 3 logs with regex the issue I face is that it is not dynamic as if the value changes then regex do not work on that group.
I think the practical will give better understanding. https://regex101.com/r/sdoZaH/1
In this, Group 1 <address is working on 1st log line only, it is not able to identify string in 2nd line
In <message> group also, I want if there is IP addr then it should be separate group else it has covered the remaining part of it.
How do I make it dynamic that it matches all lines.
The lines I am trying to match
Mar 21 23:31:19 c10sw1 raslogd: AUDIT, 2022/03/21-23:31:19 (PDT), [SEC-3020], INFO, SECURITY, admin/admin/test.domain.com/ssh/CLI, ad_0/c10sw1/FID 128, 8.2.1c, , , , , , , Event: login, Status: success, Info: Successful login attempt via REMOTE, IP Addr: test.domain.com.
Mar 21 23:37:13 c10-M1000e-SW1 raslogd: AUDIT, 2022/03/21-23:37:13 (PDT), [SEC-3022], INFO, SECURITY, admin/admin/test.domain.com/ssh/CLI, ad_0/c10-M1000e-SW1/FID 128, 8.2.2b, , , , , , , Event: logout, Status: success, Info: Successful logout by user [admin].
Mar 21 23:37:13 c10-M1000e-SW1 raslogd: AUDIT, 2022/03/21-23:37:13 (PDT), [SEC-3022], INFO, SECURITY, admin/admin/test.domain.com/ssh/CLI, ad_0/c10-M1000e-SW1/FID 128, 8.2.2b, , , , , , , Event: logout, Status: success, Info: Successful logout by user [admin].
Please try the following pattern:
^[A-Za-z]+[\d\s:]+(?<address>\D\w+)\s.+?,\s(?<time>\d+\/\d+\/\d+\-\d+\:\d+\:\d+).+?\s\w+\/.+?\/(?<domain>.+?)\/(?<destinationprocess>.+?)\/(?<sourceprocess>.+?),.+Event:\s(?<eventtype>.+?),.+Status:\s(?<status>.+?),\sInfo:\s(?<message>.+)$
Please could you put various valid and invalid strings in the question.

fluentd regexp to extract events from a log file

I'm new to fluentd.
I have a log that I want to push to AWS with fluentd but I can't figure out what the regexp should be.
All the log lines, except the multilines, start with a UUID.
Here's a sample log:
6b0815f2-8ff1-4181-a4e6-058148288281 2020-11-03 13:00:05.976366 [DEBUG] switch_core_state_machine.c:611 (some_other_data) State Change CS_REPORTING -> CS_DESTROY
And, I'm trying to get UUID, DateTime, and Message.
With this regex:
/^(?<UUID>[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}) (?<time>.*) (?<message>[^ ]*)/gm
I'm getting the last word CS_DESTROY.
I tried fluentular and still got:
text:
f6a6e1ae-e52e-4aba-a8a5-4e3cc7f40914 2020-11-03 14:32:34.975779 [CRIT] mod_dptools.c:1866 audio3: https://mydomain.s3-eu-west-1.amazonaws.com/media/576d06e5-04fc-11eb-a52c-020fd8c14d18/5f9ddf2d5df0f698094395.mpg
regexp:
^(?<UUID>[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}) (?<time>.*) (?<message>[^ ]*)$
and got:
time 2020/11/03 14:32:34 +0000
UUID f6a6e1ae-e52e-4aba-a8a5-4e3cc7f40914
message https://mydomain.s3-eu-west-1.amazonaws.com/media/576d06e5-04fc-11eb-a52c-020fd8c14d18/5f9ddf2d5df0f698094395.mpg
It's missing what's between the datetime and "https".
Try:
^(?<UUID>[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}) (?<time>[^\[]*) (?<message>\[.*)$
Live at rubular: https://rubular.com/r/JQQXs5VTkr2IxM
Here's the output for both logs:
Match 1
UUID 6b0815f2-8ff1-4181-a4e6-058148288281
time 2020-11-03 13:00:05.976366
message [DEBUG] switch_core_state_machine.c:611 (some_other_data) State Change CS_REPORTING -> CS_DESTROY
Match 2
UUID f6a6e1ae-e52e-4aba-a8a5-4e3cc7f40914
time 2020-11-03 14:32:34.975779
message [CRIT] mod_dptools.c:1866 audio3: https://mydomain.s3-eu-west-1.amazonaws.com/media/576d06e5-04fc-11eb-a52c-020fd8c14d18/5f9ddf2d5df0f698094395.mpg

Couldn't figure out why the relaying is denied

Below is what happened to one mail send from a drupal client.
$ grep 'B6693C0977' /var/log/maillog
Jan 19 14:12:30 instance-1 postfix/pickup[19329]: B6693C0977: uid=0 from=<admin#mailgun.domainA.com>
Jan 19 14:12:30 instance-1 postfix/cleanup[20035]: B6693C0977: message-id=<20170119141230.B6693C0977#mail.instance-1.c.tw-pilot.internal>
Jan 19 14:12:30 instance-1 postfix/qmgr[19330]: B6693C0977: from=<admin#mailgun.domainA.com>, size=5681, nrcpt=1 (queue active)
Jan 19 14:12:33 instance-1 postfix/smtp[20039]: B6693C0977:
to=<username#hotmail.com>, relay=smtp.mailgun.org[52.41.19.62]:2525, delay=2.4,
delays=0.02/0.05/1.8/0.53, dsn=5.7.1, status=bounced (host smtp.mailgun.org
[52.41.19.62] said: 550 5.7.1 **Relaying denied** (in reply to RCPT TO command))
Jan 19 14:12:33 instance-1 postfix/bounce[20050]: B6693C0977: sender non-delivery notification: ABB94C0976
Jan 19 14:12:33 instance-1 postfix/qmgr[19330]: B6693C0977: removed
Relevant excerpts from my /etc/postfix/main.cf are below
# RELAYHOST SETTINGS
smtp_tls_security_level = encrypt
smtp_sasl_auth_enable = yes
smtp_sasl_password_maps = hash:/etc/postfix/sasl_passwd
smtp_sasl_security_options = noanonymous
sender_dependent_relayhost_maps = hash:/etc/postfix/relayhost_map
and from /etc/postfix/sasl_passwd is follows
#mailgun.domainA.com postmaster#mailgun.domainA.com:password
and from /etc/postfix/relayhost_map is follows
#mailgun.domainA.com [smtp.mailgun.org]:2525
The permissions of the db files are as follows
# ls -lZ /etc/postfix/relayhost_map.db
-rw-r-----. root postfix unconfined_u:object_r:postfix_etc_t:s0 /etc/postfix/relayhost_map.db
# ls -lZ /etc/postfix/sasl_passwd.db
-rw-r-----. root postfix unconfined_u:object_r:postfix_etc_t:s0 /etc/postfix/sasl_passwd.db
The problem is
Outbound mails are not going.
No logs are shown in mailgun console.
Any insight is appreciated
I know that this is an old question now but I've just had the same issue and wanted to post a response for anyone who comes across this article in future.
I believe your issue is in /etc/postfix/relayhost_map where you should have the following, note that there are no brackets, for me it was the inclusion of brackets that was causing the issue:
#mailgun.domainA.com smtp.mailgun.org:2525
For anyone who is not using /etc/postfix/relayhost_map and is doing it all in /etc/postfix/sasl_passwd directly the same applies:
smtp.mailgun.org:2525 postmaster#mailgun.domainA.com:password
Don't forget to regenerate the postfix sasl_passwd.db file and restart the service afterwards
sudo postmap /etc/postfix/sasl_passwd
sudo systemctl restart postfix
Or sudo service postfix restart if you're on an older system / not running systemd.
Usually this is realted to problems on their platform if everything was working ok previously just open a ticket and usually they fix it in a few hours (yes that its kind of hard a few hours)

Perl Regex issues

why isn't this perl REGEX working? i'm grabbing the date and username (date works fine), but it will grab all the usernames then when it hits bob.thomas and grabs the entire line
Code:
m/^(.+)\s-\sUser\s(.+)\s/;
print "$2------\n";
Sample Data:
Feb 17, 2013 12:18:02 AM - User plasma has logged on to client from host
Feb 17, 2013 12:13:00 AM - User technician has logged on to client from host
Feb 17, 2013 12:09:53 AM - User john.doe has logged on to client from host
Feb 17, 2013 12:07:28 AM - User terry has logged on to client from host
Feb 17, 2013 12:04:10 AM - User bob.thomas has been logged off from host because its web server session timed out. This means the web server has not received a request from the client in 3 minute(s). Possible causes: the client process was killed, the client process is hung, or a network problem is preventing access to the web server.
for the user that asked for the full code
open (FILE, "log") or die print "couldn't open file";
$record=0;
$first=1;
while (<FILE>)
{
if(m/(.+)\sto now/ && $first==1) # find the area to start recording
{
$record=1;
$first=0;
}
if($record==1)
{
m/^(.+)\s-\sUser\s(.+)\s/;
<STDIN>;
print "$2------\n";
if(!exists $user{$2})
{
$users{$2}=$1;
}
}
}
.+ is greedy, it matches the longest possible string. If you want it to match the shortest, use .+?:
/^(.+)\s-\sUser\s(.+?)\s/;
Or use a regexp that doesn't match whitespace:
/^(.+)\s-\sUser\s(\S+)/;
Use the reluctant/ungreedy quantifier to match up until the first occurrence rather than the last. You should do this in both cases just in case the "User" line also has " - User "
m/^(.+?)\s-\sUser\s(.+?)\s/;

Regex: Matching,parsing an FTP response to a request

Here's what i'm trying to do:
I what to have some FTP functionality in one of my apps (this is just for myself, not a business application or such) and since I didn't wanted to write all that FTP request/response code for myself, I (being the lazy man I am) search the internet for an FTP wrapper.
I have found this DLL.
This is all very great, works like a charm. Except for one thing: when I request the LastWriteTime of a specific file ON the FTP server, the DLL is giving me strange dates (namely, prints out fictional dates). I've been able to find the problem. Whenever you send a request to the FTP server, it sends back a one line response, which has a very special format. Now what i've been able to gather, this format is different for most of the servers, my wrapper DLL comes with 6 pre-defined response formats, but my FTP server sends back a 7th one. Here's a response to a request and the REGEX formats:
-rw-r--r-- 1 user user 594 Jun 11 03:44 random_log.file
here are my regex parsing formats:
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{4})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\d+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{4})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\d+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})(\s+)(?<size>(\d+))(\s+)(?<ctbit>(\w+\s\w+))(\s+)(?<size2>(\d+))\s+(?<timestamp>\w+\s+\d+\s+\d{2}:\d{2})\s+(?<name>.+)", _
"(?<timestamp>\d{2}\-\d{2}\-\d{2}\s+\d{2}:\d{2}[Aa|Pp][mM])\s+(?<dir>\<\w+\>){0,1}(?<size>\d+){0,1}\s+(?<name>.+)"
Non of these seem to be able to parse the datetime correctly and since I have no idea how to do that, can a REGEX pro please write me a ParsingFormat that would be able to parse the above FTP response?
Both a hand-check and irb check of the fourth format shows that it does match:
> re=/(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)/
=> /(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)/
> m=re.match("-rw-r--r-- 1 user user 594 Jun 11 03:44 random_log.file")
=> #<MatchData "-rw-r--r-- 1 user user 594 Jun 11 03:44 random_log.file" dir:"-" permission:"rw-r--r--" size:"594" timestamp:"Jun 11 03:44" name:"random_log.file">
> m['dir']
=> "-"
> m['permission']
=> "rw-r--r--"
> m['size']
=> "594"
> m['timestamp']
=> "Jun 11 03:44"
> m['name']
=> "random_log.file"
>
I think the pile of regular expressions are fine. Perhaps you need to look elsewhere for the problem.