Perl Regex issues - regex

why isn't this perl REGEX working? i'm grabbing the date and username (date works fine), but it will grab all the usernames then when it hits bob.thomas and grabs the entire line
Code:
m/^(.+)\s-\sUser\s(.+)\s/;
print "$2------\n";
Sample Data:
Feb 17, 2013 12:18:02 AM - User plasma has logged on to client from host
Feb 17, 2013 12:13:00 AM - User technician has logged on to client from host
Feb 17, 2013 12:09:53 AM - User john.doe has logged on to client from host
Feb 17, 2013 12:07:28 AM - User terry has logged on to client from host
Feb 17, 2013 12:04:10 AM - User bob.thomas has been logged off from host because its web server session timed out. This means the web server has not received a request from the client in 3 minute(s). Possible causes: the client process was killed, the client process is hung, or a network problem is preventing access to the web server.
for the user that asked for the full code
open (FILE, "log") or die print "couldn't open file";
$record=0;
$first=1;
while (<FILE>)
{
if(m/(.+)\sto now/ && $first==1) # find the area to start recording
{
$record=1;
$first=0;
}
if($record==1)
{
m/^(.+)\s-\sUser\s(.+)\s/;
<STDIN>;
print "$2------\n";
if(!exists $user{$2})
{
$users{$2}=$1;
}
}
}

.+ is greedy, it matches the longest possible string. If you want it to match the shortest, use .+?:
/^(.+)\s-\sUser\s(.+?)\s/;
Or use a regexp that doesn't match whitespace:
/^(.+)\s-\sUser\s(\S+)/;

Use the reluctant/ungreedy quantifier to match up until the first occurrence rather than the last. You should do this in both cases just in case the "User" line also has " - User "
m/^(.+?)\s-\sUser\s(.+?)\s/;

Related

Regex for filtering out the groups and if there is specific string in the group then extract that into another group

Hi I am trying to match 3 logs with regex the issue I face is that it is not dynamic as if the value changes then regex do not work on that group.
I think the practical will give better understanding. https://regex101.com/r/sdoZaH/1
In this, Group 1 <address is working on 1st log line only, it is not able to identify string in 2nd line
In <message> group also, I want if there is IP addr then it should be separate group else it has covered the remaining part of it.
How do I make it dynamic that it matches all lines.
The lines I am trying to match
Mar 21 23:31:19 c10sw1 raslogd: AUDIT, 2022/03/21-23:31:19 (PDT), [SEC-3020], INFO, SECURITY, admin/admin/test.domain.com/ssh/CLI, ad_0/c10sw1/FID 128, 8.2.1c, , , , , , , Event: login, Status: success, Info: Successful login attempt via REMOTE, IP Addr: test.domain.com.
Mar 21 23:37:13 c10-M1000e-SW1 raslogd: AUDIT, 2022/03/21-23:37:13 (PDT), [SEC-3022], INFO, SECURITY, admin/admin/test.domain.com/ssh/CLI, ad_0/c10-M1000e-SW1/FID 128, 8.2.2b, , , , , , , Event: logout, Status: success, Info: Successful logout by user [admin].
Mar 21 23:37:13 c10-M1000e-SW1 raslogd: AUDIT, 2022/03/21-23:37:13 (PDT), [SEC-3022], INFO, SECURITY, admin/admin/test.domain.com/ssh/CLI, ad_0/c10-M1000e-SW1/FID 128, 8.2.2b, , , , , , , Event: logout, Status: success, Info: Successful logout by user [admin].
Please try the following pattern:
^[A-Za-z]+[\d\s:]+(?<address>\D\w+)\s.+?,\s(?<time>\d+\/\d+\/\d+\-\d+\:\d+\:\d+).+?\s\w+\/.+?\/(?<domain>.+?)\/(?<destinationprocess>.+?)\/(?<sourceprocess>.+?),.+Event:\s(?<eventtype>.+?),.+Status:\s(?<status>.+?),\sInfo:\s(?<message>.+)$
Please could you put various valid and invalid strings in the question.

Match 2 Pulse Secure events with 1 regular expression

I am trying to match 2 events with 1 regular expression and need some help.
REGEX
^(?:[^\.\n]*\.){6}\d+\s+\w+\s+(?P<software>\w+\-\w+/\d+\.\d+\.\d+\.\d+\s+\(\w+\s+\d+\)\s+\w+/\d+\.\d+\.\d+\.\d+)
Match
Mar 31 02:37:38 vpn PulseSecure: 2020-03-31 02:37:38 - vpn - [192.168.17.249] FRED(DUO-Windows)[] - Agent login succeeded for FRED/DUO-Windows from 192.168.17.24 with Pulse-Secure/8.3.4.1333 (Windows 10) Pulse/5.3.4.1333.
software Pulse-Secure/8.3.4.1333 (Windows 10) Pulse/5.3.4.1333
No match
Mar 31 03:01:13 vpn PulseSecure: 2020-03-31 03:01:13 - vpn - [192.168.17.24] FRED(DUO-Mac)[Mac] - Agent login succeeded for FRED/DUO-Mac from 192.168.17.24 with Pulse-Secure/9.0.4.1731 (Macintosh 10_14) Pulse/9.0.4.1731.
Your pattern didn't work because there were two different patterns but you used same pattern (\w+\s+\d+) to capture the following part:
(Windows 10)
(Macintosh 10_14)
I have updated the regex please check here

How to create Regex pattern for fluentd

I am trying to parse daemon logs from my linux machine to elastic search using fluentd but having hard time creating regex pattern for it. Below are few of the logs from the daemon logs:
Jun 5 06:46:14 user avahi-daemon[309]: Registering new address record for fe80::a7c0:8b54:ee45:ea4 on wlan0.*.
Jun 5 06:46:14 user dhcpcd[337]: wlan0: deleting default route via fe80::1e56:feff:fe13:2da
Jun 5 06:46:14 user dhcpcd[337]: wlan0: deleting route to 2402:3a80:9db:48da::/64
Jun 5 06:46:14 user dhcpcd[337]: wlan0: deleting address fe80::a7c0:8b54:ee45:ea4
Jun 5 06:46:14 user avahi-daemon[309]: Withdrawing address record for fe80::a7c0:8b54:ee45:ea4 on wlan0.
Jun 5 06:46:14 user avahi-daemon[309]: Leaving mDNS multicast group on interface wlan0.IPv6 with address fe80::a7c0:8b54:ee45:ea4.
So as you can see from the above logs, first we have the time of the logs, then we have the username and the daemon name, followed by the message.
I want to create below json format for the above logs:
{
"time": "Jun 5 06:46:14",
"username": "user",
"daemon": "avahi-daemon[309]",
"msg": "Registering new address record for fe80::a7c0:8b54:ee45:ea4 on wlan0.*."
}
{
"time": "Jun 5 06:46:14",
"username": "user",
"daemon": "dhcpcd[337]: wlan0",
"msg": "deleting default route via fe80::1e56:feff:fe13:2da"
}
Can anyone please give me some help on this. Is there any tool which we can use to generate regex in fluentd.
Edit:
I have managed to get few things matched from the logs like:
^(?<time>^(.*?:.*?):\d\d) (?<username>[^ ]*) matches Jun 5 06:46:14 user
but when I am passing this in fluentular, its not showing any results.
Try Regex: ^(?<time>[A-Za-z]{3}\s+\d{1,2}\s+\d{2}:\d{2}:\d{2})\s(?<username>[^ ]+)\s+(?<daemon>[^:]+):\s+(?<message>.*)$
See Demo

How to write Regex on last matching line then do another match then return the some value in log file?

I need a help on REGEX I have a log file and will populate the log information.
I need to find the last occurrence of the some string and in the same line i need to get a value of the some string.
Below are the conditions :
It is runtime log file I need to find last match in the log file (means take latest match it will have lot of same entries.
Need to find a last line which has a text Tomcat - Deploy A New Application Simple Web App
In the same line where the text was found find the entry result=Step Result: true ( This search should be done if the step2 text match is found)
4.Then return the step result value this case true.
I used below regex.
Deploy A New Application Simple Web App(.*)Step Result: (\w+)
Need a help on finding last entry of this and return only Step Result Value.
Example : ABCD.log
2017-09-21 22:47:37,381 [job-426106-jobServer-426106-3:Tomcat - Deploy
A New Application Simple Web App(P81.F2842.E2849.E2845)] DEBUG
(com.nolio.nimi.appmsg.durability.DurableCommunicationApi:155) - Got
new message:
INBAHRLP01735_150556657499340:payload=[ID:8444fce1d80a400_7f7#INBAHRLP01735,
from:INBAHRLP01735,
to:executionLog__426106_INBAHRLP01735#es_INBAHRLP01735-
StepExecutionEventDto [result=Step Result: true - Deploy an web
application [simple-web-app] successfully., hostIp=INBAHRLP01735,
jobId=426106, envServerId=426106, timestamp=Thu Sep 21 22:47:37 IST
2017, state=FINISHED, stepId=P81.F2842.E2849.E2845, stepTitle=Tomcat -
Deploy A New Application Simple Web App, startDate=Thu Sep 21 22:47:22
IST 2017, stopDate=Thu Sep 21 22:47:37 IST 2017, eventCounter=4264]]
2017-09-21 22:48:37,381 [job-426106-jobServer-426106-3:Tomcat - Deploy
A New Application Simple Web App(P81.F2842.E2849.E2845)] DEBUG
(com.nolio.nimi.appmsg.durability.DurableCommunicationApi:155) - Got
new message:
INBAHRLP01735_150556657499340:payload=[ID:8444fce1d80a400_7f7#INBAHRLP01735,
from:INBAHRLP01735,
to:executionLog__426106_INBAHRLP01735#es_INBAHRLP01735-
StepExecutionEventDto [result=Step Result: true - Deploy an web
application [simple-web-app] successfully., hostIp=INBAHRLP01735,
jobId=426106, envServerId=426106, timestamp=Thu Sep 21 22:47:37 IST
2017, state=FINISHED, stepId=P81.F2842.E2849.E2845, stepTitle=Tomcat -
Deploy A New Application Simple Web App, startDate=Thu Sep 21 22:48:22
IST 2017, stopDate=Thu Sep 21 22:48:37 IST 2017, eventCounter=4264]]

Regex: Matching,parsing an FTP response to a request

Here's what i'm trying to do:
I what to have some FTP functionality in one of my apps (this is just for myself, not a business application or such) and since I didn't wanted to write all that FTP request/response code for myself, I (being the lazy man I am) search the internet for an FTP wrapper.
I have found this DLL.
This is all very great, works like a charm. Except for one thing: when I request the LastWriteTime of a specific file ON the FTP server, the DLL is giving me strange dates (namely, prints out fictional dates). I've been able to find the problem. Whenever you send a request to the FTP server, it sends back a one line response, which has a very special format. Now what i've been able to gather, this format is different for most of the servers, my wrapper DLL comes with 6 pre-defined response formats, but my FTP server sends back a 7th one. Here's a response to a request and the REGEX formats:
-rw-r--r-- 1 user user 594 Jun 11 03:44 random_log.file
here are my regex parsing formats:
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{4})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\d+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{4})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\d+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})(\s+)(?<size>(\d+))(\s+)(?<ctbit>(\w+\s\w+))(\s+)(?<size2>(\d+))\s+(?<timestamp>\w+\s+\d+\s+\d{2}:\d{2})\s+(?<name>.+)", _
"(?<timestamp>\d{2}\-\d{2}\-\d{2}\s+\d{2}:\d{2}[Aa|Pp][mM])\s+(?<dir>\<\w+\>){0,1}(?<size>\d+){0,1}\s+(?<name>.+)"
Non of these seem to be able to parse the datetime correctly and since I have no idea how to do that, can a REGEX pro please write me a ParsingFormat that would be able to parse the above FTP response?
Both a hand-check and irb check of the fourth format shows that it does match:
> re=/(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)/
=> /(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)/
> m=re.match("-rw-r--r-- 1 user user 594 Jun 11 03:44 random_log.file")
=> #<MatchData "-rw-r--r-- 1 user user 594 Jun 11 03:44 random_log.file" dir:"-" permission:"rw-r--r--" size:"594" timestamp:"Jun 11 03:44" name:"random_log.file">
> m['dir']
=> "-"
> m['permission']
=> "rw-r--r--"
> m['size']
=> "594"
> m['timestamp']
=> "Jun 11 03:44"
> m['name']
=> "random_log.file"
>
I think the pile of regular expressions are fine. Perhaps you need to look elsewhere for the problem.