I have reading logs from a log file which is recording multiline type. While reading QRadar assembling two record and take it as a one log.
I have describe start and end pattern of the log line while adding the log source to QRadar as:
Start Pattern RegEx: ^(\d{7})\,
End Pattern RegEx: (\d{2}:\d{2}:\d{2})$
I should have read the logs like :
1158896,someuser,Inner User,Minor,10.6.130.11,2019-09-29 03:01:15,Security Management,Log in to the server,Network Management,Succeeded,User name: someuser,2019-09-29 03:01:15
1158897,someuser,Inner User,Minor,10.6.130.11,2019-09-29 03:03:16,Security Management,Log out the server,Network Management,Succeeded,"User name: someuserOnline duration: 0 day(s) 0 hour(s) 2 minute(s) 1 second(s)",2019-09-29 03:03:16
But I receive some of them assembled, like:
1158896,someuser,Inner User,Minor,10.6.130.11,2019-09-29 03:01:15,Security Management,Log in to the server,Network Management,Succeeded,User name: someuser,2019-09-29 03:01:151158897,someuser,Inner User,Minor,10.6.130.11,2019-09-29 03:03:16,Security Management,Log out the server,Network Management,Succeeded,"User name: someuserOnline duration: 0 day(s) 0 hour(s) 2 minute(s) 1 second(s)",2019-09-29 03:03:16
Here are the regex101.com records of my start and end pattern of RegEx.
https://regex101.com/r/2IfMR7/3
https://regex101.com/r/2IfMR7/4
As you see, it works normally in regex101.com
Why QRadar is reading them as one?
You (or qradar) might be using a greedy quantifier coupled with a multiline capture character.
If you're doing something like this: ^(\d{7})\,(?:\n|.)*(\d{2}:\d{2}:\d{2})$ where the central group is (?:\n|.)* or any similar phrase matching across multiple lines, the greedy operator * means it'll try to match from the very first 7 digits to the very last timestamp on the entire log page, ignoring your start and end anchors. Try using *? instead; the ? makes it non-greedy, so it'll stop at the first timestamp.
Compare: greedy vs non-greedy.
Related
I have wrote an easy regex for extracting user SC08.
https://regex101.com/r/L1DOzH/1/ Performance wise, its really bad taking around 1448 steps.
Jun 2 11:16:44 192.168.55.19 1 2020-06-02T10:16:43.721Z chisdsm#abcd.com dsm 4493 USR1278I [U#21513 sev="INFO" msg="user logged out due to inactivity" user="SC08"]
Jun 2 10:13:50 192.168.55.19 1 2020-06-02T09:13:50.297Z chisdsm#abcd.com dsm 4493 DO0426I [DA#21513 sev="INFO" msg="switch domain" admin="SC08"
Jun 2 10:13:43 192.168.55.19 1 2020-06-02T09:13:42.956Z chisdsm#abcd.com dsm 4493 DAO0267I [DA#21513 sev="INFO" msg="user logged in" admin="SC08" stime="2020-06-02 10:13:42.944" role="ALL_ADMIN" source="192.168.54.9"]
May 27 15:53:38 192.168.55.129 1 2020-05-27T14:53:37.669Z chisdsm#abcd.com dsm 4493 DAO0227I [DA#21513 sev="INFO" msg="delete file signature" user="SC08" filePath="/bin/rm"]
Alternation group as the first pattern in a regex cancels some optimizations that are in place for patterns that start with a more specific pattern.
Since your alternatives match = delimited strings, you may put it at the beginning of the pattern, and then use lookarounds, as in Michail's suggestion. Here is a small variation with 139 steps:
=(?:(?<=user=)"(?<user1>\w+)|(?<=admin=)"(?<user2>\w+))
See the regex demo. Details
= - an equals sign
(?:(?<=user=)"(?<user1>\w+)|(?<=admin=)"(?<user2>\w+)) - a non-capturing group:
(?<=user=) - user= must be immediately to the left of the current position
" - a " char
(?<user1>\w+) - Group "user1": 1+ word chars
| - or
(?<=admin=) - admin= must be immediately to the left of the current position
" - a " char
(?<user2>\w+) - Group "user2": 1+ word chars
If your matches are always preceded with a whitespace, use it as the first pattern:
\s(?:user="(?<user1>\w+)|admin="(?<user2>\w+))
See this regex demo, with 918 steps.
If you know the matches are somewhere close to the end of the line, use
.*\b(?:user="(?<user1>\w+)|admin="(?<user2>\w+))
See this regex demo, 568 steps. .* at the start will move the regex index at the end of a line/string and then backtrack to find either user= or admin=.
I have a log file that logs connection drops of computers in a LAN. I want to extract name of each computer from every line of the log file and for that I am doing this: (?<=Name:)\w+|(-PC)
The target text:
`[C417] ComputerName:KCUTSHALL-PC UserID:GO kcutshall Station 9900 (locked) LanId: | (11/23 10:54:09 - 11/23 10:54:44) | Average limit (300) exceeded while pinging www.google.com [74.125.224.147] 8x
[C445] ComputerName:FRONTOFFICE UserID:YB Yenae Ball Station 7C LanId: | (11/23 17:02:00) | Client is connected to agent.`
The problem is that some computer names have -PC in them and in some isn't. The expression I have created matches computer without -PC in their names but it if a computer has -PC in the name, it treats that as a separate match and I don't want that. In short, it gives me 3 matches, but I want only 2. That's why I need help here, I am beginner in regex.
You may use
(?<=Name:)\w+(?:-PC)?
Details
(?<=Name:) - a place immediately preceded with Name:
\w+ - 1+ word chars
(?:-PC)? - an optional non-capturing group that matches 1 or 0 occurrences of -PC substring.
Consider using word boundaries if you need to match PC as a whole word,
(?<=Name:)\w+(?:-PC\b)?
See the regex demo.
I got a Problem with the following regex pattern:
m).*?^([^n]*)(modified)([^n]*)$.*
I want to replace the clipboard with
Clipboard := RegExReplace(Clipboard, "m).*?^([^n]*)(modified)([^n]*)$.*" ,"" )
Source looks like:
Ask Question Interesting 326 Featured
Hot Week Month 1 vote 0 answers 12 views
Type Guard for empty object
typescript modified 2 mins ago kremerd 312
0 votes
Expected result should be:
typescript modified 2 mins ago kremerd 312
But its replacing nothing. If this works i want to get later the tagnames ^([^n]*) by using regExMatch.
I am scripting with autohotkey (a windows open souce) from https://autohotkey.com
You want to match a line that contains a modified substring. The dot in a regex does not match the newline by default, so you need to pass the s (DOTALL) modifier (you may add it together with m, MULTILINE, modifier that makes ^ match the start of string position and $ to match the end of line position). Besides, to match non-newlines you need [^\n] (not [^n]).
To solve the issue you may use
RegExMatch(Clipboard, "s)^.*?(\n[^\n]*)(modified|asked|answered)", res)
Grab the whole line value via res, the text before the keywords via res1 and the keyword itself with res2.
Details
s) - the . now matches any char including line break chars
^ - start of the string
.*? - any 0+ chars, as few as possible
(\n[^\n]*) - Group 1 (accessed via res1 later): a newline followed with 0+ chars other than newline chars
(modified|asked|answered) - any of the three alternatives: modified, asked or answered.
I have column of dates in my Notepad++:
2017-06-12
2017-06-13
2017-06-14
2017-06-15
2017-06-16
2017-06-17
2017-06-18
2017-06-19
2017-06-20
2017-06-20
2017-06-21
2017-06-22
2017-06-23
2017-06-24
2017-06-25
2017-06-26
2017-06-27
2017-06-28
2017-06-29
2017-06-30
2017-07-01
2017-07-02
2017-07-03
2017-07-04
2017-07-05
2017-07-06
2017-07-07
2017-07-08
2017-07-09
2017-07-10
I need it to cut in weeks by placing \r\n after each week like :
2017-06-12
2017-06-13
2017-06-14
2017-06-15
2017-06-16
2017-06-17
2017-06-18
2017-06-19
2017-06-20
2017-06-20
2017-06-21
2017-06-22
2017-06-23
2017-06-24
2017-06-25
2017-06-26
2017-06-27
2017-06-28
2017-06-29
2017-06-30
2017-07-01
2017-07-02
2017-07-03
2017-07-04
2017-07-05
2017-07-06
2017-07-07
2017-07-08
2017-07-09
2017-07-10
I do replace by using RegEx. I find 7 days:
\d\d\d\d-\d\d-\d\d\r\n\d\d\d\d-\d\d-\d\d\r\n\d\d\d\d-\d\d-\d\d\r\n\d\d\d\d-\d\d-\d\d\r\n\d\d\d\d-\d\d-\d\d\r\n\d\d\d\d-\d\d-\d\d\r\n\d\d\d\d-\d\d-\d\d\r\n
And now I would like to add \r\n
But how to use selected data for replace with itself plus \r\n ?
If you are sure that the first date is monday, you could that:
Ctrl+H
Find what: (?:\d{4}-\d\d-\d\d\R){7}
Replace with: $0\r\n
Replace all
In your example input there are some lines doubled. e.g. the 2017-06-20. In your example output this line is also doubled and the week-block consists of eight lines. Seven unique lines and one doubled line for 2017-06-20. I assume that all lines in the input are sorted, thus non unique lines are behind each other. Additionally I assume that the first line marks the first day of a week.
Do a regular expression find/replace like this:
Open Replace Dialog
Find What: (((.*\R)\3*){7})
Replace With: \1\r\n
Check regular expression, do not check . matches newline
Click Replace or Replace All
Explanation
Lets explain (((.*\R)\3*){7}) from the inside out, starting at the third inner group: in the following x,y are regex-parts and do not mean literal characters.
(.*\R) the third group is just one line from start to end
(y\3*) we look for a y followed by an optional part that is captured in the third braces group, here it means a y followed by an optional number of repetitions of y, here y is the third group referenced by \3; this deals with the 2017-06-20 case
(x{7}) we match seven repetions of x, which means here seven unique rows wich can have repetitions in the block, so 8 line with one line doubled is ok
I want to get all text that does not start with 1,2,12,34.
I wrote
^((?!1|2|12|34).)*$
(^ asserts position at start of a line)
as in:
https://regex101.com/r/gI6sN8/14
Problems
It also doesn't select text that has 1 or 2 in the middle ("AB 1 CD").
It also doesn't select 13 (because it starts with 1)
How can I restrict it
Looks like you want this:
^(?!(1|2|12|34)\s).*
https://regex101.com/r/gI6sN8/16
As mentioned in comment, you need word boundary and correct parenthesis position
^(?!(?:1|2|12|34)\b)(.*)$
Regex Demo
You can also use \D
^(?!(?:1|2|12|34)\D)(.*)$
In your regex
^((?!1|2|12|34).)*$
you are finding whether any of the above alternative 1|2|12|34 is correct at every position. That's why it's not matching AB 1 CD
This works
^(?!(?:12?|2|34)(?!\d)).+$
https://regex101.com/r/gI6sN8/19
A valid boundary between the numbers you don't want it to
start with and the character after it appears to be any non-digit.