I need to use a regex ("filter=regexp" in NSClient++) to get a specific line (which is marked with "<--") if there is NO $-sign inside of it.
Below is an example of what the text looks like in which I have to search (It is a description message form an eventlog).
A member was added to a security-enabled global group
Subject:
Security ID: ...
Account Name: ...
Account Domain: ...
Logon ID: ...
Member:
Security ID: ... <--
Account Name: ...
Group:
Security ID: ...
Group Name: ...
Group Domain: ...
Additional Information:
...
So if the line which starts with "Security ID" inside of "Member" does NOT contain a $-sign then I need the output of the regex to be "Security ID: ..." otherwise there should be no output.
I tried some different things, but I don't get I working totally right:
/(?<=Member:[\n]).*(Security ID:((?![$]).)*)(?=[\n].*Account Name:)/s
--> Wrong if there is a line between "Security ID ..." and "Account Name ..." and seams to give back two matches.
So maybe someone can help me with that. ;)
UPDATE:
How to do it if there can also be multiple lines between "Member" and "Security ID"?
UPDATE2:
Actually this is what I meant:
/(?m)^(?<=Member:[\n])(?:.|\n)*?(Security ID:[^$\n]*$)(?:.|\n)*?(?=Group:[\n])/
Thanks for your help Robin & AdamK otherwise I were still looking to get it right! :)
Greetings,
Cédric
The hard part is to distinguish one section (Subject, Member...) from another: we don't want to match any Security ID. This regex relies on the two spaces indentation to distinguish those.
The wanted line is captured in the first capturing group, see demo here.
(?m)Member:\n(?: .*\n|\n)*?(^ *Security ID:[^$\n]*$)
(?m) is just the inline way of turning on the multiline flag m
^[^$\n]*$ matches only lines not containing $
\S matches anything but any sort of whitespace (no newline, tab...)
(?: \S.*\n)*? matches any number of lines indented by two spaces
Related
I'm performing regex extraction for parsing logs for our SIEM. I'm working with PCRE2.
In those logs, I have this problem: I have to extract a field that can be preceded by multiple options and I want use only one group name.
Let me be clearer with an example.
The SSH connection can appear in our log with this form:
UserType=SSH,
And I know that a simple regex expression to catch this is:
UserType=(?<app>.*?),
But, at the same time, SSH can appear with another "prefix":
ACCESS TYPE:SSH;
that can be captured with:
ACCESS\sTYPE:(?<app>.*?);
Now, because the logical field is the same (SSH protocol) and I want map it in every case under group name "app", is there a way to put the previous values in OR and use the same group name?
The desiderd final result is something like:
(UserType=) OR (ACCESS TYPE:) <field_value_here>
You can use
(?:UserType=|ACCESS\sTYPE:)(?<app>[^,;]+)
See the regex demo. Details:
(?:UserType=|ACCESS\sTYPE:) - either UserType= or ACCESS + whitespace + TYPE:
(?<app>[^,;]+) - Group "app": one or more chars other than , and ;.
What I'm trying to achieve: I want to match user entered sentence with my templates and to see which template matches better (as many groups out of all in template as possible).
Regex which I'm building to solve example:
^(\bMyCompany1\b)?(?:.+)?\s(\bestablishes\b)?(?:.+)?\s(\bAnotherCompany\b)?(?:.+)?$
Example sentences:
'MyCompany1 establishes AnotherCompany' - matches all 3 groups. is OK
'MyCompany1 establ AnotherCompany' - matches first and last group. ignres the middle typo. is also Ok
'MyCompany1 establishes AnotherCompany ' - space in the end. cannot identify 2 and 3 groups. I don't understand why
'MyCompany1 establishes AnotherCompany' - additional spaces after word 'establishes'. For some reason is not detecting 2nd group anymore
This regex is just an example of one template. I will have 1 regex (build dynamically) per each template. Like 'User1 sent a request to User2', 'Company1 borrowed to Company2 $111' My idea is to define each part of the template and to see how many parts I matched. E.g. in my example: - I expect some company name from the list (MyCompany or MyCompany1) or non capturing group to ignore the rest (maybe user did a typo or is just typing and hasn't finished) - I expect same order of groups to be there
Can you please explain what I'm doing wrong in my Regex? Is it correct to achieve that by using Regex at all?
This is covering all your test cases, it is based on 3 lookaheads, each one contain an optional non-capture that includes a group for every keywords you're looking for.
^(?=(?:.*(\bMyCompany1\b))?)(?=(?:.*?(\bestablishes\b))?)(?=(?:.*(\bAnotherCompany\b))?).*$
You'll get regex explanation at the link below:
Demo
Or, if the order matter:
^(?:.*(\bMyCompany1\b))?(?:.*?(\bestablishes\b))?(?:.*(\bAnotherCompany\b))?.*$
Demo
could you please try below regex
^(\bMyCompany1\b)?\s+(\bestablishes\b)?\s+(\bAnotherCompany\b)?(?:.+)?$
hope it helps
I have Windows logs being aggregated to a syslog server which is messing with the format a little bit and I'm trying to work a regular expression (PCRE) to be reformat it a little so I can extract some key/value pairs
I've had a go at the regular expression myself, but I'm stuck on the fact that each "Message" section has several "Headers" which have defined key/value pairs underneath them.
An example would be:
An attempt was made to access an object. Subject: Security ID: NT AUTHORITY\SYSTEM Account Name: NAME$ Account Domain: DOMAIN Logon ID: 0x3e7 Object: Object Server: Security Object Type: File Object Name: Z:\PATH\PATH\PATH\file.log Handle ID: 0x9b0 Process Information: Process ID: 0xa84 Process Name: C:\Program Files\PROGRAM\EXECUTABLE.exe Access Request Information: Accesses: ReadData (or ListDirectory) Access Mask: 0x1
The headers would be Subject, Object and Process Information.
Where I seem to be stuck is the only delimiter here is \s regardless of a header or pair.
This has got me close.
\s([^:\s]+)\:[\s]([^\s]*) but only captures the first word in a multi-word header or key.
With /s being the only delimiter, will this be possible?
If you only want those header names you might use an alternation and list the words between word boundaries \b.
Note that you don't have to escape the : and a single \s could also be written without the square brackets.
\b(Process Information|\S+)\b:\s(\S*)
Explanation
\b Word boundary
( Capturing group 1
Process Information|\S+ Match any of the listed
) Close capturing group
\b:\s Match word boundary, : and whitespace char
(\S*) Capturing group 2 matching 0+ times a non whitespace char
See a regex demo
Below is my Content:
Subject:
Security ID: S-1-5-21-3368353891-1012177287-890106238-22451
Account Name: ChamaraKer
Account Domain: JIC
Logon ID: 0x1fffb
Object:
Object Server: Security
Object Type: File
Object Name: D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log
Handle ID: 0x11dc
I need to match the line containing Object Name using a Regular expression.
Following is what i have tried:
^.*\b(Object|Name)\b.*$
The above regex matches Account Name: ChamaraKer, But my requirement is to match the line containing the word Object Name. How can i do this? It would be great if any one could help me with this problem.
Your regex is actually trying to match lines that contains Object OR Name..
Change it it
^.*\bObject Name\b.*$
Response to comment:
^.*\bObject Name:(.*)$
Group 1 will have everything match inside of (.*).
Depending on regex engine, for example it'll be \1 (for Notepad++) or match.Groups[1].Value (C#)
World's most convuluted title I know, an example should explain it better. I have a large txt file in the below format, though details and amount of lines will change everytime:
Username: john_joe Owner: John Joe
Account:
CLI:
Default:
LGICMD:
Flags:
Primary days:
Secondary days:
No access restrictions
Expiration:
Pwdlifetime:
Last Login:
Maxjobs:
Maxacctjobs:
Maxdetach:
Prclm:
Prio:
Queprio:
CPU:
Authorized Privileges:
BYPASS
Default Privileges:
SYSPRV
This sequence is repeated a couple of thousand times for different users. I need to find every user (ideally the entire first line of the above) that has SYSPRV under "Default Permissions".
I know I could write an application to do this, I was just hoping their might be a nice regex I could use.
Cheers
^Username:\s*(\S+)((?!^Username).)*Default Privileges:\s+SYSPRV
with the option to make ^ match start of line, and to make dot match newlines, will isolate those records and capture the username in backreference no. 1. Tell me which language you're using, and I'll provide a code sample.
Explanation:
^Username:\s: match "Username" at the start of the line, a colon and any whitespace.
(\S)+": match any non-whitespace characters and capture them into backreference no. 1. This will be the Username.
((?!Username).)*: Match any character as long as it's not the "U" of "Username". This ensures that we won't accidentally cross over into the next record.
Default Privileges:\s+SYSPRV: match the required text.
So in Python, for example, you would use:
result = re.findall(r"(?sm)^Username:\s*(\S+)((?!^Username).)*Default Privileges:\s+SYSPRV", subject)