I have a log file that logs connection drops of computers in a LAN. I want to extract name of each computer from every line of the log file and for that I am doing this: (?<=Name:)\w+|(-PC)
The target text:
`[C417] ComputerName:KCUTSHALL-PC UserID:GO kcutshall Station 9900 (locked) LanId: | (11/23 10:54:09 - 11/23 10:54:44) | Average limit (300) exceeded while pinging www.google.com [74.125.224.147] 8x
[C445] ComputerName:FRONTOFFICE UserID:YB Yenae Ball Station 7C LanId: | (11/23 17:02:00) | Client is connected to agent.`
The problem is that some computer names have -PC in them and in some isn't. The expression I have created matches computer without -PC in their names but it if a computer has -PC in the name, it treats that as a separate match and I don't want that. In short, it gives me 3 matches, but I want only 2. That's why I need help here, I am beginner in regex.
You may use
(?<=Name:)\w+(?:-PC)?
Details
(?<=Name:) - a place immediately preceded with Name:
\w+ - 1+ word chars
(?:-PC)? - an optional non-capturing group that matches 1 or 0 occurrences of -PC substring.
Consider using word boundaries if you need to match PC as a whole word,
(?<=Name:)\w+(?:-PC\b)?
See the regex demo.
I am trying to split a string using regex. I need to use regex in nifi to split a string into groups. Could anyone helps me how to split below string using regex.
I have a string like this:
"abc","-9223371901096288826","/home/test/20170614","abc.com","Hello,Test","7462200","4622012","1296614","1029293","893529","a:ce:o:5:l:p:MMM dd HH:mm:ss","Logs","UTF8","<111>Jun 14 12:43:20 logs: Info: 1497462198.717 13073 1.22.333.44 TCP/200 168 TCP_CONNECT 1.22.33.44:443 ""GO\ABC.COM"" DIRECT/img.abc.com - test_abc_7-DefaultGroup-DefaultGroup-NONE-NONE-NONE-DefaultGroup <IW_adv,3.9,-,""-"",-,-,-,-,""-"",-,-,-,""-"",-,-,""-"",""-"",-,-,IW_adv,-,""-"",""-"",""Unknown"",""Unknown"",""-"",""-"",0.10,0,-,""-"",""-"",-,""-"",-,-,""-"",""-"",-,-,""-""> - -"
I want to split by commas but I need to ignore commas in quotes. I want result something like this :
group 1 - abc
group 2 - -9223371901096288826
group 3 - /home/test/20170614
group 4 - abc.com
group 5 - Hello,Test
group 6 - 7462200
group 7 - 4622012
group 8 - 1296614
group 9 - 1029293
group 10 - 893529
group 11 - a:ce:o:5:l:p:MMM dd HH:mm:ss
group 12 - Logs
group 13 - UTF8
group 14 - <111>Jun 14 12:43:20 logs: Info: 1497462198.717 13073 1.22.333.44 TCP/200 168 TCP_CONNECT 1.22.33.44:443 ""GO\ABC.COM"" DIRECT/img.abc.com - test_abc_7-DefaultGroup-DefaultGroup-NONE-NONE-NONE-DefaultGroup <IW_adv,3.9,-,""-"",-,-,-,-,""-"",-,-,-,""-"",-,-,""-"",""-"",-,-,IW_adv,-,""-"",""-"",""Unknown"",""Unknown"",""-"",""-"",0.10,0,-,""-"",""-"",-,""-"",-,-,""-"",""-"",-,-,""-""> - -
I tried so many regex to split but unable to get proper result.
I tried ,(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$) regex found from this link.
Above regex works great in Java for split() function but I don't want to use in Java.
I tried (?<=\")([^,]*)(?=\") regex and split the string in groups by commas but it also split inside double quotes also.
Could anyone help me. Thanks in Advance.
you can get your requirement without capturing groups by using following way.
Let us consider your below string.,
1.Use UpdateAttribute for store whole String in attribute named "InputString".
"abc","-9223371901096288826","/home/test/20170614","abc.com","Hello,Test","7462200","4622012","1296614","1029293","893529","a:ce:o:5:l:p:MMM dd HH:mm:ss","Logs","UTF8","<111>Jun 14 12:43:20 logs: Info: 1497462198.717 13073 1.22.333.44 TCP/200 168 TCP_CONNECT 1.22.33.44:443 ""GO\ABC.COM"" DIRECT/img.abc.com - test_abc_7-DefaultGroup-DefaultGroup-NONE-NONE-NONE-DefaultGroup <IW_adv,3.9,-,""-"",-,-,-,-,""-"",-,-,-,""-"",-,-,""-"",""-"",-,-,IW_adv,-,""-"",""-"",""Unknown"",""Unknown"",""-"",""-"",0.10,0,-,""-"",""-"",-,""-"",-,-,""-"",""-"",-,-,""-""> - -"
2.After result of the updateAttribute you can use another update attribute to extract those values like below..,
group1:${InputString:getDelimitedField(1)}
group2:${InputString:getDelimitedField(2)}
group3:${InputString:getDelimitedField(3)}
group4:${InputString:getDelimitedField(4)}
group5:${InputString:getDelimitedField(5)}
group6:${InputString:getDelimitedField(6)}
group7:${InputString:getDelimitedField(7)}
group8:${InputString:getDelimitedField(8)}
group9:${InputString:getDelimitedField(9)}
group10:${InputString:getDelimitedField(10)}
group11:${InputString:getDelimitedField(11)}
group12:${InputString:getDelimitedField(12)}
group13:${InputString:getDelimitedField(13)}
You can use getDelimitedFunction is the easiest way to extract those values with below reference
https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#getdelimitedfield
let me know if you face any issues in it.
I want to grab the regular expressions out of the snort rules.
Here's an example of the text that I've saved as a csv - https://rules.emergingthreats.net/open/snort-2.9.0/rules/emerging-exploit.rules
So there are multiple rules,
#by Akash Mahajan
#
alert udp $EXTERNAL_NET any -> $HOME_NET 14000 (msg:"ET EXPLOIT Borland VisiBroker Smart Agent Heap Overflow"; content:"|44 53 52 65 71 75 65 73 74|"; pcre:"/[0-9a-zA-Z]{50}/R"; reference:bugtraq,28084; reference:url,aluigi.altervista.org/adv/visibroken-adv.txt; reference:url,doc.emergingthreats.net/bin/view/Main/2007937; classtype:successful-dos; sid:2007937; rev:4;)
and I want only the text that appears after "pcre" in all of them, extracted and printed to a new file, without the quotes
pcre:"/[0-9a-zA-Z]{50}/R";
So, from this line above, I want to end up with the below text;
/[0-9a-zA-Z]{50}/R
From every place "pcre" appears in the whole file.
I've been messing around with grep, awk, and sed. I just can't figure it out. I'm fairly new to this.
Could anyone give me some tips?
Thanks
With GNU sed:
$ sed -n -r 's/.*\<pcre:"([^"]+).*/\1/p' file
/[0-9a-zA-Z]{50}/R
You can do this using grep. But the thing with grep is that it can't only display a matching group, it can only display the matched text.
In order to get by this you need to use look-ahead and look-behind.
Lookahead (?=foo)
Asserts that what immediately follows the current position in the string is foo
Lookbehind (?<=foo)
Asserts that what immediately precedes the current position in the string is foo
┌─ print file to standard output
│ ┌─ has pcre:" before matching group (look-behind)
│ │ ┌─ has "; after matching group (look-ahead)
cat file | grep -Po '(?<=pcre:\")(.*)(?=\";)'
││ └─ what we want (matching group)
│└─ print only matched part
└─ all users