ansible string split with set facts - regex

Hi I am trying to to achieve something below.
[root#WEBSERVER]# ll /dev/disk/by-id/scsi-* | grep sdd
lrwxrwxrwx. 1 root root 9 Oct 25 15:26 /dev/disk/by-id/scsi-1234567891123455 -> ../../sdd
I want assign "/dev/disk/by-id/scsi-1234567891123455" from the above output to a variable.
Ansible:
name: Capture output
command: ll /dev/disk/by-id/scsi-* | grep sdd
register: lsblkoutput
Now I want to querty lsblkoutput and the /dev/disk/by-id/scsi-1234567891123455
Thanks,

I have no knowledge af 'Ansible', but I have created a regex that matches, what you want.
However, since I don't know Ansible, I don't know, if your regex supports this pattern:
lrwxrwxrwx[^/]*(.*(?=\s->))
The regex matches the Word 'lrwxrwxrwx', followed by zero or more characters, until it reaches a slash '/'. Then it matches any characters as long as there's '.>', when looking ahead.
The value, you're looking for, is in the captured Group 1.
Hope you can use this in Ansible.

Related

Regular expression for matching a specifc substring of a string

I have a log file that logs connection drops of computers in a LAN. I want to extract name of each computer from every line of the log file and for that I am doing this: (?<=Name:)\w+|(-PC)
The target text:
`[C417] ComputerName:KCUTSHALL-PC UserID:GO kcutshall Station 9900 (locked) LanId: | (11/23 10:54:09 - 11/23 10:54:44) | Average limit (300) exceeded while pinging www.google.com [74.125.224.147] 8x
[C445] ComputerName:FRONTOFFICE UserID:YB Yenae Ball Station 7C LanId: | (11/23 17:02:00) | Client is connected to agent.`
The problem is that some computer names have -PC in them and in some isn't. The expression I have created matches computer without -PC in their names but it if a computer has -PC in the name, it treats that as a separate match and I don't want that. In short, it gives me 3 matches, but I want only 2. That's why I need help here, I am beginner in regex.
You may use
(?<=Name:)\w+(?:-PC)?
Details
(?<=Name:) - a place immediately preceded with Name:
\w+ - 1+ word chars
(?:-PC)? - an optional non-capturing group that matches 1 or 0 occurrences of -PC substring.
Consider using word boundaries if you need to match PC as a whole word,
(?<=Name:)\w+(?:-PC\b)?
See the regex demo.

How to use zgrep to display all words of a x size from a wordlist?

I want to display all the words from my wordlist who start with a w and are 9 letters long. Yesterday I learnt a bit more on how to use zgrep so I came with :
zgrep '\(^w\)\(^.........$\)' a.gz
But this doesn't work and I think it's because I don't know how to do a AND between the two conditions. I found that it should be (?=expr)(?=expr) but I can't figure out how to build my command then
So how can I build my command using the (?=expr) ?
for example if I have a wordlist like this:
Washington
Sausage
Walalalalalaaaa --> shouldn't match
Wwwwwwwww --> should match
You may use
zgrep '^w[[:alpha:]]\{8\}$' a.gz
The POSIX BRE pattern will match a string that
^w - starts with w
[[:alpha:]]\{8\} - then has eight letters
$ - followed with with the end of string marker.
Also, see the 9.3 Basic Regular Expressions.

String split with comma and ignoring comma in double quotes using regex

I am trying to split a string using regex. I need to use regex in nifi to split a string into groups. Could anyone helps me how to split below string using regex.
I have a string like this:
"abc","-9223371901096288826","/home/test/20170614","abc.com","Hello,Test","7462200","4622012","1296614","1029293","893529","a:ce:o:5:l:p:MMM dd HH:mm:ss","Logs","UTF8","<111>Jun 14 12:43:20 logs: Info: 1497462198.717 13073 1.22.333.44 TCP/200 168 TCP_CONNECT 1.22.33.44:443 ""GO\ABC.COM"" DIRECT/img.abc.com - test_abc_7-DefaultGroup-DefaultGroup-NONE-NONE-NONE-DefaultGroup <IW_adv,3.9,-,""-"",-,-,-,-,""-"",-,-,-,""-"",-,-,""-"",""-"",-,-,IW_adv,-,""-"",""-"",""Unknown"",""Unknown"",""-"",""-"",0.10,0,-,""-"",""-"",-,""-"",-,-,""-"",""-"",-,-,""-""> - -"
I want to split by commas but I need to ignore commas in quotes. I want result something like this :
group 1 - abc
group 2 - -9223371901096288826
group 3 - /home/test/20170614
group 4 - abc.com
group 5 - Hello,Test
group 6 - 7462200
group 7 - 4622012
group 8 - 1296614
group 9 - 1029293
group 10 - 893529
group 11 - a:ce:o:5:l:p:MMM dd HH:mm:ss
group 12 - Logs
group 13 - UTF8
group 14 - <111>Jun 14 12:43:20 logs: Info: 1497462198.717 13073 1.22.333.44 TCP/200 168 TCP_CONNECT 1.22.33.44:443 ""GO\ABC.COM"" DIRECT/img.abc.com - test_abc_7-DefaultGroup-DefaultGroup-NONE-NONE-NONE-DefaultGroup <IW_adv,3.9,-,""-"",-,-,-,-,""-"",-,-,-,""-"",-,-,""-"",""-"",-,-,IW_adv,-,""-"",""-"",""Unknown"",""Unknown"",""-"",""-"",0.10,0,-,""-"",""-"",-,""-"",-,-,""-"",""-"",-,-,""-""> - -
I tried so many regex to split but unable to get proper result.
I tried ,(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$) regex found from this link.
Above regex works great in Java for split() function but I don't want to use in Java.
I tried (?<=\")([^,]*)(?=\") regex and split the string in groups by commas but it also split inside double quotes also.
Could anyone help me. Thanks in Advance.
you can get your requirement without capturing groups by using following way.
Let us consider your below string.,
1.Use UpdateAttribute for store whole String in attribute named "InputString".
"abc","-9223371901096288826","/home/test/20170614","abc.com","Hello,Test","7462200","4622012","1296614","1029293","893529","a:ce:o:5:l:p:MMM dd HH:mm:ss","Logs","UTF8","<111>Jun 14 12:43:20 logs: Info: 1497462198.717 13073 1.22.333.44 TCP/200 168 TCP_CONNECT 1.22.33.44:443 ""GO\ABC.COM"" DIRECT/img.abc.com - test_abc_7-DefaultGroup-DefaultGroup-NONE-NONE-NONE-DefaultGroup <IW_adv,3.9,-,""-"",-,-,-,-,""-"",-,-,-,""-"",-,-,""-"",""-"",-,-,IW_adv,-,""-"",""-"",""Unknown"",""Unknown"",""-"",""-"",0.10,0,-,""-"",""-"",-,""-"",-,-,""-"",""-"",-,-,""-""> - -"
2.After result of the updateAttribute you can use another update attribute to extract those values like below..,
group1:${InputString:getDelimitedField(1)}
group2:${InputString:getDelimitedField(2)}
group3:${InputString:getDelimitedField(3)}
group4:${InputString:getDelimitedField(4)}
group5:${InputString:getDelimitedField(5)}
group6:${InputString:getDelimitedField(6)}
group7:${InputString:getDelimitedField(7)}
group8:${InputString:getDelimitedField(8)}
group9:${InputString:getDelimitedField(9)}
group10:${InputString:getDelimitedField(10)}
group11:${InputString:getDelimitedField(11)}
group12:${InputString:getDelimitedField(12)}
group13:${InputString:getDelimitedField(13)}
You can use getDelimitedFunction is the easiest way to extract those values with below reference
https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#getdelimitedfield
let me know if you face any issues in it.

Extract Text From CSV

I want to grab the regular expressions out of the snort rules.
Here's an example of the text that I've saved as a csv - https://rules.emergingthreats.net/open/snort-2.9.0/rules/emerging-exploit.rules
So there are multiple rules,
#by Akash Mahajan
#
alert udp $EXTERNAL_NET any -> $HOME_NET 14000 (msg:"ET EXPLOIT Borland VisiBroker Smart Agent Heap Overflow"; content:"|44 53 52 65 71 75 65 73 74|"; pcre:"/[0-9a-zA-Z]{50}/R"; reference:bugtraq,28084; reference:url,aluigi.altervista.org/adv/visibroken-adv.txt; reference:url,doc.emergingthreats.net/bin/view/Main/2007937; classtype:successful-dos; sid:2007937; rev:4;)
and I want only the text that appears after "pcre" in all of them, extracted and printed to a new file, without the quotes
pcre:"/[0-9a-zA-Z]{50}/R";
So, from this line above, I want to end up with the below text;
/[0-9a-zA-Z]{50}/R
From every place "pcre" appears in the whole file.
I've been messing around with grep, awk, and sed. I just can't figure it out. I'm fairly new to this.
Could anyone give me some tips?
Thanks
With GNU sed:
$ sed -n -r 's/.*\<pcre:"([^"]+).*/\1/p' file
/[0-9a-zA-Z]{50}/R
You can do this using grep. But the thing with grep is that it can't only display a matching group, it can only display the matched text.
In order to get by this you need to use look-ahead and look-behind.
Lookahead (?=foo)
Asserts that what immediately follows the current position in the string is foo
Lookbehind (?<=foo)
Asserts that what immediately precedes the current position in the string is foo
┌─ print file to standard output
│ ┌─ has pcre:" before matching group (look-behind)
│ │ ┌─ has "; after matching group (look-ahead)
cat file | grep -Po '(?<=pcre:\")(.*)(?=\";)'
││ └─ what we want (matching group)
│└─ print only matched part
└─ all users

In NotePad++ I'm trying to replace Jan\d+ with \r\nJan but it's not working the way I thought

10 Sts - $5,763Jan17 11 Lon -2 ft-1 Janet HallFeb2 9 Lon -10gd-4 F-nw7000lc
Using Notepad++ in the above phrase I wanted to start a new line with the dates Jan17 and Feb2 but when I try Jan\d+ to \r\nJan I get Jan 11 Lon -2 ft-1 Janet Hall without the 17 part of the date.
I can split the line again with Feb\d+ to \r\nFeb but again the 2 part of the date is missing in the newly created line.
You need to use a replacement group.
Try Find what: Jan(\d+)
Replace: \r\nJan\1
Using (\d+) will capture the number into a replacement group. Using \1 will insert the captured characters in the first replacement group.