Regexp on nmap services file - regex

I need to filter with sed only the ports from /usr/share/nmap/nmap-services
tcpmux 1/tcp 0.001995 # TCP Port Service Multiplexer [rfc-1078]
compressnet 2/tcp 0.000013 # Management Utility
compressnet 3/tcp 0.001242 # Compression Process
unknown 4/tcp 0.000477
unknown 6/tcp 0.000502
echo 7/tcp 0.004855
unknown 8/tcp 0.000013
discard 9/tcp 0.003764 # sink null
unknown 10/tcp 0.000063
systat 11/tcp 0.000075 # Active Users
I've tryed something like (!?([0-9]+/tcp))
But it wont work: why?
Thank you

Try doing this :
grep -oP '\d+(?=/(udp|tcp))' /usr/share/nmap/nmap-services
or with perl :
perl -lne 'print $& if m!\d+(?=/(udp|tcp))!' /usr/share/nmap/nmap-services
I use a positive look ahead advanced regex, see http://www.perlmonks.org/?node_id=518444
or with awk without advanced regex :
awk '{gsub("/.*", ""); print $2}' /usr/share/nmap/nmap-services
or
awk -F'[ /\t]' '{print $2}' /usr/share/nmap/nmap-services

Here's an example using AWK
cat /usr/share/nmap/nmap-services | awk '{print $2}' | awk -F\/ '{print $1}'

The simplest is so:
cut -s -d\ -f2 test
You can also do it so:
sed '/[^ ]* \([^ ]*\).*/ s::\1:; /^$/d' FILE
cut variant prints empty lines for non-matching.

Related

Using grep regex to select to first hyphen

echo "Linux/DEB/mainbinary-0.1.20190424165331-0-armdef.deb" | grep -oE "([^\/]+$)"
This prints just the filename, without the directory structure, but I cannot manage to print just mainbinary from that string. Suggestions?
And a sed alternative to PS.'s great grep -oP
echo "Linux/DEB/mainbinary-0.1.20190424165331-0-armdef.deb" |sed -r 's#^.*/([^-]+).*#\1#'
mainbinary
echo "Linux/DEB/mainbinary-0.1.20190424165331-0-armdef.deb" |grep -oP '.*/\K[^-]+'
mainbinary
This will scan till last / and ignore everything to its left and keep moving until - (excluding)
With any awk in any shell on any UNIX machine:
$ echo "Linux/DEB/mainbinary-0.1.20190424165331-0-armdef.deb" | awk -F'[/-]' '{print $3}'
mainbinary

Grepping Port 22

I'm trying to grep output from an nmap scan by just parsing the greppable file output:
cat $file | grep "22/open" | awk '{print $2}' > $results/service_ssh
This works, however the fail is that any port with '22' in the last position is also added to the file, which is wrong (Ex. 8222/open, 8322/open, 1022/open, ...).
Should I be using something other than grep to do this? I'm not the strongest at regex expressions yet, so any help would be appreciated.
Use a word boundary (\b):
grep '\b22/open' "$file" | awk '{print $2}' > "$results/service_ssh"
There's no need to use cat. And quoting variables is a good habit.
try this:
grep "^22/open" $file | awk '{print $2}' > $results/service_ssh
^ matches the begin of the line.
or, if the line doesn't start with "22/open", you can use \b to mark the beginning of the word (as suggested by #Biffen).
grep "\b22/open" $file | awk '{print $2}' > $results/service_ssh

How to use sed to identify a string in brackets?

I want to find the string in that is placed with in the brackets. How do I use sed to pull the string?
# cat /sys/block/sdb/queue/scheduler
noop anticipatory deadline [cfq]
I'm not getting the exact result
# cat /sys/block/sdb/queue/scheduler | sed 's/\[*\]//'
noop anticipatory deadline [cfq
I'm expecting an output
cfq
It can be easier with grep, if it happens to be changing the position in which the text in between brackets is located:
$ grep -Po '(?<=\[)[^]]*' file
cfq
This is look-behind: whenever you find a string [, start fetching all the characters up to a ].
See another example:
$ cat a
noop anticipatory deadline [cfq]
hello this [is something] we want to [enclose] yeah
$ grep -Po '(?<=\[)[^]]*' a
cfq
is something
enclose
You can also use awk for this, in case it is always in the same position:
$ awk -F[][] '{print $2}' file
cfq
It is setting the field separators as [ and ]. And from that, prints the second one.
And with sed:
$ sed 's/[^[]*\[\([^]]*\).*/\1/g' file
cfq
It is a bit messy, but basically it is looking from the block of text in between [] and prints it back.
I found one possible solution-
cut -d "[" -f2 | cut -d "]" -f1
so the exact solution is
# cat /sys/block/sdb/queue/scheduler | cut -d "[" -f2 | cut -d "]" -f1
Another potential solution is awk:
s='noop anticipatory deadline [cfq]'
awk -F'[][]' '{print $2}' <<< "$s"
cfq
Another way by gnu grep :
grep -Po "\[\K[^]]*" file
with pure shell:
while read line; do [[ "$line" =~ \[([^]]*)\] ]] && echo "${BASH_REMATCH[1]}"; done < file
Another awk
echo 'noop anticipatory deadline [cfq]' | awk '{gsub(/.*\[|\].*/,x)}8'
cfq
perl -lne 'print $1 if(/\[([^\]]*)\]/)'
Tested here

awk regex can't match ip addresses when trying to find repeating digits

I can't get the following to match any IP addresses
awk '/[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/{print $0}' maillog
or this one...
awk '/[[:digit:]]{1,3}\.[[:digit:]]{1,3}\.[[:digit:]]{1,3}\.[[:digit:]]{1,3}/' maillog
but this works...
awk '/127.0.0.1/{print $0}' maillog
and so does this...
awk '/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]/{print $0}' maillog
What am I doing wrong in the first two?
To use interval {1,3} with gnu awk you my need to enable it with --re-interval like this:
awk --re-interval '/[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/{print $0}' maillog
They are just fine.
The following is working for me.
$ echo "2.168.1.1" | awk '/[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/{print $0}'
2.168.1.1
$ echo "2.1.1.1" | awk '/[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/{print $0}'
2.1.1.1
$ echo "22.1.1.1" | awk '/[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/{print $0}'
22.1.1.1
I would investigate your maillog and make sure that everything there is in plaintext.

sed or awk to capture part of url

I am not very experienced with regular expressions and sed/awk scripting.
I have urls that are similar to the following torrent url:
http://torcache.net/torrent/D7249CD9AF321C8578B3A7007ABBDD63B0475EEB.torrent?title=[kickass.to]against.the.ropes.by.carly.fall.epub.torrent
I would like to have sed or awk script extract the text after the title i.e
from the example above just get:
[kickass.to]against.the.ropes.by.carly.fall.epub.torrent
A simple approach with awk: use the = as the field separator:
awk -F"=" '{print $2}'
Thus:
echo "http://torcache.net/torrent/D7249CD9AF321C8578B3A7007ABBDD63B0475EEB.torrent?title=[kickass.to]against.the.ropes.by.carly.fall.epub.torrent" | awk -F"=" '{print $2}'
[kickass.to]against.the.ropes.by.carly.fall.epub.torrent
Just remove everything before the title=: sed 's/.*title=//'
$ echo "http://torcache.net/torrent/D7249CD9AF321C8578B3A7007ABBDD63B0475EEB.torrent?title=[kickass.to]against.the.ropes.by.carly.fall.epub.torrent" | sed 's/.*title=//'
[kickass.to]against.the.ropes.by.carly.fall.epub.torrent
Let's say:
s='http://torcache.net/torrent/D7249CD9AF321C8578B3A7007ABBDD63B0475EEB.torrent?title=[kickass.to]against.the.ropes.by.carly.fall.epub.torrent'
Pure BASH solution:
echo "${s/*title=}"
[kickass.to]against.the.ropes.by.carly.fall.epub.torrent
OR using grep -P:
echo "$s"|grep -oP 'title=\K.*'
[kickass.to]against.the.ropes.by.carly.fall.epub.torrent
By using sed (no need to mention title in the regexp in your example) :
sed 's/.*=//'
An another solution exists with cut, another standard unix tool :
cut -d= -f2