Grepping Port 22 - regex

I'm trying to grep output from an nmap scan by just parsing the greppable file output:
cat $file | grep "22/open" | awk '{print $2}' > $results/service_ssh
This works, however the fail is that any port with '22' in the last position is also added to the file, which is wrong (Ex. 8222/open, 8322/open, 1022/open, ...).
Should I be using something other than grep to do this? I'm not the strongest at regex expressions yet, so any help would be appreciated.

Use a word boundary (\b):
grep '\b22/open' "$file" | awk '{print $2}' > "$results/service_ssh"
There's no need to use cat. And quoting variables is a good habit.

try this:
grep "^22/open" $file | awk '{print $2}' > $results/service_ssh
^ matches the begin of the line.
or, if the line doesn't start with "22/open", you can use \b to mark the beginning of the word (as suggested by #Biffen).
grep "\b22/open" $file | awk '{print $2}' > $results/service_ssh

Related

Using grep regex to select to first hyphen

echo "Linux/DEB/mainbinary-0.1.20190424165331-0-armdef.deb" | grep -oE "([^\/]+$)"
This prints just the filename, without the directory structure, but I cannot manage to print just mainbinary from that string. Suggestions?
And a sed alternative to PS.'s great grep -oP
echo "Linux/DEB/mainbinary-0.1.20190424165331-0-armdef.deb" |sed -r 's#^.*/([^-]+).*#\1#'
mainbinary
echo "Linux/DEB/mainbinary-0.1.20190424165331-0-armdef.deb" |grep -oP '.*/\K[^-]+'
mainbinary
This will scan till last / and ignore everything to its left and keep moving until - (excluding)
With any awk in any shell on any UNIX machine:
$ echo "Linux/DEB/mainbinary-0.1.20190424165331-0-armdef.deb" | awk -F'[/-]' '{print $3}'
mainbinary

How to use sed to identify a string in brackets?

I want to find the string in that is placed with in the brackets. How do I use sed to pull the string?
# cat /sys/block/sdb/queue/scheduler
noop anticipatory deadline [cfq]
I'm not getting the exact result
# cat /sys/block/sdb/queue/scheduler | sed 's/\[*\]//'
noop anticipatory deadline [cfq
I'm expecting an output
cfq
It can be easier with grep, if it happens to be changing the position in which the text in between brackets is located:
$ grep -Po '(?<=\[)[^]]*' file
cfq
This is look-behind: whenever you find a string [, start fetching all the characters up to a ].
See another example:
$ cat a
noop anticipatory deadline [cfq]
hello this [is something] we want to [enclose] yeah
$ grep -Po '(?<=\[)[^]]*' a
cfq
is something
enclose
You can also use awk for this, in case it is always in the same position:
$ awk -F[][] '{print $2}' file
cfq
It is setting the field separators as [ and ]. And from that, prints the second one.
And with sed:
$ sed 's/[^[]*\[\([^]]*\).*/\1/g' file
cfq
It is a bit messy, but basically it is looking from the block of text in between [] and prints it back.
I found one possible solution-
cut -d "[" -f2 | cut -d "]" -f1
so the exact solution is
# cat /sys/block/sdb/queue/scheduler | cut -d "[" -f2 | cut -d "]" -f1
Another potential solution is awk:
s='noop anticipatory deadline [cfq]'
awk -F'[][]' '{print $2}' <<< "$s"
cfq
Another way by gnu grep :
grep -Po "\[\K[^]]*" file
with pure shell:
while read line; do [[ "$line" =~ \[([^]]*)\] ]] && echo "${BASH_REMATCH[1]}"; done < file
Another awk
echo 'noop anticipatory deadline [cfq]' | awk '{gsub(/.*\[|\].*/,x)}8'
cfq
perl -lne 'print $1 if(/\[([^\]]*)\]/)'
Tested here

sed or awk to capture part of url

I am not very experienced with regular expressions and sed/awk scripting.
I have urls that are similar to the following torrent url:
http://torcache.net/torrent/D7249CD9AF321C8578B3A7007ABBDD63B0475EEB.torrent?title=[kickass.to]against.the.ropes.by.carly.fall.epub.torrent
I would like to have sed or awk script extract the text after the title i.e
from the example above just get:
[kickass.to]against.the.ropes.by.carly.fall.epub.torrent
A simple approach with awk: use the = as the field separator:
awk -F"=" '{print $2}'
Thus:
echo "http://torcache.net/torrent/D7249CD9AF321C8578B3A7007ABBDD63B0475EEB.torrent?title=[kickass.to]against.the.ropes.by.carly.fall.epub.torrent" | awk -F"=" '{print $2}'
[kickass.to]against.the.ropes.by.carly.fall.epub.torrent
Just remove everything before the title=: sed 's/.*title=//'
$ echo "http://torcache.net/torrent/D7249CD9AF321C8578B3A7007ABBDD63B0475EEB.torrent?title=[kickass.to]against.the.ropes.by.carly.fall.epub.torrent" | sed 's/.*title=//'
[kickass.to]against.the.ropes.by.carly.fall.epub.torrent
Let's say:
s='http://torcache.net/torrent/D7249CD9AF321C8578B3A7007ABBDD63B0475EEB.torrent?title=[kickass.to]against.the.ropes.by.carly.fall.epub.torrent'
Pure BASH solution:
echo "${s/*title=}"
[kickass.to]against.the.ropes.by.carly.fall.epub.torrent
OR using grep -P:
echo "$s"|grep -oP 'title=\K.*'
[kickass.to]against.the.ropes.by.carly.fall.epub.torrent
By using sed (no need to mention title in the regexp in your example) :
sed 's/.*=//'
An another solution exists with cut, another standard unix tool :
cut -d= -f2

Regexp on nmap services file

I need to filter with sed only the ports from /usr/share/nmap/nmap-services
tcpmux 1/tcp 0.001995 # TCP Port Service Multiplexer [rfc-1078]
compressnet 2/tcp 0.000013 # Management Utility
compressnet 3/tcp 0.001242 # Compression Process
unknown 4/tcp 0.000477
unknown 6/tcp 0.000502
echo 7/tcp 0.004855
unknown 8/tcp 0.000013
discard 9/tcp 0.003764 # sink null
unknown 10/tcp 0.000063
systat 11/tcp 0.000075 # Active Users
I've tryed something like (!?([0-9]+/tcp))
But it wont work: why?
Thank you
Try doing this :
grep -oP '\d+(?=/(udp|tcp))' /usr/share/nmap/nmap-services
or with perl :
perl -lne 'print $& if m!\d+(?=/(udp|tcp))!' /usr/share/nmap/nmap-services
I use a positive look ahead advanced regex, see http://www.perlmonks.org/?node_id=518444
or with awk without advanced regex :
awk '{gsub("/.*", ""); print $2}' /usr/share/nmap/nmap-services
or
awk -F'[ /\t]' '{print $2}' /usr/share/nmap/nmap-services
Here's an example using AWK
cat /usr/share/nmap/nmap-services | awk '{print $2}' | awk -F\/ '{print $1}'
The simplest is so:
cut -s -d\ -f2 test
You can also do it so:
sed '/[^ ]* \([^ ]*\).*/ s::\1:; /^$/d' FILE
cut variant prints empty lines for non-matching.

using sed to get only line number of "grep -in"

Which regexp should I use to only get line number from grep -in output?
The usual output is something like this:
241113:keyword
I need to get only "241113" from sed's output.
I suggest cut
grep -in keyword ... | cut -d: -f1
If you insist with sed:
grep -in keyword ... | sed 's/:.*$//g
You don't need to use sed. Cut is enough. Just pipe grep's output to
cut -d ':' -f 1
As an example:
grep -n blabla file.txt | cut -d ':' -f 1
Personally, I like awk
grep -in 'search' file | awk --field-separator : '{print $1}'
As said in other answers, cut is the right tool; but if you really want to use a swiss-army knife, you can also use awk:
grep -in keyword ... | awk -F: '{print $1}'
or using grep again:
grep -in keyword ... | grep -oE '^[0-9]+'
Just in case someone is wondering if all this could be done without grep, i.e. with sed alone ...
echo '
a
b
keyword
c
keyWord
x
y
keyword
Keyword
z
' |
sed -n '/[Kk][Ee][Yy][Ww][Oo][Rr][Dd]/{=;}'
#sed -n '/[Kk][Ee][Yy][Ww][Oo][Rr][Dd]/{=;q;}' # only line number of first match