using sed to get only line number of "grep -in" - regex

Which regexp should I use to only get line number from grep -in output?
The usual output is something like this:
241113:keyword
I need to get only "241113" from sed's output.

I suggest cut
grep -in keyword ... | cut -d: -f1
If you insist with sed:
grep -in keyword ... | sed 's/:.*$//g

You don't need to use sed. Cut is enough. Just pipe grep's output to
cut -d ':' -f 1
As an example:
grep -n blabla file.txt | cut -d ':' -f 1

Personally, I like awk
grep -in 'search' file | awk --field-separator : '{print $1}'

As said in other answers, cut is the right tool; but if you really want to use a swiss-army knife, you can also use awk:
grep -in keyword ... | awk -F: '{print $1}'
or using grep again:
grep -in keyword ... | grep -oE '^[0-9]+'

Just in case someone is wondering if all this could be done without grep, i.e. with sed alone ...
echo '
a
b
keyword
c
keyWord
x
y
keyword
Keyword
z
' |
sed -n '/[Kk][Ee][Yy][Ww][Oo][Rr][Dd]/{=;}'
#sed -n '/[Kk][Ee][Yy][Ww][Oo][Rr][Dd]/{=;q;}' # only line number of first match

Related

Grepping Port 22

I'm trying to grep output from an nmap scan by just parsing the greppable file output:
cat $file | grep "22/open" | awk '{print $2}' > $results/service_ssh
This works, however the fail is that any port with '22' in the last position is also added to the file, which is wrong (Ex. 8222/open, 8322/open, 1022/open, ...).
Should I be using something other than grep to do this? I'm not the strongest at regex expressions yet, so any help would be appreciated.
Use a word boundary (\b):
grep '\b22/open' "$file" | awk '{print $2}' > "$results/service_ssh"
There's no need to use cat. And quoting variables is a good habit.
try this:
grep "^22/open" $file | awk '{print $2}' > $results/service_ssh
^ matches the begin of the line.
or, if the line doesn't start with "22/open", you can use \b to mark the beginning of the word (as suggested by #Biffen).
grep "\b22/open" $file | awk '{print $2}' > $results/service_ssh

Grep in bash with regex

I am getting the following output from a bash script:
INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist
and I would like to get only the path(MajorDomo/MajorDomo-Info.plist) using grep. In other words, everything after the equals sign. Any ideas of how to do this?
This job suites more to awk:
s='INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist'
awk -F' *= *' '{print $2}' <<< "$s"
MajorDomo/MajorDomo-Info.plist
If you really want grep then use grep -P:
grep -oP ' = \K.+' <<< "$s"
MajorDomo/MajorDomo-Info.plist
Not exactly what you were asking, but
echo "INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist" | sed 's/.*= \(.*\)$/\1/'
will do what you want.
You could use cut as well:
your_script | cut -d = -f 2-
(where your_script does something equivalent to echo INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist)
If you need to trim the space at the beginning:
your_script | cut -d = -f 2- | cut -d ' ' -f 2-
If you have multiple spaces at the beginning and you want to trim them all, you'll have to fall back to sed: your_script | cut -d = -f 2- | sed 's/^ *//' (or, simpler, your_script | sed 's/^[^=]*= *//')
Assuming your script outputs a single line, there is a shell only solution:
line="$(your_script)"
echo "${line#*= }"
Bash
IFS=' =' read -r _ x <<<"INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist"
printf "%s\n" "$x"
MajorDomo/MajorDomo-Info.plist

How to use sed to identify a string in brackets?

I want to find the string in that is placed with in the brackets. How do I use sed to pull the string?
# cat /sys/block/sdb/queue/scheduler
noop anticipatory deadline [cfq]
I'm not getting the exact result
# cat /sys/block/sdb/queue/scheduler | sed 's/\[*\]//'
noop anticipatory deadline [cfq
I'm expecting an output
cfq
It can be easier with grep, if it happens to be changing the position in which the text in between brackets is located:
$ grep -Po '(?<=\[)[^]]*' file
cfq
This is look-behind: whenever you find a string [, start fetching all the characters up to a ].
See another example:
$ cat a
noop anticipatory deadline [cfq]
hello this [is something] we want to [enclose] yeah
$ grep -Po '(?<=\[)[^]]*' a
cfq
is something
enclose
You can also use awk for this, in case it is always in the same position:
$ awk -F[][] '{print $2}' file
cfq
It is setting the field separators as [ and ]. And from that, prints the second one.
And with sed:
$ sed 's/[^[]*\[\([^]]*\).*/\1/g' file
cfq
It is a bit messy, but basically it is looking from the block of text in between [] and prints it back.
I found one possible solution-
cut -d "[" -f2 | cut -d "]" -f1
so the exact solution is
# cat /sys/block/sdb/queue/scheduler | cut -d "[" -f2 | cut -d "]" -f1
Another potential solution is awk:
s='noop anticipatory deadline [cfq]'
awk -F'[][]' '{print $2}' <<< "$s"
cfq
Another way by gnu grep :
grep -Po "\[\K[^]]*" file
with pure shell:
while read line; do [[ "$line" =~ \[([^]]*)\] ]] && echo "${BASH_REMATCH[1]}"; done < file
Another awk
echo 'noop anticipatory deadline [cfq]' | awk '{gsub(/.*\[|\].*/,x)}8'
cfq
perl -lne 'print $1 if(/\[([^\]]*)\]/)'
Tested here

Regexp on nmap services file

I need to filter with sed only the ports from /usr/share/nmap/nmap-services
tcpmux 1/tcp 0.001995 # TCP Port Service Multiplexer [rfc-1078]
compressnet 2/tcp 0.000013 # Management Utility
compressnet 3/tcp 0.001242 # Compression Process
unknown 4/tcp 0.000477
unknown 6/tcp 0.000502
echo 7/tcp 0.004855
unknown 8/tcp 0.000013
discard 9/tcp 0.003764 # sink null
unknown 10/tcp 0.000063
systat 11/tcp 0.000075 # Active Users
I've tryed something like (!?([0-9]+/tcp))
But it wont work: why?
Thank you
Try doing this :
grep -oP '\d+(?=/(udp|tcp))' /usr/share/nmap/nmap-services
or with perl :
perl -lne 'print $& if m!\d+(?=/(udp|tcp))!' /usr/share/nmap/nmap-services
I use a positive look ahead advanced regex, see http://www.perlmonks.org/?node_id=518444
or with awk without advanced regex :
awk '{gsub("/.*", ""); print $2}' /usr/share/nmap/nmap-services
or
awk -F'[ /\t]' '{print $2}' /usr/share/nmap/nmap-services
Here's an example using AWK
cat /usr/share/nmap/nmap-services | awk '{print $2}' | awk -F\/ '{print $1}'
The simplest is so:
cut -s -d\ -f2 test
You can also do it so:
sed '/[^ ]* \([^ ]*\).*/ s::\1:; /^$/d' FILE
cut variant prints empty lines for non-matching.

awk or sed: Best way to grab [this text]

I'm trying to parse various info from log files, some of which is placed within square brackets. For example:
Tue, 06 Nov 2007 10:04:11 INFO processor:receive: [someuserid], [somemessage] msgtype=[T]
What's an elegant way to grab 'someuserid' from these lines, using sed, awk, or other unix utility?
cut use it like this: cut -f2 -d[ | cut -f1 -d]
bart#hal9k:~> YOURTEXT="Tue, 06 Nov 2007 10:04:11 INFO processor:receive: [someuserid], [somemessage] msgtype=[T]"
bart#hal9k:~> SOMEID=`echo $YOURTEXT | cut -f2 -d[ | cut -f1 -d]`
bart#hal9k:~> echo $SOMEID
someuserid
If you want to do something with all the bracketed fields, I'd use Perl:
perl -lne '
my #fields = /\[(.*?)\]/g;
# do something with #fields, like:
print join(":", #fields);
' logfile ...
using bash shell
while read -r line
do
case "$line" in
*processor*receive* )
t=${line#*[}
echo ${t%%]*}
;;
esac
done < "file"
sed -n '/INFO/{s/.[^[]*\[//;s/\].*//p}' file
Using AWK:
cat file | awk -F[\]\[] '{print $2}'
I have found that multiple delimiters do not work in some older versions of AWK. If it doesn't, you can use two awks:
cat file | awk -F[ '{print $2}' | awk -F] '{print $1}'