grep regex to output only multiple matches on a single line - regex

I wanted to search for multiple matches in a single line and output only the matched words using grep
alias host='echo -e "Connecting to host 10.10.11.120\n===================";ssh root#10.10.11.120'
alias host1='echo -e "Connecting to host 10.10.11.121\n===================";ssh root#10.10.11.121'
alias host2='echo -e "Connecting to host 10.10.11.122\n===================";ssh root#10.10.11.122'
alias host3='echo -e "Connecting to host 10.10.11.123\n===================";ssh root#10.10.11.123'
I want grep to output only the host name and IP address like
host 10.10.11.120
host1 10.10.11.121
host2 10.10.11.122
host3 10.10.11.123

With grep and pcre, required strings can be extracted
$ grep -oP 'alias \K[^=]+|#\K[0-9.]+' ip.txt
host
10.10.11.120
host1
10.10.11.121
host2
10.10.11.122
host3
10.10.11.123
However, each extracted string would be in separate line, so one can use other commands to join them, for ex:
$ grep -oP 'alias \K[^=]+|#\K[0-9.]+' ip.txt | paste - -
host 10.10.11.120
host1 10.10.11.121
host2 10.10.11.122
host3 10.10.11.123
Or, a single perl command can also be used:
$ perl -pe 's/alias (host\d*).*#([\d.]+).*/$1 $2/' ip.txt
host 10.10.11.120
host1 10.10.11.121
host2 10.10.11.122
host3 10.10.11.123

$ grep -oP 'alias \K[^=]+|#\K[0-9.]+' ip.txt | paste - -
host 10.10.11.120
host1 10.10.11.121
host2 10.10.11.122
host3 10.10.11.123
this is great!

1st solution: With awk it could be much simpler, with your shown samples please try following awk code.
awk -F" |=|'|#" '{print $2,$(NF-1)}' Input_file
Explanation: Simple explanation would be, setting different field separators eg: space, = OR # for lines and printing 2nd and second last fields values for each line.
2nd solution: using field separator and match function of awk. Set space and = as field separators AND use match function to get values needed by OP.
awk -F' |=' 'match($0,/root#([0-9]+\.){3}[0-9]+/){print $2,substr($0,RSTART+5,RLENGTH-5)}' Input_file
3rd solution(Generic one): Adding pure Generic solution here, this will work irrespective of where alias host and root#... are placed in lines, in case anyone requires it.
awk '
match($0,/alias host[^=]*/){
firstVal=substr($0,RSTART+6,RLENGTH-6)
match($0,/root#([0-9]+\.){3}[0-9]+/)
print firstVal,substr($0,RSTART+5,RLENGTH-5)
}
' Input_file

grep matches. It doesn't modify.
Try sed:
sed 's/.*ssh root#/host /' 's/'//' myscript.sh

Related

Access the word in the file with grep

I have a conf file and I use grep to access the data in this file but not a very useful method for me.
How can I just get the main word by search-term?
I using:
grep "export:" /etc/VDdatas.conf
Print:
export: HelloWorld
I want: (without "export: ")
HelloWorld
How can I do that?
If you're using GNU grep you can use PCRE and a lookbehind:
grep -P -o '(?<=export:).*' /etc/VDdatas.conf
The -o option means to print only the part of the line that matches the regexp, and using a lookbehind for the export: prefix makes it not part of the match.
You can also use sed or awk
sed 's/export:/s/^export: //' /etc/VDdatas.conf
awk '/export:/ {print $2}' /etc/VDdatas.conf
I suggest you pipe the match to awk.
grep "export:" /etc/VDdatas.conf | awk -F ' ' '{print $2}'
This will print the second word in the output (after splitting the line on spaces).

Use sed to retrieve IP address from one string

I have one variable
my_var=Hello 192.168.0.1:22 World
I want to retrieve this IP from variable, I write sed command as
echo $my_var | sed -n "s/.*\([0-9\.]\+\):.*$/\1/p"
I expect get "192.168.0.1" as return. Instead, I got return as "1"
Could someone help me? What is wrong with my sed
The problem is that .* is greedy, so it will match the numbers in the IP. You need to make it stop at the space before the IP.
echo $my_var | sed -n "s/.* \([0-9.]\+\):.*$/\1/p"
If sed supported PCRE, you could use a non-greedy quantifier .*?, but it only has BRE and ERE.
If there isn't always a space, you could match any non-number. But you also need to allow for the IP to be at the beginning of the string.
echo $my_var | sed -n "s/^\(\|.*[^0-9]\)\([0-9.]\+\):.*$/\2/p"
BTW, it's not necessary to escape . inside [].
With bash parameter expansion
$ r='Hello 192.168.0.1:22 World'
$ # remove upto first space
$ echo "${r#* }"
192.168.0.1:22 World
$ s="${r#* }"
$ # remove from first : to end of line
$ echo "${s%%:*}"
192.168.0.1
With awk
$ # space or : is input delimiter
$ echo "$r" | awk -F'[ :]' '{print $2}'
192.168.0.1

use grep/sed/awk to extract string corresponding to certain field

On my Fedora system, I get the following:
$ cat /proc/net/arp
IP address HW type Flags HW address Mask Device
130.48.0.1 0x1 0x2 80:4b:c7:10:3e:41 * wlp1s0
How can I pipe the result of the Device (in this case the answer is wlp1s0) using a screen editor such as sed or grep or awk?
Thanks!
To get interface name used to get out of a computer, you can use this:
ip route get 8.8.8.8 | awk 'NR==1 {print $5}'
eth0
It will always get the correct, even if more than one inf is online.
awk 'NR>1{print $6}' < /proc/net/arp
If we're after the first line (to get rid of the header "Device"), print sixth field (separated by whitespace).
$ awk '/^[0-9]/{print $6}' /proc/net/arp
wlp1s0
/^[0-9]/ selects lines that start with digit, ip
print $6 prints the 6th colum being the Device
Assuming your fields are tab-separated, if the Device always appears as the last column:
$ awk -F'\t' 'NR>1{print $NF}' file
wlp1s0
otherwise:
$ awk -F'\t' 'NR==1{for (i=1;i<=NF;i++) f[$i]=i; next} {print $(f["Device"])}' file
wlp1s0
Four methods:
awk 'NR==1{for (i=1;i<=NF;i++) f[$i]=i; next} {print $(f["HW"])}' </proc/net/arp
awk 'NR==2{print $NF}' < /proc/net/arp
sed -nr '/^[0-9]/s#(.*)* (.*)#\2#gp' /proc/net/arp
sed -nr '/^[0-9]/s#(.*)\*[^a-z]*(.*)#\2#gp' /proc/net/arp

grep extract simple url - without scheme

I need to extract n url from a file. I've started with:
grep -E -o 'ftp://\S*' $filename
I know, that this particular url will start with ftp scheme and will end with some white character (space or newline).
I receive something like:
ftp:/dir/some_file.ext
But I need just a path (/dir/some_file.ext). Without scheme (ftp:// part)
Can I do it with the first regexp? Do I have to use a second one?
I cannot use anything else then grep/egrep.
If your grep supports -P (PCRE flag) then you can use:
grep -oP 'ftp:/\K/\S*' $filename
/dir/some_file.ext
If fore some reason you don't have grep -P available then pipe with another grep:
grep -oE 'ftp://\S*' file | grep -oE '/[^/].*'
/dir/some_file.ext
This gnu awk (due to multiple characters in Record Selector) may also do:
awk -v RS="ftp:/" 'NR>1 {print $1}' file

Extracting IP address from a line from ifconfig output with grep

Given this specific line pulled from ifconfig, in my case:
inet 192.168.2.13 netmask 0xffffff00 broadcast 192.168.2.255
How could one extract the 192.168.2.13 part (the local IP address), presumably with regex?
Here's one way using grep:
line='inet 192.168.2.13 netmask 0xffffff00 broadcast 192.168.2.256'
echo "$line" | grep -oE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b"
Results:
192.168.2.13
192.168.2.256
If you wish to select only valid addresses, you can use:
line='inet 192.168.0.255 netmask 0xffffff00 broadcast 192.168.2.256'
echo "$line" | grep -oE "\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b"
Results:
192.168.0.255
Otherwise, just select the fields you want using awk, for example:
line='inet 192.168.0.255 netmask 0xffffff00 broadcast 192.168.2.256'
echo "$line" | awk -v OFS="\n" '{ print $2, $NF }'
Results:
192.168.0.255
192.168.2.256
Addendum:
Word boundaries: \b
use this regex ((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(?=\s*netmask)
you can use egrep (which is basically the same as grep -E)
in egrep there are named groups for character classes, e.g.: "digit"
(which makes the command longer in this case - but you get the point...)
another thing that is good to know is that you can use brackets to repeat a pattern
ifconfig | egrep '([0-9]{1,3}\.){3}[0-9]{1,3}'
or
ifconfig | egrep '([[:digit:]]{1,3}\.){3}[[:digit:]]{1,3}'
if you only care about the actual IP address use the parameter -o to limit output to the matched pattern instead of the whole line:
ifconfig | egrep -o '([[:digit:]]{1,3}\.){3}[[:digit:]]{1,3}'
...and if you don't want BCast addresses and such you may use this grep:
ifconfig | egrep -o 'addr:([[:digit:]]{1,3}\.){3}[[:digit:]]{1,3}' | egrep -o '[[:digit:]].*'
I assumed you were talking about IPv4 addresses only
Just to add some alternative way:
ip addr | grep -Po '(?!(inet 127.\d.\d.1))(inet \K(\d{1,3}\.){3}\d{1,3})'
it will print out all the IPs but the localhost one.
[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}
I don't have enough reputation points to comment, but I found a bug in Steve's "select only valid addresses" regex. I don't quite understand the problem, but I believe I have found the fix. The first command demonstrates the bug; the second one demonstrates the fix:
$ echo "test this IP: 200.1.1.1" |grep -oE "\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b"
$ echo "test this IP: 200.1.1.1" |grep -oE "\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b"
200.1.1.1
$
grep -oE "\b([0-9]{1,3}\.?){4}\b"
One way using sed. First instruction deletes all characters until first digit in the line, and second instruction saves first IP in group 1 (\1) and replaces all the line with it:
sed -e 's/^[^0-9]*//; s/\(\([0-9]\{1,3\}\.\)\{3\}[0-9]\{1,3\}\).*/\1/'
maybe this, one sed command, just for sports:
ip -o -4 addr show dev eth0 | sed 's/.* inet \([^/]*\).*/\1/'
This code works nicely and easy too.
ifconfig | grep Bcast > /tmp/ip1
cat /tmp/ip1 | awk '{ print $2 }' > /tmp/ip2
sed -i 's/addr://' /tmp/ip2
IPADDRESS=$(cat /tmp/ip2)
echo "$IPADDRESS"
This code works for me on raspberry pi zero w.
(extract wlan0: inet 192.168.x.y address from ifconfig output)
Search for pattern 'inet 192' in ifconfig output and get the 10th position using space delimiter.
$> ifconfig |grep 'inet 192'|cut -d' ' -f10
Output:
192.168.1.6
If using grep that supports Perl regex:
(your command that pulls mentioned line) | grep -Po 'inet \K[\d\.]+'