Grep in bash with regex

Grep in bash with regex - regex

I am getting the following output from a bash script:
INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist
and I would like to get only the path(MajorDomo/MajorDomo-Info.plist) using grep. In other words, everything after the equals sign. Any ideas of how to do this?

This job suites more to awk:
s='INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist'
awk -F' *= *' '{print $2}' <<< "$s"
MajorDomo/MajorDomo-Info.plist
If you really want grep then use grep -P:
grep -oP ' = \K.+' <<< "$s"
MajorDomo/MajorDomo-Info.plist

Not exactly what you were asking, but
echo "INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist" | sed 's/.*= \(.*\)$/\1/'
will do what you want.

You could use cut as well:
your_script | cut -d = -f 2-
(where your_script does something equivalent to echo INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist)
If you need to trim the space at the beginning:
your_script | cut -d = -f 2- | cut -d ' ' -f 2-
If you have multiple spaces at the beginning and you want to trim them all, you'll have to fall back to sed: your_script | cut -d = -f 2- | sed 's/^ *//' (or, simpler, your_script | sed 's/^[^=]*= *//')

Assuming your script outputs a single line, there is a shell only solution:
line="$(your_script)"
echo "${line#*= }"

Bash
IFS=' =' read -r _ x <<<"INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist"
printf "%s\n" "$x"
MajorDomo/MajorDomo-Info.plist

Related

Using grep regex to select to first hyphen

echo "Linux/DEB/mainbinary-0.1.20190424165331-0-armdef.deb" | grep -oE "([^\/]+$)"
This prints just the filename, without the directory structure, but I cannot manage to print just mainbinary from that string. Suggestions?

And a sed alternative to PS.'s great grep -oP
echo "Linux/DEB/mainbinary-0.1.20190424165331-0-armdef.deb" |sed -r 's#^.*/([^-]+).*#\1#'
mainbinary

echo "Linux/DEB/mainbinary-0.1.20190424165331-0-armdef.deb" |grep -oP '.*/\K[^-]+'
mainbinary
This will scan till last / and ignore everything to its left and keep moving until - (excluding)

With any awk in any shell on any UNIX machine:
$ echo "Linux/DEB/mainbinary-0.1.20190424165331-0-armdef.deb" | awk -F'[/-]' '{print $3}'
mainbinary

Regex w/grep against tnsnames.ora

I am trying to print out the contents of a TNS entry from the tnsnames.ora file to make sure it is correct from an Oracle RAC environment.
So if I do something like:
grep -A 4 "mydb.mydomain.com" $ORACLE_HOME/network/admin/tnsnames.ora
I will get back:
mydb.mydomain.com =
(DESCRIPTION =
(ADDRESS =
(PROTOCOL = TCP)(HOST = myhost.mydomain.com)(PORT = 1521))
  (CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME=mydb)))
Which is what I want. Now I have an environment variable being set for the JDBC connection string by an external program when the shell script gets called like:
export $DB_URL=#myhost.mydomain.com:1521/mydb
So I need to get TNS alias mydb.mydomain.com out of the above string. I'm not sure how to do multiple matches and reorder the matches with regex and need some help.
grep #.+: $DB_URL
I assume will get the
#myhost.mydomain.com:
but I'm looking for
mydb.mydomain.com
So I'm stuck at this part. How do I get the TNS alias and then pipe/combine it with the initial grep to display the text for the TNS entry?
Thanks
update:
#mklement0 #Walter A - I tried your ways but they are not exactly what I was looking for.
echo "#myhost.mydomain.com:1521/mydb" | grep -Po "#\K[^:]*"
echo "#myhost.mydomain.com:1521/mydb" | sed 's/.*#\(.*\):.*/\1/'
echo "#myhost.mydomain.com:1521/mydb" | cut -d"#" -f2 | cut -d":" -f1
echo "#myhost.mydomain.com:1521/mydb" | tr "#:" "\t" | cut -f2
echo "#myhost.mydomain.com:1521/mydb" | awk -F'[#:]' '{ print $2 }'
All these methods get me back: myhost.mydomain.com
What I am looking for is actually: mydb.mydomain.com

Note:
- For brevity, the commands below use bash/ksh/zsh here-string syntax to send strings to stdin (<<<"$var"). If your shell doesn't support this, use printf %s "$var" | ... instead.
The following awk command will extract the desired string (mydb.mydomain.com) from $DB_URL (#myhost.mydomain.com:1521/mydb):
awk -F '[#:/]' '{ sub("^[^.]+", "", $2); print $4 $2 }' <<<"$DB_URL"
-F'[#:/]' tells awk to split the input into fields by either # or : or /. With your input, this means that the field of interest are part of the second field ($2) and the fourth field ($4). The sub() call removes the first .-based component from $2, and the print call pieces together the result.
To put it all together:
domain=$(awk -F '[#:/]' '{ sub("^[^.]+", "", $2); print $4 $2 }' <<<"$DB_URL")
grep -F -A 4 "$domain" "$ORACLE_HOME/network/admin/tnsnames.ora"
You don't strictly need intermediate variable $domain, but I've added it for clarity.
Note how -F was added to grep to specify that the search term should be treated as a literal, so that characters such as . aren't treated as regex metacharacters.
Alternatively, for more robust matching, use a regex that is anchored to the start of the line with ^, and \-escape the . chars (using shell parameter expansion) to ensure their treatment as literals:
grep -A 4 "^${domain//./\.}" "$ORACLE_HOME/network/admin/tnsnames.ora"

You can get a part of a string with
# Only GNU-grep
echo "#myhost.mydomain.com:1521/mydb" | grep -Po "#\K[^:]*"
# or
echo "#myhost.mydomain.com:1521/mydb" | sed 's/.*#\(.*\):.*/\1/'
# or
echo "#myhost.mydomain.com:1521/mydb" | cut -d"#" -f2 | cut -d":" -f1
# or, when the string already is in a var
echo "${DB_URL#*#}" | cut -d":" -f1
# or using a temp var
tmpvar="${DB_URL#*#}"
echo "${tmpvar%:*}"
I had skipped the alternative awk, that was given by #mklement0 already:
echo "#myhost.mydomain.com:1521/mydb" | awk -F'[#:]' '{ print $2 }'
The awk solution is straight-forward, when you want to use the same approach without awk you can do something like
echo "#myhost.mydomain.com:1521/mydb" | tr "#:" "\t" | cut -f2
or the ugly
echo "#myhost.mydomain.com:1521/mydb" | (IFS='#:' read -r _ url _; echo "$url")
What is happening here?
After introducing the new IFS I want to take the second word of the input. The first and third word(s) are caught in the dummy var's _ (you could have named them dummyvar1 and dummyvar2). The pipe | creates a subprocess, so you need ()to hold reading and displaying the var url in the same process.

How to use sed to identify a string in brackets?

I want to find the string in that is placed with in the brackets. How do I use sed to pull the string?
# cat /sys/block/sdb/queue/scheduler
noop anticipatory deadline [cfq]
I'm not getting the exact result
# cat /sys/block/sdb/queue/scheduler | sed 's/\[*\]//'
noop anticipatory deadline [cfq
I'm expecting an output
cfq

It can be easier with grep, if it happens to be changing the position in which the text in between brackets is located:
$ grep -Po '(?<=\[)[^]]*' file
cfq
This is look-behind: whenever you find a string [, start fetching all the characters up to a ].
See another example:
$ cat a
noop anticipatory deadline [cfq]
hello this [is something] we want to [enclose] yeah
$ grep -Po '(?<=\[)[^]]*' a
cfq
is something
enclose
You can also use awk for this, in case it is always in the same position:
$ awk -F[][] '{print $2}' file
cfq
It is setting the field separators as [ and ]. And from that, prints the second one.
And with sed:
$ sed 's/[^[]*\[\([^]]*\).*/\1/g' file
cfq
It is a bit messy, but basically it is looking from the block of text in between [] and prints it back.

I found one possible solution-
cut -d "[" -f2 | cut -d "]" -f1
so the exact solution is
# cat /sys/block/sdb/queue/scheduler | cut -d "[" -f2 | cut -d "]" -f1

Another potential solution is awk:
s='noop anticipatory deadline [cfq]'
awk -F'[][]' '{print $2}' <<< "$s"
cfq

Another way by gnu grep :
grep -Po "\[\K[^]]*" file
with pure shell:
while read line; do [[ "$line" =~ \[([^]]*)\] ]] && echo "${BASH_REMATCH[1]}"; done < file

Another awk
echo 'noop anticipatory deadline [cfq]' | awk '{gsub(/.*\[|\].*/,x)}8'
cfq

perl -lne 'print $1 if(/\[([^\]]*)\]/)'
Tested here

(GNU)Sed: how to replace any character from nth character to nth+10?

I need to replace characters from 10th to 20th in the string which looks like that:
123456789012345678901234567890
So far I've tried:
a)
Works for the 10th character ONLY:
echo "123456789012345678901234567890" | sed 's/./X/10'
b)
Doesn't work on the range:
echo "123456789012345678901234567890" | sed 's/./X/10,20'
echo "123456789012345678901234567890" | sed 's/./X/10\,20'
echo "123456789012345678901234567890" | sed 's/./X/\{10,20\}'
echo "123456789012345678901234567890" | sed 's/./X/\{10\,20\}'
Does not work and I get error
unknown option to `s'
So - the question is - how do I make this to work:
echo "123456789012345678901234567890" | sed 's/./X/10,20'

Try:
$ sed -r "s/^(.{9})(.{11})/\1XXXXXXXXXX/" <<< 123456789012345678901234567890
123456789XXXXXXXXXX1234567890

It is a complex sed problem, I could just find this solution:
$ sed 's/^\(.\{10\}\)\(.\{10\}\)/\1XXXXXXXXXX/' <<< 123456789012345678901234567890
1234567890XXXXXXXXXX1234567890
With awk it looks nicer:
$ awk 'BEGIN{FS=OFS=""} {for (i=10;i<=20;i++) $i="X"} {print}' <<< 123456789012345678901234567890
123456789XXXXXXXXXXX1234567890

You can do it with bash parameter substitution like this:
#!/bin/bash
s="123456789012345678901234567890"
l=${s:0:9} # Extract left part
m=${s:10:11} # Extract middle part
r=${s:20} # Extract right part
# Diddle with middle part to your heart's content and re-assemble "$l$m$r" when done
m=$(sed 's/./X/g' <<<$m)
See here for more explanation and examples.
Or, you can do this:
transform the row of letters into a column so each is on its own line
apply your edits to LINES 10 through 20 (as opposed to characters 10 through 20)
transform column of letters back into a row (by deleting linefeeds)
as shown in the one-liner below:
$ echo "123456789012345678901234567890" | sed "s/\(.\)/\1\n/g" | sed "10,20s/./X/" | tr -d "\n"

I know, that it looks ugly, but:
echo "123456789012345678901234567890" | \
sed 's/^\(.\{10\}\).\{10\}\(.*\)/\1XXXXXXXXXX\2/'

Without placing multiple X in sed command:
sed -r 's/^(.{9})(.{10,20})(.*)$/\1\n\2\n\3/' | sed -e '2s/./X/g' -e 'N;N;s/\n//g'

To replace the 10th to 20th characters, inclusive, try:
echo 123456789012345678901234567890 | sed 's/\(.\{9\}\).\{11\}/\1XXXXXXXXXX/'
123456789XXXXXXXXXX1234567890
With the GNU sed, you can use the -r switch to remove most of the backslashes:
echo 123456789012345678901234567890 | sed -r 's/(.{9}).{11}/\1XXXXXXXXXX/'
Or the naive approach also works here:
echo 123456789012345678901234567890 | sed 's/\(.........\).........../\1XXXXXXXXXX/'

This might work for you (GNU sed):
sed ':a;/.\{9\}X\{11\}/!s/\(.\{9\}X*\)./\1X/;ta' file
or with a bit of syntactic sugar:
sed -r ':a;/.{9}X{11}/!s/(.{9}X*)./\1X/;ta' file

using sed to get only line number of "grep -in"

Which regexp should I use to only get line number from grep -in output?
The usual output is something like this:
241113:keyword
I need to get only "241113" from sed's output.

I suggest cut
grep -in keyword ... | cut -d: -f1
If you insist with sed:
grep -in keyword ... | sed 's/:.*$//g

You don't need to use sed. Cut is enough. Just pipe grep's output to
cut -d ':' -f 1
As an example:
grep -n blabla file.txt | cut -d ':' -f 1

Personally, I like awk
grep -in 'search' file | awk --field-separator : '{print $1}'

As said in other answers, cut is the right tool; but if you really want to use a swiss-army knife, you can also use awk:
grep -in keyword ... | awk -F: '{print $1}'
or using grep again:
grep -in keyword ... | grep -oE '^[0-9]+'

Just in case someone is wondering if all this could be done without grep, i.e. with sed alone ...
echo '
a
b
keyword
c
keyWord
x
y
keyword
Keyword
z
' |
sed -n '/[Kk][Ee][Yy][Ww][Oo][Rr][Dd]/{=;}'
#sed -n '/[Kk][Ee][Yy][Ww][Oo][Rr][Dd]/{=;q;}' # only line number of first match

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Grep in bash with regex - regex

I am getting the following output from a bash script: INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist and I would like to get only the path(MajorDomo/MajorDomo-Info.plist) using grep. In other words, everything after the equals sign. Any ideas of how to do this?

This job suites more to awk: s='INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist' awk -F' = ' '{print $2}' <<< "$s" MajorDomo/MajorDomo-Info.plist If you really want grep then use grep -P: grep -oP ' = \K.+' <<< "$s" MajorDomo/MajorDomo-Info.plist

Not exactly what you were asking, but echo "INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist" | sed 's/.= \(.\)$/\1/' will do what you want.

Assuming your script outputs a single line, there is a shell only solution: line="$(your_script)" echo "${line#*= }"

Bash IFS=' =' read -r _ x <<<"INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist" printf "%s\n" "$x" MajorDomo/MajorDomo-Info.plist

Related

Using grep regex to select to first hyphen

Regex w/grep against tnsnames.ora

How to use sed to identify a string in brackets?

(GNU)Sed: how to replace any character from nth character to nth+10?

using sed to get only line number of "grep -in"

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Grep in bash with regex - regex

I am getting the following output from a bash script: INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist and I would like to get only the path(MajorDomo/MajorDomo-Info.plist) using grep. In other words, everything after the equals sign. Any ideas of how to do this?

This job suites more to awk: s='INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist' awk -F' *= *' '{print $2}' <<< "$s" MajorDomo/MajorDomo-Info.plist If you really want grep then use grep -P: grep -oP ' = \K.+' <<< "$s" MajorDomo/MajorDomo-Info.plist

Not exactly what you were asking, but echo "INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist" | sed 's/.*= \(.*\)$/\1/' will do what you want.

Assuming your script outputs a single line, there is a shell only solution: line="$(your_script)" echo "${line#*= }"

Bash IFS=' =' read -r _ x <<<"INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist" printf "%s\n" "$x" MajorDomo/MajorDomo-Info.plist

Related

Using grep regex to select to first hyphen

Regex w/grep against tnsnames.ora

How to use sed to identify a string in brackets?

(GNU)Sed: how to replace any character from nth character to nth+10?

using sed to get only line number of "grep -in"

Categories

Resources

This job suites more to awk: s='INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist' awk -F' = ' '{print $2}' <<< "$s" MajorDomo/MajorDomo-Info.plist If you really want grep then use grep -P: grep -oP ' = \K.+' <<< "$s" MajorDomo/MajorDomo-Info.plist

Not exactly what you were asking, but echo "INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist" | sed 's/.= \(.\)$/\1/' will do what you want.