Using sed captured group variable as input for bash command - regex

I have text like:
TEXT="I need to replace the hostname [[google.com]] with it's ip in side the text"
Is there a way to use something like below, but working?
sed -Ee "s/\[\[(.*)\]\]/`host -t A \1 | rev | cut -d " " -f1 | rev`/g" <<< $TEXT
looks like the value of \1 is not being passed to the shell command used inside sed.
Thanks

Backquote interpolation is performed by the shell, not by sed. This means that your backquotes will either be replaced by the output of a command before the sed command is run, or (if you correctly quote them) they will not be replaced at all, and sed will see the backquotes.
You appear to be trying to have sed perform a replacement, then have the shell perform backquote interpolation.
You can get the backquotes past the shell by quoting them properly:
$ echo "" | sed -e 's/^/`hostname`/'
`hostname`
However, in that case you will have to use the resulting string in a shell command line to cause backquote interpolation again.
Depending on how you feel about awk, perl, or python, I'd suggest you use one of them to do this job in a single pass. Alternatively, you could make a first pass extracting the hostnames into a command without backquotes, then execute the commands to get the IP addresses you want, then replace them in another pass.

It's got to be a two part command, one to get a variable that bash can use, the other to do a straight-up /s/ replacement with sed.
TEXT="I need to replace the hostname [[google.com]] with it's ip in side the text"
DOMAIN=$(echo $TEXT | sed -e 's/^.*\[\[//' -e 's/\]\].*$//')
echo $TEXT | sed -e 's/\[\[.*\]\]/'$(host -tA $DOMAIN | rev | cut -d " " -f1 | rev)'/'
But, more cleanly using how to split a string in shell and get the last field
TEXT="I need to replace the hostname [[google.com]] with it's ip in side the text"
DOMAIN=$(echo $TEXT | sed -e 's/^.*\[\[//' -e 's/\]\].*$//')
HOSTLOOKUP=$(host -tA $DOMAIN)
echo $TEXT | sed -e 's/\[\[.*\]\]/'${HOSTLOOKUP##* }/
The short version is that you can't mix sed and bash the way you're expecting to.

This works:
#!/bin/bash
txt="I need to replace the hostname [[google.com]] with it's ip in side the text"
host_name=$(sed -E 's/^[^[]*\[\[//; s/^(.*)\]\].*$/\1/' <<<"$txt")
ip_addr=$(host -tA "$host_name" | sed -E 's/.* ([0-9.]*)$/\1/')
echo "$txt" | sed -E 's/\[\[.*\]\]/'"$ip_addr/"
# I need to replace the hostname 172.217.4.174 with it's ip in side the text

Thank you all,
I made the below solution:
function host_to_ip () {
echo $(host -t A $1 | head -n 1 | rev | cut -d" " -f1 | rev)
}
function resolve_hosts () {
local host_placeholders=$(grep -o -e "##.*##" $1)
for HOST in ${host_placeholders[#]}
do
sed -i -e "s/$HOST/$(host_to_ip $(sed -Ee 's/##(.*)##/\1/g' <<< $HOST))/g" $1
done
}
Where resolve_hosts gets a text file as an argument

Related

Regex w/grep against tnsnames.ora

I am trying to print out the contents of a TNS entry from the tnsnames.ora file to make sure it is correct from an Oracle RAC environment.
So if I do something like:
grep -A 4 "mydb.mydomain.com" $ORACLE_HOME/network/admin/tnsnames.ora
I will get back:
mydb.mydomain.com =
(DESCRIPTION =
(ADDRESS =
(PROTOCOL = TCP)(HOST = myhost.mydomain.com)(PORT = 1521))
  (CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME=mydb)))
Which is what I want. Now I have an environment variable being set for the JDBC connection string by an external program when the shell script gets called like:
export $DB_URL=#myhost.mydomain.com:1521/mydb
So I need to get TNS alias mydb.mydomain.com out of the above string. I'm not sure how to do multiple matches and reorder the matches with regex and need some help.
grep #.+: $DB_URL
I assume will get the
#myhost.mydomain.com:
but I'm looking for
mydb.mydomain.com
So I'm stuck at this part. How do I get the TNS alias and then pipe/combine it with the initial grep to display the text for the TNS entry?
Thanks
update:
#mklement0 #Walter A - I tried your ways but they are not exactly what I was looking for.
echo "#myhost.mydomain.com:1521/mydb" | grep -Po "#\K[^:]*"
echo "#myhost.mydomain.com:1521/mydb" | sed 's/.*#\(.*\):.*/\1/'
echo "#myhost.mydomain.com:1521/mydb" | cut -d"#" -f2 | cut -d":" -f1
echo "#myhost.mydomain.com:1521/mydb" | tr "#:" "\t" | cut -f2
echo "#myhost.mydomain.com:1521/mydb" | awk -F'[#:]' '{ print $2 }'
All these methods get me back: myhost.mydomain.com
What I am looking for is actually: mydb.mydomain.com
Note:
- For brevity, the commands below use bash/ksh/zsh here-string syntax to send strings to stdin (<<<"$var"). If your shell doesn't support this, use printf %s "$var" | ... instead.
The following awk command will extract the desired string (mydb.mydomain.com) from $DB_URL (#myhost.mydomain.com:1521/mydb):
awk -F '[#:/]' '{ sub("^[^.]+", "", $2); print $4 $2 }' <<<"$DB_URL"
-F'[#:/]' tells awk to split the input into fields by either # or : or /. With your input, this means that the field of interest are part of the second field ($2) and the fourth field ($4). The sub() call removes the first .-based component from $2, and the print call pieces together the result.
To put it all together:
domain=$(awk -F '[#:/]' '{ sub("^[^.]+", "", $2); print $4 $2 }' <<<"$DB_URL")
grep -F -A 4 "$domain" "$ORACLE_HOME/network/admin/tnsnames.ora"
You don't strictly need intermediate variable $domain, but I've added it for clarity.
Note how -F was added to grep to specify that the search term should be treated as a literal, so that characters such as . aren't treated as regex metacharacters.
Alternatively, for more robust matching, use a regex that is anchored to the start of the line with ^, and \-escape the . chars (using shell parameter expansion) to ensure their treatment as literals:
grep -A 4 "^${domain//./\.}" "$ORACLE_HOME/network/admin/tnsnames.ora"
You can get a part of a string with
# Only GNU-grep
echo "#myhost.mydomain.com:1521/mydb" | grep -Po "#\K[^:]*"
# or
echo "#myhost.mydomain.com:1521/mydb" | sed 's/.*#\(.*\):.*/\1/'
# or
echo "#myhost.mydomain.com:1521/mydb" | cut -d"#" -f2 | cut -d":" -f1
# or, when the string already is in a var
echo "${DB_URL#*#}" | cut -d":" -f1
# or using a temp var
tmpvar="${DB_URL#*#}"
echo "${tmpvar%:*}"
I had skipped the alternative awk, that was given by #mklement0 already:
echo "#myhost.mydomain.com:1521/mydb" | awk -F'[#:]' '{ print $2 }'
The awk solution is straight-forward, when you want to use the same approach without awk you can do something like
echo "#myhost.mydomain.com:1521/mydb" | tr "#:" "\t" | cut -f2
or the ugly
echo "#myhost.mydomain.com:1521/mydb" | (IFS='#:' read -r _ url _; echo "$url")
What is happening here?
After introducing the new IFS I want to take the second word of the input. The first and third word(s) are caught in the dummy var's _ (you could have named them dummyvar1 and dummyvar2). The pipe | creates a subprocess, so you need ()to hold reading and displaying the var url in the same process.

Replace string with another string based on backreference with sed

I'm trying to convert a predefined string %c# where # can be some number with another string. The catch is that the length of the other string must be truncated to # number of characters.
Ideally these set of commands would work:
FORMAT="%c10"
LAST_COMMIT="5189e42b14797b1e36ffb7fc5657c7eea08f1c0f"
echo $FORMAT | sed "s/%c\([0-9]\+\)/${LAST_COMMIT:0:\1}/g"
but clearly there is a syntax error on the \1. You can replace it with a number to see what I'm trying to get as output.
I'm open to using some other program other than sed to achieve this but ideally it should be programs that are pretty much native to most linux installations.
Thanks!
This is my idea.
echo ${LAST_COMMIT} | head -c $(echo ${FORMAT} | sed -e 's/%c//')
Get number with sed and get first some character with head.
EDIT1
This might be better.
echo ${LAST_COMMIT} | head -c $(echo ${FORMAT} | sed -e 's/%c\([0-9]\+\)/\1/')
EDIT2
I make the script because it is too tough to understand. Please try this.
$ cat sample.sh
#!/bin/bash
FORMAT="%b-%t-%c10-%c5"
LAST_COMMIT="5189e42b14797b1e36ffb7fc5657c7eea08f1c0f"
## List numbers
lengths=$(echo ${FORMAT} | sed -e "s/%[^c]//g" -e "s/-//g" -e "s/%c/ /g")
## Substitute %cXX to first XX characters of LAST_COMMIT
for n in ${lengths}
do
to_str=$(echo ${LAST_COMMIT:0:${n}})
FORMAT=$(echo ${FORMAT} | sed "s/%c${length}/${to_str}/")
done
## Print result
echo ${FORMAT}
This is the result.
$ ./sample.sh
%b-%t-5189e42b1410-5189e5
Also this is one line commands (Same contents but too long and too tough)
for n in $(echo ${FORMAT} | sed -e "s/%[^c]//g" -e "s/-//g" -e "s/%c/ /g"); do to_str=$(echo ${LAST_COMMIT:0:${n}}); FORMAT=$(echo ${FORMAT} | sed "s/%c${length}/${to_str}/"); done; echo ${FORMAT}
The value of $LAST_COMMIT gets interpolated before sed runs, so there is no backreference to refer back to yet. There is an /e extension in GNU sed which would support something like this, but I would simply use a slightly more capable tool.
perl -e '$fmt = shift; $fmt=~ s/%c(\d+)/%.$1s/g; printf("$fmt\n", #ARGV)' '%c10' "$LAST_COMMIT"
Of course, if you can let go of your own ad-hoc format string specifier, and switch to a printf-compatible format string altogether, just use the printf shell command straight off.
length=$(echo $FORMAT | sed "s/%c\([0-9]\+\)/\1/g")
echo "${LAST_COMMIT:0:$length}"

How to cut a string from a string

My script gets this string for example:
/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file
let's say I don't know how long the string until the /importance.
I want a new variable that will keep only the /importance/lib1/lib2/lib3/file from the full string.
I tried to use sed 's/.*importance//' but it's giving me the path without the importance....
Here is the command in my code:
find <main_path> -name file | sed 's/.*importance//
I am not familiar with the regex, so I need your help please :)
Sorry my friends I have just wrong about my question,
I don't need the output /importance/lib1/lib2/lib3/file but /importance/lib1/lib2/lib3 with no /file in the output.
Can you help me?
I would use awk:
$ echo "/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file" | awk -F"/importance/" '{print FS$2}'
importance/lib1/lib2/lib3/file
Which is the same as:
$ awk -F"/importance/" '{print FS$2}' <<< "/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file"
importance/lib1/lib2/lib3/file
That is, we set the field separator to /importance/, so that the first field is what comes before it and the 2nd one is what comes after. To print /importance/ itself, we use FS!
All together, and to save it into a variable, use:
var=$(find <main_path> -name file | awk -F"/importance/" '{print FS$2}')
Update
I don't need the output /importance/lib1/lib2/lib3/file but
/importance/lib1/lib2/lib3 with no /file in the output.
Then you can use something like dirname to get the path without the name itself:
$ dirname $(awk -F"/importance/" '{print FS$2}' <<< "/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file")
/importance/lib1/lib2/lib3
Instead of substituting all until importance with nothing, replace with /importance:
~$ echo $var
/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file
~$ sed 's:.*importance:/importance:' <<< $var
/importance/lib1/lib2/lib3/file
As noted by #lurker, if importance can be in some dir, you could add /s to be safe:
~$ sed 's:.*/importance/:/importance/:' <<< "/dir1/dirimportance/importancedir/..../importance/lib1/lib2/lib3/file"
/importance/lib1/lib2/lib3/file
With GNU sed:
echo '/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file' | sed -E 's#.*(/importance.*)#\1#'
Output:
/importance/lib1/lib2/lib3/file
pure bash
kent$ a="/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file"
kent$ echo ${a/*\/importance/\/importance}
/importance/lib1/lib2/lib3/file
external tool: grep
kent$ grep -o '/importance/.*' <<<$a
/importance/lib1/lib2/lib3/file
I tried to use sed 's/.*importance//' but it's giving me the path without the importance....
You were very close. All you had to do was substitute back in importance:
sed 's/.*importance/importance/'
However, I would use Bash's built in pattern expansion. It's much more efficient and faster.
The pattern expansion ${foo##pattern} says to take the shell variable ${foo} and remove the largest matching glob pattern from the left side of the shell variable:
file_name="/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file"
file_name=${file_name##*importance}
Removeing the /file at the end as you ask:
echo '<path>' | sed -r 's#.*(/importance.*)/[^/]*#\1#'
Input /dir1/dir2/dir3.../importance/lib1/lib2/lib3/file
Returns: /importance/lib1/lib2/lib3
See this "Match groups" tutorial.

Log Extract: SED Command

I am trying to extract logs from my application within specific time-stamps. So i wrote the following script
a= echo $1 | sed 's/\//\\\//g';
b= echo $2 | sed 's/\//\\\//g';
sed -n "/$a/,/$b/p" SystemOut.log;
Here a and b are the timestamps which i pass as parameters. When i run the script SED does not expand the variables.
But if i run the following script in terminal it works fine
sed -n '/6\/30\/14 9:03/,/6\/30\/14 9:04/p' SystemOut.log
Anyone can help?
I am running the script as following-
sh extract.sh '6/30/14 9:01' '6/30/14 9:03'
Try this way:
a=$(echo $1 | sed 's/\//\\\//g');
b=$(echo $2 | sed 's/\//\\\//g');
sed -n "/$a/,/$b/p" SystemOut.log;
In order to store the output of a command in a variable you can use $()
Use double quote "" to expand variable. like
sed -n "/\"$a\"/,/\"$b\"/p" SystemOut.log;

Substitute a regex pattern using awk

I am trying to write a regex expression to replace one or more '+' symbols present in a file with a space. I tried the following:
echo This++++this+++is+not++done | awk '{ sub(/\++/, " "); print }'
This this+++is+not++done
Expected:
This this is not done
Any ideas why this did not work?
Use gsub which does global substitution:
echo This++++this+++is+not++done | awk '{gsub(/\++/," ");}1'
sub function replaces only 1st match, to replace all matches use gsub.
Or the tr command:
echo This++++this+++is+not++done | tr -s '+' ' '
The idiomatic awk solution would be just to translate the input field separator to the output separator:
$ echo This++++this+++is+not++done | awk -F'++' '{$1=$1}1'
This this is not done
Try this
echo "This++++this+++is+not++done" | sed -re 's/(\+)+/ /g'
You could use sed too.
echo This++++this+++is+not++done | sed -e 's/+\{1,\}/ /g'
This matches one or more + and replaces it with a space.
For this case I recommend sed, this is powerful for substitution and has a short syntax.
Solution sed:
echo This++++this+++is+not++done | sed -En 's/\\++/ /gp'
Result:
This this is not done
For awk:
You must use the gsub function for global line substitution (more than one substitution).
The syntax:
gsub(regexp, replacement [, target]).
If the third parameter is ommited then $0 is the target.
Target must a variable or array element. gsub works in target, overwritten target with the replacement.
Solution awk:
echo This++++this+++is+not++done | awk 'gsub(/\\++/," ")
Result:
This this is not done
echo "This++++this+++is+not++done" | sed 's/++*/ /g'
If you have access to node on your computer you can do it by installing rexreplace
npm install -g regreplace
and then run
rexreplace '\++' ' ' myfile.txt
Of if you have more files in a dir data you can do
rexreplace '\++' ' ' data/*.txt