using vars as numbers in sed [duplicate] - regex

This question already has answers here:
sed substitution with Bash variables
(6 answers)
Closed 8 years ago.
Having some difficulty in getting a sed | grep pipe to work when using vars as numbers.
In the string below the '3,5p' works fine, but when substituting the numbers for vars I get the error
sed: -e expression #1, char 4: extra characters after command
working=$(sed -n '3,5p' ${myFile} | grep -n "string" |cut -f1 -d: )
notWorking=$(sed -n '${LINESTART},${LINEEND}p' ${myFile} | grep -n "string" |cut -f1 -d: )
I would also be interested in any advice how I could change command so the line number returned is replaced with $string2 in the file myFile
thanks
Art

You need the variables to be expanded by sed. For that, you have to enclose the expression within double quotes:
sed -n "${LINESTART},${LINEEND}p" ${myFile}
^ ^
instead of
sed -n '${LINESTART},${LINEEND}p' ${myFile}
As you are checking for the line number in $myFile where string is found, it line is in between $LINESTART and $LINEEND, you can do:
awk 'NR>=start && NR<=end && /string/ {print NR}' start=$LINESTART end=$LINEEND ${myFile}
Suppose you want to replace a string just if it appears in specific lines. You can use this:
sed -i.bak "$LINESTART,$LINEEND s/FIND/REPLACE/' file
-i.bak makes a backup of the file and does an in-place edit: file will contain the modified file, while file.bak will be the backup.
Test
$ cat a
hello
this is
something
i want changed
end
but this is not to be changed
$ sed -i.bak '3,5 s/changed/NEW/' a
$ cat a
hello
this is
something
i want NEW <---- "changed" got replaced
end
but this is not to be changed <---- this "changed" did not

Related

how to replace continuous pattern in text

i have text like 1|2|3||| , and try to replace each || with |0|, my command is following
echo '1|2|3|||' | sed -e 's/||/|0|/g'
but get result 1|2|3|0||, the pattern is only replaced once.
could someone help me improve the command, thx
Just do it 2 times
l_replace='s#||#|0|#g'
echo '1|2|3||||||||4||5|||' | sed -e "$l_replace;$l_replace"
Using any sed or any awk in any shell on every Unix box:
$ echo '1|2|3|||' | sed -e 's/||/|0|/g; s/||/|0|/g'
1|2|3|0|0|
$ echo '1|2|3|||' | awk '{while(gsub(/\|\|/,"|0|"));}1'
1|2|3|0|0|
This might work for you (GNU sed):
sed 's/||/|0|/g;s//[0]/g' file
or:
sed ':a;s/||/|0|/g;ta' file
The replacement needs to actioned twice because part of the match is in the replacement.

sed & regex expression

I'm trying to add a 'chr' string in the lines where is not there. This operation is necessary only in the lines that have not '##'.
At first I use grep + sed commands, as following, but I want to run the command overwriting the original file.
grep -v "^#" 5b110660bf55f80059c0ef52.vcf | grep -v 'chr' | sed 's/^/chr/g'
So, to run the command in file I write this:
sed -i -E '/^#.*$|^chr.*$/ s/^/chr/' 5b110660bf55f80059c0ef52.vcf
This is the content of the vcf file.
##FORMAT=<ID=DP4,Number=4,Type=Integer,Description="#ref plus strand,#ref minus strand, #alt plus strand, #alt minus strand">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 24430-0009S21_GM17-12140
1 955597 95692 G T 1382 PASS VARTYPE=1;BGN=0.00134309;ARL=150;DER=53;DEA=55;QR=40;QA=39;PBP=1091;PBM=300;TYPE=SNP;DBXREF=dbSNP:rs115173026,g1000:0.2825,esp5400:0.2755,ExAC:0.2290,clinvar:rs115173026,CLNSIG:2,CLNREVSTAT:mult,CLNSIGLAB:Benign;SGVEP=AGRN|+|NM_198576|1|c.45G>T|p.:(p.Pro15Pro)|synonymous GT:DP:AD:DP4 0/1:125:64,61:50,14,48,13
chr1 957898 82729935 G T 1214 off_target VARTYPE=1;BGN=0.00113362;ARL=149;DER=50;DEA=55;QR=38;QA=40;PBP=245;PBM=978;NVF=0.53;TYPE=SNP;DBXREF=dbSNP:rs2799064,g1000:0.3285;SGVEP=AGRN|+|NM_198576|2|c.463+56G>T|.|intronic GT:DP:AD:DP4 0/1:98:47,51:9,38,10,41
If I understand what is your expected result, try:
sed -ri '/^(#|chr)/! s/^/chr/' file
Your question isn't clear and you didn't provide the expected output so we can't test a potential solution but if all you want is to add chr to the start of lines where it's not already present and which don't start with # then that's just:
awk '!/^(#|chr)/{$0="chr" $0} 1' file
To overwrite the original file using GNU awk would be:
awk -i inplace '!/^(#|chr)/{$0="chr" $0} 1' file
and with any awk:
awk '!/^(#|chr)/{$0="chr" $0} 1' file > tmp && mv tmp file
This can be done with a single sed invocation. The script itself is something like the following.
If you have an input of format
$ echo -e '#\n#\n123chr456\n789chr123\nabc'
#
#
123chr456
789chr123
abc
then to prepend chr to non-commented chrless lines is done as
$ echo -e '#\n#\n123chr456\n789chr123\nabc' | sed '/^#/ {p
d
}
/chr/ {p
d
}
s/^/chr/'
which prints
#
#
123chr456
789chr123
chrabc
(Note the multiline sed script.)
Now you only need to run this script on a file in-place (-i in modern sed versions.)

Dynamically substitue pattern with env variable with bash [duplicate]

This question already has answers here:
Bash Search File for Pattern, Replace Pattern With Code that Includes Git Branch Name
(1 answer)
Replace a string in shell script using a variable
(12 answers)
Closed 6 years ago.
I have a file file.txt with this content: Hi {YOU}, it's {ME}
I would like to dynamically create a new file file1.txt like this
YOU=John
ME=Leonardo
cat ./file.txt | sed 'SED_COMMAND_HERE' > file1.txt
which content would be: Hi John, it's Leonardo
The sed command I tried so far is like this s#{\([A-Z]*\)}#'"$\1"'#g but the "substitution" part doesn't work correctly, it prints out Hi $YOU, it's $ME
The sed utility can do multiple things to each input line:
$ sed -e "s/{YOU}/$YOU/" -e "s/{ME}/$ME/" inputfile.txt >outputfile.txt
This assumes that {YOU} and {ME} occurs only once each on the line, otherwise, just add g ("s/{YOU}/$YOU/g" etc.)
You can use awk with 2 files.
$> cat file.txt
Hi {YOU}, it's {ME}
$> cat repl.txt
YOU=John
ME=Leonardo
$> awk -F= 'FNR==NR{a["{" $1 "}"]=$2; next} {for (i in a) gsub(i,a[i])}1' repl.txt file.txt
Hi John, it's Leonardo
First awk command goes through replacement file and stores each key-value in an array a be wrapping keys with { and }.
In second iteration we just replace each key by value in actual file.
Update:
To do this without creating repl.txt you can use `process substitution**:
awk -F= 'FNR==NR{a["{" $1 "}"]=$2; next} {
for (i in a) gsub(i,a[i])} 1' <(( set -o posix ; set ) | grep -E '^(YOU|ME)=') file.txt

Bash print word after match [duplicate]

This question already has answers here:
Get string after character [duplicate]
(5 answers)
Closed 7 years ago.
I have a variable that stores the output of a file. Within that output, I would like to print the first word after Database:. I'm fairly new to regex, but this is what I've tried so far:
sed -n -e 's/^.*Database: //p' "$output"
When I try this, I am getting a sed: can't read prints_output: File name too long error.
Does sed only take in a filename? I am running a hive query to desc formatted table and storing the results in output like so:
output=`hive -S -e "desc formatted table"`
output is then set to the result of that:
...
# Detailed Table Information
Database: sample_db
Owner: sample_owner
CreateTime: Thu Feb 26 23:36:43 PDT 2015
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: maprfs:/some/location
Table Type: EXTERNAL_TABLE
Table Parameters:
...
Superficially, you should be using:
hive -S -e "desc formatted table" |
sed -n -e 's/^.*Database: //p'
This will show the complete line containing Database:. When you've got that working, you can eliminate the unwanted material on the line too.
Alternatively, you could use:
echo "$output" |
sed -n -e 's/^.*Database: //p'
Or, again, given that you're using Bash, you could use:
sed -n -e 's/^.*Database: //p' <<< "$output"
I'd use the first unless you need the whole output preserved for rescanning. Then I'd probably capture the output in a file (with tee):
hive -S -e "desc formatted table" |
tee output.log |
sed -n -e 's/^.*Database: //p'
Try using egrep:
egrep -oh 'Database:[[:blank:]][[:alnum:]]*[[:blank:]]' <output_file> | awk '{print $2;}'

Extract nth line from lines having specific word using sed or awk? [duplicate]

This question already has answers here:
Display only the n'th match of grep
(7 answers)
Closed 8 years ago.
hg tags output on terminal:
tip 596:bb834c42599f
P_SONY_1004.21.15 595:db10157b1515
P_PHILIPS_1000.21.2 590:67b7e71f76b4
P_SONY_1004.21.2 539:2b50e157e217
P_SONY_10015.21.2 533:15160fyafd88
P_creative.21.1 512:cdac14a00df4
P_SONY_1004.15.5 500:21affdf1bbfd
P_SONY_1002.15.5 466:a7bad21505ca
P_SONY_1002.15.15 424:efbe741500bb
P_creative.15.2 420:415c415a65fa
P_SONY_1004.15.1 414:24f1ab415c15
P_PHILIPS_1000.15.1 412:5d151556c288
P_SONY_1002.15.1 410:bf1f5af64ebb
P_SONY_1002.15.1 390:152e0f4ec815
P_creative.8.2 370:ecdc64f8a4b4
P_creative.8.1 350:5b8e81bd725a
P_creative.7.5 343:221d5c15efa6
P_creative.6.1 222:62115db1e015
from this output i have to extract 2nd line from lines containing "creative" word
I tried this:
hg tags | awk '/creative/{print $1;}'
Its output:
P_creative.21.1
P_creative.15.2
P_creative.8.2
P_creative.8.1
P_creative.7.5
P_creative.6.1
But i want only this as output:
P_creative.15.2
How can i change my command to get "P_creative.15.2" as output and how can i use it in shell script?
also can i extract "15.2" from it ?
You can c as you like to get cth match :
awk -v c=2 '/creative/{count++;}count==c{print $0;exit}' file
(The above will print the whole line)
To get first word:
awk -v c=2 '/creative/{count++;}count==c{print $1;exit}' file
Like this:
awk '/creative/{if(++p==2){gsub(/.*creative./,"");print $1;exit}}' f
To assign the result to a variable, use this $()
VERSION=$(awk '/creative/{if(++p==2){gsub(/.*creative./,"");print $1;exit}}' f)
For the second instance:
hg tags | awk '/creative/{print $1;}' | head -2 | tail -1
More generally, where $n identifies the required instance:
n=2
hg tags | awk '/creative/{print $1;}' | head -$n | tail -1
This might work for you (GNU sed):
sed -n '/creative/{x;/./{x;p;q};x;H}' file
Use the hold space as a flag. Can be adapted:
sed -nr '/creative/{H;x;/^(\n[^\n]*){5}/{x;p;q};x}' file
Finds the 5th such line.
Another way:
awk '/creative/{print $1;}' yourfile | sed -n '2p'
To get only version:
awk '/creative/{print $1;}' yourfile | sed -n '2s/[^.]*\.\(.*\)/\1/p'
Test:
$ awk '/creative/{print $1;}' file | sed -n '2p'
P_creative.15.2
$ awk '/creative/{print $1;}' file | sed -n '2s/[^.]*\.\(.*\)/\1/p'
15.2