Using SED to replace a pattern causes extra characters to appear - regex

I have a file and want to replace a pattern with another string.
#PBS -N bench1-M1-plt2-size15
#PBS -o /home/results/bench1-M1-plt2-size15.out
./run -d ./results/ --config=bench1-M1-plt2-size15.conf \
--results=bench1-M1-plt2-size15.res config/myConfig.txt -F 2000000
I use this command
sed 's/M1-[^-]*-[^-]/M1-plt32-size10/g' filename
However the output file looks like this:
#PBS -N M1-plt32-size100
#PBS -o /home/mahmood/gem5/results/M1-plt32-size100.out
./run -d ./results/ --config=bench1-M1-plt32-size100.conf \
--results=bench1-M1-plt32-size100.res config/myConfig.txt -F 2000000
Please note an extra '0' character after "size10". As you can see, in the SED command, size is set to 10, however the output file is "size100"
What is the problem and how can I fix that?

Your regex is not correct. You have [^-] which will match only 1 non-hyphen character. I believe this is what you intended:
sed 's/M1-[^-]*-[^\.]*/M1-plt32-size10/' filename
OUTPUT:
#PBS -N bench1-M1-plt32-size10
#PBS -o /home/results/bench1-M1-plt32-size10.out
./run -d ./results/ --config=bench1-M1-plt32-size10.conf \
--results=bench1-M1-plt32-size10.res config/myConfig.txt -F 2000000

Related

How do I replace the second occurrence of a whitespace in each line with 'sed' or 'awk'?

I have a file hashes which has many lines that look like this:
wget https://ipfs.io/ipfs/QmbKi6XiMmf4YfvKXhqVPymD1HDwJ3WqukjyLuEvnrZrCz The_Supremes_-_My_World_Is_Empty_Without_You_(lyrics).mkv
All the lines in hashes will follow the pattern:
wget https://ipfs.io/ipfs/hashthatis46characterlong nameOfAfileWithoutSpaces
as they are written by my script with the following lines of code:
find ~/pCloudDrive/VisualArts/Films/Fiction_Movies -maxdepth 1 -type f -size +200M -exec ipfs add --nocopy {} \;>>~/CS/ipfs/hashes && \
sed -i 's;added ;wget https://ipfs.io/ipfs/;g' ~/CS/ipfs/hashes
All hashes are going to be 46-character long and they typically start with 'Qm' but this may not necessarily be
the case in the future.
I want to replace the second space of each line of this file with ' -O ' so that it looks like:
wget https://ipfs.io/ipfs/hashthatis46characterlong -O nameOfAfileWithoutSpaces
I tried sed 's/[0-9A-z]{46,46}\s/& -O /g' hashes but to no avail - I get the following output:
sed: -e expression #1, char 27: Invalid range end
How do I do this? Would awk present a better solution for this problem than sed?
Using GNU awk and gensub() to change the second occurrence on each record:
$ awk '{print gensub(/ /," -O ","2")}' file
For example:
$ echo 1 2 3 4 5 | awk '{print gensub(/ /," -O ","2")}'
1 2 -O 3 4 5
As simple as this
sed 's/ / -O /2' input
where the trailing 2 in the sed command means "the second occurrence".
As you have nameOfAfileWithoutSpaces it is possible to get desired result another way using GNU sed, namely:
s/\([^[:space:]]*\)$/-O \1/
this does capture non-whitespace characters which are followed by end of line ($) then does replace by -O followed by these characters. I tested in using sed.js.org and for input
wget https://ipfs.io/ipfs/hashthatis46characterlong nameOfAfileWithoutSpaces
wget https://ipfs.io/ipfs/hashthatis46characterlong anotherName
output is
wget https://ipfs.io/ipfs/hashthatis46characterlong -O nameOfAfileWithoutSpaces
wget https://ipfs.io/ipfs/hashthatis46characterlong -O anotherName
Another awk:
$ awk '{$3="-O" OFS $3}1' file

grep parts of string with match and not match?

I have a log file which contains string errorCode:null or errorCode:404 etc. I want to find all occurrences where errorCode is not null.
I use:
grep -i "errorcode" -A 10 -B 10 app.2020-.* | grep -v -i "errorCode:null"`,
grep -v -i 'errorcode:[0-9]^4' -A 10 -B 10 app.2020-.*
but this is not the right regex match. How could this be done?
If you have gnu-grep then use a negative lookahead in regex as this with -P option:
grep -Pi 'errorcode(?!:null)' -A 10 -B 10 app.2020-.*
If you don't have gnu grep then try this awk:
awk '/errorcode/ && !/errorcode:null/' app.2020-.*
it would require more bit of code in awk to match equivalent of -A 10 -B 10 options of grep.
I have no idea why you are using -A 10 -B 10, I've just used the following command and everything is working fine:
grep -i "ErrorCode" test.txt | grep -v -i "Errorcode:null"
This the file content:
Errorcode:null
Errorcode:405
Errorcode:504
Nonsens
This is the output of the command:
Errorcode:405
Errorcode:504

Sed command to search by regex in file

I need to get a number of version from file. My version file looks like this:
#define MINOR_VERSION_NUMBER 1
I try to use sed command:
VERSION_MINOR=`sed -i -e 'MINOR_VERSION_NUMBER\s+\([0-9]+\).*/\1/p' $WORKSPACE/project/common/version.h`
but I get error:
sed: -e expression #1, char 2: extra characters after command
The "address" that selects matching lines needs to be enclosed in /.../ (or \X...X for any X).
sed -ne '/MINOR_VERSION_NUMBER/{ s/.*\([0-9]\).*/\1/;p }'
Don't use -i, it changes the file in place and doesn't output anything.
The more common way would be to use awk to find the line and extract the wanted column:
awk '(/MINOR_VERSION_NUMBER/){print$3}'
using grep
grep MINOR_VERSION_NUMBER | grep -o '[0-9]*$'
Demo :
$echo "#define MINOR_VERSION_NUMBER 1" | grep -o '[0-9]*$'
1
$echo "#define MINOR_VERSION_NUMBER 1123" | grep -o '[0-9]*$'
1123
$
Here is a correction of your attempt. Change your line:
VERSION_MINOR=`sed -i -e 'MINOR_VERSION_NUMBER\s+\([0-9]+\).*/\1/p' $WORKSPACE/project/common/version.h`
into:
VERSION_MINOR=`sed -n -e '/^#define\s\+MINOR_VERSION_NUMBER\s\+\([0-9]\+\).*/ s//\1/p' $WORKSPACE/project/common/version.h`
This can be made more readable with GNU sed's -r option:
VERSION_MINOR=`sed -n -r -e '/^#define\s+MINOR_VERSION_NUMBER\s+([0-9]+).*/ s//\1/p' $WORKSPACE/project/common/version.h`
As stated by choroba, awk would be more suited than sed for this kind of processing (see his answer).
However, here is another solution using bash's read builtin, together with GNU grep:
read x x VERSION_MINOR x < <(grep -F -w -m1 MINOR_VERSION_NUMBER $WORKSPACE/project/common/version.h)
VERSION_MINOR=$(echo "#define MINOR_VERSION_NUMBER 1" | tr -s ' ' | cut -d' ' -f3)

sed string replace is giving some kind of warning?

I am using sed with grep command to replace a string. Old string is in 8 files at home location and I want to replace all of these with new string. I am using this:
#! /bin/bash
read oldstring
read newstring
sed -i -e 's/'Soldstring'/'$newstring'/' grep "$oldstring" /home/*
Now this command works but I am getting an warning:
sed: can't read grep: No such file or directory
sed: can't read oldstring: No such file or directory
Any ideas?
You probably wanted
sed -i -e "s|Soldstring|$newstring|" $(grep -l "$oldstring" /home/*)
However that form is unsafe. Better use xargs:
grep -l "$oldstring" /home/* | xargs sed -i -e "s|Soldstring|$newstring|"
And another if possible is to store on arrays:
readarray -t files < <(exec grep -l "$oldstring" /home/*)
sed -i -e "s|Soldstring|$newstring|" "${files[#]}"
You are not executing grep, you are giving it as a parameter to sed.
are you missing backticks?
sed -i -e 's/'Soldstring'/'$newstring'/' `grep "$oldstring" /home/*`
sed -i -e "s/$oldstring/$newstring/g" `grep -l "$oldstring" /home/*`
Just in order to clearly point out the various typos in your code:
#! /bin/bash
# ^
# extra space here (not really an error I think -- but unusual)
read oldstring
read newstring
sed -i -e 's/'Soldstring'/'$newstring'/' grep "$oldstring" /home/*
# ^ ^ ^
# `S` instead of `$` here | |
# here and there
# missing backticks (`)
As a side note, I suggest backticks above, but, since you are using bash, the syntax $(grep ....) is probably better than the classic Bourne Shell syntax `grep ....`. Finally, as suggested by konsolebox, "command nesting" might be unsafe, for example, in this case, if some file names contain spaces.

Find parameters beginning with a dash in a string

Assumed one has a string containing parameters:
echo "-v foo -d --print bar-foo ba-z fOo"
How can one get parameters beginning with a dash?
-v -d --print
An alternative:
STR="-v foo -d --print bar-foo ba-z fOo"
echo "$STR" | egrep -o -e "(^| )+--?[^ ]+" | sed -e 's/ //g'
Will output:
-v
-d
--print
If you want to parse options passed to your script, you should consider using getopt.
References:
example of how to use getopts in bash
$ str="-v foo -d --print bar-foo ba-z"
$ for i in $str; do test ${i::1} = - && echo $i; done
-v
-d
--print
Note this is an instance where you must not quote the variable, since you want word splitting to occur. (That is, do not write for i in "$str")