Sed finds backslash but doesn't replace it (escape character problem)? [duplicate] - regex

This question already has answers here:
Is it possible to escape regex metacharacters reliably with sed
(4 answers)
Closed 2 years ago.
I want to replace \emph{G. fortis} with \emph{G. fortis}\index{\emph{Geospiza fortis}} to add terms to an index in a TeX document. I have a list of words in a file called gfortis that I will pass through the sed command in a while read -r command.
PREFTXt="\emph{G. fortis}" # text to search
REPLACETXT="$PREFTXt\index{\emph{Geospiza fortis}}" # text to replace
sed -e "s/${PREFTXt}/${REPLACETXT}/" path/chapt1.tex
The result is this:
\emph{G. fortis}index{emph{Geospiza fortis}}
But it should be:
\emph{G. fortis}\index{\emph{Geospiza fortis}}
The final command looks like that:
while read -r RP
do
echo "Adding $RP to the index"
PREFTXt="$RP"
ADDTXt="\index{\emph{Geospiza fortis}}"
REPLACETXT="$PREFTXt$ADDTXt"
echo "Replaced $RP with $REPLACETXT"
sed -e "s/${PREFTXt}/${REPLACETXT}/" path/chapt1.tex # should replace the text within this file.
done < path/words_index/gfortis # input the words file to replace with a certain \index command
The cap1.txt contains this:
\chapter{Another chapter in the wall}
NICE other\index{other} to be added to the index.
\emph{Geospiza fortis}
All of the stuff that I put here shall be into the index.
\emph{Geospiza fortis}
This index will be gigantic, but I won't be making multiple indexes.
\emph{G. fortis}
Other cool stuff here
\emph{G. fortis}
I'm using bash in macOS Mojave

You need to use a double backslash instead of a single one. This is because bash/shell etc. will interpret it as a special character and replace "\e" with "e".
To avoid having to escape those, you could put their contents in a file, for instance preftxt.cfg, and do something similar for the other file
it would contain
\emph{G. fortis}
And you could use it like this
PREFTXt="$(cat preftxt.cfg)"

use 3 backslash \ instead of one -- \\\
REPLACETXT="$PREFTXt\\\index{\\\emph{Geospiza fortis}}"

Related

Regexp delete using sed works on regex101 but not with sed [duplicate]

This question already has answers here:
sed multiline delete with pattern
(2 answers)
Closed 2 years ago.
I need to remove strings from a text file that matches a REGEX pattern, using regex101 my pattern match works fine, but when I execute using sed, nothing gets deleted and for some reason the regex is not working:
https://regex101.com/r/oLNrDB/1/
I need to remove all occurrences of all text including newlines between the following 2 strings:
DELIMITER ;;
some text with newlines
DELIMITER ;
The sed command I am using is:
sed '/DELIMITER ;;[\S\s]*?DELIMITER ;/d' myfile.sql;
but the output is identical to the input file, what am I doing wrong ?
The problem here is that sed reads files line-by-line and applies the pattern to each line separately. In your case, this means that the one pattern can't match both the starting and finishing delimiter because no one line contains them both.
The sed way of doing this is to use a range with the delete command, /start pattern/,/end pattern/d, which means delete all lines between the start pattern and end pattern inclusive. For example
$ cat foo.txt
some text before
DELIMITER ;;
some text with newlines
DELIMITER ;
some text after
$ sed '/DELIMITER ;;/,/DELIMITER ;/d' foo.txt
some text before
some text after

unix regex: match from end of string until white space character [duplicate]

This question already has answers here:
regex to match a single character that is anything but a space
(3 answers)
Closed 2 years ago.
The git command:
git diff-tree --no-commit-id --name-status -r <commit hash>
generates a list that looks kinda like this:
D path/to/deleted/file.txt
A path/to/added/file.txt
A path/to/added/file.asd
M path/to/modified/file.txt
I want to grep out only the added and modified (A or M) txt files and their paths. I know I can do like this:
grep -v "^D"
to not include the deleted files.
and
grep -o "\w*.txt$"
to only get the txt files. But this command does not give me the path to the files. Since \w only matches the word. Is there any other wildcard that will match until the whitespace character (so that it removes the A/M with corresponding whitespace)?
Use \S to match anything that isn't whitespace.
grep -o '\S*\.txt$'
awk /^A/^M/'{$0=$NF} 1'
path/to/added/file.txt
path/to/added/file.asd
path/to/modified/file.txt
$

sed escape user input string [duplicate]

This question already has answers here:
Escape a string for a sed replace pattern
(17 answers)
Closed 7 years ago.
I am using sed for string replacement in a config file.
User has to input the string salt and then I replace this salt string in the config file:
Sample config file myconfig.conf
CONFIG_SALT_VALUE=SOME_DUMMY_VALUE
I use the command to replace dummy value with value of salt entered by the user.
sed -i s/^CONFIG_SALT_VALUE.*/CONFIG_SALT_VALUE=$salt/g" ./myconfig.conf
Issue : value of $salt can contain any character, so if $salt contains / (like 12d/dfs) then my above sed command breaks.
I can change delimiter to !, but now again if $salt contains amgh!fhf then my sed command will break.
How should I proceed to this problem?
You can use almost any character as sed delimiter. However, as you mention in your question, to keep changing it is fragile.
Maybe it is useful to use awk instead, doing a little bit of parsing of the line:
awk 'BEGIN{repl=ARGV[1]; ARGV[1]=""; FS=OFS="="}
$1 == "CONFIG_SALT_VALUE" {$2=repl}
1' "$salt" file
As one liner:
awk 'BEGIN{repl=ARGV[1]; ARGV[1]=""; FS=OFS="="} $1 == "CONFIG_SALT_VALUE" {$2=repl}1' "$salt" file
This sets = as field separator. Then, it checks when a line contains CONFIG_SALT_VALUE as parameter name. When this happens, it replaces the value to the one given.
To prevent values in $salt like foo\\bar from being interpreted, as that other guy commented in my original answer, we have the trick:
awk 'BEGIN{repl=ARGV[1]; ARGV[1]=""} ...' "$var" file
This uses the answer in How to use variable including special symbol in awk? where Ed Morton says that
The way to pass a shell variable to awk without backslashes being
interpreted is to pass it in the arg list instead of populating an awk
variable outside of the script.
and then
You need to set ARGV[1]="" after populating the awk variable to
avoid the shell variable value also being treated as a file name.
Unlike any other way of passing in a variable, ALL characters used in
a variable this way are treated literally with no "special" meaning.
This does not do in-place editing, but you can redirect to another file and then replace the original:
awk '...' file > tmp_file && mv tmp_file file

Replace string variable with string variable using Sed [duplicate]

This question already has answers here:
"sed" special characters handling
(3 answers)
Is it possible to escape regex metacharacters reliably with sed
(4 answers)
Escape a string for a sed replace pattern
(17 answers)
Closed 5 years ago.
I have a file called ethernet containing multiple lines. I have saved one of these lines as a variable called old_line. The contents of this variable looks like this:
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="2r:11:89:89:9g:ah", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"
I have created a second variable called new_line that is similar to old_line but with some modifications in the text.
I want to substitute the contents of old_line with the contents of new_line using sed. So far I have the following, but it doesn't work:
sed -i "s/${old_line}/${new_line}/g" ethernet
You need to escape your oldline so that it contains no regex special characters, luckily this can be done with sed.
old_line=$(echo "${old_line}" | sed -e 's/[]$.*[\^]/\\&/g' )
sed -i -e "s/${old_line}/${new_line}/g" ethernet
Since ${old_line} contains many regex special metacharacters like *, ? etc therefore your sed is failing.
Use this awk command instead that uses no regex:
awk -v old="$old_line" -v new="$new_line" 'p=index($0, old) {
print substr($0, 1, p-1) new substr($0, p+length(old)) }' ethernet

Matching pattern containing parentheses with sed [duplicate]

This question already has answers here:
Whether to escape ( and ) in regex using GNU sed
(4 answers)
Closed 4 years ago.
I need to insert '--' at the beginning of the line if line contains word VARCHAR(1000)
Sample of my file is:
TRIM(CAST("AP_RQ_MSG_TYPE_ID" AS NVARCHAR(1000))) AP_RQ_MSG_TYPE_ID,
TRIM(CAST("AP_RQ_PROCESSING_CD" AS NVARCHAR(1000)))
AP_RQ_PROCESSING_CD, TRIM(CAST("AP_RQ_ACQ_INST_ID" AS NVARCHAR(11)))
AP_RQ_ACQ_INST_ID, TRIM(CAST("AP_RQ_LOCAL_TXN_TIME" AS NVARCHAR(10)))
AP_RQ_LOCAL_TXN_TIME, TRIM(CAST("AP_RQ_LOCAL_TXN_DATE" AS
NVARCHAR(10))) AP_RQ_LOCAL_TXN_DATE, TRIM(CAST("AP_RQ_RETAILER" AS
NVARCHAR(11))) AP_RQ_RETAILER,
I used this command
sed 's/\(^.*VARCHAR\(1000\).*$\)/--\1/I' *.sql
But the result is not as expected.
Does anyone have idea what am I doing wrong?
this should do:
sed 's/.*VARCHAR(1000).*/--&/' file
The problem in your sed command is at the regex part. By default sed uses BRE, which means, the ( and ) (wrapping the 1000) are just literal brackets, you should not escape them, or you gave them special meaning: regex grouping.
The first and last (..) you have escaped, there you did right, if you want to reference it later by \1. so your problem is escape or not escape. :)
Use the following sed command:
sed '/VARCHAR(1000)/ s/.*/--\0/' *.sql
The s command appplies to all lines containing VARCHAR(1000). It then replaces the whole line .* by itself \0 with -- in front.
Through awk,
awk '/VARCHAR\(1000\)/ {sub (/^/,"--")}1' infile > outfile