How can I disable variable interpolation with the Perl substitution operator?

How can I disable variable interpolation with the Perl substitution operator? - regex

I'm trying to replace a particular line in a text file on VMS. Normally, this is a simple one-liner with Perl. But I ran into a problem when the replacement side was a symbol containing a VMS path. Here is the file and what I tried:
Contents of file1.txt:
foo
bar
baz
quux
Attempt to substitute 3rd line:
$ mysub = "disk$data1:[path.to]file.txt"
$ perl -pe "s/baz/''mysub'/" file1.txt
Yields the following output:
foo
bar
disk:[path.to]file.txt
quux
It looks like Perl was overeager and replaced the $data1 part of the path with the contents of a non-existent variable (i.e., nothing). Running with the debugger confirmed that. I didn't supply /e, so I thought Perl should just replace the text as-is. Is there a way to get Perl to do that?
(Also note that I can reproduce similar behavior at the linux command line.)

With just the right mix of quotes and double quotes you can get there:
We create a tight concatenation of "string" + 'substitute symbol' + "string".
The two double quoted strings contain single quotes for the substitute.
Contrived... but it works.
$ perl -pe "s'baz'"'mysub'"'" file1.txt
That's a DCL level solution.
For a Perl solution, use the q() operator for a non-interpolated string and execute that.
Stuff the symbol into the parens using simple DCL substitution.
This is my favorite because it (almost) makes sense to me and does not get too confusing with quotes.
$ perl -pe "s/baz/q(''mysub')/e" file1.txt

As ruakh discovered when reading my comment, the problem of perl interpolation can be solved by accessing the %ENV hash, rather than using a shell variable:
perl -pwe "s/baz/$ENV{mysub}/" file1.txt
Added -w because I do not believe in not using warnings even in one-liners.

Haven't even seen a VMS system in decades, but ... escape your sigil?
$ mysub = "disk\$data1:[path.to]file.txt"
or maybe
$ mysub = "disk\\$data1:[path.to]file.txt"
?

For sh or a derivative, I'd use
perl -pe'BEGIN { $r = shift(#ARGV) } s/baz/$r/' "$mysub" file1.txt
Otherwise, you have to somehow covert the value of mysub into a Perl string literal. That would be more complicated. Find the equivalent for your shell.
Edit by OP:
Yes, this works. The VMS equivalent is
perl -pe "BEGIN { $r = shift(#ARGV) } s/baz/$r/" 'mysub' file1.txt

You have to escape the $ with \$. Otherwise Perl sees a variable reference and replaces the string $data1 with the content of $data1 before the regular expression is evaluated. As you didn't define $data1 it is, of course, empty.

You can use the following code :
$ mysub='disk\$data1:[path.to]file.txt'
$ perl -pe 's/baz/'$mysub'/' FILE
foo
bar
disk$data1:[path.to]file.txt
quux

You could try using a logical name instead of a DCL symbol:
$ define mysub "disk$data1:[path.to]file.txt"
$ $ perl -pe "s/baz/mysub/" file1.txt

Use sed.
Sed won't try to interpolate $data1 as a variable, so you should get what you want.
$ sed -e "s/baz/''mysub'/" file1.txt

Related

How to use sed in shell script to replace all environment value occurrences with their current values

I would like to have a shell script to iterate over all the occurrences of environment variable names in a file and replace them with their current values. I am not sure how this can be done by using sed command.
The file content:
values:
value1:
name: "something"
value: "$ENV_VAR1" # this could be any variable name
value2:
name: "something"
value: "$ENV_VAR2"
...
First, I need to find all occurrences of any variable (Using regex "\$(.*?)" ). Then, somehow, I need to replace it with the variable value from the shell. I am not sure how I can use the sed command to achieve the second part as the variable name is specified in the file itself.
Something like the following command:
sed -i "s/\"\$(.*?)\"/${Some_How_Get_Var_Name}/g" file.yaml

This is a problem that comes up often. envsubst is commonly given as a solution, but I find it's easier to just stick with perl and do something like:
perl -pe 'while (my ($k, $v) = each %ENV) { s/\$$k/$v/g }'
This is almost certainly not a robust solution (it will replace $FOO, but it won't do replacements of the form ${FOO}), but I find I'm always disappointed that envsubst doesn't do ${FOO-bar}, and envsubst seems less ubiquitous than perl.
Or, rather than doing the replacement for everything in the environment, you might prefer something like:
perl -pe 's/\$([[:alpha:]_][_[:alnum:]]+)/$ENV{$1}/g'
or
perl -pe 's/\$([[:alpha:]_][\w]+)/$ENV{$1}/g'
These last two will replace '$FOO' with the empty string if FOO is not defined, while the first leaves it unreplaced. Which behavior you desire may drive the decision as to which to use.
I won't claim these are completely correct, but they are a reasonable approximation.

If You are using bash and the envsubst command is avaiable you can do:
envsubst < inputfile
E.g. (creating a temp input for demonstrating it:
$ env | tail -2 | sed 's_^_$_'
$MANPATH=/home/linuxbrew/.linuxbrew/share/man:
$INFOPATH=/home/linuxbrew/.linuxbrew/share/info:
Then running this through envsubst:
$ env | tail -2 | sed 's_^_$_' | envsubst
/home/linuxbrew/.linuxbrew/share/man:=/home/linuxbrew/.linuxbrew/share/man:
/home/linuxbrew/.linuxbrew/share/info:=/home/linuxbrew/.linuxbrew/share/info:

This might work for you (GNU sed):
sed '/value:/{y/"/\n/;s/^.*/printf "&"/e;y/\n/"/}' file
On any line containing the string value: convert any "'s to newlines, use printf to convert the environmental variables to their real values and reconvert the introduced newlines back to "'s.
N.B. If the environmental variable can contain "'s, these will need to be quoted following the printf command, i.e. insert s/"/\\"/g before the last y command.

Linux CLI change price (awk or sed?)

I have price strings formatted as
$25.00
in various html files. I would like to use the Linux command line (BASH, presumably with awk or sed) to increase each price by a certain dollar amount ($3 in this case).
In short, I need to find $nn.00 and replace it with $(n+3)n.00
Started to put it together but I don't know how to add 3 sed -r 's/([^$][0-9][0-9][.]00). ????' file.html
Thanks!

Sample data:
$ cat prices_file.html
<p>$25.00</p><p>$78.00</p>
<p>$2.00</p>
<p>$101.00</p>
Solution with Perl:
$ perl -pi.bak -e 's/\$(\d+\.\d+)/sprintf("\$%.2f", $1 + 3)/eg' prices_file.html
After:
$ cat prices_file.html
<p>$28.00</p><p>$81.00</p>
<p>$5.00</p>
<p>$104.00</p>
Above example is one of most common perl use cases with substitution.
It will also backup your original file (in prices_file.html.bak) in case you do something unwanted to it.
What is maybe not so common is evaluation modifier (s///e) which allows you to execute arbitrary perl code in substitution.
Global modifier (s///g) tells perl to replace all occurrences (here in a context of line, if you remove g modifier if would only replace first price in 1st line of given sample data).
In sprintf("\$%.2f", $1 + 3) replacement, $1 refers to matched group [(\d+\.\d+)].

sed replace exact match

I want to change some names in a file using sed. This is how the file looks like:
#! /bin/bash
SAMPLE="sample_name"
FULLSAMPLE="full_sample_name"
...
Now I only want to change sample_name & not full_sample_name using sed
I tried this
sed s/\<sample_name\>/sample_01/g ...
I thought \<> could be used to find an exact match, but when I use this, nothing is changed.
Adding '' helped to only change the sample_name. However there is another problem now: my situation was a bit more complicated than explained above since my sed command is embedded in a loop:
while read SAMPLE
do
name=$SAMPLE
sed -e 's/\<sample_name\>/$SAMPLE/g' /path/coverage.sh > path/new_coverage.sh
done < $1
So sample_name should be changed with the value attached to $SAMPLE. However when running the command sample_name is changed to $SAMPLE and not to the value attached to $SAMPLE.

I believe \< and \> work with gnu sed, you just need to quote the sed command:
sed -i.bak 's/\<sample_name\>/sample_01/g' file

In GNU sed, the following command works:
sed 's/\<sample_name\>/sample_01/' file
The only difference here is that I've enclosed the command in single quotes. Even when it is not necessary to quote a sed command, I see very little disadvantage to doing so (and it helps avoid these kinds of problems).
Another way of achieving what you want more portably is by adding the quotes to the pattern and replacement:
sed 's/"sample_name"/"sample_01"/' script.sh
Alternatively, the syntax you have proposed also works in GNU awk:
awk '{sub(/\<sample_name\>/, "sample_01")}1' file
If you want to use a variable in the replacement string, you will have to use double quotes instead of single, for example:
sed "s/\<sample_name\>/$var/" file
Variables are not expanded within single quotes, which is why you are getting the the name of your variable rather than its contents.

#user1987607
You can do this the following way:
sed s/"sample_name">/sample_01/g
where having "sample_name" in quotes " " matches the exact string value.
/g is for global replacement.
If "sample_name" occurs like this ifsample_name and you want to replace that as well
then you should use the following:
sed s/"sample_name ">/"sample_01 "/g
So that it replaces only the desired word. For example the above syntax will replace word "the" from a text file and not from words like thereby.
If you are interested in replacing only first occurence, then this would work fine
sed s/"sample_name"/sample_01/
Hope it helps

How can I use perl/awk/sed to search for all occurrences of text wrapped in quotes within a file and then delete them?

How can I use perl, awk, or sed to search for all occurrences of text wrapped in quotes within a file, and print the result of deleting those occurrences from the file? I do not want to actually alter the file, but simply print the result of altering the file like sed does.
For example, say the file contains the following :
data|more data|"not important"|"more unimportant stuff"
I need it to print out:
data|more data||
But I want to leave the file intact. I tried using sed but I could not get it to accept regexs.
I have tried something like this:
sed -e 's/\<["]+[^"]*["]+\>//g' file.txt
but it does nothing and prints the original file.
Any Thoughts?

Using a perl one-liner:
perl -pe 's/".*?"//g' file
Explanation:
Switches:
-p: Creates a while(<>){...; print} loop for each line in your input file.
-e: Tells perl to execute the code on command line.

You seem to have a few extra characters in your sed command.
sed -e 's/"[^"]*"//g' file.txt
Input:
"quoted text is here" but not quoted there
never more
"hello world" foo bar
data|more data|"not important"|"more unimportant stuff"
Output:
but not quoted there
never more
foo bar
data|more data||

echo 'data|more data|"not important"|"more unimportant stuff"' | sed -E 's/"[^"]*"//g'
You don't need to declare a character class (brackets) for only one character...

my $cnt=qq(data|more data|"not important"|"more unimportant stuff");
my #arr = $cnt =~ m{(?:^|\|)([^"][^\|]*[^"])(?=\||$)}ig;
print "#arr";
This code might help you..

Why this single and double quote make so much difference in output

Why outputs of these two commands differ?
cat config.xml|perl -ne 'print $1,"\n" if /([0-9\.]+):161/'
cat config.xml|perl -ne "print $1,"\n" if /([0-9\.]+):161/"
First works as expected printing out matched group while seconds prints whole line.

I see two main things wrong with your command.
First off, double quotes allow shell interpolation, and $1 will be taken for a shell variable and replaced. Since it unlikely exists, it will be replaced with an empty string. So instead of print $1, you get print, which is shorthand for print $_, and is probably why the entire line prints.
Second, you have unescaped double quotes inside your command, so you are in fact passing three strings to Perl:
print ,
\n
if /(....)/
As for why or how this works with your shell, I don't know, since I do not have access to your OS, nor know which one it is. In Windows, I get a Perl bareword warning for n (Unquoted string "n" may clash with future reserved word at -e line 1.) which means that the \n is interpreted as a string. Now, here's the tricky part. What we get is this:
print , \n if /.../
Which means that \n is no longer an argument to print, it is a statement that comes after print and it is in void context, so it gets ignored. We can see this by this warning (which I had to fake in my shell):
Useless use of single ref constructor in void context at -e line 1.
(Note that you do not get these warnings as you do not use warnings -- the -w switch)
So what we are left with is
print if /.../
Which is exactly the code for the behaviour you described: It prints the whole line when a match is found.
What you can do to visualize the problem in your shell is add the -MO=Deparse switch to your one-liner, as shown here:
C:\perl>perl -MO=Deparse -ne"print ,"\n" if /a/"
LINE: while (defined($_ = <ARGV>)) {
print($_), \'n' if /a/;
}
-e syntax OK
Now we can clearly see that the print statement is separated from the newline, and that the newline is a reference to a string.
Solution:
However, your code has other problems, and if done right you can avoid all the shell difficulties. First, you have a UUOC (Useless Use of Cat). A file argument can be given to perl when using the -n switch on the command line. Secondly, you do not need to use variables for this, you can simply print the return value of your regex:
perl -nlwe 'print for /(...)/' config.xml
The -l switch will handle newlines for you, and in this case add newline to the print. The for is necessary to avoid printing empty matches.

Inside double quote, some stuffs are substituted ($variable, `command`, ..). While inside single quote, they are remained as is.
$ echo "$HOME"
/home/falsetru
$ echo '$HOME'
$HOME
$ echo "`echo 1`"
1
$ echo '`echo 1`'
`echo 1`
Nested quotes:
$ echo ""hello""
hello
$ echo '"hello"'
"hello"
$ echo "\"hello\""
"hello"
Escape double quotes, $ to get same result:
cat config.xml | perl -ne "print \$1,\"\n\" if /([0-9\.]+):161/"

Two things:
Nested quotes.
Variables expand differently.
The first command has one string that happens to contain some double quotes. The variable is not expanded.
The second command has two strings with an unquoted \n in between. The variable is expanded.
Let's say $1 contains "blah"
The first passes this string to perl:
print $1,"\n" if /([0-9\.]+):161/
the second, this:
print blah,\n if /([0-9\.]+):161/

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How can I disable variable interpolation with the Perl substitution operator? - regex

As ruakh discovered when reading my comment, the problem of perl interpolation can be solved by accessing the %ENV hash, rather than using a shell variable: perl -pwe "s/baz/$ENV{mysub}/" file1.txt Added -w because I do not believe in not using warnings even in one-liners.

Haven't even seen a VMS system in decades, but ... escape your sigil? $ mysub = "disk\$data1:[path.to]file.txt" or maybe $ mysub = "disk\\$data1:[path.to]file.txt" ?

You have to escape the $ with \$. Otherwise Perl sees a variable reference and replaces the string $data1 with the content of $data1 before the regular expression is evaluated. As you didn't define $data1 it is, of course, empty.

You can use the following code : $ mysub='disk\$data1:[path.to]file.txt' $ perl -pe 's/baz/'$mysub'/' FILE foo bar disk$data1:[path.to]file.txt quux

You could try using a logical name instead of a DCL symbol: $ define mysub "disk$data1:[path.to]file.txt" $ $ perl -pe "s/baz/mysub/" file1.txt

Use sed. Sed won't try to interpolate $data1 as a variable, so you should get what you want. $ sed -e "s/baz/''mysub'/" file1.txt

Related

How to use sed in shell script to replace all environment value occurrences with their current values

Linux CLI change price (awk or sed?)

sed replace exact match

How can I use perl/awk/sed to search for all occurrences of text wrapped in quotes within a file and then delete them?

Why this single and double quote make so much difference in output

Categories

Resources