This is my one-line Perl command in a Bash script. How do I get the s to change across multiple lines?
#!/bin/bash
file=$1
echo "processing $file"
perl -0777pe 's/.*<script[ |>].*<\/script>/<script> \.\.\. <\/script>/g' "$file" >"${file}changed.txt"
I am inputting an XHTML file in this script. The Perl command line works fine when begin script and end script tags are in the same line. Perl does not find the begin script and end script tags when on separate lines.
Is there a problem with <> in a regular expression?
You're using -0777 which slurps the entire file. Now, all you need to do is add the /s switch to your regular expression so that the any character . will match new lines.
You probably also need to change your regex to be non-greedy .*?, and the regex can be simplified by using assertions and a different delimiter:
#!/bin/bash
file=$1
echo "processing $file"
perl -0777 -pe 's{<script\s*>\K.*?(?=</script>)}{ ... }gs' $file > "${file}changed.txt"
Switches:
-0777: Slurp the entire file
-p: Creates a while(<>){...; print} loop for each “line” in your input file.
-e: Tells perl to execute the code on command line.
Try this:
perl -0777pe 's:<script[ |>].*?</script>:<script> ... </script>:gs' "$file"
From the prelre:
s
Treat string as single line. That is, change "." to match any
character whatsoever, even a newline, which normally it would not
match.
The non-greed match .*? help with multiple <script> </script> blocks. Without it, (so greed match .*) for the
<script> some </script> <script> some2 </script>
will give only one
<script>...</script>
With the non-greed match (.*?) the for the same input will give
<script>...</script> <script>...</script>
Related
I found this to remove whitespace from the end of a script How to remove trailing whitespaces with sed? but it doesn't quite do what I was hoping. What I would like to do when I think of remove all white space is to remove also any empty lines - I think that this sed just removes spaces and tabs, but can it be expanded to also trim out any empty lines from the end of the file? Maybe it's not possible to do this with one line, and maybe there are better ways to achieve this, any options are great.
Also, am I right in thinking that this should replace the file in place with the changes? I'm just not sure that's happening in my testing.
sed -i 's/[ \t]*$//' ~/.bashrc
# -i is in place, [ \t] applies to any number of spaces and tabs before the end of the file "*$"
To remove all whitespace at the end of the file:
perl -0777 -pe 's{\s+\z}{}m' foo > bar
To change the file in-place:
perl -i.bak -0777 -pe 's{\s+\z}{}m' foo
To replace all whitespace at the end of the file with a single newline:
perl -0777 -pe 's{\s+\z}{\n}m' foo > bar
To change the file in-place:
perl -i.bak -0777 -pe 's{\s+\z}{\n}m' foo
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-p : Loop over the input one line at a time, assigning it to $_ by default. Add print $_ after each loop iteration.
-i.bak : Edit input files in-place (overwrite the input file). Before overwriting, save a backup copy of the original file by appending to its name the extension .bak.
-0777 : Slurp files whole.
\s+\z : one or more whitespace characters (including newline) at the end of the string (which happens to be the entire file).
The regex uses this modifier:
/m : Allow multiline matches.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
perldoc perlre: Perl regular expressions (regexes)
This might work for you (GNU sed):
sed ':a;/\S/!{$d;N;ba}' file
Append empty lines to the previous line.
If the empty line is the last, delete the current pattern space.
Otherwise print the pattern space.
To remove spaces from the end of all lines too:
sed ':a;/\S/!{$d;N;ba};s/ *$//mg' file
or:
sed 'H;$!d;x;s/.//;s/ *$//mg;s/\n*$//' file
I am trying to find a pattern of two consecutive lines, where the first line is a fixed string and the second has a part substring I like to replace.
This is to be done in sh or bash on macOS.
If I had a regex tool at hand that would operate on the entire text, this would be easy for me. However, all I find is bash's simple text replacement - which doesn't work with regex, and sed, which is line oriented.
I suspect that I can use sed in a way where it first finds a matching first line, and only then looks to replace the following line if its pattern also matches, but I cannot figure this out.
Or are there other tools present on macOS that would let me do a regex-based search-and-replace over an entire file or a string? Maybe with Python (v2.7 and v3 is installed)?
Here's a sample text and how I like it modified:
keyA
value:474
keyB
value:474 <-- only this shall be replaced (follows "keyB")
keyC
value:474
keyB
value:474
Now, I want to find all occurances where the first line is "keyB" and the following one is "value:474", and then replace that second line with another value, e.g. "value:888".
As a regex that ignores line separators, I'd write this:
Search: (\bkeyB\n\s*value):474
Replace: $1:888
So, basically, I find the pattern before the 474, and then replace it with the same pattern plus the new number 888, thereby preserving the original indentation (which is variable).
You can use
sed -e '/keyB$/{n' -e 's/\(.*\):[0-9]*/\1:888/' -e '}' file
# Or, to replace the contents of the file inline in FreeBSD sed:
sed -i '' -e '/keyB$/{n' -e 's/\(.*\):[0-9]*/\1:888/' -e '}' file
Details:
/keyB$/ - finds all lines that end with keyB
n - empties the current pattern space and reads the next line into it
s/\(.*\):[0-9]*/\1:888/ - find any text up to the last : + zero or more digits capturing that text into Group 1, and replaces with the contents of the group and :888.
The {...} create a block that is executed only once the /keyB$/ condition is met.
See an online sed demo.
Use a perl one-liner with -0777 to scan over multiple lines:
$ # inline edit:
$ perl -0777 -i -pe 's/\bkeyB\s*value):\d*/$1:888/' file.txt
$ # to stdout:
$ cat file.txt | perl -0777 -pe 's/\bkeyB\s*value):\d*/$1:888/'
In plain bash:
#!/bin/bash
keypattern='^[[:blank:]]*keyB$'
valpattern='(.*):'
replacement=888
while read -r; do
printf '%s\n' "$REPLY"
if [[ $REPLY =~ $keypattern ]]; then
read -r
if [[ $REPLY =~ $valpattern ]]; then
printf '%s%s\n' "${BASH_REMATCH[0]}" "$replacement"
else
printf '%s\n' "$REPLY"
fi
fi
done < file
I am trying to add 5 blank line spaces in a text file (text.txt) before and after string pattern matches. I used the following to get spaces after the 'string' match which worked for me-
sed '/string/{G;G;G;G;G;}' text.txt
I want to apply the same sed command to obtain 5 blank lines before the 'string' Here I don't want spaces, but rather blank lines before and after them. Any suggestions?
sed -r 's/(^.*)(string)(.*$)/\1\n\n\n\n\n\2\n\n\n\n\n\3/' text.txt
Use -r or -E to allow regular expressions, split likes into three sections and then substitute the line for the first section, 5 new lines, the second section, 5 new lines and then finally the third section.
Use this Perl one-liner:
perl -pe 's/string/\n\n\n\n\n$&\n\n\n\n\n/' text.txt
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-p : Loop over the input one line at a time, assigning it to $_ by default. Add print $_ after each loop iteration.
s/PATTERN/REPLACEMENT/ : change PATTERN to REPLACEMENT.
$& : matched pattern.
\n : newline character.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
perldoc perlrequick: Perl regular expressions quick start
For a single string match:
$ sed -e '/string/{ s/^/\n\n\n\n\n/; s/$/\n\n\n\n\n/ }' text.txt
For multiple strings, assuming same requirements:
$ sed -E '/(string1|string2|string3)/{ s/^/\n\n\n\n\n/; s/$/\n\n\n\n\n/ }' text.txt
This might work for you:
sed '/string/{G;s/\(string\)\(.*\)\(.\)/\3\3\3\3\3\1\3\3\3\3\3\2/}' file
Match on string, append an empty line, pattern match using the newline to separate the match by 5 lines either side.
And an awk version:
awk '{if(/string1|string2|.../){printf "\n\n\n\n\n%s\n\n\n\n\n",$0}else{print}}' file
I have a perl program that takes the STDIN (piped from another bash command). The output from the bash command is quite large, about 200 lines. I want to take the entire input (multiple lines) and feed that to a one-liner perl script, but so far nothing i've tried has worked. Conversely, if I use the following perl (.pl file):
#!/usr/bin/perl
use strict;
my $regex = qr/{(?:\n|.)*}(?:\n)/p;
if ( <> =~ /$regex/g ) {
print "${^MATCH}\n";
}
And execute my bash command like this:
<bash command> | perl -0777 try_m_1.pl
It works. But as a one-liner, it doesn't work with the same regex/bash command. The result of the print command is nothing. I've tried it like this:
<bash command> | perl -0777 -e '/{(?:\n|.)*}(?:\n)/pg && print "$^MATCH";'
and this:
<bash command> | perl -0777 -e '/{(?:\n|.)*}(?:\n)/g; print "$1\n";'
And a bunch of other things, too many to list them all. I'm new to perl and only want to use it to get regex output from the text. If there's something better than perl to do this (I understand from reading around that sed wouldn't work for this?) feel free to suggest.
Update: based on #zdim answer, I tried the following, which worked:
<bash command> | perl -0777 -ne '/(\{(?:\n|.)*\}(?:\n))/s and print "$1\n"'
I guess my regex needed to be wrapped in () and the { curly braces needed to be escaped.
A one-liner needs -n (or -p) to process input, so that files are opened, streams attached, and a loop set up. It still needs that even as the -0777 unsets the input record separator, so the file is read at once; see Why use the -p|-n in slurp mode in perl one liner?
That regex matches either a newline or any character other than a newline, and there is a modifier for that, /s, with which . matches newline as well. Then that need be inside curly braces, which you need to escape in newer Perls. The newline that follows doesn't need grouping.
So altogether you'd have
<bash command> | perl -0777 -ne'/(\{(.*)\}\n)/s and print "$1\n"'
I've got a document containing empty lines (\n\n). They can be removed with sed:
echo $'a\n\nb'|sed -e '/^$/d'
But how do I do that with an ordinary regular expression in perl? Anything like the following just shows no result at all.
echo $'a\n\nb'|perl -p -e 's/\n\n/\n/s'
You need to use s/^\n\z//. Input is read by line so you will never get more than one newline. Instead, eliminate lines that do not contain any other characters. You should invoke perl using
perl -ne 's/^\n\z//; print'
No need for the /s switch.
The narrower problem of not printing blank lines is more straightforward:
$(input) | perl -ne 'print if /\S/'
will output all lines except the ones that only contain whitespace.
The input is three separate lines, and perl with the -p option only processes one line at time.
The workaround is to tell perl to slurp in multiple lines of input at once. One way to do it is:
echo $'a\n\nb' | perl -pe 'BEGIN{$/=undef}; s/\n\n/\n/'
Here $/ is the record separator variable, which tells perl how to parse an input stream into lines.