Check if any replacement done by `perl -i -pe`

Check if any replacement done by `perl -i -pe` - regex

In GNU sed, I can display the result of successful substitution of the search pattern. Simple example as the following:
echo -e "nginx.service\nmariadb.service\nphp-fpm.service" > something.conf;
sed -ri 's|(mariadb)(\.service)|postgresql-9.4\2|w sed-output.log' something.conf;
[[ -s sed-output.log ]] && echo "Pattern found and modified. $(cat sed-output.log)" || echo "Pattern not found.";
Because sed has limitation while dealing with multilines, I switched to perl.
echo -e "nginx.service\nmariadb.service\nphp-fpm.service" > something.conf;
perl -i -pe 's|(mariadb)(\.service)|postgresql-9.4\2|' something.conf;
The code above did the same like sed, but how can I get the modified content ("postgresql-9.4.service") into a file, or printed out?
Basically what I would like to achieve is, after the script has been executed, it tells me if it's successful (and what actually substituted) and if not, I'll display a message of what couldn't be found and replaced.
Edit:
Highlighted that I want to get (only-the-) modified content, which indicates that my script is successful. Because with perl -i -pe 's/pattern/replace/' file, I couldn't know if it return true or false. Of course I can simple do grep -E "/pettern/" to find out, but that's not the question.

This code will throw an exit code equal to 0 when replacement is done:
$ perl -i -pe '$M += s|(mariadb)(\.service)|postgresql-9.4\2|;END{exit 1 unless $M>0}' something.conf
$ echo $?
0
When NO substitution is done, return code will be 1:
$ perl -i -pe '$M += s|(maria)(\.service)|postgresql-9.4\2|;END{exit 1 unless $M>0}' something.conf
$ echo $?
1
From Perl documentation
An END code block is executed as late as possible, that is, after perl
has finished running the program and just before the interpreter is
being exited, even if it is exiting as a result of a die() function.
(But not if it's morphing into another program via exec, or being
blown out of the water by a signal--you have to trap that yourself (if
you can).) You may have multiple END blocks within a file--they will
execute in reverse order of definition; that is: last in, first out
(LIFO). END blocks are not executed when you run perl with the -c
switch, or if compilation fails.
Number of replacements returned from s operator
s/PATTERN/REPLACEMENT/msixpodualngcer
Searches a string for a pattern, and if found, replaces that pattern
with the replacement text and returns the number of substitutions
made.

It isn't as tidy in Perl because you have to open your log file explicitly, and for a one-liner that has to be in a BEGIN block. But Perl's s/// returns the number of changes made, so you can test it for truth
Note also that $2 is better than \2 in Perl, as the latter represents a character with code point 2, or Unicode U+0002 START OF TEXT
perl -i -pe ' BEGIN { open F, ">perl-output.log" } print F $_ if s|(mariadb)(\.service)|postgresql-9.4$2| ' something.conf

You can check the output directly if you only print the substituted lines:
if [[ -z $(sed -n 's/mariadb\(\.service\)/postgresql-9.4\1/p' something.conf) ]]; then
echo nope
fi

Related

How can I get my Perl one-liner to show only the first regex match in the file?

I have a file with this format:
KEY1="VALUE1"
KEY2="VALUE2"
KEY1="VALUE2"
I need a perl command to only get first occurrence of KEY1, ie VALUE1.
I'm using this command:
perl -ne 'print "$1" if /KEY1="(.*?)"/' myfile
But the result is:
VALUE1VALUE2
EDIT
The solution must be with perl command, because the system there is no other regex tool.

Add and last to your one-liner like so (extra quotes removed):
perl -ne 'print $1 and last if /KEY1="(.*?)"/' myfile
This works because -n switch effectively wraps your code in a while loop. Thus, if the pattern matches, print is executed, which succeeds and thus causes last to be executed. This exits the while loop.
You can also use the more verbose last LINE, which specifies the (implicit) label of the while loop that iterates over the input lines. This last form is useful for more complex code than you have here, such as the code involving nested loops.

You can exit after printing first match:
perl -ne '/KEY1="([^"]*)"/ && print ($1 . "\n") && exit' file
VALUE1

You can also use sed:
sed -nE 's/^KEY1="(.*)"/\1/p;q' file
The p;q means 'print' then 'quit'

For registration only, thanks to #Andy Lester's comment I also found a simple way to solve the problem with grep and cut, without the need for regex:
grep -a -m1 'KEY1' file | cut -d "\"" -f2
return
VALUE1

Bash: Using quoted variable for grep within quoted expression

I'm trying to create a function within a bash script that queries a log file. Within the query function, I have something that resembles the following:
if [ -n "$(cat some-log-file.log | grep \"$1\")" ]; then
echo "There is a match."
else
echo "No lines matched the search term."
fi
If I send something I know will be in the log file as $1, like "I don't", I get the output:
$ ./y.sh query "I don't"
grep: don't": No such file or directory
No lines matched the search term.
If I try to single quote the $() expression, it sends the literal string and always evaluates true. I'm guessing it has something to do with the way grep interprets backslashes, but I can't figure it out. Maybe I'm overseeing something simple, but I've been at this for hours looking on forums and plugging in all kinds of strange combinations of quotes and escape characters. Any help or advice is appreciated.

It's actually really easy, if you realize that $() is allowed to have unescaped quotes:
if [ -n "$(cat some-log-file.log | grep "$1")" ]; then
echo "There is a match."
else
echo "No lines matched the search term."
fi
You can actually even skip that step, though, because grep gives an appropriate exit code:
if grep -q "$1" some-log-file.log; then
echo "There is a match."
else
echo "No lines matched the search term."
fi
In short, this happens for the same reason that "$1" works: Shell parameter expansion and command substitution happen before word splitting and quote removal. See more about how bash parses commands in the Shell Expansions section of the bash manual.

Regex word boundaries in double bracket version of test

It seems that I can't get word boundaries to work in [[:
$ echo foo | md5sum
d3b07384d113edec49eaa6238ad5ff00 -
$ [[ "$(echo foo | md5sum)" =~ ^d3b07384d113edec49eaa6238ad5ff00 ]] && echo ok
ok
$ [[ "$(echo foo | md5sum)" =~ ^d3b07384d113edec49eaa6238ad5ff00\b ]] && echo ok
$ ## no output
Are word boundaries not accepted in [[? Or am I missing something?

This seems to work, although it's a bit verbose:
[[ "$(echo foo | md5sum)" =~ $(echo '^d3b07384d113edec49eaa6238ad5ff00\b') ]] && echo ok

The problem seems related to the backslash losing the meaning you intend when interpreted by the shell. There's probably some incantation of quoting that would eliminate the issue, but for me it's sometimes just easier to dump the output of a construct into Perl for further processing.
If you can accept a solution that invokes Perl on your system, this works:
echo foo | md5sum | perl -nE 'say "ok" if m/^\bd3b07384d113edec49eaa6238ad5ff00\b/'
If you're stuck with a Perl that predates v5.10, then this:
echo foo | md5sum | perl -lne 'print "ok" if m/^\bd3b07384d113edec49eaa6238ad5ff00\b/'
The solution is fairly self-explanatory if you read through perlrun, which explains what the various command line switches do. We're using -n to cause Perl to process some input, -E to tell Perl to evaluate some code using modern (5.10+) features (say), and the rest just reads as you would expect.
For older Perl versions (pre-5.10), say wasn't available, so the command line switches change to -l, -n, and -e: The first strips newlines from input (not useful), and adds them to output (useful, because print doesn't do that, where the newer say does). And the -e to evaluate some code using pre-5.10 semantics.

How to get a part of a string with a regular expression in a /bin/sh script

I need to extract the part of a string in a shell script. The original string is pretty complicated, so I really need a regular expression to select the right part of the original string - justing removing a prefix and suffix won't work. Also, the regular expression needs to check the context of the string I want to extract, so I e.g. need a regular expression a\([^b]*\)b to extract 123 from 12a123b23.
The shell script needs to be portable, so I cannot make use of the Bash constructs [[ and BASH_REMATCH.
I want the script to be robust, so when the regular expression does not match, the script should notice this e.g. through a non-zero exit code of the command to be used.
What is a good way to do this?
I've tried various tools, but none of them fully solved the problem:
expr match "$original" ".*$regex.*" works except for the error case. With this command, I don't know how to detect if the regex did not match. Also, expr seems to take the extracted string to determine its exit code - so when I happened to extract 00, expr had an exit code of 1. So I would need to generally ignore the exit code with expr match "$original" ".*$regex.*" || true
echo "$original" | sed "s/.*$regex.*/\\1/" also works except for the error case. To handle this case, I'd need to test if I got back the original string, which is also quite unelegant.
So, isn't there a better way to do this?

You could use the -n option of sed to suppress output of all input lines and add the p option to the substitute command, like this:
echo "$original" | sed -n -e "s/.*$regex.*/\1/p"
If the regular expression matches, the matched group is printed as before. But now if the regular expression does not match, nothing is printed and you will need to test only for the empty string.

How about grep -o the only possible problem is portability, otherwise it satisfies all requirements:
➜ echo "hello and other things" | grep -o hello
hello
➜ echo $?
0
➜ echo "hello and other things" | grep -o nothello
➜ echo $?
1
One of the best things is that since it's grep you can pick what regex's you want whether BRE, ERE or Perl.

if egrep is available (pretty much all time)
egrep 'YourPattern' YourFile
or
egrep "${YourPattern}" YourFile
if only grep is available
grep -e 'YourPattern' YourFile
you check with a classical [ $? -eq 0 ] for the status of the command (also take into account bad YourFile access)
for the content itself, extract with sed or awk (for portability issue) (after the failure test)
Content="$( sed -n -e "s/.*\(${YourPattern}\).*/\1/p;q" )"

regexp (sed) suppress "no match" output

I'm stuck on that and can't wrap my head around it: How can I tell sed to return the value found, and otherwise shut up?
It's really beyond me: Why would sed return the whole string if he found nothing? Do I have to run another test on the returned string to verify it? I tried using "-n" from the (very short) man page but it effectively suppresses all output, including matched strings.
This is what I have now :
echo plop-02-plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/'
which returns
02 (and that is fine and dandy, thank you very much), but:
echo plop-02plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/'
returns
plop-02plop (when it should return this = "" nothing! Dang, you found nothing so be quiet!
For crying out loud !!)
I tried checking for a return value, but this failed too ! Gasp !!
$ echo plop-02-plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/' ; echo $?
02
0
$ echo plop-02plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/' ; echo $?
plop-02plop
0
$
This last one I cannot even believe. Is sed really the tool I should be using? I want to extract a needle from a haystack, and I want a needle or nothing..?

sed by default prints all lines.
What you want to do is
/patt/!d;s//repl/
IOW delete lines not matching your pattern, and if they match, extract particular element from it, giving capturing group number for instance. In your case it will be:
sed -e '/^.*\(.\)\([0-9][0-9]\)\1.*$/!d;s//\2/'
You can also use -n option to suppress echoing all lines. Then line is printed only when you explicitly state it. In practice scripts using -n are usually longer and more cumbersome to maintain. Here it will be:
sed -ne 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/p'
There is also grep, but your example shows, why sed is sometimes better.

Perhaps you can use egrep -o?
input.txt:
blooody
aaaa
bbbb
odor
qqqq
E.g.
sehe#meerkat:/tmp$ egrep -o o+ input.txt
ooo
o
o
sehe#meerkat:/tmp$ egrep -no o+ input.txt
1:ooo
4:o
4:o
Of course egrep will have slightly different (better?) regex syntax for advanced constructs (back-references, non-greedy operators). I'll let you do the translation, if you like the approach.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Check if any replacement done by `perl -i -pe` - regex

You can check the output directly if you only print the substituted lines: if [[ -z $(sed -n 's/mariadb\(\.service\)/postgresql-9.4\1/p' something.conf) ]]; then echo nope fi

Related

How can I get my Perl one-liner to show only the first regex match in the file?

Bash: Using quoted variable for grep within quoted expression

Regex word boundaries in double bracket version of test

How to get a part of a string with a regular expression in a /bin/sh script

regexp (sed) suppress "no match" output

Categories

Resources