Performing two (clashing) interpolations on one string

Performing two (clashing) interpolations on one string - regex

I have an interpolate function that replaces %foo with the value from $HV{'default'}{'foo'} and %foo.bar from $HV{foo}{bar}:
sub interpolate {
my $work = "#_";
$work =~ s/\%(\w+)\.(\w+)/$HV{$1}{$2}/g;
$work =~ s/\%(\w+)/$HV{'default'}{$1}/g;
return $work;
}
However if $HV{'foo'}{'bar'} contains a % character, the second operation matches it which is not what I want. My first fix was to change all occurrences of %foo into %default.foo with
$work =~ s/\%(\w+)/%default\.$1/g;
$work =~ s/\%(\w+)\.(\w+)/$HV{$1}{$2}/g;
But this changes %foo.bar into %default.foo.bar. Is there a way to do what I want without re-doing my hash?
Also for bonus credit I'd be interested in a regular expression that would match %A.very.long.and.deeply.nested.hash.value with the corresponding value to make it work with any hash.

The easiest solution is to do a single traversal of the string, not two in a row:
$work =~ s{%(\w+)(?:\.(\w+))?}{
defined $2
? $HV{$1}{$2}
: $HV{default}{$1}
}eg;
To fix your other approach, you could change your regex to
$work =~ s/%(\w+)(?!\.\w)/%default.$1/g;
to only replace %foo if it's not followed by .bar.
Bonus credit: Assuming you want to replace %foo.bar.baz by $HV{foo}{bar}{baz}, this can be done as follows:
sub lookup {
my ($cur, #keys) = #_;
$cur = $cur->{$_} for #keys;
return $cur;
}
s{%(\w+(?:\.\w+)*)}{
lookup(\%HV, split(/\./, $1))
}eg;

Related

Save result from flip-flop in variable?

I have about 1kB of text from STDIN
my $f = join("", <STDIN>);
and I would like to get the content between open1 and close1, so /open1/../close1/ comes to mind.
I have only seen it been used in one liners and in scripts in while-loops and $_.
Question
How can I get the result from /open1/../close1/ in my script when everything is in $f?

Capturing all matches with a single regular expression
If you want to capture all the lines between open1 and start1 markers (excluding the markers), it is easily done with a single regular expression:
my $f = join("", <STDIN>);
my #matches = ( $f =~ m/\bopen1\b(.*?)\bclose1\b/gs );
for my $m (#matches) {
print "$m";
}
where
s modifier treats the string as a single line;
g modifier captures all the matches;
(.*?) matches a group of any characters using the lazy quantifier
Using the range operator
The range operator (so-called flip-flop) is not very convenient for this task if you want to avoid capturing the markers, because an expression like /open1/ .. /close1/ returns true for the lines matching the patterns.
The expression /^open1$/ .. /^close1$/ returns false until /^open1$/ is true. The left regular expression stops being evaluated once it matches the line, and keeps returning true until /^close1$/ becomes true. When the right expression matches, the cycle is restarted. Thus, the open1 and close1 markers are included into $matches.
It is even less convenient, if the input is stored in a variable, because you will need to read the contents of the variable line by line, e.g.:
my $matches = "";
my #lines = split /\n/, $f;
foreach my $line (#lines) {
if ($line =~ m/^open1$/ .. $line =~ m/^close1$/) {
$matches .= "$line\n";
}
}
Note, it is possible to use arbitrary Perl expressions as operands of the range operator. I wouldn't recommend this code, as it is not very efficient, and not very readable. At the same time it is easy to adapt the first example to the case where the open1 and close1 markers are included into the set of matches, e.g.:
my #matches = ( $f =~ m/\bopen1\b(.*?)\bclose1\b/gs );
for my $m (#matches) {
print "open1${m}close1\n";
}

You can rewrite how $f is generated so that it takes advantage of the flip-flop inside a while loop:
my ( $f, $matched );
while ( <> ) {
$f .= $_;
$matched .= $_ if /open1/ .. /close1/;
}

Another way is to create a new inputs stream out of the contents of $f.
open my $fh, '<', \$f;
while (<$fh>) {
if (/open1/ .. /close1/) {
...
}
}

You can also employ split. To get what is between the first pair of open1 and close1
my $open_to_close = (split /open1|close1/, $f)[1];
The delimiter can be either open1 or close1, so returned is a list of three elements: before open1, between them, and after close1. We take the second element.
If there are more open1/close1 pairs take all odd-indexed elements.
Either get the array as well
my #parts = split /open1|close1/, $f;
my #all_open_to_close = #parts[ grep { $_ & 1 } 0..$#parts ];
or get it directly from the list
my #all_open_to_close =
grep { CORE::state $i; ++$i % 2 == 0 } split /open1|close1/, $f;
The state is a feature
from v5.10. If you already use that you don't need CORE:: prefix.

Matching a variable in a string in Perl from the end

I want to match a variable character in a given string, but from the end.
Ideas on how to do this action?
for example:
sub removeCharFromEnd {
my $string = shift;
my $char = shift;
if($string =~ m/$char/){ // I want to match the char, searching from the end, $doesn't work
print "success";
}
}
Thank you for your assistance.

There is no regex modifier that would force Perl regex engine to parse the string from right to left. Thus, the most convenient way to achieve that is via a negative lookahead:
m/$char(?!.*$char)/
The (?!.*$char) negative lookahead will require the absence (=will fail the match if found) of a $char after any 0+ chars other than linebreak chars (use s modifier if you are running the regex against a multiline string input).

The regex engine works from left to right.
You can use the natural greediness of quantifiers to reach the end of the string and find the last char with the backtracking mechanism:
if($string =~ m/.*\K$char/s) { ...
\K marks the position of the match result beginning.
Other ways:
you can also reverse the string and use your previous pattern.
you can search all occurrences and take the last item in the list

I'm having trouble understanding what you want. Your subroutine is called removeCharFromEnd, so perhaps you want to remove $char from $string if it appears at the end of the string
You can do that like this
sub removeCharFromEnd {
my ( $string, $char ) = #_;
if ( $string =~ s/$char\z// ) {
print "success";
}
$string;
}
Or perhaps you want to remove the last occurrence of $char wherever it is. You can do that with
s/.*\K$char//
The subroutine I have written returns the modified string, so you would have to assign the result to a variable to save it. You can write
my $s = 'abc';
$s = removeCharFromEnd($s, 'c');
say $s;
output
ab
If you just want to modify the string in place then you should write
$ARGV[0] =~ s/$char\z//
using whichever substitution you choose. Then you can do this
my $s = 'abc';
removeCharFromEnd($s, 'c');
say $s;
This produces the same output

To get Perl to search from the end of a string, reverse the string.
sub removeCharFromEnd {
my $string = reverse shift #_;
my $char = quotemeta reverse shift #_;
$string =~ s/$char//;
$string = reverse $string;
return $string;
}
print removeCharFromEnd(qw( abcabc b )), "\n";
print removeCharFromEnd(qw( abcdefabcdef c )), "\n";
print removeCharFromEnd(qw( !"/$%?&*!"/$%?&* $ )), "\n";

Perl Grepping from an Array

I need to grep a value from an array.
For example i have a values
#a=('branches/Soft/a.txt', 'branches/Soft/h.cpp', branches/Main/utils.pl');
#Array = ('branches/Soft/a.txt', 'branches/Soft/h.cpp', branches/Main/utils.pl','branches/Soft/B2/c.tct', 'branches/Docs/A1/b.txt');
Now, i need to loop #a and find each value matches to #Array. For Example

It works for me with grep. You'd do it the exact same way as in the More::ListUtils example below, except for having grep instead of any. You can also shorten it to
my $got_it = grep { /$str/ } #paths;
my #matches = grep { /$str/ } #paths;
This by default tests with /m against $_, each element of the list in turn. The $str and #paths are the same as below.
You can use the module More::ListUtils as well. Its function any returns true/false depending on whether the condition in the block is satisfied for any element in the list, ie. whether there was a match in this case.
use warnings;
use strict;
use Most::ListUtils;
my $str = 'branches/Soft/a.txt';
my #paths = ('branches/Soft/a.txt', 'branches/Soft/b.txt',
'branches/Docs/A1/b.txt', 'branches/Soft/B2/c.tct');
my $got_match = any { $_ =~ m/$str/ } #paths;
With the list above, containing the $str, the $got_match is 1.
Or you can roll it by hand and catch the match as well
foreach my $p (#paths) {
print "Found it: $1\n" if $p =~ m/($str)/;
}
This does print out the match.
Note that the strings you show in your example do not contain the one to match. I added it to my list for a test. Without it in the list no match is found in either of the examples.
To test for more than one string, with the added sample
my #strings = ('branches/Soft/a.txt', 'branches/Soft/h.cpp', 'branches/Main/utils.pl');
my #paths = ('branches/Soft/a.txt', 'branches/Soft/h.cpp', 'branches/Main/utils.pl',
'branches/Soft/B2/c.tct', 'branches/Docs/A1/b.txt');
foreach my $str (#strings) {
foreach my $p (#paths) {
print "Found it: $1\n" if $p =~ m/($str)/;
}
# Or, instead of the foreach loop above use
# my $match = grep { /$str/ } #paths;
# print "Matched for $str\n" if $match;
}
This prints
Found it: branches/Soft/a.txt
Found it: branches/Soft/h.cpp
Found it: branches/Main/utils.pl
When the lines with grep are uncommented and foreach ones commented out I get the corresponding prints for the same strings.

The slashes dot in $a will pose a problem so you either have to escape them it when doing regex match or use a simple eq to find the matches:
Regex match with $a escaped:
my #matches = grep { /\Q$a\E/ } #array;
Simple comparison with "equals":
my #matches = grep { $_ eq $a } #array;
With your sample data both will give an empty array #matches because there is no match.

This Solved My Question. Thanks to all especially #zdim for the valuable time and support
my #SVNFILES = ('branches/Soft/a.txt', 'branches/Soft/b.txt');
my #paths = ('branches/Soft/a.txt', 'branches/Soft/b.txt',
'branches/Docs/A1/b.txt', 'branches/Soft/B2/c.tct');
foreach my $svn (#SVNFILES)
{
chomp ($svn);
my $m = grep { /$svn/ } (#paths);
if ( $m eq '0' ) {
print "Files Mismatch\n";
exit 1;
}
}

You should escape characters like '/' and '.' in any regex when you need it as a character.
Likewise :
$a="branches\/Soft\/a\.txt"
Retry whatever you did with either grep or perl with that. If it still doesn't work, tell us precisely what you tried.

Unable to replace a string via regex through a subroutine

I am trying to replace square brackets in a string with an empty string. which means if a string is [SECTION], I want to convert it to SECTION.
I have tried this and it works,
my $sectionName =~ tr/[]//d;
print "$sectionName\n";
However, when I tried to create a general subroutine for replacing strings, it didn't work out. Here's what I tried,
sub strReplace
{
my $string = shift;
my $target = shift;
my $replacement = shift;
$target = quotemeta $target;
$replacement = quotemeta $replacement;
$string =~ tr/$target/$replacement/d;
return $string;
}
I am calling the sub like this,
# the string "[SECTION]" below is intended to be replaced by a variable
my $sectionName = strReplace("[SECTION]", "[]", "");
print "$sectionName\n";
However, instead of getting the replaced string, I am still getting the old one, i.e, [SECTION]. What am I doing wrong? (PS: Perl version 5.14.2)

Perl's tr/// operator does not support variables. You can find various strategies to work around this here: Perl's tr/// is not doing what I want
To summarize, you have two main options:
Wrap your tr/// in an eval.
Convert your tr/// into a substitution using s///.

If your main case for strReplace is actually just to remove characters, I'd write a less-general-purpose sub that does that. Otherwise, a s/// conversion that can both remove and replace looks like this:
sub strReplace
{
my $string = shift;
my $target = shift;
my $replacement = shift;
my %replacement;
#replacement{ split //, $target } = split //, $replacement;
$string =~ s{ ([\Q$target\E]) }{ $replacement{$1} // '' }gxe;
return $string;
}
The substitution repeatedly (because of the /g flag) looks for [\Q$target\E] (a character in a class of any the characters in $target, any special characters automatically escaped if necessary by \Q...\E), and replaces it with the value found by looking in the hash, or just removes it if it wasn't found in the hash.

Using Perl, how can I build a dynamic regexp by passing in an argument to a subroutine?

I would like to create subroutine with a dynamically created regxp. Here is what I have so far:
#!/usr/bin/perl
use strict;
my $var = 1234567890;
foreach (1 .. 9){
&theSub($_);
}
sub theSub {
my $int = #_;
my $var2 = $var =~ m/(??{$int})/;
print "$var2\n";
}
It looks like it will work, but it seems that once the $int in the regex gets evaluated for the first time, it's there forever.
Is there anyway to do something similar to this, but have the regex pick up the new argument each time the sub is called?

The easiest way to fix your code is to add parentheses around my, and remove ??{. Here is the fixed program:
#!/usr/bin/perl
use strict;
my $var = 1234567890;
foreach (1 .. 9){
theSub($_);
}
sub theSub {
my($int) = #_;
my($var2) = $var =~ m/($int)/;
print "$var2\n";
}
One of the problematic lines in your code was my $int = #_, which was equivalent to my $int = 1, because it evaluated #_ in scalar context, yielding the number of elements in #_. To get the first argument of your sub, use my($int) = #_;, which evaluates #_ in list context, or fetch the first element using my $int = $_[0];, or fetch+remove the first element using my $int = shift;
There was a similar problem in the my $var2 = line, you need the parentheses there as well to evaluate the regexp match in list context, yielding the list of ($1, $2, ...), and assigning $var2 = $1.
The construct (??{...}) you were trying to use had the opposite effect to what you wanted: (among doing other things) it compiled your regexp the first time it was used for matching. For regexps containing $ or #, but not containing ??{...}, Perl recompiles the regexp automatically for each match, unless you specify the o flag (e.g. m/$int/o).
The construct (??{...}) means: use Perl code ... to generate a regexp, and insert that regexp here. To get more information, search for ??{ on http://perldoc.perl.org/perlre.html . The reason why it didn't work in your example is that you would have needed an extra layer of parentheses to capture $1, but even with my ($var2) = $var =~ m/((??{$int}))/ it wouldn't have worked, because ??{ has an undocumented property: it forces the compilation of its argument the first time the regexp is used for matching, so my ($var2) = $var =~ m/((??{$int + 5}))/ would have always matched 6.

my $int = #_;
This will give you the count of parameters, always '1' in your case.
I think you want
my $int = shift;

To dynamically pass a regexp to a function, rather than dynamically build it in the function, use qr//.
#!/usr/bin/perl
use strict;
my $var = 1234567890;
foreach (1 .. 9){
&theSub(qr/$int/);
}
sub theSub {
my($regexp) = #_;
my($var2) = ($var =~ $regexp);
print "$var2\n";
}
qr// accepts the same trailing arguments that m// does: i, m, s, and x

my $int is the scalar context, he has ($int) for the list context and that puts $_[0] into $int. In the following only 10 is put into $int and the rest 11 to 99 are lost.
my ($int)=(10..99);
print $int;
10

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Performing two (clashing) interpolations on one string - regex

Related

Save result from flip-flop in variable?

Matching a variable in a string in Perl from the end

Perl Grepping from an Array

Unable to replace a string via regex through a subroutine

Using Perl, how can I build a dynamic regexp by passing in an argument to a subroutine?

Categories

Resources