I'd like to use a variable as a RegEx pattern for matching filenames:
my $file = "test~";
my $regex1 = '^.+\Q~\E$';
my $regex2 = '^.+\\Q~\\E$';
print int($file =~ m/$regex1/)."\n";
print int($file =~ m/$regex2/)."\n";
print int($file =~ m/^.+\Q~\E$/)."\n";
The result (or on ideone.com):
0
0
1
Can anyone explain to me how I can use a variable as a RegEx pattern?
As documentation says:
$re = qr/$pattern/;
$string =~ /foo${re}bar/; # can be interpolated in other patterns
$string =~ $re; # or used standalone
$string =~ /$re/; # or this way
So, use the qr quote-like operator.
You cannot use \Q in a single-quoted / non-interpolated string. It must be seen by the lexer.
Anyway, tilde isn’t a meta-character.
Add use regex "debug" and you will see what is actually happening.
Related
I realize it is possible to achieve this with a slight workaround, but I am hoping there is a simpler way (since I often make use of this type of expression).
Given the example string:
my $str = "An example: sentence!*"
A regex can be used to match each punctuation mark and capture them in an array.
Thereafter, I can simply repeat the regex and replace the matches as in the following code:
push (#matches, $1), while ($str =~ /([\*\!:;])/);
$str =~ s/([\*\!:;])//g;
Would it be possible to combine this into a single step in Perl where substitution occurs globally while also keeping tabs on the replaced matches?
You can embed code to run in your regular expression:
my #matches;
my $str = 'An example: sentence!*';
$str =~ s/([\*\!:;])(?{push #matches, $1})//g;
But with a match this simple, I'd just do the captures and substitution separately.
Yes, it's possible.
my #matches;
$str =~ s/[*!:;]/ push #matches, $&; "" /eg;
However, I'm not convinced that the above is faster or clearer than the following:
my #matches = $str =~ /[*!:;]/g;
$str =~ tr/*!:;//d;
Use:
my $str = "An example: sentence!*";
my #matches = $str =~ /([\*\!:;])/g;
say Dumper \#matches;
$str =~ tr/*!:;//d;
Output:
$VAR1 = [
':',
'!',
'*'
];
Is that what you're looking for ?
my ($str, #matches) = ("An example: sentence!*");
#first method :
($str =~ s/([\*\!:;])//g) && push(#matches, $1);
#second method :
push(#matches, $1) while ($str =~ s/([\*\!:;])//g);
Try:
my $str = "An example: sentence!*";
push(#mys, ($str=~m/([^\w\s])/g));
print join "\n", #mys;
Thanks.
I am trying to construct a small regex during runtime, but somehow it never matches -- what am I doing wrong?
my $word = quotemeta("test");
my $lines = "just a test to testing find tester testönig something fastest out pentest";
my $rule = "m/" . $word . "/g";
my $regex = qr/$rule/;
while ($lines =~ $regex) {
# this never happens...
print "\nFound pattern: '$&'";
}
Your code:
my $word = quotemeta("test");
my $rule = "m/" . $word . "/g";
my $regex = qr/$rule/;
is the same as this:
my $word = quotemeta("test");
my $rule = "m/test/g"; # interpolated $word
my $regex = qr~m/test/g~; # interpolated $rule
That is, it matches the literal string "m/test/g" and nothing else.
buff has already given pretty much the same code I would have suggested, except that I recommend avoiding the use of $& due to a performance penalty as noted in perlvar:
The use of this variable anywhere in a program imposes a considerable
performance penalty on all regular expression matches. To avoid this
penalty, you can extract the same substring by using #-. Starting with
Perl v5.10.0, you can use the /p match flag and the ${^MATCH} variable
to do the same thing for particular match operations.
This might be what you want:
#!/usr/bin/perl
use strict;
use warnings;
my $word = quotemeta("test");
my $lines = "just a test to testing find tester testönig something fastest out pentest";
my $regex = qr/$word/;
while ($lines =~ /$regex/g) {
print "\nFound pattern: '$&'";
}
You cannot use /g directly with qr.
Consider this example,
$relPath = '..\A\B/C/D/E';
$contentsDir = '..\A\B';
$relPath =~ s/$contentsDir//;
print "$relPath\n";
#Desired output: '/C/D/E'
#Actual output: '..\A\B/C/D/E'
Please help .. this unwanted interpolation has made it impossible to compute this.
Don't mix slashes and backslashes in paths. Use just slashes.
If you want to ignore any regular expression characters in a string, place it between \Q and \E (see documentation in perlre or pass it to quotemeta.
Here's an example:
#!/usr/bin/perl -w
use strict;
my $string = 'abc.*def';
my $sub = '.*';
$string =~ s/c\Q$sub\E/d/;
# or $string = 'c' . quotemeta($sub) . 'd';
print $string; # abef
Quote special regex chars with quotemeta before matching,
$contentsDir = quotemeta '..\A\B';
Is there an easy way to add regex modifiers such as 'i' to a quoted regular expression? For example:
$pat = qr/F(o+)B(a+)r/;
$newpat = $pat . 'i'; # This doesn't work
The only way I can think of is to print "$pat\n" and get back (?-xism:F(o+)B(a+)r) and try to remove the 'i' in ?-xism: with a substitution
You cannot put the flag inside the result of qr that you already have, because it’s protected. Instead, use this:
$pat = qr/F(o+)B(a+)r/i;
You can modify an existing regex as if it was a string as long as you recompile it afterwards
my $pat = qr/F(o+)B(a+)r/;
print $pat, "\n";
print 'FOOBAR' =~ $pat ? "match\n" : "mismatch\n";
$pat =~ s/i//;
$pat = qr/(?i)$pat/;
print $pat, "\n";
print 'FOOBAR' =~ $pat ? "match\n" : "mismatch\n";
OUTPUT
(?-xism:F(o+)B(a+)r)
mismatch
(?-xism:(?i)(?-xsm:F(o+)B(a+)r))
match
Looks like the only way is to stringify the RE, replace (-i) with (i-) and re-quote it back:
my $pat = qr/F(o+)B(a+)r/;
my $str = "$pat";
$str =~ s/(?<!\\)(\(\?\w*)-([^i:]*)i([^i:]*):/$1i-$2$3:/g;
$pati = qr/$str/;
UPDATE: perl 5.14 quotes regexps in a different way, so my sample should probably look like
my $pat = qr/F(o+)B(a+)r/;
my $str = "$pat";
$str =~ s/(?<!\\)\(\?\^/(?^i/g;
$pati = qr/$str/;
But I don't have perl 5.14 at hand and can't test it.
UPD2: I also failed to check for escaped opening parenthesis.
I want to be able to do a regex match on a variable and assign the results to the variable itself. What is the best way to do it?
I want to essentially combine lines 2 and 3 in a single line of code:
$variable = "some string";
$variable =~ /(find something).*/;
$variable = $1;
Is there a shorter/simpler way to do this? Am I missing something?
my($variable) = "some string" =~ /(e\s*str)/;
This works because
If the /g option is not used, m// in list context returns a list consisting of the subexpressions matched by the parentheses in the pattern, i.e., ($1, $2, $3 …).
and because my($variable) = ... (note the parentheses around the scalar) supplies list context to the match.
If the pattern fails to match, $variable gets the undefined value.
Why do you want it to be shorter? Does is really matter?
$variable = $1 if $variable =~ /(find something).*/;
If you are worried about the variable name or doing this repeatedly, wrap the thing in a subroutine and forget about it:
some_sub( $variable, qr/pattern/ );
sub some_sub { $_[0] = $1 if eval { $_[0] =~ m/$_[1]/ }; $1 };
However you implement it, the point of the subroutine is to make it reuseable so you give a particular set of lines a short name that stands in their place.
Several other answers mention a destructive substitution:
( my $new = $variable ) =~ s/pattern/replacement/;
I tend to keep the original data around, and Perl v5.14 has an /r flag that leaves the original alone and returns a new string with the replacement (instead of the count of replacements):
my $match = $variable =~ s/pattern/replacement/r;
Well, you could say
my $variable;
($variable) = ($variable = "find something soon") =~ /(find something).*/;
or
(my $variable = "find something soon") =~ s/^.*?(find something).*/$1/;
You can do substitution as:
$a = 'stackoverflow';
$a =~ s/(\w+)overflow/$1/;
$a is now "stack"
From Perl Cookbook 2nd ed
6.1 Copying and Substituting Simultaneously
$dst = $src;
$dst =~ s/this/that/;
becomes
($dst = $src) =~ s/this/that/;
I just assumed everyone did it this way, amazed that no one gave this answer.
Almost ....
You can combine the match and retrieve the matched value with a substitution.
$variable =~ s/.*(find something).*/$1/;
AFAIK, You will always have to copy the value though, unless you do not care to clobber the original.
$variable2 = "stackoverflow";
(my $variable1) = ($variable2 =~ /stack(\w+)/);
$variable1 now equals "overflow".
I do this:
#!/usr/bin/perl
$target = "n: 123";
my ($target) = $target =~ /n:\s*(\d+)/g;
print $target; # the var $target now is "123"
Also, to amplify the accepted answer using the ternary operator to allow you to specify a default if there is no match:
my $match = $variable =~ /(*pattern*).*/ ? $1 : *defaultValue*;