Regex Group: $1 in a string before replacement happens - regex

I have the following code, which replaces a string with a part of it using $1:
my $string = "abcd1234";
print "$string\n";
$string =~ s/c(d1)2/<found: $1>/;
print "$string\n";
Output:
abcd1234
ab<found: d1>34
Now I want to have variables, which contain the condition and the replacement. But if I do it like this, an error occurs:
my $string = "abcd1234";
print "$string\n";
my $cond = qr/c(d1)2/;
my $rule = "<found: $1>";
$string =~ s/$cond/$rule/;
print "$string\n";
Output:
Use of uninitialized value $1 in concatenation (.) or string
abcd1234
ab<found: >34
I get, that $1 isn't existing in the line, where $rule is defined. But how can put a placeholder there?

You can't delay the interpretation of a variable within a quoted string like that. $1 is substituted as soon as the line is executed.
To do what you seem to want, you would create a sprintf template and execute it in the replacement part of the substitution with the /e flag.
my $string = "abcd1234";
print "$string\n";
my $cond = qr/c(d1)2/;
my $replacement_fmt= "<found: %s>";
$string =~ s/$cond/sprintf($replacement_fmt, $1)/e;
print "$string\n";
## ab<found: d1>34

You can use sub_modify from String::Substitution:
use String::Substitution qw(sub_modify);
my $string = "abcd1234";
print "$string\n";
my $cond = qr/c(d1)2/;
my $rule = '<found: $1>'; # Single quotes!
sub_modify($string, $cond, $rule);
print "$string\n";
For completeness, note that it's also possible to do this using s/// with the /ee modifier. However, you shouldn't use it, as it can lead to security issues and various bugs.

Related

Assigning a regex pattern match to a variable in perl

I want to match a regex pattern and assign the matched pattern to a variable.
I then want to print the value assigned to the variable.
My attempt
#!/usr/bin/perl -w
use strict;
my $variable = 'iaw75w8yu';
my $value =~ /w[0-9][0-9]w[0-9]/;
print $value;
Desired output
w75w8
Actual output
Use of uninitialized value $value in pattern match (m//) at ./temp.pl line 5.
Use of uninitialized value $value in print at ./temp.pl line 6.
This works too:
#!/usr/bin/env perl
use strict;
use warnings;
my $variable = 'iaw75w8yu';
my $value = $1 if ($variable =~ /(w[0-9][0-9]\w[0-9])/);
print $value if defined ($value);
It seems closer to what you initialy intended to try. But the shortfall of this method is that it could happen that your value doesn't get defined. It's something important to keep in mind when you start having optional captures in a m// statement.
This should be printing w78w8
No, it shouldn't! Even if your code was correct, it would never print w78w8 because the source string you're trying to match against doesn't contain w78w8. In the examples below, I'm going to assume you meant to say w75w8.
The problem is that you're trying to match an undefined value, the result of which is undefined. Your code is printing warnings, which is exactly what it should be printing since you specified -w.
If you want to capture something, you need a capture group:
use strict;
use warnings;
use Data::Dumper;
'iaw75w8yu' =~ /(w\d\dw\d)/;
print Dumper($1);
Outputs:
$VAR1 = 'w75w8';
If you want to store the matches in a variable of your choosing, you need to evaluate the expression in list context:
my #matches = ('iaw75w8yu' =~ /(w\d\dw\d)/);
print Dumper(\#matches);
Outputs:
$VAR1 = [
'w75w8'
];
I usually do it like this
#!/usr/bin/perl -w
use strict;
# the variable
my $variable = 'iaw75w8yu';
# the value
my $value;
# regex for the variable
if ( $variable =~ /(w[0-9][0-9]\w[0-9])/ ) {
$value = $1;
}
else {
$value = "regex not found";
}
print $value;

Add html to perl Regex

Am trying to replace all `` with a HTML code tag
replace:
$string = "Foo `FooBar` Bar";
with:
$string = "Foo <code>FooBar</code> Bar";
i tried these
$pattern = '`(.*?)`';
my $replace = "<code/>$&</code>";
$subject =~ s/$pattern/$replace/im;
#And
$subject =~ s/$pattern/<code/>$&</code>/im;
but none of them works.
Assuming you meant $string instead of $subject...
use strict;
use warnings;
use v5.10;
my $string = "Foo `FooBar` Bar";
my $pattern = '`(.*?)`';
my $replace = "<code/>$&</code>";
$string =~ s{$pattern}{$replace}im;
say $string;
This results in...
$ perl ~/tmp/test.plx
Use of uninitialized value $& in concatenation (.) or string at /Users/schwern/tmp/test.plx line 9.
Foo <code/></code> Bar
There's some problems here. First, $& means the string matched by the last match. That would be all of `FooBar`. You just want FooBar which is inside capturing parens. You get that with $1. See Extracting Matches in the Perl Regex Tutorial.
Second is $& and $1 are variables. If you put them in double quotes like $replace = "<code/>$&</code>" then Perl will immediately interpolate them. That means $replace is <code/></code>. This is where the warning comes from. If you want to use $1 it has to go directly into the replace.
Finally, when quoting regexes it's best to use qr{}. That does special regex quoting. It avoids all sorts of quoting issues.
Put it all together...
use strict;
use warnings;
use v5.10;
my $string = "Foo `FooBar` Bar";
my $pattern = qr{`(.*?)`};
$string =~ s{$pattern}{<code/>$1</code>}im;
say $string;

Perl how do you assign a varanble to a regex match result

How do you create a $scalar from the result of a regex match?
Is there any way that once the script has matched the regex that it can be assigned to a variable so it can be used later on, outside of the block.
IE. If $regex_result = blah blah then do something.
I understand that I should make the regex as non-greedy as possible.
#!/usr/bin/perl
use strict;
use warnings;
# use diagnostics;
use Win32::OLE;
use Win32::OLE::Const 'Microsoft Outlook';
my #Qmail;
my $regex = "^\\s\*owner \#";
my $sentence = $regex =~ "/^\\s\*owner \#/";
my $outlook = Win32::OLE->new('Outlook.Application')
or warn "Failed Opening Outlook.";
my $namespace = $outlook->GetNamespace("MAPI");
my $folder = $namespace->Folders("test")->Folders("Inbox");
my $items = $folder->Items;
foreach my $msg ( $items->in ) {
if ( $msg->{Subject} =~ m/^(.*test alert) / ) {
my $name = $1;
print " processing Email for $name \n";
push #Qmail, $msg->{Body};
}
}
for(#Qmail) {
next unless /$regex|^\s*description/i;
print; # prints what i want ie lines that start with owner and description
}
print $sentence; # prints ^\\s\*offense \ # not lines that start with owner.
One way is to verify a match occurred.
use strict;
use warnings;
my $str = "hello what world";
my $match = 'no match found';
my $what = 'no what found';
if ( $str =~ /hello (what) world/ )
{
$match = $&;
$what = $1;
}
print '$match = ', $match, "\n";
print '$what = ', $what, "\n";
Use Below Perl variables to meet your requirements -
$` = The string preceding whatever was matched by the last pattern match, not counting patterns matched in nested blocks that have been exited already.
$& = Contains the string matched by the last pattern match
$' = The string following whatever was matched by the last pattern match, not counting patterns matched in nested blockes that have been exited already. For example:
$_ = 'abcdefghi';
/def/;
print "$`:$&:$'\n"; # prints abc:def:ghi
The match of a regex is stored in special variables (as well as some more readable variables if you specify the regex to do so and use the /p flag).
For the whole last match you're looking at the $MATCH (or $& for short) variable. This is covered in the manual page perlvar.
So say you wanted to store your last for loop's matches in an array called #matches, you could write the loop (and for some reason I think you meant it to be a foreach loop) as:
my #matches = ();
foreach (#Qmail) {
next unless /$regex|^\s*description/i;
push #matches_in_qmail $MATCH
print;
}
I think you have a problem in your code. I'm not sure of the original intention but looking at these lines:
my $regex = "^\\s\*owner \#";
my $sentence = $regex =~ "/^\s*owner #/";
I'll step through that as:
Assign $regexto the string ^\s*owner #.
Assign $sentence to value of running a match within $regex with the regular expression /^s*owner $/ (which won't match, if it did $sentence will be 1 but since it didn't it's false).
I think. I'm actually not exactly certain what that line will do or was meant to do.
I'm not quite sure what part of the match you want: the captures, or something else. I've written Regexp::Result which you can use to grab all the captures etc. on a successful match, and Regexp::Flow to grab multiple results (including success statuses). If you just want numbered captures, you can also use Data::Munge
You can do the following:
my $str ="hello world";
my ($hello, $world) = $str =~ /(hello)|(what)/;
say "[$_]" for($hello,$world);
As you see $hello contains "hello".
If you have older perl on your system like me, perl 5.18 or earlier, and you use $ $& $' like codequestor's answer above, it will slow down your program.
Instead, you can use your regex pattern with the modifier /p, and then check these 3 variables: ${^PREMATCH}, ${^MATCH}, and ${^POSTMATCH} for your matching results.

Matching a regular expression multiple times with Perl

Noob question here. I have a very simple perl script and I want the regex to match multiple parts in the string
my $string = "ohai there. ohai";
my #results = $string =~ /(\w\w\w\w)/;
foreach my $x (#results){
print "$x\n";
}
This isn't working the way i want as it only returns ohai. I would like it to match and print out ohai ther ohai
How would i go about doing this.
Thanks
Would this do what you want?
my $string = "ohai there. ohai";
while ($string =~ m/(\w\w\w\w)/g) {
print "$1\n";
}
It returns
ohai
ther
ohai
From perlretut:
The modifier "//g" stands for global matching and allows the
matching operator to match within a
string as many times as possible.
Also, if you want to put the matches in an array instead you can do:
my $string = "ohai there. ohai";
my #matches = ($string =~ m/(\w\w\w\w)/g);
foreach my $x (#matches) {
print "$x\n";
}
Or you could do this
my $string = "ohai there. ohai";
my #matches = split(/\s/, $string);
foreach my $x (#matches) {
print "$x\n";
}
The split function in this case splits on spaces and prints
ohai
there.
ohai

Match regex and assign results in single line of code

I want to be able to do a regex match on a variable and assign the results to the variable itself. What is the best way to do it?
I want to essentially combine lines 2 and 3 in a single line of code:
$variable = "some string";
$variable =~ /(find something).*/;
$variable = $1;
Is there a shorter/simpler way to do this? Am I missing something?
my($variable) = "some string" =~ /(e\s*str)/;
This works because
If the /g option is not used, m// in list context returns a list consisting of the subexpressions matched by the parentheses in the pattern, i.e., ($1, $2, $3 …).
and because my($variable) = ... (note the parentheses around the scalar) supplies list context to the match.
If the pattern fails to match, $variable gets the undefined value.
Why do you want it to be shorter? Does is really matter?
$variable = $1 if $variable =~ /(find something).*/;
If you are worried about the variable name or doing this repeatedly, wrap the thing in a subroutine and forget about it:
some_sub( $variable, qr/pattern/ );
sub some_sub { $_[0] = $1 if eval { $_[0] =~ m/$_[1]/ }; $1 };
However you implement it, the point of the subroutine is to make it reuseable so you give a particular set of lines a short name that stands in their place.
Several other answers mention a destructive substitution:
( my $new = $variable ) =~ s/pattern/replacement/;
I tend to keep the original data around, and Perl v5.14 has an /r flag that leaves the original alone and returns a new string with the replacement (instead of the count of replacements):
my $match = $variable =~ s/pattern/replacement/r;
Well, you could say
my $variable;
($variable) = ($variable = "find something soon") =~ /(find something).*/;
or
(my $variable = "find something soon") =~ s/^.*?(find something).*/$1/;
You can do substitution as:
$a = 'stackoverflow';
$a =~ s/(\w+)overflow/$1/;
$a is now "stack"
From Perl Cookbook 2nd ed
6.1 Copying and Substituting Simultaneously
$dst = $src;
$dst =~ s/this/that/;
becomes
($dst = $src) =~ s/this/that/;
I just assumed everyone did it this way, amazed that no one gave this answer.
Almost ....
You can combine the match and retrieve the matched value with a substitution.
$variable =~ s/.*(find something).*/$1/;
AFAIK, You will always have to copy the value though, unless you do not care to clobber the original.
$variable2 = "stackoverflow";
(my $variable1) = ($variable2 =~ /stack(\w+)/);
$variable1 now equals "overflow".
I do this:
#!/usr/bin/perl
$target = "n: 123";
my ($target) = $target =~ /n:\s*(\d+)/g;
print $target; # the var $target now is "123"
Also, to amplify the accepted answer using the ternary operator to allow you to specify a default if there is no match:
my $match = $variable =~ /(*pattern*).*/ ? $1 : *defaultValue*;