Running regexp on the same variable twice

Running regexp on the same variable twice - regex

How can I run regexp more times on same variable?
I want
$$contRef =~ /Kurs.*?(\d+,?\d*)<\/span/msgi;
and
$$contRef =~ /Price(\d+,?\d*)<\/span/msgi;
IIRC regexp sets a pointer and If I want to find any pattern again and start from the beginning I need to reset the variable. I don't want to copy the content to other variable.
I wish to reset the pointer to be able to start the searching from the beginning.

What you're looking for is exactly what you just pasted.
$$contRef =~ /Kurs.*?(\d+,?\d*)<\/span/msgi;
# Do stuff with $1
# ...
# ...
$$contRef =~ /Price(\d+,?\d*)<\/span/msgi;
# Do new stuff with the new $1
# ...
# ...

Maybe your unusual use of variables is the problem? This works and prints 9922
$a="Kursplat99</spanPrice22</span";
$contRef = "a";
$$contRef =~ /Kurs.*?(\d+,?\d*)<\/span/msgi;
print $1;
$$contRef =~ /Price(\d+,?\d*)<\/span/msgi;
print $1;

As has been suggested in the comments, perhaps a while, alternation (|) and an array for the captures would be helpful here:
use strict;
use warnings;
use Data::Dumper;
my $string = "Kursplat99</span>Price22</span";
my #array;
push #array, $1 while $string =~ /(?:Kurs.*?|Price)(\d+,?\d*)<\/span/msgi;
print Dumper \#array if #array;
Output:
$VAR1 = [
'99',
'22'
];

Related

Regex Group: $1 in a string before replacement happens

I have the following code, which replaces a string with a part of it using $1:
my $string = "abcd1234";
print "$string\n";
$string =~ s/c(d1)2/<found: $1>/;
print "$string\n";
Output:
abcd1234
ab<found: d1>34
Now I want to have variables, which contain the condition and the replacement. But if I do it like this, an error occurs:
my $string = "abcd1234";
print "$string\n";
my $cond = qr/c(d1)2/;
my $rule = "<found: $1>";
$string =~ s/$cond/$rule/;
print "$string\n";
Output:
Use of uninitialized value $1 in concatenation (.) or string
abcd1234
ab<found: >34
I get, that $1 isn't existing in the line, where $rule is defined. But how can put a placeholder there?

You can't delay the interpretation of a variable within a quoted string like that. $1 is substituted as soon as the line is executed.
To do what you seem to want, you would create a sprintf template and execute it in the replacement part of the substitution with the /e flag.
my $string = "abcd1234";
print "$string\n";
my $cond = qr/c(d1)2/;
my $replacement_fmt= "<found: %s>";
$string =~ s/$cond/sprintf($replacement_fmt, $1)/e;
print "$string\n";
## ab<found: d1>34

You can use sub_modify from String::Substitution:
use String::Substitution qw(sub_modify);
my $string = "abcd1234";
print "$string\n";
my $cond = qr/c(d1)2/;
my $rule = '<found: $1>'; # Single quotes!
sub_modify($string, $cond, $rule);
print "$string\n";
For completeness, note that it's also possible to do this using s/// with the /ee modifier. However, you shouldn't use it, as it can lead to security issues and various bugs.

How to capture every match in a global regex substitution?

I realize it is possible to achieve this with a slight workaround, but I am hoping there is a simpler way (since I often make use of this type of expression).
Given the example string:
my $str = "An example: sentence!*"
A regex can be used to match each punctuation mark and capture them in an array.
Thereafter, I can simply repeat the regex and replace the matches as in the following code:
push (#matches, $1), while ($str =~ /([\*\!:;])/);
$str =~ s/([\*\!:;])//g;
Would it be possible to combine this into a single step in Perl where substitution occurs globally while also keeping tabs on the replaced matches?

You can embed code to run in your regular expression:
my #matches;
my $str = 'An example: sentence!*';
$str =~ s/([\*\!:;])(?{push #matches, $1})//g;
But with a match this simple, I'd just do the captures and substitution separately.

Yes, it's possible.
my #matches;
$str =~ s/[*!:;]/ push #matches, $&; "" /eg;
However, I'm not convinced that the above is faster or clearer than the following:
my #matches = $str =~ /[*!:;]/g;
$str =~ tr/*!:;//d;

Use:
my $str = "An example: sentence!*";
my #matches = $str =~ /([\*\!:;])/g;
say Dumper \#matches;
$str =~ tr/*!:;//d;
Output:
$VAR1 = [
':',
'!',
'*'
];

Is that what you're looking for ?
my ($str, #matches) = ("An example: sentence!*");
#first method :
($str =~ s/([\*\!:;])//g) && push(#matches, $1);
#second method :
push(#matches, $1) while ($str =~ s/([\*\!:;])//g);

Try:
my $str = "An example: sentence!*";
push(#mys, ($str=~m/([^\w\s])/g));
print join "\n", #mys;
Thanks.

Put regex match only into array, not entire line

I am trying to check each line of a document for a regex match.
If the line has a match, I want to push the match only into an array.
In the code below, I thought that using the g operator at the end of the regex delimiters would make $lines value the regex match only. Instead $lines value is the entire line of the document containing the match...
my $line;
my #table;
while($line = <$input>){
if($line =~ m/foo/g){
push (#table, $line);
}
}
print #table;
If any one could help me get my matches into an array, it is much appreciated.
Thanks.
p.s.
Still learning... so any explanations of concepts I may have missed is also much appreciated.

g modifier in s///g is for global search and replace.
If you just want to push matching pattern into an array, you need to capture matching pattern enclosed by (). Captured elements are stored in variable $1, $2, etc..
Try following modification to your code:
my #table;
while(my $line = <$input>){
if($line =~ m/(foo)/){
push (#table, $1);
}
}
print #table;
Refer to this documentation for more details.
Or if you want to avoid needless use of global variables,
my #table;
while(my $line = <$input>){
if(my #captures = $line =~ m/(foo)/){
push #table, #captures;
}
}
which simplifies to
my #table;
while(my $line = <$input>){
push #table, $line =~ m/(foo)/;
}

Expanding on jkshah's answer a little, I'm explicitly storing the matches in #matches instead of using the magic variable $1 which I find a little harder to read.
"__DATA__" is a simple way to store lines in a filehandle in a perl source file.
use strict;
use warnings;
my #table;
while(my $line = <DATA>){
my #matches = $line =~ m/(foo)/;
if(#matches) {
warn "found: " . join(',', #matches );
push(#table,#matches);
}
}
print #table;
__DATA__
herp de derp foo
yerp fool foo flerp
heyhey

If you file is not very big(100-500mb fine for 2 GB RAM) then you can use below.Here I am extracting numbers if matched in line.It will be much faster than the foreach loop.
#!/usr/bin/perl
open my $file_h,"<abc" or die "ERROR-$!";
my #file = <$file_h>;
my $file_cont = join(' ',#file);
#file =();
my #match = $file_cont =~ /\d+/g;
print "#match";

Different ways to test for $1 after regex?

Normally when I check if the regex succeeded I do
if ($var =~ /aaa(\d+)bbb(\d+)/) { # $1 and $2 should be defined now }
but I recall seeing a variation of this that seamed shorter. Perhaps it was only with one buffer.
Can anyone think or other ways to test if $1 after a successful regex?

You can avoid $1 and similar altogether:
if (my ($anum, $bnum) = $var =~ /aaa(\d+)bbb(\d+)/) {
# Work with $anum and $bnum
}

The only shorter way that I can think of is if the match is on $_. So for instance:
for (#strings) {
if (m/aaa(\d+)bbb(\d+)/) {
...
If the match succeeds then $1 and $2 will be populated.

never forget about
use strict;
use warnings;
I like plain syntax in Perl, but not in this way:
my $str = 'abc101abc';
$str =~ m/(\d+)/ and do {print $1;}
OR
$str =~ m/(\d+)/ and print $1;
OR
($str in $_, so $_ = $str;)
m/(\d+)/ and print $1;
BUT! TIMTOWTDI helps you to dream about your own style :)
I prefer old-if style.

Reading both answers, I now recall that this was what I had seen
my $str = 'abc101abc';
$str =~ m/(\d+)/;
print $1 if $1;
print $1 if $str =~ m/(\d+)/;

How to save a matching regex's value to a variable in one line of perl?

I'm sure there is a very simple way to do this, but whenever I search for examples, I get the two step method. Here is what I typically do:
$data =~ m/(my_query)/;
$result = $1;
I want to set $result in the same line as the regex and never use $1. Thanks!

my($result) = ($data =~ m/(my_query)/);
As noted in a comment, the my($result) needs the parentheses to provide an array context for the result of the match. In an array context, you get the $1 etc allocated to the array. You could use #result = ($data =~ m/(my_query)/);; you could omit the my but you would need to keep the parentheses; you could subscript the array using $result = ($data =~ m/(my_query)/)[0]; (thanks ysth). The key words here are 'array context'.
Examples:
$ perl -e '$data="abcdef";my($result)=($data =~ m/(cde)/); print "$result\n"'
cde
$ perl -e '$data="abcdef"; ($result)=($data =~ m/(cde)/); print "$result\n"'
cde
$ perl -e '$data="abcdef"; #result =($data =~ m/(cde)/); print "$result[0]\n"'
cde
$ perl -e '$data="abcdef"; $result =($data =~ m/(cde)/)[0]; print "$result\n"'
cde
$

You didn't specify what problem you want to avoid, but there is definitely one to avoid. The following code assigns something unknown to $result when the pattern doesn't match:
$data =~ /(my_query)/;
my $result = $1;
You could use a conditional to assign something useful to $result when the pattern doesn't match
my $result = $data =~ /(my_query)/ ? $1 : undef;
Or you could take advantage of the fact that m// in list context returns what it captured.
my ($result) = $data =~ /(my_query)/;

$data="abcde";
$data =~ s/(cde)/$result=$1/e;

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Running regexp on the same variable twice - regex

What you're looking for is exactly what you just pasted. $$contRef =~ /Kurs.?(\d+,?\d)<\/span/msgi; # Do stuff with $1 # ... # ... $$contRef =~ /Price(\d+,?\d*)<\/span/msgi; # Do new stuff with the new $1 # ... # ...

Maybe your unusual use of variables is the problem? This works and prints 9922 $a="Kursplat99</spanPrice22</span"; $contRef = "a"; $$contRef =~ /Kurs.?(\d+,?\d)<\/span/msgi; print $1; $$contRef =~ /Price(\d+,?\d*)<\/span/msgi; print $1;

Related

Regex Group: $1 in a string before replacement happens

How to capture every match in a global regex substitution?

Put regex match only into array, not entire line

Different ways to test for $1 after regex?

How to save a matching regex's value to a variable in one line of perl?

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Running regexp on the same variable twice - regex

What you're looking for is exactly what you just pasted. $$contRef =~ /Kurs.*?(\d+,?\d*)<\/span/msgi; # Do stuff with $1 # ... # ... $$contRef =~ /Price(\d+,?\d*)<\/span/msgi; # Do new stuff with the new $1 # ... # ...

Maybe your unusual use of variables is the problem? This works and prints 9922 $a="Kursplat99</spanPrice22</span"; $contRef = "a"; $$contRef =~ /Kurs.*?(\d+,?\d*)<\/span/msgi; print $1; $$contRef =~ /Price(\d+,?\d*)<\/span/msgi; print $1;

Related

Regex Group: $1 in a string before replacement happens

How to capture every match in a global regex substitution?

Put regex match only into array, not entire line

Different ways to test for $1 after regex?

How to save a matching regex's value to a variable in one line of perl?

Categories

Resources

What you're looking for is exactly what you just pasted. $$contRef =~ /Kurs.?(\d+,?\d)<\/span/msgi; # Do stuff with $1 # ... # ... $$contRef =~ /Price(\d+,?\d*)<\/span/msgi; # Do new stuff with the new $1 # ... # ...

Maybe your unusual use of variables is the problem? This works and prints 9922 $a="Kursplat99</spanPrice22</span"; $contRef = "a"; $$contRef =~ /Kurs.?(\d+,?\d)<\/span/msgi; print $1; $$contRef =~ /Price(\d+,?\d*)<\/span/msgi; print $1;