Trying to check an array of strings against an array of regex 's.
Throws:
'Use of uninitialized value in $string in pattern match (m//) at myscript line '
If I take out the if statement, it still gives warning but prints each
element in the #string_list
foreach my $expr (#expr_list) {
foreach my $string (#string_list) {
if ($string =~ $expr) {
print $string,"\n"
}
}
}
That means one of the elements of #string_list is undef.
Related
I need to grep a value from an array.
For example i have a values
#a=('branches/Soft/a.txt', 'branches/Soft/h.cpp', branches/Main/utils.pl');
#Array = ('branches/Soft/a.txt', 'branches/Soft/h.cpp', branches/Main/utils.pl','branches/Soft/B2/c.tct', 'branches/Docs/A1/b.txt');
Now, i need to loop #a and find each value matches to #Array. For Example
It works for me with grep. You'd do it the exact same way as in the More::ListUtils example below, except for having grep instead of any. You can also shorten it to
my $got_it = grep { /$str/ } #paths;
my #matches = grep { /$str/ } #paths;
This by default tests with /m against $_, each element of the list in turn. The $str and #paths are the same as below.
You can use the module More::ListUtils as well. Its function any returns true/false depending on whether the condition in the block is satisfied for any element in the list, ie. whether there was a match in this case.
use warnings;
use strict;
use Most::ListUtils;
my $str = 'branches/Soft/a.txt';
my #paths = ('branches/Soft/a.txt', 'branches/Soft/b.txt',
'branches/Docs/A1/b.txt', 'branches/Soft/B2/c.tct');
my $got_match = any { $_ =~ m/$str/ } #paths;
With the list above, containing the $str, the $got_match is 1.
Or you can roll it by hand and catch the match as well
foreach my $p (#paths) {
print "Found it: $1\n" if $p =~ m/($str)/;
}
This does print out the match.
Note that the strings you show in your example do not contain the one to match. I added it to my list for a test. Without it in the list no match is found in either of the examples.
To test for more than one string, with the added sample
my #strings = ('branches/Soft/a.txt', 'branches/Soft/h.cpp', 'branches/Main/utils.pl');
my #paths = ('branches/Soft/a.txt', 'branches/Soft/h.cpp', 'branches/Main/utils.pl',
'branches/Soft/B2/c.tct', 'branches/Docs/A1/b.txt');
foreach my $str (#strings) {
foreach my $p (#paths) {
print "Found it: $1\n" if $p =~ m/($str)/;
}
# Or, instead of the foreach loop above use
# my $match = grep { /$str/ } #paths;
# print "Matched for $str\n" if $match;
}
This prints
Found it: branches/Soft/a.txt
Found it: branches/Soft/h.cpp
Found it: branches/Main/utils.pl
When the lines with grep are uncommented and foreach ones commented out I get the corresponding prints for the same strings.
The slashes dot in $a will pose a problem so you either have to escape them it when doing regex match or use a simple eq to find the matches:
Regex match with $a escaped:
my #matches = grep { /\Q$a\E/ } #array;
Simple comparison with "equals":
my #matches = grep { $_ eq $a } #array;
With your sample data both will give an empty array #matches because there is no match.
This Solved My Question. Thanks to all especially #zdim for the valuable time and support
my #SVNFILES = ('branches/Soft/a.txt', 'branches/Soft/b.txt');
my #paths = ('branches/Soft/a.txt', 'branches/Soft/b.txt',
'branches/Docs/A1/b.txt', 'branches/Soft/B2/c.tct');
foreach my $svn (#SVNFILES)
{
chomp ($svn);
my $m = grep { /$svn/ } (#paths);
if ( $m eq '0' ) {
print "Files Mismatch\n";
exit 1;
}
}
You should escape characters like '/' and '.' in any regex when you need it as a character.
Likewise :
$a="branches\/Soft\/a\.txt"
Retry whatever you did with either grep or perl with that. If it still doesn't work, tell us precisely what you tried.
kindly explain, why this issue comes
my data file
DATA----1
DATA----2
DATA----3
DATA----4
DATA----5
DATA----6
DATA----7
SAMPLE----1
SAMPLE----12
SAMPLE----13
SAMPLE----2
SAMPLE----3
SAMPLE----4
SAMPLE----5
OTHER----1
OTHER----2
OTHER----3
where I need entire line which start with DATA and SAMPLE to an array and an another array should have content which start with SAMPLE end with two digit number
I have got output with following script
use strict;
use warnings;
open(FH, "di.txt");
my #file = <FH>;
close(FH);
my #arr2 = grep { $_ =~ m/^SAMPLE.+\d\d$/g } #file; ## this array prints
my #arr1 = grep { $_ =~ m/^DATA|^SAMPLE/g } #file;
print #arr1,"\n\t~~~~~~~~~~~\n\n",#arr2;
First writen as
use strict;
use warnings;
open(FH, "di.txt");
my #file = <FH>;
close(FH);
my #arr1 = grep { $_ =~ m/^DATA|^SAMPLE/g } #file;
my #arr2 = grep { $_ =~ m/^SAMPLE.+\d\d$/g } #file; ## this doesn't print
print #arr1,"\n\t~~~~~~~~~~~\n\n",#arr2;
while run this one, prints only #arr1
what would be the reason #arr2 don't print
The problem is because of the behaviour of the global match /g option in scalar context
Every scalar variable has a marker that remembers where the most recent global match left off, and hence where the next one should start searching. It enables the use of the \G anchor in regex patterns, as well as while loops like this
my $s = 'aaabacad';
while ( $s =~ /a(.)/g ) {
print "$1 ";
}
which prints
a b c d
In truth you're not interested in a global match in this case, you just want to discover whether OR NOT the pattern can be found in the string. The grep operator applies scalar context to its first parameter, so in using the /g option in this statement
my #arr1 = grep { $_ =~ m/^DATA|^SAMPLE/g } #file;
you have left every element of the #file with the marker set to right after DATA or SAMPLE. That means the next match on the same element m/^SAMPLE.+\d\d$/g will start looking from there and clearly can't even find the ^ anchor to the match fails
The pos function gives you access to the marker, and you can fix your original code by resetting it to the start of the string after the first grep call. If you write this instead
my #arr1 = grep { $_ =~ m/^DATA|^SAMPLE/g } #file;
pos($_) = 0 for #file;
my #arr2 = grep { $_ =~ m/^SAMPLE.+\d\d$/g } #file; ## this doesn't print
then the output will be what you expected
The correct fix, however, is to write what you mean anyway, which means you should remove the /g option from the pattern matches. This code also works fine, and it's also more concise, more readable, and far less fragile
my #arr1 = grep /^DATA|^SAMPLE/, #file;
my #arr2 = grep /^SAMPLE.+\d\d$/, #file;
This conditional must match either telco_imac_city or telco_hier_city. When it succeeds I need to extract up to the second underscore of the value that was matched.
I can make it work with this code
if ( ($value =~ /(telco_imac_)city/) || ($value =~ /(telco_hier_)city/) ) {
print "value is: \"$1\"\n";
}
But if possible I would rather use a single regex like this
$value = $ARGV[0];
if ( $value =~ /(telco_imac_)city|(telco_hier_)city/ ) {
print "value is: \"$1\"\n";
}
But if I pass the value telco_hier_city I get this output on testing the second value
Use of uninitialized value $1 in concatenation (.) or string at ./test.pl line 19.
value is: ""
What am I doing wrong?
while (<$input>){
chomp;
print "$1\n" if /(telco_hier|telco_imac)_city/;
}
Perl capture groups are numbered based on the matches in a single statement. Your input, telco_hier_city, matches the second capture of that single regex (/(telco_imac_)city|(telco_hier_)city/), meaning you'd need to use $2:
my $value = $ARGV[0];
if ( $value =~ /(telco_imac_)city|(telco_hier_)city/ ) {
print "value is: \"$2\"\n";
}
Output:
$> ./conditionalIfRegex.pl telco_hier_city
value is: "telco_hier_"
Because there was no match in your first capture group ((telco_imac_)), $1 is uninitialized, as expected.
To fix your original code, use FlyingFrog's regex:
my $value = $ARGV[0];
if ( $value =~ /(telco_hier_|telco_imac_)city/ ) {
print "value is: \"$1\"\n";
}
Output:
$> ./conditionalIfRegex.pl telco_hier_city
value is: "telco_hier_"
$> ./conditionalIfRegex.pl telco_imac_city
value is: "telco_imac_"
I want to compare two numbers isolated from this sample data:
'gi|112807938|emb|CU075707.1|_Xenopus_tropicalis_finished_cDNA,_clone_TNeu129d01 C1:TCONS_00039972(XLOC_025068),_12.9045:32.0354,_Change:1.3118,_p:0.00025,_q:0.50752 C2:TCONS_00045925(XLOC_029835),_10.3694:43.8379,_Change:2.07985,_p:0.0004,_q:0.333824',
'gi|115528274|gb|BC124894.1|_Xenopus_laevis_islet-1,_mRNA_(cDNA_clone_MGC:154537_IMAGE:8320777),_complete_cds C1:TCONS_00080221(XLOC_049570),_17.9027:40.8136,_Change:1.18887,_p:0.00535,_q:0.998852 C2:TCONS_00092192(XLOC_059015),_17.8995:35.5534,_Change:0.990066,_p:0.0355,_q:0.998513',
'gi|118404233|ref|NM_001078963.1|_Xenopus_(Silurana)_tropicalis_pancreatic_lipase-related_protein_2_(pnliprp2),_mRNA C1:TCONS_00031955(XLOC_019851),_0.944706:5.88717,_Change:2.63964,_p:0.01915,_q:0.998852 C2:TCONS_00036655(XLOC_023660),_2.31819:11.556,_Change:2.31757,_p:0.0358,_q:0.998513',
using the following regex:
#!/usr/bin/perl -w
use strict;
use File::Slurp;
use Data::Dumper;
$Data::Dumper::Sortkeys = 1;
my (#log_change, #largest_change);
foreach (#intersect) {
chomp;
my #condition1_match = ($_ =~ /C1:.*?Change:(-?\d+\.\d+)|C1:.*?Change:(-?inf)/); # Sometimes the value is 'inf' or '-inf'. This allows either a numerical or inf value to be captured.
my #condition2_match = ($_ =~ /C2:.*?Change:(-?\d+\.\d+)|C2:.*?Change:(-?inf)/);
push #log_change, "#condition1_match\t#condition2_match";
}
print Dumper (\#log_change);
Which gives this output:
'1.3118 2.07985 ',
'1.18887 0.990066 ',
'2.63964 2.31757 ',
Ideally, within the same loop I now want to make a comparison between the values held in #condition1_match and #condition2_match such that the larger value is pushed onto a new array, unless comparing against a non numerical 'inf' in which case push the numerical value.
Something like this:
my (#log_change, #largest_change);
foreach (#intersect) {
chomp;
my #condition1_match = ($_ =~ /C1:.*?Change:(-?\d+\.\d+)|C1:.*?Change:(-?inf)/);
my #condition2_match = ($_ =~ /C2:.*?Change:(-?\d+\.\d+)|C2:.*?Change:(-?inf)/);
push #log_change, "#condition1_match\t#condition2_match";
unless ($_ =~ /Change:-?inf/) {
if (#condition1_match > #condition2_match) {
push #largest_change, #condition1_match;
}
else {
push #largest_change, #condition2_match;
}
}
}
print Dumper (\#largest_change);
Which gives:
'2.07985',
undef,
'0.990066',
undef,
'2.31757',
undef,
as well as a lot of this error message:
Use of uninitialized value $condition2_match[1] in join or string at intersect.11.8.pl line 114.
I'm unsure as to what exactly the error message means, as well as why I'm getting undef values in my #largest_change
As you've written your code, #condition_match1 and #condition_match2 will be created with 2 elements -- corresponding to the 2 capture groups in your regular expression -- each time there is a match. But one of these elements will always necessarily be undef, leading to the uninitialized ... warnings.
In this case, you can repair this program by putting the | inside the capture group:
my ($condition1_match) = ($_ =~ /C1:.*?Change:(-?\d+\.\d+|-?inf)/);
my ($condition2_match) = ($_ =~ /C2:.*?Change:(-?\d+\.\d+|-?inf)/);
so that there is a single capture group and the matching operation produces a list with a single, defined element.
In addition, the comparison
if (#condition1_match > #condition2_match) {
is probably not doing what you think it is doing. In Perl, a numerical comparison between two arrays is a comparison of array lengths. What you apparently mean to do is to compare the defined value in each of those arrays, so you would need to do something more cumbersome like:
my $condition1_match = $condition1_match[0] // $condition1_match[1];
my $condition2_match = $condition2_match[0] // $condition2_match[1];
if ($condition1_match > $condition2_match) {
push #largest_change, $condition1_match;
} else {
push #largest_change, $condition2_match;
}
I have a strange problem in matching a pattern.
Consider the Perl code below
#!/usr/bin/perl -w
use strict;
my #Array = ("Hello|World","Good|Day");
function();
function();
function();
sub function
{
foreach my $pattern (#Array)
{
$pattern =~ /(\w+)\|(\w+)/g;
print $1."\n";
}
print "\n";
}
__END__
The output I expect should be
Hello
Good
Hello
Good
Hello
Good
But what I get is
Hello
Good
Use of uninitialized value $1 in concatenation (.) or string at D:\perlfiles\problem.pl li
ne 28.
Use of uninitialized value $1 in concatenation (.) or string at D:\perlfiles\problem.pl li
ne 28.
Hello
Good
What I observed was that the pattern matches alternatively.
Can someone explain me what is the problem regarding this code.
To fix this I changed the function subroutine to something like this:
sub function
{
my $string;
foreach my $pattern (#Array)
{
$string .= $pattern."\n";
}
while ($string =~ m/(\w+)\|(\w+)/g)
{
print $1."\n";
}
print "\n";
}
Now I get the output as expected.
It is the global /g modifier that is at work. It remembers the position of the last pattern match. When it reaches the end of the string, it starts over.
Remove the /g modifier, and it will act as you expect.