Search pattern for decoding logic - regex

I need to decode logic from a database I reference when determining if an attribute is able to be associated with an item I am defining. The database uses standard logic flow and order of importance. I need to be able to determine nested conditional statements in the strings I pull from the database. For example:
$val = "(red && (blue || yellow)) && black";
The string I'd pull from the database would resemble:
$pull = "red.yellow.black";
I would separate $pull on . and store the values in an array, then I'd verify each value is in the array(s) I generate from the $val string. I am having trouble with the best method for determining how to unpack the nested logic though. Initially, I considered just using a regex, then removing the portion of the logic that has been assessed like so:
($eval) = $val =~ /.*\((.*)?\)/; $val =~ s/(.*)\((.*)?\)(.*)/$1 $2/;
If I do this in a while ($val =~ /\(/) loop, I could probably stack the extracted logic into arrays and evaluate the logical expression in each element to determine if each condition is true for the item I am evaluating, but there's something wrong with my regexp causing the sequence to fail. The ? is not as lazy as I thought, apparently...
my $val = "(red && (blue || yellow)) && black";
my $pull = "red.yellow.black";
my #stack = ();
until ($val !~ /\(/) {
my ($eval) = $val =~ /.*\((.*)?\)/;
push(#stack, $eval);
print "$eval\n";
$val =~ s/(.+)\((.*)?\)(.*)/$1 $2/;
print "$val\n";
}
If I just run the sequence in a perl shell, with some debugging info, I get this:
[bin]$ perl -e 'my $val = "((red && (blue || yellow) && black)"; my $pull = "red.yellow.black"; my #stack = (); until ($val !~ /\(/) { my ($eval) = $val =~ /.*\((.+)?\).?/; push(#stack, $eval); print "EVAL: $eval\n";$pause = <STDIN>;$val =~ s/(.*)\((.*)?\)(.*)/$1 $2/;print "VAL: $val\n"; }'
EVAL: blue || yellow) && black
VAL: ((red && blue || yellow) && black
EVAL: red && blue || yellow
VAL: ( red && blue || yellow
EVAL:
Any input on what I'm doing wrong would be appreciated, and any improvements on efficiency would be greatly appreciated! TIA
Update: Ok, so I just dumbed the whole thing down like this. I lose some of the order of operations, but if I evaluate each statement individually, whenever one of the conditions is broken, I'll pick it up and be able to act on the info accordingly. The code I am going to expand on is below. This would just return the individual components of the expression. I don't like how everything ouside parenthesis is concatenated into the first element, but I guess it still works fundamentally.
my $val = "(red && (blue || yellow)) && black";
$val =~ s/\s+//g;
my #chars = ();
my #ops = ();
while ($val) {
my ($char) = $val =~ /(.)/;
$val =~ s/.//;
push #chars, $char;
}
my $index = 0;
foreach my $char (#chars) {
if ($char =~ /\(/) {
$index++;
} elsif ($char =~ /\)/) {
$index--
}
#assign the character to the index.
$ops[$index] .= $char;
}
s/[()]//g for #ops;
print " #ops\n";
Output:
[/bin]$ perl decode_logic.pl
&&black red&& blue||yellow

Related

using ~~ correctly

I am trying to parse through a simple enough file of field and value pairs.
So some fields I am not interested in and I want to skip
So in my "play" code I had a static thing like this:
next if $field =~ m/fieldToIgnore1|fieldToIgnore2/;
... then I extended this an an array and still happy
print "== using ~~ ==\n";
foreach my $field (#fields) {
next if $field ~~ #foni;
print "$field\n";
}
(fnoi == fields not of interest)
But when I carry that over back into my non-play setup it doesn't work.
Now in the play I was just looping over
my #fields = ("field1", "field2");
my #foni = ("fieldToIgnore1", "fieldToIgnore1");
In my proper code I go through each line and take out the lines that are setup like field - value lines and then strip out the field into a scalar... hence why I thought it would the same idea as my play code - but it doesn't seem to be
while ( <$infile> ) {
if ( /^PUBLISH:/ ) {
( $symbol, $record_type ) = ( $1, $2 );
print "symbol is: [$symbol]\n";
} else {
my ( $field, $value ) = split(/\|/);
next unless $value;
print "field is: [$field]\n";
print "value is: [$value]\n";
$field =~ s/^\s+|\s+$//g;
$value =~ s/^\s+|\s+$//g;
print "... field is: [$field]\n";
print "... value is: [$value]\n";
## ADD FIELD SKIPPING LOGIC HERE
You can build a regex pattern from your array, like this
my $re = join '|', #foni;
$re = qr/$re/; # Compile the regex
for my $field (#fields) {
next if $field =~ $re;
...
}

Perl conditional regex with inequalities

I have one query. I have to match 2 strings in one if condition:
$release = 5.x (Here x should be greater than or equal to 3)
$version = Rx (this x should be greater than or equal to 5 if $release is 5.3, otherwise anything is acceptable)
e.g. 5.1R11 is not acceptable, 5.3R4 is not, 5.3R5 is acceptable, and 5.4 R1 is acceptable.
I have written a code like this:
$release = "5.2";
$version = "R4";
if ( $release =~ /5.(?>=3)(\d)/ && $version =~ m/R(?>=5)(\d)/ )
{
print "OK";
}
How can I write this?
This is really a three-level version string, and I suggest that you use Perl's version facility
use strict;
use warnings 'all';
use feature 'say';
use version;
my $release = '5.2';
my $version = 'R4';
if ( $version =~ /R(\d+)/ && version->parse("$release.$1") ge v5.3.5 ) {
say 'OK';
}
In regex (?>) it means atomic grouping.
Group the element so it will stored into $1 then compare the $1 with number so it should be
if (( ($release =~ /5\.(\d)/) && ($1 > 3) ) && (($version =~ m/R(\d)/) && ($1 >= 3) ) )
{
print "OK\n";
}
I got the correct one after modifying mkhun's solution:
if ((($release =~ /5.3/)) && (($version =~ m/R(\d+)(.\d+)?/) && ($1 >= 5))
|| ((($release =~ /5.(\d)/) && ($1 > 3)) && ($version =~ m/R(\d+)(.\d+)?/)) )
{
print "OK\n";
}

List files that exist only on a given directory?

I am trying to list out the files which are only in directory ./a/b or ./a/d. Now I am explicitly specifying by using (-d && $_ =~ "b") || (-d && $_ =~ "d"). Is there any way I can put needed folders in an array?
use File::Find;
my $filename = "h*.txt";
print ("Now it's:", $filename);
find({
wanted => \&wanted,
preprocess => \&dir_preprocess,
}, './a');
sub dir_preprocess {
my (#entries) = #_;
#my #tmparr=("d","b"); This isn't working
if ( $File::Find::dir eq './a' ) {
#entries = grep { (-d && $_ =~ "b") || (-d && $_ =~ "d") }#entries;
}
return #entries;
}
my #mylist;
sub wanted{
if($_ =~ $filename) {
push(#mylist, $_);
}
}
print ("It's:", #mylist);
You can use | as "or" in a regex expression. (-d && $_ =~ /(b|d)/) would match either b or d. You could put the names in an array and then use join to generate the regex you need.
You might want to do things like prepend the path upto that depth, else you might get at different levels of the heirarchy. Also you can add $ at the end to indicate that it comes at the end.

How to determine number of times a word appears in text?

How can I find the number of times a word is in a block of text in Perl?
For example my text file is this:
#! /usr/bin/perl -w
# The 'terrible' program - a poorly formatted 'oddeven'.
use constant HOWMANY => 4; $count = 0;
while ( $count < HOWMANY ) {
$count++;
if ( $count == 1 ) {
print "odd\n";
} elsif ( $count == 2 ) {
print "even\n";
} elsif ( $count == 3 ) {
print "odd\n";
} else { # at this point $count is four.
print "even\n";
}
}
I want to find the number of "count" word for that text file. File is named terrible.pl
Idealy it should use regex and with minimum number of line of code.
EDIT: This is what I have tried:
use IO::File;
my $fh = IO::File->new('terrible.pl', 'r') or die "$!\n";
my %words;
while (<$fh>) {
for my $word ($text =~ /count/g) {
print "x";
$words{$word}++;
}
}
print $words{$word};
Here's a complete solution. If this is homework, you learn more by explaining this to your teacher than by rolling your own:
perl -0777ne "print+(##=/count/g)+0" terrible.pl
If you are trying to count how many times appears the word "count", this will work:
my $count=0;
open(INPUT,"<terrible.pl");
while (<INPUT>) {
$count++ while ($_ =~ /count/g);
}
close(INPUT);
print "$count times\n";
I'm not actually sure what your example code is but you're almost there:
perl -e '$text = "lol wut foo wut bar wut"; $count = 0; $count++ while $text =~ /wut/g; print "$count\n";'
You can use the /g modifier to continue searching the string for matches. In the example above, it will return all instances of the word 'wut' in the $text var.
You can probably use something like so:
my $fh = IO::File->new('test.txt', 'r') or die "$!\n";
my %words;
while (<$fh>) {
for my $word (split / /) {
$words{$word}++;
}
}
That will give you an accurate count of every "word" (defined as a group of characters separated by a space), and store it in a hash which is keyed by the word with a value of the number of the word which was seen.
perdoc perlrequick has an answer. The term you want in that document is "scalar context".
Given that this appears to be a homework question, I'll point you at the documentation instead.
So, what are you trying to do? You want the number of times something appears in a block of text. You can use the Perl grep function. That will go through a block of text without needing to loop.
If you want an odd/even return value, you can use the modulo arithmetic function. You can do something like this:
if ($number % 2) {
print "$number is odd\n"; #Returns a "1" or true
}
else {
print "$number is even\n"; #Returns a "0" or false
}

Why this code does not do what I mean?

$w = 'self-powering';
%h = (self => 'self',
power => 'pauә',
);
if ($w =~ /(\w+)-(\w+)ing$/ && $1~~%h && $2~~%h && $h{$2}=~/ә$/) {
$p = $h{$1}.$h{$2}.'riŋ';
print "$w:"," [","$p","] ";
}
I expect the output to be
self-powering: selfpauәriŋ
But what I get is:
self-powering: [riŋ]
My guess is something's wrong with the code
$h{$2}=~/ә$/
It seems that when I use
$h{$2}!~/ә$/
Perl will do what I mean but why I can't get "self-powering: selfpauәriŋ"?
What am I doing wrong? Any ideas?
Thanks as always for any comments/suggestions/pointers :)
When you run
$h{$2}!~/ә$/
In your if statement the contents of $1 and $2 are changed to be empty, because no groupings were matched (there were none). If you do it like this:
if ($w =~ /(\w+)-(\w+)ing$/){
my $m1 = $1;
my $m2 = $2;
if($m2~~%h && $m2~~%h && $h{$m2}=~/ә$/) {
$p = $h{$m1}.$h{$m2}.'riŋ';
print "$w:"," [","$p","] ";
}
}
I expect you will get what you want.
Are you running with use warnings enabled? That would tell you that $1 and $2 are not what you expect. Your second regex, not the first, determines the values of those variables once you enter the if block. To illustrate with a simpler example:
print $1, "\n"
if 'foo' =~ /(\w+)/
and 'bar' =~ /(\w+)/;