regular expression matching perl - regex

I'm trying to match part of a string to a yes or no answer. The string goes either: '{"valid_js":"yes"...' or '{"valid_js":"no"...'. I'm trying to get the "yes" or "no"
Can I just use something like:
/:."yes"/g
Or do I need something more complicated?

Try something like this...
if (m/"valid_js":"(yes|no)"/)
{
# At this point $1 will contain either yes or no
if ($1 eq 'yes')
{
# Answer is yes
}
else
{
# Answer is no
}
}

This is a generic form of regexp:
valid_js["]:["](yes|no)["] -> $1
You can use that regexp to match or replace.

This should help you:
$input = '{"valid_js":"yes"...';
if (($input =~ m/"valid_js":"(.*?)"/) && ($1 eq 'yes')) {
print "1: yes\n";
}
$input = '{"valid_js":"no"...';
if (($input =~ m/"valid_js":"(.*?)"/) && ($1 eq 'yes')) {
print "2: yes\n";
}
You can test the code here.

Related

pattern matching in regular expression (Perl)

Make a pattern that will match three consecutive copies of whatever is currently contained in $what. That is, if $what is fred, your pattern should match fredfredfred. If $what is fred|barney, your pattern should match fredfredbarney, barneyfredfred, barneybarneybarney, or many other variations. (Hint: You should set $what at the top of the pattern test program with a statement like my $what = 'fred|barney';)
But my solution to this is just too easy so I'm assuming its wrong. My solution is:
#! usr/bin/perl
use warnings;
use strict;
while (<>){
chomp;
if (/fred|barney/ig) {
print "pattern found! \n";
}
}
It display what I want. And I didn't even have to save the pattern in a variable. Can someone help me through this? Or enlighten me if I'm doing/understanding the problem wrong?
This example should clear up what was wrong with your solution:
my #tests = qw(xxxfooxx oofoobar bar bax rrrbarrrrr);
my $str = 'foo|bar';
for my $test (#tests) {
my $match = $test =~ /$str/ig ? 'match' : 'not match';
print "$test did $match\n";
}
OUTPUT
xxxfooxx did match
oofoobar did match
bar did match
bax did not match
rrrbarrrrr did match
SOLUTION
#!/usr/bin/perl
use warnings;
use strict;
# notice the example has the `|`. Meaning
# match "fred" or "barney" 3 times.
my $str = 'fred|barney';
my #tests = qw(fred fredfredfred barney barneybarneybarny barneyfredbarney);
for my $test (#tests) {
if( $test =~ /^($str){3}$/ ) {
print "$test matched!\n";
} else {
print "$test did not match!\n";
}
}
OUTPUT
$ ./test.pl
fred did not match!
fredfredfred matched!
barney did not match!
barneybarneybarny did not match!
barneyfredbarney matched!
use strict;
use warnings;
my $s="barney/fred";
my #ra=split("/", $s);
my $test="barneybarneyfred"; #etc, this will work on all permutations
if ($test =~ /^(?:$ra[0]|$ra[1]){3}$/)
{
print "Valid\n";
}
else
{
print "Invalid\n";
}
Split delimited your string based off of "/". (?:$ra[0]|$ra[1]) says group, but do not extract, "barney" or "fred", {3} says exactly three copies. Add an i after the closing "/" if the case doesn't matter. The ^ says "begins with," and the $ says "ends with."
EDIT:
If you need the format to be barney\fred, use this:
my $s="barney\\fred";
my #ra=split(/\\/, $s);
If you know that the matching will always be on fred and barney, then you can just replace $ra[0], $ra[1] with fred and barney.

Perl if-statements / regular expressions

I'm not entirely sure why my if-statements are not validating user input. Here's my code.
The statements that contain regular expressions are supposed to allow leading, and trailing white space.
sub Menu
{
&processlist;
&creating_Refs;
print "[Sort by COLUMN|sortup|sortdown| quit]:";
my $user_input = <STDIN>;
chomp($user_input);
if($user_input =~ m/[quit\s]/)
{
exit;
}
elsif($user_input eq 'sortup')
{
print "working bro\n\n";
#$VAR1 = sort sortup #$VAR1;
foreach my $ref (#$VAR1)
{
print "$ref->{PID}, $ref->{USER}, $ref->{PR}, $ref->{NI}, $ref->{VIRT}, $ref->{RES}, $ref->{SHR}, $ref->{S}, $ref->{CPU}, $ref->{MEM}, $ref->{TIME}, $ref->{COMMAND} \n";
}
}
elsif($user_input eq 'sortdown \n')
{
print "working on sortdown\n\n";
}
elsif($user_input =~ m/[sort by]+\w/)
{
}
else
{
print "Error, please re-enter command \n\n";
&Menu;
}
}
A character class like [abcd] allows any one of the characters specified in the square brackets. When you say [sort by], it is equivalent to /s|o|r|t| |b|y/, which will match any one of those characters, only once. If you want to match sort by, use /sort by/.
And in your case:
if($user_input =~ m/quit/){
exit;
}
and to match exact words use word boundaries:
if($user_input =~ m/\bquit\b/){
exit;
}
if($user_input =~ m/quit/){
exit;
}
Also chomp removes trailing \n
So:
elsif($user_input eq 'sortdown \n')
Will never be true.

In Perl, how can I remove all spaces that are not inside double quotes " "?

I'm tying to come up with some regex that will remove all space chars from a string as long as it's not inside of double quotes (").
Example string:
some string with "text in quotes"
Result:
somestringwith"text in quotes"
So far I've come up with something like this:
$str =~ /"[^"]+"|/g;
But it doesn't seem to be giving the intended result.
I'm honestly very new at perl and haven't had too much regexp experience. So if anyone willing to answer would also be willing to provide some insight into the why and how that would be great!
Thanks!
EDIT
String will not contain escaped "'s
It should actually always be formatted like this:
Some.String = "Some Value"
Result would be
Some.String="Some Value"
Here is a technique using split to separate the quoted strings. It relies on your data being consistent and will not work with loose quotes.
use strict;
use warnings;
my #line = split /("[^"]*")/;
for (#line) {
unless (/^"/) {
s/[ \t]+//g;
}
}
print #line; # line is altered
Basically, you split up the string in order to isolate the quoted strings. Once that is done, perform the substitution on all other strings. Since the array elements are aliased in the loop, substitutions are performed on the actual array.
You can run this script like so:
perl -n script.pl inputfile
To see the output. Or
perl -n -i.bak script.pl inputfile
To do in-place edit on inputfile, while saving backup in inputfile.bak.
With that said, I'm not sure what your edit means. Do you want to change
Some.String = "Some Value"
to
Some.String="Some Value"
Text::ParseWords is tailor-made for this:
#!/usr/bin/env perl
use strict;
use warnings;
use Text::ParseWords;
my #strings = (
q{This.string = "Hello World"},
q{That " string " and "another shoutout to my bytes"},
);
for my $s ( #strings ) {
my #words = quotewords '\s+', 1, $s;
print join('', #words), "\n";
}
Output:
This.string="Hello World"
That" string "and"another shoutout to my bytes"
Using Text::ParseWords means if you ever had to deal with quoted strings with escaped quotation marks in them, you'd be ready ;-)
Also, this sounds like you have a configuration file of some sort and you're trying to parse it. If that is the case, there are probably better solutions.
I suggest removing the quoted substrings using split and then recombining them with join after removing whitespace from the intermediate text.
Note that if the regex used for split contains captures then the captured values will also be included in the list returned.
Here's some sample code.
use strict;
use warnings;
my $source = <<END;
Some.String = "Some Value";
Other.String = "Other Value";
Last.String = "Last Value";
END
print join '', map {s/\s+// unless /"/; $_; } split /("[^"]*")/, $source;
output
Some.String= "Some Value";Other.String = "Other Value";Last.String = "Last Value";
I would simply loop through the string char by char. This way you can handle escaped strings too (just add an isEscaped variable).
my $text='lala "some thing with quotes " lala ... ';
my $quoteOpen = 0;
my $out;
foreach $char(split//,$text) {
if ($char eq "\"" && $quoteOpen==0) {
$quoteOpen = 1;
$out .= $char;
} elsif ($char eq "\"" && $quoteOpen==1) {
$quoteOpen = 0;
$out .= $char;
} elsif ($char =~ /\s/ && $quoteOpen==1) {
$out .= $char;
} elsif ($char !~ /\s/) {
$out .= $char;
}
}
print "$out\n";
Splitting on double quotes, removing spaces only from even fields (i.e. those in quotes):
sub remove_spaces {
my $string = shift;
my #fields = split /"/, $string . ' '; # trailing space needed to keep final " in output
my $flag = 1;
return join '"', map { s/ +//g if $flag; $flag = ! $flag; $_} #fields;
}
It can be done with regex:
s/([^ ]*|\"[^\"]*\") */$1/g
Note that this won't handle any kind of escapes inside the quotes.

Why this code does not do what I mean?

$w = 'self-powering';
%h = (self => 'self',
power => 'pauә',
);
if ($w =~ /(\w+)-(\w+)ing$/ && $1~~%h && $2~~%h && $h{$2}=~/ә$/) {
$p = $h{$1}.$h{$2}.'riŋ';
print "$w:"," [","$p","] ";
}
I expect the output to be
self-powering: selfpauәriŋ
But what I get is:
self-powering: [riŋ]
My guess is something's wrong with the code
$h{$2}=~/ә$/
It seems that when I use
$h{$2}!~/ә$/
Perl will do what I mean but why I can't get "self-powering: selfpauәriŋ"?
What am I doing wrong? Any ideas?
Thanks as always for any comments/suggestions/pointers :)
When you run
$h{$2}!~/ә$/
In your if statement the contents of $1 and $2 are changed to be empty, because no groupings were matched (there were none). If you do it like this:
if ($w =~ /(\w+)-(\w+)ing$/){
my $m1 = $1;
my $m2 = $2;
if($m2~~%h && $m2~~%h && $h{$m2}=~/ә$/) {
$p = $h{$m1}.$h{$m2}.'riŋ';
print "$w:"," [","$p","] ";
}
}
I expect you will get what you want.
Are you running with use warnings enabled? That would tell you that $1 and $2 are not what you expect. Your second regex, not the first, determines the values of those variables once you enter the if block. To illustrate with a simpler example:
print $1, "\n"
if 'foo' =~ /(\w+)/
and 'bar' =~ /(\w+)/;

evaluating a user-defined regex from STDIN in Perl

I'm trying to make an on-the-fly pattern tester in Perl.
Basically it asks you to enter the pattern, and then gives you a >>>> prompt where you enter possible matches. If it matches it says "%%%% before matched part after match" and if not it says "%%%! string that didn't match". It's trivial to do like this:
while(<>){
chomp;
if(/$pattern/){
...
} else {
...
}
}
but I want to be able to enter the pattern like /sometext/i rather than just sometext
I think I'd use an eval block for this? How would I do such a thing?
This sounds like a job for string eval, just remember not to eval untrusted strings.
#!/usr/bin/perl
use strict;
use warnings;
my $regex = <>;
$regex = eval "qr$regex" or die $#;
while (<>) {
print /$regex/ ? "matched" : "didn't match", "\n";
}
Here is an example run:
perl x.pl
/foo/i
foo
matched
Foo
matched
bar
didn't match
^C
You can write /(?i:<pattern>)/ instead of /<pattern>/i.
This works for me:
my $foo = "My bonnie lies over the ocean";
print "Enter a pattern:\n";
while (<STDIN>) {
my $pattern = $_;
if (not ($pattern =~ /^\/.*\/[a-z]?$/)) {
print "Invalid pattern\n";
} else {
my $x = eval "if (\$foo =~ $pattern) { return 1; } else { return 0; }";
if ($x == 1) {
print "Pattern match\n";
} else {
print "Not a pattern match\n";
}
}
print "Enter a pattern:\n"
}