Perl special variables for regex matches

Perl special variables for regex matches - regex

I'd like to use one of perl's special variable to make this snippet a bit less large and ugly:
my $mysqlpass = "mysqlpass=verysecret";
$mysqlpass = first { /mysqlpass=/ } #vars;
$mysqlpass =~ s/mysqlpass=//;
I have looked this info up and tried several special variables ($',$1,$`, etc) to no avail

A s/// will return true if it replaces something.
Therefore, it is possible to simply combine those two statements instead of having a redundant m//:
use strict;
use warnings;
use List::Util qw(first);
chomp(my #vars = <DATA>);
my $mysqlpass = first { s/mysqlpass=// } #vars;
print "$mysqlpass\n";
__DATA__
mysqluser=notsosecret
mysqlpass=verysecret
mysqldb=notsecret
Outputs:
verysecret
One Caveat
Because $_ is an alias to the original data structure, the substitution will effect the #vars value as well.
Alternative using split
To avoid that, I would inquire if the #vars contains nothing but key value pairs separated by equal signs. If that's the case, then I would suggest simply translating that array into a hash instead.
This would enable much easier pulling of all keys:
use strict;
use warnings;
chomp(my #vars = <DATA>);
my %vars = map {split '=', $_, 2} #vars;
print "$vars{mysqlpass}\n";
__DATA__
mysqluser=notsosecret
mysqlpass=verysecret
mysqldb=notsecret
Outputs:
verysecret

Yeah, regular expression it, if you really want to visit the path of obfuscation.
See following code:
my $string = "mysqlpass=verysecret";
if ($string =~ /^(\w+)\=(\w+)$/) {
print $1; # This stores 'mysqlpass'
print $2; # This stores 'verysecret'
}
My recommendation against this though, is that you want your code to be readable.

The one you're looking for is $_.

Related

Perl regex to find keywords and not variables

I'm trying to create a regex as following :
print $time . "\n"; --> match only print because time is a variable ($ before)
$epoc = time(); --> match only time
My regex for the moment is /(?-xism:\b(print|time)\b)/g but it match time in $time in the first example.
Check here.
I tried things like [^\$] but then it doesn't match print anymore.
(I will have more keyword like print|time|...|...)
Thanks

Parsing perl code is a common and useful teaching tool since the student must understand both the parsing techniques and the code that they're trying to parse.
However, to do this properly, the best advice is to use PPI
The following script parses itself and outputs all of the barewords. If you wanted to, you could compare the list of barewords to the ones that you're trying to match. Note, this will avoid things within strings, comments, etc.
use strict;
use warnings;
use PPI;
#my $src = do {local $/; <DATA>}; # Could analyze the smaller code in __DATA__ instead
my $src = do {
local #ARGV = $0;
local $/;
<>;
};
# Load a document
my $doc = PPI::Document->new( \$src );
# Find all the barewords within the doc
my $barewords = $doc->find( 'PPI::Token::Word' );
for (#$barewords) {
print $_->content, "\n";
}
__DATA__
use strict;
use warnings;
my $time = time;
print $time . "\n";
Outputs:
use
strict
use
warnings
use
PPI
my
do
local
local
my
PPI::Document
new
my
find
for
print
content
__DATA__

What you need is a negative lookbehind (?<!\$), it's zero-width so it doesn't "consume" characters.
(?<!\$)a means match a if not preceded with a literal $. Note that we escaped $ since it means end of string (or line depending on the m modifier).
Your regex will look like (?-xism:\b(?<!\$)(print|time)\b).
I'm wondering why you are turning off the xism modifiers. They are off by default.So just use /\b(?<!\$)(?:print|time)\b/g as pattern.
Online demo
SO regex reference

Perl regex return matches from substitution

I am trying to simultaneously remove and store (into an array) all matches of some regex in a string.
To return matches from a string into an array, you could use
my #matches = $string=~/$pattern/g;
I would like to use a similar pattern for a substitution regex. Of course, one option is:
my #matches = $string=~/$pattern/g;
$string =~ s/$pattern//g;
But is there really no way to do this without running the regex engine over the full string twice? Something like
my #matches = $string=~s/$pattern//g
Except that this will only return the number of subs, regardless of list context. I would also take, as a consolation prize, a method to use qr// where I could simply modify the quoted regex to to a sub regex, but I don't know if that's possible either (and that wouldn't preclude searching the same string twice).

Perhaps the following will be helpful:
use warnings;
use strict;
my $string = 'I thistle thing am thinking this Thistle a changed thirsty string.';
my $pattern = '\b[Tt]hi\S+\b';
my #matches;
$string =~ s/($pattern)/push #matches, $1; ''/ge;
print "New string: $string; Removed: #matches\n";
Output:
New string: I am a changed string.; Removed: thistle thing thinking this Thistle thirsty

Here is another way to do it without executing Perl code inside the substitution. The trick is that the s///g will return one capture at a time and undef if it does not match, thus quitting the while loop.
use strict;
use warnings;
use Data::Dump;
my $string = "The example Kenosis came up with was way better than mine.";
my #matches;
push #matches, $1 while $string =~ s/(\b\w{4}\b)\s//;
dd #matches, $string;
__END__
(
"came",
"with",
"than",
"The example Kenosis up was way better mine.",
)

Perl regular expression - eliminate

Am a newbie here. I use glimpse in my Perl script to get the path of files.
For example
/home/user/Proj/A/Apps/App.pm
/home/user/Proj/B/Apps.pm
I need to fetch the part after Proj i.e; the output should be
A/Apps/App.pm
B/Apps.pm

If you want to use regex/replace you could do something like:
$str =~ s!.*/Proj/!!;

You have various options here. When it's always at /home/user/Proj/, I prefer the second way. If not, you can use the first way as well. The best way is a substr (when its a static length):
use 5.014;
use strict;
use warnings;
my $s_a = "/home/user/Proj/A/Apps/App.pm";
my $s_b = "/home/user/Proj/B/Apps.pm";
say $s_a =~ s{.*Proj/}{}r;
say $s_b =~ s{.*Proj/}{}r;
say $s_a =~ s{/home/user/Proj/}{}r;
say $s_b =~ s{/home/user/Proj/}{}r;
say substr $s_a, 16;
say substr $s_b, 16;
output:
A/Apps/App.pm
B/Apps.pm
A/Apps/App.pm
B/Apps.pm
A/Apps/App.pm
B/Apps.pm

If you want to modifiy an existing variable to remove the first part of the path then it's simple: just use the substitution operator s/// to remove the first part of the string up to /Proj/. I've used alternative delimiters s||| here to avoid having to escape the slashes in the pattern.
use strict;
use warnings;
my #paths = qw{
/home/user/Proj/A/Apps/App.pm
/home/user/Proj/B/Apps.pm
};
for my $path (#paths) {
$path =~ s|.*/Proj/||;
print $path, "\n";
}
output
A/Apps/App.pm
B/Apps.pm
But if you want to leave your path variable as it is and copy the tail portion to another variable, then I think it's best to use a regular expression to capture the wanted part, like this
for my $path (#paths) {
my ($tail) = $path =~ m|/Proj/(.+)|;
print $tail, "\n";
}
The output is identical.

How to pass a replacing regex as a command line argument to a perl script

I am trying to write a simple perl script to apply a given regex to a filename among other things, and I am having trouble passing a regex into the script as an argument.
What I would like to be able to do is somthing like this:
> myscript 's/hi/bye/i' hi.h
bye.h
>
I have produced this code
#!/utils/bin/perl -w
use strict;
use warnings;
my $n_args = $#ARGV + 1;
my $regex = $ARGV[0];
for(my $i=1; $i<$n_args; $i++) {
my $file = $ARGV[$i];
$file =~ $regex;
print "OUTPUT: $file\n";
}
I cannot use qr because apparently it cannot be used on replacing regexes (although my source for this is a forum post so I'm happy to be proved wrong).
I would rather avoid passing the two parts in as seperate strings and manually doing the regex in the perl script.
Is it possible to pass the regex as an argument like this, and if so what is the best way to do it?

There's more than one way to do it, I think.
The Evial Way:
As you basically send in a regex expression, it can be evaluated to get the result. Like this:
my #args = ('s/hi/bye/', 'hi.h');
my ($regex, #filenames) = #args;
for my $file (#filenames) {
eval("\$file =~ $regex");
print "OUTPUT: $file\n";
}
Of course, following this way will open you to some very nasty surprises. For example, consider passing this set of arguments:
...
my #args = ('s/hi/bye/; print qq{MINE IS AN EVIL LAUGH!\n}', 'hi.h');
...
Yes, it will laugh at you most evailly.
The Safe Way:
my ($regex_expr, #filenames) = #args;
my ($substr, $replace) = $regex_expr =~ m#^s/((?:[^/]|\\/)+)/((?:[^/]|\\/)+)/#;
for my $file (#filenames) {
$file =~ s/$substr/$replace/;
print "OUTPUT: $file\n";
}
As you can see, we parse the expression given to us into two parts, then use these parts to build a full operator. Obviously, this approach is less flexible, but, of course, it's much more safe.
The Easiest Way:
my ($search, $replace, #filenames) = #args;
for my $file (#filenames) {
$file =~ s/$search/$replace/;
print "OUTPUT: $file\n";
}
Yes, that's right - no regex parsing at all! What happens here is we decided to take two arguments - 'search pattern' and 'replacement string' - instead of a single one. Will it make our script less flexible than the previous one? No, as we still had to parse the regex expression more-or-less regularly. But now user clearly understand all the data that is given to a command, which is usually quite an improvement. )
#args in both examples corresponds to #ARGV array.

The s/a/b/i is an operator, not simply a regular expression, so you need to use eval if you want it to be interpreted properly.
#!/usr/bin/env perl
use warnings;
use strict;
my $regex = shift;
my $sub = eval "sub { \$_[0] =~ $regex; }";
foreach my $file (#ARGV) {
&$sub($file);
print "OUTPUT: $file\n";
}
The trick here is that I'm substituting this "bit of code" into a string to produce Perl code that defines an anonymous subroutine $_[0] =~ s/a/b/i; (or whatever code you pass it), then using eval to compile that code and give me a code reference I can call from within the loop.
$ test.pl 's/foo/bar/' foo nicefood
OUTPUT: bar
OUTPUT: nicebard
$ test.pl 'tr/o/e/' foo nicefood
OUTPUT: fee
OUTPUT: nicefeed
This is more efficient than putting an eval "\$file =~ $regex;" inside the loop as then it'll get compiled and eval-ed at every iteration rather than just once up-front.
A word of warning about eval - as raina77ow's answer explains, you should avoid eval unless you're 100% sure you are always getting your input from a trusted source...

s/a/b/i is not a regex. It is a regex plus substitution. Unless you use the string eval, make this work might be pretty tough (consider s{a}<b>e and so on).

The trouble is that you are trying to pass a perl operator when all you really need to pass is the arguments:
myscript hi bye hi.h
In the script:
my ($find, $replace, #files) = #ARGV;
...
$file =~ s/$find/$replace/i;
Your code is a bit clunky. This is all you need:
use strict;
use warnings;
my ($find, $replace, #files) = #ARGV;
for my $file (#files) {
$file =~ s/$find/$replace/i;
print "$file\n";
}
Note that this way allows you to use meta characters in the regex, such as \w{2}foo?. This can be both a good thing and a bad thing. To make all characters intepreted literally (disable meta characters), you can use \Q ... \E like so:
... s/\Q$find\E/$replace/i;

Using a regular expression in Perl to list variables from another Perl script

My thoughts on how to grab all scalars and arrays out of a Perl file went along the lines of:
open (InFile, "SomeScript.pl");
#InArray = <InFile>;
#OutArray = {};
close (InFile);
$ArrayCount = #InArray;
open (OutFile, ">outfile.txt");
for ($x=0; $x<=$ArrayCount; $x++){
$Testline = #InArray[$x];
if($Testline =~ m/((#|\$)[A-Z]+)/i){
$Outline = "$1\n";
push #OutArray, $Outline;
}
}
print OutFile #OutArray;
close(OutFile);
...and this works fairly well. The problem is that if multiple variables appear on a line it will only grab the first variable. An example might be:
$FirstVar = $SecondVar + $ThirdVar;
The script would only grab $FirstVar and output to a file. This might still work though because $SecondVar and $ThirdVar have to be initialized somewhere else before the proceeding line has any meaning. I guess the exception to the rule would be a line in which multiple variables are initialized at the same time.
Could an example in real Perl code break this script?
Also, how to grab multiple items that match my regular expression's criteria from the same line?

Don't do that
You can't really parse Perl with regexes, so I wouldn't even try.
You can't even properly parse it without actually running it, but you can get close with PPI.
perl-variables.pl
#! /usr/bin/env perl
use strict;
use warnings;
use 5.10.1;
use PPI;
use PPI::Find;
my($filename) = (#ARGV, $0); # checks itself by default
my $Doc = PPI::Document->new($filename);
my $Find = PPI::Find->new( sub{
return 0 unless $_[0]->isa('PPI::Token::Symbol');
return 1;
});
$Find->start($Doc);
while( my $symbol = $Find->match ){
my $raw = $symbol->content;
my $var = $symbol->symbol;
if( $raw eq $var ){
say $var;
} else {
say "$var\t($raw)";
}
}
print "\n";
my #found = $Find->in($Doc);
my %found;
$found{$_}++ for #found;
say for sort keys %found;
Running it against itself, produces:
$filename
#ARGV
$0
$Doc
$filename
$Find
#_ ($_)
$Find
$Doc
$symbol
$Find
$raw
$symbol
$var
$symbol
$raw
$var
$var
#found
$Find
$Doc
%found
%found ($found)
$_
#found
%found
$0
$Doc
$Find
$_
$filename
$found
$raw
$symbol
$var
%found
#ARGV
#found

It looks like this will miss fully qualified variable names ($My::Package::Foo) and the rare but valid variable names enclosed with braces (${variable}, ${"varname!with#special+chars"}). Your script will also match element accesses of hashes and arrays ($array[4] ==> $array, $hash{$key} ==> $hash), and object method calls ($object->method() ==> $object), which may or may not be what you want.
You also mismatch variables with underscores ($my_var) and numbers ($var3), and you could get false positives from comments, quoted strings, pod, etc. (# report bugs to bob#company.org).
Matching multiple expressions is a matter of using the /g modifier, which will return a list of matches:
#vars = $Testline =~ /[#\$]\w+/gi;
if (#vars > 0) {
push #OutArray, #vars;
}

Time simple-minded answer is to the /g flag on your regexp.
The complex answer is that this sort of code analysis is very difficult for perl. Look at the module PPI for a better, more full featured, semantic analysis of perl code.

I can't answer either of your questions directly, but I will offer this: I don't know why you're trying to extract scalars, but the debugger package that comes with perl has to "know" about all variables, and the last time I looked it was written in Perl. You may be better off trying to evaluate a perl script using the debugger package or techniques borrowed from that package rather than reinventing the wheel.

Despite the limitations with the method, here is a slightly simpler version of the script above that reads from stdin.
#!/usr/bin/perl
use strict;
use warnings;
my %vars;
while (<>) {
$vars{$_}++ for (m'([$#]\w+)'g);
}
my #vars = keys %vars;
print "#vars\n";

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Perl special variables for regex matches - regex

I'd like to use one of perl's special variable to make this snippet a bit less large and ugly: my $mysqlpass = "mysqlpass=verysecret"; $mysqlpass = first { /mysqlpass=/ } #vars; $mysqlpass =~ s/mysqlpass=//; I have looked this info up and tried several special variables ($',$1,$`, etc) to no avail

The one you're looking for is $_.

Related

Perl regex to find keywords and not variables

Perl regex return matches from substitution

Perl regular expression - eliminate

How to pass a replacing regex as a command line argument to a perl script

Using a regular expression in Perl to list variables from another Perl script

Categories

Resources