Perl regex pattern match with saved variable - regex

I need to do pattern match with two variables one contains the string and the other contains the regex pattern
I tried with the following program
#!/usr/bin/perl
my $name = "sathish.java";
my $other = '*.java';
if ( $name =~ m/$other/ )
{
print "sathish";
}
kindly help where am missing
Thanks
Sathishkumar

#Shmuel answer suits your needs, but if you are looking for common way of extract the filename from a complete path name, you can use File::Basename:
use strict;
use warnings;
use File::Basename;
my ($name, $path, $suffix) = fileparse("/example/path/test.java", qw/.java/);
print "name: $name\n";
print "path: $path\n";
print "suffix: $suffix\n";
it prints:
name: test
path: /example/path/
suffix: .java

'*.java' is not a valid regex. you probably want to use this code:
my $other = '\.java$';
if ($name =~ m/$other/) {

you can use following style which is more appropriate of your need
$other = "*.java";
if ($name =~m/^$other/){}
--SJ

I like Shmuel's answer, but I'm guessing you probably want to capture the first part of the regex into variable as well?
if so, use
my $other = '\.java$';
if ($name =~ m/(\D*)$other/) {
print $1;
# prints "sathish"
}

Related

Perl Regex regular expression to split //

I went through s'flow and other sites for simple solution with regex in perl.
$str = q(//////);#
Say I've six slash or seven, or other chars like q(aaaaa)
I want them to split like ['//','//'],
I tried #my_split = split ( /\/\/,$str); but it didn't work
Is it possible with regex?
Reason for this question is, say I have this domain name:
$site_name = q(http://www.yahoo.com/blah1/blah2.txt);
I wanted to split along single slash to get 'domain-name', I couldn't do it.
I tried
split( '/'{1,1}, $sitename); #didn't work. I expected it split on one slash than two.
Thanks.
The question is rather unclear.
To break a string into pairs of consecutive characters
my #pairs = $string =~ /(..)/g;
or to split a string by repeating slash
my #parts = split /\/\//, $string;
The separator pattern, in /.../, is an actual regex so we need to escape / inside it.
But then you say you want to parse URI?
Use a module, please. For example, there is URI
use warnings;
use strict;
use feature 'say';
use URI;
my $string = q(http://www.yahoo.com/blah1/blah2.txt);
my $uri = URI->new($string);
say "Scheme: ", $uri->scheme;
say "Path: ", $uri->path;
say "Host: ", $uri->host;
# there's more, see docs
and then there's URI::Split
use URI::Split qw(uri_split uri_join);
my ($scheme, $auth, $path, $query, $frag) = uri_split($uri);
A number of other modules or frameworks, which you may already be using, nicely handle URIs.
Here's a quick way to split the full URL into its components:
my $u = q(http://www.yahoo.com/blah1/blah2.txt);
my ($protocol, $server, $path) = split(/:\/\/([^\/]+)/, $u);
print "($protocol, $server, $path)\n";
h/t #Mike
Well next piece of code does the trick
use strict;
use warnings;
use Data::Dumper;
my %url;
while( <DATA> ) {
chomp;
m|(\wttps{0,1})://([\w\d\.]+)/(.+)/([^/]+)$|;
#url{qw(proto dn path file)} = ($1,$2,$3,$4);
print Dumper(\%url);
}
__DATA__
http://www.yahoo.com/blah1/blah2.txt
http://www.google.com/dir1/dir2/dir3/file.ext
ftp://www.server.com/dir1/dir2/file.ext
https://www.inter.net/dir/file.ext
So it seems you want to simply get the Domain name:
my $url = q(http://www.yahoo.com/blah1/blah2.txt);
my #vars = split /\//, $url;
print $vars[2];
results:
www.yahoo.com

Perl regular expression - eliminate

Am a newbie here. I use glimpse in my Perl script to get the path of files.
For example
/home/user/Proj/A/Apps/App.pm
/home/user/Proj/B/Apps.pm
I need to fetch the part after Proj i.e; the output should be
A/Apps/App.pm
B/Apps.pm
If you want to use regex/replace you could do something like:
$str =~ s!.*/Proj/!!;
You have various options here. When it's always at /home/user/Proj/, I prefer the second way. If not, you can use the first way as well. The best way is a substr (when its a static length):
use 5.014;
use strict;
use warnings;
my $s_a = "/home/user/Proj/A/Apps/App.pm";
my $s_b = "/home/user/Proj/B/Apps.pm";
say $s_a =~ s{.*Proj/}{}r;
say $s_b =~ s{.*Proj/}{}r;
say $s_a =~ s{/home/user/Proj/}{}r;
say $s_b =~ s{/home/user/Proj/}{}r;
say substr $s_a, 16;
say substr $s_b, 16;
output:
A/Apps/App.pm
B/Apps.pm
A/Apps/App.pm
B/Apps.pm
A/Apps/App.pm
B/Apps.pm
If you want to modifiy an existing variable to remove the first part of the path then it's simple: just use the substitution operator s/// to remove the first part of the string up to /Proj/. I've used alternative delimiters s||| here to avoid having to escape the slashes in the pattern.
use strict;
use warnings;
my #paths = qw{
/home/user/Proj/A/Apps/App.pm
/home/user/Proj/B/Apps.pm
};
for my $path (#paths) {
$path =~ s|.*/Proj/||;
print $path, "\n";
}
output
A/Apps/App.pm
B/Apps.pm
But if you want to leave your path variable as it is and copy the tail portion to another variable, then I think it's best to use a regular expression to capture the wanted part, like this
for my $path (#paths) {
my ($tail) = $path =~ m|/Proj/(.+)|;
print $tail, "\n";
}
The output is identical.

Need to replace part of a string with another string

I'm still pretty new to perl and regex and need some help getting started. I would love to provide some code, but that's kinda where I'm stuck.
What I'm trying to do is that I have this string in a file like this:
dn: CN=doe\, john,OU=Users,DC=domain,DC=com
and a string like this:
uid: d12345
I need to do a search and replace to get the following result.
dn: uid= d12345,OU=Users,DC=domain,DC=com
Can anyone help me get started with this one? Much thanks!
So you want to replace CN=doe\, john with uid= d12345? Try this:
$uidString = "uid: d12345";
$dnString = "dn: uid= d12345,OU=Users,DC=domain,DC=com";
if( $uidString =~ /uid: (\w+)/ ) {
$uid = $1;
$dnString =~ s/CN=.+?[^\\],/uid= $uid,/;
}
That will replace everything from CN= to the first unescaped comma with the uid.
Won't a one line regex do the trick?
use strict;
use warnings;
my $a = "dn: CN=doe\, john,OU=Users,DC=domain,DC=com";
my $b= "uid: d12345";
#the regex
$a =~ s/CN(.*?), .*?,/$b,/;
print "$a";
I suspect your DNs and uids will be dynamic. Here is something that will help. The regex will substitute CN= all the way until the comma with whatever string you put in $uid.
#!/usr/bin/env perl
use strict;
use warnings;
my $string = 'dn: CN=doe\, john,OU=Users,DC=domain,DC=com';
my $uid_str = 'uid: d12345';
my ($uid) = $uid_str =~ m/^uid:(.+)$/;
$string =~ s/CN=.+(,OU=.+$)/uid=$uid$1/;
print "String is: $string\n";
Output: String is: dn: uid= d12345,OU=Users,DC=domain,DC=com

Question about reg exps in perl

I need to write regular expression that will parse strings like this:
Build-Depends: cdbs, debhelper (>=5), smthelse
I want to extract package names (without version numbers and brackets).
I wrote something like this:
$line =~ /^Build-Depends:\s*(\S+)\s$/
But it's not exactly what I want.
Does someone know how to manage it?
P.S. I just want to get the list: "cdbs debhelper smthelse" as a result
This regex should do what you want: /\s(\S*)(?:\s\(.*?\))?(?:,|$)/g
Edit: You'd call it like this to loop through all the results:
while ($str =~ /\s(\S*)(?:\s\(.*?\))?(?:,|$)/g) {
print "$1 is one of the packages.\n";
}
With your regex /^Build-Depends:\s*(\S+)\s$/ you are matching until the end of string.
Try /^Build-Depends:\s*(\S+)\s/ instead.
This will work for the types of package names listed here.
use warnings;
use strict;
my #packs;
my $line = "Build-Depends: cdbs, debhelper (>=5), smthelse";
if ( $line =~ /^Build-Depends: (.+)$/ ) { # get everything
#packs = split /,+\s*/, $1;
s/\([^)]+\)//g for #packs; # remove version stuff
}
print "$_\n" for #packs;
How about splitting the input on whitespace and print each element if a ( is not present?
Something like this perhaps
perl -lane 'foreach $_ (#F[1..scalar(#F)]) {print if not m/\(/}'
cdbs,
debhelper
smthelse

How do I split a string into an array by comma but ignore commas inside double quotes?

I have a line:
$string = 'Paul,12,"soccer,baseball,hockey",white';
I am try to split this into #array that has 4 values so
print $array[2];
Gives
soccer,baseball,hockey
How do I this? Help!
Just use Text::CSV. As you can see from the source, getting CSV parsing right is quite complicated:
sub _make_regexp_split_column {
my ($esc, $quot, $sep) = #_;
if ( $quot eq '' ) {
return qr/([^\Q$sep\E]*)\Q$sep\E/s;
}
qr/(
\Q$quot\E
[^\Q$quot$esc\E]*(?:\Q$esc\E[\Q$quot$esc\E0][^\Q$quot$esc\E]*)*
\Q$quot\E
| # or
[^\Q$sep\E]*
)
\Q$sep\E
/xs;
}
The standard module Text::ParseWords will do this as well.
my #array = parse_line(q{,}, 0, $string);
In response to how to do it with Text::CSV(_PP). Here is a quick one.
#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV_PP;
my $parser = Text::CSV_PP->new();
my $string = "Paul,12,\"soccer,baseball,hockey\",white";
$parser->parse($string);
my #fields = $parser->fields();
print "$_\n" for #fields;
Normally one would install Text::CSV or Text::CSV_PP through the cpan utility.
To work around your not being able to install modules, I suggest you use the 'pure Perl' implementation so that you can 'install' it. The above example would work assuming you copied the text of Text::CSV_PP source into a file named CSV_PP.pm in a folder called Text created in the same directory as your script. You could also put it in some other location and use the use lib 'directory' method as discussed previously. See here and here to see other ways to get around install restriction using CPAN modules.
Use this regex: m/("[^"]+"|[^,]+)(?:,\s*)?/g;
The above regular expression globally matches any word that starts with a comma or a quote and then matches the remaining word/words based on the starting character (comma or quote).
Here is a sample code and the corresponding output.
my $string = "Word1, Word2, \"Commas, inbetween\", Word3, \"Word4Quoted\", \"Again, commas, inbetween\"";
my #arglist = $string =~ m/("[^"]+"|[^,]+)(?:,\s*)?/g;
map { print $_ , "\n"} #arglist;
Here is the output:
Word1
Word2
"Commas, inbetween"
Word3
"Word4Quoted"
"Again, commas, inbetween"
try this
#array=($string =~ /^([^,]*)[,]([^,]*)[,]["]([^"]*)["][,]([^']*)$/);
the array will contains the output which expected by you.
use strict;
use warning;
#use Data::Dumper;
my $string = qq/Paul,12,"soccer,baseball,hockey",white/;
#split string into three parts
my ($st1, $st2, $st3) = split(/,"|",/, $string);
#output: st1:Paul,12 st2:soccer,baseball,hockey st3:white
#split $st1 into two parts
my ($st4, $st5) = split(/,/,$st1);
#push records into array
push (my #test,$st4, $st5,$st2, $st3 ) ;
#print Dumper \#test;
print "$test[2]\n";
output:
soccer,baseball,hockey
#$VAR1 = [
# 'Paul',
# '12',
# 'soccer,baseball,hockey',
# 'white'
# ];
$string = "Paul,12,\"soccer,baseball,hockey\",white";
1 while($string =~ s#"(.?),(.?)"#\"$1aaa$2\"#g);
#array = map {$_ =~ s/aaa/ /g; $_ =~ s/\"//g; $_} split(/,/, $string);
$" = "\n";
print "$array[2]";