Regex parsing string substitution question - regex

I would like to know if there is an easy way for parsing a string like this
set PROMPT = Yes, Master?
What I would like to do, is parse one part of this string up to the equal sign and parse the second part after the equal sign into another string.

Something like...
$phrase = 'set PROMPT = Yes, Master?';
#parts = split /=/, $phrase;
or
($set, $value) = split /=/, $phrase, 2;
[updated] Changes per comments.

Try matching this regex /\s*set\s*(\w+)\s*=\s*(.*)\s*$/ and setting the parts with $1 and $2:
my $str = 'set PROMPT = Yes, Master?';
my ($k, $v) = ($1, $2) if $str =~ /\s*set\s*(\w+)\s*=\s*(.*)\s*$/;
print "OK: k=$k, v=$v\n"; OK: k=PROMPT, v=Yes, Master?

while ($subject =~ m/([^\s]+)\s*=\s*([^\$]+)/img) {
# $1 = $2
}

Related

Perl how do you assign a varanble to a regex match result

How do you create a $scalar from the result of a regex match?
Is there any way that once the script has matched the regex that it can be assigned to a variable so it can be used later on, outside of the block.
IE. If $regex_result = blah blah then do something.
I understand that I should make the regex as non-greedy as possible.
#!/usr/bin/perl
use strict;
use warnings;
# use diagnostics;
use Win32::OLE;
use Win32::OLE::Const 'Microsoft Outlook';
my #Qmail;
my $regex = "^\\s\*owner \#";
my $sentence = $regex =~ "/^\\s\*owner \#/";
my $outlook = Win32::OLE->new('Outlook.Application')
or warn "Failed Opening Outlook.";
my $namespace = $outlook->GetNamespace("MAPI");
my $folder = $namespace->Folders("test")->Folders("Inbox");
my $items = $folder->Items;
foreach my $msg ( $items->in ) {
if ( $msg->{Subject} =~ m/^(.*test alert) / ) {
my $name = $1;
print " processing Email for $name \n";
push #Qmail, $msg->{Body};
}
}
for(#Qmail) {
next unless /$regex|^\s*description/i;
print; # prints what i want ie lines that start with owner and description
}
print $sentence; # prints ^\\s\*offense \ # not lines that start with owner.
One way is to verify a match occurred.
use strict;
use warnings;
my $str = "hello what world";
my $match = 'no match found';
my $what = 'no what found';
if ( $str =~ /hello (what) world/ )
{
$match = $&;
$what = $1;
}
print '$match = ', $match, "\n";
print '$what = ', $what, "\n";
Use Below Perl variables to meet your requirements -
$` = The string preceding whatever was matched by the last pattern match, not counting patterns matched in nested blocks that have been exited already.
$& = Contains the string matched by the last pattern match
$' = The string following whatever was matched by the last pattern match, not counting patterns matched in nested blockes that have been exited already. For example:
$_ = 'abcdefghi';
/def/;
print "$`:$&:$'\n"; # prints abc:def:ghi
The match of a regex is stored in special variables (as well as some more readable variables if you specify the regex to do so and use the /p flag).
For the whole last match you're looking at the $MATCH (or $& for short) variable. This is covered in the manual page perlvar.
So say you wanted to store your last for loop's matches in an array called #matches, you could write the loop (and for some reason I think you meant it to be a foreach loop) as:
my #matches = ();
foreach (#Qmail) {
next unless /$regex|^\s*description/i;
push #matches_in_qmail $MATCH
print;
}
I think you have a problem in your code. I'm not sure of the original intention but looking at these lines:
my $regex = "^\\s\*owner \#";
my $sentence = $regex =~ "/^\s*owner #/";
I'll step through that as:
Assign $regexto the string ^\s*owner #.
Assign $sentence to value of running a match within $regex with the regular expression /^s*owner $/ (which won't match, if it did $sentence will be 1 but since it didn't it's false).
I think. I'm actually not exactly certain what that line will do or was meant to do.
I'm not quite sure what part of the match you want: the captures, or something else. I've written Regexp::Result which you can use to grab all the captures etc. on a successful match, and Regexp::Flow to grab multiple results (including success statuses). If you just want numbered captures, you can also use Data::Munge
You can do the following:
my $str ="hello world";
my ($hello, $world) = $str =~ /(hello)|(what)/;
say "[$_]" for($hello,$world);
As you see $hello contains "hello".
If you have older perl on your system like me, perl 5.18 or earlier, and you use $ $& $' like codequestor's answer above, it will slow down your program.
Instead, you can use your regex pattern with the modifier /p, and then check these 3 variables: ${^PREMATCH}, ${^MATCH}, and ${^POSTMATCH} for your matching results.

Perl regex - Having the delimiter as part of the string itself

I have a long string in the format
id1:2014-08-05 11:24;Does this work?,id2:2014-08-04 13:22; Does this work,too?,id3:2014-07-25 16:56 ...
I am trying to extract the 'date' and 'comment' part out of this, based on the id, which is the input.
For example, if the input is id2, I'd want the comment as 'Does this work, too?' and date as '2014-08-04 13:22'. Here is the regex I have so far.
if($string =~ m/\b$id:(.*?);(.*,?)/){
my $date = $1;
my $comment = substr($2,0,-1); #to remove the last ,
}
Now since there is a ',' as part of the string itself, my regex treats it as a delimiter and just returns 'Does this work' as the comment, leaving out the ',too?' part.
Any help would really help as to how to handle when my string has the delimiter within itself.
I think the best way to do this is to form a hash out of the string. If you start by splitting the string on any comma that's immediately followed by some alphanumeric characters and a colon then the commas within the comments will be ignored and most of your work is done.
Then just use a regex to divide each split into three chunks: the ID, the date/time, and the comment, and put them into a hash. After that you can get the date/time for an ID as $data{id1}[0] and the comment as $data{id1}[1]
This program demonstrates
use strict;
use warnings;
my $s = 'id1:2014-08-05 11:24;Does this work?,id2:2014-08-04 13:22; Does this work,too?,id3:2014-07-25 16:56 ...';
my %data;
for (split /,(?=\w+:)/, $s) {
my #fields = /([^:]+):([^;]+);(.+)/g;
$data{$1} = [ $2, $3 ];
}
print $data{id2}[1], "\n";
output
Does this work,too?
$str = "id1:2014-08-05 11:24;Does this work?,id2:2014-08-04 13:22; Does this work,too?,id3:2014-07-25 16:56; bla";
$id = "id2";
# I need comma to set the end of the last "record"
$str = $str . ",";
if ($str =~ /$id:([\d\-\: ]+);([ \w\?\,]+)\,/) {
print "date = $1\n";
print "comment = $2\n";
}

Perl regex pattern match with saved variable

I need to do pattern match with two variables one contains the string and the other contains the regex pattern
I tried with the following program
#!/usr/bin/perl
my $name = "sathish.java";
my $other = '*.java';
if ( $name =~ m/$other/ )
{
print "sathish";
}
kindly help where am missing
Thanks
Sathishkumar
#Shmuel answer suits your needs, but if you are looking for common way of extract the filename from a complete path name, you can use File::Basename:
use strict;
use warnings;
use File::Basename;
my ($name, $path, $suffix) = fileparse("/example/path/test.java", qw/.java/);
print "name: $name\n";
print "path: $path\n";
print "suffix: $suffix\n";
it prints:
name: test
path: /example/path/
suffix: .java
'*.java' is not a valid regex. you probably want to use this code:
my $other = '\.java$';
if ($name =~ m/$other/) {
you can use following style which is more appropriate of your need
$other = "*.java";
if ($name =~m/^$other/){}
--SJ
I like Shmuel's answer, but I'm guessing you probably want to capture the first part of the regex into variable as well?
if so, use
my $other = '\.java$';
if ($name =~ m/(\D*)$other/) {
print $1;
# prints "sathish"
}

Perl search and replace with variable and capture group

As the question says, I am trying to do a search replace using a variable and a capture group. That is, the replace string contains $1. I followed the answers here and here, but they did not working for me; $1 comes through in the replace. Can you help me spot my problem?
I am reading my regular expressions from a file like so:
while( my $line = <$file>) {
my #findRep = split(/:/, $line);
my $find = $findRep[0];
my $replace = '"$findRep[2]"'; # This contains the $1
$allTxt =~ s/$find/$replace/ee;
}
If I manually set my $replace = '"$1 stuff"' the replace works as expected. I have played around with every single/double quoting and /e combination I can think of.
You're using single quotes so $findRep[2] isn't interpolated. Try this instead:
my $replace = qq{"$findRep[2]"};
Why regex replacement when you already have your values in #findRep
while( my $line = <$file>) {
my #findRep = split(/:/, $line);
$findRep[0] = $findRep[2];
my $allTxt = join(":", #findRep);
}

perl regex replace only part of string

I need to write a perl regex to convert
site.company.com => dc=site,dc=company,dc=com
Unfortunately I am not able to remove the trailing "," using the regex I came with below. I could of course remove the trailing "," in the next statement but would prefer that to be handled as a part of the regex.
$data="site.company.com";
$data =~ s/([^.]+)\.?/dc=$1,/g;
print $data;
This above code prints:
dc=site,dc=company,dc=com,
Thanks in advance.
When handling urls it may be a good idea to use a module such as URI. However, I do not think it applies in this case.
This task is most easily solved with a split and join, I think:
my $url = "site.company.com";
my $string = join ",", # join the parts with comma
map "dc=$_", # add the dc= to each part
split /\./, $url; # split into parts
$data =~s/\./,dc=/g&&s/^/dc=/g;
tested below:
> echo "site.company.com" | perl -pe 's/\./,dc=/g&&s/^/dc=/g'
dc=site,dc=company,dc=com
Try doing this :
my $x = "site.company.com";
my #a = split /\./, $x;
map { s/^/dc=/; } #a;
print join",", #a;
just put like this,
$data="site.company.com";
$data =~ s/,dc=$1/dc=$1/g; #(or) $data =~ s/,dc/dc/g;
print $data;
I'm going to try the /ge route:
$data =~ s{^|(\.)}{
( $1 && ',' ) . 'dc='
}ge;
e = evaluate replacement as Perl code.
So, it says given the start of the string, or a dot, make the following replacement. If it captured a period, then emit a ','. Regardless of this result, insert 'dc='.
Note, that I like to use a brace style of delimiter on all my evaluated replacements.