Perl Modification of a read only variable - regex

I am having a problem with a regex behaving in a manner that doesn't make sense to me. $line is a reference to a scalar (in this instance the string is 'print "hello world\n"') however the attempt to perform a regex match appears to succeed yet also change the value of $$line. In addition to this, I get an error when trying to modify $$line on line 65
Here is the code:
my $line = $_[0];
$$line =~ s/^(\s+\(?)//;
my #functions = ('print');
# Check if the expression is a function
for my $funcName (#functions) {
print $$line . "\n";
if ($$line =~ m/^($funcName\(?\s*)/) {
print $$line . "\n";
$$line =~ s/$1//; # THIS IS LINE 65
my $args = [];
while (scalar(#{$args}) == 0 || ${$line} =~ /\s*,/) {
push (#{$args}, parseExpression($line))
}
my $function = {
type => 'function',
name => $funcName,
args => $args
};
return $function;
}
}
The output is as such:
print "hello world\n"
print
Modification of a read-only value attempted at ./perl2python.pl line 65, <> line 3.
This code is an excerpt from a function, however it should be enough to illustrate what is going wrong.
The second line of the output should be the same as the first, but it appears $$line is being altered between the two print statements by the if clause.
Any advice??

If you eliminate all the confusion belonging to code that is not part of the problem, eliminate the subroutine call, with its parameter passing, and so on, you could boil the problem down to code that looks something like this:
my $line = \"Hello world!\n"; # $line contains a reference to a string literal.
$$line =~ s/Hello/Goodbye/;
And when you run that, you will get the Modification of a read-only value attempted... message, because you cannot modify a string literal.
The fact that adding my somewhere fixed it may just mean that your new lexical $line masks some other scalar variable named $line, which was the one that held a reference to a string literal.

Related

Why is a regular expression containing the exact string not matching successfully?

This might be a beginners mistake.
The regex turns out always as not matching while clearly it should.
#!/usr/bin/perl
# This will print "Hello, World"
print "Hello, world\n";
my $addr = "Hello";
#if($addr =~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\)/ )
if (my $addr =~ /Hello/)
{
print("matched\n\n");
}else
{
print("Didnt Match\n\n");
}
The my makes the variable you match local and uninitialised.
So you should change to
if ($addr =~ /Hello/)
The my indicates that the $addr in the if is "my own here", i.e. different from the other $addr the one with larger, outer scope.
Only the outer scope variable got initialised to something which would match your regex. The second, inner one is not initialised and (at least in your case) has no matching value.
Note: Comments by other authors have proposed a best practice for avoiding/detecting the cause of your problem in future programming.
I know this has been answered already, but let me just expand a bit.
in general in perl we would work in blocks, if we can call it that. If you set
my $string = 'String';
in the beginning of the script, outside any loop, that declaration will stay the string throughout the script, unless you re-declare or re-assign it somewhere along the line.
my $string = 'string';
$string = 'Text';
That changes a bit if you work inside of a block, let's say in an if statement:
Scenario 1.
my $var = 'test';
if ($var =~ /test/) {
my $string = 'string';
}
print $string; # This will not work as $string only existed in the if block.
The following is the same scenario, but you re-declare $var in the if block and therefore it will try and match the new variable which has no value and therefore $string in this instance will never be set.
Scenario 2.
my $var = 'test';
if (my $var =~ /test/) {
my $string = 'string';
}
There is another one though which works differently from my and that is our
Scenario 3.
my $var = 'test';
if ($var =~ /test/) {
our $string = 'string';
}
print $string;
The above scenario works a bit different from scenario 1. We are declaring our $string inside of the if statement block and we can now use $string outside of that loop because our is saying any block from here on, owns this variable.
U had initialised another variable called $addr in the scope of if, instead of using the variable which was initialized in the global scope.
#!/usr/bin/perl
# This will print "Hello, World"
print "Hello, world\n";
my $addr = "Hello";
#if($addr =~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\)/ )
if ($addr =~ /Hello/)
{
print("matched\n\n");
}else
{
print("Didnt Match\n\n");
}

PERL: line by line text parsing script

for excercise and curiosity, anyone knows if the following script can be made more compact and expedite:
foreach(#list){
if ($_=~"givenName: ") {
$cname=$_;
$cname=~ s/givenName: //g;
}
if ($_=~"cn: ") {
$cn=$_;
$cn=~ s/cn: //g;
}
...
}
What is does:
- It looks for a string inside the line, to see if it contains that particular index
- It then strips off the string and read the rest of the line putting the content into the variable.
- This script reads line by line the result of another script and identify the fields of each line putting the value inside the proper variable
If every line in the list is guaranteed to be in the format 'variableName: someText' then you could do this instead:
foreach (#list) {
/^(\w+): (.*)/ && $vars{$1} = $2;
}
It's not exactly like your solution -- it puts the results into a %vars hash instead of into variables named $cname, $cn, etc. -- but it's more succinct and general.
how about something like this?
my $data = {}; #a hashref to store your data
foreach my $line(#list){
$line =~ s/(givenName|cn|more|names):\s//g and $data->{$1} = $line;
...
}
#EDIT: now you have all your data inside the hashref and can call each var accordingly
print $data->{givenName};
print $data->{cn};
my #list = ('name', 'givenName: ', 'noname');
foreach(#list){
s/givenName: //g if /givenName: /;
my $var = $_;
...
}

Perl regular expression unintendedly seems to be modifying a source String

Am trying to match some data patterns from a file in a Perl program. Since the match could be over multiple lines, I have made the line seperator as undefined.
$/ = undef ;
Now, since the match can be across multiple lines and more than one, I am using smgi modifiers.
if ( $msgText =~ /$msgTypeExpr/smgi )
Now, the problem I am having is that the variable $msgText above gets modified though I am not replacing it.
Here is the relevant code:
open (HANDLE1,"$file") || die "cannot open file \n";
$/ = undef ;
while ( my $msgText = <HANDLE1> )
{
my $msgTypeExpr = "<city\\W+";
print "Attempt 1:\n";
if ( $msgText =~ /$msgTypeExpr/smgi )
{
print "matched\n";
}
else
{
print " not matched \n";
}
print "Attempt 2:\n";
if ( $msgText =~ /$msgTypeExpr/smgi )
{
print "matched\n";
}
else
{
print " not matched \n";
}
}
The test input file looks like this:
<city
name="abc">
</city>
One would expect the pattern to match twice but it is matching the first time only and not the second time around.
I have temporarily fixed this issue with assigning to a temp variable for now before matching and using that temp variable to match.
my $tmpMsgText = $msgText ;
This is the first time I am posting a question on this forum, so please pardon any etiquettes mistakes I may have made and also please be kind enough to point them out so that I don't repeat in the future.
if (//g) makes no sense. "If it matches and keeps on matching until there's no match"? Get rid of the g.
I don't know why you're using s or m either.
The s is useless since the pattern doesn't contain ..
The m is useless since the pattern doesn't contain ^ or $.
In reality, //g in scalar context acts as an iterator.
$ perl -E'$_ = "abc"; /(.)/g && say $1; /(.)/g && say $1;'
a
b
first of all, I'm not sure about reading a file like that. Modifying those Perl-special variables, like $/ should be done with local, like this:
local $/ = undef;
this way the variable is only modified in the current scope (thus eliminating possible action-in-the-distance bugs). By setting $/ to undef you will read the entire file in one go, so there is no point putting a while loop there. I'd read the whole file like this:
open my $fh, "<", "somefile" or die;
my $content = do { local $/ = undef; <$fh> };
the do block restricts the modified $/ value only to that one statement (it creates a new scope).
About regex matching: remove the /g modifier after the regex. If I remember correctly, it will remember the last regex search position and continue from there. Also for detecting if a string was altered or not, print the variable before and after those matches. You will see, that they are not modified.
Instead of:
if ( $msgText =~ /$msgTypeExpr/smgi )
put:
if ( $msgText =~ /$msgTypeExpr/smi )

Perl pattern matching when using arrays

I have a strange problem in matching a pattern.
Consider the Perl code below
#!/usr/bin/perl -w
use strict;
my #Array = ("Hello|World","Good|Day");
function();
function();
function();
sub function
{
foreach my $pattern (#Array)
{
$pattern =~ /(\w+)\|(\w+)/g;
print $1."\n";
}
print "\n";
}
__END__
The output I expect should be
Hello
Good
Hello
Good
Hello
Good
But what I get is
Hello
Good
Use of uninitialized value $1 in concatenation (.) or string at D:\perlfiles\problem.pl li
ne 28.
Use of uninitialized value $1 in concatenation (.) or string at D:\perlfiles\problem.pl li
ne 28.
Hello
Good
What I observed was that the pattern matches alternatively.
Can someone explain me what is the problem regarding this code.
To fix this I changed the function subroutine to something like this:
sub function
{
my $string;
foreach my $pattern (#Array)
{
$string .= $pattern."\n";
}
while ($string =~ m/(\w+)\|(\w+)/g)
{
print $1."\n";
}
print "\n";
}
Now I get the output as expected.
It is the global /g modifier that is at work. It remembers the position of the last pattern match. When it reaches the end of the string, it starts over.
Remove the /g modifier, and it will act as you expect.

Perl regex replacement string special variable

I'm aware of the match, prematch, and postmatch predefined variables. I'm wondering if there is something similar for the evaluated replacement part of the s/// operator.
This would be particularly useful in dynamic expressions so they don't have to be evaluated a 2nd time.
For example, I currently have %regexs which is a hash of various search and replace strings.
Here's a snippet:
while (<>) {
foreach my $key (keys %regexs) {
while (s/$regexs{$key}{'search'}/$regexs{$key}{'replace'}/ee) {
# Here I want to do something with just the replaced part
# without reevaluating.
}
}
print;
}
Is there a convenient way to do it? Perl seems to have so many convenient shortcuts, and it seems like a waste to have to evaluate twice (which appears to be the alternative).
EDIT: I just wanted to give an example: $regexs{$key}{'replace'} might be the string '"$2$1"' thus swapping the positions of some text in the string $regexs{$key}{'search'} which might be '(foo)(bar)' - thus resulting in "barfoo". The second evaluation that I'm trying to avoid is the output of $regexs{$key}{'replace'}.
Instead of using string eval (which I assume is what's going on with s///ee), you could define code references to do the work. Those code references can then return the value of the replacement text. For example:
use strict;
use warnings;
my %regex = (
digits => sub {
my $r;
return unless $_[0] =~ s/(\d)(\d)_/$r = $2.$1/e;
return $r;
},
);
while (<DATA>){
for my $k (keys %regex){
while ( my $replacement_text = $regex{$k}->($_) ){
print $replacement_text, "\n";
}
}
print;
}
__END__
12_ab_78_gh_
34_cd_78_yz_
I'm pretty sure there isn't any direct way to do what you're asking, but that doesn't mean it's impossible. How about this?
{
my $capture;
sub capture {
$capture = $_[0] if #_;
$capture;
}
}
while (s<$regexes{$key}{search}>
<"capture('" . $regexes{$key}{replace}) . "')">eeg) {
my $replacement = capture();
#...
}
Well, except to do it really properly you'd have to shoehorn a little more code in there to make the value in the hash safe inside a singlequotish string (backslash singlequotes and backslashes).
If you do the second eval manually you can store the result yourself.
my $store;
s{$search}{ $store = eval $replace }e;
why not assign to local vars before:
my $replace = $regexs{$key}{'replace'};
now your evaluating once.