Am trying to replace all `` with a HTML code tag
replace:
$string = "Foo `FooBar` Bar";
with:
$string = "Foo <code>FooBar</code> Bar";
i tried these
$pattern = '`(.*?)`';
my $replace = "<code/>$&</code>";
$subject =~ s/$pattern/$replace/im;
#And
$subject =~ s/$pattern/<code/>$&</code>/im;
but none of them works.
Assuming you meant $string instead of $subject...
use strict;
use warnings;
use v5.10;
my $string = "Foo `FooBar` Bar";
my $pattern = '`(.*?)`';
my $replace = "<code/>$&</code>";
$string =~ s{$pattern}{$replace}im;
say $string;
This results in...
$ perl ~/tmp/test.plx
Use of uninitialized value $& in concatenation (.) or string at /Users/schwern/tmp/test.plx line 9.
Foo <code/></code> Bar
There's some problems here. First, $& means the string matched by the last match. That would be all of `FooBar`. You just want FooBar which is inside capturing parens. You get that with $1. See Extracting Matches in the Perl Regex Tutorial.
Second is $& and $1 are variables. If you put them in double quotes like $replace = "<code/>$&</code>" then Perl will immediately interpolate them. That means $replace is <code/></code>. This is where the warning comes from. If you want to use $1 it has to go directly into the replace.
Finally, when quoting regexes it's best to use qr{}. That does special regex quoting. It avoids all sorts of quoting issues.
Put it all together...
use strict;
use warnings;
use v5.10;
my $string = "Foo `FooBar` Bar";
my $pattern = qr{`(.*?)`};
$string =~ s{$pattern}{<code/>$1</code>}im;
say $string;
Related
I have CSV text like
1,2,3,{4,5,6,7,8},9,10,100
I want to replace the delimiter of fields between {}. The text should look like:
1,2,3,{4|5|6|7|8},9,10,100
I tried perl -0777 -pe 's/\{.*?,\}/|/g'
but nothing happens. What should I do instead?
This will do as you ask. It replaces all commas that are followed by a sequence of characters that are not braces { }, and then a closing brace
use strict;
use warnings;
use 5.010;
my $s = '1,2,3,{4,5,6,7,8},9,10,100';
$s =~ s/,(?=[^{}]*\})/|/g;
say $s;
output
1,2,3,{4|5|6|7|8},9,10,100
You can use the following regex with $1$2| replacement string:
(\{\s*|(?<!^)\G)(\d+),(?=[,0-9]*\})
Output:
1,2,3,{4|5|6|7|8},9,10,100
Sample code:
#!/usr/bin/perl
$txt = "1,2,3,{4,5,6,7,8},9,10,100";
$txt =~ s/(\{\s*|(?<!^)\G)(\d+),(?=[,0-9]*\})/$1$2|/g;
print $txt;
Here's a command line version for Perl 5.14 and greater.
perl -pe 's/([{][\d,]+[}])/$1 =~ s~,~|~gr/ge'
The /e means it's evaluating the replacement as a Perl expression and not the standard regex expression. That means that it is taking the value of the first capture ($1) and performing a substitution with return (/r) so as to avoid the error trying to modify the read-only value ($1).
You can try this:
$st = "1,2,3,{4,5,6,7,8},9,10,100";
if ( $st=~/\{(.*)\}/ ) {
$tr = $1;
$tr =~ s/,/|/g;
$st =~ s/\{*\}/{$tr}/;
print "$st \n"
}
Output:
1,2,3,{4,5,6,7,8{4|5|6|7|8},9,10,100
I'm working on some Perl code that handles possibly malformed UTF8, and have come across an oddity with regex matching. Consider the following code:
#!/usr/bin/perl
use strict;
use warnings;
use utf8;
my $string = "One \x{FFFF_FFFF} three\n";
my $re1 = qr/\x{FFFF_FFFF}/;
my $re2 = qr/.*\x{FFFF_FFFF}/;
my $re3 = qr/.\x{FFFF_FFFF}/;
print "One\n" if $string =~ $re1;
print "Two\n" if $string =~ $re2;
print "Three\n" if $string =~ $re3;
The output is:
One
Three
Why doesn't the second regular expression also match? Is there a work-around?
I'm using Perl 5.14.2.
Because of a bug that's already been fixed in 5.18
$ usr/perlbrew/perls/5.16.3t/bin/perl -wE'
say "One \x{FFFF_FFFF} three\n" =~ /.*\x{FFFF_FFFF}/ ?1:0'
0
$ usr/perlbrew/perls/5.18.2t/bin/perl -wE'
say "One \x{FFFF_FFFF} three\n" =~ /.*\x{FFFF_FFFF}/ ?1:0'
1
I am trying to simultaneously remove and store (into an array) all matches of some regex in a string.
To return matches from a string into an array, you could use
my #matches = $string=~/$pattern/g;
I would like to use a similar pattern for a substitution regex. Of course, one option is:
my #matches = $string=~/$pattern/g;
$string =~ s/$pattern//g;
But is there really no way to do this without running the regex engine over the full string twice? Something like
my #matches = $string=~s/$pattern//g
Except that this will only return the number of subs, regardless of list context. I would also take, as a consolation prize, a method to use qr// where I could simply modify the quoted regex to to a sub regex, but I don't know if that's possible either (and that wouldn't preclude searching the same string twice).
Perhaps the following will be helpful:
use warnings;
use strict;
my $string = 'I thistle thing am thinking this Thistle a changed thirsty string.';
my $pattern = '\b[Tt]hi\S+\b';
my #matches;
$string =~ s/($pattern)/push #matches, $1; ''/ge;
print "New string: $string; Removed: #matches\n";
Output:
New string: I am a changed string.; Removed: thistle thing thinking this Thistle thirsty
Here is another way to do it without executing Perl code inside the substitution. The trick is that the s///g will return one capture at a time and undef if it does not match, thus quitting the while loop.
use strict;
use warnings;
use Data::Dump;
my $string = "The example Kenosis came up with was way better than mine.";
my #matches;
push #matches, $1 while $string =~ s/(\b\w{4}\b)\s//;
dd #matches, $string;
__END__
(
"came",
"with",
"than",
"The example Kenosis up was way better mine.",
)
I have text in the form:
Name=Value1
Name=Value2
Name=Value3
Using Perl, I would like to match /Name=(.+?)/ every time it appears and extract the (.+?) and push it onto an array. I know I can use $1 to get the text I need and I can use =~ to perform the regex matching, but I don't know how to get all matches.
A m//g in list context returns all the captured matches.
#!/usr/bin/perl
use strict; use warnings;
my $str = <<EO_STR;
Name=Value1
Name=Value2
Name=Value3
EO_STR
my #matches = $str =~ /=(\w+)/g;
# or my #matches = $str =~ /=([^\n]+)/g;
# or my #matches = $str =~ /=(.+)$/mg;
# depending on what you want to capture
print "#matches\n";
However, it looks like you are parsing an INI style configuration file. In that case, I will recommend Config::Std.
my #values;
while(<DATA>){
chomp;
push #values, /Name=(.+?)$/;
}
print join " " => #values,"\n";
__DATA__
Name=Value1
Name=Value2
Name=Value3
The following will give all the matches to the regex in an array.
push (#matches,$&) while($string =~ /=(.+)$/g );
Use a Config:: module to read configuration data. For something simple like that, I might reach for ConfigReader::Simple. It's nice to stay out of the weeds whenever you can.
Instead of using a regular expression you might prefer trying a grammar engine like:
Parse::RecDescent
Regexp::Grammars
I've given a snippet of a Parse::ResDescent answer before on SO. However Regexp::Grammars looks very interesting and is influenced by Perl6 rules & grammars.
So I thought I'd have a crack at Regexp::Grammars ;-)
use strict;
use warnings;
use 5.010;
my $text = q{
Name=Value1
Name = Value2
Name=Value3
};
my $grammar = do {
use Regexp::Grammars;
qr{
<[VariableDeclare]>*
<rule: VariableDeclare>
<Var> \= <Value>
<token: Var> Name
<rule: Value> <MATCH= ([\w]+) >
}xms;
};
if ( $text =~ $grammar ) {
my #Name_values = map { $_->{Value} } #{ $/{VariableDeclare} };
say "#Name_values";
}
The above code outputs Value1 Value2 Value3.
Very nice! The only caveat is that it requires Perl 5.10 and that it may be overkill for the example you provided ;-)
/I3az/
I would like to do the following:
$find = "start (.*) end";
$replace = "foo \1 bar";
$var = "start middle end";
$var =~ s/$find/$replace/;
I would expect $var to contain "foo middle bar", but it does not work. Neither does:
$replace = 'foo \1 bar';
Somehow I am missing something regarding the escaping.
On the replacement side, you must use $1, not \1.
And you can only do what you want by making replace an evalable expression that gives the result you want and telling s/// to eval it with the /ee modifier like so:
$find="start (.*) end";
$replace='"foo $1 bar"';
$var = "start middle end";
$var =~ s/$find/$replace/ee;
print "var: $var\n";
To see why the "" and double /e are needed, see the effect of the double eval here:
$ perl
$foo = "middle";
$replace='"foo $foo bar"';
print eval('$replace'), "\n";
print eval(eval('$replace')), "\n";
__END__
"foo $foo bar"
foo middle bar
(Though as ikegami notes, a single /e or the first /e of a double e isn't really an eval(); rather, it tells the compiler that the substitution is code to compile, not a string. Nonetheless, eval(eval(...)) still demonstrates why you need to do what you need to do to get /ee to work as desired.)
Deparse tells us this is what is being executed:
$find = 'start (.*) end';
$replace = "foo \cA bar";
$var = 'start middle end';
$var =~ s/$find/$replace/;
However,
/$find/foo \1 bar/
Is interpreted as :
$var =~ s/$find/foo $1 bar/;
Unfortunately it appears there is no easy way to do this.
You can do it with a string eval, but thats dangerous.
The most sane solution that works for me was this:
$find = "start (.*) end";
$replace = 'foo \1 bar';
$var = "start middle end";
sub repl {
my $find = shift;
my $replace = shift;
my $var = shift;
# Capture first
my #items = ( $var =~ $find );
$var =~ s/$find/$replace/;
for( reverse 0 .. $#items ){
my $n = $_ + 1;
# Many More Rules can go here, ie: \g matchers and \{ }
$var =~ s/\\$n/${items[$_]}/g ;
$var =~ s/\$$n/${items[$_]}/g ;
}
return $var;
}
print repl $find, $replace, $var;
A rebuttal against the ee technique:
As I said in my answer, I avoid evals for a reason.
$find="start (.*) end";
$replace='do{ print "I am a dirty little hacker" while 1; "foo $1 bar" }';
$var = "start middle end";
$var =~ s/$find/$replace/ee;
print "var: $var\n";
this code does exactly what you think it does.
If your substitution string is in a web application, you just opened the door to arbitrary code execution.
Good Job.
Also, it WON'T work with taints turned on for this very reason.
$find="start (.*) end";
$replace='"' . $ARGV[0] . '"';
$var = "start middle end";
$var =~ s/$find/$replace/ee;
print "var: $var\n"
$ perl /tmp/re.pl 'foo $1 bar'
var: foo middle bar
$ perl -T /tmp/re.pl 'foo $1 bar'
Insecure dependency in eval while running with -T switch at /tmp/re.pl line 10.
However, the more careful technique is sane, safe, secure, and doesn't fail taint. ( Be assured tho, the string it emits is still tainted, so you don't lose any security. )
As others have suggested, you could use the following:
my $find = 'start (.*) end';
my $replace = 'foo $1 bar'; # 'foo \1 bar' is an error.
my $var = "start middle end";
$var =~ s/$find/$replace/ee;
The above is short for the following:
my $find = 'start (.*) end';
my $replace = 'foo $1 bar';
my $var = "start middle end";
$var =~ s/$find/ eval($replace) /e;
I prefer the second to the first since it doesn't hide the fact that eval(EXPR) is used. However, both of the above silence errors, so the following would be better:
my $find = 'start (.*) end';
my $replace = 'foo $1 bar';
my $var = "start middle end";
$var =~ s/$find/ my $r = eval($replace); die $# if $#; $r /e;
But as you can see, all of the above allow for the execution of arbitrary Perl code. The following would be far safer:
use String::Substitution qw( sub_modify );
my $find = 'start (.*) end';
my $replace = 'foo $1 bar';
my $var = "start middle end";
sub_modify($var, $find, $replace);
# perl -de 0
$match="hi(.*)"
$sub='$1'
$res="hi1234"
$res =~ s/$match/$sub/gee
p $res
1234
Be careful, though. This causes two layers of eval to occur, one for each e at the end of the regex:
$sub --> $1
$1 --> final value, in the example, 1234
I would suggest something like:
$text =~ m{(.*)$find(.*)};
$text = $1 . $replace . $2;
It is quite readable and seems to be safe. If multiple replace is needed, it is easy:
while ($text =~ m{(.*)$find(.*)}){
$text = $1 . $replace . $2;
}
#!/usr/bin/perl
$sub = "\\1";
$str = "hi1234";
$res = $str;
$match = "hi(.*)";
$res =~ s/$match/$1/g;
print $res
This got me the '1234'.
See THIS previous SO post on using a variable on the replacement side of s///in Perl. Look both at the accepted answer and the rebuttal answer.
What you are trying to do is possible with the s///ee form that performs a double eval on the right hand string. See perlop quote like operators for more examples.
Be warned that there are security impilcations of evaland this will not work in taint mode.
I did not manage to make the most popular answers work.
The ee method complained when my replacement string contained several consecutive backreferences.
Kent Fredric's answer only replaced the first match, and I need my search and replace to be global. I did not figure out a way to make it replace all matches that didn't cause other issues. For example, I tried running the method recursively until it no longer caused the string to change, but that causes an infinite loop if the replacement string contains the search string, whereas a regular global replacement does not do that.
I attempted to come up with a solution of my own using plain old eval:
eval '$var =~ s/' . $find . '/' . $replace . '/gsu;';
Of course, this allows for code injection. But as far as I know, the only way to escape the regex query and inject code is to insert two forward slashes in $find or one in $replace, followed by a semi-colon, after which you can add add code. For example, if I set the variables this way:
my $find = 'foo';
my $replace = 'bar/; print "You\'ve just been hacked!\n"; #';
The evaluated code is this:
$var =~ s/foo/bar/; print "You've just been hacked!\n"; #/gsu;';
So what I do is make sure the strings don't contain any unescaped forward slashes.
First, I copy the strings into dummy strings.
my $findTest = $find;
my $replaceTest = $replace;
Then, I remove all escaped backslashes (backslash pairs) from the dummy strings. This allows me to find forward slashes that are not escaped, without falling into the trap of considering a forward slash escaped if it's preceded by an escaped backslash. For example: \/ contains an escaped forward slash, but \\/ contains a literal forward slash, because the backslash is escaped.
$findTest =~ s/\\\\//gmu;
$replaceTest =~ s/\\\\//gmu;
Now if any forward slash that is not preceded by a backslash remains in the strings, I throw a fatal error, as that would allow the user to insert arbitrary code.
if ($findTest =~ /(?<!\\)\// || $replaceTest =~ /(?<!\\)\//)
{
print "String must not contain unescaped slashes.\n";
exit 1;
}
Then I eval.
eval '$var =~ s/' . $find . '/' . $replace . '/gsu;';
I'm not an expert at preventing code injection, but I'm the only one using my script, so I'm content using this solution without fully knowing if it's vulnerable. But as far as I know, it may be, so if anyone knows if there is or isn't any way to inject code into this, please provide your insight in a comment.
I'm not certain on what it is you're trying to achieve. But maybe you can use this:
$var =~ s/^start/foo/;
$var =~ s/end$/bar/;
I.e. just leave the middle alone and replace the start and end.