How can I replace the backslash inside the variable?
$string = 'a\cc\ee';
$re = 'a\\cc';
$rep = "Work";
#doesnt work in variable
$string =~ s/$re/$rep/og;
print $string."\n";
#work with String
$string =~ s/a\\cc/$rep/og;
print $string."\n";
output:
a\cc\ee
Work\ee
Because you're using this inside of a regex -- you probably want quotemeta() or \Q and \E (see perldoc perlre)
perl -E'say quotemeta( q[a/asf$## , d] )'
# prints: a\/asf\$\#\#\ \,\ d
# Or, with `\Q`, and `\E`
$string =~ s/\Q$re\E/$rep/og;
print $string."\n";
If you set $re = 'a\cc';, it would work. The backslash is not getting interpolated as you expect when you include it in the regex as a variable: it is being used literally in the substitution.
Alternatively you could define the string with double quotes, but that's not a good practice. It's better to always use single quotes in your strings unless you explicitly want to interpolate something in the content -- it saves an infitesimal amount of processing, but it is a hint to the reader as to what you the programmer intended.
The problem is that you're using single quotes to define $re. That means that when you use it in the search pattern it looks for two slashes.
Single quotes tell Perl not to interpolate the strings, but to use the raw characters instead. Each slash is taken literally and as an escape.
Compare:
$re0 = 'a\\cc';
$re1 = "a\\cc";
When you print them out you'll see:
print $re0."\n".$re1."\n";
a\\cc
a\cc
On the other hand, when you use the string directly in the regex, it's interpolated, so you need one slash to act as an escape, and another to be what you're escaping.
Related
The following regular expression gives me proper results when tried in Notepad++ editor but when tried with the below perl program I get wrong results. Right answer and explanation please.
The link to file I used for testing my pattern is as follows:
(http://sainikhil.me/stackoverflow/dictionaryWords.txt)
Regular expression: ^Pre(.*)al(\s*)$
Perl program:
use strict;
use warnings;
sub print_matches {
my $pattern = "^Pre(.*)al(\s*)\$";
my $file = shift;
open my $fp, $file;
while(my $line = <$fp>) {
if($line =~ m/$pattern/) {
print $line;
}
}
}
print_matches #ARGV;
A few thoughts:
You should not escape the dollar sign
The capturing group around the whitespaces is useless
Same for the capturing group around the dot .
which leads to:
^Pre.*al\s*$
If you don't want words like precious final to match (because of the middle whitespace, change regex to:
^Pre\S*al\s*$
Included in your code:
while(my $line = <$fp>) {
if($line =~ /^Pre\S*al\s*$/m) {
print $line;
}
}
You're getting messed up by assigning the pattern to a variable before using it as a regex and putting it in a double-quoted string when you do so.
This is why you need to escape the $, because, in a double-quoted string, a bare $ indicates that you want to interpolate the value of a variable. (e.g., my $str = "foo$bar";)
The reason this is causing you a problem is because the backslash in \s is treated as escaping the s - which gives you just plain s:
$ perl -E 'say "^Pre(.*)al(\s*)\$";'
^Pre(.*)al(s*)$
As a result, when you go to execute the regex, it's looking for zero or more ses rather than zero or more whitespace characters.
The most direct fix for this would be to escape the backslash:
$ perl -E 'say "^Pre(.*)al(\\s*)\$";'
^Pre(.*)al(\s*)$
A better fix would be to use single quotes instead of double quotes and don't escape the $:
$ perl -E "say '^Pre(.*)al(\s*)$';"
^Pre(.*)al(\s*)$
The best fix would be to use the qr (quote regex) operator instead of single or double quotes, although that makes it a little less human-readable if you print it out later to verify the content of the regex (which I assume to be why you're putting it into a variable in the first place):
$ perl -E "say qr/^Pre(.*)al(\s*)$/;"
(?^u:^Pre(.*)al(\s*)$)
Or, of course, just don't put it into a variable at all and do your matching with
if($line =~ m/^Pre(.*)al(\s*)$/) ...
Try removing trailing newline character(s):
while(my $line = <$fp>) {
$line =~ s/[\r\n]+$//s;
And, to match only words that begin with Pre and end with al, try this regular expression:
/^Pre\w*al$/
(\w means any letter of a word, not just any character)
And, if you want to match both Pre and pre, do a case-insensitive match:
/^Pre\w*al$/i
Here is an example of the code:
my $testVar = "^.+|gg$";
That line is causing an error. It says the dollar sign should be escaped but I want the entire line to match that so I need the ^ and $ characters.
Bit new to Perl and I much rather assign my regex statements to variables for ease of use and not sure how.
If want to produce the string
^.+|gg$
then you must use one of the following literals
my $pat = '^.+|gg$';
my $pat = "^.+|gg\$";
Note that $ must be escaped in double-quoted string literals because $ marks the start of a variable to interpolate in double-quoted string literals.
But it's simpler with qr, and it compiles the pattern for you.
my $re = qr/^.+|gg$/;
Here's an example of how you should do it:
$variable =~ /(find something)/;
If you want to assign a pattern to a variable, here's how you could do it:
my $pattern = qr"patern";
my $content = "content";
my #results = $content =~ m/($pattern)/;
You'll need, indeed, to escape every special characters.
When I do this
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my $s = 'dfgdfg5 )';
my $a = '5 )';
my $b = '567';
$s =~ s/$a/$b/g;
print Dumper $s;
I get
Unmatched ) in regex; marked by <-- HERE in m/5 ) <-- HERE / at ./test.pl line 11.
The problem is that $a have a (.
How do I prevent the regex from failing?
Update
The string in $a do I get from a database query, so I can't change it. Or would it be possible to make an $a2 where "something" searches for ) and replaces them with \)?
You need to escape it. Either manually by adding backslash in front of it, or by using quotemeta or the \Q sequence inside the regex:
$a = quotemeta($a);
Or
$s =~ /\Q$a/$b/g;
ETA: This is a good option if you want to match literal strings from a database query.
You should also be aware that it is not a good idea to use $a and $b as variables, since they will mask the predefined variables that are used with sort. E.g. sort { $a <=> $b } #foo.
The simple answer is to backslash escape the paren. my $a = '5 \)'; In your case, as your post mentions, you aren't the one creating the strings, so literally escaping them isn't an option.
It may be simpler to just wrap the variable that's being interpolated by the regex inside of a \Q ... \E.
$s =~ s/\Q$a\E/$b/g;
The quotemeta() function may also be helpful to you, depending on how your code is factored. With that option you would pass $a through quotemeta before interpolating it in the regex. \Q...\E is probably easier in this situation, but if your code is simplified by using quotemeta instead, it's there for you.
Use \) instead of just ). ) is special because it's normally used for capturing patterns so you need to escape it first.
Escape the parentheses with a backslash:
my $a = '5 \)'oi;
Or use \Q inside the regexp:
$s =~ s/\Q$a/$b/g;
Also when storing regexps in a variable, you should look into the regexp quote operator: http://perldoc.perl.org/perlop.html#Regexp-Quote-Like-Operators
my $a = qr/5 \)/oi;
In Perl regular expression you need to mask special chars with a backslash \.
Try
my $a = '5 \)';
my $b = '567';
$s =~ s/$a/$b/g;
For details and a good start see perldoc perlretut
Update: I didn't know the RE came from a database. Well, the code above works nevertheless. The hint for the tutorial still applies.
I think you just need to escape the brackets, ie replace ) with \)
I am a beginner in perl and I have a query regarding pattern matching.
I came across a line in perl where it was written
$variable =~ s-/\Z--;
And as the code goes ahead some another variable was assigned
$variable1 =~ s-/--;
Can you please tell me what does these 2 lines do?
I want to know what does s-/\Z-- and s-/-- mean.
$variable =~ s-/\Z--;
- is used as a delimiter here. However, best practice suggests that you either use / or {} as delimiters.
It could be re-written as:
$variable =~ s{/\Z}{}; # remove a / at the end of a string
Consider:
$variable1 =~ s-/--;
Again, it could be re-written as:
$variable1 =~ s{/}{}; # remove the first /
The s/// operator in Perl is a substitution operation, which performs a search-and-replace on a string using a special kind of pattern called a regular expression. You can read more about regular expressions and Perl's pattern matching in the man pages that come with Perl:
man perlretut
man perlre
If you don't have these on your system, try searching Google for the same.
Applying a substitution to a variable is done with the =~ operator. So the following replaces all instances of 'foo' in the variable $var with 'bar'.
$var =~ s/foo/bar/;
All the Perl operators are documented on the 'perlop' man page.
Even though the most common separator character is a slash (hence s///), you can also use any other punctuation character as a separator. So in this case, the author has decided to use the dash (-) as the separator.
Here's the same line of code above using dash as a separator:
$var =~ s-foo-bar-;
In your case, the dash doesn't seem to add any clarity to the code, so it might be best to update it to use the conventional slashes instead.
The s/// search and replace function in perl can be used with different delimeters, which is what is done in this case. They have replaced / with the minus sign -, or dash.
The s-/-- removes the first / from the string.
The s-/\Z-- matches and removes a slash at the end of the line. I think this is better written: s{/$}{}.
$variable1 =~ s-/--;could be written as
$variable =~ s{/}{}xms;
or this
$variable =~ s/ \/ //xms;
It means delete the first / in the string.
Regarding s-/\Z--, it is usually written like this
$variable =~ s{/ \Z}{}xms;
or this
$variable =~ s/ \/ \Z //xms;
It means delete a / if it is at the end of the string (\Z).
I have the following string:
$foo = "'fdsfdsf', 'fsdfdsfdsfdsfds fdsf sdfd f sfs', 'fsdfsdfsd f' fdfsdfdsfdsfsf";
I want to get everything between ' ' but from first to last occurrence.
I have tried to search my string by /.*('.*').*/ but only 'fdsfdsf' was taken, how to turn on greedy or something like that?
* is greedy (*? is the non-greedy version). The following regex works fine: /'(.*)'/.
I am going to go out on a limb and assume that you want to actually get each everything between single quotes within each comma-separated field.
#!/usr/bin/perl
use strict; use warnings;
my $foo = "'fdsfdsf', 'fsdfdsfdsfdsfds fdsf sdfd f sfs', 'fsdfsdfsd f' fdfsdfdsfdsfsf";
my #fields = map /'([^']*)'/, split ', ', $foo;
use YAML;
print Dump \#fields;
Output:
---
- fdsfdsf
- fsdfdsfdsfdsfds fdsf sdfd f sfs
- fsdfsdfsd f
Try
#result = $subject =~ m/'([^']*)'/g;
This will give you the strings enclosed in single quotes within each field in the string.
Of course, the quotes need to be balanced, and the regex match will be thrown off by escaped quotes in your string, so if those may occur, the pattern needs to be modified.
For example,
m/'((?:\\.|[^'])*)'/
would work if quotes can be escaped with backlashes.
Below seems to be working as well. Please check:
#res = $str =~ m/'([A-Z|a-z| ]*)'*/g;