Someone has written something like the following code :
#! /usr/bin/perl
my $myVar = 'somecomplicatedString';
my $someString = 'mySystemvariable=SOMESTR';
if ( $someString =~ /SOMESTR/ ) {
$someSting =~ s{SOMESTR}{$myVar}
}
# $someString now equals 'mySystemvariable=somecomplicatedString'
What is the difference between the s/// operator and the s{}{} operator?
You can use any set of delimiter in Perl match operator - m//, or substitution operator - s///.
Other examples:
s#oldTest#newTest#
s/oldTest/newTest/
s!oldTest!newTest!
s~oldTest~newTest~
s{oldTest}{newTest} # Here we use appropriate opening and closing braces.
m/someText/
m!someText!
/someText/ # You can omit the `m` when `/` is delimiter
!someText! # This is Wrong. You can't omit `m` in other delimiter.
The major advantage you see with varying delimiter is that you can avoid escaping a delimiter in the text, by using a different character as delimiter.
So, using # as delimiter, you don't need to escape / in the string.
From perlop doc:
Under m// operator section:
If "/" is the delimiter then the initial m is optional. With the m you can use any pair of non-whitespace (ASCII) characters as delimiters. This is particularly useful for matching path names that contain "/", to avoid LTS (leaning toothpick syndrome). If "?"
is the delimiter, then a match-only-once rule applies, described in m?PATTERN? below.
Under s/// operator section:
Any non-whitespace delimiter may replace the slashes. Add space after the s when using a character allowed in identifiers.
It is the same operator, but using different delimiters which can be used to achieve better readability.
{} are convenient when using /e modifier,
$string =~ s{(\d)}{
# ...
$1 + 1;
}e;
Related
I was unable to decipher what this regex does:
$c =~ s^.*/^^g;
I don't have access to the input or the output.
Does anyone know what it does?
The default delimiter for s/// is the slash, but you can use any printable character as an alternative.
So
$c =~ s^.*/^^g
is equivalent to
$c =~ s/.*\///g
Note that using the conventional delimiter requires the slash within the pattern itself to be escaped
Some options are better than others, and in the case where you're just trying to avoid escaping slashes within the pattern I think a pipe character | is better
I wouldn't hope to learn too much from this programmer. As you have experienced, ^ is a poor and confusing choice. Also, the /g modifier is superfluous, and $c is a terrible choice for an identifier
I would write
$c =~ s|.*/||
Here ^ is used as the delimiter.
We may use any printable character as a regex delimiter.
s^.*/^^g;
s/.*\///g;
Both regex are same
A non-standard delimiter is mostly used to avoid the need to escape the delimiter character within a regex pattern. For
$c = "this is a string with / slash";
Now your regex should be
$c =~ s/.*\///
^^
Here you are escaping the slash.
Both regex are same.
We will use whatever regex we want. #simbabque mentioned in comment.
s{foo}{bar}gs # here curly braces are delimiter
s[some][same] # here square bracket are delimeter.
And we will use character also a regex delimiter for our convenient
To avoid escaping we can use other delimiters.
I’m using a variable to search and replace a string using Perl.
I want to replace the string 23.0 with 23.0.1, so I tried this:
my $old="23.0";
my $new="23.0.1";
$_ =~ s/$old/$new/g;
The problem is that it also replaced the string 2310, so I tried:
my $old="23\.0"
and also /ee.
But can’t get the correct syntax for it to work. Can someone show me the correct syntax?
There are two things that will help you here:
The quotemeta function - that will escape meta characters. And also the \Q and \E regex flags, that stop regex interpolation.
print quotemeta "21.0";
Or:
my $old="23.0";
my $new="23.0.1";
my $str = "2310";
$str =~ s/\Q$old\E/$new/g;
print $str;
Just use single quotes and escape the dot.
my $old='23\.0';
To complement Sobrique's excellent answer, let me note that the reason your attempt with "23\.0" didn't work is that "23\.0" and "23.0" evaluate to the same string: in a double-quoted string literal, the backslash escape sequence \. simply evaluates to ..
There are several things you could do to avoid this:
If you indeed want to match a fixed string, and don't need or want to include any special regexp metacharacters in it, you can do as Sobrique suggest and use quotemeta or \Q to escape them.
In particular, this is almost always the correct solution if the string to be matched comes from user input. If you do want to allow some limited set of non-literal metacharacters, you can unescape those after running the pattern through quotemeta. For a simple example, here's a quick-and-dirty way to turn a basic glob-like pattern (using the metacharacters ? and * for "any character" and "any string of characters" repectively) into an equivalent regexp:
my $regexp = "^\Q$glob\E\$"; # quote and anchor the pattern
$regexp =~ s/\\\?/./g; # replace "?" (escaped to "\?" by \Q) with "."
$regexp =~ s/\\\*/.*/g; # replace "*" (escaped to "\*" by \Q) with ".*"
Conversely, if you want to have a literal regexp pattern in your code, without immediately matching it against something, you can use the qr// regexp-like quote operator, like this:
my $old = qr/\b23\.0(\.0)?\b/; # match 23.0 or 23.0.0 (but not 123.012!)
my $new = "23.0.1"; # just a literal string
s/$old/$new/g; # replace any string matching $old in $_ with $new
Note that qr// has other effects beyond just allowing you to use regexp syntax in a string literal: it actually pre-compiles the pattern into a special Regexp object, so that it doesn't need to be recompiled every time it's used later. In particular, as a side effect, the string representation of a qr// regexp literal will usually not exactly match the original content, although it will be equivalent as a regexp. For example, say qr/\b23\.0(\.0)?\b/ will, on my Perl version, output (?^u:\b23\.0(\.0)?\b).
You could also just use a normal double-quoted string literal, and double any backslashes in it, but that's (usually) less efficient than using qr//, and also less readable due to leaning toothpick syndrome.
Using a single-quoted string literal would be slightly better, since backslashes in a single-quoted string are only special when followed by another backslash or a single quote. Even so, readability can still suffer if you happen to need to match any literal backslashes in your regexp, not to mention that it's easy to create subtle bugs if you forget to double a backslash in those rare places where it's still needed.
I want to use regular expression to substitute forward slash with opening brackets, separated by a comma. For example, for
$str = "c/g/c/g/c/g/c/g";
the result should be
$str = "c),(g),(c),(g),(c),(g),(c),(g";
I wrote the following line but it does not work
$observed = ~ s///);(//;
Any advice how to solve it
If you want to use a literal forward slash in a substitution that is delimted with forward slashes e.g. s/../../g you have to first escape the forward slash:
s/\//),(/; # Note: this only replaces a single occurrence
to replace all occurrences you can add the g(global) modifier:
s/\//),(/g;
You can also choose any other delimiter for the substitution expression, the following are all valid and equivalent:
s|/|),(|g;
s./.),(.g;
s#/#),(#g;
s{/}{),(}g;
It is also important to note that
$str = c/g/c/g/c/g/c/g.
is not a valid statement in Perl, all strings need to be enclosed in single or double quotes or the corresponding q{}, qq{} forms:
$str = 'c/g/c/g/c/g/c/g.';
What is the difference between the two lines?
if ($data =~ m/$str/) {
#### ^--- HERE
print "OK";
}
and
if ($data =~ /$str/) {
print "OK";
}
The whole difference is just an 'm'.
m is indicator that you're about to use matching regexp, as opposed to replacing, using transliteration or other operators that can be used with /. If you use / as separator, then m is optional. Standalone / assumes m. m is mandatory if you want to use other symbols as quotes around regexp like $str =~ m|$regexp|. This is useful for writing more readable code if you regexp contains lots of / inside so you don't have to quote them.
Additionally, some other separators that can be specified with m will process quoted string differently.
http://perldoc.perl.org/perlop.html#Regexp-Quote-Like-Operators
With the m you can use any pair of non-whitespace (ASCII) characters
as delimiters. This is particularly useful for matching path names
that contain "/", to avoid LTS (leaning toothpick syndrome). If "?" is
the delimiter, then a match-only-once rule applies, described in
m?PATTERN? below. If "'" (single quote) is the delimiter, no
interpolation is performed on the PATTERN. When using a character
valid in an identifier, whitespace is required after the m.
I am trying to write a regex for perl that would check for alphanumeric values (having spaces) but not including underscore "_" and limit the number of character to 30 I am trying this but this is not working could anyone please tell me what I am doing wrong! This code is even taking special characters as alphanumeric values. $currLine = 'Kapil# 123' this should not be a valid value.
** apologies by $currLine = "regex" i meant $currLine =~ "regex"
if ($currLine = /^[a-zA-Z0-9]{1,30}$/){
say "Line3 Good: ", $currLine;
} else {
say "Error in Line 3: Name not alphamumeric ";
}
$currLine = /^[a-zA-Z0-9]{1,30}$/
means
$currLine = $_ =~ /^[a-zA-Z0-9]{1,30}$/
You want to use
$currLine =~ /^[a-zA-Z0-9]{1,30}$/
Now on to the other problems.
You didn't allow spaces. (What follows allows whitespace. If you mean SPACE specifically, use that instead of \s).
You allow a trailing newline.
You allow 31 characters if the 31st is a newline.
You forbid many alphanumeric characters.
You forbid zero characters.
$currLine =~ /^[\p{Alnum}\s]{0,30}\z/
You are using = (assignment) where you should have =~ (bind).
Enabling warnings may have alerted you to this. The code you have is matching $_ and then assigning the results of the match to $currLine.
For your regular expression to match all alphanumeric values including spaces, you need to include for space inside your character class. You should also be using the bind operator =~ instead of = here.
if ( $currLine =~ /^[a-z0-9\s]{1,30}$/i ) { ...
Note: I included the i modifier for case-insensitive matching.
You are using assignment operator(=) instead of match operator(=~). You should change the if statement to:
if ($currLine =~ /^[a-zA-Z0-9]{1,30}$/)
This can also be shortened to:
if ($currLine =~ /^[^\W_]{1,30}$/)
[^\W] already matches anything apart from what is represented by \w. To discard _, we add it to negated character class, thus using - [^\W_]. Note however that, this matches much more than mere [a-zA-Z0-9]. It includes other unicode characters that come under word character. To just allow that regex to consider ASCII text, add /a character set modifier:
/^[^\W_]{1,30}$/a