I have a string v1.2NDM. I'm trying to use regex to get 1.2.
my $string = "v1.2NDM";
$string =~ s/[^0-9.]//;
print $string;
output: 1.2NDM but I'm trying to get 1.2.
You can remove characters with the transliteration operator:
$string =~ y/0-9.//cd;
/c means complement - match any character not specified in the search list.
/d means delete characters for which no replacement is specified in the replace list (all matching characters in this case).
Use it like this with g or global flag:
$string =~ s/[^0-9.]+//g;
It will output 1.2 now. Also better to use + after character class for efficiency reasons.
Related
Trying to replace a number (20 with a variable $cntr=120) in a string using replace operator. But getting stuck with $cntr in the output. Where I am doing wrong? Any better solutions please.
Input string
myurl.com/search?project=ABC&startAt=**20**&maxResults=100&expand=log
Desired Output string
myurl.com/search?project=ABC&startAt=**120**&maxResults=100&expand=log
Actual Output string
myurl.com/search?project=ABC&startAt=**$cntr**&maxResults=100&expand=log
Code:
$str='myurl.com/search?project=ABC&startAt=20&maxResults=100&expand=log'
$cntr=120
$str = $str -replace '^(.+&startAt=)(\d+)(&.+)$', '$1$cntr$3'
$str
You need to
Use double quotes to be able to use string interpolation
Use the unambiguous backreference syntax, ${n}, where n is the group ID.
In this case, you can use
PS C:\Users\admin> $str='myurl.com/search?project=ABC&startAt=20&maxResults=100&expand=log'
PS C:\Users\admin> $cntr=120
PS C:\Users\admin> $str = $str -replace '^(.+&startAt=)(\d+)(&.+)$', "`${1}$cntr`$3"
PS C:\Users\admin> $str
myurl.com/search?project=ABC&startAt=120&maxResults=100&expand=log
See the .NET regex "Substituting a Numbered Group" documentation:
All digits that follow $ are interpreted as belonging to the number group. If this is not your intent, you can substitute a named group instead. For example, you can use the replacement string ${1}1 instead of $11 to define the replacement string as the value of the first captured group along with the number "1".
A couple things here:
If you just add the "12" you end up with $112$3 which isn't what you want. What I did was appended a slash in front and then removed it on the backend, so the replace becomes $1\12$3.
$str='myurl.com/search?project=ABC&startAt=20&maxResults=100&expand=log'
$cntr=12
$str = ($str -replace '^(.+&startAt=)(\d+)(&.+)$', ('$1\' + $cntr.ToString() +'$3')).Replace("\", "")
$str
Looking to see if there's another way to add the literal "12" in the replace section with the extra character, but this does work.
Here's another way to do it where you have a literal string between the $1 and $3 and then replace that at the end.
$str='myurl.com/search?project=ABC&startAt=20&maxResults=100&expand=log'
$cntr=12
$str = ($str -replace '^(.+&startAt=)(\d+)(&.+)$', ('$1REPLACECOUNTER$3')).Replace("REPLACECOUNTER", "$cntr")
$str
I need an efficient way to replace all chars from a string with another char based on a hash map
Currently I am using regex s/// and that is working fine. Can I use tr instead , because I just need character by character conversion.
This is what I am trying:
my %map = ( a => 9 , b => 4 , c => 8 );
my $str = 'abc';
my $str2 = $str;
$str2 =~ s/(.)/$map{$1}/g; # $str2 =~ tr /(.)/$map{$1}/ Does not work
print " $str => $str2\n";
If you need to replace exacly 1 character by 1 character, tr is ideal for you:
#!/usr/bin/perl
use strict;
use warnings;
my $str = 'abcd';
my $str2 = $str;
$str2 =~ tr /abc/948/;
print " $str => $str2\n";
It didn't delete "d", which will happen with the code from your question. Output:
abcd => 948d
No, one cannot do that with tr. That tool is very different from regex.
Its entry in Quote Like Operators in perlop says
Transliterates all occurrences of the characters found (or not found if the /c modifier is specified) in the search list with the positionally corresponding character in the replacement list [...]
and further down it adds
Characters may be literals, or (if the delimiters aren't single quotes) any of the escape sequences accepted in double-quoted strings. But there is never any variable interpolation, so "$" and "#" are always treated as literals. [...]
So one surely can't have a hash evaluated, nor match on a regex pattern in the first place.
The lack of even basic variable interpolation is explained at the very end
... the transliteration table is built at compile time, ...
We are then told of using eval, with an example, if we must use variables with tr.
In this case you'd need to first build variables, one a sequence of characters to replace (keys) and the other a sequence of their replacement characters (values), and then use them in tr via eval like in the docs. Better yet, you'd build a sub with it as in ikegami's comment. Here is a related page.
But this is the opposite of finding an approach simpler than that basic regex, the question's point.
Given :
my $str = "foo95285734776bar";
$str =~ s/([0-9]{2,4})/_????_/g;
What single regex where '????' is the length of $1 can produce output "foo_4__4__3_bar" ?
That is, where "9528" is replaced with "_4_", "5734" with "_4_", and the remaining "776" with "_3_".
You can use the /e modifier to add Perl code into the substitution part that is then evaled.
my $str = "foo95285734776bar";
$str =~ s/([0-9]{2,4})/'_' . length($1) . '_'/ge;
print $str;
Will output
foo_4__4__3_bar
Note that you now need a full Perl expression there. That's why you have to actually quote and concatenate the underscores.
From perlop:
A /e will cause the replacement portion to be treated as a full-fledged Perl expression and evaluated right then and there. It is, however, syntax checked at compile-time. A second e modifier will cause the replacement portion to be evaled before being run as a Perl expression.
How can I swap the letter o with the letter e and e with o?
I just tried this but I don't think this is a good way of doing this. Is there a better way?
my $str = 'Absolute force';
$str =~ s/e/___eee___/g;
$str =~ s/o/e/g;
$str =~ s/___eee___/o/g;
Output: Abseluto ferco
Use the transliteration operator:
$str =~ y/oe/eo/;
E.g.
$ echo "Absolute force" | perl -pe 'y/oe/eo/'
Abseluto ferco
As has already been said, the way to do this is the transliteration operator
tr/SEARCHLIST/REPLACEMENTLIST/cdsr
y/SEARCHLIST/REPLACEMENTLIST/cdsr
Transliterates all occurrences of the characters found in the search list with the corresponding character in the replacement list. It returns the number of characters replaced or deleted. If no string is specified via the =~ or !~ operator, the $_ string is transliterated.
However, I want to commend you on your creative use of regular expressions. Your solution works, although the placeholder string _ee_ would've been sufficient.
tr is only going to help you for character replacements though, so I'd like to quickly teach you how to utilize regular expressions for a more complicated mass replacement. Basically, you just use the /e tag to execute code in the RHS. The following will also do the replacement you were aiming for:
my $str = 'Absolute force';
$str =~ s/([eo])/$1 eq 'e' ? 'o' : 'e'/eg;
print $str;
Outputs:
Abseluto ferco
Note how the LHS (left hand side) matches both o and e, and them the RHS (right hand side) does a test to see which matched and returns the opposite for replacement.
Now, it's common to have a list of words that you want to replace, so it's convenient to just build a hash of your from/to values and then dynamically build the regular expression. The following does that:
my $str = 'Hello, foo. How about baz? Never forget bar.';
my %words = (
foo => 'bar',
bar => 'baz',
baz => 'foo',
);
my $wordlist_re = '(?:' . join('|', map quotemeta, keys %words) . ')';
$str =~ s/\b($wordlist_re)\b/$words{$1}/eg;
Outputs:
Hello, bar. How about foo? Never forget baz.
This above could've worked for your e and o case, as well, but would've been overkill. Note how I use quotemeta to escape the keys in case they contained a regular expression special character. I also intentionally used a non-capturing group around them in $wordlist_re so that variable could be dropped into any regex and behave as desired. I then put the capturing group inside the s/// because it's important to be able to see what's being captured in a regex without having to backtrack to the value of an interpolated variable.
The tr/// operator is best. However, if you wanted to use the s/// operator (to handle more than just single letter substitutions), you could write
$ echo 'Absolute force' | perl -pe 's/(e)|o/$1 ? "o" : "e"/eg'
Abseluto ferco
The capturing parentheses avoid the redundant $1 eq 'e' test in #Miller's answer.
from man sed:
y/source/dest/
Transliterate the characters in the pattern space which appear in source to the corresponding character in dest.
and tr command can do this too:
$ echo "Absolute force" | tr 'oe' 'eo'
Abseluto ferco
I am using Perl to clean up a raw text file which contains some odd characters like the following:
printableNNH=0A=0A =0A=0A=0A Event Registration Request=0A=0A ...
There are many occurances of =0A in the file which I have to get rid of. They occure in random sets of like above where there is an example of 2 and 3.
I am using the following line in my Perl script to eliminate there characters:
tr/=0A//d; #remove =0A
That works but it also removes the zeros (0) from all telephone numbers and other content containing 0s.
Can anyone advise on pattern matching an exact substring and deleting it?
tr/// is not a regular expression: It will (with the -d modifier) substitute single characters with zero characters.
In your case, using tr/=0A// will replace every occurrence of = 0 and A with nothing.
s/// however, is a substitution operator, which will substitute a regular expression with a specified character string - in your case zero characters.
Thus, use:
open my $input, '<', 'in.txt' or die "$!";
while (<$input>){
chomp;
s/=0A//g;
print "$_\n";
}
perl -pe 's/=0A//g' inFile > outFile
Use the following if you only want to remove =0A and not =,0 or A
$string=~s/=0A//g;
From perlop:
tr/SEARCHLIST/REPLACEMENTLIST/cds
y/SEARCHLIST/REPLACEMENTLIST/cds
Transliterates all occurrences of the characters found in the search list with the corresponding character in the replacement list.
Instead of replacing all occurrences of =0A, tr replaces all occurrences of =, 0, and A:
perl -we '$_ = "foo=0AbAr0"; tr/=0A//d; print'
Prints:
foobr
Instead, you should use s/pattern/replacement/, e.g.
perl -we '$_ = "foo=0AbAr0"; s/=0A//g; print'
Prints:
foobAr0
The g modifier performs the replacement globally, i.e. for every occurrence in a line.