How can I swap the letter o with the letter e and e with o?
I just tried this but I don't think this is a good way of doing this. Is there a better way?
my $str = 'Absolute force';
$str =~ s/e/___eee___/g;
$str =~ s/o/e/g;
$str =~ s/___eee___/o/g;
Output: Abseluto ferco
Use the transliteration operator:
$str =~ y/oe/eo/;
E.g.
$ echo "Absolute force" | perl -pe 'y/oe/eo/'
Abseluto ferco
As has already been said, the way to do this is the transliteration operator
tr/SEARCHLIST/REPLACEMENTLIST/cdsr
y/SEARCHLIST/REPLACEMENTLIST/cdsr
Transliterates all occurrences of the characters found in the search list with the corresponding character in the replacement list. It returns the number of characters replaced or deleted. If no string is specified via the =~ or !~ operator, the $_ string is transliterated.
However, I want to commend you on your creative use of regular expressions. Your solution works, although the placeholder string _ee_ would've been sufficient.
tr is only going to help you for character replacements though, so I'd like to quickly teach you how to utilize regular expressions for a more complicated mass replacement. Basically, you just use the /e tag to execute code in the RHS. The following will also do the replacement you were aiming for:
my $str = 'Absolute force';
$str =~ s/([eo])/$1 eq 'e' ? 'o' : 'e'/eg;
print $str;
Outputs:
Abseluto ferco
Note how the LHS (left hand side) matches both o and e, and them the RHS (right hand side) does a test to see which matched and returns the opposite for replacement.
Now, it's common to have a list of words that you want to replace, so it's convenient to just build a hash of your from/to values and then dynamically build the regular expression. The following does that:
my $str = 'Hello, foo. How about baz? Never forget bar.';
my %words = (
foo => 'bar',
bar => 'baz',
baz => 'foo',
);
my $wordlist_re = '(?:' . join('|', map quotemeta, keys %words) . ')';
$str =~ s/\b($wordlist_re)\b/$words{$1}/eg;
Outputs:
Hello, bar. How about foo? Never forget baz.
This above could've worked for your e and o case, as well, but would've been overkill. Note how I use quotemeta to escape the keys in case they contained a regular expression special character. I also intentionally used a non-capturing group around them in $wordlist_re so that variable could be dropped into any regex and behave as desired. I then put the capturing group inside the s/// because it's important to be able to see what's being captured in a regex without having to backtrack to the value of an interpolated variable.
The tr/// operator is best. However, if you wanted to use the s/// operator (to handle more than just single letter substitutions), you could write
$ echo 'Absolute force' | perl -pe 's/(e)|o/$1 ? "o" : "e"/eg'
Abseluto ferco
The capturing parentheses avoid the redundant $1 eq 'e' test in #Miller's answer.
from man sed:
y/source/dest/
Transliterate the characters in the pattern space which appear in source to the corresponding character in dest.
and tr command can do this too:
$ echo "Absolute force" | tr 'oe' 'eo'
Abseluto ferco
Related
I have a string that looks like this (key":["value","value","value"])
"emailDomains":["google.co.uk","google.com","google.com","google.com","google.co.uk"]
and I use the following regex to select from the string. (the regex is setup in a way where it wont select a string that looks like this "key":[{"key":"value","key":"value"}] )
(?<=:\[").*?(?="])
Resulting Selection:
google.co.uk","google.com","google.com","google.com","google.co.uk
I want to remove the " in that select string, and i was wondering if there was an easy way to do this using the replace command. Desired result...
"emailDomains":["google.co.uk, google.com, google.com, google.com, google.co.uk"]
How do I solve this problem?
If your string indeed has the form "key":["v1", "v2", ... "vN"], you can split off the part that needs to be changed, replace "," by a space in it, and re-assemble:
my #parts = split / (\["\s* | \s*\"]) /x, $string; #"
$parts[2] =~ s/",\s*"/ /g;
my $processed = join '', #parts;
The regex pattern for the separator in split is captured since in that case the separators are also in the returned list, what is helpful here for putting the string back together. Then, we need to change the third element of the array.
In this approach, we have to change a specific element in the array so if your format varies, even a little, this may not (or still may) be suitable.
This should of course be processed as JSON, using a module. If the format isn't sure, as indicated in a comment, it would be best to try to ensure that you have JSON. Picking bits and pieces like above (or below) is a road to madness once requirements slowly start evolving.
The same approach can be used in a regex, and this may in fact have an advantage to be able to scoop up and ignore everything preceding the : (with split that part may end up with multiple elements if the format isn't exactly as shown, what then affects everything)
$string =~ s{ :\["\s*\K (.*?) ( "\] ) }{
my $e = $2;
my $n = $1 =~ s/",\s*"/ /gr;
$n.$e
}ex;
Here /e modifier makes it so that the replacement side is evaluated as code, where we do the same as with the split above. Notes on regex
Have to save away $2 first, since it gets reset in the next regex
The /r modifier†, which doesn't change its target but rather returns the changed string, is what allows us to use substitution operator on the read-only $1
If nothing gets captured for $2, and perhaps for $1, that means that there was no match and the outcome is simply that $string doesn't change, quietly. So if this substitution should always work then you may want to add handling of such unexpected data
Don't need a $n above, but can return ($1 =~ s/",\s*"/ /gr) . $e
Or, using lookarounds as attempted
$string =~ s{ (?<=:\[") (.+?) (?="\]) }{ $1 =~ s/",\s*"/ /gr }egx;
what does reduce the amount of code, but may be trickier to work with later.
While this is a direct answer to the question I think it's least maintainable.
† This useful modifier, for "non-destructive substitution," appeared in v5.14. In earlier Perl versions we would copy the string and run regex on that, with an idiom
(my $n = $1) =~ s/",\s*"/ /g;
In the lookarounds-example we then need a little more
$string =~ s{...}{ (my $n = $1) =~ s/",\s*"/ /g; $n }gr
since s/ operator returns the number of substitutions made while we need $n to be returned from that whole piece of code in {} (the replacement side), to be used as the replacement.
You can use this \G based regex to start the match with :[" and further captures the values appropriately and replaces matched text so that only comma is retained and doublequotes are removed.
(:\[")|(?!^)\G([^"]+)"(,)"
Regex Demo
Your text is almost proper JSON, so it's really easy to go the final inch and make it so, and then process that:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw/say postderef/;
no warnings qw/experimental::postderef/;
use JSON::XS; # Install through your OS package manager or a CPAN client
my $str = q/"emailDomains":["google.co.uk","google.com","google.com","google.com","google.co.uk"]/;
my $json = JSON::XS->new();
my $obj = $json->decode("{$str}");
my $fixed = $json->ascii->encode({emailDomains =>
join(', ', $obj->{'emailDomains'}->#*)});
$fixed =~ s/^\{|\}$//g;
say $fixed;
Try Regex: " *, *"
Replace with: ,
Demo
I have a string like
XXXXYYYYZZZYYZZZYYYY which needs to be converted to
XXXXAAAYZZZAYZZZAAAY
$s =~ s/Y{2}+/AY/g;
this has 2 problems, {2}+ will get YYYY to AYAY; and AY is not the same length as YYYY (expecting AAAY)
How to get this done in perl?
Use a "look-ahead":
$s =~ s/Y(?=Y+)/A/g;
(?=Y+) means "followed by one or more Y characters", so any Y character that is followed by another Y character will be replaced with an A.
More info from perlretut
There's always more than one way to do it. My suggestion is to grab all the Ys except the last one, and then use that to create a string of As of the same length. The e modifier tells perl to execute the code in the replacement side instead of using it directly, and the r modifier tells =~ to return the result of the substitution instead of modifying the input text directly (useful for these one-liner tests, among other places).
$ perl -E 'say shift =~ s/(Y+)(?=Y)/"A"x length$1/gre' XXXXYYYYZZZYYZZZYYYY
XXXXAAAYZZZAYZZZAAAY
$s =~ s/Y{2}+/AY/g
RHS Pattern is ambiguously obscure pattern: Y{2}+, that's very rarely used regex pattern except if {}+ very rarely is available in few advanced regex engine, including perl maybe, as a regex feature called 'atomic grouping'.
You might have meant (Y{2})+ which is (YY)+ or Y{2,} which is YY+
in perl it's no brainer simple and easy as it supports lookaround feature
perl -e '$s=XXXXYYYYZZZYYZZZYYYY ;$s =~ s/Y(?=Y)/A/g;print $s'
actually lower regex engine such sed still can do it albeit in cumbersome, uneasy way
echo XXXXYYYYZZZYYZZZYYYY |sed -E 's/YY+/&\n/g;s/Y/A/g;s/A\n/Y/g'
Given :
my $str = "foo95285734776bar";
$str =~ s/([0-9]{2,4})/_????_/g;
What single regex where '????' is the length of $1 can produce output "foo_4__4__3_bar" ?
That is, where "9528" is replaced with "_4_", "5734" with "_4_", and the remaining "776" with "_3_".
You can use the /e modifier to add Perl code into the substitution part that is then evaled.
my $str = "foo95285734776bar";
$str =~ s/([0-9]{2,4})/'_' . length($1) . '_'/ge;
print $str;
Will output
foo_4__4__3_bar
Note that you now need a full Perl expression there. That's why you have to actually quote and concatenate the underscores.
From perlop:
A /e will cause the replacement portion to be treated as a full-fledged Perl expression and evaluated right then and there. It is, however, syntax checked at compile-time. A second e modifier will cause the replacement portion to be evaled before being run as a Perl expression.
How do I write a regex to replace a with b and b with a so that a&b will become b&a?
UPDATE
It should replace all occurrences of a with b and each b with a. The a&b is just an example to illustrate what I want. Sorry for the confusion.
You can use capturing groups and positional replacements to do this:
pax$ echo 'a&b' | perl -pne 's/([^&]*)&(.*)/\2&\1/'
b&a
The (blah blah) bits capture the "blah blah" and save it in a register for later use in the replacement string. When used with a replacement section like \1, the captured text is placed in the result.
So that regex simply captures all non-ampersand characters into register 1, the an ampersand then all the rest of the string into register 2.
In the substitution, it gives you register 2, an ampersand, then register 1.
If, as you mention in a comment, you want to do & and |, you need to capture and use the operator as well:
pax$ echo 'a&b' | perl -pne 's/([^&|]*)([&|])(.*)/\3\2\1/'
b&a
pax$ echo 'a|b' | perl -pne 's/([^&|]*)([&|])(.*)/\3\2\1/'
b|a
You can see that the positional replacements are slightly different now since you're capturing more groups but the concept is still identical.
The traditional generic way is:
my $string = "abcdefedcba";
my %subst = ( 'a' => 'b', 'b' => 'a' );
$string =~ s/(#{[ join '|', map quotemeta, sort { length($b) <=> length($a) } keys %subst ]})/$subst{$1}/g;
print $string;
Non-generically:
$string =~ s/(a|b)/{a=>'b',b=>'a'}->{$1}/ge;
Per your updated instructions:
$str =~ tr/ab/ba/;
Another possibility is to do 3 s///.
$_ = 'a&b'
# First, change every 'a' to something that does not appear in your string
s/a/\0/g;
# Then, change 'b' to 'a'
s/b/a/g;
# And now change your special character to b
s/\0/b/g;
I'd like to suggest a variation of this answer. Beginner Regex: Multiple Replaces
$text =~ s/(cat|tomatoes)/ ${{ qw<tomatoes cat cat tomatoes> }}{$1} /ge;
I'm debugging some code and wondered if there is any practical difference between $1 and \1 in Perl regex substitutions
For example:
my $package_name = "Some::Package::ButNotThis";
$package_name =~ s{^(\w+::\w+)}{$1};
print $package_name; # Some::Package
This following line seems functionally equivalent:
$package_name =~ s{^(\w+::w+)}{\1};
Are there subtle differences between these two statements? Do they behave differently in different versions of Perl?
First, you should always use warnings when developing:
#!/usr/bin/perl
use strict; use warnings;
my $package_name = "Some::Package::ButNotThis";
$package_name =~ s{^(\w+::\w+)}{\1};
print $package_name, "\n";
Output:
\1 better written as $1 at C:\Temp\x.pl line 7.
When you get a warning you do not understand, add diagnostics:
C:\Temp> perl -Mdiagnostics x.pl
\1 better written as $1 at x.pl line 7 (#1)
(W syntax) Outside of patterns, backreferences live on as variables.
The use of backslashes is grandfathered on the right-hand side of a
substitution, but stylistically it's better to use the variable form
because other Perl programmers will expect it, and it works better if
there are more than 9 backreferences.
Why does it work better when there are more than 9 backreferences? Here is an example:
#!/usr/bin/perl
use strict; use warnings;
my $t = (my $s = '0123456789');
my $r = join '', map { "($_)" } split //, $s;
$s =~ s/^$r\z/\10/;
$t =~ s/^$r\z/$10/;
print "[$s]\n";
print "[$t]\n";
Output:
C:\Temp> x
]
[9]
If that does not clarify it, take a look at:
C:\Temp> x | xxd
0000000: 5b08 5d0d 0a5b 395d 0d0a [.]..[9]..
See also perlop:
The following escape sequences are available in constructs that interpolate and in transliterations …
\10 octal is 8 decimal. So, the replacement part contained the character code for BACKSPACE.
NB
Incidentally, your code does not do what you want: That is, it will not print Some::Package some package contrary to what your comment says because all you are doing is replacing Some::Package with Some::Package without touching ::ButNotThis.
You can either do:
($package_name) = $package_name =~ m{^(\w+::\w+)};
or
$package_name =~ s{^(\w+::\w+)(?:::\w+)*\z}{$1};
From perldoc perlre:
The bracketing construct "( ... )" creates capture buffers. To refer to
the current contents of a buffer later on, within the same pattern, use
\1 for the first, \2 for the second, and so on. Outside the match use
"$" instead of "\".
The \<digit> notation works in certain circumstances outside the match. But it can potentially clash with octal escapes. This happens when the backslash is followed by more than 1 digits.