Perl Replace 26 characters with numeric - regex

I would like to replace a string with the numerical correspondent.
For example (one-liner on Windows):
perl -e "$_ = \"abcdefghijklmnopqrstuvwxyz\"; tr\a-z\1-9\;"
The result is:
12345678999999999999999999
This works until 9 but how I can assign the numeric correspondent after character i?
I would like to know how I can assign 2 sign to one 1 sign,
for example,
12 -> j, 13 -> k, etc.
To identify the numerical value it would makes sense to assign
"1-", "2-", ... "25-", "26".

perl -E"$_ = 'abcdefghijklmnopqrstuvwxyz'; s/([a-z])/ord($1)-96/ge; say;"
or if you have 5.14+
perl -E"say 'abcdefghijklmnopqrstuvwxyz' =~ s/([a-z])/ord($1)-96/ger;"
You can substitute any rule instead of ord($1) - 96.

I don't believe tr/// can do that unfortunately - it's a one-to-one character substitution. So you're going to have to go the long way round:
my %indicies = map { $_ => (ord($_) - ord('a')) + 1 } ('a' .. 'z');
my $result = join '', map { $indicies{$_} } split(//, $string);
Unfortunately that's not a one-liner.

Related

How to remove and ID from a string

I have a string that looks like this, they are ids in a table:
1,2,3,4,5,6,7,8,9
If someone deletes something from the database, I will need to update the string. I know that doing this it will remove the value, but not the commas. Any idea how can I check if the id has a comma before and after so my string doesn't break?
$new_values = $original_values[0];
$new_values =~ s/$car_id//;
Result: 1,2,,4,5,6,7,8,9 using the above sample (bad). It should be 1,2,4,5,6,7,8,9.
To remove the $car_id from the string:
my $car_id = 3;
my $new_values = q{1,2,3,4,5,6,7,8,9};
$new_values = join q{,}, grep { $_ != $car_id }
split /,/, $new_values;
say $new_values;
# Prints:
# 1,2,4,5,6,7,8,9
If you already removed the id(s), and you need to remove the extra commas, reformat the string like so:
my $new_values = q{,,1,2,,4,5,6,7,8,9,,,};
$new_values = join q{,}, grep { /\d/ } split /,/, $new_values;
say $new_values;
# Prints:
# 1,2,4,5,6,7,8,9
You can use
s/^$car_id,|,$car_id\b//
Details
^ - start of string
$car_id - variable value
, - comma
| - or
, - comma
$car_id - variable value
\b - word boundary.
s/^\Q$car_id\E,|,\Q$car_id\E\b//
Another approach is to store an extra leading and trailing comma (,1,2,3,4,5,6,7,8,9,)
The main benefit is that it makes it easier to search for the id using SQL (since you can search for ,$car_id,). Same goes for editing it.
On the Perl side, you'd use
s/,\K\Q$car_id\E,// # To remove
substr($_, 1, -1) # To get actual string
Ugly way: use regex to remove the value, then simplify
$new_values = $oringa_value[0];
$new_values =~ s/$car_id//;
$new_values =~ s/,+/,/;
Nice way: split and merge
$new_values = $oringa_value[0];
my #values = split(/,/, $new_values);
my $index = 0;
$index++ until $values[$index] eq $car_id;
splice(#values, $index, 1);
$new_values = join(',', #values);

Keep track of matches and check against condition

I have $entire_line = "if varC > 0: varB = varC + 2"
I would like my regex to find the following: varC, varB, varB in the $entire_line
These matches then need to be checked to see whether they exist in a HashMap. If so, a $ should be appended to the match.
Hence the output should be:
"if $varC > 0: $varB = $varC + 2"
NOTE: 0 and 2 don't appear in the HashMap.
Currently, I have:
$entire_line =~ s/(\w+)/\$$1/g if (exists($variable_hash{$1}));
However, this does not work as intended as the $1 in exists($variable_hash{$1}) does not refer to the previous regex: $entire_line =~ s/(\w+)/\$$1/g
Is there a proper way to go about this?
Thanks for your help.
Use the /e modifier and put the code into the replacement part:
$entire_line =~ s/(\w+)/exists $variable_hash{$1} ? $variable_hash{$1} : $1/ge;
If I got your question correctly and you don't need to perform variable value substitution (as in #choroba's answer), but only append $ character to known variables, and if the %variables_hash is not very long, how about concatenating all the keys of %variables_hash with a | character to get a regex matching all known variables?
my %variable_hash = (
varA => 1,
# varB => 1, # commented out to check that it will not be replaced
varC => 1,
);
my $entire_line = "if varC > 0: varB = varC + 2;";
my $key_regex = join('|', map { quotemeta $_; } keys %variable_hash);
# $key_regex will contain "varA|varC"
$entire_line =~ s/\b($key_regex)\b/\$$1/g;
# prefix all matching substrings with $ character
print "$entire_line\n";
Also check my comment to #choroba's answer.

Error with regex, match numbers

I have a string 00000001001300000708303939313833313932E2
so, I want to match everything between 708 & E2..
So I wrote:
(?<=708)(.*\n?)(?=E2) - tested in RegExr (it's working)
Now, from that result 303939313833313932 match to get result
(every second number):
099183192
How ?
To match everything between 708 and E2, use:
708(\d+)
if you are sure that there will be only digits. Otherwise try with:
708(.*?)E2
To match every second digit from 303939313833313932, use:
(?:\d(\d))+
use a global replace:
find: \d(\d)
replace: $1
Are you expecting a regular expression answer to this?
You are perhaps better off doing this using string operations in whatever programming language you're using. If you have text = "abcdefghi..." then do output = text[0] + text[2] + text[4]... in a loop, until you run out of characters.
You haven't specified a programming language, but in Python I would do something like:
>>> text = "abcdefghjiklmnop"
>>> for n, char in enumerate(text):
... if n % 2 == 0: #every second char
... print char
...
a
c
e
g
j
k
m
o

In Perl, how many groups are in the matched regex?

I would like to tell the difference between a number 1 and string '1'.
The reason that I want to do this is because I want to determine the number of capturing parentheses in a regular expression after a successful match. According the perlop doc, a list (1) is returned when there are no capturing groups in the pattern. So if I get a successful match and a list (1) then I cannot tell if the pattern has no parens or it has one paren and it matched a '1'. I can resolve that ambiguity if there is a difference between number 1 and string '1'.
You can tell how many capturing groups are in the last successful match by using the special #+ array. $#+ is the number of capturing groups. If that's 0, then there were no capturing parentheses.
For example, bitwise operators behave differently for strings and integers:
~1 = 18446744073709551614
~'1' = Î ('1' = 0x31, ~'1' = ~0x31 = 0xce = 'Î')
#!/usr/bin/perl
($b) = ('1' =~ /(1)/);
print isstring($b) ? "string\n" : "int\n";
($b) = ('1' =~ /1/);
print isstring($b) ? "string\n" : "int\n";
sub isstring() {
return ($_[0] & ~$_[0]);
}
isstring returns either 0 (as a result of numeric bitwise op) which is false, or "\0" (as a result of bitwise string ops, set perldoc perlop) which is true as it is a non-empty string.
If you want to know the number of capture groups a regex matched, just count them. Don't look at the values they return, which appears to be your problem:
You can get the count by looking at the result of the list assignment, which returns the number of items on the right hand side of the list assignment:
my $count = my #array = $string =~ m/.../g;
If you don't need to keep the capture buffers, assign to an empty list:
my $count = () = $string =~ m/.../g;
Or do it in two steps:
my #array = $string =~ m/.../g;
my $count = #array;
You can also use the #+ or #- variables, using some of the tricks I show in the first pages of Mastering Perl. These arrays have the starting and ending positions of each of the capture buffers. The values in index 0 apply to the entire pattern, the values in index 1 are for $1, and so on. The last index, then, is the total number of capture buffers. See perlvar.
Perl converts between strings and numbers automatically as needed. Internally, it tracks the values separately. You can use Devel::Peek to see this in action:
use Devel::Peek;
$x = 1;
$y = '1';
Dump($x);
Dump($y);
The output is:
SV = IV(0x3073f40) at 0x3073f44
REFCNT = 1
FLAGS = (IOK,pIOK)
IV = 1
SV = PV(0x30698cc) at 0x3073484
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x3079bb4 "1"\0
CUR = 1
LEN = 4
Note that the dump of $x has a value for the IV slot, while the dump of $y doesn't but does have a value in the PV slot. Also note that simply using the values in a different context can trigger stringification or nummification and populate the other slots. e.g. if you did $x . '' or $y + 0 before peeking at the value, you'd get this:
SV = PVIV(0x2b30b74) at 0x3073f44
REFCNT = 1
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 1
PV = 0x3079c5c "1"\0
CUR = 1
LEN = 4
At which point 1 and '1' are no longer distinguishable at all.
Check for the definedness of $1 after a successful match. The logic goes like this:
If the list is empty then the pattern match failed
Else if $1 is defined then the list contains all the catpured substrings
Else the match was successful, but there were no captures
Your question doesn't make a lot of sense, but it appears you want to know the difference between:
$a = "foo";
#f = $a =~ /foo/;
and
$a = "foo1";
#f = $a =~ /foo(1)?/;
Since they both return the same thing regardless if a capture was made.
The answer is: Don't try and use the returned array. Check to see if $1 is not equal to ""

Powershell, how many replacements did you make?

I need to know how many replacements are made by Powershell when using either the -replace operator or Replace() method. Or, if that's not possible, if it made any replacements at all.
For example, in Perl, because the substitution operation returns the number of replacements made, and zero is false while non-zero is true in a boolean context, one can write:
$greeting = "Hello, Earthlings";
if ($greeting ~= s/Earthlings/Martians/) { print "Mars greeting ready." }
However with Powershell the operator and method return the new string. It appears that the operator provides some additional information, if one knows how to ask for it (e.g., captured groups are stored in a new variable it creates in the current scope), but I can't find out how to get a count or success value.
I could just compare the before and after values, but that seems entirely inefficient.
You're right, I don't think you can squeeze anything extra out of -replace. However, you can find the number of matches using Regex.Matches(). For example
> $greeting = "Hello, Earthlings"
> $needle = "l"
> $([regex]::matches($greeting, $needle)).Length # cast explicitly to an array
3
You can then use the -replace operator which uses the same matching engine.
After looking a little deeper, there's an overload of Replace which takes a MatchEvaluator delegate which is called each time a match is made. So, if we use that as an accumulator, it can count the number of replacements in one go.
> $count = 0
> $matchEvaluator = [System.Text.RegularExpressions.MatchEvaluator]{$count ++}
> [regex]::Replace("Hello, Earthlings","l",$matchEvaluator)
> $count
Heo, Earthings
3
Here a complete functional example which preserves the replacement behavior and count the number of matches
$Script:Count = 0
$Result = [regex]::Replace($InputText, $Regex, [System.Text.RegularExpressions.MatchEvaluator] {
param($Match)
$Script:Count++
return $Match.Result($Replacement)
})
None of the above answers are actually do replacement and working in recent PS versions:
James Kolpack - show how to count a removed regex (not replaced);
Kino101 - incomplete answer, variables not defined;
Annarfych - outdated answer, in recent PS version the evaluator count variable need to be global
Here is how you can do a replace and count it:
$String = "Hello World"
$Regex = "l|o" #search for 'l' or 'o'
$ReplaceWith = "?"
$Count = 0
$Result = [regex]::Replace($String, $Regex, { param($found); $Global:Count++; return $found.Result($ReplaceWith) })
$Result
$Count
Result in Powershell 5.1:
He??? W?r?d
5
Version of the script that actually does replace things and not null them:
$greeting = "Hello, earthlings. Mars greeting ready"
$counter = 0
$search = '\s'
$replace = ''
$evaluator = [System.Text.RegularExpressions.MatchEvaluator] {
param($found)
$counter++
Write-Output ([regex]::Replace($found, [regex] $search, $replace))
}
[regex]::Replace($greeting, [regex] $search, $evaluator);
$counter
->
> Hello,earthlings.Marsgreetingready
> 4