Elegant way to find repeated digits in Raku (née Perl 6) - regex

I'm trying to find groups of repeated digits in a number, e.g. 12334555611 becomes (1 2 33 4 555 6 11).
This works:
$n.comb(/ 0+ | 1+ | 2+ | 3+ | 4+ | 5+ | 6+ | 7+ | 8+ | 9+ /)
but is not very elegant.
Is there a better way to do this?

'12334555611'.comb(/\d+ % <same>/)
Please check the answer of the first task of Perl Weekly Challenge

You may use
$n.comb(/(.) $0*/)
The (.) creates a capturing group and captures any char into Group 1, then there is a backreference to Group 1 that is $0 in Perl6 regex. The * quantifier matches zero or more occurrences of the same char as in Group 1.
Replace the . with \d to match any digit if you need to only match repeated digits.
See a Perl6 demo online.

In case someone navigates here wanting to remove singleton digits (Raku REPL code below):
Returning only the longest run(s) of repeated digits, while removing singleton digits (uses m:g adverb combo):
> put $/ if m:g/ \d**2..* % <same> / given '12334555611';
33 555 11
Alternatively, depending on how you interpret the original question, returning the overlapping runs of repeated digits (uses m:ov adverb combo):
> put $/ if m:ov/ \d**2..* % <same> / given '12334555611';
33 555 55 11
The difference between the two versions is particularly dramatic as repeated runs get longer:
> given '122333444455555666666' {put $/ if m:g/ \d**2..* % <same> /};
22 333 4444 55555 666666
> given '122333444455555666666' {put $/ if m:ov/ \d**2..* % <same> /};
22 333 33 4444 444 44 55555 5555 555 55 666666 66666 6666 666 66

Related

German Phone Number Regex

I have a regex
\(?\+\(?49?\)?[ ()]?([- ()]?\d[- ()]?){11}
This correctly matches German phone code like
+491739341284
+49 1739341284
(+49) 1739341284
+49 17 39 34 12 84
+49 (1739) 34 12 84
+(49) (1739) 34 12 84
+49 (1739) 34-12-84
but fails to match 0049 (1739) 34-12-84.
I need to adjust the regular expression so that it can match numbers with 0049 as well. can anyone help me with the regex?
Try this one:
\(?\+|0{0,2}\(?49\)?[ ()]*[ \d]+[ ()]*[ -]*\d{2}[ -]*\d{2}[ -]*\d{2}
https://regex101.com/r/CHjNBV/1
However, it's better to make it accept only +49 or 0049, and throw the error message in case the number fails validation. Because if someday you will require to extend the format - it will require making the regex much more complicated.
If you want to match the variations in the question, you might use a pattern like:
^(?:\+?(?:00)?(?:49|\(49\))|\(\+49\))(?: *\(\d{4}\)|(?: ?\d){4})? *\d\d(?:[ -]?\d\d){2}$
Explanation
^ Start of string
(?: Non capture group
\+? Match an optional +
(?:00)? Optionally match 2 zeroes
(?:49|\(49\)) Match 49 or (49)
| Or
\(\+49\) Match (+49)
) Close non capture gruop
(?: Non capture group
* Match optional spaces
\(\d{4}\) Match ( 4 digits and )
| Or
(?: ?\d){4} Repeat 4 times matching an optional space and a digit
)? Close non capture group and make it optional
* Match optional spaces
\d\d Match 2 digits
(?:[ -]?\d\d){2} Repeat 2 times matching either a space or - followed by 2 digits
$ End of string
Regex demo
Or a bit broader variation matching the 49 prefix variants, followed by matching 10 digits allowing optional repetitions of what is in the character class [ ()-]* in between the digits.
^(?:\+?(?:00)?(?:49|\(49\))|\(\+49\))(?:[ ()-]*\d){10}$
Regex demo

Google form RegEx validation that matches all numbers and ranges

In my Google form I would like to authorize numbers and range of numbers like this:
1 to 9,
10 to 80,
90,
100
The separator can be , | ; or newline character, other examples
100
110 to 115
540
or
50 | 60 | 70 to 80 | 100
I was expecting this regEx to work (selecting Regular Expression > Matches > (to)|[0-9 \n|,;]*) but it failed.
Any idea ?
You can use
^\d+(?:(?:\s*(?:to|[\n|;,])\s*)\d+)*\n*$
See the regex demo. Details:
^ - start of a string
\d+ - one or more digits
(?:\s*(?:to|[|;,])\s*\d+)* - zero or more occurrences of
\s*(?:to|[\n|;,])\s* - to or |, ;, , or newline enclosed with zero or more whitespaces
\d+ - one or more digits
\n* - zero or more newlines
$ - end of string.

regex: Numbers and spaces (10 or 14 numbers)

How I can write a regex which accepts 10 or 14 digits separated by a single space in groups of 1,2 or 3 digits?
examples:
123 45 6 789 1 is valid
1234 567 8 9 1 is not valid (group of 4 digits)
123 45 6 789 109 123 8374 is not valid (not 10 or 14 digits)
EDIT
This is what I have tried so far
[0-9 ]{10,14}+
But it validates also 11,12,13 numbers, and doesn't check for group of numbers
You may use this regex with lookahead assertion:
^(?=(?:\d ?){10}(?:(?:\d ?){4})?$)\d{1,3}(?: \d{1,3})+$
RegEx Demo
Here (?=...) is lookahead assertion that enforces presence of 10 or 14 digits in input.
\d{1,3}(?: \d{1,3})+ matches input with 1 to 3 digits separated by space with no space allowed at start or end.
aggtr,
You can match your use case with the following:
^(?:\d\s?){10}$|^(?:\d\s?){14}$
^ means the beginning of the string and $ means the end of the string.
(?:...) means a non-capturing group. Thus, the part before the | means a string that starts and has a non-capturing group of a decimal followed by an optional space that has exactly 10 items followed by the end of the string. By putting the | you allow for either 10 or 14 of your pattern.
Edit I missed the part of your requirement to have the digits grouped by 1, 2, or 3 digits.

PCI Compliance regex detect pattern with spaces

I have to generate a regular expression to detect patterns of text where credit card numbers are involved, I have a regular expression but fails when the text is altered with simple spaces between the text for example (not valid credit card number):
4320 7589 9456 0123
The regex is:
4\d{3}(\s+|-)?\d{4}(\s+|-)?\d{4}(\s+|-)?\d{4}
This regex match easy, but if someone alter the text with spaces between any number like this:
4 320 7589 9456 0123
Does not match, I need a regex to detect any possible variable with spaces, special symbols, letters, some examples:
43 20 75 89 94 56 01 23
4 3 2 0 7 5 8 9 9 4 5 6 0 1 2 3
4320a7589b9456c0123
4320$7589$9456$0123
4320_7589_9456_0123
I don't know if I can strip any space, symbols from the pattern to analyze the text?
I am posting because you actually asked for help with pattern to match any number of non-digits between the first 4 and 15 more digits.
The pattern is
^4(?:\D*\d){15}$
See demo
Regex breakdown:
^ - start of string
4 - literal 4
(?:\D*\d){15} - 15 occurrences of sequences of...
\D* - 0 or more non-digit symbols before..
\d - a digit
$ - end of string
If you need to capture, you can capture (like ^4((?:\D*\d){3})((?:\D*\d){4})((?:\D*\d){4})((?:\D*\d){4})$), but the submatches will still contain the "junk" in-between digits.

Regex to allow only number between 1 to 12

Regex to allow only number between 1 to 12
I am trying (12)|[1-9]\d? but its not working, please help as i am new to regular expression
Something like
^([1-9]|1[012])$
^ Anchors the regex at start of the string
[1-9] Matches 1 to 9
| Alternation, matches the previous match or the following match.
1[012] Matches 10, 11, or 12
$ Anchors the regex at the end of the string.
Regex Demo
Here's some readymade regex expressions for a bunch of different numbers within a certain range:
Range
Label
Regex
1 to 12
hour / month
1[0-2]|[1-9]
1 to 24
hour
2[0-4]|1[0-9]|[1-9]
1 to 31
day of month
3[01]|[12][0-9]|[1-9]
1 to 53
week of year
5[0-3]|[1-4][0-9]|[1-9]
0 to 59
min / sec
[1-5]?[0-9]
0 to 100
percentage
100|[1-9]?[0-9]
0 to 127
signed byte
12[0-7]|1[01][0-9]|[1-9]?[0-9]
32 to 126
ASCII codes
12[0-6]|1[01][0-9]|[4-9][0-9]|3[2-9]
Try something like this:
^(1[0-2]|[1-9])$
1[0-2] : first charcter must be 1 and second character can be in range
from 0 to 2
[1-9] : numbers from 1-9
^ : start of string
$ : end of string
Demo
[1-9]|1[012] works for greedy quantifiers (that try to match as much as they can) but will not match 10 for lazy quantifiers because as soon as it sees 1 it will stop.
Try this at https://regex101.com/
[2-9]|1[012] will work with lazy quantifiers
I think this should work
\[1-9]|1[0-2]\