Checking min no of characters in capturing group in Regex - regex

I have a question in regex
I am dealing with numbers 0 and 1 only
I have 10 digit number grouped into 4 as below
([01]{2})([01]{4})([01]{2})([01]{2})
I need to match all those numbers with min 2 1's in the second group which is ([01]{4}) , no matter how many 0's or 1's other groups are having. I am interested only in the second group
For example, these are the potential matches are
0000110000
0011000000
0001100000
0000110000
I tried using positive look ahead like :
^(\d{2})((?=\d*1{2,}\d*)(\d{4}))(\d{2})(\d{2})
but this is matching even
0000000011
Any help is deeply appreciated

If the two 1s are not necessarily consecutive in Group 2, you can use
^([01]{2})(?=(?:[01]*1){2}[01]{4,6}$)([01]{4})([01]{2})([01]{2})$
See the regex demo
Details:
^ - start of string
([01]{2}) - Group 1: two occurrences of 1 or 0
(?=(?:[01]*1){2}[01]{4,6}$) - immediately to the right of the current location, there must be two occurrences of any zero or more 0 or 1 chars followed with 1 and then there must be four, five or six 1 or 0 chars till the end of string
([01]{4}) - Group 2: four occurrences of 1 or 0
([01]{2}) - Group 3: two occurrences of 1 or 0
([01]{2}) - Group 4: two occurrences of 1 or 0
$ - end of string.

If the ones need to be consecutive (as per your sample data), maybe you can use:
^(?=[01]{2,4}11)[01]{10}$
See the online demo. The idea here is that you would match 2-4 zero's or 1's upto a sequence of two ones. It makes sense if you realise the only combinations that are allowed would have the minimum of two 1's ("11") sequence after exactly 2-4 other digits.
^ - Start line anchor.
(?=[01]{2,4}11) - Open positive lookahead to look for 2-4 characters from our characters class upto "11".
[01]{10} - Match exactly 10 characters from our character class.
$ - End line anchor.
If need be you can change the [01]{10} pieces where you'd use capture groups.
EDIT:
If they don't have to be consecutive, maybe you can work with:
^[01]{2}(?=[01]{8}$)([01]{0,2}1[01]{0,2}1[01]{0,2})[01]{4}$
See the online demo.
Or less verbose:
^(?=[01]{10}$)(..)(.*1.*1.*)(..)(..)$
See the demo

Not a job for regex but for bitwise operators:
(in PHP):
$nums = [
'0000110000',
'0011000000',
'0001100000',
'1000110000',
'0000000110',
'0001000000'
];
foreach ($nums as $num) {
if ( !in_array((bindec($num) >> 4) & 15, [0, 1, 2, 4, 8]) )
echo $num, PHP_EOL;
}
You can probably do that in any language.

If a positive lookahead is supported, you could also assert that group 2 has as least 11 using a positive lookahead.
^([01]{2})(?=[01]{0,2}11)([01]{4})([01]{2})([01]{2})$
^ Start of string
([01]{2}) - Group 1: two occurrences of 1 or 0
(?= Positive lookahead
[01]{0,2}11 Match 0-2 times either 0 or 1 and match 11
) Close lookahead
([01]{4}) - Group 2: four occurrences of 1 or 0
([01]{2}) - Group 3: two occurrences of 1 or 0
([01]{2}) - Group 4: two occurrences of 1 or 0
$ - end of string.
Regex demo
Or you can write out all 3 alternatives matching 11
^([01]{2})(11[01][01]|[01]11[01]|[01][01]11)([01]{2})([01]{2})$
Regex demo

Related

Regex to match all permutations of {1,2,3} with repetition in the middle. Ex: 122223

I need a regex expression, which will search for all permutations of digits (1, 2, 3), where digit in the middle will occur one or many times.
For ex:
123
133332
21111113
312
13333332
I've tried this expression:
([1][2]+[3])|([1][3]+[2])|([2][1]+[3])|([2][3]+[1])|([3][2]+[1])|([3][1]+[2]))
Unfortunately it is slow, is there any way to make it more more efficient?
You may use
([1-3])(?!\1)([1-3])\2*(?!\1|\2)[1-3]
See the regex demo
Details
([1-3]) - Group 1: 1, 2 or 3
(?!\1)([1-3])\2* - a digit from 1 to 3 not equal to Group 1 value and then 0+ occurrences of the digit
(?!\1|\2)[1-3] - a digit from 1 to 3 not equal to Group 1 and 2 value
In case you need to match the whole string, add ^ at the start and $ at the end of the pattern.

Regex: to match unsigned integer values (valid values: 0 to 65535 only) for comma separated values

I have
([0-5]?\d?\d?\d?\d|6[0-4]\d\d\d|65[0-4]\d\d|655[0-2]\d|6553[0-5])
which works for single input as:
0
1
65
6553
but i want them for comma separated input string as:
0,1,65,6553 ->this is a valid string
65535,-1,25 ->this is a invalid string because of negative number.
please can anyone suggest solution
Note:
I have already tried repetition as:
^([0-5]?\d?\d?\d?\d|6[0-4]\d\d\d|65[0-4]\d\d|655[0-2]\d|6553[0-5])+(,(([0-5]?\d?\d?\d?\d|6[0-4]\d\d\d|65[0-4]\d\d|655[0-2]\d|6553[0-5])))*$
which is accepting 65537 also which is undesirable.
Checking number bounds afterwards seems more straightforward to me, but anyway, this is a regex you may use (I refactored the integer part a little bit)
^(([0-5]?\d{0,4}|6[0-4]\d{3}|65[0-4]\d{2}|655[0-2]\d|6553[0-5])(,|(?=$)))+$
https://regex101.com/r/1RpNuy/1
Details:
^ : String start
( : Group start
([0-5]?\d{0,4}|6[0-4]\d{3}|65[0-4]\d{2}|655[0-2]\d|6553[0-5]) : Match a number
(,|(?=$)) : Match either , or make sure this is end of line (but without reading the $)
)+ : End of group, repeat as many times as possible
$ : End of string
You may use
^(?:\d{1,4}|[1-5]\d{4}|6[0-4]\d{3}|65[0-4]\d{2}|655[0-2]\d|6553[0-5])(?:,(?:\d{1,4}|[1-5]\d{4}|6[0-4]\d{3}|65[0-4]\d{2}|655[0-2]\d|6553[0-5]))*$
In PCRE and Onigmo, you may use a shorter pattern where \g<1> repeated Group 1 pattern:
^(\d{1,4}|[1-5]\d{4}|6[0-4]\d{3}|65[0-4]\d{2}|655[0-2]\d|6553[0-5])(?:,\g<1>)*$
See the regex demo and regex demo #2
The regex is basically ^<BLOCK>(?:,<BLOCK>)*$ where the BLOCK pattern is a regex matching the numbers from 0 to 65535:
\d{1,4} - 1, 2, 3 or 4 digits (0 - 9999)
[1-5]\d{4} - 1 to 5 digit and then any 4 digits (10000 - 50000)
6[0-4]\d{3} - 6, then a digit from 0 to 4, and then three digits (60000 - 64999)
65[0-4]\d{2} - 65, a digit from 0 to 4 and then any two digits (65000 - 65499)
655[0-2]\d - 655, a digit from 0 to 2 and then any digit (65500 - 655299)
6553[0-5] - 6553 and then a digit from 0 to 5 (65530 - 65535)
The general pattern:
^ - start of string
<BLOCK> - BLOCK pattern described above
(?:,<BLOCK>)* - 0 or more repetitions of , and then BLOCK pattern
$ - end of string.

Regular expression to get positive integer and -1

Below is the text I hope to match:
00000001,00000002,00000003
It works fine with ((([-1-9]+),)+)?[-1-9]+.
But it didn't match -1. The expression must not match with -2 or anything else except -1.
You may use
^(?:0*[1-9][0-9]*|-1)(?:,(?:0*[1-9][0-9]*|-1))*$
See the regex demo.
Pattern details:
^ - start of string
(?:0*[1-9][0-9]*|-1) - a non-capturing group matching...
0*[1-9][0-9]* - zero or mor 0 chars, followed with a non-zero digit followed with any 1 or more digits
| - or
-1 - a -1 substring
(?:,(?:0*[1-9][0-9]*|-1))* - a non-capturing group quantified with * (0 or more) quantifier matching 0 or more repetitions of:
, - a comma
(?:0*[1-9][0-9]*|-1) - same subpattern as in the beginning (-1 or a non-zero number with no fractions)
$ - end of string.
[-1-9]+ doesn't match what you're expecting it to match. It matches for example: "-31-23", which is obviously not a number.
A simple regex like:
(?:^-1)$|^[0-9]+
will match "-1", or any positive integer (including 0001, 00000002, etc...).
Also, depending on the language you're using, it would be simpler to use the language's features to decide if the number is "-1" or any other positive number.
As your state that ((([-1-9]+),)+)?[-1-9]+ works fine which captures a positive integer and looking at the title of the question, you might use this regex using alternation to capture -1 or only positive integers including 0 or 00000 from a string which could be preceded with zeroes.
The positive integers will be captured in group 1.
-[02-9][0-9]*|0*(-?[0-9]+)
Details
- Match literally
[02-9][0-9]* Match a 0 or digits 2-9 followed by zero or more times a digit. Note that the - is not part of the character class or else --- would also match.
| Or
0* Match zero or more times a zero
(-?[0-9]+) Capture in group 1 an optional hyphen followed by one or more times a digit

Regex max number 11 and must 2 digits

I have a validation expression i'm trying to figure out. First, I want the user to only be allowed to enter the max number of 11...not 11 characters but the number allow is the max that can be entered. I got that to work with the code below and works fine.
ValidationExpression="^([1-9]|[0-1][0-1])$"
However, I want the user to also be forced to use 2 digits. For example, instead of 1 they need to enter 01. I've tried different ways of doing this but can't seem to get it to work.
I tried this as well but that didn't work either.
ValidationExpression="^([1-9]|[0-1][0-1])${2}"
If you need to perform this in a single step (i.e. you can't do a < and > check as well as a regex) then this should do it:
ValidationExpression="^(?:0\d|1[01])$"
Or, if your language doesn't recognise the \d symbol:
ValidationExpression="^(?:0[0-9]|1[01])$"
"Match either (0 followed by any digit) or (1 followed by 0 or 1), anchored at the beginning and end of the input string."
To match padded 2-digit numbers from 01 to 12 you may use
ValidationExpression="^(0[1-9]|1[01])$"
See the regex demo.
The expression matches:
^
( - start of a group (here, a capturing group is used for better readability, a non-capturing one can also be used)
0 - zero
[1-9] - 1 to 9 digit
| - or
1 - 1
[01] - 0 or 1 digit
) - end of group
$ - end of string.
You can use this regex
/\b(?:[0][\d]|[1][01])\b/
This says enter a number 0 followed by 0-9 or enter 1 followed by 0 or 1. It is bounded on both sides by word boundaries and it is a non-capturing group. Try it out here.

How to match a whole string that contains just two and no more than two digits between 0 and 10 in regex?

This regex does not work for me as selects all groups of two and multiple digits and not the string.
abcde9 = match
abcde12 = not matched
abcde12345678 = not matched
What I have at the moment is this, it I just can't include the 0 and the 10 as two digits numbers in the regex, can anyone help me?
\d{0,10}[1-9]
If you want to match any string containing exactly one integer from 0 to 10 then use
^\D*(\d|10)\D*$
which means "any non-digit content followed by either a single digit or the number 10 and then followed by any non-digit content"
try it at regex101
I think you are looking for
^\D*(?:[0-9]|10)(?:\D+(?:[0-9]|10))?\D*$
See demo
This will match a whole string that contains 1 or 2 whole integer numbers from 0 to 10, and no other digits.
The regex breakdown:
^ - start of string
\D* - 0 or more characters other than digit
(?:[0-9]|10) - numbers from 0 to 10
(?:\D+(?:[0-9]|10))? - 1 or 0 occurrence of
\D+ - 1 or more characters other than digit
(?:[0-9]|10) - numbers from 0 to 10
\D* - 0 or more characters other than digit
$ - end of string
Is that what you looking for:
/(0[1-9])$/
You can test that regex to make sure it fits your needs:
https://regex101.com/r/hX6lB7/3