Trouble locating regex mistake

Trouble locating regex mistake - regex

import re
reg = r'^[(][+-]?([0]|([1-9][0-9]*)\.?\d+?),\s[+-]?([0]|([1-9][0-9]*)\.?\d+?)[)]$'
for _ in range(int(input())):
coord = input()
if re.search(reg, coord):
if 0 <= float(re.search(reg, coord).group(1)) <= 90 and 0 <= float(re.search(reg, coord).group(3)) <= 180:
print('Valid')
else: print('Invalid')
else: print('Invalid')
Here is my code for a regular expression that finds coordinates. I had trouble finding the mistake in the regular expression. The test cases that do not work are (-6, -165) and (-6, -172) What is the problem that prevents the code to enter the first if statement?

The main issue is that \d+? matches 1 or more digits, as few as possible, while you assumed it matches 0 or more digits.
To make .xxx part optional, use an optional non-capturing group (?:\.\d+)?:
^\([+-]?((?:0|[1-9][0-9]*)(?:\.\d+)?),\s[+-]?((?:0|[1-9][0-9]*)(?:\.\d+)?)\)$
See the regex demo
The part that matches a number, (?:0|[1-9][0-9]*)(?:\.\d+)?, now matches:
(?:0|[1-9][0-9]*) - a non-capturing group matching either of the 2 alternatives:
0 - a zero
| - or
[1-9][0-9]* - a digit from 1 to 9 and then any 0+ digits
(?:\.\d+)? - an optional non-capturing group matching 1 or 0 occurrences of:
\. - a dot
\d+ - 1 or more digits.

Related

regex in python/ansible

I am new bee to regex, I have an example string : account-device-v2-2-3-63-21900
and using this regular expression [1-9]-[0-9]-[0-9]*
I am getting output as 1-2-3
but my intention is to match/extract pattern 2-3-63
Meaning to get digits with hyphens after v2 (or v1 etc), I don't need last digit part (21000 or any other number)
Any suggestions please?

You want to get 1 or more digit except 0, dash, 1 or more digit, dash, 1 or more digit from account-device-v2-2-3-63-21900 or account-device-v1-2-3-63-21900?
Use v[12]-([1-9]+?-[0-9]+?-[0-9]+?)- and get first group.
Demo: https://regex101.com/r/hMLGsK/1

The pattern [1-9]-[0-9]-[0-9]* matches 2-2-3 because your pattern does not match the v and a digit part and this is the first part it can match.
Note that [0-9]* Matches optional digits, so 2-2- could also be a match.
Using a capture group to get the value:
\bv[1-9][0-9]*-([1-9][0-9]*-[0-9]+-[0-9]+)
\bv[1-9][0-9]*- Match v1 or also possibly v20 etc..
( Capture group 1
[1-9][0-9]* Match a digit starting at 1
-[0-9]+-[0-9]+ 2 parts matching - and 1 or more digits starting from 0
) Close group 1
Regex demo

Checking min no of characters in capturing group in Regex

I have a question in regex
I am dealing with numbers 0 and 1 only
I have 10 digit number grouped into 4 as below
([01]{2})([01]{4})([01]{2})([01]{2})
I need to match all those numbers with min 2 1's in the second group which is ([01]{4}) , no matter how many 0's or 1's other groups are having. I am interested only in the second group
For example, these are the potential matches are
0000110000
0011000000
0001100000
0000110000
I tried using positive look ahead like :
^(\d{2})((?=\d*1{2,}\d*)(\d{4}))(\d{2})(\d{2})
but this is matching even
0000000011
Any help is deeply appreciated

If the two 1s are not necessarily consecutive in Group 2, you can use
^([01]{2})(?=(?:[01]*1){2}[01]{4,6}$)([01]{4})([01]{2})([01]{2})$
See the regex demo
Details:
^ - start of string
([01]{2}) - Group 1: two occurrences of 1 or 0
(?=(?:[01]*1){2}[01]{4,6}$) - immediately to the right of the current location, there must be two occurrences of any zero or more 0 or 1 chars followed with 1 and then there must be four, five or six 1 or 0 chars till the end of string
([01]{4}) - Group 2: four occurrences of 1 or 0
([01]{2}) - Group 3: two occurrences of 1 or 0
([01]{2}) - Group 4: two occurrences of 1 or 0
$ - end of string.

If the ones need to be consecutive (as per your sample data), maybe you can use:
^(?=[01]{2,4}11)[01]{10}$
See the online demo. The idea here is that you would match 2-4 zero's or 1's upto a sequence of two ones. It makes sense if you realise the only combinations that are allowed would have the minimum of two 1's ("11") sequence after exactly 2-4 other digits.
^ - Start line anchor.
(?=[01]{2,4}11) - Open positive lookahead to look for 2-4 characters from our characters class upto "11".
[01]{10} - Match exactly 10 characters from our character class.
$ - End line anchor.
If need be you can change the [01]{10} pieces where you'd use capture groups.
EDIT:
If they don't have to be consecutive, maybe you can work with:
^[01]{2}(?=[01]{8}$)([01]{0,2}1[01]{0,2}1[01]{0,2})[01]{4}$
See the online demo.
Or less verbose:
^(?=[01]{10}$)(..)(.*1.*1.*)(..)(..)$
See the demo

Not a job for regex but for bitwise operators:
(in PHP):
$nums = [
'0000110000',
'0011000000',
'0001100000',
'1000110000',
'0000000110',
'0001000000'
];
foreach ($nums as $num) {
if ( !in_array((bindec($num) >> 4) & 15, [0, 1, 2, 4, 8]) )
echo $num, PHP_EOL;
}
You can probably do that in any language.

If a positive lookahead is supported, you could also assert that group 2 has as least 11 using a positive lookahead.
^([01]{2})(?=[01]{0,2}11)([01]{4})([01]{2})([01]{2})$
^ Start of string
([01]{2}) - Group 1: two occurrences of 1 or 0
(?= Positive lookahead
[01]{0,2}11 Match 0-2 times either 0 or 1 and match 11
) Close lookahead
([01]{4}) - Group 2: four occurrences of 1 or 0
([01]{2}) - Group 3: two occurrences of 1 or 0
([01]{2}) - Group 4: two occurrences of 1 or 0
$ - end of string.
Regex demo
Or you can write out all 3 alternatives matching 11
^([01]{2})(11[01][01]|[01]11[01]|[01][01]11)([01]{2})([01]{2})$
Regex demo

regex for matching latitude, longitudes without any character

I am looking for one regex which strictly allows 2 floating point numbers which are comma separated.
Test cases:
0,0
0.021312311323,0
0,0.012312312312
1.1,0.9836373
Regex that I have tried is
^[-+]?([1-8]?\d(\.\d+)?|90(\.0+)?),\s*[-+]?(180(\.0+)?|((1[0-7]\d)|([1-9]?\d))(\.\d+)?)$\D+|\d*\.?\d+
These are latitudes and longitudes but I just want 2 values in these paremeters.
This regex fails in:
-10a, 10a
10a,10b
I would really appreciate any help and guidance.

Your regex ends with a couple of redundant patterns, you should remove \D+|\d*\.?\d+ after $. As $ means the end of string, there can be no more text after it, and the \D+|\d*\.?\d+ requires one or more non-digit chars, or just matches any float or integer number with \d*\.?\d+ - this matched your unwelcome strings.
You can use
^([-+]?(?:[1-8]?\d(?:\.\d+)?|90(?:\.0+)?)),\s*([-+]?(?:180(?:\.0+)?|(?:1[0-7]\d|[1-9]?\d)(?:\.\d+)?))$
See the regex demo. Note I converted some capturing groups into non-capturing, so that there remain just two "notional" capturing groups in the pattern.
Details
^ - start of string
([-+]?(?:[1-8]?\d(?:\.\d+)?|90(?:\.0+)?)) - Group 1:
[-+]? - an optional - or +
(?:[1-8]?\d(?:\.\d+)?|90(?:\.0+)?) - either a number from 0 to 89 ([1-8]?\d) and then an optional fractional part ((?:\.\d+)?) or 90 and then an optional . followed with one or more 0 chars
,\s* - a comma and 0+ whitespace chars
([-+]?(?:180(?:\.0+)?|(?:1[0-7]\d|[1-9]?\d)(?:\.\d+)?)) - Group 2:
[-+]? - an optional - or +
(?:180(?:\.0+)?|(?:1[0-7]\d|[1-9]?\d)(?:\.\d+)?) - either a 180 number followed with an optional . + one or more 0 chars, or a number from 0 to 179 and then an optional fractional part
$ - end of string.

Your regular expression is almost correct. You should have stopped at $ indicating the end of the string.
const testCases = [ "0,0",
"0.021312311323,0",
"0,0.012312312312",
"1.1,0.9836373",
"-10a, 10a",
"10a,10b"];
const re = /^[-+]?([1-8]?\d(\.\d+)?|90(\.0+)?),\s*[-+]?(180(\.0+)?|((1[0-7]\d)|([1-9]?\d))(\.\d+)?)$/g;
testCases.forEach(tc => {
if(tc.match(re)) {
console.log(" VALID : " + tc );
} else {
console.log("NOT VALID : " + tc);
}
});

Regular expression to get positive integer and -1

Below is the text I hope to match:
00000001,00000002,00000003
It works fine with ((([-1-9]+),)+)?[-1-9]+.
But it didn't match -1. The expression must not match with -2 or anything else except -1.

You may use
^(?:0*[1-9][0-9]*|-1)(?:,(?:0*[1-9][0-9]*|-1))*$
See the regex demo.
Pattern details:
^ - start of string
(?:0*[1-9][0-9]*|-1) - a non-capturing group matching...
0*[1-9][0-9]* - zero or mor 0 chars, followed with a non-zero digit followed with any 1 or more digits
| - or
-1 - a -1 substring
(?:,(?:0*[1-9][0-9]*|-1))* - a non-capturing group quantified with * (0 or more) quantifier matching 0 or more repetitions of:
, - a comma
(?:0*[1-9][0-9]*|-1) - same subpattern as in the beginning (-1 or a non-zero number with no fractions)
$ - end of string.

[-1-9]+ doesn't match what you're expecting it to match. It matches for example: "-31-23", which is obviously not a number.
A simple regex like:
(?:^-1)$|^[0-9]+
will match "-1", or any positive integer (including 0001, 00000002, etc...).
Also, depending on the language you're using, it would be simpler to use the language's features to decide if the number is "-1" or any other positive number.

As your state that ((([-1-9]+),)+)?[-1-9]+ works fine which captures a positive integer and looking at the title of the question, you might use this regex using alternation to capture -1 or only positive integers including 0 or 00000 from a string which could be preceded with zeroes.
The positive integers will be captured in group 1.
-[02-9][0-9]*|0*(-?[0-9]+)
Details
- Match literally
[02-9][0-9]* Match a 0 or digits 2-9 followed by zero or more times a digit. Note that the - is not part of the character class or else --- would also match.
| Or
0* Match zero or more times a zero
(-?[0-9]+) Capture in group 1 an optional hyphen followed by one or more times a digit

Regex range between 0 and 100 including two decimal

I'm trying to figure out a regex expression that does the following. Both conditions below must be true:
1) Between 0 and 100 inclusive
2) Can contain one or two decimals only but not obligatory.
It should not allow 100.01 or 100.1
100 is the maximum value, or 100.0 or 100.00
I tried ^(100(?:\.00)?|0(?:\.\d\d)?|\d?\d(?:\.\d\d)?)$
which helped me in this question
but this does not accept 99.0 (one decimal).
I'm probably very close.

You just need to make each second decimal digit optional:
^(?:100(?:\.00?)?|\d?\d(?:\.\d\d?)?)$
^ ^
See the updated regex demo. The 0(?:\.\d\d)? alternative is covered by \d?\d(?:\.\d\d)? one (as per Sebastian's comment) and can thus be removed.
The ? quantifier matches one or zero occurrences of the subpattern it quantifies.
Pattern details:
^ - start of string
(?: - start of an alternation group:
100(?:\.00?)? - 100, 100.0 or 100.00 (the .00 is optional and the last 0 is optional, too)
\d?\d(?:\.\d\d?)? - an optional digit followed by an obligatory digit followed with an optional sequence of a dot, a digit and an optional digit.
) - end of the alternation group
$ - end of string.
BONUS: If the number can have either . (dot) or , (comma) as a decimal separator, you can replace all \. patterns in the regex with [.,]:
^(?:100(?:[.,]00?)?|\d?\d(?:[.,]\d\d?)?)$

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Trouble locating regex mistake - regex

Related

regex in python/ansible

Checking min no of characters in capturing group in Regex

regex for matching latitude, longitudes without any character

Regular expression to get positive integer and -1

Regex range between 0 and 100 including two decimal

Categories

Resources