Regular expression for nnn or nnn.nnn - regex

I have this regex
"^([0-9]{1,3})+(\.[0-9]{3})?$"
and it should allow only n, nn, nnn and nnn.nnn format of the number.
In my case it is passing also and this format nnnnn.nnn

You should remove + and redundant parentheses:
^[0-9]{1,3}(\.[0-9]{3})?$
^^^^^^^^^^
Your pattern matches start of the string (^), 1 or more occurrences of 1 to 3 digits (with ([0-9]{1,3})+) and an optional sequence of a dot followed with 3 digits ((\.[0-9]{3})?) at the end of the string ($).
The [0-9]{1,3} will only match 1 to 3 digits.
See the regex demo.

You need to remove the 1 from the expression like : ^([0-9]{3})+(\.[0-9]{3})?$

The + after the first parenthesis allows for an arbitrary number of repeats. If you mean {1,3} then you don't need the + at all.

The reason this is happening is because of the + you have in the middle of your regex.
This means "one or more of the preceding element", thus it effectively means 1 one more ([0-9]{1,3}) and it must end with ([0-9]{3})?$

Related

CMake regex simple digit match [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 1 year ago.
What is the difference between:
(.+?)
and
(.*?)
when I use it in my php preg_match regex?
They are called quantifiers.
* 0 or more of the preceding expression
+ 1 or more of the preceding expression
Per default a quantifier is greedy, that means it matches as many characters as possible.
The ? after a quantifier changes the behaviour to make this quantifier "ungreedy", means it will match as little as possible.
Example greedy/ungreedy
For example on the string "abab"
a.*b will match "abab" (preg_match_all will return one match, the "abab")
while a.*?b will match only the starting "ab" (preg_match_all will return two matches, "ab")
You can test your regexes online e.g. on Regexr, see the greedy example here
The first (+) is one or more characters. The second (*) is zero or more characters. Both are non-greedy (?) and match anything (.).
In RegEx, {i,f} means "between i to f matches". Let's take a look at the following examples:
{3,7} means between 3 to 7 matches
{,10} means up to 10 matches with no lower limit (i.e. the low limit is 0)
{3,} means at least 3 matches with no upper limit (i.e. the high limit is infinity)
{,} means no upper limit or lower limit for the number of matches (i.e. the lower limit is 0 and the upper limit is infinity)
{5} means exactly 4
Most good languages contain abbreviations, so does RegEx:
+ is the shorthand for {1,}
* is the shorthand for {,}
? is the shorthand for {,1}
This means + requires at least 1 match while * accepts any number of matches or no matches at all and ? accepts no more than 1 match or zero matches.
Credit: Codecademy.com
+ matches at least one character
* matches any number (including 0) of characters
The ? indicates a lazy expression, so it will match as few characters as possible.
A + matches one or more instances of the preceding pattern. A * matches zero or more instances of the preceding pattern.
So basically, if you use a + there must be at least one instance of the pattern, if you use * it will still match if there are no instances of it.
Consider below is the string to match.
ab
The pattern (ab.*) will return a match for capture group with result of ab
While the pattern (ab.+) will not match and not returning anything.
But if you change the string to following, it will return aba for pattern (ab.+)
aba
+ is minimal one, * can be zero as well.
A star is very similar to a plus, the only difference is that while the plus matches 1 or more of the preceding character/group, the star matches 0 or more.
I think the previous answers fail to highlight a simple example:
for example we have an array:
numbers = [5, 15]
The following regex expression ^[0-9]+ matches: 15 only.
However, ^[0-9]* matches both 5 and 15. The difference is that the + operator requires at least one duplicate of the preceding regex expression

regex replace in powershell command duplicates characters: a bug in powershell? [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 1 year ago.
What is the difference between:
(.+?)
and
(.*?)
when I use it in my php preg_match regex?
They are called quantifiers.
* 0 or more of the preceding expression
+ 1 or more of the preceding expression
Per default a quantifier is greedy, that means it matches as many characters as possible.
The ? after a quantifier changes the behaviour to make this quantifier "ungreedy", means it will match as little as possible.
Example greedy/ungreedy
For example on the string "abab"
a.*b will match "abab" (preg_match_all will return one match, the "abab")
while a.*?b will match only the starting "ab" (preg_match_all will return two matches, "ab")
You can test your regexes online e.g. on Regexr, see the greedy example here
The first (+) is one or more characters. The second (*) is zero or more characters. Both are non-greedy (?) and match anything (.).
In RegEx, {i,f} means "between i to f matches". Let's take a look at the following examples:
{3,7} means between 3 to 7 matches
{,10} means up to 10 matches with no lower limit (i.e. the low limit is 0)
{3,} means at least 3 matches with no upper limit (i.e. the high limit is infinity)
{,} means no upper limit or lower limit for the number of matches (i.e. the lower limit is 0 and the upper limit is infinity)
{5} means exactly 4
Most good languages contain abbreviations, so does RegEx:
+ is the shorthand for {1,}
* is the shorthand for {,}
? is the shorthand for {,1}
This means + requires at least 1 match while * accepts any number of matches or no matches at all and ? accepts no more than 1 match or zero matches.
Credit: Codecademy.com
+ matches at least one character
* matches any number (including 0) of characters
The ? indicates a lazy expression, so it will match as few characters as possible.
A + matches one or more instances of the preceding pattern. A * matches zero or more instances of the preceding pattern.
So basically, if you use a + there must be at least one instance of the pattern, if you use * it will still match if there are no instances of it.
Consider below is the string to match.
ab
The pattern (ab.*) will return a match for capture group with result of ab
While the pattern (ab.+) will not match and not returning anything.
But if you change the string to following, it will return aba for pattern (ab.+)
aba
+ is minimal one, * can be zero as well.
A star is very similar to a plus, the only difference is that while the plus matches 1 or more of the preceding character/group, the star matches 0 or more.
I think the previous answers fail to highlight a simple example:
for example we have an array:
numbers = [5, 15]
The following regex expression ^[0-9]+ matches: 15 only.
However, ^[0-9]* matches both 5 and 15. The difference is that the + operator requires at least one duplicate of the preceding regex expression

Match a string with a word and a digit 1-9

My regex is weak, in the case of the following string
"OtherId":47
"OtherId":7
"MyId":47 (Match this one)
"MyId":7
I want to pick up the string that has "MyId" and a number that is not 1 - 9
I thought I could just use:
RegEx: How can I match all numbers greater than 49?
Combined using:
Regular Expressions: Is there an AND operator?
But its not happening... you can see my failed attempt here:
https://www.regextester.com/index.php?fam=99753
Which is
\b"MyId":\b(?=.*^[0-10]\d)
What am I doing wrong?
You can use this regex to match any digit >= 10:
^"MyId":[1-9][0-9]+$
RegEx Demo
If leading zeroes are to be allowed as well then use:
^"MyId":0*[1-9][0-9]+$
[1-9] makes sure number starts with 1-9 and [0-9]+ match 1 or more any digits after first digit.
Essentially, you are looking for 2 or more digits:
\"MyId\"\:(\d{2,})
I have escaped the quotes and colon, and {2,} means 2 or more.
If you need exact match to any number greater than 9
^"MyId":[1-9][0-9]+$

Regex to match a 2-digit number or a 3 digit number

I need to be able to check if a string contains either a 2 digit or a 4 digit number before a . (period).
For example, 39. is good, and so is 3926., but 392. is not.
I originally had (^\\d{2,4).$) but that allows between a 2 and a 4 digit number preceding a period.
I also tried (^\\d{2}.|\\d{4}.$) but that didn't work.
You can use this regex:
^\d{2}(?:\d{2})?\.$
This regex makes 2nd set of \d{2} optional thus allowing to match 12. or 1234. but not 123..
In the expression (^\d{2}.|\d{4}.$), the dots match any character.
Try escaping them to make them match literal dots: (^\d{2}\.|\d{4}\.$)

What is the point of having * in a regular expression

Recently I am thinking the reason why we need a * in regular expression. For example, if we want to represent A0,A1..,Z99, we can do:
[A-Z][0-9][0-9]*
But A0A (which is not we want) is also valid according to the above. What benefit does the * give me?
* is just a quantifier, matching between zero and unlimited times.
[A-Z][0-9][0-9]* matches A0,A1..,Z99 and also A10000,Z123456789...
Remembering that if you dont put the ^ and $ as anchors, the processor will match the specified part, and return true even if the input contain more characters, because you don't said that you want a positive result ONLY if the entire input matches the regex.
If your goal is to match just A0,A1..,Z99, the regex should be:
^[A-Z][0-9][0-9]?$
Or simply:
^[A-Z]\d{1,2}$
\d means 'digit', and is the same as [0-9].
{1,2} means at least 1 time and nothing more than 2 times.
? also is a quantifier, matching 0 or 1 time.
But A0A (which is not we want) is also valid
No it is not valid, you just need to use anchors:
^[A-Z][0-9][0-9]*$
^ will ensure this matches at line start and $ ensures it matches till line end.
Also if only 2nd digit is optional then better to use:
^[A-Z][0-9][0-9]?$
Since * matches 0 or more times whereas ? matches 0 or 1 time.
Seems like you're trying to match the strings starts with an uppercase alphabet and the following numbers ranges from 1 to 99.
^[A-Z][1-9]?[0-9]$
^ asserts that we are at the start and $ asserts that we are at the end. So this helps to do an exact string match. It won't match at the middle or start or at the end of a string or line. That is, [A-Z][1-9]?[0-9] will match A10 in fooA10 string but ^[A-Z][1-9]?[0-9]$ won't produce a match in fooA10 string.