I want to match numbers including "-" with non leading zeros and normal numbers without "-". Therefore I want to use a regular expression.
The regex
should match 0 1 2 3 123 2-3 22-33 and
should not match 0123-123 01234.
The following regex works nearly:
\b(0|[1-9][0-9]*\-?[0-9]*)\b
The numbers 0 1 2 3 123 2-3 22-33 and 01234 are matched correctly, but 0123-123 not: it is matched partly. https://regex101.com/r/0Po3Ed/1.
You may use a negative lookbehind in your regex:
(?<!-)\b(?:0|[1-9][0-9]*(?:-[0-9]+)?)\b
Updated RegEx Demo
(?<!-) is negative lookbehind expressions that will fail the match if you have - before numbers.
Related
I'm trying to find a regex for numeric inputs. We can receive a leading 0 just if we add a dot for adding 1 or 2 decimal numbers. And of course just accept numbers.
These are the scenarios that we can accept:
0.01
1.1
1.02
120.01
We can't accept these values
0023
0100
.01
.12
Which regex is the best option for these cases?
Until now we try we the following regex for accepting just number and dots
[A-Za-z,]
And also we try with the following ones:
^[+-]?[0-9]{1,3}(?:[0-9]*(?:[.,][0-9]{1})?|(?:,[0-9]{3})*(?:\.[0-9]{1,2})?|(?:\.[0-9]{3})*(?:,[0-9]{1,2})?)$
"/^[-]?[$]\d{1,3}(?:,?\d{3})*\.\d{2}$/"
"/(^(\d{1})\.{0,1}([0-9]){0,2}$)|(^([1-9])\d{0,2}(\,\d{0,3})$)/g"
(?:0|[1-9][0-9]*)(?:\.[0-9]{1,2})?
And the next one for deleting the leading zeros but it didn't work for 0.10 cases
^0+
If a negative lookahead is supported, you can exclude matches that start with a zero and have no decimal part.
^(?!0\d*$)\d+(?:\.\d{1,2})?$
^ Start of string
(?!0+\d*$) Negative lookahead, assert not a zero followed by optional digits at the right
\d+ Match 1+ digits
(?:\.\d{1,2})? Match an optional decimal part with 1 or 2 digits
$ End of string
Regex demo
I would go with ^(0|[1-9]\d*|(0|[1-9]\d*)\.\d+)$
You can test here: https://regex101.com/r/oNMgR9/1
Explanation
^ means : match the beginning of the string (or line if the m flag is enabled).
$ means : match the end of the string (or line if the m flag is enabled).
(a|b) means match "a" or match "b" so I'll use this to match either "0" alone or any number not starting with a "0". It's the syntax for a logical or.
. alone is used to match any char. So you have to escape it if you want to match the dot character. This is why I wrote 0\. instead of 0..
[ ] is used to list some characters you want to match. It can be a range if you use the - char, so [1-9] means any digit char from "1" to "9".
\d is to match a digit. It's totally equivalent to [0-9].
* means : match the preceding pattern 0 or many times, so \d* means that it will match 0 or many times a digit, so it will match "8" or "465" or "09" but also an empty string "". If you want to match the preceding pattern at least once or many times then you use + instead of *. So \d+ won't match an empty string "" but \d* would match it.
A) Just a number not starting with 0
[1-9]\d* will match any digit from 1 to 9 and then optionnaly followed by other digits. This will match numbers without a decimal point.
B) Just 0
0 alone is a possibility. This is because the case above isn't covering it.
B) A number with decimals
(0|[1-9]\d*)\.\d+ will match either a "0" alone or a number not starting by "0" and then followed by a point and some other digits (which have to be present because we don't want to match "45." without the numbers behind the dot).
Better alternative
The solution from #TheFourthBird is a bit cleaner with the use of a negative lookahead. It's just a bit different to understand. And he read the question completely: You wanted 1 or 2 digits after the decimal. I forgot about that, so, effectively, \d+ should be replaced by \d{1,2} as you don't want more than 2 digits.
You can use
^(?![0.]+$)(?:[1-9]\d*|0)(?:\.\d{1,2})?$
See the regex demo.
Details:
^ - start of string
(?![0.]+$) - fail the match if there are just zeros or dots till end of string
(?:[1-9]\d*|0) - either a non-zero digit followed with any zero or more digits or a zero
(?:\.\d{1,2})? - optionally followed with a sequence of a . and one or two digits
$ - end of string.
I'm trying to create regex to retrieve last number if there was a number or any number if there wasn't any from a string.
Examples:
6 łyżek stopionego masła -> 6
5 łyżek blabla, 6 łyżek masła -> 6
5 łyżek mąki lub masła -> 5
I'm matching only on masła (changing variable) so it has to be included in regex
EDIT:
I cannot explain what I actually need:
Here is regex101 example: https://regex101.com/r/pEeRk3/1
EDIT2:
Emma's solution works great, but I would need to parse decimals and 2multiple digit numbers as well, meaning that those would match as well:
https://regex101.com/r/pEeRk3/3 - I added examples with answers in the link
If you want to match the last occurence of a digit with a decimal and you word has to follow this value, you might use lookarounds:
(?<!\S)\d+(?:\.\d+)?(?!\S)(?!.*\d)(?=.*masła)
(?<!\S)\d+(?:\.\d+)?(?!\S) Match 1+ digits with an optional past to match a dot and 1+ digits
(?!.*\d) assert that there are no more digits following
(?=.*masła) Assert what is on the right is your word
Regex demo
Or you might use a capturing group:
(?<!\S)(\d+(?:\.\d+)?)[^\d\n]* masła(?!\S)[^\d\n]*$
Regex demo
This expression might simply suffice:
.*([0-9])
if we are interested in one digit only, or
.*([0-9]+)
if multiple digits might be desired.
Demo 1
If those strings with masła are desired, we can expand our expression to:
(?=.*masła).*([0-9])
Demo 2
If we would not be validating our numbers and our number would be valid, with commas or dots, then this expression might likely return our desired output:
(?=.*masła)([0-9,.]+)(\D*)$
Demo 3
My regex is weak, in the case of the following string
"OtherId":47
"OtherId":7
"MyId":47 (Match this one)
"MyId":7
I want to pick up the string that has "MyId" and a number that is not 1 - 9
I thought I could just use:
RegEx: How can I match all numbers greater than 49?
Combined using:
Regular Expressions: Is there an AND operator?
But its not happening... you can see my failed attempt here:
https://www.regextester.com/index.php?fam=99753
Which is
\b"MyId":\b(?=.*^[0-10]\d)
What am I doing wrong?
You can use this regex to match any digit >= 10:
^"MyId":[1-9][0-9]+$
RegEx Demo
If leading zeroes are to be allowed as well then use:
^"MyId":0*[1-9][0-9]+$
[1-9] makes sure number starts with 1-9 and [0-9]+ match 1 or more any digits after first digit.
Essentially, you are looking for 2 or more digits:
\"MyId\"\:(\d{2,})
I have escaped the quotes and colon, and {2,} means 2 or more.
If you need exact match to any number greater than 9
^"MyId":[1-9][0-9]+$
I would like to match the "775" (representing the last 3 digit number with an unkown total number of occurrences) within the string "one 234 two 449 three 775 f4our" , with "f4our" representing an unknown number of characters (letters, digits, spaces, but not 3 or more digits in a row).
I came up with the regular expression "(\d{3}).*?$" thinking the "?" would suffice to get the 775 instead of the 234, but this doesn't seem to work.
Is there any way to accomplish this using VBA regular expressions?
Note that (\d{3}).*?$ just matches and captures into Group 1 the first 3 consecutive digits and then matches any 0+ characters other than a newline up to the end of the string.
You need to get the 3 digit chunk at the end of the string that is not followed with a 3-digit chunk anywhere after it.
You may use a negative lookahead (?!.*\d{3}) to impose a restriction on the match:
\d{3}(?!.*\d{3})
See the regex demo. Or - if the 3 digits are to be matched as whole word:
\b\d{3}\b(?!.*\b\d{3}\b)
See another demo
I am trying to figure out a regular expression that matches any 9 digits, but the last 4 digits can't be 9999 or 0000. For example, I want the regex to match these:
123456789
123459991
123459990
But not these:
123450000
123459999
I tried negative lookahead. But it doesn't seem to fit in my requirement.
The closest I can get is \d{5}[^\D90]{4}, but with this the last 4 digits can't be 0 or 9 at all, which is not what I want.
It is a special zip code requirement. Any help will be appreciated!
use a negative lookahead after the 5th digit:
^[0-9]{5}(?!0000|9999)[0-9]{4}$
or a lookbehind at the end:
^[0-9]{9}$(?<!0000|9999)
(if your regex flavor doesn't allow a lookbehind with an alternation, use two lookbehinds:
^[0-9]{9}$(?<!0000)(?<!9999)
Your \d{5}[^\D90]{4} regex matches any 5 digits followed by 4 characters other than a non-digit, 9 and 0.
You can use
^(?!\d*(?:9999|0000)$)\d{9}$
Shorter variant: ^(?!\d*(?:9{4}|0{4})$)\d{9}$. See the regex demo
The negative lookahead (anchored at the start) will fail the match if the input contains some digits and ends with either 9999 or 0000.
If we go on optimizing the regex, the most efficient version (basing on what Casimir suggests in his answer) is:
^\d{5}(?!9{4}|0{4})\d{4}$
See the regex demo
Here,
^ - start of the string
\d{5} - exactly 5 digits
(?!9{4}|0{4}) - check (but not match, the index stays after the 5th digit since it is a zero width assertion) if there are exactly 4 9s or 0, and if found, the match is failed (as (?!...) is a negative lookahead)
\d{4} - exactly 4 digits
$ - end of string.