Regular expression for phone numbers - regex

I'm trying:
\d{3}|\d{11}|\d{11}-\d{1}
to match three-digit numbers, eleven-digit numbers, eleven-digit followed by a hyphen, followed by one digit.
But, it only matches three digit numbers!
I also tried \d{3}|\d{11}|\d{11}-\d{1} but doesn't work.
Any ideas?

There are many ways of punctuating phone numbers. Why don't you remove everything but the digits and check the length?
Note that there are several ways of indicating "extension":
+1 212 555 1212 ext.35

If the first part of an alternation matches, then the regex engine doesn't even try the second part.
Presuming you want to match only three-digit, 11 digit, or 11 digit hyphen 1 digit numbers, then you can use lookarounds to ensure that the preceding and following characters aren't digits.
(?<!\d)(\d{3}|\d{11}|\d{11}-\d{1})(?!\d)

\d{7}+\d{4} will select an eleven digit number. I could not get \d{11} to actually work.

This should work: /(?:^|(?<=\D))(\d{3}|\d{11}|\d{11}-\d{1})(?:$|(?=\D))/
or combined /(?:^|(?<!\d))(\d{3}|\d{11}(?:-\d{1})?)(?:$|(?![\d-]))/
expanded:
/ (?:^ | (?<!\d)) # either start of string or not a digit before us
( # capture grp 1
\d{3} # a 3 digit number
| # or
\d{11} # a 11 digit number
(?:-\d{1})? # optional '-' pluss 1 digit number
) # end capture grp 1
(?:$ | (?![\d-])) # either end of string or not a digit nor '-' after us
/

Related

Regex expression for numbers and leading zeros just with a dot and decimal

I'm trying to find a regex for numeric inputs. We can receive a leading 0 just if we add a dot for adding 1 or 2 decimal numbers. And of course just accept numbers.
These are the scenarios that we can accept:
0.01
1.1
1.02
120.01
We can't accept these values
0023
0100
.01
.12
Which regex is the best option for these cases?
Until now we try we the following regex for accepting just number and dots
[A-Za-z,]
And also we try with the following ones:
^[+-]?[0-9]{1,3}(?:[0-9]*(?:[.,][0-9]{1})?|(?:,[0-9]{3})*(?:\.[0-9]{1,2})?|(?:\.[0-9]{3})*(?:,[0-9]{1,2})?)$
"/^[-]?[$]\d{1,3}(?:,?\d{3})*\.\d{2}$/"
"/(^(\d{1})\.{0,1}([0-9]){0,2}$)|(^([1-9])\d{0,2}(\,\d{0,3})$)/g"
(?:0|[1-9][0-9]*)(?:\.[0-9]{1,2})?
And the next one for deleting the leading zeros but it didn't work for 0.10 cases
^0+
If a negative lookahead is supported, you can exclude matches that start with a zero and have no decimal part.
^(?!0\d*$)\d+(?:\.\d{1,2})?$
^ Start of string
(?!0+\d*$) Negative lookahead, assert not a zero followed by optional digits at the right
\d+ Match 1+ digits
(?:\.\d{1,2})? Match an optional decimal part with 1 or 2 digits
$ End of string
Regex demo
I would go with ^(0|[1-9]\d*|(0|[1-9]\d*)\.\d+)$
You can test here: https://regex101.com/r/oNMgR9/1
Explanation
^ means : match the beginning of the string (or line if the m flag is enabled).
$ means : match the end of the string (or line if the m flag is enabled).
(a|b) means match "a" or match "b" so I'll use this to match either "0" alone or any number not starting with a "0". It's the syntax for a logical or.
. alone is used to match any char. So you have to escape it if you want to match the dot character. This is why I wrote 0\. instead of 0..
[ ] is used to list some characters you want to match. It can be a range if you use the - char, so [1-9] means any digit char from "1" to "9".
\d is to match a digit. It's totally equivalent to [0-9].
* means : match the preceding pattern 0 or many times, so \d* means that it will match 0 or many times a digit, so it will match "8" or "465" or "09" but also an empty string "". If you want to match the preceding pattern at least once or many times then you use + instead of *. So \d+ won't match an empty string "" but \d* would match it.
A) Just a number not starting with 0
[1-9]\d* will match any digit from 1 to 9 and then optionnaly followed by other digits. This will match numbers without a decimal point.
B) Just 0
0 alone is a possibility. This is because the case above isn't covering it.
B) A number with decimals
(0|[1-9]\d*)\.\d+ will match either a "0" alone or a number not starting by "0" and then followed by a point and some other digits (which have to be present because we don't want to match "45." without the numbers behind the dot).
Better alternative
The solution from #TheFourthBird is a bit cleaner with the use of a negative lookahead. It's just a bit different to understand. And he read the question completely: You wanted 1 or 2 digits after the decimal. I forgot about that, so, effectively, \d+ should be replaced by \d{1,2} as you don't want more than 2 digits.
You can use
^(?![0.]+$)(?:[1-9]\d*|0)(?:\.\d{1,2})?$
See the regex demo.
Details:
^ - start of string
(?![0.]+$) - fail the match if there are just zeros or dots till end of string
(?:[1-9]\d*|0) - either a non-zero digit followed with any zero or more digits or a zero
(?:\.\d{1,2})? - optionally followed with a sequence of a . and one or two digits
$ - end of string.

Regex to not allow duplicate wild cards

I want to make regex which can pass the following cases:
02:12
10:23
00.23
0.23
.02
:88
Here is what i have tried: ^([0-9:. ])*[.: ]+$
But it allows duplicate" :, ., (space)", and also I'm not able to limit to 1-2 digits on both sides of wildcards. Any help would be great. Thanks
The pattern you tried only matches digits on the left side and matching the duplicates is due to the quantifiers.
If you want to allow 1 or 2 digits on both sides and make the digits on the left optional:
^[0-9]{0,2}[.:][0-9]{1,2}$
^ Start of string
[0-9]{0,2} Match 0, 1 or 2 times a digit 0-9
[.:] Match either . or :
[0-9]{1,2} Match 1 or 2 times a digit 0-9
$ End of string
Regex demo

Input Commas into regex during the whole number part of 10,4 decimal

I am looking for a regex that will limit a decimal to 10,4 but in the whole number part (10) I would like it to separate with commas.
For example - 1,123,123,123.1234
This gets me close to what I need - \d{0,10}.\d{4}
But I would like to show commas as in the example.
But I am not sure how to tweak this to achieve what I need?
You should be able to use the following :
(?:\d{1,3}(?:,\d{3}){0,2}|\d(?:,\d{3}){3}|\d{1,10})(?:\.\d{1,4})?
I've tested it here.
The whole pattern is an integer part followed by an optional floating part.
The integer part, (?:\d{1,3}(?:,\d{3}){0,2}|\d(?:,\d{3}){3}|\d{1,10}), is an alternative between three sub-patterns :
up to 9 digits with commas, \d{1,3}(?:,\d{3}){0,2}, which is a leading group of digits of one to three digits followed by up to two optional groups of exactly three digits, groups which are separated by commas
the 10 digits case with commas, \d(?:,\d{3}){3}, in which the leading digits group must contain exactly one digit and is followed by three three-digits groups, groups which are separated by commas
the commas-less number you had to begin with, \d{1,10}
The floating part is a dot followed by at least one digit and at most four.
Note that if you can avoid using a regex you absolutely should, this is the kind of regex which will make maintainers cry...
I don't think you can do this with a single regex
The algorithm I use is
Take the part of the number before the decimal point
Convert that to a string
Reverse the string
Split the string into chunks of 3 digits allowing the last group to have 1, 2 or 3 digits (this depends on your programming language)
Join the string together inserting , between each group
Reverse the string.
Concatenate a decimal point and the decimal digits if necessary.
You now have a correctly formatted string.
This does the job:
^(?:\d,)?\d{0,3}(?:,\d{1,3}){0,2}\.\d{4}$
Explanation:
^ # beginning of string
(?:\d,)? # non capture group, a digit and a comma, optional
\d{0,3} # 0 to 3 digits
(?: # non capture group
, # a comma
\d{1,3} # a to 3 digits
){0,2} # end group, may appear 0, 1 or 2 times
\. # a dot
\d{4} # 4 digits
$ # end of string
Demo
The following perl code uses a trick to work from right to left:
$num = 12345678.01;
$rev = reverse($num);
$rev =~ s/(\d{3})(?=\d)(?!\d*\.)/$1,/g;
$res = reverse($rev);
print "$res\n";
results in
12,345,678.01

Regex preg_match to neutralize a pricelist, keeping only digits, dots and commas*

I am using preg_match (PHP version 5.5.*) and want to ignore all alphabetic letters [a-zA-Z] and special symbols such as $ and -, only to match numbers, commas, dots. Whitespaces between numbers such as 6 000 should be matched. Commas after a number that is not followed by another number should be ignored, such as 6, would only match 6
Note that this is used in a single string and never in a list, like the sample below. I use the list to show what input and desired output is, "per line".
Sample input:
1
1,99
1.99
10
100
5999 dollars
2 USD
$2,99
Our price 2.99
Price: $ 20
200 $
20,-
6 999 USD
Desired output:
1
1,99
1.99
10
100
5999
2
2,99
2.99
20
200
20
6 999
I have tried /([0-9.,\s]+)/ but the output of 6 999 USD becomes 6.
Edit
The code we are using looks like this:
preg_match($regex, $value, $extractions);
array_shift($extractions);
$this->persist($extractions);
Demo
Update:
If you have   instead of spaces, you can do two things..my recommended is to just do a str_replace() first:
str_replace(' ', ' ', $number);
The other option is to also check for   with the [\s,] group:
[\d.](?:[\d.]|(?:[\s,]| )(?=\d))*
Example:
preg_match('/[\d.](?:[\d.]|[\s,](?=\d))*/', $number, $matches);
$number = reset($matches);
Explanation:
So I classified the valid characters (digits, spaces, commas, and periods) into two groups: [\d.] and [\s,]. A number must start with a digit or a period ($.99 == .99 != 99). Then we use a repeated non-capturing group (?:...)* to take care of our alternation and lookahead assertions. Anytime there is a [\d.] we match it with now questions asked. Otherwise (|), it it is a [\s,] we assert that it is followed with a digit using a lookahead ((?=...)).
Demo
Example:
preg_replace('/\s*[^\d\s,.]+\s*|,(?!\d)/', '', $number);
Explanation:
[^\d\s,.]+ will match 1+ characters that are not either a digit, whitespace, a comma, or a period. We put \s* on either side to grab any extra whitespace around these unwanted characters (like in "Our price "). The only unwanted character this doesn't match is a trailing comma. We use an alternation (|), then look for a comma, and then make sure that it is not followed by a digit using a negative lookahead ((?!...)).
Demo

RegEx with counting digits and allow special chars

I've done some searching but cant find the right regex.
i would like a regex for a text that only contains digits, whitespaces and plus signs.
like: [0-9 +]
But with a min/max limit for only the digits in that text.
My suggestions ended up with something like this:
^[0-9 \+](?=(.*[0-9]){5,8})$
Should be OK:
"123 456 7"
"12345"
"+ 123 456 78"
Should not be ok:
"123456789"
"+ 124 578a"
"+123456789"
Anyone got a solution that might do the trick?
Edit:
I can see that i was to short on my explanation what i'm aiming for.
My regex conditions should be:
Must include between 5-8 digits
Allow whitespaces and plus signs
I'm guessing from your own regex that between 5 and 8 digits in a row without a whitespace in between are allowed. If that's true, than the following regex might do the trick (example written in Python). It allows single digit groups being between 5 and 8 digits long. If there is more than one group, it allows each group to have exactly 3 digits except for the last group which can be between 1 and 3 digits long. One single plus sign on the left is optional.
Are you parsing phone numbers? :)
In [176]: regex = re.compile(r"""
^ # start of string
(?: \+\s )? # optional plus sign followed by whitespace
(?:
(?: \d{3}\s )+ # one or more groups of three digits followed by whitespace
\d{1,3} # one group of between one and three digits
| # ALTERNATIVE
\d{5,8} # one group of between five and eight digits
)
$ # end of string
""", flags=re.X)
# --- MATCHES ---
In [177]: regex.findall('123 456 7')
Out[177]: ['123 456 7']
In [178]: regex.findall('12345')
Out[178]: ['12345']
In [179]: regex.findall('+ 123 456 78')
Out[179]: ['+ 123 456 78']
In [200]: regex.findall('12345678')
Out[200]: ['12345678']
# --- NON-MATCHES ---
In [180]: regex.findall('123456789')
Out[180]: []
In [181]: regex.findall('+ 124 578a')
Out[181]: []
In [182]: regex.findall('+123456789')
Out[182]: []
In [198]: regex.findall('123')
Out[198]: []
In [24]: regex.findall('1234 556')
Out[24]: []
You can do something like this:
^(?:[ +]*[0-9]){5}(?:(?:[ +]*[0-9])?){3}$
See it here on Regexr
The first group (?:[ +]*[0-9]){5} are the 5 minimum digits, with any amount of spaces and plus before, the second part (?:(?:[ +]*[0-9])?){3} matches the optional digits, with any amount of spaces and plus before.
You were very close - you need to anchor the lookahead to the start of input, and add a second negative lookahead for the upper bound of the quantity of digits:
^(?=(.*\d){5,8})(?!(.*\d){9,})[\d +]+$
Also, fyi you don't need to escape the plus sign within the character class, and [0-9] is \d