Italian phone 10-digit number regex issue - regex

I'm trying to use the regex from this site
/^([+]39)?((38[{8,9}|0])|(34[{7-9}|0])|(36[6|8|0])|(33[{3-9}|0])|(32[{8,9}]))([\d]{7})$/
for italian mobile phone numbers but a simple number as 3491234567 results invalid.
(don't care about spaces as i'll trim them)
should pass:
349 1234567
+39 349 1234567
TODO: 0039 349 1234567
TODO: (+39) 349 1234567
TODO: (0039) 349 1234567
regex101 and regexr both pass the validation..what's wrong?
UPDATE:
To clarify:
The regex should match any number that starts with either
388/389/380 (38[{8,9}|0])|
or
347/348/349/340 (34[{7-9}|0])|
or
366/368/360 (36[6|8|0])|
or
333/334/335/336/337/338/339/330 (33[{3-9}|0])|
328/329 (32[{8,9}])
plus 7 digits ([\d]{7})
and the +39 at the start optionally ([+]39)?

The following regex appears to fulfill your requirements. I took out the syntax errors and guessed a bit, and added the missing parts to cover your TODO comments.
^(\((00|\+)39\)|(00|\+)39)?(38[890]|34[7-90]|36[680]|33[3-90]|32[89])\d{7}$
Demo: https://regex101.com/r/yF7bZ0/1
Your test cases fail to cover many of the variations captured by the regex; perhaps you'll want to beef up the test set to make sure it does what you want.
The beginning allows for an optional international prefix with or without the parentheses. The basic pattern is (00|\+)39 and it is repeated with or without parentheses around it. (Perhaps a better overall approach would be to trim parentheses and punctuation as well as whitespace before processing begins; you'll want to keep the plus as significant, of course.)
Updated with information from #Edoardo's answer; wrapped for legibility and added comments:
^ # beginning of line
(\((00|\+)39\)|(00|\+)39)? # country code or trunk code, with or without parentheses
( # followed by one of the following
32[89]| # 328 or 329
33[013-9]| # 33x where x != 2
34[04-9]| # 34x where x not in 1,2,3
35[01]| # 350 or 351
36[068]| # 360 or 366 or 368
37[019] # 370 or 371 or 379
38[089]) # 380 or 388 or 389
\d{6,7} # ... followed by 6 or 7 digits
$ # and end of line
There are obvious accidental gaps which will probably also get filled over time. Generalizing this further is likely to improve resilience toward future changes, but of course may at the same time increase the risk of false positives. Make up your mind about which is worse.

I found this and i updated with new operators and MVNO prefixes (Iliad, ho.)
^(\((00|\+)39\)|(00|\+)39)?(38[890]|34[4-90]|36[680]|33[13-90]|32[89]|35[01]|37[019])\d{6,7}$

I improved the regex adding the case to handle space between numbers:
^(\((00|\+)39\)|(00|\+)39)?(38[890]|34[4-90]|36[680]|33[13-90]|32[89]|35[01]|37[019])(\s?\d{3}\s?\d{3,4}|\d{6,7})$
so, for example, I can match phone number like this (0039) 349 123 4567 or this 349 123 4567

Following doc:
https://it.qaz.wiki/wiki/Telephone_numbers_in_Italy
A simple regex for MOBILE italian numbers without special chars is:
/^3[0-9]{8,9}$/
it match a string starting with the digit '3' and followed by 8 or 9 digits, ex:
3345678103
you can add then ITALIAN prefix like '+39 ' or '0039 '
/^+39 3[0-9]{8,9}$/ --- match --> +39 3345678103
/^\0039 3[0-9]{8,9}$/ --- match --> 0039 3345678103

Related

Regex matching issue to Test-String

i have a problem and dont get it.
My Regex:
My Test-String:
I have two issues and one general question :)
As you can see in my Test-String the very last (german) Phone Number (the big yellow one in the Test-String attachment) does not match my Regex-Pattern correctly. I dont get it, what is the Problem here? the "0049" fits Group 5, but should fit Group 2, why is that?
My second Problem is, how can i get rid of the spaces before and after every match? (The 7 yellow small circles in the Test-String Attachment)
For copy/paste purposes, here is the Regex and Test-String again:
Regex:
((\+\d{2}|00\d{2})?([ ])?(\()?(\d{2,4})(\))?([-| |/])?(\d{3,})([ ])?(\d+)?([ ])?(\d+)?)
Test-String:
Vorwahl 089, die E.123 ebenfalls , also (089) 1234567. Die DIN 5008, also +49 89 1234567 respectivly 0049 89 1234567. Die E.123 empfiehlt, also +49 89 123456 0 respectivly 0049 89 123456 0 oder +49 89 123456 789. Also +49 89 123 456 789. Klammern 089/1234567 und 0151 19406041. Test +49 151 123 456 789 respectivly 0049 151 123 456 789
Last but not at least, my general question:
Is it a good approach to Group each logical part as i did in my example?
A last Information: I validate my Regex with https://regex101.com/ and use it in Python with the re Module.
The thing that makes it unpredictable are the numerous optional groups (..)?.
As first step i recommend replacing ([ ])?(\d+)? as a coupled expression ([ ]?\d+)?, which will avoid spaces at the end of the match - your point #2.
As a second step i recommend coupling the first optional space with the expression of the "national dialling": ((\+|00)\d{2}([ ])?)?. Now we are lucky, because it solves both the space at the beginning and the recognition of the whole number, due to less possible matching options.
The new expression now looks like this:
(((\+|00)\d{2}([ ])?)?(\()?(\d{2,4})(\))?([-| |/])?(\d{3,})([ ]?\d+)?([ ]?\d+)?)
I now recommend to simplify the last part, if you dont need the single group-values:
(((\+|00)\d{2}([ ])?)?(\()?(\d{2,4})(\))?([-| |/])?(\d{3,})([ ]?\d+){0,2})
For better performance I suggest you remove the parenteses/groups where possible or mark them as non-capturing, if you don't need to have the specific group-values.
In some programming languages you will not need to most outer parenteses, as that is always group 0.

Regex for getting just name of street and number from messy address

I have this list of messy addresses, some are clean some aren't:
Av. Chorrillos # 1759 Local 1082 Exterior Jumbo
Av. Balmaceda N° 2355 Local BS - 121 / Subterráneo sector servicios
Tarapaca N° 729
The structure is usually name of street + N°|#|nothing + number + extra stuff
I'd like to erase this extra stuff so that the expected output from the above list is:
Av. Chorrillos # 1759
Av. Balmaceda N° 2355
Tarapaca N° 729
I tried using a combination of letters and lookback:
([a-zA-Z\s]+\d+)
But the # and N° gave me trouble, so I tried also including them
([(\w|°|#)\s]+\d+)
but still no luck.
I know regex on addresses is a nightmare, but any regex that fits those three cases above would fit 95% of my list, which is good enough for me!
I'm using this with python regex in case that matters.
You can find the list of addresses and my regex attempt on regex101
(Some addresses have extra info BEFORE the relevant information of street + number, but I'm fine with screwing up those)
Based on your specifications. I came up with this regex.
Regex: ^.*?(?:[N°#Nº]\s*)?\d+
Explanation:
^.*? consumes everything from beginning of string. Since match is lazy it will match until next part which is (?:[N°#Nº]\s*)?
(?:[N°#Nº]\s*)? matches optional N°#Nº followed by zero or more whitespaces.
\d+ matches numbers.
Regex101 Demo

phone number RegEx not working for some strings

I want to recognize phone number as 9 consecutive figures which can be separated by white spaces, non-breaking spaces etc. with regEx "(\s*\d\s*){9}"
I run VBA macro (JS RegEx) and here are example strings which work fine with above RegEx:
ul. 27 Grudnia 16, tel. 21 287 31 61, fax 61 286 69 60 –
ul. Wrzosowa 110/120/222, kom. 692 601 428
And here is an example where phone number is not detected in VBA, but is detected by RegEx JS online tools:
al. Mazowieckiego 63, kom. 622 769 694 –
Strings which are detected and these which are not, have the same structure, so I have no idea why VBA doesn't detect phone number in some of them.
It came out that VBA changed some strings to look in - replaced a whitespace - chr(32) with a non breaking chr(160).
Removing chr(160) from string to look in solves the problem.
Also I will try to find RegEx which will let non-breaking spaces, because \s* doesn't do so, at least in VBA.

Match phone numbers with lengths between 8-16 digits, ignoring ()+-

Consider the following:
+12 34 456 432
(12) 34 567 124
1234 56 78 90
(1234) 567 890
1234-567-890
1234 - 567 - 890
12 34 56 78
12-34-56-78
Assume these are all valid phone number structures
Can a regex be used to express: find at least 8 numbers,but not more than 16 and ignore spaces, round brackets, the plus symbol(once) and the minus.
My current working sample is a mess:
^([\+|\(]{1,2})?+(\d{2,4})+([ |-|\)]{1,2})?+(\d{2,3})+([ |-]{1})?+(\d{2,3})+([ |-]{1})?+(\d{2,3})?$
Even if phone number validation is recommended against. Is there not a simpler regex syntax for these things?
To just account for the number of digits and ingore the -, ), ( or spaces (allowing a + at the beginning), you can use the following regex:
^\+?(?:[ ()-]*\d){8,16}$
It matches
^ - start of string
\+? - one or zero +
(?:[ ()-]*\d){8,16} - 8 to 16 sequences of...
[ ()-]* - 0 or more -, ), ( or a space characters
\d - a digit
$ - end of string
See the regex demo
This may ease your task.
First, remove everything that is not a number:
myString = myString.replace(/\D/g,'');
You'll get this:
1234456432
1234567124
1234567890
1234567890
1234567890
1234567890
12345678
12345678
Then just check for length:
if(myString.length >= 0 && myString.length <=16)
// Do stuff
Using preg_replace fetch numbers only, check for the valid length
<?php
$ph = "(12) 34 567 124";
$len = strlen(preg_replace('/[^0-9]+/', '', $ph));
if($len >=8 && $len <=16)
echo "Valid";
else
echo "Invalid";
Don't even think about it. Phone numbers are complicated. They are hugely complicated. Google has a decent library to handle phone numbers named libPhoneNumber.
And excuse me, but ignoring the "+" makes whatever you are doing totally, absolutely wrong. A plus is followed by the country code of some country, followed by a local phone number within that country (which needs to be parsed according to the rules of that country, and there are about 200). Without the "+", you have a phone number according to the local rules, and you need to find out which local rules apply. Which means your number can start with a code for dialing a foreign exchange instead of the "+", otherwise it is formatted according to local rules.
As a result, a number may be valid with the "+" and invalid without it or vice versa, and most likely refers to a different actual phone in totally different countries with or without the "+".

Phone validation regex

I'm using this pattern to check the validation of a phone number
^[0-9\-\+]{9,15}$
It's works for 0771234567 and +0771234567,
but I want it to works for 077-1234567 and +077-1234567 and +077-1-23-45-67 and +077-123-45-6-7
What should I change in the pattern?
Please refer to this SO Post
example of a regular expression in jquery for phone numbers
/\(?([0-9]{3})\)?([ .-]?)([0-9]{3})\2([0-9]{4})/
(123) 456 7899
(123).456.7899
(123)-456-7899
123-456-7899
123 456 7899
1234567899
are supported
This solution actually validates the numbers and the format. For example: 123-456-7890 is a valid format but is NOT a valid US number and this answer bears that out where others here do not.
If you do not want the extension capability remove the following including the parenthesis:
(?:\s*(?:#|x.?|ext.?|extension)\s*(\d+)\s*)? :)
edit (addendum) I needed this in a client side only application so I converted it. Here it is for the javascript folks:
var myPhoneRegex = /(?:(?:\+?1\s*(?:[.-]\s*)?)?(?:(\s*([2-9]1[02-9]|[2-9][02-8]1|[2-9][02-8][02-9]‌​)\s*)|([2-9]1[02-9]|[2-9][02-8]1|[2-9][02-8][02-9]))\s*(?:[.-]\s*)?)([2-9]1[02-9]‌​|[2-9][02-9]1|[2-9][02-9]{2})\s*(?:[.-]\s*)?([0-9]{4})\s*(?:\s*(?:#|x\.?|ext\.?|extension)\s*(\d+)\s*)?$/i;
if (myPhoneRegex.test(phoneVar)) {
// Successful match
} else {
// Match attempt failed
}
hth.
end edit
This allows extensions or not and works with .NET
(?:(?:\+?1\s*(?:[.-]\s*)?)?(?:(\s*([2-9]1[02-9]|[2-9][02-8]1|[2-9][02-8][02-9]‌​)\s*)|([2-9]1[02-9]|[2-9][02-8]1|[2-9][02-8][02-9]))\s*(?:[.-]\s*)?)([2-9]1[02-9]‌​|[2-9][02-9]1|[2-9][02-9]{2})\s*(?:[.-]\s*)?([0-9]{4})(?:\s*(?:#|x\.?|ext\.?|extension)\s*(\d+))?$
To validate with or without trailing spaces. Perhaps when using .NET validators and trimming server side use this slightly different regex:
(?:(?:\+?1\s*(?:[.-]\s*)?)?(?:(\s*([2-9]1[02-9]|[2-9][02-8]1|[2-9][02-8][02-9]‌​)\s*)|([2-9]1[02-9]|[2-9][02-8]1|[2-9][02-8][02-9]))\s*(?:[.-]\s*)?)([2-9]1[02-9]‌​|[2-9][02-9]1|[2-9][02-9]{2})\s*(?:[.-]\s*)?([0-9]{4})\s*(?:\s*(?:#|x\.?|ext\.?|extension)\s*(\d+)\s*)?$
All valid:
1 800 5551212
800 555 1212
8005551212
18005551212
+1800 555 1212 extension65432
800 5551212 ext3333
Invalid #s
234-911-5678
314-159-2653
123-234-5678
EDIT: Based on Felipe's comment I have updated this for international.
Based on what I could find out from here and here regarding valid global numbers
This is tested as a first line of defense of course. An overarching element of the international number is that it is no longer than 15 characters. I did not write a replace for all the non digits and sum the result. It should be done for completeness. Also, you may notice that I have not combined the North America regex with this one. The reason is that this international regex will match North American numbers, however, it will also accept known invalid # such as +1 234-911-5678. For more accurate results you should separate them as well.
Pauses and other dialing instruments are not mentioned and therefore invalid per E.164
\(?\+[0-9]{1,3}\)? ?-?[0-9]{1,3} ?-?[0-9]{3,5} ?-?[0-9]{4}( ?-?[0-9]{3})?
With 1-10 letter word for extension and 1-6 digit extension:
\(?\+[0-9]{1,3}\)? ?-?[0-9]{1,3} ?-?[0-9]{3,5} ?-?[0-9]{4}( ?-?[0-9]{3})? ?(\w{1,10}\s?\d{1,6})?
Valid International: Country name for ref its not a match.
+55 11 99999-5555 Brazil
+593 7 282-3889 Ecuador
(+44) 0848 9123 456 UK
+1 284 852 5500 BVI
+1 345 9490088 Grand Cayman
+32 2 702-9200 Belgium
+65 6511 9266 Asia Pacific
+86 21 2230 1000 Shanghai
+9124 4723300 India
+821012345678 South Korea
And for your extension pleasure
+55 11 99999-5555 ramal 123 Brazil
+55 11 99999-5555 foo786544 Brazil
Enjoy
I have a more generic regex to allow the user to enter only numbers, +, -, whitespace and (). It respects the parenthesis balance and there is always a number after a symbol.
^([+]?[\s0-9]+)?(\d{3}|[(]?[0-9]+[)])?([-]?[\s]?[0-9])+$
false, ""
false, "+48 504 203 260##"
false, "+48.504.203.260"
false, "+55(123) 456-78-90-"
false, "+55(123) - 456-78-90"
false, "504.203.260"
false, " "
false, "-"
false, "()"
false, "() + ()"
false, "(21 7777"
false, "+48 (21)"
false, "+"
true , " 1"
true , "1"
true, "555-5555-555"
true, "+48 504 203 260"
true, "+48 (12) 504 203 260"
true, "+48 (12) 504-203-260"
true, "+48(12)504203260"
true, "+4812504203260"
true, "4812504203260
Consider:
^\+?[0-9]{3}-?[0-9]{6,12}$
This only allows + at the beginning; it requires 3 digits, followed by an optional dash, followed by 6-12 more digits.
Note that the original regex allows 'phone numbers' such as 70+12---12+92, which is a bit more liberal than you probably had in mind.
The question was amended to add:
+077-1-23-45-67 and +077-123-45-6-7
You now probably need to be using a regex system that supports alternatives:
^\+?[0-9]{3}-?([0-9]{7}|[0-9]-[0-9]{2}-[0-9]{2}-[0-9]{2}|[0-9]{3}-[0-9]{2}-[0-9]-[0-9])$
The first alternative is seven digits; the second is 1-23-45-67; the third is 123-45-6-7. These all share the optional plus + followed by 3 digits and an optional dash - prefix.
The comment below mentions another pattern:
+077-12-34-567
It is not at all clear what the general pattern should be - maybe one or more digits separated by dashes; digits at front and back?
^\+?[0-9]{3}-?[0-9](-[0-9]+)+$
This will allow the '+077-' prefix, followed by any sequence of digits alternating with dashes, with at least one digit between each dash and no dash at the end.
/^[0-9\+]{1,}[0-9\-]{3,15}$/
so first is a digit or a +, then some digits or -
First test the length of the string to see if it is between 9 and 15.
Then use this regex to validate:
^\+?\d+(-\d+)*$
This is yet another variation of the normal* (special normal*)* pattern, with normal being \d and special being -.
I tried :
^(1[ \-\+]{0,3}|\+1[ -\+]{0,3}|\+1|\+)?((\(\+?1-[2-9][0-9]{1,2}\))|(\(\+?[2-8][0-9][0-9]\))|(\(\+?[1-9][0-9]\))|(\(\+?[17]\))|(\([2-9][2-9]\))|([ \-\.]{0,3}[0-9]{2,4}))?([ \-\.][0-9])?([ \-\.]{0,3}[0-9]{2,4}){2,3}$
I took care of special country codes like 1-97... as well. Here are the numbers I tested against (from Puneet Lamba and MCattle):
***** PASS *****
18005551234
1 800 555 1234
+1 800 555-1234
+86 800 555 1234
1-800-555-1234
1.800.555.1234
+1.800.555.1234
1 (800) 555-1234
(800)555-1234
(800) 555-1234
(800)5551234
800-555-1234
800.555.1234
(+230) 5 911 4450
123345678
(1) 345 654 67
+1 245436
1-976 33567
(1-734) 5465654
+(230) 2 345 6568
***** CORRECTLY FAILING *****
(003) 555-1212
(103) 555-1212
(911) 555-1212
1-800-555-1234p
800x555x1234
+1 800 555x1234
***** FALSE POSITIVES *****
180055512345
1 800 5555 1234
+867 800 555 1234
1 (800) 555-1234
86 800 555 1212
Originally posted here: Regular expression to match standard 10 digit phone number
Here is the regex for Ethiopian phone numbers (EthioTelecom and Safaricom). For my fellow Ethiopian developers ;)
phoneExp = /^(^\+251|^251|^0)?(9|7)\d{8}$/;
It matches the following (restrict any unwanted character in start and end position)
+251912345678
251912345678
0912345678
912345678
+251712345678
251712345678
0712345678
712345678
You can test it on this site regexr.
^(\+\d{1,2}\s)?\(?\d{3}\)?[\s.-]\d{3}[\s.-]\d{4}$
Matches the following cases:
123-456-7890
(123) 456-7890
123 456 7890
123.456.7890
+91 (123) 456-7890
Try this
\+?\(?([0-9]{3})\)?[-.]?\(?([0-9]{3})\)?[-.]?\(?([0-9]{4})\)?
It matches the following cases
+123-(456)-(7890)
+123.(456).(7890)
+(123).(456).(7890)
+(123)-(456)-(7890)
+123(456)(7890)
+(123)(456)(7890)
123-(456)-(7890)
123.(456).(7890)
(123).(456).(7890)
(123)-(456)-(7890)
123(456)(7890)
(123)(456)(7890)
For further explanation on the pattern CLICKME
The following regex matches a '+' followed by n digits
var mobileNumber = "+18005551212";
var regex = new RegExp("^\\+[0-9]*$");
var OK = regex.test(mobileNumber);
if (OK) {
console.log("is a phone number");
} else {
console.log("is NOT a phone number");
}
^+?\d{3}-?\d{2}-?\d{2}-?\d{3}$
You may try this....
How about this one....Hope this helps...
^(\\+?)\d{3,3}-?\d{2,2}-?\d{2,2}-?\d{3,3}$
^[0-9\-\+]{9,15}$
would match 0+0+0+0+0+0, or 000000000, etc.
(\-?[0-9]){7}
would match a specific number of digits with optional hyphens in any position among them.
What is this +077 format supposed to be?
It's not a valid format. No country codes begin with 0.
The digits after the + should usually be a country code, 1 to 3 digits long.
Allowing for "+" then country code CC, then optional hyphen, then "0" plus two digits, then hyphens and digits for next seven digits, try:
^\+CC\-?0[1-9][0-9](\-?[0-9]){7}$
Oh, and {3,3} is redundant, simplifes to {3}.
This regex matches any number with the common format 1-(999)-999-9999 and anything in between. Also, the regex will allow braces or no braces and separations with period, space or dash. "^([01][- .])?(\(\d{3}\)|\d{3})[- .]?\d{3}[- .]\d{4}$"
Adding to #Joe Johnston's answer, this will also accept:
+16444444444,,241119933
(Required for Apple's special character support for dial-ins - https://support.apple.com/kb/PH18551?locale=en_US)
\(?\+[0-9]{1,3}\)? ?-?[0-9]{1,3} ?-?[0-9]{3,5} ?-?[0-9]{4}( ?-?[0-9]{3})? ?([\w\,\#\^]{1,10}\s?\d{1,10})?
Note: Accepts upto 10 digits for extension code
/^(([+]{0,1}\d{2})|\d?)[\s-]?[0-9]{2}[\s-]?[0-9]{3}[\s-]?[0-9]{4}$/gm
https://regexr.com/4n3c4
Tested for
+94 77 531 2412
+94775312412
077 531 2412
0775312412
77 531 2412
// Not matching
77-53-12412
+94-77-53-12412
077 123 12345
77123 12345
JS code:
function checkIfValidPhoneNumber(input){
"use strict";
if(/^((\+?\d{1,3})?[\(\- ]?\d{3,5}[\)\- ]?)?(\d[.\- ]?\d)+$/.test(input)&&input.replace(/\D/g,"").length<=15){
return true;
} else {
return false;
}
}
It may be primitive in terms of checking phone number, but it checks that input text is compliant with E.164 recommendation.
Maximum phone length is 15 digits
Country code consists of 1 to 3 digits, could be preceded with plus (could be omitted)
Region (network) code consists of 3 to 5 digits (could be omitted but only if country code is omitted)
It allows some delimiters in phone number and around region code (.- )
For example:
+7(918)000-12-34
911
1-23456-789.10.11.12
all are compliant with E.164 and validated
for all phone number format:
/^\+?([87](?!95[5-7]|99[08]|907|94[^09]|336)([348]\d|9[0-6789]|7[01247])\d{8}|[1246]\d{9,13}|68\d{7}|5[1-46-9]\d{8,12}|55[1-9]\d{9}|55[138]\d{10}|55[1256][14679]9\d{8}|554399\d{7}|500[56]\d{4}|5016\d{6}|5068\d{7}|502[345]\d{7}|5037\d{7}|50[4567]\d{8}|50855\d{4}|509[34]\d{7}|376\d{6}|855\d{8,9}|856\d{10}|85[0-4789]\d{8,10}|8[68]\d{10,11}|8[14]\d{10}|82\d{9,10}|852\d{8}|90\d{10}|96(0[79]|17[0189]|181|13)\d{6}|96[23]\d{9}|964\d{10}|96(5[569]|89)\d{7}|96(65|77)\d{8}|92[023]\d{9}|91[1879]\d{9}|9[34]7\d{8}|959\d{7,9}|989\d{9}|971\d{8,9}|97[02-9]\d{7,11}|99[^4568]\d{7,11}|994\d{9}|9955\d{8}|996[2579]\d{8}|998[3789]\d{8}|380[345679]\d{8}|381\d{9}|38[57]\d{8,9}|375[234]\d{8}|372\d{7,8}|37[0-4]\d{8}|37[6-9]\d{7,11}|30[69]\d{9}|34[679]\d{8}|3459\d{11}|3[12359]\d{8,12}|36\d{9}|38[169]\d{8}|382\d{8,9}|46719\d{10})$/