SSN and 9 number screening issues

SSN and 9 number screening issues - regex

This regex is looking for Social Security Numbers (SSNs) in several formats, but it also ignores obviously non-valid SSNs like 123-45-6789 or 000-00-0000, etc.
This expression should find a Social Security Number that :
Contains any non-numeric delimiter (i.e. ###-##-####,
###.##.####, or ### ### ####)
It should also catch 9 digits in
sequence with no delimiter, but bounded by whitespace (i.e. `text
### text, or### ######### ###`)
This expression will ignore a Social Security Number that : Contains all zeroes in any specific group(i.e. 000-##-####, ###-00-####, or ###-##-0000)
Begins with 666
Begins with any value from 900-999
Is equal to 078-05-1120 (due to the Woolworth's Wallet Fiasco)
Is equal to 219-09-9999 (appeared in an advertisement for the Social Security
Administration)
Contains all matching values(i.e. 000-00-0000, 111-11-1111, 222-22-2222, etc.)
Contains all incrementing values (i.e. 123-45-6789)
Regex
(#"(?!\b(\d)\1+\D?(\d)\1+\D?(\d)\1+\b)(?!123\D?45\D?6789|219\D?09\D?9999|078\D?05\D?1120)(?!666|000|9\d{2})(?<!\d)\d{3}\D?(?!00)\d{2}\D?(?!0{4})\d{4}(?!\d)(?<!\d{5}-\d{4})",
The problem is we catch other entries that resemble those but we need to be specific enough these aren't caught.
Such as -
(xxxx) xxx-xx-xxxx
684072943 (and order number etc.)
FA300217F0090
Potential Match #1:--------------- nt: ex: 201[[71230 0821]] am ex: 201[[71230 0821]] am 26 JUNE 2012 ---------------Potential Match #2:--------------- am ex: 201[[71230 0821]] am 26 JUNE 2012 DTG (date time group)
"[[ 210v13:2012]],"
Any ideas?

You can use \D? to match any non-digit as your delimiter. This would be a more simplified SSN validator:
^(?!219\D?09\D?9999|078\D?05\D?1120)(?!666|000|9\d{2})d{3}\D?(?!00)\d{2}\D?(?!0{4})\d{4}$
This article might be helpful: http://rion.io/2013/09/10/validating-social-security-numbers-through-regular-expressions-2/
The article also gives a more over-the-top solution, which may be what your are looking for:
^(?!\b(\d)1+\D?(\d)1+\D?(\d)1+\b)(?!123\D?45\D?6789|219\D?09\D?9999|078\D?05\D?1120)(?!666|000|9d{2})\d{3}\D?(?!00)\d{2}\D?(?!0{4})\d{4}$

Related

Check if contains four digits year number in apple script

Now I am working on a file-rename-applescript-project. Here is an example: The.Fantasy.1997.DVDRip.XviD-ETRG.avi.
Now I want to check if the filename contains four digits year number. In this case, it's 1997. The year number MUST begin with 19 or 20 and MUST contain four digits.
If the result is true I will do something, if false I will do something else.
I try to use regex but can't find the solution. It's out of my range. Now I m looking for help here, Thanks a million.

If you want to avoid regex completely, do something like below, using text item delimiters:
(*
This first bit breaks the string up into a list of words by cutting the string
at the period delimiter.
*)
set tid to my text item delimiters
set my text item delimiters to "."
set bits_list to text items of file_name_string
set my text item delimiters to tid
(*
This repeat loop goes though the list of words and tests them (first) to see
if it can be converted to an integer, and (second) whether the number is between
1900 and 2100. If so, it chooses it as the year.
*)
repeat with this_item in bits_list
try
set possibleYear to this_item as integer
if possibleYear ≥ 1900 and possibleYear < 2100 then
-- do what you want with the year value here
exit repeat
end if
end try
end repeat
Of course, this will not work properly if there's a number in the name (e.g., "2001.A.Space.Odyssey.1968.avi") or if a file name has different delimiters (e.g., a space or a dash). But you'd run into those problems using regex as well, so...

Since you're only wishing to check whether or the filename contains a four-digit year within the range 1900-2099, you can do this very simply by defining a handler like so:
on hasYearInTitle(filmTitle as text)
repeat with yyyy from 1900 to 2099
if yyyy is in the filmTitle then return true
end repeat
return false
end hasYearInTitle
Then you can call this handler and pass it a film title, like so:
hasYearInTitle("The.Fantasy.1997.DVDRip.XviD-ETRG.avi") --> true
hasYearInTitle("The.Fantasy.197.DVDRip.XviD-ETRG.avi") --> false
hasYearInTitle("2001.A.Space.Odyssey.1968.avi") --> true
hasYearInTitle("2001.A.Space.Odyssey.avi") --> true (hm...)
As a side-note, films indexed by newznab servers follow a strict file-naming protocol that allow a media server (on your machine) to parse it easily and extract information quickly, pertaining to (as seen in your example file name): the film's title, the film's release date, the source material, the encoding quality, the encoding format (codec), the release group, and the containing file format.
Although some filenames contain more information, and some they should always appear in an set order. This makes them very simple to parse yourself should you need to, but if you're looking to create an organised media library, you would be best looking at using media server, of which there are excellent, freeware, long-standing software options available for macOS and pretty much any other operating system.

The regex .+\.(?:19:20)\d{2}\..+ should do it
The breakdown:
.+ 1 or more any characters
\. An actual dot
(?:19|20) The string "19" or "20" (non-capturing group)
\d{2} Exactly two digits
\. An actual dot
.+ 1 or more any characters

How to creating a regex pattern in VBA to extract dates from string and exclude false matches

I am trying to use Regex to parse a series of strings to extract one or more text dates that may be in multiple formats. The strings will look something like the following:
24 Aug 2016: nno-emvirt010a/b; 16 Aug 2016 nnt-emvirt010a/b nnd-emvirt010a/b COSI-1.6.5
24.16 nno-emvirt010a/b nnt-emvirt010a/b nnd-emvirt010a/b EI.01.02.03\
9/23/16: COSI-1.6.5 Logs updated at /vobs/COTS/1.6.5/files/Status_2016-07-27.log, Status_2016-07-28.log, Status_2016-08-05.log, Status_2016-08-08.log
I am not concerned about validating the individual date fields; just extracting the date string. The part I am unable to figure out is how to not match on number sequences that match the pattern but aren’t dates (‘1.6.5’ in ex. (1) and 01.02.03 in ex. (2)) and dates that are part of a file name (2016-07-27 in ex. (3)). In each of these exception cases in my input data, the initial numbers are preceded by either a period(.), underscore (_) or dash (-), but I cannot determine how to use this to edit the pattern syntax to not match these strings.
The pattern I have that partially works is below. It will only ignore the non date matches if it starts with 1 digit as in example 1.
/[^_\.\(\/]\d{1,4}[/\-\.\s*]([1-9]|0[1-9]|[12][0-9]|3[01]|[a-z]{3})[/\-\.\s*]\d{1,4}/ig`

I am not sure about vba check if this works . seems they have given so much options : https://www.safaribooksonline.com/library/view/regular-expressions-cookbook/9781449327453/ch04s04.html
^(?:(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])|↵
(3[01]|[12][0-9]|0?[1-9])/(1[0-2]|0?[1-9]))/(?:[0-9]{2})?[0-9]{2}$
^(?:
# m/d or mm/dd
(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])
|
# d/m or dd/mm
(3[01]|[12][0-9]|0?[1-9])/(1[0-2]|0?[1-9])
)
# /yy or /yyyy
/(?:[0-9]{2})?[0-9]{2}$

According to the test strings you've presented, you can use the following regex
See this regex in use here
(?<=[^a-zA-Z\d.]|^)((?:\d{1,2}\s*[A-Z][a-z]{2}\s*\d+)|(?:(?:\d{1,2}\/){2}\d+)|(?:\d+(?:-\d{2}){2})|\d{2}\.\d{2})(?=[^a-zA-Z\d.])
This regex ensures that specific date formats are met and are preceded by nothing (beginning of the string) or by a non-word character (specifically a-z, A-Z, 0-9) or dot .. The date formats that will be matched are:
24 Aug 2016
24.16
9/23/16
The regex could be further manipulated to ensure numbers are in the proper range according to days/month, etc., however, I don't feel that is really necessary.
Edits
Edit 1
Since VBA doesn't support lookbehinds, you can use the following. The date is in capture group 1.
(?:[^a-zA-Z\d.]|^)((?:\d{1,2}\s*[A-Z][a-z]{2}\s*\d+)|(?:(?:\d{1,2}\/){2}\d+)|(?:\d+(?:-\d{2}){2})|\d{2}\.\d{2})(?=[^a-zA-Z\d.])
Edit 2
As per bulbus's comment below
(?:[^\w.]|^)((?:\d{1,2}\s*[A-Z][a-z]{2}\s*\d{2,4})|(?:(?:\d{‌1,2}\/){2}\d{2,4})|(‌?:\d{2,4}(?:-\d{2}){‌2})|\d{2}\.\d{2})
Took liberty to edit that a bit.
replaced [^a-zA-Z\d.] with [^\w.], comes with added advantage of excluding dates with _2016-07-28.log
Due to 1 removed trailing condition (?=[^a-zA-Z\d.]).
Forced year digits from \d+ to \d{2,4}
Edit 3
Due to added conditions of the regex, I've made the following edits (to improve upon both previous edits). As per the OP:
The edited pattern above works in all but 2 cases:
it does not find dates with the year first (ex. 2016/07/11)
if the date is contained within parenthesis in the string, it returns the left parenthesis as part of the date (ex. match = (8/20/2016)
Can you provide the edit to fix these?
In the below regexes, I've changed years to \d+ in order for it to work on any year greater than or equal to 0.
See the code in use here
(?:[^\w.]|^)((?:\d{1,2}\s+[A-Z][a-z]{2}\s+\d+)|(?:(?:\d{1,2}\/){2}\d+)|(?:\d+(?:\/\d{1,2}){2})|(?:\d+(?:-\d{2}){2})|\d{2}\.\d+)
This regex adds the possibility of dates in the XXXX/XX/XX format where the date may appear first.
The reason you are getting ( as a match before the regex is the nature of the Full Match. You need to, instead, grab the value of the first capture group and not the whole regex result. See this answer on how to grab submatches from a regex pattern in VBA.
Also, note that any additional date formats you need to catch need to be explicitly set in the regex. Currently, the regex supports the following date formats:
\d{1,2}\s+[A-Z][a-z]{2}\s+\d+
12 Apr 17
12 Apr 2017
(?:\d{1,2}\/){2}\d+
1/4/17
01/04/17
1/4/2017
01/04/2017
\d+(?:\/\d{1,2}){2}
17/04/01
2017/4/1
2017/04/01
17/4/1
\d+(?:-\d{2}){2}
17-04-01
2017-04-01
\d{2}\.\d+ - Although I'm not sure what this date format is even used for and how it could be considered efficient if it's missing month
24.16

Regex to validate only US and India retail number

I am trying to get a regex which serves below requirements:
Validates US, India retail phone number
Excludes special purpose/business purpose phone numbers in both countries. I.e. starting with 800, 888, 877, and 866, 900, at least 10 digits for US, there can be more guidelines but above is just for example.
It should validate special chars if any like (, ), +, 1, 0 if included but satisfies all this points than should be a valid phone number.
If preceded by STD, ISD consider it as valid.
Landline, mobile both should be valid.
I looked whether some came across the same requirements, but the solutions I am getting serve different requirements and not exactly the one I am looking for.

Without a definitive exclusion/inclusion list of the phone numbers you want to match, here is a "template" regular expression that you could use to match US numbers:
(?:^|\b)(\+?1[ -.\/]?)?\(?(?!37|950|958|959|96|976)[2-9]([0-8])(?!\2)\d(?:\) ?|[ -.\/])?[2-9](?!11)\d\d[ -.]?\d{4}(?:$|\b)
A break-down:
(?:^|\b): Start of string or break. This prevents, for example, the match of digits to start in the middle of a longer series of digits;
(\+?1[ -.\/]?)?: this matches an optional prefix of the US country code (i.e. 1), and accepts values like +1, 1/, +1, 1;
\(?: an optional opening bracket for the region code;
(?!37|950|958|959|96|976): exclusion list of region codes. When only 2 digits are given, any region code starting with those is rejected -- you'll need to extend this list to identify other "special business" phone numbers you want to exclude;
[2-9]: first digit of region code; cannot be 0 or 1;
([0-8]): second digit of region code; cannot be 9;
(?!\2)\d: third digit of region code; cannot be the same as the second digit (\2 refers to the second match group);
(?:\) ?|[ -.\/])?: optional separator: ),-,.,/, or space. If ), it can optionally be followed by a space;
[2-9]: first digit of exchange code; cannot be 0 or 1;
(?!11): exclusion for second and third digits of exchange code -- they cannot both be 1 at the same time;
\d\d: second and third digit of exchange code; no further limitations;
[ -.]?: optional separator; can be -, . or space;
\d{4}: four digit customer number; no restrictions.
(?:$|\b): End of string or break. This prevents, for example, the match of digits to stop in the middle of a longer series of digits;
Here is an online regex test.
I suppose with the above as inspiration, you could fine-tune it to your expectations, and add the Indian formats in the same manner. You can use the | operator to separate the two sub-regular expressions you will have, like (US|IND), where you need to replace those two arguments by real expressions of course.
To capture also the prefix STD or ISD, you can insert the following in the above regex, right after the break test:
(?:STD\b\s*|ISD\b\s*|)
...which matches these optional words followed by optional spaces.
However, the complexity of the final regex will increase the more precise you want to match and exclude invalid numbers. For example, if you would want to validate against the All India STD Code List, then your regular expression would get very long and hard to manage.

RegEx: Uk Landlines, Mobile phone numbers

I've been struggling with finding a suitable solution :-
I need an regex expression that will match all UK phone numbers and mobile phones.
So far this one appears to cover most of the UK numbers:
^0\d{2,4}[ -]{1}[\d]{3}[\d -]{1}[\d -]{1}[\d]{1,4}$
However mobile numbers do not work with this regex expression or phone-numbers written in a single solid block such as 01234567890.
Could anyone help me create the required regex expression?

[\d -]{1}
is blatently incorrect: a digit OR a space OR a hyphen.
01000 123456
01000 is not a valid UK area code. 123456 is not a valid local number.
It is important that test data be real area codes and real number ranges.
^\s*(?(020[7,8]{1})?[ ]?[1-9]{1}[0-9{2}[ ]?[0-9]{4})|(0[1-8]{1}[0-9]{3})?[ ]?[1-9]{1}[0-9]{2}[ ]?[0-9]{3})\s*|[0-9]+[ ]?[0-9]+$
The above pattern is garbage for many different reasons.
[7,8] matches 7 or comma or 8. You don't need to match a comma.
London numbers also begin with 3 not just 7 or 8.
London 020 numbers aren't the only 2+8 format numbers; see also 023, 024, 028 and 029.
[1-9]{1} simplifies to [1-9]
[ ]? simplifies to \s?
Having found the intial 0 once, why keep searching for it again and again?
^(0....|0....|0....|0....)$ simplifies to ^0(....|....|....|....)$
Seriously. ([1]|[2]|[3]|[7]){1} simplifies to [1237] here.
UK phone numbers use a variety of formats: 2+8, 3+7, 3+6, 4+6, 4+5, 5+5, 5+4. Some users don't know which format goes with which number range and might use the wrong one on input. Let them do that; you're interested in the DIGITS.
Step 1: Check the input format looks valid
Make sure that the input looks like a UK phone number. Accept various dial prefixes, +44, 011 44, 00 44 with or without parentheses, hyphens or spaces; or national format with a leading 0. Let the user use any format they want for the remainder of the number: (020) 3555 7788 or 00 (44) 203 555 7788 or 02035-557-788 even if it is the wrong format for that particular number. Don't worry about unbalanced parentheses. The important part of the input is making sure it's the correct number of digits. Punctuation and spaces don't matter.
^\(?(?:(?:0(?:0|11)\)?[\s-]?\(?|\+)44\)?[\s-]?\(?(?:0\)?[\s-]?\(?)?|0)(?:\d{5}\)?[\s-]?\d{4,5}|\d{4}\)?[\s-]?(?:\d{5}|\d{3}[\s-]?\d{3})|\d{3}\)?[\s-]?\d{3}[\s-]?\d{3,4}|\d{2}\)?[\s-]?\d{4}[\s-]?\d{4}|8(?:00[\s-]?11[\s-]?11|45[\s-]?46[\s-]?4\d))(?:(?:[\s-]?(?:x|ext\.?\s?|\#)\d+)?)$
The above pattern matches optional opening parentheses, followed by 00 or 011 and optional closing parentheses, followed by an optional space or hyphen, followed by optional opening parentheses. Alternatively, the initial opening parentheses are followed by a literal + without a following space or hyphen. Any of the previous two options are then followed by 44 with optional closing parentheses, followed by optional space or hyphen, followed by optional 0 in optional parentheses, followed by optional space or hyphen, followed by optional opening parentheses (international format). Alternatively, the pattern matches optional initial opening parentheses followed by the 0 trunk code (national format).
The previous part is then followed by the NDC (area code) and the subscriber phone number in 2+8, 3+7, 3+6, 4+6, 4+5, 5+5 or 5+4 format with or without spaces and/or hyphens. This also includes provision for optional closing parentheses and/or optional space or hyphen after where the user thinks the area code ends and the local subscriber number begins. The pattern allows any format to be used with any GB number. The display format must be corrected by later logic if the wrong format for this number has been used by the user on input.
The pattern ends with an optional extension number arranged as an optional space or hyphen followed by x, ext and optional period, or #, followed by the extension number digits. The entire pattern does not bother to check for balanced parentheses as these will be removed from the number in the next step.
At this point you don't care whether the number begins 01 or 07 or something else. You don't care whether it's a valid area code. Later steps will deal with those issues.
Step 2: Extract the NSN so it can be checked in more detail for length and range
After checking the input looks like a GB telephone number using the pattern above, the next step is to extract the NSN part so that it can be checked in greater detail for validity and then formatted in the right way for the applicable number range.
^\(?(?:(?:0(?:0|11)\)?[\s-]?\(?|\+)(44)\)?[\s-]?\(?(?:0\)?[\s-]?\(?)?|0)([1-9]\d{1,4}\)?[\s\d-]+)(?:((?:x|ext\.?\s?|\#)\d+)?)$
Use the above pattern to extract the '44' from $1 to know that international format was used, otherwise assume national format if $1 is null.
Extract the optional extension number details from $3 and store them for later use.
Extract the NSN (including spaces, hyphens and parentheses) from $2.
Step 3: Validate the NSN
Remove the spaces, hyphens and parentheses from $2 and use further RegEx patterns to check the length and range and identify the number type.
These patterns will be much simpler, since they will not have to deal with various dial prefixes or country codes.
The pattern to match valid mobile numbers is therefore as simple as
^7([45789]\d{2}|624)\d{6}$
Premium rate is
^9[018]\d{8}$
There will be a number of other patterns for each number type: landlines, business rate, non-geographic, VoIP, etc.
By breaking the problem into several steps, a very wide range of input formats can be allowed, and the number range and length for the NSN checked in very great detail.
Step 4: Store the number
Once the NSN has been extracted and validated, store the number with country code and all the other digits with no spaces or punctuation, e.g. 442035557788.
Step 5: Format the number for display
Another set of simple rules can be used to format the number with the requisite +44 or 0 added at the beginning.
The rule for numbers beginning 03 is
^44(3\d{2})(\d{3])(\d{4})$
formatted as
0$1 $2 $3 or as +44 $1 $2 $3
and for numbers beginning 02 is
^44(2\d)(\d{4})(\d{4})$
formatted as
(0$1) $2 $3 or as +44 $1 $2 $3
The full list is quite long. I could copy and paste it all into this thread, but it would be hard to maintain that information in multiple places over time. For the present the complete list can be found at: http://aa-asterisk.org.uk/index.php/Regular_Expressions_for_Validating_and_Formatting_GB_Telephone_Numbers

Given that people sometimes write their numbers with spaces in random places, you might be better off ignoring the spaces all together - you could use a regex as simple as this then:
^0(\d ?){10}$
This matches:
01234567890
01234 234567
0121 3423 456
01213 423456
01000 123456
But it would also match:
01 2 3 4 5 6 7 8 9 0
So you may not like it, but it's certainly simpler.

Would this regex do?
// using System.Text.RegularExpressions;
/// <summary>
/// Regular expression built for C# on: Wed, Sep 8, 2010, 06:38:28
/// Using Expresso Version: 3.0.2766, http://www.ultrapico.com
///
/// A description of the regular expression:
///
/// [1]: A numbered capture group. [\+44], zero or one repetitions
/// \+44
/// Literal +
/// 44
/// [2]: A numbered capture group. [\s+], zero or one repetitions
/// Whitespace, one or more repetitions
/// [3]: A numbered capture group. [\(?]
/// Literal (, zero or one repetitions
/// [area_code]: A named capture group. [(\d{1,5}|\d{4}\s+?\d{1,2})]
/// [4]: A numbered capture group. [\d{1,5}|\d{4}\s+?\d{1,2}]
/// Select from 2 alternatives
/// Any digit, between 1 and 5 repetitions
/// \d{4}\s+?\d{1,2}
/// Any digit, exactly 4 repetitions
/// Whitespace, one or more repetitions, as few as possible
/// Any digit, between 1 and 2 repetitions
/// [5]: A numbered capture group. [\)?]
/// Literal ), zero or one repetitions
/// [6]: A numbered capture group. [\s+|-], zero or one repetitions
/// Select from 2 alternatives
/// Whitespace, one or more repetitions
/// -
/// [tel_no]: A named capture group. [(\d{1,4}(\s+|-)?\d{1,4}|(\d{6}))]
/// [7]: A numbered capture group. [\d{1,4}(\s+|-)?\d{1,4}|(\d{6})]
/// Select from 2 alternatives
/// \d{1,4}(\s+|-)?\d{1,4}
/// Any digit, between 1 and 4 repetitions
/// [8]: A numbered capture group. [\s+|-], zero or one repetitions
/// Select from 2 alternatives
/// Whitespace, one or more repetitions
/// -
/// Any digit, between 1 and 4 repetitions
/// [9]: A numbered capture group. [\d{6}]
/// Any digit, exactly 6 repetitions
///
///
/// </summary>
public Regex MyRegex = new Regex(
"(\\+44)?\r\n(\\s+)?\r\n(\\(?)\r\n(?<area_code>(\\d{1,5}|\\d{4}\\s+"+
"?\\d{1,2}))(\\)?)\r\n(\\s+|-)?\r\n(?<tel_no>\r\n(\\d{1,4}\r\n(\\s+|-"+
")?\\d{1,4}\r\n|(\\d{6})\r\n))",
RegexOptions.IgnoreCase
| RegexOptions.Singleline
| RegexOptions.ExplicitCapture
| RegexOptions.CultureInvariant
| RegexOptions.IgnorePatternWhitespace
| RegexOptions.Compiled
);
//// Replace the matched text in the InputText using the replacement pattern
// string result = MyRegex.Replace(InputText,MyRegexReplace);
//// Split the InputText wherever the regex matches
// string[] results = MyRegex.Split(InputText);
//// Capture the first Match, if any, in the InputText
// Match m = MyRegex.Match(InputText);
//// Capture all Matches in the InputText
// MatchCollection ms = MyRegex.Matches(InputText);
//// Test to see if there is a match in the InputText
// bool IsMatch = MyRegex.IsMatch(InputText);
//// Get the names of all the named and numbered capture groups
// string[] GroupNames = MyRegex.GetGroupNames();
//// Get the numbers of all the named and numbered capture groups
// int[] GroupNumbers = MyRegex.GetGroupNumbers();
Notice how the spaces and dashes are optional and can be part of it.. also it is now divided into two capture groups called area_code and tel_no to break it down and easier to extract.

Strip all whitespace and non-numeric characters and then do the test. It'll be musch , much easier than trying to account for all the possible options around brackets, spaces, etc.
Try the following:
#"^(([0]{1})|([\+][4]{2}))([1]|[2]|[3]|[7]){1}\d{8,9}$"
Starts with 0 or +44 (for international) - I;m sure you could add 0044 if you wanted.
It then has a 1, 2, 3 or 7.
It then has either 8 or 9 digits.
If you want to be even smarter, the following may be a useful reference: http://en.wikipedia.org/wiki/Telephone_numbers_in_the_United_Kingdom

It's not a single regex, but there's sample code from Braemoor Software that is simple to follow and fairly thorough.
The JS version is probably easiest to read. It strips out spaces and hyphens (which I realise you said you can't do) then applies a number of positive and negative regexp checks.

Start by stripping the non-numerics, excepting a + as the first character.
(Javascript)
var tel=document.getElementById("tel").value;
tel.substr(0,1).replace(/[^+0-9]/g,'')+tel.substr(1).replace(/[^0-9]/g,'')
The regex below allows, after the international indicator +, any combination of between 7 and 15 digits (the ITU maximum) UNLESS the code is +44 (UK). Otherwise if the string either begins with +44, +440 or 0, it is followed by 2 or 7 and then by nine of any digit, or it is followed by 1, then any digit except 0, then either seven or eight of any digit. (So 0203 is valid, 0703 is valid but 0103 is not valid). There is currently no such code as 025 (or in London 0205), but those could one day be allocated.
/(^\+(?!44)[0-9]{7,15}$)|(^(\+440?|0)(([27][0-9]{9}$)|(1[1-9][0-9]{7,8}$)))/
Its primary purpose is to identify a correct starting digit for a non-corporate number, followed by the correct number of digits to follow. It doesn't deduce if the subscriber's local number is 5, 6, 7 or 8 digits. It does not enforce the prohibition on initial '1' or '0' in the subscriber number, about which I can't find any information as to whether those old rules are still enforced. UK phone rules are not enforced on properly formatted international phone numbers from outside the UK.

After a long search for valid regexen to cover UK cases, I found that the best way (if you're using client side javascript) to validate UK phone numbers is to use libphonenumber-js along with custom config to reduce bundle size:
If you're using NodeJS, generate UK metadata by running:
npx libphonenumber-metadata-generator metadata.custom.json --countries GB --extended
then import and use the metadata with libphonenumber-js/core:
import { isValidPhoneNumber } from "libphonenumber-js/core";
import data from "./metadata.custom.json";
isValidPhoneNumber("01234567890", "GB", data);
CodeSandbox Example

Custom RegEx expression for validating different possibilities of phone number entries?

I'm looking for a custom RegEx expression (that works!) to will validate common phone number with area code entries (no country code) such as:
111-111-1111
(111) 111-1111
(111)111-1111
111 111 1111
111.111.1111
1111111111
And combinations of these / anything else I may have forgotton.
Also, is it possible to have the RegEx expression itself reformat the entry? So take the 1111111111 and put it in 111-111-1111 format. The regex will most likely be entered in a Joomla / some type of CMS module, so I can't really add code to it aside from the expression itself.

\(?(\d{3})\)?[ .-]?(\d{3})[ .-]?(\d{4})
will match all your examples; after a match, backreference 1 will contain the area code, backreference 2 and 3 will contain the phone number.
I hope you don't need to handle international phone numbers, too.
If the phone number is in a string by itself, you could also use
^\s*\(?(\d{3})\)?[ .-]?(\d{3})[ .-]?(\d{4})\s*$
allowing for leading/trailing whitespace and nothing else.

Why not just remove spaces, parenthesis, dashes, and periods, then check that it is a number of 10 digits?

Depending on the language in question, you might be better off using a replace-like statement to replace non-numeric characters: ()-/. with nothing, and then just check if what is left is a 10-digit number.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

SSN and 9 number screening issues - regex

Related

Check if contains four digits year number in apple script

How to creating a regex pattern in VBA to extract dates from string and exclude false matches

Regex to validate only US and India retail number

RegEx: Uk Landlines, Mobile phone numbers

Custom RegEx expression for validating different possibilities of phone number entries?

Categories

Resources