A typical regex for phone number validation in Java - regex

I am struggling with getting the right regex for a phone number validation in my application. I have got a regex that will accept only numbers and some special symbols like ()- etc, however, the problem is that it accepts only symbols as well. So for example, it would accept something like ()()()(). I want to modify the regex or get a whole new regex that accepts these symbols but it should have at least one number before and after each symbol.
My requirements are:
Only numbers
Number with combination of special symbols
Each symbol should be followed by a number (before and after) but white spaces are okay
Max length should be 15

In my experience, the parenthesis only appear around the first group of digits and there are never fewer than 3 digits in a group. This regex does that, and prevents multiple consecutive separators with the exception of a space following a paren "(123) 456-7890". I also added support for periods as separators. It allows for 1, 2, or 3 groups of numbers and attempts to enforce an overall range of 7-15 digits but it errs on the permissive side.
^\\s*(\\d{7,15})||(\\d{3,12}[\\-.]?\\s?\\d{3,12}[\\-.\\s]?)||([(]?\\d{3,9}[)\\-.]?\\s?\\d{3,9}[\\-.\\s]?\\d{3,9})\\s*
In my environment I have to escape the backslashes - you may not have to so you may need to replace the \ with . The hyphen must be escaped because in this context it represents a range.

Related

How to write a Regex that identifies specific letters plus a minimum amount of numbers

I'm trying to write a regex that can locate IDs in a body of text. The ID starts with "DW" and has a minimum of 5 numbers after that. It will only have numbers and no other characters following that.
Correct Examples
DW40056
DW4000057
Wrong Examples
DW4005
DW405679fg
Use word boundaries around DW followed by 4 digits then one or more digits:
\bDW\d{4}\d+\b
See live demo.
The word boundaries prevent matches with input such as ABCDW12345XYZ etc.
Although you could code the digits part as\d{5,}, which is simpler than \d{4}\d+, not all engines support open-ended quantity ranges. Since you haven’t indicated the language/tool you’re using, this regex is going to work in more situations.
Try this pattern: DW\d{5,}$
See Demo
Explanation:
DW is two characters that id start with
\d is for 0-9 numbers
{5,} it means \d must appear five or more times
$ it means the end of string. this cause this pattern just take strings that end with numbers (no more characters after numbers)

Identifying number sequences with optional punctuation

I am trying to identify account numbers in different formats using a single regex. The following are the different formats I need to detect:
12-34-56-78-9
12-3456-78-9
123-456-789
1-23-45678-9
We need to detect "-" inbetween a 9-digit number. But there is no clue where "-" could come. As of now, i am creating regex for individual conditions and detecting it. is there a simple regex to detect the above in a single shot?
Here you go, that's a pretty simple pattern:
^(?:\d-?){8}\d$
Demo
It simply means: find a digit (\d), optionally followed by a hyphen (-?), 8 times in a row ({8}), then the last digit (\d). This prevents a hyphen from being the first or last character, and it also prevents two hyphens in a row.

Regex to detect filling character length with periods

I'm trying to build some regex that would detect when someone is trying to "fill out" their username with dots.
There are a few other requirements:
username must contain only letters, numbers and dots
username must start and end with a letter or number
but not more than one consecutive dot
minimum of 6 characters (letters and numbers)
e.g.:
a.b.c.d.e.6 is allowed (not caught) because it has 6 characters
a.b.c.d.5 is not (is caught) because it does not have the prerequisite 6 characters
The way that I'm building the regex is if there's a match, it will reject the username allowed.
What I have thus far is:
/[^a-z0-9.]|^\.|\.$|\.{2,}|\S{31,}|^\S{0,5}$/i
This catches:
any characters that aren't letters, numbers, dots
can't start with a dot
can't end with a dot
can't have 2 or more consecutive dots
can't have 31 or more characters
can't have 5 or less characters
I've tried dozens of different ways to get that last check in place, but they've all either broken the entire check, included the allowable (a.b.c.d.e.6) or just not worked.
the one that I've come closest with is:
(\.{1}[a-z0-9]{1,}){1,3}\S{1,}$
The problem with this is that it's also catching 123.456 (which should be allowed / not caught)
other examples of character strings that it should catch:
asdf.g
a.sdfg
a.sdf.g
as.df.g
I'm trying to do this using only regex, without having to pre-format it using JS.
Ok, after much experimentation I've actually found the answer. It turns out that finding the non-permitted strings was actually easier (for me anyway):
/^(\w\.?){4}\w$/
same expression expanded:
/^\w\.?\w\.?\w\.?\w\.?\w$/
This will catch anything that is populated with only 5 or fewer characters and interspersed with dots.
The full regex that I'm using also catches:
Strings of 31 or more characters (alphanumeric and periods).
Any characters that are not alphanumeric and periods.
Any string starting with a period
Any string ending with a period
Any string that has 2 or more consecutive periods
And a new-comer to the list: Any string that has 8 or more numeric digits without any alpha.
/^(\w\.?){4}\w$|^\w{0,5}$|\w{31,}|[^a-z0-9.]|^\.|\.$|\.{2,}|\d{8,}/i
I've tested this with all the possible combinations that I can think of on regex101 here: https://regex101.com/r/xI7wZ3/1
And it works! (yay)

How to optimise this regex to match string (1234-12345-1)

I've got this RegEx example: http://regexr.com?34hihsvn
I'm wondering if there's a more elegant way of writing it, or perhaps a more optimised way?
Here are the rules:
Digits and dashes only.
Must not contain more than 10 digits.
Must have two hyphens.
Must have at least one digit between each hyphen.
Last number must only be one digit.
I'm new to this so would appreciate any hints or tips.
In case the link expires, the text to search is
----------
22-22-1
22-22-22
333-333-1
333-4444-1
4444-4444-1
4444-55555-1
55555-4444-1
666666-7777777-1
88888888-88888888-1
1-1-1
88888888-88888888-22
22-333-
333-22
----------
My regex is: \b((\d{1,4}-\d{1,5})|(\d{1,5}-\d{1,4}))-\d{1}\b
I'm using this site for testing: http://gskinner.com/RegExr/
Thanks for any help,
Nick
Here is a regex I came up with:
(?=\b[\d-]{3,10}-\d\b)\b\d+-\d+-\d\b
This uses a look-ahead to validate the information before attempting the match. So it looks for between 3-10 characters in the class of [\d-] followed by a dash and a digit. And then after that you have the actual match to confirm that the format of your string is actually digit(dash)digit(dash)digit.
From your sample strings this regex matches:
22-22-1
333-333-1
333-4444-1
4444-4444-1
4444-55555-1
55555-4444-1
1-1-1
It also matches the following strings:
22-7777777-1
1-88888888-1
Your regexp only allows a first and second group of digits with a maximum length of 5. Therefore, valid strings like 1-12345678-1 or 123456-1-1 won't be matched.
This regexp works for the given requirements:
\b(?:\d\-\d{1,8}|\d{2}\-\d{1,7}|\d{3}\-\d{1,6}|\d{4}\-\d{1,5}|\d{5}\-\d{1,4}|\d{6}\-\d{1,3}|\d{7}\-\d{1,2}|\d{8}\-\d)\-\d\b
(RegExr)
You can use this with the m modifier (switch the multiline mode on):
^\d(?!.{12})\d*-\d+-\d$
or this one without the m modifier:
\b\d(?!.{12})\d*-\d+-\d\b
By design these two patterns match at least three digits separated by hyphens (so no need to put a {5,n} quantifier somewhere, it's useless).
Patterns are also build to fail faster:
I have chosen to start them with a digit \d, this way each beginning of a line or word-boundary not followed by a digit is immediately discarded. Other thing, using only one digit, I know the remaining string length.
Then I test the upper limit of the string length with a negative lookahead that test if there is one more character than the maximum length (if there are 12 characters at this position, there are 13 characters at least in the string). No need to use more descriptive that the dot meta-character here, the goal is to quickly test the length.
finally, I describe the end of string without doing something particular. That is probably the slower part of the pattern, but it doesn't matter since the overwhelming majority of unnecessary positions have already been discarded.

Preparing number using abbreviations

RegEx for BMHT in a sequence is my previous post.
I'm looking to build a number using abbreviations, and ofcourse using regex.
Now I know how to validate a number with BMTH abbreviations.
Now my next and final target is to build a number using the abbreviations.
e.g. -2T2H22.55 should be displayed as -2,222.55
-2M2H22.63 should be displayed as -2,000,222.63
Help appreciated.
Flex's scripting language, ActionScript, is an ECMAScript implementation like JavaScript, so regex literals have to be delimited with slashes, for example: /^(?:\d+B)?(?:\d{1,3}M)?(?:\d{1,3}T)?(?:\d{1}H)?(\.[0-9]*)?/.
But that regex still has some problems. For one thing, you don't account for the minus sign or the two digits after the hundreds place. And, while the decimal point may be optional, if it is present you should require it to be followed by at least one digit (so +, not * in that last group).
Finally, you'll need to capture the various components so you can use them to construct the number. Here's my result:
/^(-?)(?:(\d+)B)?(?:(\d{1,3})M)?(?:(\d{1,3})T)?(?:(\d)H)?(\d{0,2})(\.\d+)?$/
The minus sign, if present, will be captured in group $1. The rest of the components will be in groups $2 through $7. You can use them in a callback function to construct the number. Also, notice that everything in this regex is optional; it will match an empty string or just a hyphen, so you'll need to check for that.