Regular Expression for matching a phone number - regex

I need a regular expression to match phone numbers. I just want to know if the number is probably a phone number and it could be any phone format, US or international. So I developed a strategy to determine if it matches.
I want it to accept the following characters: 0-9 as well as ,.()- and optionally start with a + (for international numbers). The string should not match if it has any other characters.
I tried this:
/\+?[0-9\/\.\(\)\-]/
But it matches phone numbers that have + in the middle of the number. And it matches numbers that contain alpha chars (I don't want that).
Lastly, I want to set the minimum length to 9 characters.
Any thoughts?
Thanks for any help, I'm obviously not too swift on RegEx stuff :)

Well, you're pretty close. Try this:
^\+?[0-9\/.()-]{9,}$
Without the start and end anchors you allow partial matching, so it can match +123 from the string :-)+123.
If you want a minimum of 9 digits, rather than any characters (so ---.../// isn't valid), you can use:
^\+?[\/.()-]*([0-9][\/.()-]*){9,}$
or, using a lookahead - before matching the string for [0-9/.()-]* the regex engine is looking for (\D*\d){9}, which is a of 9 digits, each digit possibly preceded by other characters (which we will validate later).
^\+?(?=(\D*\d){9})[0-9\/.()-]*$

The reason why it matches alpha character is because of the period. You have to escape it. I don't know what editor you are using for this, this is what I'll use for VIM:
^+\?[()\-\.]\?\([0-9][\.()\-]\?\)\{3,\}$

The juqeury has a plugin for US phone validation. Check this link. You can also see the regular expression in the source code.

Related

Regex not select word with character at the end

I have a simple question.
I need a regular expression to match a hexdecimal number without colon at the end.
For example:
0x85af6b9d: 0x00256f8a ;some more interesting code
// dont match 0x85af6b9d: at all, but match 0x00256f8a
My expression for hexdecimal number is 0[xX][0-9A-Fa-f]{1,8}
Version with (?!:) is not possible, because it will just match 0x85af6b9 (because of the {1,8} token)
Using a $ also isn't possible - there can be more numbers than one
Thanks!
Here is one way to do so:
0[xX][0-9A-Fa-f]{1,8}(?![0-9A-Fa-f:])
See the online demo.
We use a negative lookahead to match all hexadecimal numbers without : at the end. Because of {1,8}, it is also necessary to ensure that the entire hexadecimal number is correctly matched. We therefore reuse the character set ([0-9A-Fa-f]) to ensure that the number does not continue.

Regex that matches specific letters preceded by a percent sign and does not allow numbers. "%l%m%p"

I'm having trouble writing a regex that matches a pattern like this "%n%m%p" or "%n:%m%p". Only allow specific letters and each letter must have percent sign in front of it. No numbers allowed.
This regex /%(n|m|p)$/ works but allows numbers in between. For example this "%n3%p%m" matches. How do I disallow any numbers.
The regex %(n|m|p) itself matches either %n or %m or %p. That the numbers are allowed between each of the parts is most likely because of your other code.
You can match the whole with this regex
/^(%(n|m|p):{0,1}){0,}$/
Just need to be clear about the exact requirements.
The allowed letters are [nmp]
Each letter has to be preceded by a %
There can be an optional : before %
+ One or more tokens from ^ start to $ end
These requirements won't allow any digit.
^(?::?%[nmp])+$
You can test it at regex101
I can't leave a comment but I can answer, so...
It would help to know what exactly you need from this. Do you need those letters in that order? Do you need exactly 3? Or are you looking for any number of any length with any valid characters in between?
That said, one option if you're matching the entire string is
/^(%[nmp][^\d]*)+$/
which should match any %[nmp] with any character between them that isn't a number. Note though that this will match a single %n for example. If you want to match a specific number i or more than a certain number j, change the + to {i} or {j,} respectively.
As long as it has one of the letters and a percent sign it should
match. Just no numbers
Use the following regex pattern:
%[nmp](?!\d)\b
https://regex101.com/r/CrSnFp/2
(?!\d) - negative lookahead assertion, matches one of the specified characters if it's not followed by a number

regular expression for decimal with fixed total number of digits

Is there a way to write regular expression that will match strings like
(0|[1-9][0-9]*)\.[0-9]+
but with a specified number of numeric characters. for example: for 3 numeric characters it should match "0.12", "12.3" but not match "1.234" or "1.2". I know I can write it something like
(?<![0-9])(([0-9]{1}\.[0-9]{2})|([1-9][0-9]{1})\.[0-9]{1})(?![0-9])
but that becomes quite tedious for large number of digits.
(I know I don't need {1} but it better explains what I'm doing)
^(?=[\d.]{4}$)\d+\.\d+$
You can try this for 3 digits.Can be extended for more.See demo.
https://regex101.com/r/bN8dL3/4
or
\b(?=[\d.]{4}\b)\d+\.\d+\b
If you dont want anchors.
You can match them with adding alternatation:
\b(?:[0-9]\.[0-9]{2}|[1-9][0-9]\.[0-9])\b
Then, you won't need any start/end string/line anchors.
See demo

fixed number of characters in a regex match

Is there a way to match a fixed number of characters in a fixed length string via regex?
Example, I want to match all strings where the length of string is 5 and there are exactly 3 alphabets and 2 exclamations (!). The exclamations can be anywhere in the string.
Example matches: abc!!, a!b!c, !!abc, a!!bc
I tried to match using lookahead but I wasn't able to limit the length. The following was the regex I used.
(?=\w*!\w*!\w*)[\w!]{5}
This matches a!!!b and a!!!! as well which I don't want.
You can do this using a lookahead based regular expression.
^(?=(?:\w*!){2}\w*$)[\w!]{5}$
Live Demo
Probably easiest to just specify all possibilities.
(?=\w\w\w!!|\w\w\!\w\!|\w\w\!!\w|\w!\w\w!|\w!\w!\w|\w!!\w\w|!\w!\w\w|!!\w\w\w)
Regex doesn't work well with combinations/permutations.
If the number of combinations is too large, do it in parts where the first regex gathers potential matches and the second (and beyond) continue to validate it.
[\w!]{5}
match.count('!') == 2
match.count('\w') == 3
(that isn't valid code -- just a concept)

Extracting Customer Unique IDs from Text

I need to extract customer IDs which are unique alphanumeric character sequences from text. They can contain digits only or digits and alphabetic characters or only alphabetic characters. We can assume that they are longer than 5 characters. They might be capitalized or not.
I thought about using a dictionary, if the character sequence is not a word in dictionary and a sequence longer than 5, it is a good candidate.
Any ideas or sample java code would help. Thanks
Here is a simple regular expression that will match alphanumeric sequences of 6 characters or more:
(?<![A-Za-z0-9])[A-Za-z0-9]{6,}
I used a negative lookbehind here instead of a word boundary (\b) in case there were underscores in your text. If your regex flavor doesn't have lookbehind then you'll want to use the word boundary instead (but I note now that you mentioned java in your question - and java does have lookbehind).
If the customer ID must contain a number, then a regular expression to match these would look like this:
(?<![A-Za-z0-9])(?=[A-Za-z]*[0-9][A-Za-z0-9]*)[A-Za-z0-9]{6,}
See Regex101 demo.
Is there a limit to how long your customer IDs can be? If so, then putting that limit in would probably be helpful - any alphanumeric character sequence longer than that number obviously won't be a match. If the limit is 25 characters, for example, the regex would look like this:
(?<![A-Za-z0-9])(?=[A-Za-z]*[0-9][A-Za-z0-9]*)[A-Za-z0-9]{6,25}(?![A-Za-z0-9])
(I added the lookahead at the end, otherwise this could simply match the first 25 characters of a long alphanumeric sequence!)
Once you have the matches extracted from your text, then you could do a dictionary lookup. I know there are questions and answers on StackOverflow on this subject.
To actually use this regex in Java, you would use the Pattern and Matcher classes. For example,
String mypattern = "(?<![A-Za-z0-9])(?=[A-Za-z]*[0-9][A-Za-z0-9]*)[A-Za-z0-9]{6,25}(?![A-Za-z0-9])";
Pattern tomatch = Pattern.compile(mypattern);
Etc. Hope this helps.
UPDATE
This just occurred to me, rather than trying a dictionary match, it might be better to store the extracted values in a database table and then compare that against your customers table.