Regexp pattern Optional character [duplicate] - regex

This question already has answers here:
Regex to allow numbers and only one hyphen in the middle
(3 answers)
Closed last year.
I want to match a string like 19740103-0379 or 197401030379, i.e the dash is optional.
How do I accomplish this with regexp?

Usually you can just use -?. Alternatively, you can use -{0,1} but you should find that ? for "zero or one occurrences of" is supported just about everywhere.
pax> echo 19740103-0379 | egrep '19740103\-?0379'
19740103-0379
pax> echo 197401030379 | egrep '19740103\-?0379'
197401030379
If you want to accept 12 digits with any number of dashes in there anywhere, you might have to do something like:
-*([0-9]-*){12}
which is basically zero or more dashes followed by 12 occurrences of (a digit followed by zero or more dashes) and will capture all sorts of wonderful things like:
--3-53453---34-4534---
(of course, you should use \d instead of [0-9] if your regex engine has support for that).

You could try different ones:
\d* matches a string consisting only of digits
\d*-\d* matches a string of format digits - dash - digits
[0-9\-]* matches a string consisting of only dashes and digits
You can combine them via | (or), so that you have for example (\d*)|(\d*-\d*): matches formats just digits and digits-dash-digits.

Related

How do I create a regex expression for a 10 digit phone number with the same separator?

I am trying to create a basic regular expression to match a phone number which can either use dots [.] or hyphens [-] as the separator.
The format is 123.456.7890 or 123-456-7890.
The expression I am currently using is:
\d\d\d[-.]\d\d\d[-.]\d\d\d\d
The issue here is that it also matches the phone numbers that have both separators in them which I want to be termed as invalid/not a match. For example, with my expression, 123.456-7890 and 123-456.7890 show up as a match, something I do not want happening.
Is there a way to do that?
Use a backreference:
^\d{3}([.-])\d{3}\1\d{4}$
Here is an explanation of the regex:
^ from the start of the number
\d{3} match any 3 digits
([.-]) then match AND capture either a dot or a dash separator
\d{3} match any 3 digits
\1 match the SAME separator seen earlier
\d{4} match any 4 digits
$ end of the number
You can use this regex:
^\d{3}([-.])\d{3}\1\d{4}$
You can see that it works here.
Key point here - is that you capture your desired character using brackets ([-.])
and then reuse it with back reference \1.

Need Regex to validate 11-digit phone number without plus sign [duplicate]

This question already has answers here:
How to validate phone numbers using regex
(43 answers)
Closed 2 years ago.
I need a regex to validate phone number without plus (+) sign for example
46123456789,46-123-456-789,46-123-456-789
number should be 11 digit rest of should ignore
i am currently using this Regex /([+]?\d{1,2}[.-\s]?)?(\d{3}[.-]?){2}\d{4}/g
its not correct at all
About the pattern you tried:
Using this part in your pattern [+]? optionally matches a plus sign. It is wrapped in an optional group ([+]?\d{1,2}[.-\s]?)? possibly also matching 12 digits in total.
The character class [.-\s] matches 1 of the listed characters, allowing for mixed delimiters like 333-333.3333
You are not using anchors, and can also possible get partial matches.
You could use an alternation | to match either the pattern with the hyphens and digits or match only 11 digits.
^(?:\d{2}-\d{3}-\d{3}-\d{3}|\d{11})$
^ Start of string
(?: Non capture group for the alternation
\d{2}-\d{3}-\d{3}-\d{3} Match either the number of digits separated by a hyphen
| Or
\d{11} Match 11 digits
) Close group
$ End of string.
Regex demo
If you want multiple delimiters which have to be consistent, you could use a capturing group with a backreference \1
^(?:\d{2}([-.])\d{3}\1\d{3}\1\d{3}|\d{11})$
Regex demo
I would have this function return true or false and use as is.
function isPhoneValid(phone) {
let onlyNumbers = phone.replace(/[^0-9]/g, "");
if (onlyNumbers.length != 11) console.log(phone + ' is invalid');
else console.log(phone + ' is valid');
}
isPhoneValid('1 (888) 555-1234');
isPhoneValid('(888) 555-1234');
I am not sure how is the input looks like. But based on your question I supposed you want to trim it and match it with regex?
trim your input.
string.split(/[^0-9.]/).join('');
and you can match it with this regex:
((\([0-9]{3}\))|[0-9]{3})[\s\-]?[\0-9]{3}[\s\-]?[0-9]{4}$

Accept only numbers but ignore if two number groups have spaces in between them [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
I'm trying to develop a regex with the following rules:
it should accept solely numbers,
if the string contains any letters or any other special characters, the whole string should be rejected,
regarding spaces, there should only be one consecutive number group, which can be surrounded by spaces,
if there are more than one consecutive number group, with spaces in between the groups, that whole string should be rejected.
Example Cases:
accepted:
1234
[SPACE][SPACE]111[SPACE]
[SPACE]111[SPACE][SPACE]
declined:
1a234
aa1234aa
1234a
12#4
[SPACE]11[SPACE]111
[SPACE]11[SPACE]111#
So far, I've come up with this ([0-9]+[^\s]*) which can be seen here.
What modifications do I have to do to achieve the scenario I want above?
Use this:
^\s*\d+\s*$
All we need to do is accept one or more digits bounded by zero or more spaces on either side.
EDIT:
Just add a capturing group around the digits to use them later:
^\s*(\d+)\s*$
Demo
The pattern you tried ([0-9]+[^\s]*) matches 1+ digits and 0+ times a non whitespace character using a negated character class [^\s]* matching any character except a whitespace char (So it would match aa)
It can match multiple times in the same string as there are no anchors asserting the start ^ and the end $ of the string.
If you want to match spaces, instead of matching \s which could also match newlines, you could match a single space and repeat that 0+ times on the left and on the right side.
^ *[0-9]+ *$
Regex demo
If you only need the digits, you could use a capturing group
^ *([0-9]+) *$
Regex demo
^\s*[0-9]+\s*$
notice that I've used [0-9] instead of \d
[0-9] will accept only Arabic number (Western Arabic Number)
\d may accept all form of digit in unicode like Eastern Arabic Number, Thai,...etc like (١,٢,٣, ๑,๒,๓, ...etc) at least this is the case in XSD regex when its validate XML file.

Regex for excluding strings that start with consecutive leading zeroes or are only alphabets [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 4 years ago.
I am looking for a regex to select only the strings that are not starting with consecutive zeroes or consecutive alphabets before underscore in below strings.
For ex:
ABC_DE-001 is invalid
abc is invalid (only alphabets)
0_DE-001 is invalid (1 zero before underscore)
000_DE-001 is invalid (sequence of 3 consecutive zeroes)
00_DE-001 is invalid (sequence of 2 consecutive zeroes)
01_DE-001 is valid (0 followed by some other number is valid)
10_DE-001 is valid (starts with 1)
100_DE-001 is valid (starts with 1)
One of the approach I tried was:
(0[1-9]+|[1-9][0-9]+|0[0*$][1-9])_[A-Z0-9]+[-][0-9]{3}
I am not sure though if any scenario is missed with this. Also, how can the same thing be achieved using negative or positive lookaround?
For your examople data, you might match using an optional zero ^0? as that can occur but not more than 1 zero.
^0?[1-9][0-9]*_[A-Z]+-[0-9]{3}$
Regex demo
That will match
^0? An optional zero at the start of the string
[1-9][0-9]* Match a digit 1-9 followed by 0+ digits
_[A-Z]+ Match an _ followed by 1+ times A-Z
-[0-9]{3} Match-` followed by 3 digits
$ Assert the end of the string
You can try with negative look ahead groups:
grep -Pi '^(?![a-z]+(?:_|$|\s)|0+(?:_|$|\s))' test.txt
Explanation:
-Pi - use PCRE and process ignore case. This is grep specific, you can adapt these options to your case. If you cannot make the regex processor to ignore case, just replace [a-z] with [a-zA-Z]. And of course, PCRE support is required.
^ - beginning of the line
(?!rgx) - look forward without moving the cursor to check the line doesn't match the enclosed regular expression rgx.
[a-z]+(?:_|$|\s)|0+(?:_|$|\s) :
don't keep consecutive letters ([a-z]+) followed by an underscore, and end of line or a blank character ((?:_|$|\s))
don't keep consecutive zeroes (0+) followed by an underscore, and end of line or a blank character ((?:_|$|\s))
(?:) stands for a non capturing group (got content is not stored, use it if so to improve performances)
Output got:
01_DE-001 is valid (0 followed by some other number is valid)
10_DE-001 is valid (starts with 1)
100_DE-001 is valid (starts with 1)
Since grep only keeps valid lines (default behavior), non displayed lines were processed as invalid.

Perl comprehensive phone number regex [duplicate]

This question already has answers here:
How to validate phone numbers using regex
(43 answers)
Closed 4 years ago.
I have a file that contains phone numbers of the following formats:
(xxx) xxx.xxxx
(xxx).xxx.xxxx
(xxx) xxx-xxxx
(xxx)-xxx-xxxx
xxx.xxx.xxxx
xxx-xxx-xxxx
xxx xxx-xxxx
xxx xxx.xxxx
I must parse the file for phone numbers of those and ONLY those formats, and output them to a separate file. I'm using perl, and so far I have what I think is a valid regex for two of these numbers
my $phone_regex = qr/^(\d{3}\-)?(\(\d{3}\))?\d{3}\-\d{4}$/;
But I'm not sure if this is correct, or how to do the rest all in one regex. Thank you!
Here you go
\(?\d{3}\)?[-. ]\d{3}[-. ]\d{4}
See a demo on regex101.com.
Broken down this is
\(? # "(", optional
\d{3} # three digits
\)? # ")", optional
[-. ] # one of "-", "." or " "
\d{3} # three digits
[-. ] # same as above
\d{4} # four digits
If you want, you can add word boundaries on the right site (\b), some potential matches may be filtered out then.
You haven't escaped parenthesis properly and have uselessly escaped hyphen which isn't needed. The regex you are trying to create is this,
^\(?\d{3}\)?[ .-]\d{3}[ .-]\d{4}$
Explanation:
^ -
\(? - Optional starting parenthesis (
\d{3} - Followed by three digits
\)? - Optional closing parenthesis )
[ .-] - A single character either a space or . or -
\d{3} - Followed by three digits
[ .-] - Again a single character either a space or . or -
\d{4} - Followed by four digits
$ - End of string
Demo
Your current regex allows too much, as it will allow xxx-(xxx) at the beginning. It also doesn't handle any of the . or space separated cases. You want to have only three sets of digits, and then allow optional parentheses around the first set which you can use an alternation for, and then you can make use of character classes to indicate the set of separators you want to allow.
Additionally, don't use \d as it will match any unicode digit. Since you likely only want to allow ASCII digits, use the character class [0-9] (there are other options, but this is the simplest).
Finally, $ allows a newline at the end of the string, so use \z instead which does not. Make sure if you are reading these from a file that you chomp them so they do not contain trailing newlines.
This leaves us with:
qr/^(?:[0-9]{3}|\([0-9]{3}\))[-. ][0-9]{3}[-.][0-9]{4}\z/
If you want to ensure that the two separators are the same if the first is a . or -, it is easiest to do this in multiple regex checks (these can be more lenient since we already validated the general format):
if ($str =~ m/^[0-9()]+ /
or $str =~ m/^[0-9()]+\.[0-9]{3}\./
or $str =~ m/^[0-9()]+-[0-9]{3}-/) {
# allowed
}