I am trying to do a regular expression to validate a number between 9 and 13 numbers, but the sequence can have dashes and spaces and the ideal is to not have more than one space or dash consecutively.
this rule allow me to control the validation between 9 and 13
/^[\d]{9,13}$/
now to add dashes and spaces
/^[\d -]{9,13}$/
I think I need something like that, but I need to count the numbers
/^[ -](?:\d){9,13}$/
Any tips?
Notice how my regex starts and ends with a digit. Also, this prevents consecutive spaces and dashes.
/^\d([ \-]?\d){7,12}$/
It appears that you don't want leading or trailing spaces and dashes. This should do it.
/^\d([- ]*\d){8,12}$/
Regular expression:
\d digits (0-9)
( group and capture to \1 (between 8 and 12 times)
[- ]* any character of: '-', ' ' (0 or more times)
\d digits (0-9)
){8,12} end of \1
Another option: A digit followed any number of space or dash 8-12 times, followed by a digit.
/^(\d[- ]*){8,12}\d$/
Use look aheads to assert the various constraints:
/^(?!.*( |--))(?=(\D*\d){9,13}\D*$)[\d -]+$/
Assuming a dash following a space or vice versa is ok:
^( -?|- ?)?(\d( -?|- ?)?){9,13}$
Explanation:
( -?|- ?) - this is equivalent to ( | -|-|- ). Note that there can't be 2 consecutive dashes or spaces here, and this can only appear at the start or directly after a digit, so this prevents 2 consecutive dashes or spaces in the string.
And there clearly must be exactly one digit in (\d( -?|- ?)?), thus the {9,13} enforces 9-13 digits.
Assuming a dash following a space or vice versa is NOT ok:
^[ -]?(\d[ -]?){9,13}$
Explanation similar to the above.
Both of the above allows the string to start or end with a digit, dash or space.
Related
The strings I parse with a regular expression contain a region of fixed length N where there can either be numbers or dashes. However, if a dash occurs, only dashes are allowed to follow for the rest of the region. After this region, numbers, dashes, and letters are allowed to occur.
Examples (N=5, starting at the beginning):
12345ABC
12345123
1234-1
1234--1
1----1AB
How can I correctly match this? I currently am stuck at something like (?:\d|-(?!\d)){5}[A-Z0-9\-]+ (for N=5), but I cannot make numbers work directly following my region if a dash is present, as the negative look ahead blocks the match.
Update
Strings that should not be matched (N=5)
1-2-3-A
----1AB
--1--1A
You could assert that the first 5 characters are either digits or - and make sure that there is no - before a digit in the first 5 chars.
^(?![\d-]{0,3}-\d)(?=[\d-]{5})[A-Z\d-]+$
^ Start of string
(?![\d-]{0,3}-\d) Make sure that in the first 5 chars there is no - before a digit
(?=[\d-]{5}) Assert at least 5 digits or -
[A-Z\d-]+ Match 1+ times any of the listed characters
$ End of string
Regex demo
If atomic groups are available:
^(?=[\d-]{5})(?>\d+-*|-{5})[A-Z\d_]*$
^ Start of string
(?=[\d-]{5}) Assert at least 5 chars - or digit
(?> Atomic group
\d+-* Match 1+ digits and optional -
| or
-{5} match 5 times -
) Close atomic group
[A-Z\d_]* Match optional chars A-Z digit or _
$ End of string
Regex demo
Use a non-word-boundary assertion \B:
^[-\d](?:-|\B\d){4}[A-Z\d-]*$
A non word-boundary succeeds at a position between two word characters (from \w ie [A-Za-z0-9_]) or two non-word characters (from \W ie [^A-Za-z0-9_]). (and also between a non-word character and the limit of the string)
With it, each \B\d always follows a digit. (and can't follow a dash)
demo
Other way (if lookbehinds are allowed):
^\d*-*(?<=^.{5})[A-Z\d-]*$
demo
I want to require a space after every comma in a list. I've got this, which works pretty well for my lists that have 5 to 7 digits, separated by commas.
^([^,]{5,7},)*[^,][^ ]{5,7}$
The problem is it allows 12345,12345. I don't want that to pass. 12345, 12345 should pass. I also need just 12345 to pass, so the comma and space is not required if it's just one 5-7 digit number.
Your regex does not match 12345,12345 because this part ([^,]{5,7},)* will match from the start including the comma.
Then it matches not a comma [^,] which will match the second 1 and then it has to match not a whitespace [^ ]{5,7} but there are only 4 characters left to match which are 2345 and it can not match.
If the first part fails it tries to match [^,][^ ]{5,7} which in total matches 6-8 characters.
You might use:
^[^,\s]{5,7}(?:, [^,\s]{5,7})*$
Regex demo
^ Start of the string
[^,\s]{5,7} Match not a whitespace character of a comma 5 - 7 times
(?: Non capturing group
, [^,\s]{5,7} Match a comma, space and not a comma or a whitespace character 5-7 times
)* Close non capturing group and repeat 0+ times
$ End of the string
I didn't understand your regex, but something as simple as this should work:
^(?:\d{5,7}, )*\d{5,7}$
Or if you didn't intend to allow digit-only,
^(?:[^, ]{5,7}, )*[^, ]{5,7}$
I am trying to recognize these types of phone number inputs:
0172665476
+6265476393
+62-65476393
+62-654-76393
+62 65476393
While my regex: (?:\d+\s*)+ can recognize the 1st 2 sample values, it recognizes the last 3 sample values as multiple matches in each line, instead of recognizing the number as a whole.
How can I modify this to support multiple dashes and/or spaces and still recognize it as 1 whole number instead of multiple matches?
You may use this regex:
^\+?\d+(?:[\s-]\d+)*\b
RegEx Details:
^\+?: Match optional + at start
\d+: match 1+ digits
(?:[\s-]\d+)*: Match 0 or more groups that start with whitespace or - followed by 1+ digits
$: End (Replaced by word boundary as if there are trailing spaces, that match would be missed.)
This should work:
(?:[\d +-]+)+
This would work as per your reqt: (If there are trailing spaces, this regex will ignore.)
Regex: '^(?:[\d +-]+)\b'
Another option could be to use an alternation to match either 10 digits without a leading plus sign or match the pattern with a +, and optional space or hyphen:
(?:\d{10}|\+\d{2}[- ]?\d{3}-?\d{5})\b
That will match:
(?: Non capturing group
\d{10} Match 10 digits
| Or
\+\d{2}[-\s]?\d{3}-?\d{5} Match +, 2 digits, optional whitespace char or -, 3 digits, optional -, 5 digits
)\b Close non capturing group and word boundary
Regex demo
If your language supports negative lookbehinds you could prepend (?<!\S) which checks that what comes before is not a non-whitespace character.
$ cat t1.txt:
ABCD_EFG_HIJK
ABCD_HJIJ_IJKL
What could be the Regex for the above two lines .
Even for one of the lines
Or
Scenario is 4characters followedby underscore followed by characters ( any number) followed by underscore followed by characters (any number) again underscore characters .. ends with characters.
4characters_(minimum of 1 characters)_(minimum of1 characters)_(ends with minimum of 1 characters).
Note : It starts with 4 characters.
After edit, the question is to find a regex that matches a string that starts with 4 chars, followed by minimum of 1 group which consists of '_' followed by minimal 1 character.
[A-Z]{4}(_[A-Z]+)+
explanation:
[A-Z]{4} # exactly 4 picks from A-Z
( # group 1 start
_[A-Z]+ # "_" followed by 1 or more character out of A-Z
)+ # group 1 end. Repeat group 1 1 or more times.
You can play with it at regex101
In the above regex I've chosen for capitals as characters, since this is suggested by the question. However, this could be a set of letters e.g., which would change the regex to:
[a-zA-Z]{4}(_[a-zA-Z]+)+
If you mean by any number of character at least one character, this is the most correct answer: /^[A-Za-z0-9]{4}_([A-Za-z0-9]+_)+[A-Za-z0-9]+$/g.
If you want, you can try this solution at regex website: regexr.com
EDIT: If you want to have only capital letters, than you should remove a-z and 0-9 from square brackets.
Another option:
[^_\n]+_[^_]+_[^_\n]+
Match everything except new line \n and _
between underscores
I know that we can use regex in perl to catch numbers using [\d], but my pattern is like this:
261 193 546 302
or it could be like this:
16 0 98 120
The point is - I just want to catch a line that has any four numbers separated by a space. Each number can be made up of any number of digits, it could be a single-digit number, or a double-digit number, and so on.
^\d+(?:\s+\d+){3}$
Try this.This should do it for you.
You don't explicitly have to wrap the token inside of a character class. And for this you want to assert the start of the string and end of the string positions, so I would use anchors and quantify a non-capturing group "3" times.
^\d+(?: \d+){3}$
Explanation:
^ # the beginning of the string
\d+ # digits (0-9) (1 or more times)
(?: # group, but do not capture (3 times):
# ' '
\d+ # digits (0-9) (1 or more times)
){3} # end of grouping
$ # before an optional \n, and the end of the string
Based on your requirements to "catch a line that has any four numbers separated by a space". I would use the following as it contains a capture group which will contain your number sequence and will ignore any leading or tailing spaces.
((?:\d+\s){3}\d+)
REGEX101
Usage in Perl
$re = "/((?:\\d+\\s){3}\\d+)/";
As you can see it will match exactly 4 numbers separated by a single space and will ignore preceding and trailing characters.
Alternate
If you where being explicit and actually want to capture the whole line including any other characters this will be better suited.
(^.*(?:\d+\s){3}\d+.*$)
REGEX101
Usage In Perl
$re = "/(^.*(?:\\d+\\s){3}\\d+.*$)/mx";
Note this will match numbers with decimal places due to the way it is structured.
Try ^\d+\s\d+\s\d+\s\d+$. That will match 4 numbers with spaces and nothing else.
Sample