Regex fixed number of characters but any quantity of spaces - regex

I'm validating some input fields. Here's the regex for a simple example:
^\[0-9]\{6,6}\$
In the example, it requires 6 numbers to be input. However, I want to relax the validation a little and allow spaces where necessary, and remove them later - an example might be a bank sort code.
In the UK, a sort code could be written as 123456, or perhaps 12 34 56.
I know I can amend the expression to include a space within the brackets and relax the numbers in the curly brackets, but what I'd like to do is continue to limit the digits so that 6 must always be input, and allow none or more spaces - of course the spaces could be anywhere.
I'm not sure how to approach this - any ideas, help appreciated.

Try this:
^(\d\s*){6}$
It allows 0 or more whitespace characters after every digit.
If you want to limit whitespace to be inside the digits (without leading or trailing spaces):
^(\d\s*){5}\d$

If you allow spaces at any position alongside 6 digits, then you need
^(\s*[0-9]){6}\s*$
See regex demo
The \s* matches any whitespace, 0 or more repetitions.
Note that a limiting quantifier {6,6} (minimum 6 and maximum 6 repetitions) is equal to {6}.
Also, note that you need to double escape the \s as \\s if you pass the regex pattern as a regular string literal.
And if you plan to only allow regular spaces, not all whitespace, just use
^([ ]*[0-9]){6}[ ]*$

I think you want to look at a lookahead expression
This site explains them in more detail
For your example, ^(?=(\s*[0-9]\s*){6})(\d*\s*)$
This looks for any amount of space, followed by a digit followed by any amount of space 6 times.
Other answers I've seen so far only allow a total of 6 characters, this expression will allow any number of spaces but only 6 digits, no more, no less.
Note: ^(\s*[0-9]\s*){6}$ this will also work, without the lookahead expression
JavaScript Example

Related

Different regex conditions on same string

I am trying to implement a regex for phone numbers, based on our business logic.
What the customer wants is that the phone must contain between 8 and 15 characters of numbers, and also can contain any spaces and dots anywhere which doesn't add to the count of numbers. So, theoretically this should be valid:
3 .... 44444444
Because it contains 9 numbers.
I can't really go further on
~[0-9\.\ ]{8,15}$
but obviously it counts dots and spaces to the limit too.
Is it even possible to implement it via regex?
A Regex attempt:
^(?:[ .]*\d){8,15}[ .]*$
This will match 8 to 15 digits, with any number of space or dot happening anywhere in between.
The non-captured group, (?:[ .]*\d), matches any digit preceded by any number of dot or space, {8,15} ensures the range on numbers
[ .]*$ matches any number of dot or space at the end
Demo
As far as I know, regular expressions cannot validate this. However you could maybe globally remove all whitespace and dots and then try to match a regex that is ^[[:digit:]]{8,15}$

Need a Regex that contains at least one number, zero or more letters, no spaces, min/max

I need a regular expression that will match a string that contains:
at least one number
zero or more letters
no other characters such as spaces
The string must also be a minimum of 8 characters and a maximum of 13 characters.
Placement of the numbers and/or letters within the 8-13 character string does not matter. I haven't figured out how to make sure that the string contains a number, but here are some expressions that don't work because they are picking up spaces in the online tool Regexr. Take a look below:
- ([\w^/s]){8,13}
- ([a-zA-Z0-9]){8,13}
- ([a-zA-Z\d]){8,13}
I am specifically looking to exclude spaces and special characters. The linked and related questions all appear to allow for these characters. This is not for validating passwords, it is for detecting case numbers in natural language processing. This is different from "Password REGEX with min 6 chars, at least one letter and one number and may contain special characters" because I am looking for at least one number but zero or more letters. I also do not want to return strings that contain any special characters including spaces.
This is a typical password validation with your requirements.
Note that this will also match 8-13 digits as well (but it is requested).
Ten million + 1 (and counting) happy customers ..
^(?=.*\d)[a-zA-Z\d]{8,13}$
Explained
^ # Beginning of string
(?= .* \d ) # Lookahead for a digit
[a-zA-Z\d]{8,13} # Consume 8 to 13 alphanum characters
$ # End of string
I've seen the answer above (by sln) everywhere over the internet, but as far as I can tell, it is NOT ACCURATE.
If your string contains 8 to 13 characters with no numbers this expression will match it, because it uses the * quantifier on the wildcard character . in the positive lookahead.
In order to match at least 1 digit, 1 A-Z and 1 a-z in a password that's at least 8 characters long, you'll need something like this:
(?=.{1,7}\d)(?=.{1,7}[a-z])(?=.{1,7}[A-Z])[a-zA-Z\d]{8,13}
it uses 3 lookaheads:
(?=.{1,7}\d)
(?=.{1,7}[a-z])
(?=.{1,7}[A-Z])
each time, it looks for the target (eg the first digit) but allows 1 to 7 occurances of any character before it.
Then it will match 8 to 13 alphanumeric characters.
NOTE to Powershell users:
Use a search group to be able to extract a result
$password = [regex]::match($string-to-search,'(?=.{1,7}\d)(?=.{1,7}[a-z])(?=.{1,7}[A-Z])([a-zA-Z\d]{8,13})').Groups[1].Value

Regex - matching while ignoring some characters

I am trying to write a regex to max a sequence of numbers that is 5 digits long or over, but I ignore any spaces, dashes, parens, or hashes when doing that analysis. Here's what I have so far.
(\d|\(|\)|\s|#|-){5,}
The problem with this is that this will match any sequence of 5 characters including those characters I want to ignore, so something like "#123 " would match. While I do want to ignore the # and space character, I still need the number itself to be 5 digits or more in order to qualify at a match.
To be clear, these would match:
1-2-3-4-5
123 45
2(134) 5
Bonus points if the matching begins and ends with a number rather than with one of those "special characters" I am excluding.
Any tips for doing this kind of matching?
If I understood requirements right you can use:
^\d(?:[()\s#-]*\d){4,}$
RegEx Demo
It always matches a digit at start. Then it is followed by 4 or more of a non-capturing group i.e. (?:[()\s#-]*\d) which means 0 or more of any listed special character followed by a digit.
So just repeat a digit, followed by any other sequence of allowed characters 5 or more times:
^(\d[()\s#-]*){5,}$
You can ensure it ends on a digit if you subtract one of the repetitions and add an explicit digit at the end:
^(\d[()\s#-]*){4,}\d$
You can suggest non-digits with \D so et would be something like:
(\d\D*){5,}
Here is a guide.

How can I recognize a valid barcode using regex?

I have a barcode of the format 123456########. That is, the first 6 digits are always the same followed by 8 digits.
How would I check that a variable matches that format?
You haven't specified a language, but regexp. syntax is relatively uniform across implementations, so something like the following should work: 123456\d{8}
\d Indicates numeric characters and is typically equivalent to the set [0-9].
{8} indicates repetition of the preceding character set precisely eight times.
Depending on how the input is coming in, you may want to anchor the regexp. thusly:
^123456\d{8}$
Where ^ matches the beginning of the line or string and $ matches the end. Alternatively, you may wish to use word boundaries, to ensure that your bar-code strings are properly separated:
\b123456\d{8}\b
Where \b matches the empty string but only at the edges of a word (normally defined as a sequence consisting exclusively of alphanumeric characters plus the underscore, but this can be locale-dependent).
123456\d{8}
123456 # Literals
\d # Match a digit
{8} # 8 times
You can change the {8} to any number of digits depending on how many are after your static ones.
Regexr will let you try out the regex.
123456\d{8}
should do it. This breaks down to:
123456 - the fixed bit, obviously substitute this for what you're fixed bit is, remember to escape and regex special characters in here, although with just numbers you should be fine
\d - a digit
{8} - the number of times the previous element must be repeated, 8 in this case.
the {8} can take 2 digits if you have a minimum or maximum number in the range so you could do {6,8} if the previous element had to be repeated between 6 and 8 times.
The way you describe it, it's just
^123456[0-9]{8}$
...where you'd replace 123456 with your 6 known digits. I'm using [0-9] instead of \d because I don't know what flavor of regex you're using, and \d allows non-Arabic numerals in some flavors (if that concerns you).

Can this Regex be improved?

I have a regex to match a user entered id which has the basic format of [a-zA-z]{2}[\d]{8} but the kicker is a space can be placed between any of the letters or digits in the id so my regex looks like this
[A-Za-z]+[\s]*[A-Za-z]+[\s]*[\d]+[\s]*[\d]+[\s]*[\d]+[\s]*[\d]+[\s]*[\d]+[\s]*[\d]+[\s]*[\d]+[\s]*[\d]+[\s]*
Which is obviously an abomination and should be killed with fire, can this be improved upon?
All of the following are valid inputs
a b 1 2 2 3 4 5 5 6
ab12345678
ab 12345678
Your regex does not comply with your specification, can there be 2 or more letters before the digits? Extactly 8 digits or 8 digist or more?
Try
([a-zA-Z]\s*){2}(\d\s*){8}
If there can only be one space between each character:
([a-zA-Z]\s?){2}(\d\s?){8}
Don't ever use \d and \s unless you know EXACTLY where you are going...
\d will match 09E6 ০ BENGALI DIGIT ZERO (the ০ is your digit :-) ). For example read http://msdn.microsoft.com/en-us/library/w1c0s6bb.aspx
\s will match more types of strange spaces (and the tab character) than you can count, and I'm not kidding. http://msdn.microsoft.com/en-us/library/t809ektx.aspx
Paradoxically using [a-zA-Z] you are limiting quite much your users... No àèéìòù, nor the Turkish ı and İ (the first one is an i without the dot, lower case, the second one is the upper case version of i) http://en.wikipedia.org/wiki/Dotted_and_dotless_I .
Perhaps you could use (\p{L}\p{M}*) (with brackets) instead of [A-Za-z] (all the letters plus the combining marks). You have to add an * or a + AFTER the close bracket. The one expression is for a single letter PLUS its combining marks.
Oh... and you can use one of the other suggestions as a basis for the regex :-)
[\s]*[\d]+[\s]*[\d]+[\s]*[\d]+[\s]*[\d]+[\s]*[\d]+[\s]*[\d]+[\s]*[\d]+[\s]*[\d]+[\s]*
can be replaced with...
\s*(?:\d+\s*){8}
(Also, you can just write \s, rather than [\s], and \d rather than [\d] - the brackets are redundant if you're only specifying a single backslash character class.)
Edit Since there seems to be some confusion about what part of the original regex is being replaced, here's the entire expression after replacement:
[A-Za-z]+\s*[A-Za-z]+\s*(?:\d+\s*){8}
(?:[A-Za-z]+\s*){2}(?:\d+\s*){8}