Regular expression that matches strings containing exactly 10 digits - regex

How would I write a regular expression (Python or Java) that matches strings that contain exactly 10 digits (0-9). I don't care if it contains any other characters, and the 10 digits do not have to be consecutive. For example, I want to following strings to match: "2fdf675&*85y989$%#0" and "3h2j9f88__+=123..54". Any ideas on how to go about doing this??

Just try with:
^\D*(?:\d\D*){10}$

It will probably be clearer if you don't try to cram everything into one regex. I don't know Java or Python, but here's what you could do in Perl:
$str =~ s/\D//g; # Remove all non-digit characters
if ( length($str) == 10 ) ... # Must be left with 10 characters.

Here's a regex and java code:
if (str.matches("(\\D*\\d){10}\\D*")) // FYI ^ and $ are implied in java
But in java here's an easier way to comprehend:
if (str.replaceAll("\\D", "").length() == 10)
This just removes all non-digits and checks the length of what's left.

Related

how to exclude digits using regular expression in VBA

Hello I need to exclude sequence of digits from 890000 till 890001;
890002 to 899999 is acceptable
Is it possible doing using regular expression?
No need for regex.
If Value >= 890002 And Value <= 899999 Then
' Accept
End If
Ok, if you insist on using regex (may be for learning purpose):
In this simple case it is actually easier to exclude those two number and match the rest:
^89(?!000[12])\d{4}$
Explanation:
^ match from start of text
89match 89
(?!000[12]) negative look ahead for 3 times zero and one of characters in the character group (1 or 2). If this doesn't block the match:
\d{4} match 4 digits
$ match end of text.

Regular expression to get everything between two characters/strings

I have been trying to use regular expression to extract data from the following strings
LTE_LTE_FSD9167__P_Airport1
I want to extract the 7 digit sitecode(FSD9167) from the above string.
RUR1251__S_KhooNaiWala
I want to extract 7 digit sitecode(RUR1251) from above string.
For LTE_LTE case I wrote LTE_LTE_([^_;]+).* but it selects the whole string including not the required text only.
The pattern I see is three letters followed by four numbers, so:
\w{3}\d{4}
Use () to capture the pattern:
(\w{3}\d{4})
PHP:
$re = '/(\w{3}\d{4})/m';
JavaScript:
const regex = /(\w{3}\d{4})/gm;
Use https://regex101.com/ to learn the explanation.
You can use something like this:
^(?:LTE_LTE_)?(\S{7})\S*$ /gm
This captures the seven non-whitespace characters either at the beginning (case 2) or just after LTE_LTE_
Demo
You did not provide any rule about how the code could look like. I noticed that both codes you provided in the example have 3 letters followed by 4 digits. I made a rule more generic, with at least 2 letters followed by at least 3 digits.
The regex is:
[a-zA-Z]{2,}\d{3,}
Test here.
As you want to match only these 2 strings, use:
(?<![A-Z0-9])[A-Z0-9]{7}(?![A-Z0-9])
Explanation:
(?<![A-Z0-9]) # negative lookbehind, make sure we haven't alphanum before
[A-Z0-9]{7} # 7 alphanumerics
(?![A-Z0-9]) # negative lookahead, make sure we haven't alphanum after
Demo

Finding all the ten different digits in a random string

Sorry if this is answered somewhere, but I couldn't find it.
I need to write a regexp to matches on strings that contain the digits from 0 to 9 exactly once. For example:
e8v5i0l9ny3hw1f24z7q6
You can see that numbers [0-9] are present exactly once and in random order. (Letters are present also exactly once, but that is an advanced quest...) It must not match if a digit is missing or if any digit is present more than one time.
So what would be the best regexp to match on strings like these? I am still learning regex and couldn't find a solution. It is PCRE, running in perl environment, but I cannot use perl, only the regex part of it. Sorry for my english and thank you in advance.
What about this pattern to verify the string:
^\D*(?>(\d)(?!.*\1)\D*){10}$
^\D* Starts with any amount of characters, that are no digit
(?>(\d)(?!.*\1)\D*){10} followed by 10x: a digit (captured in first capturing group), if the captured digit is not ahead, followed by any amount of \D non-digits, using a negative lookahead. So 10x a digit, with itself not ahead consecutive should result in 10 different [0-9].
\d is a shorthand for [0-9], \D is the negation [^0-9]
Test at regex101, Regex FAQ
If you need the digit-string then, just extract the digits, e.g. php (test eval.in)
$str = "e8v5i0l9ny3hw1f24z7q6";
$pattern = '/^\D*(?>(\d)(?!.*\1)\D*){10}$/';
if(preg_match($pattern, $str)) {
echo preg_replace('/\D+/', "", $str);
}
It is easy to create a regular expression that matches one specific permutations of the numbers and ingnore the other characters. E.g.
^[^\d]*0[^\d]1[^\d]*2[^\d]*3[^\d]*4[^\d]*5[^\d]*6[^\d]*7[^\d]*8[^\d]*9[^\d]*$
You can combine 10! expressions for every possible permutation with |
Although this is completely inpractical it shows that such a regular expression (without lookahead) is indeed possible.
However this is something that is much better done without regular expression matching.
$s = "e8v5i0l9ny3hw1f24z7q6";
$s = preg_replace('/[^\d]/i', '', $s); //remove non digits
if(strlen($s) == 10) //do we have 10 digits ?
if (!preg_match('/(\d)(\1+)/i', $s)) //if no repeated digits
echo "String has 10 different digits";
http://ideone.com/eY4eGx

Limit number of alpha characters in regular expression

I've been struggling to figure out how to best do this regular expression.
Here are my requirements:
Up to 8 characters
Can only be alphanumeric
Can only contain up to three alpha characters [a-z] (zero alpha characters are valid to)
Any ideas would be appreciated.
This is what I've got so far, but it only looks for contiguous letter characters:
^(\d|([A-Za-z])(?!([A-Za-z]{3,}))){0,8}$
I'd write it like this:
^(?=[a-z0-9]{0,8}$)(?:\d*[a-z]){0,3}\d*$
It has two parts:
(?=[a-z0-9]{0,8}$)
Looksahead and matches up to 8 alphanumeric to the end of the string
(?:\d*[a-z]){0,3}\d*$
Essentially allowing injection of up to 3 [a-z] among \d*
Rubular
On rubular.com
12345678 // matches
123456789
#(#*#$
12345 // matches
abc12345
abcd1234
12a34b5c // matches
12ab34cd
123a456 // matches
Alternatives
I do think regex is the best solution for this, but since the string is short, it would be a lot more readable to do this in two steps as follows:
It must match [a-z0-9]{0,8}
Then, delete all \d
The length must now be <= 3
Do you have to do this in exactly one regular expression? It is possible to do that with standard regular expressions, but the regular expression will be rather long and complicated. You can do better with some of the Perl extensions, but depending on what language you're using, they may or may not be supported. The cleanest solution is probably to check whether the string matches:
^[A-Za-z0-9]{0,8}$
but doesn't match:
([A-Za-z].*){4}
i.e. it's an alpha string of up to 8 characters (first regular expression), but doesn't contain 4 or more alpha characters (possibly separated by other characters (second regular expression).
/^(?!(?:\d*[a-z]){4})[a-z0-9]{0,8}$/i
Explanation:
[a-z0-9]{0,8} matches up to 8 alphanumerics.
Lookahead should be placed before the matching happens.
The (?:\d*[a-z]) matches 1 alphabetic anywhere. The {4} make the count to 4. So this disables the regex from matching when 4 alphabetics can be found (i.e. limit the count to ≤3).
It's better not to exploit regex like this. Suppose you use this solution, are you sure you will know what the code is doing when you revisit it 1 year later? A clearer way is just check rule-by-rule, e.g.
if len(theText) <= 8 and theText.isalnum():
if sum(1 for c in theText if c.isalpha()) <= 3:
# valid
The easiest way to do this would be in multiple steps:
Test the string against /^[a-z0-9]{0,8}$/i -- the string is up to 8 characters and only alphanumeric
Make a copy of the string, delete all non-alphabetic characters
See if the resulting string has a length of 3 or less.
If you want to do it in one regular expression, you can use something like:
/^(?=\d*(?:[a-z]?\d*){0,3}$)[a-z0-9]{0,8}$/i
Which looks for a alphanumeric string between length 0 and 8 (^[a-z0-9]{0,8}$), but first uses a lookahead ((?=\d*(?:[a-z]?\d*){0,3}$)) to make sure that the string
has at most 3 alphabetic characters.

Looking for a regex - 8 char min w/ 1 num and 1 char

I'm looking for some help creating a regex that requires 8 char (at a minimum) along w/ 1 number and 1 char (not special char).
example: a1234567 is valid but 12345678 is not
Any help for a regex newb?
EDIT:
Thanks for the quick replies- the implementation that worked in VB is shown below
Dim ValidPassword As Boolean = Regex.IsMatch(Password, "^(?=.*[0-9])(?=.*[a-zA-Z])\w{8,}$")
something like
^(?=.*[0-9])(?=.*[a-zA-Z])\w{8,}$
would work
dissected:
^ the beginning of the string
(?=.*[0-9]) look ahead and make sure that there is at least 1 digit
(?=.*[a-zA-Z]) look ahead and make sure there is at least 1 letter
\w{8,} actually match the 8+ characters
$ the end of the string
Edit: if you want extra characters (that don't count for the 1 letter requirement) use
^(?=.*[0-9])(?=.*[a-zA-Z]).{8,}$
this will allow for any character besides newline to be used
If you only want certain characters allowed, replace \w in the first regex with [A-Za-z0-9##$%^&*] with your choice of symbols
^(?![0-9]$)(?![a-zA-Z_]$)\w{8,}$
You really need an expression with three regexes :)
/\w{8}/
gives minimum 8 A-Z, a-z, 0-9 and _ chars
/\d/
finds a single digit
/[A-Za-z]/
finds a single letter.
So, in Perl:
$string =~ /\w{8}/ and $string =~ /\d/ and $string =~ /[A-Za-z]/
Try this regular expression with positive look-ahead assertion:
(?=[a-zA-Z]*[0-9])(?=[0-9]*[a-zA-Z])^[0-9a-zA-Z]{8,}$
The parts are:
(?=[a-zA-Z]*[0-9]) checks for at least one character of 0-9
(?=[0-9]*[a-zA-Z]) checks for at least one character of the set a-z, A-Z
^[0-9a-zA-Z]{8,}$ checks for the length of at least 8 occurrences of 0-9, a-z, A-Z.
Or with just the basic syntax:
^([0-9][a-zA-Z][0-9a-zA-Z]{6,}|[0-9]{2}[a-zA-Z][0-9a-zA-Z]{5,}|[0-9]{3}[a-zA-Z][0-9a-zA-Z]{4,}|[0-9]{4}[a-zA-Z][0-9a-zA-Z]{3,}|[0-9]{5}[a-zA-Z][0-9a-zA-Z]{2,}|[0-9]{6}[a-zA-Z][0-9a-zA-Z]+|[0-9]{7}[a-zA-Z][0-9a-zA-Z]*|[a-zA-Z][0-9][0-9a-zA-Z]{6,}|[a-zA-Z]{2}[0-9][0-9a-zA-Z]{5,}|[a-zA-Z]{3}[0-9][0-9a-zA-Z]{4,}|[a-zA-Z]{4}[0-9][0-9a-zA-Z]{3,}|[a-zA-Z]{5}[0-9][0-9a-zA-Z]{2,}|[a-zA-Z]{6}[0-9][0-9a-zA-Z]+|[a-zA-Z]{7}[0-9][0-9a-zA-Z]*)$