Sorry if this is answered somewhere, but I couldn't find it.
I need to write a regexp to matches on strings that contain the digits from 0 to 9 exactly once. For example:
e8v5i0l9ny3hw1f24z7q6
You can see that numbers [0-9] are present exactly once and in random order. (Letters are present also exactly once, but that is an advanced quest...) It must not match if a digit is missing or if any digit is present more than one time.
So what would be the best regexp to match on strings like these? I am still learning regex and couldn't find a solution. It is PCRE, running in perl environment, but I cannot use perl, only the regex part of it. Sorry for my english and thank you in advance.
What about this pattern to verify the string:
^\D*(?>(\d)(?!.*\1)\D*){10}$
^\D* Starts with any amount of characters, that are no digit
(?>(\d)(?!.*\1)\D*){10} followed by 10x: a digit (captured in first capturing group), if the captured digit is not ahead, followed by any amount of \D non-digits, using a negative lookahead. So 10x a digit, with itself not ahead consecutive should result in 10 different [0-9].
\d is a shorthand for [0-9], \D is the negation [^0-9]
Test at regex101, Regex FAQ
If you need the digit-string then, just extract the digits, e.g. php (test eval.in)
$str = "e8v5i0l9ny3hw1f24z7q6";
$pattern = '/^\D*(?>(\d)(?!.*\1)\D*){10}$/';
if(preg_match($pattern, $str)) {
echo preg_replace('/\D+/', "", $str);
}
It is easy to create a regular expression that matches one specific permutations of the numbers and ingnore the other characters. E.g.
^[^\d]*0[^\d]1[^\d]*2[^\d]*3[^\d]*4[^\d]*5[^\d]*6[^\d]*7[^\d]*8[^\d]*9[^\d]*$
You can combine 10! expressions for every possible permutation with |
Although this is completely inpractical it shows that such a regular expression (without lookahead) is indeed possible.
However this is something that is much better done without regular expression matching.
$s = "e8v5i0l9ny3hw1f24z7q6";
$s = preg_replace('/[^\d]/i', '', $s); //remove non digits
if(strlen($s) == 10) //do we have 10 digits ?
if (!preg_match('/(\d)(\1+)/i', $s)) //if no repeated digits
echo "String has 10 different digits";
http://ideone.com/eY4eGx
Related
I want to find a erroneous NCR without &# and remedy it, the unicode is 4 or 5 decimal digit, I write this PHP statement:
function repl0($m) {
return '&#'.$m[0];
}
$s = "This is a good 23200; sample ship";
echo "input1= ".htmlentities($s)."<br>";
$out1=preg_replace_callback('/(?<!#)(\d{4,5};)/','repl0',$s);
echo 'output1 = '.htmlentities($out1).'<br>';
The output is:
input1= This is a good 23200; sample ship
output1 = This is a good 2ಀ sample ship
The match only happens once according to the output message.
What I want is to match '23200;' instead of '3200;'.
Default should be greedy mode and I thought it will capture 5-digit number instead 4-digit's
Do I misunderstand 'greedy' here? How can I get what I want?
The (?<!#)(\d{4,5};) pattern matches like this:
(?<!#) - matches a location that is not immediately preceded with #
(\d{4,5};) - then tries to match and consume four or five digits and a ; char immediately after these digits.
So, if you have #32000; string input, 3 cannot be a starting character of a match, as it is preceded with #, but 2 can since it is not preceded by a # and there are five digits with a ; for the pattern to match.
What you need here is to curb the match on the left by adding a digit to the lookbehind,
(?<![#\d])(\d{4,5};)
With this trick, you ensure that the match cannot be immediately preceded with either # or a digit.
You say you finally used (?<!#)(?<!\d)\d{4,5};, and this pattern is functionally equivalent to the pattern above since the lookbehinds, as all lookarounds, "stand their ground", i.e. the regex index does not move when the lookaround patterns are matched. So, the check for a digit or a # char occurs at the same location in the string.
How would I write a regular expression (Python or Java) that matches strings that contain exactly 10 digits (0-9). I don't care if it contains any other characters, and the 10 digits do not have to be consecutive. For example, I want to following strings to match: "2fdf675&*85y989$%#0" and "3h2j9f88__+=123..54". Any ideas on how to go about doing this??
Just try with:
^\D*(?:\d\D*){10}$
It will probably be clearer if you don't try to cram everything into one regex. I don't know Java or Python, but here's what you could do in Perl:
$str =~ s/\D//g; # Remove all non-digit characters
if ( length($str) == 10 ) ... # Must be left with 10 characters.
Here's a regex and java code:
if (str.matches("(\\D*\\d){10}\\D*")) // FYI ^ and $ are implied in java
But in java here's an easier way to comprehend:
if (str.replaceAll("\\D", "").length() == 10)
This just removes all non-digits and checks the length of what's left.
I need to find the text of all the one-digit number.
My code:
$string = 'text 4 78 text 558 my.name#gmail.com 5 text 78998 text';
$pattern = '/ [\d]{1} /';
(result: 4 and 5)
Everything works perfectly, just wanted to ask it is correct to use spaces?
Maybe there is some other way to distinguish one-digit number.
Thanks
First of all, [\d]{1} is equivalent to \d.
As for your question, it would be better to use a zero width assertion like a lookbehind/lookahead or word boundary (\b). Otherwise you will not match consecutive single digits because the leading space of the second digit will be matched as the trailing space of the first digit (and overlapping matches won't be found).
Here is how I would write this:
(?<!\S)\d(?!\S)
This means "match a digit only if there is not a non-whitespace character before it, and there is not a non-whitespace character after it".
I used the double negative like (?!\S) instead of (?=\s) so that you will also match single digits that are at the beginning or end of the string.
I prefer this over \b\d\b for your example because it looks like you really only want to match when the digit is surrounded by spaces, and \b\d\b would match the 4 and the 5 in a string like 192.168.4.5
To allow punctuation at the end, you could use the following:
(?<!\S)\d(?![^\s.,?!])
Add any additional punctuation characters that you want to allow after the digit to the character class (inside of the square brackets, but make sure it is after the ^).
Use word boundaries. Note that the range quantifier {1} (a single \d will only match one digit) and the character class [] is redundant because it only consists of one character.
\b\d\b
Search around word boundaries:
\b\d\b
As explained by the others, this will extract single digits meaning that some special characters might not be respected like "." in an ip address. To address that, see F.J and Mike Brant's answer(s).
It really depends on where the numbers can appear and whether you care if they are adjacent to other characters (like . at the end of a sentence). At the very least, I would use word boundaries so that you can get numbers at the beginning and end of the input string:
$pattern = '/\b\d\b/';
But you might consider punctuation at the end like:
$pattern = '/\b\d(\b|\.|\?|\!)/';
If one-digit numbers can be preceded or followed by characters other than digits (e.g., "a1 cat" or "Call agent 7, pronto!") use
(?<!\d)\d(?!\d)
Demo
The regular expression reads, match a digit (\d) that is neither preceded nor followed by digit, (?<!\d) being a negative lookbehind and (?!\d) being a negative lookahead.
I need to write a Perl regex to match numbers in a word with both letters and numbers.
Example: test123. I want to write a regex that matches only the number part and capture it
I am trying this \S*(\d+)\S* and it captures only the 3 but not 123.
Regex atoms will match as much as they can.
Initially, the first \S* matched "test123", but the regex engine had to backtrack to allow \d+ to match. The result is:
+------------------- Matches "test12"
| +-------------- Matches "3"
| | +--------- Matches ""
| | |
--- --- ---
\S* (\d+) \S*
All you need is:
my ($num) = "test123" =~ /(\d+)/;
It'll try to match at position 0, then position 1, ... until it finds a digit, then it will match as many digits it can.
The * in your regex are greedy, that's why they "eat" also numbers. Exactly what #Marc said, you don't need them.
perl -e '$_ = "qwe123qwe"; s/(\d+)/$numbers=$1/e; print $numbers . "\n";'
"something122320" =~ /(\d+)/ will return 122320; this is probably what you're trying to do ;)
\S matches any non-whitespace characters, including digits. You want \d+:
my ($number) = 'test123' =~ /(\d+)/;
Were it a case where a non-digit was required (say before, per your example), you could use the following non-greedy expressions:
/\w+?(\d+)/ or /\S+?(\d+)/
(The second one is more in tune with your \S* specification.)
Your expression satisfies any condition with one or more digits, and that may be what you want. It could be a string of digits surrounded by spaces (" 123 "), because the border between the last space and the first digit satisfies zero-or-more non-space, same thing is true about the border between the '3' and the following space.
Chances are that you don't need any specification and capturing the first digits in the string is enough. But when it's not, it's good to know how to specify expected patterns.
I think parentheses signify capture groups, which is exactly what you don't want. Remove them. You're looking for /\d+/ or /[0-9]+/
I'm looking for some help creating a regex that requires 8 char (at a minimum) along w/ 1 number and 1 char (not special char).
example: a1234567 is valid but 12345678 is not
Any help for a regex newb?
EDIT:
Thanks for the quick replies- the implementation that worked in VB is shown below
Dim ValidPassword As Boolean = Regex.IsMatch(Password, "^(?=.*[0-9])(?=.*[a-zA-Z])\w{8,}$")
something like
^(?=.*[0-9])(?=.*[a-zA-Z])\w{8,}$
would work
dissected:
^ the beginning of the string
(?=.*[0-9]) look ahead and make sure that there is at least 1 digit
(?=.*[a-zA-Z]) look ahead and make sure there is at least 1 letter
\w{8,} actually match the 8+ characters
$ the end of the string
Edit: if you want extra characters (that don't count for the 1 letter requirement) use
^(?=.*[0-9])(?=.*[a-zA-Z]).{8,}$
this will allow for any character besides newline to be used
If you only want certain characters allowed, replace \w in the first regex with [A-Za-z0-9##$%^&*] with your choice of symbols
^(?![0-9]$)(?![a-zA-Z_]$)\w{8,}$
You really need an expression with three regexes :)
/\w{8}/
gives minimum 8 A-Z, a-z, 0-9 and _ chars
/\d/
finds a single digit
/[A-Za-z]/
finds a single letter.
So, in Perl:
$string =~ /\w{8}/ and $string =~ /\d/ and $string =~ /[A-Za-z]/
Try this regular expression with positive look-ahead assertion:
(?=[a-zA-Z]*[0-9])(?=[0-9]*[a-zA-Z])^[0-9a-zA-Z]{8,}$
The parts are:
(?=[a-zA-Z]*[0-9]) checks for at least one character of 0-9
(?=[0-9]*[a-zA-Z]) checks for at least one character of the set a-z, A-Z
^[0-9a-zA-Z]{8,}$ checks for the length of at least 8 occurrences of 0-9, a-z, A-Z.
Or with just the basic syntax:
^([0-9][a-zA-Z][0-9a-zA-Z]{6,}|[0-9]{2}[a-zA-Z][0-9a-zA-Z]{5,}|[0-9]{3}[a-zA-Z][0-9a-zA-Z]{4,}|[0-9]{4}[a-zA-Z][0-9a-zA-Z]{3,}|[0-9]{5}[a-zA-Z][0-9a-zA-Z]{2,}|[0-9]{6}[a-zA-Z][0-9a-zA-Z]+|[0-9]{7}[a-zA-Z][0-9a-zA-Z]*|[a-zA-Z][0-9][0-9a-zA-Z]{6,}|[a-zA-Z]{2}[0-9][0-9a-zA-Z]{5,}|[a-zA-Z]{3}[0-9][0-9a-zA-Z]{4,}|[a-zA-Z]{4}[0-9][0-9a-zA-Z]{3,}|[a-zA-Z]{5}[0-9][0-9a-zA-Z]{2,}|[a-zA-Z]{6}[0-9][0-9a-zA-Z]+|[a-zA-Z]{7}[0-9][0-9a-zA-Z]*)$