Regular Expression matching a string of specified length - regex

I want to write a regular expression that matches all 9-letter words in a word list that contain only one vowel; for example, "strengths".
The best I've achieved is:
^(([^aeiou]{8}[aeiou])|
([^aeiou]{7}[aeiou][^aeiou]{1})|
([^aeiou]{6}[aeiou][^aeiou]{2})|
([^aeiou]{5}[aeiou][^aeiou]{3})|
([^aeiou]{4}[aeiou][^aeiou]{4})|
([^aeiou]{3}[aeiou][^aeiou]{5})|
([^aeiou]{2}[aeiou][^aeiou]{6})|
([^aeiou]{1}[aeiou][^aeiou]{7})|
([aeiou][^aeiou]{8}))$
(linebreaks added for readability). This works, but is there a way to do it with a more general pattern such as:
^[^aeiou]*[aeiou][^aeiou]*$
by adding a "fail if length is not 9" condition in some way?

Use a lookahead to limit the length and accept only letters:
^(?=[a-z]{9}$)[^aeiou]*[aeiou][^aeiou]*$

You may use
^(?=.{9}$)[b-df-hj-np-tv-z]*[aeiou][b-df-hj-np-tv-z]*$
See the regex demo. Add a case insensitive flag to also match uppercase letters.
*Details
^ - start of a string
(?=.{9}$) - a *positive lookahead that makes sure the string contains any 9 chars (else, fail the match)
[b-df-hj-np-tv-z]* - 0+ consonants
[aeiou] - a vowel
[b-df-hj-np-tv-z]* - 0+ consonants
$ - end of string.

adding a "fail if length is not 9" condition in some way?"
The best way is to not check the string length using a regex.
Use the functionality provided by the language you use to check the string length then use a regex to check that it contains only the accepted characters and it includes one wovel.
A simple example in JavaScript (rewriting it in any other language is an easy task):
var input = 'some input string';
if (input.length == 9 && input.match(/[aeiou]/)) {
console.log('Valid input (length is 9 and contains at least one wovel)');
} else {
console.log('Invalid input');
}

Related

Substitute every character after a certain position with a regex

I'm trying to use a regex replace each character after a given position (say, 3) with a placeholder character, for an arbitrary-length string (the output length should be the same as that of the input). I think a lookahead (lookbehind?) can do it, but I can't get it to work.
What I have right now is:
regex: /.(?=.{0,2}$)/
input string: 'hello there'
replace string: '_'
current output: 'hello th___' (last 3 substituted)
The output I'm looking for would be 'hel________' (everything but the first 3 substituted).
I'm doing this in Typescript, to replace some old javascript that is using ugly split/concatenate logic. However, I know how to make the regex calls, so the answer should be pretty language agnostic.
If you know the string is longer than given position n, the start-part can be optionally captured
(^.{3})?.
and replaced with e.g. $1_ (capture of first group and _). Won't work if string length is <= n.
See this demo at regex101
Another option is to use a lookehind as far as supported to check if preceded by n characters.
(?<=.{3}).
See other demo at regex101 (replace just with underscore) - String length does not matter here.
To mention in PHP/PCRE the start-part could simply be skipped like this: ^.{1,3}(*SKIP)(*F)|.

Extract text up to the n-th character in a string, but return the whole string if the character isn't present

I was looking at this question and the accepted answer gives this as a solution for the case when there are fewer than n characters in the string:
^(([^>]*>){4}|.*)
However, I have done a fiddle here, and it shows that this regex will just simply return the entire string all of the time.
This code:
SELECT
SUBSTRING(a FROM '^(([^>]*>){4}|.*)'),
a,
LENGTH(SUBSTRING(a FROM '^(([^>]*>){4}|.*)')),
LENGTH(a),
LENGTH(SUBSTRING(a FROM '^(([^>]*>){4}|.*)')) = LENGTH(a)
FROM s
WHERE LENGTH(SUBSTRING(a FROM '^(([^>]*>){4}|.*)')) = LENGTH(a) IS false;
after several runs returns no records - meaning that the regex is doing nothing.
Question:
I would like a regex which returns up to the fourth > character (not including it) OR the entire string if the string only contains 3 or fewer > characters. RTRIM() can always be used to trim the final > if not including it is too tricky - having an answer which gives both possibilities would help me to deepen my understanding of regexes!
This is not a duplicate - it's certainly related, but I'd like to correct the error in the original answer - and provide a correct answer of my own.
You can use
REGEXP_REPLACE(a, '^((?:[^>]*>){4}).*', '\1')
See the regex demo. Details:
^ - start of string
((?:[^>]*>){4}) - Group 1 (\1): four sequences of any chars other than > and then a > char
.* - the rest of the line.
Here is a test:
CREATE TABLE s
(
a TEXT
);
INSERT INTO s VALUES
('afsad>adfsaf>asfasf>afasdX>asdffs>asfdf>'),
('23433>433453>4>4559>455>3433>'),
('adfd>adafs>afadsf>');
SELECT REGEXP_REPLACE(a, '^((?:[^>]*>){4}).*', '\1') as Output FROM s;
Output:
You can repeat matching 0-3 times including the > using
^(?:[^>]*>){0,3}[^>]*
^ Start of string
(?:[^>]*>){0,3} Repeat 0 - 3 times matching any character except > and then match >
[^>]* Optionally match any char except >
See a regex demo.
If there should be at least a single > then the quantifier can be {1,3}

How to create "blocks" with Regex

For a project of mine, I want to create 'blocks' with Regex.
\xyz\yzx //wrong format
x\12 //wrong format
12\x //wrong format
\x12\x13\x14\x00\xff\xff //correct format
When using Regex101 to test my regular expressions, I came to this result:
([\\x(0-9A-Fa-f)])/gm
This leads to an incorrect output, because
12\x
Still gets detected as a correct string, though the order is wrong, it needs to be in the order specified below, and in no other order.
backslash x 0-9A-Fa-f 0-9A-Fa-f
Can anyone explain how that works and why it works in that way? Thanks in advance!
To match the \, folloed with x, followed with 2 hex chars, anywhere in the string, you need to use
\\x[0-9A-Fa-f]{2}
See the regex demo
To force it match all non-overlapping occurrences, use the specific modifiers (like /g in JavaScript/Perl) or specific functions in your programming language (Regex.Matches in .NET, or preg_match_all in PHP, etc.).
The ^(?:\\x[0-9A-Fa-f]{2})+$ regex validates a whole string that consists of the patterns like above. It happens due to the ^ (start of string) and $ (end of string) anchors. Note the (?:...)+ is a non-capturing group that can repeat in the string 1 or more times (due to + quantifier).
Some Java demo:
String s = "\\x12\\x13\\x14\\x00\\xff\\xff";
// Extract valid blocks
Pattern pattern = Pattern.compile("\\\\x[0-9A-Fa-f]{2}");
Matcher matcher = pattern.matcher(s);
List<String> res = new ArrayList<>();
while (matcher.find()){
res.add(matcher.group(0));
}
System.out.println(res); // => [\x12, \x13, \x14, \x00, \xff, \xff]
// Check if a string consists of valid "blocks" only
boolean isValid = s.matches("(?i)(?:\\\\x[a-f0-9]{2})+");
System.out.println(isValid); // => true
Note that we may shorten [a-zA-Z] to [a-z] if we add a case insensitive modifier (?i) to the start of the pattern, or just use \p{Alnum} that matches any alphanumeric char in a Java regex.
The String#matches method always anchors the regex by default, we do not need the leading ^ and trailing $ anchors when using the pattern inside it.

Regular expression for password (at least 2 digits and one special character and minimum length 8)

I have been searching for regular expression which accepts at least two digits and one special character and minimum password length is 8. So far I have done the following: [0-9a-zA-Z!##$%0-9]*[!##$%0-9]+[0-9a-zA-Z!##$%0-9]*
Something like this should do the trick.
^(?=(.*\d){2})(?=.*[a-zA-Z])(?=.*[!##$%])[0-9a-zA-Z!##$%]{8,}
(?=(.*\d){2}) - uses lookahead (?=) and says the password must contain at least 2 digits
(?=.*[a-zA-Z]) - uses lookahead and says the password must contain an alpha
(?=.*[!##$%]) - uses lookahead and says the password must contain 1 or more special characters which are defined
[0-9a-zA-Z!##$%] - dictates the allowed characters
{8,} - says the password must be at least 8 characters long
It might need a little tweaking e.g. specifying exactly which special characters you need but it should do the trick.
There is no reason, whatsoever, to implement all rules in a single regex.
Consider doing it like thus:
Pattern[] pwdrules = new Pattern[] {
Pattern.compile("........"), // at least 8 chars
Pattern.compile("\d.*\d"), // 2 digits
Pattern.compile("[-!"ยง$%&/()=?+*~#'_:.,;]") // 1 special char
}
String password = ......;
boolean passed = true;
for (Pattern p : pwdrules) {
Matcher m = p.matcher(password);
if (m.find()) continue;
System.err.println("Rule " + p + " violated.");
passed = false;
}
if (passed) { .. ok case.. }
else { .. not ok case ... }
This has the added benefit that passwort rules can be added, removed or changed without effort. They can even reside in some ressource file.
In addition, it is just more readable.
Try this one:
^(?=.*\d{2,})(?=.*[$-/:-?{-~!"^_`\[\]]{1,})(?=.*\w).{8,}$
Here's how it works shortly:
(?=.*\d{2,}) this part saying except at least 2 digits
(?=.*[$-/:-?{-~!"^_[]]{1,})` these are special characters, at least 1
(?=.*\w) and rest are any letters (equals to [A-Za-z0-9_])
.{8,}$ this one says at least 8 characters including all previous rules.
Below is map for current regexp (made with help of Regexper)
UPD
Regexp should look like this ^(?=(.*\d){2,})(?=.*[$-\/:-?{-~!"^_'\[\]]{1,})(?=.*\w).{8,}$
Check out comments for more details.
Try this regex. It uses lookahead to verified there is a least two digits and one of the special character listed by you.
^(?=.*?[0-9].*?[0-9])(?=.*[!##$%])[0-9a-zA-Z!##$%0-9]{8,}$
EXPLANATION
^ #Match start of line.
(?=.*?[0-9].*?[0-9]) #Look ahead and see if you can find at least two digits. Expression will fail if not.
(?=.*[!##$%]) #Look ahead and see if you can find at least one of the character in bracket []. Expression will fail if not.
[0-9a-zA-Z!##$%0-9]{8,} #Match at least 8 of the characters inside bracket [] to be successful.
$ # Match end of line.
Regular expressions define a structure on the string you're trying to match. Unless you define a spatial structure on your regex (e.g. at least two digits followed by a special char, followed by ...) you cannot use a regex to validate your string.
Try this : ^.*(?=.{8,15})(?=.*\d)(?=.*\d)[a-zA-Z0-9!##$%]+$
Please read below link for making password regular expression policy:-
Regex expression for password rules

Regex password issue [duplicate]

This question already has answers here:
Regexp Java for password validation
(17 answers)
Closed 8 years ago.
Working on a project where it requires me to have a password field using pattern attribute.
Not really done a lot of regex stuff and was wondering whether someone could help out.
The requirements for the field are as follows:
Can't contain the word "password"
Must be 8-12 in length
Must have 1 upper case
Must have 1 lower case
Must have 1 digit
Now, so far I have the following:
[^(password)].(?=.*[0-9])?=.*[a-zA-Z]).{8,12}
This doesn't work. We can get it so everything else works, apart from the password string being matched.
Thanks in advance,
Andy
EDIT: the method we've used now (nested in comments below) is:
^(?!.*(P|p)(A|a)(S|s)(S|s)(W|w)(O|o)(R|r)(D|d)).(?=.*\d)(?=.*[a-zA-Z]).{8,12}$
Thanks for the help
Use a series of anchored look aheads for the :must contain" criteria:
^(?!.*(?i)password)(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{8,12}$
I've added the "ignore case" switch (?i) to the "password" requirement, so it will reject `the word no matter the case of the letters.
This regex should to the job:
^(?!.*password)(?=.*\d)(?=.*[A-Z])(?=.*[a-z]).{8,12}$
it will match:
^ // from the beginning of the input
(?!.*password) // negative lookbehind whether the text contains password
(?=.*\d+) // positive lookahead for at least one digit
(?=.*[A-Z]+) // positive lookahead for at least one uppercase letter
(?=.*[a-z]+) // positive lookahead for at least one lowercase letter
.{8,12} // length of the input is between 8 and 12 characters
$
Link to phpliveregex
Try This:
^(?!.*password)(?=.*[A-Z])(?=.*[0-9])(?=.*[a-z]).{8,12}$
Explanation
^ Start anchor
(?=.*[A-Z]) Ensure string has one uppercase letter
(?=.*[0-9]) Ensure string has one digits.
(?=.*[a-z]) Ensure string has one lowercase letter
{8,12} Ensure string is of length 8 - 12.
(?!.*password) Not of word password
$ End anchor.
Try this way
Public void validation()
string xxxx = "Aa1qqqqqq";
if (xxxx.ToUpper().Contains("password") || !(xxxx.Length <= 12 & xxxx.Length >= 8) || !IsMatch(xxxx, "(?=.*[A-Z])") || !IsMatch(xxxx, "(?=.*[a-z])") || !IsMatch(xxxx, "(?=.*[0-9])"))
{
throw new System.ArgumentException("Your error message here", "Password");
}
}
public bool IsMatch(string input, string pattern)
{
Match xx = Regex.Match(input, pattern);
return xx.Success;
}