Regular expression to match repeated strings - regex

I am trying to make a regex that will
match if the string exclusively is constructed with strings from a set of strings.
not match if there is any other string in there.
examples for a set of strings that is ['xyz', 'a', 'b']:
'xyzab' == true
'xyzxyzabbb' == true
'aaabb' == true
'' == true
'd' == false
'aabbbbd' == false
'zxy' == false
I am URL matching :/

You can try this regex: ^(?:xyz|[ab])*$
var regex = new RegExp('^(?:xyz|[ab])*$');
var input = ['xyzab', 'xyzxyzabbb', 'aaabb', '', 'd', 'aabbbbd', 'zxy'];
for (var i = 0, l = input.length; i < l; i++) {
console.log(input[i], '->', regex.test(input[i]));
}

Given a set of string {"str1", "str2", ..., "strN"}, write the regex as follows:
^(str1|str2|...|strN)*$
Where
^ matches the beginning of the string
(...) matches any of the strings
* means that the one above may be repeated from 0 to infinite times
$ matches the end of the string

Related

Match word by its prefix

I'm trying to match a string by its prefix that ends with a particular character. For example, if my string is "abcd" and ends in #, then any word which is a prefix of "abcd" should be matched as long as it ends with #. Here are some examples to help illustrate the pattern:
Input: "ab#" gives true (as "ab" is a prefix of "abcd" and end with a #).
Input: "abcd#" gives true (as "abcd" is a prefix of "abcd" and end with a #).
Input: "bc#" gives false (as "bc" is a not a prefix of "abcd" ).
Input: "ab#" gives false (while "ab" is a prefix of "abcd", it doesn't end with #) .
Input: "ac#" gives false (while "ac" is contained within "abcd", it doesn't begin with a prefix from "abcd") .
So far, I've managed to come up with the following expression which seems to be working fine:
/(abcd|abc|ab|a)#/
While this is working, it isn't very practical, as larger words of length n will make the expression quite large:
/(n|n-1|n-2| ... |1)#/
Is there a way to rewrite this expression so it is more scalable and concise?
Example of my attempt (in JS):
const regex = /(abcd|abc|ab|a)#/;
console.log(regex.test("abcd#")); // true
console.log(regex.test("ab#")); // true
console.log(regex.test("abc#")); // true
console.log(regex.test("abz#")); // false
console.log(regex.test("abc#")); // false
Edit: Some of the solutions provided are nice and do do what I'm after, however, for this particular question, I'm after a solution which uses pure regular expressions to match the prefix.
Just use String#startsWith and String#endsWith here:
String input = "abcd";
String prefix = "ab#";
if (input.startsWith(prefix.replaceAll("#$", "")) && prefix.endsWith("#")) {
System.out.println("MATCH");
}
else {
System.out.println("NO MATCH");
}
Edit: A JavaScript version of the above:
var input = "abcd";
var prefix = "ab#";
if (input.startsWith(prefix.replace(/#$/, "")) && prefix.endsWith("#")) {
console.log("MATCH");
}
else {
console.log("NO MATCH");
}
Try ^ab?c?d?#$
Explanation:
`^` - match beginning of a string
`b?` - match match zero or one `b`
Rest is analigocal to the above.
Demo
Here's a left field JavaScript option. Build and array of valid prefixes, use join on the array to make your regex pattern.
var validPrefixes = ["abcd",
"abc",
"ab",
"a",
"areallylongprefix"];
var regexp = new RegExp("^(" + validPrefixes.join("|") + ")#$");
console.log(regexp.test("abcd#"));// true
console.log(regexp.test("ab#")); // true
console.log(regexp.test("abc#")); // true
console.log(regexp.test("abz#")); // false
console.log(regexp.test("abc#")); // false
console.log(regexp.test("areallylongprefix#")); //true
This can be adapted to the language of tour choosing, also handy if your prefixes are dynamically retrieved from a database or similar.
Here's my c# attempt:
private static bool test(string v)
{
var pattern = "abcd#";
//No error handling
return v.EndsWith(pattern[pattern.Length-1])
&& pattern.Replace("#", "").StartsWith(v.Replace("#",""));
}
Console.WriteLine(test("abcd#")); // true
Console.WriteLine(test("ab#")); // true
Console.WriteLine(test("abc#")); // true
Console.WriteLine(test("abz#")); // false
Console.WriteLine(test("abc#")); // false
Console.WriteLine(test("abc")); //false
/a(b(cd?)?)?#/
Or for a longer example, to match a prefix of "abcdefg#":
/a(b(c(d(e(fg?)?)?)?)?)?#/
Generating this regex isn't completely trivial, but some options are:
function createPrefixRegex(s) {
// This method creates an unnecessary set of parentheses
// around the last letter, but that won't harm anything.
return new RegExp(s.slice(0,-1).split('').join('(') + ')?'.repeat(s.length - 2) + '#');
}
function createPrefixRegex2(s) {
var r = s[0];
for (var i = 1; i < s.length - 2; ++i) {
r += '(' + s[i];
}
r += s[s.length - 2] + '?' + ')?'.repeat(s.length - 3) + '#';
return new RegExp(r);
}
function createPrefixRegex3(s) {
var recurse = function(i) {
if (i >= s.length - 1) {
return '';
}
if (i === s.length - 2) {
return s[i] + '?';
}
return '(' + s[i] + recurse(i + 1) + ')?';
}
return new RegExp(s[0] + recurse(1) + '#');
}
These may fail if the input string has no prefix before the '#' character, and they assume the last character in the string is '#'.

Regex Matching using Matcher and Pattern

I am trying to do regex on a number based on the below conditions, however its returning an empty string
import java.util.regex.Matcher
import java.util.regex.Pattern
object clean extends App {
val ALPHANUMERIC: Pattern = Pattern.compile("^[a-zA-Z0-9]*$")
val SPECIALCHAR: Pattern = Pattern.compile("[a-zA-Z0-9\\-#\\.\\(\\)\\/%&\\s]")
val LEADINGZEROES: Pattern = Pattern.compile("^[0]+(?!$)")
val TRAILINGZEROES: Pattern = Pattern.compile("\\.0*$|(\\.\\d*?)0+$")
def evaluate(codes: String): String = {
var str2: String = codes.toString
var text:Matcher = LEADINGZEROES.matcher(str2)
str2 = text.replaceAll("")
text = ALPHANUMERIC.matcher(str2)
str2 = text.replaceAll("")
text = SPECIALCHAR.matcher(str2)
str2 = text.replaceAll("")
text = TRAILINGZEROES.matcher(str2)
str2 = text.replaceAll("")
}
}
the code is returning empty string for LEADINGZEROES match.
scala> println("cleaned value :" + evaluate("0001234"))
cleaned value :
What change should I do to make the code work as I expect. Basically i am trying to remove leading/trailing zeroes and if the numbers has special characters/alphanumeric values than entire value should be returned null
Your LEADINGZEROES pattern is working correct as
val LEADINGZEROES: Pattern = Pattern.compile("^[0]+(?!$)")
println(LEADINGZEROES.matcher("0001234").replaceAll(""))
gives
//1234
But then there is a pattern matching
text = ALPHANUMERIC.matcher(str2)
which replaces all alphanumeric to "" and this made str as empty ("")
As when you do
val ALPHANUMERIC: Pattern = Pattern.compile("^[a-zA-Z0-9]*$")
val LEADINGZEROES: Pattern = Pattern.compile("^[0]+(?!$)")
println(ALPHANUMERIC.matcher(LEADINGZEROES.matcher("0001234").replaceAll("")).replaceAll(""))
it will print empty
Updated
As you have commented
if there is a code that is alphanumeric i want to make that value NULL
but in case of leading or trailing zeroes its pure number, which should return me the value after removing zeroes
but its also returning null for trailing and leading zeroes matches
and also how can I skip a match , suppose i want the regex to not match the number 0999 for trimming leading zeroes
You can write your evaluate function and regexes as below
val LEADINGTRAILINGZEROES = """(0*)(\d{4})(0*)""".r
val ALPHANUMERIC = """[a-zA-Z]""".r
def evaluate(codes: String): String = {
val LEADINGTRAILINGZEROES(first, second, third) = if(ALPHANUMERIC.findAllIn(codes).length != 0) "0010" else codes
if(second.equalsIgnoreCase("0010")) "NULL" else second
}
which should give you
println("cleaned value : " + evaluate("000123400"))
// cleaned value : 1234
println("alphanumeric : " + evaluate("0001A234"))
// alphanumeric : NULL
println("skipping : " + evaluate("0999"))
// skipping : 0999
I hope the answer is helpful

Regex: Any letters, digit, and 0 up to 3 special chars

It seems I'm stuck with a simple regex for a password check.
What I'd like:
8 up to 30 symbols (Total)
With any of these: [A-Za-z\d]
And 0 up to 3 of these: [ -/:-#[-`{-~À-ÿ] (Special list)
I took a look here and then I wrote something like:
(?=.{8,15}$)(?=.*[A-Za-z\d])(?!([ -\/:-#[-`{-~À-ÿ])\1{4}).*
But it doesn't work, one can put more than 3 of the special chars list.
Any tips?
After shuffling your regex around a bit, it works for the examples you provided (I think you made a mistake with the example "A#~` C:", it should not match as it has 6 special chars):
(?!.*(?:[ -\/:-#[-`{-~À-ÿ].*){4})^[A-Za-z\d -\/:-#[-`{-~À-ÿ]{8,30}$
It only needs one lookahead instead of two, because the length and character set check can be done without lookahead: ^[A-Za-z\d -/:-#[-`{-~À-ÿ]{8,30}$
I changed the negative lookahead a bit to be correct. Your mistake was to only check for consecutive special chars, and you inserted the wildcards .* in a way that made the lookahead never hit (because the wildcard allowed everything).
Will this work?
string characters = " -/:-#[-`{-~À-ÿ";
string letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
string[] inputs = {
"AABBCCDD",
"aaaaaaaa",
"11111111",
"a1a1a1a1",
"AA####AA",
"A1C EKFE",
"AADE F"
};
foreach (string input in inputs)
{
var counts = input.Cast<char>().Select(x => new { ch = characters.Contains(x.ToString()) ? 1 : 0, letter = letters.Contains(x.ToString()) ? 1 : 0, notmatch = (characters + letters).Contains(x) ? 0 : 1}).ToArray();
Boolean isMatch = (input.Length >= 8) && (input.Length <= 30) && (counts.Sum(x => x.notmatch) == 0) && (counts.Sum(x => x.ch) <= 3);
Console.WriteLine("Input : '{0}', Matches : '{1}'", input, isMatch ? "Match" : "No Match");
}
Console.ReadLine();
I would use: (if you want to stick to Regex)
var specialChars = #" -\/:-#[-`{-~À-ÿ";
var regularChars = #"A-Za-z\d";
if (Regex.Match(password,$"^(.[{regularChars}{specialChars}]{7,29})$").Success && Regex.Matches(password, $"[{specialChars}]").Count<=3))
{
//Password OK
}
If consists of:
Check Length and if password contains illegal characters
Check if ony contains 3 times special char
A litle faster:
var specialChars = #" -\/:-#[-`{-~À-ÿ";
var regularChars = #"A-Za-z\d";
var minChars = 8;
var maxChars = 30;
if (password.Length >= minChars && password.Length <= maxChars && Regex.Match(password,$"^[{regularChars}{specialChars}]+$").Success && Regex.Matches(password, $"[{specialChars}]").Count<=3))
{
//Password OK
}
Newbie here..I think I've managed to get what you need but one of the test cases you shared was kinda weird..
A#~` C:
OK -- Match (3 specials, it's okay)
Shouldn't this be failed because it has more than 3 specials?
Could you perhaps try this? If it works I'll type out the explanations for the regex.
https://regex101.com/r/KCL6R1/2
(?=^[A-Za-z\d -\/:-#[-`{-~À-ÿ]{8,30}$)^(?:[A-Za-z\d]*[ -\/:-#[-`{-~À-ÿ]){0,3}[A-Za-z\d]*$

Regex for the string with '#'

I wondering how should be the regex string for the string containig '#'
e.g.
abc#def#ghj#ijk
I wanna get
#def
#ghj
#ijk
I tried #[\S]+ but it selects the whole #def#ghj#ijk Any ideas ?
Edit
The code below selects only #Me instead of #MessageBox. Why ?
var m = new RegExp('#[^\s#]+').exec('http://localhost/Lorem/10#MessageBox');
if (m != null) {
var s = '';
for (i = 0; i < m.length; i++) {
s = s + m[i] + "\n";
}
}
Edit 2
the double backslash solved that problem. '#[^\\s#]+'
Try #[^\s#]+ to match # followed by a sequence of one or mor characters which are neither # nor whitespace.
Match all characters that are not #:
#[^#]+

Validate mathematical expressions using regular expression?

I want to validate mathematical expressions using regular expression. The mathematical expression can be this
It can be blank means nothing is entered
If specified it will always start with an operator + or - or * or / and will always be followed by a number that can have
any number of digits and the number can be decimal(contains . in between the numbers) or integer(no '.' symbol within the number).
examples : *0.9 , +22.36 , - 90 , / 0.36365
It can be then followed by what is mentioned in point 2 (above line).
examples : *0.9+5 , +22.36*4/56.33 , -90+87.25/22 , /0.36365/4+2.33
Please help me out.
Something like this should work:
^([-+/*]\d+(\.\d+)?)*
Regexr Demo
^ - beginning of the string
[-+/*] - one of these operators
\d+ - one or more numbers
(\.\d+)? - an optional dot followed by one or more numbers
()* - the whole expression repeated zero or more times
You could try generating such a regex using moo and such:
(?:(?:((?:(?:[ \t]+))))|(?:((?:(?:\/\/.*?$))))|(?:((?:(?:(?<![\d.])[0-9]+(?![\d.])))))|(?:((?:(?:[0-9]+\.(?:[0-9]+\b)?|\.[0-9]+))))|(?:((?:(?:(?:\+)))))|(?:((?:(?:(?:\-)))))|(?:((?:(?:(?:\*)))))|(?:((?:(?:(?:\/)))))|(?:((?:(?:(?:%)))))|(?:((?:(?:(?:\()))))|(?:((?:(?:(?:\)))))))
This regex matches any amount of int, float, braces, whitespace, and the operators +-*/%.
However, expressions such as 2+ would still be validated by the regex, so you might want to use a parser instead.
If you want negative or positive expression you can write it like this>
^\-?[0-9](([-+/*][0-9]+)?([.,][0-9]+)?)*?$
And a second one
^[(]?[-]?([0-9]+)[)]??([(]?([-+/*]([0-9]))?([.,][0-9]+)?[)]?)*$
With parenthesis in expression but doesn't count the number you will need method that validate it or regex.
// the method
public static bool IsPairParenthesis(string matrixExpression)
{
int numberOfParenthesis = 0;
foreach (char character in matrixExpression)
{
if (character == '(')
{
numberOfParenthesis++;
}
if (character == ')')
{
numberOfParenthesis--;
}
}
if (numberOfParenthesis == 0)
{ return true; }
return false;
}
This is java regex, but this is only if not have any braces
[+\-]?(([0-9]+\.[0-9]+)|([0-9]+\.?)|(\.?[0-9]+))([+\-/*](([0-9]+\.[0-9]+)|([0-9]+\.?)|(\.?[0-9]+)))*
Also this with braces in java code
In this case I raplace (..) to number (..), should matches without brace pattern
// without brace pattern
static Pattern numberPattern = Pattern.compile("[+\\-]?(([0-9]+\\.[0-9]+)|([0-9]+\\.?)|(\\.?[0-9]+))([+\\-/*](([0-9]+\\.[0-9]+)|([0-9]+\\.?)|(\\.?[0-9]+)))*");
static Pattern bracePattern = Pattern.compile("\\([^()]+\\)");
public static boolean matchesForMath(String txt) {
if (txt == null || txt.isEmpty()) return false;
txt = txt.replaceAll("\\s+", "");
if (!txt.contains("(") && !txt.contains(")")) return numberPattern.matcher(txt).matches();
if (txt.contains("(") ^ txt.contains(")")) return false;
if (txt.contains("()")) return false;
Queue<String> toBeRematch = new ArrayDeque<>();
toBeRematch.add(txt);
while (toBeRematch.size() > 0) {
String line = toBeRematch.poll();
Matcher m = bracePattern.matcher(line);
if (m.find()) {
String newline = line.substring(0, m.start()) + "1" + line.substring(m.end());
String withoutBraces = line.substring(m.start() + 1, m.end() - 1);
toBeRematch.add(newline);
if (!numberPattern.matcher(withoutBraces).matches()) return false;
}
}
return true;
}