How do I add to accept # along with the RegularExpression I have below?
[StringLength(250)]
[RegularExpression(#"[A-Za-z0-9][A-Za-z0-9\-\.]*|^$",
ErrorMessage = "DomainName may only contain letters (a-z), digits (0-9), hypens (-) and dots (.), and must start with a letter or digit")]
public string DomainName{ get; set; }
Use
^([A-Za-z0-9][A-Za-z0-9#.-]*)?$
See regex demo
Here is the regex breakdown:
^ - start of string
([A-Za-z0-9][A-Za-z0-9#.-]*)? - 1 or 0 (due to ? greedy quantifier) occurrence of...
[A-Za-z0-9] - one ASCII letter followed by...
[A-Za-z0-9#.-]* - 0 or more characters that are either ASCII letters or digits or literal #/./- symbols.
$ end of string.
So, the main points are:
adding the # into the second character class
turning the whole expression into an optional group (...)? (it can also be a non-capturing group, BTW: (?:...)?)
removing unnecessary escape symbols from the character class (if - is at the start/end of the character class, or as in your regex after a valid range, it does not require escaping).
Related
^([a-zA-Z0-9_-]+)$ matches:
BAP-78810
BAP-148080
But does not match:
B8241066 C
Q2111999 A
Q2111999 B
How can I modify regex pattern to match any space and/or special character?
For the example data, you can write the pattern as:
^[a-zA-Z0-9_-]+(?: [A-Z])?$
^ Start of string
[a-zA-Z0-9_-]+ Match 1+ chars listed in the character class
(?: [A-Z])? Optionally match a space and a char A-Z
$ End of string
Regex demo
Or a more exact match:
^[A-Z]+-?\d+(?: [A-Z])?$
^ Start of string
[A-Z]+-? Match 1+ chars A-Z and optional -
\d+(?: [A-Z])? Matchh 1+ digits and optional space and char A-Z
$ End of string
Regex demo
Whenever you want to match something that can either be a space or a special character, you would use the dot symbol .. Your regex pattern would then be modified to:
^([a-zA-Z0-9_-])+.$
This will match the empty space, or any other character. If you want to match the example provided, where strictly one alphabetical, numer character will follow the space, you could include \w such that:
^([a-zA-Z0-9_-])+.\w$
Note that \w is equivalent to [A-Za-z0-9_]
Further, be careful when you use . as it makes your pattern less specific and therefore more likely to false positives.
I suggest using this approach
^[A-Z][A-Z\d -]{6,}$
The first character must be an uppercase letter, followed by at least 6 uppercase letters, digits, spaces or -.
I removed the group because there was only one group and it was the entire regex.
You can also use \w - which includes A-Z,a-z and 0-9, as well as _ (underscore). To make it case-insensitive, without explicitly adding a-z or using \w, you can use a flag - often an i.
I have a string which has the following format:
Foo/FooVersion some info
Foo can contain:
punctuations
special characters
emojis
alpha numeric
Chinese characters
I have this regex to capture the following pattern:
^[\+$-¨™®é!?_ó–:—🔥😘兼职,.&\w\s]+\/\d+[\+\w.-]*
It seems quite exhaustive list of character set and I am not sure if it does cover all the characters. What I am looking for is a simplified regex that takes these characters into account and returns true if there is a match. I am using sql.
FooVersion can consists of:
start with digit followed by word including dot or hyphen
You could use such pattern ([^\/]+)\/\1Version.+
Pattern explanation:
([^\/]+) - [^\/]+ matches on or more characters other than / (this is negated character class), () means capturing group, so matched text is put into first capturing group
\/ - match / literally
\1 - back reference to match the same text as was matched by first capturing group
Version - match Version literally
.+ - match one or more of any characters (to match rest of a string - this is optional and can be removed)
Regex demo
Update
To match updated requirements, you should use ([^\/]+)\/\d[a-zA-Z\d.-]+
What's new is:
[a-zA-Z\d.-]+ - match on or more characters from set a-z (lowercase letters), A-Z (uppercase letters), \d (digits), .- - hyphen or dot
Updated demo
So I need to match upper and lower case a-z letters, period (.) and # in a string. As a complication the string must have # exactly once anywhere in the string and . at least once anywhere in the string.
abcd#. // match
#ab.cd // match
a#cd#. // no match
abcd# // no match
I've tried to be clever (obviously not very) by doing look ahead but this one seems tricky eg.
(?=[#]){1}[a-zA-Z#]+$
The (?=[#]){1}[a-zA-Z#]+$ pattern matches any substring that starts with # and then has zero or more letters or # up to the end of the string. Look at what it matches.
You need to use
^(?=[^#]*#[^#]*$)(?=[^.]*\.)[a-zA-Z#.]+$
Or, if there must be also one dot (and no more than one) in the string
^(?=[^#]*#[^#]*$)(?=[^.]*\.[^.]*$)[a-zA-Z#.]+$
See the regex demo #1 and the regex demo #2.
Details
^ - start of string
(?=[^#]*#[^#]*$) - requires only one # and no more than one in string - a positive lookahead that requires 0+ chars other than #, a #, and again zero or more chars other than # till the end of string
(?=[^.]*\.) - requires at least one dot - a positive lookahead that requires 0+ chars other than . and then a .
(?=[^.]*\.[^.]*$) - requires only one dot and no more than one in string - a positive lookahead that requires 0+ chars other than ., a ., and again zero or more chars other than . till the end of string
[a-zA-Z#.]+ - one or more ASCII letters, # or .
$ - end of string.
Another option could be using a single lookahead asserting # and match a dot between 2 character classes, or the other way around asserting a dot and matching #
^(?=[^#]*#[^#]*$)[A-Za-z#]*\.[A-Za-z#]*$
Explanation
^ Start of string
(?=[^#]*#[^#]*$) Assert only 1 # char in the string
[A-Za-z#]*\.[A-Za-z#]* Match a dot between optionally repeating character classes each matching 1 out of A-Za-z#
$ End of string
Regex demo
For and . at least once anywhere in the string , you can allow matching a dot in the second character class:
^(?=[^#]*#[^#]*$)[A-Za-z#]*\.[A-Za-z#.]*$
Regex demo
I'm thinking you could just use:
^(?=.*\.)[a-zA-Z.]*#[a-zA-Z.]*$
See the online demo.
^ - Start string ancor.
(?=.*.) - Positive lookahead for any amount of characters up to a literal dot.
[a-zA-Z.]* - Zero or more characters from upper/lowercase letters or a dot.
# - A single #.
[a-zA-Z.]* - Zero or more characters from upper/lowercase letters or a dot.
$ - End string ancor.
I am working on regex with the following conditions:
Must contain from 1 to 63 alphanumeric characters or hyphens.
First character must be a letter.
Cannot end with a hyphen or contain two consecutive hyphens.
I am able to get the regex like:
^[a-zA-Z0-9](?!.*--)[a-zA-Z0-9-]{0,61}[A-Za-z0-9]$
But it fails on the length constraint as well as allows patterns like "a-". How can I meet the conditions?
I would phrase your requirements as:
^(?=.{1,63}$)(?!.*--)[a-zA-Z]([a-zA-Z0-9\-]*[a-zA-Z0-9])?$
Demo
Here is a brief explanation of what each part of the above regex does:
^ from the start of the match
(?=.{1,63}$) assert that the string is between 1 63 characters
(?!.*--) assert that two hyphens do not appear together anywhere
[a-zA-Z] first character is a letter (mandatory in all matches)
([a-zA-Z0-9\-]*[a-zA-Z0-9])?
The final portion says to match a final character which is alphanumeric, but not dash, possibly preceded by alphanumeric characters or dash.
My take on this would be:
^[A-Za-z](?!.*?--)[A-Za-z0-9\-]{0,62}(?<!-)$
Try it out here
Explanation:
^ - Matches the start of the string.
[A-Za-z] - Matches the first letter.
(?!.*?--) - Ensures that there are no two consecutive hyphens in the rest of the string.
[A-Za-z0-9\-]{0,62} - Matches the remaining alphanumeric and hyphen characters.
(?<!-) - Ensures that the string doesn't end with a hyphen.
$ - Matches the end of the string.
I am using the regex
(.*)\d.txt
on the expression
MyFile23.txt
Now the online tester says that using the above regex the mentioned string would be allowed (selected). My understanding is that it should not be allowed because there are two numeric digits 2 and 3 while the above regex expression has only one numeric digit in it i.e \d.It should have been \d+. My current expression reads. Zero of more of any character followed by one numeric digit followed by .txt. My question is why is the above string passing the regex expression ?
This regex (.*)\d.txt will still match MyFile23.txt because of .* which will match 0 or more of any character (including a digit).
So for the given input: MyFile23.txt here is the breakup:
.* # matches MyFile2
\d # matched 3
. # matches a dot (though it can match anything here due to unescaped dot)
txt # will match literal txt
To make sure it only matches MyFile2.txt you can use:
^\D*\d\.txt$
Where ^ and $ are anchors to match start and end. \D* will match 0 or more non-digit.
The pattern you have has one group (.*) which would match using your example:MyFile2
because the . allows any character.
Furthermore the . in the pattern after this group is not escaped which will result in allowing another character of any kind.
To avoid this use:
(\D*)\d+\.txt
the group (\D*) would now match all non digit characters.
Here is the explanation, your "MyFile23.txt" matches the regex pattern:
A literal period . should always be escaped as \. else it will match "any character".
And finally, (.*) matches all the string from the beginning to the last digit (MyFile2). Have a look at the "MATCH INFORMATION" area on the right at this page.
So, I'd suggest the following fix:
^\D*\d\.txt$ = beginning of a line/string, non-digit character, any number of repetitions, a digit, a literal period, a literal txt, and the end of the string/line (depending on the m switch, which depends on the input string, whether you have a list of words on separate lines, or just a separate file name).
Here is a working example.