Writing a regex for a-z, A-Z and allowing - and . _ and integers.
For example:
Testing-Server1
Testing.Server
Testing_Server
Tried this, but was unsure how to allow - _ . and integers:
"^[a-z][A-Z]*$"
Simple enough:
^[-a-zA-Z0-9_.]*$
Explanation
^ = Match from the start of the input
[-a-zA-Z0-9_.] = A character class (a list of allowed characters):
- matches the literal '-' character (must be the first or last character in the class)
a-z matches lowercase alpha characters
A-Z matches uppercase alpha characters
0-9 matches the numeric characters
_ matches the literal '_' character
. matches the literal '.' character (unlike outside a character class, where it matches any character)
* = Match 0 to infinite characters (use + to match at least one character)
$ = Match to the end of the string
Alternative
As stranac mentions in his answer, you can replace a-zA-Z0-9_ with \w, but I prefer the more explicit version, as it's more understandable.
Limiting matched characters
As the OP asked in a comment, to limit the allowed number of characters to 15:
^[-a-zA-Z0-9_.]{0,15}$
Where {0,15} means match between 0 and 15 characters (of the character class) only. You can adjust the values as appropriate, for example, to match at least one character, use {1,15}.
The other answers are over-complicating things.
\w already matches letters, digits and underscores, so you only need to add dot and minus to those.
This regex does the trick: r'^[\w.-]+$'
A few examples:
>>> re.search(r'^[\w.-]+$', 'Testing-Server1').group()
'Testing-Server1'
>>> re.search(r'^[\w.-]+$', 'Testing.Server').group()
'Testing.Server'
>>> re.search(r'^[\w.-]+$', 'Testing_Server').group()
'Testing_Server'
Instead of parsing all the string to check if it contains only allowed characters, it is faster to search the first character that is not an allowed character, because the search stops once it is found:
if not re.search(r'[^\w.-]', yourstring):
...
If you need to check the max length of the string, you can simply write:
if (len(yourstring) < 16 and not re.search(r'[^\w.-]', yourstring)):
This following pattern is enough
^[a-zA-z1-9._-]+$
Explanation
a-z a single character in the range between a and z (case sensitive)
A-z a single character in the range between A and z (case sensitive)
1-9 a single character in the range between 1 and 9
. matches the character . literally
the literal character -
Demo
You can try the following pattern :
[a-zA-Z\d._-]+
DEMO
Try this out :
([a-zA-z0-9._-]+)
Explanation:
1) a-z a single character in the range between a and z (case sensitive)
2) A-z a single character in the range between A and z (case sensitive)
3) 0-9 a single character in the range between 0 and 9
4) ._- a single character in the list ._- literally
It should work , check this :https://regex101.com/r/gZ5xN5/2
Related
I use vue.js and vee-validity. I need help with regex.
Firstname and lastname
First letter big, others small
Allow only letters and UTF-8 characters such as "č,ř,ž,ý,á,í,é,ě,š"
Do not allow numbers, spaces or special characters
Nickname
Allow only letters (big, small), numbers and UTF-8 characters (same)
Max 13 characters
Template
v-validate="{required: true, regex: /^[a-zA-Z]+$/ }"
For your First name you can use:
^([A-Z\xC0-\xD6\xD8-\xDE\u0100\u0102\u0104\u0106\u0108\u010A\u010C\u010E\u0110\u0112\u0114\u0116\u0118\u011A\u011C\u011E\u0120\u0122\u0124\u0126\u0128\u012A\u012C\u012E\u0130\u0132\u0134\u0136\u0139\u013B\u013D\u013F\u0141\u0143\u0145\u0147\u014A\u014C\u014E\u0150\u0152\u0154\u0156\u0158\u015A\u015C\u015E\u0160\u0162\u0164\u0166\u0168\u016A\u016C\u016E\u0170\u0172\u0174\u0176\u0178\u0179\u017B\u017D\u0181\u0182\u0184\u0186\u0187\u0189-\u018B\u018E-\u0191\u0193\u0194\u0196-\u0198\u019C\u019D\u019F\u01A0\u01A2\u01A4\u01A6\u01A7\u01A9\u01AC\u01AE\u01AF\u01B1-\u01B3\u01B5\u01B7\u01B8\u01BC\u01C4\u01C7\u01CA\u01CD\u01CF\u01D1\u01D3\u01D5\u01D7\u01D9\u01DB\u01DE\u01E0\u01E2\u01E4\u01E6\u01E8\u01EA\u01EC\u01EE\u01F1\u01F4\u01F6-\u01F8\u01FA\u01FC\u01FE\u0200\u0202\u0204\u0206\u0208\u020A\u020C\u020E\u0210\u0212\u0214\u0216\u0218\u021A\u021C\u021E\u0220\u0222\u0224\u0226\u0228\u022A\u022C\u022E\u0230\u0232\u023A\u023B\u023D\u023E\u0241\u0243-\u0246\u0248\u024A\u024C\u024E\u0370\u0372\u0376\u037F\u0386\u0388-\u038A\u038C\u038E\u038F\u0391-\u03A1\u03A3-\u03AB\u03CF\u03D2-\u03D4\u03D8\u03DA\u03DC\u03DE\u03E0\u03E2\u03E4\u03E6\u03E8\u03EA\u03EC\u03EE\u03F4\u03F7\u03F9\u03FA\u03FD-\u042F\u0460\u0462\u0464\u0466\u0468\u046A\u046C\u046E\u0470\u0472\u0474\u0476\u0478\u047A\u047C\u047E\u0480\u048A\u048C\u048E\u0490\u0492\u0494\u0496\u0498\u049A\u049C\u049E\u04A0\u04A2\u04A4\u04A6\u04A8\u04AA\u04AC\u04AE\u04B0\u04B2\u04B4\u04B6\u04B8\u04BA\u04BC\u04BE\u04C0\u04C1\u04C3\u04C5\u04C7\u04C9\u04CB\u04CD\u04D0\u04D2\u04D4\u04D6\u04D8\u04DA\u04DC\u04DE\u04E0\u04E2\u04E4\u04E6\u04E8\u04EA\u04EC\u04EE\u04F0\u04F2\u04F4\u04F6\u04F8\u04FA\u04FC\u04FE\u0500\u0502\u0504\u0506\u0508\u050A\u050C\u050E\u0510\u0512\u0514\u0516\u0518\u051A\u051C\u051E\u0520\u0522\u0524\u0526\u0528\u052A\u052C\u052E\u0531-\u0556\u10A0-\u10C5\u10C7\u10CD\u13A0-\u13F5\u1E00\u1E02\u1E04\u1E06\u1E08\u1E0A\u1E0C\u1E0E\u1E10\u1E12\u1E14\u1E16\u1E18\u1E1A\u1E1C\u1E1E\u1E20\u1E22\u1E24\u1E26\u1E28\u1E2A\u1E2C\u1E2E\u1E30\u1E32\u1E34\u1E36\u1E38\u1E3A\u1E3C\u1E3E\u1E40\u1E42\u1E44\u1E46\u1E48\u1E4A\u1E4C\u1E4E\u1E50\u1E52\u1E54\u1E56\u1E58\u1E5A\u1E5C\u1E5E\u1E60\u1E62\u1E64\u1E66\u1E68\u1E6A\u1E6C\u1E6E\u1E70\u1E72\u1E74\u1E76\u1E78\u1E7A\u1E7C\u1E7E\u1E80\u1E82\u1E84\u1E86\u1E88\u1E8A\u1E8C\u1E8E\u1E90\u1E92\u1E94\u1E9E\u1EA0\u1EA2\u1EA4\u1EA6\u1EA8\u1EAA\u1EAC\u1EAE\u1EB0\u1EB2\u1EB4\u1EB6\u1EB8\u1EBA\u1EBC\u1EBE\u1EC0\u1EC2\u1EC4\u1EC6\u1EC8\u1ECA\u1ECC\u1ECE\u1ED0\u1ED2\u1ED4\u1ED6\u1ED8\u1EDA\u1EDC\u1EDE\u1EE0\u1EE2\u1EE4\u1EE6\u1EE8\u1EEA\u1EEC\u1EEE\u1EF0\u1EF2\u1EF4\u1EF6\u1EF8\u1EFA\u1EFC\u1EFE\u1F08-\u1F0F\u1F18-\u1F1D\u1F28-\u1F2F\u1F38-\u1F3F\u1F48-\u1F4D\u1F59\u1F5B\u1F5D\u1F5F\u1F68-\u1F6F\u1FB8-\u1FBB\u1FC8-\u1FCB\u1FD8-\u1FDB\u1FE8-\u1FEC\u1FF8-\u1FFB\u2102\u2107\u210B-\u210D\u2110-\u2112\u2115\u2119-\u211D\u2124\u2126\u2128\u212A-\u212D\u2130-\u2133\u213E\u213F\u2145\u2183\u2C00-\u2C2E\u2C60\u2C62-\u2C64\u2C67\u2C69\u2C6B\u2C6D-\u2C70\u2C72\u2C75\u2C7E-\u2C80\u2C82\u2C84\u2C86\u2C88\u2C8A\u2C8C\u2C8E\u2C90\u2C92\u2C94\u2C96\u2C98\u2C9A\u2C9C\u2C9E\u2CA0\u2CA2\u2CA4\u2CA6\u2CA8\u2CAA\u2CAC\u2CAE\u2CB0\u2CB2\u2CB4\u2CB6\u2CB8\u2CBA\u2CBC\u2CBE\u2CC0\u2CC2\u2CC4\u2CC6\u2CC8\u2CCA\u2CCC\u2CCE\u2CD0\u2CD2\u2CD4\u2CD6\u2CD8\u2CDA\u2CDC\u2CDE\u2CE0\u2CE2\u2CEB\u2CED\u2CF2\uA640\uA642\uA644\uA646\uA648\uA64A\uA64C\uA64E\uA650\uA652\uA654\uA656\uA658\uA65A\uA65C\uA65E\uA660\uA662\uA664\uA666\uA668\uA66A\uA66C\uA680\uA682\uA684\uA686\uA688\uA68A\uA68C\uA68E\uA690\uA692\uA694\uA696\uA698\uA69A\uA722\uA724\uA726\uA728\uA72A\uA72C\uA72E\uA732\uA734\uA736\uA738\uA73A\uA73C\uA73E\uA740\uA742\uA744\uA746\uA748\uA74A\uA74C\uA74E\uA750\uA752\uA754\uA756\uA758\uA75A\uA75C\uA75E\uA760\uA762\uA764\uA766\uA768\uA76A\uA76C\uA76E\uA779\uA77B\uA77D\uA77E\uA780\uA782\uA784\uA786\uA78B\uA78D\uA790\uA792\uA796\uA798\uA79A\uA79C\uA79E\uA7A0\uA7A2\uA7A4\uA7A6\uA7A8\uA7AA-\uA7AE\uA7B0-\uA7B4\uA7B6\uFF21-\uFF3A]|\uD801[\uDC00-\uDC27\uDCB0-\uDCD3]|\uD803[\uDC80-\uDCB2]|\uD806[\uDCA0-\uDCBF]|\uD835[\uDC00-\uDC19\uDC34-\uDC4D\uDC68-\uDC81\uDC9C\uDC9E\uDC9F\uDCA2\uDCA5\uDCA6\uDCA9-\uDCAC\uDCAE-\uDCB5\uDCD0-\uDCE9\uDD04\uDD05\uDD07-\uDD0A\uDD0D-\uDD14\uDD16-\uDD1C\uDD38\uDD39\uDD3B-\uDD3E\uDD40-\uDD44\uDD46\uDD4A-\uDD50\uDD6C-\uDD85\uDDA0-\uDDB9\uDDD4-\uDDED\uDE08-\uDE21\uDE3C-\uDE55\uDE70-\uDE89\uDEA8-\uDEC0\uDEE2-\uDEFA\uDF1C-\uDF34\uDF56-\uDF6E\uDF90-\uDFA8\uDFCA]|\uD83A[\uDD00-\uDD21])([a-z\u00C0-\u017F])+$
This range above supports ALL uppercase characters in the unicode class set.
Then for your Nickname you can use:
^([\w\s\u00C0-\u017F]{1,13})$
^ asserts position at start of a line
1st Capturing Group ([\w\u00C0-\u017F]{1,13})
Match a single character present in the list below [\w\u00C0-\u017F]{1,13}
{1,13} Quantifier — Matches between 1 and 13 times, as many times as possible,
giving back as needed (greedy)
\w matches any word character (equal to [a-zA-Z0-9_])
\u00C0-\u017F a single character in the range between À (index 192) and ſ (index 383) (case sensitive)
$ asserts position at the end of a line
I came to scenario where I only want [0-9 or .] For that I used this regex:
[0-9.]$
This regex accepts 0-9 and . (dot for decimal). But when I write something like this
1,1
It also accepts comma (,). How can I avoid this?
Once you are looking into a way to parse numbers (you said dot is for decimals), maybe you don't want your string to start with dot neither ending with it, and must accept only one dot. If this is your case, try using:
^(\d+\.?\d+|\d)$
where:
\d+ stands for any digit (one or more)
\.? stands for zero or one of literal dot
\d stands for any digit (just one)
You can see it working here
Or maybe you'd like to accept strings starting with a dot, which is normally accepted being 0 as integer part, in this case you can use ^\d*\.?\d+$.
This regex [0-9.]$ consists of a character class that matches a digit or a dot at the end of a line $.
If you only want to match a digit or a dot you could add ^ to assert the position at the start of a line:
^[0-9.]$
If you want to match one or more digits, a dot and one or more digits you could use:
^[0-9]+\.[0-9]+$
This regex may help you:
/[0-9.]+/g
Accepts 0 to 9 digits and dot(.).
Explanation:
Match a single character present in the list below [0-9.]+
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
0-9 a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
. matches the character . literally (case sensitive)
You can test it here
I thought [^0-9a-zA-Z]* excludes all alpha-numeric letters, but allows for special characters, spaces, etc.
With the search string [^0-9a-zA-Z]*ELL[^0-9A-Z]* I expect outputs such as
ELL
ELLs
The ELL
Which ELLs
However I also get following outputs
Ellis Island
Bellis
How to correct this?
You may use
(?:\b|_)ELLs?(?=\b|_)
See the regex demo.
It will find ELL or ELLs if it is surrounded with _ or non-word chars, or at the start/end of the string.
Details:
(?:\b|_) - a non-capturing alternation group matching a word boundary position (\b) or (|) a _
ELLs? - matches ELL or ELLs since s? matches 1 or 0 s chars
(?=\b|_) - a positive lookahead that requires the presence of a word boundary or _ immediately to the right of the current location.
change the * to +
a * means any amount including none. A + means one or more. What you probably want though is a word boundry:
\bELL\b
A word boundry is a position between \w and \W (non-word char), or at the beginning or end of a string if it begins or ends (respectively) with a word character ([0-9A-Za-z_]). More here about that:
What is a word boundary in regexes?
How do I add to accept # along with the RegularExpression I have below?
[StringLength(250)]
[RegularExpression(#"[A-Za-z0-9][A-Za-z0-9\-\.]*|^$",
ErrorMessage = "DomainName may only contain letters (a-z), digits (0-9), hypens (-) and dots (.), and must start with a letter or digit")]
public string DomainName{ get; set; }
Use
^([A-Za-z0-9][A-Za-z0-9#.-]*)?$
See regex demo
Here is the regex breakdown:
^ - start of string
([A-Za-z0-9][A-Za-z0-9#.-]*)? - 1 or 0 (due to ? greedy quantifier) occurrence of...
[A-Za-z0-9] - one ASCII letter followed by...
[A-Za-z0-9#.-]* - 0 or more characters that are either ASCII letters or digits or literal #/./- symbols.
$ end of string.
So, the main points are:
adding the # into the second character class
turning the whole expression into an optional group (...)? (it can also be a non-capturing group, BTW: (?:...)?)
removing unnecessary escape symbols from the character class (if - is at the start/end of the character class, or as in your regex after a valid range, it does not require escaping).
Trying to put a regex expression together that returns the string between _ and _$ (where $ is the end of the string).
input:
abc_def_ghi_
desired regex outcoume:
def_ghi
I've tried quite a few combinations such as thsi.
((([^_]*){1})[^_]*)_$
any help appreciated.
Note: the regex above returns abc_def, and not the desired def_ghi.
So it's everything between the first _ and the final _ (both excluding)?
Then try
(?<=_).*(?=_$)
(hoping you're not using JavaScript)
Explanation:
(?<=_) # Assert that the previous character is a _
.* # Match any number of characters...
(?=_$) # ... until right before the final, string-ending _
You could try to use the greedyness of operators to your advantage:
^.*?_(.*)_$
matches everything from the start (non-greedy), up to an underscore, and from this underscore on to the end of the string, where it expects and underscore, then the end of the string, and captures it in the first match.
^ Beginning of string
.*? Any number of characters, at least 0
_ Anchor-tag, literal underscore
(.*) Any number of characters, greedy
_ Anchor-tag, literal underscore
$ End of string
I was searching for this within a larger log entry:
"threat_name":"PUP.Optional.Wajam"
The format enclosed the field name in double quotes then a colon then the value in double quotes.
Here's what I ended up with to avoid punctuation breaking the regex..
threat_name["][:]["](?P<signature>.*?)["]
(from regex101.com)
threat_name matches the characters threat_name literally (case sensitive)
["] match a single character present in the list below
" a single character in the list " literally (case sensitive)
[:] match a single character present in the list below
: the literal character :
["] match a single character present in the list below
" a single character in the list " literally (case sensitive)
(?P<signature>.*?) Named capturing group signature
.*? matches any character (except newline)
Quantifier: *? Between zero and unlimited times, as few times as possible,
expanding as needed [lazy]
["] match a single character present in the list below
" a single character in the list " literally (case sensitive)