Regex + vee-validate - regex

I use vue.js and vee-validity. I need help with regex.
Firstname and lastname
First letter big, others small
Allow only letters and UTF-8 characters such as "č,ř,ž,ý,á,í,é,ě,š"
Do not allow numbers, spaces or special characters
Nickname
Allow only letters (big, small), numbers and UTF-8 characters (same)
Max 13 characters
Template
v-validate="{required: true, regex: /^[a-zA-Z]+$/ }"

For your First name you can use:
^([A-Z\xC0-\xD6\xD8-\xDE\u0100\u0102\u0104\u0106\u0108\u010A\u010C\u010E\u0110\u0112\u0114\u0116\u0118\u011A\u011C\u011E\u0120\u0122\u0124\u0126\u0128\u012A\u012C\u012E\u0130\u0132\u0134\u0136\u0139\u013B\u013D\u013F\u0141\u0143\u0145\u0147\u014A\u014C\u014E\u0150\u0152\u0154\u0156\u0158\u015A\u015C\u015E\u0160\u0162\u0164\u0166\u0168\u016A\u016C\u016E\u0170\u0172\u0174\u0176\u0178\u0179\u017B\u017D\u0181\u0182\u0184\u0186\u0187\u0189-\u018B\u018E-\u0191\u0193\u0194\u0196-\u0198\u019C\u019D\u019F\u01A0\u01A2\u01A4\u01A6\u01A7\u01A9\u01AC\u01AE\u01AF\u01B1-\u01B3\u01B5\u01B7\u01B8\u01BC\u01C4\u01C7\u01CA\u01CD\u01CF\u01D1\u01D3\u01D5\u01D7\u01D9\u01DB\u01DE\u01E0\u01E2\u01E4\u01E6\u01E8\u01EA\u01EC\u01EE\u01F1\u01F4\u01F6-\u01F8\u01FA\u01FC\u01FE\u0200\u0202\u0204\u0206\u0208\u020A\u020C\u020E\u0210\u0212\u0214\u0216\u0218\u021A\u021C\u021E\u0220\u0222\u0224\u0226\u0228\u022A\u022C\u022E\u0230\u0232\u023A\u023B\u023D\u023E\u0241\u0243-\u0246\u0248\u024A\u024C\u024E\u0370\u0372\u0376\u037F\u0386\u0388-\u038A\u038C\u038E\u038F\u0391-\u03A1\u03A3-\u03AB\u03CF\u03D2-\u03D4\u03D8\u03DA\u03DC\u03DE\u03E0\u03E2\u03E4\u03E6\u03E8\u03EA\u03EC\u03EE\u03F4\u03F7\u03F9\u03FA\u03FD-\u042F\u0460\u0462\u0464\u0466\u0468\u046A\u046C\u046E\u0470\u0472\u0474\u0476\u0478\u047A\u047C\u047E\u0480\u048A\u048C\u048E\u0490\u0492\u0494\u0496\u0498\u049A\u049C\u049E\u04A0\u04A2\u04A4\u04A6\u04A8\u04AA\u04AC\u04AE\u04B0\u04B2\u04B4\u04B6\u04B8\u04BA\u04BC\u04BE\u04C0\u04C1\u04C3\u04C5\u04C7\u04C9\u04CB\u04CD\u04D0\u04D2\u04D4\u04D6\u04D8\u04DA\u04DC\u04DE\u04E0\u04E2\u04E4\u04E6\u04E8\u04EA\u04EC\u04EE\u04F0\u04F2\u04F4\u04F6\u04F8\u04FA\u04FC\u04FE\u0500\u0502\u0504\u0506\u0508\u050A\u050C\u050E\u0510\u0512\u0514\u0516\u0518\u051A\u051C\u051E\u0520\u0522\u0524\u0526\u0528\u052A\u052C\u052E\u0531-\u0556\u10A0-\u10C5\u10C7\u10CD\u13A0-\u13F5\u1E00\u1E02\u1E04\u1E06\u1E08\u1E0A\u1E0C\u1E0E\u1E10\u1E12\u1E14\u1E16\u1E18\u1E1A\u1E1C\u1E1E\u1E20\u1E22\u1E24\u1E26\u1E28\u1E2A\u1E2C\u1E2E\u1E30\u1E32\u1E34\u1E36\u1E38\u1E3A\u1E3C\u1E3E\u1E40\u1E42\u1E44\u1E46\u1E48\u1E4A\u1E4C\u1E4E\u1E50\u1E52\u1E54\u1E56\u1E58\u1E5A\u1E5C\u1E5E\u1E60\u1E62\u1E64\u1E66\u1E68\u1E6A\u1E6C\u1E6E\u1E70\u1E72\u1E74\u1E76\u1E78\u1E7A\u1E7C\u1E7E\u1E80\u1E82\u1E84\u1E86\u1E88\u1E8A\u1E8C\u1E8E\u1E90\u1E92\u1E94\u1E9E\u1EA0\u1EA2\u1EA4\u1EA6\u1EA8\u1EAA\u1EAC\u1EAE\u1EB0\u1EB2\u1EB4\u1EB6\u1EB8\u1EBA\u1EBC\u1EBE\u1EC0\u1EC2\u1EC4\u1EC6\u1EC8\u1ECA\u1ECC\u1ECE\u1ED0\u1ED2\u1ED4\u1ED6\u1ED8\u1EDA\u1EDC\u1EDE\u1EE0\u1EE2\u1EE4\u1EE6\u1EE8\u1EEA\u1EEC\u1EEE\u1EF0\u1EF2\u1EF4\u1EF6\u1EF8\u1EFA\u1EFC\u1EFE\u1F08-\u1F0F\u1F18-\u1F1D\u1F28-\u1F2F\u1F38-\u1F3F\u1F48-\u1F4D\u1F59\u1F5B\u1F5D\u1F5F\u1F68-\u1F6F\u1FB8-\u1FBB\u1FC8-\u1FCB\u1FD8-\u1FDB\u1FE8-\u1FEC\u1FF8-\u1FFB\u2102\u2107\u210B-\u210D\u2110-\u2112\u2115\u2119-\u211D\u2124\u2126\u2128\u212A-\u212D\u2130-\u2133\u213E\u213F\u2145\u2183\u2C00-\u2C2E\u2C60\u2C62-\u2C64\u2C67\u2C69\u2C6B\u2C6D-\u2C70\u2C72\u2C75\u2C7E-\u2C80\u2C82\u2C84\u2C86\u2C88\u2C8A\u2C8C\u2C8E\u2C90\u2C92\u2C94\u2C96\u2C98\u2C9A\u2C9C\u2C9E\u2CA0\u2CA2\u2CA4\u2CA6\u2CA8\u2CAA\u2CAC\u2CAE\u2CB0\u2CB2\u2CB4\u2CB6\u2CB8\u2CBA\u2CBC\u2CBE\u2CC0\u2CC2\u2CC4\u2CC6\u2CC8\u2CCA\u2CCC\u2CCE\u2CD0\u2CD2\u2CD4\u2CD6\u2CD8\u2CDA\u2CDC\u2CDE\u2CE0\u2CE2\u2CEB\u2CED\u2CF2\uA640\uA642\uA644\uA646\uA648\uA64A\uA64C\uA64E\uA650\uA652\uA654\uA656\uA658\uA65A\uA65C\uA65E\uA660\uA662\uA664\uA666\uA668\uA66A\uA66C\uA680\uA682\uA684\uA686\uA688\uA68A\uA68C\uA68E\uA690\uA692\uA694\uA696\uA698\uA69A\uA722\uA724\uA726\uA728\uA72A\uA72C\uA72E\uA732\uA734\uA736\uA738\uA73A\uA73C\uA73E\uA740\uA742\uA744\uA746\uA748\uA74A\uA74C\uA74E\uA750\uA752\uA754\uA756\uA758\uA75A\uA75C\uA75E\uA760\uA762\uA764\uA766\uA768\uA76A\uA76C\uA76E\uA779\uA77B\uA77D\uA77E\uA780\uA782\uA784\uA786\uA78B\uA78D\uA790\uA792\uA796\uA798\uA79A\uA79C\uA79E\uA7A0\uA7A2\uA7A4\uA7A6\uA7A8\uA7AA-\uA7AE\uA7B0-\uA7B4\uA7B6\uFF21-\uFF3A]|\uD801[\uDC00-\uDC27\uDCB0-\uDCD3]|\uD803[\uDC80-\uDCB2]|\uD806[\uDCA0-\uDCBF]|\uD835[\uDC00-\uDC19\uDC34-\uDC4D\uDC68-\uDC81\uDC9C\uDC9E\uDC9F\uDCA2\uDCA5\uDCA6\uDCA9-\uDCAC\uDCAE-\uDCB5\uDCD0-\uDCE9\uDD04\uDD05\uDD07-\uDD0A\uDD0D-\uDD14\uDD16-\uDD1C\uDD38\uDD39\uDD3B-\uDD3E\uDD40-\uDD44\uDD46\uDD4A-\uDD50\uDD6C-\uDD85\uDDA0-\uDDB9\uDDD4-\uDDED\uDE08-\uDE21\uDE3C-\uDE55\uDE70-\uDE89\uDEA8-\uDEC0\uDEE2-\uDEFA\uDF1C-\uDF34\uDF56-\uDF6E\uDF90-\uDFA8\uDFCA]|\uD83A[\uDD00-\uDD21])([a-z\u00C0-\u017F])+$
This range above supports ALL uppercase characters in the unicode class set.
Then for your Nickname you can use:
^([\w\s\u00C0-\u017F]{1,13})$
^ asserts position at start of a line
1st Capturing Group ([\w\u00C0-\u017F]{1,13})
Match a single character present in the list below [\w\u00C0-\u017F]{1,13}
{1,13} Quantifier — Matches between 1 and 13 times, as many times as possible,
giving back as needed (greedy)
\w matches any word character (equal to [a-zA-Z0-9_])
\u00C0-\u017F a single character in the range between À (index 192) and ſ (index 383) (case sensitive)
$ asserts position at the end of a line

Related

RegEx for checking specific rule in string

I cannot figure out how to make this RegEx syntax work.
https://regex101.com/r/Zcxjtn/1
I would like to check whether a string is valid or not.
Rules:
The string must consist of 3 capital letters [A-Z]
If the string is longer than each 3 capital letter blocks must be seperated by a semicolon (;) only
The string must not start and end with a seperator (;)
optional: whitespaces are allowed between seperator and next 3-letter sub-string
examples of valid strings:
AAA;BBB
AAA; BBB
AAA
examples of invalid strings:
;AAA
AAA;BBB;
123;AAA
The string must consist of 3 capital letters [A-Z]
[A-Z] matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
{3} matches the previous token exactly 3 times
Put together you get
[A-Z]{3}
If the string is longer than each 3 capital letter blocks must be seperated by a semicolon (;) only
Make 1. required in the beginning as ^[A-Z]{3} followed by another group that occurs 0 or more times until the end of the input ( )*$ containing a leading ; and 1. from above, so (;[A-Z]{3})*$.
Put together you get
^[A-Z]{3}(;[A-Z]{3})*$
The string must not start and end with a seperator (;)
Already covered by 2.
optional: whitespaces are allowed between seperator and next 3-letter sub-string
Add a white-space \s that occurs 0 or more times *, so \s*.
\s matches any whitespace character (equivalent to [\r\n\t\f\v ])
Put to the correct location in the regex you get
^[A-Z]{3}(;\s*[A-Z]{3})*$
See: https://regex101.com/r/hZ5l6a/1
If you would like to capture only letters, add capture groups ( ) and mark some groups as non-capturing groups (?: )
Example:
^([A-Z]{3})(:?;\s*([A-Z]{3}))*$
See: https://regex101.com/r/NlDTfq/1

Regex pattern to match username requirements

Can someone please provide me with a regex pattern to match these requirements?
between 3 and 20 characters
begins with a letter
cannot end in a period(.)
can contain: a-z, 0-9, period(.), hyphen(-), underscore(_)
I'm new to regex. I've tried ^[[:alpha:]][[:alnum:]_.-]+$, given to me by someone, and I started to do my own with [a-z,A-Z,.]{3,20}[0-9]*
I will be using this in JavaScript but so far I've just been testing at regexr.com because it is convenient.
You can use /^[a-z][\w.-]{1,18}[\w-]$/i as a pattern. A little breakdown:
^ is an anchor for the start of the string, as we want to check the whole string
[a-z] is a character class matching letters a-z, lowercase and also uppercase due to the i-Modifier. This is used for your begins with a letter condition
[\w.-]{1,18} is a character class matching letters, numbers, underscore (= \w), dot and hyphen. It is repeated one to eighteen times to fit between 3 and 20 characters (+ 2 characters at start and end)
[\w-] is basically the same character class, but without the dot, to fit cannot end in a period(.)
$ is an anchor for the end of the string

regex: extract text blocks, defined beginning, undefined end

i have text like this:
Date: 01.02.2015 //<-stable format
something
something more
some random more
Date: 02.02.2015
something random
i dont know
so i have many such blocks. Starts with Date... ends with next Date... start.
The text in the lines in the block could be anything, but not Date... format
I need an array at the end, with such blocks:
array[0] = "Date: 01.02.2015
something
something more
some random more"
array[1] = "Date: 02.02.2015
something random
i dont know"
for now i add some unique splitter before Date... than split by the splitter.
Question: is it possible to get such blocks only by regex?
(i use VBA to parse the text, RegExp object)
Instead of split just match using
\bDate:\s\d{1,2}\.\d{1,2}\.\d{4}[\s\S]*?(?=\nDate:|$)
See demo.
https://regex101.com/r/uF4oY4/77
Syntax explanation (from the linked site):
\b assert position at a word boundary: (^\w|\w$|\W\w|\w\W)
Date: matches the characters Date: literally (case sensitive)
\s matches any whitespace character (equal to [\r\n\t\f\v ])
\d{1,2} matches a digit (equal to [0-9]) between 1 and 2 times, as many times as possible, giving back as needed (greedy)
. matches the character . literally (case sensitive)
\d{1,2} matches a digit (equal to [0-9]) between 1 and 2 times, as many times as possible, giving back as needed (greedy)
. matches the character . literally (case sensitive)
\d{4} matches a digit (equal to [0-9]) exactly 4 times
\s matches any whitespace character (equal to [\r\n\t\f\v ])
\S matches any non-whitespace character (equal to [^\r\n\t\f\v ])
*? Quantifier — Matches between zero and unlimited times, as few times as possible, expanding as needed (lazy) , what specified in previous brackets
?= Positive Lookahead - Assert that the following Regex matches
\nDate Option 1
\n matches a line-feed (newline) character (ASCII 10)
Date matches the characters Date: literally (case sensitive)
$: Option 2 - $ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)

regex - and . _

Writing a regex for a-z, A-Z and allowing - and . _ and integers.
For example:
Testing-Server1
Testing.Server
Testing_Server
Tried this, but was unsure how to allow - _ . and integers:
"^[a-z][A-Z]*$"
Simple enough:
^[-a-zA-Z0-9_.]*$
Explanation
^ = Match from the start of the input
[-a-zA-Z0-9_.] = A character class (a list of allowed characters):
- matches the literal '-' character (must be the first or last character in the class)
a-z matches lowercase alpha characters
A-Z matches uppercase alpha characters
0-9 matches the numeric characters
_ matches the literal '_' character
. matches the literal '.' character (unlike outside a character class, where it matches any character)
* = Match 0 to infinite characters (use + to match at least one character)
$ = Match to the end of the string
Alternative
As stranac mentions in his answer, you can replace a-zA-Z0-9_ with \w, but I prefer the more explicit version, as it's more understandable.
Limiting matched characters
As the OP asked in a comment, to limit the allowed number of characters to 15:
^[-a-zA-Z0-9_.]{0,15}$
Where {0,15} means match between 0 and 15 characters (of the character class) only. You can adjust the values as appropriate, for example, to match at least one character, use {1,15}.
The other answers are over-complicating things.
\w already matches letters, digits and underscores, so you only need to add dot and minus to those.
This regex does the trick: r'^[\w.-]+$'
A few examples:
>>> re.search(r'^[\w.-]+$', 'Testing-Server1').group()
'Testing-Server1'
>>> re.search(r'^[\w.-]+$', 'Testing.Server').group()
'Testing.Server'
>>> re.search(r'^[\w.-]+$', 'Testing_Server').group()
'Testing_Server'
Instead of parsing all the string to check if it contains only allowed characters, it is faster to search the first character that is not an allowed character, because the search stops once it is found:
if not re.search(r'[^\w.-]', yourstring):
...
If you need to check the max length of the string, you can simply write:
if (len(yourstring) < 16 and not re.search(r'[^\w.-]', yourstring)):
This following pattern is enough
^[a-zA-z1-9._-]+$
Explanation
a-z a single character in the range between a and z (case sensitive)
A-z a single character in the range between A and z (case sensitive)
1-9 a single character in the range between 1 and 9
. matches the character . literally
the literal character -
Demo
You can try the following pattern :
[a-zA-Z\d._-]+
DEMO
Try this out :
([a-zA-z0-9._-]+)
Explanation:
1) a-z a single character in the range between a and z (case sensitive)
2) A-z a single character in the range between A and z (case sensitive)
3) 0-9 a single character in the range between 0 and 9
4) ._- a single character in the list ._- literally
It should work , check this :https://regex101.com/r/gZ5xN5/2

Using ?=. in regular expression

I saw the phrase
^(?=.*[A-Z])(?=.*[a-z])(?=.*[0-9])[A-Za-z0-9_##%\*\-]{8,24}$
in regex, which was password checking mechanism. I read few courses about regular expressions, but I never saw combination ?=. explained.
I want know how it works. In the example it is searching for at least one capital letter, one small letter and one number. I guess it's something like "if".
(?=regex_here) is a positive lookahead. It is a zero-width assertion, meaning that it matches a location that is followed by the regex contained within (?= and ). To quote from the linked page:
lookaround actually matches characters, but then gives up the match,
returning only the result: match or no match. That is why they are
called "assertions". They do not consume characters in the string, but
only assert whether a match is possible or not. Lookaround allows you
to create regular expressions that are impossible to create without
them, or that would get very longwinded without them.
The . is not part of the lookahead, because it matches any single character that is not a line terminator.
Although i am a newbie to regex but what i understand about the above regex is
1- ?= is positive lookahead i.e. it matches the expression by looking ahead and sees if there is any pattern that matches your search paramater like [A-Z]
2- .* makes sure that they can be 0 or more number of characters before your matching expression i.e. it makes sure that u can lookahead till the end of the input string to find a match.
In short * is a quantifier which says 0 or more so if:
For instance u changed * with ? for [A-Z] part then your expression will only return true if ur 1st or 2nd letter is capital. OR if u changed it with + then ur expression will return true if any letter other than the first is a capital letter
^ asserts position at start of the string
Positive Lookahead (?=\D*\d)
Assert that the Regex below matches
\D matches any character that's not a digit (equivalent to [^0-9])
matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\d matches a digit (equivalent to [0-9])
Positive Lookahead (?=[^a-z]*[a-z])
Assert that the Regex below matches
Match a single character not present in the list below [^a-z]
matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
a-z matches a single character in the range between a (index 97) and z (index 122) (case sensitive)
Match a single character present in the list below [a-z]
a-z matches a single character in the range between a (index 97) and z (index 122) (case sensitive)
Positive Lookahead (?=[^A-Z]*[A-Z])
Assert that the Regex below matches
Match a single character not present in the list below [^A-Z]
matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
Match a single character present in the list below [A-Z]
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
. matches any character (except for line terminators)
{8,30} matches the previous token between 8 and 30 times, as many times as possible, giving back as needed (greedy)
$ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)