VBScript regex to support all accented characters - regex

I have a below regex in VBScript, Pattern:
^(?=.*[a-z])(?=.*[A-Z])(?!.*\s)(?=.*[0-9])(?=.*[!##\$&\*])(?=.{8,20}$)
This validates "length bet 8-20, one small, Capital, special char and digit each"
Issue#1
When I entered à , it passes the validation, which shouldn't have happened. How to restrict it ?
Issue#2
Later, I realized I can use keyboard of any language so I modified my regex to support all accented letters, but its not working either. Pattern:
^(?=.*\\p{L})(?!.*\s)(?=.*[0-9])(?=.*[!##\$&\*])(?=.{8,20}$)
Does VBScript allow to use p{L} regex ? any alternative ?

Your current pattern actually does not validate à. But it is slightly off and won't implement what you have in mind. Try this instead:
^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*[!##\$&*])[A-Za-z0-9!##$&*]{8,20}$
^^^ important
This says to assert that there is at least one:
lowercase letter
uppercase letter
digit
special character (!##$&*)
Then, it matches any of the above four types of characters 8 to 20 times.
The critical problem with your pattern, and the reason it would admit accented characters, provided the other assertions pass, is because of this:
(?=.{8,20})
Your final lookahead does enforce 8 to 20 characters, but it admits any character. Instead, I use a range limited only to the possible types of characters you want to appear.

Related

Regular Expression for Password strength with one special characters except Underscore

I have the following regular expression:
^.*(?=^.{8,}$)(?=.*\d)(?=.*[!##$%^&*-])(?=.*[A-Z])(?=.*[a-z]).*$
I am using it to validate for
At least one letter
least one capital letter
least one number
least one special characters
least 8 characters
But along with this I need to restrict the underscore (_).
If I enter password Pa$sw0rd, this is validating correctly, which is true.
If I enter Pa$_sw0rd this is also validating correctly, which is wrong.
The thing is the regex is passing when all the rules are satisfied. I want a rule to restrict underscore along with above.
Any help will be very appreciable.
I think you can use a negated character class [^_]* to add this restriction (also, remove the initial .*, it is redundant, and the first look-ahead is already at the beginning of the pattern, no need to duplicate ^, and it is totally redundant since the total length limit can be checked at the end):
^(?=.*\d)(?=.*[!##$%^&*-])(?=.*[A-Z])(?=.*[a-z])[^_]{8,}$
See demo
^(?=.*?\d)(?=.*?[!##$%^&*-])(?=.*?[A-Z])(?=.*?[a-z])(?!.*_).{8,}$
You can try this..* at start is of no use.See demo.
https://regex101.com/r/pG1kU1/34

Regex for at least one alphabet and shouldn't allow dot(.)

I have written the regex below but I'm facing an issue:
^[^\.]*[a-zA-Z]+$
As per the above regex, df45543 is invalid, but I want to allow such a string. Only one alphabet character is mandatory and a dot is not allowed. All other characters are allowed.
Just add the digits as allowed characters:
^[^\.]*[a-zA-Z0-9]+$
See demo
In case you need to disallow dots, and allow at least 1 English letter, then use lookaheads:
^(?!.*\.)(?=.*[a-zA-Z]).+$
(?!.*\.) disallows a dot in the string, and (?=.*[a-zA-Z]) requires at least one English letter.
See another demo
Another scenario is when the dot is not allowed only at the beginning. Then, use
^(?!\.)(?=.*[a-zA-Z]).+$
You need to use lookahead to enforce one alphabet:
^(?=.*?[a-zA-Z])[^.]+$
(?=.*?[a-zA-Z]) is a positive lookahead that makes sure there is at least one alphabet in the input.
You can use this:
^[^.a-z]*[a-z][^.]*$
(Use a case insensitive mode, or add A-Z in the character classes)
you can add the first part of your regex which is ^[^.]* to the end to be like this
^[^.]*[A-Za-z]+[^.]*$
try this Demo

How to optimise this regex to match string (1234-12345-1)

I've got this RegEx example: http://regexr.com?34hihsvn
I'm wondering if there's a more elegant way of writing it, or perhaps a more optimised way?
Here are the rules:
Digits and dashes only.
Must not contain more than 10 digits.
Must have two hyphens.
Must have at least one digit between each hyphen.
Last number must only be one digit.
I'm new to this so would appreciate any hints or tips.
In case the link expires, the text to search is
----------
22-22-1
22-22-22
333-333-1
333-4444-1
4444-4444-1
4444-55555-1
55555-4444-1
666666-7777777-1
88888888-88888888-1
1-1-1
88888888-88888888-22
22-333-
333-22
----------
My regex is: \b((\d{1,4}-\d{1,5})|(\d{1,5}-\d{1,4}))-\d{1}\b
I'm using this site for testing: http://gskinner.com/RegExr/
Thanks for any help,
Nick
Here is a regex I came up with:
(?=\b[\d-]{3,10}-\d\b)\b\d+-\d+-\d\b
This uses a look-ahead to validate the information before attempting the match. So it looks for between 3-10 characters in the class of [\d-] followed by a dash and a digit. And then after that you have the actual match to confirm that the format of your string is actually digit(dash)digit(dash)digit.
From your sample strings this regex matches:
22-22-1
333-333-1
333-4444-1
4444-4444-1
4444-55555-1
55555-4444-1
1-1-1
It also matches the following strings:
22-7777777-1
1-88888888-1
Your regexp only allows a first and second group of digits with a maximum length of 5. Therefore, valid strings like 1-12345678-1 or 123456-1-1 won't be matched.
This regexp works for the given requirements:
\b(?:\d\-\d{1,8}|\d{2}\-\d{1,7}|\d{3}\-\d{1,6}|\d{4}\-\d{1,5}|\d{5}\-\d{1,4}|\d{6}\-\d{1,3}|\d{7}\-\d{1,2}|\d{8}\-\d)\-\d\b
(RegExr)
You can use this with the m modifier (switch the multiline mode on):
^\d(?!.{12})\d*-\d+-\d$
or this one without the m modifier:
\b\d(?!.{12})\d*-\d+-\d\b
By design these two patterns match at least three digits separated by hyphens (so no need to put a {5,n} quantifier somewhere, it's useless).
Patterns are also build to fail faster:
I have chosen to start them with a digit \d, this way each beginning of a line or word-boundary not followed by a digit is immediately discarded. Other thing, using only one digit, I know the remaining string length.
Then I test the upper limit of the string length with a negative lookahead that test if there is one more character than the maximum length (if there are 12 characters at this position, there are 13 characters at least in the string). No need to use more descriptive that the dot meta-character here, the goal is to quickly test the length.
finally, I describe the end of string without doing something particular. That is probably the slower part of the pattern, but it doesn't matter since the overwhelming majority of unnecessary positions have already been discarded.

Regex for username that allows numbers, letters and spaces

I'm looking for some regex code that I can use to check for a valid username.
I would like for the username to have letters (both upper case and lower case), numbers, spaces, underscores, dashes and dots, but the username must start and end with either a letter or number.
Ideally, it should also not allow for any of the special characters listed above to be repeated more than once in succession, i.e. they can have as many spaces/dots/dashes/underscores as they want, but there must be at least one number or letter between them.
I'm also interested to find out if you think this is a good system for a username? I've had a look for some regex that could do this, but none of them seem to allow spaces, and I would like for the usernames to have some spaces in them.
Thank you :)
So it looks like you want your username to have a "word" part (sequence of letters or numbers), interspersed with some "separator" part.
The regex will look something like this:
^[a-z0-9]+(?:[ _.-][a-z0-9]+)*$
Here's a schematic breakdown:
_____sep-word…____
/ \
^[a-z0-9]+(?:[ _.-][a-z0-9]+)*$ i.e. "word ( sep word )*"
|\_______/ \____/\_______/ |
| "word" "sep" "word" |
| |
from beginning of string... till the end of string
So essentially we want to match things like word, word-sep-word, word-sep-word-sep-word, etc.
There will be no consecutive sep without a word in between
The first and last char will always be part of a word (i.e. not a sep char)
Note that for [ _.-], - is last so that it's not a range definition metacharacter. The (?:…) is what is called a non-capturing group. We need the brackets for grouping for the repetition (i.e. (…)*), but since we don't need the capture, we can use (?:…)* instead.
To allow uppercase/various Unicode letters etc, just expand the character class/use more flags as necessary.
References
regular-expressions.info/Anchors, Character Class, Repetition, Grouping
Although I'm sure someone will shortly post a 1 million lines regex to do exactly what you want, I don't think in this case a regex is a good solution.
Why don't you write a good old fashioned parser? It will take about as long as writing the regex that does everything you mentioned, but it's going to be much easier to maintain and read.
In particular, this is the tricky part:
it should also not allow for any of
the special characters listed above to
be repeated more than once in
succession
Alternatively you can always do a hybrid of the two. A regex for the other checks ([a-zA-Z0-9][a-zA-Z0-9 _-\.]*[a-zA-Z0-9]) and a non-regex method for the no-repeat requirement.
You don't have to use a regex for everything. I find that requirements like the "no two consecutive characters" usually make the regexes so ugly that it's better to do that bit with a simple procedural loop.
I'd just use something like ^[A-Za-z0-9][A-Za-z0-9 \.\-_]*[A-Za-z0-9]$ (or the equivalents like ::alnum:: if your regex engine is more advanced) and then just check every character in a loop to make sure the next character isn't the same.
By doing it procedurally, you can check all the other rules you're likely to want at some point without resorting to what I call "regex gymnastics", things like:
not allowed to contain your first or last name.
no more than two consecutive digits.
and so forth.

Regular expression (alphanumeric)

I need a regular expression to allow the user to enter an alphanumeric string that starts with a letter (not a digit).
This should work in any of the Regular Expression (RE) engines. There is a nicer syntax in the PCRE world but I prefer mine to be able to run anywhere:
^[A-Za-z][A-Za-z0-9]*$
Basically, the first character must be alpha, followed by zero or more alpha-numerics. The start and end tags are there to ensure that the whole line is matched. Without those, you may match the AB12 of the "###AB12!!!" string.
Full explanation:
^ start tag.
[A-Za-z] any one of the upper/lower case letters.
[A-Za-z0-9] any one of the upper/lower case letters or digits,
* repeated zero or more times.
$ end tag
Update:
As Richard Szalay rightly points out, this is ASCII only (or, more correctly, any encoding scheme where the A-Z, a-z and 0-9 groups are contiguous) and only for the "English" letters.
If you want true internationalized REs (only you know whether that is a requirement), you'll need to use one of the more appropriate RE engines, such as the PCRE mentioned above, and ensure it's compiled for Unicode mode. Then you can use "characters" such as \p{L} and \p{N} for letters and numerics respectively. I think the RE in that case would be:
^\p{L}[\pL\pN]*$
but I'm not certain. I've never used REs for our internationalized software. See here for more than you ever wanted to know about PCRE.
I think this should do the work:
^[A-Za-z][A-Za-z0-9]*$
You're looking for a pattern like this:
^[a-zA-Z][a-zA-Z0-9]*$
That one requires one letter and any number of letters/numbers after that. You may want to adjust the allowed lengths.