Reg-ex with different symbols - regex

Im using the following reg-ex which is working great but the only problem is
that you cannot mix symbols like
aaa-bbbb\ccc
it always should have the same sperator
like aaa-bbb-cccc
"^(?:(?:-?[A-z0-9]+)*|(?:_?[A-z0-9]+)*|(?:\/?[A-z0-9]+/?)*)\s*$"
How can I change it ?
The Value should have
hyphen ‘-‘,
Underscore ‘_’
Slash ‘/’

First off, A-z gives you too wide of a range. In ASCII (and Unicode) there are characters between uppercase 'Z' and lowercase 'a' that are not letters or numbers. You can use the regex escape sequence \w for word characters or A-Za-z. Both are equivalent.
Also, it looks like you know you'll always have three sections so the lazy indicators are unnecessary.
[A-Za-z\d]+([-_\\/])[A-Za-z\d]+\1[A-Za-z\d]+\s*
This will ensure you have the same separator which can be a hyphen, slash, or underscore. Whatever the separator is it will separate 3 groups of alphanumeric characters.
Is this what you're looking for?

To make sure the symbol is the same throughout you should use a back reference, e.g.
aaa[_/-]bbb\1ccc
^^
The \1 will have to be whatever symbol was matched in the [_/-]
Also you are using [A-z] which almost certainly doesn't do what you think it does, the characters between uppercase A and lowercase z are:
ABCD...XYZ[\]^_`abcd...xyz
You probably want [A-Za-z]

Related

Regex that only allows empty string, letters, numbers or spaces?

Need help coming up with a regex that only allows numbers, letters, empty string, or spaces.
^[_A-z0-9]*((-|\s)*[_A-z0-9])*$
This one is the closest I've found but it allows underscores and hyphen.
Only letters, numbers, space, or empty string?
Then 1 character class will do.
^[A-Za-z0-9 ]*$
^ : start of the string or line (depending on the flag)
[A-Za-z0-9 ]* : zero or more upper-case or lower-case letters, or digits, or spaces.
$ : end of the string or line (depending on the flag)
The A-z range contains more than just letters.
You can see that in the ASCII table.
And \s for whitespace also includes tabs or linebreaks (depending on the flag).
But if you also want those, then just use that instead of the space.
^[A-Za-z0-9\s]*$
Also, depending on the regex engine/dialect that your language/tool uses, you could use \p{L} for any unicode letter.
Since [A-Za-z] only includes the normal ascii letters.
Reference here
Your regex is too complicated for what you need.
the first part is fine, you are allowing letter and number, you could simply add the space character with it.
Then, if you use the * character, which translate to 0 or any, you could take care of your empty string problem.
See here.
/^[a-z0-9 ]*$/gmi
Notice here that i'm not using A-z like you were because this translate to any character between the A in ascii (101) and the z(172). this mean it will also match char in between (133 to 141 that are not number nor letter). I've instead use a-z which allow lowercase letter and used the flag i which tell the regex to not take care of the case.
Here is a visual explanation of the regex
You can also test more cases in this regex101
Matching only certain characters is equivalent to not matching any other character, so you could use the regex r = /[^a-z\d ]/i to determine if the string contains any character other than the ones permitted. In Ruby that would be implemented as follows.
"aBc d01e e$9" !~ r #=> false
"aBc d01e ex9" !~ r #=> true
In this situation there may not much to choose between this approach and attempting to match /\A[a-z\d ]+\z/i, but in other situations the use of a negative match can simplify the regex considerably.

Regex that excludes spaces and requires 2 capital letters or more

I'm trying to create a regular expression that matches strings with:
19 to 90 characters
symbols
at least 2 uppercase alphabetical characters
lowercase alphabetical characters
no spaces
I already know that for the size and space exclusion the regex would be:
^[^ ]{19,90}$
And I know that this one will match any a string with at least 2 uppercase characters:
^(.*?[A-Z]){2,}.*$
What I don't know is how to combine them. There is no context for the strings.
Edit: I forgot to say that it is better ifthe regex excludes strings that finish with a .com or .jpeg or .png or any .something (that "something" being of 2-5 characters).
This regex should do what you want.
^(?=(?:\w*\W+)+\w*$)(?=(?:\S*?[A-Z]){2,}\S*?$)(?=(?:\S*?[a-z])+\S*?$)(?!.*?\.\w{2,5}$).{19,90}$
Basically it uses three positive lookaheads and a negative lookahead to guarantee the conditions that you specified:
(?=(?:\w*\W+)+\w*$)
ensures that there is at least one non-word (symbol) character
(?=(?:\S*?[A-Z]){2,}\S*?$)
ensures that there are at least two uppercase characters, and also excludes a match if there are any spaces in the string
(?=(?:\S*?[a-z])+\S*?$)
ensures that there is at least one lowercase character in the string. The negative lookahead
(?!.*?\.\w{2,5}$)
ensures that strings that end with a . and 2-5 characters are excluded
Finally,
.{19.90}
performs the actual match and ensures that there are between 19 and 90 characters.
Following your requrements, I suggest to use the following pattern:
^(?=.*[a-z])(?=.*[A-Z].*[A-Z])(?=.*[^\s]).{19,90}$
Demo
Instead of just excluding spaces, I used \ssince you probably don't want allow tabs, newlines, etc. either. However, it is still unclear which symbols you want to allow, e.g. [a-zA-Z!"§$%&\/()=?+]
^(?=.*[a-z])(?=.*[A-Z].*[A-Z])(?=.*[^\s])(?=[a-zA-Z!"§$%&\/()=?+]).{19,90}$
To match your additional requirement not to match file-like extensions at the end of the string, add a negative look-ahead: (?!.*\.\w{2,5}$)
^(?=.*[a-z])(?=.*[A-Z].*[A-Z])(?=.*[^\s])(?=[a-zA-Z!"§$%&\/()=?+]).{19,90}$
Demo2
You can use backreferences as described here: https://www.ocpsoft.org/tutorials/regular-expressions/and-in-regex/
Another reference with examples here: https://www.regular-expressions.info/refcapture.html

Match exact string and random string length in between

I hate regex and I really can't get my head around it properly. I'm trying to match the following example:
fwb fcb"><a href="https://www.facebook.com/random.length?
while random.length can be any word with upper/lowercase letters, a dot or a number. And it ends with the ? so the question mark indicates the end.
I came as far as:
/fwb fcb"><a href="https:\/\/www.facebook.com\/ missing bit ?/g
Any help?
[a-zA-Z0-9\.]+\? should do the trick.
a-z matches all lowercase letters.
A-Z matches all uppercase letters.
0-9 matches all digits.
You need to escape the dot with a backslash as it has a special meaning in regex.
+ means that the length of the string can be anything from 1 to infinity.
the missing part could be (\w|\.)+ if underscore was accepted. Otherwise, like in your case, you have to specify all different possibilities: [A-Za-z0-9\.]+. Pay attention because in your regex there are some characters that need to be escaped (. and ? are an example).

Why is this regex allowing a caret?

http://regexr.com/3ars8
^(?=.*[0-9])(?=.*[A-z])[0-9A-z-]{17}$
Should match "17 alphanumeric chars, hyphens allowed too, must include at least one letter and at least one number"
It'll correctly match:
ABCDF31U100027743
and correctly decline to match:
AB$DF31U100027743
(and almost any other non-alphanumeric char)
but will apparently allow:
AB^DF31U100027743
Because your character class [A-z] matches this symbol.
[A-z] matches [, \, ], ^, _, `, and the English letters.
Actually, it is a common mistake. You should use [a-zA-Z] instead to only allow English letters.
Here is a visualization from Expresso, showing what the range [A-z] actually covers:
So, this regex (with i option) won't capture your string.
^(?=.*[0-9])(?=.*[a-z])[0-9a-z-]{17}$
In my opinion, it is always safer to use Ignorecase option to avoid such an issue and shorten the regex.
regex uses ASCII printable characters from the space to the tilde range.
Whenever we use [A-z] token it matches the following table highlighted characters. If we use [ -~] token it matches starting from SPACE to tilde.
You're allowing A-z (capital 'A' through lower 'z'). You don't say what regex package you're using, but it's not necessarily clear that A-Z and a-z are contiguous; there could be other characters in between. Try this instead:
^(?=.*[0-9])(?=.*[A-Za-z])[0-9A-Za-z-]{17}$
It seems to meet your criteria for me in regexpal.

Regex for alphanumeric, but at least one letter

In my ASP.NET page, I have an input box that has to have the following validation on it:
Must be alphanumeric, with at least one letter (i.e. can't be ALL
numbers).
^\d*[a-zA-Z][a-zA-Z0-9]*$
Basically this means:
Zero or more ASCII digits;
One alphabetic ASCII character;
Zero or more alphanumeric ASCII characters.
Try a few tests and you'll see this'll pass any alphanumeric ASCII string where at least one non-numeric ASCII character is required.
The key to this is the \d* at the front. Without it the regex gets much more awkward to do.
Most answers to this question are correct, but there's an alternative, that (in some cases) offers more flexibility if you want to change the rules later on:
^(?=.*[a-zA-Z].*)([a-zA-Z0-9]+)$
This will match any sequence of alphanumerical characters, but only if the first group also matches the whole sequence. It's a little-known trick in regular expressions that allows you to handle some very difficult validation problems.
For example, say you need to add another constraint: the string should be between 6 and 12 characters long. The obvious solutions posted here wouldn't work, but using the look-ahead trick, the regex simply becomes:
^(?=.*[a-zA-Z].*)([a-zA-Z0-9]{6,12})$
^[\p{L}\p{N}]*\p{L}[\p{L}\p{N}]*$
Explanation:
[\p{L}\p{N}]* matches zero or more Unicode letters or numbers
\p{L} matches one letter
[\p{L}\p{N}]* matches zero or more Unicode letters or numbers
^ and $ anchor the string, ensuring the regex matches the entire string. You may be able to omit these, depending on which regex matching function you call.
Result: you can have any alphanumeric string except there's got to be a letter in there somewhere.
\p{L} is similar to [A-Za-z] except it will include all letters from all alphabets, with or without accents and diacritical marks. It is much more inclusive, using a larger set of Unicode characters. If you don't want that flexibility substitute [A-Za-z]. A similar remark applies to \p{N} which could be replaced by [0-9] if you want to keep it simple. See the MSDN page on character classes for more information.
The less fancy non-Unicode version would be
^[A-Za-z0-9]*[A-Za-z][A-Za-z0-9]*$
^[0-9]*[A-Za-z][0-9A-Za-z]*$
is the regex that will do what you're after. The ^ and $ match the start and end of the word to prevent other characters. You could replace the [0-9A-z] block with \w, but i prefer to more verbose form because it's easier to extend with other characters if you want.
Add a regular expression validator to your asp.net page as per the tutorial on MSDN: http://msdn.microsoft.com/en-us/library/ms998267.aspx.
^\w*[\p{L}]\w*$
This one's not that hard. The regular expression reads: match a line starting with any number of word characters (letters, numbers, punctuation (which you might not want)), that contains one letter character (that's the [\p{L}] part in the middle), followed by any number of word characters again.
If you want to exclude punctuation, you'll need a heftier expression:
^[\p{L}\p{N}]*[\p{L}][\p{L}\p{N}]*$
And if you don't care about Unicode you can use a boring expression:
^[A-Za-z0-9]*[A-Za-z][A-Za-z0-9]*$
^[0-9]*[a-zA-Z][a-zA-Z0-9]*$
Can be
any number ended with a character,
or an alphanumeric expression started with a character
or an alphanumeric expression started with a number, followed by a character and ended with an alphanumeric subexpression