I need a regex which is matched when the string doesn't have both lowercase and uppercase letters.
If the string has only lowercase letters -> should be matched
If the string has only uppercase letters -> should be matched
If the string has only digits or special characters -> should be matched
For example
abc, ABC, 123, abc123, ABC123&^ - should match
AbC, A12b, AB^%12c - should not match
Basically I need an inverse/negation of the following regex:
^(?=.*[a-z])(?=.*[A-Z]).+$
Does not sound like any lookarounds would be needed.
Either match only characters that are not a-z or only characters, that are not A-Z.
^(?:[^a-z]+|[^A-Z]+)$
See this demo at regex101 (used + for one or more)
You may use
^(?!.*[A-Z].*[a-z])(?!.*[a-z].*[A-Z])\S+$
Or
^(?=(?:[^a-z]+|[^A-Z]+)$).*$
See the regex demo #1 and regex demo #2
A lookaround solution like this can be used in more complex scenarios, when you need to apply more restrictions on the pattern. Else, consider a non-lookaround solution.
Details
^ - start of string
(?!.*[A-Z].*[a-z]) - no uppercase followed with a lowercase letter
(?!.*[a-z].*[A-Z]) - no lowercase letter followed with an uppercase one
(?=(?:[^a-z]+|[^A-Z]+)$) - a positive lookahead that requires 1 or more characters other than lowercase ASCII letters ([^a-z]+) to the end of the string, or 1 or more characters other than uppercase ASCII letters ([^A-Z]+) to the end of the string
.+ - 1+ chars other than line break chars
$ - end of string.
You can use this regex
^(([A-Z0-9?&%^](?![a-z]))+|([a-z0-9?&%^](?![A-Z]))+)$
You can test more cases here.
I've only added the characcter ?&%^ as possible character, but you could add which ever you like.
I would go with:
^(?:[^a-z]+?|[^A-Z]+?)$
It translates to "If the entire string is composed of non-lowercase letters or non-uppercase letters then match the string."
Lazy quantifiers +? are used so that the end-string $ anchor is obeyed when the multiline flag is enabled. If you're only validating a single-line string the you can simply use + without the question mark.
If you have a whitelist of specific allowed special chars then change [^A-Z] into [A-Z0-9()_+=-] and list the allowed special chars.
https://regex101.com/r/Wg6tLn/1
Related
I am new to RegExp. I have a sentence and I would like to pull out a word which satisfies the following -
It must contain only one capitalized letter
It must consist of only characters/letters without numbers
For instance -
"appLe", "warDrobe", "hUsh"
The words that do not fit - "sf_dsfsdF", "331ffsF", "Leopard1997", "mister_Ram" et cetera.
How would you resolve this problem?
The following regex should work:
will find words that have only one capital letter
will only find words with letters (no numbers or special characters)
will match the entire word
\b(?=[A-Z])[A-Z][a-z]*\b|\b(?=[a-z])[a-z]+[A-Z][a-z]*\b
Matches:
appLe
hUsh
Harry
suSan
I
Rejects
HarrY - has TWO capital letters
warDrobeD - has TWO capital letters
sf_dsfsdF - has SPECIAL characters
331ffsF - has NUMBERS
Leopd1997 - has NUMBERS
mistram - does not have a CAPITAL LETTER
See it in action here
Note:
If the capital letter is OPTIONAL- then you will need to add a ? after each [A-Z] like this:
\b(?=[A-Z])[A-Z]?[a-z]*\b|\b(?=[a-z])[a-z]+[A-Z]?[a-z]*\b
You can do this by using character sets ([a-z] & [A-Z]) with appropriate quantifiers (use ? for one or zero capitals), wrapped in () to capture, surrounded by word breaks \b.
If the capital is optional and can appear anywhere use:
/\b([a-z]*[A-Z]?[a-z]*)\b/ //will still match empty string check for length
If you always want one capital appearing anywhere use:
/\b([a-z]*[A-Z][a-z]*)\b/ // does not match empty string
If you always want one capital that must not be the first or last character use:
/\b([a-z]+[A-Z][a-z]+)\b/ // does not match empty string
Here is a working snippet demonstrating the second regex from above in JavaScript:
const exp = /\b([a-z]*[A-Z][a-z]*)\b/
const strings = ["appLe", "warDrobe", "hUsh", "sf_dsfsdF", "331ffsF", "Leopard1997", "mister_Ram", ""];
for (const str of strings) {
console.log(str, exp.test(str))
}
Regex101 is great for dev & testing!
RegExp:
/\b[a-z\-]*[A-Z][a-z\-]*\b/g
Demo:
RegEx101
Explanation
Segment
Description
\b[a-z\-]*
Find a point where a non-word is adjacent to a word ([A-Za-z0-9\-] or \w), then match zero or more lowercase letters and hyphens (note, the hyphen needs to be escaped (\-))
[A-Z]
Find a single uppercase letter
[a-z\-]*\b
Match zero or more lowercase letters and hyphens, then find a point where a non-word is adjacent to a word
Recently I ran into a validation situation I've been trying to solve with regex. The rules are as such:
Must start with a capital letter
Center of the string may be of any length
Center of the string may have any combination of upper and lower case letters and numbers
Center of the string may have up to one underscore
Must end with a number
I have attempted to match this string with the following regex:
^(?!_{2,})([A-Z][a-zA-Z0-9_]*[0-9])$
and
^(?<=_{0,1})([A-Z][a-zA-Z0-9_]*[0-9])$
Both of these attempts still match cases where there is more than one underscore present. I.E. App_l_e9 or App__le9.
How can you check to see if your regex match, I.E. the ([A-Z][a-zA-Z0-9_]*[0-9]) part contains zero or one underscore in any place within the middle of the string?
The simplest approach would probably be this
^[A-Z][a-zA-Z0-9]*_?[a-zA-Z0-9]*[0-9]$
Explanation:
^[A-Z] Must start with an uppercase letter
[a-zA-Z0-9]* A combination of uppercase and lowercase letters and numbers of any length (also 0-length)
_? Either zero or one underscore character
[a-zA-Z0-9]* Again A combination of uppercase and lowercase letters and numbers of any length (also 0-length)
[0-9]$ Must end with a number
This will accept A_9 or AA0_xY8 but for instance not aXY_34 or Aasf1__asdf5
If the underscore in the middle part must not be the first or last character of this middlepart, you can replace the * with a + like this.
^[A-Z][a-zA-Z0-9]+_?[a-zA-Z0-9]+[0-9]$
So this, won't accecept for instance A_9 anymore, but the word must at least be Ax_d9
You might also start the match with an uppercase A-Z and immediately check that the string ends with a number 0-9 using a positive lookahead to prevent catastrophic backtracking.
^[A-Z](?=.*[0-9]$)[a-zA-Z0-9]*_?[a-zA-Z0-9]*$
^ Start of string
[A-Z] Match an uppercase char A-Z
(?=.*[0-9]$) Positive lookahead to assert a digit 0-9 at the end of the string
[a-zA-Z0-9]* Optionally match any of the listed
_? Match an optional _
[a-zA-Z0-9]* Optionally match any of the listed
$ End of string
Regex demo
Or with an optional group
^[A-Z](?=.*[0-9]$)[a-zA-Z0-9]*(?:_[a-zA-Z0-9]*)?$
Regex demo
I want to find invoice numbers with a regex. The string has be longer than 3 char. It may contain signs like {., , /, _}, all numbers and it may contain one or two capital letters - those can stay alone or after each other. That is, what I'm currently trying, without success.
`([0-9-\.\\\/_]{,3})([A-Z]{0,2})?`
Here I have two examples, which should be matched:
019S836/03717008
DR094255
This should not be matched:
DRF094255
Can somebody help me please?
You can use
^(?!(?:[^A-Z]*[A-Z]){3})(?=\D*\d)[0-9A-Z.\\\/_-]{3,}$
See the regex demo.
Details:
^ - start of string
(?!(?:[^A-Z]*[A-Z]){3}) - a negative lookahead that fails the match if, immediately to the right of the current location (i.e. from the start of string), there are three occurrences of any zero or more chars other than uppercase ASCII letters followed with one uppercase ASCII letter
(?=\D*\d) - there must be at least one digit in the string
[0-9A-Z.\\\/_-]{4,} - four or more occurrences of digits, uppercase letters, ., \, /, _ or -
$ - end of string.
I am working on a text processing Api in java. I need to match the strings which are:
At least 8 characters in length.
Should only contain uppercase letters, lowercase letters or spaces.
Spaces should not be present in between the letters. They can however be leading or trailing. The String can also contain only spaces which are at least 8.
Regular expression which I tried but failed:
^\s*[a-zA-Z]{8,}\s*$
Demo of my tries in here.
Any help will be welcomed.
You can use the below regex to achieve your result:
^(?=.{8,}) *[a-zA-Z]* *$
Explanation of the above regex:
^ - denotes start of the test String.
(?=) - Positive lookahead.
.{8,} - any character other than newline with length at least 8.
* - 0 or more spaces in order to match the leading spaces.(\s is avoided)
[a-zA-Z]* - 0 or more letters (uppercase or lowercase). (You can use [a-z]* along with i(case insensitive) flag. Although, there will be no effect on performance.)
* - 0 or more spaces in order to match the trailing spaces.(\s is avoided)
$ - denotes end of the test String.
Above regex demo.
I have a string, actually is a directory file name.
str='\\198.168.0.10\share\ccdfiles\UA-midd3-files\UA0001A_15_Jun_2014_08.17.49\Midd3\y12m05d25h03m16.midd3'
I need to extract the target substring 'UA0001A' with matlab (well I would like think all tools should have same syntax).
It does not necessary to be exact 'UA0001A', it is arbitrary alphabet-number combination.
To make it more general, I would like to think the substring (or the word) shall satisfy
it is a alphabet-number combination word
it cannot be pure alphabet word or pure number word
it cannot include 'midd' or 'midd3' or 'Midd3' or 'MIDD3', etc, so may use case-intensive method to exclude word begin with 'midd'
it cannot include 'y[0-9]{2,4}m[0-9]{1,2}d[0-9]{1,2}\w*'
How to write the regular expression to find the target substring?
Thanks in advance!
You can use
s = '\\198.168.0.10\share\ccdfiles\UA-midd3-files\UA0001A_15_Jun_2014_08.17.49\Midd3\y12m05d25h03m16.midd3';
res = regexp(s, '(?i)\\(?![^\W_]*(midd|y\d+m\d+))(?=[^\W_]*\d)(?=[^\W_]*[a-zA-Z])([^\W_]+)','tokens');
disp(res{1}{1})
See the regex demo
Pattern explanation:
(?i) - the case-insensitive modifier
\\ - a literal backslash
(?![^\W_]*(midd|y\d+m\d+)) - a negative lookahead that will fail a match if there are midd or y+digits+m+digits after 0+ letters or digits
(?=[^\W_]*\d) - a positive lookahead that requires at least 1 digit after 0+ digits or letters ([^\W_]*)
(?=[^\W_]*[a-zA-Z]) - there must be at least 1 letter after 0+ letters or digits
([^\W_]+) - Group 1 (what will extract) matching 1+ letters or digits (or 1+ characters other than non-word chars and _).
The 'tokens' "mode" will let you extract the captured value rather than the whole match.
See the IDEONE demo
this should get you started:
[\\](?i)(?!.*midd.*)([a-z]+[0-9]+[a-z0-9]*|[a-z]+[0-9]+[a-z0-9]*)
[\\] : match a backslash
(?i) : rest of regex is case insensitive
?! following match can not match this
(?!.*midd.*) : following match can not be a word wich has any character, midd, any character
([a-z]+[0-9]+[a-z0-9]*|[a-z]+[0-9]+[a-z0-9]*) match at least one number followed by at least one letter OR at least one letter followed by at least one number followed by any amount of letters and numbers (remember, cannot match the ?! group so no word which contains mid )