How to write a regex in title case - regex

I'm working with an SAP application called information steward and creating a rule where names will have to be in title case (ie each word is capitalized).
I've formulated the following rule:
BEGIN
IF(match_regex($name, '(^(\b[A-Z]\w*\s*)+$)', null)) RETURN TRUE;
ELSE RETURN FALSE;
END
Although it is successful it appears to accept inputs which should be identified as 'FALSE'. Please see the attached screenshot.
'TesT Name' and 'TEST NAME' should be FALSE but are instead passing under this regex.
Any help/guidance with the regex would be very useful.

The (^(\b[A-Z]\w*\s*)+$) regex presents a pattern that matches a string that fully matches:
^ - start of string
(\b[A-Z]\w*\s*)+ - 1 or more occurrences (due to (...)+) of
\b - a word boundary
[A-Z] - an uppercase ASCII letter
\w* - 0 or more letters/digits/underscores
\s* - 0+ whitespaces
$ - end of string.
As you see, it allows trailing whitespace, and \w matches what [A-Za-z0-9_] matches, i.e. it matches both lower- and uppercase letters.
You want to only match lowercase letters after initial uppercase ones, also allowing - and _ chars. You may use
^[A-Z][a-z0-9_-]*(\s+[A-Z][a-z0-9_-]*)*$
See the regex demo.
Details
^ - start of string anchor
[A-Z][a-z0-9_-]* - an uppercase letter followed with 0+ lowercase letters, digits, _ or - chars
(\s+[A-Z][a-z0-9_-]*)* - zero or more occurrences of:
\s+ - 1 or more whitespaces
[A-Z][a-z0-9_-]* - an uppercase letter followed with 0+ lowercase letters, digits, _ or - chars
$ - end of string.

I would write your regex as:
^[A-Z]\w*(?:\s+[A-Z]\w*)*$
This says to match a single word starting with a capital letter, then followed by one or more spaces and another word starting with a capital, this quantity zero or more times.
I phrase a matching word as starting with [A-Z] followed by \w*, meaning zero or more word characters. This allows for things like A to match.
Demo
Edit:
Based on the comments above, if you want some other character class to represent what follows the initial uppercase letter, then do that instead:
^[A-Z][something]*(?:\s+[A-Z][something]*)*$
where [something] is your character class.

Related

Regex to check Optional Group of numbers

i am trying to create a regex which should be able to accept the following strings
proj_asdasd_000.gz.xml
proj_asdasd.gz.xml
basically 2nd underscore is optional and if any value follows it, it should only be integer.
Following is my Regex that i am trying.
^proj([a-zA-z0-9]?)+_[a-zA-z]+(_[0-9]?)+\.[a-z]+.[a-z]
Any suggestion to make it accept the above mentioned strings?
You may use
^proj[a-zA-Z0-9]*_[a-zA-Z]+(?:_[0-9]+)?\.[a-z]+\.[a-z]+$
^proj[a-zA-Z0-9]*_[a-zA-Z]+(?:_[0-9]+)?(?:\.[a-z]+){2}$
See the regex demo
Details
^ - start of string
proj - a literal substring
[a-zA-Z0-9]* - 0 or more alphanumeric chars
_ - a _ char
[a-zA-Z]+ - 1+ ASCII letters
(?:_[0-9]+)? - an optional sequence of an underscore followed with 1+ digits
\.[a-z]+\.[a-z]+ = (?:\.[a-z]+){2} - two occurrences of . and 1+ lowercase ASCII letters
$ - end of string.
Notes:
[A-z] matches more than just ASCII letters
([a-zA-z0-9]?)+ matches an optional character 1 or more times, which makes little sense. Either match a char 1 or more times with + or 0 or more times with *, no need of parentheses
(_[0-9]?)+ matches 1 or more sequences of _ followed by a single optional digit (so, it matches _9___1_, for example). The quantifiers must be swapped to match an optional sequence of _ and 1+ digits.

Regex to match if a word starts and end with a letter, have no more than one consecutive non-letter (. *')

I'm currently trying to find a regex to match a specific use case and I'm not finding any specific way to achieve it. I would like, as the title says, to match if a word starts and end with a letter, contains only letter and those characters: "\ *- \'" . It should also have no more than one consecutive non-letter.
I currently have this, but it accepts consecutive non-letter and doesn't accept single letters [a-zA-Z][a-zA-Z \-*']+[a-zA-Z]
I want my regex to accept this string
This is accepted since it contains only spaces and letter and there is no consecutive space
a should be accepted
This is --- not accepted because it contains 5 consecutive non-letters characters (3 dashes and 2 spaces)
" This is not accepted because it starts with a space"
Neither is this one since it ends with a dash -
You may use
^[a-zA-Z]+(?:[ *'-][a-zA-Z]+)*$
See the regex demo and the regex graph:
Details
^ - start of string anchor
[a-zA-Z]+ - 1+ ASCII letters
(?:[ *'-][a-zA-Z]+)* - 0 or more sequences of:
[ *'-] - a space, *, ' or -
[a-zA-Z]+ - 1+ ASCII letters
$ - end of string anchor.

password Regex not matching if spaces around in plain text

Building a regex to match passwords stored in plain text.
8-15 characters, must contain at least:
1 uppercase letter [A-Z]
1 lowercase letter [a-z]
1 number \d
1 special character [!##\$%\^&\*]
The problem I have is when the password is inline with other text or spaces after, it doesn't return a match. When it's on its own without spaces it matches.
Example:
This is a Testing!23 surrounded by other text.
Testing!23
(?=.{8,15})(?=.*[!##\$%\^&\*])(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?!.*\s).*
You want to find all non-whitespace chunks matching the conditions you outlined.
Use
(?<!\S)(?=\S{8,15}(?!\S))(?=[^!##$%^&*\s]*[!##$%^&*])(?=[^\s\d]*\d)(?=[^\sa-z]*[a-z])(?=[^\sA-Z]*[A-Z])\S+
See the regex demo
Details
(?<!\S) - a whitespace or start of string should be right before the current position
(?=\S{8,15}(?!\S)) - right after the current position, there must be 8 to 15 non-whitespace chars followed with either whitespace or end of string
(?=[^!##$%^&*\s]*[!##$%^&*]) - there must be a char from the [!##$%^&*] set after zero or more non-whitespace chars outside of the set
(?=[^\s\d]*\d) - there must be a digit after 0+ non-whitespace and non-digit chars
(?=[^\sa-z]*[a-z]) - 1 lowercase letter must appear after 0+ chars other than whitespace and lowercase letters
(?=[^\sA-Z]*[A-Z]) - 1 uppercase letter must appear after 0+ chars other than whitespace and uppercase letters
\S+ - all checks are over, and if they succeed, match and consume 1+ non-whitespace chars (finally).

checking if one expression contains the next expression in regex

I want my regex to allow alphanumeric characters, "/_-" and white spaces in between but it must always have at least one alphanumeric character.
my validation goes like this,
/^([A-Za-z0-9/-]+[A-Za-z0-9/-\s]*[A-Za-z0-9/_-]+)$/
It should accept **ABC_1-2-3 but it must not allow 123 or -_/ alone
Can somebody help me please.
The below given regex will capture strings with alpha-numeric characters with optional white space, hyphen and underscore in it. Try it.
([*A-Za-z]+(\s+)?([\d\-_]+)?)
Your regex is almost right, you need to add 2 positive lookaheads at the start to require at least 1 letter and at least 1 digit:
/^(?=.*[a-z])(?=.*\d)[a-z0-9\/_-][a-z0-9\/_\s-]*[a-z0-9\/_-]$/i
See the regex demo (in the demo, \s is replaced with a space since the demo is multiline).
Details:
^ - start of string
(?=.*[a-z]) - after any 0+ chars other than line break chars, there must be at least 1 letter (replace .* with [^a-z]* for better performance)
(?=.*\d) - after any 0+ chars other than line break chars, there must be at least 1 digit (replace.with\D` for better performance)
[a-z0-9\/_-] - a letter, digit, /, _ or -
[a-z0-9\/_\s-]* - 0+ letters, digits, /, whitespaces, _ or -
[a-z0-9\/_-] - a letter, digit, /, _ or -
$ - end of string.
The i modifier makes the pattern case insensitive.

Regex pattern for underscore or hyphen but not both

I have a regular expression that is allowing a string to be standalone, separated by hyphen and underscore.
I need help so the string only takes hyphen or underscore, but not both.
This is what I have so far.
^([a-z][a-z0-9]*)([-_]{1}[a-z0-9]+)*$
foo = passed
foo-bar = passed
foo_bar = passed
foo-bar-baz = passed
foo_bar_baz = passed
foo-bar_baz_qux = passed # but I don't want it to
foo_bar-baz-quz = passed # but I don't want it to
You may expand the pattern a bit and use a backreference to only match the same delimiter:
^[a-z][a-z0-9]*(?:([-_])[a-z0-9]+(?:\1[a-z0-9]+)*)?$
See the regex demo
Details:
^ - start of string
[a-z][a-z0-9]* - a letter followed with 0+ lowercase letters or digits
(?:([-_])[a-z0-9]+(?:\1[a-z0-9]+)*)? - an optional sequence of:
([-_]) - Capture group 1 matching either - or _
[a-z0-9]+ - 1+ lowercase letters or digits
(?:\1[a-z0-9]+)* - 0+ sequences of:
\1 - the same value as in Group 1
[a-z0-9]+ - 1 or more lowercase letters or digits
$ - end of string.
Here's a nice clean solution:
^([a-zA-Z-]+|[a-zA-Z_]+)$
Break it down!
^ start at the beginning of the text
[a-zA-Z-]+ match anything a-z or A-Z or -
| OR operator
[a-zA-Z_]+ match anything a-z or A-Z or _
$ end at the end of the text
Here's an example on regexr!