Regex to check Optional Group of numbers - regex

i am trying to create a regex which should be able to accept the following strings
proj_asdasd_000.gz.xml
proj_asdasd.gz.xml
basically 2nd underscore is optional and if any value follows it, it should only be integer.
Following is my Regex that i am trying.
^proj([a-zA-z0-9]?)+_[a-zA-z]+(_[0-9]?)+\.[a-z]+.[a-z]
Any suggestion to make it accept the above mentioned strings?

You may use
^proj[a-zA-Z0-9]*_[a-zA-Z]+(?:_[0-9]+)?\.[a-z]+\.[a-z]+$
^proj[a-zA-Z0-9]*_[a-zA-Z]+(?:_[0-9]+)?(?:\.[a-z]+){2}$
See the regex demo
Details
^ - start of string
proj - a literal substring
[a-zA-Z0-9]* - 0 or more alphanumeric chars
_ - a _ char
[a-zA-Z]+ - 1+ ASCII letters
(?:_[0-9]+)? - an optional sequence of an underscore followed with 1+ digits
\.[a-z]+\.[a-z]+ = (?:\.[a-z]+){2} - two occurrences of . and 1+ lowercase ASCII letters
$ - end of string.
Notes:
[A-z] matches more than just ASCII letters
([a-zA-z0-9]?)+ matches an optional character 1 or more times, which makes little sense. Either match a char 1 or more times with + or 0 or more times with *, no need of parentheses
(_[0-9]?)+ matches 1 or more sequences of _ followed by a single optional digit (so, it matches _9___1_, for example). The quantifiers must be swapped to match an optional sequence of _ and 1+ digits.

Related

Greedy regex quantifier not matching password criteria

/(^[a-zA-Z]+-?[a-zA-Z0-9]+){5,15}$/g
regex criteria
match length must be between 6 and 16 characters inclusive
must start with a letter only
must contain letters, numbers and one optional hyphen
must not end with a hyphen
the above regular expression doesnt satisfy all 4 conditions. tried moving the ^ before the group and omitting the + quantifiers but doesnt work
You are setting the limiting quantifier on a group that already has quantified subpatterns, thus, the length restriction won't work.
To set the length restriction, add the (?=.{6,16}$) lookahead after ^ and then feel free to set your consuming pattern.
You may use
/^(?=.{6,16}$)[a-zA-Z][a-zA-Z0-9]*(?:-[a-zA-Z0-9]+)?$/
See the regex demo. Note you should not use g modifier when validating the whole input string against a regex.
Details
^ - start of string
(?=.{6,16}$) - 6 to 16 chars in the string input allowed/required
[a-zA-Z] - a letter as the first char
[a-zA-Z0-9]* - 0+ alphanumeric chars
(?:-[a-zA-Z0-9]+)? - an optional sequence of - and then 1+ alphanumeric chars
$ - end of string.
All you need
^(?i)(?=.{6,16}$)(?!.*-.*-)[a-z][a-z\d-]*\d[a-z\d-]*(?<!-)$
Readable
^
(?i)
(?= .{6,16} $ ) # 6 - 16 chars
(?! .* - .* - ) # Not 2 dashes
[a-z] # Start letter
[a-z\d-]* # Optional letters, digits, dashes
\d # Must be digit
[a-z\d-]* # Optional letters, digits, dashes
(?<! - ) # Not end in dash
$
Well, at least my regex forces a number be present.

How to write a regex in title case

I'm working with an SAP application called information steward and creating a rule where names will have to be in title case (ie each word is capitalized).
I've formulated the following rule:
BEGIN
IF(match_regex($name, '(^(\b[A-Z]\w*\s*)+$)', null)) RETURN TRUE;
ELSE RETURN FALSE;
END
Although it is successful it appears to accept inputs which should be identified as 'FALSE'. Please see the attached screenshot.
'TesT Name' and 'TEST NAME' should be FALSE but are instead passing under this regex.
Any help/guidance with the regex would be very useful.
The (^(\b[A-Z]\w*\s*)+$) regex presents a pattern that matches a string that fully matches:
^ - start of string
(\b[A-Z]\w*\s*)+ - 1 or more occurrences (due to (...)+) of
\b - a word boundary
[A-Z] - an uppercase ASCII letter
\w* - 0 or more letters/digits/underscores
\s* - 0+ whitespaces
$ - end of string.
As you see, it allows trailing whitespace, and \w matches what [A-Za-z0-9_] matches, i.e. it matches both lower- and uppercase letters.
You want to only match lowercase letters after initial uppercase ones, also allowing - and _ chars. You may use
^[A-Z][a-z0-9_-]*(\s+[A-Z][a-z0-9_-]*)*$
See the regex demo.
Details
^ - start of string anchor
[A-Z][a-z0-9_-]* - an uppercase letter followed with 0+ lowercase letters, digits, _ or - chars
(\s+[A-Z][a-z0-9_-]*)* - zero or more occurrences of:
\s+ - 1 or more whitespaces
[A-Z][a-z0-9_-]* - an uppercase letter followed with 0+ lowercase letters, digits, _ or - chars
$ - end of string.
I would write your regex as:
^[A-Z]\w*(?:\s+[A-Z]\w*)*$
This says to match a single word starting with a capital letter, then followed by one or more spaces and another word starting with a capital, this quantity zero or more times.
I phrase a matching word as starting with [A-Z] followed by \w*, meaning zero or more word characters. This allows for things like A to match.
Demo
Edit:
Based on the comments above, if you want some other character class to represent what follows the initial uppercase letter, then do that instead:
^[A-Z][something]*(?:\s+[A-Z][something]*)*$
where [something] is your character class.

What is the Regular Expression to allow only one hyphen(-) only between 2 letters or a letter and a number but not 2 numbers

I was using the below pattern.
/^[A-Za-z0-9]+(-[A-Za-z0-9]+)*$/.
What i need is that it should not be allowing the hyphen between 2 numbers.
I know that we have to make modification with 0-9, where we can restrict user from entering them twice.
The (?!.*[0-9]-[0-9]) lookahead after ^ will make sure there is no digit-digit pattern in the string. Also, if there must be 1 or 0 hyphens, replace * at the end with ? (0 or more occurrences).
Use
^(?!.*[0-9]-[0-9])[A-Za-z0-9]+(-[A-Za-z0-9]+)?$
See the regex demo.
Details
^ - start of string
(?!.*[0-9]-[0-9]) - a negative lookahead that fails the match if, after any 0+ chars other than line break chars, there is a digit, hyphen, digit pattern
[A-Za-z0-9]+ - 1 or more ASCII alphanumeric chars
(-[A-Za-z0-9]+)? - 1 or 0 sequences of:
- - a hyphen
[A-Za-z0-9]+ - 1 or more ASCII alphanumeric chars
$ - end of string.

checking if one expression contains the next expression in regex

I want my regex to allow alphanumeric characters, "/_-" and white spaces in between but it must always have at least one alphanumeric character.
my validation goes like this,
/^([A-Za-z0-9/-]+[A-Za-z0-9/-\s]*[A-Za-z0-9/_-]+)$/
It should accept **ABC_1-2-3 but it must not allow 123 or -_/ alone
Can somebody help me please.
The below given regex will capture strings with alpha-numeric characters with optional white space, hyphen and underscore in it. Try it.
([*A-Za-z]+(\s+)?([\d\-_]+)?)
Your regex is almost right, you need to add 2 positive lookaheads at the start to require at least 1 letter and at least 1 digit:
/^(?=.*[a-z])(?=.*\d)[a-z0-9\/_-][a-z0-9\/_\s-]*[a-z0-9\/_-]$/i
See the regex demo (in the demo, \s is replaced with a space since the demo is multiline).
Details:
^ - start of string
(?=.*[a-z]) - after any 0+ chars other than line break chars, there must be at least 1 letter (replace .* with [^a-z]* for better performance)
(?=.*\d) - after any 0+ chars other than line break chars, there must be at least 1 digit (replace.with\D` for better performance)
[a-z0-9\/_-] - a letter, digit, /, _ or -
[a-z0-9\/_\s-]* - 0+ letters, digits, /, whitespaces, _ or -
[a-z0-9\/_-] - a letter, digit, /, _ or -
$ - end of string.
The i modifier makes the pattern case insensitive.

Regex pattern for underscore or hyphen but not both

I have a regular expression that is allowing a string to be standalone, separated by hyphen and underscore.
I need help so the string only takes hyphen or underscore, but not both.
This is what I have so far.
^([a-z][a-z0-9]*)([-_]{1}[a-z0-9]+)*$
foo = passed
foo-bar = passed
foo_bar = passed
foo-bar-baz = passed
foo_bar_baz = passed
foo-bar_baz_qux = passed # but I don't want it to
foo_bar-baz-quz = passed # but I don't want it to
You may expand the pattern a bit and use a backreference to only match the same delimiter:
^[a-z][a-z0-9]*(?:([-_])[a-z0-9]+(?:\1[a-z0-9]+)*)?$
See the regex demo
Details:
^ - start of string
[a-z][a-z0-9]* - a letter followed with 0+ lowercase letters or digits
(?:([-_])[a-z0-9]+(?:\1[a-z0-9]+)*)? - an optional sequence of:
([-_]) - Capture group 1 matching either - or _
[a-z0-9]+ - 1+ lowercase letters or digits
(?:\1[a-z0-9]+)* - 0+ sequences of:
\1 - the same value as in Group 1
[a-z0-9]+ - 1 or more lowercase letters or digits
$ - end of string.
Here's a nice clean solution:
^([a-zA-Z-]+|[a-zA-Z_]+)$
Break it down!
^ start at the beginning of the text
[a-zA-Z-]+ match anything a-z or A-Z or -
| OR operator
[a-zA-Z_]+ match anything a-z or A-Z or _
$ end at the end of the text
Here's an example on regexr!