I need help using regular expression. I have some strings with these possibility use cases. The string will always start with a capital letters followed by 3 numbers then with a hyphen, followed by numbers:
A012-123
B001-012
C023-456
I've tried: [A-Z0-9]-[0-9] and i can't get a match. Can someone show me how to construct this correctly?
[A-Z][0-9]{3}-[0-9]{3}
The {3} means that, match only three times. This will match any string which starts with a capital letter, followed by 3 digits, a hypen and 3 digits.
But if the number of digits after - can be anything, then you can use
[A-Z][0-9]{3}-[0-9]+
This match any string which starts with a capital letter, followed by 3 digits, a hypen and one or more digits.
Note: Instead of writing [0-9], you can use \d. They both are one and the same. So your first regex will become
[A-Z]\d{3}-\d{3}
Related
I have a few strings and I need some help with constructing Regex to match them.
The example strings are:
AAPL10.XX1.XX2
AAA34CL
AAXL23.XLF2
AAPL
I have tried few expressions but couldn't achieve exact results. They are of the following:
[0-9A-Z]+\.?[0-9A-Z]$
[A-Z0-9]*\.?[^.]$
Following are some of the points which should be maintained:
The pattern should only contain capital letters and digits and no small letters are allowed.
The '.' in the middle of the text is optional. And the maximum number of times it can appear is only 2.
It should not have any special characters at the end.
Please ask me for any clarification.
You can write the pattern as:
^[A-Z\d]+(?:\.[A-Z\d]+){0,2}$
The pattern matches:
^ Start of string
[A-Z\d]+ Match 1+ chars A-Z or a digit
(?:\.[A-Z\d]+){0,2} Repeat 0 - 2 times a . and 1+ chars A-Z or a digit
$ End of string
Regex demo
I am trying to match a string (length =4) with lower case letters and digits. That could be 4 digits but not 4 letters. For example I want to match:
d4rt
df5h
34d6
4567
But not 'erty'.
I get that pattern ([a-z]+|[0-9]+){4} but that keeps me the 4 letters case.
Your regex ([a-z]+|[0-9]+){4} uses an alternation which will either match 1+ lowercase characters or 1+ digits in a capturing group and repeat that 4 times. That would also match 4 letters.
If lookarounds are supported, you could use a negative lookahead to assert that what follows are not 4 lowercase characters.
To match a string with length of 4, you could use anchors to assert the start ^ and the end $ of the string.
^(?![a-z]{4})[a-z0-9]{4}$
Regex demo
Your expression is matching four {4} of whatever either any number greater than 1 of lower case letters [a-z] or any number greater than one of digits. Therefore, your code is actually matching more than 4 of letters or digits too.
Your problem can be solved with lookaheads.
(?=[a-z]{0,3}[0-9])[a-z0-9]{4}
(?=[a-z]*[0-9]) looks ahead to find zero or more letters until it finds a number. But when it finds, such a sequence it will continue matching from the beginning of the lookahead. Thin of it as a sort of "pre match".
[a-z0-9]{4} This part checks for four numbers or lower case characters, but we are already sure that there is at least one number there because of the lookahead.
As your requirement says, the string should contain at least one digit and rest can be anything containing digits and lowercase alphabets of exactly 4 characters, you can use this regex,
^(?=.*\d)[a-z0-9]{4}$
Explanation:
^ --> Start of input
(?=.*\d) --> Look ahead to ensure the input contains at least one digit
[a-z0-9]{4} --> Ensures only lowercase alphabets and digits are matched in allowed character set
$ --> End of input
Demo
I need a regular expression for a string with has at least 8 symbols and only one uppercase character. Java
For example, it should match:
Asddffgf
asdAsadasd
asdasdaA
But not:
adadAasdasAsad
AsdaAadssadad
asdasdAsadasdA
I tried this: ^[a-z]*[A-Z][a-z]*$ This works good, but I need at least 8 symbols.
Then I tried this: (^[a-z]*[A-Z][a-z]*$){8,} But it doesn't work
^(?=[^A-Z]*[A-Z][^A-Z]*$).{8,}$
https://regex101.com/r/zTrbyX/6
Explanation:
^ - Anchor to the beginning of the string, so that the following lookahead restriction doesn't skip anything.
(?= ) - Positive lookahead; assert that the beginning of the string is followed by the contained pattern.
[^A-Z]*[A-Z][^A-Z]*$ - A sequence of any number of characters that are not capital letters, then a single capital letter, then more non capital letters until the end of the string. This insures that there will be one and only one capital letter throughout the string.
.{8,} - Any non-newline character eight or more times.
$ - Anchor at the end of the string (possibly unnecessary depending on your requirements).
In your first regex ^[a-z]*[A-Z][a-z]*$ you could append a positive lookahead (?=[a-zA-Z]{8,}) right after the ^.
That will assert that what follows matches at least 8 times a lower or uppercase character.
^(?=[a-zA-Z]{8,})[a-z]*[A-Z][a-z]*$
I am trying to do a regex to get this cases:
Correct:
IUG4455
I4UG455
A4U345A
Wrong:
IUGG453
IIUG44555
need to be exactly 4 letters (in any order) and exactly 3 digits (in any order).
i tried use that expression
[A-Z]{3}\\d{4}
but it only accept start with letters (4) then digits (3).
You have a couple of options for this:
Option 1: See regex in use here
\b(?=(?:\d*[A-Z]){3})(?=(?:[A-Z]*\d){4})[A-Z\d]{7}\b
\b Assert position as a word boundary
(?=(?:\d*[A-Z]){3}) Positive lookahead ensuring the following matches
(?:\d*[A-Z]){3} Match the following exactly 3 times
\d* Match any digit any number of times
[A-Z] Match any uppercase ASCII character
(?=(?:[A-Z]*\d){4}) Positive lookahead ensuring the following matches
(?:[A-Z]*\d){4} Match the following exactly 4 times
[A-Z]* Match any uppercase ASCII character any number of times
\d Match any digit
[A-Z\d]{7} Match any digit or uppercase ASCII character exactly 7 times
\b Assert position as a word boundary
If speed needs to be taken into consideration, you can expand the above option and use the following:
\b(?=\d*[A-Z]\d*[A-Z]\d*[A-Z])(?=[A-Z]*\d[A-Z]*\d[A-Z]*\d[A-Z]*\d)[A-Z\d]{7}\b
Option 2: See regex in use here
\b(?=(?:\d*[A-Z]){3}(?!\d*[A-Z]))(?=(?:[A-Z]*\d){4}(?![A-Z]*\d))[A-Z\d]+\b
Similar to Option 1, but uses negative lookahead to ensure an extra character (uppercase ASCII letter or digit) doesn't exist in the string.
Having two positive lookaheads back-to-back simulates an and such that it ensures both subpatterns are satisfied starting at that particular position. Since you have two conditions (3 uppercase ASCII letters and 4 digits), you should use two lookaheads.
As an alternative,
(?:(?<d>\d)|(?<c>[A-Z])){7}(?<-d>){3}(?<-c>){4}
doesn't require any lookarounds. It just matches seven letter-or-digits and then checks it found 3 digits and 4 letters.
Adjust the 3 and 4 to taste... your examples have 4 digits and 3 letters.
Also add word boundaries or anchors depending on whether you are trying to match whole words or a whole string.
Well my question is simple, I want to match a string with following attributes
No white space
Must start with a letter
Must not contain any other special characters other than underscore
May contain numbers
Please help in creating such a regex.
^[a-zA-Z][a-zA-Z0-9_]*$
Dissecting it:
^ start of line/string
[a-zA-Z] starts with a letter
[a-zA-Z0-9_]* followed by zero or more letters, underscores or digits.
$ end of line/string
If you need to consider Unicode, then the following is probably more sane:
^\p{L}[\p{L}\p{Nd}_]*$
This will match not only ASCII letters and digits but across all scripts that are supported by Unicode. Digits are restricted to decimal digits, only, so you won't get Roman numerals.
/^[a-zA-Z]\w*$/
a-Z - start with letter
\w - all leters, numbers and underscore