groovy regular expression difficulty - regex

I have a string like this: 1R12 or 2EURO16.
First character is 1 or 2 (numeric)
Middle is a letter or a word (R,L,X,Y,B or EURO)
End is 10,12,14,16 (numeric)
What I tried is this:
(^1|2)(R|L|X|Y|B|EURO)(10|12|14|16$)
But this gives negative result.
What would be a correct or possible regex?

The (^1|2) matches 1 at the start of the string and 2 anywhere in a string. Similarly, (10|12|14|16$) matches 10, 12 and 14 anywhere inside a string and 16 at the end of the string.
You need to rearrange the anchors:
/^[12](?:[RLXYB]|EURO)(?:10|12|14|16)$/
See the regex graph:
Details
^ - start of string
[12] - 1 or 2
(?:[RLXYB]|EURO) - R, L, X, Y, B or EURO
(?:10|12|14|16) - 10, 12, 14 or 16
$ - end of string
NOTE: If you use ==~ operator in Groovy, you do not need anchors at all because ==~ requires a full string match:
println("1EURO16" ==~ /[12](?:[RLXYB]|EURO)(?:10|12|14|16)/) // => true
println("1EURO19" ==~ /[12](?:[RLXYB]|EURO)(?:10|12|14|16)/) // => false
See the Groovy demo.

Related

Why does the regex [a-zA-Z]{5} return true for non-matching string?

I defined a regular expression to check if the string only contains alphabetic characters and with length 5:
use regex::Regex;
fn main() {
let re = Regex::new("[a-zA-Z]{5}").unwrap();
println!("{}", re.is_match("this-shouldn't-return-true#"));
}
The text I use contains many illegal characters and is longer than 5 characters, so why does this return true?
You have to put it inside ^...$ to match the whole string and not just parts:
use regex::Regex;
fn main() {
let re = Regex::new("^[a-zA-Z]{5}$").unwrap();
println!("{}", re.is_match("this-shouldn't-return-true#"));
}
Playground.
As explained in the docs:
Notice the use of the ^ and $ anchors. In this crate, every expression is executed with an implicit .*? at the beginning and end, which allows it to match anywhere in the text. Anchors can be used to ensure that the full text matches an expression.
Your pattern returns true because it matches any consecutive 5 alpha chars, in your case it matches both 'shouldn't' and 'return'.
Change your regex to: ^[a-zA-Z]{5}$
^ start of string
[a-zA-Z]{5} matches 5 alpha chars
$ end of string
This will match a string only if the string has a length of 5 chars and all of the chars from start to end fall in range a-z and A-Z.

Regex for set of 6 digits from 1-49

I've a problem with define regular expression correctly. I want check sets of digits f.e.: 1,2,14,15,16,17 or 12,13,14,15,16,17 or 1,2,3,6,7,8. Every set contains 6 digits from 1 to 49. I check it by input's pattern field.
I wrote some regex but it works only for 2-digit sets.
([1-9]|[1-4][0-9],){5}([1-9]|[1-4][0-9])
Thanks for all answers :)
You forgot to group the number patterns inside the quantified group before comma and the anchors to make the regex engine match the full input string:
^(?:(?:[1-9]|[1-4][0-9]),){5}(?:[1-9]|[1-4][0-9])$
^ ^^^ ^ ^
See the regex demo.
Details
^ - start of string
(?:(?:[1-9]|[1-4][0-9]),){5} - five occurrences of:
(?:[1-9]|[1-4][0-9]) - either a digit from 1 to 9 or a number from 10 to 49`
, - a comma
(?:[1-9]|[1-4][0-9])
$ - end of string.
JS demo:
var strs = ['1,2,14,15,16,17','12,13,14,15,16,17', '1,2,3,6,7,8', '1,2,3,6,7,8,'];
var rng = '(?:[1-9]|[1-4][0-9])';
var rx = new RegExp("^(?:" + rng + ",){5}" + rng + "$");
for (var s of strs) {
console.log(s, '=>', rx.test(s));
}

Regex logic for a 3 digits code with specific 1 digit Alphabetic and Numbers

I need help on this to write logic for regex expression for the following conditions. The user keyed code should have
3 Bytes max
1st byte can have alpha (specifically A, B, P) or all 3 numbers
2nd & 3rd bytes must be numeric
No special characters allowed.
Examples,
A23 - match
B45 - match
P71 - match
A3 - match
418 - match
91 - match
C23 - not match
AC2 - not match
D3 - not match
I tried the expression, but no luck. The logic is
alphaNumericRegExp =/[A,B,P][0-9]{3}/
Matcher matcher = mask.matcher(service.getRacprCd1());
Matcher matcher1=digitPattern.matcher(service.getRacprCd1());
if (!matcher.matches()) {
vectErrMsgs.add("Pr code is not valid. " );
}
You may use
alphaNumericRegExp =/[ABP0-9]?[0-9]{1,2}/
With matcher.matches(), it requires a full string match, no need adding ^ and $ anchors. It matches:
[ABP0-9]? - an optional A, B, P, or digit
[0-9]{1,2} - 1 or 2 digits
Note that a | inside a character class makes it match the literal pipe symbol.
Split it into logical pieces. The first char can be A, B, P, a number, or (if I understand correctly) nothing. Therefore:
[ABP\d]?
Then there needs to be 1 or 2 digits.
\d{1,2}
So all together,
^[ABP\d]?\d{1,2}$
One gotcha, this allows a single digit. I can't tell from your question if that is allowed. If the code has to be at least 2 chars long, remove the ?

How do I represent "Any string except for .... "

I'm trying to solve a regex where the given alphabet is Σ={a,b}
The first expression is:
L1 = {a^2n b^(3m+1) | n >= 1, m >= 0}
which means the corresponding regex is: aa(a)*b(bbb)*
What would be a regex for L2, complement of L1?
Is it right to assume L2 = "Any string except for aa(a)b(bbb)"?
First, in my opinion, the regex for L1 = {a^2n b^3m+1 | n>=1, m>=0}
is NOT what you gave but is: aa(aa)*b(bbb)*. The reason is that a^2n, n > 1 means that there are at least 2 a and a pair number of a.
Now, the regular expression for "Any string except for aa(aa)*b(bbb)*" is:
^(?!^aa(aa)*b(bbb)*$).*$
more details here: Regex101
Explanations
aa(a)*b(bbb)* the regex you DON'T want to match
^ represents begining of line
(?!) negative lookahead: should NOT match what's in this group
$ represents end of line
EDIT
Yes, a complement for aa(aa)*b(bbb)* is "Any string but the ones that match aa(aa)*b(bbb)*".
Now you need to find a regex that represents that with the syntax that you can use. I gave you a regex in this answer that is correct and matches "Any string but the ones that match aa(aa)*b(bbb)*", but if you want a mathematical representation following the pattern you gave for L1, you'll need to find something simpler.
Without any negative lookahead, that would be:
L2 = ^((b+.*)|((a(aa)*)?b*)|a*((bbb)*|bb(bbb)*)|(.*a+))$
Test it here at Regex101
Good luck with the mathematical representation translation...
The first expression is:
L1 = {a^2n b^(3m+1) | n >= 1, m >= 0}
Regex for L1 is:
^aa(?:aa)*b(?:bbb)*$
Regex demo
Input
a
b
ab
aab
abb
aaab
aabb
abbb
aaaab
aaabb
aabbb
abbbb
aaaaab
aaaabb
aaabbb
aabbbb
abbbbb
aaaaaab
aaaaabb
aaaabbb
aaabbbb
aabbbbb
abbbbbb
aaaabbbb
Matches
MATCH 1
1. [7-10] `aab`
MATCH 2
1. [30-35] `aaaab`
MATCH 3
1. [75-81] `aabbbb`
MATCH 4
1. [89-96] `aaaaaab`
MATCH 5
1. [137-145] `aaaabbbb`
Regex for L2, complement of L1
^aa(?:aa)*b(?:bbb)*$(*SKIP)(*FAIL)|^.*$
Explanation:
^aa(?:aa)*b(?:bbb)*$ matches L1
^aa(?:aa)*b(?:bbb)*$(*SKIP)(*FAIL) anything matches L1 will skip & fail
|^.*$ matches others that not matches L1
Regex demo
Matches
MATCH 1
1. [0-1] `a`
MATCH 2
1. [2-3] `b`
MATCH 3
1. [4-6] `ab`
MATCH 4
1. [11-14] `abb`
MATCH 5
1. [15-19] `aaab`
MATCH 6
1. [20-24] `aabb`
MATCH 7
1. [25-29] `abbb`
MATCH 8
1. [36-41] `aaabb`
MATCH 9
1. [42-47] `aabbb`
MATCH 10
1. [48-53] `abbbb`
MATCH 11
1. [54-60] `aaaaab`
MATCH 12
1. [61-67] `aaaabb`
MATCH 13
1. [68-74] `aaabbb`
MATCH 14
1. [82-88] `abbbbb`
MATCH 15
1. [97-104] `aaaaabb`
MATCH 16
1. [105-112] `aaaabbb`
MATCH 17
1. [113-120] `aaabbbb`
MATCH 18
1. [121-128] `aabbbbb`
MATCH 19
1. [129-136] `abbbbbb`

Regular expression match decimal with letters

I have following string 3.14, 123.56f, .123e5f, 123D, 1234, 343E12, 32.
What I want to do is match any combination of above inputs. So far I started with the following:
^[0-9]\d*(\.\d+)
I realize I have to escape the . since its a regular expression itself.
Thanks.
This should also work, if not already proposed.
try {
Pattern regex = Pattern.compile("\\.?\\b[0-9]*\\.?[0-9]+(?:[eE][-+]?[0-9]+)?[fD]?\\b", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
// matched text: regexMatcher.group()
// match start: regexMatcher.start()
// match end: regexMatcher.end()
}
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
}
Probably
^(\d+(\.\d+)?|\.\d+)([eE]\d+)?[fD]?$
http://regexr.com?2ut9t
^ start of the string
(\d+(\.\d+)?|\.\d+) one or more digits with an optional ( . and one or more digits)
or
. and one or more digits
([eE]\d+)? an optional ( e or E and one or more digits)
[fD]? an optional f or D
$ end of the string
As a sidenote, I've made the D compatible with everything but the f.
If you need positive and negative sign, add [+-]? after the ^
This will match all of those:
[0-9.]+(?:[Ee][0-9.]*)?[DdFf]?
Note that within a character class (square brackets), dot . is not a special character and should not be escaped.
Maybe that one ?
^\d*(?:\.\d+)?(?:[eE]\d+)?(?:[fD])?$
with
^\d* #possibly a digit or sequence of digits at the start
(?:\.\d+)? #possibly followed by a dot and at least one digit
(?:[eE]\d+)? #possibly a 'e' or 'E' followed by at least one digit
(?:[fD])?$ #optionnaly followed by 'f' or 'D' letters until the end
You can use regexpal to test it out, but this seems to work on all of those examples:
^\d*\.?(\d*[eE]?\d*)[fD]?$