RegEx to match days hours minutes with spaces - regex

I need to make sure the users' time stamp input is
(0 to infinity) days
(0 to 23) hours
(h to 59) minutes
with spaces between, so good example is
22h 40m
2d 20m
1d
but not
0d
2d22h20m
what i have so far is:
^(0|[1-9][0-9]*d)?( [0|[1-5][0-9]*h)?(0| [1-5][0-9]*m)?$
which gets:
22d 3h 40m
2d 40m
but not
40m
40h
it seems trivial, and I've composed it from several SO questions, but nothing matches exactly this.
Edit
just note that my original attempt has a mistake that allows 1-59 hours.

It is quite verbose, but one option could be to use an alternation and list all the allowed possibilities using your patterns for d, h and m.
Matching (d h m) or (d h) or (d m) or (h m) or (d) or (h) or (m) you might use:
^(?:[1-9][0-9]*d (?:2[0-3]|1[0-9]|[1-9])h (?:[1-5][0-9]|[1-9])m|[1-9][0-9]*d (?:(?:2[0-3]|1[0-9]|[1-9])h|(?:[1-5][0-9]|[1-9])m)|(?:2[0-3]|1[0-9]|[1-9])h (?:[1-5][0-9]|[1-9])m|[1-9][0-9]*d|(?:2[0-3]|1[0-9]|[1-9])h|(?:[1-5][0-9]|[1-9])m)$
regex101 demo
That will match:
^ Start of string
(?: Non capturing group
[1-9][0-9]*d (?:2[0-3]|1[0-9]|[1-9])h (?:[1-5][0-9]|[1-9])m Match full part
| Or
[1-9][0-9]*d Match d
(?: Non capturing group
(?:2[0-3]|1[0-9]|[1-9])h Match h
| Or
(?:[1-5][0-9]|[1-9])m Match m
) close non capturing group
| Or
(?:2[0-3]|1[0-9]|[1-9])h (?:[1-5][0-9]|[1-9])m Match h and m
| Or
[1-9][0-9]*d Match d
| Or
(?:2[0-3]|1[0-9]|[1-9])h Match h
| Or
(?:[1-5][0-9]|[1-9])m Mach m
) Close non capturing group
$ End of string

Edit
The only suggestions (so far) are really to list all possibilities. So I'll be doing that here.. just note that my original attempt has a mistake that allows 1-59 hours.
regex101
^((([1-9][0-9])d (2[0-3]|1[0-9]|[1-9])h ([1-5][0-9]|[1-9])m)|(([1-9][0-9])d (2[0-3]|1[0-9]|[1-9])h)|(([1-9][0-9])d ([1-5][0-9]|[1-9])m)|((2[0-3]|1[0-9]|[1-9])h ([1-5][0-9]|[1-9])m)|([1-9][0-9])d|(2[0-3]|1[0-9]|[1-9])h|([1-5][0-9]|[1-9])m)$
meaning: (d h m) or (d h) or (d m) or (h m) or (d) or (h) or (m)

Related

How to extract the operands on both sides of "==" using regex?

Language and package
python3.8, regex
Description
The inputs and wanted outputs are listed as following:
if (programWorkflowState.getTerminal(1, 2) == Boolean.TRUE) {
Want: programWorkflowState.getTerminal(1, 2) and Boolean.TRUE
boolean ignore = !_isInStatic.isEmpty() && (_isInStatic.peek() == 3) && isAnonymous;
Want: _isInStatic.peek() and 3
boolean b = (num1 * ( 2 + num2)) == value;
Want: (num1 * ( 2 + num2)) and value
My current regex
((?:\((?:[^\(\)]|(?R))*\)|[\w\.])+)\s*==\s*((?:\((?:[^\(\)]|(?R))*\)|[\w\.])+)
This pattern want to match \((?:[^\(\)]|(?R))*\) or [\w\.] on both side of "=="
Result on regex101.com
Problem: It failed to match the recursive part (num1 * ( 2 + num2)).
The explanation of the recursive pattern \((?:m|(?R))*\) is here
But if I only use the recursive pattern, it succeeded to match (num1 * ( 2 + num2)) as the image shows.
What's the right regex to achieve my purpose?
The \((?:m|(?R))*\) pattern contains a (?R) construct (equal to (?0) subroutine) that recurses the entire pattern.
You need to wrap the pattern you need to recurse with a group and use a subroutine instead of (?R) recursion construct, e.g. (?P<aux>\((?:m|(?&aux))*\)) to recurse a pattern inside a longer one.
You can use
((?:(?P<aux1>\((?:[^()]++|(?&aux1))*\))|[\w.])++)\s*[!=]=\s*((?:(?&aux1)|[\w.])+)
See this regex demo (it takes just 6875 steps to match the string provided, yours takes 13680)
Details
((?:(?P<aux1>\((?:[^()]++|(?&aux1))*\))|[\w.])++) - Group 1, matches one or more occurrences (possessively, due to ++, not allowing backtracking into the pattern so that the regex engine could not re-try matching a string in another way if the subsequent patterns fail to match)
(?P<aux1>\((?:[^()]++|(?&aux1))*\)) - an auxiliary group "aux1" that matches (, then zero or more occurrences of either 1+ chars other than ( and ) or the whole Group "aux1" pattern, and then a )
| - or
[\w.] - a letter, digit, underscore or .
\s*[!=]=\s* - != or == with zero or more whitespace on both ends
((?:(?&aux1)|[\w.])+) - Group 2: one or more occurences of Group "aux" pattern or a letter, digit, underscore or ..

Regex logic for a 3 digits code with specific 1 digit Alphabetic and Numbers

I need help on this to write logic for regex expression for the following conditions. The user keyed code should have
3 Bytes max
1st byte can have alpha (specifically A, B, P) or all 3 numbers
2nd & 3rd bytes must be numeric
No special characters allowed.
Examples,
A23 - match
B45 - match
P71 - match
A3 - match
418 - match
91 - match
C23 - not match
AC2 - not match
D3 - not match
I tried the expression, but no luck. The logic is
alphaNumericRegExp =/[A,B,P][0-9]{3}/
Matcher matcher = mask.matcher(service.getRacprCd1());
Matcher matcher1=digitPattern.matcher(service.getRacprCd1());
if (!matcher.matches()) {
vectErrMsgs.add("Pr code is not valid. " );
}
You may use
alphaNumericRegExp =/[ABP0-9]?[0-9]{1,2}/
With matcher.matches(), it requires a full string match, no need adding ^ and $ anchors. It matches:
[ABP0-9]? - an optional A, B, P, or digit
[0-9]{1,2} - 1 or 2 digits
Note that a | inside a character class makes it match the literal pipe symbol.
Split it into logical pieces. The first char can be A, B, P, a number, or (if I understand correctly) nothing. Therefore:
[ABP\d]?
Then there needs to be 1 or 2 digits.
\d{1,2}
So all together,
^[ABP\d]?\d{1,2}$
One gotcha, this allows a single digit. I can't tell from your question if that is allowed. If the code has to be at least 2 chars long, remove the ?

Regex for replacing multiple spaces and dashes with or without spaces

I can do this with two separate regex passes, but this is already slow and doing two doesn't help, so I want to be able to do it in one pass.
I want to:
replace multiple spaces with one space
replace a dash (hyphen) with a space
However, if the dash has a space on either side of it then the dash and any spaces either side to be replaced with just one space.
As an example:
a - b c-d e -f g- h i - j k - l m - n
must end up like
a b c d e f g h i j k l m n
I have tried things like this:
\s+| - | -|- |-
but that doesn't work:
a b c d e f g h i j k l m n
Use the following regexp to match multiple spaces or dashes;
[\s-]+
Replace with a single space.
[\s-]+ with a global 'g' modifier and replace with one single space.
See here
Regex:
(?:\s*-\s*)+|\s{2,}
REplacement string:
<space>
DEMO

Regular Expression: search multiple string with linefeed delimited by ";"

I have a string such this that described a structured data source:
Header whocares;
SampleTestPlan 2
a b
c d;
Test abc;
SampleTestPlan 3
e f
g h
i l;
Wafer 01;
EndOfFile;
Every field...
... is starting with "FieldName"
... is ending with ";"
... may contain linefeed
My need is to find with regular expression the values of SampleTestPlan that's repeated twice. So...
1st value is:
2
a b
c d
2nd value is
3
e f
g h
i l
I've performed several attempts with such search string:
/SampleTestPlan(.\s)/gm
/SampleTestPlan(.\s);/gm
/SampleTestPlan(.*);/gm
but I need to understand much better how Regular Expression work as I'm definitively a newbie on them and I need to learn a lot.
Thanks in advance to anyone that may help me!
Stefano, Milan, ITALY
You could use the following regex:
(?<=\w\b)[^;]+(?=;)
See it working live here on regex101!
How it works:
It matches everything that is:
preceded by a sequence of characters: \w+
followed by a ;
contains anything (at least one character) except a ; (including newlines).
For example, for that input:
Header whocares;
SampleTestPlan 2
a b
c d;
Test abc;
SampleTestPlan 3
e f
g h
i l;
Wafer 01;
EndOfFile;
It matches 5 times:
whocares
then:
2
a b
c d
then:
abc
then:
3
e f
g h
i l
then:
01
Assuming your input will be always in this well formatted like the sample, try this:
/SampleTestPlan(\s+\d+.*?);/sg
Here, /s modifier means Dot matches newline characters
You can try this at online.
That would be /SameTestPlan([^;]+)/g. [^abc] means any character which is not a, b or c.

Regex to calculate straight poker hand - Using ASCII CODE

In another question I learned how to calculate straight poker hand using regex (here).
Now, by curiosity, the question is: can I use regex to calculate the same thing, using ASCII CODE?
Something like:
regex: [C][C+1][C+2][C+3][C+4], being C the ASCII CODE (or like this)
Matches: 45678, 23456
Doesn't matches: 45679 or 23459 (not in sequence)
Your main problem is really going to be that you're not using ASCII-consecutive encodings for your hands, you're using numerics for non-face cards, and non-consecutive, non-ordered characters for face cards.
You need to detect, at the start of the strings, 2345A, 23456, 34567, ..., 6789T, 789TJ, 89TJQ, 9TJQK and TJQKA.
These are not consecutive ASCII codes and, even if they were, you would run into problems since both A2345 and TJQKA are valid and you won't get A being both less than and greater than the other characters in the same character set.
If it has to be done by a regex, then the following regex segment:
(2345A|23456|34567|45678|56789|6789T|789TJ|89TJQ|9TJQK|TJQKA)
is probably the easiest and most readable one you'll get.
There is no regex that will do what you want as the other answers have pointed out, but you did say that you want to learn regex, so here's another meta-regex approach that may be instructional.
Here's a Java snippet that, given a string, programmatically generate the pattern that will match any substring of that string of length 5.
String seq = "ABCDEFGHIJKLMNOP";
System.out.printf("^(%s)$",
seq.replaceAll(
"(?=(.{5}).).",
"$1|"
)
);
The output is (as seen on ideone.com):
^(ABCDE|BCDEF|CDEFG|DEFGH|EFGHI|FGHIJ|GHIJK|HIJKL|IJKLM|JKLMN|KLMNO|LMNOP)$
You can use this to conveniently generate the regex pattern to match straight poker hands, by initializing seq as appropriate.
How it works
. metacharacter matches "any" character (line separators may be an exception depending on the mode we're in).
The {5} is an exact repetition specifier. .{5} matches exactly 5 ..
(?=…) is positive lookahead; it asserts that a given pattern can be matched, but since it's only an assertion, it doesn't actually make (i.e. consume) the match from the input string.
Simply (…) is a capturing group. It creates a backreference that you can use perhaps later in the pattern, or in substitutions, or however you see fit.
The pattern is repeated here for convenience:
match one char
at a time
|
(?=(.{5}).).
\_________/
must be able to see 6 chars ahead
(capture the first 5)
The pattern works by matching one character . at a time. Before that character is matched, however, we assert (?=…) that we can see a total of 6 characters ahead (.{5})., capturing (…) into group 1 the first .{5}. For every such match, we replace with $1|, that is, whatever was captured by group 1, followed by the alternation metacharacter.
Let's consider what happens when we apply this to a shorter String seq = "ABCDEFG";. The ↑ denotes our current position.
=== INPUT === === OUTPUT ===
A B C D E F G ABCDE|BCDEFG
↑
We can assert (?=(.{5}).), matching ABCDEF
in the lookahead. ABCDE is captured.
We now match A, and replace with ABCDE|
A B C D E F G ABCDE|BCDEF|CDEFG
↑
We can assert (?=(.{5}).), matching BCDEFG
in the lookahead. BCDEF is captured.
We now match B, and replace with BCDEF|
A B C D E F G ABCDE|BCDEF|CDEFG
↑
Can't assert (?=(.{5}).), skip forward
A B C D E F G ABCDE|BCDEF|CDEFG
↑
Can't assert (?=(.{5}).), skip forward
A B C D E F G ABCDE|BCDEF|CDEFG
↑
Can't assert (?=(.{5}).), skip forward
:
:
A B C D E F G ABCDE|BCDEF|CDEFG
↑
Can't assert (?=(.{5}).), and we are at
the end of the string, so we're done.
So we get ABCDE|BCDEF|CDEFG, which are all the substrings of length 5 of seq.
References
regular-expressions.info/Dot, Repetition, Grouping, Lookaround
Something like regex: [C][C+1][C+2][C+3][C+4], being C the ASCII CODE (or like this)
You can not do anything remotely close to this in most regex flavors. This is simply not the kinds of patterns that regex is designed for.
There is no mainstream regex pattern that will succintly match any two consecutive characters that differ by x in their ASCII encoding.
For instructional purposes...
Here you go (see also on ideone.com):
String alpha = "ABCDEFGHIJKLMN";
String p = alpha.replaceAll(".(?=(.))", "$0(?=$1|\\$)|") + "$";
System.out.println(p);
// A(?=B|$)|B(?=C|$)|C(?=D|$)|D(?=E|$)|E(?=F|$)|F(?=G|$)|G(?=H|$)|
// H(?=I|$)|I(?=J|$)|J(?=K|$)|K(?=L|$)|L(?=M|$)|M(?=N|$)|N$
String p5 = String.format("(?:%s){5}", p);
String[] tests = {
"ABCDE", // true
"JKLMN", // true
"AAAAA", // false
"ABCDEFGH", // false
"ABCD", // false
"ACEGI", // false
"FGHIJ", // true
};
for (String test : tests) {
System.out.printf("[%s] : %s%n",
test,
test.matches(p5)
);
}
This uses meta-regexing technique to generate a pattern. That pattern ensures that each character is followed by the right character (or the end of the string), using lookahead. That pattern is then meta-regexed to be matched repeatedly 5 times.
You can substitute alpha with your poker sequence as necessary.
Note that this is an ABSOLUTELY IMPRACTICAL solution. It's much more readable to e.g. just check if alpha.contains(test) && (test.length() == 5).
Related questions
How does the regular expression (?<=#)[^#]+(?=#) work?
SOLVED!
See in http://jsfiddle.net/g48K9/3
I solved using closure, in js.
String.prototype.isSequence = function () {
If (this == "A2345") return true; // an exception
return this.replace(/(\w)(\w)(\w)(\w)(\w)/, function (a, g1, g2, g3, g4, g5) {
return code(g1) == code(g2) -1 &&
code(g2) == code(g3) -1 &&
code(g3) == code(g4) -1 &&
code(g4) == code(g5) -1;
})
};
function code(card){
switch(card){
case "T": return 58;
case "J": return 59;
case "Q": return 60;
case "K": return 61;
case "A": return 62;
default: return card.charCodeAt();
}
}
test("23456");
test("23444");
test("789TJ");
test("TJQKA");
test("8JQKA");
function test(cards) {
alert("cards " + cards + ": " + cards.isSequence())
}
Just to clarify, ascii codes:
ASCII CODES:
2 = 50
3 = 51
4 = 52
5 = 53
6 = 54
7 = 55
8 = 56
9 = 57
T = 84 -> 58
J = 74 -> 59
Q = 81 -> 60
K = 75 -> 61
A = 65 -> 62