I have the following RegEx that is supposed to do 24 hours time format validation, which I'm trying out in https://rubular.com
/^[0-23]{2}:[0-59]{2}:[0-59]{2}$/
But the following times fails to match even if they look correct
02:06:00
04:05:00
Why this is so?
In character classes, you're supposed to denote the range of characters allowed (in contrast to the numbers you want to match in your example). For minutes and seconds, this is relatively straight-forward - the following expression
[0-5][0-9]
...will match any numerical string from "00" to "59".
But for the hours, you need to two separate expressions:
[01][0-9]|2[0-3]
...one to match "00" to "19" and one to match "20" to "23". Due to the alternative used (| character), these need to be grouped, which adds another bit of syntax (?:...). Finally we're just adding the anchors ^ and $ for beginning and end of string, which you already had where they belong.
^(?:[01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9]$
You can check this solution out at regex101, if you like.
Your problem is that you understand characters ranges wrong: 0-23 doesn't mean "match any number from 0 to 23", it means: 0-2- match one digit: 0,1 or 2, then match 3.
Try this pattern: (?:[01][0-9]|2[0-3])(?::[0-5][0-9]){2}
Explanation:
(?:...) - non-capturing group
[01][0-9]|2[0-3] - alternation: match whether 0 or one followed by any digits fro 0 to 9 OR 2 followed by 0, 1, 2 or 3 (number from 00-23)
(?::[0-5][0-9]){2} - match : and [0-5][0-9] (basically number from 00-59) twice
Demo
use this (([0-1]\d|[2][0-3])):(([0-5][0-9])):(([0-5][0-9]))
Online demo
Related
I have the following string:
"Thu Dec 31 22:00:00 UYST 2009"
I want to replace everything except for the hours and minutes so I get the following result:
"22:00"
I am using this regex :
(^([0-9][0-9]:[0-9][0-9]))
But its not matching anything.
This would be my line of actual code :
println("Thu Dec 31 22:00:00 UYST 2009".replace("(^([0-9][0-9]:[0-9][0-9]))".toRegex(),""))
Can someone help me to correct the regex?
The reason the one you have isn't working is because you are asserting that the line starts right before the minutes and seconds, which isn't the case. This can be fixed by removing the assertion (^).
If you need the assertion to remain, there is another way. In most languages, you wouldn't be able to use a variable-length positive lookbehind here, but lucky for you, it looks like you can in Kotlin.
A positive lookbehind is basically just telling the pattern "this comes before what I'm looking for". It's denoted by a group beginning with ?<=. In this case, you can use something like (?<=^[\w ]+). This will match all word characters or spaces between the beginning of the line and where the pattern that comes after it is able to match. Appending it to your expression would look something like (?<=^[\w ]+)([0-9][0-9]:[0-9][0-9]) (note you will have to escape the \w in order for it to be in a string and not be angry about it).
Side note, Yogesh_D is correct in saying that \d\d:\d\d is the same as your [0-9][0-9]:[0-9][0-9]. Using this, it would look more like (?<=^[\w ]+)\d\d:\d\d.
You may use various solutions, here are two:
val text = """Thu Dec 31 22:00:00 UYST 2009"""
val match = """\b(?:0?[1-9]|1\d|2[0-3]):[0-5]\d\b""".toRegex().find(text)
println(match?.value)
val match2 = """\b(\d{1,2}:\d{2}):\d{2}\b""".toRegex().find(text)
println(match2?.groupValues?.getOrNull(1))
Both return 22:00. See regex #1 demo and regex #2 demo.
The regex complexity should be selected based on how messy the input string is.
Details
\b - a word boundary
(?:0?[1-9]|1\d|2[0-3]) - an optional zero and then a non-zero digit, or 1 and any digit, or 2 and a digit from 0 to 3
: - a : char
[0-5]\d - 0, 1, 2, 3, 4 or 5 and then any one digit
\b - a word boundary.
If there is a match with this regex, you get it as a whole match, so you can access it via match?.value.
If you do not have to worry about any pre-valiation when matching, you may simply match 3 colon-separated digit pairs and capture the first two, see the second regex:
\b - a word boundary
(\d{1,2}:\d{2}) - Group 1: one or two digits, : and two digits
:\d{2} - a : and two digits (not captured)
\b - a word boundary.
If there is a match, we need Group 1 value, hence match2?.groupValues?.getOrNull(1) is used.
I am not sure what language you are using but why use negation when you can directly match the first digits in the hh:mm format.
Assuming that the date string format always is in the format with a hh:mm in there.
This regex snippet should have the first group match the hh:mm.
https://regex101.com/r/aHdehZ/1
The regex to use is (\d\d:\d\d)
For the example i have these four ip address:
10.100.0.11; wrong
10.100.1.12; good
10.100.11.4; good
10.100.44.1; wrong
The task has simple rules. In the 3rd place cant be 0, and the 4rd place cant be a solo 1.
I need to select they from an ip table in different routers and i know only this rules.
My solution:
^(10.100.[1-9]{1,3}.[023456789]{1,3})$
but in this case every number with 1 like 10, 100 etc is missing, so in this way this solution is wrong.
^(10.100.[1-9]{1,3}.[1-9]{2,3})$
This solve the problem of the single 1, but make another one.
From the rules you have given, this regex should work:
10\.100\.([123456789]\d*|\d{2,})\.([^1]$|\d{2,})
it also matches 3rd position number containing a 0 but not in the first place.
so 10.100.10.4 will match as well as 10.100.02.4
I don't know if it's the intended behavior since I'm not familiar with ip adress.
The last part \.([^1]$|\d{2,}) reads like this:
"after the 3rd dot is either
a character which is not 1 followed by the end of the line
or two or more digits"
If you want to avoid malformed string containing non-digit character like 10.100.12.a to be match you should replace [^1] by [023456789] or lazier (and therefore better ;) by [02-9]
I use https://regex101.com to debug regex. It's just awesome.
Here is your regex if you want to play with it
You might use
^10\.100\.[1-9]{1,3}\.(?:[02-9]|\d{2,3})$
The pattern matches
^ Start of string
10\.100\. Match 10.100. (note to escape the dot to match it literally)
[1-9]{1,3} Match 3 times a digit 1-9
\. Match a dot
(?: Non capture group
[02-9] Match a digit 0 or 2-9
| Or
\d{2,3} Match 2 or 3 digits 0-9
) Close the group
$ End of string
Regex demo
I have files with these filename:
ZATR0008_2018.pdf
ZATR0018_2018.pdf
ZATR0218_2018.pdf
Where the 4 digits after ZATR is the issue number of magazine.
With this regex:
([1-9][0-9]*)(?=_\d)
I can extract 8, 18 or 218 but I would like to keep minimum 2 digits and max 3 digits so the result should be 08, 18 and 218.
How is possible to do that?
You may use
0*(\d{2,3})_\d
and grab Group 1 value. See the regex demo.
Details
0* - zero or more 0 chars
(\d{2,3}) - Group 1: two or three digits
_\d - a _ followed with a digit.
Here is a PCRE variation that grabs the value you need into a whole match:
0*\K\d{2,3}(?=_\d)
See another regex demo
Here, \K makes the regex engine omit the text matched so far (zeros) and then matches 2 to 3 digits that are followed with _ and a digit.
(?:[1-9][0-9]?)?[0-9]{2}(?=_[0-9])
or perhaps:
(?:[1-9][0-9]+|[0-9]{2})(?=_[0-9])
(https://www.freeformatter.com/regex-tester.html, which claims to use the XRegExp library, that you mention in another answer doesn't seem to backtrack into the (?:)? in my first suggestion where necessary, which makes it very different from any regex engine I've encoutered before and makes it prefer to match just the 18 of 218 even though it starts later in the string. But it does work with my second suggestion.
([1-9]\d{2,3})(?=_\d)
{x,y} will match from x to y times the previous pattern, in this case \d
Edit: from your own regex it looked as you wanted the part of the number which starts with a non-zero. However since your examples include leading 0s, maybe you really wanted :
(\d{2,3})(?=_\d)
Which will give you the last 3 digits before underscore unless there are only 2 digits.
I propose you:
^ZATR0*(\d{2,3})_\d+\.pdf$
demo code here. Result:
Match 1 Full match 0-17 ZATR0008_2018.pdf Group 1. 6-8 08
Match 2 Full match 18-35 ZATR0018_2018.pdf Group 1. 24-26 18
Match 3 Full match 36-53 ZATR0218_2018.pdf Group 1. 41-44 218
I need to be able to check if a string contains either a 2 digit or a 4 digit number before a . (period).
For example, 39. is good, and so is 3926., but 392. is not.
I originally had (^\\d{2,4).$) but that allows between a 2 and a 4 digit number preceding a period.
I also tried (^\\d{2}.|\\d{4}.$) but that didn't work.
You can use this regex:
^\d{2}(?:\d{2})?\.$
This regex makes 2nd set of \d{2} optional thus allowing to match 12. or 1234. but not 123..
In the expression (^\d{2}.|\d{4}.$), the dots match any character.
Try escaping them to make them match literal dots: (^\d{2}\.|\d{4}\.$)
I have a string
2012-02-19 00:11:12,128|DEBUG|Thread-1|### Time taken is 18 ms
Below regex allows me to search for 18 ms
\d\d\s[m][s]
What I want to do is search for string prior to 18 ms in Notepad++ and then delete it. So that out of thousands rows I have, I can just extract out timings.
Also, I need regex mentioned above to work with timings which are in 3 digits as well as 2 digits. For example it should be able to search for 18 ms as well as 999 ms.
Please help.
You may put your regex into a positive lookahead:
^.*(?=\d{2,3}\sms\s*$)
In case you have some text after 18 ms, you need to use a word boundary \b:
\b allows you to perform a "whole words only" search using a regular expression in the form of \bword\b
^.*(?=\d{2,3}\sms\b)
See demo
{2,3} is a limiting quantifier that lets you match 2 or 3 preceding subpattern.
There's an additional quantifier that allows you to specify how many times a token can be repeated. The syntax is {min,max}, where min is zero or a positive integer number indicating the minimum number of matches, and max is an integer equal to or greater than min indicating the maximum number of matches. If the comma is present but max is omitted, the maximum number of matches is infinite.
You can replace with empty string and 18 ms will stay on the line.
Note you can use \d+ to allow 1 or more digits to be matched (without restrictions on the digit number).
Note 2: if your number is the first of many on the line you need to use lazy matching, i.e. use .*? instead of .* in the beginning of pattern.
Also, I need regex mentioned above to work with timings which are in 3 digits as well as 2 digits.
.*?(?=\d{2,3}\sms\b)
Use the above regex and then replace the match with empty string.
You can use capturing group:
Find:
^.*(\d{2,}\s[m][s])$
Replace with:
\1