Parse time string using regex - regex

My time string may be in one of the following formates (x and y - integer numbers, h and m - symbols):
xh ym
xh
ym
y
Examples:
1h 20m
45m
2h
120
What regular expression should I write to get x and y numbers from such string?

(\d+)([mh]?)(?:\s+(\d+)m)?
You can then inspect groups 1-3. For your examples those would be
('1', 'h', '20')
('45', 'm', '')
('2', 'h', '')
('120', '', '')
As always, you might want to use some anchors ^, $, \b...

I'm going to assume you're using .NET due to your username. :)
I think in this case, it's easier to use TimeSpan.ParseExact for this task.
You can specify a list of permitted formats (see here for the format for these) and ParseExact will read in the TimeSpan according to them.
Here is an example:
var formats = new[]{"h'h'", "h'h 'm'm'", "m'm'", "%m"};
// I have assumed that a single number means minutes
foreach (var item in new[]{"23","1h 45m","1h","45m"})
{
TimeSpan timespan;
if (TimeSpan.TryParseExact(item, formats, CultureInfo.InvariantCulture, out timespan))
{
// valid
Console.WriteLine(timespan);
}
}
Output:
00:23:00
01:45:00
01:00:00
00:45:00
The only problem with this is that it is rather inflexible. Additional whitespace in the middle will fail to validate. A more robust solution using Regex is:
var items = new[]{"23","1h 45m", "45m", "1h", "1h 45", "1h 45", "1h45m"};
foreach (var item in items)
{
var match = Regex.Match(item, #"^(?=\d)((?<hours>\d+)h)?\s*((?<minutes>\d+)m?)?$", RegexOptions.ExplicitCapture);
if (match.Success)
{
int hours;
int.TryParse(match.Groups["hours"].Value, out hours); // hours == 0 on failure
int minutes;
int.TryParse(match.Groups["minutes"].Value, out minutes);
Console.WriteLine(new TimeSpan(0, hours, minutes, 0));
}
}
Breakdown of the regex:
^ - start of string
(?=\d) - must start with a digit (do this because both parts are marked optional, but we want to make sure at least one is present)
((?<hours>\d+)h)? - hours (optional, capture into named group)
\s* - whitespace (optional)
((?<minutes>\d+)m?)? - minutes (optional, capture into named group, the 'm' is optional too)
$ - end of string

I would say that mhyfritz' solution is simple, efficient and good if your input is only what you shown.
If you ever need to handle corner cases, you can use a more discriminative expression:
^(\d+)(?:(h)(?:\s+(\d+)(m))?|(m?))$
But it can be overkill...
(get rid of ^ and $ if you need to detect such pattern in a larger body of text, of course).

Try this one: ^(?:(\d+)h\s*)?(?:(\d+)m?)?$:
var s = new[] { "1h 20m", "45m", "2h", "120", "1m 20m" };
foreach (var ss in s)
{
var m = Regex.Match(ss, #"^(?:(\d+)h\s*)?(?:(\d+)m?)?$");
int hour = m.Groups[1].Value == "" ? 0 : int.Parse(m.Groups[1].Value);
int min = m.Groups[2].Value == "" ? 0 : int.Parse(m.Groups[2].Value);
if (hour != 0 || min != 0)
Console.WriteLine("Hours: " + hour + ", Mins: " + min);
else
Console.WriteLine("No match!");
}

in bash
echo $string | awk '{for(i=1;i<=NF;i++) print $i}' | sed s/[hm]/""/g

Related

How to detect incomplet date from list and replace with flutter?

Hello I don't find how to detect an incomplet date from listString. I think about regex but don't know how to extract this sequence input.
input=[2022-01-20 20:01, 2022-01-20 21, 2022-01-20 22:25, 2022-01-20 23:01]
Here I tried to match 2022-01-20 21 (it's the only who not have minute)
after match I want to add minute :00 to remove wrong date format
Here is what I search to have
output=[2022-01-20 20:01, 2022-01-20 21:00, 2022-01-20 22:25, 2022-01-20 23:01]
here is what I tried
dateList=[2022-01-20 20:01, 2022-01-20 21, 2022-01-20 22:25, 2022-01-20 23:01];
for (var i = 1; i < dateList.length; i++) {
RegExp regExp = new RegExp(
r"^((?!:).)*$",
);
var match = regExp.firstMatch("${dateList}");
var index = dateList1.indexOf(match);
dateList.replaceRange(index, index + 1, ["$match:00"]);
}
for each index of my stringlist I seach the only who not have : after I found the index who have a problem, and I replace the index with the add :00
problem match return null...
Thank you
I agree that using regular expressions is the way to go here. Detecting a date is relatively simple, you're basically looking for
4-digits dash 2-digits dash 2-digits space 2-digits colon 2-digits
Which, in RegExp language is
\d{4}-\d{2}-\d{2} \d{2}:\d{2}
Now we can detect whether a given String contains a complete datetime. The only thing that's left is to add the trailing minutes when it is missing. Note that you can decide what to add using another regular expression, but this code will just add the minutes, assuming that's always the issue.
List<String> input = ['2022-01-20 20:01', '2022-01-20 21', '2022-01-20 22:25', '2022-01-20 23:01'];
List<String> output = [];
// detect a date + time
RegExp regex = RegExp(r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}');
for (String maybeDate in input) {
bool isCompleteDate = regex.hasMatch(maybeDate);
if (isCompleteDate) {
output.add(maybeDate);
} else {
// we want to comlete the String
// in this case, I assume it's always just the minutes missing, but you could use another regex to see which part is missing
output.add(maybeDate + ':00');
}
}
print(output);
Alternatively, you can indeed use negative lookahead to find the missing minutes:
// detects a date and hour, without a colon and two digits (the minutes)
RegExp missingMinutes = RegExp(r'(\d{4}-\d{2}-\d{2} \d{2})(?!:\d{2})');
Which, in case you have a String instead of a List<String> would result in
List<String> input = ['2022-01-20 20:01', '2022-01-20 21', '2022-01-20 22:25', '2022-01-20 23:01'];
String listAsString = input.toString();
RegExp missingMinutes = RegExp(r'(\d{4}-\d{2}-\d{2} \d{2})(?!:\d{2})');
List<RegExpMatch?> matches = missingMinutes.allMatches(listAsString).toList();
for (int i = matches.length - 1; i >= 0; i--) {
// walk through all matches
if (matches[i] == null) continue;
listAsString = listAsString.substring(0, matches[i]!.end) + ':00' + listAsString.substring(matches[i]!.end);
}
print(listAsString);

How to match sequences of consecutive Date like characters string in Dart?

I have consecutive characters as date like 20210215 and 14032020
I am trying to convert to date string like 2021.02.15 and 14.03.2020
My first problem is the consecutive characters it is in 2 format type. Like:
1) 20210215
2) 14032020
And my second problem to convert them to date string without changing the format. Like:
1) 2021.02.15
2) 14.03.2020
When I search about regex couldn't find any pattern to convert the above {20210215} consecutive characters examples to date {2021.02.15} string.
What is correct regex pattern to convert both format as I describe above in Dart?
UPDATE-1:
I need to turn this string "20210215" to this "2021.02.15" as a string and not DateTime. Also I need to turn this string "14032020" to this string "14.03.2020". And I don't want to turn to DateTime string.
First I need to detected if the year is in beginning of the string or end of it. Than put dot (.) between the day, month and year string.
UPDATE-2:
this is best I can found but it turns 02 day or month to 2. But I need as it is.
var timestampString = '13022020';//'20200213';
var re1 = RegExp(
r'^'
r'(?<year>\d{4})'
r'(?<month>\d{2})'
r'(?<day>\d{2})'
r'$',
);
var re2 = RegExp(
r'^'
r'(?<day>\d{2})'
r'(?<month>\d{2})'
r'(?<year>\d{4})'
r'$',
);
var dateTime;
var match1 = re1.firstMatch(timestampString);
if (match1 == null) {
var match2 = re2.firstMatch(timestampString);
if (match2 == null) {
//throw FormatException('Unrecognized timestamp format');
dateTime = '00.00.0000';
print('DATE_TIME: $dateTime');
} else {
var _day = int.parse(match2.namedGroup('day'));
var _month = int.parse(match2.namedGroup('month'));
var _year = int.parse(match2.namedGroup('year'));
dateTime = '$_day.$_month.$_year';
print('DATE_TIME(match2): $dateTime');
}
} else {
var _year = int.parse(match1.namedGroup('year'));
var _month = int.parse(match1.namedGroup('month'));
var _day = int.parse(match1.namedGroup('day'));
dateTime = '$_year.$_month.$_day';
print('DATE_TIME(match1): $dateTime');
}
Output:
DATE_TIME: 2020.2.13
But I need to get output as 2020.02.13.
Second is match1 also prints 1302.20.20 But if I remove var match2 section and if format is like 20200213 it works but doesn't print the 0 as I post it above.
You can use
text.replaceAllMapped(RegExp(r'\b(?:((?:19|20)\d{2})(0?[1-9]|1[0-2])(0?[1-9]|[12][0-9]|3[01])|(0?[1-9]|[12][0-9]|3[01])(0?[1-9]|1[0-2])((?:19|20)\d{2}))\b'), (Match m) => m[4] == null ? "${m[1]}.${m[2]}.${m[3]}" : "${m[4]}.${m[5]}.${m[6]}")
The \b(?:((?:19|20)\d{2})(0?[1-9]|1[0-2])(0?[1-9]|[12][0-9]|3[01])|(0?[1-9]|[12][0-9]|3[01])(0?[1-9]|1[0-2])((?:19|20)\d{2}))\b regex matches
\b - a word boundary
(?: - start of a non-capturing group:
((?:19|20)\d{2}) - year from 20th and 21st centuries
(0?[1-9]|1[0-2]) - month
(0?[1-9]|[12][0-9]|3[01]) - day
| - or
(0?[1-9]|[12][0-9]|3[01]) - day
(0?[1-9]|1[0-2]) - month
((?:19|20)\d{2}) - year
) - end of the group
\b - word boundary.
See the regex demo.
See a Dart demo:
void main() {
final text = '13022020 and 20200213 20111919';
print(text.replaceAllMapped(RegExp(r'\b(?:((?:19|20)\d{2})(0?[1-9]|1[0-2])(0?[1-9]|[12][0-9]|3[01])|(0?[1-9]|[12][0-9]|3[01])(0?[1-9]|1[0-2])((?:19|20)\d{2}))\b'), (Match m) =>
m[4] == null ? "${m[1]}.${m[2]}.${m[3]}" : "${m[4]}.${m[5]}.${m[6]}"));
}
Returning 13.02.2020 and 2020.02.13 20.11.1919.
If Group 4 is null, the first alternative matched, so we need to use Group 1, 2 and 3. Else, we join Group 4, 5 and 6 with a dot.

How do I use reaLline() to find matches in a file using regex and print them out to the console

I am trying to have the user input a class number and name to pull up a list of information on that class I have on a file. I have figured out how to match the information using .toRegex. I can't figure out how to use the users input to find the match they need and not all matching in the file. I am very new to Regnex.
val pattern = """\d+\s+([A-Z]+).\s+(\d+)\s.+\s+\w.+""".toRegex()
val fileName = "src/main/kotlin/Enrollment.txt"
var lines = File(fileName).readLines()// reads every line on the file
do{
print("please enter class name")
var className = readLine()!!
print("please enter class number ")
var classNum = readLine()!!
for(i in 0..(lines.size-1) ){
var matchResult = pattern.find(lines[i])
if(matchResult != null) {
var (className,classNum) = matchResult.groupValues
println("className: $className, class number: $classNum ")
}
}
}while (readLine()!! != "EXIT") ```
example line from file
Name Num
0669 HELP 134 AN CV THING ETC 4.0 4.0 Smith P 001 0173 MTWTh 9:30A 10:30A 23 15 8 4.0
See MatchResult#groupValues reference:
This list has size of groupCount + 1 where groupCount is the count
of groups in the regular expression. Groups are indexed from 1 to
groupCount and group with the index 0 corresponds to the entire
match.
If the group in the regular expression is optional and there were no
match captured by that group, corresponding item in groupValues
is an empty string.
You need
var (_, className,classNum) = matchResult.groupValues
See Kotlin demo:
val lines = "0669 HELP 134 AN CV THING ETC 4.0 4.0 Smith P 001 0173 MTWTh 9:30A 10:30A 23 15 8 4.0 "
val pattern = """^\d+\s+([A-Z]+)\s+(\d+)""".toRegex()
var matchResult = pattern.find(lines)
if(matchResult != null) {
var (_, className,classNum) = matchResult.groupValues
println("className: $className, class number: $classNum ")
}
// => className: HELP, class number: 134
I simplified the regex a bit since find() does not require a full string match to
^\d+\s+([A-Z]+)\s+(\d+)
See the regex demo. Details:
^ - start of string
\d+ - one or more digits
\s+ - one or more whitespaces
([A-Z]+) - Group 1: one or more uppercase ASCII letters
\s+ - one or more whitespaces
(\d+) - Group 2: one or more digits
You need to use a variable in the pattern that you get from the user .readLine()
Use a loop to check each line with another loop checking if the patter is in that line. pattern.containMatchIn()
val className = readLine()!!.toUpperCase()
print("please enter class number ")
val classNum = readLine()!!
val pattern = """\s+\d+\s+$className.\s+$classNum""".toRegex()
for(i in 0..(lines.size-1) ) {
var matchResult = pattern.find(lines[i])
if(matchResult != null ){
if (pattern.containsMatchIn(lines[i])) {
println(lines[i])
}
}
}```

Regex: Any letters, digit, and 0 up to 3 special chars

It seems I'm stuck with a simple regex for a password check.
What I'd like:
8 up to 30 symbols (Total)
With any of these: [A-Za-z\d]
And 0 up to 3 of these: [ -/:-#[-`{-~À-ÿ] (Special list)
I took a look here and then I wrote something like:
(?=.{8,15}$)(?=.*[A-Za-z\d])(?!([ -\/:-#[-`{-~À-ÿ])\1{4}).*
But it doesn't work, one can put more than 3 of the special chars list.
Any tips?
After shuffling your regex around a bit, it works for the examples you provided (I think you made a mistake with the example "A#~` C:", it should not match as it has 6 special chars):
(?!.*(?:[ -\/:-#[-`{-~À-ÿ].*){4})^[A-Za-z\d -\/:-#[-`{-~À-ÿ]{8,30}$
It only needs one lookahead instead of two, because the length and character set check can be done without lookahead: ^[A-Za-z\d -/:-#[-`{-~À-ÿ]{8,30}$
I changed the negative lookahead a bit to be correct. Your mistake was to only check for consecutive special chars, and you inserted the wildcards .* in a way that made the lookahead never hit (because the wildcard allowed everything).
Will this work?
string characters = " -/:-#[-`{-~À-ÿ";
string letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
string[] inputs = {
"AABBCCDD",
"aaaaaaaa",
"11111111",
"a1a1a1a1",
"AA####AA",
"A1C EKFE",
"AADE F"
};
foreach (string input in inputs)
{
var counts = input.Cast<char>().Select(x => new { ch = characters.Contains(x.ToString()) ? 1 : 0, letter = letters.Contains(x.ToString()) ? 1 : 0, notmatch = (characters + letters).Contains(x) ? 0 : 1}).ToArray();
Boolean isMatch = (input.Length >= 8) && (input.Length <= 30) && (counts.Sum(x => x.notmatch) == 0) && (counts.Sum(x => x.ch) <= 3);
Console.WriteLine("Input : '{0}', Matches : '{1}'", input, isMatch ? "Match" : "No Match");
}
Console.ReadLine();
I would use: (if you want to stick to Regex)
var specialChars = #" -\/:-#[-`{-~À-ÿ";
var regularChars = #"A-Za-z\d";
if (Regex.Match(password,$"^(.[{regularChars}{specialChars}]{7,29})$").Success && Regex.Matches(password, $"[{specialChars}]").Count<=3))
{
//Password OK
}
If consists of:
Check Length and if password contains illegal characters
Check if ony contains 3 times special char
A litle faster:
var specialChars = #" -\/:-#[-`{-~À-ÿ";
var regularChars = #"A-Za-z\d";
var minChars = 8;
var maxChars = 30;
if (password.Length >= minChars && password.Length <= maxChars && Regex.Match(password,$"^[{regularChars}{specialChars}]+$").Success && Regex.Matches(password, $"[{specialChars}]").Count<=3))
{
//Password OK
}
Newbie here..I think I've managed to get what you need but one of the test cases you shared was kinda weird..
A#~` C:
OK -- Match (3 specials, it's okay)
Shouldn't this be failed because it has more than 3 specials?
Could you perhaps try this? If it works I'll type out the explanations for the regex.
https://regex101.com/r/KCL6R1/2
(?=^[A-Za-z\d -\/:-#[-`{-~À-ÿ]{8,30}$)^(?:[A-Za-z\d]*[ -\/:-#[-`{-~À-ÿ]){0,3}[A-Za-z\d]*$

Regex exclude exact digit from digits

Hello my fellow dream builders.
I am parsing time from twitter and I am using this regex:
{
match = /^[1]/.exec(obj.tweetTime);
if(match != null){
time = "1 hour ago";
}
else
{
match = /^[0-9]{1,2}/.exec(obj.tweetTime);
time = match + " hours ago";
}
}
My question is, if there is simpler way to do this? As you can see, I have 2 digits for time. I just want to format my print right. Hour/Hours as you can see.
Is it possible to write only 1 regex and use only 1 conditional bracket?
PS: I am beginner at regex, and I know /^[0-9]{1,2}/ allow numbers from 0 to 99 practically, but as I said it works for my needs, just asking if it is possible to do this properly, since I lack knowledge.
Thank you, much love <3
I would do it like this:
var match = obj.tweetTime.match(/^\d+$/);
if (match) {
var time = match[0] + ' hour' + (match[0] == 1 ? '' : 's') + ' ago';
}
EDIT Turns out the string is formatted! In which case:
var match = obj.tweetTime.match(/^(\d+)([smhd])$/);
if (match) {
var units = { s: 'second', m: 'minute', h: 'hour', d: 'day' },
time = match[1] + ' ' + units[match[2]] + (match[1] == 1 ? '' : 's') + ' ago';
}
To explain the regex:
^ Anchor matches to the beginning of the string
(\d+) Capture one or more digits in first group
([smhd]) Capture s, m, h or d in second group
$ Anchor to end of string