Regex exclude exact digit from digits - regex

Hello my fellow dream builders.
I am parsing time from twitter and I am using this regex:
{
match = /^[1]/.exec(obj.tweetTime);
if(match != null){
time = "1 hour ago";
}
else
{
match = /^[0-9]{1,2}/.exec(obj.tweetTime);
time = match + " hours ago";
}
}
My question is, if there is simpler way to do this? As you can see, I have 2 digits for time. I just want to format my print right. Hour/Hours as you can see.
Is it possible to write only 1 regex and use only 1 conditional bracket?
PS: I am beginner at regex, and I know /^[0-9]{1,2}/ allow numbers from 0 to 99 practically, but as I said it works for my needs, just asking if it is possible to do this properly, since I lack knowledge.
Thank you, much love <3

I would do it like this:
var match = obj.tweetTime.match(/^\d+$/);
if (match) {
var time = match[0] + ' hour' + (match[0] == 1 ? '' : 's') + ' ago';
}
EDIT Turns out the string is formatted! In which case:
var match = obj.tweetTime.match(/^(\d+)([smhd])$/);
if (match) {
var units = { s: 'second', m: 'minute', h: 'hour', d: 'day' },
time = match[1] + ' ' + units[match[2]] + (match[1] == 1 ? '' : 's') + ' ago';
}
To explain the regex:
^ Anchor matches to the beginning of the string
(\d+) Capture one or more digits in first group
([smhd]) Capture s, m, h or d in second group
$ Anchor to end of string

Related

How to match sequences of consecutive Date like characters string in Dart?

I have consecutive characters as date like 20210215 and 14032020
I am trying to convert to date string like 2021.02.15 and 14.03.2020
My first problem is the consecutive characters it is in 2 format type. Like:
1) 20210215
2) 14032020
And my second problem to convert them to date string without changing the format. Like:
1) 2021.02.15
2) 14.03.2020
When I search about regex couldn't find any pattern to convert the above {20210215} consecutive characters examples to date {2021.02.15} string.
What is correct regex pattern to convert both format as I describe above in Dart?
UPDATE-1:
I need to turn this string "20210215" to this "2021.02.15" as a string and not DateTime. Also I need to turn this string "14032020" to this string "14.03.2020". And I don't want to turn to DateTime string.
First I need to detected if the year is in beginning of the string or end of it. Than put dot (.) between the day, month and year string.
UPDATE-2:
this is best I can found but it turns 02 day or month to 2. But I need as it is.
var timestampString = '13022020';//'20200213';
var re1 = RegExp(
r'^'
r'(?<year>\d{4})'
r'(?<month>\d{2})'
r'(?<day>\d{2})'
r'$',
);
var re2 = RegExp(
r'^'
r'(?<day>\d{2})'
r'(?<month>\d{2})'
r'(?<year>\d{4})'
r'$',
);
var dateTime;
var match1 = re1.firstMatch(timestampString);
if (match1 == null) {
var match2 = re2.firstMatch(timestampString);
if (match2 == null) {
//throw FormatException('Unrecognized timestamp format');
dateTime = '00.00.0000';
print('DATE_TIME: $dateTime');
} else {
var _day = int.parse(match2.namedGroup('day'));
var _month = int.parse(match2.namedGroup('month'));
var _year = int.parse(match2.namedGroup('year'));
dateTime = '$_day.$_month.$_year';
print('DATE_TIME(match2): $dateTime');
}
} else {
var _year = int.parse(match1.namedGroup('year'));
var _month = int.parse(match1.namedGroup('month'));
var _day = int.parse(match1.namedGroup('day'));
dateTime = '$_year.$_month.$_day';
print('DATE_TIME(match1): $dateTime');
}
Output:
DATE_TIME: 2020.2.13
But I need to get output as 2020.02.13.
Second is match1 also prints 1302.20.20 But if I remove var match2 section and if format is like 20200213 it works but doesn't print the 0 as I post it above.
You can use
text.replaceAllMapped(RegExp(r'\b(?:((?:19|20)\d{2})(0?[1-9]|1[0-2])(0?[1-9]|[12][0-9]|3[01])|(0?[1-9]|[12][0-9]|3[01])(0?[1-9]|1[0-2])((?:19|20)\d{2}))\b'), (Match m) => m[4] == null ? "${m[1]}.${m[2]}.${m[3]}" : "${m[4]}.${m[5]}.${m[6]}")
The \b(?:((?:19|20)\d{2})(0?[1-9]|1[0-2])(0?[1-9]|[12][0-9]|3[01])|(0?[1-9]|[12][0-9]|3[01])(0?[1-9]|1[0-2])((?:19|20)\d{2}))\b regex matches
\b - a word boundary
(?: - start of a non-capturing group:
((?:19|20)\d{2}) - year from 20th and 21st centuries
(0?[1-9]|1[0-2]) - month
(0?[1-9]|[12][0-9]|3[01]) - day
| - or
(0?[1-9]|[12][0-9]|3[01]) - day
(0?[1-9]|1[0-2]) - month
((?:19|20)\d{2}) - year
) - end of the group
\b - word boundary.
See the regex demo.
See a Dart demo:
void main() {
final text = '13022020 and 20200213 20111919';
print(text.replaceAllMapped(RegExp(r'\b(?:((?:19|20)\d{2})(0?[1-9]|1[0-2])(0?[1-9]|[12][0-9]|3[01])|(0?[1-9]|[12][0-9]|3[01])(0?[1-9]|1[0-2])((?:19|20)\d{2}))\b'), (Match m) =>
m[4] == null ? "${m[1]}.${m[2]}.${m[3]}" : "${m[4]}.${m[5]}.${m[6]}"));
}
Returning 13.02.2020 and 2020.02.13 20.11.1919.
If Group 4 is null, the first alternative matched, so we need to use Group 1, 2 and 3. Else, we join Group 4, 5 and 6 with a dot.

How do I use reaLline() to find matches in a file using regex and print them out to the console

I am trying to have the user input a class number and name to pull up a list of information on that class I have on a file. I have figured out how to match the information using .toRegex. I can't figure out how to use the users input to find the match they need and not all matching in the file. I am very new to Regnex.
val pattern = """\d+\s+([A-Z]+).\s+(\d+)\s.+\s+\w.+""".toRegex()
val fileName = "src/main/kotlin/Enrollment.txt"
var lines = File(fileName).readLines()// reads every line on the file
do{
print("please enter class name")
var className = readLine()!!
print("please enter class number ")
var classNum = readLine()!!
for(i in 0..(lines.size-1) ){
var matchResult = pattern.find(lines[i])
if(matchResult != null) {
var (className,classNum) = matchResult.groupValues
println("className: $className, class number: $classNum ")
}
}
}while (readLine()!! != "EXIT") ```
example line from file
Name Num
0669 HELP 134 AN CV THING ETC 4.0 4.0 Smith P 001 0173 MTWTh 9:30A 10:30A 23 15 8 4.0
See MatchResult#groupValues reference:
This list has size of groupCount + 1 where groupCount is the count
of groups in the regular expression. Groups are indexed from 1 to
groupCount and group with the index 0 corresponds to the entire
match.
If the group in the regular expression is optional and there were no
match captured by that group, corresponding item in groupValues
is an empty string.
You need
var (_, className,classNum) = matchResult.groupValues
See Kotlin demo:
val lines = "0669 HELP 134 AN CV THING ETC 4.0 4.0 Smith P 001 0173 MTWTh 9:30A 10:30A 23 15 8 4.0 "
val pattern = """^\d+\s+([A-Z]+)\s+(\d+)""".toRegex()
var matchResult = pattern.find(lines)
if(matchResult != null) {
var (_, className,classNum) = matchResult.groupValues
println("className: $className, class number: $classNum ")
}
// => className: HELP, class number: 134
I simplified the regex a bit since find() does not require a full string match to
^\d+\s+([A-Z]+)\s+(\d+)
See the regex demo. Details:
^ - start of string
\d+ - one or more digits
\s+ - one or more whitespaces
([A-Z]+) - Group 1: one or more uppercase ASCII letters
\s+ - one or more whitespaces
(\d+) - Group 2: one or more digits
You need to use a variable in the pattern that you get from the user .readLine()
Use a loop to check each line with another loop checking if the patter is in that line. pattern.containMatchIn()
val className = readLine()!!.toUpperCase()
print("please enter class number ")
val classNum = readLine()!!
val pattern = """\s+\d+\s+$className.\s+$classNum""".toRegex()
for(i in 0..(lines.size-1) ) {
var matchResult = pattern.find(lines[i])
if(matchResult != null ){
if (pattern.containsMatchIn(lines[i])) {
println(lines[i])
}
}
}```

RegEx parsing email address checking for quality

I'm very new to regex, reading and learning but need some direction
I need to parse email addresses for quality
fax=1AreaCodeNumber#domain.com
or 1AreaCodeNumber#doamin.com
need to ensure the 1AreaCodeNumber is 10 digits only, must start with 1
If it is 9 digits, and the first number is not a 1 add the 1.
Any help would be greatly appreciated.
Javascript:
const regex = /^(1?)\d{9}#(domain\.com|doamin\.com)/;
str = `1123456879#domain.com`;
let m;
if ((m = regex.exec(str)) !== null) {
// The result can be accessed through the `m`-variable.
if(m[1] === '') {
str = '1' + str;
}
console.log('matched: ' + str);
} else {
console.log('No Match');
}
1?\d{9} will match a 1 followed by 9 digits, or 9 digits without the 1. The full regex would be (fax=)?1?\d{9}#domain.com

Regular expression that matches string equals to one in a group

E.g. I want to match string with the same word at the end as at the begin, so that following strings match:
aaa dsfj gjroo gnfsdj riier aaa
sdf foiqjf skdfjqei adf sdf sdjfei sdf
rew123 jefqeoi03945 jq984rjfa;p94 ajefoj384 rew123
This one could do te job:
/^(\w+\b).*\b\1$/
explanation:
/ : regex delimiter
^ : start of string
( : start capture group 1
\w+ : one or more word character
\b : word boundary
) : end of group 1
.* : any number of any char
\b : word boundary
\1 : group 1
$ : end of string
/ : regex delimiter
M42's answer is ok except degenerate cases -- it will not match string with only one word. In order to accept those within one regexp use:
/^(?:(\w+\b).*\b\1|\w+)$/
Also matching only necessary part may be significantly faster on very large strings. Here're my solutions on javascript:
RegExp:
function areEdgeWordsTheSame(str) {
var m = str.match(/^(\w+)\b/);
return (new RegExp(m[1]+'$')).test(str);
}
String:
function areEdgeWordsTheSame(str) {
var idx = str.indexOf(' ');
if (idx < 0) return true;
return str.substr(0, idx) == str.substr(-idx);
}
I don't think a regular expression is the right choice here. Why not split the the lines into an array and compare the first and the last item:
In c#:
string[] words = line.Split(' ');
return words.Length >= 2 && words[0] == words[words.Length - 1];

Parse time string using regex

My time string may be in one of the following formates (x and y - integer numbers, h and m - symbols):
xh ym
xh
ym
y
Examples:
1h 20m
45m
2h
120
What regular expression should I write to get x and y numbers from such string?
(\d+)([mh]?)(?:\s+(\d+)m)?
You can then inspect groups 1-3. For your examples those would be
('1', 'h', '20')
('45', 'm', '')
('2', 'h', '')
('120', '', '')
As always, you might want to use some anchors ^, $, \b...
I'm going to assume you're using .NET due to your username. :)
I think in this case, it's easier to use TimeSpan.ParseExact for this task.
You can specify a list of permitted formats (see here for the format for these) and ParseExact will read in the TimeSpan according to them.
Here is an example:
var formats = new[]{"h'h'", "h'h 'm'm'", "m'm'", "%m"};
// I have assumed that a single number means minutes
foreach (var item in new[]{"23","1h 45m","1h","45m"})
{
TimeSpan timespan;
if (TimeSpan.TryParseExact(item, formats, CultureInfo.InvariantCulture, out timespan))
{
// valid
Console.WriteLine(timespan);
}
}
Output:
00:23:00
01:45:00
01:00:00
00:45:00
The only problem with this is that it is rather inflexible. Additional whitespace in the middle will fail to validate. A more robust solution using Regex is:
var items = new[]{"23","1h 45m", "45m", "1h", "1h 45", "1h 45", "1h45m"};
foreach (var item in items)
{
var match = Regex.Match(item, #"^(?=\d)((?<hours>\d+)h)?\s*((?<minutes>\d+)m?)?$", RegexOptions.ExplicitCapture);
if (match.Success)
{
int hours;
int.TryParse(match.Groups["hours"].Value, out hours); // hours == 0 on failure
int minutes;
int.TryParse(match.Groups["minutes"].Value, out minutes);
Console.WriteLine(new TimeSpan(0, hours, minutes, 0));
}
}
Breakdown of the regex:
^ - start of string
(?=\d) - must start with a digit (do this because both parts are marked optional, but we want to make sure at least one is present)
((?<hours>\d+)h)? - hours (optional, capture into named group)
\s* - whitespace (optional)
((?<minutes>\d+)m?)? - minutes (optional, capture into named group, the 'm' is optional too)
$ - end of string
I would say that mhyfritz' solution is simple, efficient and good if your input is only what you shown.
If you ever need to handle corner cases, you can use a more discriminative expression:
^(\d+)(?:(h)(?:\s+(\d+)(m))?|(m?))$
But it can be overkill...
(get rid of ^ and $ if you need to detect such pattern in a larger body of text, of course).
Try this one: ^(?:(\d+)h\s*)?(?:(\d+)m?)?$:
var s = new[] { "1h 20m", "45m", "2h", "120", "1m 20m" };
foreach (var ss in s)
{
var m = Regex.Match(ss, #"^(?:(\d+)h\s*)?(?:(\d+)m?)?$");
int hour = m.Groups[1].Value == "" ? 0 : int.Parse(m.Groups[1].Value);
int min = m.Groups[2].Value == "" ? 0 : int.Parse(m.Groups[2].Value);
if (hour != 0 || min != 0)
Console.WriteLine("Hours: " + hour + ", Mins: " + min);
else
Console.WriteLine("No match!");
}
in bash
echo $string | awk '{for(i=1;i<=NF;i++) print $i}' | sed s/[hm]/""/g