Regex match the number in between numbers - regex

I have a list of string containing time in the following format.
15 min 43 sec
I want to extract 43 only. I was practicing at http://regexr.com/ but could not find an answer. The answer i have come to right now is \d+\s+min+\s+(\d*)+\s+sec which is match the whole word. But it should match only 43. Looking forward for the help soon. Thanks in advance.

A rudimentary and fast solution can be... \s(\d+)\s
But try to find a better one ;)

Use lookaround:
(\d+)(?=\s+sec)

The following pattern contains two capturing groups (for minutes and seconds), and allows for an arbitrary number of whitespaces inbetween the values. If only the seconds need to be extracted, one group would suffice.
To extract the values, match against an input (using a Matcher) and read the value of the according group (matcher.group(n), where 1 is the first group):
Pattern pattern = Pattern.compile("(\\d+)\\s*min\\s*(\\d+)\\s*sec");
String[] data = {"15 min 43 sec", "15min 43sec", "15 min 43 sec"};
for (String d : data) {
Matcher matcher = pattern.matcher(d);
if (matcher.matches()) {
int minutes = Integer.parseInt(matcher.group(1));
int seconds = Integer.parseInt(matcher.group(2));
System.out.println(minutes + ":" + seconds);
} else {
System.out.println("no match: " + d);
}
}

Related

Looping over brackets with regex

Regex extracting 99% of desired result.
This is my line:
Customer Service Representative (CS) (TM PM *) **
*Can have more parameters. Example (TM PM TR) etc
**Can have more parenthesis. Example (TM PM) (RI) (AB CD) etc
Except for the first bracket (CS in this case) which is group 1, I can have any number of parenthesis and any number of parameters within those parenthesis in group 2.
My attempt yields the desired result, but with brackets
(\(.*?\))\s*(\(.*?\).*)
My result:
My desired result:
group 1 : CS
group 2 : if gg yiy rt jfjfj jhfjh uigtu
I want help on removing those parenthesis from the result.
My attempt:
\((.*?)\)\s*\((.*?\).*)
which gives me
Can someone help me with this? I need to remove all the brackets from group 2 as well. I have been at it for a long time but can't figure out a way. Thank you.
You can't match disjoint sections of text using a single match operation. When you need to repeat a group, there is no way to even use a replace approach with capturing groups.
You need a post-process step to remove ( and ) from Group 2 value.
So, after you get your matches with the current approach, remove all ( and ) from the Group 2 value with
Group2value = Group2value.Replace("(", "").Replace(")", "");
Here is one approach which uses string splitting along with the base string functions:
string input = "(CS) (if gg yiy rt) (jfjfj) (jhfjh uigtu)";
string[] parts = Regex.Split(input, "\\) \\(");
string grp1 = parts[0].Replace("(", "");
parts[0] = "";
parts[parts.Length - 1] = parts[parts.Length - 1].Replace(")", "");
string grp2 = string.Join(" ", parts).Trim();
Console.WriteLine(grp1);
Console.WriteLine(grp2);
CS
if gg yiy rt jfjfj jhfjh uigtu

RegEx for matching the first {N} chars and last {M} chars

I'm having an issue filtering tags in Grafana with an InfluxDB backend. I'm trying to filter out the first 8 characters and last 2 of the tag but I'm running into a really weird issue.
Here are some of the names...
GYPSKSVLMP2L1HBS135WH
GYPSKSVLMP2L2HBS135WH
RSHLKSVLMP1L1HBS045RD
RSHLKSVLMP35L1HBS135WH
RSHLKSVLMP35L2HBS135WH
only want to return something like this:
MP8L1HBS225
MP24L2HBS045
I first started off using this expression:
[MP].*
But it only returns the following out of 148:
PAYNKSVLMP27L1HBS045RD
PAYNKSVLMP27L1HBS135WH
PAYNKSVLMP27L1HBS225BL
PAYNKSVLMP27L1HBS315BR
The pattern [MP].* Matches either a M or P and then matches any char until the end of the string not taking any char, digit or quantifing number afterwards into account.
If you want to match MP and the value does not end on a digit but the last in the match should be a digit, you could use:
MP[A-Z0-9]+[0-9]
Regex demo
If lookaheads are supported you might also use:
MP[A-Z0-9]+(?=[A-Z0-9]{2}$)
Regex demo
You may not even want to touch MP. You can simply define a left and right boundary, just like your question asks, and swipe everything in between which might be faster, maybe an expression similar to:
(\w{8})(.*)(\w{2})
which you can simply call it using $2. That is the second capturing group, just to be easy to replace.
Graph
This graph shows how the expression would work:
Performance
This JavaScript snippet shows the performance of this expression using a simple 1-million times for loop.
repeat = 1000000;
start = Date.now();
for (var i = repeat; i >= 0; i--) {
var string = "RSHLKSVLMP35L2HBS135WH";
var regex = /^(\w{8})(.*)(\w{2})$/g;
var match = string.replace(regex, "$2");
}
end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match πŸ’š ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. 😳 ");
Try Regex: (?<=\w{8})\w+(?=\w{2})
Demo

Regex to get certains number of string - Python

I need to get price of this strig "Prix\xa0de base : 26 900 euros – bonus" but there is a 0 in 'Prix\xa0de' and I don't know how to do it.
Thanks for your help!
You can use something like this:
subject = "Prix\xa0de base : 26 900 euros – bonus"
match = re.search(r"^.*:\s+([\d ]+)\s+", subject)
if match:
result = match.group(1)
else:
result = ""
result will be 26 900
If it always is followed by the word 'euros' then as simple as:
'(\d+ ?\d+) euros'
Capturing the number (or number with a space as separator) before 'euros'

Using regex to capture phone numbers with spaces inserted at differing points

I want to be able to extract a complete phone number from text, irrespective of how many spaces interrupt the number.
For example in the passage:
I think Emily was her name, and that her number was either 0421032614 or 0423 032 615 or 04321 98 564
I would like to extract:
0421032614
0423032615
0432198564
I can extract the first two using
(\d{4}[\s]?)(\d{3}[\s]?)+
But this is contingent on me knowing ahead of time how the ten numbers will be grouped (i.e. where the spaces will be). Is there any way to capture the ten numbers with a more flexible pattern?
You need to remove all white space then run a for loop and iterate through the groups:
public static void main (String [] args){
String reg = "(\\d{10})";
String word = " think Emily was her name, and that her number was either 0421032614 or 0423 032 615 or 04321 98 564";
word = word.replaceAll("\\s+",""); // replace all the whitespace with nothing
Pattern pat = Pattern.compile(reg);
Matcher mat = pat.matcher(word);
while (mat.find()) {
for (int i = 1; i <= mat.groupCount(); i++) {
System.out.println(mat.group(i));
}
}
}
output is
0421032614
0423032615
0432198564

decision on regular expression length

I want to accomplish the following requirements using Regex only (no C# code can be used )
β€’ BTN length is 12 and BTN starts with 0[123456789] then it should remove one digit from left and one digit from right.
WORKING CORRECTLY
β€’ BTN length is 12 and it’s not the case stated above then it should always return 10 right digits by removing 2 from the start. (e.g. 491234567891 should be changed to 1234567891)
NOT WORKING CORRECTLY
β€’ BTN length is 11 and it should remove one digit from left. WORKING CORRECTLY
for length <=10 BTNs , nothing is required to be done , they would remain as it is or Regex may get failed too on them , thats acceptable .
USING SQL this can be achieved like this
case when len(BTN) = 12 and BTN like '0[123456789]%' then SUBSTRING(BTN,2,10) else RIGHT(BTN,10) end
but how to do this using Regex .
So far I have used and able to get some result correct using this regex
[0*|\d\d]*(.{10}) but by this regex I am not able to correctly remove 1st and last character of a BTN like this 015732888810 to 1573288881 as this regex returns me this 5732888810 which is wrong
code is
string s = "111112573288881,0573288881000,057328888105,005732888810,15732888815,344956345335,004171511326,01777203102,1772576210,015732888810,494956345335";
string[] arr = s.Split(',');
foreach (string ss in arr)
{
// Match mm = Regex.Match(ss, #"\b(?:00(\d{10})|0(\d{10})\d?|(\d{10}))\b");
// Match mm = Regex.Match(ss, "0*(.{10})");
// ([0*|\\d\\d]*(.{10}))|
Match mm = Regex.Match(ss, "[0*|\\d\\d]*(.{10})");
// Match mm = Regex.Match(ss, "(?(^\\d{12}$)(.^{12}$)|(.^{10}$))");
// Match mm = Regex.Match(ss, "(info)[0*|\\d\\d]*(.{10}) (?(1)[0*|\\d\\d]*(.{10})|[0*|\\d\\d]*(.{10}))");
string m = mm.Groups[1].Value;
Console.WriteLine("Original BTN :"+ ss + "\t\tModified::" + m);
}
This should work:
(0(\d{10})0|\d\d(\d{10}))
UPDATE:
(0(\d{10})0|\d{1,2}(\d{10}))
1st alternate will match 12-digits with 0 on left and 0 on right and give you only 10 in between.
2nd alternate will match 11 or 12 digits and give you the right 10.
EDIT:
The regex matches the spec, but your code doesn't read the results correctly. Try this:
Match mm = Regex.Match(ss, "(0(\\d{10})0|\\d{1,2}(\\d{10}))");
string m = mm.Groups[2].Value;
if (string.IsNullOrEmpty(m))
m = mm.Groups[3].Value;
Groups are as follows:
index 0: returns full string
index 1: returns everything inside the outer closure
index 2: returns only what matches in the closure inside the first alternate
index 3: returns only what matches in the closure inside the second alternate
NOTE: This does not deal with anything greater than 12 digits or less than 11. Those entries will either fail or return 10 digits from somewhere. If you want results for those use this:
"(0(\\d{10})0|\\d*(\\d{10}))"
You'll get rightmost 10 digits for more than 12 digits, 10 digits for 10 digits, nothing for less than 10 digits.
EDIT:
This one should cover your additional requirements from the comments:
"^(?:0|\\d*)(\\d{10})0?$"
The (?:) makes a grouping excluded from the Groups returned.
EDIT:
This one might work:
"^(?:0?|\\d*)(\\d{10})\\d?$"
(?(^\d{12}$)(?(^0[1-9])0?(?<digit>.{10})|\d*(?<digit>.{10}))|\d*(?<digit>.{10}))
which does the exact same thing as sql query + giving result in Group[1] all the time so i didn't had to change the code a bit :)