I am trying to match strings where there are two or more of the following words: Strength, Intelligence and Dexterity, with a value of 45 or higher. This an example of a string that would return a match:
+51 to Strength
+47 to Intelligence
+79 to maximum Life
+73 to maximum Mana
28% increased Rarity of Items found
+37% to Cold Resistance
The regex expression is to be entered in a game (Path of exile). The regex string can be a maximum of 50 characters.
The fourth bird has found a solution, but the string is more than 50 characters:
\b[45][0-9] to (?:Str|Int|Dex)[\s\S]*?\b[45][0-9] to (?:Str|Int|Dex).
Is there a way to found a similar expression, but with 50 characters or less?
Thanks in advance!
You can shorten it using a group and repeat that group with a quantifier, and write [0-9] as \d for example:
^(?:[\s\S]*?\b[45]\d to (?:Str|Int|Dex)){2}
The pattern matches:
^ Start of string
(?: Non capture group
[\s\S]*? Match any char as few as possible
\b[45]\d to (?:Str|Int|Dex) Match 4 or 5 followed by a digit, to and one of Str Int Dex
){2} Close the non capture group and repeat it 2 times
Regex demo
Related
I am trying to pull the dollar amount from some invoices. I need the match to be on the word directly after the word "TOTAL". Also, the word total may sometimes appear with a colon after it (ie Total:). An example text sample is shown below:
4 Discover Credit Purchase - c REF#: 02353R TOTAL: 40.00 AID: 1523Q1Q TC: mzQm 40.00 CHANGE 0.00 TOTAL NUMBER OF ITEMS SOLD = 0 12/23/17 Ql:38piii 414 9 76 1G6 THANK YOU FOR SHOPPING KR08ER Now Hiring - Apply Today!
In the case of the sample above, the match should be "40.00".
The Regex statement that I wrote:
(?<=total)([^\n\r]*)
pulls EVERYTHING after the word "total". I only want the very next word.
This (unlike other answers so far) matches only the total amount (ie without needing to examine groups):
((?<=\bTOTAL\b )|(?<=\bTOTAL\b: ))[\d.]+
See live demo matching when input has, and doesn’t have, the colon after TOTAL.
The reason 2 look behinds (which don’t capture input) are needed is they can’t have variable length. The optional colon is handled by using an alternation (a regex OR via ...|...) of 2 look behinds, one with and one without the colon.
If TOTAL can be in any case, add (?i) (the ignore case flag) to the start of the regex.
What you could do is match total followed by an optional colon :? and zero or more times a whitespace character \s* and capture in a group one or more digits followed by an optional part that matches a dot and one or more digits.
To match an upper or lowercase variant of total you could make the match case insensitive by for example by adding a modifier (?i) or use a case insensitive flag.
\btotal:?\s*(\d+(?:\.\d+)?)
The value 40.00 will be in group 1.
Explanations are in the regex pattern.
string str = "4 Discover Credit Purchase - c REF#: 02353R TOTAL: 40.00 AID: 1523Q1Q";
string pattern = #"(?ix) # 'i' means case-insensitive search
\b # Word boundary
total # 'TOTAL' or 'total' or any other combination of cases
:? # Matches colon if it exists
\s+ # One or more spaces
(\d+\.\d+) # Sought number saved into group
\s # One space";
// The number is in the first group: Groups[1]
Console.WriteLine(Regex.Match(str, pattern).Groups[1].Value);
you can use below regex to get amount after TOTAL:
\bTOTAL\b:?\s*([\d.]+)
It will capture the amount in first group.
Link : https://regex101.com/r/tzze8J/1/
Try this pattern: TOTAL:? ?(\d+.\d+)[^\d]?.
Demo
I have multiple 24-hour time strings through several files. For example, 1234, which I wish to replace with 12:34.
Finding them is easy, just \d\d\d\d, that I understand and it works. However, what replace string do I need. In other words, say xx:xx, what do I put in place of each x.
I've tried numbers of things to no avail. I'm obviously not understanding how I get it to remember the digits it found and to recall them in the replace string.
If in your example data 4 digits represent 24 hour time strings you could match 2 capturing groups between word boundaries to prevent a match with more then 4 digits. You can Adjust the word boundaries to your requirements.
Match
\b(\d{2})(\d{2})\b
Replace
group1:group2 \1:\2
Explanation
\b Match a word boundary
(\d{2}) Capture in a group 2 digits
(\d{2}) Capture in a group 2 digits
\b Match a word boundary
Note
Matching 4 digits does not verify a valid 24 hour time. You could match that using for example \b([01][0-9]|2[0-3])([0-5][0-9])\b and replace with \1:\2
I'm new to regex and i want to know if there is anyway to select all numbers after a matched string?
For example:
Input:
important string
abc 100
def 50
ghi jk 10
m 60
not important string
aa 90
bb 20
And as output, i want to select all these numbers: 100, 50, 10, 60
I have tried with important string[\w\n ]* (\d+) but i got only 60
Thanks alot!
A generic PCRE approach to matching multiple occurrences in between some texts is to use a \G based pattern that allows anchoring matches at the end of the previous successful match:
(?:\G(?!\A)|(?<!\bnot )important string)(?:(?!not important string)\D)*?\K\d+
See the regex demo
Basically,
(?s)(?:\G(?!\A)|STARTING_DELIMITER_STRING)(?:(?!END_DELIMITER_STRING).)*?\K\d+
Or, in order to stay within the initial STARTING_DELIMITER_STRING boundaries, add it to the negative lookahead:
(?s)(?:\G(?!\A)|STARTING_DELIMITER_STRING)(?:(?!STARTING_DELIMITER_STRING|END_DELIMITER_STRING).)*?\K\d+
Details:
(?:\G(?!\A)|(?<!\bnot )important string) - either the end of the previous successful match (\G(?!\A)) or an important string literal char sequence not preceded with not + space
(?:(?!not important string)\D)*? - any char other than digit (\D), 0+ occurrences, as few as possible, that is not a starting point for a not important string char sequence
\K - match reset operator
\d+ - 1+ digits
I figured out a regular expresion for my country's phone but I've something missing.
The rule here is: (Area Code) Prefix - Sufix
Area Code could be 3 to 5 digits
Prefix could be 2 to 4 digits.
Area Code + Prefix is 7 digits long.
Sufix is always 4 digits long
Total digits are 11.
I figured I could have 3 simple regex chained with an OR "|" like this:
/(\(?\d{3}\)?[- .]?\d{4}[- .]?\d\d\d\d)|(\(?\d{4}\)?[- .]?\d{3}[- .]?\d\d\d\d)|(\(?\d{5}\)?[- .]?\d{2}[- .]?\d\d\d\d)/
The thing I'm doing wrong is that \d\d\d\d doesn't match only 4 digits for the sufix, for example: (011) 4740-5000 which is a valid phone number, works ok but if put extra digits it will also return as a valid phone number, ie: (011) 4740-5000000000
You should use ^ and $ to match whole string
For example ^\d{4}$ will match exactly 4 digits not more not less.
Here is the complete regex pattern
^((\(?\d{3}\)? \d{4})|(\(?\d{4}\)? \d{3})|(\(?\d{5}\)? \d{2}))-\d{4}$
Online demo
As per your regex pattern delimiter can be -,. or single space then try
^((\(?\d{3}\)?[-. ]?\d{4})|(\(?\d{4}\)?[-. ]?\d{3})|(\(?\d{5}\)?[-. ]?\d{2}))[-. ]?\d{4}$
This pattern works fine for me:
/^\\(?(\d{3,5})?\\)?\s?(15)?[\s|-]?(4)\d{2,3}[\s|-]?\d{4}$/
I've tested this in regex101:
/^((?:\(?\d{3}\)?[- .]?\d{4}|\(?\d{4}\)?[- .]?\d{3}|\(?\d{5}\)?[- .]?\d{2})[- .]?\d{4})$/
RegEx Demo
^ Matches the beginning of a string
( Beginning of capture group
(?: Beginning of non-capturing group
Your different options for area code & prefix
) End non-capturing group
[- .]?\d{4} The last four digits of the phone number
) End capture group
$ Matches the end of a string
If you're trying to validate such a phone number, then the following one should suit your needs:
^(?=.{15}$)[(]\d{3,5}[)] \d{2,4}-\d{4}$
Debuggex Demo
You need to match the complete expression by indicating the start and end with anchors. You also don't need alternation for the different lengths.
/^(?=(\D*\d){11}$)\(?\d{3,5}\)?[- .]?\d{2,4}[- .]?\d{4}$/
Here's the breakdown:
(?=(\D*\d){11}$) is a non-capturing group ensuring that there are 11 digits total,
with any number of non-digits amongst them
\(?\d{3,5}\)?[- .]? matches 3-5 digits in parens (area code), followed by a separator
\d{2,4}[- .]? matches 2-4 digits (prefix), followed by a separator
\d{4} matches the suffix
I am trying to do a smart input field for UK style weight input, e.g. "6 stone and 3 lb" or "6 st 11 pound", capturing the 2 numbers in groups.
For now I got: ([0-9]{1,2}).*?([0-9]{1,2}).*
Problem is it matches "12 stone" in 2 groups, 1 and 2 instead of just 12. Is it possible to make a regex which captures correctly in both cases?
You need to make the first part possessive so it never gets backtracked into.
([0-9]{1,2}+).*?([0-9]{1,2})
Because . matches everythig including numbers.. try this:
/(\d{1,2})\D+(\d{1,2})?/
Something like this?
\b(\d+)\b.*?\b(\d+)\b
Groups 1 and 2 will have your numbers in either case.
Explanation :
"
\b # Assert position at a word boundary
( # Match the regular expression below and capture its match into backreference number 1
\d # Match a single digit 0..9
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
\b # Assert position at a word boundary
. # Match any single character that is not a line break character
*? # Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
\b # Assert position at a word boundary
( # Match the regular expression below and capture its match into backreference number 2
\d # Match a single digit 0..9
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
\b # Assert position at a word boundary
"
This works, then look at capture groups 1 and 3:
([0-9]{1,2})[^0-9]+(([0-9]{1,2})?.+)?
The idea is to make a number and text manditory, but make a second number and text optional.
Here is my suggestion for a regex to match both variants you showed:
(?<stone>\d+\s(?:stone|st))(?:\s(and)?\s?)(?<pound>\d+\s(?:pound|lb))
It's a bit vague at the moment, this works:
/([0-9]{1,2})(?:[^0-9]+([0-9]{1,2}).*)?/
for this data:
6 stone and 3 lb
6 st 11 pound
12 stone
12 st and 11lbs
Seeing as everyone is having a go, here's mine:
(\d+)(?:\D+(\d+)?)
It's definitely the concisest so far. This will match one or two groups of digits anywhere:
"12": ("12", null)
"12st": ("12", null)
"12 st": ("12", null)
"12st 34 lb": ("12", "34")
"cabbage 12st 34 lb": ("12", "34")
"12 potato 34 moo": ("12", "34")
The next step would be making it catch the name of the units that were used.
Edit: as pointed out above, we don's know what language you're using, and not all regex functionality is available in all implementations. However as far as I know, \d for digits and \D for non-digits is fairly universal.