I have strings like:
Name 31X10.50R15 109S RX706 SUV
Brand 131/70R11 NU8 Word RX808
Word 6.00R16 983/222 10PR MONO S+V
I need to match 31X10.50 and 6.00R16 only from strings, as you can see there is no pattern like "digit X digit" or "digit R digit" in the second string line.
My preg_match was this:
/(\d*\.?\d+?)x\K\d*\.?\d+?|\d*\.?\d+?r\d*/i
With this line: (\d*\.?\d+?)x\K\d*\.?\d+? I am finding 31 and 10.5 from first string.
With next line: \d*\.?\d+?r\d* I hope to find 6.00R16 and took only 6.00
So my regex logic is to match 31X10.50 or 6.00R16 from strings. But second line is not working for me...
what I am doing wrong?
You may use
(?<![\d\/])(\d*\.?\d+)[xr](\d*\.?\d+)
See the regex demo.
Details
(?<![\d\/]) - there should be no digit or / to the right of the current location
(\d*\.?\d+) - Group 1: 0+ digits, an optional . and 1+ digits
[xr] - x or r
(\d*\.?\d+) - Group 2: 0+ digits, an optional . and 1+ digits
Regex
the code worked on any string contained . .
(\d+\.\d+)\w+
Related
I wish to match a filename with column and line info, eg.
\path1\path2\a_file.ts:17:9
//what i want to achieve:
match[1]: a_file.ts
match[2]: 17
match[3]: 9
This string can have garbage before and after the pattern, like
(at somewhere: \path1\path2\a_file.ts:17:9 something)
What I have now is this regex, which manages to match column and line, but I got stuck on filename capturing part.. I guess negative lookahead is the way to go, but it seems to match all previous groups and garbage text in the end of string.
(?!.*[\/\\]):(\d+):(\d+)\D*$
Here's a link to current implementation regex101
You can replace the lookahead with a negated character class:
([^\/\\]+):(\d+):(\d+)\D*$
See the regex demo. Details:
([^\/\\]+) - Group 1: one or more chars other than / and \
: - a colon
(\d+) - Group 2: one or more digits
: - a colon
(\d+) - Group 3: one or more digits
\D*$ - zero or more non-digit chars till end of string.
So... I had a regex which worked just fine (wasn't pretty but worked), until the Roman Numerals reached more than X.
Currently my Regex looks like this:
(.*?)(^(X{1,3})(I[XV]|V?I{0,3})$|^(I[XV]|V?I{1,3})$|^V$)*(.)( EP\. )(\d*)(.*)
The problem I have right now is that if roman numeral has value 10 or more it's is in 1st group which drives me nuts.
I need it to work in a way that all before roman numerals is ignored.
Test Text:
PEPA THE PIG XVI EP. 169 - BAD ENDING
Could you please help me fix the regex so it would actually do what it suppose to do?
You should re-consider using anchors in the middle of a regex: ^ requires start of string and $ requires the end of string.
Besides, (.) before ( Ep\. ) consume the space, and the Ep pattern cannot match it.
Consider using
^(.*?)\b(X{1,3}(?:I[XV]|V?I{0,3})|I[XV]|V?I{1,3}|V)\b(.)\b(EP\.)\s*(\d+)(.*)
See the regex demo. You might still need to check what exactly you want to match with (.).
Details:
^ - start of string
(.*?) - Group 1: any zero or more chars other than line break chars, as few as possible
\b - a word boundary
(X{1,3}(?:I[XV]|V?I{0,3})|I[XV]|V?I{1,3}|V) - Group 2: one to three Xs followed with IX or IV, or with an optional V and then zero to three Is, or IX, IV, or an optional V followed with one to three Is or V
\b - a word boundary
(.) - Group 3: any one char (other than a newline)
\b - a word boundary
(EP\.) - Group 4: EP.
\s* - zero or more whitespaces
(\d+) - Group 5: one or more digits
(.*) - Group 6: any zero or more chars other than line break chars, as many as possible
I'm trying to come up with a regex expression to replace an entire string with just the first two values. Examples:
Entire String: AO SMITH 100108283 4500W/240V SCREW-IN ELEMENT, 11"
First Two Values: AO SMITH
Entire String: BRA14X18HEBU / P11-042 / 310-470NL BRASS 1/4 x 1/8 HEX
BUSHING
First Two Values: BRA14X18HEBU / P11-042
Entire String: TWO-HOLE PIPE STRAP 4" 008004EG 72E 4
First Two Values: TWO-HOLE PIPE
The caveat is I'm wanting to preserve any kind of special characters and not count them, like "/"'s and "-"'s. The current code I've written does not, instead leaves the new values entirely blank. Only the first example above works.
Here's what I've got so far:
Matching Value:
^(\w+) +(\w+).+$
New Value:
$1 $2
One option could be using a single capture group and use that in the replacement.
^(\w+(?:-\w+)?(?: +\/)? +\w+(?:-\w+)?).+
The pattern matches:
^ Start of string
( Capture group 1
\w+(?:-\w+)?Match 1+ word charss with an optional part to match a - and 1+ word chars
(?: +\/)? Optionally match /
+\w+(?:-\w+)? Match 1+ word charss with an optional part to match a - and 1+ word chars
) Close group 1
.+ Match 1+ times any char (the rest of the line)
If there can be more than 1 hyphen, you can use * instead of ?
Regex demo
Output
AO SMITH
BRA14X18HEBU / P11-042
TWO-HOLE PIPE
A broader match could be matching non word chars in between the words
^(\w+(?:-\w+)*[\W\r\n]+\w+(?:-\w+)*).+
Regex demo
I am trying to split the expression like in Postgres 9.4:
"some text 123_good_345 and other text 123_some_invalid and 222_work ok_333 stop."
using pattern: (\d+\_.*\_\d+\D)+?
result is:
"123_good_345"
"123_some_invalid and 222_work ok_333"
But I need
"123_good_345"
"222_work ok_333"
note, ignoring "123_some_invalid"
Please help!
You may use
\d+_(?:(?!\d_).)*_\d+
See the regex demo. Or, if there can be no digits between \d+_ and _\d+, use
\d+_\D+_\d+
See this regex demo.
Details
\d+ - 1 or more digits
-_ - an underscore
(?:(?!\d_).)* - any char, 0 or more repetitions, as many as possible, that does not start a digit + _ char sequence
\D+ - any 1+ chars other than digits
_ - an underscore
\d+ - 1+ digits.
See the PostgreSQL demo:
SELECT unnest(regexp_matches('some text 123_good_345 and other text 123_some_invalid and 222_work ok_333 stop.', '\d+_(?:(?!\d_).)*_\d+', 'g'));
or
SELECT unnest(regexp_matches('some text 123_good_345 and other text 123_some_invalid and 222_work ok_333 stop.', '\d+_\D+_\d+', 'g'));
I need to match these values:
(First approach to a regex that roughly does what I want)
\d+([.,]\d{3})*[.,]\d{2}
like
24,56
24.56
1.234,56
1,234.56
1234,56
1234.56
but I need to not match
1.234.56
1,234,56
So somehow I need to check the last occurrence of "." or "," to not be the same as the previous "." or ",".
Background: Amounts shall be matched in English and German format with (optional) 1000-Separators.
But even with help of regex101 I completely fail at coming up with a correctly working look-behind. Any suggestions are highly appreciated.
UPDATE
Based on the answers I got so far, I came up with this (demo):
\d{1,3}(?:([\.,'])?\d{3})*(?!\1)[\.,\s]\d{2}
But it matches for example 1234.567,23 which is not desirable.
You may capture the digit grouping symbol and use a negative lookahead with a backreference to restrict the decimal separator:
^(?:\d+|\d{1,3}(?:([.,])\d{3})*)(?!\1)[.,]\d{2}$
^ ^ ^^^^^
See the regex demo
Group 1 will contain the last value of the digit grouping symbol and (?!\1)[.,] will match the other symbol.
Details:
^ - start of string
(?:\d+|\d{1,3}(?:([.,])\d{3})*) - either of the two alternatives:
\d+ - 1+ digits
| - or
\d{1,3} - 1 to 3 digits,
(?:([.,])\d{3})* - zero or more sequences of:
([.,]) - Group 1 capturing . or ,
\d{3} - 3 digits
(?!\1)[.,] - a . or , but not equal to what was last captured with ([.,]) pattern above
\d{2} - 2 digits
$ - end of string.
You can use
^\d+(([.,])\d{3})*(?!\2)[.,]\d{2}$
live demo