Is it possible to match only the number between a string and other number?
RO41 RNCB 0089 0957 6044 0001 FPS21098343 RO17 BTRL 0470 1202 W949 45XX
What I want: 21098343
What I'm trying LINK : [0-9]{4}\s*\S+\s+(\S+)
What I get: FPS21098343
Any help is much appreciated! Thanks.
If the digits are at the end of the string $ you could use :
\b\d{4}\s+[A-Z]+(\d+)$
Regex demo
Or you can match uppercase chars preceding the digits and capture the digits in capture group 1 if it is not at the end of string, not followed by a space and digit:
\b\d{4}\s+[A-Z]+(\d+)\b(?!\s+\d)
\b\d{4}\s+ A word boundary, match 4 digits and 1+ whitespace chars
[A-Z]+(\d+)\b Match 1+ uppercase chars and capture 1+ digits in group 1
(?!\s+\d) Assert not whitespaces followed by a digit to the right
Regex demo
Match the number only if its preceded by 3 capital letters.
(?<=[A-Z]{3})([\d]+)
Sample run here
Related
I a looking for a Regex to match a string which should:
start with a digit
'in-between' have a permutation of exactly 7 digits and 2 hyphens, without 2 consecutive hyphens
end with a sequence of digit, hyphen, digit
Match:
01-234-5678-9
01234-56-78-9
0123-4-5678-9
012-345-678-9
01-234567-8-9
01-234-5678-9
0-12345-678-9
0-123-45678-9
0-123-45678-9
01-23456-78-9
0-123456-78-9
0-1234567-8-9
No Match:
01-234-56789-0
01-234-567-8
01--2345678-9
01-2345678--9
0-1-23456789
-01-2345678-9
For now, I could not quite figure out how to match the 2 'in-between' hyphens: ^\d\d{7}\d-\d$
EDIT:
Thanks to the answers I had to this question, I was able to expand it to this other question regarding ISBN-10 and ISBN-13...
You can assert 7 digits and the digit - digit part at the end.
For the match there should be at least a single digit before and after the hyphen to prevent consecutive hyphens.
^\d(?=(?:-?\d){7}-?\d-\d$)\d*-\d+-\d*\d-\d$
^ Start of string
\d Match a single digit
(?= Positive lookahead
(?:-?\d){7} Match 7 digits separated by an optional -
-?\d-\d$ Match an optional - and the \d-\d$ at the end
) Close the lookahead
\d*-\d+-\d*\d-\d Match possible formats where all hyphens are separated by at least a single digit
$ End of string
Regex demo
My two cents:
^(?=.{11}-\d$)(?:\d+-){3}\d
See the online demo
^ - Start string anchor.
(?= - Open positive lookahead:
.{11}-\d$ - Any character other than newline 11 times followed by a hypen, a single digit and the end string anchor.
) - Close positive lookahead.
(?: - Open non-capture group:
\d+- - 1+ digit followed by an hyphen.
){3} - Close non-capture group and match three times.
\d - Match a single digit.
I guess alternatively even ^(?=.{13}$)(?:\d+-){3}\d$ would work.
I have a string that has the following structure:
digit-word(s)-digit.
For example:
2029 AG.IZTAPALAPA 2
I want to extract the word(s) in the middle, and the digit at the end of the string.
I want to extract AG.IZTAPALAPA and 2 in the same capture group to extract like:
AG.IZTAPALAPA 2
I managed to capture them as individual capture groups but not as a single:
town_state['municipality'] = town_state['Town'].str.extract(r'(\D+)', expand=False)
town_state['number'] = town_state['Town'].str.extract(r'(\d+)$', expand=False)
Thank you for your help!
Yo can use a single capturing group for the example string to match a single "word" that consists of uppercase chars A-Z with an optional dot in the middle which can not be at the start or end followed by 1 or more digits.
\b\d+ ([A-Z]+(?:\.[A-Z]+)* \d+)\b
Explanation
\b A word boundary
\d+
( Capture group 1
[A-Z]+ Match 1+ occurrences of an uppercase char A-Z
(?:\.[A-Z]+)* \d+ Repeat 0+ times matching a dot and a char A-Z followed by matching 1+ digits
) Close group 1
\b A word boundary
Regex demo
Or you can make the pattern a bit broader matching either a dot or a word character
\b\d+ ([\w.]+(?: [\w.]+)* \d+)\b
Regex demo
You can use the following simple regex:
[0-9]+\s([A-Z]+.[A-Z]+(?: [0-9]+)*)
Note:
(?: [0-9]+)* will make it the last digital optional.
To match a dash-less checksum I can do something like:
\b[0-9a-z]{32}\b
However, I'm seeing some checksums that also have dashes, such as:
d3bd55bf-062f-473b-9417-935f62c4c98a
While this is probably a fixed size, 8, then 4, then 4, then 4, then 12, I was wondering if I could do a regex where the number of non-dash digits adds up to 32. I think the answer is no, but hopefully some regex wizard can come up with something.
Here is a starting point for some sample inputs: https://regex101.com/r/K0IMKe/1.
You can use
\b[0-9a-z](?:-?[0-9a-z]){31}\b
See the regex demo.
It matches
\b - a word boundary
[0-9a-z] - a digit or a lowercase ASCII letter
(?:-?[0-9a-z]){31} - thirty-one repetitions of an optional - followed with a single digit or a lowercase ASCII letter
\b - a word boundary.
If you do not mind having a trailing - if there is a word char after it, at the end of a match, you may also use
\b(?:[0-9a-z]-?){32}\b
See this regex demo. Here, (?:[0-9a-z]-?){32} will match thirty-two repetitions of a digit or lowercase ASCII letter followed with an optional hyphen.
If there can be multiple dashes, you can assert 32 to 36 chars using a positive lookahead.
^(?=[a-z0-9-]{32,36}$)[a-z0-9]+(?:-[a-z0-9]+)*$
^ Start of string
(?=[a-z0-9-]{32,36}$) Positive lookahead, assert what is at the right is 32 - 36 repetitions of the listed characters
[a-z0-9]+ Match 1+ times any of the listed
(?: Non capture group
-[a-z0-9]+ Match a - followed by 1+ times any of the listed (the string can not end with a hyphen)
)* Close the group and match 0+ times to also match the string without dashes
$ End of string
Regex demo
If you want to limit the amount of dashes to 0 -4 times, you can change the quantifier * to {0,4}+
^(?=[a-z0-9-]{32,36}$)[a-z0-9]+(?:-[a-z0-9]+){0,4}+$
Regex demo
Trying to extract 1st match string between numbers:
For example:
testsfa13.4extractthis8488.9090testssffwwww
ajfafs-sss133.6extractthis887878.222testtest522252.9thismore
So far I have the following:
[\d](.*?)[\d]
However, the match includes the numbers at the end of capture group? Any suggestions appreciated. Thank you.
If you want to extract the first match, you could start with an anchor ^ matching any char except a digit \D* and then match a digit with an optional decimal part.
^\D*\d+(?:[.,]\d+)*(\D+)\d
^ Start of string
\D* Match 0+ times any char except a digit
\d+(?:[.,]\d+)* Match 1+ digits and optionally repeat a . or , and 1+ digits
(\D+) Capture group 1, match 1+ times any char except a digit
\d Match a digit
Regex demo
To prevent crossing newline boundaries:
^[^\d\n\r]*\d+(?:[,.]\d+)*([^\d\n\r]+)\d
Regex demo
try \d([A-Za-z]+)\d and get first value from returned object
https://regex101.com/r/v61exp/1
I am trying to validate decimal number of 13 digit before and 4 digit after dot excluding comma , i.e comma shouldn't be counted as a digit.
Valid Cases
1,234,567,890,123.1234
1234567890123.1234
123456789012.1234
1234567890123.123
12345.123
1.2
0
In Valid Cases
12345abc.23 // string or special characters not allowed
1,234,567,890,1231.1234
1,234,567,890,123.12341
12345678901231.1234
1234567890123.12341
Current Regex
^[0-9]{1,13}(\.[0-9]{0,4})?$
The current Regex is counting comma as a digit.
Any help would be great.
You could use a negative lookahead to assert what is directly on the right is not 14 times a digit before matching a dot:
^(?!(?:[^.\s\d]*\d){14})-?\d+(?:,\d{1,3})*(?:\.\d{1,4})?$
Explanation
^ Start of string
-? Optional hyphen
(?! Negative lookahead, assert what follows is not
(?:[^.\s\d]*\d){14} Match not a digit, whitespace char or dot 14 times
) Close lookahead
\d+ Match 1+ digits
(?:,\d{1,3})* Match comma, 1-3 digits and repeat 0+ times (Or use \d+)
(?:\.\d{1,3})? Optional part, match a dot and 1-4 digits
$ End of the string
Regex demo
You could just specify the optional count of , Like
^[0-9]{0,1}([,])?[0-9]{0,3}([,])?[0-9]{0,3}([,])?[0-9]{1,3}(\.[0-9]{0,3})?$