Suppose I have strings like:
ABC-L-W7P-1423
ABC-L-W7E-87
CH-L-W7-756
I need to grab the number at the end. That number might be 2, 3 or 4 digits. But currently what I have is:
=REGEXREPLACE(B2,"[^0-9]","")
Which of course also grabs the '7' in 'W7P' which I don't want.
EDIT:
I also need to match something like this:
CH-M-311-MM
So always a 2, 3 or 4 (or 5) digit number, but I need single digits excluded.
You can use =REGEXEXTRACT with \b[0-9]{2,4}\b:
=REGEXEXTRACT(B2, "\b[0-9]{2,4}\b")
See the regex demo.
Details:
\b - a leading word boundary
[0-9]{2,4} - 2 to 4 digits
\b - trailing word boundary
In case your 2-4 digits are always preceded with -, you may use
=REGEXREPLACE(B2,"^.*-([0-9]{2,4})\b.*","$1")
See this regex demo
Details:
^ - start of string
.*- - any 0+ chars up to the last - that is followed with...
([0-9]{2,4}) - (Group 1 referred to with $1 in the replacement pattern) - 2 to 4 digits
\b - a trailing word boundary
.* - any chars up to the end of string.
I'm not sure which language you use, but if it supports lookarounds, you can assert that there is a - (dash) on the left side.
(?<=-)\d+
See: https://regex101.com/r/sI9zR9/1
Related
I couldn't create a search query to find in previous answers, so I'll post.
How do I create a string of exactly 7 characters where 0-2 of them can be dash (in any place), 5-7 of them \w character? All I thought of is
^(\w?){5}([\w-]?){2}(\w?){5}$
but I know through regex101 it can sum up to 12 chars (
You could do something like this:
(?=^(?:\w*-?\w*){2}$)^.{7}$
(?= - start lookahead
^(?:\w*-?\w*){2}$ - from start to finish ensure we have all \w characters and allow for a maximum of 2 dashes anywhere in the string
) - end lookahead
^.{7}$ - capture 7 chars
https://regex101.com/r/L7IReu/1/
Another option could be to assert 7 characters and optionally match 1 or 2 hyphens between word characters.
^(?=[\w-]{7}$)\w*(?:-\w*){0,2}$
^ Start of string
(?=[\w-]{7}$) Assert 7 word chars or - in the whole string
\w* Match optional word chars
(?:-\w*){0,2} Repeat 0-2 times matching - and optional word chars
$ End of string
Regex demo
I have an e-Mail with some 11-digits long numbers - that one's easy: \d{11}.
There are also "words" that are 6 characters long.
The Letters there are always uppercase.
It may contain 1-5 Numbers, but never 6. It never extends the size of 6.
\b(\d{11}|([A-Z0-9]{6}))(\s|\.|$)
also captures e.g. "123456" which I'd like to omit.
It's within an e-Mail, so I'm using VBA with it's "Microsoft VBScript Regular Expression 5.5"-Library.
I think you are after:
\b(?:\d{11}|(?!\d{6})[A-Z\d]{6})\b
See an online demo
\b - Word boundary.
(?: - Open non-capture group:
d{11} - Eleven numbers.
| - Or:
(?!\d{6}) - Negative lookahead for 6 numbers.
[A-Z\d]{6} - Exactly six uppercase letters or digits.
) - Close non-capture group.
\b - Word boundary.
You may use this regex:
\b(?:\d{11}|(?=\d{0,5}[A-Z])[A-Z0-9]{6})(?:[\s\.]|$)
RegEx Demo
(?=\d{0,5}[A-Z]) is a positive lookahead that asserts presence of an uppercase letter after 0 to 5 digits thus failing the match when there are 6 digits in 2nd alternation option.
To match a dash-less checksum I can do something like:
\b[0-9a-z]{32}\b
However, I'm seeing some checksums that also have dashes, such as:
d3bd55bf-062f-473b-9417-935f62c4c98a
While this is probably a fixed size, 8, then 4, then 4, then 4, then 12, I was wondering if I could do a regex where the number of non-dash digits adds up to 32. I think the answer is no, but hopefully some regex wizard can come up with something.
Here is a starting point for some sample inputs: https://regex101.com/r/K0IMKe/1.
You can use
\b[0-9a-z](?:-?[0-9a-z]){31}\b
See the regex demo.
It matches
\b - a word boundary
[0-9a-z] - a digit or a lowercase ASCII letter
(?:-?[0-9a-z]){31} - thirty-one repetitions of an optional - followed with a single digit or a lowercase ASCII letter
\b - a word boundary.
If you do not mind having a trailing - if there is a word char after it, at the end of a match, you may also use
\b(?:[0-9a-z]-?){32}\b
See this regex demo. Here, (?:[0-9a-z]-?){32} will match thirty-two repetitions of a digit or lowercase ASCII letter followed with an optional hyphen.
If there can be multiple dashes, you can assert 32 to 36 chars using a positive lookahead.
^(?=[a-z0-9-]{32,36}$)[a-z0-9]+(?:-[a-z0-9]+)*$
^ Start of string
(?=[a-z0-9-]{32,36}$) Positive lookahead, assert what is at the right is 32 - 36 repetitions of the listed characters
[a-z0-9]+ Match 1+ times any of the listed
(?: Non capture group
-[a-z0-9]+ Match a - followed by 1+ times any of the listed (the string can not end with a hyphen)
)* Close the group and match 0+ times to also match the string without dashes
$ End of string
Regex demo
If you want to limit the amount of dashes to 0 -4 times, you can change the quantifier * to {0,4}+
^(?=[a-z0-9-]{32,36}$)[a-z0-9]+(?:-[a-z0-9]+){0,4}+$
Regex demo
I am looking for a regex expression that will accept the following: The capital letter A followed by any number of digits. This might also be a decimal number. All of these are valid: A1, A500, A543.987
This is NOT OK to accept: Apple, AE100
Currently I have [A]\w.[0-9]* but it accepts App and AE100.
You may use the following regex if the entire string should match:
^A[0-9]+(?:\.[0-9]+)?$
Or, to match these strings as whole words:
\bA[0-9]+(?:\.[0-9]+)?\b
See the regex demo.
Details
^ - start of string / \b - a word boundary
A - an A
[0-9]+ - 1+ digits
(?:\.[0-9]+)? - an optional sequence of . and then 1+ digits
$ - end of string / \b - a word boundary.
I would suggest "A\d+(\.\d+)?". \d represents all digits, the + is one or more characters and the (\.\d+)? is a . followed by one or more digits. But the ? specifies it's optional.
I am trying to create a regex that only matches for valid dates (in MM/DD or MM/DD/YY(YY) format)
My current regex (\d+)/(\d+)/?(\d+)? is very simple but it matches any number that has a / before/after. I.e. if a string is 2015/2016 12/25 it will see both of these as matches but i only want the 12/25 portion.
Here is a link to some sample RegEx.
You can add word boundaries (\b) to make sure you match the date string as a whole "word" (so that the match does not start in the middle of a number) and restrict the occurrences \d matches with the help of limiting quantifiers:
\b(\d{2})/(\d{1,2})/?(\d{4}|\d{2})?\b
See the regex demo
The regex breakdown:
\b - word boundary to make sure there is a non-word character or start of string right before the digit
(\d{2}) - match exactly 2 digits
/ - match a literal /
(\d{1,2}) - match and capture 1 to 2 digits
/? - match 1 or 0 /
(\d{4}|\d{2})? - match 1 or 0 occurrences of either 4 or 2 digits
\b - trailing word boundary