I need a regular expression to find the last occurrence of 5 consecutive digits in a string. This is what I have right now:
([0-9]{5})[a-zA-Z]*$
This only matches some of my test strings.
In a live environment the numbers will change, but for testing I expect to capture the substring '12345' in each of the test strings below:
D012345
D012345AS
D012345RM-67
D12345D
12345D67
TEST-Str12345ing-rm6
Updated
Works w/ global flag, passes all tests. No capture group required.:
[0-9]{5}(?![0-9])(?!.*[0-9]{5})
Live example:
http://www.regexr.com/3c875
Let's break this down:
Match any instance of five digits
[0-9]{5}
But the instance cannot be immediately followed by another digit - this way,
we always get the last five in any group of consecutive numbers.
(?![0-9])
Lastly, make sure no further groups of consecutive numbers exist that have
more than five digits.
(?!.*[0-9]{5})
You can use this regex:
.*([0-9]{5})
to make sure you're matching last 5 continuous digits in the input. Matched 5 digits are available in captured group #1.
.* (greedy match) at start makes sure that we match very last 5 digits only.
RegEx Demo
Related
I am trying to create regex for below case:
Input string consisting of all numbers, max length is 30.
Check if in first 10 digits, any number is not consecutively appearing equal or more than 3 in length
eg.
1234567 --> is good (no consecutive number)
1234456 --> is good (4 appears consecutive but length is less than 3)
1234445 --> is bad (4 appears consecutive and length is equal or greater than 3)
12345678904444 --> is good (4 appears consecutive and length is greater than 3 however it is accepted since it is appearing after cut off of 10 digit)
The regex I came up with is below. pardon me for my mistake if any in regex, i am still in learning mode with regexes:
https://regex101.com/r/rv5e6a/1
currently it is getting applied all across the string but not sure how to limit so that regex can be applied only for first 10 digits only.
You can use
^(?!\d{0,7}(\d)\1{2})\d{1,30}$
See the regex demo. Note that \d{0,7} in the lookahead will allow checking for repeated digits only within the first ten. More details:
^ - start of string
(?!\d{0,7}(\d)\1{2}) - a negaitve lookahead that fails the match if there are three same digits after zero to seven digits immediately to the right of the current location
\d{1,30} - one to thirty digits
$ - end of string.
^(?:(\d)(?!\1{2})){1,9}$|^(?:(\d)(?!\2{2})){10}\d*$
regex101 link
Explanation:
^ # beginning of the line
(?: # start of a non-capturing group
(\d) # a single digit in a group that we can refer to with \1 later on
(?!\1{2}) # not followed by the digit in the \1 group repeated twice
){1,9} # repeat the non-capturing group 1-9 times
$ # end of the line
| # OR
^ # beginning of the line
(?: # start of a second non-capturing group
(\d) # a single digit in a group that we can refer to with \2 later on
(?!\2{2}) # not followed by the digit in the \2 group repeated twice
){10} # repeat the non-capturing group 10 times
\d* # the rest of the string can be more digits
$ # end of the line
The important parts of the regex above makes sure that a given digit is not followed by the same digit two more times ^(?:(\d)(?!\1{2}). But, because we only care about the first 10 digits, we need to handle this in two cases.
in the first case, we have a string of digits that is less than 10 characters, so we want our pattern to repeat 1-9 times and then hit the end of the line.
in the second case, we have a string of digits that is 10 or more characters and then there might be even more characters after that that we don't care about.
We need to keep these two cases separate because we don't want to exclude the cases where there are fewer than 10 characters total in the string.
I should only catch numbers which are fit the rules.
Rules:
it should be 16 digit
first 11 digit can be any number
after 3 digit should have all zero
last two digit can be any number.
I did this way;
([0-9]{11}[0]{3}[0-9]{2})
number example:
1234567890100012
now I want to get the number even it has got any letter beginning or ending of the string like " abc1234567890100012abc"
my output should be just number like "1234567890100012"
When I add [a-zA-Z]* it gives all string.
Also another point is if there is any number beginning or ending of the string like "999912345678901000129999". program shouldn't take this. I mean It should return none or nothing. How can I write this with regex.
You can use look around to exclude the cases where there are more digits before/after:
(?<!\d)\d{11}000\d\d(?!\d)
On regex101
You can use a capture group, and match optional chars a-zA-Z before and after the group.
To prevent a partial match, you can use word boundaries \b or if the string should match from the start and end of the line you can use anchors ^ and $
\b[a-zA-Z]*([0-9]{11}000[0-9]{2})[a-zA-Z]*\b
Regex demo
I'm trying to compute a regex for this scenario:
the ID must start with the letter 'M' and ends with 3 digits, but triple zeros are not allowed.
I've tried
M(00[1-9])
But this only works on blocking triple zeros, how can I cater for the other digits?
The easiest way is probably with a negative lookahead:
M(?!0{3})\d{3}
[Regex101]
This matches literal M, checks that the next thing is not triple zero, then matches three digits.
If you want to block a specific set of digits, you can modify your lookahead to check for specific repeated digits (0, 2 5, 6 here):
M(?!([0256])\1{2})\d{3}
[Regex101]
To check for all triple digits, replace [0256] with \d. This regex makes the lookahead check for one digit, then test if it is repeated twice using a backreference.
A less redundant way might be to put the capture group outside the lookahead:
M(\d)(?!\1{2})\d{2}
[Regex101]
This version says to capture one digit, make sure it is not repeated two more times, then capture two more digits.
I have a huge dataset, where I am trying to extract a group of 4 digits. The problem is, sometimes there will be a preceding group of 4 digits that I don't want. These 2 groups will never be the same as each other.
Example:
String String 7777 Some more string
String 1234 7777 Some more string
In both cases, I want to extract ONLY 7777 (or whatever digit combination replaces it). There is no pattern to distinguish which number group will be in which position - any number from 0000 to 9999 can be in either first or second position.
If this were possible, I think it'd do what I want?
\b\d{4}{0,1}\s{0,1}(\d{4})\b
Optional 4 digits, optional space, capture 4 digits. But I've tried it, and some variations of it, but I can't get it to work!
A look-ahead seems like a possible candidate, but I don't understand how to construct the pattern.
You can use a negative look-ahead to check if there is no subsequent 4-digit number after it:
\b\d{4}\b(?!\s?\d{4}\b)
See demo
EDIT:
To capture 4-digit number that is not followed by any text and another 4-digit number, you should use:
\b\d{4}\b(?!.+\b\d{4}\b)
See demo
You can use this expression that matches the four digit group not followed by any other four digit groups:
\d{4}(?!.+\d{4}.+)
Online test here.
Can I use
\d\d\d\d[^\d]
to check for four consecutive numbers?
For example,
411112 OK
455553 OK
1200003 OK
f44443 OK
g55553 OK
3333 OK
f4442 No
45553 No
f4444g4444 No
f44444444 No
If you want to find any series of 4 digits in a string /\d\d\d\d/ or /\d{4}/ will do. If you want to find a series of exactly 4 digits, use /[^\d]\d{4}[^\d]/. If the string should simply contain 4 consecutive digits use /^\d{4}$/.
Edit: I think you want to find 4 of the same digits, you need a backreference for that. /(\d)\1{3}/ is probably what you're looking for.
Edit 2: /(^|(.)(?!\2))(\d)\3{3}(?!\3)/ will only match strings with exactly 4 of the same consecutive digits.
The first group matches the start of the string or any character. Then there's a negative look-ahead that uses the first group to ensure that the following characters don't match the first character, if any. The third group matches any digit, which is then repeated 3 times with a backreference to group 3. Finally there's a look-ahead that ensures that the following character doesn't match the series of consecutive digits.
This sort of stuff is difficult to do in javascript because you don't have things like forward references and look-behind.
Should the numbers be part of a string, or do you want only the four numbers. In the later case, the regexp should be ^\d{4}$. The ^ marks the beginning of the string, $ the end. That makes sure, that only four numbers are valid, and nothing before or after that.
That should match four digits (\d\d\d\d) followed by a non digit character ([^\d]). If you just want to match any four digits, you should used \d\d\d\d or \d{4}. If you want to make sure that the string contains just four consecutive digits, use ^\d{4}$. The ^ will instruct the regex engine to start matching at the beginning of the string while the $ will instruct the regex engine to stop matching at the end of the string.