I'm trying to compute a regex for this scenario:
the ID must start with the letter 'M' and ends with 3 digits, but triple zeros are not allowed.
I've tried
M(00[1-9])
But this only works on blocking triple zeros, how can I cater for the other digits?
The easiest way is probably with a negative lookahead:
M(?!0{3})\d{3}
[Regex101]
This matches literal M, checks that the next thing is not triple zero, then matches three digits.
If you want to block a specific set of digits, you can modify your lookahead to check for specific repeated digits (0, 2 5, 6 here):
M(?!([0256])\1{2})\d{3}
[Regex101]
To check for all triple digits, replace [0256] with \d. This regex makes the lookahead check for one digit, then test if it is repeated twice using a backreference.
A less redundant way might be to put the capture group outside the lookahead:
M(\d)(?!\1{2})\d{2}
[Regex101]
This version says to capture one digit, make sure it is not repeated two more times, then capture two more digits.
Related
I am trying to analyse my source code (written in C) for not corresponding timer variable comparisons/allocations. I have a rage of timers with different timebases (2-250 milliseconds). Every timer variable contains its granularity in milliseconds in its name (e.g. timer10ms) as well as every timer-photo and define (e.g. fooTimer10ms, DOO_TIMEOUT_100MS).
Here are some example lines:
fooTimer10ms = timer10ms;
baaTimer20ms = timer10ms;
if (DIFF_100MS(dooTimer10ms) >= DOO_TIMEOUT_100MS)
if (DIFF_100MS(dooTimer10ms) < DOO_TIMEOUT_100MS)
I want to match those line where the timebases are not corresponding (in this case the second, third and fourth line). So far I have this regex:
(\d{1,3}(?i)ms(?-i)).*[^\d](\d{1,3}(?i)ms(?-i))
that is capable of finding every line where there are two of those granularities. So instead of just line 2, 3 and 4 it matches all of them. The only idea I had to narrow it down is to add a negative lookbehind with a back-reference, like so:
(\d{1,3}(?i)ms(?-i)).*[^\d](\d{1,3}(?i)ms(?-i))(?<!\1)
but this is not allowed because a negative lookbehind has to have a fixed length.
I found these two questions (one, two) but the fist does not have the restriction of having both capture groups being of the same kind and the second is looking for equal instances of the capture group.
If what I want can be achieved way easier, by using something else than regex, I would be happy to know. My mind is just stuck due to my believe that regex is capable of that and I am just not creative enough to use it properly.
One option is to match the timer part followed by the digits and use a negative lookahead with a backreference to assert that it does not occur at the right.
For the example data, a bit specific pattern using a range from 2-250 might be:
.*?(timer(?:2[0-4]\d|250|1?\d\d|[2-9])ms)\b\S*[^\S\r\n]*[<>]?=[^\S\r\n]*\b(?!\S*\1)\S+
The pattern matches
.*? Match any char except a newline, as least as possible (Non greedy)
( Capture group 1
timer Match literally
(?:2[0-4]\d|250|1?\d\d|[2-9]) Match a digit in the range of 2-250
ms Match literally
)\b Close group and a word boundary
\S*[^\S\r\n]* Match optional non whitespace chars and optional spaces without newlines
[<>]?= Match an optional < or > and =
[^\S\r\n]*\b Match optional whitespace chars without a newline and a word boundary
(?!\S*\1) Negative lookahead, assert no occurrence of what is captured in group 1 in the value
\S+ Match 1+ non whitespace chars
Regex demo
Or perhaps a broader pattern matching 1-3 digits and optional whitespace chars which might also match a newline:
.*?(timer\d{1,3}ms\b)\S*\s*[<>]?=\s*\b(?!.*\1)\S+
Regex demo
Note that {1-3} should be {1,3} and could also match 999
First off, this has sort of been asked before. However I haven't been able to modify this to fit my requirement.
In short: I want a regex that matches an expression if and only if it only contains digits, and there are 5 (or more) increasing consecutive digits somewhere in the expression.
I understand the logic of
^(?=\d{5}$)1*2*3*4*5*6*7*8*9*0*$
however, this limits the expression to 5 digits. I want there to be able to be digits before and after the expression. So 1111345671111 should match, while 11111 shouldn't.
I thought this might work:
^[0-9]*(?=\d{5}0*1*2*3*4*5*6*7*8*9*)[0-9]*$
which I interpret as:
^$: The entire expression must only contain what's between these 2 symbols
[0-9]*: Any digits between 0-9, 0 or more times followed by:
(?=\d{5}0*1*2*3*4*5*6*7*8*9*): A part where at least 5 increasing digits are found followed by:
[0-9]*: Any digits between 0-9, 0 or more times.
However this regex is incorrect, as for example 11111 matches. How can I solve this problem using a regex? So examples of expressions to match:
00001459000
12345
This shouldn't match:
abc12345
9871234444
While this problem can be solved using pure regular expressions (the set of strictly ascending five-digit strings is finite, so you could just enumerate all of them), it's not a good fit for regexes.
That said, here's how I'd do it if I had to:
^\d*(?=\d{5}(\d*)$)0?1?2?3?4?5?6?7?8?9?\1$
Core idea: 0?1?2?3?4?5?6?7?8?9? matches an ascending numeric substring, but it doesn't restrict its length. Every single part is optional, so it can match anything from "" (empty string) to the full "0123456789".
We can force it to match exactly 5 characters by combining a look-ahead of five digits and an arbitrary suffix (which we capture) and a backreference \1 (which must exactly the suffix matched by the look-ahead, ensuring we've now walked ahead 5 characters in the string).
Live demo: https://regex101.com/r/03rJET/3
(By the way, your explanation of (?=\d{5}0*1*2*3*4*5*6*7*8*9*) is incorrect: It looks ahead to match exactly 5 digits, followed by 0 or more occurrences of 0, followed by 0 or more occurrences of 1, etc.)
Because the starting position of the increasing digits isn't known in advance, and the consecutive increasing digits don't end at the end of the string, the linked answer's concise pattern won't work here. I don't think this is possible without being repetitive; alternate between all possibilities of increasing digits. A 0 must be followed by [1-9]. (0(?=[1-9])) A 1 must be followed by [2-9]. A 2 must be followed by [3-9], and so on. Alternate between these possibilities in a group, and repeat that group four times, and then match any digit after that (the lookahead in the last repeated digit in the previous group will ensure that this 5th digit is in sequence as well).
First lookahead for digits followed by the end of the string, then match the alternations described above, followed by one or more digits:
^(?=\d+$)\d*?(?:0(?=[1-9])|1(?=[2-9])|2(?=[3-9])|3(?=[4-9])|4(?=[5-9])|5(?=[6-9])|6(?=[7-9])|7(?=[89])|8(?=9)){4}\d+
Separated out for better readability:
^(?=\d+$)\d*?
(?:
0(?=[1-9])|
1(?=[2-9])|
2(?=[3-9])|
3(?=[4-9])|
4(?=[5-9])|
5(?=[6-9])|
6(?=[7-9])|
7(?=[89])|
8(?=9)
){4}
\d+
The lazy quantifier in the first line there \d*? isn't necessary, but it makes the pattern a bit more efficient (otherwise it initially greedily matches the whole string, requiring lots of failing alternations and backtracking until at least 5 characters before the end of the string)
https://regex101.com/r/03rJET/2
It's ugly, but it works.
I need a regular expression to find the last occurrence of 5 consecutive digits in a string. This is what I have right now:
([0-9]{5})[a-zA-Z]*$
This only matches some of my test strings.
In a live environment the numbers will change, but for testing I expect to capture the substring '12345' in each of the test strings below:
D012345
D012345AS
D012345RM-67
D12345D
12345D67
TEST-Str12345ing-rm6
Updated
Works w/ global flag, passes all tests. No capture group required.:
[0-9]{5}(?![0-9])(?!.*[0-9]{5})
Live example:
http://www.regexr.com/3c875
Let's break this down:
Match any instance of five digits
[0-9]{5}
But the instance cannot be immediately followed by another digit - this way,
we always get the last five in any group of consecutive numbers.
(?![0-9])
Lastly, make sure no further groups of consecutive numbers exist that have
more than five digits.
(?!.*[0-9]{5})
You can use this regex:
.*([0-9]{5})
to make sure you're matching last 5 continuous digits in the input. Matched 5 digits are available in captured group #1.
.* (greedy match) at start makes sure that we match very last 5 digits only.
RegEx Demo
I'm trying to generate a regular expression with the next pattern.
A number, of a maximum of 16 digits, that can have or not a comma inside, never at the beginning, never at the end.
I tried this:
^(?:\d+(?:,{1}\d+){1})$
The problem is that I can't count the result of a group {0,16}.
This is a list of numbers that should fit the expression:
123,34
1,33333333
1222222233
Example numbers that shouldn't fit:
1111,1111,1111
,11111
11111,
11111111111111111111111111111,111111111111111 (more than 16
characters)
You may check the length before that or using ^(?=[\d,]{1,16}$)(?:\d+(?:,\d+)?)$
That is a lookahead that checks the length before doing the real match.
If your regex flavour supports lookahead assertions, you can use this:
^(?!(?:\d{17,}$|[,\d]{18,}$))(?:\d+(?:,\d+)?)$
See it here on Regexr
I removed the superfluous {1} and made the group with the fraction optional.
The negative lookahead assertion (?!(?:\d{17,}$|[,\d]{18,}$)) is checking your length requirement. It fails if it finds 17 or more digits till the end OR 18 or more digits and commas till the end. That I allow multiple commas in the character class here is not a problem, that there is only one comma is ensured by the following pattern.
Can I use
\d\d\d\d[^\d]
to check for four consecutive numbers?
For example,
411112 OK
455553 OK
1200003 OK
f44443 OK
g55553 OK
3333 OK
f4442 No
45553 No
f4444g4444 No
f44444444 No
If you want to find any series of 4 digits in a string /\d\d\d\d/ or /\d{4}/ will do. If you want to find a series of exactly 4 digits, use /[^\d]\d{4}[^\d]/. If the string should simply contain 4 consecutive digits use /^\d{4}$/.
Edit: I think you want to find 4 of the same digits, you need a backreference for that. /(\d)\1{3}/ is probably what you're looking for.
Edit 2: /(^|(.)(?!\2))(\d)\3{3}(?!\3)/ will only match strings with exactly 4 of the same consecutive digits.
The first group matches the start of the string or any character. Then there's a negative look-ahead that uses the first group to ensure that the following characters don't match the first character, if any. The third group matches any digit, which is then repeated 3 times with a backreference to group 3. Finally there's a look-ahead that ensures that the following character doesn't match the series of consecutive digits.
This sort of stuff is difficult to do in javascript because you don't have things like forward references and look-behind.
Should the numbers be part of a string, or do you want only the four numbers. In the later case, the regexp should be ^\d{4}$. The ^ marks the beginning of the string, $ the end. That makes sure, that only four numbers are valid, and nothing before or after that.
That should match four digits (\d\d\d\d) followed by a non digit character ([^\d]). If you just want to match any four digits, you should used \d\d\d\d or \d{4}. If you want to make sure that the string contains just four consecutive digits, use ^\d{4}$. The ^ will instruct the regex engine to start matching at the beginning of the string while the $ will instruct the regex engine to stop matching at the end of the string.