My regular expression = '(\d+)\1+'
My Aim is to capture repeating patters such as 2323 , 1212, 345345 which have different digits. Current regex also captures 11,22,11111 which I need to exclude
Example -
For the input = 44556841335158684945454545
Matches are
44
55
45454545
Matches should be -
45454545
How do I write a regex which excludes 44 and 55 and gives results which have different digits
Here is the regex I believe you want:
(\d)((?!\1)\d)
A bit of explanation:
(\d)
\d matches a digit (equal to [0-9])
((?!\1)\d)
Negative Lookahead (?!\1)
Assert that the Regex below does not match
\1
matches the same text as most recently matched by the 1st capturing group
\d
matches a digit (equal to [0-9])
Here is a quick JS demo:
var s = "44556841335158684945454545"
console.log(s.match(/(\d)((?!\1)\d)/g))
To say "two different numbers repeated" you can try
((\d)(?!\2)\d)\1
Capturing parentheses are numbered from the left; so \1 matches the entire outer pair of parentheses, and (?!\2) refers to the inner parentheses around the first digit, constraining the second digit so that it cannot be identical to the first.
Demo: https://regex101.com/r/5f2CEf/1
Obviously, add a + at the end to cover all adjacent repetitions of the match.
Related
I'm working on a regular expression for SSN with the rules below. I have successfully applied all matching rules except #7. Can someone help alter this expression to include the last rule, #7:
^((?!000|666)[0-8][0-9]{2}-(?!00)[0-9]{2}-(?!0000)[0-9]{4}$|(?!000|666)[0-8][0-9]{2}(?!00)[0-9]{2}(?!0000)[0-9]{4}$)
Hyphens should be optional (this is handled above by using 2 expressions with an OR
Cannot begin with 000
Cannot begin with 666
Cannot begin with 900-999
Middle digits cannot be 00
Last four digits cannot 0000
Cannot be all the same numbers ex: 111-11-1111 or 111111111
Add the following negative look ahead anchored to start:
^(?!(.)(\1|-)+$)
See live demo.
This captures the first character then asserts the rest of the input is not made of that captured char or hyphen.
The whole regex can be shortened to:
^(?!(.)(\1|-)+$)(?!000|666|9..)(?!...-?00)(?!.*0000$)\d{3}(-?)\d\d\3\d{4}$
See live demo.
The main trick to not having to repeat the regex both with and without the hyphens was to capture the optional hyphen (as group 3), then use a back reference \3 to the capture in the next position, so are either both there or both absent.
First, let's shorten the pattern as it contains two next-to identical alternatives, one matching SSN with hyphens, and the other matching the SSN numbers without hyphens. Instead of ^(x-y-z$|xyz$) pattern, you can use a ^x(-?)y\1z$ pattern, so your regex can get reduced to ^(?!000|666)[0-8][0-9]{2}(-?)(?!00)[0-9]{2}\1(?!0000)[0-9]{4}$, see this regex demo here.
To make a pattern never match a string that contains only identical digits, you may add the following negative lookahead right after ^:
(?!\D*(\d)(?:\D*\1)*\D*$)
It fails the match if there are
\D* - zero or more non-digits
(\d) - a digit (captured in Group 1)
(?:\D*\1)* - zero or more occurrences of any zero or more non-digits and then then same digit as in Group 1, and then
\D*$ - zero or more non-digits till the end of string.
Now, since I suggested shortening the regex to the pattern with backreference(s), you will have to adjust the backreferences after adding this lookahead.
So, your solution looks like
^(?!\D*(\d)(?:\D*\1)*\D*$)(?!000|666)[0-8]\d{2}(-?)(?!00)\d{2}\2(?!0000)\d{4}$
^(?![^0-9]*([0-9])(?:[^0-9]*\1)*[^0-9]*$)(?!000|666)[0-8][0-9]{2}(-?)(?!00)[0-9]{2}\2(?!0000)[0-9]{4}$
Note the \1 in the pattern without the lookahead turned into \2 as (-?) became Group 2.
See the regex demo.
Note also that in some regex flavors \d is not equal to [0-9].
I'd like to take an xpath-like string such as:
a.b.c[2].d[123].e1[4].f88[5]
And have each path-part as a match, with each subscript ("array index") as a group, like this:
match 1: a
match 2: b
match 3: c, group 1: 123
match 4: e1, group 1: 4,
match 5: f88, group 1: 5
I tried with the following (which doesn't work):
[^.]+(?:\[)*([0-9]+)*(?:\])*
As I understand this Regex, it means:
First, match all characters except for a dot
Then, check (but don't capture) for a left square bracket - it may be present 0 to unlimited times.
Then, check for any number, with length 1 to unlimited - and capture as a group.
Then, do 2 again for a right square brack.
But it doesn't work.
How can I make it work?
[^.]+(?:\[)*([0-9]+)*(?:\])*
"But it doesn't work" because + is greedy and consumes all the characters before the dot. Furthermore, each subscript is integrally optional, rather than part by part.
Applying those criteria, this expression does work:
([^.\[]+)(?:\[(\d+)\])?
Regex101 Test
The pattern that you tried:
The pattern that you tried matches too much, as the negated character class [^.]+ matches 1 or more times any char except a dot, and can also match square brackets.
Note that this notation (?:\[)* is the same as \[* and matches 0 or more times an opening square bracket
If the \G anchor is supported, and you want to match the example string only from the start of the string, you might use 2 capture groups for the data that you want, and match the dots and square brackets in between.
\G([^\][.\s]+)(?:\[(\d+)\])?\.?
The pattern matches:
\G Assert the position at the end of the previous match, or at the start of the string
([^\][.\s]+) Capture group 1, match 1+ char other than ] [ . or a whitespace char (as there do not seem to be any spaces in the example string)
(?:\[(\d+)\])? Optionally match capture group 2 between matching square brackets
\.? Match an optional dot to continue the consecutive matching for the \G anchor
Regex demo
If there can not be a dot at the end of the string, and there must be at least 1 dot present, you can assert the whole format first from the start of the string:
(?:^(?=[^.]+(?:\.[^.]+)+$)|\G(?!^))\.?([^\][.]+)(?:\[(\d+)\])?
Regex demo
How can I match the the pattern any number followed by h or t or l like 1h , 126h or 1268h but not 1.1h 12.6h or 12.68h in a given paragraph.
I am writing a application that can replace 1h to 100 or 1268h to 126800 so that instead of typing 00 a person can simply place h with a number but due to some error it is also matching decimal numbers, too.
Pattern that I wrote is (\d+)(h|t|l)
You could use a whitespace boundary to the left (?<!\S) if a positive lookbehind is supported or anchors to match the whole line.
The alternation can be written as a character class [htl]
(?<!\S)(\d+)([htl])
(?<!\S) Positive lookbehind, assert what is on the left is not a non whitespace char
(\d+) Capture group 1, match 1+ digits
([htl]) Capture group 2, match either h torl`
Regex demo
Using anchors to match the whole line
^(\d+)([htl])$
Regex demo
Without a lookaround, you could match either a whitespace char or the start of the string (?:\s|^) for example:
(?:\s|^)(\d+)([htl])
Regex demo
I have a text with a number that contains dots:
text 304.33.44.52.03.001 text
where I want to capture the number including strings:
304.33.44.52.03.001
The following regex will capture sevaral groups:
(\d+\.?)
Resulting in:
304.
33.
44.
...
What is the correct syntax to return the entire number including dots in one result?
\d+\.? matches 1+ digits and then an optional . char.
You need to use either
\d+(?:\.\d+)*
or
\d[\d.]*
See the regex demo
The \d+(?:\.\d+)* pattern matches
\d+ - 1+ digits
(?:\.\d+)* - 0 or more occurrences of a . and then 1+ digits. (?:...) is a non-capturing group that is used to group 2 patterns and set a quantifier on their sequence.
The \d[\d.]* pattern matches a digit first, and then tries to match 0 or more digits or ..
In regex engines that do not support \d you need to use a safer pattern, a bracket expression [0-9].
I tried different regex which I found here but they are not working.
for example:
1111 = false
1112 = true
It's my homework so I must do it in regex :)
You can use this regex:
^(\d)(?!\1+$)\d{3}$
Explanation:
^ - Match line start
(\d) - match first digit and capture it in back reference #1 i.e. \1
(?!..) is a negative lookahead
(?!\1+$) means disallow the match if first digit is followed by same digit (captured group) till end.
\d{3}$ match next 3 digit followed by line end
How about this?
(?=^\d{4}$)(\d)+(?!\1)\d\d*
The first look-ahead group (?=^\d{4}$) insists that the whole string consists of 4 digits.
The first capture group then matches any number of digits: (\d)+.
After this, there must be a digit is different to the first capture group: (?!\1)\d
Finally, there can be any number of digits trailing: \d*