regex - match pattern of alternating characters - regex

I want to match patterns of alternating lowercase characters.
ababababa -> match
I tried this
([a-z][a-z])+[a-z]
but this would be a match too
ababxyaba

You can use this regex with 2 back-reference to match alternating lowercase letters:
^([a-z])(?!\1)([a-z])(?:\1\2)*\1?$
RegEx Demo
RegEx Breakup:
^: Start
([a-z]): Match first letter in capturing group #1
(?!\1): Lookahead to make sure we don't match same letter again
([a-z]): Match second letter in capturing group #3
(?:\1\2)*: Match zero or more pairs of first and second letter
\1?: Match optional first letter before end
$: End

Related

Regex to negate certain characterss

I need some help with regex to extract specific characters from a given string. Following is the string and required string to extract.
21.4R3.13-EVO -- Required 21.4R3-EVO
24.1R13.13-EVO -- Required 24.1R13-EVO
21.4R2-S1-20190223000107.0-EVO -- Required 21.4R2-EVO
19.4R11-S1-20190223000107.0-EVO -- Required 19.4R11-EVO
I tried following and its matching first half but I couldn't negate the middle section.
[0-9]+.[0-9]+[A-Z][0-9]+
Matching
21.4R3
You can use this regex for search:
^(\d+\.[a-zA-Z\d]+)[-.].*(-[a-zA-Z]+)$
and replace with $1$2.
RegEx Demo
RegEx Breakup:
^: Start
(\d+\.[a-zA-Z\d]+): Match 1+ digits then a dot followed by 1+ alpha numeric character. Capture this in group #1
[-.].*:
(-[a-zA-Z]+): Match last - followed by 1+ letters and capture in group #2
$: End

pcre2 - Regex match to distinct letters only

I am trying to create a regex that matches the following criteria below
first letter is uppercase
remaining 5 letters following the first letter are lowercase
ends in ".com"
no letter repeats only before the ".com"
no digits
there are only 5 lowercase letters before the .com, with only the first letter being uppercase
The above criteria should match to strings such as:
Amipsa.com
Ipsamo.com
I created this regex below, but the regex seem to capture repeating letters - examples here: https://regex101.com/r/wwJBmc/1
^([A-Z])(?![a-z]*\1)(?:([a-z])\1(?!\2)(?:([a-z])(?![a-z]*\3)){3}|(?:([a-z])(?![a-z]*\4)){5})\.com$
Would appreciate any insight.
You may use this regex in PCRE with a negative lookahead:
^(?!(?i)[a-z]*([a-z])[a-z]*\1)[A-Z][a-z]{5}\.com$
Updated RegEx Demo
RegEx Details:
^: Start
(?!: Start negative lookahead
(?i): Enable ignore case modifier
[a-z]*: Match 0 or more letters
([a-z]): Match a letter and capture in group #1
[a-z]*: Match 0 or more letters
\1: Match same letter as in capture group #1
): End negative lookahead
[A-Z][a-z]{5}: Match 5 lowercase letters
\.com: Match .com
$: End
You might use
^(?i)[a-z]*?([a-z])[a-z]*?\1(?-i)(*SKIP)(*F)|[A-Z][a-z]{5}\.com$
Regex demo
Explanation
^ Start of string
(?i) Case insensitive match
[a-z]*?([a-z])[a-z]*?\1 Match 2 of the same chars a-z A-Z
(?-i) Turn of case insenstive
(*SKIP)(*F) Skip the match
[A-Z][a-z]{5} Match A-Z and 5 chars a-z
\.com Match .com
$ End of string
Another idea if you are using Javascript and you can not use (?i) is to use 2 patterns, 1 for checking not a repeated character with a case insensitive flag /i and 1 for the full match.
const rFullMatch = /^[A-Z][a-z]{5}\.com$/;
const rRepeatedChar = /^[a-z]*([a-z])[a-z]*\1/i;
[
"Regxas.com",
"Ipsamo.com",
"Plpaso.com",
"Amipsa.com",
"Ipsama.com",
"Ipsima.com",
"IPszma.com",
"Ipsamo&.com",
"abcdef.com"
].forEach(s => {
if (!rRepeatedChar.test(s) && rFullMatch.test(s)) {
console.log(`Match: ${s}`);
}
});

Regex to match the letter string group between 2 numbers

Is it possible to match only the letter from the following string?
RO41 RNCB 0089 0957 6044 0001 FPS21098343
What I want: FPS
What I'm trying LINK : [0-9]{4}\s*\S+\s+(\S+)
What I get: FPS21098343
Any help is much appreciated! Thanks.
You can try with this:
var String = "0258 6044 0001 FPS21098343";
var Reg = /^(?:\d{4} )+ *([a-zA-Z]+)(?:\d+)$/;
var Match = Reg.exec(String);
console.log(Match);
console.log(Match[1]);
You can match up to the first one or more letters in the following way:
^[^a-zA-Z]*([A-Za-z]+)
^.*?([A-Za-z]+)
^[\w\W]*?([A-Za-z]+)
(?s)^.*?([A-Za-z]+)
If the tool treats ^ as the start of a line, replace it with \A that always matches the start of string.
The point is to match
^ / \A - start of string
[^a-zA-Z]* - zero or more chars other than letters
([A-Za-z]+) - capture one or more letters into Group 1.
The .*? part matches any text (as short as possible) before the subsequent pattern(s). (?s) makes . match line break chars.
Replace A-Za-z in all the patterns with \p{L} to match any Unicode letters. Also, note that [^\p{L}] = \P{L}.
To grep all the groups of letters that go in a row in any place in the string you can simply use:
([a-zA-Z]+)
You could use a capture group to get FPS:
\b[0-9]{4}\s+\S+\s+([A-Z]+)
The pattern matches:
\b[0-9]{4} A wordboundary to prevent a partial match, and match 4 digits
\s+\S+\s+ Match 1+ non whitespace chars between whitespace chars
([A-Z]+) Capture group 1, match 1+ chars A-Z
Regex demo
If the chars have to be followed by digits till the end of the string, you can add \d+$ to the pattern:
\b[0-9]{4}\s+\S+\s+([A-Z]+)\d+$
Regex demo

Regex - Words with 2 different letters repeated twice followed by a number repeated 6 times

I need to regex that matches a string like LLMM222222. I tried with pattern like (\w{2})(\w{2})2{6} but it does not work
You can use this regex with 2 back-references:
^([A-Za-z])\1([A-Za-z])\2(\d)\3{5}$
RegEx Demo
RegEx Breakup:
^: Start
([A-Za-z]): Match a letter and capture it in group #1
\1: Make sure we repeat same letter using a back-reference #1
([A-Za-z]): Match a letter and capture it in group #2
\1: Make sure we repeat same letter using a back-reference #2
\d: Match and capture a digit in capture group #3
\3{5}: Make sure we repeat same digit 5 more times using a back-reference #3
$: End

Regex for matching groups but excluding a specific combination of groups

I'm trying to match two groups in an expression, each group represents a single letter in initials as part of a name, for example in George R. R. Martin the first group would match the first R and the second group would match the second R, I have something like this:
\b([a-zA-Z])[\.{0,1} {0,1}]{1,2}([a-zA-Z])\b
However, I'd like to exclude a specific combination of those groups, say when the first group matches the letter d and the second group matches the letter r.
Is that possible?
You may restrict matches with a negative lookahead:
\b(?![dD]\.? ?[rR]\b)([a-zA-Z])\.? ?([a-zA-Z])\b
^^^^^^^^^^^^^^^^^^^
See the regex demo
Note:
The (?![dD]\.? ?[rR]\b) lookahead should be better placed after the word boundary, so that the check only gets triggered upon encountering a word boundary, not at every location in string
The lookahead is negative, it fails the match if the pattern inside it matches the text
It matches: a d or D with [dD], then an optional literal dot with \.?, an optional space with ?, an r or R with [rR] and a trailing word boundary \b.
The main pattern is a more generic pattern - \b([a-zA-Z])\.? ?([a-zA-Z]):
\b - leading word boundary
(?![dD]\.? ?[rR]\b) - the negative lookahead
([a-zA-Z]) - Group 1 capturing an ASCII letter
\.? - an optional dot
? - an optional space
([a-zA-Z]) - Group 2 capturing an ASCII letter
\b - a trailing word boundary