Check odd number of a certain character - regex

For Uni, I need to write a method with a string as parameter which checks if the string has an even number of a's in it. Normally I had sequences like this:
baaaaaad which would then be easy to figured out by RegEx (.*)(aa)*(.*)
But now they look like this:
baadaafaag
And I have no clue how to do this since there are other characters seperating this.

Try this one for a simpler solution
^([^a]*(a{2})*[^a]*)*$
It checks for groups of 2 "a"s delimited by non-"a"s
bad no match
baad match
baaad no match
baaaad match
baaaaad no match
baaaaaad match
baadaafaag match
baadaaaaag no match

just use this [a-z]*aa+[a-z]*aa+[a-z]*
Here [a-z]* for zero or more character.aa+ for atleast 1 a followed by athat means aa.
The inner [a-z]* is for you may or may having have any number of character between every fair of aa.
Outer [a-z]* for you may have any number of character after aa.

Related

What would be the Regex expression to get the first letter after a group of character and some integers?

I have a string that the following structure:
ABCD123456EFGHIJ78 but sometimes it's missing a number or a character like:
ABC123456EFGHIJ78 or
ABCD123456E or
ABCD12345EFGHIJ78
etc.
That's why I need regular expressions.
What I want to extract is the first letter of the third group, in this case 'E'.
I have the following regex:
(\D+)+(\d+)+(\D{1})\3
but I don't get the letter E.
This seems to work for the example cases you provided.
^(?:[A-Za-z]+)(?:\d+)(.)
It assumes that the first group is only letters and that the second group is only digits.
There's already a nice answer.
But for the records, your initial proposal was very close to work. You just needed to say that the character matching the 3rd group can repeat several times by adding a star:
^(\D+)(\d+)(\D{1})\3*
The main weakness is that \D matches any char except digits, so also spaces. Making it more robust leads us to explicit the range of chars accepted:
^([A-Za-z]+)(\d+)([A-Za-z]{1})\3*
It's much better, but my favourite uses \w to match at the end of the pattern any non white character:
([A-Za-z]+)(\d+)([A-Za-z]{1})\w*

How to do a complex multiple if-then-else regex?

I need to do a complex if-then-else with five preferential options. Suppose I first want to match abc but if it's not matched then match a.c, then if it's not matched def, then %##, then 1z;.
Can I nest the if-thens or how else would it be accomplished? I've never used if-thens before.
For instance, in the string 1z;%##defarcabcaqcdef%##1z; I would like the output abc.
In the string 1z;%##defarcabaqcdef%##1z; I would like the output arc.
In the string 1z;%##defacabacdef%##1z; I would like the output def.
In the string 1z;##deacabacdf%##1z; I would like the output %##.
In the string foo;%#dfaabaef##1z;barbbbaarr3 I would like the output 1z;.
You need to force individual matching of each option and not put them together. Doing so as such: .*?(?:x|y|z) will match the first occurrence where any of the options are matched. Using that regex against a string, i.e. abczx will return z because that's the first match it found. To force prioritization you need to combine the logic of .*? and each option such that you get a regex resembling .*?x|.*?y|.*?z. It will try each option one by one until a match is found. So if x doesn't exist, it'll continue to the next option, etc.
See regex in use here
(?m)^(?:.*?(?=abc)|.*?(?=a.c)|.*?(?=def)|.*?(?=%##)|.*?(?=1z;))(.{3})
(?m) Enables multiline mode so that ^ and $ match the start/end of each line
(?:.*?(?=abc)|.*?(?=a.c)|.*?(?=def)|.*?(?=%##)|.*?(?=1z;)) Match either of the following options
.*?(?=abc) Match any character any number of times, but as few as possible, ensuring what follows is abc literally
.*?(?=a.c) Match any character any number of times, but as few as possible, ensuring what follows is a, any character, then c
.*?(?=def) Match any character any number of times, but as few as possible, ensuring what follows is def literally
.*?(?=%##) Match any character any number of times, but as few as possible, ensuring what follows is %## literally
.*?(?=1z;) Match any character any number of times, but as few as possible, ensuring what follows is 1z; literally
(.{3}) Capture any character exactly 3 times into capture group 1
If the options vary in length, you'll have to capture in different groups as seen here:
(?m)^(?:.*?(abc)|.*?(a.c)|.*?(def)|.*?(%##)|.*?(1z;))

Cleaning up a regular expression which has lots of repetition

I am looking to clean up a regular expression which matches 2 or more characters at a time in a sequence. I have made one which works, but I was looking for something shorter, if possible.
Currently, it looks like this for every character that I want to search for:
([A]{2,}|[B]{2,}|[C]{2,}|[D]{2,}|[E]{2,}|...)*
Example input:
AABBBBBBCCCCAAAAAADD
See this question, which I think was asking the same thing you are asking. You want to write a regex that will match 2 or more of the same character. Let's say the characters you are looking for are just capital letters, [A-Z]. You can do this by matching one character in that set and grouping it by putting it in parentheses, then matching that group using the reference \1 and saying you want two or more of that "group" (which is really just the one character that it matched).
([A-Z])\1{1,}
The reason it's {1,} and not {2,} is that the first character was already matched by the set [A-Z].
Not sure I understand your needs but, how about:
[A-E]{2,}
This is the same as yours but shorter.
But if you want multiple occurrences of each letter:
(?:([A-Z])\1+)+
where ([A-Z]) matches one capital letter and store it in group 1
\1 is a backreference that repeats group 1
+ assume that are one or more repetition
Finally it matches strings like the one you've given: AABBBBBBCCCCAAAAAADD
To be sure there're no other characters in the string, you have to anchor the regex:
^(?:([A-Z])\1+)+$
And, if you wnat to match case insensitive:
^(?i)(?:([A-Z])\1+)+$

Match Regular Expressoin if string contains exactly N occrences of a character

I'd like a regular expression to match a string only if it contains a character that occurs a predefined number of times.
For example:
I want to match all strings that contain the character "_" 3 times;
So
"a_b_c_d" would pass
"a_b" would fail
"a_b_c_d_e" would fail
Does someone know a simple regular expression that would satisfy this?
Thank you
For your example, you could do:
\b[a-z]*(_[a-z]*){3}[a-z]*\b
(with an ignore case flag).
You can play with it here
It says "match 0 or more letters, followed by '_[a-z]*' exactly three times, followed by 0 or more letters". The \b means "word boundary", ie "match a whole word".
Since I've used '*' this will match if there are exactly three "_" in the word regardless of whether it appears at the start or end of the word - you can modify it otherwise.
Also, I've assumed you want to match all words in a string with exactly three "_" in it.
That means the string "a_b a_b_c_d" would say that "a_b_c_d" passed (but "a_b" fails).
If you mean that globally across the entire string you only want three "_" to appear, then use:
^[^_]*(_[^_]*){3}[^_]*$
This anchors the regex at the start of the string and goes to the end, making sure there are only three occurences of "_" in it.
Elaborating on Rado's answer, which is so far the most polyvalent but could be a pain to write if there are more occurrences to match :
^([^_]*_){3}[^_]*$
It will match entire strings (from the beginning ^ to the end $) in which there are exactly 3 ({3}) times the pattern consisting of 0 or more (*) times any character not being underscore ([^_]) and one underscore (_), the whole being followed by 0 ore more times any character other than underscore ([^_]*, again).
Of course one could alternatively group the other way round, as in our case the pattern is symmetric :
^[^_]*(_[^_]*){3}$
This should do it:
^[^_]*_[^_]*_[^_]*_[^_]*$
If you're examples are the only possibilities (like a_b_c_...), then the others are fine, but I wrote one that will handle some other possibilities. Such as:
a__b_adf
a_b_asfdasdfasfdasdfasf_asdfasfd
___
_a_b_b
Etc.
Here's my regex.
\b(_[^_]*|[^_]*_|_){3}\b

Using regex to find arbitrary length consecutive blocks

I have a string containing ones and zeroes. I want to determine if there are substrings of 1 or more characters that are repeated at least 3 consecutive times. For example, the string '000' has a length 1 substring consisting of a single zero character that is repeated 3 times. The string '010010010011' actually has 3 such substrings that each are repeated 3 times ('010', '001', and '100').
Is there a regex expression that can find these repeating patterns without knowing either the specific pattern or the pattern's length? I don't care what the pattern is nor what its length is, only that the string contains a 3-peat pattern.
Here's something that might work, however, it will only tell you if there is a pattern repeated three times, and (I don't think) can't be extended to tell you if there are others:
/(.+).*?\1.*?\1/
Breaking that out:
(.+) matches any 1 or more characters, starting anywhere in the string
.*? allows any length of interposing other characters (0 or more)
\1 matches whatever was captured by the (...+) parentheses
.*? 0 or more of anything
\1 the original pattern, again
If you want the repetitions to occur immediately adjacent, then instead use
/(.+)\1\1/
… as suggested by #Buh Buh — the \1 vs. $1 notation may vary, depending on your regexp system.
(.+)\1\1
The \ might be a different charactor depending on your language choice. This means match any string then try to match it again twice more.
The \1 means repeat the 1st match.
it looks weird, but this could be the solution:
/000000000|100100100|010010010|001001001|110110110|011011011|101101101|111111111/
This contains all possible combinations for three times. So your regular expression will match for these numbers (i.e.):
10010010011
00010010011
10110110110
But not for these:
101010101010
001110111110
111000111000
And it doesn't matter where the sequence appears in the whole string.