Describing RE's in English Language - regex

I have the following question from a past exam paper:
I am struggling to formalise their definitions within the necessary 15 word limit. So far I have:
i) The empty string or set of strings that contain zero or many a's OR b's OR both
ii) The set of strings that start with one or many a's, unless preceded by b's, followed by one or many a's with zero or many possible preceding b's.
My definitions seem rather cumbersome...I just don;t want to lose any info by oversimplifying the definition.

Try to simplify the regular expressions before describing them.
i is equivalent to (a | b)* which means any number of a's and b's in any order.
ii is equivalent to (a|b)*a(a|b)*a which is hard to describe in only 15 words, my best attempt is a's and b's in any order, at least two a's, the final letter is a

I have written a tool that attempts to do this for arbitrary regular expressions. You can find it here. Enter your regular expression and change the mode to "Explain."

Related

Does order not matter in regular expressions?

I was looking at the question posed in this stackoverflow link (Regular expression for odd number of a's) for which it is asked to find the regular expression for strings that have odd number of a over Σ = {a,b}.
The answer given by the top comment which works is b*(ab*ab*)*ab*.
I am quite confused - a was placed just before the last b*, does this ordering actually matter? Why can't it be b*a(ab*ab*)*b* instead (where a is placed after the first b*), or any other permutation of it?
Another thing I am confused about is why it is (ab*ab*)* and not (b*ab*ab*)*. Isn't b*ab*ab* the more accurate definition of 'having exactly 2 a'?
Why can't it be b*a(ab*ab*)*b* instead?
b*a(ab*ab*)*b* does not work because it would require the string to have two consecutive as before the first non-leading b, wouldn't it? For example, abaa would not be matched by your proposed regex when it should. Use the regex debugger on a site like Regex101 to see this for yourself.
On the other hand, moving the whole ab* part to the start (b*ab*(ab*ab*)*) works as well.
why it is (ab*ab*)* and not (b*ab*ab*)*?
(b*ab*ab*)* does work, but the first b* is quite redundant because whatever b there is left, will be matched by the last b* in the group. There is also a b* before the group, which causes the b* to not be able to match anything, hence it is redundant.
There are infinitely many equivalent regular expressions which generate a given (infinite) regular language. A particular expression might be preferable in some cases and by certain authors: one might prefer a minimal expression, or one which shows structure or symmetry, or even one that simplifies the reasoning in a proof by induction.
Your particular suggestion to move the a is insufficient since, as noted above, that ensures the substring aa will appear in any string with more than one a. However, abab could be changed to baba to make that placement work. Choosing babab* would work with either placement. You could even go for an expression like bab + bababab + (babab*)a(babab*) which might be nice to work with depending on your application. Something like b*(abab)ab* has the advantage of being minimal (if it's not strictly minimal, it must be pretty close).

find a regular expression where a is never immediately followed by b (Theory of formal languages)

I need to find a simplified regular expression for the language of all strings
of a's, b's, and c's where a is never immediately followed by b.
I tried something and reached till (a+c)*c(b+c)* + (b+c)*(a+c)*
Is this fine and if so can this be simplified?
Thanks in advance.
You are looking for a negative lookbehind:
(?<!a)b
This will find you all the b instances that are not immediately following a
Or a negative lookahead:
a(?!b)
This will find you all the a instances that are not immediately followed by b
Here is a regex101 example for the lookbehind:
https://regex101.com/r/RsqXbW/1
Here is a regex101 example for the lookahead:
https://regex101.com/r/qiDIZU/1
You solution contains only strings from the desired language. However, it does not contain all of them. For example acbac is not contained. Your basic idea is fine, but you need to be able to iterate the possible factors. In:
(b+c)*(a (a)*(c(b+c)*)*)*
the first part generates all strings withhout a.
After the first a there come either nothing, another a or c. Another a leaves us with the same three options. c basically starts the game again. This is what the part after the first a formalizes. The many * are needed to possibly generate the empty string in all of the different options.

Regular expression for "even odd language of strings over {a, b}

I want to make regular expression having even number of b's and odd number of a's also DFA AND NFA of it
for this i made below DFA
I got these two Regular Expression
Regular Expression For Even no of b's (a*a*a*bb)*
Regular Expression For Odd no of a's (a b*b*)(a b*b*a)*
QUESTION: Did I make the right DFA ?
How to merge above two Regular Expressions into one if both are correct??
How to Convert DFA into NFA?
Edits: I got DFA From Grijesh Chauhan Answer
still unable to make regular expression which will allow only even number of b's and odd nubmer of a's .
I also tried this Regular Expression
(a(bb)*(aa)*)*
Note: From above RE only those strings are generated which start from a but i want that RE which generate string of even number of b's and odd number of a's regardles of starting from a or b
The regexes are incorrect. They should be
a*(ba*ba*)* for an even number of b
b*ab*(ab*ab*)* for an odd number of a
There is a systematic way to perform a merge of these two, because every regular expression can be represented by a state machine and vice versa and there is definitely a way to merge state machines such that the resulting state machine accepts if either of the two state machines accept, but I cannot remember how this is done directly on regular expressions.
Your DFA is incorrect. One can see this because you have cycles of odd length. Following those cycles changes the even/odd parity. So I can start with "babb" which your DFA accepts, having odd number of b's and odd number of a's. q0->q1->q2 is a cycle of 3 a's so adding 3 a's when I am in one of those states does not change wether the automata accepts, so your automata accepts "aaababb" despite neither having an odd number of a's or an even number of b's. (Also your machine fails for "bab", despite this having both odd number of a's and even number of b's)
Your DFA should at minimum keep track of the parity of the number of a's and b's. So you should start with 4 states. Q_{even,even},Q_{even,odd},Q_{odd,even} and Q_{odd,odd}. Having labeled the states in this way it should be straightforward to set up the transitions and selecting what should be the intial and accepting states.
Your regular expressions also has some issues. I would note that a* means 0 or more a's, so a*a* means 0 or more a's followed by 0 or more a's. This means that a*a*=a*. Other than that see Georg's answer.
Conventional definitions are such that every DFA is also a NFA. Converting can be a problem when going from NFA to DFA.
See Need Regular Expression for Finite Automata: Even number of 1s and Even number of 0s for a discussion on what algebra can be done on regular expressions.
use this DFA....may be help you....i made in paint so,not looking pretty...

Is the language L={words without the substring 'bb' } regular?

L = {words such that the substring 'bb' is not present in in it
Given that the alphabet is A = {a,b}, is this language regular? If so, is there a regular expression that represents it?
Yes, this language is regular. Since this looks like homework, here's a hint: if the string bb isn't present, then the string consists of lots of blocks of strings of the form a* or a*b. Try seeing how to assemble the solution from this starting point.
EDIT: If this isn't a homework problem, here's one possible solution:
(a*(ba+)*b?)?
The idea is to decompose the string into a lot of long sequences of as with some b's interspersed in-between them. The first block of a's is at the front. Then, we repeatedly place down a b, at least one a, and then any number of additional as. Finally, we may optionally have one b at the end. As an alternative, we could have the empty string, so the entire thing is guarded by a ?.
Hope this helps!

regular expression to an English description

I'm really struggling with regular expressions. I have to give English descriptions of the following regular expressions can anyone please please please help me..
i. a(aa)*
ii. a(b*ab*ab*)*
iii. b(b*ab*ab*)*
heres my attempts but everyone else in the class has seems to have shorter answers.
i. Find a "a" followed by either zero or more times "aa"s should be seen
ii. Find a "a" followed by either zero or more times of this pattern :
(zero or more times "b" followed by zero or more times "ab" followed by zero or more times "ab")
iii. Find a "b" followed by either zero or more times of this pattern :
(zero or more times "b" followed by zero or more times "ab" followed by zero or more times "ab")
If those strings are actual regexes, they (completely) match the following:
An odd number of as.
A string starting with a, followed by any combination of as and bs, with an overall odd number of as.
A string starting with b, followed by any combination of as and bs, with an overall even number of as. Edge case: If the string contains more than one b, it needs to contain at least two as.
"Any combination" includes zero instances of each character.
Some possible matches for 1.:
a
aaa
aaaaaa
aaaaaaaa etc.
Some possible matches for 2.:
a
aaa
ababa
aaab
abbbbbbbbaa
ababababababa
Some possible matches for 3.:
b
baa
baba
baaaaaba
bbbbbbbbbbaa
bababababbbbb
There's a free tool Ultrapico Express which can help. Just run a match on any of the regexes you mentioned, then it should be relatively easy to translate into regular English;
i - an odd number of a's, with at least one a.
ii - an odd number of a's, with at least one a, and 0 or more b's between each pair of a's.
Your attempted solutions seem correct, but I would expect your professor will complain that you're description is rephrasing the RE and is not an English description of the result.
I'll leave iii back to you to re-word (mainly because it's more difficult than the other two and I'm lazy this morning!)
Let me hint you a bit:
How would you describe the regular expression 'a'? How about 'aa'?. Ok, now, how would you describe the expression 'a*' and '(aa)*' ? For the latter there is a pattern which is interesting. Now, try to combine them. What is a(aa)* ? If you write down a couple of specimens for the regular language, there is a pattern you can spot.
Odd and even plays a role here.
The trick is to cut up the regular expression and understand each part. Then write down a couple of strings which are in the language the RE decides. Then look for a pattern. My guess is that this is what your TA/Prof wants you to do in order to understand the relationsship between an RE and the language it decides.
An odd number of as.
A string starting with a, followed by any combination of single as and multiple bs (zero or more), with an overall odd number of as.
A string starting with b, followed by any combination of single as and multiple bs (zero or more), with an overall even number of as.