Can this language be expressed with a regular expression? - regex

The given language is:
B = { aibic2m | i ≥ 0, m ≥ 0 } ∪ { arbsc2t | r ≥ 4, s ≥ 4, t ≥ 0}
My Prof has provided a regular expression for this language which is:
a4a* b4b* (c2)* + (epsilon + ab + a2b2 + a3b3)(c2)*
I believe that this is incorrect and that this language is not regular. Is there a possible regex for this language, if not this one?
As far as I'm aware
(i) Any language with balanced strings is not regular as it cannot be expressed as a regular expression/DFA/NFA and is instead context free so it can be represented as a PDA/CF-grammar
(ii) The union of a non-regular CF language and a regular language is CF
So the first half of the union should not be regular, while the second half is regular as none of the superscripts on the tokens are related to each other. Thus the overall language B is non-regular, correct?
Regarding my Prof's solution, the left part refers to the arbsc2t part of the language B and seems correct, but the right part seems to refer to the part of B that is aibic2m, and doesn't seem to generate all of the strings in the language.

Your teacher is right.
The right part of the solution is indeed intended to deal with aibic2m, and it is also true it does not cover all of that.
But here is the catch: For i >= 4, aibic2m is a subset of arbsc2t! So the only cases that that second expression does not cover is when i < 4, which means there are just 4 specific cases to deal with, and by listing them separately the "problem" with the equal superscripts disappears.

Related

How to tell whether a language is a regular language, context free language, Push down Automata, etc?

I know that the pumping lemma can be used to determine whether a language is a Regular Language, Context Free Language, Pushdown Automata, etc. However, I would like to know if there are any tricks in telling what type of language a given language is, or perhaps general tendencies for certain languages?
For example, is there anyway in telling what the languages are in the following examples below just by looking at the language description.
L = {(0^n)2(1^m) | n >= m }
L = {(0^n)2(1^m) | n >= 1, m >= 1, n + m <= 100 }
L = {(0^n)(1^m)2 | n >= 1, m >= 1, n + m <= 100 }
L = {ww^R} | w element of {0, 1}*, where w^R is the reverse of W}
L = {w2w | w element of {0, 1}*}
L = {w2w^R | w element of {0, 1}*, where w^R is the reverse of W}
The answers are:
Not Finite Automata, Not DPDA by empty stack, but DPDA by final state
Finite Automata, but Not DPDA by empty stack.
Finite Automata, also DPDA by empty stack
Is a PDA, but not DPDA
Not any DPDA
DPDA by empty stack and DPDA by final state, not FSA
Thanks!
There are some simple points which you can checkout by looking at a language which can help you deciding which language it is (P.S: These are not stated rules but derived from the definition of these languages).
Check if the language is finite. Finite languages means it is accepted by finite automata. For e.g L= {a^n b^m | n+m<100} or L={a^n b^n | n<50}. These examples may seem context-free languages but actually they are finite hence accepted by finite automata.
Check whether the condition given in language involves single comparison or not. If it involves more than one comparisons, then it is neither Regular nor Context-free. Then it is context-sensitive language. For e.g. L= {a^n b^n c^n | n>1} and L={a^n b^n c^m | m>n} both are the cases where more than one comparison are present. In first case, it is present in the body of language and in second example, one comparison in body and other comparison in condition of language.
Distinguish between PDA and DPDA is easy if you've knowledge of designing of PDA. If the language contains a clear point of changing a state then it is DPDA otherwise it is PDA.
If the context-free language involves a condition of equality like L={w.wR | w element of {0,1}* } or L ={ a^nb^n| n>1}, then the PDA is accepted by empty stack and final state but if the condition is of inequality, then you need to check whether the stack will be empty or not.
Try to visualize a stack and using that stack try to visualize whether given comparison can be made or not. For a language to be Context-free, only one stack should be used for comparison.
In case, the language is complex enough to guess whether it is Regular or context-free, then it must be context-sensitive. There ain't any way to tell straight away whether a language is R.E or context-sensitive so it is assumed that every language which can be written in set-builder form and which is not Regular or context-free is context-sensitive.
As I told earlier, these aren't stated rules or facts but just some points derived from the definitions of these languages. In order to guess quickly, just try to practise some languages based on these rules.

How to visualize the statement 'regular languages are closed under x,y... '?

I am studying DFA/ regular expression, I keep on encountering the statement
regular languages are closed under union, intersection, complement etc.
I understand the definition of closure, which means that when we apply some operation on some element of the set, the resulting element should also be in the set.
However, none of the resources I referred to have any concrete examples of that? The prove it by equation, Could somebody help me visualize the statement above with an example of regex?
Let's use the alphabet {0,1}.
Let L1 be the regular language containing all strings of length 3, {000, 001, 010, 011, 100, 101, 110, 111}; we can use the regular expression '(0+1) (0+1) (0+1)'.
Let L2 be the regular language containing all strings starting with 0, {0, 00, 01, 000, 001, 010, 011, …}; we can use the regular expression '0 (0+1)*'.
The union of these languages contains all strings of length 3, plus all strings starting with 0. The + operator does exactly this, so we can just write '(0+1)(0+1)(0+1) + 0(0+1)*'. (We could simplify this expression slightly, but we don't need to.)
The intersection of these languages contains all strings of length 3 that start with 0: '0 (0+1) (0+1)'.
The complement of L1 contains all strings of length 0, 1, 2, or ≥4; we can write 'ε + (0+1) + (0+1)(0+1) + (0+1)(0+1)(0+1)(0+1)(0+1)*'.
The complement of L2 contains the empty string, plus all strings starting with 1; we can write 'ε + 1 (0+1)*'.
Edited to add: That said, as some commenters mention above, it's probably easier to picture this using finite state machines. In particular, DFAs (deterministic finite automata) are probably the way to go.
Here are DFAs representing L1 and L2:
We can complete/extend these DFAs, without changing the languages that they define, by adding additional non-accept states that we will transition to whenever there is no other state transition. (This way, every string ends up in some state.) That gives:
Their union has the cross-product of the states in the two DFAs; for example, it has an "AD" state, meaning "if I were following the DFA for L1, I'd be in state A, and if I were following the DFA for L2, I'd be in state D." The accept states are the states corresponding to accept states in either DFA:
Their intersection is similar, except that its accept states are the states corresponding to accept states in both DFAs:
Though of course, we can greatly simplify it by removing all the states that can never lead to an accept state:
The complements, lastly, are simply the same DFAs, but with all accept states changed to non-accept states and vice versa:

Show that two regular expressions are equivalent in Automata Theory without using DFAs

I have been trying to prove that two regex are equivalent. I know that two regex are equivalent if they define the same language. But i am not getting my hands of way to prove it without using DFAs.
For example, i have the problem to prove that the following are equivalent.
(a + b)*a(a + b)*b(a + b)* = (a + b)*ab(a + b)*
I know both of these define the language having atleast one 'a' and one 'b'.
The same is the case with the following.
(a + b)*ab(a +b)* + b*a* = (a + b)*
Any help will be appreciated.
Thanks
You should be able to prove them using the identities on slide 16 of
this regex lecture. In particular, I'd recommend clever use of the last equality of the 9th identity there, R* = RR*+e.
By the way, the first language is not precisely "at least one 'a' and one 'b'". For example, 'ba' is not in the language, but has at least one 'a' and one 'b'.
I think in first language there is (a+b)* in the middle which mean that this is arbitrary so we can ignored the arbitrary (a+b)* so it will become equivalent

Math: Giving regular expression for a language:

I am going over and learning regular expressions and languages. I was working through some questions about giving a regular expression to represent a specified language. The question I was a little stuck on is this:
Come up with a regular expression that expresses the following
language. The alphabet of the langauge is {a,b}.
The language of all strings with two consecutive a's, but no three
consecutive a's. (ie, "aa", "aabaa", "babaa" are in the language,
while "abab", "aaaab" is not).
My answer for this so far is:
(b*(e+a+aa)bb*)* (aa) (bb*(e+a+aa)b*)*
where 'e' is the empty string and '+' functions essentially as an 'or'.
I guess what I am wondering is if my answer is correct (I believe it is), and if it can at all be simplified?
Thanks guys.
I believe that your regular expression is correct. It ensures that an aa exists in the string, and makes sure that aaa cannot exist. As for being simplest (simplest being subjective here), I would say the following is simpler:
(b + ab + aab)* aa (b + ba + baa)*
Note that you could actually derive the above from the regular expression that you have. Taking just the part before the aa in your regular expression, we have:
(b*(e+a+aa)bb*)*
= (b*bb* + b*abb* + b*aabb*)*
= (b + ab + aab)*
That last step is a little bit of a jump, but it takes noticing that all those b*'s are redundant due to the * on the whole expression, and a b existing inside the brackets.
I think this regex matches your language as well:
^((ab|b)*aa(ba|b)*)*$

What regular language intersects with 1*0* gives 1n0n

I'm reading a book on automata theory, and the book gives an example that a language with equal number of 0s and 1s intersects with 1*0* would result 1n0n, where n > 0
So my question is, how can I find some regular languages that when intersected with 1*0*, would also results in 1n0n. Is there a way to think about that?
update:
Thanks for the answers! I guess what I'm trying to find is some regular languages, so the ones like 1n0n wouldn't work ;)
Is it possible? Any ideas?
N.B. The language with an equal and unbounded number of 0s and 1s is not a regular language.
As for your question, I don't think there are any more restrictions you can add to some ones followed by some zeros to get n ones followed by n zeros other than the two you have given.
There are an infinite number of trivially-constructed languages that satisfy the conditions: A1nB0nC where A, B and C are any expressions that can match zero width.
Just think of the question as: "What languages, when intersected with 1n0m, give the language 1n0n?" Basically, anything that adds the constraint that n=m.
One example is anbn, where a!=b.
Another one is L = { 1n0n1m0m | n!=m, n >= 0, m >= 0 }.
Also, as OrangeDog pointed out, 1n0n is not regular, and since regular languages are closed under intersection, it follows that any language whose intersection with 1*0* gives 1n0n is not regular.