Regular Expression to DFA

Regular Expression to DFA - regex

Can someone tell me if the attached DFA is correct ?
I am suppose to give DFA for the language that has alphabet Σ ={a, b}
I need DFA for this ----> A={ε, b, ab}

No, for multiple reasons:
Your automaton bab
Your automaton does not accept ab
Your automaton is not a DFA, at least by some strict definitions
Regarding the first point: starting at q1, we see b, go to q2, see a, go to q3, see b, and go to q4, which is accepting. We saw bab and accepted it.
Regarding the second point: starting at q1, we see a but have no defined transition. The automaton "crashes" and fails to accept. So no string starting with a is accepted, including ab.
Regarding the third point: DFAs are often required to show all states and transitions, including dead states and transitions that will never lead back to any accepting state. You don't show all transitions and don't show all states in your automaton.
You can use the Myhill-Nerode theorem to determine how many states a minimal DFA for your language has. We note that the empty state can have appended either the empty string, b or ab to get a string in the language; a can have b appended; and b can have the empty string appended. Nothing can be appended to aa, bb, or ba to get a string in the language (so these are indistinguishable); but ab can have the empty string appended (and so is indistinguishable from b).
Equivalence classes so determined correspond to states in a minimal DFA. Our equivalence classes are:
Strings like the empty string
Strings like b
Strings like a
Strings like aa
We note that b is in the language, so the second class will correspond to an accepting state. We notice nothing can be appended to aa to get a string in the language, so this class corresponds to a dead state in the DFA. We write the transitions between these states by seeing which new equivalence class the appending of a new symbol puts us in:
Appending a puts us in (3) since appending a to the empty string gives a which is in (3). Appending b puts us in (2) since appending b to the empty string gives b which is in (2)
Appending a puts us in (4) since appending a to to b gives ba which is like aa in that it isn't a prefix of any string in the language. Appending b, we arrive in (4) by a similar argument.
Appending a we get aa and are in (4). Appending b we get ab which is like b so we are in (2).
All transitions from a dead state return to a dead state; both a and b lead back to (4).
You end up with something like:
q1 --a--> q3
| /|
b --b--< a
| / |
vv v
q2 -a,b-> q4 \
^ a,b
\_/
Or in tabular form:
q s q'
== = ==
q1 a q3
q1 b q2
q2 a q4
q2 b q4
q3 a q4
q3 b q2
q4 a q4
q4 b q4

i think this DFA is correct for that language.

Your attached D.F.A is wrong..
your D.F.A is acceptable only for €,b,bab but it cannot accept ab.
To make your dfa to accept ab also add a new state to q0 which accepts a and whenever newstate gets input as b send it to a final state.
As it is a d.f.a the inputs which are not required for u send it to a new state (DEAD STATE)
The d.f.a for your question is here:
click here to view the d.f.a

Related

Regular expression - Kleene star of a union expression

I'm trying to code something that returns randomly a possible result after going through a regular expression.
I was sort of confused on how to tackle this when you have kleene star of a union expression.
If you have (a + b)* then does this mean that you indefinitely choose between a or b and repeat it a definite number of times, or do you just randomly choose between a or b twice.
If it is the former, then would it logically make sense to first generate a random number to determine how many times I'm going to randomly choose between a or b, and then for each time I randomly choose the element I generate another random number that then repeats the element that many times?

If you're asking what kind of things match (a | b)*, you might as well think of it in terms of a grammar:
<expression> := <empty> | <parens><expression>
<parens> := a | b
That's what a * operator really means: for any expression x, x* matches either the empty string or (x)(x*) (this is a recursive definition).
If you want to randomly generate a string that matches the expression, then that's a much more complicated matter. You now have to think in terms of which distribution you want to use, because the length of the string is unbounded, and it's impossible to have a uniform distribution over an unbounded range. (In other words, you can't pick a random length between 0 and infinity uniformly, so you'd have to decide how you're going to pick that in the first place.) Once you have your length problem resolved, expand (a | b)* into (a | b) repeated N times (where N is your randomly-chosen length) and resolve each parenthesized subexpression separately — for instance, if you choose to expand the subexpression 3 times, that would become (a | b)(a | b)(a | b), which will match all of aaa, baa, aba, bba, aab, bab, abb and bbb.

If you want to test if the string is a member of a Kleene star applied
set, such as:
{"a", "b"}* = {ε, "a", "b", "aa", "ab", "ba", "bb", "aaa", "aab", ...}
then the regex ^[ab]*$ will work including an empty string.
If you want to limit the length of the string, say 10, then try ^[ab]{,10}$.

How do I concatenate adjacent Kleene star symbols from an alphabet?

I came across a situation where I need to convert regular expressions to NFA diagrams from the language {1,0}. Within the regex, I found that there are two concatenated symbols with Kleene stars, 1*0*. Basically this means that the string has any number of 1's followed by any number of 0's.
Whilst converting into an NFA, I got confused mainly because there are two transactions pointing outwards of the first symbol's (1*) accept state: an epsilon transaction back to the initial state (because it has a Kleene star), and an epsilon transaction to the initial state of 0*.
I am not sure whether 1) I can have two transactions leaving the same state when converting to an NFA and if so, 2) how to simplify this transaction.
Any help here would be appreciated!

You can definitely have multiple epsilon transitions from the same state.
Using https://en.wikipedia.org/wiki/Thompson%27s_construction,
Concatenation of s and t: the initial state of s is the new initial state, the accepting state of t is the new accepting state. The accepting state of s becomes the initial state of t.
Kleene closure of s: introduce a new initial state and a new accepting state. Add a epsilon transition from the initial state to the final state. Add an epsilon transition from the new initial to the original initial, and an epsilon transition from original accepting to new accepting, and an epsilon transition from original accepting to original initial.
So, our expression 1*0* breaks into: 1* concatenated with 0*.
1 on its own is just q --1--> f. Going through the Kleene conversion to NFA yields
/--------------e--------------\
| V
q --e--> q1 --1--> q1f --e--> f
^ |
\---e----/
With a similar construction for 0*. To concatenate them, take the accepting state from the first, and define it to be the starting state of the second:
/---------------e-------------\ /-------------e---------------\
| V | V
q --e--> q1a --1--> q1f --e--> q0 --e--> q0a --0--> q0f --e--> f
^ | ^ |
\-----e----/ \-----e----/
To simplify, you can convert it to an NFA or a DFA using their corresponding conversion algorithms.

Regex not working as expected when combined with a dot

I am writing a regex to recognize IP address of the form "A.B.C.D", where the value of A, B, C, and D may range from 0 to 255. Leading zeros are allowed. The length of A, B, C, or D can't be greater than 3. I know this regex is easily available on Internet but I am writing it on my own for practice.
First I wrote the regex for A as follows:
a = ^(^0{0,2}\d|^0{0,1}\d\d|[0-1]\d\d|2[0-4]\d|25[0-5])$
It works as expected, then I wrote it for A.B as follows:
ab = ^(^0{0,2}\d|^0{0,1}\d\d|[0-1]\d\d|2[0-4]\d|25[0-5])\.
(^0{0,2}\d|^0{0,1}\d\d|[0-1]\d\d|2[0-4]\d|25[0-5])$
But somehow it isn't working as expected. It's not recognizing strings like "2.3" but recognizing "2.003". This is very weird. I have spent hours figuring it out but have totally given up now. Please help me with this.

As #Jorge pointed out in the comments, the ^ character matches the start of a string/line, which can occur for A, as it is presumably the first group of characters in the line, but cannot occur for B, since it will always be preceded by A. This is why it could match 003 (through the subpattern [0-1]\d\d), but it couldn't match 3 through the subpattern ^0{0,2}\d.
Remove the superfluous ^s, and you should get the desired behavior:
ab = ^(0{0,2}\d|0{0,1}\d\d|[0-1]\d\d|2[0-4]\d|25[0-5])\.
(0{0,2}\d|0{0,1}\d\d|[0-1]\d\d|2[0-4]\d|25[0-5])

Concatenate a range of cells in OO Calc

I have column A with these cells:
A1: Apple
A2: Banana
A3: Cherry
I want a formula that will string them together in one cell like this:
"Apple, Banana, Cherry"

I don’t know if it’s implanted on OpenOffice but on his cousin LibreOffice Calc since the version 5.2 you’ve got the function : TEXTJOIN
TEXTJOIN( delimiter, skip_empty, string1[, string2][, …] )
delimiter is a text string and can be a range.skip_empty is a logical (TRUE or
FALSE, 1 or 0) argument. When TRUE, empty strings will be ignored.
string1[, string2][, …] are strings or references to cells or ranges
that contains text to join.
Ranges are traversed row by row (from top to bottom).
Example : =TEXTJOIN(",",1,A1:A10)
More info here :
https://help.libreoffice.org/6.3/en-US/text/scalc/01/func_textjoin.html?DbPAR=CALC#bm_id581556228060864

A different approach, suitable for a long list, would be to copy A1 to B1, prepend a " and in B2 enter:
=B1&", "&A2&IF(A3="";"""";"")
then double-click the fill handle to cell B2 (the small square at its bottom right). The result should appear in ColumnB in the row of the last entry of your list.

As of version 4.1.7 of Apache OpenOffice Calc, there still isn't a simple solution to this problem. CONCATENATE doesn't accept cell ranges, and there isn't a TEXTJOIN function like LibreOffice. However, there is a workaround.
This is essentially a duplicate of pnuts' answer, but with images to hopefully help. His answer explicitly addresses separating the items with delimiters, as well as the opening and closing quotations, as the question above uses. As the general question (how to concatenate a range of cells) is useful to many people, I think my answer should still be useful even though I haven't done that.
In my case, I had one column with letters corresponding to finished worksets, and one column with letters corresponding to unfinished worksets. The letters only appear on every 8th row, so I can't view them all at the same time. I wanted to just mash all the finished letters together in one cell to be easy to view, and the same with the unfinished letters.
The example removes the 7 empty rows per letter and manually inputs which letters are finished/unfinished for convenience.
Column A is the "unfinished" column to be concatenated. Column C is used to perform the concatenation. Row 2 is the first row, and row 24 is the final row. G1 shows the concatenated result in an easy-to-see spot near the top of the document.
Columns B and D, and cell G2, utilize the same method to show the "finished" data. The formulas aren't shown here.
In cell C2, point explicitly to A2:
=A2
If you may have blanks, as I do, there needs to be a conditional in C2 to treat the first cell as blank text, instead of as zero Note 1:
=IF (A2 <> "" ; A2 ; "")
Then, in cell C3, concatenate C2 and A3:
=C2 & A3
Copy C3, then highlight C4:C24 and paste the formula to autofill those cells.
Wherever you need the result of the concatenation, reference C24.
Notes
Note 1 If N cells at the top of the A row are blank and you just let C2 = A2, the first N rows on C will show 0, and a single 0 will be prepended to the concatenation result. Here, columns B and D are used to illustrate the problem:

Either use the CONCATENATE function or ampersands (&):
=CONCATENATE("""", A1, ", ", A2, ", ", A3, """")
For something more powerful, write a Basic macro that uses Join.
EDIT:
There is no function that can concatenate a range. Instead, write a Basic macro or drag and drop CONCATENATE formulas to multiple cells. See https://forum.openoffice.org/en/forum/viewtopic.php?f=9&t=5438.

NFA to an RE Kleene's Theorem

Here is my NFA:
Here is my attempt.
Create new start and final nodes
Next eliminate the 2nd node from the left which gives me ab
Next eliminate the 2nd node from the right which gives me ab*a
Next eliminate the 2nd node from the left which gives me abb*b
Next eliminate the 2nd node from the right which gives me b+ab*a
Which leads to abbb (b+aba)*
Is this the correct answer?

No you are not correct :(
you not need to create start state. the first state with - sign is the start state. Also a,b label means a or b but not ab
there is a theorem called Arden's theoram, will be quit helpful to convert NFA into RE
What is Regular Expression for this NFA?
In you NFA the intial part of DFA:
step-1:
(-) --a,b-->(1)
means (a+b)
step-2: next from stat 1 to 2, note state 2 is accepting state final (having + sign).
(1) --b--->(2+)
So you need (a+b)b to reach to final state.
step-3: One you are at final state 2, any number of b are accepted (any number means one or more). This is because of self loop on state 2 with label b.
So, b* accepted on state-2.
step-4:
Actually there is two loops on state-2.
one is self loop with label b as I described in step-3. Its expression is b*
second loop on state-2 is via state-3.
the expression for second loop on state-2 is aa*b
why expression aa*b ?
because:
a-
|| ====> aa*b
▼|
(2+)--a-->(3) --b-->(2+)
So, In step-3 and step-4 because of loop on state-2 run can be looped back via b labeled or via aa*b ===> (b + aa*b)*
So regular expression for your NFA is:
(a+b) b (b + aa*b)*

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js