Is this a regular language? If so, what is it's regular expression? - regex

B = {1^k y | k >= 1, y in {0, 1}* and y contains at least k 1's }
Is this language regular? If so, how do you prove it, and how would you represent it with a regular expression in Python?
This is for class work, so if you could explain the reasons and processes behind your answer, it'd be much appreciated.

The language you have is equivalent to this language:
B' = {1 y | y in {0, 1}* and y contains at least one 1}
You can prove that B' is subset of B, since the condition in B' is the same as B, but with k set to 1.
Proving B is subset of B' involves proving that all words in B where k >= 1 also belongs to B', which is easy, since we can take away the first 1 in all words in B and set y to be the rest of the string, then y will always contain at least one 1.
Therefore, we can conclude that B = B'.
So our job is simplified to ensuring the first character is 1 and there is at least 1 1 in the rest of the string.
The regular expression (the CS notation) will be:
10*1(0 + 1)*
In the notation used by common regex engines:
10*1[01]*
The DFA:
Here q2 is a final state.
"At least" is the key to solving this question. If the word becomes "equal", then the story will be different.

Related

Proving a Certain Language Regular

In my computational theory class we have an assignment of proving a language is regular.
The language is defined as:
B = {1ky | y is in {0, 1}* and y contains at least k 1s, for k >= 1}
This language looks to me like it would need a pushdown automata to create a machine for this but if someone could push me in the right direction to try and prove this is a regular language. Showing me one of these ways to prove it: creating a NFA, DFA, regular expression, or regular grammar would be helpful.
The language L:
L = {1ky | y is in {0, 1}* and y contains at least k number of 1, for k >= 1}
is a regular language. Indeed this language is very simple. Simple English description for L is: "Set of all strings consists of '0's and '1's with the restriction that: every string starts with '1' and also contains at-least two '1'".
The language description given in question is purposefully complected to make question tricky. Simple approach to solve this kind of problem is: read language try to understand pattern of strings in language. Try to write all possible smallest strings, then write second smallest strings and so on...
Also, try to find smallest length strings those doesn't belongs to the language. Below I have shown my approach with your example to write RE or FA from English description.
Let in first few steps we try to understand what kind of strings are allowed in language L. read following points:
All strings in language L are consists of '0' and '1'
According to 1ky and k >= 1, all strings in language L must start with '1' as k is grater than 0.
Pattern of language strings is 11...y (or we can say 1+y). To explain further, string start with some ones 1s, and has suffix y, where y can be any arbitrary sub string of zeros 0s and ones 1s.
Note: Because k can be any number greater than 0, there is just a simple constraint that before sub-string y there must be at least one '1'. After first '1' you can consider remain suffix as a part of sub-string y.
In other words, we can also explain language L = { 1y, where y contains at least a 1}
Now, as I said try to write some example strings in language:
Some smallest possible strings can be:
'11' where k = 1 and y = '1'
'101' where k = 1 and y = '01'
'110' where k = 1 and y = '10'
One more examples:
'111' where k = 1 and y = 11 #remember in `y` there must be atleat k ones
One more examples '1111', Now what can be k and y? string '1111' can be interpreted in following ways:
'1111' with k = 1 and y = 111 #remember in `y` there must be atleat k ones
or k = 2 and y = 11 #remember in `y` there must be atleat k ones
Some example strings those are not in language:
String that can't be in L are '0', '00', '01111' because string has to be start with '1'. So all string with pattern 0(0 + 1)* starts with '0' are not in language.
There are other possible strings those starts with '1' but still not in language. e.g. '10' because if k = 1 (min value of k) then y is '0'. For same reason, string = '1' not in language. So all strings with pattern 10* that is '1' followed by any number of zeros '0's not in language.
So all string in language starts with '1' and y part also contain at least a '1'. There is no restriction on y that where '1' can be appear. Substring y can be any string of zeros and ones with at-least single one and regular expression for y is: 0*1(1 + 0)*.
Regular expression for L will be: 10*1(1 + 0)*
Now, similar approach can be helpful for writing DFA for language one can refer answer #drawing minmal DFA for the given regular expression and to write regular grammar read answer #Left-Linear and Right-Linear Grammars.

Mathematica solution?

I am new with Mathematica and I have one more task to figure out, but I can't find the answer. I have two lists of numbers ("b","u"):
b = {8.734059001373602`, 8.330508824111284`, 5.620669156438947`,
1.4722145583571766`, 1.797504620275392`, 7.045821078656974`,
2.1437334927375247`, 2.295629405840401`, 9.749038328921163`,
5.9928406294151095`, 5.710839663259195`, 7.6983109942364365`,
1.02781847368645`, 4.909108426318685`, 2.5860897177525572`,
9.56334726886076`, 5.661774934433563`, 3.4927397824800384`,
0.4570000499566351`, 6.240122061193738`, 8.371962670138991`,
4.593105388706549`, 7.653068139076581`, 2.2715973346475877`,
7.6234743784167875`, 0.9177107503732636`, 3.182296027902268`,
6.196168580445633`, 0.1486794884986935`, 1.2920960388213274`,
7.478757220079665`, 9.610332785387424`, 0.05088141346751485`,
3.940557901075696`, 5.21881311050797`, 7.489624788199514`,
8.773397599406234`, 3.397275198258715`, 1.4847171141876618`,
0.06574278834161795`, 0.620801320529969`, 2.075457888143216`,
5.244608900551409`, 4.54384757203616`, 7.114276285060143`,
2.8878711430358344`, 5.70657733453041`, 8.759173986432632`,
1.9392596667256967`, 7.419234634325729`, 8.258205508179927`,
1.185315253730261`, 3.907753644335596`, 7.168561412289151`,
9.919881985898002`, 3.169835543867407`, 8.352858871046699`,
7.959492335118693`, 7.772764587074317`, 7.091413185764939`,
1.433673058797801`};
and
u={5.1929, 3.95756, 5.55276, 3.97068, 5.67986, 4.57951, 4.12308,
2.52284, 6.58678, 4.32735, 7.08465, 4.65308, 3.82025, 5.01325,
1.17007, 6.43412, 4.67273, 3.7701, 4.10398, 2.90585, 3.75596,
5.12365, 4.78612, 7.20375, 3.19926, 8.10662};
This is the LinePlot of "b" and "u";
I need to compare first 5 numbers from "b" to 1st number in "u" and always leave the maximum (replace "b"<"u" with "u"). Then I need to shift by 2 numbers and compare 3rd, 4th, 5th, 6th and 7th "b" with 2nd "u" and so on (shift always => 2 steps). But the overlapping numbers need to be "remembered" and compared in the next step, so that always the maximum is picked (e.g. 3rd, 4th and 5th "b" has to be > than 1st and 2nd "u").
Possibly the easiest way would be to cover the maximums showed in the image throughout the whole function, but I am new to this software and I don't have the experience to do that. Still It would be awesome if someone would figure out how to do this with a function that would do what I have described above.
I believe this does what you want:
With[{n = Length # u},
Array[b[[#]] ~Max~ Take[u, ⌊{#-2, #+1}/2⌋ ~Clip~ {1, n}] &, 2 n + 3]
]
{8.73406, 8.33051, 5.62067, 5.1929, 5.55276, 7.04582, 5.55276, 5.55276, 9.74904,--
Or if the length of u and v are appropriately matched:
With[{n = Length # u},
MapIndexed[# ~Max~ Take[u, ⌊(#2[[1]] + {-2, 1})/2⌋ ~Clip~ {1, n}] &, b]
]
These are quite a lot faster than Mark's solution. With the following data:
u = RandomReal[{1, 1000}, 1500];
b = RandomReal[{1, 1000}, 3004];
Mark's code takes 2.8 seconds, while mine take 0.014 and 0.015 seconds.
Please ask your future questions on the dedicated Mathematica StackExchange site:
I think that there's a small problem with your data, u doesn't have as many elements as Partition[b,5,2]. Leaving that to one side, the best I could do was:
Max /# Transpose[
Table[Map[If[# > 0, Max[#, u[[i]]], 0] &,
RotateRight[PadRight[Partition[b, 5, 2][[i]], Length[b]],
2 (i - 1)]], {i, 1, Length[u]}]]
which starts producing the same numbers as in your comment.
As ever, pick this apart from the innermost expression and work outwards.

How to assign binary variable in AMPL in respect to another variable

I have a problem with AMPL modelling. Can you help me how to define a binary variable u that suppose to be equall to 0 when another variable x is also equall to 0 and 1 when x is different than 0?
I was trying to use logical expressions but solver that I am working with (cplex and minos) doesn't allow it.
My idea was:
subject to:
u || x != u && x
Take M a 'big' constant such as x < M holds, and assume x is an integer (or x >= 1 if x is continuous). You can use the two constraints:
u <= x (if x=0, then u=0)
x <= M*u (if x>0, then u=1)
with u a binary variable.
If now x is continuous and not necessarily greater than 1, you will have to adapt the constraints above (for example, the first constraint here would not be verified with x=0.3 and u=1).
The general idea is that you can (in many cases) replace those logical constraints with inequalities, using the fact that if a and b are boolean variables, then the statement "a implies b" can be written as b>=a (if a=1, then b=1).

Solving a linear equation in one variable

What would be the most efficient algorithm to solve a linear equation in one variable given as a string input to a function? For example, for input string:
"x + 9 – 2 - 4 + x = – x + 5 – 1 + 3 – x"
The output should be 1.
I am considering using a stack and pushing each string token onto it as I encounter spaces in the string. If the input was in polish notation then it would have been easier to pop numbers off the stack to get to a result, but I am not sure what approach to take here.
It is an interview question.
Solving the linear equation is (I hope) extremely easy for you once you've worked out the coefficients a and b in the equation a * x + b = 0.
So, the difficult part of the problem is parsing the expression and "evaluating" it to find the coefficients. Your example expression is extremely simple, it uses only the operators unary -, binary -, binary +. And =, which you could handle specially.
It is not clear from the question whether the solution should also handle expressions involving binary * and /, or parentheses. I'm wondering whether the interview question is intended:
to make you write some simple code, or
to make you ask what the real scope of the problem is before you write anything.
Both are important skills :-)
It could even be that the question is intended:
to separate those with lots of experience writing parsers (who will solve it as fast as they can write/type) from those with none (who might struggle to solve it at all within a few minutes, at least without some hints).
Anyway, to allow for future more complicated requirements, there are two common approaches to parsing arithmetic expressions: recursive descent or Dijkstra's shunting-yard algorithm. You can look these up, and if you only need the simple expressions in version 1.0 then you can use a simplified form of Dijkstra's algorithm. Then once you've parsed the expression, you need to evaluate it: use values that are linear expressions in x and interpret = as an operator with lowest possible precedence that means "subtract". The result is a linear expression in x that is equal to 0.
If you don't need complicated expressions then you can evaluate that simple example pretty much directly from left-to-right once you've tokenised it[*]:
x
x + 9
// set the "we've found minus sign" bit to negate the first thing that follows
x + 7 // and clear the negative bit
x + 3
2 * x + 3
// set the "we've found the equals sign" bit to negate everything that follows
3 * x + 3
3 * x - 2
3 * x - 1
3 * x - 4
4 * x - 4
Finally, solve a * x + b = 0 as x = - b/a.
[*] example tokenisation code, in Python:
acc = None
for idx, ch in enumerate(input):
if ch in '1234567890':
if acc is None: acc = 0
acc = 10 * acc + int(ch)
continue
if acc != None:
yield acc
acc = None
if ch in '+-=x':
yield ch
elif ch == ' ':
pass
else:
raise ValueError('illegal character "%s" at %d' % (ch, idx))
Alternative example tokenisation code, also in Python, assuming there will always be spaces between tokens as in the example. This leaves token validation to the parser:
return input.split()
ok some simple psuedo code that you could use to solve this problem
function(stinrgToParse){
arrayoftokens = stringToParse.match(RegexMatching);
foreach(arrayoftokens as token)
{
//now step through the tokens and determine what they are
//and store the neccesary information.
}
//Use the above information to do the arithmetic.
//count the number of times a variable appears positive and negative
//do the arithmetic.
//add up the numbers both positive and negative.
//return the result.
}
The first thing is to parse the string, to identify the various tokens (numbers, variables and operators), so that an expression tree can be formed by giving operator proper precedences.
Regular expressions can help, but that's not the only method (grammar parsers like boost::spirit are good too, and you can even run your own: its all a "find and recourse").
The tree can then be manipulated reducing the nodes executing those operation that deals with constants and by grouping variables related operations, executing them accordingly.
This goes on recursively until you remain with a variable related node and a constant node.
At the point the solution is calculated trivially.
They are basically the same principles that leads to the production of an interpreter or a compiler.
Consider:
from operator import add, sub
def ab(expr):
a, b, op = 0, 0, add
for t in expr.split():
if t == '+': op = add
elif t == '-': op = sub
elif t == 'x': a = op(a, 1)
else : b = op(b, int(t))
return a, b
Given an expression like 1 + x - 2 - x... this converts it to a canonical form ax+b and returns a pair of coefficients (a,b).
Now, let's obtain the coefficients from both parts of the equation:
le, ri = equation.split('=')
a1, b1 = ab(le)
a2, b2 = ab(ri)
and finally solve the trivial equation a1*x + b1 = a2*x + b2:
x = (b2 - b1) / (a1 - a2)
Of course, this only solves this particular example, without operator precedence or parentheses. To support the latter you'll need a parser, presumable a recursive descent one, which would be simper to code by hand.

How do I build this finite automaton?

I'm studying for a Discrete Mathematics test and I found this exercise which I can't figure out.
"Build a basic finite automaton (DFA,NFA,NFA-lambda) for the language in the alphabet Sigma = {0,1,2} where the sum of the elements in the string is even AND this sum is more than 3"
I have tried using Kleene's Theorem concatenating two languages like concatenating the one associated with this regular expression:
(00 U 11 U 22 U 02 U 20)* - the even elements
with this one
(22 U 1111 U 222 U 2222)* - the ones whose sum is greater than 3
Does this make any sense?? I think my regex are flabby.
I find your notation a bit fuzzy, so perhaps I'm completely misunderstanding. If so, disregard the following. It seems you're not there yet:
I assume the * means '0 or more times'. However, one of the strings with sum >= 3 must occur. It's say you need a + ('1 or more times').
112 and 211 are missing in the list of strings with sum >= 3.
222 and 2222 in that list are superfluous.
All of these strings may be arbitraryly interspersed with 0s.
The sum of 00 is no more even than the sum of 0.
Edit: how about this (acc is the only accepting state, dot-source):
automaton http://student.science.uva.nl/~sschroev/so/885411.png
At states a and c the string sum is always odd. At states start, b and acc the sum is always even. Furthermore, at start the sum is 0, at b it is 2 and at d it is >= 4. This can be proved rather easily. Hence the accepting state acc meets all criteria.
Edit 2: I'd say this is a regex which accepts the requested language:
0*(2|(1(0|2)*1))(0*(2|(1(0|2)*1))+
Not sure if this is answering your question, but: do you need to submit a regular expression? or will an FSM do?
At any rate, it might be helpful to draw the FSM first, and I think this is a correct DFA:
FSM http://img254.imageshack.us/img254/5324/fsm.png
If that is the case, when constructing your regular expression (which, remember, has different syntax than programming "regex"):
0* to indicate "0 as many times as you want". This makes sense, since 0 in your string doesn't change the state of the machine. (See, in the FSM, 0 just loops back to itself)
You'd need to account for the different combinations of "112" or "22" etc - until you reach at least 4 in your sum.
If your sum is greater than 3, and even, then (0|2)* would keep you at a final state. Otherwise (sum > 3, and odd) you'd need something like 1(0|2)* in order to put you at an accepting state.
(don't know if this helps, or if its right - but it might be a start!)
Each expression, as guided by Stephan, may be:
(0*U 2* U 11)* - the even sums
with this one
(22 U 11 U 222 U 112 U 211 U 121)+ - the ones whose sum is greater than 3
I don't know if it could be simplfied more, it would help designing the automaton.
I find it easier just to think in terms of states. Use five states: 0, 1, 2, EVEN, ODD
Transitions:
State, Input -> New State
(0, 0) -> 0
(0, 1) -> 1
(0, 2) -> 2
(1, 0) -> 1
(1, 1) -> 2
(1, 2) -> ODD
(2, 0) -> 2
(2, 1) -> ODD
(2, 2) -> EVEN
(ODD, 0) -> ODD
(ODD, 1) -> EVEN
(ODD, 2) -> ODD
(EVEN, 0) -> EVEN
(EVEN, 1) -> ODD
(EVEN, 2) -> EVEN
Only EVEN is an accepting state.