Regex: Binary string contains at least 3 of a certain integer - regex

I am working with regular expressions for a class and we need to have it set up to contain at least 3 0 throughout the string. BUT they don't need to be next to each other, so 101010 would pass but 101011 would fail because it lacks a 0.
^[0-1]*(?:0){3,}[0-1]*$
This is what I currently have but that requires them to be adjacent.

How about:
([01]*0[01]*){3}
The 0 in the center without a qualifier ensures that it exists at least 3 times in your string. The [01]* on either side requires a 0 or 1 zero to unlimited times, giving it some wiggle room so as not to require that the 3 zeros occur consecutively.
(Demo) (regexr.com)

Related

What is the language accepted by this NFA? Can you express it in English and with regular expression?

I have been trying to solve this NFA, this below it is the best I could come up with. I have hard time describing in English the language it produces, can someone help me to understand better?
Regular expression
(0+11)*10(0+1(0+11)10)
The automata is not deterministic because there are 3 transactions coming out of P
It does not accept words ending with 0 and an even number of 1. It must have at least one 1 and one zero. Sequences of odd numbers of 1s followed by zero. Or even number of 1s ending with a 0 if there is at least one 10 preceding the series of 1.
Later I have decided that this description is better.
Accept all strings ending with 0 and an odd number of 1, it does not accept strings ending with 1. It will not accept strings ending with 0 and even number of 1.
Accepted words:
1110
1001100110
11010
Not accepted
111001
You can see that you only have 0s going into the accepting state, which means the pattern has to end with a 0.
You also see that you need to have at least one 1, otherwise you will be stuck in the starting state.
Finally, you see that going from the starting state if you get an even number of 1s, you will end up in the start state again.
So, in other words, accepted patterns need to contain an odd number of 1s and end with a 0.
The simplest regular expression I can think of is:
0*(10*10*)*10+
The ending is pretty clear: you need a 1 followed by at least one 0 (so 10+).
The beginning should be quite clear too: you should be able to have as many 0s as you like in the beginning, thus 0*.
Now what remains is (10*10*)* which is 10*10* repeated an arbitrary number of times.
What this represents is any pattern with an even number of 1s which starts with a 1 (the fact that we have 0* just before that ensures that the global expression also covers strings that don't start with a 1).
Note that 10*10* contains exactly two 1s, so no matter how many times you repeat this pattern you will always have an even number of ones.
But how do we know that any string with an even number of 1s would satisfy this pattern?
We can prove this inductively.
A string with no 1s at all will match this expression (if we consider the 0* bit at the beginning) so we have our base case covered.
A string with a positive even number of ones can always be split into a prefix with just two 1s and a suffix with an even number of 1s (this even number may itself be 0).
So what is the expression for a string that contains exactly two 1s?
It's 0*10*10* or 0* followed by 10*10*.
So that's it - our pattern works for an string without any 1s and assuming it works for some even number of 1s we showed that it will work for two additional 1s too. That's basically the entire inductive proof.
A quick note to clarify why we only have 0* once at the beginning:
What happens when you have 0*10*10* followed by another 0*10*10*?
That's right - you get 0*10*10*0*10*10* but since 0*0* is equivalent to 0* we can simply the expression and only have a single 0* at the beginning, omitting it from the repeated expression.

Regex to allow one special character with at least 5 digits and maximum 6 digits

I have created a regex which follows the following parameters:
Minimum length: 5
Maximum length: 6
Needs to have at least 5 digits
Space and Special characters allowed: #&()_+[]:;',/.\-"*
No alphabets allowed
The regex I created is :
^\d{3}[_\+\[\]\:\;\'\,\/.\-"!##$%^&*()\s]{0,1}\d{2,3}$
This is fulfilling the length requirements and 5 digit requirement, however it is not allowing special characters. I am blocked due to this and unable to find any solution, please help.
You could do it with
^(?:(?=.{6}$)\d*[-#&()_+[\]:;',\/.\\"*]\d*|\d{5,6})$
if your regex-flavor supports look-aheads.
It uses two alternations. The first starts by checking the length, which including a special character always must be 6 (to allow for 5 digits), with a positive look-ahead. Then it matches any number of digits, followed by a special character, and finally any number of digits.
The other alternative just checks for 5-6 digits.
See it here at regex101.

Convert a regulation expression to DFA

I have been trying different ways to solve this problem for over an hour and am getting very frustrated.
The problem is: Give regular expressions and DFAs for each of the following languages over Sigma = {0,1}.
a). {w ∈ Σ* | w contains an even number of 0s or an odd number of 1s}
If anyone could provide hints or get me started on figuring this one out, it would be very appreciated!
I know it is something along the lines of this DFA but this one is for
{w ∈ Σ* | w contains an even number of 0s or exactly two 1's}
so it's a bit different but I can't figure it out.
You can see it as follows: you always have to remember two things:
whether the number of 0s is even or odd; and
whether the number of 1s is even or odd.
Now if we denote even with e and odd with o, we consider four states: ee (both even), eo (even number of 0s and odd number of 1s), oe and oo.
Now when we read a zero (0), we simply swap the first state token, so it means we introduce transitions from:
ee - 0 -> oe;
eo - 0 -> oo;
oe - 0 -> ee; and
oo - 0 -> eo.
The same for ones (1):
ee - 1 -> eo;
eo - 1 -> ee;
oe - 1 -> oo; and
oo - 1 -> oe.
Now we only need to determine the initial state and the accepting state(s). The intial state is ee, since at that moment we have considered no zeros and no ones.
Furthermore the accepting state can by determined by the condition:
w contains an even number of 0s or an odd number of 1s
So that means the accepting states are ee, eo and oo. A drawing of this DFA is shown below:
There exists an algorithmic way to convert a DFA into an equivalent regular expression as is stated here.
You can construct a regular expression by splitting the problem into two easier problems:
a regex that checks if the number of 0s is even; and
a regex that checks if the number of 1s is odd.
For the first, you can use the regex:
(1*01*0)*1*
Indeed: you first have a group (1*01*0). This group ensures that there are two zeros, and 1s can appear everywhere in between. We allow an arbitrary number of repetitions, since the number always remains even. The regex ends with 1* since it is still possible that there are additional ones in the string.
The second problem can be solved with the regex:
0*1(0*10*1)*0*
The solution is more or less the same. The expression between the brackets: (0*10*1) ensures that the ones occur evenly. By adding a 1 in front, we ensure the number of 1s is odd.
A regular expression that then solves the problem is:
(1*01*0)*1*|0*1(0*10*1)*0*
Since the "pipe" (|) means "or".
Think about what possible states you can ever be in.
A number contains either an even number of 0's or an odd number of 0's. (2 possible states)
A number contains either an even number of 1's or an odd number of 1's. (2 possible states)
Now let's look at what combinations are accepted by your language:
even 0's, even 1's: accept
even 0's, odd 1's: accept
odd 0's, even 1's: reject
odd 0's, odd 1's: accept
As a result, your DFA will need 4 states, of which 3 are accept states and 1 is a reject state. Every state will have 2 transitions leading to a different state. Since the empty string has an even number of 0's and an even number of 1's, the first state will be the initial state.
For making this into a regular expression: think about how you'd match an even number of 0's, then how you'd match an odd number of 1's. The language is just the union of these two.
Alternatively, as suggested by Willem, you can use an algorithm to convert any NFA to a regular expression. It has the advantage of being very general, but it's also more technical. Either way, it should lead to an equivalent regular expression.
What does a number with an even number of 0's look like? It might start with any number of 1's, but when we do find a 0 we better find another one! There can be any number of 1's in between, but we only care about the 0's. Thus, we come up with the following regular expression:
1*(01*01*)*
You should be able to apply a similar logic to match an odd number of 1's. Finally, OR the two expressions to get the requested regular expression.

How do I convert language set notation to regular expressions?

I have this following questing in regular expression and I just can't get my head around these kind of problems.
L1 = { 0n1m | n≥3 ∧ m is odd }
How would I write a regular expression for this sort of problem when the alphabet is {0,1}.
What's the answer?
The regular expression for your example is:
000+1(11)*1
So what does this do?
The first two characters, 00, are literal zeros. This is going to be important for the next point
The second two characters, 0+, mean "at least one zero, no upper bound". These first four characters satisfy the first condition, which is that we have at least three zeros.
The next character, 1, is a literal one. Since we need to have an odd number of ones, this is the smallest number we're allowed to have
The last-but-one characters, (11), represent a logical grouping of two literal ones, and the ending * says to match this grouping zero or more times. Since we always have at least one 1, we'll always match an odd number. So we're done.
How'd I get that?
The key is knowing regular expression syntax. I happen to have quite a bit of experience in it, but this website helped me to verify.
Once you know the basic building blocks of regex, you need to break down your problem into what you can represent.
For example, regex allows us to specify a lower AND upper bound for matching (the {x,y} syntax), but doesn't allow to specify just a lower bound ({x} will match exactly x times). So I knew I would have to use either + or * to specify the zeros, as those are the only specifiers that permit an infinite number of matches. I also knew that it didn't make sense to apply those modifiers to a group; the restriction that we must have at least 3 zeroes doesn't imply that we must have a multiple of three, for example, so (000)+ was out. I had to apply the modifier to only one character, which meant I had to match a few literals first. 000 guarantees matching exactly three 0s, and 0* (Final expression 0000*) does exactly what I want, and then I condensed that to the equivalent 000+.
For the second condition, I had to think about what an odd number is. By definition, an odd number can be expressed by 2*k + 1, where k is an integer. So I had to match one 1 (Hence the literal 1), and some number of the substring 11. That led me to the group, and then the *. On a slightly different problem, you could write 1(11)+ to match any odd number of ones, and at least 3.
1 A colleague of mine pointed out to me that the + operator isn't technically part of the formal definition of regular expressions. If this is an academic question rather than a programming one, you might find the 0000* version more helpful. In that case, the final string would be 0000*1(11)*

XSD Restriction based off of certain numbers in range using regular expression

I am working on creating an XSD for a web service that will take in an ID number as an element in the XML. These ID numbers consist of 10 consecutive digits ([0-9]{10}), but I was trying to create a regular expression that could exclude certain elements from this range.
For example, here is the restriction I have currently in my XSD:
<xsd:restriction base="xsd:string">
<xsd:pattern value="[0-9]{10}" />
</xsd:restriction>
I need the restriction to allow a string of [0-9]{10} that doesn't fit the following IDs:
All 0's: [0]{10}
Starting with 6: [6][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]
Starting with 000: [0][0][0][0-9][0-9][0-9][0-9][0-9][0-9][0-9]
Starting with 999: [9][9][9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]
Ends with 2 0's: [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0][0]
4 0's in Middle: [0-9][0-9][0-9][0][0][0][0][0-9][0-9][0-9]
Is this possible to do from within the XSD or regular expression?
Thanks.
I'd rephrase your restrictions a bit:
The first digit must not be a 6.
At least one of the last two digits must not be a zero.
At least one of the middle four digits must not be a zero.
The first restriction, an ID only consisting of zeroes, is actually included in the two last restrictions.
The first restriction can be expressed by a set of allowed characters that does not include 6, i.e. [0-57-9].
For the other restrictions, a straightforward solution is to start at the beginning of a section that must not consist only of zeroes and assume a non-zero digit; if that assumption is true, the remaining digits may include zeroes; otherwise the first digit in that section must be a zero and for the remaining characters, this rule can be repeated recursively until only one character is left: ([1-9][0-9]{3}|0(... repeat for three digits, then two digits, ...))
Therefore, a suitable RegEx would be:
[0-57-9][0-9]{2}([1-9][0-9]{3}|0([1-9][0-9]{2}|0([1-9][0-9]|0[1-9])))[0-9]([1-9][0-9]|0[1-9])
Update: The additional restrictions require the following:
At least one of the first three digits must not be a 0.
At least one of the first three digits must not be a 9.
This can be included the same way as above, accepting either anything except 0 and 9, or either of these two numbers:
([1-57-8][0-9]{2}|0([1-9][0-9]|[0-9][1-9])|9([0-8][0-9]|[0-9][0-8]))([1-9][0-9]{3}|0([1-9][0-9]{2}|0([1-9][0-9]|0[1-9])))[0-9]([1-9][0-9]|0[1-9])
The new part is in the front of the expression:
([1-57-8][0-9]{2}|0([1-9][0-9]|[0-9][1-9])|9([0-8][0-9]|[0-9][0-8]))
So,
either the ID starts with neither a 0 nor with a 9. In that case, there are no restrictions for the next two digits.
or the ID starts with a 0. In that case, one of the next two digits must not be a zero, either the first one or the second one.
or the ID starts with a 9. In that case, one of the next two digits must not be a nine, either the first one or the second one.
I think this will cover it:
[01-57-9]\d{2}([1-9]\d{3}|\d[1-9]\d{2}|\d{2}[1-9]\d|\d{3}[1-9])\d([1-9]\d|\d[1-9])
Broken down:
[01-57-9] First character is a number-not-6.
\d{2} Next two characters can be any digit.
Then there is a (...|...|...|...) section, ORing all of these together.
[1-9]\d{3} Of the next 4 digits, the first cannot be zero.
OR
\d[1-9]\d{2} Of the next 4, the second cannot be zero.
OR
\d{2}[1-9]\d Or the third is not zero.
OR
\d{3}[1-9] Or the fourth is not zero.
Then we have another \d, any digit.
Finally,
([1-9]\d|\d[1-9]) either the first or the second of the last two digits cannot be 0.
Since we have two sections that demand at least one number is not zero, there is not way for all 10 to be zero.