Is the solution of this exercise the below regular expression? I found it in the internet but I don't believe that this is correct.
(1*011*(0+011*))*
According to the theory of Chapter 1 in the book "The handbook of computational linguistics and natural language processing", how could I solve this exercise?
I would like a regular expression that will satisfy the below regular language
L = {010,0101,0110,0101010,01011110,.....}
Here is another option:
^[^0]*[0]{1}([^0]+[0]{1}[^0]*)+$
You can go with:
^(?!.*00.*)(?=.*0.*0).*$
You can play with it here.
Explanation:
(?!.*00.*) the input can't have two consecutive 0
(?=0.*0) the input have to contains at least two 0
If you don't want to use lookaround use Maria's answer
(1+01)* 0 (1+) 0 (1+10)*
This solves the problem
How about this:
0[^0]+0
Zero 0 followed by a character in the range "not zero" [^0] followed by zero 0.
The regexp you post is erroneous, it suffices to note that it has a 0+ subsequence, which should admit a sequence of one or more 0s. It can be corrected with this solution:
1*011*0(10?)*
or with + operator
1*01+0(10?)*
An explanation of it should be: First skip al the ones that start your expression with the 1* subpattern, so you get to the first 0, then skip at least one 1, and all the 1s following it (with subpattern 1+) up to the second 0, so we have just matched the minimum length string that matches you regular language. Once here, all the rest is optional, we need to repeat any number of times the pattern 1 with an optional trailing 0 (as in 10?), or there should be two consecutive 0s. You can check it in this demo, that contains all the possible strings from 1 to 8 characters, and the matching or not of them.
If it's atleast 2 0s, then there's also a possibility of 1 being at the start
So wouldn't that be 1* 0 1* 0 (1+01)*
But if it's acc to the language given (the 2 0s at the beginning and end),
0 (1+01)* 0
1*01(1|01)*01*
I think this would work perfectly
Given language contains at least two zeroes but not consecutive zeroes.
The strings accepted by this language are L= {010,0101,0110,0101010,01011110,.....}
The matching regular expression is:
1*01*(10+101*)^+
Here + represents at least a one time occurrence.
DFA for the above language is shown in this link:
DFA IMAGE
Related
I'm trying to solve this issue:
How to use the interval quantifier curlybraces "{}" from 0 up to 4 in a REGEXMATCH Google Sheets formula to make it match only the occurrences from zero occurrence up to 4 occurrences and no more?
Here the source and context I started first from (section Quantifiers — * + ? and {}) Regex tutorial — A quick cheatsheet by examples
Specifying the following:
a(bc){2,5} matches a string that has a followed by 2 up to 5 copies of the sequence bc
My formula is:
=REGEXMATCH($A7,"a(bc){0,4}")
Here the 1st input in A7:
abcbcbcbcbc
Contrary to expectation, it returns TRUE despite A7 having more than 4 bc's as input in A7.
The same contrary to expectation result occurs for the following intervals {1,4} and {2,4} in :
=REGEXMATCH($A7,"a(bc){1,4}")
=REGEXMATCH($A7,"a(bc){2,4}")
It still returns matches despite 5 occurences of bc's sequences in those latter two cases as well.
Here the Sheet:
quantifier interval in regex from zero to defined interval end
I read the general regex info answer here Learning Regular Expressions [closed] but couldn't find the solution.
How to make it return FALSE for any input of more than 4 bc's in A7?
Thanks a lot for your help!
A regex does not have to match the entire string you are checking it against by default. The function will return True if the regex matches any substring of the provided string.
To change that behaviour add the character ^ to match the start of the subject string and the character $ to match its end.
For example: =REGEXMATCH($A7,"^a(bc){0,4}$") will not match abcbcbcbcbc.
I've got a question that asks for a non-empty string that starts and ends with two 1's. The alphabet is {0,1}. It needs to match the string {11,111,1111,11000...11..0011} However many 1's and 0's in between doesn't matter as long as it ends with 2 1's. So far I've got this:
^(1{2,4}|(11[01]*0[01]*11))$
But my answer wasn't accepted because it needs to be simplified. Something along these lines 11(0|1)*(11)* - this returns infinite 11's at the end so it's not accepted. I just can't figure it out can someone please push me in the right direction.
One possibility ^(?=11)[01]*11$. See demo. Here use look ahead to assert the string starts with 11 which fits the edge cases (11, 111) pretty well here since it doesn't consume characters, and then match the whole string with [01]*11$ which contains only 1 and 0 and ends with pattern 11.
Or based on your existing approach ^(1{2,3}|11[01]*11)$ should work as well. demo.
The simplest one:
11((0*1)*1)*
Explain:
When capturing 0 we must have one 1's at the end and another 1's at the outer group.
11 # match because 11 and Kleene star group is empty
111 # match 11(e1) -> 111
1111 # match 11(e1)1 --> 1111
11011 # match 11(01)1
11001 # non-match because 11(001) (no 1's at the end)
110111011 # match 11((01)1))(e1)((01)1)
^(1{2,4}|11[01]+11)$
^(1{2,3}|11[01]*11)$
^(11|111|11[01]*11)$
Your last answer is very close. (^11[01]*11$|^11+$) would do.
I added the OR 1+ to cover the 11 and 111 cases because the expression on the left covers anything that starts with 11 then either has some 0's and/or 1's or doesn't have them but then definitely has 11 again. This mean the shortest string it will match would be 1111. Hence the fix.
EDIT:
Sorry I answered too fast. Take Psidom's answer it's perfect.
Only 0 and 1?
And starts and ends with 11?
But also matching "11" or "111"?
Then this regex also does that:
^11(1|[01]*11)?$
I came across the regular expression not containing 101 as follows:
0∗1∗0∗+(1+00+000)∗+(0+1+0+)∗
I was unable to understand how the author come up with this regex. So I just thought of string which did not contain 101:
01000100
I seems that above string will not be matched by above regex. But I was unsure. So tried translating to equivalent pcre regex on regex101.com, but failed there too (as it can be seen my regex does not even matches string containing single 1.
Whats wrong with my translation? Is above regex indeed correct? If not what will be the correct regex?
Here is a bit shorter expression ^0*(1|00+)*0*$
https://www.regex101.com/r/gG3wP5/1
Explanation:
(1|00+)* we can mix zeroes and ones as long as zeroes occur in groups
^0*...0*$ we can have as many zeroes as we want in prefix/suffix
Direct translation of the original regexp would be like
^(0*1*0*|(1|00|000)*|(0+1+0+)*)$
Update
This seems like artificially complicated version of the above regexp:
(1|00|000)* is the same as (1|00+)*
it is almost the solution, but it does not match strings 0, 01.., and ..10
0*1*0* doesn't match strings with 101 inside, but matches 0 and some of 01.., and ..10
we still need to match those of 01.., and ..10 which have 0 & 1 mixed inside, e.g. 01001.. or ..10010
(0+1+0+)* matches some of the remaining cases but there are still some valid strings unmatched
e.g. 10010 is the shortest string that is not matched by all of the cases.
So, this solution is overly complicated and not complete.
read the explanation in the right side tab in regex101 it tells you what your regex does( I think you misunderstood what list operator does) , inside a list operator ( [ ) , the other characters such as ( won't be metacharacters anymore so the expression [(0*1*0*)[1(00)(000)] will be equivalent to [01()*[] which means it matches 0 or 1 or ( or ) or [
The correct translation of the regular expression 0∗1∗0∗+(1+00+000)∗+(0+1+0+)∗
will be as follows:
^((?:0*1*0*)|(?:1|00|000)*|(?:0+1+0+)*)$
regex101
Debuggex Demo
What your regex [(0*1*0*)[1(00)(000)]*(0+1+0+)*] does:
[(0*1*0*)[1(00)(000)]* -> matches any of characters 0,(,),*,[ zero or more times followed by
(0+1+0+)* --> matches the pattern 0+1+0+ 0 or more times followed by
] --> matches the character ]
so you expression is equivalent to
[([)01](0+1+0+)*] which is not a regular expression to match strings that do not contain 101
0* 1* ( (00+000)* 1*)* (ε+0)
i think this expression covers all cases because --
any number apart from 1 can be broken into constituent 2's and 3's i.e. any number n=2*i+3*j. So there can be any number of 0's between 2 consecutive 1's apart from one 0.Hence, 101 cannot be obtained.
ε+0 for expressions ending in one 0.
The RE for language not containing 101 as sub-string can also be written as (0*1*00)*.0*.1*.0*
This may me a smaller one then what you are using. Try to make use of this.
Regular Expression I got (0+10)1. (looks simple :P)
I just considered all cases to make this.
you consider two 1's we have to end up with continuous 1's
case 1: 11111111111111...
case 2: 0000000011111111111111...(once we take two 1's we cant accept 0's so one and only chance is to continue with 1's)
if you consider only one 1 which was followed by 0 So, no issue and after one 1 we can have any number of 0's.
case 3: 00000000 10100100010000100000100000 1111111111
=>(0*+10*)1
final answer (0+10)1.
Thanks for your patience.
I am looking for simple way to use regex and catch variant of word with simplest format.
For example, the 5 variants of the word below.
hike
hhike
hiike
hikke
hikkee
Using something similar to the format below...
[([a-zA-Z]){4,}]
Thanks
Are you looking for something like /h+i+k+e+/?
Meaning:
The literal h character repeated 1 to infinity times
The literal i character repeated 1 to infinity times
The literal k character repeated 1 to infinity times
The literal e character repeated 1 to infinity times
DEMO
If each character can maximum be there twice, you can use /h{1,2}i{1,2}k{1,2}e{1,2}/ meaning "present 1 or 2 times".
You probably cannot solve this generically (i.e. for any word) under standard regex syntax.
For a given word, as others have pointed out, it is trivial.
This is more of a soundex kind of problem I think:
https://stackoverflow.com/a/392236/514463
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm looking for a way to check if a number is greater than 0 using regex.
For example:
12 would return true
0 would return false.
I don't know how MVC is relevant, but if your ID is an integer, this BRE should do:
^[1-9][0-9]*$
If you want to match real numbers (floats) rather than integers, you need to handle the case above, along with normal decimal numbers (i.e. 2.5 or 3.3̅), cases where your pattern is between 0 and 1 (i.e. 0.25), as well as case where your pattern has a decimal part that is 0. (i.e. 2.0). And while we're at it, we'll add support for leading zeros on integers (i.e. 005):
^(0*[1-9][0-9]*(\.[0-9]+)?|0+\.[0-9]*[1-9][0-9]*)$
Note that this second one is an Extended RE. The same thing can be expressed in Basic RE, but almost everything understands ERE these days. Let's break the expression down into parts that are easier to digest.
^(
The caret matches the null at the beginning of the line, so preceding your regex with a caret anchors it to the beginning of the line. The opening parenthesis is there because of the or-bar, below. More on that later.
0*[1-9][0-9]*(\.[0-9]+)?
This matches any integer or any floating point number above 1. So our 2.0 would be matched, but 0.25 would not. The 0* at the start handles leading zeros, so 005 == 5.
|
The pipe character is an "or-bar" in this context. For purposes of evaluation of this expression, It has higher precedence than everything else, and effectively joins two regular expressions together. Parentheses are used to group multiple expressions separated by or-bars.
And the second part:
0+\.[0-9]*[1-9][0-9]*
This matches any number that starts with one or more 0 characters (replace + with * to match zero or more zeros, i.e. .25), followed by a period, followed by a string of digits that includes at least one that is not a 0. So this matches everything above 0 and below 1.
)$
And finally, we close the parentheses and anchor the regex to the end of the line with the dollar sign, just as the caret anchors to the beginning of the line.
Of course, if you let your programming language evaluate something numerically rather than try to match it against a regular expression, you'll save headaches and CPU.
What about this: ^[1-9][0-9]*$
Another solution for integer:
^[1-9]\d*$
\d equivalent to [0-9]
Code:
^([0-9]*[1-9][0-9]*(\.[0-9]+)?|[0]+\.[0-9]*[1-9][0-9]*)$
Example: http://regexr.com/3anf5
Reference: https://social.msdn.microsoft.com/Forums/en-US/17089c0f-f9cb-437a-9667-ba8329681624/regular-expression-number-greater-than-0?forum=regexp
I think the best solution is to add the + sign between the two brackets of regex expression:
^[1-9]+[0-9]*$
If you only want non-negative integers, try:
^\d+$
I Tried this one and it worked for me for all decimal/integer numbers greater than zero
Allows white space: ^\s*(?=.*[1-9])\d*(?:\.\d{1,2})?\s*$
No white space: ^(?=.*[1-9])\d*(?:\.\d{1,2})?$
Reference: Regex greater than zero with 2 decimal places
there you go:
MatchCollection myMatches = Regex.Matches(yourstring, #"[1-9][0-9]*");
on submit:
if(myMatches.Count > 0)
{
//do whatever you want
}
You can use the below expression:
(^\d*\.?\d*[1-9]+\d*$)|(^[1-9]+\.?\d*$)
Valid entries: 1 1. 1.1 1.0 all positive real numbers
Invalid entries: all negative real numbers and 0 and 0.0
Simplified only for 2 decimal places.
^\s*(?=.*[1-9])\d*(?:\.\d{1,2})?\s*$
Ref: https://www.regextester.com/94470
The simple answer is: ^[1-9][0-9]*$
I think this would perfectly work :
([1-9][0-9]*(\.[0-9]*[1-9])?|0\.[0-9]*[1-9])
Valid:
1
1.2
1.02
0.1
0.02
Not valid :
0
01
01.2
1.10
[1-9]\.\d{1,2}|0\.((0?[1-9])|([1-9]0?)){1,2}\b
Very simple answer to this use this: \d*