How to evaluate this regular expression? - regex

I'm just learning regular expressions, so I just want to make sure my understanding is correct.
01* mean 0 followed by 0 or more repetitions of 1.
1* + 01* means either 0 or more repetitions of 1 OR 0 followed by 0 or more repetitions of 1.
Am I right or is there something that I'm missing? Thanks.

The + in a regex doesn't mean OR but "one or more of"
So instead of 1* + 01* you would say:
1*|01*
which would mean either a (maybe zero length) string of ones, or a zero followed by (maybe a zero length) string of ones.
So it would match any of:
1
1111
0
011
But none of:
101
110
100001
001
00
The OR operator (vertical pipe) has a low precedence.

This seems correct to me. (Even thought I'm not a whiz in regular expressiony myself)
But here's a good tutorial you could check out.
This one I found useful as well.

Related

RegEx difference for answer

I've got a question that asks for a non-empty string that starts and ends with two 1's. The alphabet is {0,1}. It needs to match the string {11,111,1111,11000...11..0011} However many 1's and 0's in between doesn't matter as long as it ends with 2 1's. So far I've got this:
^(1{2,4}|(11[01]*0[01]*11))$
But my answer wasn't accepted because it needs to be simplified. Something along these lines 11(0|1)*(11)* - this returns infinite 11's at the end so it's not accepted. I just can't figure it out can someone please push me in the right direction.
One possibility ^(?=11)[01]*11$. See demo. Here use look ahead to assert the string starts with 11 which fits the edge cases (11, 111) pretty well here since it doesn't consume characters, and then match the whole string with [01]*11$ which contains only 1 and 0 and ends with pattern 11.
Or based on your existing approach ^(1{2,3}|11[01]*11)$ should work as well. demo.
The simplest one:
11((0*1)*1)*
Explain:
When capturing 0 we must have one 1's at the end and another 1's at the outer group.
11 # match because 11 and Kleene star group is empty
111 # match 11(e1) -> 111
1111 # match 11(e1)1 --> 1111
11011 # match 11(01)1
11001 # non-match because 11(001) (no 1's at the end)
110111011 # match 11((01)1))(e1)((01)1)
^(1{2,4}|11[01]+11)$
^(1{2,3}|11[01]*11)$
^(11|111|11[01]*11)$
Your last answer is very close. (^11[01]*11$|^11+$) would do.
I added the OR 1+ to cover the 11 and 111 cases because the expression on the left covers anything that starts with 11 then either has some 0's and/or 1's or doesn't have them but then definitely has 11 again. This mean the shortest string it will match would be 1111. Hence the fix.
EDIT:
Sorry I answered too fast. Take Psidom's answer it's perfect.
Only 0 and 1?
And starts and ends with 11?
But also matching "11" or "111"?
Then this regex also does that:
^11(1|[01]*11)?$

Regular expressions: contains at least two 0s but not consecutive 0s

Is the solution of this exercise the below regular expression? I found it in the internet but I don't believe that this is correct.
(1*011*(0+011*))*
According to the theory of Chapter 1 in the book "The handbook of computational linguistics and natural language processing", how could I solve this exercise?
I would like a regular expression that will satisfy the below regular language
L = {010,0101,0110,0101010,01011110,.....}
Here is another option:
^[^0]*[0]{1}([^0]+[0]{1}[^0]*)+$
You can go with:
^(?!.*00.*)(?=.*0.*0).*$
You can play with it here.
Explanation:
(?!.*00.*) the input can't have two consecutive 0
(?=0.*0) the input have to contains at least two 0
If you don't want to use lookaround use Maria's answer
(1+01)* 0 (1+) 0 (1+10)*
This solves the problem
How about this:
0[^0]+0
Zero 0 followed by a character in the range "not zero" [^0] followed by zero 0.
The regexp you post is erroneous, it suffices to note that it has a 0+ subsequence, which should admit a sequence of one or more 0s. It can be corrected with this solution:
1*011*0(10?)*
or with + operator
1*01+0(10?)*
An explanation of it should be: First skip al the ones that start your expression with the 1* subpattern, so you get to the first 0, then skip at least one 1, and all the 1s following it (with subpattern 1+) up to the second 0, so we have just matched the minimum length string that matches you regular language. Once here, all the rest is optional, we need to repeat any number of times the pattern 1 with an optional trailing 0 (as in 10?), or there should be two consecutive 0s. You can check it in this demo, that contains all the possible strings from 1 to 8 characters, and the matching or not of them.
If it's atleast 2 0s, then there's also a possibility of 1 being at the start
So wouldn't that be 1* 0 1* 0 (1+01)*
But if it's acc to the language given (the 2 0s at the beginning and end),
0 (1+01)* 0
1*01(1|01)*01*
I think this would work perfectly
Given language contains at least two zeroes but not consecutive zeroes.
The strings accepted by this language are L= {010,0101,0110,0101010,01011110,.....}
The matching regular expression is:
1*01*(10+101*)^+
Here + represents at least a one time occurrence.
DFA for the above language is shown in this link:
DFA IMAGE

Regex to match numbers and commas, but not numbers starting with 0 unless it's 0,

Well I tried to sum it up in the title.
I need a reg ex to match numbers and commas, but not numbers starting with 0 unless it's 0,number
My users enter hours in a field, so they have to be able to enter 0,3 hours, but they are not allowed to write 002 or 09.
I have this reg ex
^[0-9]*\,?[0-9]+$
How can I extend it to not allow start with 0 unless the 0 is followed by a comma
Another one :)
^(0|[1-9]\d*(|,\d+)|0,\d+)$
This one should suit your needs:
^0,\d*[1-9]|[1-9]\d*$
either 0,\d*[1-9]: a 0, followed by a comma, followed by 0 or more digit, followed by one digit between 1 and 9
or [1-9]\d*: a digit between 1 and 9, followed by zero or more digit
Matches:
0,3
0,03
3
30
Doesn't match:
0
0,0
0,30
03
You don't need to force everything into a single regex to do this.
It will be far clearer if you use multiple regexes, each one making a specific check.
if ( /^[0-9]+,[0-9]+$/ || /^[1-9][0-9]*$/ )
Here we are making two different checks. "Either this one matches, or the other one matches", and then you don't have to jam both conditions into one regex.
Let the expressive form of your host language be used, rather than trying to cram logic into a regex.

Violating RegExp: Every string that is smaller or equal "001700"

i have a unique challenge.
i want to create a google analytics filter for a custom variable that only returns a value if the given string is smaller or equal than '001700'. yeah, i know that a string can't be smaller, still i need to find a way to make this work.
oh, and if you ask: no there is no way to convert that string to a number (according to my knowledge - via a google analytics filter - and that is what i have to work with in this case).
so basically, i have
000000
000001
000002
000003
...
...
999998
999999
and i need a regular expression that matches
001700
001699
001698
...
...
000001
000000
but does not match
001701
001702
...
...
999998
999999
sub question a) is it possible? (as i have learned, everything is possible with regExp if you are clever and/or masochistic enough)
sub question b) how to do it?
thx very much
You can do:
^00(1700|1[0-6][0-9]{2}|0[0-9]{3})$
See it
yes you can do
see this article
Eg:
alert('your numericle string'.replace(/\d+/g, function(match) {
return parseInt(match,10) <= 17000 ? '*' : match;
}));
JavaScript calls our function, passing
the match into our match argument.
Then, we return either the asterisk
(if the number matched is under 17000) or
the match itself (i.e. no match should
take place).
Can be done with RegEx:
/00(1([0-6][0-9]{2}|700)|0[0-9]{3})/
Explanation:
00 followed by
1 followed by 0 to 6 and any 2 numbers = 1000 - 1699
or
1700
or
0 followed by any 3 numbers = 0000 - 0999

RegEx for value Range from 1 - 365

What is the RegEx for value Range from 1- 365
Try this:
^(?:[1-9]\d?|[12]\d{2}|3[0-5]\d|36[0-5])$
The start anchor ^ and end anchor
$ are to match the whole input and
not just part of it.
(? ) is for grouping.
| is for alternation
[1-9]\d? matches 1 to 99
[12]\d{2} matches 100 to 299
3[0-5]\d matches 300 to 359
36[0-5] matches 360 to 365
You would have to list the possible combinations 1-9, 10-99, 100-299, 300-359, 360-365:
^([1-9]\d?|[12]\d\d|3[0-5]\d|36[0-5])$
Not really a good fit for regex, but if you insist:
^(?:36[0-5]|3[0-5][0-9]|[12][0-9][0-9]|[1-9][0-9]|[1-9])$
This is not allowing leading zeroes. If you wish to allow those, let me know.
The expression above can be shortened a little to
^(?:36[0-5]|3[0-5]\d|[12]\d{2}|[1-9]\d?)$
but I find the first solution to be a bit more readable. YMMV.
A general solution for matching the numbers from 1 to XYZ
^(?!0)(?!\d{4}$)(?![X+1-9]\d{2}$)(?!X[Y+1-9]\d$)(?!XY[Z+1-9]$)\d+$
Notes:
If any of X, Y or Z are 9 that will make X+1 etc. be 10. If that happens the regex part that would require using the 10 should be left out.
This can be extended to numbers with more or less digits following the same principles.
It does not allow left-padding 0es.
Applied to your case:
^(?!0)(?!\d{4}$)(?![4-9]\d{2}$)(?!3[7-9]\d$)(?!36[6-9]$)\d+$
Lets explain:
(?!0\d*) - does not start with 0
(?!\d{4}$) - does not have 4 digits, i.e. between 1000 and infinity
(?![4-9]\d{2}$) - it's not between 400 and 999
(?!3[7-9]\d$) - it's not between 370 and 399
(?!36[6-9]$) - it's not between 366 and 369
Test it.
^36[0-5]|(3[0-5]|[12]?[0-9])[0-9]$
^3(6[0-5]|[0-5]\d)|[12]\d\d|[1-9]\d|[1-9]$
Or if numbers like 05 can not be in input:
^3(6[0-5]|[0-5]\d)|[12]?\d?\d$
P.S.: Anyway no need of regex here. Use ToInt(), <=, >=
It really depends on your regex engine since they may not all be PCRE-style. I usually work to the lowest common denominator unless I know it will be targeting a minimum engine.
To that end, I'd just use something like:
^[1-9]|[1-9][0-9]|[1-2][0-9]{2}|3[0-5][0-9]|36[0-5]$
This will take care of (in order):
1-9.
10-99.
100-299.
300-359.
360-365.
However, unless you're absolutely required to use just a regex, I wouldn't. It's like trying to kill a fly with a thermo-nuclear warhead.
Just use the much simpler ^[0-9]{1,3}$ then use whatever language features you have to convert it to an integer and check it's between 1 and 365 inclusive:
def isValidDayOtherThanLeapYear (s):
if not s.matches ("^[0-9]{1,3}$"):
return false
n = s.toInteger()
if n < 1 or n > 365:
return false
return true
Your code will be more readable that way and I tend to rethink the use of regular expressions the second they start looking like they may be hard to read six months down the track.
This worked for me...
^[1-3][0-6]?[0-5]?$