I'm trying to create a regex that would match any string containing only the '0' and '1' char as long as it does not contain the specific sequence "00" or "010".
My idea was something like this
1*(011+)*
But a problem shows up, if the string ended with a 0, like 1110, then it should be valid. With my regex, any 0 has to be followed by two or more ones, but I can't figure out how to make this specific exception.
I need a regex that would force any 0 to be followed by two or more ones OR to end. This would allow sequences ending in 0 and 01, in addition to the obvious "ends with 1" case where, as in "111111", the unautorized sequences are not present.
How can I "cut off short" a condition in a regex, allowing it to either go on according to my rules, or to end right there?
You can fix your regex like this:
1*(011+)*(01?)?
Add an (optional) group at the end that matches incomplete zero groups, i.e. 0 and 01.
Related
I came across the regular expression not containing 101 as follows:
0∗1∗0∗+(1+00+000)∗+(0+1+0+)∗
I was unable to understand how the author come up with this regex. So I just thought of string which did not contain 101:
01000100
I seems that above string will not be matched by above regex. But I was unsure. So tried translating to equivalent pcre regex on regex101.com, but failed there too (as it can be seen my regex does not even matches string containing single 1.
Whats wrong with my translation? Is above regex indeed correct? If not what will be the correct regex?
Here is a bit shorter expression ^0*(1|00+)*0*$
https://www.regex101.com/r/gG3wP5/1
Explanation:
(1|00+)* we can mix zeroes and ones as long as zeroes occur in groups
^0*...0*$ we can have as many zeroes as we want in prefix/suffix
Direct translation of the original regexp would be like
^(0*1*0*|(1|00|000)*|(0+1+0+)*)$
Update
This seems like artificially complicated version of the above regexp:
(1|00|000)* is the same as (1|00+)*
it is almost the solution, but it does not match strings 0, 01.., and ..10
0*1*0* doesn't match strings with 101 inside, but matches 0 and some of 01.., and ..10
we still need to match those of 01.., and ..10 which have 0 & 1 mixed inside, e.g. 01001.. or ..10010
(0+1+0+)* matches some of the remaining cases but there are still some valid strings unmatched
e.g. 10010 is the shortest string that is not matched by all of the cases.
So, this solution is overly complicated and not complete.
read the explanation in the right side tab in regex101 it tells you what your regex does( I think you misunderstood what list operator does) , inside a list operator ( [ ) , the other characters such as ( won't be metacharacters anymore so the expression [(0*1*0*)[1(00)(000)] will be equivalent to [01()*[] which means it matches 0 or 1 or ( or ) or [
The correct translation of the regular expression 0∗1∗0∗+(1+00+000)∗+(0+1+0+)∗
will be as follows:
^((?:0*1*0*)|(?:1|00|000)*|(?:0+1+0+)*)$
regex101
Debuggex Demo
What your regex [(0*1*0*)[1(00)(000)]*(0+1+0+)*] does:
[(0*1*0*)[1(00)(000)]* -> matches any of characters 0,(,),*,[ zero or more times followed by
(0+1+0+)* --> matches the pattern 0+1+0+ 0 or more times followed by
] --> matches the character ]
so you expression is equivalent to
[([)01](0+1+0+)*] which is not a regular expression to match strings that do not contain 101
0* 1* ( (00+000)* 1*)* (ε+0)
i think this expression covers all cases because --
any number apart from 1 can be broken into constituent 2's and 3's i.e. any number n=2*i+3*j. So there can be any number of 0's between 2 consecutive 1's apart from one 0.Hence, 101 cannot be obtained.
ε+0 for expressions ending in one 0.
The RE for language not containing 101 as sub-string can also be written as (0*1*00)*.0*.1*.0*
This may me a smaller one then what you are using. Try to make use of this.
Regular Expression I got (0+10)1. (looks simple :P)
I just considered all cases to make this.
you consider two 1's we have to end up with continuous 1's
case 1: 11111111111111...
case 2: 0000000011111111111111...(once we take two 1's we cant accept 0's so one and only chance is to continue with 1's)
if you consider only one 1 which was followed by 0 So, no issue and after one 1 we can have any number of 0's.
case 3: 00000000 10100100010000100000100000 1111111111
=>(0*+10*)1
final answer (0+10)1.
Thanks for your patience.
Hi I am working on RegEx. Correct response should NOT allow for number to the tenths only, as in RESPONSE = "925.0", nor should it allow for trailing zeros after the hundredths place as in RESPONSE = "925.000". Only correct responses: 925, 0925, 0925., 925., 925.00, 00925
I worked on it and finally came up with this
"^-?(0)*(\d*(\.(00))?\d+.|(\d){1,3}(,(\d){3})*(\.(00))?)$"
It works for three digit numbers but if i want it for 38400.00 it doesn't allow it
I am not quite certain whether the decimal places can be any digit or if they have to be zero. If the former, then this should do the trick:
^-?\d{1,3}(,?\d{3})*(\.(\d{2})?)?$
If the latter, then this:
^-?\d{1,3}(,?\d{3})*(\.(00)?)?$
The entire match starting with the decimal point is optional, and the two decimal places in that match are optional as well.
UPDATE I just realized that it appears you need to accept commas in the response as well - I assume for thousands, millions, etc.
UPDATE #2 per OP's comment
^-?(\d+|\d{1,3}(,\d{3})*)(\.(00)?)?$
UPDATE #3 Added link to regex101 for explanation of this regular expression.
Have a try with:
^-?\d{1,3}(?:,?\d{3})*(?:\.(?:00)?)?$
I think your problem is that you're trying to match it in chunks of three, with commas separating, but 38400.00 doesn't have commas.
Try this:
^-?\d+(\.?(\d{2})?)$
The - indicates the character, -. With the ? after, it says that it may or may not apply. This allows negative numbers, so if you only want positive numbers matched, delete the first two characters.
\d represents every digit. The + after says that there can be as many as you want, as long as there's at least one.
Then there's a \., which is just a dot in the number. The ? does the same as before.. Since you seem to allow trailing periods, I assumed you wanted it to be considered separately from the following digits.
The () encloses the next group, which is the period (\.) followed by two characters that match \d -- two digits -- and which may be repeated 0 or 1 times, as dictated by the ?. This allows people to either have no digits after the period or two, but nothing else.
The ^ at the beginning specifies it has to be at the beginning of the line, and the $ at the end specifies it has to end at the end of the line. Remember to enable the multiline (m) flag so it works properly.
Disclaimer: I've not done much regex work before, so I could well be totally off. If it doesn't work, let me know.
Couldn't you do this without the ?'s
^[0-9,]+(\.){0,1}(\d{2}){0,1}$
improved: ^\d+[0-9,]*(\.){0,1}(\d{2}){0,1}$
Edit:
Broken down a bit as requested
Old one:
[0-9,]+
1 or more digits/commas (would have accepted ',' as true) so improved version:
\d+
for starts with 1 or more digits
[0-9,]*
0 or more digits/commas
followed by
(\.){0,1}
0 or 1 decimal
Followed by
(\d{2}){0,1}
0 or 1 of (exactly 2 digits)
I am trying to replace text in a kicad program using notepad++. I am having trouble using wild cards.
This string I am trying to find is one similar to this...
(fp_text reference J2 (at -8.30084 1.4004 270)
J2 is a wild card, but will not be changed and it can be anywhere from 2 to 5 characters long)
-8.30084 can be any number that I want to change to zero
1.4004 can be any number that I want to change to zero
270 will not change, no matter what the number is.
In the end, I want the string to be
(fp_text reference J2 (at 0 0 270)
If in understand correctly you're looking for a regex to match that and replace the first and second (but not the third) number with 0. Without knowing what are valid characters for the token you have as J2 I'll assume that it's any non-space character.
You can reference a capture group within your replacement string. So you can capture the parts you want to preserve. (In the example below I also capture other unknown parts of the string, but that's not really necessary.
The regex should be something like:
(\S)\s\(at ([-+]?\d*\.?\d+) ([-+]?\d*\.?\d+) ([-+]?\d*\.?\d+)\)
And your replacement will be something like:
\1 (at 0 0 \4)
There's a long natural number that can be grouped to smaller numbers by the 0 (zero) delimiter.
Example: 4201100370880
This would divide to Group1: 42, Group2: 110, Group3: 370880
There are 3 groups, groups never start with 0 and are at least 1 char long. Also the last groups is "as is", meaning it's not terminated by a tailing 0.
This is what I came up with, but it only works for certain inputs (like 420110037880):
(\d+)0([1-9][0-9]{1,2})0([1-9]\d+)
This shows I'm attempting to declare the 2nd group's length to min2 max3, but I'm thinking the correct solution should not care about it. If the delimiter was non-numeric I could probably tackle it, but I'm stumped.
All right, factoring in comment information, try splitting on a regex (this may vary based on what language you're using - .split(/.../) in JavaScript, preg_split in PHP, etc.)
The regex you want to split on is: 0(?!0). This translates to "a zero that is not followed by a zero". I believe this will solve your splitting problem.
If your language allows a limit parameter (PHP does), set it to 3. If not, you will need to do something like this (JavaScript):
result = input.split(/0(?!0)/);
result = result.slice(0,2).concat(result.slice(2).join("0"));
The following one should suit your needs:
^(.*?)0(?!0)(.*?)0(?!0)(.*)$
Visualization by Debuggex
The following regex works:
(\d+?)0(?!0) with the g modifier
Demo: http://regex101.com/r/rS4dE5
For only three matches, you can do:
(\d+?)0(?!0)(\d+?)0(?!0)(.*)
Better explained with examples:
HHH
HHHH
HHHBBHHH
HHHBH
BB
HHBH
I need to come up with a regexp that matches only 3 H's or a multiple of 3 H's (so 6, 9, 12, ... H's are ok as well) and 5 H's are not ok. And if possible I don't want to use Perl regexps.
So for the input above the regexp would match (1), (3) and (6) only.
I'm just starting with regular expressions here so I don't exactly know how I'm supposed to approach this.
edit
Just to clear something up:, an H can only be in one group of 3 H's. The group of 3 H's might be HHH or HHBH.
That's why in example 2 above it is not a match because the last H is not in a group of 3 H's. And you can't take the last 3 H's in a group because the middle 2 H's have already been inside a group before.
You can use the following regular expression:
^([^H]*H[^H]*H[^H]*H[^H]*)+$
It matches any string which contains in total 3 H or any multiple of 3. In between there might be any other character.
Explanation:
^ begin of string
( start of group
[^H]*H any string of characters (or none) not including 'H' plus a single 'H'
[^H]*H any string of characters (or none) not including 'H' plus a single 'H'
[^H]*H any string of characters (or none) not including 'H' plus a single 'H'
[^H]* any string of characters (or none) which is not 'H'
)+ containing the group once or twice or ...
$ end of string
By repeating the subpattern [^H]*H three times we make sure that there are indeed 3 H included, [^H]* allows any separating characters.
Note: use either egrep or run grep with additional argument -E.
Use this to match a multiple of 3 H's:
(H{3})+
Here is a complete regex for your examples:
^(H{3})+B*(H{3})*$
Edit: It looks like you need to count non-consecutive H's. In that case:
^(([^H]*H){3})+[^H]*$
That should match any string with a multiple of 3 H's.
Given the requirement that H's can be arbitrarily interleaved with non-H's, but that the total number of H's must be a non-zero multiple of 3 (so XXX, containing no H's, is not a match), then the total regular expression is anything but trivial. This is not a beginner's regular expression.
I'm going to assume that the dialect of regular expression treats {} and () as metacharacters for counting and grouping, and includes + for one-or-more. If you're using a regular expression system that has a different requirement (\{\}, for example) then adjust accordingly.
You need the regex to match the whole string, so there are no stray H's allowed. So, it must start with ^ and end with $. You need to allow an arbitrary number of non-H's at front and back. The H's may be separated by an arbitrary number of non-H's. That leads to:
^([^H]*H[^H]*H[^H]*H)+[^H]*$
Ouch; that is hard to read! It says the line must consist of 1 or more (+) groups of an arbitrary number of non-H's followed by an H, an arbitrary number of non-H's, another H, an arbitrary number of non-H's and a third H; all of which can be followed by an arbitrary number of non-H's.
Using the {} for counting:
^(([^H]*H){3})+[^H]*$
That's still hard to read. Note that my description said "arbitrary number of non-H's at front and back", but I only use the [^H]* at the back; that's because the repeating pattern allows an arbitrary number of non-H's at the front anyway so there's no need to repeat that fragment.