I hope you can help me, I'm studying how to make a regex and now I have this problem:
Write a regex that accepts strings with 0 and 1 and that has a 1 on position 5 from right to left.
e.g. 10000 is accepted because it has an 1 on the position 5 from right to left or 010000, 0010000 or 1110000 are accepted.
I was thinking with something like: (0+1)*+1(0+1)(0+1)(0+1)(0+1)(0+1)
You can use this regex:
1[01]{4}$
If you want to match full input then use:
^[01]*1[01]{4}$
Here 1[01]{4}$ ensures that we have 4 digits of 0 and 1 after we match 1 thus making 1 at 5th position from right to left.
RegEx Demo
Well - think of it this way. It needs to be as many 1s and 0s as you please, followed by a 1, followed by 4 more ones or zeroes.
So:
my_regex =
"^[01]*" + // Starts with One or zero, zero or more times
"1" + // Followed by a one
"[01]{4}$" // Followed by four things, which could be either zero or one, before ending.
Your (0 + 1) syntax looks foreign to me. I'm using character classes to specify the [01] things but you could use (0|1) in their place, which is what your attempt looks more like.
The full thing, together, is ^[01]*1[01]{4}$
Related
Hello and thank you in advance,
I buy items that have a variety of human written listings on auction sites and forums.
Often times, the quantity is clear to a person, but extracting it has been a real challenge. I'm using google sheets and REGEXEXTRACT().
I consider myself to be a intermediate regex user, but this has me stumped, so I need an expert.
Here's a few examples, my desired return, and what I'm getting.
Listing
Desired Return
Actual Return
Red 1996 Corvette 2x - Matchbox
2
2
3 x SmartCar, broken 2nd door
3
3
2nd edition Kindle (x3)
3
3
**1x** 2008 financial crash notice
1
1
Collectors Edition Beannie Baby, item 204/343
1
4
(6) Nissan window motors (1995-1998 ONLY)
6
N/A
White chevy F150, 1996
1
6
Green bowl, cracked (stored in room 2A5)
1
5
As I thought through this, I think I can put some reasonable limitations on this logic, but the code is harder.
The quantities will only be a single number 1-9. (perhaps reject all numbers > 9?)
They'll possibly be precede by or followed by an X or x, with or without a space
The quantity may be next to a special character like * , () or -
It should ignore all 1st, 2nd, 3rd, - 9th style notation
If a number is mixed in a word, like 2A3, it should ignore all
Obviously most description don't have any quantity, so if there's no return or zero, that's fine.
I have something that feels close, and does a reasonable job:
[^a-wy-zA-WY-Z0-9]*([1-4]){1}([^a-wA-w0-9]|$)
It doesn't return anything with the returns marked of 1*, and that's fine. It breaks on the last two, and I've struggled for too long!
Thanks in advance!
You can use
=IFNA(INT(REGEXEXTRACT(REGEXREPLACE(LOWER(A27), "\d{2,}|(x\d)|(\dx)|[^\W\d]+\d\w*|\d+[^\W\d]\w*", "$1$2"), "(\d)")), 1)
Here,
REGEXREPLACE(LOWER(A27), "\d{2,}|(x\d)|(\dx)|[^\W\d]+\d\w*|\d+[^\W\d]\w*", "$1$2") finds and removes chunks of two or more digits, or chunks with a digit and at least one letter, but keeps the sequences where a digit is preceded or followed with x
REGEXEXTRACT(..., "(\d)")) extracts the first digit left after the replacement
=IFNA(INT(...), 1) either casts the found digit to integer, or, if there was no match, inserts 1 into the column.
See the long regex demo.
\d{2,} - two or more digits
| - or
(x\d) - Group 1 ($1): x and a digit
| - or
(\dx) - Group 2 ($2): a digit and x
| - or
[^\W\d]+\d\w* - one or more word chars except digits, a digit and then zero or more word chars
| - or
\d+[^\W\d]\w* - one or more digits, a letter or underscore, and then zero or more word chars.
Demo:
I've got a question that asks for a non-empty string that starts and ends with two 1's. The alphabet is {0,1}. It needs to match the string {11,111,1111,11000...11..0011} However many 1's and 0's in between doesn't matter as long as it ends with 2 1's. So far I've got this:
^(1{2,4}|(11[01]*0[01]*11))$
But my answer wasn't accepted because it needs to be simplified. Something along these lines 11(0|1)*(11)* - this returns infinite 11's at the end so it's not accepted. I just can't figure it out can someone please push me in the right direction.
One possibility ^(?=11)[01]*11$. See demo. Here use look ahead to assert the string starts with 11 which fits the edge cases (11, 111) pretty well here since it doesn't consume characters, and then match the whole string with [01]*11$ which contains only 1 and 0 and ends with pattern 11.
Or based on your existing approach ^(1{2,3}|11[01]*11)$ should work as well. demo.
The simplest one:
11((0*1)*1)*
Explain:
When capturing 0 we must have one 1's at the end and another 1's at the outer group.
11 # match because 11 and Kleene star group is empty
111 # match 11(e1) -> 111
1111 # match 11(e1)1 --> 1111
11011 # match 11(01)1
11001 # non-match because 11(001) (no 1's at the end)
110111011 # match 11((01)1))(e1)((01)1)
^(1{2,4}|11[01]+11)$
^(1{2,3}|11[01]*11)$
^(11|111|11[01]*11)$
Your last answer is very close. (^11[01]*11$|^11+$) would do.
I added the OR 1+ to cover the 11 and 111 cases because the expression on the left covers anything that starts with 11 then either has some 0's and/or 1's or doesn't have them but then definitely has 11 again. This mean the shortest string it will match would be 1111. Hence the fix.
EDIT:
Sorry I answered too fast. Take Psidom's answer it's perfect.
Only 0 and 1?
And starts and ends with 11?
But also matching "11" or "111"?
Then this regex also does that:
^11(1|[01]*11)?$
I have a string of 8 separated hexadecimal numbers, such as:
3E%12%3%1F%3E%6%1%19
And I need to check if the number 12 is located within the first 4 set of numbers.
I'm guessing this shouldn't be all that complex, but my searches turned up empty. Regular expressions are always a trouble for me, but I don't have access to anything else in this scenario. Any help would be appreciated.
^([^%]+%){0,3}12%
See it in action
The idea is:
^ - from the start
[^%]+% - match multiple non % characters, followed by a % character
{0,3} - between 0 and 3 of those
12% - 12% after that
Here you go
^([^%]*%){4}(?<=.*12.*)
This will match both the following if that is what is intended
1%312%..
1%123%..
Check the solution if %123% is matched or not
If the number 12 should stand on its own then use
^([^%]*%){4}(?<=.*\b12\b.*)
I got a little problem in SAS Content Categorization. I'm working with getting out two values. Value 1 and value 2.
I use predicate_rule, so when I click on the matched string in the program I get
ARGUMENT 0 [val1]: 4
ARGUMENT 1 [val2]: 4
ARGUMENT 2 [valName]: Score
In this example 4 is just an example of a value, but my problem is that when it stand 4+4 (no space between 4, + and 4) I can't get the latest value WITHOUT the plus symbol, so I get this out
ARGUMENT 0 [val1]: 4
ARGUMENT 1 [val2]: +4
ARGUMENT 2 [valName]: Score
I only manage to get the value printet correctly if there is space between the numbers and plus symbol.
I have now crateded two regex and two predicate_rules.
This one is for the first value (val1), called: Regex1
REGEX:[1-5]
This is for the seconed value (val2), called: Regex2
REGEX:\+[1-5]
I know that I get the plus symbol printed out because of Regex2, but I can't manage to get the latest value without typing it this way.
In the main concept I have created two predicate_rules. One that should manage the score which have space between the numbers and the plus symbol, and one that should manage when there is no space between.
#With space
PREDICATE_RULE:(valName,val1,val2):(ORDDIST_4, "_valName{valName}", "_val1{Regex1}", "+", "_val2{Regex1}")
#Without space
PREDICATE_RULE:(valName,val1,val2):(ORDDIST_3, "_valName{valName}", "_val1{Regex1}", "_val2{Regex2}")
valName only contains terms that should be in distance of the arguments so it matches correctly.
Thanks in advance.
I think you can look at altering your 2nd regex in the predicate_rule. Since you mentioned that text pattern like 4+4 is an issue. You could probably look into Positive lookbehind to solve the issue. Positive lookbehind will help you to select your group before your main expression without including it in the result.
Pattern like below could easily solve by Positive lookbehind:
4+4
4 + 4
4 +4
4 4
Try the following regex for the 2nd predicate_rule :
(?<=[\+ ])[\d]
Well I tried to sum it up in the title.
I need a reg ex to match numbers and commas, but not numbers starting with 0 unless it's 0,number
My users enter hours in a field, so they have to be able to enter 0,3 hours, but they are not allowed to write 002 or 09.
I have this reg ex
^[0-9]*\,?[0-9]+$
How can I extend it to not allow start with 0 unless the 0 is followed by a comma
Another one :)
^(0|[1-9]\d*(|,\d+)|0,\d+)$
This one should suit your needs:
^0,\d*[1-9]|[1-9]\d*$
either 0,\d*[1-9]: a 0, followed by a comma, followed by 0 or more digit, followed by one digit between 1 and 9
or [1-9]\d*: a digit between 1 and 9, followed by zero or more digit
Matches:
0,3
0,03
3
30
Doesn't match:
0
0,0
0,30
03
You don't need to force everything into a single regex to do this.
It will be far clearer if you use multiple regexes, each one making a specific check.
if ( /^[0-9]+,[0-9]+$/ || /^[1-9][0-9]*$/ )
Here we are making two different checks. "Either this one matches, or the other one matches", and then you don't have to jam both conditions into one regex.
Let the expressive form of your host language be used, rather than trying to cram logic into a regex.