What does the regular expression ^(\d{1,2})$ mean? [duplicate] - regex

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 8 years ago.
I am trying to understand what the regular expression ^(\d{1,2})$ stands for in google sheets. A quick look around the regex sites and in tools left me confused. Can anybody please help?

^ Asserts position at start of the string
( Denotes the start of a capturing group
\d Numerical digit, 0, 1, 2, ... 9. Etc.
{1,2} one to two times.
) You guessed it - Closes the group.
$ Assert position at end of the string
Regular expression visualization:

^ - start of a line.
(\d{1,2}) - captures upto two digits(ie; one or two digits).
$ - End of the line.

It means at least one at most two digits \d{1,2}, no other characters at the beginning ^ or the end $. Parenthesis essentially picks the string in it i.e. what ever the digits are

^ matches the start of the line
The parens can be ignored for now..
\d{1, 2} means one or two digits
$ is the end of the line.
The parens, if you need them, can be used to retrieve the digit(s) that were found in the regex.

Related

Regex to match specific string + optional space + 8 digits [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
I need a regular expression to validate strings with the prefix 'CON' followed by an optional space followed by 8 digits.
I've tried various expressions, I got tangled up and now I'm lost.
^(CON+s\?d{8})$
\bCON\b\S?D{8}
Syntax is off a bit
^(CON\s?\d{8})
( starts a capturing group
CON is exactly matched
\s matches any white space character and the ? makes it optional
\d{8} matches 8 digits
) ends the capturing group
You were pretty well off to start, Hope this helps :)
keeping in mind If there is no space, then there shouldn't be 8 more digits
^CON(\ \d{8})?
If the string you are looking for can be part of a larger string (note that in this case it may be preceded or followed by anything, even other digits):
CON\s?\d{8}
If the string must match in full, use ^$ to designate that:
^CON\s?\d{8}$
You can add variations to it, if say you want it to begin/end with a word boundary - use \bto indicate that. If you want it to end in a non-digit, use \D+ at the end, instead of $.
Finally, if you want the string to end with an EOL or a non-digit, you may use an expression like this:
CON\s?\d{8}(\D+|$) or the same with a non-capturing group: CON\s?\d{8}(?:\D+|$)

Regex to match given amount of characters in undefined order [duplicate]

This question already has answers here:
Regex to match exactly n occurrences of letters and m occurrences of digits
(3 answers)
Closed 4 years ago.
I am looking for a regex that matches the following:
2 times the character 'a' and 3 times the character 'b'.
Additionally, the characters do not have to be subsequent, meaning that not only 'aabbb' and 'bbaaa' should be allowed, but also 'ababb', 'abbab' and so forth.
By the sound of it this should be an easy task, but atm I just can't wrap my head around it. Redirection to a good read is appreciated.
You need to use positive lookaheads. This is the same as the password validation problem described here.
Edit:
A positive lookahed will allow you to check a pattern against the string without changing where the next part of the regex matches. This means that you can test multiple regex patterns at the current position of the string and for the regex to match all the positive lookaheads will have to match.
In your case you are looking for 2 a' and 3 b's so the regex to match exactly 2 a's anywhere in the string is /^[^a]*a[^a]*a[^a]*$/ and for 3 b's is /^[^b]*b[^b]*b[^b]*b[^b]*$/ we now need to combine these so that we can match both together as follows /^(?=[^a]*a[^a]*a[^a]*$)(?=[^b]*b[^b]*b[^b]*b[^b]*$).*$/. This will start at the beginning of the string with the ^ anchor, then look for exactly 2 a's then the end of the string. Then because that was a positive lookahead the (?= ... ) the position for the next part of the pattern to match at in the string wont move so we are still at the start of the string and now match exactly 3 b's. As this is a positive lookahead we are still at the beginning of the string but now know that we have 2 a's and 3'b in the string so we match the whole of the string with .*$.

Regex quantifier not restricting match [duplicate]

This question already has an answer here:
Restricting character length in a regular expression
(1 answer)
Closed 4 years ago.
I would like to match 1 or more capital letters, [A-Z]+ followed by 0 or more numbers, [0-9]* but the entire string needs to be less than or equal to 8 characters in total.
No matter what regex I come up with the total length seems to be ignored. Here is what I've tried.
^[A-Z]+[0-9]*{1,8}$ //Range ignored, will not work on regex101.com but will on rubular.com/
^([A-Z]+[0-9]*){1,8}$ //Range ignored
^(([A-Z]+[0-9]*){1,8})$ //Range ignored
Is this not possible in regex? Do I just need to do the range check in the language I'm writing in? That's fine but I thought it would be cleaner to keep in all in regex syntax. Thanks
The behaviour is expected. When you write the following pattern:
^([A-Z]+[0-9]*){1,8}$
The {1,8} quantifier is telling the regex to repeat the previous pattern, therefore the capturing group in this case, between one to eight times. Due to the greedyness of your operators, you will match and capture indefinitely.
You need to use a lookahead to obtain the desired behaviour:
^(?=.{1,8}$)[A-Z]+[0-9]*$
^ Assert beginning of string.
(?=.{1,8}$) Ensure that the string that follows is between one and eight characters in length.
[A-Z]+[0-9]*$ Match any upper case letters, one or more, and any digits, zero or more.
$ Asserts position end of string.
See working demo here.
The regex ^([A-Z]+[0-9]*){1,8}$ would match [A-Z]+[0-9]* 1 - 8 times. That would match for example a repetition of 8 times A1A1A1A1A1A1A1A1 but not a repetition of 9 times A1A1A1A1A1A1A1A1A1
You might use a positive lookahead (?=[A-Z0-9]{1,8}$) to assert the length of the string:
^(?=[A-Z0-9]{1,8}$)[A-Z]+[0-9]*$
That would match
^ From the start of the string
(?=[A-Z0-9]{1,8}$) Positive lookahead to assert that what follows matches any of the characters in the character class [A-Z0-9] 1 - 8 times and assert the end of the string.
[A-Z]+[0-9]*$ Match one or more times an uppercase character followed by zero or more times a digit and assert the end of the string. $

How to do regexp replace/search [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 5 years ago.
I am following some instructions for data upload. I can't figure out what the following two points mean. Does anyone have any idea?
Regexp search/replace
search: 201([0-9])([0-9])([0-9])([0-9][0-9]) ([0-9])
replace:201\1\2\3\4 \5
Regexp search/replace
replace 20110401 with whatever year month day that is being fixed
^(.{462})
\120110401
Any decent regex tutorial will help.
() wrap groups that can be referenced later with \#. For example, \2 references the token matched by the second pair of parentheses.
[0-9] means any character between 0-9 inclusive.
^ is the left anchor (i.e., start of string or new line), and .{462} means any character, 462 times.

noncapturing group explanation within a positive lookahead [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 5 years ago.
Does this regular expression mean that at least one of the following that isn't a-z:
(?=.*(?:[a-z]))
It's part of the following expression:
/^(?=[A-Za-z0-9\'\s\d\.]{2,50}$)(?=.*(?:[a-z]))[a-zA-Z0-9]+[A-Za-z0-9\'\s\.]+$/m
No, (?=.*(?:[a-z])) means that there could be whatever but must finish with a lowercase letter.
This regex means:
/^(?=[A-Za-z0-9\'\s\d\.]{2,50}$)(?=.*(?:[a-z]))[a-zA-Z0-9]+[A-Za-z0-9\'\s\.]+$/m
Match the line that starts with 2 to 50 alphanumeric, single quote, spaces or a dot, and then follows with lower case letter, and continues with alphanumerics and must ends followed by alphanumerics, spaces, single quote or dot.
Here you can see a better graphical approach for your regex:
Actually, this can be improved as:
/^(?=[A-Za-z\d'\s.]{2,50}$)(?=.*[a-z])[a-zA-Z\d]+[A-Za-z\d'\s.]+$/m