Match letter followed by specific numeric range

Match letter followed by specific numeric range - regex

I am writing a regular expression in which the string can be of 2-3 characters.
The first character has to be a Alphabet between A and H (capitals). This character has to be followed by a number between 1 and 12.
I wrote
[A-H]{1}[1-12]{1,2}
This is fine when I keyin A12 but not when I keyin A6
Please suggest.

You can't specify a range of digits like that because it is implemented as a range between characters, so [1-12] is equivalent to [12], which would only match either a 1 or a 2. Instead, try the following:
[A-H](?:1[012]|[1-9])
Here is an explanation:
[A-H] # one letter from A to H
(?: # start non-capturing group
1[012] # 1 followed by 0, 1, or 2 (10, 11, 12)
| # OR
[1-9] # one digit from 1 to 9
) # end non-capturing group
Note that the {1} after [A-H] in your original regex is unnecessary, [A-H]{1} and [A-H] are equivalent.
You may want to consider adding anchors to the regex, otherwise you would also get a partial match on a string like A20. If you are trying to match an entire string then you should use the following:
\A[A-H](?:1[012]|[1-9])\z
If it is within a larger text you could use word boundaries instead:
\b[A-H](?:1[012]|[1-9])\b

Here you go:
^[A-H]([1-9]|1[0-2])$
No need to for the {1} in your question.
The regex is anchored with ^ and $ meaning it can can be the only thing on your line.
It will not match A60 for example

Related

regex in python/ansible

I am new bee to regex, I have an example string : account-device-v2-2-3-63-21900
and using this regular expression [1-9]-[0-9]-[0-9]*
I am getting output as 1-2-3
but my intention is to match/extract pattern 2-3-63
Meaning to get digits with hyphens after v2 (or v1 etc), I don't need last digit part (21000 or any other number)
Any suggestions please?

You want to get 1 or more digit except 0, dash, 1 or more digit, dash, 1 or more digit from account-device-v2-2-3-63-21900 or account-device-v1-2-3-63-21900?
Use v[12]-([1-9]+?-[0-9]+?-[0-9]+?)- and get first group.
Demo: https://regex101.com/r/hMLGsK/1

The pattern [1-9]-[0-9]-[0-9]* matches 2-2-3 because your pattern does not match the v and a digit part and this is the first part it can match.
Note that [0-9]* Matches optional digits, so 2-2- could also be a match.
Using a capture group to get the value:
\bv[1-9][0-9]*-([1-9][0-9]*-[0-9]+-[0-9]+)
\bv[1-9][0-9]*- Match v1 or also possibly v20 etc..
( Capture group 1
[1-9][0-9]* Match a digit starting at 1
-[0-9]+-[0-9]+ 2 parts matching - and 1 or more digits starting from 0
) Close group 1
Regex demo

How can I limit the total length of 2 adjacent strings in Regular Expression?

Example word: name.surname#exm.gov.xx.en
I want to limit the name + surname's total length to 12.
Ex: If name's length is 5 then the surname's length cannot bigger than 7.
My regex is here: ([a-z|çöşiğü]{0,12}.[a-z|çöşiğü]{0,12}){0,12}#exm.gov.xx.en
Thx in advance

If there should be a single dot present which should not be at the start or right before the #, you could assert 13 characters followed by an #
^(?=[a-zçöşğü.]{13}#)[a-zçöşğü]+\.[a-zçöşğü]+#exm\.gov\.xx\.en$
In parts
^ Start of string
(?= Positive lookahead, assert what is on the right is
[a-zçöşğü.]{13}# Match 13 times any of the listed followed by an #
) Close lookahead
[a-zçöşğü]+\.[a-zçöşğü]+ Match 2 times any of the listed with a dot inbetween
#exm\.gov\.xx\.en Match #exm.gov.xx.en
$ End of string
Regex demo
Note that I have omitted the pipe | from the character class as it would match it literally instead of meaning OR. If you meant to use it as a char, you could add it back. I also have remove the i as that will be matched by a-z

How to write that the pattern should be repeated?

I have a line of pattern:
double1, +double2,-double3.
For single double value pattern is :
[+-]?([0-9]+([.][0-9]*)?|[.][0-9]+)
How to make it for triple value?
Such as:
1.1, 0, -0
0, -123, 33
Not valid for:
""
1,123
123,123,123,123

You can use a slightly simpler pattern:
^(?:(?:^[+-]?|, ?[+-]?)\d+(?:\.\d+)?){3}$
Matches only triple occurences as you specified in your edit.
You can try it here.
As correctly pointed out by The Fourth Bird in his comments below, if you wish to match entries such as .9, where no digits precede the full stop you can use:
^(?:(?:^[+-]?|, ?[+-]?)(?:\d+(?:\.\d+)?|\.\d+)){3}$
You can check this pattern here.

The double part ([.][0-9]*)? is optional which will match 0 or 1 times.
To match it triple times, you could match a double using [-+]?(?:[0-9]+(?:\.[0-9]+)?|\.[0-9]+) which will match an optional + or - followed by an alternation that will match either a digit followed by an optional part that matches a dot and one or more digits or a dot followed by one or more digits.
Repeat that pattern 2 times using a quantifier {2} preceded by a comma and zero or more times a whitespace character \s*.
Add anchors to assert the start ^ and the end $ of the string and you could make use of a non capturing group (?: if you only want to check if it is a match and not refer to the groups anymore.
^[-+]?(?:[0-9]+(?:\.[0-9]+)?|\.[0-9]+)(?:,\s*[-+]?(?:[0-9]+(?:\.[0-9]+)?|\.[0-9]+)){2}$

How to use regular expression to use as few groups as possible to match as long string as possible

For example, this is the regular expression
([a]{2,3})
This is the string
aaaa // 1 match "(aaa)a" but I want "(aa)(aa)"
aaaaa // 2 match "(aaa)(aa)"
aaaaaa // 2 match "(aaa)(aaa)"
However, if I change the regular expression
([a]{2,3}?)
Then the results are
aaaa // 2 match "(aa)(aa)"
aaaaa // 2 match "(aa)(aa)a" but I want "(aaa)(aa)"
aaaaaa // 3 match "(aa)(aa)(aa)" but I want "(aaa)(aaa)"
My question is that is it possible to use as few groups as possible to match as long string as possible?

How about something like this:
(a{3}(?!a(?:[^a]|$))|a{2})
This looks for either the character a three times (not followed by a single a and a different character) or the character a two times.
Breakdown:
( # Start of the capturing group.
a{3} # Matches the character 'a' exactly three times.
(?! # Start of a negative Lookahead.
a # Matches the character 'a' literally.
(?: # Start of the non-capturing group.
[^a] # Matches any character except for 'a'.
| # Alternation (OR).
$ # Asserts position at the end of the line/string.
) # End of the non-capturing group.
) # End of the negative Lookahead.
| # Alternation (OR).
a{2} # Matches the character 'a' exactly two times.
) # End of the capturing group.
Here's a demo.
Note that if you don't need the capturing group, you can actually use the whole match instead by converting the capturing group into a non-capturing one:
(?:a{3}(?!a(?:[^a]|$))|a{2})
Which would look like this.

Try this Regex:
^(?:(a{3})*|(a{2,3})*)$
Click for Demo
Explanation:
^ - asserts the start of the line
(?:(a{3})*|(a{2,3})*) - a non-capturing group containing 2 sub-sequences separated by OR operator
(a{3})* - The first subsequence tries to match 3 occurrences of a. The * at the end allows this subsequence to match 0 or 3 or 6 or 9.... occurrences of a before the end of the line
| - OR
(a{2,3})* - matches 2 to 3 occurrences of a, as many as possible. The * at the end would repeat it 0+ times before the end of the line
-$ - asserts the end of the line

Try this short regex:
a{2,3}(?!a([^a]|$))
Demo
How it's made:
I started with this simple regex: a{2}a?. It looks for 2 consecutive a's that may be followed by another a. If the 2 a's are followed by another a, it matches all three a's.
This worked for most cases:
However, it failed in cases like:
So now, I knew I had to modify my regex in such a way that it would match the third a only if the third a is not followed by a([^a]|$). So now, my regex looked like a{2}a?(?!a([^a]|$)), and it worked for all cases. Then I just simplified it to a{2,3}(?!a([^a]|$)).
That's it.
EDIT
If you want the capturing behavior, then add parenthesis around the regex, like:
(a{2,3}(?!a([^a]|$)))

How would I find values in a file, but only on lines that don't start with #?

I've got a document that looks something like this:
# Document ID 8934
# Last updated 2018-05-06
52 84 12 70 23 2 7 20 1 5
4 2 7 81 32 98 2 0 77 6
(..and so on..)
In other words, it starts off with a few comment lines, then the rest of the document is just a bunch of numbers separated by spaces.
I'm trying to write a regex that gets all digits on all lines that don't start with #, but I can't seem to get it.
I've read over answers such as
Regular Expressions: Is there an AND operator?
Regex: Find a character anywhere in a document but only on lines that begin with a specific word
and pawed through sites such as http://regular-expressions.info, but I still can't get an expression that works (the best I can get is a lengthy version of ^[^#].*
So how can I match digits (or text, or whatever) in a string, but only on lines that don't start with a certain character?

Your regex ^[^#].* uses a negated character class which matches not a # from the start of the string ^ and after that matches any character zero or more times.
This would for example also match t test
What you might do is use an alternation to match a whole line ^#.*$ that starts with a # or capture in a group one or more digits (\d+)
Your digits are captured group 1. You could change the (\d+) to for example a character class ([\w+.]+) to match more than only digits.
(?:^#.*$|(\d+))
Details
(?: Non capturing group
^#.*$ Match from the start of the line ^ a # followed by any character zero or more times .* until the end of the string $
| Or
(\d+) capture one or more digits in a group
) Close non capturing group

I think a way simpler method would be to replace the lines with "" first with this regex:
^#.*
And then you can just match all the numbers with this:
-?\d+ (-? is for negative)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Match letter followed by specific numeric range - regex

Here you go: ^[A-H]([1-9]|1[0-2])$ No need to for the {1} in your question. The regex is anchored with ^ and $ meaning it can can be the only thing on your line. It will not match A60 for example

Related

regex in python/ansible

How can I limit the total length of 2 adjacent strings in Regular Expression?

How to write that the pattern should be repeated?

How to use regular expression to use as few groups as possible to match as long string as possible

How would I find values in a file, but only on lines that don't start with #?

Categories

Resources