Regex to get specific numbers with three digits - regex

Im trying to get a string to match this pattern:
C006,
C007,
C008,
C009,
C010,
C011
I have this:
C00[6-9]|1[0-1]
And it works with "C006" to "C009", but when I got "C010" or "C011" the regex match only with the number 10 or 11.
Tested on http://rubular.com/r/gFKJ2eTyrz
Can anyone help-me with this?
Thankss.

Your regex is
C00[6-9]
or
1[0-1]
https://regex101.com/r/CsjWzT/3
You need to group the alternative patterns. Try:
C0(0[6-9]|1[0-1])
Demo: https://regex101.com/r/CsjWzT/1
If you want it exact use anchors:
^C0(0[6-9]|1[0-1])$
https://regex101.com/r/CsjWzT/2

Try this: C(\d){3}
C -> matches the char 'C'
\d -> matches any number
{3} -> only three digits

Related

Python Regex - How to extract the third portion?

My input is of this format: (xxx)yyyy(zz)(eee)fff where {x,y,z,e,f} are all numbers. But fff is optional though.
Input: x = (123)4567(89)(660)
Expected output: Only the eeepart i.e. the number inside 3rd "()" i.e. 660 in my example.
I am able to achieve this so far:
re.search("\((\d*)\)", x).group()
Output: (123)
Expected: (660)
I am surely missing something fundamental. Please advise.
Edit 1: Just added fff to the input data format.
You could find all those matches that have round braces (), and print the third match with findall
import re
n = "(123)4567(89)(660)999"
r = re.findall("\(\d*\)", n)
print(r[2])
Output:
(660)
The (eee) part is identical to the (xxx) part in your regex. If you don't provide an anchor, or some sequencing requirement, then an unanchored search will match the first thing it finds, which is (xxx) in your case.
If you know the (eee) always appears at the end of the string, you could append an "at-end" anchor ($) to force the match at the end. Or perhaps you could append a following character, like a space or comma or something.
Otherwise, you might do well to match the other parts of the pattern and not capture them:
pattern = r'[0-9()]{13}\((\d{3})\)'
If you want to get the third group of numbers in brackets, you need to skip the first two groups which you can do with a repeating non-capturing group which looks for a set of digits enclosed in () followed by some number of non ( characters:
x = '(123)4567(89)(660)'
print(re.search("(?:\(\d+\)[^(]*){2}(\(\d+\))", x).group(1))
Output:
(660)
Demo on rextester

1 to 5 of the same groups in REGEX

For a string such as:
abzyxcabkmqfcmkcde
Notice that there are string patterns between ab and c in bold. To capture the first string pattern:
ab([a-z]{3,5})c
Is it possible to match both of the groups from the sample string? Actually, there should be 1 to 5 groups.
Note: python style regex.
You can verify that a given string conforms to the 1-5 repetitions of ab([a-z]{3,5})c using this regex
(?:ab([a-z]{3,5})c){1,5}
or this one if there are characters expected between the groups
(?:ab([a-z]{3,5})c.*?){1,5}
You will only be able to extract the last matching group from that string however, not any of the previous ones. to get a previous one you need to use hsz's approach
Just match all results - i.e. with g flag:
/ab([a-z]{3,5})c/g
or some method like in Python:
re.findall(pattern, string, flags=0)

How can I match this string '-,-,-,9,-'?

I was told to validate the string like this -,-,-,9,-
It was separated by , and contains 1 number(0-9), others are all -
some examples:
9,-,-,-,-
-,-,-,-,9
-,-,2,-,-
How can I match this? And what concepts should I learn in regex?
Update
I miss the times, sorry, this string can only contains 5 part,so the length can be only 9,it means a string like below should not be passed:
-,-,9,-,-,-
and of course, it should have only one number.
^(?=\D*\d\D*$)[0-9-](?:,[0-9-]){4}$
You can try this.See demo.
https://regex101.com/r/nM7nT5/5
This ensures that the string must hav atleast one comma and exactly one digit.
^(?:\d(?:,-)+|-(?:,-)*,\d(?:,-)*)$
DEMO
OR
^(?=\D*(?:^|,)\d(?:,|$)\D*$)[\d-](?:,[\d-])+$
DEMO

Regex to match numeric pattern

I am trying to match specific numeric pattern from the below list.
My requirement is to match only report.20150325 to report.20150331. Please help.
report.20150319
report.20150320
report.20150321
report.20150322
report.20150323
report.20150324
report.20150325
report.20150326
report.20150327
report.20150328
report.20150329
report.20150330
report.20150331
It's very simple to match 25 to 31 use regex 2[5-9]|3[01]
Here is complete regex
(report\.201503(2[5-9]|3[01]))
DEMO
Explanation of 2[5-9]|3[01]
2 followed by a single character in the range between 5 and 9
OR
3 followed by 0 or 1
You could use something like so: ^report\.201503(2[5-9]|3[01])$/gm (built using this tool).
It should match the reports you are after, as shown here.
A regexp match isn't always the right approach. Here you are asking to match a string followed by a number so use a string and numeric comparisons:
$ awk -F'.' '$1=="report" && ($2>=20150325) && ($2<=20150331)' file
report.20150325
report.20150326
report.20150327
report.20150328
report.20150329
report.20150330
report.20150331
Seems like you want to print the lines which falls between the lines which matches two separate patterns (including the lines which matches the patterns).
$ sed -n '/^report\.20150325$/,/^report\.20150331$/p' file
report.20150325
report.20150326
report.20150327
report.20150328
report.20150329
report.20150330
report.20150331

Regex for Regex validation decimal[19,3]

I want to validate a decimal number (decimal[19,3]). I used this
#"[\d]{1,16}|[\d]{1,16}[\.]\d{1,3}"
but it didn't work.
Below are valid values:
1234567890123456.123
1234567890123456.12
1234567890123456.1
1234567890123456
1234567
0.0
.1
Simplification:
The \d doesn't have to be in []. Use [] only when you want to check whether a character is one of multiple characters or character classes.
. doesn't need to be escaped inside [] - [\.] appears to just allow ., but allowing \ to appear in the string in the place of the . may be a language dependent possibility (?). Or you can just take it out of the [] and keep it escaped.
So we get to:
\d{1,16}|\d{1,16}\.\d{1,3}
(which can be shortened using the optional / "once or not at all" quantifier (?)
to \d{1,16}(\.\d{1,3})?)
Corrections:
You probably want to make the second \d{1,16} optional, or equivalently simply make it \d{0,16}, so something like .1 is allowed:
\d{1,16}|\d{0,16}\.\d{1,3}
If something like 1. should also be allowed, you'll need to add an optional . to the first part:
\d{1,16}\.?|\d{0,16}\.\d{1,3}
Edit: I was under the impression [\d] matches \ or d, but it actually matches the character class \d (corrected above).
This would match your 3 scenarios
^(\d{1,16}|(\d{0,16}\.)?\d{1,3})$
first part: a 0 to 16 digit number
second: a 0 to 16 digit number with 1 to 3 decimals
third: nothing before a dot and then 1 to 3 decimals
the ^ and $ are anchorpoints that match start of line and end of line, so if you need to search for numbers inside lines of text, your should remove those.
Testdata:
Usage in C#
string resultString = null;
try {
resultString = Regex.Match(subjectString, #"\d{1,16}\.?|\d{0,16}\.\d{1,3}").Value;
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
Slight optimization
A bit more complicated regex, but a bit more correct would be to have the ?: notation in the "inner" group, if you are not using it, to make that a non-capture group, like this:
^(\d{1,16}|(?:\d{0,16}\.)?\d{1,3})$
Following Regex will help you out -
#"^(\d{1,16}(\.\d{1,3})?|\.\d{1,3})$"
Try something like that
(\d{0,16}\.\d{0,3})|(\d{0,16})
It work with all your examples.
edit. new version ;)
You can try:
^\d{0,16}(?:\.|$)(?:\d{0,3}|)$
match 0 to 16 digits
then match a dot or end of string
and then match 3 more digits