Regex: Match dot and dash in a 5 digit number - regex

I'm trying to use a Regex to match only dot and dash from a number that matches in this format:
00.000-0
I'd use a two step way: first checking if the number is in this format 00.000-0 and then matching only the dot and dash, which I'd use a regex pattern like [^\d] or [\.\-].
But I'm trying to use in a single step, a Regex pattern that matches the dot after the first two digits and the dash followed by respectively, two digits, dot and three digits.
First, I tried in regex101.com with positive lookahead, something like (?=\d\d)\.(?=\d\d\d)\-, but it didn't work. Then I tried (?=\d\d)\., so at least I tried to the dot . to see if the lookahead was working, but again it didn't work.
I read in Regular-Expressions.info and, apparently, the lookahead format I tried was correct.
Is there something else I can do, it matches the dot and dash, only for this format: 00.000-0?

You might capture the dot and the dash in a capturing group ().
From the start of the string ^ match 2 digits [0-9]{2}, then capture (\.)the dot in capturing group 1, match 3 digits [0-9]{3} and capture the dash (-) in capturing group 2 and finally match a digit [0-9] at the end of the line $
^[0-9]{2}(\.)[0-9]{3}(-)[0-9]$
If your engine supports lookbehinds, an option to match only the dot and the dash could be to match a dot or a hyphen if on the left side and on the right side is the pattern that you would expect.
(?<=^\d{2})\.(?=\d{3}-\d$)|(?<=^\d{2}\.\d{3})-(?=\d$)

Related

Negative Lookahead not match suffix

I have an expression that is matching something, but am trying to get this not to match if it's followed by the suffix: one or more spaces, three dashes, one or more spaces, one or more digits, a slash, and finally one or more digits. Here is the expression:
(?<=(^|\s+))[A-Z]+[ ]+([0-9]+(\.[0-9]{1,3})?)/([0-9]+(\.[0-9]{1,3})?)(?!(\s+\-\-\-\s+[0-9]+/[0-9]+))
And here is the text:
January 10.5/13.5 --- 22/26 ---
It's matching January 10.5/13, but I don't want it to match anything.
As lookarounds are supported, you can change the positive lookbehind at the start to a negative lookbehind asserting a whitespace boundary to the left (?<!\S)
You can use .* to it to scan the whole line, instead of starting with 1+ more whitespace chars \s+
The negative lookahead (?!.*\s-{3}\s+[0-9]+/[0-9] asserts that what is on the right is not the suffix.
You can omit the quantifier + after the last character class, as it does not matter if there are 1 or more digits following...as long as it is not a digit.
Note that in the current pattern, the decimal part is an optional capturing group 2. If you want that whole value in group 1, you can make it an optional group.
(?<!\S)[A-Z]+[ ]+([0-9]+(\.[0-9]{1,3})?)/([0-9]+(\.[0-9]{1,3})?)(?!.*\s-{3}\s+[0-9]+/[0-9])
Regex demo

Regex: Opposite of group match

this expression
(^\+\d{2})_\1
would match
+32_+32
How can I make it match
+32_+44
If you want the opposite, you might use a negative lookahead (?!\1) asserting not the value of group 1 and then match a + and 2 digits
^(\+\d{2})_(?!\1)\+\d{2}
Regex demo
If you want to match an underscore followed by 2 digits, you don't need the first capturing group and you can match if afterwards.
^\+\d{2}_\+\d{2}
Regex demo

Regex to pull first two fields from a comma separated file

I want to pull the second string in a commma delimited list where the first value is numeric and the second is alpha.
I'm using \d[^,]+(?=,) to pull the numeric value in the first field and just need help with pulling the second value from the "Name" column.
Here's part of a sample file that I'm trying to extract data from:
Address Number,Name,Employee Master Exist(Y/N),Auto-Deposit Exists(Y/N),Supplier Master Exists(Y/N),Supplier Master Created,ACH Account Exists(Y/N),ACH Account Created,ACH Same as Auto-deposit(Y/N)
//line break here is for clarity and does not exist in file//
4398,Presley Elvis Aaron,Y,N,Y,N,Y,N,N
10154,Shepard Alan Barrett,Y,Y,Y,N,Y,N,N
You could make use of a capturing group if you want to match the second string by first matching 1+ digits and a comma.
Then capture in a group matching 1+ chars a-zA-Z and match the trailing comma.
^\d+,([a-zA-Z]+(?: [a-zA-Z]+)*),
^ Start of string
\d+, Match 1+ digits and a comma (Or use (\d+), if the digits should also be a group)
( Capture group 1
[a-zA-Z]+ Match 1+ chars a-zA-Z
(?: [a-zA-Z]+)* Repeat matching the same as previous preceded by a space
), Close capturing group and match trailing comma
Regex demo
To get a bit broader match you could use this pattern to match at least a single char a-zA-Z
\d+,([a-zA-Z ]*[a-zA-Z][a-zA-Z ]*),
Regex demo
Note that this part in your pattern \d[^,]+ matches not only digits, but 1 digit followed by 1+ times any char except a comma which would for example also match 4a$ .
You could try this regex:
^\d+,([^,]+),
This will look for lines:
starting with one or more digits
followed by a comma
capture anything that is not a comma
followed by a comma
See it at Regex 101
If not all lines contain a name, then change the + to a *:
^\d+,([^,]*),
See alternative regex

How to capture an entire group consisting out of different characters?

I have a text with a number that contains dots:
text 304.33.44.52.03.001 text
where I want to capture the number including strings:
304.33.44.52.03.001
The following regex will capture sevaral groups:
(\d+\.?)
Resulting in:
304.
33.
44.
...
What is the correct syntax to return the entire number including dots in one result?
\d+\.? matches 1+ digits and then an optional . char.
You need to use either
\d+(?:\.\d+)*
or
\d[\d.]*
See the regex demo
The \d+(?:\.\d+)* pattern matches
\d+ - 1+ digits
(?:\.\d+)* - 0 or more occurrences of a . and then 1+ digits. (?:...) is a non-capturing group that is used to group 2 patterns and set a quantifier on their sequence.
The \d[\d.]* pattern matches a digit first, and then tries to match 0 or more digits or ..
In regex engines that do not support \d you need to use a safer pattern, a bracket expression [0-9].

Regex: Detect Phone numbers that are separated by dashes (-) and/or spaces

I am trying to recognize these types of phone number inputs:
0172665476
+6265476393
+62-65476393
+62-654-76393
+62 65476393
While my regex: (?:\d+\s*)+ can recognize the 1st 2 sample values, it recognizes the last 3 sample values as multiple matches in each line, instead of recognizing the number as a whole.
How can I modify this to support multiple dashes and/or spaces and still recognize it as 1 whole number instead of multiple matches?
You may use this regex:
^\+?\d+(?:[\s-]\d+)*\b
RegEx Details:
^\+?: Match optional + at start
\d+: match 1+ digits
(?:[\s-]\d+)*: Match 0 or more groups that start with whitespace or - followed by 1+ digits
$: End (Replaced by word boundary as if there are trailing spaces, that match would be missed.)
This should work:
(?:[\d +-]+)+
This would work as per your reqt: (If there are trailing spaces, this regex will ignore.)
Regex: '^(?:[\d +-]+)\b'
Another option could be to use an alternation to match either 10 digits without a leading plus sign or match the pattern with a +, and optional space or hyphen:
(?:\d{10}|\+\d{2}[- ]?\d{3}-?\d{5})\b
That will match:
(?: Non capturing group
\d{10} Match 10 digits
| Or
\+\d{2}[-\s]?\d{3}-?\d{5} Match +, 2 digits, optional whitespace char or -, 3 digits, optional -, 5 digits
)\b Close non capturing group and word boundary
Regex demo
If your language supports negative lookbehinds you could prepend (?<!\S) which checks that what comes before is not a non-whitespace character.