Matching a substring of n numbers, but not if there are any numbers after that [duplicate] - regex

This question already has answers here:
Java RegEx that matches exactly 8 digits
(3 answers)
Closed 5 years ago.
Basically I'm looking for a regex that matches some simple phone numbers.
I want to match numbers in a longer string of text like 123 4567, 891-0111, or 21314151, something that is (hopefully) identified by (\d{3,4}[- ]\d{3,4}|\d{4,8}), but I don't want to match them if they're part of a longer number like 3919503570275.
If I require the next character to be a non-digit or the end of a line, then that next character is also included in the match, which I don't want.

Surround your regex with a lookahead and a lookbehind to reject \d on both sides:
(?<!\d)(\d{3,4}[- ]\d{3,4}|\d{4,8})(?!\d)
Demo.
Note that this would accept a string that looks like a phone number preceded or followed by letters.

Depending on what programming language you use, I suggest to either use negative look-ahead or to use groups to extract the number.
See https://www.regular-expressions.info/lookaround.html for information about lookaround pattern.

Related

Regex for alphanumeric with at least one digit [duplicate]

This question already has answers here:
RegEx for an invoice format
(5 answers)
Closed 2 years ago.
I'm looking for a regex for Invoice Number in Vbscript
It can have alphanumeric but at least one numeric digit is a must.
I'm using the below regex but it matches ALPHA String INVOICE also. It need to have at least one digit
\b(?=.*\d)[A-Z0-9\-]{5,12}\b
Expected Match String
1233444
M62899M
M828828
783838PTE
A751987
Expected Unmatch String
INVOICE
ubb62727
XYZ
123
If we use ([A-Z0-9]*[0-9]+[A-Z0-9]*), I can't specify the length.
Please suggest a proper regex. Please note its totally different from the suggested duplicate as the requirement, format is different.
The blanket .* in your lookahead will happily skip past the trailing \b if it has to. Make it more constrained, so it can't.
\b(?=[-A-Z]*\d)[A-Z0-9-]{5,12}\b
(I removed the backslash before the -; if you really want to allow a literal backslash, obviously add it back, to the character class in the lookahead also. A dash at beginning or end of a character class is unambiguous and doesn't require a backslash escape; this is also the only way to have a literal dash in a character class in many regex dialects.)

how to validate that two underscores are not together in regular expression [duplicate]

This question already has answers here:
Regex not to allow double underscores
(3 answers)
Closed 3 years ago.
I have tried different regular expressions already but I am not sure how to have it catch one or more underscore. If are two together, must be invalid.
First word must be capital letter, then any character, the problem is underscore
I have this: (^[A-Z])(\w{6,30} ?=*(_))
This regex may work for you with a negative lookahead condition:
^[A-Z](?![^_]*__)\w{6,30}$
(?![^_]*__) is a negative lookahead condition that fails the match if __ appear anywhere after first capital letter.
RegEx Demo
If you mean a pattern which is a word starting with a capital letter followed by some groups consisting of a single underscore and a word:
^[A-Z]\w{6,30}(_\w{6,30})*$

Regex substitution: find double quotes not following by specific character [duplicate]

This question already has an answer here:
Regex Match a character which is not followed by another specific character
(1 answer)
Closed 4 years ago.
I have the following situation:
3" a
3":a
3",a
3"a
3"2
3"A
I need to find a replace a double quote with space every time the double quote is not following by : or ,.
So, for my case the expected results will be:
3 a
3":a
3",a
3 a
3 2
3 A
Any idea how write this logic using regex?
Regards,
You can use a negative lookahead A(?!B) for that. It matches an expression A that is not followed by expression B.
The replacement of the matches with spaces will depend on the used language.
"(?![:,])
Applied to your examples: https://regex101.com/r/UiPlaC/2
If you want to handle the case 3" a without having multiple spaces, just include one (or even more?) optional spaces in the match.
"(?![:,])\ ?
See here for more information:
Regex lookahead, lookbehind and atomic groups
https://www.regular-expressions.info/lookaround.html

Does not match when the string does not have a dot but it will match multiple dots [duplicate]

This question already has answers here:
Regex to allow alphanumeric and dot
(3 answers)
Closed 4 years ago.
I am trying to match the string when there's 0 or multiple dots. The regex that I can only match multiple dots but not 0 dot.
(\w*)((\w*\.)+\w*)
These are the test string I am using
dial.check.Catch.Url
dial.check.Catch.Url.Dial.check.Catch.Url
32443.324342.23423424.23.423.423.42.34.234.32.4..2..2.342.4
234dfasfd2aa4234234.234aa341.4.123daaadf.df.af....
12fd.dafd
.
abc
The Regex will match these
dial.check.Catch.Url
dial.check.Catch.Url.Dial.check.Catch.Url
32443.324342.23423424.23.423.423.42.34.234.32.4..2..2.342.4
234dfasfd2aa4234234.234aa341.4.123daaadf.df.af....
12fd.dafd
.
But not this one:
abc
https://regexr.com/?38ed7
If you really must use a regex, here is one (but it is inefficient):
/^(?![^.]*\.[^.]*$).*$/
It says:
Match a string so that the beginning of the string is not followed by a whole string with a single dot.
It does some backtracking when parsing the negative lookahead.
As mentioned in the comments to the question, I do think, unless you must have a regex, that a simple function might be better. But if you like the conciseness of a regex and performance is not a huge concern, you can go with the one I gave above. Regexes with "nots" in them are generally a tad messy, but once you understand lookarounds they do become doable. Cheers.
/\..*\.|^[^.]*$/
Or, in plain English:
Match EITHER a dot, then any number of characters, then another dot; OR the beginning of the string, then any number of non-dots, then the end of the string.

Regex for string containing one string, but not another [duplicate]

This question already has answers here:
Regular expression for a string containing one word but not another
(5 answers)
Closed 3 years ago.
Have regex in our project that matches any url that contains the string
"/pdf/":
(.+)/pdf/.+
Need to modify it so that it won't match urls that also contain "help"
Example:
Shouldn't match: "/dealer/help/us/en/pdf/simple.pdf"
Should match: "/dealer/us/en/pdf/simple.pdf"
If lookarounds are supported, this is very easy to achieve:
(?=.*/pdf/)(?!.*help)(.+)
See a demo on regex101.com.
(?:^|\s)((?:[^h ]|h(?!elp))+\/pdf\/\S*)(?:$|\s)
First thing is match either a space or the start of a line
(?:^|\s)
Then we match anything that is not a or h OR any h that does not have elp behind it, one or more times +, until we find a /pdf/, then match non-space characters \S any number of times *.
((?:[^h ]|h(?!elp))+\/pdf\/\S*)
If we want to detect help after the /pdf/, we can duplicate matching from the start.
((?:[^h ]|h(?!elp))+\/pdf\/(?:[^h ]|h(?!elp))+)
Finally, we match a or end line/string ($)
(?:$|\s)
The full match will include leading/trailing spaces, and should be stripped. If you use capture group 1, you don't need to strip the ends.
Example on regex101