Regex substitution: find double quotes not following by specific character [duplicate] - regex

This question already has an answer here:
Regex Match a character which is not followed by another specific character
(1 answer)
Closed 4 years ago.
I have the following situation:
3" a
3":a
3",a
3"a
3"2
3"A
I need to find a replace a double quote with space every time the double quote is not following by : or ,.
So, for my case the expected results will be:
3 a
3":a
3",a
3 a
3 2
3 A
Any idea how write this logic using regex?
Regards,

You can use a negative lookahead A(?!B) for that. It matches an expression A that is not followed by expression B.
The replacement of the matches with spaces will depend on the used language.
"(?![:,])
Applied to your examples: https://regex101.com/r/UiPlaC/2
If you want to handle the case 3" a without having multiple spaces, just include one (or even more?) optional spaces in the match.
"(?![:,])\ ?
See here for more information:
Regex lookahead, lookbehind and atomic groups
https://www.regular-expressions.info/lookaround.html

Related

Regexp to match multi-line string [duplicate]

This question already has answers here:
What is a non-capturing group in regular expressions?
(18 answers)
Closed 2 years ago.
I have this regexp:
^(?<FOOTER_TYPE>[ a-zA-Z0-9-]+)?(?<SEPARATOR>:)?(?<FOOTER>(?<=:)(.|[\r\n](?![\r\n]))*)?
Which I'm using to match text like:
BREAKING CHANGE: test
my multiline
string.
This is not matched
You can see the result here https://regex101.com/r/gGroPK/1
However, why is there the last Group 4 ?
You will need to make last group non-capturing:
^(?<FOOTER_TYPE>[ a-zA-Z0-9-]+)?(?<SEPARATOR>:)?(?<FOOTER>(?<=:)(?:.|[\r\n](?![\r\n]))*)?
Make note of:
(?:.|[\r\n](?![\r\n]))*)?
(?: at the start makes this optional group non-capturing.
Updated Demo
it is group 4 because the fourth parentheses you defined is:
(.|[\r\n](?![\r\n]))*)
it translate to
"either dot, or the following regex"
and in the example you have, it ends on a dot.
string.
so as regex is usually greedy, it captures dot as the forth group

Find DATE match starting from end of string [duplicate]

This question already has answers here:
Regex Last occurrence?
(7 answers)
Closed 3 years ago.
I have the following RegEx syntax that will match the first date found.
([0-9]+)/([0-9]+)/([0-9]+)
However, I would like to start from the end of the content and search backwards. In other words, in the below example, my syntax will always match the first date, but I want it to match the last instead.
Some Text here
01/02/15
Some additional
text here.
10/04/14
Ending text
here
I believe this is possible by using a negative lookahead, but all my attempts failed at this because I don't understand RegEx enough. Help would be appreciated.
Note: my application uses RegEx PCRP.
You could make the dot match a newline using for example an inline modifier (?s) and match until the end of the string.
Then make use of backtracking until the last occurrence of the date like pattern and precede the first digit with a word boundary.
Use \K to forget what was matched and match the date like pattern.
^(?s).*\b\K[0-9]+/[0-9]+/[0-9]+
Regex demo
Note that the pattern is a very broad match and does not validate a date itself.

Does not match when the string does not have a dot but it will match multiple dots [duplicate]

This question already has answers here:
Regex to allow alphanumeric and dot
(3 answers)
Closed 4 years ago.
I am trying to match the string when there's 0 or multiple dots. The regex that I can only match multiple dots but not 0 dot.
(\w*)((\w*\.)+\w*)
These are the test string I am using
dial.check.Catch.Url
dial.check.Catch.Url.Dial.check.Catch.Url
32443.324342.23423424.23.423.423.42.34.234.32.4..2..2.342.4
234dfasfd2aa4234234.234aa341.4.123daaadf.df.af....
12fd.dafd
.
abc
The Regex will match these
dial.check.Catch.Url
dial.check.Catch.Url.Dial.check.Catch.Url
32443.324342.23423424.23.423.423.42.34.234.32.4..2..2.342.4
234dfasfd2aa4234234.234aa341.4.123daaadf.df.af....
12fd.dafd
.
But not this one:
abc
https://regexr.com/?38ed7
If you really must use a regex, here is one (but it is inefficient):
/^(?![^.]*\.[^.]*$).*$/
It says:
Match a string so that the beginning of the string is not followed by a whole string with a single dot.
It does some backtracking when parsing the negative lookahead.
As mentioned in the comments to the question, I do think, unless you must have a regex, that a simple function might be better. But if you like the conciseness of a regex and performance is not a huge concern, you can go with the one I gave above. Regexes with "nots" in them are generally a tad messy, but once you understand lookarounds they do become doable. Cheers.
/\..*\.|^[^.]*$/
Or, in plain English:
Match EITHER a dot, then any number of characters, then another dot; OR the beginning of the string, then any number of non-dots, then the end of the string.

Matching a substring of n numbers, but not if there are any numbers after that [duplicate]

This question already has answers here:
Java RegEx that matches exactly 8 digits
(3 answers)
Closed 5 years ago.
Basically I'm looking for a regex that matches some simple phone numbers.
I want to match numbers in a longer string of text like 123 4567, 891-0111, or 21314151, something that is (hopefully) identified by (\d{3,4}[- ]\d{3,4}|\d{4,8}), but I don't want to match them if they're part of a longer number like 3919503570275.
If I require the next character to be a non-digit or the end of a line, then that next character is also included in the match, which I don't want.
Surround your regex with a lookahead and a lookbehind to reject \d on both sides:
(?<!\d)(\d{3,4}[- ]\d{3,4}|\d{4,8})(?!\d)
Demo.
Note that this would accept a string that looks like a phone number preceded or followed by letters.
Depending on what programming language you use, I suggest to either use negative look-ahead or to use groups to extract the number.
See https://www.regular-expressions.info/lookaround.html for information about lookaround pattern.

Regex for string containing one string, but not another [duplicate]

This question already has answers here:
Regular expression for a string containing one word but not another
(5 answers)
Closed 3 years ago.
Have regex in our project that matches any url that contains the string
"/pdf/":
(.+)/pdf/.+
Need to modify it so that it won't match urls that also contain "help"
Example:
Shouldn't match: "/dealer/help/us/en/pdf/simple.pdf"
Should match: "/dealer/us/en/pdf/simple.pdf"
If lookarounds are supported, this is very easy to achieve:
(?=.*/pdf/)(?!.*help)(.+)
See a demo on regex101.com.
(?:^|\s)((?:[^h ]|h(?!elp))+\/pdf\/\S*)(?:$|\s)
First thing is match either a space or the start of a line
(?:^|\s)
Then we match anything that is not a or h OR any h that does not have elp behind it, one or more times +, until we find a /pdf/, then match non-space characters \S any number of times *.
((?:[^h ]|h(?!elp))+\/pdf\/\S*)
If we want to detect help after the /pdf/, we can duplicate matching from the start.
((?:[^h ]|h(?!elp))+\/pdf\/(?:[^h ]|h(?!elp))+)
Finally, we match a or end line/string ($)
(?:$|\s)
The full match will include leading/trailing spaces, and should be stripped. If you use capture group 1, you don't need to strip the ends.
Example on regex101