How to match an apostrophe (') unless it is escaped (\')?

How to match an apostrophe (') unless it is escaped (\')? - regex

Is it possible to construct a regular expression for this? If so, I'd appreciate if someone shows how.

Use this regular expression:
(?<!\\)'
It means match all apostrophe chars not preceeded by a backslash (the backslash is escaped itself because it is a special char for regexps)

If you are using the .NET regex engine or another engine that can handle indefinite-length lookbehind assertions, then use
(?<=(?<!\\)(?:\\\\)*)'
That makes sure that there is an even number of backslashes before the apostrophe.
Explanation:
(?<= # Assert that the following regex matches before the current position:
(?<!\\) # No backslash before...
(?:\\\\)* # ... an even number of backslashes.
) # (End of lookbehind assertion)
' # Match an apostrophe.
If your regex engine can't handle that, you'll need to make the (even number of) backslashes part of the match and account for them later:
(?<!\\)((?:\\\\)*)'
Now $1 (or \1) will contain the matched backslashes, so you can replace the result by \1\\' or $1\\', depending on the details of the QRegExp implementation.

Related

Match any character but no empty and not only white spaces

I have this regex:
\[tag\](.*?)\[\/tag\]
It match any character between two tags. The problem that is matching also empty contents or just white spaces inside the tags, for example:
[tag][/tag]
[tag] [/tag]
How can I avoid it? Make it to match at least 1 character and not only white spaces. Thanks!

Use
\[tag\](?!\s*\[\/tag\])(.*?)\[\/tag\]
^^^^^^^^^^^^^^^^
See the regex demo and the Regulex graph:
The (?!\s*\[\/tag\]) is a negative lookahead that fails the match if, immediately to the right of the current location, there is 0+ whitespaces, [/tag].

You might change your expression to something similar to this:
\[tag\]([\s\S]+)\[\/tag\]
and you might add a quantifier to it, and bound it with number of chars, similar to this expression:
\[tag\]([\s\S]{3,})\[\/tag\]
Or you could do the same with your original expression as this expression:

Try this regex:
\[(tag)\](?!\s*\[\/\1\])(.*?)\[\/\1\]
This regex matches tag only if it has at least one non-whitespace char.

If this is a PCRE (or php) or NP++ or Perl, use this
(?s)(?:\[tag\]\s*\[/tag\](*SKIP)(?!)|\[tag\]\s*(.+?)\s*\[/tag\])
https://regex101.com/r/aCsOoQ/1
If not, you're stuck with using Stribnetz regex, which works because of
an odd condition of your requirements.
Readable
(?s)
(?:
\[tag\]
\s*
\[/tag\]
(*SKIP)
(?!)
|
\[tag\]
\s*
( .+? ) # (1)
\s*
\[/tag\]
)

RegEx: don't capture match, but capture after match

There are a thousand regular expression questions on SO, so I apologize if this is already covered. I did look first.
I have string:
Name Subname 11X22 88X620 AB33(20) YA5619 77,66
I need to capture this string: YA5619
What I am doing is just finding AB33(20) and after this I am capturing until first white space. But AB33(20) can be AB-33(20) or AB33(-20) or AB33(-1).
My preg_match regex is: (?<=\bAB\d{2}\(\d{2}\)\s).+?(?=\s)
Why I am getting error when I change from \d{2} to \d+?
For final result I was thinking this regix will work but no:
(?<=\bAB-?\d+\(-?\d+\)\s).+?(?=\s)
Any ideas what I am doing wrong?

With most regex flavors, lookbehind needs to evaluate to a fixed-length sequence, so you can't use variable quantifiers like * or + or even {1,2}.
Instead of using lookaround, you can simply match your marker pattern and then forget it with \K.
AB-?\d+(?:\(-?\d+\))? \K[^ ]+
demo: https://regex101.com/r/8XXngH/1

It depends on the language. If it is in .NET for example, it matches due to the various length in the lookbehind.
Another solution might be to use a character class and add the character you would allow to match. Then match a whitespace character and capture in a group matching \S+ which matches 1+ times not a whitespace character.
\bAB[()\d-]+\s\K\S+
Explanation
\bAB Match literally prepended with word boundary to prevent AB being part of a larger match.
[()\d-]+ Match 1+ times any of the listed character in the character class
\s Match a whitespace char (or \s+ to match 1 or more)
\K Reset the starting point of the reported match( Forget what was matched)
\S+ Match in a group 1+ times not a whitespace character
Regex demo | Php demo

Find slash that are NOT followed by non word character

I am trying to write a regex for finding slashes only that are not followed by special characters.
For example, if the string is,
/PErs/#loc/g/2, then I regex should find slashes (/) that are before P, g and 2. It should not return slash before # as # is a special character.
I could write \/\w but it is returning me /P, /g and /2.

Simplest one by using word boundary \b.
\/\b
\b matches between a word character and a non-word character.
DEMO

You want to use the lookahead operator.
Positive lookahead or detect if something is present after (ahead)
Try this regex instead:
\/(?=\w)
DEMO
We use here the positive lookahead operator (?=). It will "detect" the position of a given expression but won't match the expression.
Negative lookahead or detect if something is NOT present after (ahead)
Alternatively, you can also use the negative look ahead operator (?!).
\/(?![#])
DEMO
Negative lookahead with multiple special characters
This will match any / NOT followed by #. If you have more special characters, simply add them to the character class.
For example, if # and % were special characters, the regular expression above would become:
\/(?![##%])
DEMO

Matching slashes NOT followed by NON word character is not the same than followed by word character.
Have a try with:
/(?!\W)
This matches slashes NOT followed by NON word character
It matches the final slash in string: PErs/

Match everything to the first unescaped (with \) character

I have following input:
!foo\[bar[bB]uz\[xx/
I want to match everything from start to [, including escaped bracket \[ and ommiting first characters if in [!#\s] group
Expected output:
foo\[bar
I've tried with:
(?![!#\s])[^/\s]+\[
But it returns:
foo\[bar[bB]uz\[

Java: Use Lookbehind
(?<=!)(?:\\\[|[a-z])+
See the regex demo
Explanation
The lookbehind (?<=!) asserts that what precedes the current position is the character !
The non-capture group (?:\\\[|[a-z]) matches \[ OR | a letter between a and z
The + causes the group to be matched one or more times
Reference
Lookahead and Lookbehind Zero-Length Assertions
Mastering Lookahead and Lookbehind

You can use this regex:
!((?:[^[\\]*\\\[)*[^[]*)
Online Regex Demo

Add a ? after [^/\s]+ to catch the shortest group possible
Add \w+ to the end to catch the first group of alphanumeric characters after \[
Result :
(?![!#\s])[^\/\s]+?\[\w+
Try it

You can try this pattern:
(?<=^[!#\s]{0,1000})(?:[^!#\s\\\[]|\\.)(?>[^\[\\]+|\\.)*(?=\[)
pattern details:
The begining is a lookbehind and means preceded by zero or several forbidden characters at the start of the string
(?:[^!#\s\\\[]|\\.) ensures that the first character is an allowed character or an escaped character.
(?>[^\[\\]+|\\.)* describes the content: all that is not a [ or a \, or an escaped character. (note that this subpattern can be written like that too: (?:[^\[\\]|\\.)*)
(?=\[) checks that the next character is a literal opening square bracket. (since all escaped characters are matched by the precedent group, you can be sure that this one is not escaped)
link to fiddle (push the Java button)

Use a negated character class first the start (ie the match must not start with a special char), then a reluctant quantifier (which stops at the first hit), with a negative look behind to skip over escaped brackets:
[^!#\s].*?(?<!\\)\[
See live demo

Eclipse regex find and replace

I want to replace the below statement
ImageIcon("images/calender.gif");
with
ImageIcon(res.getResource("images/calender.gif"));
Can anyone suggest a regex to do this in eclipse.Instead of "calender.gif" any filename can come.

You can find this pattern (in regex mode):
ImageIcon\(("[^"]+")\)
and replace with:
ImageIcon(res.getResource($1))
The \( and \) in the pattern escapes the braces since they are to match literally. The unescaped braces (…) sets up capturing group 1 which matches the doublequoted string literal, which should not have escaped doublequotes (which I believe is illegal for filenames anyway).
The […] is a character class. Something like [aeiou] matches one of any of the lowercase vowels. [^…] is a negated character class. [^aeiou] matches one of anything but the lowercase vowels.
The + is one-or-more repetition, so [^"]+ matches non-empty sequence of everything except double quotes. We simply surround this pattern with " to match the double-quoted string literal.
So the pattern breaks down like this:
literal( literal)
| |
ImageIcon\(("[^"]+")\)
\_______/
group 1
In replacement strings, $1 substitutes what group 1 matched.
References
regular-expressions.info
Character Class, Repetition, Brackets
Examples/Programming Constructs - Strings - has patterns for strings that may contain escaped doublequotes

Ctrl-F
Find: ImageIcon\("([^\"]*)"\);
Replace with: ImageIcon(res.getResource("\1"));
Check Regular Expressions checkbox.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to match an apostrophe (') unless it is escaped (\')? - regex

Is it possible to construct a regular expression for this? If so, I'd appreciate if someone shows how.

Use this regular expression: (?<!\\)' It means match all apostrophe chars not preceeded by a backslash (the backslash is escaped itself because it is a special char for regexps)

Related

Match any character but no empty and not only white spaces

RegEx: don't capture match, but capture after match

Find slash that are NOT followed by non word character

Match everything to the first unescaped (with \) character

Eclipse regex find and replace

Categories

Resources