Pattern for apostrophe inside quotes - regex

I am looking for a pattern that can find apostrophes that are inside single quotes. For example the text
Foo 'can't' bar 'don't'
I want to find and replace the apostrophe in can't and don't, but I don't want to find the single quotes
I have tried something like
(.*)'(.*)'(.*)'
and apply the replace on the second matching group. But for text that has 2 words with apostrophes this pattern won't work.
Edit: to clarify the text could have single quotes with no apostrophes inside them, which should be preserved as is. For example
'foo' 'can't' bar 'don't'
I am still looking for only apostrophes, so the single quotes around foo should not match

I believe you need to require "word" characters to appear before and after a ' symbol, and it can be done with a word boundary:
\b'\b
See the regex demo
To only match the quote inside letters use
(?<=\p{L})'(?=\p{L})
(?<=[[:alpha:]])'(?=[[:alpha:]])
(?U)(?<=\p{Alpha})'(?=\p{Alpha}) # Java, double the backslashes in the string literal
Or ASCII only
(?<=[a-zA-Z])'(?=[a-zA-Z])

You can use the following regular expression:
'[^']+'\s|'[^']+(')[^' ]+'
it will return 3 matches, and if capture group 1 participated in the word, it will be the apostrophe in the word:
'foo'
'can't'
'don't'
demo
How it works:
'[^']+'\s
' match an apostrophe
[^']+ followed by at least one character that isn't an apostrophe
' followed by an apostrophe
\s followed by a space
| or
'[^']+(')[^' ]+'
' match an apostrophe
[^']+ followed by at least one character that isn't an apostrophe
(') followed by an apostrophe, and capture it in capture group 1
[^' ]+ followed by at least one character that is not an apostrophe or a space
' followed by an apostrophe

Related

Regex for no single quote and newline character in between single quotes

so far I have this '.[^ \n']*'(?!') with a negative look ahead after the last qoute
Unfortunately, this does allow ''' (three single quotes).
The regex should match these strings
'abc'
'abc##$%^xyz'
The regex shouldn't match these strings
'\n'
'abc#'#$%^xyz'
'''
'
My current regex is looking at negative precedes for a single quote. I am trying to find a way to make it more generalized so if doesn't match if it has odd number of single qoutes.
If your patterns occur always alone in a line, you could use this:
^'[^\n']*'$
If you want to find matching pairs of single quotes in a bigger text, I think regex is not the solution for you.
You could use:
^'[^\n']*(?:'[^\n']*')*[^\n']*'$
Explanation
^ Start of string
' Match a single quote
[^\n']* Match 0+ chars other than a newline or a single quote
(?: Non capture group to repeat as a whole part
'[^\n']*' Match from ' to ' without matching newlines in between
)* Close the non capture group and optionally repeat it
[^\n']* Match 0+ chars other than a newline or a single quote
' Match a single quote
$ End of string
See a regex101 demo.

Preserve space after delimeter regex

I have the next regex which keeps "c" and delimiter sign from replacement
(?<=c[:=\s]|:=).+
But the problem is in case of spaces after delimiter, it replaces them as well:
c= test1
will replace for example with:
c=test
How can I preserve space after delimiter sign in order it will not be replaced:
c= test
I have tried the next:
(?<=c[:=\s]\s).+
But in it doesn't do matching and correct replacement for strings which do not contain a space after delimiter:
c=test1
You could match c followed by : or = and zero or more whitespace characters \s+ and then capture one or more characters in a group (.+). Start with a word boundary \b to make sure c is not part of a longer match.
As replacement you could use the first capturing group \1 followed by your replacement text.
Match
\b(c[:=]\s*).+
Replace with
\1test
Demo Python

Regex to Match All Whitespace After Word

I have strings like this:
"2015/08/this filename has whitespace .jpg"
I need to match the whitespace characters in those strings. They will all have "2015/08/ and will end with ".
I'm using Sublime Text 2 to search and replace in a SQL DB dump. I'm at a loss on how to do the match. I know I can match whitespace with \s, but I have no clue how to contain to those groups.
As per my comment, this expression should work for a string that has the same number of opening/closing double quotes:
\s+(?=(?:(?:[^"]*"){2})*[^"]*"[^"]*$)
See demo here. The look-ahead is checking for an odd number of double quotes until the end of file.
Another approach is to define the boundary with \G and trim the beginning of the match with \K:
(?:"\d{4}\/\d{2}\/|(?!^)\G)[^"\s]*\K\s(?=[^"]*")
See demo
The regex finds a match:
(?:"\d{4}\/\d{2}\/|(?!^)\G) - when a substring starts with numbers like 2015/12/ or after a successful match
[^"\s]*\K - matches all characters that are not whitespace or " and omits them due to \K operator
\s - here it matches a whitespace symbol
(?=[^"]*") - a look-ahead checking we are presumably inside double quotes.
Replacing the spaces with, say, %20 results in:

Regex expression for all white space except when it is in quotes

I'm looking for the regex that will match all white space in a string except when it is between quotes.
For example, if I have the following string:
abc def " gh i " jkl " m n o p " qrst
- -- -- - -- - --
I want to match the spaces that have a dash under them. The dashes are not part of the string, only for illustration purposes.
Can this be done?
You could try the below positive lookahead based regex.
\s(?=(?:"[^"]*"|[^"])*$)
or
(?=(?:"[^"]*"|[^"])*$)
DEMO
Explanation:
\s Matches a space character
(?=(?:"[^"]*"|[^"])*$) only if it's followed by,
"[^"]*" double quotes plus [^"]* any character not of double quotes zero or more times plus a closing double quotes. So it matches the double quotes block ie, like "foo" or "ljilcjljfcl"
| OR If the following character is not of a double quotes, then the control switches to the pattern next to the | or part ie, [^"].
[^"] Matches any character but not of a double quotes.
Take foo "foo bar" buz as an example string.
foo "foo bar" buz
\s at first matches all the spaces. Then it checks the condition that the matched spaces must be followed by double quoted string or [^"] zero or more times. So it checks that the first space if followed by a double quoted string or not. Yes, the first space if followed by a double quoted string "foo bar", then the character following the double quoted string is a space. Now the regex "[^"]*" got failed and the control switches to the next part ie,
[^"]. This pattern matches the following space. Because * applies to that pattern [^"]* matches all the following characters. Finally the condition is satisfied for the first space, so it got matched.
[ ](?=(?:[^"]*"[^"]*")*[^"]*$)
Try this.See demo.
https://regex101.com/r/pM9yO9/7
This basically states that find any space which has groups of "" in front of it but not an alone ".It is enforced through lookahead.
If your regex flavor is PCRE could (*SKIP)(*F) the quoted stuff or replace one or more \s
"[^"]*"(*SKIP)(*F)|\s+
Test at regex101.com

Eclipse regex find and replace

I want to replace the below statement
ImageIcon("images/calender.gif");
with
ImageIcon(res.getResource("images/calender.gif"));
Can anyone suggest a regex to do this in eclipse.Instead of "calender.gif" any filename can come.
You can find this pattern (in regex mode):
ImageIcon\(("[^"]+")\)
and replace with:
ImageIcon(res.getResource($1))
The \( and \) in the pattern escapes the braces since they are to match literally. The unescaped braces (…) sets up capturing group 1 which matches the doublequoted string literal, which should not have escaped doublequotes (which I believe is illegal for filenames anyway).
The […] is a character class. Something like [aeiou] matches one of any of the lowercase vowels. [^…] is a negated character class. [^aeiou] matches one of anything but the lowercase vowels.
The + is one-or-more repetition, so [^"]+ matches non-empty sequence of everything except double quotes. We simply surround this pattern with " to match the double-quoted string literal.
So the pattern breaks down like this:
literal( literal)
| |
ImageIcon\(("[^"]+")\)
\_______/
group 1
In replacement strings, $1 substitutes what group 1 matched.
References
regular-expressions.info
Character Class, Repetition, Brackets
Examples/Programming Constructs - Strings - has patterns for strings that may contain escaped doublequotes
Ctrl-F
Find: ImageIcon\("([^\"]*)"\);
Replace with: ImageIcon(res.getResource("\1"));
Check Regular Expressions checkbox.