Select all single quotes in regex field - regex

I have this field in my JSON data:
"pinyin": "bei1 'ai1",
I just want to select any single quote ' like the one before ai1;
I tried this
(?<="pinyin": "\w*)\'+(?!")
but it didn't work

You can use
(?<="pinyin": "[\w\s]*)'(?!")
See this regex demo. Details:
(?<="pinyin": "[\w\s]*) - a positive lookbehind that matches a location that is immediately preceded with "pinyin": " and then any zero or more word or whitespace chars
' - a single quotation mark
(?!") - a negative lookahead that fails the match of there is a " char immediately to the right of the current location.

Related

Regexp to cut all text inside the external quotation signs

Please help me with adjusting regexp. I need to cut all text inside the external quotation signs.
I have text:
some text "have "some text" here "that should" be cut"
My regexp:
some text "(?<name>[^"]*)"
Need to get
have "some text" here "that should" be cut
But I've got
have
If you want to supported the first level of nested double quotes you can use
some text "(?<name>[^"]*(?:"[^"]*"[^"]*)*)"
See the regex demo.
Details:
[^"]* - zero or more chars other than double quotes
(?:"[^"]*"[^"]*)* - zero or more repetitions of
"[^"]*" - a substring between double quotes that contains no other double quotes
[^"]* - zero or more chars other than double quotes.
If your regex flavor supports recursion:
some text ("(?<name>(?:[^"]++|\g<1>)*)")
See this regex demo. Here, ("(?<name>(?:[^"]++|\g<1>)*)") is a capturing group #1 that matches
" - a " char
(?<name>(?:[^"]++|\g<1>)*) - Group "name": zero or more sequences of
[^"]++ - one or more chars other than "
| - or
\g<1> - Group 1 pattern recursed
" - a " char
Assuming you want to remove all text up to the first quotes then retain everything till the last quote, you can try this.
Demo
[[:alpha:]][^"]*\"(?<name>.*)"
You can solve this problem with nested regexp operators:
SELECT regexp_replace(Regexp_substr(regexp_replace(word,'(^")|("$)'),'["].+'),'(^")') as Result
from(
SELECT '"some text "have "some text" here "that should" be cut"' as word from dual)

How to find a all string in ' but exclude lines which contain a System.debug

I trying to get all data inside ' using this
['].*?[']
and additionaly i want to exclude lines which is contain System.Debug
i tried
^((?!System.debug).)*$['].*?[']
but this is not working , what i doing wrong ?
You can use
(?<!System\.debug.*)'(?!.*System\.debug)[^']*'
See the regex demo and the Visual Studio Code demo & settings:
Details:
(?<!System\.debug.*) - a negative lookbehind that fails the match if there is System.debug and any zero or more chars other than line break chars as many as possible immediately to the left of the current location
' - a ' char
(?!.*System\.debug) - a negative lookahead that fails the match if there is any zero or more chars other than line break chars as many as possible and then System.debug immediately to the right of the current location
[^']* - zero or more chars other than '
' - a ' char.
NOTE: In VSCode regex, [^'] and other negated character classes only match line breaks if there is \r or \n somewhere in the regex pattern, so it is not necessary to use [^'\n], but it is used in the regex101 demo.

Notepad++: reemplace ocurrences of characters before other character

I have a file with text like this:
"Title" = "Body"
And I would like to remove both " before the =, to leave it like this:
Title = "Body"
So far I managed to select the first block of text with:
.+(=)
That selects everything up to the =, but I can't find how to reemplace (or delete) both " .
Any suggestions?
You could use a capture group in the replacement, and match the double quotes to be removed while asserting an equals sign at the right.
Find what:
"([^"]+)"(?=\h*=)
" Match literally
([^"]+) Capture group 1, match 1+ times any char other than "
" Match literally
(?=\h*=) Positive lookahead, assert an = sigh at the right
Regex demo
Replace with:
$1
To match the whole pattern from the start till end end of the string, you might also use 2 capture groups and use those in the replacement.
^"([^"]+)"(\h*=\h*"[^"]+")$
Regex demo
In the replacement use $1$2
You can use
(?:\G(?!^)|^(?=.*=))[^"=\v]*\K"
Replace with an empty string.
Details:
(?:\G(?!^)|^(?=.*=)) - end of the previous successful match (\G(?!^)) or (|) start of a line that contains = somewhere on it (^(?=.*=))
[^"=\v]* - any zero or more chars other than ", = and vertical whitespace
\K - omit the text matched
" - a " char (matched, consumed and removed)
See the screenshot with settings and a demo:

Regex extraction of substrings ignoring internal character used to match

I'm matching a string of a key value pair between characters "" with "(.*?)" how can I ignore any extra " characters within the value part.
example string {"1"=>"email#example.com"}
You may use
String pat = "(?<=\\{|=>)\"(.*?)\"(?=\\}|=>)";
See the regex demo
Details
(?<=\{|=>) - a positive lookbehind that matches a location immediately preceded with { or =>
" - a double quotation mark
(.*?) - Group 1: any zero or more chars other than line break chars, as few as possible
" - a double quotation mark
(?=\}|=>) - a positive lookahead that matches a location immediately followed with } or =>.

How can I check it with regular Expression?

I have a long input string that contains certain field names in-bedded in it. For instance:
SELECT some-name, some-name FROM [some-table] WHERE [some-column] = 'some-value'
The actual field name may change, but it is always in the form of word-word. I need to perform a regex replace on the string so that the output will look like this:
SELECT some - name, some - name FROM [some-table] WHERE [some-column] = 'some - value'
In other words, when the field name is enclosed in square-brackets, it should be left untouched, but when it is not, spaces should be inserted on either side of the dash. There are no nested square brackets and the reserved word could be one or more in the string.
You can do this:
Regex.Replace(input, "(?<!\[[^-\]]*)(\w+)-(\w+)(?![^-\]]*\])", "$1 - $2")
Here's an explanation of the pattern:
(?<!\[[^-\]]*) - This is a negative look-behind. It asserts that matches cannot be immediately preceded by text that matches the sub-pattern \[[^-\]]*. In other words, the matches we are looking for cannot be preceded by a [ character followed by any number of characters that are not a - or a ].
(\w+)-(\w+) - Matches one or more word-characters, then a dash, and then one or more word characters following the dash. By enclosing the sub-patterns on either side of the dash in capturing groups, we can then refer to their values as $1 and $2 in the replacement pattern.
(?![^-\]]*\]) - This is a negative look-ahead. Similar to the negative look-behind, it asserts that matches cannot be immediately followed by text which matches the sub pattern [^-\]]*\]. In other words, a match cannot be followed by any number of characters that are not a - or a ] and then a closing ].
See a demo.
At first glance, you might assume that you could simply assert that is must not be immediately preceded by a [ character and that it must not be immediately followed by a ] character. In other words, (?<!\[)(\w+)-(\w+)(?!\]). However, that pattern would still match the text ome-nam in the input [some-name] because the text ome-nam is not immediately preceded or followed by the brackets.
Dim regex As Regex = New Regex("\[[^-]*-[^-]*\]")
Dim match As Match = regex.Match("A long string containing square brackets [some-name]")
If match.Success Then
Console.WriteLine(match.Value)
End If
Or you could use Regex.IsMatch:
Return Regex.IsMatch("A long string containing square brackets [some-name]",
"\[[^-]*-[^-]*\]")
You may match and capture the [...] substrings and then only match hyphens that are not surrounded with hyphens to replace them:
Dim nStr As String = "SELECT 'some-name' FROM [some-name]"
Dim nResult = Regex.Replace(nStr, "(\[.+?])|\s*-\s*", New MatchEvaluator(Function(m As Match)
If m.Groups(1).Success Then
Return m.Groups(1).Value
Else
Return " - "
End If
End Function))
So, what is happening is:
(\[[^]]+]) - matches and stores the value of [...] substring inside the Group(1) buffer (or \[.+?] can be used here to match a [, then 1 or more any characters and then ] - with RegexOptions.Singleline flag so that . could match a newline, too)
(?<!\s)-(?!\s) - matches any hyphen not preceded ((?<!\s)) or followed ((?!\s)) with whitespace (\s). Actually, we may even use \s*-\s* (where \s* stands for zero or more whitespaces as many as possible since * is a greedy quantifier matching zero or more occurrences of the quantified subpattern) here to remove any whitespace there is to make sure we just insert 1 space before and after -.
If Group 1 matches, then we just re-insert it (Return m.Groups(1).Value), else we insert the space-enclosed hyphen Return " - ".
Just to check if it exists, you could try
\[[^\]]+-[^\]]+\]
It matches a literal [ and then any characters, except ], up to (including) a hyphen. Then again any characters, except ], up to a literal ].
See it here at regex101.
Actually I don't know the vb.net syntax but you can use regex as
/[\s\'](\w+)\-(\w+)/g
find the (\w+)-(\w+) which is followed by space or ' and replace your string with capture group 1st - 2nd
See the sample here