search/replace from script (execute search command from string variable) - regex

I am scripting search/replace function in vim and have a problem. It should work in visual mode for selected lines.
This works:
let b:CON="'<,'>s:[^\t ]:a:e"
vnoremap r :<C-u>execute b:CON
But this does not (it should add 'a' after first letter in line):
let b:CON="'<,'>s:\([^\t ]\):\1a:e"
vnoremap r :<C-u>execute b:CON
So here I just added a group in regex. But it does nothing now. What is wrong here? Same command works fine if I type it in or call it direct via map. Does execute un-escapes some characters in my string? I think it should not.
Extra question: is there some other good way to "execute" command on multiple lines, other than what I use here (C-u).

The first only works by accident. For instance you have an actual tab character embedded in b:CON instead of \t since vim interprets the escape sequences in double quotes strings. The second interprets \( as actual ( and \1 turns into ^A (<c-a>)
You just need to double escape the slashes.
let b:CON="'<,'>s:[^\\t ]:a:e"
let b:CON="'<,'>s:\\([^\\t ]\\):\\1a:e"
Or use single quoted strings and escape the single quotes. (where two single quotes equals and escaped single quote.)
let b:CON='''<,''>s:[^\t ]:a:e'
let b:CON='''<,''>s:\([^\t ]\):\1a:e'
Another way to rewrite the second one would be to use & since that represents the whole match
let b:CON="'<,'>s:[^\\t ]:&a:e"
Or
let b:CON='''<,''>s:[^\t ]:&a:e'
Take a look at :help literal-string for singled quoted strings and :help expr-string for double quoted strings.

Related

Replace repeated special characters with a single special character

I am attempting to use REGEXREPLACE in Google Sheets to remove the repeating special character \n.
I can't get it to replace all repeating instances of the characters with a single instance.
Here is my code:
REGEXREPLACE("Hi Gene\n\n\n\n\nHope","\\n+","\\n")
I want the results to be:
Hi Gene\nHope
But it always maintains the new lines.
Hi Gene\n\n\n\n\nHope
It has to be an issue with replacing the special characters because this:
REGEXREPLACE("Hi Gennnne\nHope","n+","n")
Produces:
Hi Gene\nHope
How do I remove repeating instances of special characters with a single instance of the special character in Google Sheets?
Edit
Just found easier way:
=REGEXREPLACE("Hi Gene\n\n\n\n\nHope","(\\n)+","\\n")
Original solution
Thy this formula:
=REGEXREPLACE(A1,REPT(F2,(len(A1)-len(REGEXREPLACE(A1,"\\n","")))/2),"\\n")
Put your text in A1.
How it works
It's workaround, we want to use final formula like this:
REGEXREPLACE("Hi Gene\n\n\n\n\nHope","\\n+\\n+\\n+\\n+\\n+","\\n")
First target is to find, how many times to repeat \\n+:
=(len(F1)-len(REGEXREPLACE(F1,F2,F3)))/2
Then just combine RegEx.
https://support.google.com/docs/answer/3098245?hl=en
REGEXREPLACE(text, regular_expression, replacement)
The problem seems to be how it interprets the "text". If I put this in a cell REGEXREPLACE("Hi Gene\n\n\n\n\nHope","","")
the output is Hi Gene\n\n\n\n\nHope as well.
If I place the text in a cell by itself with proper newlines and have this REGEXREPLACE(A1, "(\n)\n*", "$1") it works.
Note I could not just do s/\n+/\n/ as it still does not interpret the newline notation as anything special. It would just output \n instead of a newline.
I believe that you don't need to double escape the newlines, e.g. just search for \n:
REGEXREPLACE("Hi Gene\n\n\n\n\nHope", "\n+", "\n")
When you replace \\n you are searching for the literal text \n, rather than newline.

Regex to remove commas between quotes with comma right before end quote Notepad++

In Notepad++, I am using Regex to replace commas between quotes in CSV file.
Using similar example from here.This is what I am trying to read.
1070,17,2,GN3-670,"COLLAR B, M STAY,","2,606.45"
except in my text there is an extra comma right before the closing quotes.
The regex ("[^",]+),([^"]+") does not seem to pick up the last comma and result is
1070,17,2,GN3-670,"COLLAR B M STAY,","2606.45"
I would like
1070,17,2,GN3-670,"COLLAR B M STAY","2606.45"
Is there a simple Regex or will I have to use csv reader C#?
Edit: Some of the Regex is giving false matches so I would like to add another scenario. If I have
1070,17,2,GN3-670,"COLLAR B, M STAY,",55, FREE,"2,606.45"
I would like
1070,17,2,GN3-670,"COLLAR B M STAY",55, FREE,"2606.45"
I think this is what you're looking for:
,(?=[^"]*"(?:[^"]*"[^"]*")*[^"]*$)
This matches any comma that's followed by an odd number of quotes. It consumes only the comma, so you replace it with nothing.
The thing about your original solution is that it would only match one comma per quoted field. It never even tried to match the second comma in "COLLAR B, M STAY,", so its position didn't really matter. This solution removes any number of commas, regardless of their position within the field.
UPDATE: This regex assumes you're processing one line at a time. If you're using it on a whole document containing many lines, the regex is probably timing out. You can work around that by excluding line terminators (carriage returns and linefeeds), like this:
,(?=[^"\r\n]*"(?:[^"\r\n]*"[^"\r\n]*")*[^"\r\n]*$)
Note that the CSV spec (such as it is) says you can have line terminators in quoted fields, so this regex is technically incorrect. If you do need to support multiline fields, you might as well switch to the CSV library. Regexes are not quite capable of handling CSV fully, but in most cases they're good enough.
You can use the following to match:
((["])(?:(?=(\\?))\3.)*?),\2
And replace with the following:
\1"
See DEMO
This should work
Find What ("[^"]*),"
Replace With \1"

PCRE regex replace a text pattern within double quotes

In Notepad++ 6.5.1 I need to replace certain patterns within quote pairs. I want to save the replace as part of a macro, so all replacements need to happen in one step.
For example, in the following string, replace all 'a' characters within quote pairs with a dash, while leaving characters outside the quote pairs untouched:
Input: aa"bbabaavv"kdjhas"bbabaavv"x
Desired result: aa"bb-b--vv"kdjhas"bb-b--vv"x
Note that the quotes are matched up pairwise, such that the 'a' in kdjhas is untouched.
So far I have tried searching for (?:"[^"a]*|\G)\Ka([^"a]*) and replacing with -$1, but that simply replaces all the a's, with the result --"bb-b--vv"kdjh-s"bb-b--vv"x. I'm attempting PCRE regex that will let me recursively replace the quote-delimited text.
Edit: Quote marks within a quoted string are escaped with an extra quote, e.g. "". However, assume I will have already replaced these in a previous pass with a special character. Therefore a regex solution to this problem will not have to deal with escaped quotes.
It is hard to tell if this is possible as you've only provided one line of input text.
But assuming that input follows this pattern:
BOL|any text|string with two groups of a's|any text|string with two groups of a's|any text|EOL
aa "bbabaavv" kdjhas "bbabaavv" x
I was able to create this regexp search string:
^(.+?\".+?)([a]+)(.+?)([a]+)(.*?\")(.+?\".+?)([a]+)(.+?)([a]+)(.*?\".*)$
With this replace string:
\1-\3-\5\6-\8-\A
and it turn your input string from this:
aa"bbabaavv"kdjhas"bbabaavv"x
into this:
aa"bb-b-vv"kdjhas"bb-b-vv"x
Now naturally the search an replace will fail if the input varies from that pattern described as the search is looking for those four groups of a's inside the two groups of quoted strings.
Also I tested that regexp using Zeus which can create a regexp with more than 9 groups.
As you can see the regexp requires 10 groups.
I'm not familar with Notpad++ so I don't know if it supports that many groups.
If your data have variable number of occurrences of quoted strings, then it is not possible to perform replacements only via regex at least in its form offered by Notepad++.
To replace using regex, you would need to perform regex find in existing regex match. As far as I know such a functionality is not available in Notepad++ regexes.
Self-answer
I may have been reaching for the stars in trying to get Notepad++ to do this regex replace, but I think I found a workaround.
The actual task I was attempting involved creating a SQL Server VALUES list from an Excel spreadsheet, where I was copying and pasting selected cells into Notepad++. The delimiters are \t and \r\n. But, cells can have linefeeds too, which are delimited by ". So, I was going to replace these linefeeds with <br> (or something like it), so that
"line1
line2"
would become "line1<br>line2", before processing the actual end-of-row line feeds.
Having such parsing work reliably, especially when more than two lines were in a single cell, may have been too much to ask of Notepad++'s regex capability.
So I came up with a workaround that seems to be working:) Basically it starts with selecting a blank "dummy" column to the right of my column selection (which I can insert if I'm partially selecting from the middle). This will leave a trailing \t at the end of each row, which effectively sets these EOL's apart from ones that might exist with a text cell, freeing me from having to parse line feeds from a "..." field.
So I compiled a macro from the following steps, which seems to be working well:
replace ' with ''
replace \t\r\n with '\)\r\n, \('
replace \t with ', '
replace "" with ''
replace " with <blank>
replace ^ with \(' (cleanup - first row only)
replace ^, \('$ with <blank> (cleanup - last row only)
Example transformation:
from
line1 line 2
"line3
line3b
line3c" line 4
to
('line1', 'line 2')
, ('line3
line3b
line3c', 'line 4')
which can now be easily modified into a SELECT statement:
SELECT *
FROM (VALUES('line1', 'line 2')
, ('line3
line3b
line3c', 'line 4')
) t(a,b)

Search non-escaped single quote

I need to write a function to search for single quotes (') while skipping escaped quotes (\'). I know I can do a patten search using a function like this:
let contains string pattern =
begin
let re = Str.regexp_string pattern
in
try ignore (Str.search_forward re string 0); true
with Not_found -> false
end
But how do I only search for non-escaped quote?
I'd say a non-escaped quote is one that's at the beginning of the input or is not preceded by a backslash. Unfortunately, special characters in OCaml regular expressions are marked by backslashes, and backslashes need to be doubled in an OCaml string. So you get something like the following:
let neq = "\\(^\\|[^\\]\\)'"
It just says "(the beginning of the input or a non-backslash) followed by a quote".
Don't use Str.regexp_string. Its purpose is to produce a regular expression that matches a given string exactly. You want to use a "real" regular expression. So use Str.regexp.
As a side comment, if you really just want to find unescaped quote characters (rather than learning about regular expressions), it would be much easier just to look for quote characters and then test the previous character to see if it's a backslash.
The String.Escaping module in Core (make sure to install Core, and do an open Core.Std first) lets you do just what you want here.
utop[9]> String.Escaping.index ~escape_char:'\\' "a\\'sdfde" '\'';;
- : int option = None
utop[10]> String.Escaping.index ~escape_char:'\\' "a'sdfde" '\'';;
- : int option = Some 1
utop[11]> String.Escaping.index ~escape_char:'\\' "asdfde" '\'';;
- : int option = None

auto indent in vim string replacement new line?

I'm using the following command to auto replace some code (adding a new code segment after an existing segment)
%s/my_pattern/\0, \r some_other_text_i_want_to_insert/
The problem is that with the \r, some_other_text_i_want_to_insert gets inserted right after the new line:
mycode(
some_random_text my_pattern
)
would become
mycode(
some_random_text my_pattern
some_other_text_i_want_to_insert <--- this line is NOT indented
)
instead of
mycode(
some_random_text my_pattern
some_other_text_i_want_to_insert <--- this line is now indented
)
i.e. the new inserted line is not indented.
Is there any option in vim or trick that I can use to indent the newly inserted line?
Thanks.
Try this:
:let #x="some_other_text_i_want_to_insert\n"
:g/my_pattern/normal "x]p
Here it is, step by step:
First, place the text you want to insert in a register...
:let #x="some_other_text_i_want_to_insert\n"
(Note the newline at the end of the string -- it's important.)
Next, use the :global command to put the text after each matching line...
:g/my_pattern/normal "x]p
The ]p normal-mode command works just like the regular p command (that is, it puts the contents of a register after the current line), but also adjusts the indentation to match.
More info:
:help ]p
:help :global
:help :normal
%s/my_pattern/\=submatch(0).", \n".matchstr(getline('.'), '^\s*').'some_other_text'/g
Note that you will have to use submatch and concatenation instead of & and \N. This answer is based on the fact that substitute command puts the cursor on the line where it does the substitution.
How about normal =``?
:%s/my_pattern/\0, \r some_other_text_i_want_to_insert/ | normal =``
<equal><backtick><backtick>: re-index position before latest jump
(Sorry about the strange formatting, escaping backtick is really hard to use here)
To keep them as separate command you could do one of these mappings:
" Equalize and move cursor to end of change - more intuitive for me"
nnoremap =. :normal! =````<CR>
" Equalize and keeps cursor at beginning of change"
nnoremap =. :keepjumps normal! =``<CR>
I read the mapping as "equalize last change" since dot already means "repeat last change".
Or skip the mapping altogether since =`` is only 3 keys with 2 of them being repeats. Easy peasy, lemon squeezy!
References
:help =
:help mark-motions
Kind of a round-about way of achieving the same thing: You could record a macro which finds the next occurance of my_pattern and inserts after it a newline and your replacement string. If auto-indent is turned on, the indent level will be maintained reagardless of where the occurance of my_pattern is found.
Something like this key sequence:
q 1 # begin recording
/my_pattern/e # find my_pattern, set cursor to end of match
a # append
\nsome_other_text... # the text to append
<esc> # exit insert mode
q # stop recording
Repeated by pressing #1
You can do it in two steps. This is similar to Bill's answer but simpler and slightly more flexible, since you can use part of the original string in the replacement.
First substitute and then indent.
:%s/my_pattern/\0, \r some_other_text_i_want_to_insert/
:%g/some_other_text_i_want_to_insert/normal ==
If you use part of the original string with \0,\1, etc. just use the common part of the replacement string for the :global (second) command.
I achieved this by using \s* at the beginning of my pattern to capture the preceding whitespace.
I'm using the vim addon for VSCode, which doesn't seem to match standard vim completely, but for me,
:%s/(\s*)(existing line)/$1$2\n$1added line/g
turns this
mycode{
existing line
}
into this
mycode{
existing line
added line
}
The parentheses in the search pattern define groups which are referenced by $1 and $2. In this case $1 is the white space captured by (\s*). I'm not an expert on different implementations of vim or regex, but as far as I can tell, this way of referencing regex groups is specific to VSCode (or at least not general). More explanation of that here. Using \s* to capture a group of whitespace should be general, though, or at least have a close analog in your environment.