Consider the below line for example
'{"place":"buddy's home"}'
I want to replace the single quote in buddy's only. Single quotes at the start and end of line had to be intact. So the resulting line would look like.
'{"place":"buddy\'s home"}'
There could be multiple lines with multiple occurrences of such single quotes in each line. I have to escape all of them except at the start and end of line.
I'm able to find out such pattern using vim regex :/.'. This pattern ensures that single quote is surrounded by two characters and is not at start or at the end of line. But I'm having trouble how to replace the y's into y\'s at all places.
If the regex .'. is accurate enough then you can substitute all occurrences with:
:%s/.\zs'\ze./\\'/g
Instead of using \ze and \zs you could use groups (...) as well. However I find this version slightly more readable.
See :h /\zs and :h /\ze for further information.
:%s/\(.\)'\(.\)/\1\\'\2/gc
:%s/ substitute over the whole buffer (see :help range to explain the %)
\(.\) match a character and save it in capture group 1 (see :help \()
' a literal '
\(.\) match a character and save it in capture group 2
/ replace by
\1 capture group 1 (see :help \1)
\\' this is a \' (you need to escape the backslash)
\2 capture group 2
/gc replace globally (the whole line) and ask for confirmation (see :help :s_flags)
You can omit the c option if you are sure all replaces are legit.
As kongo2002 says in his answer you could replace the capture groups by \zs and \ze:
\zs will start a match and discard everything before
\ze will end a match and discard everything after
See :help \ze and :help \zs.
Related
When saving a Markdown file, I’d like to remove single trailing spaces at the end of the line and trim two or more trailing spaces to two.
I’ve tried
:%s/\([^\s]\)\s$/\1/gc
but that still matches two trailing spaces? Trimming two, seems to work though:
:%s/\s\{2,}$/ /gc
What am I missing here? Thanks!
Inside [], all characters are taken literally. So you’re effectively saying “any character BUT \ or s", which all white space will match. What you want is \S (any non-white space character).
Also, you can make this simpler. Vim has special zero-width modifiers \zs and \ze to set the start or end point of a match, respectively. So, you could do the following:
:%s/\S\zs\s$//gc
Broken down:
%s/{pattern}//gc - replace every occurrence of {pattern} in the entire file with the empty string, with confirmation
\S - any non-whitespace character
\zs - start match here
\s - any whitespace character
$ - end of line
See the following :help topics:
:h :s
:h :s_flags
:h pattern-atoms
:h /[]
:h /\zs
:h /\ze
:h /\$
As an example,
%s/\s\+$/\=strlen(submatch(0)) >= 2 ? ' ' : ''/e
That is, capture all spaces at the end of line, and substitute it with two spaces if length of match is greater than 2. Pretty straightforward, I believe. See also :h sub-replace-expression.
Let's say I have the following text:
new_item['uid']
And I want to capture everything within the [ ... ]. So in this case grab the 'uid'. Normally I could use something like:
\[([^\]]+)]
To match this (start with the opening bracket and get everything until the closing bracket). But without the character classes, or negated character class in vim, how would I do something similar?
If you want to have a capture group with (..), you need the verymagic mode, otherwise you have to escape the ( and ), similar to the BRE.
So both give the matched part in \1:
\[\([^]]*\)
and (\v tells vim to match in verymagic mode)
\v\[([^]]*)
You could use .\{-} in place of .*? to make a lazy dot match:
\[(.\{-})\]
in addition to the other answers, you could use \v\[\zs.{-}\ze\] to only highlight the text within \zs and \ze, see :h \ze
I'm looking to use vim to extract only the square brackets and the number inside from a file containing the following example text:
13_[4]_3_[4]_[1]_5_[1]_29_[3]_4_[2]_9_[1]_6_[2]_4
14_[4]_28_[3]_4_[2]_12_[1]_8_[2]_2
[1]_[4]_15_[1]_16_[3]_4_[2]_11_[1]_16_[2]_2
9_[4]_3_[4]_3_[4]_9_[4]_4_[4]_7_[1]_12_[3]_4_[2]_9_[1]_[2]_2
14_[4]_30_[3]_4_[2]_5_[1]_19_[1]_3_[1]_8_[2]_10_[1]_4_[1]_3_[1]_2
So for the first example line I would like an output line that looks like:
[4][4][1][1][3][2][1][2].
I can easily delete the square brackets with:
:%s/\[\d\]//g
but I am having real trouble trying to delete all text that doesn't match [/d]. Most vim commands that work with negation (e.g. :v) appear to only operate on the whole line rather than individual strings, and using %s with group matching:
:%s/\v(.*)([\d])(.*)/\2
also matches and deletes the square brackets.
Would someone have a suggestion to solve my problem?
You were close. You need to quote the square brackets and use something far less greedy than .*.
:%s/\v[^[]*(\[\d\])[^[]*/\1/g
Overview
Match leading text + [ + digit + ] + trailing text. Capturing the [ + digit + ]. Replace the match the capture group. Leaving only the brackets and digits.
Glory of details
Using \v for very magic. See :h magic
[...] is a bracketed character classes which matches any of the characters inside. e.g. fooba[rs] matches foobar and foobas, but not foobaz. See :h /\[. (Note Vim may call this this a collection.)
[^...] is an negated bracketed character class, so matches none of the charcters inside the brackets. e.g. fooba[^rz] matches foobas, but not foobaz and foobar.
[^[] - match any non-[ character. (This looks funny)
[^[]* - match are non-[ character zero or more times. This will match the leading text we want to remove.
(...) - capture group
\[ & \] represent literal [ / ]. We must escape to prevent a character class.
\d match 1 digit.
[^[]* - match trailing text to be removed
\1 the replacement will be our capture group aka bracketed digits.
Use the g flag to do this globally or more plainly multiple times.
Use a range of % to do a substitution, :s, over the entire file, 1,$.
So why does :%s/\v(.*)([\d])(.*)/\2 fail?
tl;dr: Your pattern doesn't match. Try /[\d].
Long version:
The first .* will capture too much leaving only the last portion. e.g. [2]....
[\d]creates a bracketed character class that matches one of the following characters: d or \
The second .* suffers from the same problem as the first when using the g flag.
Why not 3 capture groups? You can certainly have more capture groups, but in this case they unnecessary, so remove them.
Missing g flag. This means the command will only do 1 substitution per line which will leave plenty of text.
General regex and substitution advice
When working with a tricky regex pattern it is often best to start with a search, /, instead of a substitution. This allows you to see where the matches are beforehand. You can tweak your search via / and pressing <up> or <c-p>. Or even better use q/ to open the command-line-window so you edit your pattern like editing any text. You can also use <c-f> while on the command line (including /) to bring up the command-line-window.
Once you have your pattern then you want to start your substitution. Vim provides a shortcut for using the current search by using an empty pattern. e.g :%s//\1/g.
This technique especially combined with set incsearch and set hlsearch, means you can see your matches interactively before you do your substitutions. This technique is shown in the following Vimcast episode: Refining search patterns with the command-line window.
Need to learn more regex syntax? See :h pattern. It is a very long and dense read, but will greatly aid you in the future. I also find reading Perl's regex documentation via perldoc perlre to be a good place to look as well. Note: Perl's regexes are different from Vim's regexes (See :h perl-patterns), but Perl Compatible Regular Expressions (PCRE) are very common.
Thoughts
You may also consider grep -o. e.g. %!grep -o '\[\d\]'.
More help
:h :s
:h range
:h magic
:h /\[
:h /\(
:h s/\1
:h /\d
:h :s_flags
:h 'hlsearch'
:h 'incsearch'
:h q/
:h command-line-window
:h :range!
Another way to do it:
:%s/\v[^[]*(%(\[\d\])?)/\1/g
I have a situation like
something 'text' something 'moretext'
I forgot to add more spacing the first time I created this file and now on each line I should put some whitespace before the 2nd occurence of ' .
Now I can't build a regex for this.
I know that:
my command should begin with :%s because I want it to be executed on all lines
I should use the {2} operator to pick the 2nd occurence ?
If my regex will match something I can put stuff before the match with &
The main problem for me is how to build a regex to match the second ' using the {} notation, it's frustrating because I don't where it's supposed to be inserted or if I should use the magic or non-magic regex in vim.
The result I'm looking for
something 'text ' something 'moretext'
You can use
:%s:\v(^[^']*'[^']*)':\1 ':
[^'] means everything except '
\1 is a backreference to the first captured group (...)
Basically what your doing here is capturing everything up to the second quote, and replacing the line up to (and including) this quote with what you've captured, a space, and a quote.
{2} doesn't mean "the second match", it means "two matches" so it's completely useless for the task.
You could use a substitution like this one or the one in Robin's answer:
:%s/[^']*'[^']*\zs'/ '
Or you could use something like this:
:g/ '[^']*' /norm 2f'i<space>
Yet another way to do it:
:%s/\v%([^']*\zs\ze'){2}/ /
Note: I am using very magic, \v, to reduce amount of escaping.
This approach uses \zs and \ze to set the start and end of the match . The \zs and \ze get set multiple times because of the quantifier, {2} but each occurrence of the group will change the \zs and \ze positions.
For more help see:
:h /\zs
:h /\v
Of course there is always sed, but the trick is getting the quote escaped correctly.
:%!sed -e 's/'\''/ &/2'
I'm playing with vim-ruby indent, and there are some pretty complex regexes there:
" Regex used for words that, at the start of a line, add a level of indent.
let s:ruby_indent_keywords = '^\s*\zs\<\%(module\|class\|def\|if\|for' .
\ '\|while\|until\|else\|elsif\|case\|when\|unless\|begin\|ensure' .
\ '\|rescue\):\#!\>' .
\ '\|\%([=,*/%+-]\|<<\|>>\|:\s\)\s*\zs' .
\ '\<\%(if\|for\|while\|until\|case\|unless\|begin\):\#!\>'
With the help of vim documentation I deciphered it to mean:
start-of-line <any number of spaces> <start matching> <beginning of a word> /atom
<one of provided keywords> <colon character> <nothing> <end of word> ...
I have some doubts:
Is it really matching ':'? Doesn't seem to work like that, but I don't see anything about colon being some special character in regexes.
why is there \zs (start of the match) and no \ze (end of the match)?
what does \%() do? Is it just some form of grouping?
:\#! says to match only if there is not a colon, if I read it correctly. I am not familiar with the ruby syntax that this is matching against so this may not be quite correct. See :help /\#! and the surrounding topics for more info on lookarounds.
You can have a \zs with no \ze, it just means that the end of the match is at the end of the regex. The opposite is also true.
\%(\) just creates a grouping just as \(\) would except that the group is not available as a backreference (like would be used in a :substitute command).
you can check about matching ':' or any other string by copying the regex and using it to perform a search with / on the code you are working. Using :set incsearch may help you to see what is being matched while you type the regex.
the \zs and \ze don't affect what is matched, but instead determine which part of matched text is used in functions as :s/substitute(). You can check that by performing searches with / and 'incsearch' option set - you can start a search for a string in the text, which will be highlighted, then adding \zsand \ze will change the highlight on the matched text. There is no need to "close" \zsand \ze, as one can discard only the start or the end of the match.
It is a form of grouping that is not saved in temporary variables for use with \1, \2 or submatch(), as stated in :h \%():
\%(\) A pattern enclosed by escaped parentheses.
Just like \(\), but without counting it as a sub-expression. This
allows using more groups and it's a little bit faster.