Substituting zero-width match in vim script - regex

I have written this script that replaces many spaces around the cursor with one space. This however doesn't work when I use it with no spaces around the cursor. It seems to me that Vim doesn't replace on a zero-width match.
function JustOneSpace()
let save_cursor = getpos(".")
let pos = searchpos(' \+', 'bc')
s/\s*\%#\s*/ /e
let save_cursor[2] = pos[1] + 1
call setpos('.', save_cursor)
endfunction
nmap <space> :call JustOneSpace()<cr>
Here are a few examples (pipe | is cursor):
This line
hello | world
becomes
hello |world
But this line
hello wo|rld
doesn't become
hello wo |rld
Update: By changing the function to the following it works for the examples above.
function JustOneSpace()
let save_cursor = getpos(".")
let pos = searchpos(' *', 'bc')
s/\s*\%#\s*/ /e
let save_cursor[2] = pos[1] + 1
call setpos('.', save_cursor)
endfunction
This line
hello |world
becomes
hello w|orld
The problem is that the cursors moves to the next character. It should stay in the same place.
Any pointers and or tips?

I think that the only problem with your script is that the position saving doesn't seem correct. You can essentially do what you are trying to do with:
:s/\s*\%#\s*/ /e
which is identical to the (correct) code in your question. You could simply map this with:
:nmap <space> :s/\s*\%#\s*/ /e<CR>
If you want to save the position, it gets a little more complicated. Probably the best bet is to use something like this:
function! JustOneSpace()
" Get the current contents of the current line
let current_line = getline(".")
" Get the current cursor position
let cursor_position = getpos(".")
" Generate a match using the column number of the current cursor position
let matchRE = '\(\s*\)\%' . cursor_position[2] . 'c\s*'
" Find the number of spaces that precede the cursor
let isolate_preceding_spacesRE = '^.\{-}' . matchRE . '.*$'
let preceding_spaces = substitute(current_line, isolate_preceding_spacesRE, '\1', "")
" Modify the line by replacing with one space
let modified_line = substitute(current_line, matchRE, " ", "")
" Modify the cursor position to handle the change in string length
let cursor_position[2] -= len(preceding_spaces) - 1
" Set the line in the window
call setline(".", modified_line)
" Reset the cursor position
call setpos(".", cursor_position)
endfunction
Most of that is comments, but the key thing is that you look at the length of the line before and after the substitution and decide on the new cursor position accordingly. You could do this with your method by comparing len(getline(".")) before and after if you prefer.
Edit
If you want the cursor to end after the space character, modify the line:
let cursor_position[2] -= len(current_line) - len(modified_line)
such that it looks like this:
let cursor_position[2] -= (len(current_line) - len(modified_line)) - 1
Edit (2)
I've changed the script above to consider your comments such that the cursor position is only adjusted by the number of spaces before the cursor position. This is done by creating a second regular expression that extracts the spaces preceding the cursor (and nothing else) from the line and then adjusting the cursor position by the number of spaces.

I don't use vim, but if you want to match zero or more spaces, shouldn't you be using ' *' instead of ' \+' ?
EDIT: re the cursor positioning problem: what you're doing now is setting the position at the beginning of the whitespace before you do the substitution, then moving it forward one position so it's after the space. Try setting it at the end of the match instead, like this:
search(' *', 'bce')
That way, any additions or removals will occur before the cursor position. In most editors, the cursor position automatically moves to track such changes. You shouldn't need to do any of that getpos/setpos stuff.

This function is based on Al's answer.
function JustOneSpace()
" Get the current contents of the current line
let current_line = getline(".")
" Get the current cursor position
let cursor_position = getpos(".")
" Generate a match using the column number of the current cursor position
let matchre = '\s*\%' . cursor_position[2] . 'c\s*'
let pos = match(current_line, matchre) + 2
" Modify the line by replacing with one space
let modified_line = substitute(current_line, matchre, " ", "")
" Modify the cursor position to handle the change in string length
let cursor_position[2] = pos
" Set the line in the window
call setline(".", modified_line)
" Reset the cursor position
call setpos(".", cursor_position)
endfunction
Instead using the difference between the normal and the modified line, I find the position of the first space that will match the regular expression of the substitution. Then I set the cursor position to that position + 1.

This simple one I use does almost the same:
nnoremap <leader>6 d/\S<CR>
Put the cursor till where you want to remove the spaces and it removes all the spaces after the cursor and the next text.

Related

Remove lines from buffer that match the selected text

When analyzing large log files, I often remove lines containing text I find irrelevant:
:g/whatever/d
Sometimes I find text that spans multiple lines, like stacktraces. For that, I record the steps taken (search, go to start anchor, delete to end anchor) and replay that macro with 100000#q. I'm searching for a function or a feature vim already has included that allows me to mark text and remove all lines containing this text. Ideally this would also work for block selection.
If I understood your problem right, this command should do what you want:
:g/NullPointer/,/omitt/d
Example:
Before:
1
2
3
NullPointerException1
4
5
6
omitted
7
NullPointerException2
8
9
omitted
10
After:
1
2
3
7
10
Please read :h edit-paragraph-join, there is good explanation for the command, your case is just changing join into d
:g/whatever/d2
will delete a line with whatever and the line after it. If you can find text that always happens in the first line, you can strip out all of the following text if it has the same number of lines by changing 2 to whatever you need.
You could actually just use some normal commands in a global command to achieve what you want, look at your example (hope i understood it more or less right):
someText
NullPointerException
...
omitted
you want to delte from the line above NPE until the line with omitted right?
Just use the following:
:g/NullPointerException/execute "normal! kddd/omitted\<cr>dd"
It maybe looks complex, but it isn't. It is not better than a macro1
, but i like commands more, because I always make errors recording macros.
Since it only uses normal vim movements, it is easy to adopt. If you f.e. not know where your previous anchor is, you could use ?anchor\<cr> instead of kd. For a better demonstration you will have to submit a realistic example.
[1] You could argue, that this only needs to be run once, but that is also true for a recursive macro http://vim.wikia.com/wiki/Record_a_recursive_macro
Thanks to the answers here, I was able to code a very handy function: The sources below enables one to select text and remove all lines with the same (or similar) text in the current buffer. That works with both in-line and multiline selection. As I said I was searching for something that made me faster in analyzing log files. Log files typically contain dates and times and these change all the time, so it's a good idea to have something that let's us ignore numbers. Let's see. I'm using these two mappings:
vnoremap d :<C-U>echo RemoveSelectionFromBuffer(0)<CR>
vnoremap D :<C-U>echo RemoveSelectionFromBuffer(1)<CR>
Typical usage:
Remove similar lines ignoring numbers: Shift+v, then Shift+d
Remove same matches (single line): Mark text inline (leaving out dates and times), then d
Remove same matches (multiline): Mark text across lines (leaving out dates and times), then d
Here's the source code:
" Removes lines matching the selected text from buffer.
function! RemoveSelectionFromBuffer(ignoreNumbers)
let lines = GetVisualSelection() " selected lines
" Escape backslashes and slashes (delimiters)
call map(lines, {k, v -> substitute(v, '\\\|/', '\\&', 'g')})
if a:ignoreNumbers == 1
" Substitute all numbers with \s*\d\s* - in formatted output matching
" lines may have whitespace instead of numbers. All backslashes need
" to be escaped because \V (very nomagic) will be used.
call map(lines, {k, v -> substitute(v, '\s*\d\+\s*', '\\s\\*\\d\\+\\s\\*', 'g')})
endif
let blc = line('$') " number of lines in buffer (before deletion)
let vlc = len(lines) " number of selected lines
let pattern = join(lines, '\_.') " support multiline patterns
let cmd = ':g/\V' . pattern . '/d_' . vlc " delete matching lines (d_3)
let pos = getpos('v') " save position
execute "silent " . cmd
call setpos('.', pos) " restore position
let dlc = blc - line('$') " number of deleted lines
let dmc = dlc / vlc " number of deleted matches
let cmd = substitute(cmd, '\(.\{50\}\).*', '\1...', '') " command output
let lout = dlc . ' line' . (dlc == 1 ? '' : 's')
let mout = '(' . dmc . ' match' . (dmc == 1 ? '' : 'es') . ')'
return printf('%s removed: %s', (vlc == 1 ? lout : lout . ' ' . mout), cmd)
endfunction
I took the GetVisualSelection() code from this answer.
function! GetVisualSelection()
if mode() == "v"
let [line_start, column_start] = getpos("v")[1:2]
let [line_end, column_end] = getpos(".")[1:2]
else
let [line_start, column_start] = getpos("'<")[1:2]
let [line_end, column_end] = getpos("'>")[1:2]
end
if (line2byte(line_start)+column_start) > (line2byte(line_end)+column_end)
let [line_start, column_start, line_end, column_end] =
\ [line_end, column_end, line_start, column_start]
end
let lines = getline(line_start, line_end)
if len(lines) == 0
return ''
endif
let lines[-1] = lines[-1][: column_end - 1]
let lines[0] = lines[0][column_start - 1:]
return lines
endfunction
Thanks, aepksbuck, DoktorOSwaldo and Kent.

Join lines after specific word till another specific word

I have a .txt file of a transcript that looks like this
MICHEAL: blablablabla.
further talk by Michael.
more talk by Michael.
VALERIE: blublublublu.
Valerie talks more.
MICHAEL: blibliblibli.
Michael talks again.
........
All in all this pattern goes on for up to 4000 lines and not just two speakers but with up to seven different speakers, all with unique names written with upper-case letters (as in the example above).
For some text mining I need to rearrange this .txt file in the following way
Join the lines following one speaker - but only the ones that still belong to him - so that the above file looks like this:
MICHAEL: blablablabla. further talk by Michael. more talk by Michael.
VALERIE: blublublublu. Valerie talks more.
MICHAEL: blibliblibli. Michael talks again.
Sort the now properly joined lines in the .txt file alphabetically, so that all lines spoken by a speaker are now together. But, the sort function should not sort the sentences spoken by one speaker (after having sorted each speakers lines together).
I know some basic vim commands, but not enough to figure this out. Especially, the first one. I do not know what kind of pattern I can implement in vim so that it only joins the lines of each speaker.
Any help would be greatly apperciated!
Alright, first the answer:
:g/^\u\+:/,/\n\u\+:\|\%$/join
And now the explanation:
g stands for global and executes the following command on every line that matches
/^\u+:/ is the pattern :g searches for : ^ is start of line, \u is a upper case character, + means one or more matches and : is unsurprisingly :
then comes the tricky bit, we make the executed command a range, from the match so some other pattern match. /\n\u+:\|\%$ is two parts parted by the pipe \| . \n\u+: is a new line followed by the last pattern, i.e. the line before the next speaker. \%$ is the end of the file
join does what it says on the tin
So to put it together: For each speaker, join until the line before the next speaker or the end of the file.
The closest to the sorting I now of is
:sort /\u+:/ r
which will only sort by speaker name and reverse the other line so it isn't really what you are looking for
Well I don't know much about vim, but I was about to match lines corresponding particular speaker and here is the regex for that.
Regex: /([A-Z]+:)([A-Za-z\s\.]+)(?!\1)$/gm
Explanation:
([A-Z]+:) captures the speaker's name which contains only capital letters.
([A-Za-z\s\.]+) captures the dialogue.
(?!\1)$ backreferences to the Speaker's name and compares if the next speaker was same as the last one. If not then it matches till the new speaker is found.
I hope this will help you with matching at least.
In vim you might take a two step approach, first replace all newlines.
:%s/\n\+/ /g
Then insert a new line before the terms UPPERCASE: except the first one:
:%s/ \([[:upper:]]\+:\)/\r\1/g
For the sorting you can leverage the UNIX sort program:
:%sort!
You can combine them using a pipe symbol:
:%s/\n\+/ /g | %s/ \([[:upper:]]\+:\)/\r\1/g | %!sort
and map them to a key in your vimrc file:
:nnoremap <F5> :%s/\n\+/ /g \| %s/ \([[:upper:]]\+:\)/\r\1/g \| %sort! <CR>
If you press F5 in normal mode, the transformation happens. Note that the | needs to get escaped in the nnoremap command.
Here is a script solution to your problem.
It's not well tested, so I added some comments so you can fix it easily.
To make it run, just:
fill the g:speakers var in the top of the script with the uppercase names you need;
source the script (ex: :sav /tmp/script.vim|so %);
run :call JoinAllSpeakLines() to join the lines by speakers;
run :call SortSpeakLines() to sort
You may adapt the different patterns to better fit your needs, for example adding some space tolerance (\u\{2,}\s*\ze:).
Here is the code:
" Fill the following array with all the speakers names:
let g:speakers = [ 'MICHAEL', 'VALERIE', 'MATHIEU' ]
call sort(g:speakers)
function! JoinAllSpeakLines()
" In the whole file, join all the lines between two uppercase speaker names
" followed by ':', first inclusive:
silent g/\u\{2,}:/call JoinSpeakLines__()
endf
function! SortSpeakLines()
" Sort the whole file by speaker, keeping the order for
" each speaker.
" Must be called after JoinAllSpeakLines().
" Create a new dict, with one key for each speaker:
let speakerlines = {}
for speaker in g:speakers
let speakerlines[speaker] = []
endfor
" For each line in the file:
for line in getline(1,'$')
let speaker = GetSpeaker__(line)
if speaker == ''
continue
endif
" Add the line to the right speaker:
call add(speakerlines[speaker], line)
endfor
" Delete everything in the current buffer:
normal gg"_dG
" Add the sorted lines, speaker by speaker:
for speaker in g:speakers
call append(line('$'), speakerlines[speaker])
endfor
" Delete the first (empty) line in the buffer:
normal gg"_dd
endf
function! GetOtherSpeakerPattern__(speaker)
" Returns a pattern which matches all speaker names, except the
" one given as a parameter.
" Create an new list with a:speaker removed:
let others = copy(g:speakers)
let idx = index(others, a:speaker)
if idx != -1
call remove(others, idx)
endif
" Create and return the pattern list, which looks like
" this : "\v<MICHAEL>|<VALERIE>..."
call map(others, 'printf("<%s>:",v:val)')
return '\v' . join(others, '|')
endf
function! GetSpeaker__(line)
" Returns the uppercase name followed by a ':' in a line
return matchstr(a:line, '\u\{2,}\ze:')
endf
function! JoinSpeakLines__()
" When cursor is on a line with an uppercase name, join all the
" following lines until another uppercase name.
let speaker = GetSpeaker__(getline('.'))
if speaker == ''
return
endif
normal V
" Search for other names after the cursor line:
let srch = search(GetOtherSpeakerPattern__(speaker), 'W')
echo srch
if srch == 0
" For the last one only:
normal GJ
else
normal kJ
endif
endf

Enumerate existing text in Vim (make numbered list out of existing text)

I have a source document with the following text
Here is a bunch of text
...
Collect underpants
???
Profit!
...
More text
I would like to visually select the middle three lines and insert numbers in front of them:
Here is a bunch of text
...
1. Collect underpants
2. ???
3. Profit!
...
More text
All the solutions I found either put the numbers on their own new lines or prepended the actual line of the file.
How can I prepend a range of numbers to existing lines, starting with 1?
It makes for a good macro.
Add the first number to your line, and put your cursor back at the beginning.
Start a macro with qq (or q<any letter>)
Copy the number with yf<space> (yank find )
Move down a line with j
Paste your yank with P
Move back to the beginning of the line with 0
Increment the number with Ctrl-a
Back to the beginning again with 0 (incrementing positions you at the end of the number)
End the macro by typing q again
Play the macro with #q (or #<the letter you picked>)
Replay the macro as many times as you want with <number>## (## replays the last macro)
Profit!
To summarize the fun way, this GIF image is i1. <Esc>0qqyf jP0^a0q10#q.
To apply enumeration for all lines:
:let i=1 | g/^/s//\=i.'. '/ | let i=i+1
To enumerate only selected lines:
:let i=1 | '<,'>g/^/s//\=i.'. '/ | let i=i+1
Set non recursive mapping with following command and type ,enum in command mode when cursor is inside the lines you are going to enumerate.
:nn ,enum {j<C-v>}kI0. <Esc>vipg<C-a>
TL;DR
You can type :help CTRL-A to see an answer on your question.
{Visual}g CTRL-A Add [count] to the number or alphabetic character in
the highlighted text. If several lines are
highlighted, each one will be incremented by an
additional [count] (so effectively creating a
[count] incrementing sequence).
For Example, if you have this list of numbers:
1.
1.
1.
1.
Move to the second "1." and Visually select three
lines, pressing g CTRL-A results in:
1.
2.
3.
4.
If you have a paragraph (:help paragraph) you can select it (look at :help object-select). Suppose each new line in the paragraph needs to be enumerated.
{ jump to the beginning of current paragraph
j skip blank line, move one line down
<C-v> emulates Ctrl-v, turns on Visual mode
} jump to the end of current paragraph
k skip blank line, move one line up
required region selected, we can make multi row edit:
I go into Insert mode and place cursor in the beginning of each line
0. is added in the beginning of each line
<Esc> to change mode back to Normal
You should get list prepended with zeros. If you already have such, you can omit this part.
vip select inner paragraph (list prepended with "0. ")
g<C-a> does the magic
I have found it easier to enumerate with zeroes instead of omitting first line of the list to enumerate as said in documentation.
Note: personally I have no mappings. It is easier to remember what g <C-a> does and use it directly. Answer above describes usage of pure <C-a> which requires you to manually count whatever, on the other hand g <C-a> can increment numbers with given value (aka step) and have it's "internal counter".
Create a map for #DmitrySandalov solution:
vnoremap <silent> <Leader>n :<C-U>let i=1 \| '<,'>g/^/s//\=i.'. '/ \| let i=i+1 \| nohl<CR>

vim sed match more than one newline and replace it with one newline

I'm having some trouble with vim, gg=G doesn't remove extra newlines, I'm trying with
:%s/\(\n\)\n\+/\1/g
but it's not working in the whole file. Any help appreciated.
This should work in vim...
:g/^\s*$/d
" Put the function bellow in your vimrc
" remove extra newlines keeping the cursor position and search registers
fun! DelBlank()
let _s=#/
let l = line(".")
let c = col(".")
:g/^\n\{2,}/d
let #/=_s
call cursor(l, c)
endfun
" the function can be called with "leader" d see :h <leader>
map <special> <leader>d :keepjumps call DelBlank()<cr>

Add spaces at cursor position

I would like to know if it is possible to add spaces (30 spaces) at cursor position
I tried to do it with regex but I don't know how to represent the actual cursor position in regex.
30iSPACE will add 30 spaces at the cursor position in command mode.
You can use vim register for this:
"a defines register a and if you cut a whitespace with "ax makes register a has whitespace. Then use:
30"ap
Cut a whitespace with x and paste it with 30p
Note: Registers don't forget its value so first solution is more useful.
In addition to already given answers I can say that cursor position is represented in regex with \%#, so s/\%#/\=repeat(" ", 30)/ will add 30 spaces at cursor position just like 30i<Space><Esc>.