vi: :s how to replace only the second occurence on a line? - regex

:s/u/X/2 - this replaces the first u to X on the current and next line...
or to replace the second character on a line with X???? IDK.
or perhaps its something other than :s?
I suspect I have to use grouping of some kind (\2?) but I don't know to write that.
I heard that sed and :s option in sed are alike, and on a help page for sed I found:
3.1.3. Substitution switches:
Standard versions of sed support 4 main flags or switches which may be added to
the end of an "s///" command. They are:
N - Replace the Nth match of the pattern on the LHS, where
N is an integer between 1 and 512. If N is omitted,
the default is to replace the first match only.
g - Global replace of all matches to the pattern.
p - Print the results to stdout, even if -n switch is used.
w file - Write the pattern space to 'file' if a replacement was
done. If the file already exists when the script is
executed, it is overwritten. During script execution,
w appends to the file for each match.
http://sed.sourceforge.net/sedfaq3.html#s3.1.3
so: :r! sed 's/u/X/2' would work, although I think there is a specifically vi way of doing this?
IDK if its relevant but I'm using the tcsh shell.
also,
:version:
Version 1.79 (10/23/96) The CSRG, University of California, Berkeley.

This is brittle, but may be enough to do what you want. This switch command with regex:
:%s/first\(.\{-}\)first/first\1second/g
converts this:
first and first then first again
first and first then first again
first and first then first again
first and first then first again
to this:
first and second then first again
first and second then first again
first and second then first again
first and second then first again
The regexp looks for the first "first", followed by a match of any characters using pattern .\{-}, which is the non-greedy version of .* (type :help non-greedy in vim for more info.) This non-greedy match is followed with the second "first".
The characters between the first and second "first" are captured by surrounding the .\{-} with parenthesis, which, with escaping results in \(.\{-}\), then that captured group is dereferenced with the \1 (1 means first captured group) in the replacement.

In order to substitute the second occurrence on a line, you can say:
:call feedkeys('nyq') | s/u/X/gc
In order to invoke it over a range of lines or the entire file, use it in a function:
:function Mysub()
: call feedkeys('nyq') | s/u/X/gc
:endfunction
For example, the following would substitute the second occurrence of u for X in every line in the file:
:1,$ call Mysub()

Here's a dumber but easier to understand way: first find a string that doesn't exist in the file - for the sake of argument assume it's zzz. then simply:
:%s/first/zzz
:%s/first/second
:%s/zzz/first

Related

sed using regex example

I'm going over some legacy code and found this code:
cat some_file | \
sed "/^\/${CATEGORY}\/latest\//s: /.*$: ${DATA_PATH}:"
The format of the original file looks like:
/car/latest/ /US/car/2017/04/02
/bike/latest/ /US/bike/2017/03/31
/boat/latest/ /US/boat/2017/04/03
Assume the CATEGORY above is bike, and the DATA_PATH is /US/bike/2017/04/02, I guess the output will be like this, otherwise it does not make any sense.
/car/latest/ /US/car/2017/04/02
/bike/latest/ /US/bike/2017/04/02
/boat/latest/ /US/boat/2017/04/03
If so, what does the "s: /.*$:" do here? Why doesn't "/boat/latest/ /US/boat/2017/04/03" get substituted since we are replacing to the end (using the dollar sign).
If not, then what will be the output?
Thanks!
As the sed part is the issue, let us break it down:
/^/${CATEGORY}/latest// -- So this first part says to find all lines that follow this pattern, assuming CATEGORY = bike --- ^/bike/latest/. Note that ^ means the line must start with this
s: /.*$: ${DATA_PATH}: -- Once we have found lines matching the above this replacement is performed. first note is that the "normal" / delimiter has been replaced by :. Now if you look closely, it reads like this -- match a space followed by / and then all characters until the end of the line. the 'space' is the key as the only place on each line where you find a space followed by / is at the start of the second column, namely :- /US/bike/2017/03/31, using our bike example. The replacement portion also uses "space" + DATA_PATH
if we take a single line of our data (where we have bike), the matching portion is:
/bike/latest/ /US/bike/2017/03/31
^^^^^^^^^^^^^^^^^^^^
Note how the first ^ is prior to the / in front of US
The expression will match /bike/latest/ in your example. The /.*$ substitution replaces space followed by slash followed by any characters up to the end of the line. If DATA_PATH is the same as what is being replaced then this actually does nothing. Try replacing DATA_PATH with something else and you can see the substitution.
Just to clarify, the substitution replaces everything after a slash that is preceded by a space. There are no spaces before any of the category paths, e.g. /bike/latest/

vim search regular expression replace with register

I'd like to search a regex pattern with vim and replace the matches with a paste from a register. In detail that means:
acb123acb
asokqwdad
def442ads
asduiosdf
df567hjk
should finish with
acbXYZacb
asokqwdad
defPOWads
asduiosdf
dafMANhjk
where I had
XYZ
POW
MAN
in a register A (:g/pattern/y A)
A regex pattern to search for might be [0-9]{3} to match the 3 numbers from the text block.
Block mode would help if there were no lines between the matches...
I could use a perl script therefore of course. However I'm sure, if possible in vim it were a lot faster, right?
Thank you in advance
If you want to replace all strings matching [0-9]{3} with the same value, which happens to be the contents of register a:
:%s/\v\d{3}/\=#a/g
In detail:
:% - apply to all lines in buffer
s/.../.../g - replace all occurrences
\v - what follows is a "very magic" regular expression
\d{3} - match 3 digits
\= - replace with the value of...
#a - register a
If on the other hand you want to read replacement values from register a:
:let a=getreg('a', 1, 1)
:%s/\v\d{3}/\=remove(a, 0)/g
In detail:
let a=getreg('a', 1, 1) - transfer the contents of register a to a list, imaginatively also named a
then same as above, except...
remove(a, 0) - deletes the first element in list a and returns it.
Also, VimL is, sadly, nowhere near as fast as Perl. :)

Regex: Match any character (including whitespace) except a comma

I would like to match any character and any whitespace except comma with regex. Only matching any character except comma gives me:
[^,]*
but I also want to match any whitespace characters, tabs, space, newline, etc. anywhere in the string.
EDIT:
This is using sed in vim via :%s/foo/bar/gc.
I want to find starting from func up until the comma, in the following example:
func("bla bla bla"
"asdfasdfasdfasdfasdfasdf"
"asdfasdfasdf", "more strings")
I
To work with multiline in SED using RegEx, you should look at here.
EDIT:
In SED command, working with NewLine is a bit different. SED command support three patterns to manage multiline operations N, P and D. To see how it works see this(Working with Multiple Lines) explaination. Here these three operations discussed.
My guess is that N operator is the area of consideration that is missing from here. Addition of N operator will allows to sense \n in string.
An example from here:
Occasionally one wishes to use a new line character in a sed script.
Well, this has some subtle issues here. If one wants to search for a
new line, one has to use "\n." Here is an example where you search for
a phrase, and delete the new line character after that phrase -
joining two lines together.
(echo a;echo x;echo y) | sed '/x$/ { N s:x\n:x: }'
which generates
a xy
However, if you are inserting a new line, don't use "\n" - instead
insert a literal new line character:
(echo a;echo x;echo y) | sed 's:x:X\ :'
generates
a X
y
So basically you're trying to match a pattern over multiple lines.
Here's one way to do it in sed (pretty sure these are not useable within vim though, and I don't know how to replicate this within vim)
sed '
/func/{
:loop
/,/! {N; b loop}
s/[^,]*/func("ok"/
}
' inputfile
Let's say inputfile contains these lines
func("bla bla bla"
"asdfasdfasdfasdfasdfasdf"
"asdfasdfasdf", "more strings")
The output is
func("ok", "more strings")
Details:
If a line contains func, enter the braces.
:loop is a label named loop
If the line does not contain , (that's what /,/! means)
append the next line to pattern space (N)
branch to / go to loop label (b loop)
So it will keep on appending lines and looping until , is found, upon which the s command is run which matches all characters before the first comma against the (multi-line) pattern space, and performs a replacement.

How to replace all the blanks within square brackets with an underscore using sed?

I figured out that in order to turn [some name] into [some_name] I need to use the following expression:
s/\(\[[^ ]*\) /\1_/
i.e. create a backreference capture for anything that starts with a literal '[' that contains any number of non space characters, followed by a space, to be replaced with the non space characters followed by an underscore. What I don't know yet though is how to alter this expression so it works for ALL underscores within the braces e.g. [a few words] into [a_few_words].
I sense that I'm close, but am just missing a chunk of knowledge that will unlock the key to making this thing work an infinite number of times within the constraints of the first set of []s contained in a line (of SQL Server DDL in this case).
Any suggestions gratefully received....
There are two parts to the trickery needed:
Stop replacing when you reach a close square bracket (but do it repeatedly on the line):
s/\(\[[^] ]*\) /\1_/g
This matches an open square bracket, followed by zero or more characters that are neither a blank nor a close square bracket. The global suffix means that the pattern is applied to all sequences starting with an open square bracket followed eventually by a blank or close square bracket on the line. Note, too, that this regex does not alter '[single-word] and context' whereas the original would translate that to '[single-word]_and context', which is not the object of the exercise.
Get sed to repeat the search from where this one started. Unfortunately, there isn't a truly good way to do that. Sed always resumes searching after the text that was substituted; and this is one occasion when we don't want that. Sometimes, you can get away with simply repeating the substitute operation. In this case, you have to repeat it every time the substitution succeeds, stopping when there are no more substitutions.
Two of the less well known operations in sed are the ':label' and the 't' commands. They were present in the 7th Edition of Unix (circa 1978), though, so they are not new features. The first simply identifies a position in the script which can be jumped to with 'b' (not wanted here) or 't':
[2addr]t [label]
Branch to the ':' function bearing the label if any substitutions have been made since the most recent reading of an input line or execution of a 't' function. If no label is specified, branch to the end of the script.
Marvellous: we need:
sed -e ':redo; s/\(\[[^] ]*\) /\1_/g; t redo' data.file
Except - it doesn't work all on one line like that (at least, not on MacOS X). This did work admirably, though:
sed -e ':redo
s/\(\[[^] ]*\) /\1_/g
t redo' data.file
Or, as noted in the comments, you could write three separate '-e' options (which works on MacOS X):
sed -e ':redo' -e 's/\(\[[^] ]*\) /\1_/g' -e 't redo' data.file
Given the data file:
a line with [one blank] word inside square brackets.
a line with [two blank] or [three blank] words inside square brackets.
a line with [no-blank] word inside square brackets.
a line with [multiple words in a single bracket] inside square brackets.
a line with [multiple words in a single bracket] [several times on one line]
the output from the sed script shown is:
a line with [one_blank] word inside square brackets.
a line with [two_blank] or [three_blank] words inside square brackets.
a line with [no-blank] word inside square brackets.
a line with [multiple_words_in_a_single_bracket] inside square brackets.
a line with [multiple_words_in_a_single_bracket] [several_times_on_one_line]
And, finally, reading the fine print in the question, if you need this done only in the first square-bracketed field on each line, then we need to ensure that are no open square brackets before the one that starts the match. This variant works:
sed -e ':redo' -e 's/^\([^]]*\[[^] ]*\) /\1_/' -e 't redo' data.file
(The 'g' qualifier is gone - it probably isn't needed in the other variants either given the loop; its presence might make the process marginally more efficient, but it would most likely be essentially impossible to detect that. The pattern is now anchored to the start of the line (the caret) and contains zero or more characters that are not open square bracket before the first open square bracket.)
Sample output:
a line with [two_blank] or [three blank] words inside square brackets.
a line with [no-blank] word inside square brackets.
a line with [multiple_words_in_a_single_bracket] inside square brackets.
a line with [multiple_words_in_a_single_bracket] [several times on one line]
This is easier in a language like perl which has "executable" substitutions:
perl -wne 's/(\[.*?])/ do { my $x = $1; $x =~ y, ,_,; $x } /ge; print'
Or to split it up more clearly:
sub replace_with_underscores {
my $s = shift;
$s =~ y/ /_/;
$s
}
s/(\[.*?])/ replace_with_underscores($1) /ge;
The .*? is the non-greedy match (to avoid slurring together two adjacent bracketed phrases) and the e flag to the substitution causes it to be evaluated, so you can call a function to do the inner work.

Substitute the n-th occurrence of a word in vim

I saw other questions dealing with the finding the n-th occurrence of a word/pattern, but I couldn't find how you would actually substitute the n-th occurrence of a pattern in vim. There's the obvious way of hard coding all the occurrences like
:s/.*\(word\).*\(word\).*\(word\).*/.*\1.*\2.*newWord.*/g
Is there a better way of doing this?
For information,
s/\%(\(pattern\).\{-}\)\{41}\zs\1/2/
also works to replace 42th occurrence. However, I prefer the solution given by John Kugelman which is more simple -- even if it will not limit itself to the current line.
You can do this a little more simply by using multiple searches. The empty pattern in the :s/pattern/repl/ command means replace the most recent search result.
:/word//word//word/ s//newWord/
or
:/word//word/ s/word/newWord/
You could then repeat this multiple times by doing #:, or even 10#: to repeat the command 10 more times.
Alternatively, if I were doing this interactively I would do something like:
3/word
:s//newWord/r
That would find the third occurrence of word starting at the cursor and then perform a substitution.
Replace each Nth occurrence of PATTERN in a line with REPLACE.
:%s/\(\zsPATTERN.\{-}\)\{N}/REPLACE/
To replace the nth occurrence of PATTERN in a line in vim, in addtion to the above answer I just wanted to explain the pattern matching i.e how it is actually working for easy understanding.
So I will be discussing the \(.\{-}\zsPATTERN\)\{N} solution,
The example I will be using is replacing the second occurrence of more than 1 space in a sentence(string).
According to the pattern match code->
According to the zs doc,
\zs - Scroll the text horizontally to position the cursor at the start (left
side) of the screen.
.\{-} 0 or more as few as possible (*)
Here . is matching any character and {} the number of times.
e.g ab{2,3}c here it will match where b comes either 2 or 3 times.
In this case, we can also use .* which is 0 or many as many possible.
According to vim non-greedy docs, "{-}" is the same as "*" but uses the shortest match first algorithm.
\{N} -> Matches n of the preceding atom
/\<\d\{4}\> search for exactly 4 digits, same as /\<\d\d\d\d>
**ignore these \<\> they are for exact searching, like search for fred -> \<fred\> will only search fred not alfred.
\( \) combining the whole pattern.
PATTERN here is your pattern you are matching -> \s\{1,} (\s - space and {1,} as explained just above, search for 1 or more space)
"abc subtring def"
:%s/\(.\{-}\zs\s\{1,}\)\{2}/,/
OUTPUT -> "abc subtring,def"
# explanation: first space would be between abc and substring and second
# occurence of the pattern would be between substring and def, hence that
# will be replaced by the "," as specified in replace command above.
This answers your actual question, but not your intent.
You asked about replacing the nth occurrence of a word (but seemed to mean "within a line"). Here's an answer for the question as asked, in case someone finds it like I did =)
For weird tasks (like needing to replace every 12th occurrence of "dog" with "parrot"), I like to use recursive recordings.
First blank the recording in #q
qqq
Now start a new recording in q
qq
Next, manually do the thing you want to do (using the example above, replace the 12th occurrence of "dog" with "parrot"):
/dog
nnnnnnnnnnn
delete "dog" and get into insert
diwi
type parrot
parrot
Now play your currently empty "#q" recording
#q
which does nothing.
Finally, stop recording:
q
Now your recording in #q calls itself at the end. But because it calls the recording by name, it won't be empty anymore. So, call the recording:
#q
It will replay the recording, then at the end, as the last step, replay itself again. It will repeat this until the end of the file.
TLDR;
qq
q
/dog
nnnnnnnnnnndiwiparrot<esc>
#q
q
#q
Well, if you do /gc then you can count the number of times it asks you for confirmation, and go ahead with the replacement when you get to the nth :D