How to match a decimal letter and blank in vim? - regex

I need to change
1 A
2 B
3 C
4 D
to
A
B
C
D
which means the decimal letter at the begining of every line and the following one or more blank should be deleted .
I'm only familiar with Reqex in Perl, so I try to use :%s/^\d\s+// to solve my problem, but it does not work. so does anyone of you can tell me how to get the work done using vim ?
thanks.

Vim needs a backslash for +, so try
:%s/^\d\s\+//

One way is to use the global command with the search-and-replace command:
:g/^[0-9] */s//
It searches for the sequence:
start of line ^
a digit [0-9]
a space <space>
zero or more spaces <space>*
and then substitutes it for nothing (s//).
While you can do a similar thing with just the search-and-replace command on its own, it's useful to learn the global command since you can do all sorts of wonderful things with the lines selected (not just search and replace).

Use the following
:%s/^[0-9] *//

You should use instead
:%s/^\d\s\+//
Being a text editor, vim tends to treat more characters literally‒as text‒when matching a pattern than perl. In the default mode, + matches literal +.
Of course, this is configurable. Try
:%s/\v^\d\s+//
and read the help file.
:help magic

You can also use the Visual Block mode (Ctrl+V), then move down and to the right to highlight a block of characters and use 'x' to remove them. Depending on the layout of the line, that may in fact be quicker (and easier to remember).

If you still want to use Perl for this, you can:
:%!perl -pe 's/^\d\s+//'
Vim will write the file to a temporary file, run the given Perl script on it, and reload the file into the edit buffer.

Escape the plus sign:
:%s/^\d\s\+//

If it's in a column like that you could go into the column visual mode by pressing:
esc ctrl+q
then you can highlight what you want to delete

Related

Remove Word smart quotes from a text file using vim

I have a large text file, originally generated in Microsoft Word, that contains these four character sequences, alongside regular text:
?~#~\
?~#~]
?~#~X
?~#~Y
From the content of what is written in the file, it appears that the sequences respectively correspond to open double quotes, close double quotes, open single quote, and close single quote. When displayed in Vim, everything in the sequences other than the question mark appears in blue.
I cannot remove them with a command such as
:.,$s/?~#~Y//
This command results in the following error from vim:
E33: No previous substitute regular expression
E476: Invalid command
Press ENTER or type command to continue
These commands also produce errors:
:.,$s/\?~#~Y//
:.,$s/\?\~\#\~Y//
Specifically,
E866: (NFA regexp) Misplaced ?
E476: Invalid command
Press ENTER or type command to continue
What would be the correct way to automatically remove or replace the sequences? Ideally, I'd like to remove the double quotes, and replace the open/close single quotes with a traditional single quote or apostrophe.
Since "everything in the sequences other than the question mark appears in blue", all characters except the question mark are probably binary characters. I'd suggest this approach:
go to the first sequence and yank it: press v to start marking, extend the mark to the end of the sequence, then press y
paste the sequence as the replace pattern from the unnamed register: :%s/Ctrl-r"//gEnter
repeat for the remaining sequences.
If you’re using a unicode-compatible encoding (such as utf-8) and your font supports it, the smart quotes will show properly.
Additionally, the digraphs for them are 6', 6", 9', and 9". This makes it pretty easy to chain a couple of substitutes to swap them for straight variants:
%s/<C-k>6'\|<C-k>9'/'/g
Etc. Wrap it in a function or command to make it easier for later.
Sorry to bump an old thread but I stumbled upon this late at night while trying to figure out how to remove the exact same characters from a bind9 configuration file that I had pasted in from a website. The aberrant characters were "~#~X", "~#~Y", " | ", and I believe another but I can't remember it at the moment. Anyway, regular expressions couldn't seem to find and replace using the above methods, but I was able to find a solution.
If you can set VIM to show the special characters in their binary representation, then you can use regex to find that. Here's how I did it:
Steps to fix
Open the file with the problem characters in VIM
(a) original method - :set encoding=latin1|set isprint=|set display+=uhex
(b) easier method - :set encoding=utf-8
NOTE: either of these should display the digraph characters in their binary form <<<>>>
(e.g. <80>, <99>, ... )
Then search and replace with VIM regex like so
:%s:\%xNN:':g #replace NN with byte code (i.e. 80, 99, etc.)
Let's break that command down, shall we:
%s: - search command looking for all occurrences due to the % at the start and the 's' for search. The ':' (colon) has been used as the delimiter in this case, but you can use other symbols to delimit the search command.
\%x - the backslash escapes the %x which represents a byte code that we're looking for (i.e. <2 x numbers between brackets>)
NN - replace with the two chars inside of the <> that you're looking to replace in your file. In my case, the byte codes were <e2>, <80>, <99>, which I had to search for separately.
:' - then, the colon delimiting the replacement group where I'm specifying a single quote to replace the byte code, you could put whatever text you want here.
:g - finally, the last colon delineation and the letter 'g' which means to search the entire file top to bottom.
You can do more research in VIM's help with:
:help isprint
Anyway, I hope this helps someone else in the future.
References:
https://blog-en.openalfa.com/how-to-edit-non-printing-and-unicode-characters-in-vim-editor
https://unix.stackexchange.com/questions/108020/can-vim-display-ascii-characters-only-and-treat-other-bytes-as-binary-data
VIM How do I search for a <XX> single byte representation

Deleting every 2nd line from a file using Notepad++

I am looking for some regex help.
I have a textfile, nothing super important but I would like to delete every second line from it - I have tried following this guide: Delete every other line in notepad++
However I just can't get it to work, is the regex I am using ok? I am noob with regex
Find:
([^\n]*\n)[^\n]*\n
Replace with:
$1
No matter what I try (mouse position at the beginning, ctrl+a and Replace All) I just can't get it to work. I appreciate any help.
I've put the regex into here: http://regexpal.com/ and if I remove the final \n it highlights the individual rows.
Make sure you select regular expression for the search mode...
Also, you may want to make that final newline optional. In the case that there are an even number of lines and you do not have a trailing newline, it won't remove the last line.
([^\n]*\n)[^\n]*\n?
Update:
See how Windows handle new lines with \r\n instead of just \n. Try updating the expression to take this into account:
([^\r\n]*[\r\n]+)[^\r\n]*[\r\n]*
Final Update:
Thanks to #zx81, I now know that N++ uses PCRE so \R can be used for unicode newline characters. However [^\R] won't work (this looks for anything except R literally), so you will need to keep [^\r\n]. This can be simplified as:
([^\r\n]*\R)[^\r\n]*\R?

My Vim replace with a regex is throwing a `E488: Trailing characters`

I'm trying to find all instances of a Twitter handle, and wrap an anchor tag around them.
:%s/\(#[\w]\)/<a href="http://www.twitter.com/\1">\1<\/a>/gc
Which gives me:
E488: Trailing characters
If you have this when replacing within a selected block of text, it may be because you mistakenly typed %s when you should only type s
I had this happen by selecting a block, typing : and at the prompt :'<,'>, typing %s/something/other/ resulting in :'<,'>%s/something/other/ when the proper syntax is :'<,'>s/something/other/ without the percent.
When the separator character (/ in your case) between {pattern} and {string} is contained in one of those, it must be escaped with a \. A trick to avoid that is to use a different separator character, e.g. #:
:%s##\(\w\+\)#\0#gc
PS: If it should do what I think it should do, your pattern is wrong; see my correction.
I had this issue and couldn't make it go away until I found out that the .vimrc file that I had parts that I copied from else where that contained abbreviations, like this for example:
abbrev gc !php artisan generate:controller
That abbreviation would mess up my search and replace commands which usually look like this:
:%s/foo/bar/gc
by expanding that gc into !php artisan generate:controller, except, that it wouldn't do it on the spot/ in real time. The way that I clued in was by looking through the command history (by pressing : and the up arrow) and seeing
:%s/foo/bar/!php artisan generate:controller
So if you're getting trailing character errors no matter what you do I'd look inside
~/.vimrc
and see if you can find the problem there.
I had the same problem.
Only using other delimiters didn't help. So, additionally
I didn't select any row.
And didn't use g for global.
so just
:%s#to_be_replaced#replacement#
did the job. Changed all occurrences of 'to_be_replaced' with 'replacement'.
:%s/\/apps/log_dir/g
where string to replace=/apps
and replaced string=log_dir
as we saw / so we need to use "\/"
This is how I caused my "E488: Trailing characters"
Occasionally the muscle memory in my brain skips a beat as it did today causing me to seek the reason for my E488: Trailing characters.
:%s/searchItem/changeTo/s
The s at the end caused my E488: Trailing characters.
I should have used a g
:%s/searchItem/changeTo/g
Placing the g at the end worked as always.

Grep is messing up my understanding

For sometime I have been trying to play with grep to retrieve data from files and I noticed something funny.
It might be my ignorance but here is what happens...
Suppose I have a file ABC. the data is:
a
abc
ab
bac
bb
ac
Now ran this grep command,
grep a* ABC
I found the output to contain lines starting a with b.c. why is this happening?
You used 'a*' as your search pattern... the '*' means ZERO or MORE of the previous character, so 'b.c' matches, having ZERO or more 'a's in it.
On a semi-related note, I'd recommend quoting the 'a*' bit, since if you have ANY files in the current subdirectory which start with a, you'll be VERY surprised to see what you're really searching for, since the shell (bash,zsh,csh,sh,dash,wtfsh...) will perform wildcard expansion automatically BEFORE the command is executed.
if you want to search for lines which START with 'a', then you'll need to anchor the search pattern with a leading ^ character, so your pattern becomes '^a*', but again, the * means ZERO or more, so it's not useful in this situation where you only have one letter... use '^a' instead.
As a contrived example, if you wanted to find all the lines containing a 'c' AND those containing the letters 'bc', then you could use 'b*c' as the search pattern... meaning ZERO or more b's, and a c.
The power of the regex search pattern is immense, and takes some time to grok. Peruse the man pages for grep(1), regex(7), pcre(3), pcresyntax(3), pcrepattern(3).
Once you get the hang of them, regex's are useful in sed, grep, perl, vim, (probably emacs too), ... uh, it's late (early?) nothing more comes to mind, but they're VERY powerful.
As some bonus, '*' means ZERO or more, '+' means ONE or more, and '?' means ZERO or ONE.
So to search for things with two or more a's... 'aa+', which is 1 a, and 1+ a (1 or more)
I ramble.... (regex(7)!)
grep tries to find that pattern in the whole line. Use ^a to get line starting with a or ^a*$ to find lines containing only as (including the empty line).
also, please quote that shell argument (eg: '^a*$'), if you use a* and there is a file in the working directory starting with an a you will get very weird results...
Try this, it works for me. The ^ means beginning of a line - so it has to start with a.
grep ^a ABC
You need to put quotes around your pattern:
grep "a*" ABC
Otherwise the * is interpreted by the shell (which does wild-card filename matching), instead of by grep itself.

Convert line-broken paragraphs into single paragraphs? (Folding text?)

I have searched everywhere for an answer to this, but I think I must not be using the right lingo... I have text like this:
This text is actually just
one paragraph, but every
few words are broken to a
new line, and that's
annoying as hell, because
I have to go to each line
and fix it by hand...
Then there's a second
paragraph which does the
same thing.
I would like to convert that to:
This text is actually just one paragraph, but every few words are broken to a new line, and that's annoying as hell, because I have to go to each line and fix it by hand...
Then there's a second paragraph which does the same thing.
I've tried as many regex techniques as I could think of in TextMate, and can't find any macros or commands to re-wrap the text... The text in question is a result of content editors on one of my sites pasting from Word... I think they may even type this way (holdover from typewriter days!).
Based on your comment, there's probably something you can do with lookaheads. I tried it, but it didn't work (perhaps didn't try enough). So you can try to do this with a series of commands.
First replace any series of spaces with just a single space character:
:%s/ \+/ /g
Then replace all newlines with a space:
:%s/\n/ /g
Then replaces all double spaces with double newlines:
:%s/ /^M^M/g
The ^M can be obtained in vim by doing CTRL+V CTRL+M.
Or, you could even do:
:%s/ /\r\r/g
This is a little ghetto, but it should work :)