Adding Line Break After pattern in VIM - regex

I have a css file and I want to add an empty line after every }.
How can I do this in Vim?

A substitution would work nicely.
:%s/}/\0\r/g
Replace } with the whole match \0 and a new line character \r.
or
:%s/}/&\r/g
Where & also is an alternative for the whole match, looks a bit funny though in my opinion. Vim golfers like it because it saves them a keystroke :)
\0 or & in the replacement part of the substitution acts as a special character. During the substitution the whole string that was matched replaces the \0 or the & character in the substitution.
We can demonstrate this with a more complex search and replace -
Which witch is which?
Apply a substitution -
:s/[wW][ih][ti]ch/The \0/g
Gives -
The Which The witch is The which?

The answer is :%s/}/}\r/ I guess.

:%s/pre/cur\r/g
%: operate on the entire buffer.
pre(previous pattern): which pattern will be to changed.
cur(current pattern): by which the previous pattern will be changed.
\r: new line.
g: repeat for every match on a line (default is to just replace the first).

Related

Regular expression is not matching new lines

I have the following reg ex:
"^((?!([\w!#$%&'*+\-/=?\^_`{|}~]+(\.[\w!#$%&'*+\-/=?\^_`{|}~]+)*#((([\-\w]+\.)+[a-zA-Z]{2,4})|(([0-9]{1,3}\.){3}[0-9]{1,3})))).)*$"
But it's not matching new lines as well:
https://regex101.com/r/nT6wK0/1
Any ideas how to make it match when there is a new line?
The . at the and actually means
All but a line break character. (source)
By replacing it with [\S\s], it means
All spacing characters and all non-spacing characters; so all characters.
Then it seems to work. You could have used other variants like [\W\w], [\D\d],...
So the "correct" regex (please don't take my word for it, first test this) is:
^((?!([\w!#$%&'*+\-/=?\^_`{|}~]+(\.[\w!#$%&'*+\-/=?\^_`{|}~]+)*#((([\-\w]+\.)+[a-zA-Z]{2,4})|(([0-9]{1,3}\.){3}[0-9]{1,3}))))[\S\s])*$
regex101 demo.
Assuming that you only want to match the first line, you can add the multiline option (/m) to include the newline.
If you want the second line to be included you'll need to read ahead an extra line. How you do that depends on the regex engine: N in sed; getline in awk; -n in perl; ...

specify pattern at the beginning of string in regular expression

I have some string with multiple possible values:
e
(space)Exact
Exact
exact
phase
I want to get only the first four values, the regular expression I came up with is:
^\s*e
it means at the beginning of the string it has 0 or more white space followed by e(or E, case insensitive), howevever it always filters out the case
(space)Exact
my guess is it take ^ as not instead of beginning of string. How can i correct that? I use Perl Compatible Regular Expressions(PCRE) as the matching engine.
Try the using the mode modifiers in your regex to turn on ^$ match at linebreaks; and also, if necessary case insensitive
(?mi)^\s*e
The ^ character means only the beginning of a string. The beginning of a new line does not count as the beginning of a string. So this would not work if more than one are inside the same "string" object. Not sure how pcre works, but if you want to be able to match the begging of a line also you have to have the multi-line flag enabled.
Edit: If you want to pick up the beginnning of a new line go this route instead: \r\n at the beginning of the expression and remove the "^"
Edit #2 (because I feel like doing regex): here's what you're looking for:
(\b)[eE]+\w*

Regex optimization: negative character class "[^#]" nullifies multiline flag "m"

I'm trying to parse a text line by line, catching everything EXCEPT what's after a specific marker, # for example. No escaping to take into account, pretty basic.
For instance, if the input text is:
Multiline input text
Mid-sentence# cut, this won't be matched
Hey there
If want to retrieve
['Multiline input text',
'Mid-sentence',
'Hey There']
This is working fine with /(.*?)(?:#.*$|$)/mg (even though there are a few empty matches). However, if I try to improve the regex (by avoiding backtracking and getting rid of empty matches) with /([^#]++)(?:#.*$|$)/mg, it returns
[
"Multiline input text
Mid-sentence",
"
Hey There"
]
As if [^#] was including linebreaks, even with the multiline flag on. As far as I can tell I can fix that by adding [^#\n\r] into the class character, but this makes the multiline option kind of useless and I'm afraid it could break on some weird linebreaks in some environments/encoding.
Would any of you know the reason for this behavior, and if there's another workaround? Thanks!
Edit
Originally, it happens in PCRE. But even in Javascript with /([^#]+)(?:#.*$|$)/mg, same unwanted multiline behavior. I know I could probably use the language to parse the text line by line, but I'd like to do it with regex only.
It seems you got your definition of /m wrong. The only thing this flag does is to change what ^ and $ matches, so that they also match at the beginning and end of line respectively. It does not affect anything else. If you don't want to match line breaks you should do as you suggested and use [^#\n\r].
The regex that will work for you is:
^(.*?)(?:#.*|)$
Online Demo: http://regex101.com/r/aP8eV6
DIfference is use of .*? instead of [^#]+.
[^#]+ by definition matches anything but # and that includes newlines as well.
multiline flag m only lets you use line start/end anchors ^ and $ in multiline inputs.

Regex Question: how do I replace a single space with a newline in VI

Regex Question: how do I replace a single space with a newline in VI.
:%s/ /^V^M/g
note: hit ctrl-v, ctrl-m.
edit: if you really mean all single spaces, meaning spaces not followed by another space, use this:
:%s/ \{1\}/^V^M/g
and if you really meant just the first single space in the document, use this:
:%s/ /^V^M/
Just do the following in command mode:
:%s/ /\r/gic
gic in the end means:
- g: replace all occurrences in the same line (not just the first).
- i: case insensitive (not really helpful here but good to know).
- c: prompt for confirmation (nice to have to avoid you having to do immediate undo if it goes wrong :) ).
\([^ ]\|^\)\([^ ]\|$\) will find lone spaces only if that's what you need.

Regular Expression to get comments in VB.Net source code

I have a syntax highlighting function in vb.net. I use regular expressions to match "!IF" for instance and then color it blue. This works perfect until I tried to figure out how to do comments.
The language I'm writing this for a comment can either be if the line starts with a single quote ' OR if anywhere in the line there is two single quotes
'this line is a comment
!if StackOverflow = "AWESOME" ''this is also a comment
Now i know how to see if it starts with a single line ^' but i need to to return the string all the way to the end of the line so i can color the entire comment green and not just the single quotes.
You shouldn't need the code but here is a snippet just in case it helps.
For Each pass In frmColors.lbRegExps.Items
RegExp = System.Text.RegularExpressions.Regex.Matches(LCase(rtbMain.Text), LCase(pass))
For Each RegExpMatch In RegExp
rtbMain.Select(RegExpMatch.Index, RegExpMatch.Length)
rtbMain.SelectionColor = ColorTranslator.FromHtml(frmColors.lbHexColors.Items(PassNumber))
Next
PassNumber += 1
Next
Something along the lines of:
^(\'[^\r\n]+)$|(''[^\r\n]+)$
should give you the commented line (of part of the line) in group n° 1
Actually, you do not even need group
^\'[^\r\n]+$|''[^\r\n]+$
If it finds something, it is a comment.
"(^'|'').*$"
mentioned by Boaz would work if applied only line by line (which may be your case).
For multi-line detection, you must be sure to avoid the 'Dotall' mode, where '.' stands also for \r and \n characters. Otherwise that pattern would match both your lines entirely.
That is why I generally prefer [^\r\n] to '.': it avoids any dependency to the mode of the pattern. Even in 'Dotall' mode, it still works and avoids trying any match on the next line.
While the above would work you can simplify it:
"(^'|'').*$"
As VonC mentions - this would only work if you feed the Regex one line at a time. For multi line mode use:
"(^'|'').*?$"
The ? makes the * operator not be greedy , forcing the regex to match a single line.
Using the regex pattern: REM((\t| ).*$|$)|^\'[^\r\n]+$|''[^\r\n]+$
see more https://code.msdn.microsoft.com/How-to-find-code-comments-9d1f7a29/