non-greedy multiline search in vim - regex

I am trying to search ruby file and find all methods (before autoreplacing them later).
In vim, i use following regexp:
/\vdef.*(\n.*){-}end
However even though i use "{-}", it selects whole file's contents.

vim uses \_. to include the newline character to the common ..
/\vdef\_.{-}end

In my version of Vim, you need to escape the {, as well as using \_. as other answers have said:
/def\_.\{-}end

Try following regex.
/\vdef(\n|.){-}end
.* was culprit in your case

Related

Search string enclosed in quotes in Vim

In vim I need to search all strings in quotes e.g. 'foo'
Does one see the problem in this regex? E486: Pattern not found \'([^']*)'
:\/'([^']*)'
Regex Tester
First problem is that your use of find is a bit confusing. If you want
to just find, use /. The colon is not necessary (which indicates
command mode). If you're using the find as a range (basically the same
thing, / is just an empty command with a range) you can use the colon,
but either way escaping the first slash is not necessary.
The other main problem is that parenthesis by default need to be escaped
if you meant a capturing group. All of this is dependant on your
'magic' option reading the help for the /magic topic (you can do a
:h magic) is highly recommended. With "vanilla" Vim settings, the
regex you need looks live this:
/'\([^']*\)'
With very magic enable (by using the \v atom) this can be simplified
to your original design:
/\v'([^']*)'
Alternatively you can use
\v'(\a+)'
this regex performs similar than yours, except when nested quotes are encountered. In the text:
The user's first 'answer'.
The regex \v'(\a+)' will capture answer while your original regex (corrected by sidyll) \v'([^']*)' will capture 's first '.

how to extract a part of header in Fasta file by using Linux command

I have a Fasta file with unique header,I would like to extract a part of this header by using Regular expression in Unix.
for example My Fasta file start with this header:
>jgi|Penbr2|47586|fgenesh1_pm.1_#_25
and I would like to extract just the last part of this header like:
>fgenesh1_pm.1_#_25
Actually I use this regular expression in vim editor but It did not work:
:%s/^([^|]+\|){3}//g
or
:%s/^([A-Z][0-9]+\|){3}//g
I would be appropriate if give me some suggestion.
You can use sed:
sed -e 's/>.*|/>/' fasta-file
i.e. everything between > and | is replaced by >.
I don't know if the leading > is also a part of your text. Assume that they are not.
Since you tagged with vim, then I just post the vim solution.
You can make the usage of the "greedy" of regex:
In vim:
%s/.*|//
will leave the last part, this is the easiest way.
in vim you can use \zs, \ze and non-greedy matching too:
%s/\zs.\{-}\ze[^|]\+$//
Of course, if you like grouping, you can use \(...\) to group and don't use \zs \ze.
In your codes, you grouped just with (...) without escaping. I don't know how did you configure your magic setting in your vimrc, if you use default, you have to escape the ( and ) to give them special meanings (grouping here). Just like what we do with BRE. Do a :h magic, and find the table to know the difference.
In vim do :h terms to get detail information.

Vim regex to substitute/escape pipe characters

Let's suppose I have a line:
a|b|c
I'd like to run a regex to convert it to:
a\|b\|c
In most regex engines I'm familiar with, something like s%\|%\\|%g should work. If I try this in Vim, I get:
\|a\||\|b\||\|c
As it turns out, I discovered the answer while typing up this question. I'll submit it with my solution, anyway, as I was a bit surprised a search didn't turn up any duplicates.
vim has its own regex syntax. There is a comparison with PCRE in vim help doc (see :help perl-patterns).
except for that, vim has no magic/magic/very magic mode. :h magic to check the table.
by default, vim has magic mode. if you want to make the :s command in your question work, just active the very magic:
:s/\v\|/\\|/g
Vim does the opposite of PCRE in this regard: | is a literal pipe character, with \| serving as the alternation operator. I couldn't find an appropriate escape sequence because the pipe character does not need to be escaped.
The following command works for the line in my example:
:. s%|%\\|%g
If you use very-magic (use \v) you'll have the Perl/pcre behaviour on most special characters (excl. the vim specifics):
:s#\v\|#\\|#g

regexIssueTracker not working in CruiseControl.net

I am trying to get an issueUrlBuilder to work in my CruiseControl.NET config, and cannot figure out why they aren't working.
The first one I tried is this:
<cb:define name="issueTracker">
<issueUrlBuilder type="regexIssueTracker">
<find>^.*Issue (\d*).|\n*$</find>
<replace>https://issuetracker/ViewIssue.aspx?ID=$1</replace>
</issueUrlBuilder>
</cb:define>
Then, I reference it in the sourceControl block:
<sourcecontrol type="vaultplugin">
...
<issueTracker/>
</sourcecontrol>
My checkin comments look like this:
[Issue 1234] This is a test comment
I cannot find anywhere in the build reports/logs/etc. where that issue link is converted to a link. Is my regex wrong?
I've also tried the default issueUrlBuilder:
<cb:define name="issueTracker">
<issueUrlBuilder type="defaultIssueTracker">
<url>https://issuetracker/ViewIssue.aspx?ID={0}</url>
</issueUrlBuilder>
</cb:define>
Again, same comments and no links anywhere.
Anyone have any ideas.
It looks like you're trying to match a potentially multiline comment by using .|\n instead of just ., which doesn't match newlines by default. Your first problem is that | has the lowest associativity of all regex constructs, so it's dividing your whole regex into the alternatives ^.*Issue (\d*). or \n*$. You would need to enclose the alternation in a group: (?:.|\n)*.
Another potential problem is that the lines might be separated by \r\n (carriage-return plus linefeed) instead of just \n. If CCNET uses the .NET regex engine under the hood, that won't be a problem because the dot matches \r. But that's not true of all flavors, and anyway, there's always a better way to match anything including newlines than (?:.|\n)*. I suggest you try
<find>^.*Issue (\d*)(?s:.*)$</find>
or
<find>(?s)^.*Issue (\d*).*$</find>
(?s) and (?s:...) are inline modifiers which allow the dot to match line separator characters.
EDIT: It looks like this is a known bug in CCNET. If the inline modifier doesn't work, try replacing . with [\s\S], as you would in a JavaScript regex. Example:
<find>^.*Issue (\d*)[\s\S]*$</find>

Multiline Regular Expression search and replace!

I've hit a wall. Does anybody know a good text editor that has search and replace like Notepad++ but can also do multi-line regex search and replace? Basically, I am trying to find something that can match a regex like:
search oldlog\(.*\n\s+([\r\n.]*)\);replace newlog\(\1\)
Any ideas?
Notepad++ can now handle multi line regular expressions (just update to the latest version - feature was introduced around March '12).
I needed to remove all onmouseout and onmouseover statements from an HTML document and I needed to create a non-greedy multi line match.
onmouseover=.?\s*".*?"
Make sure you check the: [ ] . matches newline checkbox if you want to use the multi line match capability.
EditPad Pro has better regex capabilities than any other editor I've ever used.
Also, I suspect you have an error in your regex — [\r\n.] will match only carriage returns, newlines, and full stops. If you're trying to match any character (i.e. "dot operator plus CR and LF), try [\s\S] instead.
My personal recommendation is IDM Computing's UltraEdit (www.ultraedit.com) - it can do regular expressions (both search and replace) with Perl, Unix and UltraEdit syntax. Multi-line matching is one of the capabilities in Perl regex mode in it.
It also has other nice search capabilities (e.g search in specific character column range, search in multiple files, search history, search favorites, etc...)
(source: ultraedit.com)
The Zeus editor can do multi-line search and replace.
I use Eclipse, which is free and that you may already have if you are a developer. '\R' acts as platform independent line delimiter. Here is an example of multi-line search:
search:
\bibitem.(\R.)?\R?{([^{])}$\R^([^\].[^}]$\R.$\R.)
and replace:
\defcitealias{$2}{$3}
I'm pretty sure Notepad++ can do that now via the TextFX plugin (which is included by default). Hit Control-R in Notepad++ and have a play.
TextPad has good Regex search and replace capabilities; I've used it for a while and am pretty happy with it.
From the Features:
Powerful search/replace engine using
UNIX-style regular expressions, with
the power of editor macros. Sets of
files in a directory tree can be
searched, and text can be replaced in
all open documents at once.
For more options than you could possibly need, check out "Notepad++ Alternatives" at AlternativeTo.net.
you can use Python Script plugin for Multiline Regular Expression search and replace!
- http://npppythonscript.sourceforge.net/docs/latest/scintilla.html?highlight=pymlreplace#Editor.pymlreplace
# This example replaces any <br/> that is followed by another on the next line (with optional spaces in between), with a single one
editor.pymlreplace(r"<br/>\s*\r\n\s*<br/>", "<br/>\r\n")
I use Notepad++ all the time but it's Regex has alway been a bit lacking.
Sublime Text is what you want.
EditPlus does a good job at search/replace using regex (including multiline)
You could use Visual Studio. Download Express for free if you don't have a copy.
VS's regex is non-standard, so you'd have to use \n:b+[\r\n] instead.
The latest version of UltraEdit has multiline find and replace w/ regex support.
Or if you're OK with using a more specialized regular expression tool for this, there's Regex Hero. It has the side benefit of being able to do everything on the fly. In other words, you don't have to click a button to test your regular expression because it's automatically tested after every keypress.
Personally, I'd use UltraEdit if I'm looking to replace text in multiple files. That way I can just select the files to replace as a batch and click Replace. But if I'm working with a single text file and I'm in need of writing a more complex regular expression then I'd paste it into Regex Hero and work with it there. That's because Regex Hero can save time when you see everything happen in real-time.
ED for windows has two versions of regex, three sorts of cut and paste (selection, lines or blocks, AND you can shift from one to the next (unlike ultra edit, which is clunky at best) with just a mouse click while you are highlighting -- no need to pull down a menu. The sheer speed of getting the job done is incredible, like reading on a Kindle, you don't have to think about it.
You can use a recent version of Notepad++ (Mine is 6.2.2).
No need to use the option ". match newline" as suggested in another answer. Instead, use the adequate regular expression with ^ for "begin of line" and $ for "end of line". Then use \r\n after the $ for a "new line" in a dos file (or just \n in a unix file as the carriage return is mainly used for dos/windows text file):
Ex.: to remove all lines starting with tags OBJE after a line starting with a tag UID (from a gedcom file - used in genealogy), I did use the following search regex:
^UID (.*)$\r\n^(OBJE (.*)$\r\n)+
And the following replace value:
UID \1\r\n
This is matching lines like this:
UID 4FBB852FB485B2A64DE276675D57A1BA
OBJE #M4#
OBJE #M3#
OBJE #M2#
OBJE #M1#
and the output of the replacement is
UID 4FBB852FB485B2A64DE276675D57A1BA
550 instances have been replaced in less than 1 sec. Notepad++ is really efficient!
Otherwise, to validate a Regular expression I like to use the .Net RegEx Tester (http://regexhero.net/tester/). It's really great to write and test on the fly a Reg Ex...
PS.: Also, you can use [\s\S] in your regex to match any character including new lines. So, if you look for any block of "multi-line" text starting with "xxx" and ending with "abc", the following Regex will be fine:^xxx[\s\S]*?abc$ where "*?" is to match as less as possible between xxx and abc !!!