Vim regex - quantifier

Vim regex - quantifier - regex

I have a problem with following expression in vim.
In text: AekwoeuwioeuwioeuwB_AewieuiwuiweuB-A32r3r3hruh3u2huB A32r3r3hruh3u2huB I would like to select A-B strings separately. Something which is achievable by A.*?B in standard regex. However I am not able to do this in vim.

The syntax for the non-greedy "zero or more" match A.*?B in Vim is A.\{-}B. See :help /\{-.
An overview of the main differences to Perl's regular expression dialect can be found at :help perl-patterns.
Alternative
For simple patterns, the end delimiter can be excluded from the range, so that the default greedy matching will work, too: A[^B]*B

Related

How to limit a negating (emacs-compatible) regex to the negated string

I am looking for a regex which does not match "CVS". I don't want to use any feature which cannot translate into Emacs regex.
So far I have (in Python syntax, because I want to show it on regex101.com):
(^[^C].*|^.[^V].*|^..[^S].*)
This regex does not match "CVS" - so far so good. Unfortunately it does not match "CVS and more" either, but it should match it.
How can I adjust my regex to match "CVS and anything after it", but still not match "CVS"? (I.e how can I make the last test on the regex101.com page succeed?)

My understanding is that you want to match every individual line which isn't the line:
CVS
You're not too far off with your attempt.
Here the regex in Python syntax on regex101.com:
^(?:[^C\n]|C[^V\n]|CV[^S\n]|CVS.).*|^CV?$
And here's an elisp regexp in the read syntax for strings:
"^\\(?:[^C\n]\\|C[^V\n]\\|CV[^S\n]\\|CVS.\\).*\\|^CV?$"
Note the newlines. So in string syntax it becomes:
^\(?:[^C
]\|C[^V
]\|CV[^S
]\|CVS.\).*\|^CV?$
n.b. You can use M-x re-builder to test these in Emacs.

Need regular expression to replace before and after while keeping the numbers in the middle

I have the following:
itemid=44'>Red Flower</a>
I need it to be this:
_ITEMID_START_44_ITEMID_END_
Can this be done with regular expressions? I need to keep the id (44 in the example), and replace everything on the left with _ITEMID_START_and everything on the right with _ITEMID_END_.
Note: The itemid is one digit or two but never no more than two.
I found something about tagged regular expressions and backreferences which seems like it would work but the syntax is killing me.
I tried this (and other attempts):
Find What: ^(\bitemid=\b)^([0-9][0-9]^)\b'>\b[a-z]+\b</a>\b)
Replace With: ^(\b_ITEMID_START_\b^2^(\b_ITEMID_END_\b
I am using UltraEdit to do the find and replace in over 20,000 *.html files. Any help would be very much appreciated.

The solution of Casimir et Hippolyte and also first solution of Avinash Raj work both in UltraEdit with selecting Perl as regular expression engine. The second search string of Avinash Raj requires removing backslash left of character ' in search string to work in UltraEdit.
UltraEdit has 3 regular expression engines: UltraEdit, Unix and Perl.
The search string in the question is a mixture of UltraEdit and Perl regular expression syntax and therefore does not work.
With UltraEdit reguar expression engine:
Find what: itemid=^([0-9]+^)*</a>
Replace with: _ITEMID_START_^1_ITEMID_END_
With Unix or Perl regular expression engine:
Find what: itemid=([0-9]+).*</a>
Replace with: _ITEMID_START_\1_ITEMID_END_
More secure because non greedy, but only with Perl regex engine:
Find what: itemid=(\d+).*?</a>
Replace with: _ITEMID_START_\1_ITEMID_END_
IDM published the power tips tagged expressions for UltraEdit regex engine and Perl regular expressions: Backreferences for Perl regex engine.

You can try this:
Find What: \bitemid=([0-9][0-9]?)'>[^<]*</a>
Replace With: _ITEMID_START_\1_ITEMID_END_
A replacement string is a normal string, and all the regex special characters (except for the backreference) loose their special meaning.
\b the word boundary is the limit between a character that come from the \w character class (a shortcut for [A-Za-z0-9_]) and an other character.
Note: I can't try it with ultraedit, if you obtain a literal \1, replace it with $1

The below regex would match everything and capture only the digits which was just after to the itemid=. And in the replacement part, the whole line is replaced with _ITEMID_START_\1_ITEMID_END_ (\1 represents the first captured group. It may vary for different languages)
.*(?<=\bitemid=)([0-9]{1,2}).*
And the substitution would be,
_ITEMID_START_\1_ITEMID_END_
DEMO
If you just want to replace only,
itemid=44'>Red Flower</a>
with
_ITEMID_START_44_ITEMID_END_
Then your regex would be,
\bitemid=([0-9]{1,2})\'>[^<]*<\/a>
And the substitution would be,
_ITEMID_START_\1_ITEMID_END_

very magic regex that will allow me to replace these images?

I'm using vim daily for manipulating text and to write code. However, every time I have to perform any substitution, or do any kind of regex work, it drives me crazy, and I have to switch to sublime. I'd like to know, what's the correct way of turning this:
<img src="whatever.png"/>
<img src="x.png"/>
into
<img src="<%= image_path("whatever.png") %>"/>
<img src="<%= image_path("x.png") %>"/>
In sublime, I can use this as the regex for search: src="(.*?.png)" and this as the regex for substitution: src="<%= asset_path("\1") %>". In vim, if I do this: :%s/\vsrc="(.*?.png)/src="<%= asset_path("\1") %>"/g I get:
E62: Nested ?
E476: Invalid command
What am I not doing right?

As #nhahtdh stated Vim's dialect of regex uses \{-} as the non-greedy quantifier. If you use the very magic flag it is just {-}. So your command turns into:
:%s/\vsrc="(.{-}.png)/src="<%= asset_path("\1") %>"/g
However you didn't escape the . in .png so:
:%s/\vsrc="(.{-}\.png)/src="<%= asset_path("\1") %>"/g
But we can still do better! By using \zs and \ze we can avoid retyping the src=" bit. \zs and \ze mark the start and end of the match where the substitution will occur.
:%s/\vsrc="\zs(.\{-}\.png)"/<%= image_path("\1") %>"/g
However we still are not done because we can take it one step further if we carefully choose where we put \zs and \ze then we can use vim's & in the substitution. It is like \0 in Perl's regex syntax. Now we don't need any capture groups which nullifies the need for the very magic flag.
:%s/src="\zs.\{-}\.png\ze"/<%= image_path("&") %>/g
For more help see the following documentation:
:h /\zs
:h /\{-
:h s/\&

According to this website, the syntax for lazy quantifier in vim is different from the syntax used in Perl-like regex.
Let me quote the website:
*/\{-*
\{-n,m} matches n to m of the preceding atom, as few as possible
\{-n} matches n of the preceding atom
\{-n,} matches at least n of the preceding atom, as few as possible
\{-,m} matches 0 to m of the preceding atom, as few as possible
\{-} matches 0 or more of the preceding atom, as few as possible
{Vi does not have any of these}
n and m are positive decimal numbers or zero
*non-greedy*
If a "-" appears immediately after the "{", then a shortest match
first algorithm is used (see example below). In particular, "\{-}" is
the same as "*" but uses the shortest match first algorithm.

:%s/"\(.*\)"/"<%= image_path("\1") %>"/g
The double quotes are out main pattern. Everything we want to capture gets thrown into a group \( \) so we can later relate to it via \1.
If you use very magic, you have to escape the =, thus \vsrc\=(.*).png". So using your way the answer is:
:%s/\vsrc\="(.*\.png)"/src="<%= image_path("\1") %>"/g
It's easy to see if you :set hlsearch and then play around with /. :)

Vim positive lookahead regex

I am still not so used to the vim regex syntax. I have this code:
rename_column :keywords, :textline_two_id_4, :textline_two_id_4
I would like to match the last id with a positive lookahead in VIMs regex syntax.
How would you do this?
\id#=_\d$
This does not work.
This perl syntax works:
id(?=_\d$)
Edit - the answer:
/id\(_\d$\)\#=
Can someone explain the syntax?

If you check the vim help, there is not much to explain: (:h \#=)
\#= Matches the preceding atom with zero width. {not in Vi}
Like "(?=pattern)" in Perl.
Example matches
foo\(bar\)\#= "foo" in "foobar"
foo\(bar\)\#=foo nothing
This should match the last id:
/id\(_\d$\)\#=
save some back slashes with "very magic":
/\vid(_\d$)#=
actually, it looks more straightforward to use vim's \zs \ze:
id\ze_\d$

vim regex with meta-characters

I have the following in a text file:
This is some text for cv_1 for example
This is some text for cv_001 for example
This is some text for cv_15 for example
I am trying to use regex cv_.*?\s to match cv_1, cv_001, cv_15 in the text. I know that the regex works. However, it doesn't match anything when I try it in Vim.
Do we need to do something special in Vim?

The non-greedy character ? doesn't work in Vim; you should use:
cv_.\{-}\s
...instead of:
cv_.*?\s
Here's a quick reference for matching:
* (0 or more) greedy matching
\+ (1 or more) greedy matching
\{-} (0 or more) non-greedy matching
\{-n,} (at least n) non-greedy matching

vim's regex syntax is a little different -- what you're looking for is
cv_.\{-}\s
(the \{-} being the vim equivalent of perl's *?, i.e., non-greedy 0-or-more). See here for a good tutorial on vim's regular expressions.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Vim regex - quantifier - regex

I have a problem with following expression in vim. In text: AekwoeuwioeuwioeuwB_AewieuiwuiweuB-A32r3r3hruh3u2huB A32r3r3hruh3u2huB I would like to select A-B strings separately. Something which is achievable by A.*?B in standard regex. However I am not able to do this in vim.

Related

How to limit a negating (emacs-compatible) regex to the negated string

Need regular expression to replace before and after while keeping the numbers in the middle

very magic regex that will allow me to replace these images?

Vim positive lookahead regex

vim regex with meta-characters

Categories

Resources