Replace while keeping certain "words" in vi/vim - regex

For example, if I have $asd['word_123'] and I wanted to replace it with $this->line('word_123'), keeping the 'word_123'. How could I do that?
By using this:
%s/asd\[\'.*\'\]/this->line('.*')/g
I will not be able to keep the wording in between. Please enlighten me.

Using regex, you could do something like :%s/\$asd\['\([^']*\)'\]/$this->line('\1')/g
Step by step:
%s - substitute on the whole file
\$asd\[' - match "$asd['". Notice the $ and [ need to be escaped since these have special meaning in regex.
\([^']*\) - the \( \) can be used to select what's called an "atom" so that you can use it in the replacement. The [^'] means anything that is not a ', and * means match 0 or more of them.
'\] - finishes our match.
$this->line('\1') - replaces with what we want, and \1 replaces with our matched atom from before.
g - do this for multiple matches on each line.
Alternative (macro)
Instead of regex you could also use a macro. For example,
qq/\$asd<Enter>ct'$this->line(<Esc>f]r)q
then #q as many times as you need. You can also ## after you've used #q once, or you can 80#q if you want to use it 80 times.
Alternative (:norm)
In some cases, using :norm may be the best option. For example, if you have a short block of code and you're matching a unique character or position. If you know that "$" only appears in "$asd" for a particular block of code you could visually select it and
:norm $T$ct'this->line(<C-v><Esc>f]r)<Enter>
For a discourse on using :norm more effectively, read :help :norm and this reddit post.

Try using
:%s/\$asd\[\'\([^\']\+\)\'\]/$this->line('\1')/g

Related

vs code replace two different sides of something

suppose I have code that looks something like
myFunc(someInput);
and suppose I run this function in many different places, on many different inputs (someInput could be various things, I need to perserve whatever it is).
All the sudden I realize I need to perform another function on the input. So I would like to replace every instance with
myFunc(nutherFunc(someInput));
I could run a replace of myFunc( with myFunc(nutherFunc( but would have to manually close the nutherFunc call everywhere. is there a way, using regex or otherwise, that I can replace myFunc(nutherFunc( AND )) while preservint the input?
Said another way, can I say "replace these two character sets but keep what is in between them"?
You can use a regex with a capture group to accomplish this. I'd recommend
myFunc\( # match the literal characters "myFunc(" (We have to escape the paren)
(\w+) # capture group so we can refer to the argument of `myFunc` in the replacement
\) # a literal close paren
with a replacement of
myFunc(notherFunc($1))
Where the $1 represents the group that was captured between parens.
Here's a video: https://clip.brianschiller.com/wjuWYgc-2019-12-17-replace.mp4

Regex to find two words on the page

I'm trying to find all pages which contain words "text1" and "text2".
My regex:
text1(.|\n)*text2
it doesn't work..
If your IDE supports the s (single-line) flag (so the . character can match newlines), you can search for your items with:
(text1).*(text2)|\2.*\1
Example with s flag
If the IDE does not support the s flag, you will need to use [\s\S] in place of .:
(text1)[\s\S]*(text2)|\2[\s\S]*\1
Example with [\s\S]
Some languages use $1 and $2 in place of \1 and \2, so you may need to change that.
EDIT:
Alternately, if you want to simply match that a file contains both strings (but not actually select anything), you can utilize look-aheads:
(?s)^(?=.*?text1)(?=.*?text2)
This doesn't care about the order (or number) of the arguments, and for each additional text that you want to search for, you simply append another (?=.*?text_here). This approach is nice, since you can even include regex instead of just plain strings.
text0[\s\S]*text1
Try this.This should do it for you.
What this does is match all including multiline .similar to having .*? with s flag.
\s takes care of spaces,newlines,tabs
\S takes care any non space character.
If you want the regex to match over several lines I would try:
text1[\w\W]*text2
Using . is not a good choice, because it usually doesn't match over multiple lines. Also, for matching single characters I think using square brackets is more idiomatic than using ( ... | ... )
If you want the match to be order-independent then use this:
(?:text1[\w\W]*text2)|(?:text2[\w\W]*text1)
Adding a response for IntelliJ
Building on #OnlineCop's answer, to swap the order of two expressions in IntelliJ,you would style the search as in the accepted response, but since IntelliJ doesn't allow a one-line version, you have to put the replace statement in a separate field. Also, IntelliJ uses $ to identify expressions instead of \.
For example, I tend to put my nulls at the end of my comparisons, but some people prefer it otherwise. So, to keep things consistent at work, I used this regex pattern to swap the order of my comparisons:
Notice that IntelliJ shows in a tooltip what the result of the replacement will be.
For me works text1*{0,}(text2){0,}.
With {0,} you can decide to get your keyword zero or more times OR you set {1,x} to get your keyword 1 or x-times (how often you want).

Regular expression question

I have some text like this:
dagGeneralCodes$_ctl1$_ctl0
Some text
dagGeneralCodes$_ctl2$_ctl0
Some text
dagGeneralCodes$_ctl3$_ctl0
Some text
dagGeneralCodes$_ctl4$_ctl0
Some text
I want to create a regular expression that extracts the last occurrence of dagGeneralCodes$_ctl[number]$_ctl0 from the text above.
the result should be: dagGeneralCodes$_ctl4$_ctl0
Thanks in advance
Wael
This should do it:
.*(dagGeneralCodes\$_ctl\d\$_ctl0)
The .* at the front is greedy so initially it will grab the entire input string. It will then backtrack until it finds the last occurrence of the text you want.
Alternatively you can just find all the matches and keep the last one, which is what I'd suggest.
Also, specific advice will probably need to be given depending on what language you're doing this in. In Java, for example, you will need to use DOTALL mode to . matches newlines because ordinarily it doesn't. Other languages call this multiline mode. Javascript has a slightly different workaround for this and so on.
You can use:
[\d\D]*(dagGeneralCodes\$_ctl\d+\$_ctl0)
I'm using [\d\D] instead of . to make it match new-line as well. The * is used in a greedy way so that it will consume all but the last occurrence of dagGeneralCodes$_ctl[number]$_ctl0.
I really like using this Regular Expression Cheatsheet; it's free, a single page, and printed, fits on my cube wall.

Explain this Regular Expression please

Regular Expressions are a complete void for me.
I'm dealing with one right now in TextMate that does what I want it to do...but I don't know WHY it does what I want it to do.
/[[:alpha:]]+|( )/(?1::$0)/g
This is used in a TextMate snippet and what it does is takes a Label and outputs it as an id name. So if I type "First Name" in the first spot, this outputs "FirstName".
Previously it looked like this:
/[[:alpha:]]+|( )/(?1:_:/L$0)/g (it might have been \L instead)
This would turn "First Name" into "first_name".
So I get that the underscore adds an underscore for a space, and that the /L lowercases everything...but I can't figure out what the rest of it does or why.
Someone care to explain it piece by piece?
EDIT
Here is the actual snippet in question:
<column header="$1"><xmod:field name="${2:${1/[[:alpha:]]+|( )/(?1::$0)/g}}"/></column>
This regular expression (regex) format is basically:
/matchthis/replacewiththis/settings
The "g" setting at the end means do a global replace, rather than just restricting the regex to a particular line or selection.
Breaking it down further...
[[:alpha:]]+|( )
That matches an alpha numeric character (held in parameter $0), or optionally a space (held in matching parameter $1).
(?1::$0)
As Roger says, the ? indicates this part is a conditional. If a match was found in parameter $1 then it is replaced with the stuff between the colons :: - in this case nothing. If nothing is in $1 then the match is replaced with the contents of $0, i.e. any alphanumeric character that is not a space is output unchanged.
This explains why the spaces are removed in the first example, and the spaces get replaced with underscores in your second example.
In the second expression the \L is used to lowercase the text.
The extra question in the comment was how to run this expression outside of TextMate. Using vi as an example, I would break it into multiple steps:
:0,$s/ //g
:0,$s/\u/\L\0/g
The first part of the above commands tells vi to run a substitution starting on line 0 and ending at the end of the file (that's what $ means).
The rest of the expression uses the same sorts of rules as explained above, although some of the notation in vi is a bit custom - see this reference webpage.
I find RegexBuddy a good tool for me in dealing with regexs. I pasted your 1st regex in to Buddy and I got the explanation shown in the bottom frame:
I use it for helping to understand existing regexs, building my own, testing regexs against strings, etc. I've become better # regexs because of it. FYI I'm running under Wine on Ubuntu.
it's searching for any alpha character that appears at least once in a row [[:alpha:]]+ or space ( ).
/[[:alpha:]]+|( )/(?1::$0)/g
The (?1 is a conditional and used to strip the match if group 1 (a single space) was matched, or replace the match with $0 if group 1 wasn't matched. As $0 is the entire match, it gets replaced with itself in that case. This regex is the same as:
/ //g
I.e. remove all spaces.
/[[:alpha:]]+|( )/(?1:_:/\L$0)/g
This regex is still using the same condition, except now if group 1 was matched, it's replaced with an underscore, and otherwise the full match ($0) is used, modified by \L. \L changes the case of all text that comes after it, so \LABC would result in abc; think of it as a special control code.

Regex search and replace in VI

I have a document with lots of <swf...>.....</swf> in it. I would like to remove all these. Using vi when i type
:%s/\<swf[^\/swf>]+\/swf\>//g
I was hoping this would work, but it doesn't match anything.
You can remove all those from the buffer with this command:
:%s!<swf.\{-}/swf>!!
if you also have tags that might be split on two lines, you can add the \_ modifier to make . match newlines too:
:%s!<swf\_.\{-}/swf>!!
this assuming you want to remove both the tags and what they contain, if you just want to get rid of the tags and keep the content
:%s!</\?swf.\{-}>!!
Notes:
you don't need to escape < or >
you can choose whatever pattern delimiter you wish: Vim will use the first character you put after the s in the substitute command: this takes away the need to escape forward slashes in your pattern
EDIT: extending my answer after your comment
this is exactly like /STRING/REPLACE/g I only used a ! instead of / so that I don't have to quote the backslash in the pattern (see my second point above)
I didn't add the g modifier at the end since I have :set gdefault in my .vimrc since forever (it means that by default Vim will substitute all matches in a line instead of just the first, thus reverting the meaning of /g)
\{-} is the "ungreedy" version of the * quantifier, i.e. it matches 0 or more of the preceding atom but take as few as possible -- this helps you make sure that your search pattern will extend to the first "closing tag" instead of the last.
HTH
The problem here is that the [] is a character class, so you are telling it that between the swf opening and closing tags, the letters s, w and f cannot appear anywhere, in any order.
You could try a non-greedy match instead:
\<swf.\{-}\/swf\>
Note that . does not allow newline by default.
I don't use Vim though, so I used this guide to discover the syntax. I hope it is correct.