Regex search and replace except for specific word - regex

I'm trying to search and replace some strings in my code with RegEx as follows:
replacing self.word with self.indices['word']
So for example, replace:
self.example
with
self.indices['example']
But it should work for all words, not just 'example'.
So the RegEx needed for this probably doesn't need the actual word in it.
It should actually just replace everything but the word itself.
What I tried is the following:
self.(.*?)
It works for searching all the strings that match 'self.' but when I want to change it I lose the word like 'example' so I can't change it to self.indices['example'].

What you need to do is a substitution using capturing groups. Depending on the language used some of the syntax might be slightly different but in general you would want something like this:
self\.([^\s]+)
replace with:
self.indices['\1']
https://regex101.com/r/R3xFZB/1
We are using the parans to caputre what is after self. and in the replace putting that captured data into \1 which is the first (and in this case) only captured group. For some languages you might need $1 or \\1 for the substitution variable.
As with most regexes this can be done many different ways using look arounds, etc. but I think for somebody new this is easy to read, understand and maintain. The comment #wnull made has a more proper regex leveraging a look behind that should also do the trick :)

Related

How to replace all lines based on previous lines in Notepad++?

I have an XML code:
<Line1>Matched_text Other_text</Line1>
<Line2>Text_to_replace</Line2>
How to tell Notepad++ to find Matched_text and replace Text_to_replace to Replaced_text? There are several similar blocks of code, with one exactly Matched _text and different Other_text and Text_to_replace. I want to replace all in once.
My idea is to put
Matched_text*<Line2>*</Line2>
in the Find field, and
Matched_text*<Line2>Replaced_text</Line2>
in the Replace field. I know that \1 in regex might be useful, but I don't know where to start.
The actual code is:
<Name>Matched_text, Other_text</Name>
<IsBillable>false</IsBillable>
<Color>-Text_to_replace</Color>
The regex you're looking for is something like the following.
Find: (Matched_text[\w,\s<>\/]*<Color>-).*(</Color>)
Replace: \1Replaced_text\2
Broken down:
`()` is how you tell regex that you want to keep things (for use in /1, /2, etc.), these are called capture groups in regex land.
`Matched_text[\w,\s<>\/]*` means you want your anchor `Matched_text` and everything after it up till the next part of the expression.
`<Color>-).*(</Color>)` Select everything between <Color>- and </Color> for replacement.
If you have any questions about the expression, I highly recommend looking at a regex cheatsheet.

Regex to find two words on the page

I'm trying to find all pages which contain words "text1" and "text2".
My regex:
text1(.|\n)*text2
it doesn't work..
If your IDE supports the s (single-line) flag (so the . character can match newlines), you can search for your items with:
(text1).*(text2)|\2.*\1
Example with s flag
If the IDE does not support the s flag, you will need to use [\s\S] in place of .:
(text1)[\s\S]*(text2)|\2[\s\S]*\1
Example with [\s\S]
Some languages use $1 and $2 in place of \1 and \2, so you may need to change that.
EDIT:
Alternately, if you want to simply match that a file contains both strings (but not actually select anything), you can utilize look-aheads:
(?s)^(?=.*?text1)(?=.*?text2)
This doesn't care about the order (or number) of the arguments, and for each additional text that you want to search for, you simply append another (?=.*?text_here). This approach is nice, since you can even include regex instead of just plain strings.
text0[\s\S]*text1
Try this.This should do it for you.
What this does is match all including multiline .similar to having .*? with s flag.
\s takes care of spaces,newlines,tabs
\S takes care any non space character.
If you want the regex to match over several lines I would try:
text1[\w\W]*text2
Using . is not a good choice, because it usually doesn't match over multiple lines. Also, for matching single characters I think using square brackets is more idiomatic than using ( ... | ... )
If you want the match to be order-independent then use this:
(?:text1[\w\W]*text2)|(?:text2[\w\W]*text1)
Adding a response for IntelliJ
Building on #OnlineCop's answer, to swap the order of two expressions in IntelliJ,you would style the search as in the accepted response, but since IntelliJ doesn't allow a one-line version, you have to put the replace statement in a separate field. Also, IntelliJ uses $ to identify expressions instead of \.
For example, I tend to put my nulls at the end of my comparisons, but some people prefer it otherwise. So, to keep things consistent at work, I used this regex pattern to swap the order of my comparisons:
Notice that IntelliJ shows in a tooltip what the result of the replacement will be.
For me works text1*{0,}(text2){0,}.
With {0,} you can decide to get your keyword zero or more times OR you set {1,x} to get your keyword 1 or x-times (how often you want).

EditPad: How to replace multiple search criteria with multiple values?

I did some searching and found tons of questions about multiple replacements with Regex, but I'm working in EditPadPro and so need a solution that works with the regex syntax of that environment. Hoping someone has some pointers as I haven't been able to work out the solution on my own.
Additional disclaimer: I suck with regex. I mean really... it's bad. Like I barely know wtf I'm doing.So that being said, here is what I need to do and how I'm currently approaching it...
I need to replace two possible values, with their corresponding replacements. My two searches are:
(.*)-sm
(.*)-rad
Currently I run these separately and replace each with simple strings:
sm
rad
Basically I need to lop off anything that comes prior to "sm" so I just detect everything up to and including sm, and then replace it all with that string (and likewise for "rad").
But it seems like there should be a way to do this in a single search/replace operation. I can do the search part fine with:
(.*)-sm|(.*)-rad
But then how to replace each with it's matching value? That's where I'm stuck. I tried:
sm|rad
but alas, that just becomes the literal complete string that is used for replacement.
Jonathan, first off let me congratulate you for using EPP Pro for regex in your text. It's my main text editor, and the main reason I chose it, as a regex lover, is that its support of regex syntax is vastly superior to competing editors. For instance Notepad++ is known for its shoddy support of regular expressions. The reason of course is that EPP's author Jan Goyvaerts is the author of the legendary RegexBuddy.
A picture is worth a thousand words... So here is how I would do your replacement. Just hit the "replace all button". The expression in the regex box assumes that anything before the dash that is not a whitespace character can be stripped, so if this is not what you want, we need to tune it.
Search for:
(.*)-(sm|rad)
Now, when you put something in parenthesis in Regex, those matches are stored in temporary variables. So whatever matched (.*) is stored in \1 and whatever matched (sm|rad) is stored in \2. Therefore, you want to replace with:
\2
Note that the replacement variable may be different depending on what programming language you are using. In Perl, for example, I would have to use $2 instead.

Notepad++ masschange using regular expressions

I have issues to perform a mass change in a huge logfile.
Except the filesize which is causing issues to Notepad++ I have a problem to use more than 10 parameters for replacement, up to 9 its working fine.
I need to change numerical values in a file where these values are located within quotation marks and with leading and ending comma: ."123,456,789,012.999",
I used this exp to find and replace the format to:
,123456789012.999, (so that there are no quotation marks and no comma within the num.value)
The exp used to find is:
([,])(["])([0-9]+)([,])([0-9]+)([,])([0-9]+)([,])([0-9]+)([\.])([0-9]+)(["])([,])
and the exp to replace is:
\1\3\5\7\9\10\11\13
The problem is parameters \11 \13 are not working (the chars eg .999 as in the example will not appear in the changed values).
So now the question is - is there any limit for parameters?
It seems for me as its not working above 10. For shorter num.values where I need to use only up to 9 parameters the string for serach and replacement works fine, for the example above the search works but not the replacement, the end of the changed value gets corrupted.
Also, it came to my mind that instead of using Notepad++ I could maybe change the logfile on the unix server directly, howerver I had issues to build the correct perl syntax. Anyone who could help with that maybe?
After having a little play myself, it looks like back-references \11-\99 are invalid in notepad++ (which is not that surprising, since this is commonly omitted from regex languages.) However, there are several things you can do to improve that regular expression, in order to make this work.
Firstly, you should consider using less groups, or alternatively non-capture groups. Did you really need to store 13 variables in that regex, in order to do the replacement? Clearly not, since you're not even using half of them!
To put it simply, you could just remove some brackets from the regex:
[,]["]([0-9]+)[,]([0-9]+)[,]([0-9]+)[,]([0-9]+)[.]([0-9]+)["][,]
And replace with:
,\1\2\3\4.\5,
...But that's not all! Why are you using square brackets to say "match anything inside", if there's only one thing inside?? We can get rid of these, too:
,"([0-9]+),([0-9]+),([0-9]+),([0-9]+)\.([0-9]+)",
(Note I added a "\" before the ".", so that it matches a literal "." rather than "anything".)
Also, although this isn't a big deal, you can use "\d" instead of "[0-9]".
This makes your final, optimised regex:
,"(\d+),(\d+),(\d+),(\d+)\.(\d+)",
And replace with:
,\1\2\3\4.\5,
Not sure if the regex groups has limitations, but you could use lookarounds to save 2 groups, you could also merge some groups in your example. But first, let's get ride of some useless character classes
(\.)(")([0-9]+)(,)([0-9]+)(,)([0-9]+)(,)([0-9]+)(\.)([0-9]+)(")(,)
We could merge those groups:
(\.)(")([0-9]+)(,)([0-9]+)(,)([0-9]+)(,)([0-9]+)(\.)([0-9]+)(")(,)
^^^^^^^^^^^^^^^^^^^^
We get:
(\.)(")([0-9]+)(,)([0-9]+)(,)([0-9]+)(,)([0-9]+\.[0-9]+)(")(,)
Let's add lookarounds:
(?<=\.)(")([0-9]+)(,)([0-9]+)(,)([0-9]+)(,)([0-9]+\.[0-9]+)(")(?=,)
The replacement would be \2\4\6\8.
If you have a fixed length of digits at all times, its fairly simple to do what you have done. Even though your expression is poorly written, it does the job. If this is the case, look at Tom Lords answer.
I played around with it a little bit myself, and I would probably use two expressions - makes it much easier. If you have to do it in one, this would work, but be pretty unsafe:
(?:"|(\d+),)|(\.\d+)"(?=,) replace by \1\2
Live demo: http://regex101.com/r/zL3fY5

regex to do a non-contiguous find and replace

Often, I run into situations where I'd like to find and replace two sides of a string, and leave the middle intact. Usually, the first part is easy enough to identify, but the second part is too common. Here is an example:
To change the code string ActiveDocument.Sections["SectionName"] to SectionName, it's easy to find the first part, but the latter "] is way too common without its relationship to ActiveDocument.Sections[". Obviously, if SectionName was a static string, this wouldn't matter, I can just find the whole code string, inclusive, to replace.
Is there a way to match both sides, skipping the middle, or can regex only find contiguous parts? Or maybe there's a way of doing what I want by temporarily storing what's found by .*? in an expression?
I'm using UltraEdit for my find/replace operation. I can also use Javascript to execute on the code as a string ie: codestring.replace()
This is going to depend on the tool you use, but in Javascript you can use $1 to return the first capture group.
So you can use the pattern ActiveDocument.Sections\["(.*?)"\] for "find" and $1 for "replace".
I tested this with the javascript regex tool at regular-expressions.info
.*\["(.*?)"\]worked for me in Python
import re
re.match(r'.*\["(.*?)"\]', 'ActiveDocument.Sections["SectionName"]').group(1)
'SectionName'
Based on an online test, it appears to work with Javascript as well.