How to replace all lines based on previous lines in Notepad++? - regex

I have an XML code:
<Line1>Matched_text Other_text</Line1>
<Line2>Text_to_replace</Line2>
How to tell Notepad++ to find Matched_text and replace Text_to_replace to Replaced_text? There are several similar blocks of code, with one exactly Matched _text and different Other_text and Text_to_replace. I want to replace all in once.
My idea is to put
Matched_text*<Line2>*</Line2>
in the Find field, and
Matched_text*<Line2>Replaced_text</Line2>
in the Replace field. I know that \1 in regex might be useful, but I don't know where to start.
The actual code is:
<Name>Matched_text, Other_text</Name>
<IsBillable>false</IsBillable>
<Color>-Text_to_replace</Color>

The regex you're looking for is something like the following.
Find: (Matched_text[\w,\s<>\/]*<Color>-).*(</Color>)
Replace: \1Replaced_text\2
Broken down:
`()` is how you tell regex that you want to keep things (for use in /1, /2, etc.), these are called capture groups in regex land.
`Matched_text[\w,\s<>\/]*` means you want your anchor `Matched_text` and everything after it up till the next part of the expression.
`<Color>-).*(</Color>)` Select everything between <Color>- and </Color> for replacement.
If you have any questions about the expression, I highly recommend looking at a regex cheatsheet.

Related

Regex: Trying to extract all values (separated by new lines) within an XML tag

I have a project that demands extracting data from XML files (values inside the <Number>... </Number> tag), however, in my regular expression, I haven't been able to extract lines that had multiple data separated by a newline, see the below example:
As you can see above, I couldn't replicate the multiple lines detection by my regular expression.
If you are using a script somewhere, your first plan should be to use a XML parser. Almost every language has one and it should be far more accurate compared to using regex. However, if you just want to use regex to search for strings inside npp, then you can use \s+ to capture multiple new lines:
<Number>(\d+\s)+<\/Number>
https://regex101.com/r/MwvBxz/1
I'm not sure I fully understand what you are trying to do so if this doesn't do it then let me know what you are going for.
You can use this find+replace combo to remove everything which is not a digit in between the <Number> tag:
Find:
.*?<Number>(.*?)<\/Number>.*
Replace:
$1
finally i was able to find the right regular expression, I'll leave it below if anyone needs it:
<Type>\d</Type>\n<Number>(\d+\n)+(\d+</Number>)
Explanation:
\d: Shortcut for digits, same as [1-9]
\n: Newline.
+: Find the previous element 1 to many times.
Have a good day everybody,
After giving it some more thought I decided to write a second answer.
You can make use of look arounds:
(?<=<Number>)[\d\s]+(?=<\/Number>)
https://regex101.com/r/FiaTKD/1

Replace wrong xml-comments with Regex

I am dealing with a bunch of xml files that contain one-line-comments like this: // Some comment.
I am pretty sure that xml comments look like this: <!-- Some comment -->
I would like to use a regular expression in the Atom editor to find and replace all wrong comment syntax.
according to this question, the comment can be found with (?<=\s)//([^\n\r]*) and replaced with something like <!--$1-->. There must be an error somewhere since clicking replace button leaves the comment as is, instaed of replacing it. Actually I can't even replace it with a simple character.
The find and replace works with a different regex in the "Find" field:
Find: name.*
Replace: baloon
Is there anything I can write in the "Find" and "Replace" field to achieve this transformation?
Atom editor search and replace currently does not support lookbehind constructs, like (?<=\s). To "imitate" it, you may use a capturing group with an alternation between start of string, ^, and a whitespace, \s.
So, you may use
Find What: (^|\s)//([^\n\r]+)
Replace With: $1<!--$2-->
See the regex demo. NOTE \s may match newlines, so you may probably want to use (^|[^\S\r\n])//([^\n\r]+) to avoid matching across line breaks.
If you do not need to check for a whitespace, just remove that first capturing group and use a mere:
Find What: //([^\n\r]+)
Replace With: <!--$1-->
See another regex demo.

EditPad: How to replace multiple search criteria with multiple values?

I did some searching and found tons of questions about multiple replacements with Regex, but I'm working in EditPadPro and so need a solution that works with the regex syntax of that environment. Hoping someone has some pointers as I haven't been able to work out the solution on my own.
Additional disclaimer: I suck with regex. I mean really... it's bad. Like I barely know wtf I'm doing.So that being said, here is what I need to do and how I'm currently approaching it...
I need to replace two possible values, with their corresponding replacements. My two searches are:
(.*)-sm
(.*)-rad
Currently I run these separately and replace each with simple strings:
sm
rad
Basically I need to lop off anything that comes prior to "sm" so I just detect everything up to and including sm, and then replace it all with that string (and likewise for "rad").
But it seems like there should be a way to do this in a single search/replace operation. I can do the search part fine with:
(.*)-sm|(.*)-rad
But then how to replace each with it's matching value? That's where I'm stuck. I tried:
sm|rad
but alas, that just becomes the literal complete string that is used for replacement.
Jonathan, first off let me congratulate you for using EPP Pro for regex in your text. It's my main text editor, and the main reason I chose it, as a regex lover, is that its support of regex syntax is vastly superior to competing editors. For instance Notepad++ is known for its shoddy support of regular expressions. The reason of course is that EPP's author Jan Goyvaerts is the author of the legendary RegexBuddy.
A picture is worth a thousand words... So here is how I would do your replacement. Just hit the "replace all button". The expression in the regex box assumes that anything before the dash that is not a whitespace character can be stripped, so if this is not what you want, we need to tune it.
Search for:
(.*)-(sm|rad)
Now, when you put something in parenthesis in Regex, those matches are stored in temporary variables. So whatever matched (.*) is stored in \1 and whatever matched (sm|rad) is stored in \2. Therefore, you want to replace with:
\2
Note that the replacement variable may be different depending on what programming language you are using. In Perl, for example, I would have to use $2 instead.

Remove text appearing after numbers in Notepad++ using regular expressions

I have a large text file which contains many timestamps. The timestamps look like this: 2013/11/14 06:52:38AM. I need to remove the last two characters (am/pm/AM/PM) from each of these. The problem is that a simple find and replace of "AM" may remove text from other parts of the file (which contains a lot of other text).
I have done a find using the regular expression (:\d\d[ap]m), which in the above example would track down the last bit of the timestamp: :38AM. I now need to replace this with :38, but I don't know how this is done (allowing for any combination of two digits after the colon).
Any help would be much appreciated.
EDIT: What I needed was to replace (:\d\d)[ap]m with \1
Make (:\d\d[ap]m) into (:\d\d)[ap]m and use $1 not \1
Go to Search > Replace menu (shortcut CTRL+H) and do the following:
Find what:
[0-9]{2}\K[AP]M
Replace:
[leave empty]
Select radio button "Regular Expression"
Then press Replace All
You can test it at regex101.
Note: the use of [0-9] is generally better than \d (read why), and avoiding to use a capture group $1 with the use of \K is considered better. It's definitely not important in your case, but it is good to know :)

Replacing char in a String with Regular Expression

I got a string like this:
PREFIX-('STRING WITH SPACES TO REPLACE')
and i need this:
PREFIX-('STRING_WITH_SPACES_TO_REPLACE')
I'm using Notepad++ for the Regex Search and Replace, but i'm shure every other Editor capable of regex replacements can do it to.
I'm using:
PREFIX-\('(.*)(\s)(.*)'\)
for search and
PREFIX-('\1_\3')
for replace
but that replaces only one space from the string.
The regex search feature in Notepad++ is very, very weak. The only way I can see to do this in NPP is to manually select the part of the text you want to work on, then do a standard find/replace with the In selection box checked.
Alternatively, you can run the document through an external script, or you can get a better editor. EditPad Pro has the best regex support I've ever seen in an editor. It's not free, but it's worth paying for. In EPP all I had to do was this:
search: ((?:PREFIX-\('|\G)[^\s']+)\s+
replace: $1_
EDIT: \G matches the position where the previous match ended, or the beginning of the input if there was no previous match. In other words, the first time you apply the regex, \G acts like \A. You can prevent that by adding a negative lookahead, like so:
((?:PREFIX-\('|(?!\A)\G)[^\s']+)\s+
If you want to prevent a match at the very beginning of the text no matter what it starts with, you can move the lookahead outside the group:
(?!\A)((?:PREFIX-\('|\G)[^\s']+)\s+
And, just in case you were wondering, a lookbehind will work just as well as a lookahead:
((?:PREFIX-\('|(?<!\A)\G)[^\s']+)\s+
You have to keep matching from the beggining of the string untill you can match no more.
find /(PREFIX-\('[^\s']*)\s([^']*'\))/
replace $1_$2
like: while (/(PREFIX-\('[^\s']*)\s([^']*'\))/$1_$2/) {}
How about using Replace all for about 20 times? Or until you're sure no string contains more spaces
Due to nature of regex, it's not possible to do this in one step by normal regular expression.
But if I be in your place, I do such replaces in several steps:
find such patterns and mark them with special character
(Like replacing STRING WITH SPACES TO REPLACE with #STRING WITH SPACES TO REPLACE#
Replace #([^#\s]*)\s to #\1_ server times.
Remove markers!
I studied a little the regex tool in Notepad++ because I didn't know their possibilities.
I conclude that they aren't powerful enough to do what you want.
Your are obliged to learn and use a programming language having a real regex capability. There are a number of them. Personnaly, I use Python. It would take 1 mn to do what you want with it
You'd have to run the replace several times for each space but this regex will work
/(?<=PREFIX-\(')([^\s]+)\s+/g
Replace with
\1_ or $1_
See it working at http://refiddle.com/10z