Notepad++ regexp to search and replace with exceptions - regex

I'm a regexp newbie and I would like to know how to do a search and replace for the following case:
A file contains many occurrences of the following:
L1234_XL3.ext
and also many occurrences of:
L1234_XL3
I only want to find and replace L1234_XL3 occurrences with XL3 without affecting instances that have an extension.
I am using notepad++ to do the regular expression.

If Notepad++ supports lookaheads, you can simply use L1234_XL3(?!\.ext) for the search and "XL3" for the replacement.
EDIT: Looks like it doesn't support lookaheads after all. A pity; you'll have to do it the hard way without regexes (regexen?):
Replace L1234_XL3.ext with QQQ (or any other string that doesn't appear in the file)
Replace L1234_XL3 with XL3.
Replace QQQ with L1234_XL3.ext.

Step 1.
Change all occurences of L1234_XL3.ext to L-1-2-3-4_XL3.ext (for example)
Step 2.
Change all occurences of L1234_XL3 to XL3
Step 3.
Change all occurences of L-1-2-3-4_XL3.ext back to L1234_XL3.ext
As far as I understand Notepad++ 5.4.5 doesn't support positive lookahead

Related

Regex in search & replace: avoid fixed length of lookaround

In a long corpus of text, I want to make some corrections in certain
environments. However, I am encountering problems when using regex with text
editors. I switched to gedit to have an editor which supports regex in
search & replace.
Crucially, I only want to make changes if the line starts with a certain
pattern (\nm or \mb). The problem is that the element that I want to
replace (o' -> o'o) is not at a fixed length from the beginning of the line
and I can't include the regex in the lookbehind (the lookbehind fails).
Is there any way to include what I am looking for in a simple text editor
regex? Or is this already a step where I have to learn how to script in, for
example, Python?
This is what the regex looks like so far.
(?<=\\(nm|mb)).*o'(?=(q|w|r|t|z|p|s|d|f|g|h|j|k|l|x|c|v|b|n|m|a|i|u|e))
Of course, I can't apply .* in the replace without losing its content.
Put a capture group around .* and a back-reference in the replacement.
Find: (?<=\\(nm|mb))(.*)o'(?=(q|w|r|t|z|p|s|d|f|g|h|j|k|l|x|c|v|b|n|m|a|i|u|e))
Replace: \1o'o

Notepad ++ Regular Expression Replacement

I've got a big XML file and I should modify a tag.
Original:
<MyTag>13/19/59/70/68/32'</MyTag>'
What I want with regular expression:
<MyTag>13,19,59,70,68,32</MyTag>
That could be pretty easy if I'd got each time the same quantity of number but I could have 8 number or 5 or 6 or less.
How can I do that in one time?
As already pointed out in the comments, Notepad++'s regexes don't seem to be powerful enough to make that replace. In general, I don't think bare regex replacement isn't powerful enough for this replacement, you could at most get 13/19/59/70/68/32 in a capture group, and perform the / to , replace on that string by other means. That's why maybe I'd consider using another tool you are proficient in (perl, java, whatever) instead.
Using notepad++, I'd go for a normal replace first, to change all occurrences of '</MyTag>' to </MyTag>, and then a regex replace with this regular expression: (\d+)/. The replace should be \1,. Clicking on Replace all should replace all occurrences.
If you wanted to avoid replacing digits separated by / in other tags, maybe you could use this regular expression <MyTag>(.*)(\d+)/(.*)</MyTag> and replace it with <MyTag>\1\2,\3</MyTag>. This replace will have to be executed N times, so you might be interested in recording a macro or similar if you want to use it.
IT IS POSSIBLE TO DO IN ONE REGEXP.
Search for:
/([0-9]+)('(<){1}/(MyTag>){1}')?
Replace with:
,\1\3\4

Notepad++ Regex: Find all 1 and 2 letter words

I’m working with a text file with 200.000+ lines in Notepad++. Each line has only one word. I need to strip out and remove all words which only contains one letter (e.g.: I) and words which contains only two letters (e.g.: as).
I thought I could just pas in regular regex like this [a-zA-Z]{1,2} but I does not recognize anything (I’m trying to Mark them).
I’ve done manual search and I know that there do exists words of that length so therefor can it only be my regex code that’s wrong. Anyone knows how to do this in Notepad++ ???
Cheers,
- Mestika
If you want to remove only the words but leave the lines empty, this works:
^[a-zA-Z]{1,2}$
Replace this with an empty string. ^ and $ are anchors for the beginning and the end of a line (because Notepad++'s regexes work in multi-line mode).
If you want to remove the lines completely, search for this:
^[a-zA-Z]{1,2}\r\n
And replace with an empty string. However, this won't work before Notepad++ 6, so make sure yours is up-to-date.
Note that you will have to replace \r\n with the specific line-endings of your file!
As Tim Pietzker suggested, a platform independent solution that also removes empty lines would be:
^[a-zA-Z]{1,2}[\r\n]+
A platform-independent solution that does not remove empty lines but only those with one or two letters would be:
^[a-zA-Z]{1,2}(\r\n?|\n)
I don't use Notepad++ but my guess is it could be because you have too many matches - try including word boundaries (your exp will match every set of 2 letters)
\b[a-zA-Z]{1,2}\b
The regex you specified should find 1-or-2 characters (even in Notepad++'s Find-dialog), but not in the way you'd think. You want to have the regex make sure it starts at the beginning of the line and ends at the end with ^ and $, respecitevely:
^[a-zA-Z]{1,2}$
Notepad++ version 6.0 introduced the PCRE engine, so if this doesn't work in your current version try updating to the most recent.
You seem to use the version of Notepad++ that doesn't support explicit quantifiers: that's why there's no match at all (as { and } are treated as literals, not special symbols).
The solution is to use their somewhat more lengthy replacement:
\w\w?
... but that's only part of the story, as this regex will match any symbol, and not just short words. To do that, you need something like this:
^\w\w?$

Replacing char in a String with Regular Expression

I got a string like this:
PREFIX-('STRING WITH SPACES TO REPLACE')
and i need this:
PREFIX-('STRING_WITH_SPACES_TO_REPLACE')
I'm using Notepad++ for the Regex Search and Replace, but i'm shure every other Editor capable of regex replacements can do it to.
I'm using:
PREFIX-\('(.*)(\s)(.*)'\)
for search and
PREFIX-('\1_\3')
for replace
but that replaces only one space from the string.
The regex search feature in Notepad++ is very, very weak. The only way I can see to do this in NPP is to manually select the part of the text you want to work on, then do a standard find/replace with the In selection box checked.
Alternatively, you can run the document through an external script, or you can get a better editor. EditPad Pro has the best regex support I've ever seen in an editor. It's not free, but it's worth paying for. In EPP all I had to do was this:
search: ((?:PREFIX-\('|\G)[^\s']+)\s+
replace: $1_
EDIT: \G matches the position where the previous match ended, or the beginning of the input if there was no previous match. In other words, the first time you apply the regex, \G acts like \A. You can prevent that by adding a negative lookahead, like so:
((?:PREFIX-\('|(?!\A)\G)[^\s']+)\s+
If you want to prevent a match at the very beginning of the text no matter what it starts with, you can move the lookahead outside the group:
(?!\A)((?:PREFIX-\('|\G)[^\s']+)\s+
And, just in case you were wondering, a lookbehind will work just as well as a lookahead:
((?:PREFIX-\('|(?<!\A)\G)[^\s']+)\s+
You have to keep matching from the beggining of the string untill you can match no more.
find /(PREFIX-\('[^\s']*)\s([^']*'\))/
replace $1_$2
like: while (/(PREFIX-\('[^\s']*)\s([^']*'\))/$1_$2/) {}
How about using Replace all for about 20 times? Or until you're sure no string contains more spaces
Due to nature of regex, it's not possible to do this in one step by normal regular expression.
But if I be in your place, I do such replaces in several steps:
find such patterns and mark them with special character
(Like replacing STRING WITH SPACES TO REPLACE with #STRING WITH SPACES TO REPLACE#
Replace #([^#\s]*)\s to #\1_ server times.
Remove markers!
I studied a little the regex tool in Notepad++ because I didn't know their possibilities.
I conclude that they aren't powerful enough to do what you want.
Your are obliged to learn and use a programming language having a real regex capability. There are a number of them. Personnaly, I use Python. It would take 1 mn to do what you want with it
You'd have to run the replace several times for each space but this regex will work
/(?<=PREFIX-\(')([^\s]+)\s+/g
Replace with
\1_ or $1_
See it working at http://refiddle.com/10z

Extract and use a part of string with a regex in GVIM

I've got a string:
doCall(valA, val.valB);
Using a regex in GVIM I would like to change this to:
valA = doCall(valA, val.valB);
How would I go about doing this? I use %s for basic regex search and replace in GVIM, but this a bit different from my normal usages.
Thanks
You can use this:
%s/\vdoCall\(<(\w*)>,/\1 = doCall(\1,/
\v enables “more magic” in regular expressions – not strictly necessary here but I usually use it to make the expressions simpler. <…> matches word boundaries and the in-between part matches the first parameter and puts it in the first capture group. The replacement uses \1 to access that capture group and insert into the right two places.