I have tried to use the Notepad++ Search/Replace with a Regular Expression to replace specific words with shorter versions of those words.
I used the following regex to match every word that ends with er (but not er as a word) - and replace the matching words with the same words minus the ending r, using a backreference:
Find what: ([a-zA-z]+e)r
Replace with: $1
But it doesn't replace the matching words, even though it finds them.
However, if I change the backreference syntax to this:
Replace with: \1
Everything works fine.
Why doesn't the $1 backreference work?
What is the difference between the two forms of the backrefernce - \1 and $1?
Notepad++'s earlier versions (v5.9.8 and prior) only supported standard POSIX Regular Expressions. However, full PCRE (Perl Compatible Regular Expression) Search/Replace support was added in version 6.0:
New features and enhancement in Notepad++ 6.0:
PCRE (Perl Compatible Regular Expressions) is supported.
This means that if you're using Notepad++ v6.0 or any newer version (e.g v6.1.5), you can use the PCRE syntax, and use $1 instead of \1 for backreference, but it won't be compatible with earlier versions of Notepad++ (prior to version 6.0). Other than that, they're similar.
For more info regarding the differences between the backreference syntax and the reasons behind the new syntax support, see Backreferences syntax in replacement strings (why dollar sign?).
A useful tutorial on how to use regular expressions in Notepad++ can be found here.
Related
Does VSCode have support for numbered backreferences? I'm trying to do a find replace from the dialog but it matches the entire regex and replaces with a literal \1
# Regex
<tr><th align="right">target</th><td><pre>(.*)</pre></td></tr>
What is the regex engine that VSCode actually uses under the hood?
VSCode search and replace feature uses ECMAScript standard for the regex based search, and the replacement backreferences are also those that can be used in JavaScript.
To insert Group 1 value use $1.
However, to replace with the whole match, you may use both $& (as in JavaScript) and $0 (as in PCRE).
And remember to use $$ to insert a single literal $ char.
Note that beginning with Visual Studio Code v.1.31.0 release, as a result of moving to Electron 3.0, you may use all the cool features ECMAScript 2018 provides (like infinite-width lookbehinds).
I want to learn more about the regex syntax of the search and replace function in eclipse c++.
Does it use it's own regex syntax(in this case anyone knows a good tutorial) or does it use the syntax of another language(like java regex, grep, perl regex)?
Eclipse search and replace feature uses Java regex:
The regular expression must respect Java Regex.
However, one of the peculiarities is that you cannot match zero-legnth strings (i.e. (?=,) won't match the empty string before ,). In such cases, use capturing groups in the regex and use backreferences to those groups in the replacement patterns (e.g. to add newlines after a comma use , in the search and $0\n in the replacement).
I am having an issue with trying to figure out how to insert some text after I perform a regex search. I know there is a replace function, but I am not looking for that option, just inserting. The text editor I am using is Notepad2, but I am willing to try this in other text editors.
Here is the example that I have.
TEST|Test2|Test3|Test4
This is what I am looking for
Test|Test2|PrefixTest3|Test4
Notice that I am trying to insert the the phrase "Prefix" after the 2nd pipe and leave everything else alone.
I can successfully query the result by using this regex:
^[^|]*\|[^|]*|
But then I do not know how I can retain everything prior and after the search point. Any ideas?
You could simply use \K inorder to discard the previously matched characters.
^[^|]*\|[^|]*\|\K
Then replace the match with the string prefix.
DEMO
You may easily do that in Notepad2 using the regex-based Replace feature.
Find: ^\([^|]*|[^|]*|\)
Replace: \1Prefix
Details:
^ - start of a line (Notepad2 never overflows line boundaries!)
\([^|]*|[^|]*|\) - Capturing group 1 matching a sequence of:
[^|]* - zero or more chars other than |
| - a literal (yes, no escaping is necessary, both escaped and unescaped | match a literal |) pipe symbol
[^|]*| - see above, gets to the second |.
The replacement contains a \1 backreference that inserts what was captured with the capturing group 1.
NOTE that Notepad2 regex engine is very limited. Here is what the Notepad2 documentation says:
Notepad2 supports only a limited subset of regular expressions, as provided by built-in engine of the Scintilla source code editing component. The advantage is that it has a very small footprint. There's currently no plans to integrate a more advanced regular expressions engine, but this may be an option for future development.
Note: Regular expression search is limited to single lines, only.
Also, you may refer to the inline comments inside Scintilla RESearch.cxx file describing the supported syntax. Bear in mind that the regex type used in the Notepad2 S&R tool is that of POSIX and not all of the described Scintilla regex features will work in the tool.
Note that Notepad2 does not seem to support alternation and limiting quantifiers (similar to Lua patterns), but \w matches Unicode letters together with ASCII ones. Sadly, I could not make ? quantifier work.
^([^|]*\|[^|]*\|)
Try this.Replace by $1prefix.See demo.Just capture the first group and then use it for replace.The first group can be accessed by $1.
http://regex101.com/r/pQ9bV3/11
I am very new to the regular expression arena. Recently I searched for a regular expression for Powershell that allows me to match a html tag and I found the following in this site.
$content -match '(?s)<table[^>]+width\s*=\s*"300px"\s*.*?>(.*?)</table>'
I have been looking for all regular expressions references and books (Perl and Powershell) for the meaning of (?s) with no luck. It looks like a condition but missing the then part.
Can someone point me to the right direction for the meaning of this?
Thanks
According to Regular Expressions reference site.
Turn on "dot matches newline" for the remainder of the regular
expression. (Older regex flavors may turn it on for the entire regex.)
"?" means 1 or 0 matches. "?s" enables dot matching newlines. A period is normally a wildcard that will match any character, save the newline.
Is there any implementation of regex that allow to replace group in regex with lowercase version of it?
If your regex version supports it, you can use \L, like so in a POSIX shell:
sed -r 's/(^.*)/\L\1/'
In Perl, you can do:
$string =~ s/(some_regex)/lc($1)/ge;
The /e option causes the replacement expression to be interpreted as Perl code to be evaluated, whose return value is used as the final replacement value. lc($x) returns the lowercased version of $x. (Not sure but I assume lc() will handle international characters correctly in recent Perl versions.)
/g means match globally. Omit the g if you only want a single replacement.
If you're using an editor like SublimeText or TextMate1, there's a good chance you may use
\L$1
as your replacement, where $1 refers to something from the regular expression that you put parentheses around. For example2, here's something I used to downcase field names in some SQL, getting everything to the right of the 'as' at the end of any given line. First the "find" regular expression:
(as|AS) ([A-Za-z_]+)\s*,$
and then the replacement expression:
$1 '\L$2',
If you use Vim (or presumably gvim), then you'll want to use \L\1 instead of \L$1, but there's another wrinkle that you'll need to be aware of: Vim reverses the syntax between literal parenthesis characters and escaped parenthesis characters. So to designate a part of the regular expression to be included in the replacement ("captured"), you'll use \( at the beginning and \) at the end. Think of \ as—instead of escaping a special character to make it a literal—marking the beginning of a special character (as with \s, \w, \b and so forth). So it may seem odd if you're not used to it, but it is actually perfectly logical if you think of it in the Vim way.
1 I've tested this in both TextMate and SublimeText and it works as-is, but some editors use \1 instead of $1. Try both and see which your editor uses.
2 I just pulled this regex out of my history. I always tweak regexen while using them, and I can't promise this the final version, so I'm not suggesting it's fit for the purpose described, and especially not with SQL formatted differently from the SQL I was working on, just that it's a specific example of downcasing in regular expressions. YMMV. UAYOR.
Several answers have noted the use of \L. However, \E is also worth knowing about if you use \L.
\L converts everything up to the next \U or \E to lowercase. ... \E turns off case conversion.
(Source: https://www.regular-expressions.info/replacecase.html )
So, suppose you wanted to use rename to lowercase part of some file names like this:
artist_-_album_-_Song_Title_to_be_Lowercased_-_MultiCaseHash.m4a
artist_-_album_-_Another_Song_Title_to_be_Lowercased_-_MultiCaseHash.m4a
you could do something like:
rename -v 's/^(.*_-_)(.*)(_-_.*.m4a)/$1\L$2\E$3/g' *
In Perl, there's
$string =~ tr/[A-Z]/[a-z]/;
Most Regex implementations allow you to pass a callback function when doing a replace, hence you can simply return a lowercase version of the match from the callback.