vscode Find and replace across files, backreference is repeated the if more than one match is on the same line - regex

I am searching across files in vscode using the following regex expression
(?<=[a-zA-Z])_([a-zA-Z]) with $1
to replace Some_Random_Text with SomeRandomText
but vs code repeats the backreference value if they are on the same line as shown below S is repeated:

That sure looks like a bug to me. It does not happen in the Find in the current file widget. You should file an issue on this - probably due to the Search across files handling the lookbehind (which should be no problem since it is fixed-width).
In the meantime, you can easily remove the lookbehind part and use:
find: ([a-zA-Z])_([a-zA-Z])
replace: $1$2
which works as it should.

Related

Regex in search & replace: avoid fixed length of lookaround

In a long corpus of text, I want to make some corrections in certain
environments. However, I am encountering problems when using regex with text
editors. I switched to gedit to have an editor which supports regex in
search & replace.
Crucially, I only want to make changes if the line starts with a certain
pattern (\nm or \mb). The problem is that the element that I want to
replace (o' -> o'o) is not at a fixed length from the beginning of the line
and I can't include the regex in the lookbehind (the lookbehind fails).
Is there any way to include what I am looking for in a simple text editor
regex? Or is this already a step where I have to learn how to script in, for
example, Python?
This is what the regex looks like so far.
(?<=\\(nm|mb)).*o'(?=(q|w|r|t|z|p|s|d|f|g|h|j|k|l|x|c|v|b|n|m|a|i|u|e))
Of course, I can't apply .* in the replace without losing its content.
Put a capture group around .* and a back-reference in the replacement.
Find: (?<=\\(nm|mb))(.*)o'(?=(q|w|r|t|z|p|s|d|f|g|h|j|k|l|x|c|v|b|n|m|a|i|u|e))
Replace: \1o'o

vs code can't replace using regex group references under PCRE2 mode

I'm trying to replace my std stl usage to EASTL and since i have a lot of cpp/h files, i'm relying in 'Search in Files' option of vs-code, with the following pattern:
((?<=#include \<)([^\/(.h)]+?)(?=\>))
This matches completely fine in regexr.com, in both match and replace and in vs code as well but needs the option of PCRE2 engine being enabled due backreferences use.
Trying to reference the matching group #1 using $1 under Search sidebar view simply doesn't work, and just adds "$1".
But if i search & replace with the same input for each file manually, it works as intended.
Thanks.
EDIT: The bug which prevented replace from working properly with lookarounds has been fixed, see capture group in regex not working. It is working in the Insider's Build and will presumably be included in v1.39.
However, your regex:
((?<=#include \<)([^\/(.h)]+?)(?=\>)) should be changed to:
((?<=#include <)([^\/(.h)]+?)(?=>)) note the removal of escapes before < and > and then it works in the Insider's Build as of this date.
[And the PCRE2 mode has been deprecated since the original question. So you do not need that option anymore, PCRE2 will be used automatically if needed.]
There is a similar bug when using search/replace with newlines and the replace just literally inserts a $1 instead of the capture group's value. This bug has been fixed in the latest Insider's Build, see multiline replace issue and issue: newlines and replace with capture groups.
But I tried your regex in the Insider's Build and it has the same result as you had before - it inserts the literal $1 instead of its value. It appears to be a similar bug but due to the regex lookarounds.
So I tried a a simpler, but I think still correct, regex without the lookarounds:
^(#include\s+<)([^\.\/]+?)(>)
and replace with $1EASTL/$2.h$3 and it works as expected.
.

Regex to match any .config file with a few exceptions

I'm trying to get a regex working to use in an .hgignore file that will ignore various copies of .config files made during debugging.
The regex should match any path ending in .config as long as the path does not start with _config, config, or packages and as long as the file name (the characters immediately following the last slash) is not app, web, packages, or repositories (or web.release, web.debug).
The closest I seem to get is
^(?!(_config|[Cc]onfig|packages)).*\/(?!([Aa]pp|[Ww]eb|packages|repositories)\.).*config$
This will properly ignore Data/app.config, and seems to work with all other cases, but it will incorrectly match Libraries/Data/app.config. When I check this out at http://regex101.com/ it shows me that the .*\/ group is only matching through Libraries/, not Libraries/Data/ as I expected.
I tried changing it to
^(?!(_config|[Cc]onfig|packages))(.*\/)*(?!([Aa]pp|[Ww]eb|packages|repositories)\.).*config$
But then the group (.*\/)* seems to match the whole path for any .config file.
If I change the last negative lookahead to a matching group like so
^(?!(_config|[Cc]onfig|packages))(.*\/)(([Aa]pp|[Ww]eb|packages|repositories)\.).*config$
Then the (.*\/) matches Libraries/Data/, which is what I want and expected, but it appears the negative lookahead changes the matching behavior of (.*\/).
I'm not sure where to go from here? The conditions I'm trying to match or not match don't seem that complicated, but I'm not the most experienced with regexes. Maybe there is a simpler way to achieve the same thing in .hgignore?
These are examples of paths that should match and be ignored:
Web/smtp.config
Libraries/Data/connectionStrings.config
These are examples of paths that should NOT match and not be ignored
_config/staging/smtp.config
Web/web.config
Web/web.release.config
Web/Views/web.config
Libraries/Data/app.config
Libraries/Data/packages.config
Data/app.config
packages/MiniProfiler.EF6.3.0.11/lib/net40/MiniProfiler.EntityFramework6.dll.config
packages/repositories.config
You were really close. Try this regex on regex101:
^(?!_?config|packages).*\/(?!(app|web|packages|repositories)\.)[^\/]*config$
I simplified it a little, but the main change was to specify no slashes in the match before the "config".
Note: I used a case-insensitive flag to simplify the regex itself.

Regex substitution with Notepad++

I have a text file with several lines like these ones:
cd_cod_bus
nm_number_ex
cd_goal
And I want to get rid of the - and uppercase the following character using Notepad++ (I can also use other tool but if it doesn't get the problem more troublesome).
So I tried to get the characters with the following regex (?<=_)\w and replace it using \U\1\E\2 for the uppercasing trick but here is where my problems came. I think the regex is OK but once I click replace all I get this result:
cd_od_us
nm_umber_x
cd_oal
as you can see it is only deleting the match.
Do you know where the problem is?
Thanks.
The search regex has no capture groups, i.e. the \1 and \2 references in the replacement do not refer to anything.
Try this instead:
Search: _(\w)
Replace \U\1\E
There you have a capture group in the search part (the parenthesis around the \w) and the \1 in the replacement refers back to what was captured.
replace
_(.)
with
\U$1
will give you:
cdCodBus
nmNumberEx
cdGoal
and for your
I can also use other tool but if it doesn't get the problem more troublesome
I suggest you try vim.
Try this,
_(\w)
and replace with
\U\1
here's a screenshot

regexIssueTracker not working in CruiseControl.net

I am trying to get an issueUrlBuilder to work in my CruiseControl.NET config, and cannot figure out why they aren't working.
The first one I tried is this:
<cb:define name="issueTracker">
<issueUrlBuilder type="regexIssueTracker">
<find>^.*Issue (\d*).|\n*$</find>
<replace>https://issuetracker/ViewIssue.aspx?ID=$1</replace>
</issueUrlBuilder>
</cb:define>
Then, I reference it in the sourceControl block:
<sourcecontrol type="vaultplugin">
...
<issueTracker/>
</sourcecontrol>
My checkin comments look like this:
[Issue 1234] This is a test comment
I cannot find anywhere in the build reports/logs/etc. where that issue link is converted to a link. Is my regex wrong?
I've also tried the default issueUrlBuilder:
<cb:define name="issueTracker">
<issueUrlBuilder type="defaultIssueTracker">
<url>https://issuetracker/ViewIssue.aspx?ID={0}</url>
</issueUrlBuilder>
</cb:define>
Again, same comments and no links anywhere.
Anyone have any ideas.
It looks like you're trying to match a potentially multiline comment by using .|\n instead of just ., which doesn't match newlines by default. Your first problem is that | has the lowest associativity of all regex constructs, so it's dividing your whole regex into the alternatives ^.*Issue (\d*). or \n*$. You would need to enclose the alternation in a group: (?:.|\n)*.
Another potential problem is that the lines might be separated by \r\n (carriage-return plus linefeed) instead of just \n. If CCNET uses the .NET regex engine under the hood, that won't be a problem because the dot matches \r. But that's not true of all flavors, and anyway, there's always a better way to match anything including newlines than (?:.|\n)*. I suggest you try
<find>^.*Issue (\d*)(?s:.*)$</find>
or
<find>(?s)^.*Issue (\d*).*$</find>
(?s) and (?s:...) are inline modifiers which allow the dot to match line separator characters.
EDIT: It looks like this is a known bug in CCNET. If the inline modifier doesn't work, try replacing . with [\s\S], as you would in a JavaScript regex. Example:
<find>^.*Issue (\d*)[\s\S]*$</find>