Python regex error: "cannot refer to an open group" - regex

I am creating rules for a reddit automoderator. It gets its rules from a YAML config file and the regexes are interpreted as Python regex.
I am trying to make the following regular expression work:
(https?://[\\w\\d:##%/;$()~_?+-=\\.&]+\\.\\w{2,6})([\\S\\s]*\\1)
When I test it on https://pythex.org/ it works perfectly to achieve what I want.
Unfortunately my group reference at the end of the expression is causing an error when I copy the same regex into the config file:
Generated an invalid regex for body (regex): cannot refer to an open
group
I have also tried this version with everything escaped just to make sure that the characters weren't interfering in any way:
(https?://[\\w\\d\\:\\#\\#\\%\\/\\;\\$\\(\\)\\~\\_\\?\\+\\-\\=\\.&]+\\.\\w{2,6})([\\S\\s]*\\1)
But I still get the same error. Does anyone know what I'm doing wrong here?

I managed to fix the problem by changing the group selection to \2 instead of \1.
It turned out that YAML or AutoModerator were automatically putting parentheses around the whole expression, so any group references within must be 1 more than you would initially expect.
I had thought that this was the problem at the start, and tried the fix explained above, however due to a separate issue with the AutoModerator code, the fix had not appeared to have worked. All resolved now though; thanks for your patience and help.

Related

Regex : Match if exact prefix is present however ignore if the prefix has other attribute

So i am trying to get a regex to work where i have a prefix .i. And i only want the action to run if someone types in the prefix without an attribute. So for example if someone types in .i L, it would be completely legal.
The regex i have tried using is ^[.i] and ^[.i]?
However with this if someone types something such as ".intro" the action still fires which i dont want it to.
playing with this tester I found a this solution
^(?:^|\W).i (?:$|\W)?
which seems to work for the examples you gave.

Are named capture groups supported? If so, how to engage?

I understand that VSCode uses the JavaScript regex engine for its functionality.
The latest JavaScript specification allows for named capture groups to be used.
However, I am at a loss in understanding whether this is enabled in VSCode v1.43?
I am using the following notations in the general find command:
(?<name-of-capture>pattern to find)( other stuff )(\k<name-of-capture>)
(?<name-of-capture>pattern to find)( other stuff )(\g<name-of-capture>)
I have also used the combinations of \k'name' and \g'name' and these have no effect.
If anyone has insights into this I would appreciate to hear.
If you want to use an inline backreference, they work in VSCode.
(?<group>[a-z]+) \d+ \k<group>
matches abc 1 abc.
However, new JavaScript-like $<group> replacement does not work, .NET-style replacement backreference, ${group}, does not work either, probably, due to the issue referred to by #JW.
NOTE: They say they need 20 votes on the issue and there are 3 days to go before they close the issue and turn down the suggestion to introduce backreferences in replacement. If you want this feature to be implemented, please consider voting for that issue.

jmeter.extractor.RegexExtractor: Error in pattern

I tested my regex in the tree view and it worked fine, but when I actually run the test it's giving me the error in the title.
The pattern I'm using is (?<=\{\"id\":)\d+
I have also tried (?<=\{\"id\":)(\d+)
The response data looks like this: aaData":[{"id":488,"environment": (I am trying to match 488)
I've tried changing the response field to check (I've tried them all), not sure what else could be wrong.
It looks like your regular expression itself is OK.
At a pinch (not having used jmeter) I'd say the problem is with the lack of support for lookbehinds.
In the current user manual it is states:
Note that (?<=regexp) - lookbehind - is not supported.
I guess \{\"id\":(\d+) should work without upset (provided you're able to use the first capture group as a result.
Edit The working regex ended up using a non capturing group:
(?:\{\"id\":)(\d+)

Specific file name regex with optional text

I am trying to build a regular expression for a specific name with optional text in the middle. This alone is fairly easy:
^(pom)(.*?)([.]xml)$
However, there is one constraint I would like to have. This may be is possible, perhaps it isn't (I haven't been able to find anything like this). There can be additional text within the file name but if it is there, it has to be preceded with an underscore. The following example should help illustrate what I am trying to get:
pom.xml - SUCCEED
pomdxml - FAIL
pomd.xml - FAIL
pom_asdf.xml - SUCCEED
pom_.xml - FAIL
Thank you in advance for your knowledge and help!
Here you go:
^(pom)(_.+)?(\.xml)$
Just use an optional group.
^(pom)(_.*)?(\.xml)$
This also worked for me
^pom(_\w+)*([.]xml)$

Replacing all instances of a name in all strings in a solution

We have a large solution with many projects in it, and throughout the project in forms, messages, etc we have a reference to a company name. For years this company name has been the same, so it wasn't planned for it to change, but now it has.
The application is specific to one state in the US, so localizations/string resource files were never considered or used.
A quick Find All instances of the word pulled up 1309 lines, but we only need to change lines that actually end up being displayed to the user (button text, message text, etc).
Code can be refactored later to make it more readable when we have time to ensure nothing breaks, but for time being we're attempting to find all visible instances and replace them.
Is there any way to easily find these "instances"? Perhaps a type of Regex that can be used in the Find All functionality in Visual Studio to only pull out the word when it's wrapped inside quotes?
Before I go down the rabbit hole of trying to make my job easier and spending far more time than it would have taken to just go line by line, figured I would see if anyone has done something like this before and has a solution.
You can give this a try. (I hope your code is under source control!)
Foobar{[^"]*"([^"]*"[^"]*")*[^"]*}$
And replace with
NewFoobar\1
Explanation
Foobar the name you are searching for
[^"]*" a workaround for the missing non greedy modifier. [^"] means match anything but " that means this matches anything till the first ".
([^"]*"[^"]*")* To ensure that you are matching only inside quotes. This ensures that there are only complete sets of quotes following.
[^"]* ensures that there is no quote anymore till the end of the line $
{} the curly braces buts all this stuff following your companies name into a capturing group, you can refer to it using \1
The VS regex capability is quite stripped down. It perhaps represents 20% of what can be done with full-powered regular expressions. It won't be sufficient for your needs. For example, one way to solve this quote-delimited problem is to use non-greedy matching, which VS regex does not support.
If I were in your shoes, I would write a perl script or a C# assembly that runs outside of Visual Studio, and simply races through all files (having a particular file extension) and fixes everything. Then reload into Visual Studio, and you are done. Well, if all went well with the regex anway.
Ultimately what you really must watch out for is code like this:
Log.WriteLine("Hello " + m_CompanyName + " There");
In this case, regex will think that "m_CompanyName" appears between two quotes - but it is not what you meant. In this case you need even more sophistication, and I think you'll find the answer with a special .net regular expression extension.