How to properly parse multiline warnings using Jenkins Warnings Next Generation Plugin? - regex

I am trying to get the Jenkins Warnings Next Generation plugin to be able to parse warning messages that span multiple lines but unfortunately the plugin only matches one line and cannot do multiline?
In the configuration for the plugin, there is a feature where it shows a preview when you try out your regular expression. In the preview, it seems to work fine and catches my example warning but when it tries to parse through the console output for warnings, it fails to catchy any (all my warnings span multiple lines).
Not exactly sure why it's not working in the real output but is working in the preview. The plugin is able to catch multiple warnings if the match is only 1 line.
You can see what I have done here:
https://regexr.com/4o3lq
I am currently using this for regular expression to input into the plugin configuration
(?ms)\x08(.*?)\x08
The warning is encapsulated by the \x08 special character (see regexr).
I thought that the mode modifier (m and s) would allow multiline but apparently not.
Thanks in advance!

Was able to resolve my issue.
The Jenkins warnings parser preview is a bit misleading since I thought the previous Regex would be good enough to catch multiple line warnings but apparently the \r needs to be included. Within the plugin source code, there is a check that sees if a \r or \n is included in the Regular Expression and if there is, the Groovy Parser plugin will enable multiline support.
(?ms)\x08(.*?)\r?\x08
May or may not need the "m" in the mode modifier.

Related

How can I specify the regular expression dialect in IntelliJ IDEA?

I have a file which is in Java's regular expression dialect:
# Prevents matching at the second half of a version number and things like
# 1.16.2 splitting into 1.1 and 6.2
(?<![._\-\d])
(?<sign>-)?
(?<integerPart>\d+(?:,\d+)*)
(
(?<fractionalPart>\.\d+)?
(?<suffix>[kKMG%])?
# Prevents matching at the first half of a version number
(?![._\-\d])
|
# Note how this one does _not_ include '.' because we wanted to deal with
# integers with a period after them. This may change?
(?![_\-\d])
)
IDEA gives me errors on all the groups, saying: "This named group syntax is not supported in this regex dialect".
But when I edit settings for this inspection there is just one checkbox.
Questions:
What dialect is the default anyway? I'm mildly surprised that it isn't the Java Pattern one
How do I configure this to use Java one? Is there a magic comment I can put in the file to hint at the format which IDEA and maybe even other text editors would recognise?
It looks like a known bug in IntelliJ IDEA. There is no way to change the dialect at the moment.

Regular expression to find non-encapsulated text that's also not commented out

I'm working on migrating a database into a sql project, and need to replace all instances of cross-database calls with a SQLCMD variable, and am struggling to write a regex to help me find the places I still need to update.
In the SQL, we have the following:
MyOtherDatabase.MySchema.MyTable
[MyOtherDatabase].MySchema.MyTable
Which I need to change to:
[$(MyOtherDatabase)].MySchema.MyTable
So far, I've come up with the following regex:
([^(]M|^M)yOtherDatabase
Which finds all places where "MyOtherDatabase" is used, and hasn't been replaced with the variable.
HOWEVER, it's also picking it up in SQL comments, such as:
-- I don't want to find MyOtherDatabase in this line
and
FROM ADifferentPlace -- Used to be MyOtherDatabase
If this was only a few instances, I'd live with it, but I've currently got 560 matches, most of which are one or the other of the above, making it very easy for human error to get in the way.
I'm using this regex in the "Search" box within Visual Studio 2015, with the "use regex" checkbox ticked.
any advice would be helpful!
Edit
Also need to NOT find the following:
from MyTable -- from MyOtherDatabase.MySchema.MyTable
If your environment supports variable-length negative lookbehinds, you could use the following to avoid matching any commented section :
search for (?<!--.*)MyOtherDatabase(?=]?\.)
replace by $(MyOtherDatabase)
If it doesn't, you can still match lines from the start :
search for ^((?:[^-]|-[^-])*)MyOtherDatabase(]?\.)
replace by \1$(MyOtherDatabase)\2

What flavor of Regex does Visual Studio Code use?

Trying to search-replace in Visual Studio Code, I find that its Regex flavor is different from full Visual Studio. Specifically, I try to declare a named group with string (?<p>[\w]+) which works in Visual Studio but not in Visual Studio Code. It'll complain with the error Invalid group.
Apart from solving this specific issue, I'm looking for information about the flavor of Regexes in Visual Studio Code and where to find documentation about it, so I can help myself with any other questions I might stumble upon.
Full Visual Studio uses .NET Regular Expressions as documented here. This link is mentioned as the documentation for VS Code elsewhere on Stackoverflow, but it's not.
Rust Regex in the Find/Replace in Files Sidebar
Rob Lourens of MSFT wrote that the file search uses Rust regex. The Rust language documentation describes the syntax.
JavaScript Regex in the Find/Replace in File Widget
Alexandru Dima of MSFT wrote that the find widget uses JavaScript regex. As Wicktor commented, ECMAScript 5's documentation describes the syntax. So does the MDN JavaScript Regular Expression Guide.
Test the Difference
The find in files sidebar does not support (?=foobar) whereas the find in file widget does support that lookahead syntax.
Regarding Find/Replace with Groups
To find/replace with groups, use parentheses () to group and $1, $2, $3, $n to replace.
Here is an example.
Before:
After:
Shaun's answer is still correct, however to add an update, recently VS Code added the option to opt into using the Perl based PCRE2 engine. You can enable this through your settings config.
This allows you to perform more advanced regex operations like lookaheads and backreferences. But as noted below, the regex still has to be valid JavaScript regex.
VS Code does support regular expression searches, however,
backreferences and lookaround aren't supported by default. But you can
enable these with the setting search.usePCRE2. This configures ripgrep
to use the PCRE2 regex engine. While PCRE2 supports many other
features, we only support regex expressions that are still valid in
JavaScript, because open editors are still searched using the editor's
JavaScript-based search.
And for a bonus if you ended up here trying to do multi line searches, VS Code recently added that feature as well!
I've found newer information (July 22, 2020) about it.
IllusionMH left the following comment in Github:
ripgrep (compatible with PCRE2) is already used for Find in files
functionality (for not open editors) and JS engine used only for open
editors.
Which regex engine does vscode use? It is now a little more nuanced than previously. The best source is this vscode wiki: Github wiki: Notes on Regular Expression Support:
[At top of document]
This document applies to search (CMD+SHIFT+F/CTRL+SHIFT+F) and
quickopen (CMD+P/CTRL+P). By default, VS Code uses the ripgrep tool to
drive search.
...
[At end of document]
Text search uses two different sets of regular expression engines. The
workspace is searched using ripgrep, which will use the Rust regex
engine, and will fallback to PCRE2 if the regex fails to parse in the
Rust regex engine. The Rust regex engine doesn't support some features
like backreferences and look-around, so if you use those features,
PCRE2 will be used. Open files are searched using a JS regex in the
editor itself. Most of the time, you don't need to worry about this,
but you may see an inconsistency in how some complex regexes are
interpreted, and this can be an explanation. Especially when you see a
regex interpreted one way when a file is open, and another way when it
is not. During a Replace operation, each file will be opened in turn,
and the search query will be run as a JS regex.
Another potential issue is how newlines are handled between ripgrep
and the editor. The editor normalizes newlines, so that you can match
both CRLF and LF line endings just with \n. It's actually not possible
to match \r explicitly in the editor because it is normalized away.
When searching in the workspace, VS Code tries to rewrite a regex so
that \n will match CRLF. But \r\n or \s\n will also match CRLF in
closed files, but not in open files.
Two key points: (1) newlines are handled specially and (2) backrefereences and look-arounds are supported despite using the Rust regex engine - if your regex has a look-around or backreference in it PCRE2 will be used instead of the Rust engine.
More on lookarounds
The Find Widget (Ctrl+F) used for finding within the active editor only supports all lookarounds (lookahead and lookbehind) and those lookarounds can be non-fixed-length. So this will work in the Find Widget: (?<!blah.*).
In a search across files (Ctrl+Shift+F) non-fixed-length lookbehinds DO NOT WORK. Lookaheads can be fixed or non-fixed-length. But non-fixed-length positive or negative lookbehinds do not work and you will get an error message below the search input box: Regex parse error: lookbehind assertion is not fixed length which may not appear until you actually try to run the search.

Filter by regex example

Could anyone provide an example of a regex filter for the Google Chrome Developer toolbar?
I especially need exclusion. I've tried many regexes, but somehow they don't seem to work:
It turned out that Google Chrome actually didn't support this until early 2015, see Google Code issue. With newer versions it works great, for example excluding everything that contains banners:
/^(?!.*?banners)/
It's possible -- at least in Chrome 58 Dev. You just need to wrap your regex with forward-slashes: /my-regex-string/
For example, this is one I'm currently using: /^(.(?!fallback font))+$/
It successfully filters out any messages that contain the substring "fallback font".
EDIT
Something else to note is that if you want to use the ^ (caret) symbol to search from the start of the log message, you have to first match the "fileName.js?someUrlParam:lineNumber " part of the string.
That is to say, the regex is matching against not just the log message, but also the stack-entry for the line which made the log.
So this is the regex I use to match all log messages where the actual message starts with "Dog":
/^.+?:[0-9]+ Dog/
The negative or exclusion case is much easier to write and think about when using the DevTool's native syntax. To provide the exclusion logic you need, simply use this:
-/app/ -/some\sother\sregex/
The "-" prior to the regex makes the result negative.
Your expression should not contain the forward slashes and /s, these are not needed for crafting a filter.
I believe your regex should finally read:
!(appl)
Depending on what exactly you want to filter.
The regex above will filter out all lines without the string "appl" in them.
edit: apparently exclusion is not supported?

Multiline regexp in jEdit custom mode

I'm currently creating a language with a friend and I would like to provide a highlighting for it in jEdit.
It's syntax is actually quite simple. The functions can only match this pattern:
$function_name(arguments)
Note that our parser is currently working without closing tag like the C-style semi-column and that we would like to keep this feature.
I created my jEdit mode and (almost) succeeded in highligting my pattern with <SPAN_REGEXP>. Here's how I did it:
<SPAN_REGEXP HASH_CAR="\$" TYPE="KEYWORD3" DELEGATE="ARGS">
<BEGIN>\$[A-Za_z0-9_]*\s*\(</BEGIN>
<END>)</END>
</SPAN_REGEXP>
But It's not good enough.
Here's what I would like:
Same color for the entire function skeleton : $func( )
Special highlighting (already defined within the ARGS rules set) for %content1% in $func(%content1%)
No highlighting for brackets not following a $func
Authorize alternative multiline syntax like
$func
(
args
)
which is for now not highlighted.
I guessed I needed to change my <BEGIN> regexp to accept newlines, but it seems that jEdit is unable to match multiline regexp for highlighting although he does it perfectly for search&replace !
I tried the (?s) and (?m) flags, the [\d\D]* workaround, even [\r\n]* but it never works.
So, here are my questions:
Does anyone know how to match multiline regexp in jEdit modes <SPAN_REGEXP> ?
If not, does anyone have any idea how to do what I need ?
As stated in the help, the SPAN_REGEXP does not support multi-line regexes. You can of course specify multi-line regexes, but they are only checked against individual lines and thus will then never match. You could post a Feature Request to the Feature Request Tracker of jEdit though if there is none for it yet.