Regex for extracting the exception names - regex

I want to extract the exception name from the below sentences using regex pattern,
Error: MYTERA RuntimeException: No task output
Error: android.java.lang.NullPointerException.checked
I need the terms RuntimeException and NullPointerException with a single Regex pattern.

This expression might help you to do so:
([A-Za-z]+Exception)
Graph
This graph shows how the expression would work and you can visualize your expressions in this link:
Performance
This JavaScript snippet shows the performance of that expression using a simple 1-million times for loop.
repeat = 1000000;
start = Date.now();
for (var i = repeat; i >= 0; i--) {
var string = 'Error: android.java.lang.NullPointerException.checked';
var regex = /(.*)\.([A-Za-z]+Exception)(.*)/g;
var match = string.replace(regex, "$2");
}
end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match 💚💚💚 ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. 😳 ");

Related

RegEx for matching the first {N} chars and last {M} chars

I'm having an issue filtering tags in Grafana with an InfluxDB backend. I'm trying to filter out the first 8 characters and last 2 of the tag but I'm running into a really weird issue.
Here are some of the names...
GYPSKSVLMP2L1HBS135WH
GYPSKSVLMP2L2HBS135WH
RSHLKSVLMP1L1HBS045RD
RSHLKSVLMP35L1HBS135WH
RSHLKSVLMP35L2HBS135WH
only want to return something like this:
MP8L1HBS225
MP24L2HBS045
I first started off using this expression:
[MP].*
But it only returns the following out of 148:
PAYNKSVLMP27L1HBS045RD
PAYNKSVLMP27L1HBS135WH
PAYNKSVLMP27L1HBS225BL
PAYNKSVLMP27L1HBS315BR
The pattern [MP].* Matches either a M or P and then matches any char until the end of the string not taking any char, digit or quantifing number afterwards into account.
If you want to match MP and the value does not end on a digit but the last in the match should be a digit, you could use:
MP[A-Z0-9]+[0-9]
Regex demo
If lookaheads are supported you might also use:
MP[A-Z0-9]+(?=[A-Z0-9]{2}$)
Regex demo
You may not even want to touch MP. You can simply define a left and right boundary, just like your question asks, and swipe everything in between which might be faster, maybe an expression similar to:
(\w{8})(.*)(\w{2})
which you can simply call it using $2. That is the second capturing group, just to be easy to replace.
Graph
This graph shows how the expression would work:
Performance
This JavaScript snippet shows the performance of this expression using a simple 1-million times for loop.
repeat = 1000000;
start = Date.now();
for (var i = repeat; i >= 0; i--) {
var string = "RSHLKSVLMP35L2HBS135WH";
var regex = /^(\w{8})(.*)(\w{2})$/g;
var match = string.replace(regex, "$2");
}
end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match 💚 ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. 😳 ");
Try Regex: (?<=\w{8})\w+(?=\w{2})
Demo

Matching exactly two consecutive spaces in Google Apps Scripts

I'm trying to match exactly two consecutive spaces using DocumentApp.getActiveDocument().getBody().replaceText() and replace them with a single space.
Unfortunately, it only supports some of regex (https://github.com/google/re2/wiki/Syntax).
I've tried DocumentApp.getActiveDocument().getBody().replaceText("[^ ] {2}[^ ]", " ") but that matches the characters sourrounding the text aswell.
I've tried DocumentApp.getActiveDocument().getBody().replaceText("([^ ]) {2}([^ ])", "$1 $2") but this outputs "$1 $2" rather then "character character"
I've tried DocumentApp.getActiveDocument().getBody().replaceText(" {2}", " ") but that also matches two spaces within a greater group of spaces.
It was difficult (for me) to write a single regular expression for required replacements, because surrounding characters (non-spaces) were also replaced each time. Moreover, in general case we should take into account special cases when spaces position is at the very beginning of the string or at the end.
As a result I suggest 2 functions for all kinds of replacements below:
function replaceDoubleSpace() {
var body = DocumentApp.getActiveDocument().getBody();
var count = replaceWithPattern('^ $', body);
Logger.log(count + ' replacement(s) done for the entire string');
count = replaceWithPattern('[^ ]{1} [^ ]{1}', body);
Logger.log(count + ' replacement(s) done inside the string');
count = replaceWithPattern('^ [^ ]{1}', body);
Logger.log(count + ' replacement(s) done at the beginning of the string');
count = replaceWithPattern('[^ ]{1} $', body);
Logger.log(count + ' replacement(s) done at the end of the string');
}
function replaceWithPattern(pat, body) {
var patterns = [];
var count = 0;
while (true) {
var range = body.findText(pat);
if (range == null) break;
var text = range.getElement().asText().getText();
var pos = range.getStartOffset() + 1;
text = text.substring(0, pos) + text.substring(pos + 1);
range.getElement().asText().setText(text);
count++;
}
return count;
}
Of course, the first function may be simplified, but it becomes less readable in this case:
function replaceDoubleSpace() {
var body = DocumentApp.getActiveDocument().getBody();
var count = replaceWithPattern('^ $|[^ ]{1} [^ ]{1}|^ [^ ]{1}|[^ ]{1} $', body);
Logger.log(count + ' replacement(s) done');
}

Another regex expression

I need a regular expression for the next rules:
should not start or end with a space
should contain just letters (lower / upper), digits, #, single quotes, hyphens and spaces (spaces just inside, but not at the beginning and the end, as I already said)
should contain at least one letter (lower or upper).
Thank you
I think
^[^ ](?=.*[a-zA-Z]+)[a-zA-Z0-9#'\- ]*[^ ]$
should help you.
"Does it really matter guys?"
with regards to the dialect of regex: yes it does matter. Different languages may have different dialects. One example off the top of my head is that the RegEx library in PHP supports lookbehinds whereas RegEx library in JavaScript does not. This is why it is important for you to list the underlying language that you're using. Also for future reference, it is helpful for those wanting to answer your questions to provide us with sample input and sample matches from the input.
Using the information that you provided, this is also a question that I feel as though you should use RegEx and JavaScript to validate the input. Take a look at this example:
window.onload = function() {
var valid = "a1 - 'super' 1";
var invalid1 = " a1 - 'super' 1"; //leading ws
var invalid2 = "a1 - 'super' 1 "; //trailing ws
var invalid3 = "a1 - 'super' 1?"; //invalid (?) char
var invalid4 = "1 - '123'"; //no letters
console.log(valid + ": " + validation(valid));
console.log(invalid1 + ": " + validation(invalid1));
console.log(invalid2 + ": " + validation(invalid2));
console.log(invalid3 + ": " + validation(invalid3));
}
function validation(input) {
var acceptableChars = new RegExp(/[^a-zA-Z\d\s'-]/g);
var containsLetter = new RegExp(/[a-zA-Z]/);
return input.length > 1 && input.trim().length == input.length && !acceptableChars.test(input) && containsLetter.test(input);
}

Regexp help - is this possible at all with regexp?

I'm still struggling with regexp, wondering if this is at all possible.
I need to parse variable names from expression, but I need to skip ones within string literals and ones after "dot".
so for expression like:
'test' + (n + text.length)
I would like to get only n and text.
I'm using /([a-z_][a-z0-9_]*)/gi
but it gives me test,n,text,length
Thanks for help:)
If your input is not too complicated, here is a possible regex option:
var re = /'[^'\\]*(?:\\.[^'\\]*)*'|"[^"\\]*(?:\\.[^"\\]*)*"|(?:^|[^.])\b(\w+)/g;
var str = '\'test\\\' this\' + "Missing \\\"here\\\"" + (n + text.length)';
document.body.innerHTML = "Testing string: <b>" + str + "</b><br/>";
var res = [];
while ((m = re.exec(str)) !== null) {
if (m[1]) { res.push(m[1]); }
}
document.body.innerHTML += JSON.stringify(res, 0, 4);
The regex details:
'[^'\\]*(?:\\.[^'\\]*)*' - single quoted string literals (supporting escaped sequences)
| - or
"[^"\\]*(?:\\.[^"\\]*)*" - double quoted string literals (supporting escaped sequences)
| - or
(?:^|[^.])\b(\w+) - 1+ word characters that are either right at the string start or after a non-dot and preceded with a word boundary (placed inside Group 1)
See the regex demo.

VB.Net help selecting first index of string with regex

I was wondering if there was a way I could start a selection from the Regex string i have in the below example
The below example works exactly how I want it too however if there is text that matches before it on another line it is choosing the wrong text and highlighting it.
What im wondering is if there is a way to get the start index of the regex string?
If Regex.IsMatch(Me.TextBox1.Text, "\b" + Regex.Escape("is") + "\b") Then
Me.TextBox1.SelectionStart = Me.TextBox1.Text.IndexOf("is")
Dim linenumber As Integer = Me.TextBox1.GetLineFromCharIndex(Me.TextBox1.Text.IndexOf("is"))
Me.TextBox1.SelectionLength = Me.TextBox1.Lines(linenumber).Length
Me.TextBox1.Focus()
Me.TextBox1.SelectedText = "is " & Me.TextBox2.Text
The System.Text.RegularExpression.Match object has a property which should help you here: Match.Index. Match.Index will tell you where the capture starts, and Match.Length tells you how long it is. Using those you could change your code to look like this:
If Regex.IsMatch(Me.TextBox1.Text, "\b" + Regex.Escape("is") + "\b") Then
Dim m as Match
m = Regex.Match(Me.TextBox1.Text, "\b" + Regex.Escape("is") + "\b")
Me.TextBox1.SelectionStart = m.Index
Dim linenumber As Integer = Me.TextBox1.GetLineFromCharIndex(m.Index)
Me.TextBox1.SelectionLength = Me.TextBox1.Lines(linenumber).Length
Me.TextBox1.Focus()
Me.TextBox1.SelectedText = "is " & Me.TextBox2.Text