RegEx for matching the first {N} chars and last {M} chars - regex

I'm having an issue filtering tags in Grafana with an InfluxDB backend. I'm trying to filter out the first 8 characters and last 2 of the tag but I'm running into a really weird issue.
Here are some of the names...
GYPSKSVLMP2L1HBS135WH
GYPSKSVLMP2L2HBS135WH
RSHLKSVLMP1L1HBS045RD
RSHLKSVLMP35L1HBS135WH
RSHLKSVLMP35L2HBS135WH
only want to return something like this:
MP8L1HBS225
MP24L2HBS045
I first started off using this expression:
[MP].*
But it only returns the following out of 148:
PAYNKSVLMP27L1HBS045RD
PAYNKSVLMP27L1HBS135WH
PAYNKSVLMP27L1HBS225BL
PAYNKSVLMP27L1HBS315BR

The pattern [MP].* Matches either a M or P and then matches any char until the end of the string not taking any char, digit or quantifing number afterwards into account.
If you want to match MP and the value does not end on a digit but the last in the match should be a digit, you could use:
MP[A-Z0-9]+[0-9]
Regex demo
If lookaheads are supported you might also use:
MP[A-Z0-9]+(?=[A-Z0-9]{2}$)
Regex demo

You may not even want to touch MP. You can simply define a left and right boundary, just like your question asks, and swipe everything in between which might be faster, maybe an expression similar to:
(\w{8})(.*)(\w{2})
which you can simply call it using $2. That is the second capturing group, just to be easy to replace.
Graph
This graph shows how the expression would work:
Performance
This JavaScript snippet shows the performance of this expression using a simple 1-million times for loop.
repeat = 1000000;
start = Date.now();
for (var i = repeat; i >= 0; i--) {
var string = "RSHLKSVLMP35L2HBS135WH";
var regex = /^(\w{8})(.*)(\w{2})$/g;
var match = string.replace(regex, "$2");
}
end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match 💚 ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. 😳 ");

Try Regex: (?<=\w{8})\w+(?=\w{2})
Demo

Related

regex to extract substring for special cases

I have a scenario where i want to extract some substring based on following condition.
search for any pattern myvalue=123& , extract myvalue=123
If the "myvalue" present at end of the line without "&", extract myvalue=123
for ex:
The string is abcdmyvalue=123&xyz => the it should return myvalue=123
The string is abcdmyvalue=123 => the it should return myvalue=123
for first scenario it is working for me with following regex - myvalue=(.?(?=[&,""]))
I am looking for how to modify this regex to include my second scenario as well. I am using https://regex101.com/ to test this.
Thanks in Advace!
Some notes about the pattern that you tried
if you want to only match, you can omit the capture group
e* matches 0+ times an e char
the part .*?(?=[&,""]) matches as least chars until it can assert eiter & , or " to the right, so the positive lookahead expects a single char to the right to be present
You could shorten the pattern to a match only, using a negated character class that matches 0+ times any character except a whitespace char or &
myvalue=[^&\s]*
Regex demo
function regex(data) {
var test = data.match(/=(.*)&/);
if (test === null) {
return data.split('=')[1]
} else {
return test[1]
}
}
console.log(regex('abcdmyvalue=123&3e')); //123
console.log(regex('abcdmyvalue=123')); //123
here is your working code if there is no & at end of string it will have null and will go else block there we can simply split the string and get the value, If & is present at the end of string then regex will simply extract the value between = and &
if you want to use existing regex then you can do it like that
var test = data1.match(/=(.*)&|=(.*)/)
const result = test[1] ? test[1] : test[2];
console.log(result);

Regex for set of 6 digits from 1-49

I've a problem with define regular expression correctly. I want check sets of digits f.e.: 1,2,14,15,16,17 or 12,13,14,15,16,17 or 1,2,3,6,7,8. Every set contains 6 digits from 1 to 49. I check it by input's pattern field.
I wrote some regex but it works only for 2-digit sets.
([1-9]|[1-4][0-9],){5}([1-9]|[1-4][0-9])
Thanks for all answers :)
You forgot to group the number patterns inside the quantified group before comma and the anchors to make the regex engine match the full input string:
^(?:(?:[1-9]|[1-4][0-9]),){5}(?:[1-9]|[1-4][0-9])$
^ ^^^ ^ ^
See the regex demo.
Details
^ - start of string
(?:(?:[1-9]|[1-4][0-9]),){5} - five occurrences of:
(?:[1-9]|[1-4][0-9]) - either a digit from 1 to 9 or a number from 10 to 49`
, - a comma
(?:[1-9]|[1-4][0-9])
$ - end of string.
JS demo:
var strs = ['1,2,14,15,16,17','12,13,14,15,16,17', '1,2,3,6,7,8', '1,2,3,6,7,8,'];
var rng = '(?:[1-9]|[1-4][0-9])';
var rx = new RegExp("^(?:" + rng + ",){5}" + rng + "$");
for (var s of strs) {
console.log(s, '=>', rx.test(s));
}

How to get last two words written in Regex in Javascript

I am trying to get data after a colon.
This is my code:
function myFunction() {
var withBreaks = "*Cats are:* cool Pets [CATS]"
var sheet = SpreadsheetApp.getActiveSheet()
if (withBreaks) {
var tmp;
tmp = withBreaks.match(/^[\*]Cats are:[\*][\s]([a-z]+[\s]+[A-Za-z].*)$/m); //
var username = (tmp && tmp[1]) ? tmp[1].trim() : 'No username';
sheet.appendRow([username])
}
};
So I'm trying to get information after the
*Cats are:*. This code works, but, sometimes some sentences would have an asterisk and sometimes there wouldn't be an asterisk to different sentences. I would like to make one that is more unifying, if that clarifies my question a bit.
What I would like to do is, without specifying the asterisk, get data after the :. So anything after Cats are:. Do I have to specify the asterisk?
I suggest
/^\**Cats are:\**\s*([\s\S]*)/
Here, any text is captured into Group 1 with ([\s\S]*) and the asterisks are made optional with * quantifier meaning 0 or more repetitions.
See the regex demo
If the asterisks can appear 1 or 0 times, replace * with ?:
/^\*?Cats are:\*?\s*([\s\S]*)/
^ ^
See another regex demo.

Regex replace phone numbers with asterisks pattern

I want to apply a mask to my phone numbers replacing some characters with "*".
The specification is the next:
Phone entry: (123) 123-1234
Output: (1**) ***-**34
I was trying with this pattern: "\B\d(?=(?:\D*\d){2})" and the replacing the matches with a "*"
But the final input is something like (123)465-7891 -> (1**)4**-7*91
Pretty similar than I want but with two extra matches. I was thinking to find a way to use the match zero or once option (??) but not sure how.
Try this Regex:
(?<!\()\d(?!\d?$)
Replace each match with *
Click for Demo
Explanation:
(?<!\() - negative lookbehind to find the position which is not immediately preceded by (
\d - matches a digit
(?!$) - negative lookahead to find the position not immediately followed by an optional digit followed by end of the line
Alternative without lookarounds :
match \((\d)\d{2}\)\s+\d{3}-\d{2}(\d{2})
replace by (\1**) ***-**\2
In my opinion you should avoid lookarounds when possible. I find them less readable, they are less portable and often less performant.
Testing Gurman's regex and mine on regex101's php engine, mine completes in 14 steps while Gurman's completes in 80 steps
Some "quickie":
function maskNumber(number){
var getNumLength = number.length;
// The number of asterisk, when added to 4 should correspond to length of the number
var asteriskLength = getNumLength - 4;
var maskNumber = number.substr(-4);
for (var i = 0; i < asteriskLength; i++) maskNumber+= '*';
var mask = maskNumber.split(''), maskLength = mask.length;
for(var i = maskLength - 1; i > 0; i--) {
var j = Math.floor(Math.random() * (i + 1));
var tmp = mask[i];
mask[i] = mask[j];
mask[j] = tmp;
}
return mask.join('');
}

Exclude quantitizer from regular expression`

I have a quantifier regular expression that matches a 5digit code [0-9]{5}.
How can I exclude any matched of the above quantifier?
I tried [^([0-9]{5})] but it seems it doesn't work.
Test data follows:
including:
12345678875645 (will be matched)
pppppaaaaa (will be matched)
52p26 (will be matched)
123 (will be matched)
excluding:
12345 (won't be matched)
try this
^(\d{1,4}|\d{6,})$
This won't match numbers with exactly 5 digits
demo here: https://regex101.com/r/sHvRMA/1
You can use a negative look ahead:
/(?!^[0-9]{5}$)^.+$/
var rexp = /(?!^[0-9]{5}$)^.+$/;
var str = ['12345', '12345678875645', 'pppppaaaaa', '52p26', '123'];
for (var i = 0; i < str.length; i++) {
console.log(str[i] + ' - ' + (rexp.test(str[i]) ? 'matched' : 'did not match'));
}
I assume that you need a regex to match all things except 5 digits length
You simply need to use negative lookahead assertion for excluding 5 digits. that is it.
\b(?!\d{5}).+|.{6,}\b
It excludes only 5 digits not anything else