Regex replace phone numbers with asterisks pattern - regex

I want to apply a mask to my phone numbers replacing some characters with "*".
The specification is the next:
Phone entry: (123) 123-1234
Output: (1**) ***-**34
I was trying with this pattern: "\B\d(?=(?:\D*\d){2})" and the replacing the matches with a "*"
But the final input is something like (123)465-7891 -> (1**)4**-7*91
Pretty similar than I want but with two extra matches. I was thinking to find a way to use the match zero or once option (??) but not sure how.

Try this Regex:
(?<!\()\d(?!\d?$)
Replace each match with *
Click for Demo
Explanation:
(?<!\() - negative lookbehind to find the position which is not immediately preceded by (
\d - matches a digit
(?!$) - negative lookahead to find the position not immediately followed by an optional digit followed by end of the line

Alternative without lookarounds :
match \((\d)\d{2}\)\s+\d{3}-\d{2}(\d{2})
replace by (\1**) ***-**\2
In my opinion you should avoid lookarounds when possible. I find them less readable, they are less portable and often less performant.
Testing Gurman's regex and mine on regex101's php engine, mine completes in 14 steps while Gurman's completes in 80 steps

Some "quickie":
function maskNumber(number){
var getNumLength = number.length;
// The number of asterisk, when added to 4 should correspond to length of the number
var asteriskLength = getNumLength - 4;
var maskNumber = number.substr(-4);
for (var i = 0; i < asteriskLength; i++) maskNumber+= '*';
var mask = maskNumber.split(''), maskLength = mask.length;
for(var i = maskLength - 1; i > 0; i--) {
var j = Math.floor(Math.random() * (i + 1));
var tmp = mask[i];
mask[i] = mask[j];
mask[j] = tmp;
}
return mask.join('');
}

Related

Testing for n lowercase characters in a string using a regex?

Trying to create a regular expression that tests for n lowercase characters in a string.
So for a minimum of 2 characters for example, I thought something like ([a-z]){2,} might work.
For the below test the first two are expected to pass:
const min = 2;
const tests = ['a2a#$2', 'a2a#$2a2', 'a2'];
const regex2: RegExp = new RegExp(`([a-z]){${min},}`);
tests.forEach((t) => {
const valid = regex2.test(t);
console.log(`t: ${t} is valid: ${valid}`);
});
Thoughts?
[a-z].*[a-z]
Looks for lowercase, then anything or nothing in between, then lowercase again.
Try it out for yourself:
https://www.debuggex.com/
I might go the route of first stripping off all characters other than lowercase letters, then using a length assertion:
var min = 2;
var tests = ['a2a#$2', 'a2a#$2a2', 'a2'];
tests.forEach(e => {
if (e.replace(/[^a-z]+/g, "").length >= min) {
console.log("MATCH: " + e);
}
else {
console.log("NO MATCH: " + e);
}
});
This isn't the most scalable solution, but for 2 lowercase letters, you could do this: .*[a-z].*[a-z].*. Of course, this breaks down if you want to match 1000 lower case letters, you'd have to type [a-z] 1000 times.
To test if the string has exactly n lower-case letters, attempt to match the following regular expression:
^[^a-z]*(?:[a-z][^a-z]*){n}$
where n is replaced with the desired value.
See Demo for n = 9.
To match at least n lower-case letters use
[^a-z]*(?:[a-z][^a-z]*){n,}
From the below, the match function will return the matches array. If there are no matches then it will return null. you can use matches.length to filter the array.
const min = 2;
const tests = ['a2a#$2', 'a2a#$2a2', 'a2'];
tests.forEach((t) => {
const matches = t.match(/([a-z])/g)||[];
console.log(`t: ${t} is valid: ${matches.length}`);
});

The first digit matches the last digit in a four-digits number

I'm trying to find numbers that start and end with the same digit and have similar numbers in between the two digits. Here are some examples:
7007 1551 3993 5115 9889
I tried the following regular expression to identify the first and the last digit. However, no number was selected.
^(\d{1})\1$
I appreciate your help.
Use this:
(\d)(\d)\2+\1
Capture the first and second digits separately, then match them in the reverse order.
Demo
Maybe,
^(\d)(\d)\2+\1$
might be an option to look into.
RegEx Demo
If you wish to simplify/update/explore the expression, it's been explained on the top right panel of regex101.com. You can watch the matching steps or modify them in this debugger link, if you'd be interested. The debugger demonstrates that how a RegEx engine might step by step consume some sample input strings and would perform the matching process.
Your regex will match two digit numbers where both digits are the same. You just need to expand it: (\d)(\d)\2\1
As well, since the numbers are on the same line, use word boundaries (\b) instead of line boundaries (^ and $).
\b(\d)(\d)\2\1\b
BTW {1} is redundant
Demo on regex101
Simple JS way.
let a = "7007 1551 3393 5115 9883";
a = a.split(" ");
let ans = [];
a.forEach((val) => {
let temp = val.split("");
if (temp && temp[0] === temp[temp.length - 1]) {
temp = temp.slice(1,temp.length-1);
ans.push(temp.slice(0,temp.length).every( (val, i, arr) => val === arr[0] )) ;
} else {
ans.push(false);
}
});
console.log(ans);
Regular Expression:
let a = "7007 1551 3393 5115 9883";
a = a.split(" ");
let ans = [];
a.forEach((val) => {
let reg = /(\d)(\d*)(\d)/gi;
let match = reg.exec(val);
if (match && match.length > 3 && match[1] === match[3]) {
let temp = match[2];
temp = temp.split("");
temp = temp.slice(0,temp.length);
ans.push(temp.every( (val, i, arr) => val === arr[0] )) ;
} else {
ans.push(false);
}
});
console.log(ans);

Regex find most center

I need to manully hyphante words that are too long. Using hyphen.js, I get soft hyphens between every syllable, like below.
I want to find the hyphen closes to the middle. All words will be more than 14 characters long. Regex that works in https://regex101.com/ or node/js example.
Basically, find the middle character excluding hyphens, check if there is a hyphen there, then step backwards one step and then forwards one step, then backwards to steps etc.
re-spon-si-bil-i-ties => [re-spon-si,-bil-i-ties]
com-pe-ten-cies. => [com-pe,-ten-cies.]
ini-tia-tives. => [ini-tia,-tives]
vul-ner-a-bil-i-ties => [vul-ner-a,-bil-i-ties]
Here's a simple js approach based on string splitting. There could be a binary search style algorithm as you mentioned which would avoid the array allocation but that seems overkill for these small data sets.
function halve(str) {
var right = str.split('-');
var left = right.splice(0, Math.ceil(right.length / 2));
return right.length > 0 ? [left.join('-'), '-' + right.join('-')] : left;
}
console.log(halve('re-spon-si-bil-i-ties'));
console.log(halve('com-pe-ten-cies.'));
console.log(halve('ini-tia-tives.'));
console.log(halve('vul-ner-a-bil-i-ties'));
console.log(halve('none')); // no hyphens returns ["none"]
You can work this out with this method:
Get middle point of string
From the middle point, and checking each character in both directions (left from middle, right from middle) check if that position is the - character. Set the index to the first such match.
If it matches that character, stop the loop and split the string on that index, otherwise return the original word.
words = [
're-spon-si-bil-i-ties',
'com-pe-ten-cies.',
'ini-tia-tives.',
'vul-ner-a-bil-i-ties',
'test',
'-aa',
'aa-'
];
split = '-'
for(word of words) {
m=Math.floor(word.length/2),offset=0,i=null
do{
if(word[m-offset] == split) i = m-offset
else if(word[m+offset] == split) i = m+offset
else offset++
}while(offset<=m && i == null)
if(i!=null && i>0) console.log([word.substring(0,i),word.substring(i)])
else console.log(word)
}
You can achieve this with:
var words = [
're-spon-si-bil-i-ties',
'com-pe-ten-cies.',
'ini-tia-tives.',
'vul-ner-a-bil-i-ties',
're-ports—typ-i-cal-ly',
'none'
];
for(var i = 0; i < words.length; ++i){
var matches = words[i]
.match(
new RegExp(
'^((?:[^-]+?-?){' // Start the regex
+parseInt(
words[i].replace( /-/g, '' ).length/2 // Round down the halfway point of this word's length without the hyphens
)
+'})(-.+)?$' // End the regex
)
)
.slice( 1 ); // Remove position 0 because it is the entire word
console.log( matches );
}
Regex explanation for re-spon-si-bil-i-ties:
^((?:[^-]+?-?){8})(-.+)$
^( - start the capture group leading up to the half way point
(?:[^-]+?-?) - find everything not a hyphen with an optional hyphen after it. Make the hyphen optional so that the second capture group can greedily claim it
{8} - 8 times; this will get us half way
) - close the half way capture group
(-.+)?$ - greedily get the hyphen and everything after it till the end of the string

RegEx for matching the first {N} chars and last {M} chars

I'm having an issue filtering tags in Grafana with an InfluxDB backend. I'm trying to filter out the first 8 characters and last 2 of the tag but I'm running into a really weird issue.
Here are some of the names...
GYPSKSVLMP2L1HBS135WH
GYPSKSVLMP2L2HBS135WH
RSHLKSVLMP1L1HBS045RD
RSHLKSVLMP35L1HBS135WH
RSHLKSVLMP35L2HBS135WH
only want to return something like this:
MP8L1HBS225
MP24L2HBS045
I first started off using this expression:
[MP].*
But it only returns the following out of 148:
PAYNKSVLMP27L1HBS045RD
PAYNKSVLMP27L1HBS135WH
PAYNKSVLMP27L1HBS225BL
PAYNKSVLMP27L1HBS315BR
The pattern [MP].* Matches either a M or P and then matches any char until the end of the string not taking any char, digit or quantifing number afterwards into account.
If you want to match MP and the value does not end on a digit but the last in the match should be a digit, you could use:
MP[A-Z0-9]+[0-9]
Regex demo
If lookaheads are supported you might also use:
MP[A-Z0-9]+(?=[A-Z0-9]{2}$)
Regex demo
You may not even want to touch MP. You can simply define a left and right boundary, just like your question asks, and swipe everything in between which might be faster, maybe an expression similar to:
(\w{8})(.*)(\w{2})
which you can simply call it using $2. That is the second capturing group, just to be easy to replace.
Graph
This graph shows how the expression would work:
Performance
This JavaScript snippet shows the performance of this expression using a simple 1-million times for loop.
repeat = 1000000;
start = Date.now();
for (var i = repeat; i >= 0; i--) {
var string = "RSHLKSVLMP35L2HBS135WH";
var regex = /^(\w{8})(.*)(\w{2})$/g;
var match = string.replace(regex, "$2");
}
end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match 💚 ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. 😳 ");
Try Regex: (?<=\w{8})\w+(?=\w{2})
Demo

Exclude quantitizer from regular expression`

I have a quantifier regular expression that matches a 5digit code [0-9]{5}.
How can I exclude any matched of the above quantifier?
I tried [^([0-9]{5})] but it seems it doesn't work.
Test data follows:
including:
12345678875645 (will be matched)
pppppaaaaa (will be matched)
52p26 (will be matched)
123 (will be matched)
excluding:
12345 (won't be matched)
try this
^(\d{1,4}|\d{6,})$
This won't match numbers with exactly 5 digits
demo here: https://regex101.com/r/sHvRMA/1
You can use a negative look ahead:
/(?!^[0-9]{5}$)^.+$/
var rexp = /(?!^[0-9]{5}$)^.+$/;
var str = ['12345', '12345678875645', 'pppppaaaaa', '52p26', '123'];
for (var i = 0; i < str.length; i++) {
console.log(str[i] + ' - ' + (rexp.test(str[i]) ? 'matched' : 'did not match'));
}
I assume that you need a regex to match all things except 5 digits length
You simply need to use negative lookahead assertion for excluding 5 digits. that is it.
\b(?!\d{5}).+|.{6,}\b
It excludes only 5 digits not anything else