Regular Expression to mask given number either from begining or end - regex

I have a scenario where have to mask two number in return from my application based on the configured regular expression patterns. I have following two numbers and need to mask as shown below.
20128569 --> 2012****
40953186 --> ****3186
I need two regular expression patterns to achieve this pattern accordingly using the String.replaceAll(...) or some other possible way.
public static void main(String[] args) {
String value = "20128569";
String pattern = "(?<=.{4}).?" ;
String formattedValue = value.replaceAll(pattern, "*");
System.out.println(formattedValue);
}
Note: I need two regular expression patterns in order to mask number as shown above.
However currently i have resolve this issue temporally through the following code. But it is nice if i can resolve this issue through only regular expression.
String maskedAccountNumber = Pattern.compile(aRegexPattern).matcher(aKey).replaceFirst(MASK_CHARACTER);
StringBuilder maskBuffer = new StringBuilder();
for(int i = 0; i <= aKey.length() - maskedAccountNumber.length() ; i++){
maskBuffer.append(MASK_CHARACTER);
}
return maskedAccountNumber.replace(MASK_CHARACTER, maskBuffer.toString());
Below are the two regulare expressions i used so far:
^.\d{1,3}
.\d{1,3}$

That's fairly easy to do, it's still hard to understand under what circumstances do you want which regex.
Either way, (\d*)\d{4} and replace it with $1****, as seen https://regex101.com/r/uI0zJ6/2
Or \d{4}(\d*) and replace with ****$1 https://regex101.com/r/uI0zJ6/3 .

Related

How to Get substring from given QString in Qt

I have a QString like this:
QString fileData = "SOFT_PACKAGES.ABC=MY_DISPLAY_OS:MY-Display-OS.2022-3.10.25.10086-1.myApplication"
What I need to do is to create substrings as follow:
SoftwareName = MY_DISPLAY_OS //text after ':'
Version = 10.25.10086-1
Release = 2022-3
I tried using QString QString::sliced(qsizetype pos, qsizetype n) const but didn't worked as I'm using 5.9 and this is supported on 6.0.
QString fileData = "SOFT_PACKAGES.ABC=MY_DISPLAY_OS:MY-Display-OS.2022-3.10.25.10086-1.myApplication";
QString SoftwareName = fileData.sliced(fileData.lastIndexOf(':'), fileData.indexOf('.'));
Please help me to code this in Qt.
Use QString::split 3 times:
Split by QLatin1Char('=') to two parts:
SOFT_PACKAGES.ABC
MY_DISPLAY_OS:MY-Display-OS.2022-3.10.25.10086-1.myApplication
Next, split 2nd part by QLatin1Char(':'), probably again to just 2 parts if there can never be more than 2 parts, so the 2nd part can contain colons:
MY_DISPLAY_OS
MY-Display-OS.2022-3.10.25.10086-1.myApplication
Finally, split 2nd part of previous step by QLatin1Char('.'):
MY-Display-OS
2022-3
10
25
10086-1
myApplication
Now just assemble your required output strings from these parts. If exact number of parts is unknown, you can get Version = 10.25.10086-1 by removing two first elements and last element from the final list above, and then joining the rest by QLatin1Char('.'). If indexes are known and fixed, you can just use QStringLiteral("%1.%2.%3").arg(....
One way is using
QString::mid(int startIndex, int howManyChar);
so you probably want something like this:
QString fileData = "SOFT_PACKAGES.ABC=MY_DISPLAY_OS:MY-Display-OS.2022-3.10.25.10086-1.myApplication";
QString SoftwareName = fileData.mid(fileData.indexOf('.')+1, (fileData.lastIndexOf(':') - fileData.indexOf('.')-1));
To extract the other part you requested and if the number of '.' characters remains constant along all strings you want to check you can use the second argument IndexOf to find shift the starting location to skip known many occurences of '.', so for example
int StartIndex = 0;
int firstIndex = fileData.indexOf('.');
for (int i=0; i<=6; i++) {
StartIndex += fileData.indexOf('.', firstIndex+StartIndex);
}
int EndIndex = fileData.indexOf('.', StartIndex+8);
should give the right indices to be cut out with
QString SoftwareVersion = fileData.mid(StartIndex, EndIndex - StartIndex);
If the strings to be parsed stay less consistent in this way, try switching to regular expressions, they are the more flexible approach.
In my experience, using regular expressions for these types of tasks is generally simpler and more robust. You can do this with a regular expressions with the following:
// Create the regular expression.
// Using C++ raw string literal to reduce use of escape characters.
QRegularExpression re(R"(.+=([\w_]+):[\w-]+\.(\d+-\d+)\.(\d+\.\d+\.\d+-?\d+))");
// Match against your string
auto match = re.match("SOFT_PACKAGES.ABC=MY_DISPLAY_OS:MY-Display-OS.2022-3.10.25.10086-1.myApplication");
// Now extract the portions you are interested in
// match.captured(0) is always the full string that matched the entire expression
const auto softwareName = match.captured(1);
const auto version = match.captured(3);
const auto release = match.captured(2);
Of course for this to make sense, you have to understand regex, so here is my explanation of the regex used here:
.+=([\w_]+):[\w-]+\.(\d+-\d+)\.(\d+\.\d+\.\d+-?\d+)
.+=
get all characters up to and including the first equals sign
([\w_]+)
capture one or more word characters (alphanumeric characters) or underscores
:
a colon
[\w-]+\.
one or more alphanumeric or dash characters followed by a single period
(\d+-\d+)
capture one or more of digits followed by a dash followed by one or more digits
\.
a single period
(\d+\.\d+\.\d+-?\d*)
capture three sets of digits with periods in between, then an optional dash, and any number of digits (could be zero digits)
I think it is generally easier to make a regex that handles changes to the input - lets say version becomes 10.25.10087 - more easily than manually parsing things by index.
Regex is a powerful tool once you get used to it, but it can certainly seem daunting at first.
Example of this regex on regex101.com: https://regex101.com/r/dj3Z4U/1

Regex to match all occurrences that begin with n characters in sequence

I'm not sure if it's even possible for a regular expression to do this. Let's say I have a list of the following strings:
ATJFH
ABHCNEK
BKDFJEE
NCK
ABH
ABHCNE
KDJEWRT
ABHCN
EGTI
And I want to match all strings that begin with any number of characters for the following string: ABHCNEK
The matches would be:
ABH
ABHCN
ABHCNE
ABHCNEK
I tried things like ^[A][B][H][C][N][E][K] and ^A[B[H[C[N[E[K]]]]]], but I can't seem to get it to work...
Can this be done in regex? If so, what would it be?
The simplest can be
^(?:ABHCNEK|ABHCNE|ABHCN|ABHC|ABH|AB|A)$
See demo.
https://regex101.com/r/eB8xU8/6
Use this regular expression:
^[ABHCNEK]+$
You haven't said how you want to use it, but one option doesn't require regex. Loop through the various strings and check for a match within your test string:
var strings = ['ATJFH', 'ABHCNEK', 'BKDFJEE', 'NCK', 'ABH', 'ABHCNE', 'KDJEWRT', 'ABHCN', 'EGTI'];
var test = 'ABHCNEK';
for (var i = 0; i < strings.length; i++) {
if (test.match(strings[i])) {
console.log(strings[i]);
}
}
This returns:
ABHCNEK
ABH
ABHCNE
ABHCN

I want to check a string against many different regular expressions at once

I have a string which the user has inputted and I have my regular expressions within my Database and I can check the input string against those regular expressions within the database fine.
But now I need to add another column within my database which will hold another regular expression but I want to use the same for loop to check the input string againt my new regular expression aswell but at the end of my first loop. But I want to use this new expression against the same string
i.e
\\D\\W\\D <-- first expression
\\d <-- second expression which I want to use after the first expression is over
use regular expressions from database against input string which works
add new regular expression and corporate that within the same loop and check against the same string - not workin
my code is as follows
std::string errorMessages [2][2] = {
{
"Correct .R\n",
},
{
"Free text characters out of bounds\n",
}
};
for(int i = 0; i < el.size(); i++)
{
if(el[i].substr(0,3) == ".R/")
{
DCS_LOG_DEBUG("--------------- Validating .R/ ---------------");
output.push_back("\n--------------- Validating .R/ ---------------\n");
str = el[i].substr(3);
split(st,str,boost::is_any_of("/"));
DCS_LOG_DEBUG("main loop done");
for (int split_id = 0 ; split_id < splitMask.size() ; split_id++ )
{
boost::regex const string_matcher_id(splitMask[split_id]);
if(boost::regex_match(st[split_id],string_matcher_id))
{
a = errorMessages[0][split_id];
DCS_LOG_DEBUG("" << a );
}
else
{
a = errorMessages[1][split_id];
DCS_LOG_DEBUG("" << a);
}
output.push_back(a);
}
DCS_LOG_DEBUG("Out of the loop 2");
}
}
How can I retrieve my regular expression from the database and after this loops has finished use this new regex against the same string.
STRING IS - shamari
regular expresssion i want to add - "\\d"
ask me any questions if you do not understand
I'm not sure I understand you entirely, but if you're asking "How do I combine two separate regexes into a single regex", then you need to do
combinedRegex = "(?:" + firstRegex + ")|(?:" + secondRegex + ")"
if you want an "or" comparison (either one of the parts must match).
For an "and" comparison it's a bit more complicated, depending on whether these regexes match the entire string or only a substring.
Be aware that if the second regex uses numbered backreferences, this won't work since the indexes will change: (\w+)\1 and (\d+)\1 would have to become (?:(\w+)\1)|(?:(\d+)\2), for example.

VB.Net Matching and replacing the contents of multiple overlapping sets of brackets in a string

I am using vb.net to parse my own basic scripting language, sample below. I am a bit stuck trying to deal with the 2 separate types of nested brackets.
Assuming name = Sam
Assuming timeFormat = hh:mm:ss
Assuming time() is a function that takes a format string but
has a default value and returns a string.
Hello [[name]], the time is [[time(hh:mm:ss)]].
Result: Hello Sam, the time is 19:54:32.
The full time is [[time()]].
Result: The full time is 05/06/2011 19:54:32.
The time in the format of your choice is [[time([[timeFormat]])]].
Result: The time in the format of your choice is 19:54:32.
I could in theory change the syntax of the script completely but I would rather not. It is designed like this to enable strings without quotes because it will be included in an XML file and quotes in that context were getting messy and very prone to errors and readability issues. If this fails I could redesign using something other than quotes to mark out strings but I would rather use this method.
Preferably, unless there is some other way I am not aware of, I would like to do this using regex. I am aware that the standard regex is not really capable of this but I believe this is possible using MatchEvaluators in vb.net and some form of recursion based replacing. However I have not been able to get my head around it for the last day or so, possibly because it is hugely difficult, possibly because I am ill, or possibly because I am plain thick.
I do have the following regex for parts of it.
Detecting the parentheses: (\w*?)\((.*?)\)(?=[^\(+\)]*(\(|$))
Detecting the square brackets: \[\[(.*?)\]\](?=[^\[+\]]*(\[\[|$))
I would really appreciate some help with this as it is holding the rest of my project back at the moment. And sorry if I have babbled on too much or not put enough detail, this is my first question on here.
Here's a little sample which might help you iterate through several matches/groups/captures. I realize that I am posting C# code, but it would be easy for you to convert that into VB.Net
//these two may be passed in as parameters:
string tosearch;//the string you are searching through
string regex;//your pattern to match
//...
Match m;
CaptureCollection cc;
GroupCollection gc;
Regex r = new Regex(regex, RegexOptions.IgnoreCase);
m = r.Match(tosearch);
gc = m.Groups;
Debug.WriteLine("Number of groups found = " + gc.Count.ToString());
// Loop through each group.
for (int i = 0; i < gc.Count; i++)
{
cc = gc[i].Captures;
counter = cc.Count;
int grpnum = i + 1;
Debug.WriteLine("Scanning group: " + grpnum.ToString() );
// Print number of captures in this group.
Debug.WriteLine(" Captures count = " + counter.ToString());
if (cc.Count >= 1)
{
foreach (Capture cap in cc)
{
Debug.WriteLine(string.format(" Capture found: {0}", cap.ToString()));
}
}
}
Here is a slightly simplified version of the code I wrote for this. Thanks for the help everyone and sorry I forgot to post this before. If you have any questions or anything feel free to ask.
Function processString(ByVal scriptString As String)
' Functions
Dim pattern As String = "\[\[((\w+?)\((.*?)\))(?=[^\(+\)]*(\(|$))\]\]"
scriptString = Regex.Replace(scriptString, pattern, New MatchEvaluator(Function(match) processFunction(match)))
' Variables
pattern = "\[\[([A-Za-z0-9+_]+)\]\]"
scriptString = Regex.Replace(scriptString, pattern, New MatchEvaluator(Function(match) processVariable(match)))
Return scriptString
End Function
Function processFunction(ByVal match As Match)
Dim nameString As String = match.Groups(2).Value
Dim paramString As String = match.Groups(3).Value
paramString = processString(paramString)
Select Case nameString
Case "time"
Return getLocalValueTime(paramString)
Case "math"
Return getLocalValueMath(paramString)
End Select
Return ""
End Function
Function processVariable(ByVal match As Match)
Try
Return moduleDictionary("properties")("vars")(match.Groups(1).Value)
Catch ex As Exception
End Try
End Function

What regular expression can I use to match a cell reference?

For one of my projects I want to use a regular expression to match a string like "REF:Sheet1!$C$6".
So far I have done
public static private bool IsCellReference()
{
string CELL_REFERENCE_PATTERN = #"REF:Sheet[1-9]!$[A-Z]$[0-9]";
Regex r = new Regex(CELL_REFERENCE_PATTERN);
Match m = r.Match("REF:Sheet1!$C$6");
if (m.Success) return true;
else return false;
}
but it is not working. It is returning false.
Where am I wrong?
You need to escape your $ signs.
REF:Sheet[1-9]!\$[A-Z]\$[0-9]
See Regular Expression Language Elements for more information
Also, this page is good for testing your regexes: A better .NET Regular Expression Tester