Regex to evaluate phone number in Pentaho [duplicate] - regex

I wanted to remove the special characters like ! # # $ % ^ * _ = + | \ } { [ ] : ; < > ? / in a string field.
I used the "Replace in String" step and enabled the use RegEx. However, I do not know the right syntax that I will put in "Search" to remove all these characters from the string. If I only put one character in the "Search" it was removed from the string. How can I remove all of these??
This is the picture of how I did it:

As per documentation, the regex flavor is Java. You may use
\p{Punct}
See the Java regex syntax reference:
\p{Punct} Punctuation: One of !"#$%&'()*+,-./:;<=>?#[]^_`{|}~

Related

Regular expression not working with \ and ]

I have a regex for validating a password which has to be at least 8 chars and must contain letter(upper and lower case) number and a special character from set ^ $ * . [ ] { } ( ) ? - " ! # # % & / \ , > < ' : ; | _ ~ ` .
I face 2 problems, after adding the / to the reg exp its not recognized (other characters are still working OK. If I add the /] as well the expression no longer works (everything is invalid though the pattern seems to be ok in the browser debug mode).
The regex string
static get PASSWORD_VALIDATION_REGEX(): string {
return '(?=.*[a-z])(?=.*[0-9])(?=.*[A-Z])' + // contains lowercase number uppercase
'(?=.*[\-~\$#!%#<>\|\`\\\/\[;:=\+\{\}\.\(\)*^\?&\"\,\'])' + // special
'.{8,}'; // more than allowed char
}
I used the regexp as a form validator and as a match in function
password: ['', {validators: [Validators.required,
Validators.pattern(StringUtils.PASSWORD_VALIDATION_REGEX)
],
updateOn: 'change'
}
]
//....
value.match(StringUtils.PASSWORD_VALIDATION_REGEX)
Tried to use only (?=.*[\\]) for the special chars list, in that case I've received a console error
Invalid regular expression: /^(?=.*[a-z])(?=.*[0-9])(?=.*[A-Z])(?=.*[\]).{8,}$/: Unterminated character class
For '(?=.*[\]])' no console error but the following error is present in the form validation 'pattern'
actualValue: "AsasassasaX000[[][]"
requiredPattern: "^(?=.*[a-z])(?=.*[0-9])(?=.*[A-Z])(?=.*[]]).{8,}$"
The same value and pattern fails on https://regex101.com/
Thanks for your help / suggestions in advance!
You have overescaped your pattern and failed to escape the ] char correctly. In JavaScript regex, ] inside a character class must be escaped.
If you are confused with how to define escapes inside a string literal (and it is rather confusing indeed), you should use a regex literal. One thing to remember about the regex use with Validators.pattern is that the string pattern is anchored by the Angular framework by enclosing the whole pattern with ^ and $, so these anchors must be present when you define the pattern as a regex literal.
Use
static get PASSWORD_VALIDATION_REGEX(): string {
return /^(?=.*[a-z])(?=.*[0-9])(?=.*[A-Z])(?=.*[-~$#!%#<>|`\\\/[\];:=+{}.()*^?&",']).{8,}$/;
}
Note the \] that matches a ] char and \\ to match \ inside [...].

Remove special characters using Pentaho - Replace in String

I wanted to remove the special characters like ! # # $ % ^ * _ = + | \ } { [ ] : ; < > ? / in a string field.
I used the "Replace in String" step and enabled the use RegEx. However, I do not know the right syntax that I will put in "Search" to remove all these characters from the string. If I only put one character in the "Search" it was removed from the string. How can I remove all of these??
This is the picture of how I did it:
As per documentation, the regex flavor is Java. You may use
\p{Punct}
See the Java regex syntax reference:
\p{Punct} Punctuation: One of !"#$%&'()*+,-./:;<=>?#[]^_`{|}~

How to filter a string for invalid filename characters using regex

My problem is that I don't want the user to type in anything wrong so I am trying to remove it and my problem is that I made a regex which removes everything except words and that also remove . , - but I need these signs to make the user happy :D
In a short summary: This Script removes bad characters in an input field using a regex.
Input field:
$CustomerInbox = New-Object System.Windows.Forms.TextBox #initialization -> initializes the input box
$CustomerInbox.Location = New-Object System.Drawing.Size(10,120) #Location -> where the label is located in the window
$CustomerInbox.Size = New-Object System.Drawing.Size(260,20) #Size -> defines the size of the inputbox
$CustomerInbox.MaxLength = 30 #sets max. length of the input box to 30
$CustomerInbox.add_TextChanged($CustomerInbox_OnTextEnter)
$objForm.Controls.Add($CustomerInbox) #adding -> adds the input box to the window
Function:
$ResearchGroupInbox_OnTextEnter = {
if ($ResearchGroupInbox.Text -notmatch '^\w{1,6}$') { #regex (Regular Expression) to check if it does match numbers, words or non of them!
$ResearchGroupInbox.Text = $ResearchGroupInbox.Text -replace '\W' #replaces all non words!
}
}
Bad Characters I don't want to appear:
~ " # % & * : < > ? / \ { | } #those are the 'bad characters'
Note that if you want to replace invalid file name chars, you could leverage the solution from How to strip illegal characters before trying to save filenames?
Answering your question, if you have specific characters, put them into a character class, do not use a generic \W that also matches a lot more characters.
Use
[~"#%&*:<>?/\\{|}]+
See the regex demo
Note that all these chars except for \ do not need escaping inside a character class. Also, adding the + quantifier (matches 1 or more occurrences of the quantified subpattern) streamlines the replacing process (matches whole consecutive chunks of characters and replaced all of them at once with the replacement pattern (here, empty string)).
Note you may also need to account for filenames like con, lpt1, etc.
To ensure the filename is valid, you should use the GetInvalidFileNameChars .NET method to retrieve all invalid character and use a regex to check whether the filename is valid:
[regex]$containsInvalidCharacter = '[{0}]' -f ([regex]::Escape([System.IO.Path]::GetInvalidFileNameChars()))
if ($containsInvalidCharacter.IsMatch(($ResearchGroupInbox.Text)))
{
# filename is invalid...
}
$ResearchGroupInbox.Text -replace '~|"|#|%|\&|\*|:|<|>|\?|\/|\\|{|\||}'
Or as #Wiketor suggest you can obviate it to '[~"#%&*:<>?/\\{|}]+'

Regular expression extract filename from line content

I'm very new to regular expression. I want to extract the following string
"109_Admin_RegistrationResponse_20130103.txt"
from this file content, the contents is selected per line:
01-10-13 10:44AM 47 107_Admin_RegistrationDetail_20130111.txt
01-10-13 10:40AM 11 107_Admin_RegistrationResponse_20130111.txt
The regular expression should not pick the second line, only the first line should return a true.
Your Regex has a lot of different mistakes...
Your line does not start with your required filename but you put an ^ there
missing + in your character group [a-zA-Z], hence only able to match a single character
does not include _ in your character group, hence it won't match Admin_RegistrationResponse
missing \ and d{2} would match dd only.
As per M42's answer (which I left out), you also need to escape your dot . too, or it would match 123_abc_12345678atxt too (notice the a before txt)
Your regex should be
\d+_[a-zA-Z_]+_\d{4}\d{2}\d{2}\.txt$
which can be simplified as
\d+_[a-zA-Z_]+_\d{8}\.txt$
as \d{2}\d{2} really look redundant -- unless you want to do with capturing groups, then you would do:
\d+_[a-zA-Z_]+_(\d{4})(\d{2})(\d{2})\.txt$
Remove the anchors and escape the dot:
\d+[a-zA-Z_]+\d{8}\.txt
I'm a newbie in php but i think you can use explode() function in php or any equivalent in your language.
$string = "01-09-13 10:17AM 11 109_Admin_RegistrationResponse_20130103.txt";
$pieces = explode("_", $string);
$stringout = "";
foreach($i = 0;$i<count($pieces);i++){
$stringout = $stringout.$pieces[$i];
}

problem in not replaceing minus sign(-) with a blank using regex

I am using this regex expression to replace some characters with ""
I used it as
query=query.replace(/[^a-zA-Z 0-9 * ? : . + - ^ "" _]+/g,'');
But when my query is as +White+Diamond, i get result +White+Diamond, but when query is -White+diamond i am getting White+diamond, it means - is replaced by "" that i don't want.
Please tell me what is the problem.
In regex, - means "from ... to ...", escape your - with a backslash: \-.
What SteeveDroz said:
query=query.replace(/[^a-zA-Z0-9*?:.+\-^"_ ]+/g,'');
I'm assuming you want to exclude spaces as well. If not, remove the final space from the character class.