Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I created a RegEx-based validation with SO's help and passed it off to my coworker explaining the reasons why it is needed (Referring to it as a Whitelist). The coworker then proceeded to change my code (insisting upon what they called a Blacklist) and modified the RegEx. The code corresponding to each approach is listed below. The validation should ensure that only a hyphen, numbers, spaces and letters are allowed. I'd like to know
Which of the code fragments achieves that?
How can I break my co-worker's code?
Is a Blacklist just a Whitelist with the condition inverted?
My co-worker's Code:
objRegExp.Pattern= "[^-A-Za-z0-9'&(). ]"
if objRegExp.Test(strInput) then
FoundSpecialChar= true
exit function
end if
FoundSpecialChar= false
Set objRegExp = Nothing
My Code:
objRegExp.Pattern= "^[-A-Za-z0-9\s'&().]+"
if objRegExp.Test(strInput) then
FoundSpecialChar= false
exit function
end if
FoundSpecialChar= true
Set objRegExp = Nothing
Your colleague's approach lists the acceptable characters. If it finds even one character not in that list, it sets FoundSpecialChar= true which seems to be what you want. To test the difference between his code and yours, you could try to run both the code fragments with strInput = "ABCD#EFGH".
Running your code once with strInput = "A#" and another time with strInput = "#A" should help as well.
BTW, Set objRegExp = Nothing should be included before Exit Function as well.
I am from testing background and I've experienced that Whitelist approach is good from application developer point of view and Blacklist approach is good to test the application. The reason being, as a dev a Whitelist gives you control over the exact input that a user is allowed to enter. On the other hand, as a tester I would use the Blacklist approach more because it will have infinite number of options to test.
Interesting discussion on SO --> blacklisting vs whitelisting in form's input filtering and validation
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Below consists of email data present in the single column:
Requirement is to print from Call Example to additional details alone.
Input:
Summary:
Below are the details:
Call Example:
dialFromNumber:***** dialToNumber:***** date:*** time:*** additional details:xxxx
Please check out the call details.
Second Call Example:
dialFromNumber:*****
dialToNumber:*****
date:***
time:***
additional details:xxxx
Some random text.
Output:
Both of the call examples needs to be populated in the new column 'Calldetails1' in two different rows using Pyspark.
Call Example:
dialFromNumber:***** dialToNumber:***** date:*** time:*** additional details:xxxx
Call Example:
dialFromNumber:*****
dialToNumber:*****
date:***
time:***
additional details:xxxx
Regex_extract which i used to print from call example to additional details:
result = df.withColumn('result',regex_extract('comments','(?s)(?=Call Example)(.?additional details:\s[\w+])',1))
It's working for one group. Please suggest options to work globally in python
As mentioned in the chat:
(?=Call Example)([\w\s:\*]+?[\S])$
(?=Call Example) will assert whether there is a string that starts with Call Example
[\w\s:*]+? - Will do a lazy check of atleast 1 or more characters until the last occurence of a character till end of line.
Extracting multiple captured groups using pySpark
https://stackoverflow.com/questions/58930893/extracting-several-regex-matches-in-pyspark
https://stackoverflow.com/questions/54597183/i-have-an-issue-with-regex-extract-with-multiple-matches
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
Ive been trying for ages to get the id 30393 from the below string.
https://example.com/service_requests/30393/journey
Any ideas how? Its the / causing me issues. Been trying (?<=/).*?(?=/) but obviously it doesn't work.
You could try this:
\d+
DEMO: https://regex101.com/r/9JyTdx/1
OR
(?<=service_requests\/)\d+(?=\/journey)
DEMO: https://regex101.com/r/8SXJiJ/1
/^https:\/\/example.com\/service_requests\/(\d*)\/journey$/
The answer is this (?<=service_requests\/).*?(?=\/)
Some of the comments inspired me. Just made an alteration to #bhusak comment.
This might not be a job for regexes, but for existing tools in your language of choice. Regexes are not a magic wand you wave at every problem that happens to involve strings. You probably want to use existing code that has already been written, tested, and debugged.
In PHP, use the parse_url function.
Perl: URI module.
Ruby: URI module.
.NET: 'Uri' class
Split the string on / characters and return the 2nd-to-last element. Here's the JavaScript solution, but it should be similar in other languages.
let url = 'https://example.com/service_requests/30393/journey';
let arr = url.split('/');
let id = arr[arr.length-2];
console.log(id);
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I'm trying to learn more about regex and I'm running into a block
my current query:
function telephoneCheck(str) {
return str.match(/[0-9]{3}[-][0-9]{3}[-][0-9]{4}/g)? true : false
}
This will only work for a specific inputs such as "555-555-5555", but for other inputs such as "1 (555) 555-5555" it will not. I'm at a loss on how to query for optional characters and whitespace. Moreover bracket handling is odd and I've found some crazy queries such as /(\d+-)\1\d{4}/g but I have no idea what its doing and I don't want to use code I don't understand.
Can someone show me a query that solves for "1 (555) 555-5555" where the first two characters (the one and space) are optional inputs?
These are inputs that the regex should be able to handle:
"1 (555) 555-5555"
"1(555)555-5555"
"1 555-555-5555"
"555-555-5555"
"(555)555-5555"
"5555555555"
I found a solution
regex: function telephoneCheck(str) {
var regex = /^(1\s?)?(\(\d{3}\)|\d{3})[\s\-]?\d{3}[\s\-]?\d{4}$/;
return regex.test(str);
}
telephoneCheck("555-555-5555");
But I have no idea whats going on in here. If someone could explain whats happening I'd be happy to give you the answer for this posted question :)
You have be wary of trying to be all things within regex and question why the data is so varied in the first place.
If you are just parsing a bunch of what you are thinking should be phone numbers for example and notice a lot of different formats it might actually be more readable to use logic.
There is probably a really clever way of doing the above but I tend to be a bit more brute force with regex until I need more.
The below combines both patterns in to one regex expression. You use the | separator to say or. Also if your strings are exactly as you say, you should to use the ^ (starts with) and $ ends with to ensure you don't get false positives.
var pattern = /^[0-9] \([0-9]{3}\) [0-9]{3}-[0-9]{4}$|^[0-9]{3}[-][0-9]{3}[-][0-9]{4}$/
pattern.test('555-555-5555') //true
pattern.test('1 (555) 555-5555') // true
pattern.test('(555) 555-5555') // false
And as I say if you have lots of different formats in one. Question why, is there a way to clean things up first. Then perhaps use logic and separate statements.
var parensPattern = /^[0-9] \([0-9]{3}\) [0-9]{3}-[0-9]{4}$/
var noParensPattern = /^[0-9]{3}[-][0-9]{3}[-][0-9]{4}$/
if(parensPattern.test('1 (555) 555-5555')) {
// do something
} else if (noParensPattern.test('555-555-5555)) {
// do something
}
Check out http://regex101.com, it is a great resource.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I am trying to figure out a regex to strip extra single quotes so that I would end up with only one single quote. To explain my question better, here is an example.
Let's say I have 3 different strings such as this ones.
(two single quotes)
Name<span fontSize=''16'' baselineShift=''superscript''>ABC</span>
(three single quotes)
Name<span fontSize='''16''' baselineShift='''superscript'''>ABC</span>
(four single quotes)
Name<span fontSize=''''16'''' baselineShift=''''superscript''''>ABC</span>
I am trying to sanitize the string to end up with this:
Name<span fontSize='16' baselineShift='superscript'>ABC</span>
I tried several online tools. This one is my favourite one: http://ryanswanson.com/regexp/#start. But I just can't get it right.
Could someone please help me out? Any tips and suggestions would be greatly appreciated.
Thank you in advance!
Did you try '+?
var str:String = "Name<span fontSize=''''16'''' baselineShift=''''superscript''''>ABC</span>";
trace( str.replace(/'+/g, "'") );
Have you looked at the docs for AS3's RegEx code? AS3 Replace
You could try something like this
var myPattern:RegExp = /'{2,100}/g;
var str:String = "fontSize=''''16''''";
trace(str.replace(myPattern, "'"));
The '{2,100} essentially looks for a match of ' that occurs between 2 - 100 times and replaces it with a single '.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 8 years ago.
Improve this question
Alright, so I have a database where you can get information from that'll show off in this kind of way:
ID, Display name, Likes Cake, Likes Coffee, Likes Dogs
So if you get the information, it would show something a little like to this:
1,anonymous,1,0,1
Now it's not very popular so I would like to show the people who has answered this so I would like the "1,!anonymous!,1,0,1" (anything outside the !'s) gone. I looked around and found a RegExp code that would remove stuff outside quotes, but it's rather hard and I'm rather impatient to put all the display names in quotes.
So if there was a RegExp that would erase the numbers so I could put the usernames up, would be delicious.
Well, you could do something like this:
Replae '^[^,]+([^,]+).*' With '$1'
How it looks exactly in your language may vary, of course.
But in your case this looks like CSV, so isn't parsing the CSV file easier in that case? E.g. in PowerShell you could do
Import-Csv foo.csv | select 'Display name'
and likewise for other languages that have such parsing built-in somewhere. Besides, most other options may break depending on the input because fields in CSV may contain commas too which breaks both above regex and a naïve splitting method.
You can split the database result string and then get the relevant array index.
string dbString = "1,anonymous,1,0,1";
string username = dbString.Split(',')[1];
//value of username will be "anonymous"