Regex Find One Word But Not Another - regex

How would I detect if a string contained for instance my first name but not my last name using only a regular expression?

The easiest way to use regular expressions for this is to use simple regexes and logical connectives. Since you only want simple matches and since you didn't list a language, here is a basic implementation in Perl:
my $str1="firstname lastname blah blah blah";
my $str2="blurg firstname etc";
foreach($str1,$str2)
{
if(/firstname/ and !/lastname/)
{
print "$_ matched firstname and not lastname!\n";
}
else
{
print "No match for $_\n";
}
}
As expected, the output is:
No match for firstname lastname blah blah blah
blurg firstname etc matched firstname and not lastname!

How about:
#!/usr/bin/perl
use Modern::Perl;
while (<DATA>) {
chomp;
say /^(?=.*\bfirstname\b)(?!.*\blastname\b)/ ? "OK : $_" : "KO : $_";
}
__DATA__
jhjg firstname jkhkjh lastname kljh
jhgj lastname kjhjk firstname kjhkjh
jhgdf firstname sjhdfg not_my_lastname
jshgdf no_my_lastname jhg firstname jkhghg
output:
KO : jhjg firstname jkhkjh lastname kljh
KO : jhgj lastname kjhjk firstname kjhkjh
OK : jhgdf firstname sjhdfg not_my_lastname
OK : jshgdf no_my_lastname jhg firstname jkhghg

^(?=.*firstname)(?=.*lastname)
With many regex versions you can make something like this. I'm using zero-width lookahead to search for your firstname and your lastname. They don't "move" the regex cursor, so both are scanned starting from the first character. The regex will fail if firstname or lastname isn't present.
Seriously, the regex should be more complex, otherwise you could have these situations:
firstname = `name`
lastname = `lastname`
lastname // ok with the given rules
and with
firstname = `firstname`
lastname = `lastname`
firstnamelastname // ok with the given rules, even without the space
xfirstnamex xlastnamex // ok with the given rules
The regex would need to be:
^(.*\bfirstname\b.*\blastname\b)|(.*\blastname\b.*\bfirstname\b)
so checking for both orders of firstname and lastname and checking that there is a word separator before and after firstname and lastname.
I'll add that what I have showed are perfect examples of thing not to do. You want to user regexes? You don't! First you try to use the string functions of your language. Then, if they fail, you can try with regexes.

Related

How to Extract people's last name start with "S" and first name not start with "S"

As the title shows, how do I capture a person who:
Last name start with letter "S"
First name NOT start with letter "S"
The expression should match the entire last name, not just the first letter, and first name should NOT be matched.
Input string is like the following:
(Last name) (First name)
Duncan, Jean
Schmidt, Paul
Sells, Simon
Martin, Jane
Smith, Peter
Stephens, Sheila
This is my regular expression:
/([S].+)(?:, [^S])/
Here is the result I have got:
Schmidt, P
Smith, P
the result included "," space & letter "P" which should be excluded.
The ideal match would be
Schmidt
Smith
You can try this pattern: ^S\w+(?=, [A-RT-Z]).
^S\w+ matches any word (name in your case) that start with S at the beginning,
(?=, [A-RT-Z]) - positive lookahead - makes sure that what follows, is not the word (first name in your case) starting with S ([A-RT-Z] includes all caps except S).
Demo
I did something similar to catch the initials. I've just updated the code to fit your need. Check it:
public static void Main(string[] args)
{
//Your code goes here
Console.WriteLine(ValidateName("FirstName LastName", 'L'));
}
private static string ValidateName(string name, char letter)
{
// Split name by space
string[] names = name.Split(new string[] {" "}, StringSplitOptions.RemoveEmptyEntries);
if (names.Count() > 0)
{
var firstInitial = names.First().ToUpper().First();
var lastInitial = names.Last().ToUpper().First();
if(!firstInitial.Equals(letter) && lastInitial.Equals(letter))
{
return names.Last();
}
}
return string.Empty;
}
In you current regex you capture the lastname in a capturing group and match the rest in a non capturing group.
If you change your non capturing group (?: into a positive lookahead (?= you would only capture the lastname.
([S].+)(?=, [^S]) or a bit shorter S.+(?=, [^S])
Your regex worked for me fine
$array = ["Duncan, Jean","Schmidt, Paul","Sells, Simon","Martin, Jane","Smith, Peter","Stephens, Sheila"];
foreach($array as $el){
if(preg_match('/([S].+)(?:,)( [^S].+)/',$el,$matches))
echo $matches[2]."<br/>";
}
The Answer I got is
Paul
Peter

Regex to check capitalization of keyword

Suppose I have a keyword ProcessTest and I want to match all occurrences that are not capitalized that way, in order to replace them with correctly spelled keyword.
PROCESSTEST >> ProcessTest
Processtest >> ProcessTest
proceSstest >> ProcessTest
Etc.
So I first need a case-insensitive match for the keyword, an next a case sensitive check for the correctly spelled keyword.
Any suggestions how to do this with regex?
I totally agree with #casimir-et-hippolyte, simply match the word globally with case insentitive option (gi) then replace it.
Code is very simple and efficient in every language. Why add complexity when it is not necessary ? ;)
Here is a sample
var text = "This is ProCeSSTesT or processTest or PROCESSTEST. This is FooBAR or martyMCfly or foobAr"
var words = ["ProcessTest", "FooBar", "MartyMcFly"]
words.forEach(function(word) {
var re = new RegExp(word, "gi")
text = text.replace(re, word)
})
console.log(text)
Some improvements
When searching for ProcessTest You certainly don't want to match words like preProcessTest or processTester, so let's update our regex like this : \bprocesstest\b
I didn't use lookaheads because not every language support them.
Demo
var text = "This is ProCeSSTesT or processTester or PROCESSTEST. This is FooBAR or martyMCfly or foobAr or preProcessTest and DummyFooBar"
var words = ["ProcessTest"]
words.forEach(function(word) {
var re = new RegExp("\\b" + word + "\\b", "gi")
text = text.replace(re, word)
})
console.log(text)
Demo

PowerShell Normalize List of Names

I have some really messed up names from a system that I'm trying to match First and Last names in AD. Just need to parse the strings. I have names such as :
Hagstrom, N.P., Ana (Analise)
Banas, R.N., Cynthia
Saltzmann, N.P., April
Lee, Christopher
Rajaram, Pharm.D., Sharmee
Goode Jr, John (Jack) L
Reyes, R.N., Meghan
Miller, M.S., Adrienne M
Chavez, Gabriela
Stevens, MS, CCC-SLP, Christopher
Lockwood Flores, R.N., Jessica
I have tried this, but for some reason, the GivenName isn't being returned properly.
$Name = "Saltzmann, N.P., April"
$GivenName = $Name.Split(",")[$Name.Split(",").GetUpperBound(0)]
$SN = $Name.Split(",")[0]
If ($SN.IndexOf("-") -gt -1) {
$HypenLast = $SN.Split("-")[0]
$SNName = $SN.Split("-")[1]
}
If ($GivenName.IndexOf(" ") -gt -1) {
$GivenName = $GivenName.Replace("(","").Replace(")","").Split(" ")[0]
$MiddleName =$GivenName.Replace("(","").Replace(")","").Split(" ")[1]
}
Trying to take everything before the first comma and everything after last comma, but take letters before the second space of the first name.
Trying to get LastName FirstName but then need to flip it to FirstName LastName. Thanks.
All of the names could be piped to a script block that uses a regex with some named capture groups. The named capture group values can be extracted to rebuild the name you need using string interpolation.
$nameList | ForEach-Object {
$match = [Text.RegularExpression.Regex]::Match($_, "(?<last>[\w\s]+),(?:.*,)?(?:\s*)(?<first>\w+)")
$lastName = $match.Groups["last"].Value
$firstName = $match.Groups["first"].Value
"$firstName $lastName"
}

RegEx and split camelCase

I want to get an array of all the words with capital letters that are included in the string. But only if the line begins with "set".
For example:
- string "setUserId", result array("User", "Id")
- string "getUserId", result false
Without limitation about "set" RegEx look like /([A-Z][a-z]+)/
$str ='setUserId';
$rep_str = preg_replace('/^set/','',$str);
if($str != $rep_str) {
$array = preg_split('/(?<=[a-z])(?=[A-Z])/',$rep_str);
var_dump($array);
}
See it
Also your regex will also work.:
$str = 'setUserId';
if(preg_match('/^set/',$str) && preg_match_all('/([A-Z][a-z]*)/',$str,$match)) {
var_dump($match[1]);
}
See it

convert comma-separated string-pairs with regex

I have a comma-separated list of first- and lastnames which I need to convert to SQL
(whitespace exists after the comma):
joe, cool
alice, parker
etc.
should become:
( firstname ='joe' and lastname = 'cool' ) or
( firstname ='alice' and lastname = 'parker' )
How can I achieve this with a regular expression?
In Perl you can do this:
s/(\S+),\s*(\S+)/( firstname ='\1' and lastname = '\2' )/
From the command line:
> perl -pe "s/(\S+),\s*(\S+)/( firstname ='\1' and lastname = '\2' )/" input.txt
Input:
joe, cool
alice, parker
Output:
( firstname ='joe' and lastname = 'cool' )
( firstname ='alice' and lastname = 'parker' )