Regex get next 2 words after certain string - regex

I need a regular expression, which can find names in some text content. It should match from 1 to 3 names, First-name, (Middle-name), (Surname).
I have a list of valid first-names which will be used to search the text. If the first-name is found in the text, the regular expression should get the next middle-name or/and surname, if they exists.
As an example the names below, should be valid names found:
John
John Doe
John Average Joe
Special cases:
John average Doe (if, possible it should match/find John Doe)
So far my solution is:
\b(John|Mary|Tom)\b(?:(?:([^A-Za-z]*[A-Z][^\s,]*)*[^A-Za-z]+)){0,3}
This kinda works, the problem is the limitation to only match maximum 3 words, which this doesn't.
Online test: http://regex101.com/r/aM7bS3/2

I've modified your regex HERE
You can use the following:
\b(Mogens|Victor|John)(\b\s*([A-Z]\w+)){0,2}

Related

Validate authors in google sheets using regex

I have an authors column and I would like to limit the input to a specific format using data validation and REGEXMATCH.
Let's say we have 3 authors (of course the validation should allow for 1 or more authors). In no particular order:
John Edward Smith
Jane Doe
José Luis-Visquez
The desired format is strictly this (including upper and lower case and punctuation):
Smith JE, Doe J, Luis-Visquez J
Anything else should throw an error.
No dot at the end
I tried this regex but it is matching incorrect inputs as well:
(?:(?:[A-Z][a-z]+\-?(?:[A-Z][a-z]+)?)\s[A-Z]{1,2}, )*(?:(?:[A-Z][a-z]+\-?(?:[A-Z][a-z]+)?)\s[A-Z]{1,2})
What is the correct regex that would allow for unlimited authors in this specific format in no particular order for the author names? The regex should be general to any name.
try:
=ARRAYFORMULA(REGEXMATCH(B2:B4, "^\w+(?:-\w+)? [A-Z]{1,2}$"))
or more strict:
=ARRAYFORMULA(REGEXMATCH(B2:B4, "^[A-Z][a-z]+(?:-[A-Z][a-z]+)? [A-Z]{1,2}$"))

Convert MS Outlook formatted email addresses to names of attendees using RegEx

I'm trying to use Notepadd ++ to find and replace regex to extract names from MS Outlook formatted meeting attendee details.
I copy and pasted the attendee details and got names like.
Fred Jones <Fred.Jones#example.org.au>; Bob Smith <Bob.Smith#example.org.au>; Jill Hartmann <Jill.Hartmann#example.org.au>;
I'm trying to wind up with
Fred Jones; Bob Smith; Jill Hartmann;
I've tried a number of permutations of
\B<.*>; \B
on Regex 101.
Regex is greedy, <.*> matches from the first < to the last > in one fell swoop. You want to say "any character which is neither of these" instead of just "any character".
*<[^<>]*>
The single space and asterisk before the main expression consumes any spaces before the match. Replace these matches with nothing and you will be left with just the names, like in your example.
This is a very common FAQ.

Regex for more than 1 First Name before the Middle Initial

I'm not that good with regular expression and here is my problem:
I want to create a regex that match with a name that has two or more first name (e.g. Francis Gabriel).
I came up with the regex ^[A-Z][a-z]{3,30}/s[A-Z][a-z]{3,30} but
it only matches with two first name and not all first names.
The regex should match with John John J. Johnny.
^[A-Z][a-z]{3,30}(\\s[A-Z](\\.|[a-z]{2,30})?)*$
\s must be used in java when using a Pattern Compiler.
If it is X., we have to validate it, or XYZ
John Johny J.hny -> is wrong
so either . or [a-z] and at least one first name should be there. So, put a * at last of second part to match 0 or more.
Since java is not supported in this snippet, a JavaScript implementation of same regex is done for you to understand.
Check it here
var reg=/^[A-Z][a-z]{3,30}(\s[A-Z](\.|[a-z]{2,30})?)*$/;
console.log(reg.test("John john")); // false because second part start with small case
console.log(reg.test("John John"));
console.log(reg.test("John John J."));
console.log(reg.test("John John J. Johny"));
Use the following regex:
^\w+\s(\w+\s)+\w\.\s\w+$
^\w+\s match a name a space
(\w+\s)+ followed by at least one more name and space
\w+\.\s followed by a single letter initial with dot then space
\w+$ followed by a last name
Regex101
Test code:
String testInput = "John John P. Johnny";
if (testInput.matches("^\\w+\\s(\\w+\\s)+\\w+\\.\\s\\w+$")) {
System.out.println("We have a match");
}
Try this:
^(\S*\s+)(\S*)?\s+\S*?
Francis Gabriel - matches:
0: [0,10] Francis
1: [0,9] Francis
2: [9,9]
John John2 J. Johnny - matches:
0: [0,11] John John2
1: [0,5] John
2: [5,10] John2

Multiple filter regex

Sample Data:
ID Name User
12 Test Same
14 Xyz Joe
15 Abc John
16 Def Bill
17 Ghi Donald
If a user searches for Abc or Joe, he should get that rows.
Regex:
'Abc|Joe'
Output:
14 Xyz Joe
15 Abc John
Now, if the user further searches for e, it should filter based on the previous output(2 rows retrieved), so I will just get 14 Xyz Joe . Is this possible using regex?
I am trying to have all this in one regex.
`'Abc|Joe and the second filter goes here (All in one regex)'`
Use case: The user selects checkboxes to set the filters he wants to apply on the data (All the data in the columns Name and User are available). He may then search again on the filtered result using a search textbox.
((firstRegex)(?:.*(secondRegex)))|((secondRegex)(?:.*(firstRegex)))
((Abc|Xyz)(?:.*(Jo)))|((Jo)(?:.*(Abc|Xyz)))
See Demo
we don't know which regex would before or after,so it have two case and we use | combine these case.If have more search,suggest you write some code.
For the 2 filters:
/^\d+\s+(?:Abc|Xyz|Def)\s+\S*(?:Jo|ill).*/mg;
If the user doesn't specify the second filter, you could just leave it empty as (?:).
I'm positive you could create these kind of expressions if you read a couple of minutes about regex syntax, so allow me to recommend:
Regular Expressions Tutorial (regular-expressions.info). A quite comprehensive tutorial to learn regex.
regex101.com. Allows you to test different expressions and understand the way a pattern matches the subjet string.

CSV - split full name into first and last name

I regularly need to process large lists of user data for our marketing emails. I get a lot of CSVs with full name and email address and need to split these full names into separate first name and last name values. for example:
John Smith,jsmith#gmail.com
Jane E Smith,jane-smith#example.com
Jeff B. SMith,jeff_b#demo.com
Joel K smith,joelK#demo.org
Mary Jane Smith,mjs#demo.co.uk
In all of these cases, I want Smith to go in the last name column and everything else into the first name column.
Basically, I'd like to look for the last space before the first comma and replace that last space with a comma. But, I'm lost on how to do this, so any suggestions would be greatly appreciated. Also, I'm using BBEdit to process the text file.
Try the following regex:
(.*?) (\b\w*\b)(,[^,]*$)
And the substitution:
$1,$2$3
DEMO
After substitution, the data will be as follows:
John,Smith,jsmith#gmail.com
Jane E,Smith,jane-smith#example.com
Jeff B.,SMith,jeff_b#demo.com
Joel K,smith,joelK#demo.org
Mary Jane,Smith,mjs#demo.co.uk