Regular Expression - regex

I wonder if anyone can help.
I need to write a regular expression that throws away everything apart from the last word if that last word is an alphanumeric (numbers and letters) or a single number or a single letter.
For example
Ground floor Apartment 2
Garden Apartment 1A
Block 2D
Suite 12
Unit C
Basement Flat
General Office
I would like to remove all words and characters that are not part of the actual number i.e.
Ground Floor Apartment 2 should become 2
Garden Apartment 1A should become 1A
Block 2D should become 2D
Suite 12 should become 12
Unit C should become C
Basement Flat should become Blank as there is no numbers involved
General Office should become blank
Many Thanks in advance

You could try using a positive lookahead which asserts your requirements at the end of the string.
(?:\b[A-Za-z]{1}|\d+|(?=.*\d)[a-zA-Z0-9]+)$
Explanation
A non capturing group (?:
A word boundary \b
Match a single letter [A-Za-z]{1}
Or |
One or more digits \d+
Or |
A positive lookahead which asserts that the last word contains a digit (?=.*\d)
Match one or more lower/upper case characters or digits [a-zA-Z0-9]+
Close non capturing group )
The end of the string $

What language are you using? You should be able to get the last word by splitting/exploding the string using spaces, then apply the regex to the last word.
You may want to just handle if the length of the word is 1 to make your regex simpler to understand and troubleshoot. This regex works for any word that is 2 letters or longer.
Here's a regex that should work for that last word. It uses a positive lookahead to ensure one letter and one number are present. https://regex101.com/r/i5R9bq/1/
(?=.*[0-9])(?=.*[A-z])[0-9A-z]+

Related

Regex: How to find a phone number (or number sequence) that begins with a particular single digit (multiple numbers on the same line)

Newbie question but how can I check for instances where there are multiple numbers on the same line. For instance, the content reads for example contact 408-555-5454 or reach out to 408-555-4545. Right now the best I can do is ^4 but that's only catching multiple things if the mutliline flag is tured on. Any idea.
You could try the regex below
/4\d{2}(-| )?\d{3}(-| )?\d{4}/g
This of course assumes that you're looking for numbers that start with 4. You can have a look at the Regex Snippet here and you can experiment with trying different variations of the regex to suit your needs.
here's a key to the regex elements included:
4 = matches the literal number 4
\d{2} = matches 2 digits (0-9).
(-| )? = matches either a hyphen or single space but makes it not required. ie you can have a space or hyphen or not.
\d{3} = matches 3 digits (0-9)
Same as #3 above
\d{4} = matches 4 digits (0-9)
the g flag will ensure that you're searching through the whole text and not stopping after the first match.
If you like the answer please Accept it :)

RegEx - Require minimum length of one group

I'm trying to make a nickname validator. Here are the rules I want:
Total character count must be between 3 and 15
There can be two non-consecutive spaces
Only letters (a-z) are allowed
Each word separated by a space can begin with an uppercase letter, the rest of the word must be lowercase
At least one of the words must have 3 or more characters
This is what I currently have which checks the four first rules, but I have no idea how to check the last rule.
^(?=.{3,15}$)(\b[A-Z]?[a-z]* ?\b){1,3}$
Should match:
Yaw
yaw
James Bond
Monkey D Luffy
List item
Shouldn't match:
YaW
Two spaces (with two consecutive space characters)
No no no
JamesBond
Try Regex: ^(?=[A-Za-z ]{3,15}$)(?=[A-Za-z ]*[A-Za-z]{3})(?:\b[A-Z]?[a-z]*\ ?\b){1,3}$
Demo
For the last rule, a positive lookahead without space was used (?=[A-Za-z ]*[A-Za-z]{3})
but I have no idea how to check the last rule
As you only allow [A-Za-z] and space in your regex, you could simply use (?=.*?\S{3}) which looks ahead for 3 non white-space characters. .*? matches lazily any amount of any characters.
As soon as 3 non white-space characters are required the initial lookahead can be improved to the negative ^(?!.{16}) as the minimum of 3 is already required in \S{3} ⇒ [A-Za-z][a-z]*
Further you can drop the initial \b which is redundant as there can only be start or space before.
^(?!.{16})(?=.*?\S{3})(?:[A-Za-z][a-z]* ?\b){1,3}$
Here is a demo at regex101 (for more regex info see the SO regex faq)
If your tool supports atomic groups, improve performance by use of (?> instead of (?:

use ultraedit find and replace Perl regex to insert colon into 4 digit time string

I have multiple 24-hour time strings through several files. For example, 1234, which I wish to replace with 12:34.
Finding them is easy, just \d\d\d\d, that I understand and it works. However, what replace string do I need. In other words, say xx:xx, what do I put in place of each x.
I've tried numbers of things to no avail. I'm obviously not understanding how I get it to remember the digits it found and to recall them in the replace string.
If in your example data 4 digits represent 24 hour time strings you could match 2 capturing groups between word boundaries to prevent a match with more then 4 digits. You can Adjust the word boundaries to your requirements.
Match
\b(\d{2})(\d{2})\b
Replace
group1:group2 \1:\2
Explanation
\b Match a word boundary
(\d{2}) Capture in a group 2 digits
(\d{2}) Capture in a group 2 digits
\b Match a word boundary
Note
Matching 4 digits does not verify a valid 24 hour time. You could match that using for example \b([01][0-9]|2[0-3])([0-5][0-9])\b and replace with \1:\2

Regular expression for match string within first five words of input sentence

I want to match specific strings from beginning to 5th word of article title.
Input string:
The 14 best US colleges in the West are dominated by California — here's who makes the cut.
regex:
/^.*(\bbest\b|\btop\b|\bhot\b).*$/
Currently matched whole article title but want to search till "colleges".
and also need ignore or not matched strings like laptop,hot-spot etc.
You can use this expression
^((?:\w+\s?){1,5}).*
Explanation:
^ assert position at start of the string
\w+ match any word character
\s? match any white space character
{1,5} Quantifier - Between 1 and 5 times, as many times as possible
.* matches any character (except newline)
This matches the first 5 words (and spaces).
^(\w+\s){0,4}\b(best|top|hot)(\s|$)
You want to match string within first five words of input sentence. Then if counted from the start the sentence, there must be 0-4 words before the word you want to match. So you need ^(\w+\s){0,4} before the specific words you want to match. See https://regex101.com/r/nS0dU6/4
regex101 comes to help again.
^(?=(?:\w+\s){0,4}?(?:best|top|hot)\b(?!-))(\w+(?:\s\w+){0,4})
(?=(?:\w+\s){0,4}?(?:best|top|hot)\b(?!-) checks that the keyword is within first 5 (note that (?!-) is added to cater for words such as hot-spot)
(\w+(?:\s\w+){0,4}) then matches the first maximum 5 words

Noob regex poser (match MAY contain and MUST have)

Probably really simple for you Regex masters :) I'm a noob at regex, just having picked up some PHP, but wanting to learn (once this project is complete, I'll knuckle down and crack regular expressions).
I'd like to understand how to compose a regex that may contain some data, but must contain other.
My example being, the match MAY begin with numbers but doesn't have to, however if it does, I need the number and the following 2 words. If it doesn't begin with a number, just the first 2 words. The data will be at the beginning of the string.
The following would match:
123 Fore Street, Fiveways (123 Fore Street returned(no comma))
Our House Village (Our House returned)
7 Eightnine (7 Eightnine returned)
Thanks
Something like this should work:
^((?:\d+\s)?\w+(?:\s\w+)?)
You can test it out somewhere like http://rubular.com/ before coding it, it's usually easier.
What it means:
^ -> beginning of the line
(?:\d+\s)? -> a non capturing group, (marked by ?:), consisting of several digits and a space, since we follow it by ?, it's optional.
\w+(?:\s\w+)? -> several alphanumeric characters (look up what \w means), followed by, optionally, a space and another "word", again in a non capturing group.
The whole thing is encapsulated in a capturing group, so group 1 will contain your match.
Use this regex with multiline option
^(\d+(\s*\b[a-zA-Z]+\b){1,2}|(\s*\b[a-zA-Z]+\b){1,2})
Group1 contains your required data
\d+ means match digit i.e \d 1 to many times+
\s* means match space i.e \s 0 to many times*
(\s*\b[a-zA-Z]+\b){1,2} matches 1 to 2 words..