im using regex to match certain text after selecting with xpath
for example Huntsville, Alabama 11111
i want only Alabama which always come after comma
and i use [^,]*$ to get text after comma
but i can't seem to find a way to exclude numbers or returns only the letters
another exmaple when i want to get the numbers after the comma i use [^[0-9],]*$
but when i tried to tweak it with anything else it only return numbers or nothing.
[?<=,\s*][a-zA-Z]+ You can try this.
Explanation:
?<= => lookbehind to match a string but not include in capture group
,\s* => match comma followed by 0 or more spaces
[a-zA-Z]+ => match letters only (one or more)
HTH
To match a letter word after the last comma, you may use
[a-zA-Z]+(?=[^,]*$)
See the regex demo.
Details
[a-zA-Z]+ - 1 or more ASCII letters
(?=[^,]*$) - followed with 0+ chars other than , up to the end of the string.
To match 1 or more words in the same context, use
[a-zA-Z]+(?:\s+[a-zA-Z]+)*(?=[^,]*$)
^^^^^^^^^^^^^^^^^
See this regex demo.
The (?:\s+[a-zA-Z]+)* part matches zero or more consequent occurrences of 1+ whitespaces and 1+ ASCII letters.
Related
I have a few strings and I need some help with constructing Regex to match them.
The example strings are:
AAPL10.XX1.XX2
AAA34CL
AAXL23.XLF2
AAPL
I have tried few expressions but couldn't achieve exact results. They are of the following:
[0-9A-Z]+\.?[0-9A-Z]$
[A-Z0-9]*\.?[^.]$
Following are some of the points which should be maintained:
The pattern should only contain capital letters and digits and no small letters are allowed.
The '.' in the middle of the text is optional. And the maximum number of times it can appear is only 2.
It should not have any special characters at the end.
Please ask me for any clarification.
You can write the pattern as:
^[A-Z\d]+(?:\.[A-Z\d]+){0,2}$
The pattern matches:
^ Start of string
[A-Z\d]+ Match 1+ chars A-Z or a digit
(?:\.[A-Z\d]+){0,2} Repeat 0 - 2 times a . and 1+ chars A-Z or a digit
$ End of string
Regex demo
I want to extract an ID from a search query but I don't know the length of the ID.
From this input I want to get the numbers that are not in the words and the numbers that are not separated by symbols.
12 11231390 good123e41 12he12o1 1391389 dajue1290a 12331 12-10 1.2 test12.0why 12+12 12*6 2d1139013 09`29 83919 1
Here I want to return
12 11231390 1391389 12331 83919 1
So far I've tried /\b[^\D]\d*[^\D]\b/gm but I get the numbers in between the symbols and I don't get the 1 at the end.
You could repeatedly match digits between whitespace boundaries. Using a word boundary \b would give you partial matches.
Note that [^\D] is the same as \d and would expect at least a single character.
Your pattern can be written as \b\d\d*\d\b and you can see that you don't get the 1 at the end as your pattern matches at least 2 digits.
(?<!\S)\d+(?:\s+\d+)*(?!\S)
The pattern matches:
(?<!\S) Negateive lookbehind, assert a whitespace boundary to the left
\d+(?:\s+\d+)* Match 1+ digits and optionally repeat matching 1+ whitespace chars and 1+ digits.
(?!\S) Negative lookahead, assert a whitspace boundary to the right
Regex demo
If lookarounds are not supported, you could use a match with a capture group
(?:^|\s)(\d+(?:\s+\d+)*)(?:$|\s)
Regex demo
The words' length could be 2 or 6-10 and could be separated by space or comma. The word only include alphabet, not case sensitive.
Here is the groups of words that should be matched:
RE,re,rereRE
Not matching groups:
RE,rere,rel
RE,RERE
Here is the pattern that I have tried
((([a-zA-Z]{2})|([a-zA-Z]{6,10}))(,|\s+)?)
But unfortunately this pattern can match string like this: RE,RERE
Look like the word boundary has not been set.
You could match chars a-z either 2 or 6 - 10 times using an alternation
Then repeat that pattern 0+ times preceded by a comma or a space [ ,].
^(?:[A-Za-z]{6,10}|[A-Za-z]{2})(?:[, ](?:[A-Za-z]{6,10}|[A-Za-z]{2}))*$
Explanation
^ Start of string
(?:[A-Za-z]{6,10}|[A-Za-z]{2}) Match chars a-z 6 -10 or 2 times
(?: Non capturing group
[, ](?:[A-Za-z]{6,10}|[A-Za-z]{2}) Match comma or space and repeat previous pattern
)* Close non capturing group and repeat 0+ times
$ End of string
Regex demo
If lookarounds are supported, you might also assert what is directly on the left and on the right is not a non whitespace character \S.
(?<!\S)(?:[A-Za-z]{6,10}|[A-Za-z]{2})(?:[ ,](?:[A-Za-z]{6,10}|[A-Za-z]{2}))*(?!\S)
Regex demo
([a-zA-Z]{2}(,|\s)|[a-zA-Z]{6,10}|(,|\s))
This one will get only the words who have 2 letter, or between 6 and 10
\b,?([a-zA-Z]{6,10}|[a-zA-Z]{2}),?\b
You can use this
^(?!.*\b[a-z]{4}\b)(?:(?:[a-z]{2}|[a-z]{6,10})(?:,|[ ]+)?)+$
Regex Demo
This regex will match your first case, but neither of your two other cases:
^((([a-zA-Z]{2})|([a-zA-Z]{6,10}))(,|[ ]+|$))+$
I'm making the assumption here that each line should be a single match.
Here it is in action.
I am trying to recognize these types of phone number inputs:
0172665476
+6265476393
+62-65476393
+62-654-76393
+62 65476393
While my regex: (?:\d+\s*)+ can recognize the 1st 2 sample values, it recognizes the last 3 sample values as multiple matches in each line, instead of recognizing the number as a whole.
How can I modify this to support multiple dashes and/or spaces and still recognize it as 1 whole number instead of multiple matches?
You may use this regex:
^\+?\d+(?:[\s-]\d+)*\b
RegEx Details:
^\+?: Match optional + at start
\d+: match 1+ digits
(?:[\s-]\d+)*: Match 0 or more groups that start with whitespace or - followed by 1+ digits
$: End (Replaced by word boundary as if there are trailing spaces, that match would be missed.)
This should work:
(?:[\d +-]+)+
This would work as per your reqt: (If there are trailing spaces, this regex will ignore.)
Regex: '^(?:[\d +-]+)\b'
Another option could be to use an alternation to match either 10 digits without a leading plus sign or match the pattern with a +, and optional space or hyphen:
(?:\d{10}|\+\d{2}[- ]?\d{3}-?\d{5})\b
That will match:
(?: Non capturing group
\d{10} Match 10 digits
| Or
\+\d{2}[-\s]?\d{3}-?\d{5} Match +, 2 digits, optional whitespace char or -, 3 digits, optional -, 5 digits
)\b Close non capturing group and word boundary
Regex demo
If your language supports negative lookbehinds you could prepend (?<!\S) which checks that what comes before is not a non-whitespace character.
I checked on stackoverflow already but didn't find a solution I could use.
I need a regular expression to match any word (by word I mean anything between full spaces) that contains numbers. It can be alphanumeric AB12354KFJKL, or dates 11/01/2014, or numbers with hyphens in the middle, 123-489-568, or just plain normal numbers 123456789 - but it can't match anything without numbers.
Thanks,
Better example of what I want (in bold) in a sample text:
ABC1 ABC 23-4787 ABCD 4578 ABCD 11/01/2014 ABREKF
There must be something better, but I think this should work:
\S*\d+\S*
\S* - Zero or more non-whitespace characters
\d+ - One or more digits
\S* - Zero or more non-whitespace characters
Use this lookahead:
(?=\D*\d)
This asserts that the string contains any quantity of non numeric characters (\D) followed by a single digit.
If you want to match/capture the string, then just add .* to the regex:
(?=\D*\d).*
Reference: http://www.rexegg.com/regex-lookarounds.html