I am looking to match using regexp in VBScript strings that begin with one or more digits have intervening Capital letters or spaces OR strings that begin with Capital letters and spaces and end with one or more digits.
Tried "^([0-9]+[A-Z\s]+)|([A-Z\s]+[0-9]+)$" but not working.
Example Match strings:
75 MANOJ TIGADI
VASANT KANETKAR 111
You could match it both ways using the alternation inside the grouping.
If you don't need the value as a group, you can make it non capturing.
If you don't want to match only spaces, but a single space between the uppercase chars and no trailing spaces, you can use an optional repeating group (?: [A-Z]+)*
Note that \s could also possibly match a newline.
^(?:[0-9]+(?: [A-Z]+)*|[A-Z]+(?: [A-Z]+)* [0-9]+)$
Regex demo
Related
I want to make a regex that recognize some patterns and some not.
_*[a-zA-Z][a-zA-Z0-9_][^-]*.*(?<!_)
The sample of patterns that i want to recognize:
a100__version_2
_a100__version2
And the sample of patterns that i dont want to recognize:
100__version_2
a100__version2_
_100__version_2
a100--version-2
The regex works for all of them except this one:
a100--version-2
So I don't want to match the dashes.
I tried _*[a-zA-Z][a-zA-Z0-9_][^-]*.*(?<!_)
so the problem is at [^-]
You could write the pattern like this, but [^-]* can also match newlines and spaces.
To not match newlines and spaces, and matching at least 2 characters:
^_*[a-zA-Z][a-zA-Z0-9_][^-\s]*$(?<!_)
Regex demo
Or matching only word characters, matching at least a single character repeating \w* zero or more times:
^_*[a-zA-Z]\w*$(?<!_)
^ Start of string
_* Match optional underscores
[a-zA-Z] Match a single char a-zA-Z
\w* Match optional word chars (Or [a-zA-Z0-9_]*)
$ End of string
(?<!_) Assert not _ to the left at the end of the string
Regex demo
I have a list that could look sort of like
("!Goal 27' Edward Nketiah"),
("!Goal 33' 46' Pierre Emerick-Aubameyang"),
("!Sub Nicolas Pepe"),
("Jordan Pickford"),
and I'm looking to match either !Sub or !Goal 33' 46' or !Goal 27'
Right now I'm using the regex (!\w+\s) which will match !Goal and !Sub, but I want to be able to get the timestamps too. Is there an easy way to do that? There is no limit on the number of timestamps there could be.
As I mentioned in my comment, you can use the following regex to accomplish this:
(!\w+(?:\s\d+')*)
Explanation:
(!\w+(?:\s\d+')*) capture the following
! matches this character literally
\w+ matches one or more word characters
(?:\s\d+')* match the following non-capture group zero or more times
\s match a whitespace character
\d+ matches one or more digits
' match this character literally
Additionally, the first capture group isn't necessary - you can remove it to simply match:
!\w+(?:\s\d+')*
If you need each timestamp, you can use !\w+(\s\d+')* and split capture group 1 on the space character.
If your input always follows the format "bang text blank digits apostrophe blank digits apostrophe etc", then it should be as simple as:
!\w+(?:\s\d+')*
Explanation:
! matches an exclamation mark
\w+ matches 1 or more word-characters (letters, underscores)
(?:…) is a non-capturing group
\s matches a single whitespace character
\d+ matches one or more digits
' matches the apostrophe character
* repeatedly matches the group 0 or more times
this :
(!\w+(?:\s\d+')*)
will capture :
"!Goal 27'"
"!Goal 33' 46'"
"!Sub"
I am splitting file in words. I am able to splitting it into word but in some word there is special character like '___'. I want to skip that special character nd also split that word from that special character.
The file which contains data like this
Yahoo$$$Yahoo OK : ___GET
Gmail$$$Gmail Ok:___GET
google_data$$$Google.com.in___POST
using ((?!:)[.0-9a-zA-Z\s]\w+)+ gives me
Yahoo
Yahoo OK
___GET
Gmail
Gmail Ok
GET
google_data
Google.com.in___POST
I don't want that '___' and also the following string:
Google.com.in___POST
has to be split in two words, like:
Google.com.in
POST
Can any one help me with this ?
Using \w will also match an underscore. Looking at the example data, you want to match characters a-z or a digit, and in between there can be a space, dot or underscore.
Instead of splitting, you might match the values:
[0-9a-zA-Z]+(?:[._ ][0-9a-zA-Z]+)*
Explanation
[0-9a-zA-Z]+ Match a digit or a-z in lower or uppercase 1+ times
(?: Non caputuring group
[._ ] Match a . _ or space
[0-9a-zA-Z]+ Match a digit or a-z in lower or uppercase 1+ times
)* Close on capturing group and repeat 0+ times
Regex demo
I am trying to recognize these types of phone number inputs:
0172665476
+6265476393
+62-65476393
+62-654-76393
+62 65476393
While my regex: (?:\d+\s*)+ can recognize the 1st 2 sample values, it recognizes the last 3 sample values as multiple matches in each line, instead of recognizing the number as a whole.
How can I modify this to support multiple dashes and/or spaces and still recognize it as 1 whole number instead of multiple matches?
You may use this regex:
^\+?\d+(?:[\s-]\d+)*\b
RegEx Details:
^\+?: Match optional + at start
\d+: match 1+ digits
(?:[\s-]\d+)*: Match 0 or more groups that start with whitespace or - followed by 1+ digits
$: End (Replaced by word boundary as if there are trailing spaces, that match would be missed.)
This should work:
(?:[\d +-]+)+
This would work as per your reqt: (If there are trailing spaces, this regex will ignore.)
Regex: '^(?:[\d +-]+)\b'
Another option could be to use an alternation to match either 10 digits without a leading plus sign or match the pattern with a +, and optional space or hyphen:
(?:\d{10}|\+\d{2}[- ]?\d{3}-?\d{5})\b
That will match:
(?: Non capturing group
\d{10} Match 10 digits
| Or
\+\d{2}[-\s]?\d{3}-?\d{5} Match +, 2 digits, optional whitespace char or -, 3 digits, optional -, 5 digits
)\b Close non capturing group and word boundary
Regex demo
If your language supports negative lookbehinds you could prepend (?<!\S) which checks that what comes before is not a non-whitespace character.
I need a regular expression that will match this pattern (case doesn't matter):
066B-E77B-CE41-4279
4 groups of letters or numbers 4 characters long per group, hyphens in between each group.
Any help would be greatly appreciated.
^(?:\w{4}-){3}\w{4}$
Explanation:
^ # must match beginning of string
(?: # make a non-capturing group (for duplicating entry)
\w{4} # a-z, A-Z, 0-9 or _ matching 4 times
- # hyphen
){3} # this group matches 3 times
\w{4} # 4 more of the letters numbers or underscore
$ # must match end of string
Would be my best bet. Then you can use Regex Match (static).
P.S. More info on regex can be found here.
P.P.S. If you don't want to match underscores, the \w above can be replaced (both times) with [a-zA-Z0-9] (known as a class matching lowercase and uppercase letters and numbers). e.g.
^(?:[a-zA-Z0-9]{4}-){3}[a-zA-Z0-9]{4}$
Try:
[A-Za-z0-9]{4}\-[A-Za-z0-9]{4}\-[A-Za-z0-9]{4}\-[A-Za-z0-9]{4}
With such a small sample of data, it's not easy to be certain what you actually want.
I'm going to assume that all the characters in that string are hex digits, and that's what you need to search for.
In that case, you would need a regular expression something like this:
^[a-f0-9]-[a-f0-9]-[a-f0-9]-[a-f0-9]$
If they can be any letter, then replace the fs with zs.
Oh, and use myRE.IgnoreCase = True to make it case insensitive.
If you need further advice on regular expressions, I'd recommend http://www.regular-expressions.info/ as good site. They even have a VB.net-specific page.
Assuming from your example:
There are four groups of letters, separated by dashes.
Each group is four letters.
The letters are hexadecimal digits.
This pattern would match that:
^[\dA-F]{4}-[\dA-F]{4}-[\dA-F]{4}-[\dA-F]{4}$
Note that ^ and $ match the beginning and end of the string, which is important if you want to match the entire string and not check if the pattern occurs inside a string.
You could also make use of the repetitions in the pattern:
^(?:[\dA-F]{4}-){3}[\dA-F]{4}$