Regular Expression to exclude emailids with special characters - regex

I have a sample set of emailids below
EmailAddress
abc#in.in
#abc#in.in
abc#in.in&
a#b#c#in.in
a!bc#in.in
a$bc#in.in
a+bc#in.+in
ab-c-#in.in
ab/c\#in.in
ab\c#in.in
ab~~~~c#in.in
una02#gmail.com
I have to separate invalid mailids containing special characters other than - _ # .
I wrote below rex and its working fine. Please point out if I missed any possible scenario or this rex can be improved. Thanks in advance.
[^\$\+\\/~#!&]*
Clean List
abc#in.in
ab-c-#in.in
una02#gmail.com
Invalid List
#abc#in.in
abc#in.in&
a#b#c#in.in
a!bc#in.in
a$bc#in.in
a+bc#in.+in
ab/c\#in.in
ab\c#in.in
ab~~~~c#in.in

You eleminated the addresses with the # in the local-part.
I think after RFC 5322 is is a valid character.

Related

Regular Exp match anything but not specific string

I am handling user input in my program by using regular exp.
the string contains /_MyWord/ and only a-z is accepted before /_MyWord/.
the string not contain /s/123, /s/32A and atr/will in the beginning.
My try:
^(?!.*/s/123)(?!.*/s/32A )(?!.*atr/will)([/a-z]+)/_MyWord/(.*)$
Example:
/s/123/QWERERTYU/_MyWord/45454545 -> fail
/DFGH/FGHJK/GHJK/_MyWord/DFGHJ452 -> OK
HiCanYouHelpMe/_MyWord/fgh -> OK
/_MyWord/HiCanYouHelpMefgh -> OK
Can anyone help me to finish the Regular Exp string
If I got your question correctly, try this regex:
^(?!.*\/s\/123)(?!.*\/s\/32A)(?!.*atr\/will)([\/a-zA-Z]*)\/_MyWord\/(.*)$
Unescaped: ^(?!.*/s/123)(?!.*/s/32A)(?!.*atr/will)([/a-zA-Z]*)/_MyWord/(.*)$
Changed ([\/a-z]+) to ([\/a-zA-Z]*) to include lower and upper case as well as support none (e.g /_MyWord/Test)
Regex101 Demo
Works for
/DFGH/FGHJK/GHJK/_MyWord/DFGHJ452
HiCanYouHelpMe/_MyWord/fgh
/_MyWord/HiCanYouHelpMefgh
Doesn't match:
/s/123/QWERERTYU/_MyWord/45454545
atr/will/DFGH/FGHJK/GHJK/_MyWord/DFGHJ452
Also, you really don't need lookaheads for /s/123 and /s/32A since they contain numbers so they will automatically be rejected because your condition includes [a-zA-Z]. So you might want to remove (?!.*\/s\/123)(?!.*\/s\/32A) from the beginning.

Regex number range target [duplicate]

I am trying to have my regex match the following:
169.254.0.0-169.254.254.255
Could anyone please help how can I achieve this.
so far I have this:
169\.254\.([1-9]{1,2}|[1-9]{1,2}[1-4])
but it would also pick up 169.254.255.1 which should not be one of the matches.
Please help!
thanks
This is the regex I use for general IP validation:
(([0-9](?!\d)|[1-9][0-9](?!\d)|1[0-9]{2}(?!\d)|2[0-4][0-9](?!\d)|25[0-5](?!\d))[.]?){4}
Breakdown:
1.`[0-9](?!\d)` -> Any Number 0 through 9 (The `(?!\d)` makes sure it only grabs stand alone digits)
2.`|[1-9][0-9](?!\d)` -> Or any number 10-99 (The `(?!\d)` makes sure it only grabs double digit entries)
3.`|1[0-9]{2}` -> Or any number 100-199
4.`|2[0-4][0-9]` -> Or any number 200-249
5.`|25[0-5]` -> Or any number 250-255
6.`[.]?` -> With or without a `.`
7.`{4}` -> Lines 1-6 exactly 4 times
This hasn't failed my yet for IP address validation.
For your specific case, this should do it:
(169\.254\.)((([0-9](?!\d)|[1-9][0-9](?!\d)|1[0-9]{2}|2[0-4][0-9]|25[0-4])[.])(([0-9](?!\d)|[1-9][0-9](?!\d)|1[0-9]{2}|2[0-4][0-9]|25[0-5])))
This is very long because I couldn't figure out how to get 169.254.(0-254).255 to check without getting 169.254.255.1 to fail
Edit: Fixed due to comments
the regex ([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-4]) matches 0-254.
see this page for more discussion
I've written an article that provides regular expressions for all the components of a generic URI (as defined in RFC3986: Uniform Resource Identifier (URI): Generic Syntax)
See: Regular Expression URI Validation
One of the components of a generic URI is an IPv4 address. Here is the free-spacing mode Python version from that article:
re_python_rfc3986_IPv4address = re.compile(r""" ^
# RFC-3986 URI component: IPv4address
(?: (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) \.){3} # (dec-octet "."){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) # dec-octet "."
$ """, re.VERBOSE)
And the un-commented JavaScript version:
var re_js_rfc3986_IPv4address = /^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$/;

How to group provided string correctly?

I have the following regex:
^([A-Za-z]{2,3}\d{6}|\d{5}|\d{3})((\d{3})?)(\d{2}|\d{3}|\d{6})(\d{2}|\d{3})$
I use this regex to match different, yet similar strings:
# MOR644-004-007-001
MOR644004007001 # string provided
# VUF00101-050-08-01
VUF001010500801 # string provided
# MF001317-077944-01
MF00131707794401 # string provided
These strings need to match/group as it is at the top of the strings, however my problem is that it is not grouping it correctly
The first string: MOR644004007001 is grouped: (MOR644004) (007) (001) which should be (MOR644) (004) (007) (001)
The second string: VUF001010500801 is grouped (VUF001010) (500) (801) which should be (VUF00101) (050) (08) (01)
How can I change ([A-Za-z]{2,3}\d{6}|\d{5}|\d{3})((\d{3})?) so that it would group the provided string correctly?
I am not sure that you can do what you want to.
Let's consider the first two strings:
# MOR644-004-007-001
MOR644004007001 # string provided
# VUF00101-050-08-01
VUF001010500801 # string provided
Now, both the strings are composed of 3 chars followed by 12 digits. Thus, given a regex R, if R does not depend on particular (sequences of) characters and on particular (sequences of) digits (i.e., it presents [A-Za-z] and \d but does not present, let's say, MO and 0070), then it will match both the string in the same way.
So, if you want to operate a different matching, then you need to look at the particular occurrence of certain characters or digits. We need more data from you in order to give you an aswer.
Finally, I suggest you to take a look at this tool:
http://regex.inginf.units.it/ (demo: http://regex.inginf.units.it/demo.html). It is a research project that automatically generates a regex given (many) examples of extraction. I warmly suggest you to try it, especially if you know that an underlying pattern is present in your case for sure (i.e. strings beginning with VUF must be matched differently from strings beginning with MOR) but you are unable to find it. Again, you will need to provide many examples to the engine. Needles to say, if a generic pattern does not exist, then the tool won't find it ;)
Considering your comment to Serv I'd say the (only?) solution is to have one regex for each possibility, like -
MOR(\d{3})(\d{3})(\d{3})(\d{3})|VUF(\d{5})(\d{3})(\d{2})(\d{2})|MF(\d{6})(\d{6})(\d{2})
and then use the execution environment (JS/php/python - you haven't provided which one) to piece the parts together.
See example on regex101 here. Note that substitution, only as an example, matches only the second string.
Regards
Take a look at this. I have used what's called as a named group. As pointed out earlier by others, it's better to have one regex code for each string. I have shown here for the first string, MOR644004007001. Easily you can expand for other two strings:
import re
# MOR644-004-007-001
MOR = "MOR644004007001" # string provided
# VUF00101-050-08-01
VUF = "VUF001010500801" # string provided
# MF001317-077944-01
MF = "MF00131707794401" # string provided
MORcompile = re.compile(r'(?P<first>\w{,6})(?P<second>\d{,3})(?P<third>\d{,3})(?P<fourth>\d{,3})')
MORsearch = MORcompile.search(MOR.strip())
print MORsearch.group('first')
print MORsearch.group('second')
print MORsearch.group('third')
print MORsearch.group('fourth')
MOR644
004
007
001

How to match IPv4 addresses

I am trying to have my regex match the following:
169.254.0.0-169.254.254.255
Could anyone please help how can I achieve this.
so far I have this:
169\.254\.([1-9]{1,2}|[1-9]{1,2}[1-4])
but it would also pick up 169.254.255.1 which should not be one of the matches.
Please help!
thanks
This is the regex I use for general IP validation:
(([0-9](?!\d)|[1-9][0-9](?!\d)|1[0-9]{2}(?!\d)|2[0-4][0-9](?!\d)|25[0-5](?!\d))[.]?){4}
Breakdown:
1.`[0-9](?!\d)` -> Any Number 0 through 9 (The `(?!\d)` makes sure it only grabs stand alone digits)
2.`|[1-9][0-9](?!\d)` -> Or any number 10-99 (The `(?!\d)` makes sure it only grabs double digit entries)
3.`|1[0-9]{2}` -> Or any number 100-199
4.`|2[0-4][0-9]` -> Or any number 200-249
5.`|25[0-5]` -> Or any number 250-255
6.`[.]?` -> With or without a `.`
7.`{4}` -> Lines 1-6 exactly 4 times
This hasn't failed my yet for IP address validation.
For your specific case, this should do it:
(169\.254\.)((([0-9](?!\d)|[1-9][0-9](?!\d)|1[0-9]{2}|2[0-4][0-9]|25[0-4])[.])(([0-9](?!\d)|[1-9][0-9](?!\d)|1[0-9]{2}|2[0-4][0-9]|25[0-5])))
This is very long because I couldn't figure out how to get 169.254.(0-254).255 to check without getting 169.254.255.1 to fail
Edit: Fixed due to comments
the regex ([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-4]) matches 0-254.
see this page for more discussion
I've written an article that provides regular expressions for all the components of a generic URI (as defined in RFC3986: Uniform Resource Identifier (URI): Generic Syntax)
See: Regular Expression URI Validation
One of the components of a generic URI is an IPv4 address. Here is the free-spacing mode Python version from that article:
re_python_rfc3986_IPv4address = re.compile(r""" ^
# RFC-3986 URI component: IPv4address
(?: (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) \.){3} # (dec-octet "."){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) # dec-octet "."
$ """, re.VERBOSE)
And the un-commented JavaScript version:
var re_js_rfc3986_IPv4address = /^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$/;

Regex: How to match a string that is not only numbers

Is it possible to write a regular expression that matches all strings that does not only contain numbers? If we have these strings:
abc
a4c
4bc
ab4
123
It should match the four first, but not the last one. I have tried fiddling around in RegexBuddy with lookaheads and stuff, but I can't seem to figure it out.
(?!^\d+$)^.+$
This says lookahead for lines that do not contain all digits and match the entire line.
Unless I am missing something, I think the most concise regex is...
/\D/
...or in other words, is there a not-digit in the string?
jjnguy had it correct (if slightly redundant) in an earlier revision.
.*?[^0-9].*
#Chad, your regex,
\b.*[a-zA-Z]+.*\b
should probably allow for non letters (eg, punctuation) even though Svish's examples didn't include one. Svish's primary requirement was: not all be digits.
\b.*[^0-9]+.*\b
Then, you don't need the + in there since all you need is to guarantee 1 non-digit is in there (more might be in there as covered by the .* on the ends).
\b.*[^0-9].*\b
Next, you can do away with the \b on either end since these are unnecessary constraints (invoking reference to alphanum and _).
.*[^0-9].*
Finally, note that this last regex shows that the problem can be solved with just the basics, those basics which have existed for decades (eg, no need for the look-ahead feature). In English, the question was logically equivalent to simply asking that 1 counter-example character be found within a string.
We can test this regex in a browser by copying the following into the location bar, replacing the string "6576576i7567" with whatever you want to test.
javascript:alert(new String("6576576i7567").match(".*[^0-9].*"));
/^\d*[a-z][a-z\d]*$/
Or, case insensitive version:
/^\d*[a-z][a-z\d]*$/i
May be a digit at the beginning, then at least one letter, then letters or digits
Try this:
/^.*\D+.*$/
It returns true if there is any simbol, that is not a number. Works fine with all languages.
Since you said "match", not just validate, the following regex will match correctly
\b.*[a-zA-Z]+.*\b
Passing Tests:
abc
a4c
4bc
ab4
1b1
11b
b11
Failing Tests:
123
if you are trying to match worlds that have at least one letter but they are formed by numbers and letters (or just letters), this is what I have used:
(\d*[a-zA-Z]+\d*)+
If we want to restrict valid characters so that string can be made from a limited set of characters, try this:
(?!^\d+$)^[a-zA-Z0-9_-]{3,}$
or
(?!^\d+$)^[\w-]{3,}$
/\w+/:
Matches any letter, number or underscore. any word character
.*[^0-9]{1,}.*
Works fine for us.
We want to use the used answer, but it's not working within YANG model.
And the one I provided here is easy to understand and it's clear:
start and end could be any chars, but, but there must be at least one NON NUMERICAL characters, which is greatest.
I am using /^[0-9]*$/gm in my JavaScript code to see if string is only numbers. If yes then it should fail otherwise it will return the string.
Below is working code snippet with test cases:
function isValidURL(string) {
var res = string.match(/^[0-9]*$/gm);
if (res == null)
return string;
else
return "fail";
};
var testCase1 = "abc";
console.log(isValidURL(testCase1)); // abc
var testCase2 = "a4c";
console.log(isValidURL(testCase2)); // a4c
var testCase3 = "4bc";
console.log(isValidURL(testCase3)); // 4bc
var testCase4 = "ab4";
console.log(isValidURL(testCase4)); // ab4
var testCase5 = "123"; // fail here
console.log(isValidURL(testCase5));
I had to do something similar in MySQL and the following whilst over simplified seems to have worked for me:
where fieldname regexp ^[a-zA-Z0-9]+$
and fieldname NOT REGEXP ^[0-9]+$
This shows all fields that are alphabetical and alphanumeric but any fields that are just numeric are hidden. This seems to work.
example:
name1 - Displayed
name - Displayed
name2 - Displayed
name3 - Displayed
name4 - Displayed
n4ame - Displayed
324234234 - Not Displayed