What is wrong with this regex email pattern? - regex

I have got a pattern to validate a HTML5 email field:
[A-z\u00c0-\u017e0-9._%+-]+#[A-z\u00c0-\u017e0-9.-]+\.[A-z]{2,3}$
It should allow European characters as well as numbers and other symbols.
I am getting this error:
A part followed by '#' should not contain the symbol 'á'
It allows abc#défg.com but not ábc#defg.com
Is anyone able to help? Thanks!

Related

Multiple Email validation in a single input field separated by ;

Currently i am writing a software where a user can input more than one email in a input field separated by: ";"
Now i have a regex that validates the email but sadly enough doesn't work when i have more Emails in the input field when using the separation.
Has anyone ever created such a regex or is there anyone that is able to help me?
Thanx in advance and looking forward for a response.
Here is my Regex:
[a-zA-Z0-9_.+-]+#[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]{2,4}+(\;|)
Just put the pattern which matches the following emails inside a non-capturing group with a preceding ; and make it to repeat zero or more times.
^[a-zA-Z0-9_.+-]+#[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]{2,4}+(?:;[a-zA-Z0-9_.+-]+#[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]{2,4}+)*$
And one more thing is, you need to escape the dot.

Twitter Name Validation

In our registration form we now want to ask the user to enter their twitter name (eg #paul).
Can anyone tell what characters are allowed in it?
e.g. a-z, A-Z, underscores, 0-9
anything else?
I believe it's letters, numbers and underscores only, and a maximum of 15 characters.
A quick search unveiled this post (non-Twitter) covering the same topic:
http://kagan.mactane.org/blog/2009/09/22/what-characters-are-allowed-in-twitter-usernames/
The above post also contains regex examples to help you validate:
Full regex – /^[a-zA-Z0-9_]{1,15}$/
Perl-compatible regex – /^\w{1,15}$/
This is the final JavaScript Funcion:
function validTwitteUser(sn) {
return /^[a-zA-Z0-9_]{1,15}$/.test(sn);
}
Check this page from Twitter for the official guidelines/rules
http://support.twitter.com/articles/101299-why-can-t-i-register-certain-usernames#
For JavaScript / TypeScript:
Firstly,
npm i twitter-text
# or
yarn add twitter-text
Then,
import { isValidUsername } from 'twitter-text'
console.log(isValidUsername('#helloworld')) // true
There are other implementations (Java, Ruby, ObjC), please take a look at https://github.com/twitter/twitter-text
Note: the bundle size is too large for merely a string validation...
Whilst the regexes here seem all correct, they all allow for less than 4 characters to pass validation. As per twitter:
Your username cannot be longer than 15 characters. Your name can be
longer (50 characters) or shorter than 4 characters, but usernames
are kept shorter for the sake of ease.
A username can only contain alphanumeric characters (letters A-Z,
numbers 0-9) with the exception of underscores, as noted above. Check
to make sure your desired username doesn't contain any symbols,
dashes, or spaces.
So therefore, the correct regex is:
/^[a-zA-Z0-9_]{4,15}$/
If one were to nit-pick then you also can't use the word Twitter or Admin, for my purposes the above suffices.

How to detect that a certain string is not an email address but a twitter id?

Is there a way to differentiate between an email address and a twitter id?
Both use the '#' character and the email regex will be contained by the twitter id regex.
What's the best way to approach this?
Should I require a whitespace before the '#' character in order to identify that it's a twitter id?
Not entirely sure which characters are allowed in twitter usernames, but basically like so:
/(?:^|\s)#[a-zA-Z0-9_.-]+\b/
You can test that it's preceded by whitespace using (?<=\s) and then check for the valid characters of twitter IDs which are only [A-Za-z0-9_].
That gives you a resulting regex of: (?<=\s|^)#[A-Za-z0-9_]+
You could eventually add a check for a dot, comma or whitespace after it to check that it's properly formatted within a sentence and not some weird artifact:
(?<=\s|^)#[A-Za-z0-9_]+(?=[\s.,])
Note that the lookbehind and lookahead (?<= and ?=) might not work in your language of choice, but I'll assume it does since you didn't specify.
Email addresses never start with an #, while twitter ids always do.
isTwitter = address[0] == '#'
A twitter id wouldn't pass an email regex check.
Regular email:
^[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}$
twitter won't have the last characters:
^#[A-Za-z0-9_]+$
So check if it's a valid email, if not, check if it's a valid twitter ID
Farther reading:
How to Find or Validate an Email Address

CodeIgniter - Does 'name' regex match in Cart class support Unicode?

I want to include unicode characters (to be more specific, Tamil words) in the 'name' of my Code Igniter cart. I found this example. I tried the following, so that the regex could match anything:
$this->cart->product_name_rules = '.+';
$this->cart->product_name_rules = '.*';
$this->cart->product_name_rules = '.';
But for all these, I get the error "An invalid name was submitted as the product name: சும்மாவா சொன்னாங்க பெரியவங்க The name can only contain alpha-numeric characters, dashes, underscores, colons, and spaces" in my log.
Also, thinking it could be due to unicode support, I tried the following:
$this->cart->product_name_rules = '\p{Tamil}';
But to no avail. Can you please point if something wrong here?
Try adding each Tamil character individually to your regex. I had to do this for special characters in input keys:
if ( ! preg_match("/^[a-z0-9àÀâÂäÄáÁãÃéÉèÈêÊëËìÌîÎïÏòÒôÔöÖõÕùÙûÛüÜçÇ’ñÑß¡¿œŒæÆåÅøØö:_\.\-\/-\\\,]+$/i", $str))
{
exit('Disallowed Key Characters.');
}
Here he posted how did he managed to save the cyrilic character in Codeigniter 1.7.2's cart.

Regular Expression for some email rules

I was using a regular expression for email formats which I thought was ok but the customer is complaining that the expression is too strict. So they have come back with the following requirement:
The email must contain an "#" symbol and end with either .xx or .xxx ie.(.nl or .com). They are happy with this to pass validation. I have started the expression to see if the string contains an "#" symbol as below
^(?=.*[#])
this seems to work but how do I add the last requirement (must end with .xx or .xxx)?
A regex simply enforcing your two requirements is:
^.+#.+\.[a-zA-Z]{2,3}$
However, there are email validation libraries for most languages that will generally work better than a regex.
I always use this for emails
^([a-zA-Z0-9_\-\.]+)#((\[[0-9]{1,3}" +
#"\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\" +
#".)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$
Try http://www.ultrapico.com/Expresso.htm as well!
It is not possible to validate every E-Mail Adress with RegEx but for your requirements this simple regex works. It is neither complete nor does it in any way check for errors but it exactly meets the specs:
[^#]+#.+\.\w{2,3}$
Explanation:
[^#]+: Match one or more characters that are not #
#: Match the #
.+: Match one or more of any character
\.: Match a .
\w{2,3}: Match 2 or 3 word-characters (a-zA-Z)
$: End of string
Try this :
([\w-\.]+)#((?:[\w]+\.)+)([a-zA-Z]{2,4})\be(\w*)s\b
A good tool to test our regular expression :
http://gskinner.com/RegExr/
You could use
[#].+\.[a-z0-9]{2,3}$
This should work:
^[^#\r\n\s]+[^.#]#[^.#][^#\r\n\s]+\.(\w){2,}$
I tested it against these invalid emails:
#exampleexample#domaincom.com
example#domaincom
exampledomain.com
exampledomain#.com
exampledomain.#com
example.domain#.#com
e.x+a.1m.5e#em.a.i.l.c.o
some-user#internal-email.company.c
some-user#internal-ema#il.company.co
some-user##internal-email.company.co
#test.com
test#asdaf
test#.com
test.#com.co
And these valid emails:
example#domain.com
e.x+a.1m.5e#em.a.i.l.c.om
some-user#internal-email.company.co
edit
This one appears to validate all of the addresses from that wikipedia page, though it probably allows some invalid emails as well. The parenthesis will split it into everything before and after the #:
^([^\r\n]+)#([^\r\n]+\.?\w{2,})$
niceandsimple#example.com
very.common#example.com
a.little.lengthy.but.fine#dept.example.com
disposable.style.email.with+symbol#example.com
other.email-with-dash#example.com
user#[IPv6:2001:db8:1ff::a0b:dbd0]
"much.more unusual"#example.com
"very.unusual.#.unusual.com"#example.com
"very.(),:;<>[]\".VERY.\"very#\\ \"very\".unusual"#strange.example.com
postbox#com
admin#mailserver1
!#$%&'*+-/=?^_`{}|~#example.org
"()<>[]:,;#\\\"!#$%&'*+-/=?^_`{}| ~.a"#example.org
" "#example.org
üñîçøðé#example.com
üñîçøðé#üñîçøðé.com