What does it mean to normalize an email address? - django

In this example, Django talks about normalizing an email address with self.normalize_email(email) where self is BaseUserManager. When I search for "normalizing emails" it seems to be a practice across all platforms. I see tutorials of how to do it, but nothing really explaining what it is and what it's used for.

For email addresses, foo#bar.com and foo#BAR.com are equivalent; the domain part is case-insensitive according to the RFC specs. Normalizing means providing a canonical representation, so that any two equivalent email strings normalize to the same thing.
The comments on the Django method explain:
Normalize the email address by lowercasing the domain part of it.

One application of normalizing emails is to prevent multiple signups. If your application lets the public to sign up, your application might attract the "unkind" types, and they could attempt to sign up multiple times with the same email address by mixing symbols, upper and lower cases to make variants of the same email address.
From Django's repository, the docstring of normalize_email is the following:
Normalize the email address by lowercasing the domain part of it.
What this method does is to lowercase the domain part of an email, so this part is case insensitive, so consider the following examples:
>>> from django.contrib.auth.models import BaseUserManager
>>> BaseUserManager.normalize_email("user#example.com")
user#example.com
>>> BaseUserManager.normalize_email("user#EXAMPLE.COM")
user#example.com
>>> BaseUserManager.normalize_email("user#example.COM")
user#example.com
>>> BaseUserManager.normalize_email("user#EXAMPLE.com")
user#example.com
>>> BaseUserManager.normalize_email("user#ExAmPlE.CoM")
user#example.com
As you can see all emails are equivalent because the case after # is irrelevant.

Related

Default regex django uses to validate email

recently, I started playing with Django and created a custom form for user registration. In that form to create the field for email I use something like
email = forms.EmailField()
I observed that address such as a#a.a is considered invalid by the form. Of course this is a nonsense email address. Nonetheless, I wonder how does Django checks for validity.
I found some topics on the net discussing how to check for validity of an email address but all of them were providing some custom ways. Couldn't find something talking about the django default validator.
In their docs on the email filed they specify
Uses EmailValidator to validate that the given value is a valid email address, using a moderately complex regular expression.
However that's not very specific so I decided to ask here.
For anyone also interested in this, I would suggest looking up the implementation (django.core.validators) as was kindly suggested by iklinac in the comments.
In it, there is not just the source but also mentions about standards that were used to derive regexes that check if domain and literal have valid format.
us should check docs here https://www.geeksforgeeks.org/emailfield-django-forms/#:~:text=EmailField%20in%20Django%20Forms%20is,max_length%20and%20min_length%20are%20provided.
if u wanna check validation use clean function like this :
from django.forms.fields import EmailField
email = EmailField()
my_email = "a#a.a"
print(email.clean(my_email))
if your email is valid then this func return value else return validation error

Obfuscating a string that can afterwards be used in a web address

I have a small simple server in Flask, and I'd like to be able to route to a user's page with the following route:
#app.route("/something/<string:username>", methods=["GET"])
When it's a clear username it's not a problem, however I want to add simple obfuscation so that when given a key produces a new string that can still be used in a web address.
I tried my luck with several methods I found in Stack Overflow, but the output strings have various issues like non-ASCII characters, or characters that give me issues in the routing (like having a / which confuses Flask).
Ideally I'd like to have two functions, obfuscate(key, string) and deobfuscate(key, string) so I'll be able to use like so:
#app.route("/something/<string:username>", methods=["GET"])
def user_page(username):
# username is an obfuscated string
clear_username = deobfuscate(MY_KEY, username)
return flask.make_response("Hi {}".format(clear_username), 200)
...
...
def create_user(username):
# username is a clear string
save_to_database(username)
return obfuscate(MY_KEY, username)
To summarize, the obfuscation needs to be simple but good enough that you won't be able to figure it out by looking at the URL, and two-way so that I can figure out what the original string was and print it out.
I ended up solving the issue with itsdangerous, which is a dependency of Flask so I have it on my server anyway.
As the example here shows:
>>> from itsdangerous import URLSafeSerializer
>>> s = URLSafeSerializer('secret-key')
>>> s.dumps([1, 2, 3, 4])
'WzEsMiwzLDRd.wSPHqC0gR7VUqivlSukJ0IeTDgo'
>>> s.loads('WzEsMiwzLDRd.wSPHqC0gR7VUqivlSukJ0IeTDgo')
[1, 2, 3, 4]
It's safe to assume I won't have any surprises as the docstring says:
Works like :class:Serializer but dumps and loads into a URL safe string consisting of the upper and lowercase character of the alphabet as well as _, - and ..

How to generate hash in django 1.9 like Invitation Link

I want to send the email to the user that will contains url+hash
like this bleow
www.mywebsite.com/user/verify/121#$%3h2%^1kj3#$h2kj1h%$3kj%$21h
and save this hash against the user in the Database like this
ID | Email |Hash
1 | youremail#gmail.com |121#$%3h2%^1kj3#$h2kj1h%$3kj%$21h
When the user received the email it should check and compare the hash with it and perform the action as per situation.
My question is simple how to generate a unique hash for each user and how to store them in the Database.
If by "hash", you mean a unique string, you can just use uuid.uuid4 for that purpose.
>>> import uuid
>>> unique_id = str(uuid.uuid4())
>>> print unique_id
d8814205-f11e-46e1-925e-a878fc75cb8d
>>> # replace dashes, if you like
>>> unique_id.replace("-", "")
I've used this for projects where I need to verify a user's email.
P.S.: It's not called a hash, it's called a unique ID. Hashing is something else, where you generate a value from a given string. See this question for more explanation.
Django has a Cryptographic Signing module, which helps produce unique and verifiable signatures for any data you need. If you are trying to do this to verify that the request is done by the appropriate user or not, you can use the library to verify requests, without storing the hash in the database.

How to make case insensitive queries with Django Models

I have an member model with contains an email field. I recently realized that if a part of the email is capitalized, it won't show up in Django queries if I try to filter by the email (multiple member objects have the same email, but it may not be capitalized). I could have just made all emails lower-case when entering them into the database, but it's too late for that now (as the website is already launched). So how do I check who has a certain email, without being case sensitive?
Just use iexact:
User.objects.filter(email__iexact='email#email.com')
Case-insensitive exact match. If the value provided for comparison is None, it will be interpreted as an SQL NULL (see isnull for more details).
Member.objects.filter(email__iexact=email)

mail address validation in python/django

We are developing our website in django framework and with python. currently we are looking for an api/a tool to validate the physical addres the user enters. It doesn't need to be vey sophisticated. We just need to avoid spam registrations. For this step, we need to verify if country, state, city, and zipcode are belong to each other. For instance, Berkeley as a city with Zipcode 94704 is not in NY state, etc.
Any thoughts?
You can give a try to pygeocoder. It's a Python wrapper for Google's Geocoding V3 API. It can fulfill some of your validation needs.
from pygeocoder import Geocoder
g = Geocoder()
g.geocode("1 wellington, ottawa").valid_address
>>> True
g.geocode("1 wellington, chicago").valid_address
>>> False
It can take some minor misspelling too
result = g.geocode("1600 amphiteather, mountain view")
result.valid_address
>>> True
print result
>>> 1600 Amphitheatre Pkwy, Mountain View, CA 94043, USA
g.geocode("16000000 amphitheater, mountain view").valid_address
>>> False
But Google doesn't always have the right postal/zip code. USPS has a public API for the US I believe.
disclamer: I made pygeocoder
You could probably use a GeoCoding service (like Google's) to verify that an address exists.