Encrypting Django User Model Fields - django

Is there any way Django provides us to Encrypt all / atleast fields like first_name, last_name, email_id of auth.User model just like how it does encrypts PASSWORD field before storing it into Database ?
My Workaround:
I have gone through documentation & few questions on StackOverflow, according to which it would be possible to inherit default BaseUser model & define our own myUser model the way we want, by defining the Custom Character Field which encrypts & decrypts characters.
Problem with this is in my application, I have provided SEARCH option for easy access of fields which are characters. If I encrypt all such Char Fields, it's difficult for me to query for search option.
For example: If ABCD, ABCDE, ABC are strings in database & user wishes to know all such entries which have BC, none of results pop out. Reason is each of ABCD, ABCDE, ABC encrypts to different / unique strings ( I am using AES encryption provided by PyCrypto ). Also BC gets encrypted to some unique string which has no similarity between that of ABCD, ABCDE, ABC ( for obvious reason that I am using AES algorithm with key length as 32 ). And the query I have written like
MyModel.objects.filter(first_name__icontains='BC')
would not return any result. ( Yes I want search to be not case sensitive ).
[Note: I have added all required methods like "to_python" , "get_db_prep_value" in Custom Field, also tried lookup method. But yeah actual problem is each string gets encrypted to unique characters in AES of same length]
Since I am new to Django, my question may not be that like a Django developer. I would like to know answer for either of above two questions. Unless I get answer I am deadlocked. Thanks in advance, but please be kind to me & answer.

I tried a lot, found no useful answers for querying partial matches if fields are encrypted. So I had to do this in Python (Found no other way to do it).
This work around works fine only if database we are working with is small, otherwise it comes with cost of performance.
Query all tuples from database, use python to do partial matches.
result = []
temp_result = MyModel.objects.all()
for temp in temp_result:
if query.lower() in temp.first_name.lower():
result.append(temp)
or something like above. I know this is rude way of Querying, but for the given conditions this was only available solution.

Related

How can I dynamically create multi-level hierarchical forms in Django?

I'm building an advanced search page for a scientific database using Django. The goal is to be able to allow some dynamically created sophisticated searches, joining groups of search terms with and & or.
I got part-way there by following this example, which allows multiple terms anded together. It's basically a single group of search terms, dynamically created, that I can either and-together or or-together. E.g.
<field1|field2> <is|is not|contains|doesn't contain> <search term> <->
<+>
...where <-> will remove a search term row and <+> will add a new row.
But I would like the user to be able to either add another search term row, or add an and-group and an or-group, so that I'd have something like:
<and-group|or-group> <->
<field1|field2> <is|is not|contains|doesn't contain> <search term> <->
<+term|+and-group|_or-group>
A user could then add terms or groups. The result search might end up like:
and-group
compound is lysine
or-group
tissue is brain
tissue is spleen
feeding status is not fasted
Thus the resulting filter would be like the following.
Data.objects.filter(Q(compound="lysine") & (Q(tissue=brain) | Q(tissue="spleen")) & ~Q(feeding_status="fasted"))
Note - I'm not necessarily asking how to get the filter expression below correct - it's just the dynamic hierarchical construction component that I'm trying to figure out. Please excuse me if I got the Q and/or filter syntax wrong. I've made these queries before, but I'm still new to Django, so getting it right off the top of my head here is pretty much guaranteed to be zero-chance. I also skipped the model relationships I spanned here, so let's assume these are all fields in the same model, for simplicity.
I'm not sure how I would dynamically add parentheses to the filter expression, but my current code could easily join individual Q expressions with and or or.
I'm also not sure how I could dynamically create a hierarchal form to create the sub-groups. I'm guessing any such solution would have to be a hack and that there are not established mechanisms for doing something like this...
Here's a screenshot example of what I've currently got working:
UPDATE:
I got really far following this example I found. I forked that fiddle and got this proof of concept working before incorporating it into my Django project:
http://jsfiddle.net/hepcat72/d42k38j1/18/
The console spits out exactly the object I want. And there are no errors. Clicking the search button works for form validation. Any fields I leave empty causes a prompt to fill in the field. Here's a demo gif:
Now I need to process the POST input to construct the query (which I think I can handle) and restore the form above the results - which I'm not quite sure how to accomplish - perhaps a recursive function in a custom tag?
Although, is there a way to snapshot the form and restore it when the results load below it? Or maybe have the results load in a different frame?
I don't know if I'm teaching a grandmother to suck eggs, but in case not, one of the features of the Python language may be useful.
foo( bar = 27, baz = None)
can instead be coded
args = {}
a1, a2 = 'bar', 'baz'
d[a1] = 27
d[a2] = None
foo( **args )
so an arbitrary Q object specified by runtime keys and values can be constructed q1 = Q(**args)
IIRC q1 & q2 and q1 | q2 are themselves Q objects so you can build up a filter of arbitrary complexity.
I'll also include a mention of Django-filter which is usually my answer to filtering questions like this one, but I suspect in this case you are after greater power than it easily provides. Basically, it will "and" together a list of filter conditions specified by the user. The built-in ones are simple .filter( key=value), but by adding code you can create custom filters with complex Q expressions related to a user-supplied value.
As for the forms, a Django form is a linear construct, and a formset is a list of similar forms. I think I might resort to JavaScript to build some sort of tree representing a complex query in the browser, and have the submit button encode it as JSON and return it through a single text field (or just pick it out of request.POST without using a form). There may be some Javascript out there already written to do this, but I'm not aware of it. You'd need to be sure that malicious submission of field names and values you weren't expecting doesn't result in security issues. For a pure filtering operation, this basically amounts to being sure that the user is entitled to get all data in database table in any case.
There's a form JSONField in the Django PostgreSQL extensions, which validates that user-supplied (or Javascript-generated) text is indeed JSON, and supplies it to you as Python dicts and lists.

Django Session counter

I'm making a food order project in Django, and I want to set unique order numbers for quick verification of the user when they come to pick the food up after the order.
If I create a random combination of numbers and letters, there can be an issue of two orders having the same order number. Is it possible to make the order number unique to each session?
Just use normal incremental order numbers and One Time Passwords(OTP)
User pyotp to generate random OTP for order.
When user orders, create a secret string. Then Generate an OTP for user and send it to them.
When they arrive to pick up, they show the OTP. You can check it using pyotp and the secret string to verify.
Use counter based OTP.
Store the secret key in the order model.
Simple and effective.
Are you storing this on the request.session or is it a model in the database? If it's the model in the database that's a lot simpler. Just set the unique=True kwarg in the field. Though you still have to generate the unique fields.
Look at this guy's _createHash function in his question. Combined with the accepted answer you can create unique order ID's this way. If it's just attached the the request.session object, it'll be more difficult(if not impossible) to be 100% sure that the string has not been repeated, because there's no way to query all the existing Sessions for a custom added attribute. I don't think session is the word you're looking for, though.

Django - How to add an entry in Ascending order in the database efficiently?

I am working on making an application in Django that can manage my GRE words and other details of each word. So whenever I add a new word to it that I have learnt, it should insert the word and its details in the database alphabetically. Also while retrieving, I want the details of the particular word I want to be extracted from the database.
Efficiency is the main issue.
Should I use SQLite? Should I use a file? Should I use a JSON object to store the data?
If using a file is the most efficient, what data structure should I implement?
Are there any functions in Django to efficiently do this?
Each word will have - meaning, sentence, picture, roots. How should I store all this information?
It's fine if the answer is not Django specific and talks about the algorithm or the type of database.
I'm going to answer from the data perspective since this is not totally related to django.
From your question it appears you have a fixed identifier for each "row": the word, which is a string, and a fixed set of attributes.
I would recommend using any enterprise level RDBMS. In the case of django, the most popular for the python ecosystem is PostgreSQL.
As for the ordering, just create the table with an index on the word name (this will be automatically done for you if you use the word as primary key), and retrieve your records using order_by in django.
Here's some info on django field options (check primary_key=True)
And here's the info for the order_by order_by method
Keep in mind you can also set the ordering in the Meta class of the model.
For your search case, you'll have to implement an endpoint that is capable of querying your database with startswith. You can check an example here
Example model:
class Word(models.Model):
word = models.CharField(max_length=255, primary_key=True)
roots = ...
picture = ...
On your second question: "Is this costly?"
It really depends. With 4000 words I'll say: NO
You probably want to add a delay in the client to do the query anyways (for example "if the user has typed in and 500ms have passed w/o further input")
If I'm to give 1 good advice to any starting developer, it's don't optimize prematurely

How to order django query set filtered using '__icontains' such that the exactly matched result comes first

I am writing a simple app in django that searches for records in database.
Users inputs a name in the search field and that query is used to filter records using a particular field like -
Result = Users.objects.filter(name__icontains=query_from_searchbox)
E.g. -
Database consists of names- Shiv, Shivam, Shivendra, Kashiva, Varun... etc.
A search query 'shiv' returns records in following order-
Kahiva, Shivam, Shiv and Shivendra
Ordered by primary key.
My question is how can i achieve the order -
Shiv, Shivam, Shivendra and Kashiva.
I mean the most relevant first then lesser relevant result.
It's not possible to do that with standard Django as that type of thing is outside the scope & specific to a search app.
When you're interacting with the ORM consider what you're actually doing with the database - it's all just SQL queries.
If you wanted to rearrange the results you'd have to manipulate the queryset, check exact matches, then use regular expressions to check for partial matches.
Search isn't really the kind of thing that is best suited to the ORM however, so you may which to consider looking at specific search applications. They will usually maintain an index, which avoids database hits and may also offer a percentage match ordering like you're looking for.
A good place to start may be with Haystack

Django: searching on an EncryptedCharField (django-extensions), is this possible?

is this possible?
For a model with EncryptedCharField named "first_name" i notice that the field does not decrypt when I search on it. In all other uses it is fine. This does not work:
if form.is_valid():
cd = form.cleaned_data
search_results = MyTable.objects.filter(first_name__icontains=cd['search_term'])
is this by design or am i doing something wrong?
thanks for you help...
Encrypting the search term first, even if the exact decrypted value, would not work as the cipher is not going to be the same as the one stored in the db. So this would not work:
crypter = Crypter.Read(settings.ENCRYPTED_FIELD_KEYS_DIR)
if form.is_valid():
cd = form.cleaned_data
cipher = crypter.Encrypt(cd['search_term'])
search_results = MyTable.objects.filter(first_name__icontains=cipher)
When something is encrypted (or at least, when it is done properly), it is impossible to gain the value that has been encrypted, without knowing the value. This means that while you can check the value of say a password very quickly, as the user has given you the value of the password, it is very hard to find out the value of the password from the encrypted string. This is part of the P=NP topic.
When you search say via MyTable.objects.filter(first_name=cipher), you are just comparing encrypted strings, which is fine. However, when you try MyTable.objects.filter(first_name_icontains=cipher), you are asking django to unencrypt all of the values, compare them, then return what matches. However, django cannot do that, as no one knows what the value of the decrypted first_name field is. This is by design, as it means that even if the database is compromised, the data is safe (It is also why you should beware any website or organisation that will show you your password, as it means they have not encrypted the value in their database). Overall, not being able to see a users password is a good thing, and even if you do not agree, it is a small price to pay for good security.
You could simply store the HMAC hash of the value in another field, then search for that.