Django: Editable BINARY field which displays as HEX in Admin - django

update I have now figured that there is a reason to define get_prep_value() and that doing so improves Django's use of the field. I have also been able to get rid of the wrapper class. All this has, finally, enabled me to also eliminate the __getattribute__ implementation with the data model, which was annoying. So, apart from Django callingto_python()` super often, I'm now fine as far as I can see. /update
One morning, you wake up and find yourself using Django 1.4.2 along with DjangoRESTFramework 2.1.2 on Python 2.6.8. And hey, things could definitely be worse. This Django admin magic provides you with forms for your easily specified relational data model, making it a pleasure to maintain the editorial part of your database. Your business logic behind the RESTful URLs accesses both the editorial data and specific database tables for their needs, and even those are displayed in the Django admin, partially because it's easily done and nice to have, partially because some automatically generated records require a mini workflow.
But wait. You still haven't implemented those binary fields as BINARY. They're VARCHARS. You had put that on your ToDo list for later. And later is now.
Okay, there are those write-once-read-many-times cases with small table sizes where an optimization would not necessarily pay. But in another case, you're wasting both storage and performance due to freuquent INSERTs and DELETEs in a table which will get large.
So what would you want to have? A clear mapping between the DB and Django, where the DB stores BINARY and Django deals with hex strings of twice the length. Can't be that hard to achieve, can it?
You search the Web and find folks who want CHAR instead for VARCHAR, others who want BLOBs, and everybody seems to do it a bit differently. Finally, you end up at Writing custom model fields where the VARCHAR -> CHAR case is officially dealt with. So you decide to go with this information.
Starting with __init__(), db_type() and to_python(), you notice that to_python() gets rarely called and add __metaclass__ = models.SubfieldBase only to figure that Django now calls to_python() even if it has done so before. The other suggestions on the page suddenly start to make more sense to you, so you're going to wrap your data in a class, such that you can protect it from repeated calls to to_python(). You also follow the suggestion to Put a __str__() or __unicode__() method on the class you're wrapping up as a field and implement get_prep_value().
While the resulting code does not do what you expect, one thing you notice is that get_prep_value() never gets called so far, so you're removing it for now. What you do figure is that Django consistently appears to get a str from the DB and a unicode from the admin, which is cool, and end up with something like this (boiled down to essentials, really).
class MyHexWrappeer(object):
def __init__(self, hexstr):
self.hexstr = hexstr
def __len__(self):
return len(self.hexstr)
def __str__(self):
return self.hexstr
class MyHexField(models.CharField):
__metaclass__ = models.SubfieldBase
def __init__(self, max_length, *args, **kwargs):
assert(max_length % 2 == 0)
self.max_length = max_length
super(MyHexField, self).__init__(max_length=max_length, *args, **kwargs)
def db_type(self, connection):
return 'binary(%s)' % (self.max_length // 2)
def to_python(self, data):
if isinstance(data, MyHexWrapper): # protect object
return data
if isinstance(data, str): # binary string from DB side
return MyHexWrapper(binascii.b2a_hex(data))
if isinstance(data, unicode): # unicode hex string from admin
return MyHexWrapper(data)
And... it won't work. The reason, of course, being that while you have found a reliable way to create MyHexWrapper objects from all sources including Django itself, the path backwards is clearly missing. From the remark above, you were thinking that Django calls str() or unicode() for admin and get_prep_value() in the direction of the DB. But if you add get_prep_value() above, it will never be called, and there you are, stuck.
That can't be, right? So you're not willing to give up easily. And suddenly you get this one nasty thought, and you're making a test, and it works. And you don't know whether you should laugh or cry.
So now you try this modification, and, believe it or not, it just works.
class MyHexWrapper(object):
def __init__(self, hexstr):
self.hexstr = hexstr
def __len__(self):
return len(self.hexstr)
def __str__(self): # called on its way to the DB
return binascii.a2b_hex(self.hexstr)
def __unicode__(self): # called on its way to the admin
return self.hexstr
It just works? Well, if you use such a field in code, like for a RESTful URL, then you'll have to make sure you have the right kind of string; that's a matter of discipline.
But then, it still only works most of the time. Because when you make such a field your primary key, then Django will call quote(getattr()) and while I found a source claiming that getattr() "nowdays" will use unicode() I can't confirm. But that's not a serious obstacle once you got this far, eh?
class MyModel((models.Model):
myhex = MyHexField(max_length=32,primary_key=True,editable=False)
# other fields
def __getattribute__(self, name):
if (name == 'myhex'):
return unicode(super(MyModel, self).__getattribute__(name))
return super(MyModel, self).__getattribute__(name)
Works like a charm. However, now you lean back and look at your solution as a whole. And you can't help to figure that it's a diversion from the documentation you referred to, that it uses undocumented or internal behavioural characteristics which you did not intend to, and that it is error-prone and shows poor usability for the developer due to the somewhat distributed nature of what you have to implement and obey.
So how can the objective be achieved in a cleaner way? Is there another level with hooks and magic in Django where this mapping should be located?
Thank you for your time.

Related

Prefetch_related on queryset.get()

Today I have written a DRF view method using prefetch_related:
def post(self, request, post_uuid, format=None):
post = Post.objects.prefetch_related('postimage_set').get(uuid=post_uuid)
postimage_set = post.postimage_set.all()
for image in postimage_set:
...
return Response('', status.HTTP_200_OK)
And I fear that I am using prefetch_related wrongfully with this. Does it make sense to use prefetch_related here or will this fetch all posts as well as all postimages and then filter this set to just one instance? I'm super thankful for any help on this.
Looks kinda unnatural. Without looking at your database structure I can only guess, that what you really want to do is:
PostImage.objects.filter(post__uuid=post_uuid) (mind the usage of a dunder between post and uuid - that simple trick follow the relation attribute) which should result in a single query.
Moreover, if you are uncertain of a number of queries that will hit the database, you can write a very precise test with one of the assertions, that is available since Django 1.3: assertNumQueries

Is there a better way to store a list of integers in a MySQL Model?

I'd like to store a list of integers in a MySQL field.
My current workaround:
import datetime
from django.db import models
class myModel(models.Model):
testList = models.CharField()
def set_testList(self,data):
self.testList = ','.join(map(str, data))
def get_testList(self):
return list(map(int, self.testField.split(',')))
This works fine as long as I go through set_testList and get_testList to set and retrieve the field.
This get particularly annoying as I have 4-5 such fields in some models, and having to set and retrieve every field through their own set and get methods makes the code much less readable and increases db queries.
Is it possible to create a solution where I wouldn't have to go through custom methods to achieve this?
The optimal case would be to set the field using: myModel.objects.create(testField=[1,2,3,4]); and retrieve it using myModelobjects.get(pk=1).values() and have the conversion occur 'behind the scenes'.
Is something like this possible (without having to migrate to PostgreSQL)?
You can define your own Django model field, like:
# app/fields.py
from django.db import models
class IntegerListField(models.CharField):
description = 'list of integers'
def from_db_value(self, value, expression, connection):
if value is None:
return None
return list(map(int, value.split(',')))
def to_python(self, value):
if isinstance(value, list):
return value
if value is None:
return None
return list(map(int, value.split(',')))
def get_prep_value(self, value):
if value is None:
return None
return ','.join(map(str, value))
Then you can use that field in your model:
# app/models.py
import datetime
from django.db import models
from app.fields import IntegerListField
class myModel(models.Model):
testList = IntegerListField(max_length=255)
So now Django will automatically wrap the list of integers between the Python world, and the database world.
The above is of course a raw sketch. You probably should read the documentation on Writing custom model fields.
So "under the hood" at the database side, we still use a VARCHAR or whatever CharField is using here. We just have added some extra logic here, that automatically converts values in the database to a list of integers, and it will wrap these to strings before storing these in the database. We thus did not construct a new database type. I think however it is more convenient that you can use a list of integers on your model.
While Willem's answer is great and perfectly correct from a purely technical POV, I wish to add that the question itself suggests a possible database design issue.
You are using a relational database, not a mere bit bucket, and relational modeling rules state that fields should be atomic (one field should only store one single atomic value), which is not the case anymore with this solution.
Theoretically, the right design would be a distinct table (model) holding the value, with a foreign key on the "master" model. One of the benefits here is that you can query the master model on the related one values...
Now I know from experience that it's just plain overkill (and useless overhead) for some use cases (if you never need to query on those values for example), and you didn't provide any context for your question so it's impossible to tell whether denormalizing is a sensible design here or not, but I thought this little reminder could be useful (for you, but also for future readers).
PS: also, more and more rdbms are building (more or less complete and performant...) support for json fields nowadays so you might want to check this solution too (eventually wrapping the JSON field in a custom one to make sure you only ever get integers lists).

Django POST validation seeing number as string

I'm a bit confused where to go with this as I thought it would be part of Django's validation... I'm on 1.8 because I'm using an older database connection library that was last tested with 1.8 (rewriting a frontend for old data).
models.py:
class Order(models.Model):
#rest of class#
RequestorNumber = models.SmallIntegerField(db_column='requestor_no')
class Requestor(models.Model):
RequestorNumber = models.SmallIntegerField(primary_key=True, db_column="requester_no")
Requestor = models.CharField(max_length=20, db_column = "requester")
def __str__(self):
return self.Requestor
forms.py
class OrderForm(forms.ModelForm):
RequestorNumber = forms.ModelChoiceField(queryset=Requestor.objects.all().order_by('RequestorNumber'), label="Requestor")
So this creates a correct dropdown in the template, with values as integers and text as the descriptions ex:
<option value="1" selected="selected">JOHN DOE</option>
When the form is submitted, the POST QueryDict has a proper entry when printing the entire request:
...
'OrderForm-RequestorNumber': ['1']
...
but this is coming in as a string (as I would expect), but the validator when doing is_valid() kicks back and the webpage gets:
'JOHN DOE' value must be an integer.
Is this by design? I feel like it's trying ignore the value of the selected for the form and referring back to the object's __str__ definition as what needs to be saved. If this is dumb, i'm also all ears to figure out what a more correct method is, the only problem is I can't change the DB schema, and all tables are managed=False in the meta.
EDIT: I overwrote the clean_RequestorNumber in the form to literally output the value it thinks is supposed to be saved, and it's giving the value of the __str__ of the method rather than the primary key.
I need to change this behavior but I can't nail down the spot in source code where the validation is being done. Between models.py, fields.py, and widgets.py i can see the required, valid_choice, and other validations but I can't spot where this is being pushed around. Once I can spot it I can try writing my own class but I can't figure out what to overwrite.
class OrderForm(forms.ModelForm):
def __init__(self, *args, **kwargs):
super(OrderForm, self).__init__(*args, **kwargs)
self.fields['RequestorNumber'].choices = [(x.pk, x.Requestor) for x in Requestor.objects.all()]
RequestorNumber = forms.ChoiceField()
Doing it the old way before ModelChoiceField was a thing validates correctly. Not sure if this was fixed past 1.8 but I'm still having trouble finding the spot in the source that would provide the incorrect behavior for the validation. If someone can point me in the correct direction I'd be happy as I would still rather fix the version I'm on so I can have some cleaner code to deal with. Going to leave this up for a week if someone can help and then just mark this as the answer.

Django: overriding single model field validation

I am using django.db.models.fields.DecimalField in one case, but its validation error is quite BAD.
like, when user enters 3,4 instead of 3.4 it says - 'Enter a number'. Well 3,4 is as much as number in some countries as 3.4 is. At least to those who perhaps are not well versed in computer stuff.
So for that reason i am trying to override this fields validation so i could validate it myself.
My problem is - before modelforms clean_my_field() is called, models own validation works and it already raises an error.
So i looked up https://docs.djangoproject.com/en/dev/ref/models/instances/#validating-objects
After reading this i understood that i could do
def full_clean(self):
super(MyModel, self).full_clean(exclude = 'my_field')
and my_field would be excluded from validation and i could validate it myself in
def clean(self)
pass
#how do i access cleaned data here anyway?
#self.cleaned_data does not exist
#is self.my_field the only way?
But alas - it does not work. self.my_field value is old value in clean() method and cleaned_data is nowhere to be found.
All this makes me think my approach is wrong. I could write my own field which extends django's DecimalField i guess. I thought this approach would work... Can someone clear this up for me as - WHY it does not work. why is that exclude there if it does not work? Django version 1.4.2 by the way.
Alan
Edit: I dug deeper. It seems that even if i override all models cleaning methods and dont use super in them at all - the fields are STILL cleaned at some point and the error is already raised by then.
I guess i will be doing some extending to django.db.models.fields.DecimalField in this case.
An answer about why the exclude is there in the full_clean method would still be nice. Why is it there if it does not work?
I know it's an old question but for the ones that didn't find an answer, what I did was to add localize=True in the ModelAdmin for the Admin Site:
formfield_overrides = {
models.DecimalField: {'localize': True},
}
That will make the FormField locale aware, accepting comma as decimal separator (depending of the current locale, of course). It will also display it localized.
https://docs.djangoproject.com/en/dev/topics/i18n/formatting/#locale-aware-input-in-forms
If you are targeting a single DecimalField or you are writing a custom Form or ModelForm just follow the instructions on the URL.
def views_name(request):
......
field = dec_num(form.cleaned-data['....'])
.........
return render(request, 'page.html', {.....})
def dec_num(value):
return value.replace(",",".")

Form/ModelForm instances between requests

I want write a custom form field (and possibly widget too) and I'm not sure about how the form instances are shared between requests. For example, if I render a form with data from a model instance, is that instance still available when I am validating data? If so, does that mean that there is another database hit to look up the model again between requests?
Similarly, if I write a custom field that takes in a list of data to display in its __init__ method, will that list of data be available to validate against when the user POSTs the data?
It would be really helpful if someone could point me to parts of the django source where this occurs. I've been looking at the models.py, forms.py, fields.py and widgets.py from django.forms, but I'm still not 100% sure how it all works out.
Eventually, what I want to do is have a field that works something like this (the key part is the last line):
class CustomField(ChoiceField):
def __init__(self, data_dict, **kwargs):
super(CustomField, self).__init__(**kwargs)
self.data_dict = data_dict
self.choices = data_dict.keys()
def validate(self, value):
if value not in self.data_dict:
raise ValidationError("Invalid choice")
else:
return self.data_dict[value]
Will that data_dict be available on the next request? If I create a custom forms.Form and initialize it with the data_dict, will that be available on the next request? (e.g. with a factory method or something...).
Side note: I'm doing this because I want to (eventually) use something like Bootstrap's typeahead and I'd like to pass it "pretty values" which I then convert server-side (basically, like how option values in a select can have a different submitted value). I've done this with client-side javascript in the past, but it would be nice to consolidate it all into a form field.
There's nothing magical about forms. Like everything else in Django (or just about any web framework), objects don't persist between requests, and need to be reinstantiated each time. This happens in the normal view pattern for form handling: you instantiate it once for a POST, and a separate time for a GET. If you have data associated with the form, it would need to be passed in each time.