Better to save a slug to the DB or generate dynamically? - django

I am working on a django project and would like to include a slug at the end of the url, as is done here on stackoverflow.com: http://example.com/object/1/my-slug-generated-from-my-title
The object ID will be used to look up the item, not the slug -- and, like stackoverflow.com, the slug won't matter at all when getting the link (just in displaying it).
Qestion: is there a downside (or upside) to generating the slug dynamically, rather than saving it as an actual database field ?
For example (not real code):
class Widget(models.Model):
title = models.CharField()
def _slug(self):
return slugify(self.title)
slug = property(_slug)
Rather than using an something like an AutoSlugField (for example) ?
Since my plan is to have it match the title, I didn't know if it made sense to have a duplicate field in the database.
Thanks!

If you're using the slug for decorative (rather than lookup) purposes, generating it dynamically is the best idea.
Additionally, the code sample you posted can be written like this:
#property
def slug(self):
return slugify(self.title)

Try making a slug out of the word "café" or "浦安鉄筋家族".
Chances are that it'll look like poo, unless you're really well-prepared.
Sometimes you need the ability to customize slugs.

The downside would be that you're automatically generating the slug every time you render the page. The upside is that you're not taking up space in the database with a field that will never be directly queried against.
Either way is fine, it just depends on your performance vs. space requirements.

The main downside of generating slugs dynamically is that you miss the ability to customize slugs per-object, eg. make them shorter and prettier. For English titles this can be OK, but for non-English content generated slugs can be ugly.

Related

two-stage submission of django forms

I have a django model that looks something like this:
class MyModel(models.Model):
a = models.BooleanField(default=False)
b = models.CharField(max_length=33, blank=False)
c = models.CharField(max_length=40, blank=True)
and a corresponding form
class MyForm(ModelForm):
class Meta:
model = MyModel
Asking the user to fill in the form is a two phase process. First I ask whether a is True or False. After a submit (would be better maybe with ajax, but keep it simple at first) I can fill in a dropdown list with choices for b and decide whether or not to show c as an option. No need to show a as a choice any more, it's descriptive text.
So I want to present a form twice, and the same model is behind it, slowly being filled in. First choose a, then be reminded that a is True and be asked about b and c. Or that a is False and be asked about only b. After submitting this second form, I save the object.
I'm not clear how best to do this in django. One way is to make two separate form classes, one of which has hidden fields. But if the same model is behind both and the model has required fields, I'm anticipating this will get me in trouble, since the first form won't have satisified the requirement that b be non-empty. In addition, there's a small amount of fragility introduced, since updating the model requires updating two forms (and a probably at least one view).
Alternatively, I could use non-model forms and have full freedom, but I'd love to believe that django has foreseen this need and I can do it easier.
Any suggestions on what the right idiom is?
You can use Form Wizard from form-tools for that: https://django-formtools.readthedocs.io/en/latest/wizard.html
It works rather simple by defining multiple forms and combining them. Then in the end, you can use the data to your liking with a custom done() form. The docs tell you everything. You can use JS to hide some of your fields for the super quick approach (utilize localStorage for example).

Should I define get_absolute_url() if the url is not unique?

In the Django framework, models can define a get_absolute_url() method. The method is used to create urls for instances of the model and it is considered a good practice to define and use this method.
Is it still a good practice to define this method even if the generated urls are not unique?
Example:
class Project(models.Model):
name = models.CharField(max_length=128)
def get_absolute_url(self):
return reverse('xxx:project_details', args=(self.id,))
class Item(models.Model):
project = models.ForeignKey(Project, on_delete=models.CASCADE)
name = models.CharField(max_length=128)
def get_absolute_url(self):
return reverse('xxx:project_details', args=(self.project.id,))
Currently, the Item instances can only be seen in a list on the project_details page and I intend to keep it that way. The get_absolute_url() method returns the project details url. This means that all Items of the same project return the same url. Is that okay? I found it useful, because some generic views use get_absolute_url() and automatically redirect to the correct page. However, I am new to Django and want to know whether this will cause problems later.
Short answer: I would advice not to do this. Usually the get_absolute_url links to a "unique identifier" for that object.
Is it still a good practice to define this method even if the generated urls are not unique?
The documentation on get_absolute_url(..) [Django-doc] mentions that:
Define a get_absolute_url() method to tell Django how to calculate
the canonical URL for an object. To callers, this method should
appear to return a string that can be used to refer to the object over
HTTP.
(...)
Similarly, a couple of other bits of Django, such as the syndication
feed framework, use get_absolute_url() when it is defined. If it
makes sense for your model’s instances to each have a unique URL,
you should define get_absolute_url().
In mathematics and computer science a canonical form [wiki] says that:
(...) The distinction between "canonical" and "normal" forms varies by subfield. In most fields, a canonical form specifies a unique representation for every object, while a normal form simply specifies its form, without the requirement of uniqueness. (...)
So this hints that get_absolute_url makes more sense if the URLs that are generated are unique. After all, the get_absolute_url aims to show the details of that specific object. You use it to redirect(some_object), etc.
Although it will of course not raise errors, it is not very common to use get_absolute_url to link to the detail page of a parent object.
If you proceed, you might implement it as: return self.project.get_absolute_url(..) instead of implementing the same reverse(..) logic, since if you later alter the get_absolute_url of the project, then the get_absolute_url of item will be updated as well.

Django Model Optimization

Class Model(models.Model):
.......
.......
.......
.......
first_name = models.CHarField(max_length = 50)
last_name = models.CharField(ma_lenghth = 50)
def full_name():
return '%s %s' %(self.first_name, self.last_name)
Calling Models.objects.get().full_name() would be efficient
or
Model.objects.filter().values('first_name, 'last_name') and than
adding the string later would be better.
The question is in regards to database optimization. Basically I want to know if calling a method of a model loads the whole object or not. If not than I feel both would result in same database operations but if it loads the whole object than values method would be better optimization.
Please reply. Share any experiences if you have on this topic and also any statistics for the comparison if you have one.
Please note that this is an example and not the actual use case, the model also contains many other fields.
Few will feel that using defer() or only() will also give the desired result. But what I found in django documentation is that it basically only prevents those fields data from being converted to python object and not in sql look-ups. Therefore I don't think that's any better.
Please help me out
Thanks in advance.
The question is not whether "calling a method of model loads the whole object or not", because that's irrelevant. The "loading of the whole object" has already been done by the get call. The method will operate on the model object returned by that call, which unless you specify otherwise (using for example defer or only) will be the entire object.
When you use get or filter and then access to the object or objects from those querisets, you don't get extra queries if and only if you use the fields of the model you are accessing. For example, in your case those fields would be first_name and last_name.
But it is different in case you have a foreign key to other model. When you try to access to fields of that model, the simple query you made before doesn't get from the database the other object. So when you try to access it, you will access another time to your database. To solve this problem, you should see the documentation of select_related and prefetch_related.
Hope it helps!

Django: Editable BINARY field which displays as HEX in Admin

update I have now figured that there is a reason to define get_prep_value() and that doing so improves Django's use of the field. I have also been able to get rid of the wrapper class. All this has, finally, enabled me to also eliminate the __getattribute__ implementation with the data model, which was annoying. So, apart from Django callingto_python()` super often, I'm now fine as far as I can see. /update
One morning, you wake up and find yourself using Django 1.4.2 along with DjangoRESTFramework 2.1.2 on Python 2.6.8. And hey, things could definitely be worse. This Django admin magic provides you with forms for your easily specified relational data model, making it a pleasure to maintain the editorial part of your database. Your business logic behind the RESTful URLs accesses both the editorial data and specific database tables for their needs, and even those are displayed in the Django admin, partially because it's easily done and nice to have, partially because some automatically generated records require a mini workflow.
But wait. You still haven't implemented those binary fields as BINARY. They're VARCHARS. You had put that on your ToDo list for later. And later is now.
Okay, there are those write-once-read-many-times cases with small table sizes where an optimization would not necessarily pay. But in another case, you're wasting both storage and performance due to freuquent INSERTs and DELETEs in a table which will get large.
So what would you want to have? A clear mapping between the DB and Django, where the DB stores BINARY and Django deals with hex strings of twice the length. Can't be that hard to achieve, can it?
You search the Web and find folks who want CHAR instead for VARCHAR, others who want BLOBs, and everybody seems to do it a bit differently. Finally, you end up at Writing custom model fields where the VARCHAR -> CHAR case is officially dealt with. So you decide to go with this information.
Starting with __init__(), db_type() and to_python(), you notice that to_python() gets rarely called and add __metaclass__ = models.SubfieldBase only to figure that Django now calls to_python() even if it has done so before. The other suggestions on the page suddenly start to make more sense to you, so you're going to wrap your data in a class, such that you can protect it from repeated calls to to_python(). You also follow the suggestion to Put a __str__() or __unicode__() method on the class you're wrapping up as a field and implement get_prep_value().
While the resulting code does not do what you expect, one thing you notice is that get_prep_value() never gets called so far, so you're removing it for now. What you do figure is that Django consistently appears to get a str from the DB and a unicode from the admin, which is cool, and end up with something like this (boiled down to essentials, really).
class MyHexWrappeer(object):
def __init__(self, hexstr):
self.hexstr = hexstr
def __len__(self):
return len(self.hexstr)
def __str__(self):
return self.hexstr
class MyHexField(models.CharField):
__metaclass__ = models.SubfieldBase
def __init__(self, max_length, *args, **kwargs):
assert(max_length % 2 == 0)
self.max_length = max_length
super(MyHexField, self).__init__(max_length=max_length, *args, **kwargs)
def db_type(self, connection):
return 'binary(%s)' % (self.max_length // 2)
def to_python(self, data):
if isinstance(data, MyHexWrapper): # protect object
return data
if isinstance(data, str): # binary string from DB side
return MyHexWrapper(binascii.b2a_hex(data))
if isinstance(data, unicode): # unicode hex string from admin
return MyHexWrapper(data)
And... it won't work. The reason, of course, being that while you have found a reliable way to create MyHexWrapper objects from all sources including Django itself, the path backwards is clearly missing. From the remark above, you were thinking that Django calls str() or unicode() for admin and get_prep_value() in the direction of the DB. But if you add get_prep_value() above, it will never be called, and there you are, stuck.
That can't be, right? So you're not willing to give up easily. And suddenly you get this one nasty thought, and you're making a test, and it works. And you don't know whether you should laugh or cry.
So now you try this modification, and, believe it or not, it just works.
class MyHexWrapper(object):
def __init__(self, hexstr):
self.hexstr = hexstr
def __len__(self):
return len(self.hexstr)
def __str__(self): # called on its way to the DB
return binascii.a2b_hex(self.hexstr)
def __unicode__(self): # called on its way to the admin
return self.hexstr
It just works? Well, if you use such a field in code, like for a RESTful URL, then you'll have to make sure you have the right kind of string; that's a matter of discipline.
But then, it still only works most of the time. Because when you make such a field your primary key, then Django will call quote(getattr()) and while I found a source claiming that getattr() "nowdays" will use unicode() I can't confirm. But that's not a serious obstacle once you got this far, eh?
class MyModel((models.Model):
myhex = MyHexField(max_length=32,primary_key=True,editable=False)
# other fields
def __getattribute__(self, name):
if (name == 'myhex'):
return unicode(super(MyModel, self).__getattribute__(name))
return super(MyModel, self).__getattribute__(name)
Works like a charm. However, now you lean back and look at your solution as a whole. And you can't help to figure that it's a diversion from the documentation you referred to, that it uses undocumented or internal behavioural characteristics which you did not intend to, and that it is error-prone and shows poor usability for the developer due to the somewhat distributed nature of what you have to implement and obey.
So how can the objective be achieved in a cleaner way? Is there another level with hooks and magic in Django where this mapping should be located?
Thank you for your time.

Django save image using upload_to instance keeps instance properties None

I Am trying to give an uploaded image a nicer path, using this code (in models.py):
def get_image_path_photos(instance, filename):
return os.path.join('photos', str(instance.someproperty), filename)
and the model
class Photo(models.Model):
someproperty = models.CharField(max_length=17, blank=False, null=False, default="something")
photo = models.ImageField(upload_to=get_image_path_photos, blank=True, null=True)
When I save this after a new insert, it saves it in the path /photos/something/ (it keeps using the default value).
When I edit it, add a photo and save it, it will save it to the correct path.
So it must have something to do that while saving the new object, it doesn't exist yet.
I tried the same with instance.id and this keeps being None as well (I read using auto increment on the id solves this, but this sounds as using the default value as well, and using the default pk/id is auto increment).
I found some simular questions, but none with the answer that solves my problem.
I thought of going to use the pre_save signal.... but somehow my guts says this isn't the right way.
The solution of my problem I found out myselve, please see my answer... A good lesson, don't use slugname definitions the same as the field name.....
Sorry about this. The problem is a bit more complicated. I use the field someproperty also in the url as a slug on the posts....
I just found out something I didn't expected.
i did my post (using django rest framework) from the url using the default value in the url... but I filled in the field with something else.
than, because I define the slugname the same as the fieldname, it overwrites anything you fill in in the field with the value from the url....
This isn't exactly what I meant to be done, but makes sence.
Probably the solution is to call the slug name not the same as the field name......
I keep this question and answer anyway, because for me it was quite a puzzle..... (might be of help to somebody)
as an addition to the answer of jpic:
I used the urls in django rest framwwork, lets say; http:\someurl\api\photos\\
and post there the photo.
posting the photo avatar_big.png using someproperty=bar:
saved the photo in photos\something\ when using the url http\someurl\api\photos\something
and saved the photo in photos\bar\ when using the url http:\someurl\api\photos\bar
the problem is (i guess, still have to check this) that the slug name I use for the url is the same as the fieldname.
this is the code I use in views.py (class based view I use in the django-rest-framework):
class PhotoBySomePropertyListOrCreateModelView(ListOrCreateModelView):
permissions = (IsAuthenticated, )
form = PhotoForm
def get_queryset(self):
someproperty=self.kwargs['someproperty']
return Photo.objects.filter(someproperty=someproperty)
and in urls.py:
url(r'^api/photos/(?P<someproperty>[\w:]+)/$', PhotoBySomePropertyListOrCreateModelView.as_view(resource=PhotoResource)),
here you see the problem, it doesn't listen to the field in 'someproperty', but to the value in the url ....
changing it in
url(r'^api/photos/(?P[\w:]+)/$', PhotoBySomePropertyListOrCreateModelView.as_view(resource=PhotoResource)),
should do the trick.... also adjust the view of course