I have several models inheriting from a base model.
The fields in the base model are needed rarely, but Django keeps doing complex inner joins to retrieve those fields whenever I use any of the inherited models.
How can I tell Django to avoid this ? I only need the fields in this model rarely.
Note: maybe only(..) would work(I didn't check), but I would need to add it in many places in the code..
Use abstract model inheritance.
In short, setting abstract = True in the base class' meta, makes Django using abstract inheritance, meaning each derived model will contain a copy of all the fields defined in the base model.
By the way, one of the Django's maintainers, Jacob Kaplan-Moss has quite a strong opinion against concrete inheritance,
model inheritance also offers a really
excellent opportunity to shoot
yourself in the foot: concrete
(multi-table) inheritance
and again:
I’d strongly suggest that Django users
approach any use of concrete
inheritance with a large dose of
skepticism.
Personally, I have never had to use model inheritance at all; however, after reading that blog entry, I am quite convinced in trying to avoid concrete inheritance as much as possible.
I'd say the only possiblity to avoid this is either making your base class abstract, or you create some custom sql queries that don't hit the 'base'-table...
Related
While there is general consensus that multi-table inheritance isn't a very good idea in the long term (Jacobian, Others), am wondering if in some use cases the "extra joins" created by django during querying might be worth it.
My issue is having a Single Source of Truth in the database. Say, for Person Objects who are identified using an Identity Number and Identity Type. E.g. ID Number 222, Type Passport.
class Person(models.Model):
identity_number = models.CharField(max_length=20)
identity_type = models.IntegerField()
class Student(Person):
student_number = models.CharField(max_length=20)
class Employee(Person):
employee_number = models.CharField(max_length=20)
In abstract inheritance, any subclass model of person e.g. Student, Parent, Supervisor, Employee etc inheriting from a Person Abstract Class will have identity_number & identity_type stored in their respective tables
In multi-table inheritance, since they all share the same table, I can be sure that if I create a unique constraint on both columns in the Person Model then no duplicates will exist in the database.
In the abstract inheritance, to keep out duplicates in the database, one would have to build extra validation logic into the application thus also slightly degrading performance meaning it cancels out the "extra join" that django has to do with a concrete inheritance?
It's a mistake to think about your data modeling in object-oriented terms at all. It's an abstraction that fits poorly to relational databases, by hiding some very important details that can massively affect performance (as pointed out in the articles) or correctness (as you've pointed out above).
A traditional SQL approach to your example would offer two possibilities:
Having a Person table with the IDs and then Student, etc. with foreign keys back to it.
Having a single table for everything, with some additional fields to distinguish the different kinds of person.
Now, if your evaluation led you to prefer 1, you might notice that in Django this could be accomplished by using a concrete inheritance model (it's the same as what you describe above). In that case, by all means, use inheritance if you'd find the resulting access patterns in Django more elegant.
So I'm not saying you shouldn't use inheritance, I'm saying you should only look at it after you've modeled your data from the SQL perspective. If you did that in the example above, you would never even consider splitting everything into separate tables—which has all the problems you noted—as suggested by the abstract inheritance model.
Build of this question: Which is better: Foreign Keys or Model Inheritance?
I would like to know if it is possible to replace a OneToOne field by MTI?
A.k.
I have:
class GroupUser(models.Model):
group = models.OneToOneField(Group)
...other fields....
and I want:
class GroupUser(Group):
...other fields....
I think that should be faster, or not?
Is it possible?
It won't be faster, because your parent class object will still have a field in the database that links to the child class if you are using concrete inheritance(and sounds like it would be), so technically the efficiency is the same as OneToOne field.
The choice is also based on the business logic. Inheritance is used for the situations where you have things that are of similar type, so that you could define common fields/methods in the parent class and reduce some repetitive code. From your example sounds like Group and GroupUser are totally two different things, most likely they don't share many common attributes either, so unless I misunderstand your intention, OneToOneField is a better candidate.
Is it acceptable to inherit multiple QuerySet classes ?
Simple question, but didn't find much info on google.. :(
I'd like to inherit django-model-utils's InheritanceQuerySet and my custom mixins(which subclassed django's model.QuerySet)
-- EDIT --
Suppose InheritanceQuerySet has _clone() method.
Down the road, I may need to inherit OtherQuerySet which also has _clone() method.
_clone() copies something specific to the class and calls super._clone()
I worried if first *_clone()* would hide the second *_clone()* in MRO and affect the functionality.
(But I guess since _clone() calls the super, I don't need to worry about 'hiding', writing out sometimes solves the problem.)
Then, I was worried because 'queryset multiple inheritance' doesn't yield many google results although I think it's really good way to add functionality to manager.
(I'm thinking to make a queryset which inherits multiple queryset related mixin which has object as base, or models.QuerySet as base.
Then I can use PassThroughManager or alike(from_queryset from django 1.7) to use the all-the-powerful queryset)
I've got a django app, where I'd like to define a relationship between two classes at a base level. It also makes sense to me to define the relationship between the children of those base classes - so that I get something like this:
class BaseSummary(models.Model):
base_types...
class BaseDetail(models.Model):
base_detail_types...
base_summary = models.ForeignKey('BaseSummary')
class ChildSummary(BaseSummary):
child_summary_types...
class ChildDetail(BaseDetail):
child_detail_type...
child_summary = models.ForeignKey('ChildSummary')
Does django support this? and If it is supported, is something like this going to cause scalability problems?
Thanks!
Yes, this is supported. Yes, it can cause performance problems. You should read Jacob's post on model inheritance: http://jacobian.org/writing/concrete-inheritance/
Since 1.0, Django’s supported model
inheritance. It’s a neat feature, and
can go a long way towards increasing
flexibility in your modeling options.
However, model inheritance also offers
a really excellent opportunity to
shoot yourself in the foot: concrete
(multi-table) inheritance. If you’re
using concrete inheritance, Django
creates implicit joins back to the
parent table on nearly every query.
This can completely devastate your
database’s performance.
It is supported, and won't cause scalability problems. My advice, however, is that you only refer to the Child classes (i.e. don't create references to the Base classes, and don't instantiate them).
Base Model Classes should be extend-only (sort of like an Abstract Class in other languages).
Let's say I have an abstract base class that looks like this:
class StellarObject(BaseModel):
title = models.CharField(max_length=255)
description = models.TextField()
slug = models.SlugField(blank=True, null=True)
class Meta:
abstract = True
Now, let's say I have two actual database classes that inherit from StellarObject
class Planet(StellarObject):
type = models.CharField(max_length=50)
size = models.IntegerField(max_length=10)
class Star(StellarObject):
mass = models.IntegerField(max_length=10)
So far, so good. If I want to get Planets or Stars, all I do is this:
Thing.objects.all() #or
Thing.objects.filter() #or count(), etc...
But what if I want to get ALL StellarObjects? If I do:
StellarObject.objects.all()
It of course returns an error, because an abstract class isn't an actual database object, and therefore cannot be queried. Everything I've read says I need to do two queries, one each on Planets and Stars, and then merge them. That seems horribly inefficient. Is that the only way?
At its root, this is part of the mismatch between objects and relational databases. The ORM does a great job in abstracting out the differences, but sometimes you just come up against them anyway.
Basically, you have to choose between abstract inheritance, in which case there is no database relationship between the two classes, or multi-table inheritance, which keeps the database relationship at a cost of efficiency (an extra database join) for each query.
You can't query abstract base classes. For multi-table inheritance you can use django-model-utils and it's InheritanceManager, which extends standard QuerySet with select_subclasses() method, which does right that you need: it left-joins all inherited tables and returns appropriate type instance for each row.
Don't use an abstract base class if you need to query on the base. Use a concrete base class instead.
This is an example of polymorphism in your models (polymorph - many forms of one).
Option 1 - If there's only one place you deal with this:
For the sake of a little bit of if-else code in one or two places, just deal with it manually - it'll probably be much quicker and clearer in terms of dev/maintenance (i.e. maybe worth it unless these queries are seriously hammering your database - that's your judgement call and depends on circumstance).
Option 2 - If you do this quite a bit, or really demand elegance in your query syntax:
Luckily there's a library to deal with polymorphism in django, django-polymorphic - those docs will show you how to do this precisely. This is probably the "right answer" for querying straightforwardly as you've described, especially if you want to do model inheritance in lots of places.
Option 3 - If you want a halfway house:
This kind of has the drawbacks of both of the above, but I've used it successfully in the past to automatically do all the zipping together from multiple query sets, whilst keeping the benefits of having one query set object containing both types of models.
Check out django-querysetsequence which manages the merge of multiple query sets together.
It's not as well supported or as stable as django-polymorphic, but worth a mention nevertheless.
In this case I think there's no other way.
For optimization, you could avoid inheritance from abstract StellarObject and use it as separate table connected via FK to Star and Planet objects.
That way both of them would have ie. star.stellar_info.description.
Other way would be to add additional model for handling information and using StellarObject as through in many2many relation.
I would consider moving away from either an abstract inheritance pattern or the concrete base pattern if you're looking to tie distinct sub-class behaviors to the objects based on their respective child class.
When you query via the parent class -- which it sounds like you want to do -- Django treats the resulting ojects as objects of the parent class, so accessing child-class-level methods requires re-casting the objects into their 'proper' child class on the fly so they can see those methods... at which point a series of if statements hanging off a parent-class-level method would arguably be a cleaner approach.
If the sub-class behavior described above isn't an issue, you could consider a custom manager attached to an abstract base class sewing the models together via raw SQL.
If you're interested mainly in assigning a discrete set of identical data fields to a bunch of objects, I'd relate along a foreign-key, like bx2 suggests.
That seems horribly inefficient. Is that the only way?
As far as I know it is the only way with Django's ORM. As implemented currently abstract classes are a convenient mechanism for abstracting common attributes of classes out to super classes. The ORM does not provide a similar abstraction for querying.
You'd be better off using another mechanism for implementing hierarchy in the database. One way to do this would be to use a single table and "tag" rows using type. Or you can implement a generic foreign key to another model that holds properties (the latter doesn't sound right even to me).