How to query abstract-class-based objects in Django? - django

Let's say I have an abstract base class that looks like this:
class StellarObject(BaseModel):
title = models.CharField(max_length=255)
description = models.TextField()
slug = models.SlugField(blank=True, null=True)
class Meta:
abstract = True
Now, let's say I have two actual database classes that inherit from StellarObject
class Planet(StellarObject):
type = models.CharField(max_length=50)
size = models.IntegerField(max_length=10)
class Star(StellarObject):
mass = models.IntegerField(max_length=10)
So far, so good. If I want to get Planets or Stars, all I do is this:
Thing.objects.all() #or
Thing.objects.filter() #or count(), etc...
But what if I want to get ALL StellarObjects? If I do:
StellarObject.objects.all()
It of course returns an error, because an abstract class isn't an actual database object, and therefore cannot be queried. Everything I've read says I need to do two queries, one each on Planets and Stars, and then merge them. That seems horribly inefficient. Is that the only way?

At its root, this is part of the mismatch between objects and relational databases. The ORM does a great job in abstracting out the differences, but sometimes you just come up against them anyway.
Basically, you have to choose between abstract inheritance, in which case there is no database relationship between the two classes, or multi-table inheritance, which keeps the database relationship at a cost of efficiency (an extra database join) for each query.

You can't query abstract base classes. For multi-table inheritance you can use django-model-utils and it's InheritanceManager, which extends standard QuerySet with select_subclasses() method, which does right that you need: it left-joins all inherited tables and returns appropriate type instance for each row.

Don't use an abstract base class if you need to query on the base. Use a concrete base class instead.

This is an example of polymorphism in your models (polymorph - many forms of one).
Option 1 - If there's only one place you deal with this:
For the sake of a little bit of if-else code in one or two places, just deal with it manually - it'll probably be much quicker and clearer in terms of dev/maintenance (i.e. maybe worth it unless these queries are seriously hammering your database - that's your judgement call and depends on circumstance).
Option 2 - If you do this quite a bit, or really demand elegance in your query syntax:
Luckily there's a library to deal with polymorphism in django, django-polymorphic - those docs will show you how to do this precisely. This is probably the "right answer" for querying straightforwardly as you've described, especially if you want to do model inheritance in lots of places.
Option 3 - If you want a halfway house:
This kind of has the drawbacks of both of the above, but I've used it successfully in the past to automatically do all the zipping together from multiple query sets, whilst keeping the benefits of having one query set object containing both types of models.
Check out django-querysetsequence which manages the merge of multiple query sets together.
It's not as well supported or as stable as django-polymorphic, but worth a mention nevertheless.

In this case I think there's no other way.
For optimization, you could avoid inheritance from abstract StellarObject and use it as separate table connected via FK to Star and Planet objects.
That way both of them would have ie. star.stellar_info.description.
Other way would be to add additional model for handling information and using StellarObject as through in many2many relation.

I would consider moving away from either an abstract inheritance pattern or the concrete base pattern if you're looking to tie distinct sub-class behaviors to the objects based on their respective child class.
When you query via the parent class -- which it sounds like you want to do -- Django treats the resulting ojects as objects of the parent class, so accessing child-class-level methods requires re-casting the objects into their 'proper' child class on the fly so they can see those methods... at which point a series of if statements hanging off a parent-class-level method would arguably be a cleaner approach.
If the sub-class behavior described above isn't an issue, you could consider a custom manager attached to an abstract base class sewing the models together via raw SQL.
If you're interested mainly in assigning a discrete set of identical data fields to a bunch of objects, I'd relate along a foreign-key, like bx2 suggests.

That seems horribly inefficient. Is that the only way?
As far as I know it is the only way with Django's ORM. As implemented currently abstract classes are a convenient mechanism for abstracting common attributes of classes out to super classes. The ORM does not provide a similar abstraction for querying.
You'd be better off using another mechanism for implementing hierarchy in the database. One way to do this would be to use a single table and "tag" rows using type. Or you can implement a generic foreign key to another model that holds properties (the latter doesn't sound right even to me).

Related

Django's MutiTable Vs. Abstract Inheritance

While there is general consensus that multi-table inheritance isn't a very good idea in the long term (Jacobian, Others), am wondering if in some use cases the "extra joins" created by django during querying might be worth it.
My issue is having a Single Source of Truth in the database. Say, for Person Objects who are identified using an Identity Number and Identity Type. E.g. ID Number 222, Type Passport.
class Person(models.Model):
identity_number = models.CharField(max_length=20)
identity_type = models.IntegerField()
class Student(Person):
student_number = models.CharField(max_length=20)
class Employee(Person):
employee_number = models.CharField(max_length=20)
In abstract inheritance, any subclass model of person e.g. Student, Parent, Supervisor, Employee etc inheriting from a Person Abstract Class will have identity_number & identity_type stored in their respective tables
In multi-table inheritance, since they all share the same table, I can be sure that if I create a unique constraint on both columns in the Person Model then no duplicates will exist in the database.
In the abstract inheritance, to keep out duplicates in the database, one would have to build extra validation logic into the application thus also slightly degrading performance meaning it cancels out the "extra join" that django has to do with a concrete inheritance?
It's a mistake to think about your data modeling in object-oriented terms at all. It's an abstraction that fits poorly to relational databases, by hiding some very important details that can massively affect performance (as pointed out in the articles) or correctness (as you've pointed out above).
A traditional SQL approach to your example would offer two possibilities:
Having a Person table with the IDs and then Student, etc. with foreign keys back to it.
Having a single table for everything, with some additional fields to distinguish the different kinds of person.
Now, if your evaluation led you to prefer 1, you might notice that in Django this could be accomplished by using a concrete inheritance model (it's the same as what you describe above). In that case, by all means, use inheritance if you'd find the resulting access patterns in Django more elegant.
So I'm not saying you shouldn't use inheritance, I'm saying you should only look at it after you've modeled your data from the SQL perspective. If you did that in the example above, you would never even consider splitting everything into separate tables—which has all the problems you noted—as suggested by the abstract inheritance model.

Convert OneToOneField to MultipleTableInheritance

Build of this question: Which is better: Foreign Keys or Model Inheritance?
I would like to know if it is possible to replace a OneToOne field by MTI?
A.k.
I have:
class GroupUser(models.Model):
group = models.OneToOneField(Group)
...other fields....
and I want:
class GroupUser(Group):
...other fields....
I think that should be faster, or not?
Is it possible?
It won't be faster, because your parent class object will still have a field in the database that links to the child class if you are using concrete inheritance(and sounds like it would be), so technically the efficiency is the same as OneToOne field.
The choice is also based on the business logic. Inheritance is used for the situations where you have things that are of similar type, so that you could define common fields/methods in the parent class and reduce some repetitive code. From your example sounds like Group and GroupUser are totally two different things, most likely they don't share many common attributes either, so unless I misunderstand your intention, OneToOneField is a better candidate.

Model polymorphism and model-view separation

I'm encountering a dilemma of sorts while making my Django application, but I think the problem I'm encountering may apply to the MVC pattern generally. I'm making a Question model which can be used to construct quizzes or questionnaires. The Question base class would be a simple free response question. I'd like to support different types of questions such as multiple choice questions or sliding scale questions, and these would be subclasses of the Question base class with extra fields added such as an array of possible choices. I'd like to be able to extend my question models to support more types of questions in the future, and for that I can rely on polymorphism and pass objects of the Question type between the model layer and the view layer for all subclasses of Question.
The problem I'm encountering is that the view has to know the type of question it has received in order to render it. If it gets a multiple choice question it needs to draw the radio choice widgets, etc. So now if I extend my models with more types of questions I have to add it to both the model and the view layers. This seems to defeat the point of polymorphism since the views receiving the Question objects would always have to know the subclass type of the questions received. I can get around this problem by delegating the responsibility of rendering the question back to the model. If the Question model has a virtual function called render_question() that its subclasses override then the view layer can call that function to get the right HTML to output without worrying about the type of question. But now I have the problem of having the HTML rendering code bound up with the model.
Could there be a third solution that does not have either of the downsides of the solutions I've thought of? Or is this truly a dilemma about which one has to make a difficult decision?
The separation between model/view is intended to decouple presentation from data. Your initial description of a polymorphic model hierarchy for Question is indeed a valid approach.
What you really want to do here is consider using Django's model inheritance to handle data hierarchies i.e. have:
BaseQuestion <- FreeQuestion,
MultipleChoiceQuestion,
SlidingScaleQuestion etc.
Then you can build a BaseQuestionView that know how to display a BaseQuestion (for instance render the Question string, style it and what not) and using the same principle construct:
BaseQuestionView <- FreeQuestionView,
MultipleChoiceQuestionView,
SlidingScaleQuestionView
You can make the BaseQuestionView abstract as to pull all BaseQuestion model instances from the DB and invoke abstract render_question method that is implemented in each of FreeQuestionView, MultipleChoiceQuestionView, SlidingScaleQuestionView subclasses. Thus FreeQuestionView knows its working with a FreeQuestion model and only implements how to render widget for the answer (textfield). MultipleChoiceQuestionView would only implement how to render radio boxes etc.
In other words it is almost exactly what you presented in your first case, except the rendering implementation lies in the View classes not Model classes.
Same principle can be applied when you want to render the same class of objects in different ways.
With model inheritance you can access any subclass of the base instance with dot notation i.e.: question.freequestion. This will return to you FreeQuestion instance associated with base instance or raise Question.DoesNotExist if that's not its class.
Using class-based views you can add Mixins that can render your question differently depending on the weather this is a FreeQuestion, MultipleChoiceQuestion, using python's MRO pattern, or you can subclass them.
As far as I know there is no automatic way for Django to make the correlation between inherited models and inherited views, you have to make the mapping yourself.
Perhaps the easiest approach is to explicit request all instances matching your initial Question QuerySet related to FreeQuestions, MultipleChoiceQuestions, etc. individually and cast them in the list before feeding it to your main renderer which will then go through a map[question.__class__] to find the renderer method from the mixin. Alternatively you can just keep the question type in the base class as to avoid having to deal with class mappings and let the DB help you in this regard.
However, normally you don't want your model behavior to dynamically change for the same class (which is what you are effectively doing with BaseQuestion). This is especially true when designing with REST in mind, as you want explicit URLs to map to explicit concrete not abstract types.

Django Model Inheritance and Relationships

I've got a django app, where I'd like to define a relationship between two classes at a base level. It also makes sense to me to define the relationship between the children of those base classes - so that I get something like this:
class BaseSummary(models.Model):
base_types...
class BaseDetail(models.Model):
base_detail_types...
base_summary = models.ForeignKey('BaseSummary')
class ChildSummary(BaseSummary):
child_summary_types...
class ChildDetail(BaseDetail):
child_detail_type...
child_summary = models.ForeignKey('ChildSummary')
Does django support this? and If it is supported, is something like this going to cause scalability problems?
Thanks!
Yes, this is supported. Yes, it can cause performance problems. You should read Jacob's post on model inheritance: http://jacobian.org/writing/concrete-inheritance/
Since 1.0, Django’s supported model
inheritance. It’s a neat feature, and
can go a long way towards increasing
flexibility in your modeling options.
However, model inheritance also offers
a really excellent opportunity to
shoot yourself in the foot: concrete
(multi-table) inheritance. If you’re
using concrete inheritance, Django
creates implicit joins back to the
parent table on nearly every query.
This can completely devastate your
database’s performance.
It is supported, and won't cause scalability problems. My advice, however, is that you only refer to the Child classes (i.e. don't create references to the Base classes, and don't instantiate them).
Base Model Classes should be extend-only (sort of like an Abstract Class in other languages).

Django: How to reduce inner joins on Model inheritance?

I have several models inheriting from a base model.
The fields in the base model are needed rarely, but Django keeps doing complex inner joins to retrieve those fields whenever I use any of the inherited models.
How can I tell Django to avoid this ? I only need the fields in this model rarely.
Note: maybe only(..) would work(I didn't check), but I would need to add it in many places in the code..
Use abstract model inheritance.
In short, setting abstract = True in the base class' meta, makes Django using abstract inheritance, meaning each derived model will contain a copy of all the fields defined in the base model.
By the way, one of the Django's maintainers, Jacob Kaplan-Moss has quite a strong opinion against concrete inheritance,
model inheritance also offers a really
excellent opportunity to shoot
yourself in the foot: concrete
(multi-table) inheritance
and again:
I’d strongly suggest that Django users
approach any use of concrete
inheritance with a large dose of
skepticism.
Personally, I have never had to use model inheritance at all; however, after reading that blog entry, I am quite convinced in trying to avoid concrete inheritance as much as possible.
I'd say the only possiblity to avoid this is either making your base class abstract, or you create some custom sql queries that don't hit the 'base'-table...