Django ORM join with another table - django

I have a table known as messages. In my application, users can send different type of messages. Like forwarding an event, etc. As such, I have columns type and value in that table.
What I want to do is for a particular type, goto a particular table and make sure the value is valid (typically this maps to the id of that table). There could be multiple types, and each one has to be mapped to a different table. Is there a way to logically write this in the built in django ORM? Right now I'm only seeing this feasible if I use straight SQL, which I would rather not if I can get away with it...
Right now I'm doing something like:
Messages.objects.all().filter(Q(user_id=id))...etc
I want to add to the statement above checking the type and for the particular type, check the table associated with it.

It sounds like you have a "polymorphic association". There are a couple ways to do analogous things in Django, but I think the one that most closely matches what you described is the contenttypes module, which uses separate columns for the type and for the value as in your application.

You may just want to define a multi-table inheritance structure. This gets you the same result in that you have multiple types of messages and field inheritance, but you don't have to mess with a type field at all.
class Message(models.Model):
value = models.CharField(max_length=9000)
class Event(Message):
pass
class Tweet(Message):
pass
In view post handler:
...
if request.POST['type'] == 'event':
Event.objects.create(value=request.POST['value'])
elif request.POST['type'] == 'tweet':
Tweet.objects.create(value=request.POST['value'])
...
Then, get your objects:
all_tweets = Tweet.objects.all()
all_messages = Message.objects.all()
for message in all_messages:
try:
event = message.event
except Event.DoesNotExist:
# this message is not an Event
pass
If this sort of structure doesn't work for you, there are two other styles of Model inheritance supported by Django: Abstract base classes and Proxy models. You could also hold a GenericForeignKey as suggested by #AndrewGorcester.

Related

What is the most Django-appropriate way to combine multiple database columns into one model field?

I have several times come across a want to have a Django model field that comprises multiple database columns, and am wondering what the most Django way to do it would be.
Three use cases come specifically to mind.
I want to provide a field that wraps another field, keeping record of whether the wrapped field has been set or not. A use case for this particular field would be for dynamic configuration. A new configuration value is introduced, and a view marks itself as dependent upon a configuration value, redirecting if the value isn't set. Storing whether it's been set yet or not allows for easy indefinite caching of the state. This also lets the configuration value itself be not-nullable, and the application can ignore any value it might have when unset.
I want to provide a money field that combines a decimal (or integer) value, and a currency.
I want to provide a file field with a link to some manner of access rule to determine whether the request should include it/a request for it should succeed.
For each of the use cases, there exists a workaround, that in each case seems less elegant.
Define the configuration fields as nullable. This is undesirable for a few reasons: it removes the validity of NULL as a value for the configuration itself, so tristates and other use valid cases for NULL have to become a pair of fields or a different data type, or an edge case; null=True on the fields allows them to be set back to None in modelforms and the admin without writing a custom FormField for them every time; and every nullable column in a database is arguably bad design.
Define the field as a subclass of DecimalField with an argument accepting a string, and use that to contribute another field to the model. (This is what django-money does). Again, this is undesirable: fields are appearing "as if by magic" on the model; and configuring the currency field becomes not obvious.
Define the combined file+rule field instead as an entire model, and one-to-one to it from the model where you want to have the field. This is a solution to all use cases, but again comes with downsides: there's an extra JOIN required for every instance of the field - one can imagine a User with profile_picture, cv, passport, private_key etc.; there's an implicit requirement to .select_related(*fields) on every query that would ever want to access the fields; and the layout of the related model is going to have cold data interleaved with hot data all over the place given that it's reused everywhere.
In addition to solution 3., there's also the option to define a mixin factory that produces the multiple fields with matching names and whatever desired properties and methods. Again this isn't perfect because the user ends up with fields being defined in the model body, but also above that in the inheritance list.
I think the main reason this keeps sending me in circles is because custom Django model fields are always defined in terms of a single base field, because it's done by inheritance.
What is the accepted way to achieve this end?

django save instance between parent and child model class

I came across this problem on form save the data needs to be persisted somewhere then go through a payment process then on success retrieve the data and save to the proper model.
I have seen this done using session, but with some hacky way to persist file uploads when commit=False and it doesn't seem very pythonic
I am thinking if I have a model class A, and have a child class extending A, such as A_Temp
class A(models.Model):
name = models.CharField(max_lenght=25)
image = models.ImageField()
class A_Temp(A):
pass
class AForm(forms.ModelForm):
class Meta:
model = A_Temp
On model form (A_Temp) save, it stores to A_Temp, and when payment successful, it move the instance to the parent model class A.
Here are the questions:
Has anyone done this before?
How to properly move an instance of a child model class to the parent model class?
Edit:
There are other different ways to do it, such as adding extra fields to the table, yes I would've done that if I am using PHP without a ORM framework, but since the ORM is pretty decent in django, I thought that I might trial something different.
Since I am asking here, means I am not convinced myself about this approach as well. What are your thoughts?
As suggested in the question comments, adding an extra field to your model containing payment state may be the easiest approach. Conceptually it's the same object, it's just that the state changes once payment has been made. As you've indicated, you will need logic to purge out items from your database which never proceed through the required states such as payment. This may involve adding both a payment_state and state_change_time field to your model which indicates when the state last changed. If the state is PAYMENT_PENDING for for too long, that record could be purged.
If you take the approach that unpaid items are stored in a different table as you've suggested, you still have to manage that table to determine when it's safe to delete items. For example, if a payment is never processed, when will you delete record from the A_temp table? Also, having a separate table means that you really only have two states possible, paid and unpaid as determine by the table in which the record occurs. Having a single table with a payment_state may be more flexible in that it allows you to extend the state as required. eg. Let's say you decide you need the payment states ITEM_SUBMITTED, AWAITING_PAYMENT, PAYMENT_ACCEPTED, PAYMENT_REJECTED. This could all be implemented with a single state field. If this was implemented as you've described, you'd need a separate table for each state.
Having said all that, if you're still set on having a separate table structure, you can create a function which will copy the values from an instance of A_temp to A. Something like the following may work, but any relationship type fields such as ForeignKey are likely to require special attention.
def copy_A_temp_to_A(a, a_temp):
for field_name in a._meta.fields:
value = getattr(a, field_name)
setattr(a_temp, field_name, value)
When you need to do the move from A_temp to A, you'd have to instantiate an A instance, then call the copy function, save the instance and delete the A_temp instance from the database.

Is there a way to create a so-called multi-typed model field?

Preface
I need to have objects (Object model) with an individual set of fields (Field model). It contains name and type (see the diagram). Each connection between Object and Field stores the field's value. Datatype of value depends on the Field type property and physically the value will be stored in one of the predefined db columns (value_number, value_text, ...).
How I want it to work:
field = Field.objects.get(pk=1)
sought_for = Object.fields.filter(field=field, value='test')
Is there a way to create such a field that can be put to QuerySet just as simple as in the example but it actually, depending on the field's type, uses different db column or even columns as I suppose that in the future there will types that involve more than one column to store its value.
P.S. I tried some EAV applications but they seemed to be too complicated for my case.
The diagram:
Field model, it stores name and type of fields
FieldValue, the model the values for the fields are stored.
UPD: Eventually I came to a thought that the very approach to use Postgres (or any relational database) is not the best choice. I got this implemented easily in MongoDB.

Django - How to annotate QuerySet using multiple field values?

I have a model called "Story" that has two integer fields called "views" and "votes". When I retrieve all the Story objects I would like to annotate the returned QuerySet with a "ranking" field that is simply "views"/"votes". Then I would like to sort the QuerySet by "ranking". Something along the lines of...
Story.objects.annotate( ranking=CalcRanking('views','votes') ).sort_by(ranking)
How can I do this in Django? Or should it be done after the QuerySet is retrieved in Python (like creating a list that contains the ranking for each object in the QuerySet)?
Thanks!
PS: In my actual program, the ranking calculation isn't as simple as above and depends on other filters to the initial QuerySet, so I can't store it as another field in the Story model.
In Django, the things you can pass to annotate (and aggregate) must be subclasses of django.db.models.aggregates.Aggregate. You can't just pass arbitrary Python objects to it, since the aggregation/annotation actually happens inside the database (that's the whole point of aggregate and annotate). Note that writing custom aggregations is not supported in Django (there is no documentation for it). All information available on it is this minimal source code: https://code.djangoproject.com/browser/django/trunk/django/db/models/aggregates.py
This means you either have to store the calculations in the database somehow, figure out how the aggregation API works or use raw sql (raw method on the Manager) to do what you do.

Django ORM: Optimizing queries involving many-to-many relations

I have the following model structure:
class Container(models.Model):
pass
class Generic(models.Model):
name = models.CharacterField(unique=True)
cont = models.ManyToManyField(Container, null=True)
# It is possible to have a Generic object not associated with any container,
# thats why null=True
class Specific1(Generic):
...
class Specific2(Generic):
...
...
class SpecificN(Generic):
...
Say, I need to retrieve all Specific-type models, that have a relationship with a particular Container.
The SQL for that is more or less trivial, but that is not the question. Unfortunately, I am not very experienced at working with ORMs (Django's ORM in particular), so I might be missing a pattern here.
When done in a brute-force manner, -
c = Container.objects.get(name='somename') # this gets me the container
items = c.generic_set.all()
# this gets me all Generic objects, that are related to the container
# Now what? I need to get to the actual Specific objects, so I need to somehow
# get the type of the underlying Specific object and get it
for item in items:
spec = getattr(item, item.get_my_specific_type())
this results in a ton of db hits (one for each Generic record, that relates to a Container), so this is obviously not the way to do it. Now, it could, perhaps, be done by getting the SpecificX objects directly:
s = Specific1.objects.filter(cont__name='somename')
# This gets me all Specific1 objects for the specified container
...
# do it for every Specific type
that way the db will be hit once for each Specific type (acceptable, I guess).
I know, that .select_related() doesn't work with m2m relationships, so it is not of much help here.
To reiterate, the end result has to be a collection of SpecificX objects (not Generic).
I think you've already outlined the two easy possibilities. Either you do a single filter query against Generic and then cast each item to its Specific subtype (results in n+1 queries, where n is the number of items returned), or you make a separate query against each Specific table (results in k queries, where k is the number of Specific types).
It's actually worth benchmarking to see which of these is faster in reality. The second seems better because it's (probably) fewer queries, but each one of those queries has to perform a join with the m2m intermediate table. In the former case you only do one join query, and then many simple ones. Some database backends perform better with lots of small queries than fewer, more complex ones.
If the second is actually significantly faster for your use case, and you're willing to do some extra work to clean up your code, it should be possible to write a custom manager method for the Generic model that "pre-fetches" all the subtype data from the relevant Specific tables for a given queryset, using only one query per subtype table; similar to how this snippet optimizes generic foreign keys with a bulk prefetch. This would give you the same queries as your second option, with the DRYer syntax of your first option.
Not a complete answer but you can avoid a great number of hits by doing this
items= list(items)
for item in items:
spec = getattr(item, item.get_my_specific_type())
instead of this :
for item in items:
spec = getattr(item, item.get_my_specific_type())
Indeed, by forcing a cast to a python list, you force the django orm to load all elements in your queryset. It then does this in one query.
I accidentally stubmled upon the following post, which pretty much answers your question :
http://lazypython.blogspot.com/2008/11/timeline-view-in-django.html