is django queryset.distinct() necessary when you are not using queryset.values()? - django

I'm trying to wrap my head around the distinct method of the django queryset class, the thing I'm having trouble understanding is when to actually use it. note that I'm not talking about the "distinct on" feature of postgres.
I understand that each model instance has to have an id property and ids are unique so when you are querying model instances it's not really possible to get duplicate models/rows. so is the following use of distinct redundant?
User.objects.distict()
I know that one correct use of the distinct method is when you use the values method and you don't select the id, you might have values that are duplicate and you could use distinct in these scenarios.
is there any other scenario where one might need to use distinct (e.g. when using select_related or prefetch_related)?

A common use of distinct() is to eliminate duplicates when you filter across multiple tables
Consider the following models
class Parent(models.Model):
pass
class Child(models.Model):
parent = models.ForeignKey(Parent, on_delete=models.CASCADE)
value = models.IntegerField()
Populated with the following data
p = Parent.objects.create()
Child.objects.create(parent=p, value=1)
Child.objects.create(parent=p, value=2)
Child.objects.create(parent=p, value=3)
If you filter a Parent queryset by the related value column then you will get a duplicate for every Child that matches the filter
Parent.objects.filter(child__value__gt=0)
# <QuerySet [<Parent: Parent object (1)>, <Parent: Parent object (1)>, <Parent: Parent object (1)>]>
Parent.objects.filter(child__value__gt=1)
# <QuerySet [<Parent: Parent object (1)>, <Parent: Parent object (1)>]>
But if you use distinct() then the duplicates will be eliminated
Parent.objects.filter(child__value__gt=0).distinct()
# <QuerySet [<Parent: Parent object (1)>]>

Related

Getting the primary key of a ForeignKey field without querying the database again in Django

I have the following models:
class Model1(models.Model):
...
class Model2(models.Model):
...
model1 = models.ForeignKey(Model1)
Now, lets say I have an object of Model2 with pk=241 which is related to another object of Model1 with pk=102. I am querying them as follows:
model2 = Model2.objects.get(pk=241)
Now, if I want the pk of the referenced Model1 object. I do the following:
model2.model1.pk
This should not query the database again according to what I understand about tables, but if I run the following:
from django.db import connection
connection.queries
I get a list of 2 queries. Why do I need to query my database again to only get the primary key of my related object? Is there a way to avoid doing this?
I am aware of select_related(), however, what if I want to call the Model1 objects pk in the save() method of the Model2 class?
Moreover, is select_related() required even if I want to just retrieve the pk of the related object and nothing more?
You can access the underlying field without a db hit.
model2.model1_id
You don't need select_related here, since you are not actually accessing the related object.

Getting model instance when using QuerySet.values()

Let's say I have two models, one referencing the other:
class Shelf(models.Model):
pass
class Book(models.Model):
shelf = models.ForeignKey(Shelf)
I'd like to use values() on a QuerySet of Book instances:
In [1]: Book.objects.create(shelf=Shelf.objects.create())
Out[1]: <Book: Book object>
In [2]: Book.objects.values()
Out[2]: [{'id': 1, 'shelf_id': 1}]
The problem is that the returned dictionaries contain just the primary keys of the related Shelf instances instead of the instances themselves. Is there a way to get the actual instances in a single query? E.g.:
In [2]: Book.objects.values()
Out[2]: [{'id': 1, 'shelf': <Shelf: Shelf object>}]
The reason I'm using values() is so that I can merge two QuerySets for different models which I want to sort and render into a single table in a view.
You can get a very similar output by a generator with .get() like this:
(Shelf.objects.get(id) for id in Book.objects.values_list("shelf"))
According to the docs:
If you have a field called foo that is a ForeignKey, the default
values() call will return a dictionary key called foo_id, since this
is the name of the hidden model attribute that stores the actual value
(the foo attribute refers to the related model).
I don't think it is possible with the default values() implementation. You can create a custom manager and override the values() method to create a multi-level dictionary for each ForeignKey of the model.

Get all related objects of an object in Django?

Let us say I have a model which contains related (foreign key) fields. Likewise, those Foreign Key fields may refer to models which may or may not contain related fields. Note that relational fields in Django may be one-to-one, many-to-one, or many-to-many.
Now, given an instance of a model, I want to recursively and dynamically get all instances of the models related to it, either directly or indirectly down the line. Conceptually, i want to perform a traversal of the related objects and return them.
Example:
class Model1{
rfield1 = models.ForeignKey("Model2")
rfield2 = models.ManyToManyField("Model3")
normalfield1 = models.Charfield(max_length=50)
}
class Model2{
sfield = models.ForeignKey("Model3")
normalfield = models.CharField(max_length=50)
}
class Model3{
normalfield = models.CharField(max_length=50)
}
Let's say, I have an instance of model Model1 model1, and I want to get objects directly related to it i.e. all Model2 and Model3 objects, and also those which are indirectly related i.e. all Model3 objects related to the Model2 objects retrieved previously. I also want to consider the case of a One-to-One field where the related field is defined on the OTHER MODEL.
Also, note that it might not be the case that I know the model of an instance I'm currently working on. Let's say in the previous example, I may not now that model1 is an instance of Model1 model. So I want to perform all these dynamically.
In order to this, I think I need a way to get all related fields of an object.
How to get all the related fields of an object?
And how should I use them to get the actual related objects?
Or is there a way to better to do this? Thank you very much!
UPDATE:
I already know how to perform 1, and 2 basically follows directly from 1. :) Update later.
If you have model1 getting all it's many to many field names (etc) is easy since this is well know and these are all stored in the meta's 'local_many_to_many' list:
[field.name for field in model1._meta.local_many_to_many]
The foreign keys are a bit more tricky since they are stored with all other fields in the meta's 'local_fields' list. Hence we need to make sure that it has a relation of sorts. This can be done as follows:
[field.name for field in model1._meta.local_fields if field.rel]
This method has requires no knowledge of your models. Also further interrogation can be done on the field object if the name is not enough.

Django models: Reference field return multiple type of models

As a project to figure out Django I'm trying to build a small game.
A player has a base. A base has several type of items it can harbor. (Vehicle, Defense, Building).
I have 3 static tables which contain information for the first level of each item (in the game these values are used in formulas to calculate stuff for upgrades). I've used a sequence to insert all these items in these different tables so the ID's are unique across tables.
To keep track of what items the player has per base I have a table 'Property'. I want to use a single field as a reference to the ID of an item and trying to get this done with the Django models.
Warning: my knowledge about Django models are pretty limited and I've been stuck with this a few days now.
Is this possible and if so how can it be done?
I tried using annotations on the save method to change the value of a field by overwriting the field with the id of that object before trying to query the object by id when trying to 'get' the object, however I can't get past the obvious restriction of the model when defining that field as an Integer - I hoped it wouldn't validate until I called save()
def getPropertyItemID(func):
"""
This method sets the referral ID to an item to the actual ID.
"""
def decoratedFunction(*args):
# Grab a reference to the data object we want to update.
data_object=args[0]
# Set the ID if item is not empty.
if data_object.item is not None:
data_object.item=data_object.item.id
# Execute the function we're decorating
return func(*args)
return decoratedFunction
class Property(models.Model):
"""
This class represents items that a user has per base.
"""
user=models.ForeignKey(User)
base=models.ForeignKey(Base)
item=models.IntegerField()
amount=models.IntegerField(default=0)
level=models.SmallIntegerField(default=0)
class Meta:
db_table='property'
#getPropertyItemID
def save(self):
# Now actually save the object
super(Property, self).save()
I hope you can help me here. The end result I'd like to be able to put to use would be something like:
# Adding - automatically saving the ID of item regardless of the class
# of item
item = Property(user=user, base=base, item=building)
item.save()
# Retrieving - automatically create an instance of an object based on the ID
# of item, regardless of the table this ID is found in.
building = Property.objects.all().distinct(True).get(base=base, item=Building.objects.all().distinct(True).get(name='Tower'))
# At this point building should be an instance of the Building model
If I'm completely off and I can achieve this differently I'm all ears :)
I think you are looking for a Generic Relationship:
class Property(models.Model):
user=models.ForeignKey(User)
base=models.ForeignKey(Base)
content_type = models.ForeignKey(ContentType) # Which model is `item` representing?
object_id = models.PositiveIntegerField() # What is its primary key?
item=generic.GenericForeignKey('content_type', 'object_id') # Easy way to access it.
amount=models.IntegerField(default=0)
level=models.SmallIntegerField(default=0)
This lets you create items as you mentioned, however you would probably need to look at a different way of filtering those items out.

django: iterating over items in ManyToMany table without intermediate model (without using 'through')

I have a simple case with 2 models: Item and Category with ManyToMany between them. I want to show a page listing all categories and for each category list of items. I have hundreds of categories so django hits db hundreds of times (when iterating thru categories and calling items.all() for each one). I need to select data from the intermediate table manually and use select_related() to pull item and category for each record - one query instead of hundreds.
I know that introducing 'through' would solve the problem but I don't want to do it now because it may break existing code (using through makes you can't use add, create, or assignment to create relationships - which I want to avoid for now).
So, is it possible at all without creating a model for intermediate table?
You could make a model for your existing table, and just not use it as the through field for the m2m, and make it unmanaged. eg:
class ItemCategory(models.Model):
item = models.ForeignKey('Item')
category = models.ForeignKey('Category')
class Meta:
db_table = 'the_name_of_the_existing_m2m_table'
managed = False
Something like that, anyway.