Efficient ways of removing duplicate from a queryset? - django

I have a table called Clue which has a foreignkey relation with another entity called Entry.
class Entry(models.Model):
entry_text = models.CharField(max_length=50, unique=True)
.....
class Clue(models.Model):
entry = models.ForeignKey(Entry, on_delete=models.CASCADE)
......
Now, let's say I have the following queryset
clues = Clue.objects.filter(clue_text=clue.clue_text)
which returns something like this-
[<Clue: ATREST-Still>, <Clue: ATREST-Still>, <Clue: ATREST-Still>, <Clue: YET-Still>, <Clue: YET-Still>, <Clue: SILENT-Still>]
As, you can see there are different clue objects but some of them are tied to the same entry objects.
I tried the following:-
clues = Clue.objects.filter(clue_text=clue.clue_text).distinct()
But this won't work as the field repeating is a foreign key value. Correct me if I am wrong.
Essentially, I want my queryset to look something like this
[<Clue: ATREST-Still>, <Clue: YET-Still>, <Clue: SILENT-Still>]
I was able to achieve it through the following but I was looking at a solution that can be done at the database level rather than doing it in memory.
This is my approach
clue_objs=[]
temp = {}
clues = Clue.objects.filter(clue_text=clue.clue_text)
for clue in clues:
if not temp.get(clue.entry.entry_text):
temp[clue.entry.entry_text]=1
clue_objs.append(clue)

You can call .distinct() on the QuerySet you obtain with .values_list(…) [Django-doc], so something like:
clues = Clue.objects.filter(
clue_text=clue.clue_text
).values_list('clue_text', flat=True).distinct()
But this looks more like a modeling problem: if you have a lot of duplicated data, that often means you should construct a new model that stores that data only once, and then reference that model with a relation (like a ForeignKey, OneToOneField or ManyToManyField).

Related

How to sort by the sum of a related field in Django using class based views?

If I have a model of an Agent that looks like this:
class Agent(models.Model):
name = models.CharField(max_length=100)
and a related model that looks like this:
class Deal(models.Model):
agent = models.ForeignKey(Agent, on_delete=models.CASCADE)
price = models.IntegerField()
and a view that looked like this:
from django.views.generic import ListView
class AgentListView(ListView):
model = Agent
I know that I can adjust the sort order of the agents in the queryset and I even know how to sort the agents by the number of deals they have like so:
queryset = Agent.objects.all().annotate(uc_count=Count('deal')).order_by('-uc_count')
However, I cannot figure out how to sort the deals by the sum of the price of the deals for each agent.
Given you already know how to annotate and sort by those annotations, you're 90% of the way there. You just need to use the Sum aggregate and follow the relationship backwards.
The Django docs give this example:
Author.objects.annotate(total_pages=Sum('book__pages'))
You should be able to do something similar:
queryset = Agent.objects.all().annotate(deal_total=Sum('deal__price')).order_by('-deal_total')
My spidy sense is telling me you may need to add a distinct=True to the Sum aggregation, but I'm not sure without testing.
Building off of the answer that Greg Kaleka and the question you asked under his response, this is likely the solution you are looking for:
from django.db.models import Case, IntegerField, When
queryset = Agent.objects.all().annotate(
deal_total=Sum('deal__price'),
o=Case(
When(deal_total__isnull=True, then=0),
default=1,
output_field=IntegerField()
)
).order_by('-o', '-deal_total')
Explanation:
What's happening is that the deal_total field is adding up the price of the deals object but if the Agent has no deals to begin with, the sum of the prices is None. The When object is able to assign a value of 0 to the deal_totals that would have otherwise been given the value of None

Django related_name naming best practice in case of multiple relations

After a year of Django experience I found out that I am not quite sure that I use Django related_names correctly.
Imagine I have three models
classA(models.Model):
pass
classB(models.Model):
pass
classC(models.Model):
modelA = models.ForeignKey(classA)
modelB = models.ForeignKey(classB)
Fine. Now I am thinking of adding related_name to classC's modelA and modelB, but the frustrating think is that I cannot use the same name for two fields. In other words, this code is apparently wrong
classC(models.Model):
modelA = models.ForeignKey(classA, related_name = 'classC') # wrong
modelB = models.ForeignKey(classB, related_name = 'classC') # wrong
On the other hand, coming up with an approach like this:
classC(models.Model):
modelA = models.ForeignKey(classA, related_name = 'classA') # wrong
modelB = models.ForeignKey(classB, related_name = 'classB') # wrong
would result in a very misleading (at least for me) code. Consider this:
obj = classA.filter(classC__in = classA_qs)
So such naming results in a very disruptive code classC = classA_instance.
What is the best practice in terms of naming related_names. And is there something I am missing about ManyToManyFields ? Actually, I have a large project, but I've never used ManyToManyFields, always going for a third table like classC in the example. Is there something I am missing ?
How about using variable related_names that way you can relate them according to their app and class.
class ClassB(models.Model):
readers = ForeignKey('Reader',
related_name='readable_%(app_label)s_%(class)s_set+')

Django Query. Correct use of objects.select_related()

I have this models:
A = class(models.Model):
onefield = ...
B = class(models.Model):
property = models.CharField(...)
foreign = models.ForeignKey(A)
So, I want to get all the objects A that are ForeignKeys of objects B with property = x.
I tried this:
query = B.objects.filter(property=x).select_related('A')
But it doesn't work. Is it possible to do this with select_related()?
Although I hesitate to contradict the illustrious Alex, and he's technically correct (the best kind of correct, after all), select_related is not the answer here. Using that never gives you different objects as a result; it only improves the efficiency of subsequently accessing related objects.
To get As, you need to start your query from A. You can use the double-underscore syntax to filter on the related property. So:
query = A.objects.filter(b__property=x)
You need to write .select_related('foreign'). select_related takes a field name, not a class name.

ForeignKey and a ManyToMany Self Query

I have a Django model as follows:
class Topic(models.Model):
name=models.CharField(db_index=True,max_length=30)
categorykey=models.ForeignKey(Category)
class Category(models.Model):
categorykey=models.CharField(db_index=True,max_length=30)
relatedcategories=models.ManyToManyField("Category",symmetrical=False)
The categories can have related categories. For example, if the category is "Vet", the related categories might be "Animals", "Medicine", etc. I want to find all the Topics within a category and it's related categories.
I can not figure out how to do that, I think I want something like:
categorykey="Vet"
topics=list(Topic.objects.filter(categorykey__relatedcategories__in=categorykey))
But that just throws an error. Any ideas?
Try this:
topics = Topic.objects.filter(categorykey__relatedcategories__categorykey = 'Vet')
Or this:
vet_category = Category.objects.get(category_key = 'Vet')
topics = Topic.objects.filter(categorykey__relatedcategories = vet_category)
(Depending on which is more convenient for you.)

How can i get a list of objects from a postgresql view table to display

this is a model of the view table.
class QryDescChar(models.Model):
iid_id = models.IntegerField()
cid_id = models.IntegerField()
cs = models.CharField(max_length=10)
cid = models.IntegerField()
charname = models.CharField(max_length=50)
class Meta:
db_table = u'qry_desc_char'
this is the SQL i use to create the table
CREATE VIEW qry_desc_char as
SELECT
tbl_desc.iid_id,
tbl_desc.cid_id,
tbl_desc.cs,
tbl_char.cid,
tbl_char.charname
FROM tbl_desC,tbl_char
WHERE tbl_desc.cid_id = tbl_char.cid;
i dont know if i need a function in models or views or both. i want to get a list of objects from that database to display it. This might be easy but im new at Django and python so i having some problems
Django 1.1 brought in a new feature that you might find useful. You should be able to do something like:
class QryDescChar(models.Model):
iid_id = models.IntegerField()
cid_id = models.IntegerField()
cs = models.CharField(max_length=10)
cid = models.IntegerField()
charname = models.CharField(max_length=50)
class Meta:
db_table = u'qry_desc_char'
managed = False
The documentation for the managed Meta class option is here. A relevant quote:
If False, no database table creation
or deletion operations will be
performed for this model. This is
useful if the model represents an
existing table or a database view that
has been created by some other means.
This is the only difference when
managed is False. All other aspects of
model handling are exactly the same as
normal.
Once that is done, you should be able to use your model normally. To get a list of objects you'd do something like:
qry_desc_char_list = QryDescChar.objects.all()
To actually get the list into your template you might want to look at generic views, specifically the object_list view.
If your RDBMS lets you create writable views and the view you create has the exact structure than the table Django would create I guess that should work directly.
(This is an old question, but is an area that still trips people up and is still highly relevant to anyone using Django with a pre-existing, normalized schema.)
In your SELECT statement you will need to add a numeric "id" because Django expects one, even on an unmanaged model. You can use the row_number() window function to accomplish this if there isn't a guaranteed unique integer value on the row somewhere (and with views this is often the case).
In this case I'm using an ORDER BY clause with the window function, but you can do anything that's valid, and while you're at it you may as well use a clause that's useful to you in some way. Just make sure you do not try to use Django ORM dot references to relations because they look for the "id" column by default, and yours are fake.
Additionally I would consider renaming my output columns to something more meaningful if you're going to use it within an object. With those changes in place the query would look more like (of course, substitute your own terms for the "AS" clauses):
CREATE VIEW qry_desc_char as
SELECT
row_number() OVER (ORDER BY tbl_char.cid) AS id,
tbl_desc.iid_id AS iid_id,
tbl_desc.cid_id AS cid_id,
tbl_desc.cs AS a_better_name,
tbl_char.cid AS something_descriptive,
tbl_char.charname AS name
FROM tbl_desc,tbl_char
WHERE tbl_desc.cid_id = tbl_char.cid;
Once that is done, in Django your model could look like this:
class QryDescChar(models.Model):
iid_id = models.ForeignKey('WhateverIidIs', related_name='+',
db_column='iid_id', on_delete=models.DO_NOTHING)
cid_id = models.ForeignKey('WhateverCidIs', related_name='+',
db_column='cid_id', on_delete=models.DO_NOTHING)
a_better_name = models.CharField(max_length=10)
something_descriptive = models.IntegerField()
name = models.CharField(max_length=50)
class Meta:
managed = False
db_table = 'qry_desc_char'
You don't need the "_id" part on the end of the id column names, because you can declare the column name on the Django model with something more descriptive using the "db_column" argument as I did above (but here I only it to prevent Django from adding another "_id" to the end of cid_id and iid_id -- which added zero semantic value to your code). Also, note the "on_delete" argument. Django does its own thing when it comes to cascading deletes, and on an interesting data model you don't want this -- and when it comes to views you'll just get an error and an aborted transaction. Prior to Django 1.5 you have to patch it to make DO_NOTHING actually mean "do nothing" -- otherwise it will still try to (needlessly) query and collect all related objects before going through its delete cycle, and the query will fail, halting the entire operation.
Incidentally, I wrote an in-depth explanation of how to do this just the other day.
You are trying to fetch records from a view. This is not correct as a view does not map to a model, a table maps to a model.
You should use Django ORM to fetch QryDescChar objects. Please note that Django ORM will fetch them directly from the table. You can consult Django docs for extra() and select_related() methods which will allow you to fetch related data (data you want to get from the other table) in different ways.