Django: Join two tables in a single query - django

In my template I have a table which I want to populate with data from two different tables which use foreign-keys.
The solution I find while googling around is to use values(), example of my queryset below:
data = Table1.objects.all().filter(date=today).values('table2__price')
My template
{% for item in data %}
<tr class="original">
<td>{{ item.fruit }}</td>
<td>{{ item.table2__price }}</td>
</tr>
{% endfor %}
It do fetch price from table2, but it also make the data (item.fruit) from Table1 disappear. If I remove the values field, Table1 populates accordingly.
Anyone have any feedback of what I am doing wrong?
Thanks

If you make use of .values(…) [Django-doc], then you retrieve a queryset of dictionaries, that only contains the values specified. This is not a good idea. Not only because as you found out, it wiill thus no longer use the values in the model, but you will "erode" the model layer as well: it is no longer a Table1 object, but a dictionary. Which is a form of a primitive obsession antipattern [refactoring.guru].
You can make use of .annotate(…) [Django-doc] to add extra attributes for the elements that arise from this queryset:
from django.db.models import F
data = Table1.objects.filter(date=today).annotate(
table2__price=F('table2__price')
)
If there is however a ForeignKey (or OneToOneField) from Table1 to Table2, you can select the columns of Table2 in the same query with .select_related(…) [Django-doc]:
data = Table1.objects.filter(date=today).select_related('table2')
in the template, you can then use:
{{ item.fruit }}
{{ item.table2.price }}
{{ item.table2.other_attr }}
The .select_related(…) part is not strictly necessary, but without it, it will make an extra query per item, thus this will produce an N+1 problem.

Related

How do I properly compare two different models based on their field values and output the differences?

I am trying to figure out how to produce the output between two querysets that have similar fields. I have two different models where I keep identical fields, and at times I want to show the differences between these two models. I have researched and successfully used sets and lists in my python code, but I can't quite figure out how to leverage them or determine if I Can. Sets seems to strip out just the field values, and when I try to loop through my sets, it doesn't currently work because it's just the values without the keys.
Example:
class Team(models.Model):
player = models.Charfield
coach = models.Charfield
class TeamCommittee(models.Model):
player = models.Charfield
coach = models.Charfield
I want to be able to query both models at times and be able to exclude data from one model if it exists in the other. I have spent the afternoon trying to use sets, lists, and loops in my django templates but can't quite work this out. I have also tried various querysets using exclude....
I tried something like....
query1 = TeamCommittee.objects.filter(id=self.object.pk).values('coach','player')
query2 = Team.objects.filter(id=self.object.pk).exclude(id__in=query1)
When I use the approach above, I get TypeError: Cannot use multi-field values as a filter value.
I am wondering if I can do this via a query or if I need to dump my querysets and go down a path of manipulating a data dictionary? That seems extreme for what I am trying to do though. This does seem to be a bit more complicated because I am trying to cross reference two different models. If it was the same model this would be a lot easier but it's not an option for this particular use case.
Thanks in advance for any thoughts on the right way to approach this.
If you want to compare on the basis of the ID of both tables, probably you can use this:
teamID = list(Team.objects.all().values_list('id', flat=True))
query1 = TeamCommittee.objects.filter(id__in=teamID)
teamCommitteeID = list(TeamCommittee.objects.all().values_list('id', flat=True))
query2 = Team.objects.filter(id__in=teamCommitteeID)
I was super close...Instead I just did this...
query1 = TeamCommittee.objects.filter(id=self.object.pk).values('coach','player').distinct()
Then in my template I did a very simple....
{% for item in query1.all %}
item
{% endfor %}
Then when I wanted to get the values out of the other queryset I just did the same thing with the loop.
query2 = Team.objects.filter(id=self.object.pk).values('coach','player').distinct()
Then in my template I did a very simple....
{% for item in query2.all %}
item
{% endfor %}
Sometimes simplicity is hard.
The answer above after additional testing only partially worked. I ultimately abandoned that approach and instead incorporated the logic below into my template. I did not need to create separate queries, I just needed to loop through the fields that were already available to me as part of the DetailView I was using....
{% for author in author_detail.author_set.all %}
{% for book in book_detail.book_set.all %}
{% if author.author_name %}
{% if book.book_name in author.book_name %}
{% if author.book_name == book.book_name %}
{% elif author.book_name != book.book_name %}
{{ author.author_name }}
{% endif %}
{% endif %}
{% endif %}
{% endfor %}
{% endfor %}
Thank you to everyone who made suggestions to get me to this point.

Django: pagination with prefetch_related

I have a model specifications which are divided into categories. For the template, in order to display the category as an header in the table, I make a prefetch_related on the Category like this:
categories = Category.objects.distinct().prefetch_related('specifications').filter(filters)
Now I can loop over the categories and show the related specifications like:
{% for category in categories %}
<tr>
<th colspan="7">{{ category.name }} - ({{ category.abbr }})</th>
</tr>
{% for specification in category.specifications.all %}
...
I also want to use the paginator, but now the paginator only counts the categories and not the related specifications. Is it possible to paginate on the specifications with the given query or should I change the query to retrieve all the specifications?
Use Prefetch
categories = Category.objects.distinct().filter(filters)
category_ids = categories.values_list('id', flat=True) # category ids on page
categories = categories.prefetch_related(
Prefetch(
'specifications',
queryset=Specialization.objects.filter(category_id__in=category_ids)
)
)
Here it creates another db request (to fetch category ids) but it will cost less than prefetch all specializations I think. It depends on your data structure but it definitely one of solutions.
Have you tried django-tables2?
With it you could just simply render the table with something like:
Create a CategoryTable class and add it to your view:
class CategoryTable(tables.Table):
class Meta:
model = Category
def your_view(request):
...
categories = Category.objects.distinct()
.prefetch_related('specifications')
.filter(filters)
table = CategoryTable(categories)
table.paginate(page=request.GET.get("page", 1), per_page=25)
return render(request, "category_template.html", {"table": table})
Then, in your template just put:
{% load django_tables2 %}
{% render_table table %}
What you are trying to achieve is an anti-pattern of prefetch_related. Prefetch is to fetch "all" the related rows, but the use case is to paginate the specifications.
Prefetch would be good in cases where the number of related rows per parent row is ~5 (or upto 10, remember that you are wasting DB network bandwidth growing prefetched child rows. So if you are considering pagination, it is best to avoid prefetch)
Note: Child rows = specifications, Parent rows = categories, in your use case.
My answer would be to avoid using prefetch in this case. Just use the following
categories = Category.objects.distinct().filter(filters)
--
{% for category in categories %}
<tr>
<th colspan="7">{{ category.name }} - ({{ category.abbr }})</th>
</tr>
# Use some table lib like django-tables
...
If this is a simple internal project, do go ahead with prefetch, no problems, otherwise you are going to hit DB performance issues.

Using queryset manager with prefetch_related

I have been succesfully using this brilliant technique to keep my code DRY encapsulating ORM relations in querysets so that code in views is simple and not containing foreign key dependency. But this time I face the following issue best descriped by code:
View:
vendors_qs = vendors_qs.select_related().prefetch_related('agreement_vendors')
Model
class AgreementVendorsQuerySet(models.query.QuerySet):
def some_filter1(self, option):
result = self.filter(.....)
return result
def some_filter2(self, option):
result = self.filter(.....)
return result
And a template:
{% for vendor in vendors_qs %}
<tr>
...
<td>
{% for vend_agr in vendor.agreement_vendors.all %}
{{vend_agr.serial_number}}
{% endfor %}
<td>
</tr>
{% endfor %}
The question is, how and where do I apply the some_filter1 to vendor agreements given that it is fetched as prefetch_related relation. Do I have to apply the filter in the template somehow or in the view itself ?
If I didn't put the question clearly enough, I will ask your questions to clarify further...
UPDATE:
Anna's asnwer looks very much like the truth, but one detail remains unclear. What if I want to apply several filters based on if-condition. For exapmle, if the filters were to apply to vendors, then the code would simply look like:
if (condition_1)
vendors_qs = vendors_qs.filter1()
if (condition_2)
vendors_qs = vendors_qs.filter2()
If I clearly understand your question you need something like this
vendors_qs = vendors_qs.prefetch_related(models.Prefetch('agreement_vendors', queryset=some_filter, to_attr='agreement_vendors_list'))
And then in template you can call it like {% for vend_agr in vendor.agreement_vendors_list %}

django best practice query foreign key

I am trying to understand the best way to structure queries in django to avoid excessive database hits.
This is similar to the question: Django best practice with foreign key queries, but involves greater 'depth' in the queries.
My situation:
models.py
class Bucket(models.Model):
categories = models.ManyToManyField('Category')
class Category(models.Model):
name = models.CharField(max_length=50)
class SubCategory(models.Model):
category = models.ForeignKey(Category)
class SubSubCategory(models.Model):
subcat = models.ForeignKey(SubCategory)
views.py
def showbucket(request, id):
bucket = Bucket.objects.prefetch_related('categories',).get(pk=id)
cats = bucket.categories.prefetch_related('subcategory_set__subsubcategory_set',)
return render_to_response('showbucket.html', locals(), context_instance=RequestContext(request))
and relevant template:
{% for c in cats %}
{{c}}
<ul>
{% for d in c.subcategory_set.all %}
<li>{{d}}</li>
<ul>
{% for e in d.subsubcategory_set.all %}
<li>{{e}}</li>
{% endfor %}
</ul>
{% endfor %}
</ul>
{% endfor %}
Despite the use of prefetch_related(), I seem to be hitting the database each time the top two for statements are evaluated, e.g. {% for c in cats %}, (at least I believe so from reading the debug_toolbar). Other ways I've tried have ended up with (C x D x E) number of database hits. Is this something inherently wrong with my use of prefetch, queries, or models? What is the best way in Django to access database objects with a "depth > 1" so-to-speak?
Use select_related() instead:
https://docs.djangoproject.com/en/dev/ref/models/querysets/#django.db.models.query.QuerySet.select_related
bucket = Bucket.objects.select_related('categories',).get(id=id)
cats = bucket.categories.select_related('subcategory_set__subsubcategory_set',)
So, i found out there's a couple things going on here:
First, my current understanding on select_related vs prefetch_related:
select_related() follows foreign-key relationships, causing larger result sets but means that later use of FK won't require additional database hits. It is limited to FK and one-to-one relationships.
prefetch_related() does a separate lookup for each relationship and joins them in python, and is means to be used for many-to-many, many-to-one, and GenericRelation and GenericForeignKey.
By the book, I should be using prefetch(), as I was not 'following' the Foreign Keys.
That's what I had understood going into this, but my template seemed to be causing additional queries when evaluating the given for loops in the template, even when I added the use of {with} tags.
At first, I had thought I had discovered something similar to this issue, but I am unable to replicate when I built out my simplified example. I switched from using the debug toolbar to direct checking using the following template code (in the article Tracking SQL Queries for a Request using Django by Karen Tracey, I would link but am link-limited):
{% with sql_queries|length as qcount %}
{{ qcount }} quer{{ qcount|pluralize:"y,ies" }}
{% for qdict in sql_queries %}
{{ qdict.sql }} ({{ qdict.time }} seconds)
{% endfor %}
{% endwith %}
Using this method, I am only seeing 5 queries for using pre-fetch() (7 with debug_toolbar), and queries grow linearly when using select_related() (with +2 for debug_toolbar), which I believe is expected.
I will gladly take any other advice/tools on getting a handle on these types of issues.

Filter Django Haystack results like QuerySet?

Is it possible to combine a Django Haystack search with "built-in" QuerySet filter operations, specifically filtering with Q() instances and lookup types not supported by SearchQuerySet? In either order:
haystack-searched -> queryset-filtered
or
queryset-filtered -> haystack-searched
Browsing the Django Haystack documentation didn't give any directions how to do this.
You could filter your queryset based on the results of a Haystack search, using the objects' PKs:
def view(request):
if request.GET.get('q'):
from haystack import ModelSearchForm
form = ModelSearchForm(request.GET, searchqueryset=None, load_all=True)
searchqueryset = form.search()
results = [ r.pk for r in searchqueryset ]
docs = Document.objects.filter(pk__in=results)
# do something with your plain old regular queryset
return render_to_response('results.html', {'documents': docs});
Not sure how this scales, but for small resultsets (a few hundred, in my case), this works fine.
From the docs:
SearchQuerySet.load_all(self)
Efficiently populates the objects in the search results. Without using
this method, DB lookups are done on a per-object basis, resulting in
many individual trips to the database. If load_all is used, the
SearchQuerySet will group similar objects into a single query,
resulting in only as many queries as there are different object types
returned.
http://django-haystack.readthedocs.org/en/latest/searchqueryset_api.html#load-all
Therefore, after you have a filtered SQS, you can do a load_all() on it and just access the database objects via SearchResult.object. E.g.
sqs = SearchQuerySet()
# filter as needed, then load_all
sqs = sqs.load_all()
for result in sqs:
my_obj = result.object
# my_obj is a your model object
If you want to keep up with the pertinence, you have to access the object from the database through "object" :
example in your template:
{% for result in results %}
{{ result.object.title }}
{{ result.objects.author }}
{% endfor %}
But this is really bad since haystack will make an extra request like "SELECT * FROM blah WHERE id = 42" on each results.
Seems you're trying to get those object from your database because you didn't put some extra fields in your index ins't it ? If you add the title AND the author in your SearchIndex, then you can just use your results:
{% for result in results %}
{{ result.title }}
{{ result.author }}
{% endfor %}
and avoid some extra queries.