Do I need to optimize database access in django templates? - django

Say I have 3 models:
class Franchises
name = models.CharField()
class Stores
franchise = models.ForeignKey(Franchises)
name = models.CharField()
class Items
store = models.ForeignKey(Stores)
name = models.CharField()
in view
items = Items.objects.all()
in template
{% for item in items %}
<div>item.store.franchise.name</div>
{% endfor %}
I wonder whether executing item.store.franchise.name will hit the database? and what I need to do to optimize the database access?

Queries in django are lazy, meaning that they try to resolve as little as possible until accessed. So right now you will hit the database for the store reference, and then again for the franchise ref. And that would be for each item in those results (many queries).
If you know you will need those relations, you can tell the query to get them all right away:
Items.objects.selected_related('store__franchise').all()
Django will do the joins ahead of time to make sure each result is already cached for those related objects. When you hit them, it will not trigger more queries.
More info on select_related here
A really cool way to test this out is to start the django shell ./manage.py shell, and look at the queries that get issued:
from django import db
from django.db import connection
db.reset_queries()
items = list(Items.objects.all())
print connection.queries
db.reset_queries()
items = list(Items.objects.selected_related('store__franchise').all())
print connection.queries

Related

Django Loading up Class based model data on a template

I am new to django before this only made 1 project and that was just a follow along video so obviously the fix is probably very easy, but I am trying to make a restaurant reservation page, and later on there will be user authentication to only show the reservation that particular user made, but right now I want to display all of the reservations in my modal and trying to learn how to actually display the data.
here's the modal:
https://gyazo.com/066e3e060492990008d608a012f588f3
here's the view: https://gyazo.com/6947ed97d84b38f1e73680e28f3a0a9a
Here's the template: https://gyazo.com/966c4810b3c7f4dd8dad2e5b71c2179c
I am spent about 3 hours watching other videos on people loading there modal data in there website and when they do it I understand everything but for some reason I can't apply that to my own project, my guess is my for loop is wrong since I have a hard time with writing them and seem to always mess up, so any help so I can at least start looking in the right direction would be appreciated
When using class based views you'll have to work with django conventions. For example where you have reservations = Reservation.objects.all(), reservations is not a defined class attribute for the class based view. What you can do is rename it to queryset instead.
from django.views.generic import ListView
class ReservationList(ListView):
model = Reservation
queryset = Reservation.objects.all()
context_object_name = 'reservations' # using this attribute to alias the queryset
template_name = "make_a_reservation.html"
This way you can now use the name reservations in your template as you did with:
{% for i in reservations %}
...
{% endfor %}
That should work.

Design Question: Low load data aggregation for overview page

What is the best way to achieve low load on the database or application server for this use case:
Let's say I want to build a web application that has for each user an overview page. The overview page shows in an aggregated form for each user the user's data. For example, if it were a library application it would show how many times the user visited the library in total, how many books he read in total, how many books were delivered delayed in total, how many minutes he spend in the building. Each time the user visits the overview page the up-to-date values should be displayed. While the user interacts with the site the numbers change.
What I could do is for every overview page refresh do several counts in the database. But that would be expensive.
views.py
def overview(request, userID):
booksCount = Book.objects.count()
booksReadCount = Book.objects.filter(UserID=userID, Status='read').count()
# ... many more, same way
libraryVisitedCount = LibraryVisits.objects.filter(UserID=userID).count()
# many counts like these on different tables for the user
data = {
"booksCount" : booksCount,
"booksReadCount" : booksReadCount,
# ... many more
"libraryVisitedCount" : libraryVisitedCount
}
render(..., context=data)
I have thought I could store a JSON object with the data to be presented on the overview page in a database table and I update the JSON each time an event happend on the site which affects the count of objects.
Or I could use a materiliazed view but to refresh it I would have to recalculate all the data of all users each time, right?
Other ideas? I'm using django webframework and postgres database.
TL;DR: I wondered isn't there a better way to receive counts than do several counts in the database each time?
Thanks.
Lets say, in Book, LibraryVisit etc models, there is ForeignKey to User model with related_name like this:
class Book(models.Model):
UserID = models.ForeignKey(User, related_name='books', on_delete=DO_NOTHING)
class LibraryVisit(models.Model):
UserID = models.ForeignKey(User, related_name='library_visit', on_delete=DO_NOTHING)
Then you can use annotation and conditional expression like this:
from django.db.models import Case, IntegerField, Sum, When
def overview(request, userID):
users = User.objects.filter(pk=userId)
users = users.annotate(
booksReadCount=Sum(
Case(
When(book__Status='read', then=1),
output_field=IntegerField()
)
)
).annotate(library_visited_count=Count('library_visit'))
# FYI: please use snake_case when defining object attribute(like model fields) as per PEP-8 style guide
data = {
"user_object" : users.first(), # taking first item of the User queryset. Also DB is hit once in this step
"booksCount" : Book.objects.count()
}
# access counts in view like this:
# user.book_read_count
# user.library_visited_count
return render(..., context=data)
# bold marked words are related_name
And render counts in template like this:
{{ user_object.book_read_count }}
{{ user_object.library_visited_count }}

Should I use ArrayField or ManyToManyField for tags

I am trying to add tags to a model for a postgres db in django and I found two solutions:
using foreign keys:
class Post(models.Model):
tags = models.ManyToManyField('tags')
...
class Tag(models.Model):
name = models.CharField(max_length=140)
using array field:
from django.contrib.postgres.fields import ArrayField
class Post(models.Model):
tags = ArrayField(models.CharField(max_length=140))
...
assuming that I don't care about supporting other database-backends in my code, what is a recommended solution ?
If you use an Array field,
The size of each row in your DB is going to be a bit large thus Postgres is going to be using more toast tables
Every time you get the row, unless you specifically use defer the field or otherwise exclude it from the query via only, or values or something, you paying the cost of loading all those values every time you iterate across that row. If that's what you need then so be it.
Filtering based on values in that array, while possible isn't going to be as nice and the Django ORM doesn't make it as obvious as it does for M2M tables.
If you use M2M field,
You can filter more easily on those related values
Those fields are postponed by default, you can use prefetch_related if you need them and then get fancy if you want only a subset of those values loaded.
Total storage in the DB is going to be slightly higher with M2M because of keys, and extra id fields.
The cost of the joins in this case is completely negligible because of keys.
With that being said, the above answer doesn't belong to me. A while ago, I had stumbled upon this dilemma when I was learning Django. I had found the answer here in this question, Django Postgres ArrayField vs One-to-Many relationship.
Hope you get what you were looking for.
If you want the class tags to be monitored ( For eg : how many tags, how many of a particular tag etd ) , the go for the first option as you can add more fields to the model and will add richness to the app.
On the other hand, if you just want it to be a array list just for sake of displaying or minimal processing, go for that option.
But if you wish to save time and add richness to the app, you can use this
https://github.com/alex/django-taggit
It is as simple as this to initialise :
from django.db import models
from taggit.managers import TaggableManager
class Food(models.Model):
# ... fields here
tags = TaggableManager()
and can be used in the following way :
>>> apple = Food.objects.create(name="apple")
>>> apple.tags.add("red", "green", "delicious")
>>> apple.tags.all()
[<Tag: red>, <Tag: green>, <Tag: delicious>]

Django prefetch_related query not working as required, troubleshooting needed

I have two simple Django models:
class PhotoStream(models.Model):
cover = models.ForeignKey('links.Photo')
creation_time = models.DateTimeField(auto_now_add=True)
class Photo(models.Model):
owner = models.ForeignKey(User)
which_stream = models.ManyToManyField(PhotoStream)
image_file = models.ImageField(upload_to=upload_photo_to_location, storage=OverwriteStorage())
Currently the only data I have is 6 photos, that all belong to 1 photostream. I'm trying the following to prefetch all related photos when forming a photostream queryset:
queryset = PhotoStream.objects.order_by('-creation_time').prefetch_related('photo_set')
for obj in queryset:
print obj.photo_set.all()
#print connection.queries
Checking via the debug toolbar, I've found that the above does exactly the same number of queries it would have done if I remove the prefetch_related part of the statement. It's clearly not working. I've tried prefetch_related('cover') as well - that doesn't work either.
Can anyone point out what I'm doing wrong, and how to fix it? My goal is to get all related photos for every photostream in the queryset. How can I possibly do this?
Printing connection.queries after running the for loop includes, among other things:
SELECT ("links_photo_which_stream"."photostream_id") AS "_prefetch_related_val", "links_photo"."id", "links_photo"."owner_id", "links_photo"."image_file" FROM "links_photo" INNER JOIN "links_photo_which_stream" ON ("links_photo"."id" = "links_photo_which_stream"."photo_id") WHERE "links_photo_which_stream"."photostream_id" IN (1)
Note: I've simplified my models posted in the question, hence the query above doesn't include some fields that actually appear in the output, but are unrelated to this question.
Here are some of the extracts from prefetch_related:
**prefetch_related**, on the other hand, does a separate lookup for each relationship, and does the ‘joining’ in Python.
And, some more:
>>> Pizza.objects.all().prefetch_related('toppings')
This implies a self.toppings.all() for each Pizza; now each time self.toppings.all() is called, instead of having to go to the database for the items, it will find them in a prefetched QuerySet cache that was populated in a single query.
So the number of queries you see will always be the same but if you use prefetch_related then instead of hitting the database on for each photostream it will hit the prefetched QuerySet cache that it already built and get the photo_set from there.

Select a single field from a foreign key

So I have a a very simple (simplified) model
class MyObject(models.Model):
owning_user = models.ForeignKey(User, null=True)
Now in one of my templates I'm trying to iterate over a list of these objects to determine whether something should be displayed similar to this
{% for my_object in foo.my_object_set %}
{% if my_object.owning_user.id == user.id %}
Show Me!
{% endif %}
This works fine, but what I am finding is that the query
my_object.owning_user.id
returns every field from the owning user before getting the id which is verified both in django debug tool bar, and checking the connection queries
# django-debug-toolbar states this is repeated multiple times
SELECT ••• FROM "auth_user" WHERE "auth_user"."id" = 1
# The following test code also confirms this
from django.db import connection
conn = connection
bearing_type.owning_user.id
print conn.queries[-1]
Now since this query repeats over 1000 times and takes 2ms per query it is taking 2 seconds just to perform this - when all I care about is the id...
Is there anyway at all that I can just perform a query to get just the id from the owning_user instead of having to query all the fields?
Note, I'm trying really hard here to avoid making a raw query
If you use my_object.owning_user_id instead of my_object.owning_user, then Django will use the id instead of looking up the user.
In this case, you only need the id, but if you needed other user attributes, you could use select_related. In the view, you would do:
foo.my_object_set.select_related('user')
In the template, you don't have as much control, since you can't pass arguments.
{{ foo.my_object_set.select_related }}