Reducing db queries in django - django

I have a view that searches through a database of movie credits, and converts and returns results like so --
# From the following results:
Avatar - James Cameron - director
Avatar - James Cameron - writer
Avatar - James Cameron - editor
Avatar - Julie Jones - writer
Crash - John Smith - director
# ...display in the template as:
Avatar - James Cameron (director, writer, editor)
Avatar - Julie Jones (writer)
Crash - John Smith (director)
However, when I do this conversion and do print connection.queries I am hitting the database about 100 times. Here is what I currently have --
# in models
class VideoCredit(models.Model):
video = models.ForeignKey(VideoInfo)
# if the credit is a current user, FK to his profile,
profile = models.ForeignKey('UserProfile', blank=True, null=True)
# else, just add his name
name = models.CharField(max_length=100, blank=True)
# normalize name for easier searching / pulling of name
normalized_name = models.CharField(max_length=100)
position = models.ForeignKey(Position)
timestamp = models.DateTimeField(auto_now_add=True)
actor_role = models.CharField(max_length=50, blank=True)
class VideoInfo(models.Model):
title = models.CharField(max_length=256, blank=True)
uploaded_by = models.ForeignKey('UserProfile')
...
class Position(models.Model):
position = models.CharField(max_length=100)
ordering = models.IntegerField(max_length=3)
class UserProfile(models.Model):
user = models.ForeignKey(User, unique=True)
...
In my view, I am building a list of three-tuples in the form of (name, video, [list_of_positions]) for the display of credits --
credit_set = VideoCredit.objects.filter(***depends on a previous function***)
list_of_credit_tuples = []
checklist = [] # I am creating a 'checklist' to see whether to append the positions
# list of create a new tuple entry
for credit in credit_set:
if credit.profile: # check to see if the credit has an associated profile
name = credit.profile
else:
name = credit.normalized_name
if (credit.normalized_name, credit.video) in checklist:
list_of_keys = [(name, video) for name, video, positions in list_of_credit_tuples]
index = list_of_keys.index((name, credit.video))
list_of_credit_tuples[index][2].append(credit.position)
else:
list_of_credit_tuples.append((name, credit.video, [credit.position]))
checklist.append((credit.normalized_name, credit.video))
...
And finally, in my template to display the credits (note: if the credit has a profile, provide a link to profile of user) --
{% for name, video, positions in list_of_credit_tuples %}
<p>{% if name.full_name %}
{{name.full_name}}
{% else %}
{{name}}
{% endif %}
{{video}}
({% for position in positions %}{% ifchanged %}{{position}}{% endifchanged %}{% if not forloop.last %}, {% endif %}{% endfor %})
{% endfor %}
Why and where is this view creating so many db queries? How and in which ways could I make this view function more efficient / better? Thank you.

You will want to look into select_related() (https://docs.djangoproject.com/en/1.3/ref/models/querysets/#select-related) to resolve your query leaking issue. If you know ahead of time you're going to be looking at data on models related by a foreignkey you'll want to add select_related. Even better if you know it's only going to be a couple of foreignkeys you can add only the ones you need.
Anytime you see django ran a huge number of queries more than you expected, select_related is almost always the right answer

Try adding this Django snippet which returns the number of queries as well as the queries themselves to your template:
http://djangosnippets.org/snippets/159/
This should easily tell you where your leak is originating from.

Related

Using Django, how to count amount of free shelf space using a queryset?

Im developing an inventory management web app that allows teams, within an organisation, to be assigned boxes to store things in.
There are multiple storage rooms and multiple shelves and sections. Every box is assigned a project.
I would like to write a queryset in Django that shows how many empty spaces there are in a given room (how many locations there are that do not have boxes assigned to them) e.g. for the above picture showing room A
Room: A
Empty Spaces: 4
Here is a simplified version of my code:
HTML:
{% for each_space in Room_data %}
{
<p>"Room": "Room {{each_space.loc_room}}",</p>
<p>"Empty Spaces": *** HERE I NEED HELP ***,</p>
},
{% endfor %}
Model:
class Location(models.Model):
loc_room = models.CharField()
loc_section = models.IntegerField()
loc_shelf = models.CharField()
class Box(models.Model):
box_contents = models.CharField()
project_assigned_to = models.ForeignKey()
Location = models.OneToOneField()
class Project(models.Model):
project_name = models.CharField()
project_manager = models.ForeignKey()
Views:
def dashboard(request):
Room_data = Location.objects.all()
return render(request, 'main_app/dashboard.html' , {"Room_data":Room_data})
I've been stuck on this for a lot of today so I was hoping somebody might know the best direction forward. Thank you in advance.
You can obtain a list of locations with no Box with:
Location.objects.filter(box__isnull=True)
or even simpler:
Location.objects.filter(box=None)
We can obtain a QuerySet with such Locations for a given room some_room with:
Location.objects.filter(box=None, loc_room=some_room)
This will result in a query that looks like:
SELECT location.*
FROM location
LEFT OUTER JOIN box ON location.id = box.Location_id
WHERE box.id IS NULL
AND location.loc_room = some_room
we can also count the number of Locations with:
Location.objects.filter(box=None, loc_room=some_room).count()
If you want to retrieve per room the number of empty locations, we can annotate these, like:
Location.objects.values(
'loc_room'
).filter(
box__isnull=True
).annotate(
nempty=Count('id')
).order_by('loc_room')
Then we can print these like:
{% for each_space in Room_data %}
<p>"Room": "Room {{ each_space.loc_room }}",</p>
<p>"Empty Spaces": {{ each_space.nempty }},</p>
{% endfor %}

Django : m2m relationship create two line instead of one

I have extended the UserModel this way :
# users/models.py
from django.contrib.auth.models import AbstractUser
from django.db import models
class CustomUser(AbstractUser):
# add additional fields in here
credit = models.IntegerField(default=200)
follow = models.ManyToManyField('self', related_name='follow')
def __str__(self):
return self.username
But I am stuck as to how I should add/remove a follower. I have created a view with :
#login_required
def follow(request, user_id):
user = get_object_or_404(CustomUser, pk=user_id)
if CustomUser.objects.filter(follow=user.pk).exists():
request.user.follow.remove(user)
else:
request.user.follow.add(user)
return redirect('profil', user_id)
Issue :
Let's say request.user.pk is 1 and user_id is 2.
For the add part (in the else), I would expect a new line in database with from_customuser_id=1 and to_customuser_id=2 however, it creates two lines:
one with from_customuser_id=1 and from_customuser_id=2 as expected
one with from_customuser_id=2 and from_customuser_id=1 which I don't need.
And for the remove part (in the if), I would expect it to only remove the line
from_customuser_id=1 and from_customuser_id=2
But it removes the two lines.
I read the doc about django models relations but didn't found how to solve this issue.
Question :
How should I update my code in order to have the add method to only insert one line with from_customuser_id=1, from_customuser_id=2 and the remove method to only delete this line (assuming the current user have the id 1).
Not sure if it is relevant but for sake of completeness this is the related part of my urls.py :
path('follow/<int:user_id>', views.follow, name='follow'),
path('unfollow/<int:user_id>', views.follow, name='unfollow'),
And this is how I call them in templates :
{% if follow %}
<a href="{% url 'follow' user_profil.id %}">
Unfollow {{ user_profil.username }}
</a>
{% else %}
<a href="{% url 'unfollow' user_profil.id %}">
Follow {{ user_profil.username }}
</a>
{% endif %}
When you have a ManyToManyField it essentially creates a relationship between both the objects. This also allows you to do reverse lookups.
For example:
class Person(models.Model):
name = model.CharField(max_length=100)
class Pet(models.Model):
owners = models.ManyToMany(Person, related_name="pets")
name = model.CharField(max_length=100)
bob = Person.objects.create(name="Bob")
john = Person.objects.create(name="John")
kitty_kat = Pet.objects.create(name="Kitty Kat")
kitty_kat.owners.set([bob, john])
According to these models one pet can be owned by multiple people, and one person can have multiple pets. So if I do
bob.pets.all() # I get kitty kat
kitty_kay.owners.all() # I get bob & john
When this relationship is supposed to be on the same model, you end up creating two relationships. One as the normal one & one for the reverse.
For example:
class Person(models.Model):
name = model.CharField(max_length=100)
followers = models.ManyToManyField('self', related_name='follow')
bob = Person.objects.create(name="Bob")
john = Person.objects.create(name="John")
john.followers.add(bob)
bob.follow.all() # I get john... notice I use follow and not followers
john.followers.all() # I get bob
In order to avoid this you can pass symmetrical=False to the field and one one row will be created
followers = models.ManyToManyField('self', related_name='+', symmetrical=False)
Setting the related_name to anything starting with + will also prevent reverse lookups (which you don't need in this case)

Django | How to make a reverse Relationship with two models

i want to make a ticket system. I have the main model (TicketSystem) and a model with the messages from the users (TicketSystem_Messages).
In the model "TicketSystem_Messages" is the ForeignKey to the model "TicketSystem".
Here is my code:
class TicketSystem(models.Model):
subject = models.CharField(_('Subject'),max_length=30, blank=False, default="N/A")
message = models.TextField(_('Message'), null=False, blank=False)
created_date = models.DateTimeField(_('Date'), default=datetime.utcnow().replace(tzinfo=utc))
fertig_date = models.DateTimeField(_('Date'), default=datetime.utcnow().replace(tzinfo=utc))
class TicketSystem_Messages(models.Model):
user_name = models.ForeignKey(User)
status = models.ForeignKey(TicketSystem_Status)
ticketid = models.ForeignKey(TicketSystem)
message = models.TextField(_('Message'), null=False, blank=False)
created_date = models.DateTimeField(_('Sent'), default=datetime.utcnow().replace(tzinfo=utc))
At the moment i get the Tickets without the Messages:
sql_TicketSystem = TicketSystem.objects.filter(id=kwargs['pk'])
I want to make a LEFT JOIN like this
SELECT * FROM TicketSystem LEFT JOIN TicketSystem_Messages ON Ticketsystem.id = TicketSystem_Messages.ticketid
I heard something about "select_related" and "prefetch_related" and tried it but it does not work.
Rather than using raw SQL joins, in Django you can traverse model relationships in either direction, regardless of how they are represented in the database.
To find all the messages for a given TicketSystem instance:
my_ticket = TicketSystem.objects.get(id=0) # or something
my_ticket_messages = my_ticket.ticketsystem_messages_set.all() # or do other filters here
To find all the messages using a queryset:
TicketSystem_Messages.objects.filter(ticketId=my_ticket)
To find all tickets with more than one message:
from django.db import Count
TicketSystem.objects.annotate(message_count=Count('ticketsystem_messagess')).filter(message_count__gt=1)
If i want to list all tickets with the last status from the model "TicketSystem_Messages", it doesnt work. So with "TicketSystem.objects.get" it works without any problems.
Example:
sql_TicketSystem = TicketSystem.objects.all()
sql_TicketSystem_Messages = sql_TicketSystem.ticketsystem_messages_set.all()
UPDATE:
{% for data in sql_TicketSystem %}
{{ data.subject }}
{% for msg in data.ticketsystem_messages_set.all %}
{{msg.status}}
{% empty %}
Unknown
{% endfor %}
{% endfor %}
This works :-)

Django left join on models

Assume u have two models. Let's name them computer and log_file.
I want to make a join so that a always display all computer objects, but if there is a related log_file object it is also added to the query result.
Which is the best way to achieve this? Relationship: one Computer has one log file. But it is possible it's not uploaded yet to the database. Only gets uploaded when my script throws an error.
Sorry for my beginner question, I'm new to Django.
following simplified models as an example:
Model 1: computer
id (pk)
computer name
mac address (unique)
Model 2: log file / max one for each computer
id (pk)
mac address
text field
required query: list of all computers and also the log file if there is any in the database. The join is made by the value of the mac address.
Think the SQL query would look like the following. But I am not able to translate this with the ORM.
SELECT *
FROM computer
LEFT JOIN log ON computer.mac_address = log_file.mac_address;
Thank you!
I found a work around by using a calculated field. Check the following link:
How to add a calculated field to a Django model
Everything below #property
I added some extra code to my model:
class ConnecThor(models.Model):
# Incrementing ID (created automatically)
hostname = models.CharField(max_length=40)
MAC_address = models.CharField(max_length=17,
default='00.00.00.00.00.00', help_text="Example format: 255.255.255.255",
primary_key=True)
IP_address = models.CharField(max_length=15, default='000.000.000.000', help_text="Example format: 192.148.0.1")
config_version = models.CharField(max_length=10, default='000')
creation_date = models.DateTimeField(auto_now_add=True, null=True,
editable=False) # timezone.now() gedaan tijdens make migrations
#property
def log_file(self):
try:
return LogFiles.objects.get(MAC_address=self.MAC_address)
except LogFiles.DoesNotExist:
print('logfile not found')
class Meta(object):
db_table = 'connecthor' # table name
And to my view:
{% if connecthor.log_file %}
{% if connecthor.log_file.viewed > 0 %}
<td>Approved</td>
{% else %}
<td>Failed update</td>
{% endif %}
{% else %}
<td><label class="badge badge-warning">Not Found</label></td>
{% endif %}

Prefetch related starting from a single object - geting first in second prefetch and count and order

I have 3 Models Product,Company Categories.
class Product(Meta):
categories = models.ManyToManyField(Category)
company = models.ForeignKey(Company, related_name='products', on_delete=models.CASCADE)
updated_at = models.DateTimeField(auto_now_add=False, auto_now=True)
I need:
to get all the products of a company
show the product first category
count the number products per company and show
order products by reverse updated_at
I start from:
1. Company.objects.get(pk=company_pk).prefetch_related('products')
will give me an error, because get returns an object:
class CompanyProductListView(ListView):
model = Company
template_name_suffix = '_company_list'
def get_queryset(self):
company_pk = self.kwargs.get('pk')
return Company.objects.get(pk=company_pk).prefetch_related('products')
get without prefetch works.
return Company.objects.filter(pk=company_pk).prefetch_related('products')
there is no error, but in template:
{% for company in company_list %}
{{ company.name}}
{% endfor %}
I loop even is one, but doesn't show me anything.
Besides that I need to attach first category to each product, and count the number of products
I'm thinking on access something like this:
{{company.name}}
{% for product in company.products %}
{{ product.name }}
{{ product.category }}
This query will get a little complicated, but should help you solve your issue.
PS: I haven't tested this but should mostly work. Will take a deeper look once I get some more time.
First we get the company we want:
company = Company.objects.get(pk=company_pk)
Then we fetch all the first categories for all products, it can be done by using this question as a guide:
first_categories = Category.objects.order_by('product__id', '-id').distinct('product__id')
Now we use the first_categories to use to limit the amount of data we prefetch (giving this a different perspective, we will query the Product model instead of the Company model)
product_list = Products.objects.filter(company=company).prefetch_related(
Prefetch('categories', queryset=first_categories)
)
def get_queryset():
company_pk = ...
company = ...
first_categories = ...
product_list = ...
return product_list