Use of select_related in simple query in django - django

I have a model in Django in which a field has a fk relationship with the teacher model. I have came across select_related in django and want to use it in my view. However, I am not sure whether to use it in my query or not.
My models:
class Teacher(models.Model):
name = models.OneToOneField(max_length=255, default="", blank=True)
address = models.CharField(max_length=255, default="", blank=True)
college_name = models.CharField(max_length=255, default="", blank=True)
class OnlineClass(models.Model):
teacher = models.ForeignKey(Teacher,on_delete=models.CASCADE)
My view:
def get(self, request,*args, **kwargs):
teacher = self.request.user.teacher
classes = Class.objects.filter(teacher=teacher) #confusion is here..............
serializer_class = self.get_serializer_class()
serializer = serializer_class(classes,many=True)
return Response(serializer.data,status=status.HTTP_200_OK)
I have commented on the line or the section of the problem. So I wanted to list all the classes of that teacher. Here I have used filter. But can we use select_related here?? What I understood is if I want to show another fields of teacher model as well, for eg name or college_name, then I have to use it. Otherwise the way I have done it is correct. Also, select_related is only used for get api not for post api, is that correct??

First, the easiest way to get all classes per teacher is by using the related_name attribute (https://docs.djangoproject.com/en/3.2/ref/models/fields/#django.db.models.ForeignKey.related_name).
class OnlineClass(models.Model):
teacher = models.ForeignKey(
Teacher,
on_delete=models.CASCADE,
related_name='classes'
)
# All classes of a teacher
teacher.classes.all()
When select_related is used, new sql joins are added to the Django internals SQL query. It is useful to reduce the workload in the database engine, getting the data quickly, and yes, is only for reading.
for obj in OnlineClass.objects.all():
# This hits the database every cycle to get the teacher data,
# with a new query like: select * from teacher_table where id = ...
print(obj.teacher)
for obj in OnlineClass.objects.select_related('teacher').all():
# This don'ts hits the database.
# Previously, the Django ORM joined the
# OnlineClass and Teacher data with a single SQL query.
print(obj.teacher)
I think that, in your example, with only one teacher, using "select_related" or not don't make big difference.

select_related is used to select additional data from related objects when the query is executed. It results in a more complex query. But it boosts performance if you have to access related data, since no additional database queries will be required.
See documentation here.
In your code it would be possible to use select_related, but it would be inefficient, because you're not accessing related objects of the queried classes. So using select_related would result in a more complex query without any advantage.
If you wanted to use select_related, the syntax would be classes = Class.objects.select_related('teacher').filter(teacher=teacher)

Related

dynamiclly count vs database record

I have a post model as below, now I use number_of_likes to record the liked post number. If so, I have to manually maintain the number_of_likes field.
Now, I add this field in post mainly two reasons, and I would like to hear your advice.
it is easy to write serialisation using declarative syntax(every post need this)
I don't need to filter and count on model Like, which is more expensive than just get this value from field
class Post(models.Model):
...
number_of_likes = models.IntegerField()
class Like(models.Model):
user = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
post = models.ForeignKey(Post, on_delete=models.CASCADE)
I would like to know which method is better, using Like.objects.filter(user=user).count() or maintain a new field such as number_of_likes.If choose later, what is the best way to maintain this field
As #WillemVanOnsem suggested, best way to display this data is by annotation. For example:
from django.db.models import Count
posts = Post.objects.annotate(num_of_likes=Count('like'))
# usage
for post in posts:
print(post.num_of_likes)
# or
posts.values('pk', 'num_of_likes')

Django project architecture advice

I have a django project and I have a Post model witch look like that:
class BasicPost(models.Model):
author = models.ForeignKey('auth.User', on_delete=models.CASCADE)
published = models.BooleanField(default=False)
created_date = models.DateTimeField(auto_now_add=True)
title = models.CharField(max_length=100, blank=False)
body = models.TextField(max_length=999)
media = models.ImageField(blank=True)
def get_absolute_url(self):
return reverse('basic_post', args=[str(self.pk)])
def __str__(self):
return self.title
Also, I use the basic User model that comes with the basic django app.
I want to save witch posts each user has read so I can send him posts he haven't read.
My question is what is the best way to do so, If I use Many to Many field, should I put it on the User model and save all the posts he read or should I do it in the other direction, put the Many to Many field in the Post model and save for each post witch user read it?
it's going to be more that 1 million + posts in the Post model and about 50,000 users and I want to do the best filters to return unread posts to the user
If I should use the first option, how do I expand the User model?
thanks!
On your first question (which way to go): I believe that ManyToMany by default creates indices in the DB for both foreign keys. Therefore, wherever you put the relation, in User or in BasicPost, you'll have the direct and reverse relationships working through an index. Django will create for you a pivot table with three columns like: (id, user_id, basic_post_id). Every access to this table will index through user_id or basic_post_id and check that there's a unique couple (user_id, basic_post_id), if any. So it's more within your application that you'll decide whether you filter from a 1 million set or from a 50k posts.
On your second question (how to overload User), it's generally recommended to subclass User from the very beginning. If that's too late and your project is too far advanced for that, you can do this in your models.py:
class BasicPost(models.Model):
# your code
readers = models.ManyToManyField(to='User', related_name="posts_already_read")
# "manually" add method to User class
def _unread_posts(user):
return BasicPost.objects.exclude(readers__in=user)
User.unread_posts = _unread_posts
Haven't run this code though! Hope this helps.
Could you have a separate ReadPost model instead of a potentially large m2m, which you could save when a user reads a post? That way you can just query the ReadPost models to get the data, instead of storing it all in the blog post.
Maybe something like this:
from django.utils import timezone
class UserReadPost(models.Model):
user = models.ForeignKey("auth.User", on_delete=models.CASCADE, related_name="read_posts")
seen_at = models.DateTimeField(default=timezone.now)
post = models.ForeignKey(BasicPost, on_delete=models.CASCADE, related_name="read_by_users")
You could add a unique_together constraint to make sure that only one UserReadPost object is created for each user and post (to make sure you don't count any twice), and use get_or_create() when creating new records.
Then finding the posts a user has read is:
posts = UserReadPost.objects.filter(user=current_user).values_list("post", flat=True)
This could also be extended relatively easily. For example, if your BasicPost objects can be edited, you could add an updated_at field to the post. Then you could compare the seen_at of the UserReadPost field to the updated_at field of the BasicPost to check if they've seen the updated version.
Downside is you'd be creating a lot of rows in the DB for this table.
If you place your posts in chronological order (by created_at, for example), your option could be to extend user model with latest_read_post_id field.
This case:
class BasicPost(models.Model):
# your code
def is_read_by(self, user):
return self.id < user.latest_read_post_id

Django QuerysSet for finding related foreign key fields

I'm trying to query a related field to a Catalog class in which many items are related to by foreign key. I'm currently trying:
article = forms.ModelChoiceField(queryset=Catalog.objects.select_related(
'article_products'))
It seems to do the same query as:
queryset = Catalog.objects.all()
Can anyone help steer me in the right direction? Here is the model I'm working with.
class Catalog(models.Model):
products = models.CharField(max_length=200)
def __unicode__(self):
return self.products
class Article(models.Model):
catalog = models.ForeignKey(Catalog, related_name='article_products')
title = models.CharField(max_length=200)
abstract = models.TextField(max_length=1000, blank=True)
full_text = models.TextField(blank=True)
proquest_link = models.CharField(max_length=200, blank=True, null=True)
ebsco_link = models.CharField(max_length=200, blank=True, null=True)
def __unicode__(self):
return self.title
My goal is to have a form select field with all of the articles related to the Catalog. It currently just displays the name of the Catalog.
I do not think the select_related method will accomplish the goal you have set out to achieve with this ModelChoiceField. You are quite correct that the two queries below return the same resulting queryset:
Catalog.objects.all().select_related('article_products'))
Catalog.objects.all()
The select_related method of Django querysets serves a different function, specifically as a performance booster to reduce the number of database accesses required to obtain the data you want to retrieve from a model instance. The Django reference about this method contains very good documentation, with examples explaining why you would use the select_related method for performance purposes.
With that being said, your original purpose remains: The form field would display all of the articles related to a given catalog.
In order to achieve this goal, it seems best to filter the queryset of the Article objects being given to the form field. First of all, if we want to display Article objects within the ModelChoiceField, we should certainly give the ModelChoiceField a queryset containing Article objects rather than Catalog objects, like so:
article = forms.ModelChoiceField(queryset=Article.objects.all())
But this queryset argument is not quite right, either. We are still passing the queryset of all Article objects that exist in the database. Instead, we want to pass only the articles that are associated with a given Catalog object. To achieve this goal, we can filter the Article queryset to obtain only the Article objects that are related to a certain Catalog object, like so:
# cat is some catalog object
article = forms.ModelChoiceField(queryset=Article.objects.filter(catalog=cat))
In this example, the queryset filter returns only Article objects which contain a reference to the given Catalog object. This queryset will be used to populate the ModelChoiceField.
For more information about filtering by field lookup, see the Django documentation here.

select_related with reverse foreign keys

I have two Models in Django. The first has the hierarchy of what job functions (positions) report to which other positions, and the second is people and what job function they hold.
class PositionHierarchy(model.Model):
pcn = models.CharField(max_length=50)
title = models.CharField(max_length=100)
level = models.CharField(max_length=25)
report_to = models.ForeignKey('PositionHierachy', null=True)
class Person(model.Model):
first_name = models.CharField(max_length=50)
last_name = models.CharField(max_length=50)
...
position = models.ForeignKey(PositionHierarchy)
When I have a Person record and I want to find the person's manager, I have to do
manager = person.position.report_to.person_set.all()[0]
# Can't use .first() because we haven't upgraded to 1.6 yet
If I'm getting people with a QuerySet, I can join (and avoid a second trip to the database) with position and report_to using Person.objects.select_related('position', 'position__reports_to').filter(...), but is there any way to avoid making another trip to the database to get the person_set? I tried adding 'position__reports_to__person_set' or just position__reports_to__person to the select_related, but that doesn't seem to change the query. Is this what prefetch_related is for?
I'd like to make a custom manager so that when I do a query to get Person records, I also get their PositionHeirarchy and their manager's Person record without more round trips to the database. This is what I have so far:
class PersonWithManagerManager(models.Manager):
def get_query_set(self):
qs = super(PersonWithManagerManager, self).get_query_set()
return qs.select_related(
'position',
'position__reports_to',
).prefetch_related(
)
Yes, that is what prefetch_related() is for. It will require an additional query, but the idea is that it will get all of the related information at once, instead of once per Person.
In your case:
qs.select_related('position__report_to')
.prefetch_related('position__report_to__person_set')
should require two queries, regardless of the number of Persons in the original query set.
Compare this example from the documentation:
>>> Restaurant.objects.select_related('best_pizza')
.prefetch_related('best_pizza__toppings')

Basic relations in django (built on top of a legacy db)

I've googled on and on, and I just don't seem to get it.
How do I recreate simple join queries in django?
in models.py (Fylker is county, Dagensrepresentanter is persons)
class Fylker(models.Model):
id = models.CharField(max_length=6, primary_key=True)
navn = models.CharField(max_length=300)
def __unicode__(self):
return self.navn
class Meta:
db_table = u'fylker'
class Dagensrepresentanter(models.Model):
id = models.CharField(max_length=33, primary_key=True)
etternavn = models.CharField(max_length=300, blank=True)
fornavn = models.CharField(max_length=300, blank=True)
fylke = models.ForeignKey(Fylker, db_column='id')
def __unicode__(self):
return u'%s %s' % (self.fornavn, self.etternavn)
class Meta:
ordering = ['etternavn'] # sette default ordering
db_table = u'dagensrepresentanter'
Since the models are auto-created by django, I have added the ForeignKey and tried to connect it to the county. The id fields are inherited from the db I'm trying to integrate into this django project.
By querying
Dagensrepresentanter.objects.all()
I get all the people, but without their county.
By querying
Dagensrepresentanter.objects.all().select_related()
I get a join on Dagensrepresentanter.id and Fylker.id, but I want thet join to be on fylke, aka
SELECT * FROM dagensrepresentanter d , fylker f WHERE d.fylke = f.id
This way I'd get the county name (Fylke navn) in the same resultset as all the persons.
Additional request:
I've read over the django docs and quite a few questions here at stackoverflow, but I can't seem to get my head around this ORM thing. It's the queries that hurt. Do you have any good resources (blogposts with experiences/explanations, etc.) for people accustomed to think of databases as an SQL-thing, that needs to start thinking in django ORM terms?
Your legacy database may not have foreign key constraints (for example, if it is using MyISAM then foreign keys aren't even supported).
You have two choices:
Add foreign key constraints to your tables (would involve upgrading to Innodb if you are on MyISAM). Then run ./manage inspectdb again and the relationships should appear.
Use the tables as is (i.e., with no explicit relationships between them) and compose queries manually (e.g., Mytable.objects.get(other_table_id=23)) either at the object level or through writing your own SQL queries. Either way, you lose much of the benefit of python's ORM query language.