Django DRF serializer method field on many to many running 2n queries

Django DRF serializer method field on many to many running 2n queries - django

I'm using Django 2.2 and Django REST Framework.
I have the following model structure
class MyModel(models.Model):
name = models.ChartField(max_length=200)
class Tag(models.Model):
name = models.ChartField(max_length=50, unique=True)
class MyModelRelation(models.Model):
obj = models.ForeignKey(MyModel, related_name='relation')
user = models.ForeignKey(User)
tags = models.ManyToManyField(Tag)
def tag_list(self):
return self.tags.all().values_list('name', flat=True).distinct()
I want to get the tags list with the MyModel instance and for that, the serializer is
class MyModelSerializer(serializers.ModelSerializer):
tags_list = serializers.SerializerMethodField(read_only=True)
def get_tags_list(self, obj):
return obj.relation.tag_list()
class Meta:
fields = [
'name',
'tags_list'
]
and the view is
class ObjListView(ListAPIView):
serializer_class = MyModelSerializer
def get_queryset(self):
return super().get_queryset().select_related('relation').prefetch_related('relation__tags')
But to get 58 records, it is running almost 109 queries.
The my_app_mymodel`, `my_app_mymodelrelation_tags is repeated multiple times

This is how I suggest you solve the problem. Instead of extracting the name in the DB level, you can do it in the serializer level. It will make things way easier and faster. First, remove the tag_list method from the model class. First add the annotation to your views:
from django.db.models import F
def get_queryset(self):
return super().get_queryset().annotate(tags_list=F('relation__tags')).select_related('relation')
Then in your serializers
class MyModelSerializer(serializers.ModelSerializer):
tags_list = serializers.SlugRelatedField(many=True, slug_field='name', read_only=True)
...

Related

Using selected_related() in nested serializers

I've been using select_related() to speed up a large DRF call with great success, but I've hit a wall.
My main serializer references two other serializers, and one of those references yet another serializer. I'm unsure as how to implement prefetching in the second level serializer.
serializer.py
class DocumentsThinSerializer(serializers.ModelSerializer):
class Meta:
model = Documents
fields = ('confirmed', )
class PersonThinSerializer(serializers.ModelSerializer):
documents = DocumentsThinSerializer()
class Meta:
model = Person
fields = ('name', 'age', 'gender')
class EventThinSerializer(serializers.ModelSerializer):
day = DayThinSerializer()
person = PersonThinSerializer()
#staticmethod
def setup_eager_loading(queryset):
return queryset.select_related('day', 'person')
class Meta:
model = Event
views.py
class EventList(generics.ListAPIView):
authentication_classes = (SessionAuthentication, BasicAuthentication)
permission_classes = (IsAuthenticated,)
queryset = Event.objects.all()
serializer_class = EventThinSerializer
def get_queryset(self):
return self.get_serializer_class().setup_eager_loading(queryset)
As you can see, I'm using the static method setup_eager_loading() to get things going, but I can't find a queryset hook for my PersonThinSerializer() to get the speedup when accessing the DocumentsThinSerializer() in the same way.

Assuming Documents has a foreign key to Person, you should be able to add "person__documents" to your queryset.select_related in EventThinSerializer.setup_eager_loading:
class EventThinSerializer(serializers.ModelSerializer):
day = DayThinSerializer()
person = PersonThinSerializer()
#staticmethod
def setup_eager_loading(queryset):
return queryset.select_related('day', 'person', 'person__documents')

Can I use prefetch_related to cache filtered querysets?

I am using DRF to serialize some related models. In my toy example below, assume that each author can have a million books. Clearly doing a db query for all "good" books and then another db query for all "bad" books is inefficient.
This post [http://ses4j.github.io/2015/11/23/optimizing-slow-django-rest-framework-performance/] offered some suggestions regarding prefetch_related. But I found that this only helped when I made subsequent calls to .books.all() rather than .books.filter() as happens in the properties below.
Is there any automatic way in Django to cache the books queryset and not have subsequent filters to it hit the database again?
Here is some code:
models.py:
class Author(models.Model):
name = models.CharField(max_length=100)
#property
def good_books(self):
return self.books.filter(is_good=True)
#property
def bad_books(self):
return self.books.filter(is_good=False)
class Book(models.Model):
title = models.CharField(max_length=100)
is_good = models.BooleanField(default=False)
author = models.ForeignKey(Author, related_name="books")
serializers.py:
class BookSerializer(serializers.ModelSerializer):
class Meta:
model = Book
fields = ("title",)
class AuthorSerializer(serializers.ModelSerializer):
class Meta:
model = Author
fields = ("name", "good_books", "bad_books",)
good_books = BookSerializer(many=True, read_only=True, source="good_books")
bad_books = BookSerializer(many=True, read_only=True, source="bad_books")
#staticmethod
def setup_eager_loading(queryset):
queryset = queryset.prefetch_related("books")
return queryset
views.py:
class AuthorViewSet(viewsets.ReadOnlyModelViewSet):
serializer = AuthorSerializer
def get_queryset(self):
queryset = Author.objects.all()
queryset = self.get_serializer_class().setup_eager_loading(queryset)
return queryset
Thanks.
edit:
Using Prefetch:
#staticmethod
def setup_eager_loading(queryset):
queryset = queryset.prefetch_related(
Prefetch("books", queryset=Book.objects.filter(is_good=True), to_attr="good_books"),
Prefetch("books", queryset=Book.objects.filter(is_good=False), to_attr="bad_books"),
)
return queryset
This still gives me extra db hits for the calls to filter.

Instead of doing it in model's property, which will be evaluated for each author separately, you can do prefetch on view level and use Prefetch with to_attr argument:
class AuthorViewSet(viewsets.ReadOnlyModelViewSet):
serializer = AuthorSerializer
def get_queryset(self):
queryset = Author.objects.prefetch_related(
Prefetch('books', queryset=Book.objects.filter(is_good=True), to_attr='good_books'),
Prefetch('books', queryset=Book.objects.filter(is_good=False), to_attr='bad_books')
)
return queryset

You need to evaluate your queryset first, in order to be cached. From docs caching and querysets
So instead of
return queryset
you could do
return [queryset]
Be aware that in certain cases querysets are not cached.

Aggregate (and other annotated) fields in Django Rest Framework serializers

I am trying to figure out the best way to add annotated fields, such as any aggregated (calculated) fields to DRF (Model)Serializers. My use case is simply a situation where an endpoint returns fields that are NOT stored in a database but calculated from a database.
Let's look at the following example:
models.py
class IceCreamCompany(models.Model):
name = models.CharField(primary_key = True, max_length = 255)
class IceCreamTruck(models.Model):
company = models.ForeignKey('IceCreamCompany', related_name='trucks')
capacity = models.IntegerField()
serializers.py
class IceCreamCompanySerializer(serializers.ModelSerializer):
class Meta:
model = IceCreamCompany
desired JSON output:
[
{
"name": "Pete's Ice Cream",
"total_trucks": 20,
"total_capacity": 4000
},
...
]
I have a couple solutions that work, but each have some issues.
Option 1: add getters to model and use SerializerMethodFields
models.py
class IceCreamCompany(models.Model):
name = models.CharField(primary_key=True, max_length=255)
def get_total_trucks(self):
return self.trucks.count()
def get_total_capacity(self):
return self.trucks.aggregate(Sum('capacity'))['capacity__sum']
serializers.py
class IceCreamCompanySerializer(serializers.ModelSerializer):
def get_total_trucks(self, obj):
return obj.get_total_trucks
def get_total_capacity(self, obj):
return obj.get_total_capacity
total_trucks = SerializerMethodField()
total_capacity = SerializerMethodField()
class Meta:
model = IceCreamCompany
fields = ('name', 'total_trucks', 'total_capacity')
The above code can perhaps be refactored a bit, but it won't change the fact that this option will perform 2 extra SQL queries per IceCreamCompany which is not very efficient.
Option 2: annotate in ViewSet.get_queryset
models.py as originally described.
views.py
class IceCreamCompanyViewSet(viewsets.ModelViewSet):
queryset = IceCreamCompany.objects.all()
serializer_class = IceCreamCompanySerializer
def get_queryset(self):
return IceCreamCompany.objects.annotate(
total_trucks = Count('trucks'),
total_capacity = Sum('trucks__capacity')
)
This will get the aggregated fields in a single SQL query but I'm not sure how I would add them to the Serializer as DRF doesn't magically know that I've annotated these fields in the QuerySet. If I add total_trucks and total_capacity to the serializer, it will throw an error about these fields not being present on the Model.
Option 2 can be made work without a serializer by using a View but if the model contains a lot of fields, and only some are required to be in the JSON, it would be a somewhat ugly hack to build the endpoint without a serializer.

Possible solution:
views.py
class IceCreamCompanyViewSet(viewsets.ModelViewSet):
queryset = IceCreamCompany.objects.all()
serializer_class = IceCreamCompanySerializer
def get_queryset(self):
return IceCreamCompany.objects.annotate(
total_trucks=Count('trucks'),
total_capacity=Sum('trucks__capacity')
)
serializers.py
class IceCreamCompanySerializer(serializers.ModelSerializer):
total_trucks = serializers.IntegerField()
total_capacity = serializers.IntegerField()
class Meta:
model = IceCreamCompany
fields = ('name', 'total_trucks', 'total_capacity')
By using Serializer fields I got a small example to work. The fields must be declared as the serializer's class attributes so DRF won't throw an error about them not existing in the IceCreamCompany model.

I made a slight simplification of elnygreen's answer by annotating the queryset when I defined it. Then I don't need to override get_queryset().
# views.py
class IceCreamCompanyViewSet(viewsets.ModelViewSet):
queryset = IceCreamCompany.objects.annotate(
total_trucks=Count('trucks'),
total_capacity=Sum('trucks__capacity'))
serializer_class = IceCreamCompanySerializer
# serializers.py
class IceCreamCompanySerializer(serializers.ModelSerializer):
total_trucks = serializers.IntegerField()
total_capacity = serializers.IntegerField()
class Meta:
model = IceCreamCompany
fields = ('name', 'total_trucks', 'total_capacity')
As elnygreen said, the fields must be declared as the serializer's class attributes to avoid an error about them not existing in the IceCreamCompany model.

You can hack the ModelSerializer constructor to modify the queryset it's passed by a view or viewset.
class IceCreamCompanySerializer(serializers.ModelSerializer):
total_trucks = serializers.IntegerField(readonly=True)
total_capacity = serializers.IntegerField(readonly=True)
class Meta:
model = IceCreamCompany
fields = ('name', 'total_trucks', 'total_capacity')
def __new__(cls, *args, **kwargs):
if args and isinstance(args[0], QuerySet):
queryset = cls._build_queryset(args[0])
args = (queryset, ) + args[1:]
return super().__new__(cls, *args, **kwargs)
#classmethod
def _build_queryset(cls, queryset):
# modify the queryset here
return queryset.annotate(
total_trucks=...,
total_capacity=...,
)
There is no significance in the name _build_queryset (it's not overriding anything), it just allows us to keep the bloat out of the constructor.

Django rest framework - filtering for serializer field

I have question about Django REST-full framework.
When products have rendered into remote client, each of product takes a filed with filtered data.
For example, model may be like this.
class Product(models.Model):
name = models.CharField()
class Like(models.Model):
product = models.ForeignKey(Product, related_name="likes")
On the client, each likes of product counted with true value, not false.
So I tried with below code in the serializer.
class ProductSerializer(serializers.ModelSerializer):
likes = serializers.PrimaryKeyRelatedField(many=True, queryset=Like.objects.filter(whether_like=True))
class Meta:
model = Product
fields = ('id', 'name', 'likes')
But, that is not working as I wanted.
What should I do?
The following is extra view code.
#api_view(['GET'])
def product_list(request, user_id, format=None):
if request.method == 'GET':
products = Product.objects.all()
serializer = ProductSerializer(products, many=True)
return Response(serializer.data)

How about something like this:
class ProductSerializer(serializers.ModelSerializer):
likes = serializers.SerializerMethodField()
def get_likes(self, product):
qs = Like.objects.filter(whether_like=True, product=product)
serializer = LikeSerializer(instance=qs, many=True)
return serializer.data
class Meta:
model = Product
fields = ('id', 'name', 'likes')
**LikeSerializer omitted for brevity.

Instead of SerializerMethodField, which causes one additional database query per object, you can now (starting with Django 1.7) use Prefetch objects in the queryset of your DRF ViewSet:
from rest_framework import viewsets
from django.db.models import Prefetch
class ProductViewSet(viewsets.ModelViewSet):
queryset = Product.objects.prefetch_related(Prefetch(
'likes',
queryset=Like.objects.filter(like=True)))
The prefetch needs just one query, ensuring vastly superior performance compared to SerializerMethodField.

Django: running a method on all queryset objects

I want to know if the following is possible and if someone could explain how. I'm using Django REST Framework
I have a model, in that model I have a class called Product. Product has method called is_product_safe_for_user. It requires the user object and the self (product).
model.py
class Product(models.Model):
title = models.CharField(max_length=60, help_text="Title of the product.")
for_age = models.CharField(max_length=2,)
def is_product_safe_for_user(self, user):
if self.for_age > user.age
return "OK"
(ignore the syntax above, its just to give you an idea)
What I want to do is run the method for to all of the queryset objects, something like below, but I don't know how...
class ProductListWithAge(generics.ListAPIView):
permission_classes = (permissions.IsAuthenticated,)
model = Product
serializer_class = ProductSerializer
def get_queryset(self):
Product.is_product_safe_for_user(self,user)
# then somehow apply this to my queryset
return Product.objects.filter()
there will also be times when I want to run the methoud on just one object.
Or should it go into the Serializer? if so how?...
class ProductSerializer(serializers.ModelSerializer):
safe = serializers.Field(Product='is_product_safe_for_user(self,user)')
class Meta:
model = Product
fields = ('id', 'title', 'active', 'safe')

You could write a custom manager for your model. Something like this:
class OnlySafeObjects(models.Manager):
def filter_by_user(self, user):
return super(OnlySafeObjects, self).get_query_set().filter(for_age__gte=user.age)
class Product(models.Model):
# your normal stuff
onlysafeobjects = OnlySafeObjects()
Then you would use it like this:
safe_products = Product.onlysafeobjects.filter_by_user(request.user)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Django DRF serializer method field on many to many running 2n queries - django

Related

Using selected_related() in nested serializers

Can I use prefetch_related to cache filtered querysets?

Aggregate (and other annotated) fields in Django Rest Framework serializers

Django rest framework - filtering for serializer field

Django: running a method on all queryset objects

Categories

Resources