displaying group permissions causing duplicate queries in django - django

TLDR : I wanted to serialize a Group along with its permissions name. But lot of duplicate queries of content_type from Permission Model occurred. I tried to solve it through prefetch, but didn't work. What am i doing wrong?
so my serializer for retreive method is given below
class RetrieveGroupSerializer(serializers.ModelSerializer):
user_set = UserSerializer(many=True, read_only=True)
permissions = PermissionsSerializer(many=True, read_only=True)
class Meta:
model = Group
fields = ('name', 'user_set', 'permissions')
The serializer for list method is given below
class GroupSerializer(serializers.ModelSerializer):
user_set = UserSerializer(many=True)
permissions = PermissionsSerializer(many=True)
class Meta:
model = Group
fields = ('url', 'user_set', 'permissions')
the views is given below
class GroupViewSet(
mixins.CreateModelMixin,
mixins.RetrieveModelMixin,
mixins.UpdateModelMixin,
mixins.ListModelMixin,
viewsets.GenericViewSet):
"""
Creates, Updates, and retrives User Groups
"""
queryset = Group.objects.all().prefetch_related('user_set').prefetch_related('permissions__content_type')
serializer_class = GroupSerializer
permission_classes = (
IsAuthenticated,
)
action_serializer_classes = {
"create": CreateGroupSerializer,
"retrieve": RetrieveGroupSerializer,
"update": UpdateGroupSerializer
}
def get_serializer_class(self):
try:
return self.action_serializer_classes[self.action]
except (KeyError, AttributeError):
return super(GroupViewSet, self).get_serializer_class()
When I use the list method I am not facing any duplicate queries, but when I use the retrieve method on any single group instance I am getting lot of duplicate queries.
As you can see the content_type from Permission Model is getting queried 62 times.
So I used prefetch_related on the Foreign Key in Permission Model. But the result is the same.
But the same queryset works well for the List method and doesn't cause duplicate queries. you can see that below
Other than the problem of duplicate queries I am also confused how can the same queryset cause such different results?

That's likely because the browsable API also displays a create / update form that has a dropdown with the content types and won't use the prefetch optimizations.
Try to get it as JSON and see how many requests it performs or remove the permissions on updates to see if it changes the query quantities.

Related

Django Rest FrameWork Reduce number of queries using group by

I am writing an api using Django Rest Frameworks. The api fetches a list of clients.A Clients has many projects. My api should returns the list of clients with number of projects completed, pending and total. My api works, but it has too many sql queries. The api is paginated
class ClientViewSet(ModelViewSet):
"""
A simple view for creating clients, updating and retrieving
"""
model = Client
queryset = Client.objects.all()
serializer_class = ClientSerializer
Now my client Serializer
class ClientSerializer(serializers.ModelSerializer):
total_projects_count = serializers.SerializerMethodField()
on_going_projects_count = serializers.SerializerMethodField()
completed_projects_count = serializers.SerializerMethodField()
class Meta:
model = Client
fields = __all__
def get_total_projects_count(self, obj):
return obj.total_projects_count()
def get_on_going_projects_count(self, obj):
return obj.on_going_project_count()
def get_completed_projects_count(self, obj):
return obj.completed_projects_count()
Project has a client foreign key. I tried to fetch all products like below and group by using annotate. But annotate worked only on a single field.
projects = Project.objects.filter(client__in=queryset).values('client', 'status')
How to do group by on multiple fields and pass that extra argument to serializer. Or is there any better approach. I also tried prefetch_related but the total_projects_count was still exuecting new sql queries
You need to annotate the calculated fields in the queryset and then, instead of calling the methods, use the annotated columns to return the relevant result. This will make sure that all data is retrieved using a single query, which will definitely be faster.
Update your queryset.
class ClientViewSet(ModelViewSet):
"""
A simple view for creating clients, updating and retrieving
"""
model = Client
queryset = Client.objects.annotate(total_projects_count_val=...)
serializer_class = ClientSerializer
Then, in your serializer, use the annotated column
class ClientSerializer(serializers.ModelSerializer):
total_projects_count = serializers.SerializerMethodField()
on_going_projects_count = serializers.SerializerMethodField()
completed_projects_count = serializers.SerializerMethodField()
class Meta:
model = Client
fields = __all__
def get_total_projects_count(self, obj):
return obj.total_projects_count_val
...
Looking at the method names, I think you will need Case-When annotation.
I reduced the query by using the below queries
from django.db.models import Count, Q
pending = Count('project', filter=Q(project__status="pending"))
finished = Count('project', filter=Q(project__status="finished"))
queryset = Client.objects.annotate(pending=pending).annotate(finished=finished)
Now was able to access queryset[0].finished etc . As I was using pagination provided drf the query generated was
SELECT "clients_client"."id",
"clients_client"."created_at",
"clients_client"."updated_at",
"clients_client"."client_name",
"clients_client"."phone_number",
"clients_client"."email",
"clients_client"."address_lane",
"clients_client"."state",
"clients_client"."country",
"clients_client"."zipCode",
"clients_client"."registration_number",
"clients_client"."gst",
COUNT("projects_project"."id") FILTER (WHERE "projects_project"."status" = 'pending') AS "pending",
COUNT("projects_project"."id") FILTER (WHERE "projects_project"."status" = 'finished') AS "finished"
FROM "clients_client"
LEFT OUTER JOIN "projects_project"
ON ("clients_client"."id" = "projects_project"."client_id")
GROUP BY "clients_client"."id"
ORDER BY "clients_client"."id" ASC
LIMIT 6

How to implement a simple "like" feature in Django REST Framework?

I'm a beginner building the backend API for a social media clone using DRF. The frontend will be built later and not in Django. I'm currently using Postman to interact with the API.
I'm trying to implement a "like" feature as you would have on Facebook or Instagram. I cannot send the correct data with Postman to update the fields which bear the many-to-many relationship.
Here is some of my code:
models.py
class User(AbstractUser):
liked_haikus = models.ManyToManyField('Haiku', through='Likes')
pass
class Haiku(models.Model):
user = models.ForeignKey(User, related_name='haikus', on_delete=models.CASCADE)
body = models.CharField(max_length=255)
liked_by = models.ManyToManyField('User', through='Likes')
created_at = models.DateTimeField(auto_now_add=True)
class Likes(models.Model):
user = models.ForeignKey(User, on_delete=models.CASCADE)
haiku = models.ForeignKey(Haiku, on_delete=models.CASCADE)
serializers.py
class UserSerializer(serializers.ModelSerializer):
class Meta:
model = User
fields = ['username', 'password', 'url', 'liked_haikus']
extra_kwargs = { 'password' : {'write_only': True}}
def create(self, validated_data):
password = validated_data.pop('password')
user = User(**validated_data)
user.set_password(password)
user.save()
token = Token.objects.create(user=user)
return user
class HaikuSerializer(serializers.ModelSerializer):
class Meta:
model = Haiku
fields = ['user', 'body', 'liked_by', 'created_at']
class LikesSerializer(serializers.ModelSerializer):
model = Likes
fields = ['haiku_id', 'user_id']
views.py
class UserViewSet(viewsets.ModelViewSet):
queryset = User.objects.all()
serializer_class = UserSerializer
permission_classes = [permissions.IsAuthenticated]
#action(detail=True, methods=['get'])
def haikus(self, request, pk=None):
user = self.get_object()
serializer = serializers.HaikuSerializer(user.haikus.all(), many=True)
return Response(serializer.data)
class UserCreateViewSet(viewsets.ModelViewSet):
queryset = User.objects.all()
serializer_class = UserSerializer
permission_classes = [permissions.AllowAny]
class HaikuViewSet(viewsets.ModelViewSet):
queryset = Haiku.objects.all()
serializer_class = HaikuSerializer
permission_classes = [permissions.IsAuthenticated]
class LikesViewSet(viewsets.ModelViewSet):
queryset = Likes.objects.all()
serializer_class = LikesSerializer
permission_classes = [permissions.IsAuthenticated]
urls.py
router = routers.DefaultRouter(trailing_slash=False)
router.register('users', views.UserViewSet)
router.register('haikus', views.HaikuViewSet)
router.register('register', views.UserCreateViewSet)
router.register('likes', views.LikesViewSet)
urlpatterns = [
path('admin/', admin.site.urls),
path('', include(router.urls)),
path('api-auth/', include('rest_framework.urls', namespace='rest_framework')),
path('api-auth-token', obtain_auth_token, name='api_token_auth')
]
Using the Django Admin I can manually set users to like posts and the fields in the db will update and reflect in API requests.
With Postman, I've tried sending both PUT and PATCH to, for example:
http://127.0.0.1:8000/haikus/2
with "form data" where key ="liked_by" and value="3" (Where 3 is a user_id). I got a 200 response and JSON data for the endpoint back, but there was no change in the data.
I've tried GET and POST to http://127.0.0.1:8000/likes and I receive the following error message:
AttributeError: 'list' object has no attribute 'values'
I've looked at nested-serializers in the DRF docs, but they don't seem to be quite the same use-case.
How can I correct my code and use Postman to properly update the many-to-many fields?
I think I need to probably write an update function to one or several of the ViewSets or Serializers, but I don't know which one and don't quite know how to go about it.
All guidance, corrections and resources appreciated.
To update the liked_by Many2Many field, the serializer expect you to provide primary key(s).
Just edit your HaikuSerializer like the following. It will work.
class HaikuSerializer(serializers.ModelSerializer):
liked_by = serializers.PrimaryKeyRelatedField(
many=True,
queryset=User.objects.all())
class Meta:
model = models.Haiku
fields = ['created_by', 'body', 'liked_by', 'created_at']
def update(self, instance, validated_data):
liked_by = validated_data.pop('liked_by')
for i in liked_by:
instance.liked_by.add(i)
instance.save()
return instance
adnan kaya has provided the correct code and I have upvoted him and checked him off as the correct answer. I want go through his solution to explain it for future readers of this question.
liked_by = serializers.PrimaryKeyRelatedField(
many=True,
queryset=User.objects.all())
You can read about PrimaryKeyRelatedField here: https://www.django-rest-framework.org/api-guide/relations/
Since liked_by is a ManyToManyField it has special properties in that ManyToMany relations create a new table in the DB that relates pks to each other. This line tells Django that this field is going to refer to one of these tables via its primary key. It tells it that liked by is going to have multiple objects in it and it tells it that these objects are going to come from a particular queryset.
def update(self, instance, validated_data):
liked_by = validated_data.pop('liked_by')
for i in liked_by:
instance.liked_by.add(i)
instance.save()
return instance
ModelSerializers is a class that provides its own built in create and update functions that are fairly basic and operate in a straightforward manner. Update, for example, will just update the field. It will take the incoming data and use it to replace the existing data in the field it is directed at.
You can read more about ModelSerializers here: https://www.django-rest-framework.org/api-guide/serializers/#modelserializer
You can overwrite these functions and specify custom functions by declaring them. I have declared update here. Update is a function that takes 3 arguments. The first is self. You can call this whatever you want, but there is a strong convention to call it self for readability. Essentially this is importing the class the function belongs, into the function so you can utilize all that classes functions and variables. Next is instance. Instance is the data that is currently in the entry you are trying to update. It is a dictionary like object. Finally, there is validated_data. This is the data you are trying to send to the entry to update it. When using form data, for example, to update a database, this will be a dictionary.
liked_by = validated_data.pop('liked_by')
Because validated_data is a dictionary you can use the .pop() method on it. Pop can take the key of the dictionary and "pop it off" leaving you with the value (more formally, .pop('key') will return its 'value'). This is nice because, at least in my case, it is the value that you want added to the entry.
for i in liked_by:
instance.liked_by.add(i)
this is a simple python for-loop. A for loop is here because in my use-case the value of the validated_data dictionary is potentially a list.
The .add() method is a special method that can be used with ManytoMany relationships. You can read about the special methods for ManytoMany relations here: https://docs.djangoproject.com/en/3.1/ref/models/relations/
It does what it advertises. It will add the value you send send to it to data you call it for, instead of replacing that data. In this case it is instance.liked_by (the current contents of the entry).
instance.save()
This saves the new state of the instance.
return instance
returns the new instance, now with the validated data appended to it.
I'm not sure if this is the most ideal, pythonic, or efficient way implementing a like feature to a social media web app, but it is a straightforward way of doing it. This code can be repurposed to add all sorts of many-to-many relationships into your models (friends lists/followers and tags for example).
This is my understanding of what is going on here and I hope it can help make sense of the confusing topic of ManytoMany relationships for clearer.

Django REST Framework: Setting up prefetching for nested serializers

My Django-powered app with a DRF API is working fine, but I've started to run into performance issues as the database gets populated with actual data. I've done some profiling with Django Debug Toolbar and found that many of my endpoints issue tens to hundreds of queries in the course of returning their data.
I expected this, since I hadn't previously optimized anything with regard to database queries. Now that I'm setting up prefetching, however, I'm having trouble making use of properly prefetched serializer data when that serializer is nested in a different serializer. I've been using this awesome post as a guide for how to think about the different ways to prefetch.
Currently, my ReadingGroup serializer does prefetch properly when I hit the /api/readinggroups/ endpoint. My issue is the /api/userbookstats/ endpoint, which returns all UserBookStats objects. The related serializer, UserBookStatsSerializer, has a nested ReadingGroupSerializer.
The models, serializers, and viewsets are as follows:
models.py
class ReadingGroup(models.model):
owner = models.ForeignKeyField(settings.AUTH_USER_MODEL)
users = models.ManyToManyField(settings.AUTH_USER_MODEL)
book_type = models.ForeignKeyField(BookType)
....
<other group related fields>
def __str__(self):
return '%s group: %s' % (self.name, self.book_type)
class UserBookStats(models.Model):
reading_group = models.ForeignKey(ReadingGroup)
user = models.ForeignKey(settings.AUTH_USER_MODEL)
alias = models.CharField()
total_books_read = models.IntegerField(default=0)
num_books_owned = models.IntegerField(default=0)
fastest_read_time = models.IntegerField(default=0)
average_read_time = models.IntegerField(default=0)
serializers.py
class ReadingGroupSerializer(serializers.ModelSerializer):
users = UserSerializer(many = True,read_only=True)
owner = UserSerializer(read_only=True)
class Meta:
model = ReadingGroup
fields = ('url', 'id','owner', 'users')
#staticmethod
def setup_eager_loading(queryset):
#select_related for 'to-one' relationships
queryset = queryset.select_related('owner')
#prefetch_related for 'to-many' relationships
queryset = queryset.prefetch_related('users')
return queryset
class UserBookStatsSerializer(serializers.HyperlinkedModelSerializer):
reading_group = ReadingGroupSerializer()
user = UserSerializer()
awards = AwardSerializer(source='award_set', many=True)
class Meta:
model = UserBookStats
fields = ('url', 'id', 'alias', 'total_books_read', 'num_books_owned',
'average_read_time', 'fastest_read_time', 'awards')
#staticmethod
def setup_eager_loading(queryset):
#select_related for 'to-one' relationships
queryset = queryset.select_related('user')
#prefetch_related for 'to-many' relationships
queryset = queryset.prefetch_related('awards_set')
#setup prefetching for nested serializers
groups = Prefetch('reading_group', queryset ReadingGroup.objects.prefetch_related('userbookstats_set'))
queryset = queryset.prefetch_related(groups)
return queryset
views.py
class ReadingGroupViewset(views.ModelViewset):
def get_queryset(self):
qs = ReadingGroup.objects.all()
qs = self.get_serializer_class().setup_eager_loading(qs)
return qs
class UserBookStatsViewset(views.ModelViewset):
def get_queryset(self):
qs = UserBookStats.objects.all()
qs = self.get_serializer_class().setup_eager_loading(qs)
return qs
I've optimized the prefetching for the ReadingGroup endpoint (I actually posted about eliminating duplicate queries for that endpoint here), and now I'm working on the UserBookStats endpoint.
The issue I'm having is that, with my current setup_eager_loading in the UserBookStatsSerializer, it doesn't appear to use the prefetching set up by the eager loading method in the ReadingGroupSerializer. I'm still a little hazy on the syntax for the Prefetch object - I was inspired by this excellent answer to try that approach.
Obviously the get_queryset method of UserBookStatsViewset doesn't call setup_eager_loading for the ReadingGroup objects, but I'm sure there's a way to accomplish the same prefetching.
prefetch_related() supports prefetching inner relations by using double underscore syntax:
queryset = queryset.prefetch_related('reading_group', 'reading_group__users', 'reading_group__owner')
I don't think Django REST provides any elegant solutions out of the box for fetching all necessary fields automatically.
An alternative to prefetching all nested relationships manually, there is also a package called django-auto-prefetching which will automatically traverse related fields on your model and serializer to find all the models which need to be mentioned in prefetch_related and select_related calls. All you need to do is add in the AutoPrefetchViewSetMixin to your ViewSets:
from django_auto_prefetching import AutoPrefetchViewSetMixin
class ReadingGroupViewset(AutoPrefetchViewSetMixin, views.ModelViewset):
def get_queryset(self):
qs = ReadingGroup.objects.all()
return qs
class UserBookStatsViewset(AutoPrefetchViewSetMixin, views.ModelViewset):
def get_queryset(self):
qs = UserBookStats.objects.all()
return qs
Any extra prefetches with more complex Prefetch objects can be added in the get_queryset method on the ViewSet.

djangorestframework: Filtering in a related field

Basically, I want to filter out inactive users from a related field of a ModelSerializer. I tried Dynamically limiting queryset of related field as well as the following:
class MySerializer(serializers.ModelSerializer):
users = serializers.PrimaryKeyRelatedField(queryset=User.objects.filter(active=True), many=True)
class Meta:
model = MyModel
fields = ('users',)
Neither of these approaches worked for just filtering the queryset. I want to do this for a nested related Serializer class as a field (but couldn't even get it to work with a RelatedField).
How do I filter queryset for nested relation?
I'll be curious to see a better solution as well. I've used a custom method in my serializer to do that. It's a bit more verbose but at least it's explicit.
Some pseudo code where a GarageSerializer would filter the nested relation of cars:
class MyGarageSerializer(...):
users = serializers.SerializerMethodField('get_cars')
def get_cars(self, garage):
cars_queryset = Car.objects.all().filter(Q(garage=garage) | ...).select_related()
serializer = CarSerializer(instance=cars_queryset, many=True, context=self.context)
return serializer.data
Obviously replace the queryset with whatever you want. You don't always need the to give the context (I used it to retrieve some query parameters in the nested serializer) and you probably don't need the .select_related (that was an optimisation).
One way to do this is to create a method on the Model itself and reference it in the serializer:
#Models.py
class MyModel(models.Model):
#...
def my_filtered_field (self):
return self.othermodel_set.filter(field_a = 'value_a').order_by('field_b')[:10]
#Serialziers.py
class MyModelSerialzer(serializers.ModelSerializer):
my_filtered_field = OtherModelSerializer (many=True, read_only=True)
class Meta:
model = MyModel
fields = [
'my_filtered_field' ,
#Other fields ...
]
Another way to avoid the SerializerMethodField solution and therefore still allow writing to the serializer as well would be to subclass the RelatedField and do the filtering there.
To only allow active users as values for the field, the example would look like:
class ActiveUsersPrimaryKeyField(serializers.PrimaryKeyRelatedField):
def get_queryset(self):
return super().get_queryset().filter(active=True)
class MySerializer(serializers.ModelSerializer):
users = ActiveUsersPrimaryKeyField(many=True)
class Meta:
model = MyModel
fields = ('users',)
Also see this response.
Note that this only restricts the set of input values to active users, though, i.e. only when creating or updating model instances, inactive users will be disallowed.
If you also use your serializer for reading and MyModel already has a relation to a user that has become inactive in the meantime, it will still be serialized. To prevent this, one way is to filter the relation using django's Prefetch objects. Basically, you'll filter out inactive users before they even get into the serializer:
from django.db.models import Prefetch
# Fetch a model instance, eagerly prefetching only those users that are active
model_with_active_users = MyModel.objects.prefetch_related(
Prefetch("users", queryset=User.objects.filter(active=True))
).first()
# serialize the data with the serializer defined above and see that only active users are returned
data = MyModelSerializer(model_with_active_users).data

How can I apply a filter to a nested resource in Django REST framework?

In my app I have the following models:
class Zone(models.Model):
name = models.SlugField()
class ZonePermission(models.Model):
zone = models.ForeignKey('Zone')
user = models.ForeignKey(User)
is_administrator = models.BooleanField()
is_active = models.BooleanField()
I am using Django REST framework to create a resource that returns zone details plus a nested resource showing the authenticated user's permissions for that zone. The output should be something like this:
{
"name": "test",
"current_user_zone_permission": {
"is_administrator": true,
"is_active": true
}
}
I've created serializers like so:
class ZonePermissionSerializer(serializers.ModelSerializer):
class Meta:
model = ZonePermission
fields = ('is_administrator', 'is_active')
class ZoneSerializer(serializers.HyperlinkedModelSerializer):
current_user_zone_permission = ZonePermissionSerializer(source='zonepermission_set')
class Meta:
model = Zone
fields = ('name', 'current_user_zone_permission')
The problem with this is that when I request a particular zone, the nested resource returns the ZonePermission records for all the users with permissions for that zone. Is there any way of applying a filter on request.user to the nested resource?
BTW I don't want to use a HyperlinkedIdentityField for this (to minimise http requests).
Solution
This is the solution I implemented based on the answer below. I added the following code to my serializer class:
current_user_zone_permission = serializers.SerializerMethodField('get_user_zone_permission')
def get_user_zone_permission(self, obj):
user = self.context['request'].user
zone_permission = ZonePermission.objects.get(zone=obj, user=user)
serializer = ZonePermissionSerializer(zone_permission)
return serializer.data
Thanks very much for the solution!
I'm faced with the same scenario. The best solution that I've found is to use a SerializerMethodField and have that method query and return the desired values. You can have access to request.user in that method through self.context['request'].user.
Still, this seems like a bit of a hack. I'm fairly new to DRF, so maybe someone with more experience can chime in.
You have to use filter instead of get, otherwise if multiple record return you will get Exception.
current_user_zone_permission = serializers.SerializerMethodField('get_user_zone_permission')
def get_user_zone_permission(self, obj):
user = self.context['request'].user
zone_permission = ZonePermission.objects.filter(zone=obj, user=user)
serializer = ZonePermissionSerializer(zone_permission,many=True)
return serializer.data
Now you can subclass the ListSerializer, using the method I described here: https://stackoverflow.com/a/28354281/3246023
You can subclass the ListSerializer and overwrite the to_representation method.
By default the to_representation method calls data.all() on the nested queryset. So you effectively need to make data = data.filter(**your_filters) before the method is called. Then you need to add your subclassed ListSerializer as the list_serializer_class on the meta of the nested serializer.
subclass ListSerializer, overwriting to_representation and then calling super
add subclassed ListSerializer as the meta list_serializer_class on the nested Serializer
If you're using the QuerySet / filter in multiple places, you could use a getter function on your model, and then even drop the 'source' kwarg for the Serializer / Field. DRF automatically calls functions/callables if it finds them when using it's get_attribute function.
class Zone(models.Model):
name = models.SlugField()
def current_user_zone_permission(self):
return ZonePermission.objects.get(zone=self, user=user)
I like this method because it keeps your API consistent under the hood with the api over HTTP.
class ZoneSerializer(serializers.HyperlinkedModelSerializer):
current_user_zone_permission = ZonePermissionSerializer()
class Meta:
model = Zone
fields = ('name', 'current_user_zone_permission')
Hopefully this helps some people!
Note: The names don't need to match, you can still use the source kwarg if you need/want to.
Edit: I just realised that the function on the model doesn't have access to the user or the request. So perhaps a custom model field / ListSerializer would be more suited to this task.
I would do it in one of two ways.
1) Either do it through prefetch in your view:
serializer = ZoneSerializer(Zone.objects.prefetch_related(
Prefetch('zone_permission_set',
queryset=ZonePermission.objects.filter(user=request.user),
to_attr='current_user_zone_permission'))
.get(id=pk))
2) Or do it though the .to_representation:
class ZoneSerializer(serializers.HyperlinkedModelSerializer):
class Meta:
model = Zone
fields = ('name',)
def to_representation(self, obj):
data = super(ZoneSerializer, self).to_representation(obj)
data['current_user_zone_permission'] = ZonePermissionSerializer(ZonePermission.objects.filter(zone=obj, user=self.context['request'].user)).data
return data