How can I prefetch nested tables? - django

I'm working on an app with a DRF API. Development has been going on for some months, but only now are we running into performance issues when populating the database with actual data. I have done some profiling and found out that many endpoints query the database hundreds of times to fetch the necessary data. This is something that can be solved with select_related and prefetch_related, that much I know, but I'm having a hard time picturing it.
Models
class LegalFile(models.Model):
code = models.CharField(max_length=75)
description = models.CharField(max_length=255)
contactrole = models.ManyToManyField(LegalFileContactRole, blank=True, related_name='contactroles')
class LegalFileContactRole(models.Model):
contact = models.ForeignKey(Contact, on_delete=models.DO_NOTHING, blank=True, related_name='legal_file_contact')
subtype = models.ForeignKey(ContactSubType, on_delete=models.DO_NOTHING, related_name='legal_file_contact_role', null=True)
class Contact(models.Model):
name = models.CharField(max_length=150)
surname_1 = models.CharField(max_length=50,blank=True, null=True)
surname_2 = models.CharField(max_length=50,blank=True, null=True)
class ContactSubType(models.Model):
pass
Say I want to list all the LegalFiles in the database, and fetch the names and both surnames of the contacts associated to each LegalFile through LegalFileContactRole. Is there a DRYer way than using lowercase notation to prefetch like LegalFile.prefetch_related('contactrole__contact__name', 'contactrole__contact__surname_1', 'contactrole__contact__surname_2')? This kind of nested relationship is a recurring thing in the app.
Editing in response to Marc Compte's comment:
I have not explained myself properly, I think. In fact, this is missing a good chunk of context - I have not mentioned serializers at all, for instance.
I have taken the approach described in this post and created a method to set eager loading up:
class LegalFileReadSerializer(serializers.ModelSerializer):
contactrole = LegalFileContactRoleSerializer(many=True)
#classmethod
def setup_eager_loading(cls, queryset):
queryset = queryset.prefetch_related('contactrole__contact__name', 'contactrole__contact__surname_1', 'contactrole__contact__surname_2')
return queryset
class Meta:
model = LegalFile
fields = '__all__'
read_only_fields = ('__all__',)
class LegalFileViewSet(CustomViewSet):
model = models.LegalFile
read_serializer_class = serializers.LegalFileReadSerializer
write_serializer_class = serializers.LegalFileWriteSerializer
def get_queryset(self):
queryset = super().get_queryset()
queryset = self.get_serializer_class().setup_eager_loading(queryset)
return queryset
(Note this is still work in progress and will have to be decoupled further for reuse, in a mixin or something)
Could I, so to speak, chain serializers? Like:
class LegalFileReadSerializer(serializers.ModelSerializer):
contactrole = LegalFileContactRoleSerializer(many=True)
#classmethod
def setup_eager_loading(cls, queryset):
queryset = queryset.prefetch_related('contactrole',)
return queryset
class Meta:
model = LegalFile
fields = '__all__'
read_only_fields = ('__all__',)
class LegalFileContactRoleSerializer(serializers.ModelSerializer):
contact = ContactReadSerializer()
subtype = ContactSubTypeReadSerializer()
#classmethod
def setup_eager_loading(cls, queryset):
queryset = queryset.prefetch_related('contact',)
return queryset
class Meta:
model = models.LegalFileContactRole
fields = '__all__'
class ContactReadSerializer(serializers.ModelSerializer):
# And so on
pass
This, in my head at least, makes sense, but I don't know if this behavior is possible or even advisable. It seems that if I chain serializers like this (implying again that it's possible to do so) I'm gonna run into problems picking up too much unneeded information.

Related

When don't we use "__all__" in ModelSerializer in Django Rest Framework

This is just my curiosity but I will be very happy if anyone answers my question.
I am using Django Rest Framework but I'm a beginner. In serializers.py, I use ModelSerializer and "all" to fields attribute.
This is an example.
class UserSerializer(serializers.ModelSerializer):
class Meta:
model = User
fields = "__all__"
And then, I just thought
when don't we use "__all__" in serializers.py??
As long as we create models.py in advance, I think we usually use all fields in each Model.
I would like you to teach me when we omit specific fields that come from each Model.
Thank you.
So the second question is a bit harder to explain in a comment:
If we use some fields of all fields in Model, how do we store information of the rest of fields?
Various cases:
Fields with defaults:
class Log(models.Model):
message = models.TextField()
created_at = models.DateTimeField(auto_now_add=True)
class LogSerializer(serializers.ModelSerializer):
class Meta:
model = Log
fields = ('message',)
For autogenerated, think user profile models via the post_save signal or calculated fields:
class OrderLine(models.Model):
order = models.ForeignKey(Order)
name = models.CharField(max_length=200)
quantity = models.IntegerField()
price = models.DecimalField()
class OrderLineSerializer(serializers.ModelSerializer):
order = serializers.PrimaryKeyRelatedField()
product = serializers.IntegerField()
class Meta:
model = OrderLine
fields = ('quantity', 'product', 'order')
In this case, the product is a primary key for a product. The serializer will have a save method that looks up the product and put it's name and price on the OrderLine. This is standard practice as you cannot reference a product in your orders, else your orders would change if you change (the price of) your product.
And derived from request:
class BlogPost(models.Model):
author = models.ForeignKey(User)
post = models.TextField()
class BlogPostSerializer(serializers.ModelSerializer):
class Meta:
model = BlogPost
fields = ('post',)
def create(self, validated_data):
instance = BlogPost(**validated_data)
instance.author = self.context['request'].user
instance.save()
return instance
This is pretty much the common cases.
There are many cases, but I think the two main ones are:
When you don't want all fields to be returned by the serializer.
When you need some method of the serializer to know its fields. In such case, you should traverse fields array, but it doesn't work if you use __all__, only if you have an actual list of fields.

Django rest API, nested serializer add/edit multiple real estate images to one listing?

I am pretty stuck working with DRF for the first time. I am looking to upload multiple Images to a single real estate Listing.
My image model
class Image(models.Model):
photo = models.ImageField(blank=True, upload_to=get_image_filename)
listing = models.ForeignKey(Listing, on_delete=models.CASCADE)
my Image, Listing, and Listing detail serializers
class ListingSerializer(serializers.HyperlinkedModelSerializer):
image_set = ImageSerializerForListingDetail(many=True, required=False)
class Meta:
model = Listing
fields = ['url', 'address', 'image_set', ]
class ListingDetailSerializer(serializers.HyperlinkedModelSerializer):
user = AccountSerializer(read_only=True)
image_set = ImageSerializerForListingDetail(many=True, required=False)
class Meta:
model = Listing
fields = '__all__'
depth = 1
class ImageSerializerForListingDetail(serializers.ModelSerializer):
image_url = serializers.SerializerMethodField()
class Meta:
model = Image
fields = ('image_url', )
def get_image_url(self, listing):
return listing.photo.url
My view
class ListingViewSet(viewsets.ModelViewSet):
queryset = Listing.objects.all()
serializer_class = ListingSerializer
detail_serializer_class = ListingDetailSerializer
permission_classes = [IsOwnerOrReadOnly, ]
'''Show detailed Listing view'''
def get_serializer_class(self):
if self.action == 'retrieve':
if hasattr(self, 'detail_serializer_class'):
return self.detail_serializer_class
return super(ListingViewSet, self).get_serializer_class()
I am having trouble figuring out how to upload/edit multiple Images, to a single Listing, and where to override. I would like it possible when both creating and editing listings. Any help is greatly appreciated. Thanks!
This specific use case does have a section dedicated in the docs for "Writable nested objects"
https://www.django-rest-framework.org/api-guide/serializers/#writable-nested-representations
If you're supporting writable nested representations you'll need to write .create() or .update() methods that handle saving multiple objects.
The doc should cover the appropriate example you are looking for!
It seems like this should do the trick, and then I still need to work on the update method.
class ListingSerializer(serializers.HyperlinkedModelSerializer):
user = UsernameSerializer(read_only=True)
image_set = ImageSerializerForListingDetail(many=True, required=False,
read_only=True)
class Meta:
model = Listing
exclude = ('url', )
depth = 1
def create(self, validated_data):
images_data = validated_data.pop('image_set')
listing = Listing.objects.create(**validated_data)
for image_data in images_data:
Image.objects.create(listing=listing, **image_data)
return listing
Is there anything special that needs to me done with Images, that my one big concern? I always thought I needed to request.FILES, but I am seeing that that has been depreciated in DRF 3?

Polymorphic models serializer

I'm using a Polymorphic model for setting up notifications:
My models:
class Notification(PolymorphicModel):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
created_by = models.ForeignKey(ElsUser, on_delete=models.CASCADE, default=None, related_name="creatednotifications")
created_on = models.DateTimeField(default=timezone.now)
created_for = models.ForeignKey(ElsUser, on_delete=models.CASCADE, default=None, related_name="receivednotifications")
read = models.DateTimeField(default=None, null=True, blank=True)
message = models.CharField(default=None, blank=True, null=True, max_length=800)
#property
def total(self):
return self.objects.filter(created_for=self.request.user).count()
#property
def unread(self):
return self.objects.filter(created_for=self.request.user,read=None).count()
#property
def read(self):
return self.objects.filter(created_for=self.request.user).exclude(read=None).count()
class WorkflowNotification(Notification):
# permission_transition = models.ForeignKey(WorkflowStatePermissionTransition, on_delete=models.CASCADE)
action = models.ForeignKey(UserAction, on_delete=models.CASCADE)
Currently i have just one model WorkFlowNotification inheriting from the Polymorphic model,but many would be there in the future.
Im trying to get the count(total) of notifications for the logged in user in the API ..total is given as property field to help in the same
my serializer:
class NotificationSerializer(serializers.ModelSerializer):
total = serializers.ReadOnlyField()
read = serializers.ReadOnlyField()
unread = serializers.ReadOnlyField()
class Meta:
model = Notification
fields = ['id', 'total','read', 'unread']
In the view:
class NotificationsMeta(generics.ListAPIView):
serializer_class = NotificationSerializer
queryset = Notification.objects.all()
When i try to run the server it shows:
Got AttributeError when attempting to get a value for field `total` on serializer `NotificationSerializer`.
The serializer field might be named incorrectly and not match any attribute or key on the `WorkflowNotification` instance.
Original exception text was: Manager isn't accessible via WorkflowNotification instances.
Since you need the 'meta data' only, what is the use of making a model serializer? Or any serializer, for that matter? Serializers will give you serialized instances of the objects of your model. So if you have multiple objects, you will get multiple serialized objects in response.
Just make your view a normal APIView. Since there is no necessity of serializing anything.
class NotificationsMeta(APIView):
def get(self, request, format=None):
qs = Notification.objects.filter(created_for=self.request.user)
response = {
'total': qs.count(),
'read': qs.filter(read=None).count(),
'unread': qs.exclude(read=None).count()
}
return Response(response)
Now remove those property functions from your model.
I didn't test your queries, just copied them from your model. You will need to check if they are working properly. Hope this helps.
I am not sure about how calling a model property who is responsible for querying in model can give appropriate data from serializer. Unfortunately i do have knowledge gap about that. I am thinking about an alternative solution. I hope following should work.
class NotificationSerializer(serializers.ModelSerializer):
total = serializers.serializers.SerializerMethodField()
read = serializers.ReadOnlyField()
unread = serializers.ReadOnlyField()
class Meta:
model = Notification
fields = ['read', 'unread']
def get_total(self, obj):
user = self.context['request'].user
return Notification.objects.filter(created_for=user).count()
If this work then you can able to do similar kind of thing for read and unread too.
In order to get notification for current_user we need to overwrite get_queryset from view.
class NotificationsMeta(generics.ListAPIView):
serializer_class = NotificationSerializer
def get_queryset(self):
return Notification.objects.filter(created_for=self.request.user)

Django DRF: POST to CreateAPIView with list of PrimaryKeyRelatedFields

I have a many-to-one relationship between the following models
class Story(models.Model):
id = models.CharField(max_length=12, primary_key=True)
class Article(models.Model):
id = models.CharField(max_length=16, primary_key=True)
title = models.CharField(max_length=500)
address = models.URLField()
story = models.ForeignKey(to=Story, blank=True, null=True, on_delete=models.CASCADE)
Suppose I post several article objects to the database successfully.
I identify that the articles with the ids
['1', '2', '3']
are all reporting on a particular Story.
I want create a Story via a POST method to a CreateAPIView view like this
POST http://127.0.0.1/news/story {articles': ['1', '2', '3']}
Here is my serializer
class StorySerializer(serializers.ModelSerializer):
id = serializers.ReadOnlyField()
articles = serializers.PrimaryKeyRelatedField(many=True, allow_empty=False, queryset=Article.objects.all())
class Meta:
model = Story
fields = ('id', 'articles')
Here is my view
class StoryList(generics.ListCreateAPIView):
serializer_class = StorySerializer
queryset = Story.objects.all()
I want to ensure that 1) the articles exist. 2) the article story is updated before the Story object is created.
Suppose I run this as it is, I will get the following error:
Got a TypeError when calling Story.objects.create(). This may be
because you have a writable field on the serializer class that is not
a valid argument to Story.objects.create(). You may need to make the
field read-only, or override the StorySerializer.create() method to
handle this correctly.
So here is an attempt to override the create() method:
def create(self, validated_data):
story_id = None
for article_id in validated_data['articles']:
article = Article.objects.get(id=article_id)
story_id = article.story_id
if story_id:
break
story = Story.objects.get(id=story_id) if story_id else Story.objects.create()
for article_id in validated_data['articles']:
article = Article.objects.get(id=article_id)
article.story_id = story.id
article.save()
story.save()
return story
def update(self, instance, validated_data):
return self.create(validated_data)
The idea here is make sure there are no overlapping stories by merging them.
When I try POST to this view, I encounter a DoesNotExist thrown by the line Article.objects.get(id=article_id)
My questions are
1) Minor : Why am I getting this error
2) Major : Is there a cleaner / correct way of addressing such a use case in django?
Thank you
class StoryList(generics.ListCreateAPIView):
serializer_class = StorySerializer
query_set = Story.objects.all()
It's should be quertset not query_set.
1) Minor : Why am I getting this error
You gived an illegal article_id which is not exist.
2) Major : Is there a cleaner / correct way of addressing such a use case in django?
drf-writable-nested can handle nested write in drf well.

Conditionally Limiting Related Resource Data in TastyPie

I am new to Tastypie (and Django) and am running into a problem with circular many-to-many relationships in my app's api.
I have 3 models RiderProfile, Ride, and RideMemebership. RiderProfilea can belong to multiple Rides, and Rides can have multiple RiderProfile. The many-to-many relationship is mediated by RideMembership. My models look like:
class RiderProfile(models.Model):
user = models.OneToOneField(User)
age = models.IntegerField(max_length=2)
rides = models.ManyToManyField('riderapp.Ride', through="RideMembership")
def __unicode__(self):
return self.user.get_username()
class Ride(models.Model):
name = models.CharField(max_length=64)
riders = models.ManyToManyField(RiderProfile, through="RideMembership")
def __unicode__(self):
return self.name
class RideMembership(models.Model):
rider = models.ForeignKey(RiderProfile)
ride = models.ForeignKey(Ride)
date_joined = models.DateField()
invite_reason = models.CharField(max_length=64)
def __unicode__(self):
return self.rider.user.get_username() + ' to ' + self.ride.name()
My TastyPie resources look like:
class UserResource(ModelResource):
...
class RideResource(ModelResource):
class Meta:
queryset = Ride.objects.all()
resource_name = 'rides'
riders = fields.ToManyField('riderapp.api.RiderProfileResource', 'riders', full=True)
class RiderProfileResource(ModelResource):
class Meta:
queryset = RiderProfile.objects.all()
resource_name = 'riders'
user = fields.ForeignKey(UserResource, 'user', full=True)
rides = fields.ToManyField('riderapp.api.RideResource', 'rides', full=True)
When I GET either a RiderProfile or Ride (list or detail), I get a recursion error because the models are fetching themselves infinitely. I have tried using the RelationalField.use_in parameter, which is very close to what I am trying to accomplish - as it prevents a field from being included based whether the request is for a list or a detail. However, I am trying to remove a resource field based on which endpoint is called.
For instance, a request for /rides:
I would like to have a list of all the RiderProfile items involved, but without their Ride list.
Likewise, a request for /riders:
I would like to have a list of all the Ride items for the RiderProfile, but without their Rider list.
What is the recommended solution for this? I have been playing the with the dehyrdate cycle, but am struggling to modify the set of related resources. I have also read answers about using multiple ModelResources for Rides and Riders. Is there a recommended way to accomplish this?
Thanks in advance for your advice!
Update
I added extra ModelResources for use with each endpoint (RiderProfileForRideResource and RideForRiderProfileResource), and it is working. I am just not sure this is the best approach. It creates additional endpoints that I don't really want to expose. Any thoughts on a better way?
class UserResource(ModelResource):
...
class RideResource(ModelResource):
class Meta:
queryset = Ride.objects.all()
resource_name = 'rides'
riders = fields.ToManyField('riderapp.api.RiderProfileForRideResource', 'riders', full=True)
class RideForRiderProfileResource(ModelResource):
class Meta:
queryset = Ride.objects.all()
resource_name = 'rides_for_riders'
class RiderProfileResource(ModelResource):
class Meta:
queryset = RiderProfile.objects.all()
resource_name = 'riders'
user = fields.ForeignKey(UserResource, 'user', full=True)
rides = fields.ToManyField('riderapp.api.RideForRiderProfileResource', 'rides', full=True)
class RiderProfileForRideResource(ModelResource):
class Meta:
queryset = RiderProfile.objects.all()
resource_name = 'riders_for_ride'
user = fields.ForeignKey(UserResource, 'user', full=True)
class RideMembershipResource(ModelResource):
class Meta:
queryset = RideMembership.objects.all()
resource_name = 'rider_membership'
This might not be the cleanest one way to do it but you could try to remove riders or rides in the dehydrate cycle by checking the resource uri path of the api call you have made
class RideResource(ModelResource):
class Meta:
queryset = Ride.objects.all()
resource_name = 'rides'
riders = fields.ToManyField('riderapp.api.RiderProfileResource', 'riders', full=True)
def dehydrate(self, bundle):
# You make api call to 'riders' and are dehydrating related source RideResource. Path should be of the form API/app_name/riders
# When call made directly to this resource then uri path will be API/app_name/rides and if statement will not be entered
if 'riders' in bundle.request.path:
del bundle.data['riders']
and vice versa for the opposite relation.
You can use a callable for the use_in attribute of your resource field instead of overriding dehydrate.
def riders_check(bundle):
return 'riders' in bundle.request.path
Something like,
riders = fields.ToManyField('riderapp.api.RiderProfileForRideResource', 'riders', full=True, use_in=riders_check)