Django rest framework extremly slow (recursive relations) - django

For my project I started to use Laravel for the api than I switched to Django / django rest framework, I did this to gain more speed as I need to query large data.
Now I got the following situation:
I got a "Group" which has "Subjects" and which has a recursive relation.
Now a group can have like 2000+ subjects(including the descendent subjects) a parent subject has +/- 30 subjects.
This is my code:
serializers
class RecursiveField(serializers.Serializer):
def to_representation(self, value):
serializer = self.parent.parent.__class__(value, context=self.context)
return serializer.data
class SubjectSerializer(serializers.ModelSerializer):
parent_of = RecursiveField(many=True, read_only=True)
class Meta:
model = Subject
fields = ("id", "name", "parent_of", "parent")
class GroupSerializer(serializers.ModelSerializer):
subjects = SubjectSerializer(many=True, read_only=True)
class Meta:
model = Group
fields = ("id", "name", "subjects")
def setup_eager_loading(cls, queryset):
return queryset.prefetch_related("subjects")
views
class GroupViewSet(ModelViewSet):
class Paginator(BasePaginator):
model = Group
queryset = Group.objects.all()
serializer_class = serializers.GroupSerializer
pagination_class = Paginator
def get_queryset(self):
return self.get_serializer().setup_eager_loading(GroupViewSet.queryset)
I tested the same request with the laravel api and its much faster, still noticeable slow but its ok(5-10 secs). With django rest framework its too slow (1 minute +/-), and thats just a page with 1 group that has 2500 subjects.
I do know what takes long, the RecursiveField class, because when I remove that the query is done in less than 2 seconds. So my question is whats the main cause, because it's creates a recursive relation (I doubt)? Or is it because I don't prefetch?
And ofcourse whats the best way to do this?
Thank you

You have a few options, but I don't think any are great. Recursive queries aren't very well supported with Django.
Rework your data model to prevent needing to use recursion to fetch the subjects from the database. You could add a root ForeignKey to Subject on Subject that would identify the root subject. This would allow you to grab all subjects in a tree fairly easily. Then you'd have to arrange them in your View/Viewset to fit the ordering (if that's necessary).
Use raw() and your database's recursive functionality to fetch the models. This would require raw SQL and can be painful to maintain.
Use django_cte. I've used this in one of my projects for a few queries, but I'm not a big fan of it. It breaks some functionality with update() being called on an empty queryset. However, it will work and it won't require you to drop down to raw SQL.

The problem is not DRF, but the data structure itself.
It is very slow in django to query all ancestors/descendants recursively, your should use a more efficient data structure.
For the same reason I wrote django-treenode, it performs tree operations without query the db.
You can read the docs here: https://github.com/fabiocaccamo/django-treenode

Related

Django Rest Framework group fields

I'm exposing an REST api for legacy application.
I have a Company model class that defines the following fields:
address_street (required)
address_street_number (required)
shipping_address_street (optional)
shipping_address_street_number (optional)
billing_address_street (optional)
... you got the point
I would like to group all address fields into an Adress serializer in order to have a cleaner structure.
Some thing like:
{
"adress": {"street": "foo", "street_number": "bar"},
"shipping_address": {"street": "ham", "street_number": "spam"},
"billing_address": null,
}
So far, I can create a CompanySerializer from a rest_framework.serializers.Serializer and manually build my Company objects from this.
It is tedious, but it will work.
But how can I build a rest_framework.serializers.ModelSerializer for my Company model, changing the way fields are structured to have my model fields automatically populated by rest framework ?
DRF nested model serializers seems to only work for relations, not groups of fields.
Am I to build my models instances by hands or is there a way to decouple the representation of a model serializer to my object ?
From ModelSerializer documentation
The process of automatically determining a set of serializer fields
based on the model fields is reasonably complex
You probably should stick to the "tedious" method you mention (you will have to put in some effort if serializer representation and model fields have different structures altogether).
ModelSerializer is tightly linked to the model in question, so overriding that behaviour seems to be for little benefit when you can do the same thing using a plain Serializer and put object creation under save.
Maybe you need to override the data property/method on the Serializer subclass so that you get a dict that is fit for consumption directly by the model, that might make it less tedious
You can build custom serialiser fields with SerializerMethodField:
from rest_framework.fields import SerializerMethodField
class AdressSerializer(ModelSerializer):
adress = SerializerMethodField()
shipping_address = SerializerMethodField()
def get_adress(self, instance):
return {
"street": instance.address_street,
"street_number": instance.address_street_number
}
def get_shipping_address(self, instance):
// same approach as above
If needed to populate the model from the same data representation, the best approach is to override serialiser's save method. I don't think there is an "automatic" way of doing it.

How to design a Django API to handle a "Dynamic" form?

I have built an Angular form that contains a form array. So the user has the control to add fields or delete fields from the form. What I am trying to understand is how I can design a Django API that can handle the post for this kind of dynamic form?
I do not want the user to create an account on the website in order to place his order. Is that even possible?
You should be more concerned about how to model your data, than you can think about your interface. Here a few options for modeling your data:
Option One is to use regular Django ORM, and in this case you may use the JSONField for any dynamic properties.
Option two is to use any schemaless data model, such as document-based databases(e.g MongoDB).
Here a simple example, on how to use Django's JSONField:
your model:
class Order(models.Model):
customer = models.ForeignKey(User, on_delete=models.CASCADE)
#any additional static fields
properties = JSONField()
your view:
def create_order_view(request):
if request.method == "POST":
#do your validation
Order.objects.create(user=request.user, properties=request.POST["properties"])
return Response(status=200)
this example is totally incomplete as you have to add validation error handling, and it is a better idea to use Django rest-framework for constructing your API.
Finally as I said there many option to model your data, in addition to what I mentioned above there are many other. To decide what model to use, you have to know how your data gonna be consumed, so you can optimze for query time, in addition there are many other factors but this is out of scope of this asnwer.
For me, I used Django-RESTframework to build the api.
The way to achieve this is simple, just create the model and iterate through the items which is the dynamic part, and assign the Foreignkey field to obj.id created. First, I created the main model instance, then created the instances of the child instances. I will use Order and Item to demonstrate the idea, The Item instance will have Foreinkey field to Order model.
In the Item model, add "related_name" argument to the Foreinkey field
order = models.ForeignKey(Order, related_name='items',on_delete=models.CASCADE)
serializers.py
class ItemSerializer(serializers.ModelSerializer):
class Meta:
model = Item
fields = [
....your fields...
]
class OrderSerializer(serializers.ModelSerializer):
items = ItemSerializer(many=True)
class Meta:
model = Order
fields = [
'order', ....
]
def create(self, validated_data):
items_data = validated_data.pop("items")
order = Order.objects.create(**validated_data)
order.total_fees = order.delivery_fees
for item in items_data:
i = Item.objects.create(order=order, **item)
return order

Django rest framework: automatically create a url for each field of a model

I have large table of data (~30 Mb) that I converted into into a model in Django. Now I want to have access to that data through a REST API.
I've successfully installed the Django REST framework, but I'm looking for a way to automatically create a URL for each field in my model. My model has about 100 fields, and each field has about 100,000 entries.
If my model is named Sample,
models.py
class Sample(models.Model):
index = models.IntegerField(primary_key=True)
year = models.IntegerField(blank=True, null=True)
name = models.TextField(blank=True, null=True)
...97 more fields...
then I can access the whole model using Django REST framework like this:
urls.py
class SampleSerializer(serializers.HyperlinkedModelSerializer):
class Meta:
model = Sample
fields = ( **100 fields**)
class SampleViewSet(viewsets.ModelViewSet):
queryset = Sample.objects.all()
serializer_class = SampleSerializer
router = routers.DefaultRouter()
router.register(r'sample', SampleViewSet)
But of course my browser can't load all of that data in a reasonable amount of time. I could manually make a different class and URL for each field, but there must be a better way... I want to be able to go to my_site.com/sample/year (for example) and have it list all of the years in JSON format, or my_site.com/sample/name and list all the names, etc.
Please help me figure out how to do this, thanks!
You might be able to do that using a custom viewset route.
You have this:
class ModelViewSet(ModelViewSet):
#list_route()
def sample_field(self, request):
desired_field = request.data.get('field', None)
if not desired_field:
return response # pseudocode
values = Model.objects.all().values_list(desired_field, flat=True)
# serialize this for returning the response
return Response(json.dumps(values)) # this is an example, you might want to do something mode involved
You will be able to get this from the url:
/api/model/sample_field/?field=foo
This extra method on the viewset will create a new endpoint under the samples endpoint. Since it's a list_route, you can reach it using /sample_field.
So following your code, it would be:
mysite.com/sample/sample_field/?field='year'
for example.
There are many interesting details in your question, but with this sample I think you might able to achieve what you want.
Try to use pagination. You can do it in almost the same way as in you question. Pagination in django lets you divide the results into pages. You don't have to display all the entries in the same page. I think this is the best option for you.
Refer django documentation on pagination:
Pagination in django

How can I update two models in one serializer in Django Rest Framework?

I have a database schema that has each object of a certain type being stored across two separate tables (one row in each table, different data in each, with a foreign key from one to the other.)
Unfortunately, Django Rest Framework tends to assume that there is a one to one correspondence between serializers and models, which is not true of my case. How should I be approaching this? It seems like the serializer should return the representation of the object which will be the actual HTTP response of the ajax requests, so using two serializers doesn't seem right. I've looked at extending BaseSerializer (which is how I currently plan to implement this if I don't find better solutions), but certain methods take in an instance, which should contain all the data needed to serialize the object, whereas I have two instances relevant.
Any advice would be super appreciated! Thank you.
Writable nested representations section might help you.
You have 2 models ModelA and ModelB. Create your first model's serializer
class ModelASerializer(serializers.ModelSerializer):
class Meta:
model = ModelA
fields = ('fields',..) #
Then in other model's serializer add the first serializer and override the required methods (like create, update). Something like this:
class ModelBSerializer(serializers.ModelSerializer):
# add the serializer for the foreignkey model
model_a = ModelASerializer()
class Meta:
model = ModelB
fields = ('fields',..) #
def create(self, validated_data):
modela_data = validated_data.pop('model_a')
model_b = ModelB.objects.create(**validated_data)
ModelA.objects.create(model_b=model_b, **modela_data)
return model_b
# override update too ..

Dynamic model choice field in django formset using multiple select elements

I posted this question on the django-users list, but haven't had a reply there yet.
I have models that look something like this:
class ProductGroup(models.Model):
name = models.CharField(max_length=10, primary_key=True)
def __unicode__(self): return self.name
class ProductRun(models.Model):
date = models.DateField(primary_key=True)
def __unicode__(self): return self.date.isoformat()
class CatalogItem(models.Model):
cid = models.CharField(max_length=25, primary_key=True)
group = models.ForeignKey(ProductGroup)
run = models.ForeignKey(ProductRun)
pnumber = models.IntegerField()
def __unicode__(self): return self.cid
class Meta:
unique_together = ('group', 'run', 'pnumber')
class Transaction(models.Model):
timestamp = models.DateTimeField()
user = models.ForeignKey(User)
item = models.ForeignKey(CatalogItem)
quantity = models.IntegerField()
price = models.FloatField()
Let's say there are about 10 ProductGroups and 10-20 relevant
ProductRuns at any given time. Each group has 20-200 distinct
product numbers (pnumber), so there are at least a few thousand
CatalogItems.
I am working on formsets for the Transaction model. Instead of a
single select menu with the several thousand CatalogItems for the
ForeignKey field, I want to substitute three drop-down menus, for
group, run, and pnumber, which uniquely identify the CatalogItem.
I'd also like to limit the choices in the second two drop-downs to
those runs and pnumbers which are available for the currently
selected product group (I can update them via AJAX if the user
changes the product group, but it's important that the initial page
load as described without relying on AJAX).
What's the best way to do this?
As a point of departure, here's what I've tried/considered so far:
My first approach was to exclude the item foreign key field from the
form, add the substitute dropdowns by overriding the add_fields
method of the formset, and then extract the data and populate the
fields manually on the model instances before saving them. It's
straightforward and pretty simple, but it's not very reusable and I
don't think it is the right way to do this.
My second approach was to create a new field which inherits both
MultiValueField and ModelChoiceField, and a corresponding
MultiWidget subclass. This seems like the right approach. As
Malcolm Tredinnick put it in
a django-users discussion,
"the 'smarts' of a field lie in the Field class."
The problem I'm having is when/where to fetch the lists of choices
from the db. The code I have now does it in the Field's __init__,
but that means I have to know which ProductGroup I'm dealing with
before I can even define the Form class, since I have to instantiate the
Field when I define the form. So I have a factory
function which I call at the last minute from my view--after I know
what CatalogItems I have and which product group they're in--to
create form/formset classes and instantiate them. It works, but I
wonder if there's a better way. After all, the field should be
able to determine the correct choices much later on, once it knows
its current value.
Another problem is that my implementation limits the entire formset
to transactions relating to (CatalogItems from) a single
ProductGroup.
A third possibility I'm entertaining is to put it all in the Widget
class. Once I have the related model instance, or the cid, or
whatever the widget is given, I can get the ProductGroup and
construct the drop-downs. This would solve the issues with my
second approach, but doesn't seem like the right approach.
One way of setting field choices of a form in a formset is in the form's __init__ method by overwriting the self.fields['field_name'].choices, but since a more dynamic approach is desired, here is what works in a view:
from django.forms.models import modelformset_factory
user_choices = [(1, 'something'), (2, 'something_else')] # some basic choices
PurchaserChoiceFormSet = modelformset_factory(PurchaserChoice, form=PurchaserChoiceForm, extra=5, max_num=5)
my_formset = PurchaserChoiceFormSet(self.request.POST or None, queryset=worksheet_choices)
# and now for the magical for loop
for choice_form in my_formset:
choice_form.fields['model'].choices = user_choices
I wasn't able to find the answer for this but tried it out and it works in Django 1.6.5. I figured it out since formsets and for loops seem to go so well together :)
I ended up sticking with the second approach, but I'm convinced now that it was the Short Way That Was Very Long. I had to dig around a bit in the ModelForm and FormField innards, and IMO the complexity outweighs the minimal benefits.
What I wrote in the question about the first approach, "It's straightforward and pretty simple," should have been the tip-off.