Suppose I have a model, MyModel, with a property method that uses another model's queryset.
class OtherModel(models.Model)
...
class MyModel(models.Model):
simple_attr = models.CharField('Yada yada')
#property
def complex_attr(self):
list_other_model = OtherModel.objects.all()
...
# Complex algorithm using queryset from 'OtherModel' and simple_attr
return result
This causes my get_queryset() method on MyModel to query the database to generate the list_other_model variable every time for every single row.
Which causes my MyModel ListView to generate hundreds of SQL queries. Not efficient.
How can I architect a Manager or get_queryset method to cache the variable list_other_model for each row when using MyModel.objects.all()?
I hope my question makes sense--I'm on my sixth shot of espresso, and still haven't found a way to reduce the db queries.
Not sure if this is the best way to do it, but it works.
If someone posts a better answer, I'll accept theirs.
class OtherModel(models.Model)
...
class MyModelManager(models.Manager):
def get_queryset(self):
self.model.list_other_model = OtherModel.objects.all()
return super(MyModelManager, self).get_queryset()
class MyModel(models.Model):
simple_attr = models.CharField('Yada yada')
list_other_model = None
objects = MyModelManager()
#property
def complex_attr(self):
...
# Complex algorithm using queryset from 'OtherModel' and simple_attr
return result
Related
I have an endpoint that can follow this format:
www.example.com/ModelA/2/ModelB/5/ModelC?word=hello
Model C has a FK to B, which has a FK to A. I should only ever see C's that correspond to the same A and B at one time. In the above example, we should...
filter C by those with an FK to B id = 5 and A id = 2
also filter C by the field 'word' that contains hello.
I know how to use the filter_queryset() method to accomplish #1:
class BaseModelCViewSet(GenericViewSet):
queryset = ModelC.objects.all()
class ModelCViewSet(BaseModelCViewSet, mixins.RetrieveModelMixin, mixins.ListModelMixin):
def filter_queryset(self, queryset):
return queryset.filter(ModelB=self.kwargs["ModelB"], ModelB__ModelA=self.kwargs["ModelA"])
I also know how to use a Filterset class to filter by fields on ModelC to accomplish #2
class ModelCFilterSet(GeoFilterSet):
word = CharFilter(field_name='word', lookup_expr='icontains')
But I can only get one or the other to work. If I add filterset_class = ModelCFilterSet to ModelCViewSet then it no longer does #1, and without it, it does not do #2.
How do I accomplish both? Ideally I want all of this in the ModelCFilterSet
Note - As hinted by the use of GeoFilterSet I will (later on) be using DRF to add a GIS query, this is just a simplified example. So I think that restricts me to using FilterSet classes in some manner.
I'm not sure this would be of help in your situation, but I often use nested urls in DRF in a way that would be convenient to perform the task. I use a library called drf-nested-routers that does part of the job, namely keeps track of the relations by the provided ids. Let me show an example:
# views.py
from rest_framework import exceptions, viewsets
class ModelBViewSet(viewsets.ModelViewSet):
# This is a viewset for the nested part that depends on ModelA
queryset = ModelB.objects.order_by('id').select_related('model_a_fk_field')
serializer_class = ModelBSerializer
filterset_class = ModelBFilterSet # more about it below
def get_queryset(self, *args, **kwargs):
model_a_entry_id = self.kwargs.get('model_a_pk')
model_a_entry = ModelA.objects.filter(id=model_a_entry_id).first()
if not model_a_entry:
raise exceptions.NotFound("MAYDAY")
return self.queryset.filter(model_c_fk_field=model_a_entry)
class ModelAViewSet(viewsets.ModelViewSet):
queryset = ModelA.objects.order_by('id')
serializer_class = ModelASerializer
# urls.py
from rest_framework_nested import routers
router = routers.SimpleRouter()
router.register('model-a', ModelAViewSet, basename='model_a')
model_a_router = routers.NestedSimpleRouter(router, 'model-a', lookup='model_a')
model_a_router.register('model-b', ModelBViewSet, basename='model_b')
...
In this case I can make a query like www.example.com/ModelA/2/ModelB/ that will only return the entries of ModelB that point to the object of ModelA with id 2. Likewise, www.example.com/ModelA/2/ModelB/5 will return only the corresponding object of ModelB in case it is related to ModelA-id2. A further level for ModelC would act correspondingly.
Sticking to the example, by now we have filtered the entries of ModelB related to a particular object of ModelA, that is we have received the relevant queryset. Next, we have to search for a particular subset within this queryset, and here's where FilterSet comes in play. The easiest way to customise its behaviour is by writing specific methods.
# filters.py
import django_filters
class ModelBFilterSet(django_filters.FilterSet):
word = django_filters.CharFilter(
method="get_word",
)
def get_word(self, queryset, name, value):
return queryset.filter(word__icontains=value)
In fact, you don't even have to use a method here; the way you pasted would work as well (word = CharFilter(field_name='word', lookup_expr='icontains')), I just wanted to point out that there is such an option too.
The filter starts its job with the queryset that has already been processed by the viewset and now it will just narrow down our sample using the given parameter.
I haven't tried this with a three-level nested URL, only checked on the example of two levels, but I think the third level should act in the same way.
Figured it out. So creating a filter_queryset() method overwrites the one that is in the GenericAPIView class (which I already knew).
However - that overwritten class is also responsible for using the FilterSet class that I defined. So by overwriting it, I also "broke" the FilterSet.
Solution was adding a super() to call the original class before the one I wrote:
def filter_queryset(self, queryset):
queryset = super().filter_queryset(queryset)
return queryset.filter(asset=self.kwargs["asset"], asset__program=self.kwargs["program"])
I have problem with my endpoint performance which returns around 40 items and response takes around 17 seconds.
I have model:
class GameTask(models.Model):
name= models.CharField()
description = RichTextUploadingField()
...
and another model like that:
class TaskLevel(models.Model):
master_task = models.ForeignKey(GameTask, related_name="sub_levels", on_delete-models.CASCADE)
sub_tasks = models.ManyToManyField(GameTask, related_name="master_levels")
...
So basicly I can have "normal" tasks, but when I create TaskLevel object I can add master_task as a task which gonna fasten other tasks added to sub_tasks field.
My serializers look like:
class TaskBaseSerializer(serializers.ModelSerializer):
fields created by serializers.SerializerMethodField()
...
class TaskLevelSerializer(serializers.ModelSerializer):
sub_tasks = serializers.SerializerMethodField()
class Meta:
model = TaskLevel
def get_sub_tasks(self, obj: TaskLevel):
sub_tasks = get_sub_tasks(level=obj, user=self.context["request"].user) # method from other module
return TaskBaseSerializer(sub_tasks, many=True, context=self.context).data
class TaskSerializer(TaskBaseSerializer):
levels_config = serializers.SerializerMethodField()
class Meta:
model = GameTask
def get_levels_config(self, obj: GameTask):
if is_mastertask(obj):
return dict(
sub_levels=TaskLevelSerializer(
obj.sub_levels.all().order_by("number"), many=True, context=self.context
).data,
progress=get_progress(
master_task=obj, user=self.context["request"].user
),
)
return None
When I tried to measure time it turned out that get_levels_config method takes around 0.25 seconds for one multilevel-task (which contain 7 subtasks). Is there any way to improve this performance? If any more detailed methods are needed I will add them
Your code might be suffering from N+1 problem. TaskSerializer.get_levels_config() performs database queries from obj.sub_levels.all().order_by("number").
What happens when serializing multiple instances like:
TaskSerializer(tasks, many=True)
each instance calls .get_levels_config()
You can use prefetch_related & selected_related(more explanation here).
You will have to manually check for prefetched objects since you are using SerializerMethodField. There's also the functions get_progress & get_sub_tasks which I assume does another query.
Here are some examples that can be used around your code:
Prefetching:
GameTask.objects.prefetch_related("sub_levels", "master_levels")
# Accessing deeper level
GameTask.objects.prefetch_related(
"sub_levels",
"master_levels",
"sub_levels__sub_tasks",
).select_related(
"master_levels__master_task",
)
Checking prefetch:
def get_sub_tasks(self, obj: TaskLevel):
if hasattr(obj, "_prefetched_objects_cache") and obj._prefetched_objects_cache.get("sub_tasks", None):
return obj._prefetched_objects_cache
return obj.sub_tasks.all()
How does DRF by default handle serializing a manytomany?
I see it defaults to render the field as an array of ids ex: [1,2,3]
And only uses 2 queries when I prefetch the related model.
However, when I generate it myself with .values_list('id', flat=True) it makes an extra query for every row.
Models
class Fails(models.Model):
runs = models.ManyToManyField(Runs, related_name='fails')
class Runs(models.Model):
name = models.TextField()
View
class FailsViewSet(viewsets.ModelViewSet):
...
def get_queryset(self):
...
return Fails.objects.filter(**params).prefetch_related('runs')
Serializer
class FailsSerializer(QueryFieldsMixin, serializers.ModelSerializer):
runs = serializers.SerializerMethodField()
def get_failbin_regressions(self, obj):
runids = self.context.get('runids')
return obj.runs.values_list('id', flat=True) #this creates an extra query for every row
The end goal is to get runs to display a filtered list of runids.
return obj.runs.values_list('id', flat=True).filter(id__in=runids)
or
runs = obj.runs.values_list('id', flat=True)
return [x for x in runs if x in runids] #to avoid an extra query from the .filter
I know the filter creates more queries, I assume the prefetch model is lost in the serializerMethodField.
Is there a way of getting the list of ids like drf does it without the extra query cost when I do it manually?
I can't find any documentation on how they implement the manytomany render.
By calling:
obj.runs.values_list('id', flat=True)
you are performing a new DB query. Since it will be called for every instance, you'll have a lot of extra queries.
prefetch_related loads the associated instances. So you can interact with the Python objects without extra queries. You could fix your issue with:
def get_failbin_regressions(self, obj):
runids = self.context.get('runids')
return [run.id for run in obj.runs.all() if run.id in runids]
I have a Row class and a SpottedPub model. They are almost the same except that Row has a couple extra fields.
What i need to do is to gracefully update spottedpub fields from row object attributes. I tried this
spotted_pubs = SpottedPub.objects.filter(notification__rule__suite__campaign=self,
name=particular_row.name)
if spotted_pubs.all():
spotted_pubs.update(**row.__dict__)
but I get an error saying that I pass too many kwargs to update():
FieldDoesNotExist: SpottedPub has no field named 'profit_weight'
is there a way to drop kwargs that don't correspond to fields?
I tried writing a custom manager
class CustomSpottedPubManager(models.Manager):
def update_drop_fields(self, **fields):
queryset = super(CustomSpottedPubManager, self).get_queryset()
print(queryset)
# drop unwanted fields here
queryset.update(**fields)
and attaching it to the model
class SpottedPub(BaseUserObject):
objects = CustomSpottedPubManager()
but update_drop_fields() method doesn't get called because I access this method already after filtering:
filter(notification__rule__suite__campaign=self,
name=particular_row.name)
Your idea to create an update_drop_fields and filter the fields looks good.
You can create a queryset with the custom update_drop_fields() method,
class CustomSpottedPubQueryset(models.QuerySet):
def update_drop_fields(self, **fields):
return self.update(**fields)
then create a manager with the queryset methods.
class SpottedPub(BaseUserObject):
objects = CustomSpottedPubQueryset.as_manager()
I have a private boolean flag on my model, and a custom manager that overwrites the get_query_set method, with a filter, removing private=True:
class myManager(models.Manager):
def get_query_set(self):
qs = super(myManager, self).get_query_set()
qs = qs.filter(private=False)
return qs
class myModel(models.Model):
private = models.BooleanField(default=False)
owner = models.ForeignKey('Profile', related_name="owned")
#...etc...
objects = myManager()
I want the default queryset to exclude the private models be default as a security measure, preventing accidental usage of the model showing private models.
Sometimes, however, I will want to show the private models, so I have the following on the manager:
def for_user(self, user):
if user and not user.is_authenticated():
return self.get_query_set()
qs = super(myManager, self).get_query_set()
qs = qs.filter(Q(owner=user, private=True) | Q(private=False))
return qs
This works excellently, with the limitation that I can't chain the filter. This becomes a problem when I have a fk pointing the myModel and use otherModel.mymodel_set. otherModel.mymodel_set.for_user(user) wont work because mymodel_set returns a QuerySet object, rather than the manager.
Now the real problem starts, as I can't see a way to make the for_user() method work on a QuerySet subclass, because I can't access the full, unfiltered queryset (basically overwriting the get_query_set) form the QuerySet subclass, like I can in the manager (using super() to get the base queryset.)
What is the best way to work around this?
I'm not tied to any particular interface, but I would like it to be as djangoy/DRY as it can be. Obviously I could drop the security and just call a method to filter out private tasks on each call, but I really don't want to have to do that.
Update
manji's answer below is very close, however it fails when the queryset I want isn't a subset of the default queryset. I guess the real question here is how can I remove a particular filter from a chained query?
Define a custom QuerySet (containing your custom filter methods):
class MyQuerySet(models.query.QuerySet):
def public(self):
return self.filter(private=False)
def for_user(self, user):
if user and not user.is_authenticated():
return self.public()
return self.filter(Q(owner=user, private=True) | Q(private=False))
Define a custom manager that will use MyQuerySet (MyQuerySet custom filters will be accessible as if they were defined in the manager[by overriding __getattr__]):
# A Custom Manager accepting custom QuerySet
class MyManager(models.Manager):
use_for_related_fields = True
def __init__(self, qs_class=models.query.QuerySet):
self.queryset_class = qs_class
super(QuerySetManager, self).__init__()
def get_query_set(self):
return self.queryset_class(self.model).public()
def __getattr__(self, attr, *args):
try:
return getattr(self.__class__, attr, *args)
except AttributeError:
return getattr(self.get_query_set(), attr, *args)
Then in the model:
class MyModel(models.Model):
private = models.BooleanField(default=False)
owner = models.ForeignKey('Profile', related_name="owned")
#...etc...
objects = myManager(MyQuerySet)
Now you can:
¤ access by default only public models:
MyModel.objects.filter(..
¤ access for_user models:
MyModel.objects.for_user(user1).filter(..
Because of (use_for_related_fields = True), this same manager wil be used for related managers. So you can also:
¤ access by default only public models from related managers:
otherModel.mymodel_set.filter(..
¤ access for_user from related managers:
otherModel.mymodel_set.for_user(user).filter(..
More informations: Subclassing Django QuerySets & Custom managers with chainable filters (django snippet)
To use the chain you should override the get_query_set in your manager and place the for_user in your custom QuerySet.
I don't like this solution, but it works.
class CustomQuerySet(models.query.QuerySet):
def for_user(self):
return super(CustomQuerySet, self).filter(*args, **kwargs).filter(private=False)
class CustomManager(models.Manager):
def get_query_set(self):
return CustomQuerySet(self.model, using=self._db)
If you need to "reset" the QuerySet you can access the model of the queryset and call the original manager again (to fully reset). However that's probably not very useful for you, unless you were keeping track of the previous filter/exclude etc statements and can replay them again on the reset queryset. With a bit of planning that actually wouldn't be too hard to do, but may be a bit brute force.
Overall manji's answer is definitely the right way to go.
So amending manji's answer you need to replace the existing "model"."private" = False with ("model"."owner_id" = 2 AND "model"."private" = True ) OR "model"."private" = False ). To do that you will need to walk through the where object on the query object of the queryset to find the relevant bit to remove. The query object has a WhereNode object that represents the tree of the where clause, with each node having multiple children. You'd have to call the as_sql on the node to figure out if it's the one you are after:
from django.db import connection
qn = connection.ops.quote_name
q = myModel.objects.all()
print q.query.where.children[0].as_sql(qn, connection)
Which should give you something like:
('"model"."private" = ?', [False])
However trying to do that is probably way more effort than it's worth and it's delving into bits of Django that are probably not API-stable.
My recommendation would be to use two managers. One that can access everything (an escape hatch of sort), the other with the default filtering applied. The default manager is the first one, so you need to play around with the ordering depending on what you need to do. Then restructure your code to know which one to use - so you don't have the problem of having the extra private=False clause in there already.