Override queryset 'only' method | Django - django

For reducing the number of queries when people access frequently queried foreign key references, I've created a custom manager which includes select_related always for the foreign key relation.
Now that there are places in code where people have used only on this same model when creating a query set. How can I override only query set method to include this foreign key relation always.
class CustomManager(models.Manager):
def get_queryset(self):
return super(CustomManager, self).get_queryset().select_related('user')
When this model is used as below, I run into an error which says Field cannot be both deferred and traversed using select_related at the same time.
Model.objects.only('id','field_1').get(pk=1)
Anyway to mitigate this? I've to use select_related as above, since it will save lot of queries.

Related

Optimal project organization and querysets

I have 2 models Company and Product with FK on Product:
class Product(Meta):
company = models.ForeignKey(Company, related_name='products', on_delete=models.CASCADE)
In case of a View that will gather company products what is the optimal approach(use infor form both models):
1) add the View in companies app and as queryset use:
Company.objects.prefetch_related('products').get(pk=company_pk)
2) add the View in products app and as queryset use:
Product.objects.select_related('company').filter(company=company_pk)
What about ordering can be chained with prefetch or select ?
The Django docs illustrate the difference quite well:
prefetch_related(*lookups)
Returns a QuerySet that will
automatically retrieve, in a single batch, related objects for each of
the specified lookups.
This has a similar purpose to select_related, in that both are
designed to stop the deluge of database queries that is caused by
accessing related objects, but the strategy is quite different.
select_related works by creating an SQL join and including the fields
of the related object in the SELECT statement. For this reason,
select_related gets the related objects in the same database query.
However, to avoid the much larger result set that would result from
joining across a ‘many’ relationship, select_related is limited to
single-valued relationships - foreign key and one-to-one.
select_related(*fields)
Returns a QuerySet that will “follow” foreign-key relationships,
selecting additional related-object data when it executes its query.
This is a performance booster which results in a single more complex
query but means later use of foreign-key relationships won’t require
database queries.

Django - copy and insert queryset clone using bulk_create

My goal is to create a clone of a queryset and then insert it into the database.
Following the suggestions of this post, I have the following code:
qs_new = copy.copy(qs)
MyModel.objects.bulk_create(qs_new)
However, with this code I run into duplicate primary key error. As for now, I only can come up with the following work-around:
qs_new = copy.copy(qs)
for x in qs_new:
x.id = None
MyModel.objects.bulk_create(qs_new)
Question: Can I implement this code snippet without going through loop ?
Can't think of a way without loop, but just a suggestion:
# add all fields here except 'id'
qs = qs.values('field1', 'field2', 'field3')
new_qs = [MyModel(**i) for i in qs]
MyModel.objects.bulk_create(new_qs)
Note that bulk_create behaves differently depending on the underlying database. With Postgres you get the new primary keys set:
Support for setting primary keys on objects created using
bulk_create() when using PostgreSQL was added.
https://docs.djangoproject.com/en/1.10/ref/models/querysets/#django.db.models.query.QuerySet.bulk_create
You should, however make sure that the objects you are creating either have no primary keys or only keys that are not taken yet. In the latter case you should run the code that sets the PKs as well as the bulk_create inside transaction.atomic().
Fetching the values explicitly as suggested by Shang Wang might be faster because only the given values are retrieved from the DB instead of fetching everything. If you have foreign key relations or m2m relations you might want to avoid simply throwing the complex instances into bulk_create but instead explicitly naming all attributes that are required when constructing a new MyModel instance.
Here an example:
class MyModel(Model):
name = TextField(...)
related = ForeignKeyField(...)
my_m2m = ManyToManyField(...)
In case of MyModel above, you would want to preserve the ForeignKey relations by specifying related_id and the PK of the related object in the constructor of MyModel, avoiding specifying related.
With m2m relations, you might end up skipping bulk_create altogether because you need each specific new PK, the corresponding original PK (from the instance that was copied) and the m2m relations of that original instance. Then you would have to create new m2m relations with the new PK and these mappings.
# add all fields here except 'id'
qs = qs.values('name', 'related_id')
MyModel.objects.bulk_create([MyModel(**i) for i in qs])
Note for completeness:
If you have overriden save() on your model (or if you are inheriting from 3rd party with custom save methods), it won't be executed and neither will any post_save handlers (yours or 3rd party).
I tried and you need a loop to set the id to None, then it works. so finally it may be like this:
qs_new = copy.copy(qs)
for q in qs_new:
q.id = None
# also, you can set other fields if you need
MyModel.objects.bulk_create(qs_new)
This works for me.

Django: Using prefetch_selected with get() and filter()

I am trying to make queries using prefetch_selected to prefetch many to many objects.
Using prefetch_selected when doing an all() query works okay but I do not know how i’m supposed to use it with get() or filter(). Here is my model:
class Project(models.Model):
…
funders = models.ManyToManyField(Organization, related_name="funders")
class Organization(models.Model):
…
name = models.CharField(max_length=200, unique=True)
Trying the lines below doesn’t seem to work:
Project.objects.get(id=project_id).select_related('funders')
and
Project.objects.filter(id__in=[‘list-of-ids]).select_related('funders')
How i’m I supposed to go about it?
Thanks in advance.
You chained them in the wrong order, do it this way:
Project.objects.select_related('funders').get(id=project_id)
and
Project.objects.select_related('funders').filter(id__in=[‘list-of-ids])
You have to call select_related and prefetch_related on the Manager object (Project.objects).
select_related() is a queryset method. The documentation on querysets has two sections of methods: methods that return a new queryset, and methods that do not return a queryset. get() is in the second section, so you can't chain any other queryset methods after it.
One other thing: prefetch_related() runs an extra query for each model. Since you're only fetching a single project, project.funders.all() will run exactly 1 query to fetch all related organizations regardless of your use of prefetch_related(). prefetch_related becomes useful when you need the related organizations for multiple projects.
Since funder is m2m you cannot use select_related, you have to us prefetch instead. select_related only work on foreign key and one to one relation
Project.objects.prefetch_related('funders').filter(id__in=[id1, id2])

Filter across different models by one common field in django

I have the following django model manager:
class EntityManager(models.Manager):
...
def filter(self, uuid, *args, **kwargs):
entity_qs = EmptyQuerySet()
for Model in entity_classes:
count = Model.objects.filter(uuid=uuid, *args, **kwargs).count()
if count:
entity_qs = Model.objects.filter(uuid=uuid, *args, **kwargs)
break
return entity_qs
uuid field is a common field across different models and it is unique across them. Idea of the
code above is to get count of rows for different models and when it is positive then return actual query set that will return necessary instance on evaluation. So in worse cases we will do len(entity_classes) SELECT statements + 1 select on result query set evaluation.
The question is: is it possible to filter across different models by one common field with django orm with more efficient way than i do?
First, to try to answer your question directly: given what you've described, I don't know any way around checking each table. If you index on uuid that will certainly speed up the queries. Also, use exists() instead of count().
But the fact that uuid is unique across all tables might be a sign that you should reorganize your schema. If you can't do away with the idea entirely, consider linking from your user model to a new model that specifies both the table and the primary key of the corresponding row.
Django has a built-in way of doing this: the contenttypes framework. From the documentation:
Adding a foreign key from one of your own models to ContentType allows your model
to effectively tie itself to another model class.... A normal ForeignKey can only
"point to" one other model.... The contenttypes application provides a special field
type (GenericForeignKey) which works around this and allows the relationship to be
with any model.

Django-rest-framework serializer makes a lot of queries

Let's say I have this model:
class Place(models.Model):
....
owner = ForeignKey(CustomUserModel)
....
And I have this DRF serializer that returns a list of Places (the view calling it uses DRF's generics.ListAPIView class):
class PlaceSerializer(serializers.ModelSerializer):
owner = UserModelSerializer() # Gets only specific fields for a place owner
class Meta:
model = Place
The problem is, when the serializer gets a query that returns, let's say... 50 places, I can see (in connection.queries) that a query is being made for each owner foreign key relation, which sums up to a lot of queries. This of course has a big impact on performance.
Also important to mention is that for the view calling the serializer I had get_queryset() return only Places that are in a certain distance from a center point using a custom query. I used Django's extra() method for that.
I have tried using select_related and prefetch_related with the query mentioned above, but it doesn't seem to make any difference in terms of queries being made later on by the serializer.
What am I missing?
select_related will work as expected with serializers.
Make sure you're setting that in the 'queryset' attribute on the view if you're using the generic views.
Using select_related inside 'get_queryset' will work too.
Otherwise the only thing I can suggest is trying to narrow the issue down with some more debugging. If you still believe there's an issue and have a minimal example that'll replicate it then raise the issue as a ticket, or take the discussion to the mailing list.