Django: select_related and GenericRelation - django

Does select_related work for GenericRelation relations, or is there a reasonable alternative? At the moment Django's doing individual sql calls for each item in my queryset, and I'd like to avoid that using something like select_related.
class Claim(models.Model):
proof = generic.GenericRelation(Proof)
class Proof(models.Model):
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField()
content_object = generic.GenericForeignKey('content_type', 'object_id')
I'm selecting a bunch of Claims, and I'd like the related Proofs to be pulled in instead of queried individually.

There isn't a built-in way to do this. But I've posted a technique for simulating select_related on generic relations on my blog.
Blog content summarized:
We can use Django's _content_object_cache field to essentially create our own select_related for generic relations.
generics = {}
for item in queryset:
generics.setdefault(item.content_type_id, set()).add(item.object_id)
content_types = ContentType.objects.in_bulk(generics.keys())
relations = {}
for ct, fk_list in generics.items():
ct_model = content_types[ct].model_class()
relations[ct] = ct_model.objects.in_bulk(list(fk_list))
for item in queryset:
setattr(item, '_content_object_cache',
relations[item.content_type_id][item.object_id])
Here we get all the different content types used by the relationships
in the queryset, and the set of distinct object IDs for each one, then
use the built-in in_bulk manager method to get all the content types
at once in a nice ready-to-use dictionary keyed by ID. Then, we do one
query per content type, again using in_bulk, to get all the actual
object.
Finally, we simply set the relevant object to the
_content_object_cache field of the source item. The reason we do this is that this is the attribute that Django would check, and populate if
necessary, if you called x.content_object directly. By pre-populating
it, we're ensuring that Django will never need to call the individual
lookup - in effect what we're doing is implementing a kind of
select_related() for generic relations.

Looks like select_related and GRs don't work together. I guess you could write some kind of accessor for Claim that gets them all via the same query. This post gives you some pointers on raw SQL to get generic objects, if you need them

you can use .extra() function to manually extract fields :
Claims.filter(proof__filteryouwant=valueyouwant).extra(select={'field_to_pull':'proof_proof.field_to_pull'})
The .filter() will do the join, the .extra() will pull a field.
proof_proof is the SQL table name for Proof model.
If you need more than one field, specify each of them in the dictionnary.

Related

Which technique for database design is better for performance?

I need to create a Django PostgreSQL Database with a field in multiple tables to use it like a filter for the users, but I don't know what technique to use for performance.
I can create a table with the field and make a foreign key for every table.
class Tablefilter(models.Model):
filter_field = models.Field()
class Tablefilted(models.Model):
table_filter = models.ForeignKey(Tablefilter)
Or in my models just create a extend for that field in every model.
class Tablefilter(models.Model):
filter_field = models.Field()
class Tablefilted(Tablefilter):
field = models.Field()
Your question is missing some elaboration.
However, filtering is better to be done through the views, not via the models. So basically have some parameters on the frontend supplied to our views, and use Django's filter by inbuilt function.

Store user-defined filters in database

I need to allow users to create and store filters for one of my models. The only decent idea I came up with is something like this:
class MyModel(models.Model):
field1 = models.CharField()
field2 = models.CharField()
class MyModelFilter(models.Model):
owner = models.ForeignKey('User', on_delete=models.CASCADE, verbose_name=_('Filter owner'))
filter = models.TextField(_('JSON-defined filter'), blank=False)
So the filter field store a string like:
{"field1": "value1", "field2": "value2"}.
Then, somewhere in code:
filters = MyModelFilter.objects.filter(owner_id=owner_id)
querysets = [MyModel.objects.filter(**json.loads(filter)) for filter in filters]
result_queryset = reduce(lambda x, y: x|y, querysets)
This is not safe and I need to control available filter keys somehow. On the other hand, it presents full power of django queryset filters. For example, with this code I can filter related models.
So I wonder, is there any better approach to this problem, or maybe a 3rd-party library, that implements same functionality?
UPD:
reduce in code is for filtering with OR condition.
UPD2:
User-defined filters will be used by another part of system to filter newly added model instances, so I really need to store them on server-side somehow (not in cookies or something like that).
SOLUTION:
In the end, I used django-filter to generate filter form, then grabbing it's query data, converting in to json and saving it to the database.
After that, I could deserialize that field and use it in my FilterSet again. One problem that I couldn't solve in a normal way is testing single model in my FilterSet (when model in already fetched and I need to test, it it matches filter) so I ended up doing it manually (by checking each filter condition on model).
Are you sure this is actually what you want to do? Are your end users going to know what a filter is, or how to format the filter?
I suggest that you look into the Django-filter library (https://django-filter.readthedocs.io/).
It will enable you to create filters for your Django models, and then assist you with rendering the filters as forms in the UI.

Django - copy and insert queryset clone using bulk_create

My goal is to create a clone of a queryset and then insert it into the database.
Following the suggestions of this post, I have the following code:
qs_new = copy.copy(qs)
MyModel.objects.bulk_create(qs_new)
However, with this code I run into duplicate primary key error. As for now, I only can come up with the following work-around:
qs_new = copy.copy(qs)
for x in qs_new:
x.id = None
MyModel.objects.bulk_create(qs_new)
Question: Can I implement this code snippet without going through loop ?
Can't think of a way without loop, but just a suggestion:
# add all fields here except 'id'
qs = qs.values('field1', 'field2', 'field3')
new_qs = [MyModel(**i) for i in qs]
MyModel.objects.bulk_create(new_qs)
Note that bulk_create behaves differently depending on the underlying database. With Postgres you get the new primary keys set:
Support for setting primary keys on objects created using
bulk_create() when using PostgreSQL was added.
https://docs.djangoproject.com/en/1.10/ref/models/querysets/#django.db.models.query.QuerySet.bulk_create
You should, however make sure that the objects you are creating either have no primary keys or only keys that are not taken yet. In the latter case you should run the code that sets the PKs as well as the bulk_create inside transaction.atomic().
Fetching the values explicitly as suggested by Shang Wang might be faster because only the given values are retrieved from the DB instead of fetching everything. If you have foreign key relations or m2m relations you might want to avoid simply throwing the complex instances into bulk_create but instead explicitly naming all attributes that are required when constructing a new MyModel instance.
Here an example:
class MyModel(Model):
name = TextField(...)
related = ForeignKeyField(...)
my_m2m = ManyToManyField(...)
In case of MyModel above, you would want to preserve the ForeignKey relations by specifying related_id and the PK of the related object in the constructor of MyModel, avoiding specifying related.
With m2m relations, you might end up skipping bulk_create altogether because you need each specific new PK, the corresponding original PK (from the instance that was copied) and the m2m relations of that original instance. Then you would have to create new m2m relations with the new PK and these mappings.
# add all fields here except 'id'
qs = qs.values('name', 'related_id')
MyModel.objects.bulk_create([MyModel(**i) for i in qs])
Note for completeness:
If you have overriden save() on your model (or if you are inheriting from 3rd party with custom save methods), it won't be executed and neither will any post_save handlers (yours or 3rd party).
I tried and you need a loop to set the id to None, then it works. so finally it may be like this:
qs_new = copy.copy(qs)
for q in qs_new:
q.id = None
# also, you can set other fields if you need
MyModel.objects.bulk_create(qs_new)
This works for me.

Django-taggit prefetch_related

I'm building a basic time logging app right now and I have a todo model that uses django-taggit. My Todo model looks like this:
class Todo(models.Model):
project = models.ForeignKey(Project)
description = models.CharField(max_length=300)
is_done = models.BooleanField(default=False)
billable = models.BooleanField(default=True)
date_completed = models.DateTimeField(blank=True, null=True)
completed_by = models.ForeignKey(User, blank=True, null=True)
tags = TaggableManager()
def __unicode__(self):
return self.description
I'm trying to get a list of unique tags for all the Todos in a project and I have managed to get this to work using a set comprehension, however for every Todo in the project I have to query the database to get the tags. My set comprehension is:
unique_tags = { tag.name.lower() for todo in project.todo_set.all() for tag in todo.tags.all() }
This works just fine, however for every todo in the project it runs a separate query to grab all the tags. I was wondering if there is any way I can do something similar to prefetch_related in order to avoid these duplicate queries:
unique_tags = { tag.name.lower() for todo in project.todo_set.all().prefetch_related('tags') for tag in todo.tags.all() }
Running the previous code gives me the error:
'tags' does not resolve to a item that supports prefetching - this is an invalid parameter to prefetch_related().
I did see that someone asked a very similar question here: Optimize django query to pull foreign key and django-taggit relationship however it doesn't look like it ever got a definite answer. I was hoping someone could help me out. Thanks!
Taggit now supports prefetch_related directly on tag fields (in version 0.11.0 and later, released 2013-11-25).
This feature was introduced in this pull request. In the test case for it, notice that after prefetching tags using .prefetch_related('tags'), there are 0 additional queries for listing the tags.
Slightly hackish soution:
ct = ContentType.objects.get_for_model(Todo)
todo_pks = [each.pk for each in project.todo_set.all()]
tagged_items = TaggedItem.objects.filter(content_type=ct, object_id__in=todo_pks) #only one db query
unique_tags = set([each.tag for each in tagged_items])
Explanation
I say it is hackish because we had to use TaggedItem and ContentType which taggit uses internally.
Taggit doesn't provide any method for your particular use case. The reason is because it is generic. The intention for taggit is that any instance of any model can be tagged. So, it makes use of ContentType and GenericForeignKey for that.
The models used internally in taggit are Tag and TaggedItem. Model Tag only contains the string representation of the tag. TaggedItem is the model which is used to associate these tags with any object. Since the tags should be associatable with any object, TaggedItem uses model ContentType.
The apis provided by taggit like tags.all(), tags.add() etc internally make use of TaggedItem and filters on this model to give you the tags for a particular instance.
Since, your requirement is to get all the tags for a particular list of objects we had to make use of the internal classes used by taggit.
Use django-tagging and method usage_for_model
def usage_for_model(self, model, counts=False, min_count=None, filters=None):
"""
Obtain a list of tags associated with instances of the given
Model class.
If ``counts`` is True, a ``count`` attribute will be added to
each tag, indicating how many times it has been used against
the Model class in question.
If ``min_count`` is given, only tags which have a ``count``
greater than or equal to ``min_count`` will be returned.
Passing a value for ``min_count`` implies ``counts=True``.
To limit the tags (and counts, if specified) returned to those
used by a subset of the Model's instances, pass a dictionary
of field lookups to be applied to the given Model as the
``filters`` argument.
"""
A slightly less hackish answer than akshar's, but only slightly...
You can use prefetch_related as long as you traverse the tagged_item relations yourself, using the clause prefetch_related('tagged_items__tag'). Unfortunately, todo.tags.all() won't take advantage of that prefetch - the 'tags' manager will still end up doing its own query - so you have to step over the tagged_items relation there too. This should do the job:
unique_tags = { tagged_item.tag.name.lower()
for todo in project.todo_set.all().prefetch_related('tagged_items__tag')
for tagged_item in todo.tagged_items.all() }

Django ManyToMany generic "through" model

I'm writing a gallery field. The field subclasses ManyToManyField and adds its own ajax widget and stuff. I want to make this solution as compact as possible (I mean - I want to write as little code to reimplement this in another projects if possible).
I've decided to create an intermediate table (that provides a 'through' parameter to ManyToManyField), which will hold ordering information:
class IntermediateModel(models.Model):
from_content_type = models.ForeignKey(ContentType)
from_object_id = models.PositiveIntegerField()
from_content_object = generic.GenericForeignKey('from_content_type', 'from_object_id')
to_content_type = models.ForeignKey(ContentType)
to_object_id = models.PositiveIntegerField()
to_content_object = generic.GenericForeignKey('to_content_type', 'to_object_id')
order = models.PositiveIntegerField()
The following questions arise:
Is it possible to have a "through" model for m2m in django having both foreign keys pointing to a generic relations (like the one above)? If so - how to achieve this?
If it's possible to do this - can such model hold generic relations between more than one m2m field? Like: Class <-> Intermediate <-> Student, Gallery <-> Intermediate <-> Photo - both using Intermediate as 'through' model??
EDIT: just tested - I can ;) Can I use abstract classes with 'through' tables? I figured out - if the above mentioned complex scenario won't work I could just create two abstract classes that provide ordering and some other stuff and then always create normal subclasses to actually build some relations :)
If the difference between the intermediate models is just the way you handle them, maybe you just need to use it with "proxy" models. I mean, using Django model subclassing with a Meta option called "proxy" as True. This way, you can handle them separately, but having them stored in the same database table (if that is an option to your needs).
Read this. Maybe that is what you want. Instead of having 2 database tables with the same structure, you can have 1 table with 2 (or more) ways of accessing and handling them.