I have been mulling over this for a while looking at many stackoverflow questions and going through aggregation docs
I'm needing to get a dataset of PropertyImpressions grouped by date. Here is the PropertyImpression model:
#models.py
class PropertyImpression(models.Model):
'''
Impression data for Property Items
'''
property = models.ForeignKey(Property, db_index=True)
imp_date = models.DateField(auto_now_add=True)
I have tried so many variations of the view code, but I'm posting this code because I consider to be the most logical, simple code, which according to documentation and examples should do what I'm trying to do.
#views.py
def admin_home(request):
'''
this is the home dashboard for admins, which currently just means staff.
Other users that try to access this page will be redirected to login.
'''
prop_imps = PropertyImpression.objects.values('imp_date').annotate(count=Count('id'))
return render(request, 'reportcontent/admin_home.html', {'prop_imps':prop_imps})
Then in the template when using the {{ prop_imps }} variable, it gives me a list of the PropertyImpressions, but are grouped by both imp_date and property. I need this to only group by imp_date, and by adding the .values('imp_date') according to values docs it would just be grouping by that field?
When leaving off the .annotate in the prop_imps variable, it gives me a list of all the imp_dates, which is really close, but when I group by the date field it for some reason groups by both imp_date and property.
Maybe you have defined a default ordering in your PropertyImpression model?
In this case, you should add order_by() before annotate to reset it :
prop_imps = PropertyImpression.objects.values('imp_date').order_by() \
.annotate(count=Count('id'))
It's explained in Django documentation here:
Fields that are mentioned in the order_by() part of a queryset (or which are used in the default ordering on a model) are used when selecting the output data, even if they are not otherwise specified in the values() call. These extra fields are used to group “like” results together and they can make otherwise identical result rows appear to be separate. This shows up, particularly, when counting things.
Related
I would like to have my search results sorted by use in a foreign key relation when searching for one of my Django models.
Example:
Model "Tag" -> Many to Many <- Model "Post"
If I am searching for a tag, I would like to get the query matching tags returned in the order they are used in the relation. This means the most used tag that meets the search criteria first, etc.
Is that possible, if so, how?
I have big problems to adapt the proposed approach to my application, so here some code for clarification:
class Tag(models.Model):
class Meta:
ordering = #by number of relations to Post
class Post(models.Model):
tags = models.ManyToManyField('Tag')
Here's a more complete answer based on Alexandr's:
For the sake of the answer, I assume that your Tag model has a name field to search by.
from django.db.models import Count
search_term = 'this is the string you search for'
query = Tag.objects.filter(name__contains=search_term).annotate(post_count=Count('post')).order_by('-post_count')
In more detail:
Tag.objects.filter(name__contains=search_term) returns a QuerySet with Tag instances whose name contains the expression defined in the variable search_term (we assume this comes from the user, who wants to search for tags).
.annotate(post_count=Count('post') adds an extra field to the instances that contains the number of relations to Posts for that specific instance. You can refer to this in your code in the following way:
for tag in query:
print('This tag is used for', tag.post_count, 'posts')
Finally, .order_by('-post_count') sets the order to be by post count, in descending order (this is what the - denotes).
To order by most used tag, you probably should use something like
from django.db.models import Count
queryset = matching_tags.annotate(used_count=Count('posts'))
queryset = queryset.order_by('-used_count')
I have two simple Django models:
class PhotoStream(models.Model):
cover = models.ForeignKey('links.Photo')
creation_time = models.DateTimeField(auto_now_add=True)
class Photo(models.Model):
owner = models.ForeignKey(User)
which_stream = models.ManyToManyField(PhotoStream)
image_file = models.ImageField(upload_to=upload_photo_to_location, storage=OverwriteStorage())
Currently the only data I have is 6 photos, that all belong to 1 photostream. I'm trying the following to prefetch all related photos when forming a photostream queryset:
queryset = PhotoStream.objects.order_by('-creation_time').prefetch_related('photo_set')
for obj in queryset:
print obj.photo_set.all()
#print connection.queries
Checking via the debug toolbar, I've found that the above does exactly the same number of queries it would have done if I remove the prefetch_related part of the statement. It's clearly not working. I've tried prefetch_related('cover') as well - that doesn't work either.
Can anyone point out what I'm doing wrong, and how to fix it? My goal is to get all related photos for every photostream in the queryset. How can I possibly do this?
Printing connection.queries after running the for loop includes, among other things:
SELECT ("links_photo_which_stream"."photostream_id") AS "_prefetch_related_val", "links_photo"."id", "links_photo"."owner_id", "links_photo"."image_file" FROM "links_photo" INNER JOIN "links_photo_which_stream" ON ("links_photo"."id" = "links_photo_which_stream"."photo_id") WHERE "links_photo_which_stream"."photostream_id" IN (1)
Note: I've simplified my models posted in the question, hence the query above doesn't include some fields that actually appear in the output, but are unrelated to this question.
Here are some of the extracts from prefetch_related:
**prefetch_related**, on the other hand, does a separate lookup for each relationship, and does the ‘joining’ in Python.
And, some more:
>>> Pizza.objects.all().prefetch_related('toppings')
This implies a self.toppings.all() for each Pizza; now each time self.toppings.all() is called, instead of having to go to the database for the items, it will find them in a prefetched QuerySet cache that was populated in a single query.
So the number of queries you see will always be the same but if you use prefetch_related then instead of hitting the database on for each photostream it will hit the prefetched QuerySet cache that it already built and get the photo_set from there.
I'm building a basic time logging app right now and I have a todo model that uses django-taggit. My Todo model looks like this:
class Todo(models.Model):
project = models.ForeignKey(Project)
description = models.CharField(max_length=300)
is_done = models.BooleanField(default=False)
billable = models.BooleanField(default=True)
date_completed = models.DateTimeField(blank=True, null=True)
completed_by = models.ForeignKey(User, blank=True, null=True)
tags = TaggableManager()
def __unicode__(self):
return self.description
I'm trying to get a list of unique tags for all the Todos in a project and I have managed to get this to work using a set comprehension, however for every Todo in the project I have to query the database to get the tags. My set comprehension is:
unique_tags = { tag.name.lower() for todo in project.todo_set.all() for tag in todo.tags.all() }
This works just fine, however for every todo in the project it runs a separate query to grab all the tags. I was wondering if there is any way I can do something similar to prefetch_related in order to avoid these duplicate queries:
unique_tags = { tag.name.lower() for todo in project.todo_set.all().prefetch_related('tags') for tag in todo.tags.all() }
Running the previous code gives me the error:
'tags' does not resolve to a item that supports prefetching - this is an invalid parameter to prefetch_related().
I did see that someone asked a very similar question here: Optimize django query to pull foreign key and django-taggit relationship however it doesn't look like it ever got a definite answer. I was hoping someone could help me out. Thanks!
Taggit now supports prefetch_related directly on tag fields (in version 0.11.0 and later, released 2013-11-25).
This feature was introduced in this pull request. In the test case for it, notice that after prefetching tags using .prefetch_related('tags'), there are 0 additional queries for listing the tags.
Slightly hackish soution:
ct = ContentType.objects.get_for_model(Todo)
todo_pks = [each.pk for each in project.todo_set.all()]
tagged_items = TaggedItem.objects.filter(content_type=ct, object_id__in=todo_pks) #only one db query
unique_tags = set([each.tag for each in tagged_items])
Explanation
I say it is hackish because we had to use TaggedItem and ContentType which taggit uses internally.
Taggit doesn't provide any method for your particular use case. The reason is because it is generic. The intention for taggit is that any instance of any model can be tagged. So, it makes use of ContentType and GenericForeignKey for that.
The models used internally in taggit are Tag and TaggedItem. Model Tag only contains the string representation of the tag. TaggedItem is the model which is used to associate these tags with any object. Since the tags should be associatable with any object, TaggedItem uses model ContentType.
The apis provided by taggit like tags.all(), tags.add() etc internally make use of TaggedItem and filters on this model to give you the tags for a particular instance.
Since, your requirement is to get all the tags for a particular list of objects we had to make use of the internal classes used by taggit.
Use django-tagging and method usage_for_model
def usage_for_model(self, model, counts=False, min_count=None, filters=None):
"""
Obtain a list of tags associated with instances of the given
Model class.
If ``counts`` is True, a ``count`` attribute will be added to
each tag, indicating how many times it has been used against
the Model class in question.
If ``min_count`` is given, only tags which have a ``count``
greater than or equal to ``min_count`` will be returned.
Passing a value for ``min_count`` implies ``counts=True``.
To limit the tags (and counts, if specified) returned to those
used by a subset of the Model's instances, pass a dictionary
of field lookups to be applied to the given Model as the
``filters`` argument.
"""
A slightly less hackish answer than akshar's, but only slightly...
You can use prefetch_related as long as you traverse the tagged_item relations yourself, using the clause prefetch_related('tagged_items__tag'). Unfortunately, todo.tags.all() won't take advantage of that prefetch - the 'tags' manager will still end up doing its own query - so you have to step over the tagged_items relation there too. This should do the job:
unique_tags = { tagged_item.tag.name.lower()
for todo in project.todo_set.all().prefetch_related('tagged_items__tag')
for tagged_item in todo.tagged_items.all() }
I'm trying to restrict the selectable values of a 'persons' field in a particular form.
I have a TaskPerson model that has two foreign keys: one for 'task' one for 'person'.
In my form, the persons field should allow the user to select one or more persons, but only those persons which match a certain task.
I've attempted this:
persons = [tp.person for tp in TaskPerson.objects.filter(task=thistask)]
form.fields["persons"].queryset = persons
This list comprehension gives me the correct person objects I require, but my form doesn't display at all, presumably because it gives me only a standard python list.
I had a look over the docs, but I'm not quite sure how to progress. Could someone please advise how I can correctly display my form?
Many thanks
You can easily get a QuerySet of Person objects by following the reverse relationship to TaskPerson
http://docs.djangoproject.com/en/dev/topics/db/queries/#following-relationships-backward
form.fields['field'].queryset = Person.objects.filter(taskperson__task=thistask)
I've essentially got two tables: Page(PK=url) and PageProperty(PK=url+name).
Here is how I have my Models set up:
class Page(model.Model):
url = model.CharField(primary_key=True, max_length=255, db_column='url')
#.....
class PageProperty(model.Model):
# table with compound key (url + name)
url = model.ForeignKey('Page', to_field='url', db_column='url', primary_key=True)
name = model.CharField(primary_key=True, max_length=20)
value = model.TextField()
I have a ModelAdmin set up so I can Inline edit PageProperty(s) from Page. Its a legacy database and I know there's a lot of data in there. But the Admin is only showing ONE of the PagePropertys, not all.
I think you might need to apply the extra option to your TabularInline. Example:
class PagePropertyInline(admin.TabularInline):
model = PageProperty
extra = 3
You could probably do some magic to make the amount of extra items dynamic (such as the number of PageProperty objects for a given Page, but I'll leave that up to you.
I would suggest further reading on InlineModelAdmin options and Formsets.
Because it felt as thought a non-integer primary key was too much against the grain, I ended up buckling down and migrating the schema to use an auto generated integer pk for both tables. After that everything was smooth sailing again and the Inlines worked perfectly.