Specifying Django Related Model Sort Order - django

I have Django models that are lists of items and I want the items on each list to have a separate sort order. Maybe list 1 will sort line items by name, list 2 by date, and list 3 by priority.
The models looks more or less like this:
class ListItem(models.Model):
priority = models.IntegerField(choices=PRIORITY_CHOICES)
name = models.CharField(max_length=240)
due_date = models.DateTimeField(null=True, blank=True)
list = models.ForeignKey(List)
Right now I'm using this query in my view:
lists = List.objects.filter(user=request.user)
return render_to_response('showlists.html', {'lists': lists })
I've read examples that imply that I could do something like this:
lists = List.objects.filter(user=request.user).select_related().order_by("items.name")
Assuming that worked, it would give me the collection of lists with an identical sort order for each batch of items. How would I go about sorting each individual list's related items differently?

The syntax for ordering across related models is the same double-underscore format as for filter, so you can do:
List.objects.filter(user=request.user).select_related().order_by("listitems__name")
You should just able to use the different fields in the same way in the order_by call.

If the lists are small, then you could do it in python, using sort and tagging
list = <...code to get sublist here ...>
list.sort(key=lambda x: x.due_date) #sort by due date, say
Otherwise kick off a query for each sub-list

Related

Efficiently select latest items of different categories

Consider the following model:
class Data(Model):
created_at = models.DateTimeField()
category = models.CharField(max_length=7)
I want to select the latest object for all categories.
Following this question, i'm selecting the distinct categories and then making a separate query for each of them:
categories = Data.objects.distinct('category').values_list('category', flat=True)
for category in categories:
latest_obj = Data.objects.filter(category=category).latest('created_at')
The downside of the approach is that it makes lots of queries (1 for the distinct categories, and then a separate query per category).
Is there a way to do this with a single query?
Typically, you would use a group by in relational database. Django has an aggergation API
(https://docs.djangoproject.com/en/dev/topics/db/aggregation/#aggregation) which allows you to do the following:
from django.db.models import Max
Data.objects.values('category').annotate(latest=Max('created_at'))
This will perform a single query and return a list like this:
[{'category' : 'cat1', 'latest' : '01/01/01' },{'category' : 'cat2' 'latest' : '02/02/02' }]
But I guess you might want to retrieve the data record id as well within this list. Django does not make thinks simple for you in this case. The problem is django uses all fields in the value clause to make the grouping and you cannot return extra columns from the query.
EDIT: I originally proposed to add a second values() clause to the end of the query based on web resources but this does not add extra columns to the result set.

Django models: retrieving unique foreign key instances

I have two tables like so:
class Collection(models.Model):
name = models.CharField()
class Image(models.Model):
name = models.CharField()
image = models.ImageField()
collection = models.ForeignKey(Collection)
I'd like to retrieve the first image out of every collection. I have attempted:
image_list = Image.objects.order_by('collection.id').distinct('collection.id')
but it didn't work out the way I expected :(
Any ideas?
Thanks.
Don't use dots to separate fields that span relations in Django; the double-underscore convention is used instead -- it means "follow this relation to get to this field"
this is more correct:
image_list = Image.objects.order_by('collection__id').distinct('collection__id')
However, it probably doesn't do what you want.
The concept of "first" doesn't always apply in relational databases the way you seem to be using it. For all of the records in the image table with the same collection id, there is no record which is 'first' or 'last' -- they're all just records. You could put another field on that table to define a specific order, or you could order by id, or alphabetically by name, but none of those will happen by default.
What will probably work best for you is to get the list of collections with one query, and then get a single item per collection, in separate queries:
collection_ids = Image.objects.values_list('collection', flat=True).distinct()
image_list = [
Image.objects.filter(collection__id=c)[0] for c in collection_ids
]
If you want to apply an order to the Images, to define which is 'first', then modify it like this:
collection_ids = Image.objects.values_list('collection', flat=True).distinct()
image_list = [
Image.objects.filter(collection__id=c).order_by('-id')[0] for c in collection_ids
]
You could also write raw SQL -- MySQL aggregation has the interesting property that fields which are not aggregated over can still appear in the final output, and essentially take a random value from the set of matching records. Something like this might work:
Image.objects.raw("SELECT image.* FROM app_image GROUP BY collection_id")
This query should get you one image from each collection, but you will have no control over which one is returned.
As written in my comment, you cannot use specific fields with distinct under MySQL. However, you can achieve the same result with the following:
from itertools import groupby
all_images = Image.objects.order_by('collection__id')
images_by_collection = groupby(all_images, lambda image: image.collection_id)
image_list = sum([group for key, group in images_by_collection], [])
Unfortunately, this results in a "bigger" query to the DB (all images are retrieved).
dict([(c.id, c.image_set.all()[0]) for c in Collection.objects.all()])
That will create a dictionary of the first image (by default ordering) in each collection, keyed by the collection's id. Be aware, though, that this will generate 1+N queries, where N is the total number of collection objects.
To get around that, you'll either need to wait for Django 1.4 and prefetch_related or use something like django-batch-select.
First get the distinct result, then do your filters.
I think you should try this one.
image_list = Image.objects.distinct()
image_list = image_list.order_by('collection__id')

getting next 10 data elements, sorted in django

A very simple question say i have the following models:
class Tweet(models.Model):
entry = Models.CharField()
time_of_entry = Models.DateTimeField()
in my view i do the following:
def get_tweets(request):
tw = Tweet.objects.all().order_by(-time_of_entry)
this will return me the tweets ordered in descending order by time.
when i filter it this way, say there are 150 tweets arranged in variable tw.
Is there any way i can get tweets in range 20 to 30 of the above queryset?
I want to show 10 tweets at a time, from the above list, and then in my template i want to have a simple button, which reads load_10_next_tweets and does the same. I do not want to use pagination.
How can this be done?
Two ways. The first is to get an iterator and then islice it multiple times. The second is to slice the query set itself, e.g. tw[20:30].

Is Concatenating Django QuerySets In A Loop The Right Thing To Do?

I have a Django model called Collection that represents a collection of items (CollectionItem). Each Collection only contains items of a specific type. (CollectionItem has a foreign key to Collection).
I want to get all of the CollectionItems that are in public-flagged lists of a specific type and return them sorted by a particular field. Here's the query code that I use:
lists = Collection.objects.filter(is_public=True, type=7)
items = CollectionItem.objects.none()
for list in lists:
items |= CollectionItem.objects.filter(collection=list)
items = items.order_by('name')
I have to imagine that this will not scale well at all when I have a large database with tons of lists and items. Is there a better way to do this in Django? Or is the inefficiency involved in the query loop negligible enough compared to other options that I shouldn't worry too much?
Sounds like you just need:
items = CollectionItem.objects.filter(
collection__is_public=True, collection__type=7
).order_by('name')

Django DB, finding Categories whose Items are all in a subset

I have a two models:
class Category(models.Model):
pass
class Item(models.Model):
cat = models.ForeignKey(Category)
I am trying to return all Categories for which all of that category's items belong to a given subset of item ids (fixed thanks). For example, all categories for which all of the items associated with that category have ids in the set [1,3,5].
How could this be done using Django's query syntax (as of 1.1 beta)? Ideally, all the work should be done in the database.
Category.objects.filter(item__id__in=[1, 3, 5])
Django creates the reverse relation ship on the model without the foreign key. You can filter on it by using its related name (usually just the model name lowercase but it can be manually overwritten), two underscores, and the field name you want to query on.
lets say you require all items to be in the following set:
allowable_items = set([1,3,4])
one bruteforce solution would be to check the item_set for every category as so:
categories_with_allowable_items = [
category for category in
Category.objects.all() if
set([item.id for item in category.item_set.all()]) <= allowable_items
]
but we don't really have to check all categories, as categories_with_allowable_items is always going to be a subset of the categories related to all items with ids in allowable_items... so that's all we have to check (and this should be faster):
categories_with_allowable_items = set([
item.category for item in
Item.objects.select_related('category').filter(pk__in=allowable_items) if
set([siblingitem.id for siblingitem in item.category.item_set.all()]) <= allowable_items
])
if performance isn't really an issue, then the latter of these two (if not the former) should be fine. if these are very large tables, you might have to come up with a more sophisticated solution. also if you're using a particularly old version of python remember that you'll have to import the sets module
I've played around with this a bit. If QuerySet.extra() accepted a "having" parameter I think it would be possible to do it in the ORM with a bit of raw SQL in the HAVING clause. But it doesn't, so I think you'd have to write the whole query in raw SQL if you want the database doing the work.
EDIT:
This is the query that gets you part way there:
from django.db.models import Count
Category.objects.annotate(num_items=Count('item')).filter(num_items=...)
The problem is that for the query to work, "..." needs to be a correlated subquery that looks up, for each category, the number of its items in allowed_items. If .extra had a "having" argument, you'd do it like this:
Category.objects.annotate(num_items=Count('item')).extra(having="num_items=(SELECT COUNT(*) FROM app_item WHERE app_item.id in % AND app_item.cat_id = app_category.id)", having_params=[allowed_item_ids])