Django: copy and manipulate a QuerySet? - django

I need to re-order a Django queryset, because I want to put None values at the bottom of an ORDER BY DESC query on a FloatField.
Unfortunately, I'm struggling to find an elegant way to manipulate a Django queryset. Here's what I have so far:
cities = City.objects.filter(country__id=country.id).order_by('value')
if cities.count() > 1:
cities_sorted = cities
del manors_sorted[:]
for city in cities:
cities_sorted += city
# Add code to
cities = cities_sorted
Currently, this fails with 'QuerySet' object does not support item deletion.
Any idea how I can copy and re-order this QuerySet to put None items last?

The queryset will be evaluated to a list, if you eg. call list() on it:
cities_sorted = list(cities)

Related

Django filter empty fields

Currently I am filtering empty fields in all entries of the queryset like this:
data_qs = DataValue.objects.filter(study_id=study.id) #queryset
opts = DataValue._meta # get meta info from DataValue model
field_names = list([field.name for field in opts.fields]) # DataValue fields in a list
field_not_empty = list() # list of empty fields
for field in field_names:
for entry in data_qs.values(field):
if entry.get(field) is not None:
field_not_empty.append(field)
break
It works but not sure if it is an appropriate solution....
Does anyone know how to filter empty values in all the queryset? The table have more than 30 fields, so depending on the study ID some querysets may contain the field1 all empty, other study ID may contain all the field2 empty.
Does the Django ORM provide an easy an clean solution to do this?
Thanks in advance
To check if some value in a QuerySet is empty, say the values name is "title".
This will exclude all empty fields
DataValue.objects.filter(study_id=study.id).exclude(title__exact='')
If you only want the empty fields, just filter it
DataValue.objects.filter(study_id=study.id, title__exact='')
Hope it helped.

Django queryset get max id's for a filter

I want to get a list of max ids for a filter I have in Django
class Foo(models.Model):
name = models.CharField()
poo = models.CharField()
Foo.objects.filter(name__in=['foo','koo','too']).latest_by_id()
End result a queryset having only the latest objects by id for each name. How can I do that in Django?
Edit: I want multiple objects in the end result. Not just one object.
Edit1: Added __in. Once again I need only latest( as a result distinct) objects for each name.
Something like this.
my_id_list = [Foo.objects.filter(name=name).latest('id').id for name in ['foo','koo','too']]
Foo.objects.filter(id__in=my_id_list)
The above works. But I want a more concise way of doing it. Is it possible to do this in a single query/filter annotate combination?
you can try:
qs = Foo.objects.filter(name__in=['foo','koo','too'])
# Get list of max == last pk for your filter objects
max_pks = qs.annotate(mpk=Max('pk')).order_by().values_list('mpk', flat=True)
# after it filter your queryset by last pk
result = qs.filter(pk__in=max_pks)
If you are using PostgreSQL you can do the following
Foo.objects.order_by('name', '-id').distinct('name')
MySQL is more complicated since is lacks a DISTINCT ON clause. Here is the raw query that is very hard to force Django to generate from ORM function calls:
Foo.objects.raw("""
SELECT
*
FROM
`foo`
GROUP BY `foo`.`name`
ORDER BY `foo`.`name` ASC , `foo`.`id` DESC
""")

Join annotations in Django without raw SQL

I have a model that has arbitrary key/value pairs (attributes) associated with it. I'd like to have the option of sorting by those dynamic attributes. Here's what I came up with:
class Item(models.Model):
pass
class Attribute(models.Model):
item = models.ForeignKey(Item, related_name='attributes')
key = models.CharField()
value = models.CharField()
def get_sorted_items():
return Item.objects.all().annotate(
first=models.select_attribute('first'),
second=models.select_attribute('second'),
).order_by('first', 'second')
def select_attribute(attribute):
return expressions.RawSQL("""
select app_attribute.value from app_attribute
where app_attribute.item_id = app_item.id
and app_attribute.key = %s""", (attribute,))
This works, but it has a bit of raw SQL in it, so it makes my co-workers wary. Is it possible to do this without raw SQL? Can I make use of Django's ORM to simplify this?
I would expect something like this to work, but it doesn't:
def get_sorted_items():
return Item.objects.all().annotate(
first=Attribute.objects.filter(key='first').values('value'),
second=Attribute.objects.filter(key='second').values('value'),
).order_by('first', 'second')
Approach 1
Using Djagno 1.8+ Conditional Expressions
(see also Query Expressions)
items = Item.objects.all().annotate(
first=models.Case(models.When(attribute__key='first', then=models.F('attribute__value')), default=models.Value('')),
second=models.Case(models.When(attribute__key='second', then=models.F('attribute__value')), default=models.Value(''))
).distinct()
for item in items:
print item.first, item.second
Approach 2
Using prefetch_related with custom models.Prefetch object
keys = ['first', 'second']
items = Item.objects.all().prefetch_related(
models.Prefetch('attributes',
queryset=Attribute.objects.filter(key__in=keys),
to_attr='prefetched_attrs'),
)
This way every item from the queryset will contain a list under the .prefetched_attrs attribute.
This list will contains all filtered-item-related attributes.
Now, because you want to get the attribute.value, you can implement something like this:
class Item(models.Model):
#...
def get_attribute(self, key, default=None):
try:
return next((attr.value for attr in self.prefetched_attrs if attr.key == key), default)
except AttributeError:
raise AttributeError('You didnt prefetch any attributes')
#and the usage will be:
for item in items:
print item.get_attribute('first'), item.get_attribute('second')
Some notes about the differences in using both approaches.
you have a one idea better control over the filtering process using the approach with the custom Prefetch object. The conditional-expressions approach is one idea harder to be optimized IMHO.
with prefetch_related you get the whole attribute object, not just the value you are interested in.
Django executes prefetch_related after the queryset is being evaluated, which means a second query is being executed for each clause in the prefetch_related call. On one way this can be good, because it this keeps the main queryset untouched from the filters and thus not additional clauses like .distinct() are needed.
prefetch_related always put the returned objects into a list, its not very convenient to use when you have prefetchs returning 1 element per object. So additional model methods are required in order to use with pleasure.

Django Queryset's annotate returns only a single result

I have a model called Schedule which has a ManyToManyField, roomId. This links to the Room model, which has an important ForeignKey, buildingId.
For my QuerySet, I need the list of buildingIds for each roomId.
What I've tried:
queryset = Schedule.objects.all().annotate(buildingId=F('roomId__buildingId'))
And also:
queryset = Schedule.objects.all().annotate(buildingId=RawSQL("select roomId from api_room where buildingId_id = 1", ()))
The second one is just a test which should return two results.
Both of these only return the first result. So the buildingId I get is the ID of the first result, instead of a list of all matching results.
That won't work as you're expecting, it will return only one buildingId for every Schedule object. Database can't return data in tree format which will be required here.
What you can do is to access each building from Schedule objects using:
queryset = Schedule.objects.all().prefetch_related(Prefetch('roomId', queryset=Room.objects.all().select_related('buildingId'))
for schedule in queryset:
rooms = schedule.roomId.objects.all()
for room in rooms:
building = room.buildingId
That will execute only 2 queries on your database.

Custom SQL for Geodjango on ForignKey

I have a following model:
class UserProfile(models.Model):
user = models.OneToOneField(User)
location = models.PointField(blank=True, null=True, srid=CONSTANTS.SRID)
objects = models.GeoManager()
class Item(models.Model):
owner = models.ForeignKey(UserProfile)
objects = models.GeoManager()
Now I need to sort the Items by distance to some point:
p = Point(12.5807203, 50.1250706)
Item.objects.all().distance(p, field='owner__location')
But that throws me an error:
TypeError: ST_Distance output only available on GeometryFields.
From GeoDjango GeoQuerySet.distance() results in 'ST_Distance output only available on GeometryFields' when specifying a reverse relationship in field_name I can see there is already ticket for this.
Now I don't like the solution proposed in that question since that way I would not get the distance and I would lose the distances.
So I was thinking that I could achieve this by making a custom sql query. I know that this:
UserProfile.objects.distance(p)
will produce something like this:
SELECT (ST_distance_sphere("core_userprofile"."location",ST_GeomFromEWKB('\x0101000020e6100000223fd12b5429294076583c5002104940'::bytea))) AS "distance", "core_userprofile"."id", "core_userprofile"."user_id", "core_userprofile"."verified", "core_userprofile"."avatar_custom", "core_userprofile"."city", "core_userprofile"."location", "core_userprofile"."bio" FROM "core_userprofile"
So my question is: is there some easy way how to manually construct such query that would sort items by distance?
Since the geometry you're measuring distance to is on UserProfile, it makes sense to query for UserProfile objects and then handle each Item object they own. (The distance is the same for all items owned by a profile.)
For example:
all_profiles = UserProfile.objects.all()
for profile in all_profiles.distance(p).order_by('distance'):
for item in profile.item_set.all():
process(item, profile.distance)
You may be able to make this more efficient with prefetch_related:
all_profiles = UserProfile.objects.all()
all_profiles = all_profiles.prefetch_related('item_set') # we'll need these
for profile in all_profiles.distance(p).order_by('distance'):
for item in profile.item_set.all(): # items already prefetched
process(item, profile.distance)
If it's important for some reason to query directly for Item objects, try using extra:
items = Item.objects.all()
items = items.select_related('owner')
distance_select = "st_distance_sphere(core_userprofile.location, ST_GeomFromEWKT('%s'))" % p.wkt
items = items.extra({'distance': distance_select})
items = items.order_by('distance')
Raw queries are another option, which let you get model objects from a raw SQL query:
items = Item.objects.raw("SELECT core_item.* FROM core_item JOIN core_userprofile ...")