Join annotations in Django without raw SQL - django

I have a model that has arbitrary key/value pairs (attributes) associated with it. I'd like to have the option of sorting by those dynamic attributes. Here's what I came up with:
class Item(models.Model):
pass
class Attribute(models.Model):
item = models.ForeignKey(Item, related_name='attributes')
key = models.CharField()
value = models.CharField()
def get_sorted_items():
return Item.objects.all().annotate(
first=models.select_attribute('first'),
second=models.select_attribute('second'),
).order_by('first', 'second')
def select_attribute(attribute):
return expressions.RawSQL("""
select app_attribute.value from app_attribute
where app_attribute.item_id = app_item.id
and app_attribute.key = %s""", (attribute,))
This works, but it has a bit of raw SQL in it, so it makes my co-workers wary. Is it possible to do this without raw SQL? Can I make use of Django's ORM to simplify this?
I would expect something like this to work, but it doesn't:
def get_sorted_items():
return Item.objects.all().annotate(
first=Attribute.objects.filter(key='first').values('value'),
second=Attribute.objects.filter(key='second').values('value'),
).order_by('first', 'second')

Approach 1
Using Djagno 1.8+ Conditional Expressions
(see also Query Expressions)
items = Item.objects.all().annotate(
first=models.Case(models.When(attribute__key='first', then=models.F('attribute__value')), default=models.Value('')),
second=models.Case(models.When(attribute__key='second', then=models.F('attribute__value')), default=models.Value(''))
).distinct()
for item in items:
print item.first, item.second
Approach 2
Using prefetch_related with custom models.Prefetch object
keys = ['first', 'second']
items = Item.objects.all().prefetch_related(
models.Prefetch('attributes',
queryset=Attribute.objects.filter(key__in=keys),
to_attr='prefetched_attrs'),
)
This way every item from the queryset will contain a list under the .prefetched_attrs attribute.
This list will contains all filtered-item-related attributes.
Now, because you want to get the attribute.value, you can implement something like this:
class Item(models.Model):
#...
def get_attribute(self, key, default=None):
try:
return next((attr.value for attr in self.prefetched_attrs if attr.key == key), default)
except AttributeError:
raise AttributeError('You didnt prefetch any attributes')
#and the usage will be:
for item in items:
print item.get_attribute('first'), item.get_attribute('second')
Some notes about the differences in using both approaches.
you have a one idea better control over the filtering process using the approach with the custom Prefetch object. The conditional-expressions approach is one idea harder to be optimized IMHO.
with prefetch_related you get the whole attribute object, not just the value you are interested in.
Django executes prefetch_related after the queryset is being evaluated, which means a second query is being executed for each clause in the prefetch_related call. On one way this can be good, because it this keeps the main queryset untouched from the filters and thus not additional clauses like .distinct() are needed.
prefetch_related always put the returned objects into a list, its not very convenient to use when you have prefetchs returning 1 element per object. So additional model methods are required in order to use with pleasure.

Related

How to use select_related with Django create()?

class Parent(models.Model):
# some fields
class Child(models.Model):
parent = models.ForeginKey(Parent)
species = models.ForeignKey(Species)
# other fields
I have a function like this:
1. def some_function(unique_id):
2. parent_object = Parent.objects.get(unique_id=unique_id)
3. new_child = Child.objects.create(name='Joe', parent=parent_object)
4. call_some_func(new_child.parent.name, new_child.species.name)
In the line 4, a db query is generated for Species. Is there any way, I can use select_related to prefetch the Species, so as to prevent extra query.
Can it be done while I use .create(). This is just an example , I am using many other fields too and they are querying the DB every time.
The only way I can think is after line 3 using this code:
child_obj = Child.objects.select_related('species').get(id=new_child.id)
The only parameter that create accepts is force_insert which is not related to what you're asking, so it seems it's not possible. Also, noticing that create performs an INSERT ... RETURNING ... statement, I don't think it would be possible anyway because you cannot return columns from other tables.
Possibly the best approach is what you already suggested: do a get() afterwards with the related fields you need.

Django: "distinct" alternatives for non-PostgreSQL databases

Consider the following model:
class Test(Model):
# Some fields...
class TestExecution(Model):
test = ForeignKey(Test)
execution_date = DateTimeField()
# more fields...
class Goal(Model):
tests = ManyToManyField(Test)
# more fields...
I want to get all the latest result of each test performed as part of a certain goal, so I perform the following query:
TestExecution.objects.filter(test__goal_id=goal_id).order_by("execution_date")
but the problem is that I get ALL the executions performed, and I want only the latest for each test.
I saw that the distinct(*fields) method can be used to eliminate duplicate execution of the same test, but it only works in PostgreSQL, so it is not suitable for me.
Is there any other way to filter a QuerySet so that it'll include only rows that are distinct on selected columns?
You can remove duplicates not by a query, but sth like list(set(list_of_objects)) (I recommend first check if it works), for removing list_of_objects duplicates you'll need to define a uniqueness of an object.
In order to do that, you'll need to make the object hashable. You need to define both hash and eq method:
def __eq__(self, other):
return self.execution_date==other.execution_date\
and self.title==other.title
def __hash__(self):
return hash(('title', self.title,
'execution_date', self.execution_date))
also you can do that more easily but not in clean way by getting values_list in query:
list(set(TestExecution.objects.filter(test__goal_id=goal_id)
.values_list("sth", flat = True)
.order_by("execution_date")))
if objects are not hashable remove in the dirty way:
seen_titles = set()
new_list = []
for obj in myList:
if obj.title not in seen_titles:
new_list.append(obj)
seen_titles.add(obj.title)

Create new model instances from existing ones

I have a model:
class MyModel(models.Model):
fieldA = models.IntegerField()
fieldB = models.IntegerField()
fieldC = models.IntegerField()
Now, let's get a QuerySet, e.g.
qs = MyModel.objects.all()
I'd like to be able to change fieldB and fieldC of all instances in qs with the same value and save them as NEW records in my database. I need something similar to qs.update(fieldB=2, fieldC=3) but I don't want to override the original records in qs. Is there a Django-way to do so (i.e., something not involving a manually coded for loop)?
I'm not aware of a Django call that will do what you want in a single call, like update does.
This is the closest I've been able to come to. Assuming that the objects you wanted to operate on are in qs then:
MyModel.objects.bulk_create(
MyModel(**{pair[0]: pair[1] for pair in x.iteritems()
if pair[0] != MyModel._meta.pk.name})
for x in qs.values())
Notes:
qs.values() returns one dictionary per object in qs.
The {pair[0]...} expression is a dictionary comprehension that creates a new dictionary minus primary field defined on MyModel.
MyModel(**...) creates a new MyModel object with its primary key set to None. It effectively creates a copy.
bulk_create creates all the objects in one query. It has caveats that one should be aware of.
If it needs be mentioned, what you have here are dict and list comprehensions, not for loops.

Filter on a distinct field with TastyPie

Suppose I have a Person model that has a first name field and a last name field. There will be many people who have the same first name. I want to write a TastyPie resource that allows me to get a list of the unique first names (without duplicates).
Using the Django model directly, you can do this easily by saying something like Person.objects.values("first_name").distinct(). How do I achieve the same thing with TastyPie?
Update
I've adapted the apply_filters method linked below to use the values before making the distinct call.
def apply_filters(self, request, applicable_filters):
qs = self.get_object_list(request).filter(**applicable_filters)
values = request.GET.get('values', '').split(',')
if values:
qs = qs.values(*values)
distinct = request.GET.get('distinct', False) == 'True'
if distinct:
qs = qs.distinct()
return qs
values returns dictionaries instead of model objects, so I don't think you need to override alter_list_data_to_serialize.
Original response
There is a nice solution to the distinct part of the problem here involving a light override of apply_filters.
I'm surprised I'm not seeing a slick way to filter which fields are returned, but you could implement that by overriding alter_list_data_to_serialize and deleting unwanted fields off the objects just before serialization.
def alter_list_data_to_serialize(self, request, data):
data = super(PersonResource, self).alter_list_data_to_serialize(request, data)
fields = request.GET.get('fields', None)
if fields is not None:
fields = fields.split(',')
# Data might be a bundle here. If so, operate on data.objects instead.
data = [
dict((k,v) for k,v in d.items() if k in fields)
for d in data
]
return data
Combine those two to use something like /api/v1/person/?distinct=True&values=first_name to get what you're after. That would work generally and would still work with additional filtering (&last_name=Jones).

Is it possible to order by an annotation with django TastyPie?

I'm trying to order by a count of a manyToMany field is there a way to do this with TastyPie?
For example
class Person(models.Model):
friends = models.ManyToMany(User, ..)
I want PersonResource to spit out json that is ordered by the number of friends a person has...
is that possible?
I know this is an old question, but I recently encountered this problem and came up with a solution.
Tastypie doesn't easily allow custom ordering, but it is easy to modify the queryset it uses.
I actually just modified the default queryset for the model using a custom manager.
for instance:
class PersonManager(models.Manager):
def get_query_set(self):
return super(PersonManager self).get_query_set().\
annotate(friend_count=models.Count('friends'))
class Person(models.Model):
objects = PersonManager()
friends = ...
You could also add the annotation in Tastypie, wither in the queryset=... in the Meta class, or overriding the get_object_list(self,request) method.
I wasn't able to get the results ordering as per coaxmetal's solution, so I solved this a different way, by overriding the get_object_list on the Resource object as per http://django-tastypie.readthedocs.org/en/latest/cookbook.html. Basically if the 'top' querystring parameter exists, then the ordered result is returned.
class MyResource(ModelResource):
class Meta:
queryset = MyObject.objects.all()
def get_object_list(self, request):
try:
most_popular = request.GET['top']
result = super(MyResource, self).get_object_list(request).annotate(num_something=Count('something')).order_by('num_something')
except:
result = super(MyResource, self).get_object_list(request)
return result
I have not used TastyPie, but your problem seems to be more general. You can't have custom ordering in a Django ORM query. You're better off storing tuples of the form (Person, friend_count). This is pretty easy:
p_list = []
for person in Person.objects.all():
friendcount = len(person.friends.all())
p_list.append((person, friendcount))
Then, you can use the built in sorted function like so:
sorted_list = [person for (person, fc) in sorted(p_list, key=lambda x: x[1])]
The last line basically extracts the Persons from a sorted list of Persons, sorted on the no of friends one has.
`