Aggregation and extra values with Django

Aggregation and extra values with Django - django

I have a model which looks like this:
class MyModel(models.Model)
value = models.DecimalField()
date = models.DatetimeField()
I'm doing this request:
MyModel.objects.aggregate(Min("value"))
and I'm getting the expected result:
{"mymodel__min": the_actual_minimum_value}
However, I can't figure out a way to get at the same time the minimum value AND the associated date (the date at which the minimum value occured).
Does the Django ORM allow this, or do I have to use raw SQL ?

What you want to do is annotate the query, so that you get back your usual results but also have some data added to the result. So:
MyModel.objects.annotate(Min("value"))
Will return the normal result with mymodel__min as an additional value
In reply to your comment, I think this is what you are looking for? This will return the dates with their corresponding Min values.
MyModel.objects.values('date').annotate(Min("value"))
Edit: In further reply to your comment in that you want the lowest valued entry but also want the additional date field within your result, you could do something like so:
MyModel.objects.values('date').annotate(min_value=Min('value')).order_by('min_value')[0]
This will get the resulting dict you are asking for by ordering the results and then simply taking the first index which will always be the lowest value.
See more

Related

Annotate one part of a range to a new field

So we've been using a DateTimeRangeField in a booking model to denote start and end. The rationale for this might not have been great —separate start and end fields might have been better in hindsight— but we're over a year into this now and there's no going back.
It's generally been fine except I need to annotate just the end datetime onto a related model's query. And I can't work out the syntax.
Here's a little toy example where I want a list of Employees with end of their last booking annotated on.
class Booking(models.Model):
timeframe = DateTimeRangeField()
employee = models.ForeignKey('Employee')
sq = Booking.objects.filter(employee=OuterRef('pk')).values('timeframe')
Employee.objects.annotate(last_on_site=Subquery(sq, output_field=DateTimeField()))
That doesn't work because the annotated value is the range, not the single value. I've tried a heap of modifiers (egs __1 .1 but nothing works).
Is there a way to get just the one value? I guess you could simulate this without the complication of the subquery just doing a simple values lookup. Booking.objects.values('timeframe__start') (or whatever). That's essentially what I'm trying to do here.

Thanks to some help in IRC, it turns out you can use the RangeStartsWith and RangeEndsWith model transform classes directly. These are the things that are normally just registered to provide you with a __startswith filter access to range values, but directly they can pull back the value.
In my example, that means just modifying the annotation slightly:
from django.contrib.postgres.fields.ranges import RangeEndsWith
sq = Booking.objects.filter(employee=OuterRef('pk')).values('timeframe')
Employee.objects.annotate(last_on_site=RangeEndsWith(Subquery(sq[:1])))

Return object when aggregating grouped fields in Django

Assuming the following example model:
# models.py
class event(models.Model):
location = models.CharField(max_length=10)
type = models.CharField(max_length=10)
date = models.DateTimeField()
attendance = models.IntegerField()
I want to get the attendance number for the latest date of each event location and type combination, using Django ORM. According to the Django Aggregation documentation, we can achieve something close to this, using values preceding the annotation.
... the original results are grouped according to the unique combinations of the fields specified in the values() clause. An annotation is then provided for each unique group; the annotation is computed over all members of the group.
So using the example model, we can write:
event.objects.values('location', 'type').annotate(latest_date=Max('date'))
which does indeed group events by location and type, but does not return the attendance field, which is the desired behavior.
Another approach I tried was to use distinct i.e.:
event.objects.distinct('location', 'type').annotate(latest_date=Max('date'))
but I get an error
NotImplementedError: annotate() + distinct(fields) is not implemented.
I found some answers which rely on database specific features of Django, but I would like to find a solution which is agnostic to the underlying relational database.

Alright, I think this one might actually work for you. It is based upon an assumption, which I think is correct.
When you create your model object, they should all be unique. It seems highly unlikely that that you would have two events on the same date, in the same location of the same type. So with that assumption, let's begin: (as a formatting note, class Names tend to start with capital letters to differentiate between classes and variables or instances.)
# First you get your desired events with your criteria.
results = Event.objects.values('location', 'type').annotate(latest_date=Max('date'))
# Make an empty 'list' to store the values you want.
results_list = []
# Then iterate through your 'results' looking up objects
# you want and populating the list.
for r in results:
result = Event.objects.get(location=r['location'], type=r['type'], date=r['latest_date'])
results_list.append(result)
# Now you have a list of objects that you can do whatever you want with.
You might have to look up the exact output of the Max(Date), but this should get you on the right path.

Django get count of each age

I have this model:
class User_Data(AbstractUser):
date_of_birth = models.DateField(null=True,blank=True)
city = models.CharField(max_length=255,default='',null=True,blank=True)
address = models.TextField(default='',null=True,blank=True)
gender = models.TextField(default='',null=True,blank=True)
And I need to run a django query to get the count of each age. Something like this:
Age || Count
10 || 100
11 || 50
and so on.....

Here is what I did with lambda:
usersAge = map(lambda x: calculate_age(x[0]), User_Data.objects.values_list('date_of_birth'))
users_age_data_source = [[x, usersAge.count(x)] for x in set(usersAge)]
users_age_data_source = sorted(users_age_data_source, key=itemgetter(0))

There's a few ways of doing this. I've had to do something very similar recently. This example works in Postgres.
Note: I've written the following code the way I have so that syntactically it works, and so that I can write between each step. But you can chain these together if you desire.
First we need to annotate the queryset to obtain the 'age' parameter. Since it's not stored as an integer, and can change daily, we can calculate it from the date of birth field by using the database's 'current_date' function:
ud = User_Data.objects.annotate(
age=RawSQL("""(DATE_PART('year', current_date) - DATE_PART('year', "app_userdata"."date_of_birth"))::integer""", []),
)
Note: you'll need to change the "app_userdata" part to match up with the table of your model. You can pick this out of the model's _meta, but this just depends if you want to make this portable or not. If you do, use a string .format() to replace it with what the model's _meta provides. If you don't care about that, just put the table name in there.
Now we pick the 'age' value out so that we get a ValuesQuerySet with just this field
ud = ud.values('age')
And then annotate THAT queryset with a count of age
ud = ud.annotate(
count=Count('age'),
)
At this point we have a ValuesQuerySet that has both 'age' and 'count' as fields. Order it so it comes out in a sensible way..
ud = ud.order_by('age')
And there you have it.
You must build up the queryset in this order otherwise you'll get some interesting results. i.e; you can't group all the annotates together, because the second one for count depends on the first, and as a kwargs dict has no notion of what order the kwargs were defined in, when the queryset does field/dependency checking, it will fail.
Hope this helps.
If you aren't using Postgres, the only thing you'll need to change is the RawSQL annotation to match whatever database engine it is that you're using. However that engine can get the year of a date, either from a field or from its built in "current date" function..providing you can get that out as an integer, it will work exactly the same way.

django aggregate for multiple days

I have a model which has two attributes: date and length and others which are not relevant. And I need to display list of sums of length for each day in template.
The solution I've used so far is looping day by day and creating list of sums using aggregations like:
for day in month:
sums.append(MyModel.objects.filter(date=date).aggregate(Sum('length')))
But it seems very ineffective to me because of the number of db lookups. Isn't there a better way to do this? Like caching everything and then filter it without touching the db?

.values() can be used to group by date, so you will only get unique dates together with the sum of length fields via .annotate():
>>> from django.db.models import Sum
>>> MyModel.objects.values('date').annotate(total_length=Sum('length'))
From docs:
When .values() clause is used to constrain the columns that are returned in the result set, the method for evaluating annotations is slightly different. Instead of returning an annotated result for each result in the original QuerySet, the original results are grouped according to the unique combinations of the fields specified in the .values() clause.
Hope this helps.

Filtering QuerySet by __count of RelatedManager

I've got a QuerySet I'd like to filter by the count of a related_name. Currently I've got something like this:
objResults = myObjects.filter(Q(links_by_source__status=ACCEPTED),Q(links_by_source__count=1))
However, when I run this I get the following error message:
Cannot resolve keyword 'count' into field
I'm guessing that this query is operating individually on each of the links_by_source connections, therefore there is no count function since it's not a QuerySet I'm working with. Is there a way of filtering so that, for each object returned, the number of links_by_source is exactly 1?

You need to use an aggregation function to get the count before you can filter on it.
from django.db.models import Count
myObjects.filter(
links_by_source__status=ACCEPTED).annotate(link_count=Count('links_by_source')
).filter(link_count=1)
Note, you should pay attention to the order of the annotate and filter here: that query counts the number of ACCEPTED links, not sure if you want that or you want to check that the total count of all links is 1.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js