Django model aggregation

Django model aggregation - django

I have a simple hierarchic model whit a Person and RunningScore as child.
this model store data about running score of many user, simplified something like:
class Person(models.Model):
firstName = models.CharField(max_length=200)
lastName = models.CharField(max_length=200)
class RunningScore(models.Model):
person = models.ForeignKey('Person', related_name="scores")
time = models.DecimalField(max_digits=6, decimal_places=2)
If I get a single Person it cames with all RunningScores associated to it, and this is standard behavior. My question is really simple: if I'd like to get a Person with only a RunningScore child (suppose the better result, aka min(time) ) how can I do?
I read the official Django documentation but have not found a
solution.

I am not 100% sure if I get what you mean, but maybe this will help:
from django.db.models import Min
Person.objects.annotate(min_running_time=Min('time'))
The queryset will fetch Person objects with min_running_time additional attribute.
You can also add a filter:
Person.objects.annotate(min_running_time=Min('time')).filter(firstName__startswith='foo')
Accessing the first object's min_running_time attribute:
first_person = Person.objects.annotate(min_running_score=Min('time'))[0]
print first_person.min_running_time
EDIT:
You can define a method or a property such as the following one to get the related object:
class Person(models.Model):
...
#property
def best_runner(self):
try:
return self.runningscore_set.order_by('time')[0]
except IndexError:
return None

If you want one RunningScore for only one Person you could use odering and limit your queryset to 1 object.
Something like this:
Person.runningscore_set.order_by('-time')[0]
Here is the doc on limiting querysets:
https://docs.djangoproject.com/en/1.3/topics/db/queries/#limiting-querysets

Related

Bin a queryset using Django?

Let's say we have the following simplistic models:
class Category(models.Model):
name = models.CharField(max_length=264)
def __str__(self):
return self.name
class Meta:
verbose_name_plural = "categories"
class Status(models.Model):
name = models.CharField(max_length=264)
def __str__(self):
return self.name
class Meta:
verbose_name_plural = "status"
class Product(models.Model):
title = models.CharField(max_length=264)
description = models.CharField(max_length=264)
category = models.ForeignKey(Category, on_delete=models.CASCADE)
price = models.DecimalField(max_digits=10)
status = models.ForeignKey(Status, on_delete=models.CASCADE)
My aim is to get some statistics, like total products, total sales, average sales etc, based on which price bin each product belongs to.
So, the price bins could be something like 0-100, 100-500, 500-1000, etc.
I know how to use pandas to do something like that:
Binning column with python pandas
I am searching for a way to do this with the Django ORM.
One of my thoughts is to convert the queryset into a list and apply a function to get the apropriate price bin and then do the statistics.
Another thought which I am not sure how to impliment, is the same as the one above but just apply the bin function to the field in the queryset I am interested in.

There are three pathways I can see.
First is composing the SQL you want to use directly and putting it to your database with a modification of your models manager class. .objects.raw("[sql goes here]"). This answer shows how to define group with a simple function on the content - something like that could work?
SELECT FLOOR(grade/5.00)*5 As Grade,
COUNT(*) AS [Grade Count]
FROM TableName
GROUP BY FLOOR(Grade/5.00)*5
ORDER BY 1
Second is that there is no reason you can't move the queryset (with .values() or .values_list()) into a pandas dataframe or similar and then bin it, as you mentioned. There is probably a bit of an efficiency loss in terms of getting the queryset into a dataframe and then processing it, but I am not sure that it would certainly or always be bad. If its easier to compose and maintain, that might be fine.
The third way I would try (which I think is what you really want) is chaining .annotate() to label points with the bin they belong in, and the aggregate count function to count how many are in each bin. This is more advanced ORM work than I've done, but I think you'd start looking at something like the docs section on conditional aggregation. I've adapted this slightly to create the 'price_class' column first, with annotate.
Product.objects.annotate(price_class=floor(F('price')/100).aggregate(
class_zero=Count('pk', filter=Q(price_class=0)),
class_one=Count('pk', filter=Q(price_class=1)),
class_two=Count('pk', filter=Q(price_class=2)), # etc etc
)
I'm not sure if that 'floor' is going to work, and you may need 'expression wrapper' to ensure the push price_class into the write type of output_field. All the best.

Calling a model's method from another model

I am quite new to django so my question might seem very trivial..
I have 2 models (simplified for demonstration purposes):
class Subarticle(models.Model):
parent_article = models.ForeignKey(Article, related_name='subarticles')
priority = IntegerField()
....
def getCheapest(self, quantity): //find cheapest subarticle based on qty
//code
and
class Article(models.Model):
sub_article_qty = models.IntegerField()
def production_cost(self):
sub_article = Subarticle.objects.filter(parent_article=self).order_by('priority').first
sub_article_price = sub_article.getCheapest(self.sub_article_qty)
return sub_article_price*self.sub_article_qty
So basically every article has one or more sub-articles and I want to be able to find the cost for the article based on the cheapest priced sub-article with the lowest priority number.
I am using "rest_framwork" to send the model data with approximately following serializer
from .models import Subarticles, Articles
from rest_framwork import serializers
class SubarticleSerializer(serializer.HyperlinkedModelSerializer):
class Meta:
model=Subarticle
fields=('parent_article','priority')
class ArticleSerializer(serializer.HyperlinkedModelSerializer):
class Meta:
model=Article
fields=('sub_article_qty','subarticles','production_cost')
But trying to do this like this gives me the following error:
Exception raised in callable attribute "production_cost"; original exception was: 'function' object has no attribute 'getCheapest'
Is it even possible to do it as I am trying to do or is there some other way of achieving this?

first is a method; you didn't call it.
sub_article = Subarticle.objects.filter(parent_article=self).order_by('priority').first()
Note, an easier way of spelling this is:
sub_article = self.subarticles.all().order_by('priority').first()

How to query a field in an ListField of EmbeddedModelField in django-nonrel?

Let's say I have this:
class Parent(models.Model):
id = models.IntegerField(primary_key=True)
children = ListField(EmbeddedModelField('Child'))
class Child(models.Model):
id = models.IntegerField(primary_key=True)
In the mongo interactive shell, finding Parent's with a particular Child is as easy as:
db.myapp_parent.find({'children.id': 123})
How is this done in django-nonrel?
I tried a few things including I looked for raw queries but raw_results is not a method in Parent.objects for some reason.
FWIW, this is what I have in my requirements.txt:
git+https://github.com/django-nonrel/django#nonrel-1.3
git+https://github.com/django-nonrel/djangotoolbox#toolbox-1.3
git+https://github.com/django-nonrel/mongodb-engine#mongodb-engine-1.3

I think I found the answer myself:
https://groups.google.com/forum/#!topic/django-non-relational/kCLOcI7nHS0
Basically, looks like this is not supported yet.
So the workaround is raw queries.
In order to make raw queries the code in the question should be modified to:
from django_mongodb_engine.contrib import MongoDBManager
class Parent(models.Model):
id = models.IntegerField(primary_key=True)
children = ListField(EmbeddedModelField('Child'))
objects = MongoDBManager()
class Child(models.Model):
id = models.IntegerField(primary_key=True)
Then
Parent.objects.raw_query({'children.id': 123})
works.

Looked around for a while and suddenly the following mentioned there worked like magic for me, that appears to avoid the need for raw queries (adapted to your example):
from django_mongodb_engine.query import A
...
Parent.objects.filter( children = A('id', '123') )
As for requirements:
git+https://github.com/django-nonrel/django#nonrel-1.5
git+https://github.com/django-nonrel/djangotoolbox#toolbox-1.8
git+https://github.com/django-nonrel/mongodb-engine
#(django-mongodb-engine==0.6.0)
#(pymongo==3.2)

Figuring out how to design my model and using "through"

I'm trying to figure out how to design my model. I've been going over the documentation, and it ultimately seems like I should be using the "through" attribute, but I just can't figure out how to get it to work how I want.
If someone could take a look and point out what I'm missing, that would be really helpful. I have pasted my model below.
This is what I am trying to do:
1) Have a list of server types
2) Each server type will need to have different parts available to that specific server type
3) The asset has a FK to the servermodel, which has a M2M to the parts specific to that server type.
My question is, how can each "Asset" store meta data for each "Part" specific to that "Asset"? For example, each "Asset" should have it's own last_used data for the part that's assigned to it.
Thanks! :)
class Part(models.Model):
part_description = models.CharField(max_length=30,unique=1)
last_used = models.CharField(max_length=30)
def __unicode__(self):
return self.part_description
class ServerModel(models.Model):
server_model = models.CharField(max_length=30,unique=1)
parts = models.ManyToManyField(Part)
def __unicode__(self):
return self.server_model
class Asset(models.Model):
server_model = models.ForeignKey(ServerModel)
serial_number = models.CharField(max_length=10,unique=1)
def __unicode__(self):
return self.server_model.server_model
EDIT:
Thank you for the help!
I may have not explained myself clearly, though. It's probably my confusing model names.
Example:
ServerModel stores the type of server being used, say "Dell Server 2000".
The "Dell Server 2000" should be assigned specific parts:
"RAM"
"HARD DISK"
"CDROM"
Then, I should be able to create 10x Assets with a FK to the ServerModel. Now, each of these assets should be able to mark when the "RAM" part was last used for this specific asset.

I'm not sure I exactly understand what you want to do, but basically you can solve that with a "through" model, as you expected:
import datetime
class Part(models.Model):
name = models.CharField(max_length=30,unique=1)
class ServerModel(models.Model):
server_model = models.CharField(max_length=30,unique=1)
parts = models.ManyToManyField(Part,through='Asset')
class Asset(models.Model):
server_model = models.ForeignKey(ServerModel)
part = models.ForeignKey(Part)
serial_number = models.CharField(max_length=10,unique=1)
used = models.DateTimeField(default=datetime.datetime.now())
First thing to notice is the relation of the parts to the servermodel using the "through"-model: that way for each Part instance assigned to the "parts"-property of a ServerModel instance a new Asset instance is created (Phew - hope that doesn't sound too complicated). At the time of creation the "used"-property of the Asset instance is set to the current date and time (thats what default=datetime.datetime.now() does).
If you do that, you can then just query the database for the last asset containing your part. That queryset can then be sorted by the "used" property of the Asset model, which is the date when the Asset instance has been created.
ServerModel.objects.filter(parts__name='ThePartYouAreLookingFor').order_by('asset__used')
I'm not absolutely sure if the queryset is correct, so if someone finds huge nonsense in it, feel free to edit ;)
edit:
The models above do not exactly that. But you do not even need a through model for what you want:
class ServerModel(models.Model):
server_model = models.CharField(max_length=30,unique=1)
parts = models.ManyToManyField(Part)
class Asset(models.Model):
server_model = models.ForeignKey(ServerModel)
parts = models.ForeignKey(Part)
serial_number = models.CharField(max_length=10,unique=1)
used = models.DateTimeField(default=datetime.datetime.now())
Basically you can just add assets and then query all assets that have a RAM in parts.
Asset.objects.filter(parts__contains='RAM').order_by('used')
Get the date of the first (or last) result of that queryset and you have the date of the last usage of your 'RAM'-part.

Django: construct a QuerySet inside a view?

I have models as follows:
class Place(models.Model):
name = models.CharField(max_length=300)
class Person(models.Model):
name = models.CharField(max_length=300)
class Manor(models.Model):
place = models.ManyToManyField(Place, related_name="place"))
lord = models.ManyToManyField(Person, related_name="lord")
overlord = models.ManyToManyField(Person, related_name="overlord")
I want to get all the Places attached with the relation 'lord' to a particular person, and then get the centre, using a GeoDjango method. This is as far as I've got:
person = get_object_or_404(Person, namesidx=namesidx)
manors = Manor.objects.filter(lord=person)
places = []
for manor in manors:
place_queryset = manor.place.all()
for place in place_queryset:
places.append(place)
if places.collect():
centre = places.collect().centroid
However, this gives me:
AttributeError at /name/208460/gamal-of-shottle/
'list' object has no attribute 'collect'
Can I either (a) do this in a more elegant way to get a QuerySet of places back directly, or (b) construct a QuerySet rather than a list in my view?
Thanks for your help!

The way you're doing this, places is a standard list, not a QuerySet, and collect is a method that only exists on GeoDjango QuerySets.
You should be able to do the whole query in one go by following the relations with the double-underscore syntax:
places = Place.objects.filter(manor__lord=person)
Note that your use of related_name="place" on the Manor.place field is very confusing - this is what sets the reverse attribute from Place back to Manor, so it should be called manors.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Django model aggregation - django

If you want one RunningScore for only one Person you could use odering and limit your queryset to 1 object. Something like this: Person.runningscore_set.order_by('-time')[0] Here is the doc on limiting querysets: https://docs.djangoproject.com/en/1.3/topics/db/queries/#limiting-querysets

Related

Bin a queryset using Django?

Calling a model's method from another model

How to query a field in an ListField of EmbeddedModelField in django-nonrel?

Figuring out how to design my model and using "through"

Django: construct a QuerySet inside a view?

Categories

Resources