Django aggregation: Group and sum over a date range

Django aggregation: Group and sum over a date range - django

I am fetching some user stats daily and recording them in a model as follows (irrelevant parts are stripped off for simplicity):
class User(models.Model):
group = models.CharField(
choices=(('A', 'Group A'), ('B', 'Group B'),
))
class Stats(models.Model):
day = models.DateField()
user = models.ForeignKey(User)
follower_count = models.PositiveIntegerField()
As seen above, each user belongs to a group.
How can I get the sum each user's follower_counts for each group over a date range?
In other words, what is the best way to build a data structure as follows using these models? Is it possible to do it with a single aggregate query?
[{
'date': '2015-07-15',
'Group A': 26, # sum of followers of users in Group A on 2015-07-15
'Group B': 15,
}, {
'date': '2015-07-16',
'Group A': 30,
'Group B': 18,
}, {
'date': '2015-07-17',
'Group A': 32,
'Group B': 25,
}]
Thank you.

You should be able to get the desired aggregate query by using this block of code.
Stats.objects.values('day').annotate(
group_a=Sum(Case(When(user__group='A', then='follower_count'))),
group_b=Sum(Case(When(user__group='B', then='follower_count')))
)
Basically it tells the Django ORM to get the sums of the follower_count's of the two groups A and B, and the column aliases will be "group_a" and "group_b" respectively. The aggregation will be performed with a GROUP BY using the 'day' field.
The resulting queryset will give you the details you want. The rest will be just formatting. You may use the basic JSON serializer Django provides to get the format you want, but if it is for a Web API, you might want to take a look at Django REST Framework, particularly the serializers.

Related

Convert raw sql query to django orm

I written this query in PostgreSQL and I'm confused of conversion of this query to django orm
SELECT count(*),
concat(date_part('month', issue_date), '/', date_part('year', issue_date) ) as date
FROM affiliates_issuelog
WHERE tenant_id = '{tenant_id}'
GROUP BY date_part('month', issue_date),
date_part('year', issue_date)
ORDER BY date_part('year', issue_date) desc,
date_part('month', issue_date) desc
I have this model that records the insertion of new affiliates by date and by institution (tenant), only I need to receive from the query the total amount of records inserted per month in the year, and I was using the listview to make my pages until then but I don't know how to filter this data using orm.
class IssueLog():
tenant = models.ForeignKey("tenants.Federation", on_delete=models.CASCADE)
issue_date = models.DateField(
default=date.today, verbose_name=_("date of issue")
)
class Meta:
verbose_name = _("Entrada de emissão")
verbose_name_plural = _("Entradas de emissão")
def __str__(self):
return f"{self.institution}, {self.tenant}"
My pages that return a list of data I did as the example below, is it possible to pass the data as I want through get_queryset()?, I already managed to solve my problem using the raw query, but the project is being done only with orm so I wanted to keep that pattern for the sake of the team. Ex:
class AffiliateExpiredListView(HasRoleMixin, AffiliateFilterMxin, ListView):
allowed_roles = "federation"
model = Affiliate
ordering = "-created_at"
template_name_suffix = "issued_list_expired"
paginate_by = 20
def get_context_data(self, **kwargs):
context = super().get_context_data(**kwargs)
context["renew_form"] = AffiliateRenewForm()
tenant_t = self.request.user.tenant
context["cancel_presets"] = tenant_t.cancelationreason_set.all()
return context
def get_queryset(self):
return super().get_queryset().filter(is_expired=True).order_by('full_name')

You can query with:
from django.db.models import Count
from django.db.models.functions import ExtractMonth, ExtractYear
IssueLog.objects.values(
year=ExtractYear('issue_date'),
month=ExtractMonth('issue_date')
).annotate(
total=Count('pk')
).order_by('-year', '-month')
This will make a queryset with dictionaries that look like:
<QuerySet [
{'year': 2022, 'month': 2, 'total': 14},
{'year': 2022, 'month': 1, 'total': 25},
{'year': 2021, 'month': 12, 'total': 13}
]>
I would not do string formatting in the database query, but just do this in the template, etc.
But the model can not be abstract = True [Django-doc]: that means that there is no table, and that it is only used for inheritance purposes to implement logic and reuse it somewhere else.

Django fetch manytomany values as list

class Groups:
name = models.CharField(max_length=100)
class User:
firstname = models.CharField(max_length=100)
lastname = models.CharField(max_length=100)
group = models.ManyToManyField(Groups)
i need user details with groupname as list not as separate records.
User.objects.values('firstname','lastname', 'group__name')
While i'm querying like above, i'm getting like this
<QuerySet [{'firstname': 'Market', 'lastname': 'Cloud', 'group__name':
'Group 5'}, {'firstname': 'Market', 'lastname': 'Cloud',
'group__name': 'Group 4'}]>
but i want like this
<QuerySet [{'firstname': 'Market', 'lastname': 'Cloud', 'group__name':
['Group 5', 'Group 4']}]>
is there a way, i can query like this without doing separate query.

If you're using postgres you can use ARRAY_AGG function. In Django ORM like below:
from django.contrib.postgres.aggregates import ArrayAgg
User.objects \
.annotate(list_of_group_names=ArrayAgg('group__name')) \
.order_by('id').distinct() \
.values('firstname', 'lastname', 'list_of_group_names')
Note: distinct is useful, because joining tables can result in duplicates

Is there a way to get the columns from a joined table in the model instance dict object?

t = PurchaseHeader.objects.first()
t.__dict__
{
'_state': <django.db.models.base.ModelState object at 0x7f4b34aa7fa0>,
'id': 3,
'ref': 'jhkh',
'goods': Decimal('-100.00'),
'discount': Decimal('0.00'),
'vat': Decimal('-20.00'),
'total': Decimal('-120.00'),
'paid': Decimal('-120.00'),
'due': Decimal('0.00'),
'date': datetime.date(2020, 11, 7),
'due_date': datetime.date(2020, 11, 14),
'period': '202007',
'status': 'c',
'created': datetime.datetime(2020, 11, 7, 15, 46, 48, 191772, tzinfo=<UTC>),
'cash_book_id': None,
'supplier_id': 1128,
'type': 'pc'
}
When I joined the supplier table I was disappointed to find that the columns are not included in the dict. Below, t.__dict__ is the same as above. I noticed that the Supplier model instance is cached inside of t._state so I guess I could create my own method which all models inherit from which does what i want - all the columns from all tables inside a dict. But I wondered if anybody knew a way of doing this sort of thing out of the box?
t = PurchaseHeader.objects.select_related("supplier").first()
t.__dict__

select_related's goal is actually to prefetch data so that it doesn't need to be fetched in a second query when accessing "supplier". Instead it already fetched this data using a join in the original query.
If you want to obtain a dict based of your model that also contains the data of a relation in it, your best bet is using ModelSerializer with a nested serializer. Assuming that your supplier model is called Supplier it would look something like this:
class SupplierSerializer(serializers.ModelSerializer):
class Meta:
model = Supplier
fields = ['name', 'other_field'] # Add more Supplier fields
class PurchaseHeaderSerializer(serializers.ModelSerializer):
supplier = SupplierSerializer(read_only=True)
class Meta:
model = PurchaseHeader
fields = ['supplied', 'vat', 'total'] # Add more PurchaseHeader fields
You can then use the PurchaseHeaderSerializer like this:
purchase_header = PurchaseHeader.objects.select_related("supplier").first()
the_dict_you_want = PurchaseHeaderSerializer(instance=purchase_header).data

Serializers in django rest framework with dynamic fields

I am trying to build a small api with django rest framework but I don't want to map directly the tables with calls (as in the examples).
I have the following database schema:
In models.py:
class ProductType(models.Model):
name = models.CharField(max_length=255, blank=False, null=False, unique=True)
class Product(models.Model):
#staticmethod
def get_accepted_fields(self):
return {'color': 'pink', 'size': 34, 'speed': 0, 'another_prop': ''}
name = models.CharField(max_length=255, blank=False, null=False, unique=True)
class ProductConfig(models.Model):
product_type = models.ForeignKey(ProductType)
product = models.ForeignKey(Product)
# a json field with all kind of fields: eg: {"price": 123, "color": "red"}
value = models.TextField(blank=True)
As you can see, every product can have multiple configurations and the value field is a json with different parameters. The json will be one level only. The configuration will have a flag if is active or not (so, the 1 product will have only 1 active configuration)
So, the data will look for example like this:
store_producttype
=================
1 type1
2 type2
store_product
=============
id name
1 car
store_productconfig
===================
id product_type_id product_id value active
1 2 1 { "color": "red", "size": 34, "speed": 342} 0
2 1 1 { "color": "blue", "size": 36, "speed": 123, "another_prop": "xxx"} 1
What I want to know is how can I get /product/1/ like this:
{
"id": 1,
"name": "car",
"type": "type1",
"color": "blue",
"size": 36,
"speed": 123,
"another_prop": "xxx",
}
and to create a new product posting a json similar with the one above.
The json fields are defined but some of them can miss (eg: "another_prop" in the productconfig.id=1
On update, anyway, it will create a new row in productconfig and it will put inactive=0 on the previous one.
So, every product can have different configuration and I want to go back to a specific configuration back in time in some specific cases). I am not really bound to this data model, so if you have suggentions for improvement I am open to them, but I don't want to have that properties as columns in the table.
The question is, what will be the best way to write the serializers for this model? There is any good example somewhere for a such use case?
Thank you.

Let's take this step by step:
In order to get a JSON like the one you posted, you must first transform your string (productConfig value field) to a dictionary. This can be done by using ast.literal_eval ( see more here).
Then, in your product serializer, you must specify the source for each field, like this:
class ProductSerializer(serializers.ModelSerializer):
color = serializer.Field(source='value_dict.color')
size = serializer.Field(source='value_dict.size')
type = serializer.Field(source='type.name')
class Meta:
model = Product
fields = (
'id',
'color',
'size',
'type',
)
This should work just fine for creating the representation that you want. However, this will not create automatically the product config, because DRF doesn't yet allow nested object creation.
This leads us to the next step:
For creating a product with a configuration from JSON, you must override the post method in your view, and create it yourself. This part shouldn't be so hard, but if you need an example, just ask.
This is more of a suggestion: if the json fields are already defined, wouldn't it be easier to define them as separate fields in your productConfig model?

Annotate in SQLite

I have the following model:
class Model(...):
date = DateField()
user = ForeignKey()
data = ForeignKey()
time = IntegerField()
I'd like to make sum of time field for every user for a single data, so I do:
Model.objects.filter(date=..., data=...).values('user_id').annotate(time=Sum('time'))
but I receive result which looks like:
[{'user_id': 1, 'time': 20},{'user_id': 1, 'time': 10}, {'user_id': 2, 'time': 20}]
So the grouping does not work. I checked query generated by django and I don't know why django uses date and data for grouping as well, not only user. Am I doing something wrong or this is only SQLite issue?

You should append .order_by() to your query set to clear default model ordering.
For your code:
(Model.objects.filter(date=…, data=…)
.values('user_id')
.annotate(time=Sum('time'))
.order_by()) # <---- Here
This is full explained in default ordering doc warning:
"Except that it won't quite work. The default ordering by name will
also play a part in the grouping ... you should ... clearing any ordering in the query."

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Django aggregation: Group and sum over a date range - django

Related

Convert raw sql query to django orm

Django fetch manytomany values as list

Is there a way to get the columns from a joined table in the model instance dict object?

Serializers in django rest framework with dynamic fields

Annotate in SQLite

Categories

Resources