How to annotate sum over Django JSONField (Array of objects) data?

How to annotate sum over Django JSONField (Array of objects) data? - django

I have models sth like this
# models.py
class MyModel( models.Model ):
orders = models.JsonField(null= True, blank=True, default=list)
category = models.ForeignKey(Category, on_delete=models.CASCADE)
I stored json data in this structure.
[
{
"order_name": "first order",
"price": 200
},
{
"order_name": "second order",
"price": 800
},
{
"order_name": "third order",
"price": 100
}
]
I want to sum price of all json objects ie 200+800+100

One way will be to use jsonb_array_elements to break each value into rows and then use the normal aggregate function.
For eg:
from django.db import models
Model.objects.annotate(
# This will break items into multiple rows
annotate_field_1=models.Func(models.F('array_field__items'), function='jsonb_array_elements'),
).aggregate(
# Then calculate the total.
total=models.Count('annotate_field_1'),
)['total']

I haven't worked with JSONArrayField but I did a little bit of research and found that the following example can give you a clue:
MyModel.objects.annotate(
order_price_sum=Sum(
Cast(
KeyTextTransform("price", "orders"), models.FloatField()
)
),
)
I tried to implement it to your specific question you can find more helpfull information in the following link: https://dev.to/saschalalala/aggregation-in-django-jsonfields-4kg5
Workaround:
I was trying to figure out how to manage JSONArray using annotate in django but it seems to not be well-documented so I share this workaround to achieve the goal:
total = 0
for i in MyModel.objects.exclude(orders__isnull=True).values('orders'):
total += sum([j.get('price',0) for j in i.get('orders') if j is not None])

Related

How to apply a function on the values selected in Django queryset?

Say I'm having the below model in Django
class Book(models.Model):
id = models.AutoField(primary_key=True)
volumes = JSONField()
I want to get the length of title of all the Books as values -
[
{
"id": 1,
"volumes": [
{
"order": 1
},
{
"order": 2
}
],
"length_of_volumes": 2
},
]
I tried the following, but it's not the proper way to do it as it's not a CharField -
from django.db.models.functions import Length
Books.objects.all().values('id', 'title', length_of_valumes=Length('volumes'))

len('title') will just determine the length of the string 'title' which thus has five characters, so as .values(…), you use .values(length_of_title=5).
You can make use of the Length expression [Django-doc]:
from django.db.models.functions import Length
Books.objects.values('id', 'title', length_of_title=Length('title'))
Note: normally a Django model is given a singular name, so Book instead of Books.

Similar to Willem's answer, but uses annotation. Taken from https://stackoverflow.com/a/34640020/14757226.
from django.db.models.functions import Length
qs = Books.objects.annotate(length_of_title=Length('title')).values('id', 'title', 'length_of_title')
An advantage would be you can then add filter or exclude clauses to query on the length of the title. So if you only wanted results where the title was less than 10 characters or something.

You can use dict structure:
values = []
for b in Books.objects.all().values('id', 'title'):
values.append({
'id': b.id,
'title': b.title,
'length_of_title': len(b.title)
})

django annotate with dynamic column name

I have a model in django app, with the following structure:
class items(models.Model):
name = models.CharField(max_length=50)
location = models.CharField(max_length=3)
I wanted to create a pivot table for the count of each location per each name/item, which I managed to do as per the following:
queryset_res = items.objects.values('name')\
.annotate(NYC=Sum(Case(When(location='NYC', then=1),default=Value('0'),output_field=IntegerField())))\
.annotate(LND=Sum(Case(When(location='LND', then=1),default=Value('0'),output_field=IntegerField())))\
.annotate(ASM=Sum(Case(When(location='ASM', then=1),default=Value('0'),output_field=IntegerField())))\
.annotate(Total=Count('location'))\
.values('name', 'NYC', 'LSA','Total')\
.order_by('-Total')
This gives me how many times each name appears against each location which is all ok.
my question is how can I make the location dynamic, and so if new locations where added I don't have come back and change the code again! either from a list or from the model data itself
Many Thanks
AB

You can bind dynamic parameter with *[1, 2, 3], **{'key': 'value'} in python.
from django.db.models import Case, Count, Sum, IntegerField, Value, When
def get_annotation(key):
return {
key: Sum(
Case(
When(location=key, then=Value(1)),
default=Value(0),
output_field=IntegerField(),
),
),
}
queryset_res = items.objects.values('name')
location_list = ['NYC', 'LSA', 'ASM', ...etc]
for key in location_list:
queryset_res = queryset_res.annotate(**get_annotation(key))
queryset_res = (
queryset_res.annotate(Total=Count("location"))
.values("name", "Total", *location_list)
.order_by("-Total")
)
Now you can implement a set of queries simply by changing location_list.

Django queryset grouped by count of values in Postgres JSONField

My model:
from django.contrib.postgres.fields import JSONField
class Image(models.Model):
tags = JSONField(null=False, blank=True, default={})
tags field value can be empty, or something like:
[
{"tag": "xxx"},
{"tag": "yyy"},
{"tag": "zzz"}
]
The number or dicts may vary (from 0 to N).
I need to make a query that counts Images grouped by number of tags. Something like:
{
"0": "345",
"1": "1223",
"2": "220",
...
"N": "23"
}
where the key is the number of tags, and the value is the count of Image objects that contains this number of tags.
How can i do that? Thank you for your help!
UPDATE
I modified my code: now I don't use JsonField, but a dedicated model:
class ImageTag(models.Model):
image = models.ForeignKey(Image)
tag = models.CharField()
The question is the same :)

Serializers in django rest framework with dynamic fields

I am trying to build a small api with django rest framework but I don't want to map directly the tables with calls (as in the examples).
I have the following database schema:
In models.py:
class ProductType(models.Model):
name = models.CharField(max_length=255, blank=False, null=False, unique=True)
class Product(models.Model):
#staticmethod
def get_accepted_fields(self):
return {'color': 'pink', 'size': 34, 'speed': 0, 'another_prop': ''}
name = models.CharField(max_length=255, blank=False, null=False, unique=True)
class ProductConfig(models.Model):
product_type = models.ForeignKey(ProductType)
product = models.ForeignKey(Product)
# a json field with all kind of fields: eg: {"price": 123, "color": "red"}
value = models.TextField(blank=True)
As you can see, every product can have multiple configurations and the value field is a json with different parameters. The json will be one level only. The configuration will have a flag if is active or not (so, the 1 product will have only 1 active configuration)
So, the data will look for example like this:
store_producttype
=================
1 type1
2 type2
store_product
=============
id name
1 car
store_productconfig
===================
id product_type_id product_id value active
1 2 1 { "color": "red", "size": 34, "speed": 342} 0
2 1 1 { "color": "blue", "size": 36, "speed": 123, "another_prop": "xxx"} 1
What I want to know is how can I get /product/1/ like this:
{
"id": 1,
"name": "car",
"type": "type1",
"color": "blue",
"size": 36,
"speed": 123,
"another_prop": "xxx",
}
and to create a new product posting a json similar with the one above.
The json fields are defined but some of them can miss (eg: "another_prop" in the productconfig.id=1
On update, anyway, it will create a new row in productconfig and it will put inactive=0 on the previous one.
So, every product can have different configuration and I want to go back to a specific configuration back in time in some specific cases). I am not really bound to this data model, so if you have suggentions for improvement I am open to them, but I don't want to have that properties as columns in the table.
The question is, what will be the best way to write the serializers for this model? There is any good example somewhere for a such use case?
Thank you.

Let's take this step by step:
In order to get a JSON like the one you posted, you must first transform your string (productConfig value field) to a dictionary. This can be done by using ast.literal_eval ( see more here).
Then, in your product serializer, you must specify the source for each field, like this:
class ProductSerializer(serializers.ModelSerializer):
color = serializer.Field(source='value_dict.color')
size = serializer.Field(source='value_dict.size')
type = serializer.Field(source='type.name')
class Meta:
model = Product
fields = (
'id',
'color',
'size',
'type',
)
This should work just fine for creating the representation that you want. However, this will not create automatically the product config, because DRF doesn't yet allow nested object creation.
This leads us to the next step:
For creating a product with a configuration from JSON, you must override the post method in your view, and create it yourself. This part shouldn't be so hard, but if you need an example, just ask.
This is more of a suggestion: if the json fields are already defined, wouldn't it be easier to define them as separate fields in your productConfig model?

Django query - Is it possible to group elements by common field at database level?

I have a database model as shown below. Consider the data as 2 different books each having 3 ratings.
class Book(models.Model):
name = models.CharField(max_length=50)
class Review(models.Model):
book = models.ForeignKey(Book)
review = models.CharField(max_length=1000)
rating = models.IntegerField()
Question : Is it possible to group all the ratings in a list, for each book with a single query. I'm looking to do this at database level, without iterating over the Queryset in my code. Output should look something like :
{
'book__name':'book1',
'rating' : [3, 4, 4],
'average' : 3.66,
'book__name':'book2',
'rating : [2, 1, 1] ,
'average' : 1.33
}
I've tried this query, but neither are the ratings grouped by book name, nor is the average correct :
Review.objects.annotate(average=Avg('rating')).values('book__name','rating','average')
Edit : Added clarification that I'm looking for a method to group the elements at database level.

You can do this. Hope this helps.
Review.objects.values('book__name').annonate(average=Avg('rating'))
UPDATE:
If you want all the ratings of a particular book in a list, then you can do this.
from collections import defaultdict
ratings = defaultdict(list)
for result in Review.objects.values('book__name', 'rating').order_by('book__name', 'rating'):
ratings[result['book__name']].append(result['rating'])
You will get a structure like this :
[{ book__name: [rating1, rating2, ] }, ]
UPDATE:
q = Review.objects.values('book__name').annonate(average=Avg('rating')).filter().prefetech_related('rating')
q[0].ratings.all() # gives all the ratings of a particular book name
q[0].average # gives average of all the ratings of a particular book name
Hope this works (I'm not sure, sorry), but you need to add related_ name attribute
class Review(models.Model):
book = models.ForeignKey(Book, related_name='rating')
UPDATE:
Sorry to say, but you need something called as GROUP_CONCAT in SQL , but it is not supported in Django ORM currently.
You can use Raw SQL or itertools
from django.db import connection
sql = """
SELECT name, avg(rating) AS average, GROUP_CONCAT(rating) AS rating
FROM book JOIN review on book.id = review.book_id
GROUP BY name
"""
cursor = connection.cursor()
cursor.execute(sql)
data = cursor.fetchall()
DEMO

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to annotate sum over Django JSONField (Array of objects) data? - django

Related

How to apply a function on the values selected in Django queryset?

django annotate with dynamic column name

Django queryset grouped by count of values in Postgres JSONField

Serializers in django rest framework with dynamic fields

Django query - Is it possible to group elements by common field at database level?

Categories

Resources