Django GROUP BY without aggregate

Django GROUP BY without aggregate - django

I would like to write the following query in Postgresql using Django ORM:
SELECT t.id, t.field1 FROM mytable t JOIN ... JOIN ... WHERE .... GROUP BY id
Note that there is NO aggregate function (like SUM, COUNT etc.) in the SELECT part.
By the way, it's a perfectly legal SQL to GROUP BY primary key only in Postgresql.
How do I write it in Django ORM?
I saw workarounds like adding .annotate(dummy=Count('*')) to the queryset (but it slows down the execution) or introducing a dummy custom aggregate function (but it's a dirty hack). How to do it in a clean way?

Related

Annotating a QuerySet with joined tables in Django

How can I annotate a Django QuerySet with data from a custom join expression, without using raw SQL?
I'd like to translate the following query for the Django ORM, without having to use this question :
SELECT a.*, b.name as b_name
FROM a
JOIN b ON ST_Within(ST_Centroid(a.geom), b.geom)
As far as I could tell, the best candidate for doing something like this it the annotate(...) function, but the documentation didn't have anything on how to add a joined table to the annotated QuerySet.
My other idea was to use something similar to ManyToManyField (maybe subclass it) that can use custom ON ... expressions for its joined model.
Any other idea?

Hourly grouping of rows using Django

I have been trying to group the results of table into Hourly format using DateTimeField.
SQL:
SELECT strftime('%H', created_on), count(*)
FROM users_test
GROUP BY strftime('%H', created_on);
This query works fine, but the corresponding Django query does not.
Django queries I've tried:
Test.objects.extra({'hour': 'strftime("%%H", created_on)'}).values('hour').annotate(count=Count('id'))
# SELECT (strftime("%H", created_on)) AS "hour", COUNT("users_test"."id") AS "count" FROM "users_test" GROUP BY (strftime("%H", created_on)), "users_test"."created_on" ORDER BY "users_test"."created_on" DESC
It adds additional group by "users_test"."created_on", which I guess is giving incorrect results.
It would be great if anyone can explain me this and provide a solution as well.
Environment:
Python 3
Django 1.8.1
Thanks in Advance
References (Possible Duplicates) (But None helping out):
Grouping Django model entries by day using its datetime field
Django - Group By with Date part alone
Django aggregate on .extra values

To fix it, append order_by() to query chain. This will override model Meta default ordering. Like this:
Test
.objects
.extra({'hour': 'strftime("%%H", created_on)'})
.order_by() #<------ here
.values('hour')
.annotate(count=Count('id'))
In my environment ( Postgres also ):
>>> print ( Material
.objects
.extra({'hour': 'strftime("%%H", data_creacio)'})
.order_by()
.values('hour')
.annotate(count=Count('id'))
.query )
SELECT (strftime("%H", data_creacio)) AS "hour",
COUNT("material_material"."id") AS "count"
FROM "material_material"
GROUP BY (strftime("%H", data_creacio))
Learn more in order_by django docs:
If you don’t want any ordering to be applied to a query, not even the default ordering, call order_by() with no parameters.
Side note:
using extra() may introduce SQL injection vulnerability to your code. Use this with precaution and escape any parameters that user can introduce. Compare with docs:
Warning
You should be very careful whenever you use extra(). Every time you
use it, you should escape any parameters that the user can control by
using params in order to protect against SQL injection attacks .
Please read more about SQL injection protection.

Executing Anti Join in Django ORM

I have two models:
class Note(model):
<attribs>
class Permalink(model):
note = foreign key to Note
I want to execute a query: get all notes which don't have a permalink.
In SQL, I would do it as something like:
SELECT * FROM Note WHERE id NOT IN (SELECT note FROM Permalink);
Wondering how to do this in ORM.
Edit: I don't want to get all the permalinks out into my application. Would instead prefer it to run as a query inside the DB.

You should be able to use this query:
Note.objects.filter(permalink_set__isnull=True)

you can use:
Note.objects.exclude(id__in=Permalink.objects.all().values_list('id', flat=True))

Django: Count of Group Elements

How can we achieve the following via the Django 1.5 ORM:
SELECT TO_CHAR(date, 'IW/YYYY') week_year, COUNT(*) FROM entries GROUP BY week_year;
EDIT: cf. Follow up: Count of Group Elements With Joins in Django in case you need a join.

I had to do something like this recently.
You need to add your week_year column via Django's extra, then you can use that column in the values method.
...it's not obvious but if you then use annotate Django will GROUP BY all of the fields mentioned in the values clause (as described in the docs here https://docs.djangoproject.com/en/dev/topics/db/aggregation/#values)
So your code should look like:
Entry.objects.extra(select={'week_year': "TO_CHAR(date, 'IW/YYYY')"}).values('week_year').annotate(Count('id'))

Prevent multiple SQL querys with model relations

Is it possible to prevent multiple querys when i use django ORM ? Example:
product = Product.objects.get(name="Banana")
for provider in product.providers.all():
print provider.name
This code will make 2 SQL querys:
1 - SELECT ••• FROM stock_product WHERE stock_product.name = 'Banana'
2 - SELECT stock_provider.id, stock_provider.name FROM stock_provider INNER JOIN stock_product_reference ON (stock_provider.id = stock_product_reference.provider_id) WHERE stock_product_reference.product_id = 1
I confess, i use Doctrine (PHP) for some projects. With doctrine it's possible to specify joins when retrieve the object (relations are populated in object, so no need to query database again for get attribute relation value).
Is it possible to do the same with Django's ORM ?
PS: I hop my question is comprehensive, english is not my primary language.

In Django 1.4 or later, you can use prefetch_related. It's like select_related but allows M2M relations and such.
product = Product.objects.prefetch_related('providers').get(name="Banana")
You still get two queries, though. From the docs:
prefetch_related, on the other hand, does a separate lookup for each relationship, and does the ‘joining’ in Python.
As for packing this down into a single query, Django won't do it like Doctrine because it doesn't do that much post-processing of the result set (Django would have to remove all the redundant column data, since you'll get a row per provider and each of these rows will have a copy of all of product's fields).
So if you want to pack this down to one query, you're going to have to turn it around and run the query on the Provider table (I'm guessing at your schema):
providers = Provider.objects.filter(product__name="Banana").select_related('product')
This should pack it down to one query, but you won't get a single product ORM object out of it, instead needing to get the product fields via providers[k].product.

You can use prefetch_related, sometimes in combination with select_related, to get all related objects in a single query: https://docs.djangoproject.com/en/1.5/ref/models/querysets/#prefetch-related

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Django GROUP BY without aggregate - django

Related

Annotating a QuerySet with joined tables in Django

Hourly grouping of rows using Django

Executing Anti Join in Django ORM

Django: Count of Group Elements

Prevent multiple SQL querys with model relations

Categories

Resources