Selecting all fields and grouping by by one - django

I want to write a query like SELECT * FROM users GROUP BY some_attribute. How can I do that using Django ORM?
User.objects.all().values('some_attribute').annotate(count=Count('*'))
doesn't work, because it just selects some_attribute, instead of * - all.
I need it using the ORM, I don't want to write raw statement.

Related

Django add function to "from" clause?

I'm trying to write a Django query that generates the following:
select group_id, cardinality(array_agg(distinct aul))
from report_table, unnest(active_user_list) as aul
group by group_id
Where active_user_list is an array[int] type.
I'm trying to get a count of the unique items in the arrays of all rows that are in a group. The queryset.extra method gets me very close to this, but adds double quotes around unnest(active_user_list) as aul and doesn't work. I created a custom sql function that does work, but I'd prefer to do it in Django if possible.

Django GROUP BY without aggregate

I would like to write the following query in Postgresql using Django ORM:
SELECT t.id, t.field1 FROM mytable t JOIN ... JOIN ... WHERE .... GROUP BY id
Note that there is NO aggregate function (like SUM, COUNT etc.) in the SELECT part.
By the way, it's a perfectly legal SQL to GROUP BY primary key only in Postgresql.
How do I write it in Django ORM?
I saw workarounds like adding .annotate(dummy=Count('*')) to the queryset (but it slows down the execution) or introducing a dummy custom aggregate function (but it's a dirty hack). How to do it in a clean way?

django valueslist queryset across database engines

In one of the django apps we use two database engine A and B, both are the same database but with different schemas. We have a table called C in both schemas but using db routing it's always made to point to database B. We have formed a valuelist queryset from one of the models in A, tried to pass the same in table C using filter condition __in but it always fetches empty though there are matching records. When we convert valueslist queryset to a list and use it in table C using filter condition __in it works fine.
Not working
data = modelindbA.objects.values_list('somecolumn',flat=True)
info = C.objects.filter(somecolumn__in=data).values_list
Working
data = modelindbA.objects.values_list('somecolumn',flat=True)
data = list(data)
info = C.objects.filter(somecolumn__in=data).values_list
I have read django docs and other SO questions, couldn't find anything relative. My guess is that since both models are in different database schemas the above is not working. I need assistance on how to troubleshoot this issue.
When you use a queryset with __in, Django will construct a single SQL query that uses a subquery for the __in clause. Since the two tables are in different databases, no rows will match.
By contrast, if you convert the first queryset to a list, Django will go ahead and fetch the data from the first database. When you then pass that data to the second query, hitting the second database, it will work as expected.
See the documentation for the in field lookup for more details:
You can also use a queryset to dynamically evaluate the list of values instead of providing a list of literal values.... This queryset will be evaluated as subselect statement:
SELECT ... WHERE blog.id IN (SELECT id FROM ... WHERE NAME LIKE '%Cheddar%')
Because values_list method returns django.db.models.query.QuerySet, not a list.
When you use it with same schema the orm optimise it and should make just one query, but when schemas are different it fails.
Just use list().
I would even recommend to use it for one schema since it can decrease complexity of query and work better on big tables.

Django: Count of Group Elements

How can we achieve the following via the Django 1.5 ORM:
SELECT TO_CHAR(date, 'IW/YYYY') week_year, COUNT(*) FROM entries GROUP BY week_year;
EDIT: cf. Follow up: Count of Group Elements With Joins in Django in case you need a join.
I had to do something like this recently.
You need to add your week_year column via Django's extra, then you can use that column in the values method.
...it's not obvious but if you then use annotate Django will GROUP BY all of the fields mentioned in the values clause (as described in the docs here https://docs.djangoproject.com/en/dev/topics/db/aggregation/#values)
So your code should look like:
Entry.objects.extra(select={'week_year': "TO_CHAR(date, 'IW/YYYY')"}).values('week_year').annotate(Count('id'))

Prevent multiple SQL querys with model relations

Is it possible to prevent multiple querys when i use django ORM ? Example:
product = Product.objects.get(name="Banana")
for provider in product.providers.all():
print provider.name
This code will make 2 SQL querys:
1 - SELECT ••• FROM stock_product WHERE stock_product.name = 'Banana'
2 - SELECT stock_provider.id, stock_provider.name FROM stock_provider INNER JOIN stock_product_reference ON (stock_provider.id = stock_product_reference.provider_id) WHERE stock_product_reference.product_id = 1
I confess, i use Doctrine (PHP) for some projects. With doctrine it's possible to specify joins when retrieve the object (relations are populated in object, so no need to query database again for get attribute relation value).
Is it possible to do the same with Django's ORM ?
PS: I hop my question is comprehensive, english is not my primary language.
In Django 1.4 or later, you can use prefetch_related. It's like select_related but allows M2M relations and such.
product = Product.objects.prefetch_related('providers').get(name="Banana")
You still get two queries, though. From the docs:
prefetch_related, on the other hand, does a separate lookup for each relationship, and does the ‘joining’ in Python.
As for packing this down into a single query, Django won't do it like Doctrine because it doesn't do that much post-processing of the result set (Django would have to remove all the redundant column data, since you'll get a row per provider and each of these rows will have a copy of all of product's fields).
So if you want to pack this down to one query, you're going to have to turn it around and run the query on the Provider table (I'm guessing at your schema):
providers = Provider.objects.filter(product__name="Banana").select_related('product')
This should pack it down to one query, but you won't get a single product ORM object out of it, instead needing to get the product fields via providers[k].product.
You can use prefetch_related, sometimes in combination with select_related, to get all related objects in a single query: https://docs.djangoproject.com/en/1.5/ref/models/querysets/#prefetch-related