How does distinct parameter work with the Count method in annotate?

How does distinct parameter work with the Count method in annotate? - django

I got a problem with annotate method when I was using the Count method to count multiple columns that come from the database which have a relationship with one of the tables.
Let me give you a quick example:
match_session_instance = MatchSessionInstance.objects.filter(match_session=match_session, status="main")
match_instances = MatchSessionInstance.objects.filter(match_session=match_session)
action_counts = match_instances.values(player_number=F("player_pk__number"), player_name=F("player_pk__player"))\
.annotate(pass_count=Count("live_match_pass__id", distinct=True),
corner_count=Count("live_match_corner__id", distinct=True))
In the meantime, I'm not facing any problems - I caught my issue and I addressed it but that is the problem now.
I don't know how could "disticnt=True" parameter helps me to fix that problem!
I googled a bit and found this source that has helped me:
Count on multiple fields in Django querysets
I know what does distinct as a method in ORM but actually, I get no idea how it works in that format special when I used columns that never have duplicated data.
How can I understand this?

Related

Django Query, Distinct and Order_By combination not working

There are similar questions here but I haven't been able to find one that helps me.
I have two models, Chat and Post
there are multiple Chats, and each chat has multiple posts attached to it.
I'm trying to get the latest post for each chat.
Post.objects.order_by('-id').distinct('Chat')
Filter the posts by ID (so the newest post is first), and then grab the distinct ones based on the Chats.
but since order_by and distinct don't match I'm getting the error:
SELECT DISTINCT ON expressions must match initial ORDER BY expressions
So how exactly do I go about doing this? Rawsql? Thanks!

If you use distinct by related model, you must use ordering based of this model:
Post.objects.order_by('chat', '-id').distinct('chat')
Also you can look at this question

Django: how to filter() after distinct()

If we chain a call to filter() after a call to distinct(), the filter is applied to the query before the distinct. How do I filter the results of a query after applying distinct?
Example.objects.order_by('a','foreignkey__b').distinct('a').filter(foreignkey__b='something')
The where clause in the SQL resulting from filter() means the filter is applied to the query before the distinct. I want to filter the queryset resulting from the distinct.
This is probably pretty easy, but I just can't quite figure it out and I can't find anything on it.
Edit 1:
I need to do this in the ORM...
SELECT z.column1, z.column2, z.column3
FROM (
SELECT DISTINCT ON (b.column1, b.column2) b.column1, b.column2, c.column3
FROM table1 a
INNER JOIN table2 b ON ( a.id = b.id )
INNER JOIN table3 c ON ( b.id = c.id)
ORDER BY b.column1 ASC, b.column2 ASC, c.column4 DESC
) z
WHERE z.column3 = 'Something';
(I am using Postgres by the way.)
So I guess what I am asking is "How do you nest subqueries in the ORM? Is it possible?" I will check the documentation.
Sorry if I was not specific earlier. It wasn't clear in my head.

This is an old question, but when using Postgres you can do the following to force nested queries on your 'Distinct' rows:
foo = Example.objects.order_by('a','foreign_key__timefield').distinct('a')
bar = Example.objects.filter(pk__in=foo).filter(some_field=condition)
bar is the nested query as requested in OP without resorting to raw/extra etc. Tested working in 1.10 but docs suggest it should work back to at least 1.7.
My use case was to filter up a reverse relationship. If Example has some ForeignKey to model Toast then you can do:
Toast.objects.filter(pk__in=bar.values_list('foreign_key',flat=true))
This gives you all instances of Toast where the most recent associated example meets your filter criteria.
Big health warning about performance though, using this if bar is likely to be a huge queryset you're probably going to have a bad time.

Thanks a ton for the help guys. I tried both suggestions and could not bend either of those suggestions to work, but I think it started me in the right direction.
I ended up using
from django.db.models import Max, F
Example.objects.annotate(latest=Max('foreignkey__timefield')).filter(foreignkey__timefield=F('latest'), foreign__a='Something')
This checks what the latest foreignkey__timefield is for each Example, and if it is the latest one and a=something then keep it. If it is not the latest or a!=something for each Example then it is filtered out.
This does not nest subqueries but it gives me the output I am looking for - and it is fairly simple. If there is simpler way I would really like to know.

No you can't do this in one simple SELECT.
As you said in comments, in Django ORM filter is mapped to SQL clause WHERE, and distinct mapped to DISTINCT. And in a SQL, DISTINCT always happens after WHERE by operating on the result set, see SQLite doc for example.
But you could write sub-query to nest SELECTs, this depends on the actual target (I don't know exactly what's yours now..could you elaborate it more?)
Also, for your query, distinct('a') only keeps the first occurrence of Example having the same a, is that what you want?

What is the best way to get the count of the number of items in a Django queryset?

I am doing a Django query. I want to know how many MyModels there are that have myAttribute value of "X". This is how I am doing it:
len(MyModel.objects.filter(myAttribute="X"))
Is this the most efficient way to handle it? I am concerned that this unnecessarily gets more data from the database than I need and instead I should be using the Count() function. However, from examples I have seen posted I am unsure if I can combine Count() with filter(). Can someone please advise?

The most optimal way to just get the count is to use count():
MyModel.objects.filter(myAttribute="X").count()

django : How to write alias in queryset

How can one write an alias for the column name in django query set.
Would be useful for union-style combinations of two linked field to the same foreign model (for instance).
for example in mysql :
select m as n, b as a from xyz
how can i do this in django query set ?
models.Table.objects.all().values('m', 'b')
Any help really appreciate it.

You can annotate the fields as you want, with the F expression:
from django.db.models import F
models.Table.objects.all().values('m', 'b').annotate(n=F('m'), a=F('b'))

Although this could have been done before by using extra(select={'n':'m','a':'b'}), I agree that this really should have been a part of values() itself.
To that end, and inspired by Alex's ticket, I have just posted a patch that adds this feature. I hope you'll find it useful!

Your reason in the comment makes no sense. Each field in the model has its own column in the database, and there's never any danger of mixing them up. You can of course tell a field to use a column name that's different from the field name:
myfield = models.CharField(max_length=10, db_column='differentname')
but I don't know if that will help you, because I still don't know what your problem is.

I'm presuming this isn't possible, so I've raised a ticket for the feature to be added, I think there is some merit to being able to do this. Please see the ticket for more information.
https://code.djangoproject.com/ticket/16735

you can also use .alias() in your queryset.
.alias(alias=SomeExpression()).annotate(
SomeOtherExpression('alias'))
.alias(alias=SomeExpression()).order_by('alias')
.alias(alias=SomeExpression()).update(field=F('alias'))
for your specific case, this would be the answer
models.Table.objects.all().alias(
n=F('m'), a=F('b')).values('m', 'a')
)

I really can't understand what you are trying to do however it sounds like what you are looking for is the extra queryset method. This for most purposes acts in the same manner as AS does in sql.

Django database query - return the most recent three objects

This can't be hard, but... I just need to get the most recent three objects added to my database field.
So, query with reverse ID ordering, maximum three objects.
Been fiddling round with
Records.objects.order_by(-id)[:3]
Records.objects.all[:3]
and including an if clause to check whether there are actually three objects:
num_maps = Records.objects.count()
if (num_maps > 3): # etc...
and using reverse() and filter() for a while...
But just can't figure it out! Nothing I do gives the right result and using num_maps feels pretty inelegant. Not getting much joy from the documentation. Can anyone help?!

All you should need is:
Records.objects.all().order_by('-id')[:3]
You need the all() first before the order_by and the argument you pass into order_by should be a string. No need to check if there are actually 3 before running this query because the [:3] will not break if there are less than 3.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How does distinct parameter work with the Count method in annotate? - django

Related

Django Query, Distinct and Order_By combination not working

Django: how to filter() after distinct()

What is the best way to get the count of the number of items in a Django queryset?

django : How to write alias in queryset

Django database query - return the most recent three objects

Categories

Resources