Django Annotation Count with Subquery & OuterRef

Django Annotation Count with Subquery & OuterRef - django

I'm trying to create a high score statistic table/list for a quiz, where the table/list is supposed to be showing the percentage of (or total) correct guesses on a person which was to be guessed on. To elaborate further, these are the models which are used.
The Quiz model:
class Quiz(models.Model):
participants = models.ManyToManyField(
User,
through="Participant",
through_fields=("quiz", "correct_user"),
blank=True,
related_name="related_quiz",
)
fake_users = models.ManyToManyField(User, related_name="quiz_fakes")
user_quizzed = models.ForeignKey(
User, related_name="user_taking_quiz", on_delete=models.CASCADE, null=True
)
time_started = models.DateTimeField(default=timezone.now)
time_end = models.DateTimeField(blank=True, null=True)
final_score = models.IntegerField(blank=True, default=0)
This model does also have some properties; I deem them to be unrelated to the problem at hand.
The Participant model:
class Participant(models.Model): # QuizAnswer FK -> QUIZ
guessed_user = models.ForeignKey(
User, on_delete=models.CASCADE, related_name="clicked_in_quiz", null=True
)
correct_user = models.ForeignKey(
User, on_delete=models.CASCADE, related_name="solution_in_quiz", null=True
)
quiz = models.ForeignKey(
Quiz, on_delete=models.CASCADE, related_name="participants_in_quiz"
)
#property
def correct(self):
return self.guessed_user == self.correct_user
To iterate through what I am trying to do, I'll try to explain how I'm thinking this should work:
For a User in User.objects.all(), find all participant objects where the user.id equals correct_user(from participant model)
For each participantobject, evaluate if correct_user==guessed_user
Sum each participant object where the above comparison is True for the User, represented by a field sum_of_correct_guesses
Return a queryset including all users with parameters [User, sum_of_correct_guesses]
^Now ideally this should be percentage_of_correct_guesses, but that is an afterthought which should be easy enough to change by doing sum_of_correct_guesses / sum n times of that person being a guess.
Now I've even made some pseudocode for a single person to illustrate to myself roughly how it should work using python arithmetics
# PYTHON PSEUDO QUERY ---------------------
person = get_object_or_404(User, pk=3) # Example-person
y = Participant.objects.filter(
correct_user=person
) # Find participant-objects where person is used as guess
y_corr = [] # empty list to act as "queryset" in for-loop
for el in y: # for each participant object
if el.correct: # if correct_user == guessed_user
y_corr.append(el) # add to queryset
y_percentage_corr = len(y_corr) / len(y) # do arithmetic division
print("Percentage correct: ", y_percentage_corr) # debug-display
# ---------------------------------------------
What I've tried (with no success so far), is to use an ExtensionWrapper with Count() and Q object:
percentage_correct_guesses = ExpressionWrapper(
Count("pk", filter=Q(clicked_in_quiz=F("id")), distinct=True)
/ Count("solution_in_quiz"),
output_field=fields.DecimalField())
all_users = (
User.objects.all().annotate(score=percentage_correct_guesses).order_by("score"))
Any help or directions to resources on how to do this is greatly appreciated :))

I found an answer while looking around for related problems:
Django 1.11 Annotating a Subquery Aggregate
What I've done is:
Create a filter with an OuterRef() which points to a User and checks if Useris the same as correct_person and also a comparison between guessed_person and correct_person, outputs a value correct_user in a queryset for all elements which the filter accepts.
Do an annotated count for how many occurrences there are of a correct_user in the filtered queryset.
Annotate User based on the annotated-count, this is the annotation that really drives the whole operation. Notice how OuterRef() and Subquery are used to tell the filter which user is supposed to be correct_user.
Below is the code snippet which I made it work with, it looks very similar to the answer-post in the above linked question:
from django.db.models import Count, OuterRef, Subquery, F, Q
crit1 = Q(correct_user=OuterRef('pk'))
crit2 = Q(correct_user=F('guessed_user'))
compare_participants = Participant.objects.filter(crit1 & crit2).order_by().values('correct_user')
count_occurrences = compare_participants.annotate(c=Count('*')).values('c')
most_correctly_guessed_on = (
User.objects.annotate(correct_clicks=Subquery(count_occurrences))
.values('first_name', 'correct_clicks')
.order_by('-correct_clicks')
)
return most_correctly_guessed_on
This works wonderfully, thanks to Oli.

Related

Django: Filtering a related field by date yields unwanted results

models:
class Vehicle(models.Model):
licence_plate = models.CharField(max_length=16)
class WorkTime(models.Model):
work_start = models.DateTimeField()
work_end = models.DateTimeField()
vehicle = models.ForeignKey(Vehicle, on_delete=models.SET_NULL, related_name="work_times")
However when I try to filter those working times using:
qs = Vehicle.objects.filter(
work_times__work_start__date__gte="YYYY-MM-DD",
work_times__work_end__date__lte="YYYY-MM-DD").distinct()
I get results that do not fit the timeframe given. Most commonly when the work_end fits to something, it returns everything from WorkTime
What I would like to have:
for vehicle in qs:
for work_time in vehicle.work_times:
print(vehicle, work_time.work_start, work_time.work_end)

The filter has no effect on the .work_times from the Vehicles, it only will ensure that the Vehicles in the qs will contain at least one WorkTime in the given range.
You can work with a Prefetch object [Django-doc] to allow filtering efficiently on a related manager:
from django.db.models import Prefetch
qs = Vehicle.objects.prefetch_related(
Prefetch(
'work_times',
WorkTime.objects.filter(
work_start__date__range=('2021-03-01', '2021-03-12')
),
to_attr='filtered_work_times'
)
)
and then you can work with:
for vehicle in qs:
for work_time in vehicle.filtered_work_times:
print(vehicle, work_time.work_start, work_time.work_end)

The best way to do an efficient filter query in django

models.py file
I am not so good at this aspect in Django. Please can someone help me? I wish to know if there is a more efficient way for the class method already_voted
class Vote(TimeStamped):
voter = models.ForeignKey(get_user_model(), verbose_name=_("Vote"), on_delete=models.CASCADE)
contester = models.ForeignKey(Contester, verbose_name=_("Contester"), on_delete=models.CASCADE,
help_text=("The chosen contester"), related_name="votes")
ip_address = models.GenericIPAddressField(
_("Voter's IP"),
protocol="both",
unpack_ipv4=False,
default="None",
unique=True
)
num_vote = models.PositiveIntegerField(_("Vote"), default=0)
class Meta:
unique_together = ('voter','contester')
verbose_name = _("Vote")
verbose_name_plural = _("Votes")
permissions = (
("vote_multiple_times", "can vote multiple times"),
)
....
....
#classmethod
def already_voted(cls, contester_id, voter_id=None, ip_addr=None):
return cls.objects.filter(contester_id=contester_id).exists() and \
(cls.objects.filter(ip_address=ip_addr).exists() or \
cls.objects.filter(voter_id=voter_id).exists())

The class method may be right, but your model needs one more index:
contester = models.ForeignKey( db_index= True #... )
Notice that:
voter doesn't need index because is on first place on unique_together constraint.
contester needs index because, despite it is on unique_together, doesn't is place on first position of the constraint.
ip_address doesn't need index because has unique constraint.
Also:
unique_together is deprecated and should be a list of tuples (not just a tuple)
Edited
Edited 5 feb 2021 due to OP comment
You can get results in just one hit using Exists but it is less readable, also, I'm not sure if it is more efficient or the best way:
from django.db.models import Exists
q_ip=Vote.objects.filter(ip_address="1")
q_voter=Vote.objects.filter(voter=2)
already_voted=(
Vote
.objects
.filter(contester=3)
.filter(Exists(q_ip)|Exists(q_voter))
.exists())
The underlying sql, you can see this is just one query:
SELECT ( 1 ) AS "a"
FROM "a1_vote"
WHERE ( "a1_vote"."contester" = 3
AND ( EXISTS(SELECT U0."id",
U0."voter",
U0."contester",
U0."ip_address",
U0."num_vote"
FROM "a1_vote" U0
WHERE U0."ip_address" = '1')
OR EXISTS(SELECT U0."id",
U0."voter",
U0."contester",
U0."ip_address",
U0."num_vote"
FROM "a1_vote" U0
WHERE U0."voter" = 2) ) )
LIMIT 1

Django Counting related object with a certain condition

I have this model classes in my django app:
class Ad(models.Model):
...
class Click:
time = models.DateTimeField(auto_now_add=True)
ip = models.GenericIPAddressField()
ad = models.ForeignKey(
to=Ad,
related_name='views',
on_delete=CASCADE
)
class View:
time = models.DateTimeField(auto_now_add=True)
ip = models.GenericIPAddressField()
ad = models.ForeignKey(
to=Ad,
related_name='views',
on_delete=CASCADE
)
Assume I have a queryset of Ad objects. I want to annotate the count of clicks for each add that happened in hour 12 to 13 (We could use range look-up). First I did it like this:
query.filter(clicks__time__hour__range=[12, 13]).annotate(views_count=Count('views',distinct=True), clicks_count=Count('clicks', distinct=True))
but those ads which don't have any clicks in that range will be omitted from the query this way but I need them to be present in the final query.
Is there any proper way to do so maybe with Django Conditional Expressions?

As per the docs you should be able to do the filter in the Count aggregate.
from django.db.models import Count, Q
query.annotate(
views_count=Count('views',distinct=True),
clicks_count=Count('clicks', distinct=True, filter=Q(time__hour__range=[12, 13])),
)

Division in Django Query

Model:
class Vote(models.Model):
thumbs_up = models.ManyToManyField(
settings.AUTH_USER_MODEL, blank=True, related_name='thumbs_up')
thumbs_down = models.ManyToManyField(
settings.AUTH_USER_MODEL, blank=True, related_name='thumbs_down')
View:
qs = Vote.objects.all()
percent_min = request.GET.get('min-rating')
percent_max = request.GET.get('max-rating')
qs = qs.annotate(percent=(Count('thumbs_up')/(Count('thumbs_down')+Count('thumbs_up')))
* 100).filter(percent__gte=percent_min)
qs = qs.annotate(percent=(Count('thumbs_up')/(Count('thumbs_down')+Count('thumbs_up')))
* 100).filter(percent__lte=percent_max)
I also tried this which also didn't work.
qs = qs.annotate(up=Count('thumbs_up', distinct=True), combined=Count('thumbs_up', distinct=True) +
Count('thumbs_down', distinct=True), result=(F('up')/F('combined'))*100).filter(result__gte=percent_min)
I'm attempting to filter by min and max percentages based on user votes (up and down) but I can't seem to get it to work properly.
Using the current code if I, for example, put a maximum percentage of 74% in then it filters out everything rated 100% and leaves the remaining. The opposite happens if I enter 74% as a minimum percentage, it filters everything except those rated 100%.
Currently no 0 rated entries as I have to tackle the divide by 0 issue next.
Any insights would be greatly appreciated.

So I came up with this which seems to be working:
qs = qs.annotate(meh=Count('thumbs_meh', distinct=True), up=Count('thumbs_up', distinct=True), combined=Count('thumbs_up', distinct=True) +
Count('thumbs_down', distinct=True) + Count('thumbs_meh', distinct=True), result=Case(When(combined=0, then=0), default=((F('up')+(F('meh')/2))/(1.0*F('combined')))*100)).filter(result__gte=rating_min)
I added another model field for 'meh' votes hence the addition to the query.

You can do something similar to
Vote.objects.filter(percentage__range=(min_perct, max_perct))
Although you need to generate a separate field for this method to work. Try adding:
class Vote
# ...
percentage = models.DecimalField(default=0, max_digits=5, decimal_places=2)
# Remember to update this field after every update!
v = Vote()
v.percentage = (thumbs_up/(thumbs_up + thumbs_down))

Django annotate fails when querying unique values

for a Django project I need to combine two parts lists into one.
models.py:
class UserBuild(models.Model):
project = models.ForeignKey(Project)
created = models.DateTimeField(auto_now_add=True)
updated = models.DateTimeField(auto_now=True)
part = models.ForeignKey(Parts)
part_quantity = models.IntegerField(max_length=5, null=True, blank=True)
suggested_quantity = models.IntegerField(
max_length=5, null=True, blank=True
)
views.py:
def CombineProjects(request, template_name='combined_projects.html'):
...
build_set = UserBuild.objects.values(
#'pk',
#'project__pk',
'part__number',
'part__part_type__name',
'part__price',
'part__description',
'part__category__name'
).filter(
project__in=projects
).order_by('part__category', 'part__part_type').annotate(total=Sum('part_quantity'))
Basically here I want to group all parts which are the same and sum their quantity. As above works but if I uncomment either the pk or project__pk arguments, then the parts are no longer grouped (I assume because they are variable even when the part is the same).
Is there some way that I can keep the grouping but also include the pk and project__pk values?

The problem is that your list of values is what you group and sum by; so you really want to group by project and part number only.
I would try:
build_set = UserBuild.objects.values(
'project__pk',
'part__number',
).filter(
project__in=projects
).order_by('part__category', 'part__part_type').annotate(total=Sum('part_quantity'))
then merge the other attributes into this list later on. I'm not sure this will work however. You might need to do:
build_set = UserBuild.objects.values(
'project_id',
'part_id',
).filter(
project__in=projects
).order_by('part__category', 'part__part_type').annotate(total=Sum('part_quantity'))
And do more work by hand afterwards.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Django Annotation Count with Subquery & OuterRef - django

Related

Django: Filtering a related field by date yields unwanted results

The best way to do an efficient filter query in django

Django Counting related object with a certain condition

Division in Django Query

Django annotate fails when querying unique values

Categories

Resources