Django is complex select query possible for this? - django

I have the following model used to store a bidirectional relationship between two users. The records are always inserted where the smaller user id is user_a while the larger user id is user_b.
Is there a way to retrieve all records belonging to a reference user and the correct value of the status (apply negative transformation to relationship_type if user_a) based on whether the reference user id is larger or smaller than the other user id?
Perhaps two separate queries, one where reference user = user_a and another where reference user = user_b, followed by a join?
class Relationship(models.Model):
RELATIONSHIP_CHOICES = (
(0, 'Blocked'),
(1, 'Allowed'),
(-2, 'Pending_A'),
(2, 'Pending_B'),
(-3, 'Blocked_A'),
(3, 'Blocked_B'),
)
user_a = models.ForeignKey(CustomUser, on_delete=models.SET_NULL, related_name='user_a',null=True)
user_b = models.ForeignKey(CustomUser, on_delete=models.SET_NULL, related_name='user_b',null=True)
relationship_type = models.SmallIntegerField(choices=RELATIONSHIP_CHOICES, default=0)
A SQL query of what I'm trying to achieve:
(SELECT user_b as user_select, -relationship_type as type_select WHERE user_a='reference_user') UNION (SELECT user_a as user_select, relationship_type as type_select WHERE user_b='reference_user')

Given you have the id of the user user_id, you can filter with:
from django.db.models import Q
Relationship.objects.filter(Q(user_a_id=user_id) | Q(user_b_id=user_id))
If you have a CustomUser object user, it is almost the same:
from django.db.models import Q
Relationship.objects.filter(Q(user_a=user) | Q(user_b=user))
If you are looking to obtain Relationships with a given type, we can do the following:
from django.db.models import Q
rel_type = 2 # example rel_type
Relationship.objects.filter(
Q(user_a=user, relationship_type=rel_type) |
Q(user_b=user, relationship_type=-rel_type)
)
Here we thus retrieve Relationship objects with user_a the given user and relationship_type=2, or Relationship objects with user_b the given user, and relationship_type=-2.
We could annotate the querysets, and then take the union, like:
qs1 = Relationship.objects.filter(
user_a=user, relationship_type=rel_type
).annotate(
user_select=F('user_b'),
rel_type=F('relationship_type')
)
qs2 = Relationship.objects.filter(
user_a=user, relationship_type=rel_type
).annotate(
user_select=F('user_a'),
rel_type=-F('relationship_type')
)
qs = qs1.union(qs2)
Although I do not know if that is a good idea: the annotations are not "writable" (so you can not update these).
It might be better to implement some sort of "proxy object" that can swap user_a and user_b, and negate the relationship type, and thus is able to act as if it is a real Relationship object.

As you said, id in user_a is always smaller than user_b. So if you query with user_b=user then you should always get the references where user_id in the reference is always higher than other user_id. So I think you can use following querysets:
user = CustomUser.objects.get(id=1)
user_a_references = Relationship.objects.filter(user_a=user)
user_b_references = Relationship.objects.filter(user_b=user)
all_relation_ships = user_a_reference.union(user_b_references)

Related

How to apply an arbitrary filter on a specific chained prefetch_related() within Django?

I'm trying to optimize the fired queries of an API. I have four models namely User, Content, Rating, and UserRating with some relations to each other. I want the respective API returns all of the existing contents alongside their rating count as well as the score given by a specific user to that.
I used to do something like this: Content.objects.all() as a queryset, but I realized that in the case of having a huge amount of data tons of queries will be fired. So I've done some efforts to optimize the fired queries using select_related() and prefetch_related(). However, I'm dealing with an extra python searching, that I hope to remove that, using a controlled prefetch_related() — applying a filter just for a specific prefetch in a nested prefetch and select.
Here are my models:
from django.db import models
from django.conf import settings
class Content(models.Model):
title = models.CharField(max_length=50)
class Rating(models.Model):
count = models.PositiveBigIntegerField(default=0)
content = models.OneToOneField(Content, on_delete=models.CASCADE)
class UserRating(models.Model):
user = models.ForeignKey(
settings.AUTH_USER_MODEL, blank=True, null=True, on_delete=models.CASCADE
)
score = models.PositiveSmallIntegerField()
rating = models.ForeignKey(
Rating, related_name="user_ratings", on_delete=models.CASCADE
)
class Meta:
unique_together = ["user", "rating"]
Here's what I've done so far:
contents = (
Content.objects.select_related("rating")
.prefetch_related("rating__user_ratings")
.prefetch_related("rating__user_ratings__user")
)
for c in contents: # serializer like
user_rating = c.rating.user_ratings.all()
for u in user_rating: # how to remove this dummy search?
if u.user_id == 1:
print(u.score)
Queries:
(1) SELECT "bitpin_content"."id", "bitpin_content"."title", "bitpin_rating"."id", "bitpin_rating"."count", "bitpin_rating"."content_id" FROM "bitpin_content" LEFT OUTER JOIN "bitpin_rating" ON ("bitpin_content"."id" = "bitpin_rating"."content_id"); args=(); alias=default
(2) SELECT "bitpin_userrating"."id", "bitpin_userrating"."user_id", "bitpin_userrating"."score", "bitpin_userrating"."rating_id" FROM "bitpin_userrating" WHERE "bitpin_userrating"."rating_id" IN (1, 2); args=(1, 2); alias=default
(3) SELECT "users_user"."id", "users_user"."password", "users_user"."last_login", "users_user"."is_superuser", "users_user"."first_name", "users_user"."last_name", "users_user"."email", "users_user"."is_staff", "users_user"."is_active", "users_user"."date_joined", "users_user"."user_name" FROM "users_user" WHERE "users_user"."id" IN (1, 4); args=(1, 4); alias=default
As you can see on the above fired queries I've only three queries rather than too many queries which were happening in the past. However, I guess I can remove the python searching (the second for loop) using a filter on my latest query — users_user"."id" IN (1,) instead. According to this post and my efforts, I couldn't apply a .filter(rating__user_ratings__user_id=1) on the third query. Actually, I couldn't match my problem using Prefetch(..., queryset=...) instance given in this answer.
I think you are looking for Prefetch object:
https://docs.djangoproject.com/en/4.0/ref/models/querysets/#prefetch-objects
Try this:
from django.db.models import Prefetch
contents = Content.objects.select_related("rating").prefetch_related(
Prefetch(
"rating__user_ratings",
queryset=UserRating.objects.filter(user__id=1),
to_attr="user_rating_number_1",
)
)
for c in contents: # serializer like
print(c.rating.user_rating_number_1[0].score)

Django attribute of most recent reverse relation

I have two models:
class Test(models.Model):
test_id = models.CharField(max_length=20, unique=True, db_index=True)
class TestResult(models.Model):
test = models.ForeignKey("Test", to_field="test_id", on_delete=models.CASCADE)
status = models.CharField(max_length=30, choices=status_choices)
with status_choices as an enumeration of tuples of strings.
Some Test objects may have zero related TestResult objects, but most have at least one.
I want to filter Test objects based on their most recent TestResult status.
I have tried this:
queryset = Test.objects.all()
queryset = queryset.annotate(most_recent_result_pk=Max("testresult__pk"))
queryset = queryset.annotate(current_status=Subquery(TestResult.objects.filter(pk=OuterRef("most_recent_result")).values("status")[:1]))
But I get the error:
column "u0.status" must appear in the GROUP BY clause or be used in an
aggregate function LINE 1: ...lts_testresult"."id") AS
"most_recent_result_pk", (SELECT U0."status...
I can find the most recent TestResult object fine with the first annotation of the pk, but the second annotation breaks everything. It seems like it ought to be easy to find an attribute of the TestResult object, once its pk is known. How can I do this?
You can do this with one subquery, without annotating this first:
from django.db.models import OuterRef, Subquery
queryset = Test.objects.annotate(
current_status=Subquery(
TestResult.objects.filter(
test=OuterRef('pk')
).order_by('-pk').values('status')[:1])
)
This will generate a query that looks like:
SELECT test.*,
(SELECT U0.status
FROM testresult U0
WHERE U0.test_id = test.id
ORDER BY U0.id DESC
LIMIT 1
) AS current_status
FROM test
or without subquery:
from django.db.models import F, Max
queryset = Test.objects.annotate(
max_testresult=Max('testresult__test__testresult__pk')
).filter(
testresult__pk=F('max_testresult')
).annotate(
current_status=F('testresult__status')
)
That being said, ordering by primary key is not a good idea to retrieve the latest object. You can see primary keys as "blackboxes" that simply hold a value to refer to it.
It is often better to use a column that stores the timestamp:
class TestResult(models.Model):
test = models.ForeignKey("Test", to_field="test_id", on_delete=models.CASCADE)
status = models.CharField(max_length=30, choices=status_choices)
created = models.DateTimeField(auto_now_add=True)
and then query with:
from django.db.models import OuterRef, Subquery
queryset = Test.objects.annotate(
current_status=Subquery(
TestResult.objects.filter(
test=OuterRef('pk')
).order_by('-created').values('status')[:1])
)

How can I filter a Django queryset by the latest of a related model?

Imagine I have the following 2 models in a contrived example:
class User(models.Model):
name = models.CharField()
class Login(models.Model):
user = models.ForeignKey(User, related_name='logins')
success = models.BooleanField()
datetime = models.DateTimeField()
class Meta:
get_latest_by = 'datetime'
How can I get a queryset of Users, which only contains users whose last login was not successful.
I know the following does not work, but it illustrates what I want to get:
User.objects.filter(login__latest__success=False)
I'm guessing I can do it with Q objects, and/or Case When, and/or some other form of annotation and filtering, but I can't suss it out.
We can use a Subquery here:
from django.db.models import OuterRef, Subquery
latest_login = Subquery(Login.objects.filter(
user=OuterRef('pk')
).order_by('-datetime').values('success')[:1])
User.objects.annotate(
latest_login=latest_login
).filter(latest_login=False)
This will generate a query that looks like:
SELECT auth_user.*, (
SELECT U0.success
FROM login U0
WHERE U0.user_id = auth_user.id
ORDER BY U0.datetime DESC
LIMIT 1
) AS latest_login
FROM auth_user
WHERE (
SELECT U0.success
FROM login U0
WHERE U0.user_id = auth_user.id
ORDER BY U0.datetime
DESC LIMIT 1
) = False
So the outcome of the Subquery is the success of the latest Login object, and if that is False, we add the related User to the QuerySet.
You can first annotate the max dates, and then filter based on success and the max date using F expressions:
User.objects.annotate(max_date=Max('logins__datetime'))\
.filter(logins__datetime=F('max_date'), logins__success=False)
for check bool use success=False and for get latest use latest()
your filter has been look this:
User.objects.filter(success=False).latest()

Getting distinct objects of a queryset from a reverse relation in Django

class Customer(models.Model):
name = models.CharField(max_length=189)
class Message(models.Model):
message = models.TextField()
customer = models.ForeignKey(Customer, on_delete=models.CASCADE, related_name="messages")
created_at = models.DateTimeField(auto_now_add=True)
What I want to do here is that I want to get the queryset of distinct Customers ordered by the Message.created_at. My database is mysql.
I have tried the following.
qs = Customers.objects.all().order_by("-messages__created_at").distinct()
m = Messages.objects.all().values("customer").distinct().order_by("-created_at")
m = Messages.objects.all().order_by("-created_at").values("customer").distinct()
In the end , I used a set to accomplish this, but I think I might be missing something. My current solution:
customers = set(Interaction.objects.all().values_list("customer").distinct())
customer_list = list()
for c in customers:
customer_list.append(c[0])
EDIT
Is it possible to get a list of customers ordered by according to their last message time but the queryset will also contain the last message value as another field?
Based on your comment you want to order the customers based on their latest message. We can do so by annotating the Customers and then sort on the annotation:
from dango.db.models import Max
Customer.objects.annotate(
last_message=Max('messages__crated_at')
).order_by("-last_message")
A potential problem is what to do for Customers that have written no message at all. In that case the last_message attribute will be NULL (None) in Python. We can specify this with nulls_first or nulls_last in the .order_by of an F-expression. For example:
from dango.db.models import F, Max
Customer.objects.annotate(
last_message=Max('messages__crated_at')
).order_by(F('last_message').desc(nulls_last=True))
A nice bonus is that the Customer objects of this queryset will have an extra attribute: the .last_message attribute will specify what the last time was when the user has written a message.
You can also decide to filter them out, for example with:
from dango.db.models import F, Max
Customer.objects.filter(
messages__isnull=False,
).annotate(
last_message=Max('messages__crated_at')
).order_by('-last_message')

Annotate django query if filtered row exists in second table

I have two tables (similar to the ones below):
class Piece(models.Model):
cost = models.IntegerField(default=50)
piece = models.CharField(max_length=256)
class User_Piece (models.Model):
user = models.ForeignKey(User)
piece = models.ForeignKey(Piece)
I want to do a query that returns all items in Piece, but annotates each row with whether or not the logged in user owns that piece (so there exists a row in User_Piece where user is the logged in user).
I tried:
pieces = Piece.objects.annotate(owned=Count('user_piece__id'))
But it puts a count > 0 for any piece that is owned by any user. I'm not sure where/how I put in the condition that the user_piece must have the specified user I want. If I filter on user__piece__user=user, then I don't get all the rows from Piece, only those that are owned.
You could use Exist subquery wrapper:
from django.db.models import Exists, OuterRef
subquery = User_Piece.objects.filter(user=user, piece=OuterRef('pk'))
Piece.objects.annotate(owned=Exists(subquery))
https://docs.djangoproject.com/en/dev/ref/models/expressions/#exists-subqueries
In newer versions of Django, you can do:
from django.db.models import Exists, OuterRef
pieces = Piece.objects.annotate(
owned=Exists(UserPiece.objects.filter(piece=OuterRef('id'), user=request.user))
)
for piece in pieces:
print(piece.owned) # prints True or False
Of course, you can replace the name owned with any name you want.
Easy approach, be careful with performance:
pk_pices = ( User_Piece
.objects
.filter(user=user)
.distinct()
.values_list( 'id', flat=True)
)
pieces = pieces.objects.filter( id__in = pk_pieces )
Also, notice that you have a n:m relation ship, you can rewrite models as:
class Piece(models.Model):
cost = models.IntegerField(default=50)
piece = models.CharField(max_length=256)
users = models.ManyToManyField(User, through='User_Piece', #<- HERE!
related_name='Pieces') #<- HERE!
And get user pieces as:
pieces = currentLoggedUser.pieces.all()