sqlalchemy-flask: How to count self relation - flask

I have the following model:
Base = declarative_base()
class Comment(Base):
__tablename__ = "commenting_comment"
id = Column(UUID, primary_key=True, index=True)
text = Column(String)
parent_id = Column(UUID, ForeignKey('commenting_comment.id'))
parent = relationship('Comment', remote_side=[id], backref="replies")
I want to count the number of replies for each Comment.
Currently, I have the following query:
ids = ["uuid1", "uuid2", "uuid3"]
qs = db.query(
Comment,
func.count('id').label("replies_count"),
).select_from(Comment).group_by(Comment.id).filter(
Comment.id.in_(ids)
).all()
There are two problems:
1- It returns replies_count=1 for all of the rows.
2- I want replies_count to be in the comment's body but it returns replies_count out of the comment body. (as follows)
comments = [
{"Comment": {...}, "replies_count": 1},
{"Comment": {...}, "replies_count": 1},
]
How can I solve these two problems?
UPDATE
In Django corresponding query will be:
Comment.objects.filter(id__in=ids).annotate(replies_count=Count("replies"))
But I don't know how to implement it in the flask and sqlalchemy.

Related

Django DRF: how to groupby on a foreign fields?

I have a model where users can upvote other users for specific topics. Something like:
#models.py
Class Topic(models.Model):
name = models.StringField()
def __str__(self):
return str(self.name)
Class UserUpvotes(models.Model):
"""Holds total upvotes by user and topic"""
user = models.ForeignKey(User)
topic= models.ForeignKey(Topic)
upvotes = models.PositiveIntegerField(default=0)
Using DRF, I have an API that returns the following: topic_id, topic_name, and upvotes, which is the total upvotes for a given topic.
One of the project requirements is for the API to use these field names specifically: topic_id, topic_name, and upvotes
#serializers.py
class TopicUpvotesSerializer(serializers.ModelSerializer):
topic_name = serializers.StringRelatedField(source="topic")
class Meta:
model = UserUpvotes
fields = ["topic_id", "topic_name", "upvotes"]
My trouble is aggregating these fields. I'm filtering the UserUpvotes by user or team and then aggregating by topic.
Desired output
This is the result I want to get. When I don't perform any aggregations (and there are views where this will be the case), it works.
[
{
"topic_id": 3,
"topic_name": "Korean Studies",
"upvotes": 14
},
{
"topic_id": 12,
"topic_name": "Inflation",
"upvotes": 3
},
]
At first, I tried creating a TopicSerializer, and then assigning it to the topic field in TopicUpvotesSerializer. But then, the resulting json would have a nested "topic" field and the aggragation would fail.
Attempt 1
#views.py
def get_queryset(self):
return (
UserUpvotes.objects.filter(user__team=team)
.values("topic")
.annotate(upvotes=models.Sum("upvotes"))
.order_by("-upvotes")
)
My problem is that the topic_id and topic_name fields are not showing. I get something like:
[
{
"topic_name": "3",
"upvotes": 14
},
{
"topic_name": "12",
"upvotes": 3
},
]
Attempt 2
Another queryset attempt:
# views.py
def get_queryset(self):
return (
UserUpvotes.objects.filter(user__team=team)
.values("topic__id", "topic__name")
.annotate(upvotes=models.Sum("upvotes"))
.order_by("-upvotes")
)
Which yields:
[
{
"upvotes": 14
},
{
"upvotes": 3
},
]
The aggregation worked on the queryset level, but the serializer failed to find the correct fields.
Attempt 3
This was the closest I got:
# views.py
def get_queryset(self):
return (
UserUpvotes.objects.filter(user__team=team)
.values("topic__id", "topic__name")
.annotate(upvotes=models.Sum("upvotes"))
.values("topic_id", "topic", "upvotes")
.order_by("-upvotes")[:n]
)
[
{
"topic_name": 3,
"topic_name": "3",
"upvotes": 14
},
{
"topic_name": 12,
"topic_name": "12",
"upvotes": 3
},
]
I have no idea why "topic_name" is simply transforming the "topic_id" into a string, instead of calling the string method.
Work with a serializer for the topic:
class TopicSerializer(serializers.ModelSerializer):
upvotes = serializers.IntegerField(read_only=True)
class Meta:
model = Topic
fields = ['id', 'name', 'upvotes']
then in the ModelViewSet, you annotate:
from django.db.models import Sum
from rest_framework.viewsets import ModelViewSet
class TopicViewSet(ModelViewSet):
serializer_class = TopicSerializer
queryset = Topic.objects.annotate(upvotes=Sum('userupvotes__upvotes'))
Desired output
This is the result I want to get. When I don't perform any aggregations (and there are views where this will be the case), it works.
[
{
"topic_name": 3,
"topic_name": "Korean Studies",
"upvotes": 14
},
{
"topic_name": 12,
"topic_name": "Inflation",
"upvotes": 3
},
]
The serialized FK will always give you the ID of the related model. I am not sure why you name it topic_name if that is equal to an ID. Now, if you really want to get the name field of the Topic model
in the topic_name = serializers.StringRelatedField(source="topic") you should give it a source="topic.name"
However, if you trying to get the ID of the relation you can still use ModelSerializer :
class TopicUpvotesSerializer(serializers.ModelSerializer):
class Meta:
model = UserUpvotes
fields = "__all__"
#willem-van-onsem's answer is the correct one for the problem as I had put it.
But... I had another use case (sorry! ◑﹏◐), for when the Users API used UserUpvotes serializer as a nested field. So I had to find another solution. This is was I eventually ended up with. I'm posting in case it helps anyone.
class UserUpvotesSerializer(serializers.ModelSerializer):
topic_name = serializers.SerializerMethodField()
def get_topic_name (self, obj):
try:
_topic_name = obj.topic.name
except TypeError:
_topic_name = obj.get("skill__name", None)
return _topic_name
class Meta:
model = UserUpvotes
fields = ["topic_id", "topic_name", "upvotes"]
I still have no idea why the SerializerMethodField works and the StringRelatedField field doesn't. It feels like a bug?
Anyways, the rub here is that, after the values().annotate() aggregation, obj is no longer a QuerySet, but a dict. So accessing namedirectly will give you a 'UserUpvotes' object is not subscriptable error.
I don’t know if there are any other edge cases I should be aware of (this is when I REALLY miss type hints in Django), but it works so far

Django Alternative To Inner Join When Annotating

I'm having some trouble generating an annotation for the following models:
class ResultCode(GenericSteamDataModel):
id = models.IntegerField(db_column='PID')
result_code = models.IntegerField(db_column='resultcode', primary_key=True)
campaign = models.OneToOneField(SteamCampaign, db_column='campagnePID', on_delete=models.CASCADE)
sale = models.BooleanField(db_column='ishit')
factor = models.DecimalField(db_column='factor', max_digits=5, decimal_places=2)
class Meta:
managed = False
constraints = [
models.UniqueConstraint(fields=['result_code', 'campaign'], name='result_code per campaign unique')
]
class CallStatistics(GenericShardedDataModel, GenericSteamDataModel):
objects = CallStatisticsManager()
project = models.OneToOneField(SteamProject, primary_key=True, db_column='projectpid', on_delete=models.CASCADE)
result_code = models.ForeignKey(ResultCode, db_column='resultcode', on_delete=models.CASCADE)
class Meta:
managed = False
The goal is to find the sum of factors based on the result_code field in the ResultCode and CallStatistics model, when sale=True.
Note that:
Result codes are not unique by themselves (described in model). A Project has a relation to a Campaign
The following annotation generates the result that is desired (possible solution):
result = CallStatistics.objects.all().values('project').annotate(
sales_factored=models.Sum(
models.Case(
models.When(
models.Q(sale=True) & models.Q(project__campaign=models.F('result_code__campaign')),
then=models.F('result_code__factor')
)
)
)
)
The problem is that the generated query performs a Inner Join on result_code between the 2 models.
Trying to add another field in the same annotation (that should not be joined with Resultcode), for example:
sales=models.Sum(Cast('sale', models.IntegerField())),
results in a wrong summation.
The Questions is if there is an alternative to the automatic Inner Join that Django generates. So that it is possible to retrieve the following fields (and others similar) in 1 annotation:
...
sales=models.Sum(Cast('sale', models.IntegerField())),
sales_factored= [sum of factores, without Inner Join]
...
Thanks in advance for taking your time for this.

Filter annotate When Case with foregin key

i have 2 models:
class ServiceRequest(models.Model):
post_time = models.DateTimeField(default=timezone.now)
service = models.IntegerField(default=102)
class RequestSession(models.Model):
request = models.ForeignKey(ServiceRequest)
post_time = models.DateTimeField(default=timezone.now)
If service have value 102, it will have one or more RequestSession.
I have queryset get all ServiceRequest : ServiceRequest.objects.all()
My question is : How to order_by post_time if service!=102. If service=102 i want order by greatest post_time of requestsession_set?
i tried annotate but i dont know how to get greatest post_time in requestsession inside When then
My query i tried:
queryset.annotate(time_filter=Case(
When(service_id=102, then=('requestsession_set__post_time')),
default=Value('post_time'),
output_field=DateTimeField(), ), ).order_by("-time_filter")
Try:
When(service=102, then=Max('requestsession_set__post_time')),
Notice that I have also replaced service_id for service since service is simply an IntegerField in your example.
Also, it should be default='post_time', without Value. See https://docs.djangoproject.com/en/2.2/ref/models/expressions/#value-expressions

Django get foreign key object inside Queryset

I have 3 models:
class Event(models.Model):
cur_datetime = models.DateTimeField(default=datetime.datetime(1970, 1, 1, 0, 0, 0, 0, pytz.UTC), blank=True, null=True)
week_no = models.IntegerField()
day_no = models.IntegerField()
class SubEvent(models.Model):
event = models.ForeignKey(Event, on_delete=models.CASCADE)
version = models.IntegerField()
class SubSubEvent(models.Model):
sub_event = models.ForeignKey(SubEvent, on_delete=models.CASCADE)
length = models.IntegerField()
I want to get a Queryset from SubSubEvent model, which includes all the Foreign keys as one single object. What I have now is:
querySet = SubSubEvent.objects.filter(sub_event__event__cur_datetime__range=[from_date, to_date])
This will return a queryset, and using a for loop to get __dict__ on each of objects, I get something like this:
{'event_id': 1, '_state': <django.db.models.base.ModelState object at 0x7fd7d9cefeb8>, 'id': 10, 'length': '1'}
This is just a part of the query I want to achieve. What I really want, is all the fields in event_id instead of just the id number. In other word, all the fields (including data) from Event plus SubEvent plus SubSubEvent in one queryset. This queryset should contains objects with cur_datetime, week_no, day_no, version and length.
It sounds like you're looking for select_related().
qs = SubSubEvent.objects \
.select_related('sub_event__event') \
.filter(sub_event__event__cur_datetime__range=[from_date, to_date])
You can then access the related SubEvent and Event resources without hitting the database.
sub_sub_event = qs[0]
sub_event = sub_sub_event.sub_event # doesn't hit the database
event = sub_sub_event.sub_event.event # doesn't hit the database

Querying data from Django

Here's what my model structure looks like:
class Visitor(models.Model):
id = models.AutoField(primary_key=True)
class Session(models.Model):
id = models.AutoField(primary_key=True)
visit = models.ForeignKey(Visitor)
sequence_no = models.IntegerField(null=False)
class Track(models.Model):
id = models.AutoField(primary_key=True)
session = models.ForeignKey(Session)
action = models.ForeignKey(Action)
when = models.DateTimeField(null=False, auto_now_add=True)
sequence_no = models.IntegerField(null = False)
class Action(models.Model):
id = models.AutoField(primary_key=True)
url = models.CharField(max_length=65535, null=False)
host = models.IntegerField(null=False)
As you can see, each Visitor has multiple Sessions; each Session has multiple Tracks and each Track has one Action. Tracks are always ordered ascendingly by the session and the sequence_no. A Visitors average time on an site (i.e. a particular Action.host) is the difference in Track.when (time) between the highest and lowest Track.sequence_no divided by the number of Sessions of that Visitor.
I need to calculate the average time of visitors on the site which be the sum of the time for each visitor on the Action.site divided by the number of visitors.
I could query this using SQL but I'd like to keep my query as Djangonic as possible and I'm still very lost with complex queries.
For a specific Action object you can gather interesting data about Sessions:
from django.db.models import Min, Max
from yourapp.models import *
host = 1 # I suppose you want to calculate for each site
sessions = list(Session.objects.filter(
track__action__host=host,
).annotate(
start=Min('track__when'),
end=Max('track__when'),
).values('visit_id', 'start', 'end'))
You will get something in the line of:
[
{ 'visit_id': 1, 'start': datetime(...), 'end': datetime(...) },
{ 'visit_id': 1, 'start': datetime(...), 'end': datetime(...) },
{ 'visit_id': 2, 'start': datetime(...), 'end': datetime(...) },
....
]
Now it's only a matter of getting the desired result from the data:
number_of_visitors = len(set(s['visit_id'] for s in sessions))
total_time = sum((s['end'] - s['start']).total_seconds() for s in sessions)
average_time_spent = total_time / number_of_visitors
Another way is to use two queries instead of one, and avoid the len(set(...)) snippet:
sessions = Session.objects.filter(
track__action__host=host,
).annotate(
start=Min('track__when'),
end=Max('track__when'),
)
number_of_visitors = sessions.values('visit_id').distict().count()
total_time = sum((s['end'] - s['start']).total_seconds()
for s in sessions.values('start', 'end'))
There is NO WAY to do actual calculated fields barring the provided aggregations, so either you do it in raw SQL or you do in code like this.
At least the proposed solution uses Django's ORM as far as possible.