django 2.0.2 python 3.4
models.py
Post(models.Model):
Id = pk
content = text
Reply(models.Model):
Id = pk
PostId = Fk(Post)
content = text
view.py
Post.objects.all().annotate(lastreply=F("Reply__content__last"))
can use last query in F() ?
As far as I know, latest cannot be used with F().
One possible solution is including a timestamp in the reply class
Post(models.Model):
Id = pk
content = text
Reply(models.Model):
Id = pk
PostId = Fk(Post)
content = text
timestamp = DateTime(auto)
Then you can use a query of this format to get the latest reply for each post.
Reply.objects.annotate(max_time=Max('Post__Reply__timestamp')).filter(timestamp=F('max_time'))
Please note that this is really time consuming for large number of records.
If you are using a Postgres DB you can use distinct()
Reply.objects.order_by('Post__Id','-timestamp').distinct('Post__Id')
F expression has no way to do that.
but Django has another way to handle it.
https://docs.djangoproject.com/en/2.0/ref/models/expressions/#subquery-expressions
for this problem, the code below can solve this:
from django.db.models import OuterRef, Subquery
sub_qs = Reply.objects.filter(
PostId=OuterRef('pk')
).order_by('timestamp')
qs = Post.objects.annotate(
last_reply_content=Subquery(
sub_qs.values('content')[:1]))
how does it work?
sub_qs is the related model queryset, where you want to take only the last reply for each post, to do that, we use the OuterRef, it will take care to get replies related to this post, and finally the order_by that will order by the timestamp, the first is the most recent, and the last is the eldest.
sub_qs = Reply.objects.filter(
PostId=OuterRef('pk')
).order_by('timestamp')
the second part is the Post queryset with a annotate, we wanna apply the sub_qs in an extra field, and using subquery will allow us to insert another queryset inside of annotate
we use .values('content') to get only the content field, and slice the sub_qs with [:1] to get only the first occurrence.
qs = Post.objects.annotate(
last_reply_content=Subquery(
sub_qs.values('content')[:1]))
Related
Two models Users (built-in) and Posts:
class Post(models.Model):
post_date = models.DateTimeField(default=timezone.now)
user = models.ForeignKey(User, on_delete=models.CASCADE, null=True, related_name='user_post')
post = models.CharField(max_length=100)
I want to have an API endpoint that returns the percentage of users that have posted. Basically I want SUM(unique users who have posted) / total_users
I have been trying to play around with annotate and aggregate, but I am getting the sum of posts for each users, or the sum of users per post (which is one...). How can I get the sum of posts returned with unique users, divide that by user.count and return?
I feel like I am missing something silly but my brain has gone to mush staring at this.
class PostParticipationAPIView(generics.ListAPIView):
queryset = Post.objects.all()
serializer_class = PostSerializer
def get_queryset(self):
start_date = self.request.query_params.get('start_date')
end_date = self.request.query_params.get('end_date')
# How can I take something like this, divide it by User.objects.all().count() * 100, and assign it to something to return as the queryset?
queryset = Post.objects.filter(post_date__gte=start_date, post_date__lte=end_date).distinct('user').count()
return queryset
My goal is to end up with the endpoint like:
{
total_participation: 97.3
}
Thanks for any guidance.
BCBB
EDIT
OK, I am still struggling a bit. I tried to create a serializer that just had a decimal field for participation_percentage like:
percentage_participation = serializers.DecimalField(max_digits=5, decimal_places=2, max_value=100, min_value=0)
Then I calculate in the view, but I get an error:
Got AttributeError when attempting to get a value for field percentage_participation on serializer ParticipationSerializer.
The serializer field might be named incorrectly and not match any attribute or key on the str instance.
Original exception text was: 'str' object has no attribute 'percentage_participation'.
Error was the same if I made it a CharField (in case there was some string coercion?).
So then I tried to move it to a Serializer Method and put all the calculation logic in there. This calculated fine, but if I had to provide a query_set in the view. If provided a model object, it just returned the percentage as many times as the query (say Posts.objects.all() had a total of 100 posts, it returned the percentage 100 times).
So then I tried to override the get_queryset in the view, but I HAVE to return something. If I just return { "meh", "hello" } then I return the percentage from the SerializerMethodField one time and the end result is exactly what I want.
I just have no idea as to WHY or how to do this correctly.
Thanks for your help.
EDIT #2
OK so I realized why I was only getting one, it was iterating over the string I returned, which was one character. When I returned "meh" it gave me three of the percentage, iterating over each character in the string...
I am not understanding from playing around, reading the docs, or using GoogleFu how to do this properly. I just want to be able to perform some kind of summary logic on records from the DB - how can I do this properly?!?!
Thank you for all your time.
BCBB
something like this should work
# get total user count
total_users = User.objects.count()
# get unique set of users with post
total_users_who_posted = Post.objects.filter(...).distinct("user").count()
# calculate_percentage
percentage = {
"total_participation": (total_users_who_posted*100)/ total_users
}
# take caution of divion by zero
I don't think it is possible to use djangos orm to do this completely but you can use the orm to get the user counts (with posts and total):
from django.db.models import BooleanField, Case, Count, When, Value
counts = (User
.objects
.annotate(posted=Case(When(user_post__isnull=False,
then=Value(True)),
default=Value(False),
output_field=BooleanField()))
.values('posted')
.aggregate(posted_users=Count('pk', filter=Q(posted=True)),
total_users=Count('pk', filter=Q(posted__isnull=False)))
# This will result in a dict containing the following:
# counts = {'posted_users': ...,
# 'total_users': ....}
I want to get a list of max ids for a filter I have in Django
class Foo(models.Model):
name = models.CharField()
poo = models.CharField()
Foo.objects.filter(name__in=['foo','koo','too']).latest_by_id()
End result a queryset having only the latest objects by id for each name. How can I do that in Django?
Edit: I want multiple objects in the end result. Not just one object.
Edit1: Added __in. Once again I need only latest( as a result distinct) objects for each name.
Something like this.
my_id_list = [Foo.objects.filter(name=name).latest('id').id for name in ['foo','koo','too']]
Foo.objects.filter(id__in=my_id_list)
The above works. But I want a more concise way of doing it. Is it possible to do this in a single query/filter annotate combination?
you can try:
qs = Foo.objects.filter(name__in=['foo','koo','too'])
# Get list of max == last pk for your filter objects
max_pks = qs.annotate(mpk=Max('pk')).order_by().values_list('mpk', flat=True)
# after it filter your queryset by last pk
result = qs.filter(pk__in=max_pks)
If you are using PostgreSQL you can do the following
Foo.objects.order_by('name', '-id').distinct('name')
MySQL is more complicated since is lacks a DISTINCT ON clause. Here is the raw query that is very hard to force Django to generate from ORM function calls:
Foo.objects.raw("""
SELECT
*
FROM
`foo`
GROUP BY `foo`.`name`
ORDER BY `foo`.`name` ASC , `foo`.`id` DESC
""")
I have a django model that has a date field and a separate time field. I am trying to use a filter to find a value on the latest record by date/time that is less than the current record's date time.
How do I use annotate/aggregate to combine the date and time fields into one and then do a filter on it?
models.py
class Note(models.model):
note_date = models.DateField(null=True)
note_time = models.TimeField(null=True)
note_value = models.PositiveIntegerField(null=True)
def get_last(n):
"""
n: Note
return: Return the note_value of the most recent Note prior to given Note.
"""
latest = Note.objects.filter(
note_date__lte=n.note_date
).order_by(
'-note_date', '-note_time'
).first()
return latest.note_value if latest else return 0
This will return any notes from a previous date, but if I have a two notes on the same date, one at 3pm and one at 1pm, and I send the 3pm note to the function, I want to get the value of the 1pm note. Is there a way to annotate the two fields into one for comparison, or do I have to perform a raw SQL query? Is there a way to convert the date and time component into one, similar to how you could use Concat for strings?
Note.objects.annotate(
my_dt=Concat('note_date', 'note_time')
).filter(
my_dt__lt=Concat(models.F('note_date'), models.F('note_time')
).first()
I am too late but here is what I did
from django.db.models import DateTimeField, ExpressionWrapper, F
notes = Note.objects.annotate(my_dt=ExpressionWrapper(F('note_date') + F('note_time'), output_field=DateTimeField()))
Now we have added a new field my_dt of datetime type and can add a filter further to do operations
Found an answer using models.Q here: filter combined date and time in django
Note.objects.filter(
models.Q(note_date__lt=n.note_date) | models.Q(
note_date=n.note_date,
note_time__lt=n.note_time
)
).first()
I guess I just wasn't searching by the right criteria.
Here is another Approach which is more authentic
from django.db.models import Value, DateTimeField
from django.db.models.functions import Cast, Concat
notes = Note.objects.annotate(my_dt=Cast(
Concat('note_date', Value(" "), 'note_time', output_field=DateTimeField()),
output_field=DateTimeField()
).filter(my_dt__lte=datetime.now())
Here is another solution following others.
def get_queryset(self):
from django.db import models
datetime_wrapper = models.ExpressionWrapper(models.F('note_date') + models.F('note_time'), output_field=models.DateTimeField())
return Note.objects.annotate(
note_datetime=datetime_wrapper
).filter(note_datetime__gt=timezone.now()).order_by('note_datetime')
My question is simple: in a Django app, I have a table Users and a table StatusUpdates. In the StatusUpdates table, I have a column user which is a foreign key pointing back to Users. How can I do a search expressing something like:
users.filter(latest_status_update.text__contains='Hello')
Edit:
Please excuse my lack of clarity. The query that I would like to make is something like "Give me all the users whose latest status update contains the text 'hello'". In Django code, I would do the following (which is really inefficient and ugly):
hello_users = []
for user in User.objects.all():
latest_status_update = StatusUpdate.objects.filter(user=user).order_by('-creation_date')[0]
if latest_status_update.text.contains('Hello'):
hello_users.append(user)
return hello_users
Edit 2:
I've already found the solution but since I was asked, here are the important parts of my models:
class User(models.Model):
...
class StatusUpdate(models.Model):
user = models.ForeignKey(User)
text = models.CharField(max_length=140)
creation_date = models.DateTimeField(auto_now_add=True, editable=False)
....
Okay, I think I got it:
from django.db.models import Max, F
User.objects\
.annotate(latest_status_update_id=Max('statusupdate__id'))\
.filter(
statusupdate__id=F('latest_status_update_id'),
statusupdate__text__icontains='hello'
)
For more info, see this section of the Django documentation.
Please note: I ended up changing my strategy a bit and settling for the strategy where the highest ID means the latest update. This is the case because I realized that a User could post two updates the same time and that would break my query.
latest_status_updates = filter(lambda x: x.text.contains('hello'),
[
user.statusupdates_set.order_by('-creation_date').first()
for user in User.objects.all()
]
)
users = list(set([status_update.user for status_update in latest_status_updates]))
EDIT:
Now I first get all LATEST status updates of each user into a list which is then filtered by the text field found in StatusUpdate class. In the second line, I extract users out of the filtered status updates and then produce a unique list of users.
I hope this helps!
Not sure I understand, are you trying to do something like
(StatusUpdates
.objects
.select_related("user")
.filter(text__contains = "hello")
.order_by("-updated")
.first())
This will return the StatusUpdate that was modified last (if you have a field called updated that stores the time of the last modification) which contains "Hello" in the text field. If none of the StatusUpdates contains that string, it will return None.
Then you can do:
latest = (StatusUpdates
.objects
.select_related("user")
.filter(text__contains = "hello")
.order_by("-updated")
.first())
#then if you needed the user too
if latest is not None:
user = latest.user #which does not call the DB again since you selected related`
If this isn't what you needed, please provide more details (models) and clarify your need
According to documentation, exclude() joines multiple parameters by AND, but my query seems to use OR logic instead.
I have the following model:
class Book(models.Model):
# some fields
authors = models.ManyToManyField(Author)
And I want to get all books except those, where is only one author and author's id is '12':
q_and = Book.objects.annotate(authors_number=Count('authors')).exclude(authors_number=1, authors__in=['12'])
But instead, I get result which is similar to query with OR logic:
q_or = Book.objects.annotate(authors_number=Count('authors')).exclude(authors_number=1).exclude(authors__in=['12'])
If I filter books with the only author and author's id='12', I get those I need to exclude:
need_to_exclude = Book.objects.annotate(authors_number=Count('authors')).filter(authors_number=1, authors__in=['12'])
I know how to make, what I want in two queries, but how to make the same with one query, using exclude()?
And what's wrong with my query?
Try this; This is the OR operation.
from django.db.models import Q
q = Q(Q(authors_number=1)| Q(authors__in=['12']))
q_and = Book.objects.annotate(authors_number=Count('authors')).exclude(q).distinct()