Django aggregation over filtered query - django

Assuming I have these two models :
class User(models.model):
username = models.CharField(max_length=255)
class Item(models.model):
user = models.ForeignKey( User )
enabled = models.BooleanField()
price = models.IntegerField()
I would like to create an optimal query and get : Top 10 users which have at least 10 enabled
items and the highest average price of that total items (sorted by best avarage)
In other words I am trying to create a top 10 "leaderboard" on my site for the users that own an average best priced items, however some of the items may be disabled and still exist on my database, I am trying to get rid of them on my ORM query but cant find a good way of doing it.
This operation is run every 5 minutes or so, it is not running while generating a page.

I can't test right now, but I think this should work:
topusers = User.objects.prefetch_related(
'item_set'
).filter(
item_set__enabled=True
).annotate(
item_count=models.Count('item_set'),
avg_price=models.Avg('item_set__price')
).filter(
item_count__gte=10
).order_by('-avg_price').all()

Related

Count different values of user over a period of time with selecting the latest for that day

I was trying to get models value over a period of time in django. The model is used to keep kind of activity log.
class Activity(models.Model):
PLACE_CHOICES = (('home', 'Home'),('office', 'Office'))
userId = models.IntegerField()
date = models.DateField()
place = models.CharField(max_length=25, choices=PLACE_CHOICES)
An user can have multiple place for a day ('place' can be duplicate) but I need only the latest model for that day.
I need to group data over a period of time (say 15 days) for multiple user, filtering args are something like this,
Activity.objects.filter(userId__in=[1,2], date__gte='2022-01-01', date__lte='2022-01-15')
This is only to show the filtering. I tried other methods,like-
Activity.objects.filter(
userId__in=[1,2], date__gte='2022-01-01', date__lte='2022-01-15'
).annotate(
home=Count('place', distinct=True, filter=Q(place='home'),
office=Count('place', distinct=True, filter=Q(place='home')
).values('userId','home','office')
I need values like this
[{userId:1, home:4, office:3},{userId:2, home:1, office:3}]

Django ORM: Get all records and the respective last log for each record

I have two models, the simple version would be this:
class Users:
name = models.CharField()
birthdate = models.CharField()
# other fields that play no role in calculations or filters, but I simply need to display
class UserLogs:
user_id = models.ForeignKey(to='Users', related_name='user_daily_logs', on_delete=models.CASCADE)
reference_date = models.DateField()
hours_spent_in_chats = models.DecimalField()
hours_spent_in_p_channels = models.DecimalField()
hours_spent_in_challenges = models.DecimalField()
# other fields that play no role in calculations or filters, but I simply need to display
What I need to write is a query that will return all the fields of all users, with the latest log (reference_date) for each user. So for n users and m logs, the query should return n records. It is guaranteed that each user has at least one log record.
Restrictions:
the query needs to be written in django orm
the query needs to start from the user model. So Anything that goes like Users.objects... is ok. Anything that goes like UserLogs.objects... is not. That's because of filters and logic in the viewset, which is beyond my control
It has to be a single query, and no iterations in python, pandas or itertools are allowed. The Queryset will be directly processed by a serializer.
I shouldn't have to specify the names of the columns that need to be returned, one by one. The query must return all the columns from both models
Attempt no. 1 returns only user id and the log date (for obvious reasons). However, it is the right date, but I just need to get the other columns:
test = User.objects.select_related("user_daily_logs").values("user_daily_logs__user_id").annotate(
max_date=Max("user_daily_logs__reference_date"))
Attempt no. 2 generates as error (Cannot resolve expression type, unknown output_field):
logs = UserLogs.objects.filter(user_id=OuterRef('pk')).order_by('-reference_date')[:1]
users = Users.objects.annotate(latest_log = Subquery(logs))
This seems impossible taking into account all the restrictions.
One approach would be to use prefetch_related
users = User.objects.all().prefetch_related(
models.Prefetch(
'user_daily_logs',
queryset=UserLogs.objects.filter().order_by('-reference_date'),
to_attr="latest_log"
)
)
This will do two db queries and return all logs for every user which may or not be a problem depending on the number of records. If you need only logs for the current day as the name suggest, you can add that to filter and reduce the number of UserLogs records. Of course you need to get the first element from the list.
users.daily_logs[0]
For that you can create a #property on the User model which could look roughly like this
#property
def latest_log(self):
if not hasattr('daily_logs'):
return None
return self.daily_logs[0]
user.latest_log
You can also go a step further and try the following SubQuery inside Prefetch to limit the queryset to one element but I am not sure on the performance with this one (credits Django prefetch_related with limit).
users = User.objects.all().prefetch_related(
models.Prefetch(
'user_daily_logs',
queryset=UserLogs.objects.filter(id__in=Subquery(UserLogs.objects.filter(user_id=OuterRef('user_id')).order_by('-reference_date').values_list('id', flat=True)[:1] ) ),
to_attr="latest_log"
)
)

Multiple payments related to multiple customer implementation in django

I'm writing a customer management system for my business and got stuck on the payments entry system. This will run on a local dedicated server and should have only one user so code performance is not really an issue.
Every adult customer who enters the store is given a numbered card (Card, for the rest of this question) and his/her ID ( from Customer model ) is attached to it by a foreign key relation. There is an "entrance fee subtotal", which is the result of a choice field on Card model (there's only two choices and those won't change for a long time) plus kids 'fees'.
This, along with other two kind of models ( Product and Service), will compose the customer's bill. I have it working just fine, except on the payments registration.
As many Customers may be part of a family, and they may split their total bill quite often, I do believe Payment should be a model with an ManyToManyField related to Card so it could cover multiple payments methods ( treated as another choice field, since it will be either money, credit or debit cards ) but I can't figure it out how to model it neither how to handle it in my view/template.
I'm using django 1.9 & postgres 9.5 & python 2.7.
Bootstrap 3 along with some JS for styling (probably irrelevant).
Enough said, here's some code:
models.py
class Customer(models.Model):
id = models.AutoField(primary_key=True) #unnecessary but I had already written it
name = models.CharField(max_length=40)
last = models.CharField(max_length=80)
class Card(models.Model):
entrance_type1 = 1
entrance_type2 = 2
entrance_choices = (
(entrance_type1, 'Fun'),
(entrance_type2, 'Really Fun, kinda expensive'),
)
entrance_types = {
1:"Fun",
2:"Really Fun",
}
entrance_fee= {
'kid':5.0,
entrance_type1:15.0,
entrance_type1:35.0,
}
id = models.AutoField(primary_key=True) #Yeah, I do that
date = models.DateTimeField(auto_now=True, auto_now_add=False)
card_number = models.IntegerField()
entrance_type = models.PositiveIntegerField(choices=entrance_choices)
kids_number = models.PositiveIntegerField()
id_costumer = models.ForeignKey(Customer)
entrances_value = models.DecimalField(max_digits=6, decimal_places=2)
#will be entrance_fee[entrance_type] + entrance_fee['kid'] * kids_number
status = models.BooleanField(default=1) #should be 0 after payment(s)
Anyway, I really need help modelling payments for those. It should contain payment method, date and to which Cards it's related to.
I'm already getting ideas on the views/template step so I won't be strict about those on answers.
I do believe my question is kinda fuzzy and confuse, but can't figure how to make it better ( and maybe this is why I can't solve it by myself ) so please comment in your doubts and I'll edit it ( including removing this part when it does improve) after lunch.
Thanks in advance

Creating a query with foreign keys and grouping by some data in Django

I thought about my problem for days and i need a fresh view on this.
I am building a small application for a client for his deliveries.
# models.py - Clients app
class ClientPR(models.Model):
title = models.CharField(max_length=5,
choices=TITLE_LIST,
default='mr')
last_name = models.CharField(max_length=65)
first_name = models.CharField(max_length=65, verbose_name='Prénom')
frequency = WeekdayField(default=[]) # Return a CommaSeparatedIntegerField from 0 for Monday to 6 for Sunday...
[...]
# models.py - Delivery app
class Truck(models.Model):
name = models.CharField(max_length=40, verbose_name='Nom')
description = models.CharField(max_length=250, blank=True)
color = models.CharField(max_length=10,
choices=COLORS,
default='green',
unique=True,
verbose_name='Couleur Associée')
class Order(models.Model):
delivery = models.ForeignKey(OrderDelivery, verbose_name='Delivery')
client = models.ForeignKey(ClientPR)
order = models.PositiveSmallIntegerField()
class OrderDelivery(models.Model):
date = models.DateField(default=d.today())
truck = models.ForeignKey(Truck, verbose_name='Camion', unique_for_date="date")
So i was trying to get a query and i got this one :
ClientPR.objects.today().filter(order__delivery__date=date.today())
.order_by('order__delivery__truck', 'order__order')
But, i does not do what i really want.
I want to have a list of Client obj (query sets) group by truck and order by today's delivery order !
The thing is, i want to have EVERY clients for the day even if they are not in the delivery list and with filter, that cannot be it.
I can make a query with OrderDelivery model but i will only get the clients for the delivery, not all of them for the day...
Maybe i will need to do it with a Q object ? or even raw SQL ?
Maybe i have built my models relationships the wrong way ? Or i need to lower what i want to do... Well, for now, i need your help to see the problem with new eyes !
Thanks for those who will take some time to help me.
After some tests, i decided to go with 2 querys for one table.
One from OrderDelivery Queryset for getting a list of clients regroup by Trucks and another one from ClientPR Queryset for all the clients without a delivery set for them.
I that way, no problem !

Django: Distinct on forgin key relationship

I'm working on a Ticket/Issue-tracker in django where I need to log the status of each ticket. This is a simplification of my models.
class Ticket(models.Model):
assigned_to = ForeignKey(User)
comment = models.TextField(_('comment'), blank=True)
created = models.DateTimeField(_("created at"), auto_now_add=True)
class TicketStatus(models.Model):
STATUS_CHOICES = (
(10, _('Open'),),
(20, _('Other'),),
(30, _('Closed'),),
)
ticket = models.ForeignKey(Ticket, verbose_name=_('ticket'))
user = models.ForeignKey(User, verbose_name=_('user'))
status = models.IntegerField(_('status'), choices=STATUS_CHOICES)
date = models.DateTimeField(_("created at"), auto_now_add=True)
Now, getting the status of a ticket is easy sorting by date and retrieving the first column like this.
ticket = Ticket.objects.get(pk=1)
ticket.ticketstatus_set.order_by('-date')[0].get_status_display()
But then I also want to be able to filter on status in the Admin, and those have to get the status trough a Ticket-queryset, which makes it suddenly more complex. How would I get a queryset with all Tickets with a certain status?
I guess you are trying to avoid a cycle (asking for each ticket status) to filter manually the queryset. As far as I know you cannot avoid that cycle. Here are ideas:
# select_related avoids a lot of hits in the database when enter the cycle
t_status = TicketStatus.objects.select_related('Ticket').filter(status = ID_STATUS)
# this is an array with the result
ticket_array = [ts.ticket for ts in tickets_status]
Or, since you mention you were looking for a QuerySet, this might be what you are looking for
# select_related avoids a lot of hits in the database when enter the cycle
t_status = TicketStatus.objects.select_related('Ticket').filter(status = ID_STATUS)
# this is a QuerySet with the result
tickets = Tickets.objects.filter(pk__in = [ts.ticket.pk for ts in t_status])
However, the problem might be in the way you are modeling the data. What you called TickedStatus is more like TicketStatusLog because you want to keep track of the user and date who change the status.
Therefore, the reasonable approach is to add a field 'current_status' to the Ticket model that is updated each time a new TicketStatus is created. In this way (1) you don't have to order a table each time you ask for a ticket and (2) you would simply do something like Ticket.objects.filter(current_status = ID_STATUS) for what I think you are asking.