Django: How to do Aggregate (GroupBy ID) & Latest timestamp?

Django: How to do Aggregate (GroupBy ID) & Latest timestamp? - django

I have a table like this:
ID | Time Stamp
1 | 2012-07-28 18:57:48.160912+01
1 | 2012-07-28 20:57:43.063327+01
2 | 2012-07-28 21:17:16.016665+01
I would like to see the latest entry of each id.
If I did this, I would get only one object with the very latest entry:
open_deals = all_deals.latest('time_stamp'))
--> 2 | 2012-07-28 21:17:16.016665+01
But I would like to get
--> 1 | 2012-07-28 20:57:43.063327+01
2 | 2012-07-28 21:17:16.016665+01
I need somehow to aggregate or Groupby the ID. But there is no function for that in the documentation.
Any tips? Thank you
Update:
I have tried the solution below:
result_list = [deal.dealchangelog_set.latest('time_stamp') for deal in open_deals]
result_set = set()
for item in result_list:
result_set.add(item.pk)
return open_deals.filter(pk__in = result_set)
Unfortunatelly as you can see the list still contains three objects instead of two. :-(
Here are my models (beware I am not using the deal_id as pk) The pk is still as integers.
In my case, I need to get the latest of the deal_id, which isn't unique. (For sake of simplicity I had shown previously the uuid in here as integer)
class Deal(models.Model):
deal_id = UUIDField()
status = models.ForeignKey(DealStatus, verbose_name=_(u"Deal Status"), null=True, blank=True)
contact = models.ForeignKey(Contact)
deal_type = models.ForeignKey(DealType)
class DealChangeLog(models.Model):
deal = models.ForeignKey(Deal)
time_stamp = CreationDateTimeField()
Update 2:
def get_open_deals(call):
all_deals = Deal.objects.filter(contact=call.contact)
closed_deals = all_deals.filter(status__in=[5, 6])
closed_deal_list = []
if closed_deals:
for item in closed_deals:
closed_deal_list.append(item.deal_id)
open_deals = all_deals.exclude(deal_id__in=closed_deal_list)
result_list = [deal.dealchangelog_set.latest('time_stamp') for deal in open_deals]
result_set = set()
for item in result_list:
result_set.add(item.pk)
return open_deals.filter(pk__in = result_set)

I'm not context aware but I would probably break up the model like this:
class Deal(models.Model):
title = models.CharField()
# maybe signal changes etcetera...
class DealChangeLog(models.Model):
deal = models.ForeignKey(Deal)
date = models.DateTimeField(auto_now=True, auto_now_add=True)
Then you could achieve your result by list comprehension:
results = [deal.dealchangelog_set.latest('date') for deal in Deal.objects.all()]
To get the results as a queryset you can do:
Deal.objects.filter(id__in=[result.deal.id for result in results])
But if you can work with the results as a list I don't see much point for the extra queries

Related

Django 2.2 ORM Exclude Not Working as expected

Im trying to get all customer list form Cust models that not having a record in Stat (ForeignKey), Django 2.2
class Cust(models.Model):
name = models.CharField(max_length=50)
active = models.BooleanField(default=True)
class Stat(models.Model):
cust = models.ForeignKey(Cust, on_delete=models.PROTECT, null=True)
date = models.DateField(null=True, blank=True)
im trying this but doestn work,
month = datetime.now()
month = month.strftime("%Y-%m")
inner_qs = Stat.objects.filter(date__icontains=month)
data = Cust.objects.exclude(id__in=inner_qs)
print(inner_qs)
print(data)
The query above returning:
<QuerySet [<Stat: Ruth>]>
<QuerySet [<Cust: Jhonny>, <Cust: Rony>, <Cust: Sinta>, <Cust: Ruth>]>
As you can see, i need the result [<Stat: Ruth>] excluded from the data queryset/list.
but what i expected is:
<QuerySet [<Stat: Ruth>]>
<QuerySet [<Cust: Jhonny>, <Cust: Rony>, <Cust: Sinta>>

According to django doc 1 about __in, it states:
In a given iterable; often a list, tuple, or queryset. It’s not a common use case, but strings (being iterables) are accepted.
There are several ways to solve this problem.
replace
inner_qs = Stat.objects.filter(date__icontains=month)
data = Cust.objects.exclude(id__in=inner_qs)
1. with
inner_qs = Stat.objects.filter(date__icontains=month)
data = Cust.objects.exclude(id__in=[o.cust_id for o in inner_qs])
2. Another way is to replace with
id_list =Stat.objects.filter(date__icontains=month)\
.values_list('cust_id', flat=True)\
# flat=True will return list rather
# than tuple in the ValueListQueryset
data = Cust.objects.exclude(id__in=id_list)
Here, you first generate an exclude id_list then use it to exclude.
3. a revised way of #2 with distinct()
id_list =Stat.objects.filter(date__icontains=month)\
.values_list('cust_id', flat=True)\
.distinct().order_by()
# distinct() is used to get unique 'cust_id's
# distinct() does not work without order_by()
data = Cust.objects.exclude(id__in=id_list)

Django query assigns database columns to wrong model instance properties

Really weird problem - when I query for a model instance the data comes back assigned to the wrong properties.
The model:
class SaleLineItem(models.Model):
sale = models.ForeignKey(Sale, on_delete=models.CASCADE, related_name="sale_line_items")
stock_unit = models.ForeignKey(StockUnit, on_delete=models.CASCADE, related_name="sale_line_items")
currency = models.CharField(max_length=3)
price_original = models.FloatField()
price_paid = models.FloatField()
tax_amount = models.FloatField(null=True, blank=True)
num_sold = models.IntegerField()
sale_line_item_id = models.CharField(max_length=30, null=True, blank=True)
status = models.CharField(max_length=20, choices=SALE_STATUS_CHOICES, null=True, blank=True)
The database row:
id | currency | price_original | price_paid | tax_amount | num_sold | sale_line_item_id | status | sale_id | stock_unit_id
-------+----------+----------------+------------+------------+----------+-------------------+-----------+---------+---------------
15726 | THB | 130 | 130 | | 1 | | delivered | 16219 | 2
And the query:
sli = SaleLineItem.objects.get(pk=15726)
print(sli.pk)
-------------------------
16219
print(sli.stock_unit_id)
-------------------------
THB
print(sli.currency)
-------------------------
130.0
The data get populated on the object but everything is "shifted" by one column.
But if I do the query this way:
SaleLineItem.objects.filter(pk=15726).values()
-------------------------
<QuerySet [{'id': 15726, 'sale_id': 16219, 'stock_unit_id': 2, 'currency': 'THB', 'price_original': 130.0, 'price_paid': 130.0, 'tax_amount': None, 'num_sold': 1, 'sale_line_item_id': None, 'status': 'delivered'}]>
. . . the result is correct.
I thought I might have un-migrated models but I ran both makemigrations and migrate to no effect.
Same result when I use lower-level QuerySet methods:
qs = SaleLineItem.objects.all()
clone = qs._chain()
clone.query.add_q(Q(pk=15726))
print(clone)
------------------------------
<QuerySet [<SaleLineItem: SaleLineItem object (16219)>]>
Note the pk on the model __str__ is incorrect.
Any ideas what's happening here?
Running:
Python 3.7.3
Django 2.2.1
Postgres 10

Turns out it's because I overrode __init__ with an extra (non-field) argument.
#classmethod
def from_db(cls, db, field_names, values):
if len(values) != len(cls._meta.concrete_fields):
values_iter = iter(values)
values = [
next(values_iter) if f.attname in field_names else DEFERRED
for f in cls._meta.concrete_fields
]
new = cls(*values)
new._state.adding = False
new._state.db = db
return new
Database values are populated onto the model using *values, and the model expects fields in a specific order. So you can't have an extra argument in __init__ or the order gets messed up.
Edit:
Had not read this part in the docs (https://docs.djangoproject.com/en/2.1/ref/models/instances/):
You may be tempted to customize the model by overriding the __init__
method. If you do so, however, take care not to change the calling
signature . . .

Trying to filter based upon a value in another table

I have 2 tables as
class ItemFollowers(models.Model):
item = models.ForeignKey(Items, models.DO_NOTHING, db_column='item')
user = models.ForeignKey(AuthUser, models.DO_NOTHING, db_column='user')
And the other one is
class UsrPosts(models.Model):
item = models.ForeignKey('Items', models.DO_NOTHING, db_column='item')
# Some other fields
How can I select the UsrPosts related to the items followed by some user? i.e. I can have records in ItemFollowers like (item0, user0), (item1, user0), (item5, user0). I need to filter UsrPosts based upon the user (aka. request.user.id)
Here is a inefficient non-working way to get UsrPostts
itms = ItemFollowers.objects.filter(user_id=request.user.id)
qry = Q(item_id=itms[0].item.id) | ..... | Q(item_id=itms[N].item.id)
posts = UsrPosts.objects.filter(qry)
Is there some filter magic to get it in one transaction?

itms = ItemFollowers.objects.filter(user_id=request.user.id).values‌_list('item')
posts = UsrPosts.objects.filter(item__in = itms)

Free dates in reservation system

I am working on a reservation system, and I can not find a good way to select free dates. Here is my model:
class rental_group(models.Model):
group_title = models.CharField(max_length = 30)
description = models.TextField()
class rental_units(models.Model):
group = models.ForeignKey(rental_group)
number = models.CharField(max_length = 6)
class Reservations(models.Model):
#name = models.ForeignKey(customer)
rental_unit = models.ForeignKey(rental_units)
start_date = models.DateField()
end_date = models.DateField()
I tried to select available rooms with this query:
rental_units.objects.filter(group__rental_group = 'Bungalow').exclude(Q(Reservations__start_date__lt = arrival)&Q(Reservations__end_date__gt = arrival)|Q(Reservations__start_date__lt = departure)&Q(Reservations__end_date__gt = departure))
This works perfect when there is only one reservation. When there are more reservations on the same number, things go wrong. For example when I have two reservations on number 120 this query returns 120 twice when everything is available. And 120 is returns once if the the new reservation is between one of the old reservetaion dates (should be zero "not availble")
Is this possible with a query? Or should I iter over the reservations, and remove the reservated house from a list (which could take a lot of time when there are lots of reservations)

You want Reservations, not rental_units, so filter them:
Reservations.objects\
.filter(rental_unit__group__group_title='Bungalow')\
.exclude(
Q(start_date__lt=arrival) & \
Q(end_date__gt=arrival) | \
Q(start_date__lt=departure) & \
Q(end_date__gt=departure)
)

Ok it seems that it is not possible to do what I wanted. So I choose a different approach:
Make a list of all rental_units: all_units
Make a list of rental_units from the query below, and remove these list items from all_units:
rental_units.objects.filter(group__rental_group = 'Bungalow').filter(Q(Reservations__start_date__gt = arrival)&Q(Reservations__start_date__lt = departure)|Q(Reservations__end_date__gt = arrival)&Q(Reservations__end_date__lt = departure))
I enhanced the queryset, so it differs from the topic start.
The queryset returns all Bungalows that have a reservation between the arrival and departure date.

Reducing queries for manytomany models in django

EDIT:
It turns out the real question is - how do I get select_related to follow the m2m relationships I have defined? Those are the ones that are taxing my system. Any ideas?
I have two classes for my django app. The first (Item class) describes an item along with some functions that return information about the item. The second class (Itemlist class) takes a list of these items and then does some processing on them to return different values. The problem I'm having is that returning a list of items from Itemlist is taking a ton of queries, and I'm not sure where they're coming from.
class Item(models.Model):
# for archiving purposes
archive_id = models.IntegerField()
users = models.ManyToManyField(User, through='User_item_rel',
related_name='users_set')
# for many to one relationship (tags)
tag = models.ForeignKey(Tag)
sub_tag = models.CharField(default='',max_length=40)
name = models.CharField(max_length=40)
purch_date = models.DateField(default=datetime.datetime.now())
date_edited = models.DateTimeField(auto_now_add=True)
price = models.DecimalField(max_digits=6, decimal_places=2)
buyer = models.ManyToManyField(User, through='Buyer_item_rel',
related_name='buyers_set')
comments = models.CharField(default='',max_length=400)
house_id = models.IntegerField()
class Meta:
ordering = ['-purch_date']
def shortDisplayBuyers(self):
if len(self.buyer_item_rel_set.all()) != 1:
return "multiple buyers"
else:
return self.buyer_item_rel_set.all()[0].buyer.name
def listBuyers(self):
return self.buyer_item_rel_set.all()
def listUsers(self):
return self.user_item_rel_set.all()
def tag_name(self):
return self.tag
def sub_tag_name(self):
return self.sub_tag
def __unicode__(self):
return self.name
and the second class:
class Item_list:
def __init__(self, list = None, house_id = None, user_id = None,
archive_id = None, houseMode = 0):
self.list = list
self.house_id = house_id
self.uid = int(user_id)
self.archive_id = archive_id
self.gen_balancing_transactions()
self.houseMode = houseMode
def ret_list(self):
return self.list
So after I construct Itemlist with a large list of items, Itemlist.ret_list() takes up to 800 queries for 25 items. What can I do to fix this?

Try using select_related
As per a question I asked here

Dan is right in telling you to use select_related.
select_related can be read about here.
What it does is return in the same query data for the main object in your queryset and the model or fields specified in the select_related clause.
So, instead of a query like:
select * from item
followed by several queries like this every time you access one of the item_list objects:
select * from item_list where item_id = <one of the items for the query above>
the ORM will generate a query like:
select item.*, item_list.*
from item a join item_list b
where item a.id = b.item_id
In other words: it will hit the database once for all the data.

You probably want to use prefetch_related
Works similarly to select_related, but can deal with relations selected_related cannot. The join happens in python, but I've found it to be more efficient for this kind of work than the large # of queries.
Related reading on the subject

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Django: How to do Aggregate (GroupBy ID) & Latest timestamp? - django

Related

Django 2.2 ORM Exclude Not Working as expected

Django query assigns database columns to wrong model instance properties

Trying to filter based upon a value in another table

Free dates in reservation system

Reducing queries for manytomany models in django

Categories

Resources