How to join on multiple column with a groupby in Django/Postgres - django

I have the following tables that I need to join on date and currency:
class Transaction(models.Model):
description = models.CharField(max_length=100)
date = models.DateField()
amount = models.FloatField()
currency = models.ForeignKey(Currency, on_delete=models.PROTECT)
class ExchangeRate(models):
currency = models.ForeignKey(Currency, on_delete=models.PROTECT)
rate = models.FloatField()
date = models.DateField()
I need to join on both the date and currency columns, multiply the rate and the amount to give me the 'converted_amount'. I then need to group all the transactions by calendar month and sum up the 'converted_amount'.
Is this possible using the Django ORM or would I need to use SQL directly? If so, how do I go about doing this in Postgres?

Assuming that the Dates in the "Exchange rates" table are independent from the dates in the Transactions table, so that for each Transaction, the corresponding "Exchange rates".Date is the latest date which is less or equal than the Transactions.Date, you can try this in Postgres :
In Postgres :
SELECT t.Currency
, date_trunc('month', t.Date) AS period_of_time
, sum(t.amount * er.Rate) AS sum_by_currency_by_period_of_time
FROM Transactions AS t
CROSS JOIN LATERAL
( SELECT DISTINCT ON (er.Currency) er.Rate
FROM "Exchange rates" AS er
WHERE er.Currency = t.Currency
AND er.Date <= t.Date
ORDER BY er.Date DESC
) AS er
GROUP BY t.Currency, date_trunc('month', t.Date)

Assuming that your Currency model has a symbol column (change to your needs) you can achieve this with the following Django statements:
from your.models import Transaction, ExchangeRate
from django.db.models.functions import ExtractMonth
from django.db.models import Sum, F, Subquery, OuterRef
rates = ExchangeRate.objects.filter(
currency=OuterRef("currency"), date__lt=OuterRef("date")
).order_by("-date")
Transaction.objects.annotate(
month=ExtractMonth("date"),
rate=Subquery(rates.values("rate")[:1]),
conversion=F("amount") * F("rate"),
).values("currency__symbol", "month").annotate(sum=Sum("conversion")).order_by(
"currency", "month"
)
This will result in a list like:
{'currency__symbol': '$', 'month': 2, 'sum': 105.0},...
The subquery statement will annotate the last found exchange rate comparing the dates. Make sure that each transaction has an exchange rate (exchange rate date prior transaction date).

Related

django query left join, sum and group by

I have a model:
class Product(models.Model):
name = models.CharField(max_length=100)
class Sales(models.Model):
product_id = models.ForeignKey(Product, on_delete=models.CASCADE, related_name='products')
date = models.DateTimeField(null=True)
price = models.FloatField()
How do I return data as the following sql query (annotate sales with product name, group by product, day and month, and calculate sum of sales):
select p.name
, extract(day from date) as day
, extract(month from date) as month
, sum(s.price)
from timetracker.main_sales s
left join timetracker.main_product p on p.id = s.product_id_id
group by month, day, p.name;
Thanks,
If only ORM was as simple as sql... Spent several hours trying to figuring it out...
PS. Why when executing Sales.objects.raw(sql) with sql query above I get "Raw query must include the primary key"
You can annotate with:
from django.db.models import Sum
from django.db.models.functions import ExtractDay, ExtractMonth
Product.objects.values(
'name',
month=ExtractDay('products__date')
day=ExtractDay('products__date'),
).annotate(
total_price=Sum('products__price')
).order_by('name', 'month', 'day')
Note: Normally one does not add a suffix …_id to a ForeignKey field, since Django
will automatically add a "twin" field with an …_id suffix. Therefore it should
be product, instead of product_id.
Note: The related_name=… parameter [Django-doc]
is the name of the relation in reverse, so from the Product model to the Sales
model in this case. Therefore it (often) makes not much sense to name it the
same as the forward relation. You thus might want to consider renaming the products relation to sales.

query to django model to find best company sale in the month

I have two django model one "company" and the other is "MonthlyReport" of the company
I want to find out which company sale in current month had more than 20% of previous month sale
class Company(models.Model):
name = models.CharField(max_length=50)
class MonthlyReport(models.Model):
company = models.ForeignKey(Company,on_delete=models.CASCADE)
sale = models.IntegerField()
date = models.DateField()
How can i figure out this issue to find a company that has more than 20% sales over the previous month
You can certainly do it using the ORM. You will need to combine Max (or SUM depending on your use case) with a Q() expression filter and annotate the percentage increase to the queryset before filtering it.
You could do it in a single piece of code, but I have split it out because getting the dates and the query expressions are quite long. I have also put the increase value in a separate variable, rather than hardcoding it.
from datetime import datetime, timedelta
from django.db.models import Max, Q
SALES_INCREASE = 1.2
# Get the start dates of this month and last month
this_month = datetime.now().date().replace(day=1)
last_month = (this_month - timedelta(days=15)).replace(day=1)
# Get the maximum sale this month
amount_this_month = Max('monthlyreport__sale',
filter=Q(monthlyreport__date__gte=this_month))
# Get the maximum sale last month, but before this month
amount_last_month = Max('monthlyreport__sale',
filter=Q(monthlyreport__date__gte=last_month) & \
Q(monthlyreport__date__lt=this_month))
Company.objects.annotate(
percentage_increase=amount_this_month/amount_last_month
).filter(percentage_increase__gte=SALES_INCREASE)
Edit - removed incorrect code addition
There is probably a way to do this using ORM, but I would just go with python way:
First add related name to MonthlyReport
class Company(models.Model):
name = models.CharField(max_length=50)
class MonthlyReport(models.Model):
company = models.ForeignKey(Company, related_name="monthly_reports", on_delete=models.CASCADE)
sale = models.IntegerField()
date = models.DateField()
Then
best_companies = []
companies = Company.objects.all()
for company in companies:
two_last_monthly_reports = company.monthly_reports.order_by("date")[:2]
previous_report = two_last_monthly_reports[0]
current_report = two_last_monthly_reports[1]
if current_report.sale / previous_report.sale > 1.2:
best_companies.append(company)

Django ORM query on fk set

I have a problem with lookup that looks for a value in related set.
class Room(models.Model):
name = models.Charfield(max_lentgh=64)
class Availability(models.Model):
date = models.DateField()
closed = models.BooleanField(default=False)
room = models.ForeignKey(Room)
Considering that there is one availability for every date in a year. How can use ORM query to find whether room withing given daterange (i.e. 7 days) has:
availability exist for every day within given daterange
none of the availabilities has closed=True
I was unable to find any orm examples that check whether all objects within daterange exist
You can enumerate over the dates, and ensure that it has for that date an Availability with closed=False:
from datetime import date, timedelta
rooms = Room.objects.all()
start_date = date(2022, 7, 21) # first day
for dd in range(7): # number of days
dt = start_date + timedelta(days=dd)
rooms = rooms.filter(availability__date=dt, availability__closed=False)
The rooms will after the for loop have a QuerySet with all Rooms that have for all dates in that range Availability objects with closed=False.

Django Queryset - extracting only date from datetime field in query (inside .value() )

I want to extract some particular columns from django query
models.py
class table
id = models.IntegerField(primaryKey= True)
date = models.DatetimeField()
address = models.CharField(max_length=50)
city = models.CharField(max_length=20)
cityid = models.IntegerField(20)
This is what I am currently using for my query
obj = table.objects.filter(date__range(start,end)).values('id','date','address','city','date').annotate(count= Count('cityid')).order_by('date','-count')
I am hoping to have a SQL query that is similar to this
select DATE(date), id,address,city, COUNT(cityid) as count from table where date between "start" and "end" group by DATE(date), address,id, city order by DATE(date) ASC,count DESC;
At least in Django 1.10.5, you can use something like this, without extra and RawSQL:
from django.db.models.functions import Cast
from django.db.models.fields import DateField
table.objects.annotate(date_only=Cast('date', DateField()))
And for filtering, you can use date lookup (https://docs.djangoproject.com/en/1.11/ref/models/querysets/#date):
table.objects.filter(date__date__range=(start, end))
For the below case.
select DATE(date), id,address,city, COUNT(cityid) as count from table where date between "start" and "end" group by DATE(date), address,id, city order by DATE(date) ASC,count DESC;
You can use extra where you can implement DB functions.
Table.objects.filter(date__range(start,end)).extra(select={'date':'DATE(date)','count':'COUNT(cityid)'}).values('date','id','address_city').order_by('date')
Hope it will help you.
Thanks.

Django date effective retrieval of ORM records

If I have a Django Employee model with a start_date and end_date date field, how can I use get in the ORM to date effectively select the correct record if different versions of the record exist over time based on these date fields?
So I could have the following records:
start_date, end_date, emp
01/01/2013, 31/01/2013, Emp1
01/02/2013, 28/02/2013, Employee1
01/03/2013, 31/12/4000. EmpOne
And if today's date is 10/02/2013 then I would want Employee1.
Something similar to:
from django.utils import timezone
current_year = timezone.now().year
Employee.objects.get(end_date__year=current_year)
or
res = Employee.objects.filter(end_date__gt=datetime.now()).order_by('-start_date')
Or is there a more efficient way of doing the same?
Your second example looks fine. I corrected the filter parameters to match your start_date constraints. Also, i added a LIMIT 1 ([:1]) for better performance:
now = datetime.now()
employees = Employee.objects.filter(start_date__lt=now, end_date__gt=now).order_by('-start_date')
employee = employees[:1][0] if employees else None