Django filter model with timestamp - django

I have the following models:
class User(models.Model):
id = models.CharField(max_length=10, primary_key=True)
class Data(models.Model):
user = models.ForeignKey(User, on_delete=models.CASCADE)
timestamp = models.IntegerField(default=0)
Given a single user, I would like to know how I can filter using timestamp. For example:
Obtain the data from user1, between now and 1 hour ago.
I have the current timestamp with now = time.time(), also I have 1 hour ago using hour_ago = now-3600
I would like to obtain the Data that has a timestamp between these two values.

Use range to obtain data between two values.
You can use range anywhere you can use BETWEEN in SQL — for dates, numbers and even characters.
e.g.
Data.objects.filter(timestamp__range=(start, end))
from docs:
import datetime
start_date = datetime.date(2005, 1, 1)
end_date = datetime.date(2005, 3, 31)
Entry.objects.filter(pub_date__range=(start_date, end_date))

You can use __gte which is termed as greater than or equal to and __lte refers less than or equal toSo try this,
Data.objects.filter(timestamp__gte=hour_ago,timestamp__lte=now)
You can find similar examples in official doc

Related

query to django model to find best company sale in the month

I have two django model one "company" and the other is "MonthlyReport" of the company
I want to find out which company sale in current month had more than 20% of previous month sale
class Company(models.Model):
name = models.CharField(max_length=50)
class MonthlyReport(models.Model):
company = models.ForeignKey(Company,on_delete=models.CASCADE)
sale = models.IntegerField()
date = models.DateField()
How can i figure out this issue to find a company that has more than 20% sales over the previous month
You can certainly do it using the ORM. You will need to combine Max (or SUM depending on your use case) with a Q() expression filter and annotate the percentage increase to the queryset before filtering it.
You could do it in a single piece of code, but I have split it out because getting the dates and the query expressions are quite long. I have also put the increase value in a separate variable, rather than hardcoding it.
from datetime import datetime, timedelta
from django.db.models import Max, Q
SALES_INCREASE = 1.2
# Get the start dates of this month and last month
this_month = datetime.now().date().replace(day=1)
last_month = (this_month - timedelta(days=15)).replace(day=1)
# Get the maximum sale this month
amount_this_month = Max('monthlyreport__sale',
filter=Q(monthlyreport__date__gte=this_month))
# Get the maximum sale last month, but before this month
amount_last_month = Max('monthlyreport__sale',
filter=Q(monthlyreport__date__gte=last_month) & \
Q(monthlyreport__date__lt=this_month))
Company.objects.annotate(
percentage_increase=amount_this_month/amount_last_month
).filter(percentage_increase__gte=SALES_INCREASE)
Edit - removed incorrect code addition
There is probably a way to do this using ORM, but I would just go with python way:
First add related name to MonthlyReport
class Company(models.Model):
name = models.CharField(max_length=50)
class MonthlyReport(models.Model):
company = models.ForeignKey(Company, related_name="monthly_reports", on_delete=models.CASCADE)
sale = models.IntegerField()
date = models.DateField()
Then
best_companies = []
companies = Company.objects.all()
for company in companies:
two_last_monthly_reports = company.monthly_reports.order_by("date")[:2]
previous_report = two_last_monthly_reports[0]
current_report = two_last_monthly_reports[1]
if current_report.sale / previous_report.sale > 1.2:
best_companies.append(company)

Django ORM query on fk set

I have a problem with lookup that looks for a value in related set.
class Room(models.Model):
name = models.Charfield(max_lentgh=64)
class Availability(models.Model):
date = models.DateField()
closed = models.BooleanField(default=False)
room = models.ForeignKey(Room)
Considering that there is one availability for every date in a year. How can use ORM query to find whether room withing given daterange (i.e. 7 days) has:
availability exist for every day within given daterange
none of the availabilities has closed=True
I was unable to find any orm examples that check whether all objects within daterange exist
You can enumerate over the dates, and ensure that it has for that date an Availability with closed=False:
from datetime import date, timedelta
rooms = Room.objects.all()
start_date = date(2022, 7, 21) # first day
for dd in range(7): # number of days
dt = start_date + timedelta(days=dd)
rooms = rooms.filter(availability__date=dt, availability__closed=False)
The rooms will after the for loop have a QuerySet with all Rooms that have for all dates in that range Availability objects with closed=False.

Django annotation on compoundish primary key with filter ignoring primary key resutling in too many annotated items

Please see EDIT1 below, as well.
Using Django 3.0.6 and python3.8, given following models
class Plants(models.Model):
plantid = models.TextField(primary_key=True, unique=True)
class Pollutions(models.Model):
pollutionsid = models.IntegerField(unique=True, primary_key=True)
year = models.IntegerField()
plantid = models.ForeignKey(Plants, models.DO_NOTHING, db_column='plantid')
pollutant = models.TextField()
releasesto = models.TextField(blank=True, null=True)
amount = models.FloatField(db_column="amount", blank=True, null=True)
class Meta:
managed = False
db_table = 'pollutions'
unique_together = (('plantid', 'releasesto', 'pollutant', 'year'))
class Monthp(models.Model):
monthpid = models.IntegerField(unique=True, primary_key=True)
year = models.IntegerField()
month = models.IntegerField()
plantid = models.ForeignKey(Plants, models.DO_NOTHING, db_column='plantid')
power = models.IntegerField(null=False)
class Meta:
managed = False
db_table = 'monthp'
unique_together = ('plantid', 'year', 'month')
I'd like to annotate - based on a foreign key relationship and a fiter a value, particulary - to each plant the amount of co2 and the Sum of its power for a given year. For sake of debugging having replaced Sum by Count using the following query:
annotated = tmp.all().annotate(
energy=Count('monthp__power', filter=Q(monthp__year=YEAR)),
co2=Count('pollutions__amount', filter=Q(pollutions__year=YEAR, pollutions__pollutant="CO2", pollutions__releasesto="Air")))
However this returns too many items (a wrong number using Sum, respectively)
annotated.first().co2 # 60, but it should be 1
annotated.first().energy # 252, but it should be 1
although my database guarantees - as denoted, that (plantid, year, month) and (plantid, releasesto, pollutant, year) are unique together, which can easily be demonstrated:
pl = annotated.first().plantid
testplant = Plants.objects.get(pk=pl) # plant object
pco2 = Pollutions.objects.filter(plantid=testplant, year=YEAR, pollutant="CO2", releasesto="Air")
len(pco2) # 1, as expected
Why does django return to many results and how can I tell django to limit the elements to annotate to the 'current primary key' in other words to only annotate the elements where the foreign key matches the primary key?
I can achieve what I intend to do by using distinct and Max:
energy=Sum('yearly__power', distinct=True, filter=Q(yearly__year=YEAR)),
co2=Max('pollutions__amount', ...
However the performance is inacceptable.
I have tested to use model_to_dict and appending the wanted values "by hand" to the dict, which works for the values itself, but not for sorting the resulted dict (e.g. by energy) and it is acutally faster than the workaround directly above.
It conceptually strikes to me that the manual approach is faster than letting the database do, what it is intended to do.
Is this a feature limitation of django's orm or am I missing something?
EDIT1:
The behaviour is known as bug since 11 years.
Even others "spent a whole day on this".
I am now trying it with subqueries. However the forein key I am using is not a primary key of its table. So the kind of "usual" approach to use "pk=''" does not work. More clearly, trying:
tmp = Plants.objects.filter(somefilter)
subq1 = Subquery(Yearly.objects.filter(pk=OuterRef('plantid'), year=YEAR)) tmp1 = tmp.all().annotate(
energy=Count(Subquery(subq1))
)
returns
OperationalError at /xyz
no such column: U0.yid
Which definitely makes sense because Plants has no clue what a yid is, it only knows plantids. How do I adjust the subquery to that?

Define time period in database for analyses

I want to store timeseries in a database. The values of these timeseries are usually defined for a certain period, for example:
country population in 2014, 2015, 2016, etc.
number of houses in country in 2014, 2015, 2016
I want to combine the data of these varabiales to be able to do some statstics, so housing vs population. This is only possible if I make sure the time periods are exactly the same. The periods are usually on a per year/quarter/month basis. How to best store these values such that I can later compare them?
I currently use start_date (datetime) and end_date (datetime), which obviously works but needs a good GUI to prevent that one person enters for example:
start = 1-1-2016 & end = 31-12=2016
while another would enter:
start = 1-1-2016 & end = 1-1=2017
I think it would be a good idea to keep the freedom of defining the period with the user but help them in defining the right thing. How would you suggest to do this?
BTW: I work with Django so my current model has the following two fields:
period_start = models.DateField(null=False)
period_end = models.DateField(null=False)
Edit 8-5-2018 10:32: added some information on storing data
Some extra information for added clarity:
I store my data in two tables: (1) the variable definition and (2) the values.
Variable defintion looks roughly like this:
class VarDef(models.Model):
name = models.CharField(max_length=2000, null=False)
unit = models.CharField(max_length=20, null=False)
desc = models.CharField(max_length=2000, blank=True, null=True)
class VarValue(models.Model):
value = models.DecimalField(max_digits=60, decimal_places=20, null=False)
var = models.ForeignKey(VarDef, on_delete=models.CASCADE, null=False,
related_name='var_values')
period_start = models.DateField(null=False)
period_end = models.DateField(null=False)
It is hard to answer since I don't have your full code(models, views etc). But keep in mind that you can query django datetime fields using lt and gt like this:
import datetime
# user input from your view, I hardcoded just for the sake of the example
start_date = datetime.date(2005, 1, 1)
end_date = datetime.date(2005, 3, 31)
house_data = HousesData.objects.filter(period_start__gt=start_date, period_end__lt=end_date)).all()
country_data = CountryData.objects.filter(period_start__gt=start_date, period_end__lt=end_date)).all()
# Do the rest of your calculation

Django combine filter on two fields

I am relatively new to Django. I'm having problem when filtering data. I have two models, given below:
Account(models.Model):
name = models.CharField(max_length=60)
hotel = models.ForeignKey(Hotel)
account_type = models.CharField(choices=ACCOUNT_TYPE, max_length=30)
Transaction(models.Model):
account = models.ForeignKey(Account, related_name='transaction')
transaction_type = models.CharField(choices=TRANSACTION_TYPE, max_length=15)
description = models.CharField(max_length=100, blank=True, null=True)
date_created = models.DateTimeField(default=timezone.now)
ACCOUT_TYPE is:
ACCOUNT_TYPE = (
(0, 'Asset'),
(1, 'Liabilities'),
(2, 'Equity'),
(3, 'Income'),
(4, 'Expense')
)
I want to filter all the transactions where the account type is Income and Expense within a given date range. How can I combine those filters in Django?
I have tried like this:
income_account = Account.objects.filter(account_type=3)
expense_account = Account.objects.filter(account_type=4)
transactions = Transaction.objects.filter(Q(
account=income_account,
date_created__gte=request.data['start_date'],
date_created__lte=request.data['end_date']
) & Q(
account=expense_account,
date_created__gte=request.data['start_date'],
date_created__lte=request.data['end_date'])).order_by('date_created')
But it's not working. It raises the following error:
ProgrammingError: more than one row returned by a subquery used as an expression
income_account and expense_account is not single object, it is a list of objects. So instead of this account=income_account and this account=expense_account try to use in: account__in=income_account and account__in=expense_account.
Also you probably could simplify queryset like this:
accounts = Account.objects.filter(Q(account_type=3) | Q(account_type=4))
transactions = Transaction.objects.filter(
account__in=accounts,
date_created__gte=request.data['start_date'],
date_created__lte=request.data['end_date']
).order_by('date_created')
Instead of having multiple querysets, you can have only one, as Q allows ORing of filters. You could do:
Transaction.objects.filter(
(Q(account__account_type=3) | Q(account__account_type=4)) &
Q(date_created__range=[start_date, end_date])
)
The __range can be used to get dates between the specified start_date and end_date.
You can always use in to lookup records by multiple values. So, if you want Transaction where ACCOUNT_TYPE are Income, Expenseyou can use it like this.
Transaction.objects.filter(Q(account__in=[3,4]) & Q(date_created__gte=request.data['start_date']) & Q(date_created__lte=request.data['end_date'])).order_by('date_created')
This will work for you:-
result = Account.objects.filter((account_type__in['Income','Expense'])
OR
result = Account.objects.filter((account_type__in['0','4'])
I have put 0 and 4 as string because you have mention account_type as CharField.