How to prevent database to search for old objects? - django

I have a Django app which uses Postgresql as the database. I store some objects that have a datetime field. In my queries I usually want to fetch the objects that are stored in last day or last week, so older objects are of no importance for me.
I cannot delete the older objects because I sometimes want to fetch all the data.
I want to optimize the app.
Is there any way to just search the data stored in the last day and not search for the other data?
Edit:
Imagine there are so many records, say 1 million, and only small amount of them are for today. If I use Model.objects.filter(datetime_field__gte=last_week), does the database check all the records?

You can query filtering on the DateTime field.
import datetime
last_week = datetime.date.today() - datetime.timedelta(days=7)
Model.objects.filter(datetime_field__gte=last_week)
Be sure to check the docs here and here

create a new input query2 for example
def SearchW(request):
serchedWl = Modal.objects.all()
query1= request.GET.get('query1') #main query
query2= request.GET.get('query2') #last week
if query1:
serchedWl =serchedWl.filter(Q(field__contains=query1))
if query2:
serchedWl =serchedWl.filter(Q(datetime_field__contains=query2))
context={
'title':'your page',
'serchedWl': serchedWl,
}
return render(request,'to your page .html',context)
don't forget the tag is serchedWl

Related

How to properly use dates in object filters

I'm using Django ORM to access database models, it works well when I use objects.all(), it returns all the objects in the database. But when I want to filter on a date I add a filter using the new date type it doesnt return anything, I get a blank QuerySet. After searching and trying different things for many hours I discovered object.filter(date__gte=date) works.
For example:
This works, I get all the records where date = today:
today = date.today()
Model.objects.filter(date__gte=today)
These do not work, they return empty QuerySets:
Model.objects.filter(date__contains=today)
Model.objects.filter(date__startswith=today)
Model.objects.filter(date__date=date.today())
My question is what am I doing wrong that one type of query works but not the other, when they should all return the same data?
You can do it like this(Reference) for DateTimeField:
Model.objects.filter(date__date=datetime.today())
If its a DateField, then simply do:
Model.objects.filter(date=datetime.today())

how to get latest foreign key value in models.py

I have a little problem with getting latest foreign key value in my django app. Here are my two models:
class Stock(models.Model):
...
class Dividend(models.Model):
date = models.DateField('pay date')
stock = models.ForeignKey(Stock, related_name="dividends")
class Meta:
ordering = ["date"]
I would like to get latest dividend from stock object. So basically this - stock.dividends.latest('date'). However, everytime I call stock.dividends.latest('date'), it fires up sql query to get latest dividend. I have latest() method in for cycle for every stock I have. I would like to avoid these sql queries. May I somehow define new method in class Stock that would get latest dividend within sql query for stock object?
I cannot change default ordering from "date" to "-date".
Using select_related('dividends') loads dividends objects with stock, but latest probably uses order_by and it requires sql query anyway. :(
EDIT1: To make more clear what I want, here is an example. Let's say I have 100 symbols in shares.keys():
for stock in Stock.objects.filter(symbol__in=shares.keys()): # 1 sql query
latest_dividend = stock.dividends.latest('date') # 100 sql queries
... #do something with latest dividend
Well and in some cases I might have 500 symbols in shares.keys(). That is why I need to avoid making sql queries on getting latest dividend for stock.
I have the same problem with you, so I tested many Django queries. Finally, I found out that we can use this:
Stock.objects.all().annotate(latest_date=Max('dividends__date')).filter(dividends__date=F('latest_date')).values('dividends')
I'm not sure my solution is the best, but here it is (works only with PostgreSQL):
stocks = list(Stock.objects.filter(**something))
dividends = Dividend.objects.filter(
stock__in=stocks,
).order_by(
'stock_id',
'-date'
).distinct(
'stock_id',
)
dividends_dict = {d.stock_id: d for d in dividends}
for stock in stocks:
stock.latest_dividend = dividends_dict.get(stock.id)
I'm a little confused by your question, I'm assuming you are trying to access the dividends from your stock object in order to limit your queries to the database. I believe that is the least number queries of possible.
stock_options = stock.objects.get(pk=your_query)
order_options = stock.dividend_set.order_by('-date')[:5]
likeon: Thanks for your answer. But I think I can avoid initializing that large dictionary (I have 5000 stocks and 280 000 dividends). But your list gave me an idea. Your code requires 2 sql queries. Here is my example (EDIT1).
for stock in Stock.objects.filter(symbol__in=shares.keys())\
.prefetch_related('dividends'): # 2 sql queries
latest_dividend = list(stock.dividends.all())[-1] # 0 sql queries
... #do something with latest_dividend
My code also requires 2 sql queries, but I do not have to reorder it and create list from stocks and all 280 000 dividends (I only create dict from current stock dividends every cycle). May be creating one dict is quicker than creating len(shares.keys()) dicts, not sure.
I thought there would be easier solution (avoid creating list/dictionary from dividends), but this is good enough for now. Thanks for answers!
As long as I understood you can do it this way:
stock.dividends.last()
as implementation in Django is like this:
def first(self):
"""Return the first object of a query or None if no match is found."""
for obj in (self if self.ordered else self.order_by('pk'))[:1]:
return obj
Also, you can use .latest(*fields, field_name=None) too.

Django queryset aggregate by time interval

Hi I am writing a Django view which ouputs data for graphing on the client side (High Charts). The data is climate data with a given parameter recorded once per day.
My query is this:
format = '%Y-%m-%d'
sd = datetime.datetime.strptime(startdate, format)
ed = datetime.datetime.strptime(enddate, format)
data = Climate.objects.filter(recorded_on__range = (sd, ed)).order_by('recorded_on')
Now, as the range is increased the dataset obviously gets larger and this does not present well on the graph (aside from slowing things down considerably).
Is there an way to group my data as averages in time periods - specifically average for each month or average for each year?
I realize this could be done in SQL as mentioned here: django aggregation to lower resolution using grouping by a date range
But I would like to know if there is a handy way in Django itself.
Or is it perhaps better to modify the db directly and use a script to populate month and year fields from the timestamp?
Any help much appreciated.
Have you tried using django-qsstats-magic (https://github.com/kmike/django-qsstats-magic)?
It makes things very easy for charting, here is a timeseries example from their docs:
from django.contrib.auth.models import User
import datetime, qsstats
qs = User.objects.all()
qss = qsstats.QuerySetStats(qs, 'date_joined')
today = datetime.date.today()
seven_days_ago = today - datetime.timedelta(days=7)
time_series = qss.time_series(seven_days_ago, today)
print 'New users in the last 7 days: %s' % [t[1] for t in time_series]

Get daily counts of objects from Django

I have a Django model with a created timestamp and I'd like to get the counts of objects created on each day. I was hoping to use the aggregation functionality in Django but I can't figure out how to solve my problem with it. Assuming that doesn't work I can always fall back to just getting all of the dates with values_list but I'd prefer to give the work to Django or the DB. How would you do it?
Alex pointed to the right answer in the comment:
Count number of records by date in Django
Credit goes to ara818
Guidoism.objects.extra({'created':"date(created)"}).values('created').annotate(created_count=Count('id'))
from django.db.models import Count
Guidoism.objects \
# get specific dates (not hours for example) and store in "created"
.extra({'created':"date(created)"})
# get a values list of only "created" defined earlier
.values('created')
# annotate each day by Count of Guidoism objects
.annotate(created_count=Count('id'))
I learn new tricks every day reading stack.. awesome!
Use the count method:
YourModel.objects.filter(published_on=datetime.date(2011, 4, 1)).count()

How do I get the related objects In an extra().values() call in Django?

Thank to this post I'm able to easily do count and group by queries in a Django view:
Django equivalent for count and group by
What I'm doing in my app is displaying a list of coin types and face values available in my database for a country, so coins from the UK might have a face value of "1 farthing" or "6 pence". The face_value is the 6, the currency_type is the "pence", stored in a related table.
I have the following code in my view that gets me 90% of the way there:
def coins_by_country(request, country_name):
country = Country.objects.get(name=country_name)
coin_values = Collectible.objects.filter(country=country.id, type=1).extra(select={'count': 'count(1)'},
order_by=['-count']).values('count', 'face_value', 'currency_type')
coin_values.query.group_by = ['currency_type_id', 'face_value']
return render_to_response('icollectit/coins_by_country.html', {'coin_values': coin_values, 'country': country } )
The currency_type_id comes across as the number stored in the foreign key field (i.e. 4). What I want to do is retrieve the actual object that it references as part of the query (the Currency model, so I can get the Currency.name field in my template).
What's the best way to do that?
You can't do it with values(). But there's no need to use that - you can just get the actual Collectible objects, and each one will have a currency_type attribute that will be the relevant linked object.
And as justinhamade suggests, using select_related() will help to cut down the number of database queries.
Putting it together, you get:
coin_values = Collectible.objects.filter(country=country.id,
type=1).extra(
select={'count': 'count(1)'},
order_by=['-count']
).select_related()
select_related() got me pretty close, but it wanted me to add every field that I've selected to the group_by clause.
So I tried appending values() after the select_related(). No go. Then I tried various permutations of each in different positions of the query. Close, but not quite.
I ended up "wimping out" and just using raw SQL, since I already knew how to write the SQL query.
def coins_by_country(request, country_name):
country = get_object_or_404(Country, name=country_name)
cursor = connection.cursor()
cursor.execute('SELECT count(*), face_value, collection_currency.name FROM collection_collectible, collection_currency WHERE collection_collectible.currency_type_id = collection_currency.id AND country_id=%s AND type=1 group by face_value, collection_currency.name', [country.id] )
coin_values = cursor.fetchall()
return render_to_response('icollectit/coins_by_country.html', {'coin_values': coin_values, 'country': country } )
If there's a way to phrase that exact query in the Django queryset language I'd be curious to know. I imagine that an SQL join with a count and grouping by two columns isn't super-rare, so I'd be surprised if there wasn't a clean way.
Have you tried select_related() http://docs.djangoproject.com/en/dev/ref/models/querysets/#id4
I use it a lot it seems to work well then you can go coin_values.currency.name.
Also I dont think you need to do country=country.id in your filter, just country=country but I am not sure what difference that makes other than less typing.