Get objects created in last 30 days, for each past day - django

I am looking for fast method to count model's objects created within past 30 days, for each day separately. For example:
27.07.2013 (today) - 3 objects created
26.07.2013 - 0 objects created
25.07.2013 - 2 objects created
...
27.06.2013 - 1 objects created
I am going to use this data in google charts API. Have you any idea how to get this data efficiently?

items = Foo.objects.filter(createdate__lte=datetime.datetime.today(), createdate__gt=datetime.datetime.today()-datetime.timedelta(days=30)).\
values('createdate').annotate(count=Count('id'))
This will (1) filter results to contain the last 30 days, (2) select just the createdate field and (3) count the id's, grouping by all selected fields (i.e. createdate). This will return a list of dictionaries of the format:
[
{'createdate': <datetime.date object>, 'count': <int>},
{'createdate': <datetime.date object>, 'count': <int>},
...
]
EDIT:
I don't believe there's a way to get all dates, even those with count == 0, with just SQL. You'll have to insert each missing date through python code, e.g.:
import datetime
# needed to use .append() later on
items = list(items)
dates = [x.get('createdate') for x in items]
for d in (datetime.datetime.today() - datetime.timedelta(days=x) for x in range(0,30)):
if d not in dates:
items.append({'createdate': d, 'count': 0})

I think this can be somewhat more optimized solution with #knbk 's solution with python. This has fewer iterations and iterations inside SET is highly optimized in python (both in processing and in CPU-cycles).
from_date = datetime.date.today() - datetime.timedelta(days=7)
orders = Order.objects.filter(created_at=from_date, dealer__executive__branch__user=user)
orders = orders.annotate(count=Count('id')).values('created_at').order_by('created_at')
if len(orders) < 7:
orders_list = list(orders)
dates = set([(datetime.date.today() - datetime.timedelta(days=i)) for i in range(6)])
order_set = set([ord['created_at'] for ord in orders])
for dt in (order_set - dates):
orders_list.append({'created_at': dt, 'count': 0})
orders_list = sorted(orders_list, key=lambda item: item['created_at'])
else:
orders_list = orders

Related

python Find the most reported month

I am trying to find out October(mentioned 2 times), I had the idea to use dictionary to solve this problem. However I struggled a lot to figure out how to find/separate the months, I was not able to use my solution for the 1st str values where there are some spaces. Can someone please suggest how can I modify that split section to cover - , and white space?
import re
#str="May-29-1990, Oct-18-1980 ,Sept-1-1980, Oct-2-1990"
str="May-29-1990,Oct-18-1980,Sept-1-1980,Oct-2-1990"
val=re.split(',',str)
monthList=[]
myDictionary={}
#put the months in a list
def sep_month():
for item in val:
if not item.isdigit():
month,day,year=item.split("-")
monthList.append(month)
#process the month list from above
def count_month():
for item in monthList:
if item not in myDictionary.keys():
myDictionary[item]=1
else:
myDictionary[item]=myDictionary.get(item)+1
for k,v in myDictionary.items():
if v==2:
print(k)
sep_month()
count_month()
from datetime import datetime
import calendar
from collections import Counter
datesString = "May-29-1990,Oct-18-1980,Sep-1-1980,Oct-2-1990"
datesListString = datesString.split(",")
datesList = []
for dateStr in datesListString:
datesList.append(datetime.strptime(dateStr, '%b-%d-%Y'))
monthsOccurrencies = Counter((calendar.month_name[date.month] for date in datesList))
print(monthsOccurrencies)
# Counter({'October': 2, 'May': 1, 'September': 1})
Something to be aware in my solution with %b for the month is that Sept has changed to Sep to work (Month as locale’s abbreviated name). In this case you can either use fullname months (%B) or abbreviated name (%b). If you can not have the big string as with correct month name formatting, just replace the wrong ones ("Sept" for example with "Sep" and always work with date obj).
Not sure that regex is the best tool for this job, I would just use strip() along with split() to handle your whitespace issues and get a list of just the month abbreviations. Then you could create a dict with counts by month using the list method count(). For example:
dates = 'May-29-1990, Oct-18-1980 ,Sept-1-1980, Oct-2-1990'
months = [d.split('-')[0].strip() for d in dates.split(',')]
month_counts = {m: months.count(m) for m in set(months)}
print(month_counts)
# {'May': 1, 'Oct': 2, 'Sept': 1}
Or even better with collections.Counter:
from collections import Counter
dates = 'May-29-1990, Oct-18-1980 ,Sept-1-1980, Oct-2-1990'
months = [d.split('-')[0].strip() for d in dates.split(',')]
month_counts = Counter(months)
print(month_counts)
# Counter({'Oct': 2, 'May': 1, 'Sept': 1})

Count unique dates from a multidimensional list

I am querying a database for Leads. Leads have a "lead generated date" and a possible "closed" date.
What I would like to do is get a month by month total for leads generated/leads closed per month in the format [MM/YYYY, leads generated, leads closed] for Google Visualization API.
I have my query logic set and currently have a a result similar to:
[
["09/2011","09/2011"],
["09/2011","10/2011"],
["10/2011","12/2011"],
...
]
I am stuck trying to come up with an efficient way parse this and get the result of:
[
["09/2011", 2, 1],
["10/2011", 1, 1],
["12/2011", 0, 1]
]
Any help would be appreciated!
It's not that beautiful, but this should work:
from collections import defaultdict
d1 = defaultdict(int)
d2 = defaultdict(int)
data = [["09/2011","09/2011"],["09/2011","10/2011"],["10/2011","12/2011"]]
for d in data:
d1[d[0]] += 1
d2[d[1]] += 1
out = []
for key in set(d1.keys()) | set(d2.keys()):
out.append([key, d1.get(key, 0), d2.get(key, 0)])

django-rating filtering in django

I am trying to do something pretty simple and trivial but with no luck.
I am using django-rating to rate specific objects on my site.
On my model which I wanted to rate I have a field :
rating = RatingField(range=5)
Now , all I want is to filter all of the objects which have a rate of 2 and aobve for example.
If rating was IntegerField for example, I would only need to do :
objects.filter( rating__gte = 2)
how can I do the same using django-rating ?
Reading django-rate documentation I found this trick to sort by rate:
# In this example, ``rating`` is the attribute name for your ``RatingField``
qs = qs.extra(select={
'rating': '((100/%s*rating_score/(rating_votes+%s))+100)/2'
% (MyModel.rating.range, MyModel.rating.weight)
})
qs = qs.order_by('-rating')
Perhaps you can modify this code sample and use extra where to get your results:
qs = qs.extra(where=[
'((100/%s*rating_score/(rating_votes+%s))+100)/2 >= 2 ' %
(MyModel.rating.range, MyModel.rating.weight) ,
])

The queryset's `count` is wrong after `extra`

When I use extra in a certain way on a Django queryset (call it qs), the result of qs.count() is different than len(qs.all()). To reproduce:
Make an empty Django project and app, then add a trivial model:
class Baz(models.Model):
pass
Now make a few objects:
>>> Baz(id=1).save()
>>> Baz(id=2).save()
>>> Baz(id=3).save()
>>> Baz(id=4).save()
Using the extra method to select only some of them produces the expected count:
>>> Baz.objects.extra(where=['id > 2']).count()
2
>>> Baz.objects.extra(where=['-id < -2']).count()
2
But add a select clause to the extra and refer to it in the where clause, and the count is suddenly wrong, even though the result of all() is correct:
>>> Baz.objects.extra(select={'negid': '0 - id'}, where=['"negid" < -2']).all()
[<Baz: Baz object>, <Baz: Baz object>] # As expected
>>> Baz.objects.extra(select={'negid': '0 - id'}, where=['"negid" < -2']).count()
0 # Should be 2
I think the problem has to do with django.db.models.sql.query.BaseQuery.get_count(). It checks whether the BaseQuery's select or aggregate_select attributes have been set; if so, it uses a subquery. But django.db.models.sql.query.BaseQuery.add_extra adds only to the BaseQuery's extra attribute, not select or aggregate_select.
How can I fix the problem? I know I could just use len(qs.all()), but it would be nice to be able to pass the extra'ed queryset to other parts of the code, and those parts may call count() without knowing that it's broken.
Redefining get_count() and monkeypatching appears to fix the problem:
def get_count(self):
"""
Performs a COUNT() query using the current filter constraints.
"""
obj = self.clone()
if len(self.select) > 1 or self.aggregate_select or self.extra:
# If a select clause exists, then the query has already started to
# specify the columns that are to be returned.
# In this case, we need to use a subquery to evaluate the count.
from django.db.models.sql.subqueries import AggregateQuery
subquery = obj
subquery.clear_ordering(True)
subquery.clear_limits()
obj = AggregateQuery(obj.model, obj.connection)
obj.add_subquery(subquery)
obj.add_count_column()
number = obj.get_aggregation()[None]
# Apply offset and limit constraints manually, since using LIMIT/OFFSET
# in SQL (in variants that provide them) doesn't change the COUNT
# output.
number = max(0, number - self.low_mark)
if self.high_mark is not None:
number = min(number, self.high_mark - self.low_mark)
return number
django.db.models.sql.query.BaseQuery.get_count = quuux.get_count
Testing:
>>> Baz.objects.extra(select={'negid': '0 - id'}, where=['"negid" < -2']).count()
2
Updated to work with Django 1.2.1:
def basequery_get_count(self, using):
"""
Performs a COUNT() query using the current filter constraints.
"""
obj = self.clone()
if len(self.select) > 1 or self.aggregate_select or self.extra:
# If a select clause exists, then the query has already started to
# specify the columns that are to be returned.
# In this case, we need to use a subquery to evaluate the count.
from django.db.models.sql.subqueries import AggregateQuery
subquery = obj
subquery.clear_ordering(True)
subquery.clear_limits()
obj = AggregateQuery(obj.model)
obj.add_subquery(subquery, using=using)
obj.add_count_column()
number = obj.get_aggregation(using=using)[None]
# Apply offset and limit constraints manually, since using LIMIT/OFFSET
# in SQL (in variants that provide them) doesn't change the COUNT
# output.
number = max(0, number - self.low_mark)
if self.high_mark is not None:
number = min(number, self.high_mark - self.low_mark)
return number
models.sql.query.Query.get_count = basequery_get_count
I'm not sure if this fix will have other unintended consequences, however.

Using the "extra fields " from django many-to-many relationships with extra fields

Django documents give this example of associating extra data with a M2M relationship. Although that is straight forward, now that I am trying to make use of the extra data in my views it is feeling very clumsy (which typically means "I'm doing it wrong").
For example, using the models defined in the linked document above I can do the following:
# Some people
ringo = Person.objects.create(name="Ringo Starr")
paul = Person.objects.create(name="Paul McCartney")
me = Person.objects.create(name="Me the rock Star")
# Some bands
beatles = Group.objects.create(name="The Beatles")
my_band = Group.objects.create(name="My Imaginary band")
# The Beatles form
m1 = Membership.objects.create(person=ringo, group=beatles,
date_joined=date(1962, 8, 16),
invite_reason= "Needed a new drummer.")
m2 = Membership.objects.create(person=paul, group=beatles,
date_joined=date(1960, 8, 1),
invite_reason= "Wanted to form a band.")
# My Imaginary band forms
m3 = Membership.objects.create(person=me, group=my_band,
date_joined=date(1980, 10, 5),
invite_reason= "Want to be a star.")
m4 = Membership.objects.create(person=paul, group=my_band,
date_joined=date(1980, 10, 5),
invite_reason= "Wanted to form a better band.")
Now if I want to print a simple table that for each person gives the date that they joined each band, at the moment I am doing this:
bands = Group.objects.all().order_by('name')
for person in Person.objects.all():
print person.name,
for band in bands:
print band.name,
try:
m = person.membership_set.get(group=band.pk)
print m.date_joined,
except:
print 'NA',
print ""
Which feels very ugly, especially the "m = person.membership_set.get(group=band.pk)" bit. Am I going about this whole thing wrong?
Now say I wanted to order the people by the date that they joined a particular band (say the beatles) is there any order_by clause I can put on Person.objects.all() that would let me do that?
Any advice would be greatly appreciated.
You should query the Membership model instead:
members = Membership.objects.select_related('person', 'group').all().order_by('date_joined')
for m in members:
print m.band.name, m.person.name, m.date_joined
Using select_related here we avoid the 1 + n queries problem, as it tells the ORM to do the join and selects everything in one single query.