The queryset's `count` is wrong after `extra` - django

When I use extra in a certain way on a Django queryset (call it qs), the result of qs.count() is different than len(qs.all()). To reproduce:
Make an empty Django project and app, then add a trivial model:
class Baz(models.Model):
pass
Now make a few objects:
>>> Baz(id=1).save()
>>> Baz(id=2).save()
>>> Baz(id=3).save()
>>> Baz(id=4).save()
Using the extra method to select only some of them produces the expected count:
>>> Baz.objects.extra(where=['id > 2']).count()
2
>>> Baz.objects.extra(where=['-id < -2']).count()
2
But add a select clause to the extra and refer to it in the where clause, and the count is suddenly wrong, even though the result of all() is correct:
>>> Baz.objects.extra(select={'negid': '0 - id'}, where=['"negid" < -2']).all()
[<Baz: Baz object>, <Baz: Baz object>] # As expected
>>> Baz.objects.extra(select={'negid': '0 - id'}, where=['"negid" < -2']).count()
0 # Should be 2
I think the problem has to do with django.db.models.sql.query.BaseQuery.get_count(). It checks whether the BaseQuery's select or aggregate_select attributes have been set; if so, it uses a subquery. But django.db.models.sql.query.BaseQuery.add_extra adds only to the BaseQuery's extra attribute, not select or aggregate_select.
How can I fix the problem? I know I could just use len(qs.all()), but it would be nice to be able to pass the extra'ed queryset to other parts of the code, and those parts may call count() without knowing that it's broken.

Redefining get_count() and monkeypatching appears to fix the problem:
def get_count(self):
"""
Performs a COUNT() query using the current filter constraints.
"""
obj = self.clone()
if len(self.select) > 1 or self.aggregate_select or self.extra:
# If a select clause exists, then the query has already started to
# specify the columns that are to be returned.
# In this case, we need to use a subquery to evaluate the count.
from django.db.models.sql.subqueries import AggregateQuery
subquery = obj
subquery.clear_ordering(True)
subquery.clear_limits()
obj = AggregateQuery(obj.model, obj.connection)
obj.add_subquery(subquery)
obj.add_count_column()
number = obj.get_aggregation()[None]
# Apply offset and limit constraints manually, since using LIMIT/OFFSET
# in SQL (in variants that provide them) doesn't change the COUNT
# output.
number = max(0, number - self.low_mark)
if self.high_mark is not None:
number = min(number, self.high_mark - self.low_mark)
return number
django.db.models.sql.query.BaseQuery.get_count = quuux.get_count
Testing:
>>> Baz.objects.extra(select={'negid': '0 - id'}, where=['"negid" < -2']).count()
2
Updated to work with Django 1.2.1:
def basequery_get_count(self, using):
"""
Performs a COUNT() query using the current filter constraints.
"""
obj = self.clone()
if len(self.select) > 1 or self.aggregate_select or self.extra:
# If a select clause exists, then the query has already started to
# specify the columns that are to be returned.
# In this case, we need to use a subquery to evaluate the count.
from django.db.models.sql.subqueries import AggregateQuery
subquery = obj
subquery.clear_ordering(True)
subquery.clear_limits()
obj = AggregateQuery(obj.model)
obj.add_subquery(subquery, using=using)
obj.add_count_column()
number = obj.get_aggregation(using=using)[None]
# Apply offset and limit constraints manually, since using LIMIT/OFFSET
# in SQL (in variants that provide them) doesn't change the COUNT
# output.
number = max(0, number - self.low_mark)
if self.high_mark is not None:
number = min(number, self.high_mark - self.low_mark)
return number
models.sql.query.Query.get_count = basequery_get_count
I'm not sure if this fix will have other unintended consequences, however.

Related

Optional input data

For the problem formulation
import pyomo.environ as pe
model = pe.AbstractModel()
model.I = pe.Set()
model.p = model.Param(model.I)
model.create_instance("input.dat")
and the input.dat
set I := 1 2 3 ;
param p :=
1 0.1
2 0.2
3 0.3
;
param q :=
1 1.1
2 2.2
3 3.3
;
The following error is shown
AttributeError: 'AbstractModel' object has no attribute 'q'
How to silence create_instance in this case? The model is fully specified. The "excess" data (parameter q in this case) is needed for another model and the models share this input.dat. I could go with a try/except for the AttributeError and just carry on I guess, but then I would need to guard each create_instance call. I looked for a "skip_undefined" kwarg or similar in the documentation. Is there another preferred way to handle this situation?
According to the documentation, if you load your data using the method load from the class DataPortal, the parameters not used by the model are omitted.
Therefore you may try:
from pyomo.environ import *
data = DataPortal()
model = AbstractModel()
data.load(filename='./input.dat')
model.I = Set()
model.p = model.Param(model.I)
instance = model.create_instance(data)

Django postgres Json field with numeric keys

I have model with postgres json field.
class MyModel(models.Model):
data = JSONField(null=True)
then, I do:
m1 = MyModel.objects.create(data={'10':'2017-12-1'})
m2 = MyModel.objects.create(data={'10':'2018-5-1'})
I want query all the MyModel whose key '10' starts with '2017', so I want to write:
MyModel.objects.filter(data__10__startswith='2017')
The problem is that the 10 is interpreted as integer, and therefore, in the generated query it is considered as list index and not key.
Is there anyway to solve this? (except writing raw queries).
This is the generated query:
SELECT "systools_mymodel"."id", "systools_mymodel"."data" FROM "systools_mymodel" WHERE ("systools_mymodel"."data" ->> 10)::text LIKE '2017%' LIMIT 21;
And I want the 10 to be quoted (which would give me the right answer).
Thanks!
A very hackish solution (use on your own risk, tested under Django 2.0.5, voids warranty...):
# patch_jsonb.py
from django.contrib.postgres.fields.jsonb import KeyTransform
def as_sql(self, compiler, connection):
key_transforms = [self.key_name]
previous = self.lhs
while isinstance(previous, KeyTransform):
key_transforms.insert(0, previous.key_name)
previous = previous.lhs
lhs, params = compiler.compile(previous)
if len(key_transforms) > 1:
return "(%s %s %%s)" % (lhs, self.nested_operator), [
key_transforms] + params
try:
int(self.key_name)
except ValueError:
if self.key_name.startswith("K") and self.key_name[1:].isnumeric():
lookup = "'%s'" % self.key_name[1:]
else:
lookup = "'%s'" % self.key_name
else:
lookup = "%s" % self.key_name
return "(%s %s %s)" % (lhs, self.operator, lookup), params
def patch():
KeyTransform.as_sql = as_sql
Usage:
Add this to the bottom of your settings.py:
import patch_jsonb
patch_jsonb.patch()
Instead of __123__ lookups use __K123__ lookups - the uppercase K will be stripped by this patch:
MyModel.objects.filter(data__K10__startswith='2017')
And consider avoiding using numbers as jsonb object keys...

How to add time in a python list

This is the list:
lsty = ['1:07:11', '2:37:28', '07:11', '1:07:11']
Time can be like '2:37:28' (2h 37m 28s) or '07:11' (7m 11s). How can I sum up the list?
You may find the native python datetime.timedelta object useful, it allows you to represent time in a way that Python understands, and perform arithmetic with other timedelta objects.
Perhaps something like this? This is totally untested:
from datetime import timedelta
def sum_times(times):
sum = timedelta(0)
for time in times:
time_split = time.split(':') # Extract just time vals
if len(time_split) == 2: # Just mins/secs
t_delt = timedelta(minutes=time_split[0],
seconds=time_split[1])
else:
t_delt = timedelta(hours=time_split[0],
minutes=time_split[1],
seconds=time_split[2])
sum += t_delt # This is where the magic happens
return '%s:%s:%s' % (sum.hours, sum.minutes, sum.seconds)

Get objects created in last 30 days, for each past day

I am looking for fast method to count model's objects created within past 30 days, for each day separately. For example:
27.07.2013 (today) - 3 objects created
26.07.2013 - 0 objects created
25.07.2013 - 2 objects created
...
27.06.2013 - 1 objects created
I am going to use this data in google charts API. Have you any idea how to get this data efficiently?
items = Foo.objects.filter(createdate__lte=datetime.datetime.today(), createdate__gt=datetime.datetime.today()-datetime.timedelta(days=30)).\
values('createdate').annotate(count=Count('id'))
This will (1) filter results to contain the last 30 days, (2) select just the createdate field and (3) count the id's, grouping by all selected fields (i.e. createdate). This will return a list of dictionaries of the format:
[
{'createdate': <datetime.date object>, 'count': <int>},
{'createdate': <datetime.date object>, 'count': <int>},
...
]
EDIT:
I don't believe there's a way to get all dates, even those with count == 0, with just SQL. You'll have to insert each missing date through python code, e.g.:
import datetime
# needed to use .append() later on
items = list(items)
dates = [x.get('createdate') for x in items]
for d in (datetime.datetime.today() - datetime.timedelta(days=x) for x in range(0,30)):
if d not in dates:
items.append({'createdate': d, 'count': 0})
I think this can be somewhat more optimized solution with #knbk 's solution with python. This has fewer iterations and iterations inside SET is highly optimized in python (both in processing and in CPU-cycles).
from_date = datetime.date.today() - datetime.timedelta(days=7)
orders = Order.objects.filter(created_at=from_date, dealer__executive__branch__user=user)
orders = orders.annotate(count=Count('id')).values('created_at').order_by('created_at')
if len(orders) < 7:
orders_list = list(orders)
dates = set([(datetime.date.today() - datetime.timedelta(days=i)) for i in range(6)])
order_set = set([ord['created_at'] for ord in orders])
for dt in (order_set - dates):
orders_list.append({'created_at': dt, 'count': 0})
orders_list = sorted(orders_list, key=lambda item: item['created_at'])
else:
orders_list = orders

django-rating filtering in django

I am trying to do something pretty simple and trivial but with no luck.
I am using django-rating to rate specific objects on my site.
On my model which I wanted to rate I have a field :
rating = RatingField(range=5)
Now , all I want is to filter all of the objects which have a rate of 2 and aobve for example.
If rating was IntegerField for example, I would only need to do :
objects.filter( rating__gte = 2)
how can I do the same using django-rating ?
Reading django-rate documentation I found this trick to sort by rate:
# In this example, ``rating`` is the attribute name for your ``RatingField``
qs = qs.extra(select={
'rating': '((100/%s*rating_score/(rating_votes+%s))+100)/2'
% (MyModel.rating.range, MyModel.rating.weight)
})
qs = qs.order_by('-rating')
Perhaps you can modify this code sample and use extra where to get your results:
qs = qs.extra(where=[
'((100/%s*rating_score/(rating_votes+%s))+100)/2 >= 2 ' %
(MyModel.rating.range, MyModel.rating.weight) ,
])