I am trying to update two fields:
# This works, changes the correct field.
DModel.objects.filter(ticker=i['tic']).filter(trail=True).update(
last_high_price=Case(
When(
LessThan(F('last_high_price'),
i['high_price']),
then=Value(i['last_high_price'])),
default=F('last_high_price')
)
)
# Using the same condition to change another field in same row. Does not work?
DModel.objects.filter(ticker=i['tic']).filter(trail=True).update(
date_high_price=Case(
When(
LessThan(F('last_high_price'),
i['high_price']),
then=Value(i['last_date_time'])),
output_field=DateField(), default=F('date_high_price')
)
)
The field date_high_price will not update, I am getting a 200 response. But the field remains null.
If I remove output_field then I get >FieldError.
Edit:
Leaving the question as is incase someone else faces the problem.
Basically I am performing the update of condition first and then rechecking the same condition again in second statement which will lead to the default value. Move the date_high_price to before last_high_price and it will work.
Related
The model is:
class RecordModel(BaseModel):
visibility_setting = models.PositiveIntegerField()
visible_to = models.ManyToManyField(UserModel, blank=True)
I need to return rows depending on row visibility_setting:
if visibility_setting == 0 - return row without any checks,
if visibility_setting == 1 - I need to check if user is in visible_to m2m relation exists.
In second case query works okay, but in first case my approach of setting None on first case not working (as expected):
Feed.objects.filter(
visible_to=Case(
When(visibility_setting=0, then=None), # no data returned in this case, but I want to
# return all rows with visibility_setting=0
When(visibility_setting=1, then=user.id), # rows with visibility_setting=1 are queried okay
)
)
I am a little bit stuck which way to use in this situation? Can we not apply case at all in some conditions or use some specific default to skip visible_to relations check?
You can .filter(…) [Django-doc] with:
from django.db.models import Q
Feed.objects.filter(Q(visibility_setting=0) | Q(visible_to=my_user)).distinct()
the .distinct() call [Django-doc]
prevents returning the same Feed multiple times.
I have a queryset that returns a lot of data, it can be filtered by year which will return around 100k lines, or show all which will bring around 1 million lines.
The objective of this annotate is to generate a xlsx spreadsheet.
Models representation, RelatedModel is manytomany between Model and AnotherModel
Model:
id
field1
field2
field3
RelatedModel:
foreign_key_model (Model)
foreign_key_another (AnotherModel)
Queryset, if the relation exists it will annotate, this annotate is very slow and can take several minutes.
Model.objects.all().annotate(
related_exists=Exists(RelatedModel.objects.filter(foreign_key_model=OuterRef('id'))),
related_column=Case(
When(related_exists=True, then=Value('The relation exists!')),
When(related_exists=False, then=Value('The relation doesn't exist!')),
default=Value('This is the default value!'),
output_field=CharField(),
)
).values_list(
'related_column',
'field1',
'field2',
'field3'
)
If only thing needed is to change how True / False is displayed in xlsx - one option is to just have one related_exists BooleanField annotation and later customize how it will be converted when creating xlsx document - i.e. in serializer. Database should store raw / unformatted values, and app prepare them to be shown to user.
Other things to consider:
Indexes to speed-up filtering.
If you have millions of records after filtering, in one table - maybe table partitioning could be considered.
But let's look into raw sql of original query. It will be like this:
SELECT [model_fields],
EXISTS([CLIENT_SELECT]) AS related_exists,
CASE
WHEN EXISTS([CLIENT_SELECT]) = true THEN 'The relation exists!'
WHEN EXISTS([CLIENT_SELECT]) = true THEN 'The relation does not exist!'
ELSE 'The relation exists!'
END AS related_column
FROM model;
And right away we can see nested query for Exists CLIENT_SELECT is there 3 times. Even though it is exactly the same, it may be executed minimum 2 times and up to 3 times. Database may optimize it to be faster than 3x, but it still is not optimal as 1x.
First, EXISTS returns either True or False, we can leave just one check that it is True, making 'The relation does not exist!' the default value.
related_column=Case(
When(related_exists=True, then=Value('The relation exists!')),
default=Value('The relation does not exist!')
Why related_column performs same select again and not takes the value of related_exists?
Because we cannot reference calculated columns while calculating another columns - and this is database level constraint django knows about and duplicates expression.
Wait, then we actually do not need related_exists column, lets just leave related_column with CASE statement and 1 exists subquery.
Here comes Django - we cannot (till 3.0) use expressions in filters without annotating them first.
So, it our case it is like: in order to use Exist in When, we first need to add it as annotation, but it won't be used as a reference, but a full copy of expression.
Good news!
Since Django 3.0 we can use expressions that output BooleanField directly in QuerySet filters, without having to first annotate. Exists is one of such BooleaField expressions.
Model.objects.all().annotate(
related_column=Case(
When(
Exists(RelatedModel.objects.filter(foreign_key_model=OuterRef('id'))),
then=Value('The relation exists!'),
),
default=Value('The relation doesn't exist!'),
output_field=CharField(),
)
)
And only one nested select, and one annotated field.
Django 2.1, 2.2
Here's the commit that finalized allowance of boolean expressions although many pre-conditions for it were added earlier. One of them is presence of conditional attribute on expression object and check for this attribute.
So, although not recommended and not tested it seems quite working little hack for Django 2.1, 2.2 (before there was no conditional check, and it will require more intrusive changes):
create Exists expression instance
monkey patch it with conditional = True
use it as condition in When statement
related_model_exists = Exists(RelatedModel.objects.filter(foreign_key_model=OuterRef('id')))
setattr(related_model_exists, 'conditional', True)
Model.objects.all().annotate(
related_column=Case(
When(
relate_model_exists,
then=Value('The relation exists!'),
),
default=Value('The relation doesn't exist!'),
output_field=CharField(),
)
)
Related checks
relatedmodel_set__isnull=True check is not suitable for several reasons:
it performs LEFT OUTER JOIN - that is less efficient than EXISTS
it performs LEFT OUTER JOIN - it joins tables, this makes it ONLY suitable in filter() condition (not in annotate - When), and only for OneToOne or OneToMany (One is on relatedmodel side) relations
You can considerably simplify your query to:
from django.db.models import Count
Model.objects.all().annotate(
related_column=Case(
When(relatedmodel_set__isnull=True, then=Value("The relation doesn't exist!")),
default=Value("The relation exists!"),
output_field=CharField()
)
)
Where relatedmodel_set is the related_name on your foreign key.
I'm using postgresql:
articles = Article.objects.filter(articleinreports__report__in=reports_in_time_range).distinct().annotate(
total_views=Case(
When(articleinreports__report=last_report,
then=(F("articleinreports__ios_views") + F("articleinreports__android_views")))
, default=0, output_field=IntegerField()
)
).order_by('-total_views')
Why is distinct() not getting applied and I'm getting two-articles instead of one for each unique article in the database?
Removing order_by does not solve (and that creates another problem anyway), only removing both the annotation and the order_by.
When I try to do distinct('id'), I get an error that it must match the order_by field, which defeats the purpose of ordering by my desired criteria.
Why is this happening, and how do I solve it for my use case?
I have a somewhat complicated model, so I will do my best to give an example that simplifies my current state, and my need.
I have a queryset:
qs = MyModel.objects.all()
Each instance in this queryset, has a many-2-many field to another model, let's call it 'First_M2M'. First_M2M has a foreign key to another model, and a many-2-many to yet another model (FkModel and Second_M2M, respectively):
qs[0].first_m2m.fk_model.name # This is a string.
qs[0].first_m2m.second_m2m.all() # This is a many2many manager.
The Second_M2M has another many-2-many relationship, Third_M2M:
qs[0].first_m2m.second_m2m[0].third_m2m.all() # Also a m2m manager.
Now that's what I'm trying to do: I want to order my qs, based on a value from one of the second_m2m instances. However, I need to choose which instance is it, and this is done by querying a field in the fk_model (to determine the first_m2m instance) AND a field in one of the instances in the third_m2m (this will determine the second_m2m).
In order to make it even more interesting, the value to order by, is YAML.
Here's what I tried to do:
qs.annotate(val_to_filter_by=Case(
When(
first_m2m__fk_model__name='foo',
first_m2m__second_m2m__third_m2m__some_field='bar'),
then='first_m2m__second_m2m__value_field',
default=Value(None),
output_field=YAMLField()
)
).order_by(val_to_filter)
I believe what I got wrong is the querying, that is not coherent enough for Django to determine which instance it should take. But I can't find my problem.
Any help will be much appreciated.
Solved it...
I had a mistake with my 'then. It was part of my 'Case' instead of the 'When'. Here is the solution:
qs.annotate(val_to_filter=Case(
When(
first_m2m__fk_model__name='foo',
first_m2m__second_m2m__third_m2m__some_field='bar',
then=F('first_m2m__second_m2m__value_field')
),
default=Value(''),
output_field=YAMLField()
)).order_by(val_to_filter)
UPDATE: Didn't solve it...
Although I got the right query, and it worked, I am getting ALL instances of the second_m2m, instead of a single instance. Still not sure how to get exactly what I need, seems like Django is not my friend in this case.
UPDATE2:
Changed 'default=None' and added
.exclude(val_to_filter=None)
right before the ordering. Seems to work...
I'm surprised that this question apparently doesn't yet exist. If it does, please help me find it.
I want to use annotate (Count) and order_by, but I don't want to count every instance of a related object, only those that meet a certain criteron.
To wit, that I might list swallows by the number of green coconuts they have carried:
swallow.objects.annotate(num_coconuts=Count('coconuts_carried__husk__color = "green"').order_by('num_coconuts')
For Django >= 1.8:
from django.db.models import Sum, Case, When, IntegerField
swallow.objects.annotate(
num_coconuts=Sum(Case(
When(coconuts_carried__husk__color="green", then=1),
output_field=IntegerField(),
))
).order_by('num_coconuts')
This should be the right way.
swallow.objects.filter(
coconuts_carried__husk__color="green"
).annotate(
num_coconuts=Count('coconuts_carried')
).order_by('num_coconuts')
Note that when you filter for a related field, in raw SQL it translates as a LEFT JOIN plus a WHERE. In the end the annotation will act on the result set, which contains only the related rows which are selected from the first filter.