Map (apply function) Django QuerySet

Map (apply function) Django QuerySet - django

Is there a mechanism to map Django QuerySet items not triggering its evaluation?
I am wondering about something like Python map. A function that uses a function to apply it over the QuerySet, but keeping the lazy evaluation.
For example, using models from Django documentation example, is there something like? (not real code):
>>> Question.objects.all().map(lambda q: q.pub_date + timedelta(hours=1))
which keeps the lazy evaluation?

Just working through something like this myself. I think the best way to do it is to use a python list comprehension.
[q.pub_date + timedelta(hours=1) for q in Question.objects.all()]
Then Django takes care of optimizing this as it would any other query.

You can use annotate for this purpose, for example
from datetime import timedelta
from django.db.models import F, ExpressionWrapper, DateTimeField
Question.objects.annotate(
new_pub_date=ExpressionWrapper(
F('pub_date') + timedelta(hours=1),
output_field=DateTimeField()
)
)
For something a little bit more complex than this example, you can use Func, Case, When

You can use values_list to get any column you like:
Question.objects.values_list('pub_date')
This is simpler than anything you can cook up yourself.

Related

What is the usage of `FilteredRelation()` objects in Django ORM (Django 2.X)?

I've seen Django 2.0 consists of FilteredRelation object in queryset. What is the usage of newly introduced FilteredRelation?
What I've looked into?
I observed Django 2.0 Documentation but I could not understand idea behind this FilteredRelation object.
I looked into following code. But I didn't get it.
>>> from django.db.models import FilteredRelation, Q
>>> Restaurant.objects.annotate(
... pizzas_vegetarian=FilteredRelation(
... 'pizzas', condition=Q(pizzas__vegetarian=True),
... ),
... ).filter(pizzas_vegetarian__name__icontains='mozzarella')
Main Question
Show now my question is that what is usage of FilteredRelation and when to use in your QuerySet?

I think the documentation itself self-explanatory.
You could achieve the same result in,
Method-1
from django.db.models import FilteredRelation, Q
result_1 = Restaurant.objects.annotate(pizzas_vegetarian=FilteredRelation('pizzas', condition=Q(pizzas__vegetarian=True), ), ).filter(
pizzas_vegetarian__name__icontains='mozzarella')
Method-2
result_2 = Restaurant.objects.filter(pizzas__vegetarian=True, pizzas__name__icontains='mozzarella')
You will get better performance with Method-1 since the filtering in the WHERE clause of the first queryset will only operate on vegetarian pizzas.
UPDATE
The Django #29555 ticket has more information regarding the usage and performance.
The FilteredRelation() not only improves performance but also creates
correct results when aggregating with multiple LEFT JOINs.

How to use a tsvector field to perform ranking in Django with postgresql full-text search?

I need to perform a ranking query using postgresql full-text search feature and Django with django.contrib.postgres module.
According to the doc, it is quite easy to do this using the SearchRank class by doing the following:
>>> from django.contrib.postgres.search import SearchQuery, SearchRank, SearchVector
>>> vector = SearchVector('body_text')
>>> query = SearchQuery('cheese')
>>> Entry.objects.annotate(rank=SearchRank(vector, query)).order_by('-rank')
This probably works well but this is not exactly what I want since I have a field in my table which already contains tsvectorized data that I would like to use (instead of recomputing tsvector at each search query).
Unforunately, I can't figure out how to provide this tsvector field to the SearchRank class instead of a SearchVector object on a raw data field.
Is anyone able to indicate how to deal with this?
Edit:
Of course, simply trying to instantiate a SearchVector from the tsvector field does not work and fails with this error (approximately since I translated it from french):
django.db.utils.ProgrammingError: ERROR: function to_tsvector(tsvector) does not exist

If your model has a SearchVectorField like so:
from django.contrib.postgres.search import SearchVectorField
class Entry(models.Model):
...
search_vector = SearchVectorField()
you would use the F expression:
from django.db.models import F
...
Entry.objects.annotate(
rank=SearchRank(F('search_vector'), query)
).order_by('-rank')

I've been seeing mixed answers here on SO and in the official documentation. F Expressions aren't used in the documentation for this. However it may just be that the documentation doesn't actually provide an example for using SearchRank with a SearchVectorField.
Looking at the output of .explain(analyze=True) :
Without the F Expression:
Sort Key: (ts_rank(to_tsvector(COALESCE((search_vector)::text, ''::text))
When the F Expression is used:
Sort Key: (ts_rank(search_vector, ...)
In my experience, it seems the only difference between using an F Expression and the field name in quotes is that using the F Expression returns much faster, but is sometimes less accurate - depending on how you structure the query - it can be useful to enforce it with a COALESCE in some cases. In my case it's about a 3-5x speedboost to use the F Expression with my SearchVectorField.
Ensuring your SearchQuery has a config kwarg also improves things dramatically.

Django equivalent of SQL not in

I have a very simple query: select * from tbl1 where title not in('asdasd', 'asdasd').
How do I translate that to Django? It's like I want the opposite of: Table.objects.filter(title__in=myListOfTitles)

try using exclude
Table.objects.exclude(title__in=myListOfTitles)

(this thread is old but still could be googled)
you can use models.Q with "~" as follows:
Table.objects.filter(~Q(title__in=myListOfTitles))
this method specially is helpful when you have multiple conditions.

Table.objects.exclude(title__in=myListOfTitles)

Django provides two options.
exclude(<condition>)
filter(~Q(<condition>))
Method 2 using Q() method
>>> from django.db.models import Q
>>> queryset = User.objects.filter(~Q(id__lt=5))

Django Query Related Field Count

I've got an app where users create pages. I want to run a simple DB query that returns how many users have created more than 2 pages.
This is essentially what I want to do, but of course it's not the right method:
User.objects.select_related('page__gte=2').count()
What am I missing?

You should use aggregates.
from django.db.models import Count
User.objects.annotate(page_count=Count('page')).filter(page_count__gte=2).count()

In my case, I didn't use last .count() like the other answer and it also works nice.
from django.db.models import Count
User.objects.annotate( our_param=Count("all_comments")).filter(our_param__gt=12)

use aggregate() function with django.db.models methods!
this is so useful and not really crushing with other annotation aggregated columns.
*use aggregate() at the last step of calculation, it turns your queryset to dict.
below is my code snippet using them.
cnt = q.values("person__year_of_birth").filter(person__year_of_birth__lte=year_interval_10)\
.filter(person__year_of_birth__gt=year_interval_10-10)\
.annotate(group_cnt=Count("visit_occurrence_id")).aggregate(Sum("group_cnt"))

Making queries using F() and timedelta at django

I have the following model:
class Process(models.Model):
title = models.Charfield(max_length=255)
date_up = models.DateTimeField(auto_now_add=True)
days_activation = models.PositiveSmallIntegerField(default=0)
Now I need to query for all Process objects that have expired, according to their value of days_activation.
I tried
from datetime import datetime, timedelta
Process.objects.filter(date_up__lte=datetime.now()-timedelta(days=F('days_activation')))
and received the following error message:
TypeError: unsupported type for timedelta days component: F
I can of course do it in Python:
filter (lambda x: x.date_up<=datetime.now() - timedelta(days=x.days_activation),
Process.objects.all ()),
but I really need to produce a django.db.models.query.QuerySet.

7 days == 1 day * 7
F is deep-black Django magic and the objects that encounter it
must belong to the appropriate magical circles to handle it.
In your case, django.db.models.query.filter knows about F, but datetime.timedelta does not.
Therefore, you need to keep the F out of the timedelta argument list.
Fortunately, multiplication of timedelta * int is supported by F,
so the following can work:
Process.objects.filter(date_up__lte=datetime.now()-timedelta(days=1)*F('days_activation'))
As it turns out, this will work with PostgreSQL, but will not work with SQlite (for which Django 1.11 only supports + and - for timedelta,
perhaps because of a corresponding SQlite limitation).

You are mixing two layers: run-time layer and the database layer. F function is just a helper which allows you to build slightly more complex queries with django ORM. You are using timedelta and Ftogether and expecting that django ORM will be smart enough to convert these things to raw SQL, but it can't, as I see. Maybe I am wrong and do not know something about django ORM.
Anyway, you can rewrite you ORM call with extra extra and build the WHERE clause manually using native SQL functions which equals to datetime.now() and timedelta.

You have to extend Aggregate. Do like below:
from django.db import models as DM
class BaseSQL(object):
function = 'DATE_SUB'
template = '%(function)s(NOW(), interval %(expressions)s day)'
class DurationAgr(BaseSQL, DM.Aggregate):
def __init__(self, expression, **extra):
super(DurationAgr, self).__init__(
expression,
output_field=DM.DateTimeField(),
**extra
)
Process.objects.filter(date_up__lte=DurationAgr('days_activation'))
Hopefully, It will work for you. :)

I tried to use solution by Lutz Prechelt above, but got MySQL syntax error.
It's because we can't perform arithmetic operations with INTERVAL in MySQL.
So, for MySQL my solution is create a custom DB function:
class MysqlSubDate(Func):
function = 'SUBDATE'
output_field = DateField()
Example of usage:
.annotate(remainded_days=MysqlSubDate('end_datetime', F('days_activation')))
Also you can use timedelta, it will be converted into INTERVAL
.annotate(remainded_days=MysqlSubDate('end_datetime', datetime.timedelta(days=10)))

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Map (apply function) Django QuerySet - django

Just working through something like this myself. I think the best way to do it is to use a python list comprehension. [q.pub_date + timedelta(hours=1) for q in Question.objects.all()] Then Django takes care of optimizing this as it would any other query.

You can use values_list to get any column you like: Question.objects.values_list('pub_date') This is simpler than anything you can cook up yourself.

Related

What is the usage of `FilteredRelation()` objects in Django ORM (Django 2.X)?

How to use a tsvector field to perform ranking in Django with postgresql full-text search?

Django equivalent of SQL not in

Django Query Related Field Count

Making queries using F() and timedelta at django

Categories

Resources