I have the following model, which has 3 classes Project,CrawlerProject,CrawlerResults
class CrawlerProject(models.Model):
user = models.ForeignKey(User,on_delete=models.SET_NULL,null=True)
cralwer_results_M2M = models.ManyToManyField(CrawlerResults,blank=True)
class Project(models.Model):
user = models.ForeignKey(User,on_delete=models.SET_NULL,null=True)
crawler_project_M2M = models.ManyToManyField(CrawlerProject,blank=True)
Here, i want to count the total number of CrawlerResults objects are present for all CrawlerProjects within individual Project object.
projects = Project.objects.all().prefetch_related('crawler_project_M2M')
for each_proj in projects:
total_num_of_crawler_results = each_proj.crawler_project_M2M__cralwer_results_M2M.count() ## count all the crawler_results objects of all crawler_project present in current `project` object.
How do I achieve in an efficient way(single query) to get the total count of nested ManyToMany class?
Try this:
from django.db.models import Count
CrawlerProjects =CrawlerProject.objects.all().annotate(CrawlerResults_count=Count('project'))
Related
These are my models:
from django.db import models
class A(models.Model):
# fields
class B(models.Model):
a = models.ForeignKey(A)
# fields
I have some items from model A:
items = A.objects.filter(some_column=some_value)
Now I want 2 model B objects for each object in items. If there are 5 objects in items then I want total 10 objects from model B, 2 model B objects for each object of model A. Hope I made my requirement clear. I tried some queries, but ended up with querying model B for each model A object.
Also the solution should be well optimized, I would like to avoid 20 different queries for 20 objects in items.
If it is not possible with ORM, then I can use raw query as well.
you can get those using related query and prefetch_related
like
items = A.objects.prefetch_related('b_set').filter(some_column=some_value)
for item in items:
/* Here you get all modal B object for particular item */
obj_of_modal_B = item.b_set.all() # Here b is model name in small
you can also overwrite related_query name using related_name
class A(models.Model):
# fields
class B(models.Model):
a = models.ForeignKey(A,related_name='custom_name')
# fields
and then use like
items = A.objects.prefetch_related('custom_name').filter(some_column=some_value)
for item in items:
/* Here you get all modal B object for particular item */
obj_of_modal_B = item.custom_name.all()
Use prefecth_related. It won't query in for loop. It will have two query only
a = A.objects.prefetch_related('b')
Read about prefetch_related in docs for more detailed information
https://docs.djangoproject.com/en/3.0/topics/db/queries/
we have Project as main model, which contains 2 fields of M2M relationship.
class First(models.Model):
first_results_M2M = models.ManyToManyField(First_Results)
class Second(models.Model):
second_results_M2M = models.ManyToManyField(Second_Results)
class Project(models.Model):
project_first_M2M = models.ManyToManyField(First)
project_second_M2M = models.ManyToManyField(Second)
I m trying to count all the objects present in first_results_M2M of all the project_first_M2M objects within each Project object.
Here's the below example to count all the objects of first_results_M2M for Project object 1.
total_first_all = First_Results.objects.filter(first__project__id=1).count()
I want to render the total count of total_first_all and total_second_all in the template.
Project_Query = Project.objects.all()
for each_proj in Project_Query:
print(each_proj.total_first_all) ## should print the count the `first_resuls_M2M` for each project obj.
Please let me know how to do achieve it in more effecient/fast way besides annotate.
annotate.total_first_all=Count('project_first_M2M__first_results_M2M')
You .annotate(..) [Django-doc] your queryset, like:
from django.db.models import Count
project_query = Project.objects.annotate(
total_first_all=Count('project_first_M2M__first_results_M2M')
)
for project in project_query:
print(project.total_first_all)
This will not make a query per Project object, but calculate the counts for all Projects in "bulk".
For multiple ones, you can make use of subqueries to reduce the amount of nested JOIN:
from django.db.models import Count, OuterRef, Subquery
project_query = Project.objects.annotate(
total_first_all=Subquery(
First_Results.objects.filter(first__project=OuterRef('pk')).values('first__project').values(cnt=Count('*')).order_by('first__project')
),
total_second_all=Subquery(
Second_Results.objects.filter(second__project=OuterRef('pk')).values('second__project').values(cnt=Count('*')).order_by('second__project')
)
)
I have two models, Project and Session. One project has many sessions, one user has many projects:
class Project(models.Model):
class Meta:
ordering = [models.functions.Lower("name")]
name = models.CharField(max_length=255)
user = models.ForeignKey(User, on_delete=models.CASCADE)
class Session(models.Model):
start = models.DateTimeField()
end = models.DateTimeField()
timezone = TimeZoneField()
breaks = models.IntegerField(default=0, validators=[MinValueValidator(0)])
project = models.ForeignKey(Project, on_delete=models.CASCADE)
def duration(self):
# returns minutes in (end - start)
I want a way to get all projects for a given user, sorted by the sum of duration in all its sessions. As session.duration() is not a database field, but rather is calculated from database fields, I cannot get this information in a single database query.
My current solution is:
sessions = Session.objects.filter(project__user=self)
groups = [[a, sum([s.duration() for s in b])] for a, b in groupby(
sessions, key=lambda s: s.project
)]
groups = sorted(groups, key=lambda g: g[1], reverse=True)
return [g[0] for g in groups]
This gets all relevant sessions in a single query, but then I group them by project and this takes too long - about a second when there are about a 100 projects. Is there a way to accomplish this that takes less time? And ideally doesn't require a database call for every project?
I am using Django 2.0.
You can use annotations and aggregation to achieve this. First, modify the Session model a bit by changing this line:
project = models.ForeignKey(Project, on_delete=models.CASCADE)
to this:
project = models.ForeignKey(Project, related_name='sessions', on_delete=models.CASCADE)
-now every Project instance will have a sessions field, which will contain the queryset of all Sessions related to that Project.
Instead of taking all user sessions like you do now, you can take all user's Projects and loop through each project's Sessions like:
projects = Project.objects.filter(user=self)
for p in projects:
sessions = p.sessions.all()
Then you can manipulate the sessions queryset, annotating them with an expression field like:
from django.db.models import ExpressionWrapper, F, fields
duration_ = ExpressionWrapper(F('end') - F('start'), output_field=fields.DurationField())
sessions = p.sessions.annotate(d=duration_)
At this point each member of the sessions queryset will have a field named d holding the duration of the sorresponding Session.
To sum the durations, we can use the aggregation feature of Django querysets, like this:
from django.db.models import Sum
total = sessions.aggregate(total_duration=Sum('d'))["total_duration"]
What we're doing on the 2nd line is creating a single element from a queryset ("aggregating" it), by adding all the values in the d field, and assigning the result to a field called total_duration. The result of this expression:
sessions.aggregate(total_duration=Sum('d'))
is a dict with only one key (total_duration), from which we take the value.
Next, you can build a list of projects and durations, and sort it afterwards by duration, e.g. like this:
import operator
plist = []
for p in projects:
sessions = p.sessions.annotate(d=duration_)
total = sessions.aggregate(total_duration=Sum('d'))["total_duration"]
# total holds the sum of this project's sessions
plist.append({'p':p,'total':total})
plist.sort(key=operator.itemgetter('total'))
projects = [item['p'] for item in plist]
To sum it up:
import operator
from django.db.models import F, Sum, ExpressionWrapper, fields
duration_ = ExpressionWrapper(F('end') - F('start'), output_field=fields.DurationField())
projects = Project.objects.filter(user=self)
plist = []
for p in projects:
sessions = p.sessions.annotate(d=duration_)
total = sessions.aggregate(total_duration=Sum('d'))["total_duration"]
# total holds the sum of this project's sessions
plist.append({'p':p,'total':total})
plist.sort(key=operator.itemgetter('total'))
projects = [item['p'] for item in plist]
Reference: this answer, Django Query Expressions, Django Aggregation
These are my models
class Order(models.Model):
name = ...
class OrderDetail(models.Model)
order = models.OneToOneField(Order,null=False)
comment = ...
class LastUpdate(models.Model)
order = models.OneToOneField(Order,null=False)
date = ...
When I write Order.objects.all().values() it gives me a list which only contains name
But I need to get the name,orderdetail__comment,lastupdate__date values.
I can get them by writing
Order.objects.values('name','orderdetail__comment','lastupdate__date').all()
but there are a lot of related models to the order and I don't want to write all of them.
How can I get the all values of the related fields?
First, you query by
orders = Order.objects.select_related('orderdetail__comment', 'lastupdate__date')
then, get values by
orders.values('name', 'orderdetail__comment', 'lastupdate__date')
I have models:
class Z(models.Model):
name = ...
class B(model.Model):
something = model...
other = models.ForeignKey(Z)
class A(models.Model):
date = model.DateTimeField()
objs_b = models.ManyToManyField(B)
def get_obj_b(self,z_id):
self.obj_b = self.objs_b.get(other=z_id)
and query:
qs = A.objects.filter(...)
but if I want get object B related to A I must call get_obj_b:
for item in gs:
item.get_obj_b(my_known_z_id)
It was generate many queries. How to do it simple? I can not change models, and generally I must use filter (not my own manager) function.
If you are using Django 1.4, I would suggest that you use prefetch_related like this:
A.objects.all().prefetch_related('objs_b__other')
This would minimize the number of queries to 2: one for model A, and one for 'objs_b' joined with 'other'
And you can combine it with a filter suggested by pastylegs:
A.objects.filter(objs_b__other__id=z_id).prefetch_related('objs_b__other')
For details see: https://docs.djangoproject.com/en/1.4/ref/models/querysets/#prefetch-related