Complex Query for Django ORM - django

I'm trying to execute a complex query using Django's ORM and I can't seem to find a nice solution. Namely, I have a web application where users answer questions based on a video. I need to display all the videos for a specified user that have at least one question unanswered (not responded to). I haven't been able to figure it out yet with the ORM ... I know that I could probably write a SQL query for this and just execute it with the raw SQL function, but I really would prefer to stay in the ORM.
Models: Video, Question, Response and default User.
Relationships:
Question has a many to many relation towards video
Response has a foreign key each to Question, Video and User
What the query needs to do:
Display all the videos for a specified user that have at least one video question unanswered (not responded to).
Any help would be awesome! I've been struggling with this for way too long.
EDIT: The models I have are (simplified):
class Video(TimeStampedModel):
title = models.CharField(max_length=200)
source_id = models.CharField(max_length=20)
class Question(TimeStampedModel):
DEMOGRAPHIC_QUESTION = 'd'
QUESTION_TYPES = (
(VIDEO_QUESTION, 'Video related question'),
(DEMOGRAPHIC_QUESTION, 'Demographic question'),
)
MULTIPLE_CHOICE = 0
PLAIN_TEXT = 1
RESPONSE_TYPE = (
(MULTIPLE_CHOICE, 'Multiple Choice'),
(PLAIN_TEXT, 'Plain Text')
)
type = models.CharField(max_length=1, choices=QUESTION_TYPES)
videos = models.ManyToManyField(Video, null=True, blank=True)
title = models.CharField(max_length=500)
priority = models.IntegerField()
class Response(TimeStampedModel):
user = models.ForeignKey(User)
question = models.ForeignKey(Question)
video = models.ForeignKey(Video, blank=True, null=True)
choice = models.ForeignKey(Choice, null=True, blank=True,related_name='selected_choice')
text = models.CharField(max_length=500, blank=True)
// Not relevant but included for clarity
class Choice(TimeStampedModel):
question = models.ForeignKey(Question)
text_response = models.CharField(max_length=500)
image = models.FileField(upload_to=_get_choice_img_path, blank=True)
value = models.IntegerField(default=0)
external_id = models.IntegerField(default=0)

Judging logically by the way your models look like, I think something close to the following should be fine.
q = Response.objects.select_related().filter(user__name=user).filter(response__choice=None)
videos = Video.objects.filter(id__in=q.extra(where=["{}>=1".format(q.count())]).values('video_id'))
Hope you understand what I did there. The first line basically tries to take a natural join of the model objects. The second line is using the query generated in the first line to get the count and checks if it is at least 1, and gets the Videos that belong to that query.

Related

Speeding up a Django query

My company has a pretty complicated web of Django models that I don't deal too much with. But sometimes I need to do queries on it. One that I'm doing right now is taking an inconveniently long time. So because I'm not the best at understanding how to use annotations effectively and I don't really understand subqueries at all (and probably some other key Django stuff) I was hoping someone here could help figure out how to do a better job at getting this result quicker.
Here's a facsimile of the relevant models in our database.
class Company(models.Model):
name = models.CharField(max_length=255)
#property
def active_humans(self):
if hasattr(self, '_active_humans'):
return self._active_humans
else:
self._active_humans = Human.objects.filter(active=True, departments__company=self).distinct()
return self._active_humans
class Department(models.Model):
name = models.CharField(max_length=225)
company = models.ForeignKey(
'muh_project.Company',
related_name="departments",
on_delete=models.PROTECT
)
humans = models.ManyToManyField('muh_project.Human', through='muh_project.Job', related_name='departments')
class Job(models.Model):
name = models.CharField(max_length=225)
department = models.ForeignKey(
'muh_project.Department',
on_delete=models.PROTECT
)
human = models.ForeignKey(
'muh_project.Human',
on_delete=models.PROTECT
)
class Human(models.Model):
active = models.BooleanField(default=True)
#property
def fixed_happy_dogs(self):
return self.solutions.filter(is_neutered_spayed=True, disposition="happy")
class Dog(models.Model):
is_neutered_spayed = models.BooleanField(default=True)
disposition = models.CharField(max_length=225)
age = models.IntegerField()
human = models.ForeignKey(
'muh_project.Human',
related_name="dogs",
on_delete=models.PROTECT
)
human_job = models.ForeignKey(
'muh_project.Job',
blank=True,
null=True,
on_delete=models.PROTECT
)
What I'm trying to do (in the language of this silly toy example) is to get the number of humans with at least one of a certain type of dog for each of some companies. So what I'm doing is running this.
rows = []
company_type = "Tech"
fixed_happy_dogs = Dog.objects.filter(is_neutered_spayed=True, disposition="happy")
old_dogs = fixed_happy_dogs.filter(age__gte=7)
companies = Company.objects.filter(name__icontains=company_type)
for company in companies.order_by('id'):
humans = company.active_humans
num_humans = humans.distinct().count()
humans_with_fixed_happy_dogs = humans.filter(dogs__in=fixed_happy_dogs).distinct().count()
humans_with_old_dogs = humans.filter(dogs__in=old_dogs).distinct().count()
rows.append(f'{company.id};{num_humans};{humans_with_fixed_happy_dogs};{humans_with_old_dogs}')
It generally takes anywhere from 45 - 120 seconds to run depending on how many companies I run it over. I'd like to cut that down. I do need the final result as a list of strings as shown.
One low-hanging fruit would be to add db index to the column Dog.disposition, since it's being used in the .filter() statement, and it looks like it needs to do sequence scan over the table (each time it goes through the for loop).
For this task specifically I'd recommend to use Django Debug Toolbar where you can see all SQL queries, which can help you to pinpoint the slowest ones, and use EXPLAIN to see what goes wrong there.

Using ForeignKey to sort with order_by and distinct not working

I'm trying to sort model Game by each title and most recent update(post) without returning duplicates.
views.py
'recent_games': Game.objects.all().order_by('title', '-update__date_published').distinct('title')[:5],
The distinct method on the query works perfectly however the update__date_published doesn't seem to be working.
models.py
Model - Game
class Game(models.Model):
title = models.CharField(max_length=100)
slug = models.SlugField(unique=True)
description = models.TextField()
date_published = models.DateTimeField(default=timezone.now)
cover = models.ImageField(upload_to='game_covers')
cover_display = models.ImageField(default='default.png', upload_to='game_displays')
developer = models.CharField(max_length=100)
twitter = models.CharField(max_length=50, default='')
reddit = models.CharField(max_length=50, default='')
platform = models.ManyToManyField(Platform)
def __str__(self):
return self.title
Model - Update
class Update(models.Model):
author = models.ForeignKey(User, models.SET_NULL, blank=True, null=True,) # If user is deleted keep all updates by said user
article_title = models.CharField(max_length=100, help_text="Use format: Release Notes for MM/DD/YYYY")
content = models.TextField(help_text="Try to stick with a central theme for your game. Bullet points is the preferred method of posting updates.")
date_published = models.DateTimeField(db_index=True, default=timezone.now, help_text="Use date of update not current time")
game = models.ForeignKey(Game, on_delete=models.CASCADE)
article_image = models.ImageField(default='/media/default.png', upload_to='article_pics', help_text="")
platform = ChainedManyToManyField(
Platform,
horizontal=True,
chained_field="game",
chained_model_field="game",
help_text="You must select a game first to autopopulate this field. You can select multiple platforms using Ctrl & Select (PC) or ⌘ & Select (Mac).")
See this for distinct reference Examples (those after the first will only work on PostgreSQL)
See this one for Reverse Query - See this one for - update__date_published
Example -
Entry.objects.order_by('blog__name', 'mod_date').distinct('blog__name', 'mod_date')
Your Query-
Game.objects.order_by('title', '-update__date_published').distinct('title')[:5]
You said:
The -update__date_published does not seem to be working as the Games are only returning in alphabetical order.
The reason is that the first order_by field is title; the secondary order field -update__date_published would only kick in if you had several identical titles, which you don't because of distinct().
If you want the Game objects to be ordered by latest update rather their title, omitting title from the ordering seems the obvious solution until you get a ProgrammingError that DISTINCT ON field requires field at the start of the ORDER BY clause.
The real solution to sorting games by latest update is:
games = (Game.objects
.annotate(max_date=Max('update__date_published'))
.order_by('-update__date_published'))[:5]
The most probable misunderstanding here is the join in your orm query. They ussually lazy-loading, so the date_published field is not yet available, yet you are trying to sort against it. You need the select_related method to load the fk relation as a join.
'recent_games': Game.objects.select_related('update').all().order_by('title', '-update__date_published').distinct('title')[:5]

Django Query: How to order posts by amount of upvotes?

I'm currently working on a website (with Django), where people can write a story, which can be upvoted by themselves or by other people. Here are the classes for Profile, Story and Upvote:
class Profile(AbstractBaseUser, PermissionsMixin):
email = models.EmailField(unique=True)
first_name = models.CharField(max_length=30, null=True)
last_name = models.CharField(max_length=30, null=True)
class Story(models.Model):
author = models.ForeignKey('accounts.Profile', on_delete=models.CASCADE, related_name="author")
title = models.CharField(max_length=50)
content = models.TextField(max_length=10000)
class Upvote(models.Model):
profile = models.ForeignKey('accounts.Profile', on_delete=models.CASCADE, related_name="upvoter")
story = models.ForeignKey('Story', on_delete=models.CASCADE, related_name="upvoted_story")
upvote_time = models.DateTimeField(auto_now=True)
As you can see, Upvote uses two foreign keys to store the upvoter and the related story. Now I want to make a query which gives me all the stories, sorted by the amount of upvotes they have. I've tried my best to come up with some queries myself, but it's not exactly what I'm searching for.
This one doesn't work at all, since it just gives me all the stories in the order they were created, for some reason. Also it contains duplicates, although I want them to be grouped by story.
hot_feed = Upvote.objects.annotate(upvote_count=Count('story')).order_by('-upvote_count')
This one kind of works. But if I'm trying to access a partical story in my template, it just gives me back the id. So I'm not able to fetch the title, author and content from that id, since it's just an integer, and not an object.
hot_feed = Upvote.objects.values('story').annotate(upvote_count=Count('story')).order_by('-upvote_count')
Could someone help me out with finding the query I'm searching for?
You are querying from the wrong model, you here basically fetch Upvotes ordered by the number of stories, or something similar.
But your probaby want to retrieve Storys by the number of upvotes, so you need to use Story as "queryset root", and annotate it with the number of upvotes:
Story.objects.annotate(
upvote_count=Count('upvoted_story')
).order_by('-upvote_count')
I think the related_name of your story is however a bit "misleading". The related_name is the name of the relation "in reverse", so probably a better name is upvotes:
class Upvote(models.Model):
profile = models.ForeignKey(
'accounts.Profile',
on_delete=models.CASCADE,
related_name='upvotes'
)
story = models.ForeignKey(
'Story',
on_delete=models.CASCADE,
related_name='upvotes'
)
upvote_time = models.DateTimeField(auto_now=True)
In that case the query is:
Story.objects.annotate(
upvote_count=Count('upvotes')
).order_by('-upvote_count')

Is there any possible solution for getting more than one value inside function in django?

I am creating a blog application using Django and I am also very much new to django.
This is the models I created
class categories(models.Model):
Title = models.CharField(max_length=40, default='GST')
class Blog(models.Model):
User = models.ForeignKey(settings.AUTH_USER_MODEL,on_delete=models.CASCADE,null=True,blank=True)
Date = models.DateTimeField(default=datetime.now)
Blog_title = models.CharField(max_length=255)
likes = models.ManyToManyField(settings.AUTH_USER_MODEL,related_name='likes',blank=True)
Description = RichTextUploadingField(blank=True, null=True,config_name='special')
Blog_image = models.ImageField(upload_to='blog_image', null=True, blank=True)
Category = models.ForeignKey(categories,on_delete=models.CASCADE,related_name='blogs')
I was wondering How to count the total no of blog present under a particular category?
I want to track a specific count rate for all Categories...
Done something like this in my model
def categories_count(self):
for a in categories.objects.all():
categories_count = Blog.objects.filter(Category__Title=a.Title).count()
return categories_count
But it is returning only one value...Can anyone suggest me with some suitable codes to resolve this...
Thank you
You can get a list of tuples of category title and blog count with the following query:
categories.objects.annotate(blog_count=Count('Categories')).values_list('Title', 'blog_count')

django cms plugin that display model with specific value checked

I made a model that displays articles and when you create an article you have the possibility to choose if this article will be a featured one.
So this is basically what I have in my Article model :
class Article(ModelMeta, TranslatableModel):
taints_cache = True
"""
Press article element,
"""
date_created = models.DateTimeField(auto_now_add=True)
date_modified = models.DateTimeField(auto_now=True)
date_realization = models.DateField(_('Realised in'),
default=timezone.now)
image = FilerImageField(verbose_name=_('Featured image'), blank=True,
null=True,
on_delete=models.SET_NULL,
related_name='image_press_article',
help_text=_('Set if the article will be featured'))
sources = models.ManyToManyField(ArticleSource, verbose_name=_('Source'),
blank=False, null=True, related_name='sources_press_article')
regions = models.ManyToManyField(Country, verbose_name=_('Country of the article'),
blank=True, null=True,
related_name='regions_press_article')
global_regions = models.BooleanField('Global', default=True)
featureArticle = models.BooleanField(_('Feature'), help_text=_('Feature this article'), default=False)
Then, I created a plugin that displays the featured articles.
But the thing is, in the django plugin admin I let the user the possibility to choose which article he wants to display (with a maximum of 3).
But in this choosing list, all my articles are listed.
What I want to, is to list only the articles that are checked as "featured", in my plugin admin. Instead of having all the articles.
Here what I have with my cms_plugin's model :
class FeaturedArticlePlugin(CMSPlugin):
selected_article = SortedManyToManyField(Article, blank=True, verbose_name=_('Selected articles'),
help_text=_('Select the featured articles to display'))
def __str__(self):
return u'%s Selected articles' % self.selected_article.all()
def copy_relations(self, oldinstance):
self.selected_article = oldinstance.selected_article.all()
And in my cms_plugins.py :
class PressPlugin(CMSPluginBase):
module = 'Press'
class PressFeaturedArticlePlugin(PressPlugin):
module = _('Press')
name = _('Press feature')
model = FeaturedArticlePlugin
render_template = 'djangocms_press/plugins/feature_article.html'
number_article = 3
def render(self, context, instance, placeholder):
"""
Get a list of selected_articles
"""
selected_article = instance.selected_article.all()
number_selected_article = selected_article.count()
feature_article_list = list(selected_article[:self.number_article])
context['instance'] = instance
context['feature_article_list'] = feature_article_list
return context
plugin_pool.register_plugin(PressFeaturedArticlePlugin)
So, I am sure it's nothing complicated but I can't point this out.
Anyone has a clue ?
EDIT
From what I understand, the only thing that concern the display of all articles is this line :
selected_article = SortedManyToManyField(Article, blank=True, verbose_name=_('Selected articles'),
help_text=_('Select the featured articles to display'))
So what I am suppose to do is to filter this selected_article with the featureArticle=True. But how to do it ?
I am not quite sure if I am missing something, but, couldn't you just apply a filter here?
selected_article = instance.selected_article.all().filter(featureArticle=true)
number_selected_article = selected_article.count()
Or is the problem with the lines after?
feature_article_list = list(selected_article[:self.number_article])
If your problem is selecting the extra articles, maybe you need to order them by date and select only the necessary?
feature_article_list = list(Articles.all().order_by('-created')[:self.number_article - number_selected_article]
Which will only select the extra necessaries?
Edit: Your situation kind of reminds me of a problem I once had. So I'll refer you to the same page that helped me in the past just in case you'd manage to figure it out.
Restrict django admin change permissions
Edit 2 : "I created a plugin that displays the featured articles. But the thing is, in the django plugin admin I let the user the possibility to choose which article he wants to display (with a maximum of 3). But in this choosing list, all my articles are listed."
Isn't it ok if all the articles are displayed there? How can you choose among them if they are not all displayed?