Django - joining multiple tables (models) and filtering out based on their attribute - django

I'm new to django and ORM in general, and so have trouble coming up with query which would join multiple tables.
I have 4 Models that need joining - Category, SubCategory, Product and Packaging, example values would be:
Category: 'male'
SubCategory: 'shoes'
Product: 'nikeXYZ'
Packaging: 'size_36: 1'
Each of the Model have FK to the model above (ie. SubCategory has field category etc).
My question is - how can I filter Product given a Category (e.g. male) and only show products which have Packaging attribute available set to True? Obviously I want to minimise the hits on my database (ideally do it with 1 SQL query).
I could do something along these lines:
available = Product.objects.filter(packaging__available=True)
subcategories = SubCategory.objects.filter(category_id=<id_of_male>)
products = available.filter(subcategory_id__in=subcategories)
but then that requires 2 hits on database at least (available, subcategories) I think. Is there a way to do it in one go?

try this:
lookup = {'packaging_available': True, 'subcategory__category_id__in': ['ids of males']}
product_objs = Product.objects.filter(**lookup)

Try to read:
this
You can query with _set, multi __ (to link models by FK) or create list ids

I think this should work but it's not tested:
Product.objects.filter(packaging__available=True,subcategori‌​es__category_id__in=‌​[id_of_male])
it isn't tested but I think that subcategories should be plural (related_name), if you didn't set related_name, then subcategory__set instead od subcategories should work.
Probably subcategori‌​es__category_id__in=‌​[id_of_male] can be switched to .._id=id_of_male.

Related

Get models in Django that have all of the values in ManyToMany field (AND-query, no reverse lookups allowed)

I have such a model in Django:
class VariantTag(models.Model):
saved_variants = models.ManyToManyField('SavedVariant')
I need to get all VariantTag models that have saved_variants ManyToMany field with exact ids, say (250, 251), no more, no less. By the nature of the code that I am dealing with there is no way I can do reverse lookup with _set. So, I am looking for a query (or several queries + additional python code filtering) that will get me there but in such a way:
query = Q(...)
tag_queryset = VariantTag.objects.filter(query)
How is it possible to achieve?
I should probably stress out: supplied saved variants (e.g. (250, 251) should be AND - ed, not OR - ed.
Use in lookup
tag_queryset = VariantTag.objects.filter(saved_variants__in=[250,251])
So far I was able to achieve AND result by the following code:
tag_ids = VariantTag.objects.filter(variant_tag_type__name=tag_data['tag'],
saved_variants__in=saved_variant_ids).values_list('id', flat=True).distinct()
for tag_id in tag_ids:
saved_variants = list(VariantTag.objects.get(id=tag_id).saved_variants.all().values_list('id', flat=True))
if all(s in saved_variant_ids for s in saved_variants) and len(saved_variants) == len(saved_variant_ids):
return VariantTag.objects.get(id=tag_id)
So, I am doing the following:
Getting the OR - result
Iterating over the resulting ids of the retrieved model and for each one of them getting all of the ids of the ManyToMany field
Checking if all of the obtained ids of the ManyToMany field are in the required ids list (saved_variant_ids)
If yes - get the model by the id: VariantTag.objects.get(id=tag_id)
In my case there will be only one such model that have the required ids in ManyToMany field. If it is not the case for you - just append the ids of the model (in my case tag_id) to a list - then make a query for all of them.
If anyone has more concise way of doing AND ManyToMany query + code, would be interesting to see.

How to limit prefetch_related data in django

I've two tables brand and a product. each brand has multiple products.
So. I used prefetch_related to get related products for a particular brand with only a minimum product price. but the problem is when I have 2 products with the same price it selects both records so how to limit this?
alternatives_data = Brand.objects.filter(category__category_slug = category_slug).prefetch_related(
Prefetch('products', queryset=Product.objects.annotate(
min_brand_price=Min('brand__products__product_price')
).filter(
product_price=F('min_brand_price')
).order_by('product_id')))
i tried everything but nothing work!
To prevent a query to return multiple records with duplicata in specific columns, use the distinct method.
In your case, add .distinct('price') to the Product queryset inside the prefetch.
There is however one caveat : It is supported on PostgreSQL only.
Documentation

Count only published videos

I have a Category model and Video model
Category:
name=Charfield()
Video:
name=CharField()
category=ManyToManyField()
is_live=BooleanField()
And I want to have the get all categories with a video count but I want to exclude videos who are not live.
This my start state:
Category.objects.annotate(video_count=Count('video'))
# I tried this but I'm not sure if this the right way
Category.objects.exclude(video__is_liive=False)
Any Ideas?
If you want to filter the field you are annotating, you need to use raw SQL as you can't do it through the ORM yet. I wrote a blog post about this:
http://timmyomahony.com/blog/filtering-annotations-django/
Your situation is a little more complicated as you have a M2M relationship which uses an intermediate table. You need something like the following which joins all 3 tables and counts only those that are marked is_live=True (this is totally untested so you will need to play around with it)
categories = Category.objects.all().extra(select = {
"video_count" : """
SELECT COUNT(*)
FROM myapp_videocategory
JOIN myapp_videocategory on myapp_videocategory.category_id = myapp_category.id
JOIN myapp_video on myapp_videocategory.video_id = myapp_video.id
WHERE myapp_video.is_live = True
"""
}).order_by("-live_video_count",)

Efficiently select latest items of different categories

Consider the following model:
class Data(Model):
created_at = models.DateTimeField()
category = models.CharField(max_length=7)
I want to select the latest object for all categories.
Following this question, i'm selecting the distinct categories and then making a separate query for each of them:
categories = Data.objects.distinct('category').values_list('category', flat=True)
for category in categories:
latest_obj = Data.objects.filter(category=category).latest('created_at')
The downside of the approach is that it makes lots of queries (1 for the distinct categories, and then a separate query per category).
Is there a way to do this with a single query?
Typically, you would use a group by in relational database. Django has an aggergation API
(https://docs.djangoproject.com/en/dev/topics/db/aggregation/#aggregation) which allows you to do the following:
from django.db.models import Max
Data.objects.values('category').annotate(latest=Max('created_at'))
This will perform a single query and return a list like this:
[{'category' : 'cat1', 'latest' : '01/01/01' },{'category' : 'cat2' 'latest' : '02/02/02' }]
But I guess you might want to retrieve the data record id as well within this list. Django does not make thinks simple for you in this case. The problem is django uses all fields in the value clause to make the grouping and you cannot return extra columns from the query.
EDIT: I originally proposed to add a second values() clause to the end of the query based on web resources but this does not add extra columns to the result set.

Django DB, finding Categories whose Items are all in a subset

I have a two models:
class Category(models.Model):
pass
class Item(models.Model):
cat = models.ForeignKey(Category)
I am trying to return all Categories for which all of that category's items belong to a given subset of item ids (fixed thanks). For example, all categories for which all of the items associated with that category have ids in the set [1,3,5].
How could this be done using Django's query syntax (as of 1.1 beta)? Ideally, all the work should be done in the database.
Category.objects.filter(item__id__in=[1, 3, 5])
Django creates the reverse relation ship on the model without the foreign key. You can filter on it by using its related name (usually just the model name lowercase but it can be manually overwritten), two underscores, and the field name you want to query on.
lets say you require all items to be in the following set:
allowable_items = set([1,3,4])
one bruteforce solution would be to check the item_set for every category as so:
categories_with_allowable_items = [
category for category in
Category.objects.all() if
set([item.id for item in category.item_set.all()]) <= allowable_items
]
but we don't really have to check all categories, as categories_with_allowable_items is always going to be a subset of the categories related to all items with ids in allowable_items... so that's all we have to check (and this should be faster):
categories_with_allowable_items = set([
item.category for item in
Item.objects.select_related('category').filter(pk__in=allowable_items) if
set([siblingitem.id for siblingitem in item.category.item_set.all()]) <= allowable_items
])
if performance isn't really an issue, then the latter of these two (if not the former) should be fine. if these are very large tables, you might have to come up with a more sophisticated solution. also if you're using a particularly old version of python remember that you'll have to import the sets module
I've played around with this a bit. If QuerySet.extra() accepted a "having" parameter I think it would be possible to do it in the ORM with a bit of raw SQL in the HAVING clause. But it doesn't, so I think you'd have to write the whole query in raw SQL if you want the database doing the work.
EDIT:
This is the query that gets you part way there:
from django.db.models import Count
Category.objects.annotate(num_items=Count('item')).filter(num_items=...)
The problem is that for the query to work, "..." needs to be a correlated subquery that looks up, for each category, the number of its items in allowed_items. If .extra had a "having" argument, you'd do it like this:
Category.objects.annotate(num_items=Count('item')).extra(having="num_items=(SELECT COUNT(*) FROM app_item WHERE app_item.id in % AND app_item.cat_id = app_category.id)", having_params=[allowed_item_ids])