Exclude fields with the same value using Django querysets - django

In my project, I have Articles and User Responses that have the same value "title". I only want to find the first Articles, because other object have the same "title", these are the users' answers. How can I exclude objects from queryset have the same "title" parameter.
I try this:
q1 = Article.objects.order_by().values('title').distinct()
*works good but it returns something like a list.
Well, I tried to convert it to query:
q2 = Article.objects.filter(title__in=q1).distinct()
*But it causes it to return all Repeat-Topic Articles again.
How to exclude objects from queryset that have the same title without changing them to a list?

On PostgreSQL only, you can pass positional arguments (*fields) in order to specify the names of fields to which the DISTINCT should apply.
If it is your's case then the following must be work:
Article.objects.filter(title__in=q1).order_by('title').distinct('title')

Related

Django Aggregate on related field with filter

Following problem:
I have product groups containing products. These products can be visible in the frontend or not. I determine their visibility with the method frontend() (which contains a filter) like so:
product_groups.first().products.frontend()
Now I want to determine if I want to put a link for the product group on the homepage only if there are four or more products in it.
With annotations I would do:
product_groups.annotate(num_products=Count('products')).filter(num_products__gte=4)
But this gives me of cause the count of all products and not the count of products visible in the frontend.
So how do I put the additional filter frontend() into my query? To be clear, I want the Count() not on 'products' but on products.frontend().
Edit:
This is not a duplicate of the suggested question. If the filter function frontend() was simple enough to pull out the filter and stick it in the aggregate function, the suggested question would answer my problem.
My frontend() function is quite complicated and an aggregate of multiple other filter functions. So I would really like to use the frontend() function.
Edit:
This needs to work in Django 1.8.
If you want to reuse the frontend() method on your Product model's Queryset, then you can use Subquery aggregate expressions:
# assumption: `Product` has a fk to `ProductGroup`
# assumption 2: frontend() returns a `Queryset` of `Product` and is a method of `Product` model's default `Queryset`
frontend_products = Product.objects.filter(product_group=OuterRef('pk')).frontend().values('product_group')
total_products = frontend_products.annotate(total=Count('pk')).values('total')
q = product_groups.annotate(num_frontend_products=Subquery(total_products, output_field=IntegerField()))
Note that this will populate num_frontend_products with None instead of 0 for the groups where there isn't any corresponding product. You might want to modify the queryset further with conditional annotations to replace None with 0.

Return object when aggregating grouped fields in Django

Assuming the following example model:
# models.py
class event(models.Model):
location = models.CharField(max_length=10)
type = models.CharField(max_length=10)
date = models.DateTimeField()
attendance = models.IntegerField()
I want to get the attendance number for the latest date of each event location and type combination, using Django ORM. According to the Django Aggregation documentation, we can achieve something close to this, using values preceding the annotation.
... the original results are grouped according to the unique combinations of the fields specified in the values() clause. An annotation is then provided for each unique group; the annotation is computed over all members of the group.
So using the example model, we can write:
event.objects.values('location', 'type').annotate(latest_date=Max('date'))
which does indeed group events by location and type, but does not return the attendance field, which is the desired behavior.
Another approach I tried was to use distinct i.e.:
event.objects.distinct('location', 'type').annotate(latest_date=Max('date'))
but I get an error
NotImplementedError: annotate() + distinct(fields) is not implemented.
I found some answers which rely on database specific features of Django, but I would like to find a solution which is agnostic to the underlying relational database.
Alright, I think this one might actually work for you. It is based upon an assumption, which I think is correct.
When you create your model object, they should all be unique. It seems highly unlikely that that you would have two events on the same date, in the same location of the same type. So with that assumption, let's begin: (as a formatting note, class Names tend to start with capital letters to differentiate between classes and variables or instances.)
# First you get your desired events with your criteria.
results = Event.objects.values('location', 'type').annotate(latest_date=Max('date'))
# Make an empty 'list' to store the values you want.
results_list = []
# Then iterate through your 'results' looking up objects
# you want and populating the list.
for r in results:
result = Event.objects.get(location=r['location'], type=r['type'], date=r['latest_date'])
results_list.append(result)
# Now you have a list of objects that you can do whatever you want with.
You might have to look up the exact output of the Max(Date), but this should get you on the right path.

Django annotate a field value to queryset

I want to attach a field value (id) to a QS like below, but Django throws a 'str' object has no attribute 'lookup' error.
Book.objects.all().annotate(some_id='somerelation__id')
It seems I can get my id value using Sum()
Book.objects.all().annotate(something=Sum('somerelation__id'))
I'm wondering is there not a way to simply annotate raw field values to a QS? Using sum() in this case doesn't feel right.
There are at least three methods of accessing related objects in a queryset.
using Django's double underscore join syntax:
If you just want to use the field of a related object as a condition in your SQL query you can refer to the field field on the related object related_object with related_object__field. All possible lookup types are listed in the Django documentation under Field lookups.
Book.objects.filter(related_object__field=True)
using annotate with F():
You can populate an annotated field in a queryset by refering to the field with the F() object. F() represents the field of a model or an annotated field.
Book.objects.annotate(added_field=F("related_object__field"))
accessing object attributes:
Once the queryset is evaluated, you can access related objects through attributes on that object.
book = Book.objects.get(pk=1)
author = book.author.name # just one author, or…
authors = book.author_set.values("name") # several authors
This triggers an additional query unless you're making use of select_related().
My advice is to go with solution #2 as you're already halfway down that road and I think it'll give you exactly what you're asking for. The problem you're facing right now is that you did not specify a lookup type but instead you're passing a string (somerelation_id) Django doesn't know what to do with.
Also, the Django documentation on annotate() is pretty straight forward. You should look into that (again).
You have <somerelation>_id "by default". For example comment.user_id. It works because User has many Comments. But if Book has many Authors, what author_id supposed to be in this case?

django values not working

When I try to call values with more than 3 fields it seems to 'break' (ie. it doesn't group duplicate entries together)
My model is a through model with three fields, 2 ForeignKey and one DateTimeField
ProjectView(models.Model):
user = models.ForeignKey(User)
project = models.ForeignKey(Project)
datetime_created = models.DateTimeField()
I want to do:
ProjectView.objects.filter(datetime_created__gt=yesterday).values('project__id', 'project__title', 'project__thumbnail', 'project__creator_username')
If i get rid of any one of the values fields it groups them by same projects without duplicates, if there are 4 values it seems to do no grouping. Am i doing something wrong?
If you take a look at the docs for values, you'll see no guarantee of grouping or distinct. If you want that functionality, you'll have to call .order_by() and/or .distinct() when making you call to the ORM.
That it works at all is probably just a side effect of the SQL generated. If you want to see the SQL, take a look at Django-debug-toolbar

Django DB, finding Categories whose Items are all in a subset

I have a two models:
class Category(models.Model):
pass
class Item(models.Model):
cat = models.ForeignKey(Category)
I am trying to return all Categories for which all of that category's items belong to a given subset of item ids (fixed thanks). For example, all categories for which all of the items associated with that category have ids in the set [1,3,5].
How could this be done using Django's query syntax (as of 1.1 beta)? Ideally, all the work should be done in the database.
Category.objects.filter(item__id__in=[1, 3, 5])
Django creates the reverse relation ship on the model without the foreign key. You can filter on it by using its related name (usually just the model name lowercase but it can be manually overwritten), two underscores, and the field name you want to query on.
lets say you require all items to be in the following set:
allowable_items = set([1,3,4])
one bruteforce solution would be to check the item_set for every category as so:
categories_with_allowable_items = [
category for category in
Category.objects.all() if
set([item.id for item in category.item_set.all()]) <= allowable_items
]
but we don't really have to check all categories, as categories_with_allowable_items is always going to be a subset of the categories related to all items with ids in allowable_items... so that's all we have to check (and this should be faster):
categories_with_allowable_items = set([
item.category for item in
Item.objects.select_related('category').filter(pk__in=allowable_items) if
set([siblingitem.id for siblingitem in item.category.item_set.all()]) <= allowable_items
])
if performance isn't really an issue, then the latter of these two (if not the former) should be fine. if these are very large tables, you might have to come up with a more sophisticated solution. also if you're using a particularly old version of python remember that you'll have to import the sets module
I've played around with this a bit. If QuerySet.extra() accepted a "having" parameter I think it would be possible to do it in the ORM with a bit of raw SQL in the HAVING clause. But it doesn't, so I think you'd have to write the whole query in raw SQL if you want the database doing the work.
EDIT:
This is the query that gets you part way there:
from django.db.models import Count
Category.objects.annotate(num_items=Count('item')).filter(num_items=...)
The problem is that for the query to work, "..." needs to be a correlated subquery that looks up, for each category, the number of its items in allowed_items. If .extra had a "having" argument, you'd do it like this:
Category.objects.annotate(num_items=Count('item')).extra(having="num_items=(SELECT COUNT(*) FROM app_item WHERE app_item.id in % AND app_item.cat_id = app_category.id)", having_params=[allowed_item_ids])