Django Haystack: Filtering by object properties - django

I'm trying to build a page that renders searched Oscar products and filters them by their Category using GET attributes. I'm overriding get_queryset and building my object list from there
class ProductSearchView(ListView):
model = Product
template_name = "productsearch/product_list.html"
queryset = Product.objects.none()
def get_queryset(self):
word_query_attr = self.request.GET.get('q', None) # word query
sqs = SearchQuerySet().models(Product).filter(Q(title__icontains=word_query_attr) |
Q(category__name__icontains=word_query_attr) |
Q(upc__icontains=word_query_attr) |
Q(description__icontains=word_query_attr))
qs = Product.objects.all()
if self.request.GET.get('cat', None):
cat_attr = self.request.GET.get('cat', None)
category = Category.objects.get(name=cat_attr)
qs = qs.filter(categories__in=category.get_children())
My question is, can I use SearchQuerySet() to filter through fields from result objects? (in this case, categories from Product objects)
If not, is there an effective way I can create a Product queryset using SearchQuerySet() results?
I've tried filtering through IDs
object_ids = [result.object.id for result in sqs]
qs = qs.filter(id__in=object_ids).distinct()
But there are two problems: it isn't scalable (as noted here) and some queries are very slow when I'm dealing with ~900 results.

I might be missing something, but the SearchQuerySet filter in your example is already restricting by category name? Could you just change that filter to exact instead of icontains and pass your cat GET param?
sqs = SearchQuerySet().models(Product).filter(
Q(title__icontains=word_query_attr) |
Q(upc__icontains=word_query_attr) |
Q(description__icontains=word_query_attr)
)
cat_attr = self.request.GET.get('cat')
if cat_attr:
sqs.filter(category__name__exact=cat_attr)
I think Haystack should let you chain filters like that.

Related

How to filter in django with empty fields when using ChoiceField

I have a programme where users should be able to filter different types of technologies by their attributes. My question is, how would I filter the technologies when there's potential conflicts and empty values in the parameters I use to filter?
Forms.py:
class FilterDataForm(forms.ModelForm):
ASSESSMENT = (('', ''),('Yes', 'Yes'),('No', 'No'),)
q01_suitability_for_task_x = forms.ChoiceField(label='Is the technology suitable for x?',
choices=ASSESSMENT, help_text='Please select yes or no', required=False,)
q02_suitability_for_environment_y = forms.ChoiceField(label='Is the technology suitable for environment Y?',
choices=ASSESSMENT, help_text='Please select yes or no', required=False)
There are many fields in my model like the ones above.
views.py
class TechListView(ListView):
model = MiningTech
def get_queryset(self):
q1 = self.request.GET.get('q01_suitability_for_task_x', '')
q2 = self.request.GET.get('q02_suitability_for_environment_y', '')
object_list = MiningTech.objects.filter(q01_suitability_for_task_x=q1).filter(
q02_suitability_for_environment_y=q2)
return object_list
The difficulty is that not all technology db entries will have data. So in my current setup there's times where I will filter out objects that have one attribute but not another.
For instance if my db has:
pk1: q01_suitability_for_task_x=Yes; q02_suitability_for_environment_y=Yes;
pk2: q01_suitability_for_task_x=Yes; q02_suitability_for_environment_y='';
In the form, if I don't select any value for q01_suitability_for_task_x, and select Yes for q02_suitability_for_environment_y, I get nothing back in the queryset because there are no q01_suitability_for_task_x empty fields.
Any help would be appreciated. I'm also ok with restructuring everything if need be.
The problem is that your self.request.GET.get(...) code defaults to an empty string if there is no value found, so your model .filter() is looking for matches where the string is ''.
I would restructure the first part of get_queryset() to build a dictionary that can be unpacked into your filter. If the value doesn't exist then it doesn't get added to the filter dictionary:
filters = {}
q1 = self.request.GET.get('q01_suitability_for_task_x', None)
q2 = self.request.GET.get('q02_suitability_for_environment_y', None)
if q1 is not None:
filters['q01_suitability_for_task_x'] = q1
... etc ...
object_list = MiningTech.objects.filter(**filters)
If you have a lot of q1, q2, etc. items then consider putting them in a list, looping through and inserting into the dictionary if .get(...) returns anything.
Edit: Because there are indeed a lot possible filters, the final solution looks as follows:
def get_queryset(self):
filters = {}
for key, value in self.request.GET.items():
if value != '':
filters[key] = value
object_list = Tech.objects.filter(**filters)

Django: Change Queryset of already defined Prefetch object

Context
I really want to, but I don't understand how I can limit an already existing Prefetch object
Models
class MyUser(AbstractUser):
pass
class Absence(Model):
employee = ForeignKey(MyUser, related_name='absences', on_delete=PROTECT)
start_date = DateField()
end_date = DateField()
View
class UserAbsencesListAPIView(ListAPIView):
queryset = MyUser.objects.order_by('first_name')
serializer_class = serializers.UserWithAbsencesSerializer
filterset_class = filters.UserAbsencesFilterSet
Filter
class UserAbsencesFilterSet(FilterSet):
first_name = CharFilter(lookup_expr='icontains', field_name='first_name')
from_ = DateFilter(method='filter_from', distinct=True)
to = DateFilter(method='filter_to', distinct=True)
What do I need
With the Request there are two arguments from_ and to. I should return Users with their Absences, which (Absences) are bounded by from_ and/or to intervals. It's very simple for a single argument, i can limit the set using Prefetch object:
def filter_from(self, queryset, name, value):
return queryset.prefetch_related(
Prefetch(
'absences',
Absence.objects.filter(Q(start_date__gte=value) | Q(start_date__lte=value, end_date__gte=value)),
)
)
Similarly for to.
But what if I want to get a limit by two arguments at once?
When the from_ attribute is requested - 'filter_from' method is executed; for the to argument, another method filter_to is executed.
I can't use prefetch_related twice, I get an exception ValueError: 'absences' lookup was already seen with a different queryset. You may need to adjust the ordering of your lookups..
I've tried using to_attr, but it looks like I can't access it in an un-evaluated queryset.
I know that I can find the first defined Prefetch in the _prefetch_related_lookups attribute of queryset, but is there any way to apply an additional filter to it or replace it with another Prefetch object so that I can end up with a query similar to:
queryset.prefetch_related(
Prefetch(
'absences',
Absence.objects.filter(
Q(Q(start_date__gte=from_) | Q(start_date__lte=from_, end_date__gte=from_))
& Q(Q(end_date__lte=to) | Q(start_date__lte=to, end_date__gte=to))
),
)
)
django-filter seems to have its own built-in filter for range queries:
More info here and here
So probably just easier to use that instead:
def filter_date_range(self, queryset, name, value):
if self.lookup_expr = "range":
#return queryset with specific prefetch
if self.lookup_expr = "lte":
#return queryset with specific prefetch
if self.lookup_expr = "gte":
#return queryset with specific prefetch
I haven't tested this and you may need to play around with the unpacking of value but it should get you most of the way there.

Optimize the filtering of Queryset results in Django

I'm overriding Django Admin's list_filter (to customize the filter that shows on the right side on the django admin UI for a listview). The following code works, but isn't optimized: it increases SQL queries by "number of product categories".
(The parts to focus on, in the following code sample are, qs.values_list('product_category', flat=True) which only returns an id (int), so I've to use ProductCategory.objects.get(id=i).)
Wondering if this can be simplified?
(E.g. data: Suppose the product categories are "baked" "fried" "raw" etc., and the Items are "bread" "fish fry" "cake". So when the Item list is displayed in Django Admin, all product categories will show on the 'Filter By' column on the right side of the UI.)
from django.utils.translation import ugettext_lazy as _
from django.contrib.admin import SimpleListFilter
from product_category.model import ProductCategory
class ProductCategoryFilter(SimpleListFilter):
title = _('ProductCategory')
parameter_name = 'product_category'
def lookups(self, request, model_admin):
qs = model_admin.get_queryset(request)
ordered_filter_obj_list = []
# TODO: Works, but increases SQL queries by "number of product categories"
for i in (
qs.values_list("product_category", flat=True)
.distinct()
.order_by("product_category")
):
cat = ProductCategory.objects.get(id=i)
ordered_filter_obj_list.append((i, cat))
return ordered_filter_obj_list
def queryset(self, request, queryset):
if self.value():
return queryset.filter(product_category__exact=self.value())
# P.S. Above filter is used in another class like so
class ItemAdmin(admin.ModelAdmin):
list_filter = (ProductCategoryFilter,)
Probably you are looking for select_related, I do not know your exact models structure, but you may use it as follow:
cats = set()
for p in Product.objects.all().select_related('category'):
# Without select_related(), this would make a database query for each
# loop iteration in order to fetch the related categories for each product.
cats.add(p.category)
I am Assuming there is some relation between your Product and ProductCategory models. Hope this help.
Hah, phrasing the question makes it clear in your own head! Found an answer mins after posting this:
(Instead of doing an objects.get() inside the for loop, we can do objects.all() (which is a single SQL Query) and fill up a temporary dictionary. Then use this temp dict to find the associated string value.)
def lookups(self, request, model_admin):
qs = model_admin.get_queryset(request)
category_list = {}
for x in ProductCategory.objects.all():
category_list[x.id] = str(x)
ordered_filter_obj_list = []
for i in (
qs.values_list("product_category", flat=True)
.distinct().order_by("product_category")
):
ordered_filter_obj_list.append((i, category_list[i]))
return ordered_filter_obj_list
First parameter on the tuple list is the value of the lookup, and the second is just the name for display. This can be done in a single SQL query, or via the Django ORM:
def lookups(self, request, model_admin):
qs = model_admin.get_queryset(request).select_related('product_category')
values = qs.values('product_category_id', 'product_category__name') #assuming ProductCategory has an attribute 'name'
unique_categories = values.distinct('product_category_id', 'product_category__name')
categories = []
for c in unique_categories:
categories.append((c['product_category_id'], c['product_category__name']))
return categories

AND search with reverse relations

I'm working on a django project with the following models.
class User(models.Model):
pass
class Item(models.Model):
user = models.ForeignKey(User)
item_id = models.IntegerField()
There are about 10 million items and 100 thousand users.
My goal is to override the default admin search that takes forever and
return all the matching users that own "all" of the specified item ids within a reasonable timeframe.
These are a couple of the tests I use to better illustrate my criteria.
class TestSearch(TestCase):
def search(self, searchterm):
"""A tuple is returned with the first element as the queryset"""
return do_admin_search(User.objects.all())
def test_return_matching_users(self):
user = User.objects.create()
Item.objects.create(item_id=12345, user=user)
Item.objects.create(item_id=67890, user=user)
result = self.search('12345 67890')
assert_equal(1, result[0].count())
assert_equal(user, result[0][0])
def test_exclude_users_that_do_not_match_1(self):
user = User.objects.create()
Item.objects.create(item_id=12345, user=user)
result = self.search('12345 67890')
assert_false(result[0].exists())
def test_exclude_users_that_do_not_match_2(self):
user = User.objects.create()
result = self.search('12345 67890')
assert_false(result[0].exists())
The following snippet is my best attempt using annotate that takes over 50 seconds.
def search_by_item_ids(queryset, item_ids):
params = {}
for i in item_ids:
cond = Case(When(item__item_id=i, then=True), output_field=BooleanField())
params['has_' + str(i)] = cond
queryset = queryset.annotate(**params)
params = {}
for i in item_ids:
params['has_' + str(i)] = True
queryset = queryset.filter(**params)
return queryset
Is there anything I can do to speed it up?
Here's some quick suggestions that should improve performance drastically.
Use prefetch_related` on the initial queryset to get related items
queryset = User.objects.filter(...).prefetch_related('user_set')
Filter with the __in operator instead of looping through a list of IDs
def search_by_item_ids(queryset, item_ids):
return queryset.filter(item__item_id__in=item_ids)
Don't annotate if it's already a condition of the query
Since you know that this queryset only consists of records with ids in the item_ids list, no need to write that per object.
Putting it all together
You can speed up what you are doing drastically just by calling -
queryset = User.objects.filter(
item__item_id__in=item_ids
).prefetch_related('user_set')
with only 2 db hits for the full query.

django model search form

Firstly, I did my homework and looked around before posting! My question seems like a very basic thing that must’ve been covered before.
I'm now looking at Django-filter as a potential solution, but would like some advice on if this is the right way to go and if there any other solutions.
I have a Django app wit 10 models, each model has a few fields. Most fields are ChoiceField that users populate using forms with the default select widget. There is a separate form for each model.
I want to create a separate form for each model (in separate views) that users will use to search the database. The search form will contain only drop-down boxes (the select widgets) with the same choices as the forms used to populate the database with the addition of the “any” option.
I know how to use .object.filter(), however the “any” option would correspond to not include specific fields in the filter and I'm not sure how to add model fields to the filter based on users’ selection
I briefly looked at Haystack as an option but it seems to be made for full text search rather than “model filed search” I'm after.
Sample model (simplified):
class Property():
TYPE_CHOICES = (‘apartment’, ‘house’, ‘flat’)
type = charfield(choices=TYPE_CHOICES)
LOC_CHOICES = (‘Brussels’, ‘London’, ‘Dublin’, ‘Paris’)
location = charfield(choices=LOC_CHOICES)
price = PostivieInteger()
Users can select only “type”, only “location” or both (not making selection is equal to ANY) in which case I end up with 3 different filters:
Property.objects.filter(type=’apartment’)
Property.objects.filter(location=’Dublin’)
Property.objects.filter(type=’apartment’, location=’Dublin’)
The main question: django-filter the best option?
Question 1: what’s the best option of accomplishing this overall?
Question 2: how do I add model fields to the filter based on user’s form selection?
Question 3: how do I do the filter based on user selection? (I know how to use .filter(price_lt=).exclude(price_gt=) but again how do I do it dynamically based on selection as “ANY” would mean this is not included in the query)
I had a similar case like yours (real estate project), I ended up with the following approach, you can refine this to your needs...I removed select_related and prefetch_related models for easier reading
properties/forms.py:
class SearchPropertyForm(forms.Form):
property_type = forms.ModelChoiceField(label=_("Property Type"), queryset=HouseType.objects.all(),widget=forms.Select(attrs={'class':'form-control input-sm'}))
location = forms.ModelChoiceField(label=_('Location'), queryset=HouseLocation.objects.all(), widget=forms.Select(attrs={'class':'form-control input-sm'}))
Then in the properties/views.py
# Create a Mixin to inject the search form in our context
class SeachPropertyMixin(object):
def get_context_data(self, **kwargs):
context = super(SeachPropertyMixin, self).get_context_data(**kwargs)
context['search_property_form'] = SearchPropertyForm()
return context
In your actual view (I apply the search form as a sidebar element in my detailview only:
# Use Class Based views, saves you a great deal of repeating code...
class PropertyView(SeachPropertyMixin,DetailView):
template_name = 'properties/view.html'
context_object_name = 'house'
...
queryset = HouseModel.objects.select_related(...).prefetch_related(...).filter(flag_active=True, flag_status='a')
Finally your search result view (this is performed as GET request, since we are not altering any data in our DB, we stick to the GET method):
# Search results should return a ListView, here is how we implement it:
class PropertySearchResultView(ListView):
template_name = "properties/propertysearchresults.html"
context_object_name = 'houses'
paginate_by = 6
queryset = HouseModel.objects.select_related(...).prefetch_related(...).order_by('-sale_price').filter(flag_active=True, flag_status='a')
def get_queryset(self):
qs = super(PropertySearchResultView,self).get_queryset()
property_type = self.request.GET.get('property_type')
location = self.request.GET.get('location')
'''
Start Chaining the filters based on the input, this way if the user has not
selected a filter it wont be used.
'''
if property_type != '' and property_type is not None:
qs = qs.filter(housetype=property_type)
if location != '' and location is not None:
qs = qs.filter(location=location)
return qs
def get_context_data(self, **kwargs):
context = super(PropertySearchResultView, self).get_context_data()
'''
Add the current request to the context
'''
context['current_request'] = self.request.META['QUERY_STRING']
return context
Your solution works. I've modified it and I'm not using ModelChoiceField but the standard form.ChoiceField. The reason for that is that I wanted to add option "Any". My "if" statements look like:
if locality != 'Any Locality':
qs = qs.filter(locality=locality)
if property_type != 'Any Type':
qs = qs.filter(property_type=property_type)
if int(price_min) != 0:
qs = qs.filter(price__gte=price_min)
if int(price_max) != 0:
qs = qs.filter(price__lte=price_max)
if bedrooms != 'Any Number':
qs = qs.filter(bedrooms=bedrooms)
And so on....
This does the job, however it seems like an ugly and hacky solution to a simple problem. I would think is a common use case. I feel there should be a cleaner solution...
I've tried the django-filter. It is close to doing what I want but I couldn't add the "Any" choice and it filters inline rather than returning. It should do with some modifications.
Cheers